Emission-line properties of IllustrisTNG galaxies: from local diagnostic diagrams to high-redshift predictions for JWST

We compute synthetic, rest-frame optical and ultraviolet (UV) emission-line properties of galaxy populations at redshifts from z$\approx$0 to z=8 in a full cosmological framework. We achieve this by coupling, in post-processing, the cosmological IllustrisTNG simulations with new-generation nebular-emission models, accounting for line emission from young stars, post-asymptotic-giant-branch (PAGB) stars, accreting black holes (BHs) and, for the first time, fast radiative shocks. The optical emission-line properties of simulated galaxies dominated by different ionizing sources are largely consistent with those expected from classical diagnostic diagrams and reflect the observed increase in [OIII]/H$\beta$ at fixed [NII]/H$\alpha$ and the evolution of the H$\alpha$, [OIII]$\lambda5007$ and [OII]$\lambda3727$ luminosity functions from z$\approx$0 to z$\sim$2. At higher redshift, we find that the emission-line galaxy population is dominated by star-forming and active galaxies, with negligible fractions of shock- and PAGB-dominated galaxies. We highlight 10 UV-diagnostic diagrams able to robustly identify the dominant ionizing sources in high-redshift galaxies. We also compute the evolution of several optical- and UV-line luminosity functions from z=4 to z=7, and the number of galaxies expected to be detectable per field of view in deep, medium-resolution spectroscopic observations with the NIRSpec instrument on board the James Webb Space Telescope. We find that 2-hour-long exposures are sufficient to achieve unbiased censuses of H$\alpha$ and [OIII]$\lambda5007$ emitters, while at least 5 hours are required for H$\beta$, and even 10 hours will detect only progressively smaller fractions of [OII]$\lambda3727$, OIII]$\lambda1663$, CIII]$\lambda1908$, CIV$\lambda1550$, [NII]$\lambda6584$, SiIII]$\lambda1888$ and HeII$\lambda1640$ emitters, especially in the presence of dust.


INTRODUCTION
Optical and ultraviolet (UV) emission from interstellar gas provides important information about the ionizing sources and the properties of the interstellar medium (ISM) in galaxies.Extensive studies of this kind have been enabled for galaxies in the local Universe by large spectroscopic surveys, such as the Sloan Digital Sky Survey (SDSS, e.g., York et al. 2000;Blanton et al. 2017).For example, optical nebular-emission lines provide useful diagnostics to distinguish between ionization by young massive stars (tracing the star formation rate, hereafter SFR), an active galactic nucleus (hereafter AGN), evolved, post-asymptotic giant branch (hereafter post-AGB) stars and fast radiative shocks1 (e.g., Izotov & Thuan 1999;Kobulnicky et al. 1999;Kauffmann et al. 2003;Nagao et al. 2006;Kewley & Ellison 2008;Morisset et al. 2016).In fact, the intensity ratios of strong emission lines, such as  Baldwin et al. (1981, hereafter BPT) and Veilleux & Osterbrock (1987).These diagrams have proven efficient to distinguish active from inactive star-forming galaxies in large samples of local galaxies (e.g.Kewley et al. 2001;Kauffmann et al. 2003).Instead, identifying unambiguously galaxies for which the ISM is primarily ionized by post-AGB stellar populations or fast radiative shocks appears more difficult.In the former case, recent studies suggest that diagnostic diagrams involving the Hα equivalent width (EW), such as EW(Hα) versus [N ii]/Hα (hereafter the WHAN diagram, Cid Fernandes et al. 2010Fernandes et al. , 2011)), appear more promising.In the latter case, Baldwin et al. (1981); Kewley et al. (2019)  Over the past two decades, rest-frame optical spectra have become available for increasingly large samples of more distant galaxies, at redshifts z ∼ 0.5-3, through near-infrared (NIR) spectroscopy (e.g., Pettini & Pagel 2004;Hainline et al. 2009;Steidel et al. 2014;Shapley et al. 2015), in particular with the NIR multi-object spectrographs MOSFIRE (McLean et al. 2010) and FMOS (Kimura 2010).Interestingly, these observations indicate uniformly that star-forming galaxies at z > 1 have systematically larger [O iii]/Hβ ratio at fixed [N ii]/Hα ratio than their local SDSS counterparts (see, e.g., Shapley et al. 2005;Lehnert et al. 2009;Yabe et al. 2012;Steidel et al. 2014;Shapley et al. 2015;Strom et al. 2017).The physical origin of this observation is still being debated, and several explanations have been proposed, such as higher ionization parameter, enhanced N/O abundances, harder ionizing-radiation field (e.g.due to binary star models) or weak, unresolved AGN emission in high-redshift galaxies compared to low-redshift ones (for a detailed discussion, see Hirschmann et al. 2017, and references therein).
The above spectroscopic surveys, together with deep narrow-band surveys over wide sky areas (e.g.HiZELS, Geach et al. 2008), have also allowed studies of the statistics of rest-frame optical emission-line galaxies, quantified through luminosity functions of the Hα, [O iii]λ5007+Hβ and [O ii] λ3727 emission lines at redshifts z ∼ 0-3 (e.g., Tresse et al. 2002;Fujita et al. 2003;Ly et al. 2007;Villar et al. 2008;Shim et al. 2009;Hayes et al. 2010;Sobral et al. 2012Sobral et al. , 2013Sobral et al. , 2015;;Colbert et al. 2013;Khostovan et al. 2015Khostovan et al. , 2018Khostovan et al. , 2020;;Hayashi et al. 2018, and references therein).These emission lines, which are sensitive to the ionizing radiation from hot, massive O-and Btype stars, are often used as tracers of star formation on timescales of ∼ 10 million years, providing insight into the evolution of the cosmic star-formation-rate density and the associated build-up of stellar mass in galaxies (as a complement to indicators based on the UV and far-infrared luminosities).Yet, the luminosity functions derived in the above studies can exhibit large scatter, because of uncertain corrections for dust attenuation, AGN contamination, selection biases, sample completeness, sample variance, filter profiles and various other factors.
In general, the observed line luminosity functions have been found to be well described by a Schechter (1976) functional form.In this context, the most recent studies appear to agree on a negligible evolution of the faint-end slope of line-luminosity functions with redshift and an increase by one or two orders of magnitude of the cutoff luminosity at the bright end (L * ) from z = 0 to z = 2-3 for Hα, [O iii]λ5007+Hβ and [O ii] λ3727, reflecting the similar increase in cosmic SFR density (e.g., Sobral et al. 2015;Khostovan et al. 2015).
At redshift z > 3, emission-line observations are generally still limited, except for a few pioneering studies (see Robertson 2021, for a review).This situation is being revolutionized by the recently launched James Webb Space Telescope (JWST), on board of which the sensitive NIRSpec instrument (Jakobsen et al. 2022;Ferruit et al. 2022) is already detecting rest-frame optical emission lines in z > 3 galaxies (Curti et al. 2023;Katz et al. 2022).Large such samples will be built up over the first year of operations by multiple teams, with rest-frame UV emission lines being observable out to extremely high redshifts.UV lines are particularly prominent in metal-poor, actively star-forming dwarf galaxies in the nearby Universe and have been detected in small samples of high-redshift galaxies with NIR spectrographs (e.g., Pettini & Pagel 2004;Hainline et al. 2009;Erb et al. 2010;Stark et al. 2014;Vanzella 2017;Senchyna et al. 2017;Talia 2017).In addition to JWST, large surveys of high-redshift emission-line galaxies are (expected to be) observed with other current and future observational facilities, such as the Dark Energy Spectroscopic Instrument (DESI), Euclid, and the Nancy Grace Roman Space Telescope.
Despite this observational progress, interpreting observations of rest-frame optical and UV emission lines of highredshift galaxies remains challenging, as illustrated by the persisting debate about the nature of the ionizing radiation in galaxies at z ∼ 3 (see above).Also, at the very low metallicities expected in the youngest galaxies at high redshifts (e.g., Maiolino et al. 2008), the spectral signatures of AGN-dominated and star-formation (SF)-dominated galaxies overlap in standard optical (BPT) diagnostic diagrams (Groves et al. 2006;Feltre et al. 2016;Hirschmann et al. 2019).This led Feltre et al. (2016, see also Nakajima et al. 2018) to explore the location of a wide range of photoionization models of SF-and AGN-dominated galaxies in alternative, UV emission-line dianostic diagrams.While highly instructive, these pioneering studies also present some limitations: (i) the exploration of purely SF-dominated and AGNdominated models did not account for mixed contributions by these ionizing sources, nor for contributions by post-AGB stars nor radiative shocks; (ii) the results may be biased by the inclusion of parameter combinations not found in nature; (iii) the conclusions drawn about line-ratio diagnostic diagrams do not incorporate potential effects linked to the evolution of galaxy properties with cosmic time; and (iv) these studies, based on grids of photoionization models, are not designed to explore the evolution of emission-line properties of galaxy populations (such as luminosity functions) with redshift.
A way to gain insight into the connection between emission-line properties and ionizing sources for the evolving galaxy population is to model galaxy nebular emission in a full cosmological framework.Fully self-consistent models of this kind are currently limited, mainly because of insufficient resolution of the ISM and the neglect of radiationgas coupling in cosmological simulations.As an alternative, some studies proposed the post-processing of cosmological hydrodynamic simulations and semi-analytic models with photoionization models to compute nebular emission of galaxies in a cosmological context, albeit with the drawbacks of either following emission lines produced by only young star clusters (Kewley et al. 2013;Orsi et al. 2014;Shimizu et al. 2016;Wilkins et al. 2020;Pellegrini et al. 2020;Shen et al. 2020;Garg et al. 2022;Baugh et al. 2022), or being limited to a relatively small sample of massive galaxies (Hirschmann et al. 2017(Hirschmann et al. , 2019).In the latter two studies, we included nebular emission from young star clusters, AGN and post-AGB stellar populations, but not the contribution from fast radiative shocks.
In this paper, we appeal to the same methodology as in our previous work (Hirschmann et al. 2017(Hirschmann et al. , 2019) ) to model in a self-consistent way the emission-line properties of galaxy populations, but with two main improvements: we consider the evolution over cosmic time of a much larger sample of galaxies from the IllustrisTNG simulation, covering a full range of masses; and we incorporate the contribution to nebular emission from fast radiative shocks, in addition to the emission from young star clusters, AGN, and post-AGB stellar populations.We achieve this by coupling photoionization models for young stars (Gutkin et al. 2016), AGN (Feltre et al. 2016), post-AGB stars (Hirschmann et al. 2017) and fast, radiative shocks (Alarie & Morisset 2019) with the IllustrisTNG cosmological hydrodynamic simulations.Our methodology, which by design captures SF-dominated, composite, AGN-dominated, post-AGB-dominated and shock-dominated galaxy populations in a full cosmological framework, offers a unique way to address several important questions: • Are the predicted emission-line properties of Illus-trisTNG galaxy populations statistically consistent with the observed, optical emission-line properties of galaxies out to z ∼ 3, as was previously shown for only a small sample of massive galaxies (Hirschmann et al. 2017)?
• Which optical and UV diagnostic diagrams can help robustly distinguish between the main ionizing source(s) in metal-poor galaxy populations at redshifts z > 3, and how do these diagnostics compare with the UV-based criteria derived from the limited sample of Hirschmann et al. (2019)?
• What is the expected evolution of optical and UV lineluminosity functions of galaxies at redshifts z > 3, and how many galaxies of a given type can we expect to be detected by planned JWST/NIRSpec surveys?
Answers to these questions will allow a robust interpretation of the high-quality spectra of very distant galaxies that will be collected in the near future with JWST, providing valuable insight into, for example, the census of emission-line galaxy populations out to z ∼ 8, and the relative contributions by star formation and nuclear activity to reionization.
The paper is structured as follows.We start by describing the theoretical framework of our study in Section 2, including the IllustrisTNG simulation set, the photoionization models and the coupling methodology between simulations and emission-line models.Section 3 presents our main findings about tracing the nature of ionizing sources in galaxy populations over cosmic time, via classical optical and novel UV diagnostic diagrams.In Section 4, we explore: (i) which UV line luminosities reliably trace the SFR in high-redshift galaxies; (ii) the cosmic evolution of optical and UV emission-line luminosity functions; and (iii) number counts of galaxy populations of different types expected to be detected with JWST/NIRSpec at high redshift.We address possible caveats of our approach and put our results into the context of previous theoretical studies in Section 5. Section 6 summarizes our main results.

THEORETICAL FRAMEWORK
In this paper, we model the optical and UV emissionline properties of galaxy populations in a wide redshift range using the IllustrisTNG300, IllustrisTNG100 and Il-lustrisTNG50 simulation suite (hereafter TNG300, TNG100 and TNG50; Pillepich et al. 2018a;Springel et al. 2018;Nelson et al. 2018;Naiman et al. 2018;Marinacci et al. 2018;Nelson et al. 2019;Pillepich et al. 2019).In the following paragraphs, we briefly summarise the simulation details (Section 2.1), the emission-line models (Section 2.2, Gutkin et al. 2016;Feltre et al. 2016;Hirschmann et al. 2017) and the coupling methodology (Section 2.3, Hirschmann et al. 2017, 2019), referring the reader to the original studies for more details.Our novel incorporation of emission-line models of fast radiative shocks (from Alarie & Morisset 2019), and their coupling with shocked regions identified in simulated IllustrisTNG galaxies, are described in Sections 2.2.2 and 2.3.2,respectively.

The IllustrisTNG simulation suite
IllustrisTNG is a suite of publicly available, large volume, cosmological, gravo-magnetohydrodynamical simula-tions, run with the moving-mesh code Arepo (Springel 2010), and composed of three simulations with different volumes and resolutions: TNG300, TNG100 and TNG50.Each IllustrisTNG run self-consistently solves the coupled evolution of dark matter, gas, luminous stars, and supermassive black holes from a starting redshift of z = 127 to z = 0, assuming the currently favoured Planck cosmology (σ8 = 0.8159, Ωm = 0.3089, ΩΛ = 0.6911 and h = 0.6774; see e.g., Planck Collaboration et al. 2016).
All IllustrisTNG simulations include a model for galaxy formation physics (Weinberger et al. 2017;Pillepich et al. 2018b), which has been tuned to match observational constraints on the galaxy stellar-mass function and stellarto-halo mass relation, the total gas-mass content within the virial radius of massive galaxy groups, the stellarmass/stellar-size relation and the relation between blackhole (BH) mass and galaxy mass at z = 0. Specifically, the simulations follow (i) primordial and metal-line radiative cooling in the presence of an ionizing background radiation field; (ii) a pressurised ISM via an effective equation-of-state model; (iii) stochastic star formation; (iv) evolution of stellar populations and the associated chemical enrichment (tracing individual elements) by supernovae (SN) Ia/II, AGB stars and neutron-star (NS)-NS mergers; (v) stellar feedback via an kinetic wind scheme; (vi) BH seeding, growth and feedback (two-mode model for high-accretion and low-accretion phases); (vii) amplification of a small, primordial-seed magnetic field and dynamical impact under the assumption of ideal MHD.The IllustrisTNG simulations have been shown in various studies to provide a fairly realistic representation of the properties of galaxies evolving across cosmic time (e.g., Torrey et al. 2019).
In an IllustrisTNG simulation, the properties of galaxies, galaxy groups, subhaloes and haloes (identified using the FoF and Subfind substructure-identification algorithms, see Davis et al. 1985 andSpringel et al. 2001, respectively), are computed 'on the fly' and saved for each snapshot.In the following, we always consider both central and satellite Subfind haloes and their galaxies.In addition, an on-the-fly cosmic shock finder coupled to the code (Schaal & Springel 2015) uses a ray-tracing method to identify shock surfaces and measure their properties.Schaal et al. (2016) describe the functioning and performances of this algorithm on Illustris-simulation data.In a first step, cells in shock-candidate regions are flagged according to certain criteria (e.g., negative velocity divergence; temperature and density gradients pointing in a same direction).Secondly, the shock surface is defined by the ensemble of cells exhibiting the maximum compression along the shock normal.Thirdly, the Mach number is estimated from the Ranking-Hugoniot temperature jump condition, and a shock is defined only if the Mach-number jump exceeds 1.3.We note that, by default, shock finding is disabled for starforming cells, which are those lying close to the effective equation of state of the ISM in density-temperature (ρ-T ) phase space.That is, shocks are not followed when either the pre-shock cell, the post-shock cell, or the cell in the shock fulfil ρ > ρ sfr and T < 2 × 10 4 (ρ/ρ sfr ) K, where ρ sfr is the density threshold for star formation.By definition, therefore, the IllustrisTNG simulations are not capturing shocks in the cold/warm/star-forming ISM.This also implies that shocks from supernovae explosions, which primarily occur in star-forming regions, are not captured.For more details on the shock-finder algorithm, we refer the reader to Schaal & Springel (2015) and Schaal et al. (2016).

Emission-line models
We use the same methodology as in Hirschmann et al. (2017Hirschmann et al. ( , 2019) ) to compute nebular-line emission of galaxies from the post-processing of the three tiers of IllustrisTNG simulations.At z = 0, TNG50 has ∼ 8, 800 galaxies with stellar masses greater than 10 8 M⊙, TNG100 ∼ 18, 000 galaxies with stellar masses greater than 3 × 10 9 M⊙ and TNG300 ∼ 47, 000 galaxies with stellar masses greater than 10 10 M⊙.For all these galaxies and their progenitors at z > 0, nebular emission from young star clusters (Gutkin et al. 2016), narrow-line regions (NLR) of AGN (Feltre et al. 2016) and post-AGB stars (Hirschmann et al. 2017) is computed using version c13.03 of the photoionization code Cloudy (Ferland et al. 2013), while the emission from fast radiative shocks (Alarie & Morisset 2019) is computed using Mappings V (Sutherland & Dopita 2017).All photoionization calculations considered in this work were performed adopting a common set of element abundances down to metallicities of a few per cent of Solar (from Gutkin et al. 2016).

Emission-line models of young star clusters, AGN and post-AGB stellar populations
To model nebular emission from H ii regions around young stars, AGN narrow-line regions and post-AGB stellar populations, we adopt the same photoionization models as described in section 2.

Emission-line models of fast radiative shocks
To describe the contribution by shock-ionized gas to nebular emission in IllustrisTNG galaxies, we appeal to the 3MdBs database3 of fully-radiative shock models computed by Alarie & Morisset (2019) using Mappings V.The models, computed in plane-parallel geometry, are available over a similarly wide range of interstellar metallicities (Z shock ) to those adopted in the H ii-region, AGN-NLR and PAGB photoionization models described in the previous section [albeit with only two values of the C/O ratio: (C/O) shock = 0.25×(C/O)⊙ and (C/O)⊙].Metal depletion onto dust grains is not included in this case, as in fast shocks, dust can be efficiently destroyed by grain-grain collisions, through both shattering and spallation, and by thermal sputtering (Allen et al. 2008).
The other main adjustable parameters defining the shock-model grid are the shock velocity (100 v shock 1000 km s −1 ), pre-shock density (1 n H,shock 10 4 cm −2 ) and magnetic-field strength (10 −4 B shock 10 µG).We focus here on the predictions of models including nebular emission from both shocked and shock-precursor (i.e., preshock) gas photoionized by the shock (see Alarie & Morisset 2019 for more details).

Coupling nebular-emission models with IllustrisTNG simulations
We combine the IllustrisTNG simulations of galaxy populations described in Section 2.1 with the photoionization models of Section 2.2 by associating, at each time step, each simulated galaxy single H ii-region, AGN-NLR, PAGB and radiative-shock models, which, when combined, constitute the total nebular emission of the galaxy.We achieve this using the procedure described in Hirschmann et al. (2017Hirschmann et al. ( , 2019)), by self-consistently matching the model parameters available from the simulations with those of the emissionline models, a novelty here being the incorporation of the radiative-shock models (see below).The ISM and stellar parameters of simulated galaxies are evaluated by considering all 'bound' gas cells and star particles (as identified by the Subfind algorithm; Section 2.1) for the coupling with the H ii-region and PAGB models, and within 1 kpc around the BH for the coupling with the AGN-NLR models.We note that while the coupling methodology described in the next paragraphs pertains to galaxy-wide nebular emission (as in Hirschmann et al. 2017Hirschmann et al. , 2019)), the versatile nature of our approach allows its application to spatially-resolved studies of galaxies (Hirschmann et al., in prep.).
A few photoionization-model parameters cannot be defined from the simulation, such as the slope of the AGN ionizing spectrum (α), the hydrogen gas density in individual ionized regions (nH), the dust-to-metal mass ratio (ξ d ) and the pre-shock density (n H,shock ), since individual ionised regions are unresolved and dust formation is not followed.For simplicity, we fix the slope of the AGN ionizing spectrum to α = −1.7 and adopt standard hydrogen densities nH,⋆ = 10 2 cm −3 in the H ii-region models, nH,• = 10 3 cm −3 in the AGN-NLR models, and nH,⋄ = 10 cm −3 in the PAGB models.We mitigate the impact of the dust-to-metal mass ratio on the predicted nebular emission of the H iiregion, AGN and PAGB models by sampling two values, ξ d = 0.3 and 0.5, around the Solar-neighbourhood value of ξ d,⊙ = 0.36 (Gutkin et al. 2016).Hirschmann et al. (2017Hirschmann et al. ( , 2019) ) discuss in detail the influence of the above parameters on predicted optical and UV emission-line fluxes.For the pre-shock density, which has a comparably weak influence on most emission lines of interest to us, we adopt for simplicity n H,shock = 1 cm −3 , the lowest value available in the shock models, closest to the typical galaxy-wide densities sampled by the simulations.
In the next paragraphs, we provide slightly more details on the coupling of the IllustrisTNG simulations with H ii-region, AGN-NLR and PAGB photoionization models (Section 2.3.1) and radiative-shock models (Section 2.3.2).Carton et al. 2017). 2 At higher redshift, we take the evolution of the filling factor to follow that of the global average gas density.

2.3.1
The procedure to associate each simulated galaxy with an AGN-NLR and a PAGB photoionization models is identical to that described in Hirschmann et al. (2019).At each time step, we select the AGN-NLR model of metallicity, C/O ratio and ionization parameter closest to the average values computed in a co-moving sphere of 1-kpc radius around the central BH (roughly appropriate to probe the NLR for AGN with luminosities in the range found in our simulations).The PAGB model is selected to be that with age and metallicity closest to the average ones of stars older than 3 Gyr in the simulated galaxy, and with same warm-gas properties as the H ii-region model (since the simulation does not resolve gas in star-forming clouds versus the diffuse ISM).

Coupling simulated galaxies with radiative-shock models
To compute the contribution from radiative shocks to nebular-line emission in a simulated galaxy at a given time step, we consider all cells flagged by the shock-finder algorithm (Section 2.1) and compute their mean magnetic-field strength (B We note that, throughout this paper (except in Section 4.3), we do not consider attenuation by dust outside H ii regions and compare our predictions with observed emissionline ratios corrected for this effect By design, the above optical-line ratios are anyway rather insensitive to dust, as they are defined by lines close in wavelength (for reference, the corrections are less than ∼ 0.015 dex for a V -band attenuation of AV ∼ 0.5 mag).
In addition to line luminosities and line ratios, we also compute the total equivalent widths of some nebular emission lines.We obtain the EW of an emission line by dividing the total line luminosity by the total continuum flux C at the line wavelength, e.g., EW(C iii] λ1908) = L CIII] /C CIII] (expressed in Å).The total continuum C is the sum of the contributions by the stellar (SF and PAGB) and AGN components. 3For the stellar components, we account for both attenuated stellar radiation and nebular recombination continuum.For the AGN component, we consider only the nebular recombination continuum, and do not account for any attenuated radiation from the accreting BH.This assumption should be reasonable for type-2 AGN, where direct AGN radiation is obscured by the surrounding torus, and only emission from the narrow-line region is observed.Instead, for type-1 AGN, the EWs computed here should be interpreted as upper limits.In the remainder of this study, we will focus on exploring the EWs of one optical and five UV lines: EW(Hα), EW(C iii] λ1908), EW(C iv λ1550), EW(O iii] λ1663), EW(Si iii] λ1888) and EW(N iii] λ1750).

IDENTIFICATION OF THE MAIN IONIZING SOURCE(S) IN GALAXY POPULATIONS OVER COSMIC TIME
In this section, we investigate optical and UV diagnostic diagrams to identify the main ionizing sources in full galaxy populations of the TNG50 and TNG100 simulations at z = 0−7.We focus on galaxies containing at least ∼ 1000 star particles, corresponding to stellar masses greater than 10 8 M⊙ and 3 × 10 9 M⊙ in the TNG50 and TNG100 simulations, respectively.Note that in this section, we do not explicitly show results for TNG300 galaxies, whose emissionline properties hardly differ from those of TNG100 galaxies.
In Section 3.1 below, we start by comparing the predicted locations of these different galaxy types at z = 0 in the classical BPT and Mass-Excitation (MEx, Juneau et al. 2011) diagnostic diagrams with those of SDSS galaxies.We also propose alternative diagnostics to reliably identify shock-dominated and PAGB-dominated galaxies.In Section 3.2, we examine how the star formation rates and coldgas fractions of simulated galaxies compare with those of SDSS galaxies.Then, we move on to the redshift range z = 0.5 − 7 and explore, in Section 3.3, to what extent classical BPT diagrams still allow accurate separation of SFdominated, composite and AGN galaxy populations, and whether our simulation predictions can statistically reproduce the observed evolution of [O iii]/Hβ at fixed [N ii]/Hα.In Section 3.4, we investigate changes in the relative fractions of galaxies of different types with cosmic time.Finally, in Section 3.5, we confront our simulations of galaxy populations with the UV diagnostic diagrams proposed by Hirschmann et al. (2019, based on a set of only 20 zoomin simulations of massive galaxies) to assess the robustness of these diagrams to identify different galaxy types during the epoch or reionization, as a guidance for interpreting planned spectroscopic, high-redshift galaxy surveys with, e.g., JWST/NIRSpec.In each panel, the predicted distribution for TNG50 galaxies is overplotted as black contours, while the grey shaded 2D histogram shows the observed distribution of SDSS galaxies.For a fair comparison between models and observations, we have imposed a typical flux-detection limit of 5 × 10 −17 erg s −1 cm −2 for simulated galaxies (motivated by table 1 of Juneau et al. 2014, as in figure 1 of Hirschmann et al. 2017).

Optical diagnostic diagrams at z = 0
In the top-row diagrams, the black dashed and dotted lines indicate the classical empirical criteria of Kauffmann et al. (2003) and Kewley et al. (2001), respectively, to differentiate SF galaxies from composites, AGN and LI(N)ER [low-ionization (nuclear) emission] galaxies.Observationally, galaxies with line ratios below the dashed line are classified as SF-dominated, those with line ratios between the dashed and dotted lines as composites, and those with line ratios above the dotted line as AGN-dominated.In addition, galaxies with line ratios in the bottom-right quadrant defined by dot-dashed lines are classified as LI(N)ER, whose main ionizing sources are still debated (e.g., faint AGN, post-AGB stellar populations, shocks or a mix of these sources; e.g., Belfiore 2016).The lines in the middleand bottom-row diagrams of Fig. 1 show other criteria by Kewley et al. (2001Kewley et al. ( , 2006) ) to separate SF-dominated (below the curved dotted line) from AGN-dominated (above both dotted lines) and LI(N)ER (in the bottom right area) galaxies.
As expected from the results obtained by Hirschmann et al. (2017Hirschmann et al. ( , 2019) ) for a small sample of massive, re-simulated galaxies, we find that the TNG50 and TNG100 galaxy populations at z = 0 overlap with SDSS galaxies in all BPT diagrams in Fig. 1.Moreover, the predicted locations of SF-, AGN-and PAGB-dominated galaxy populations largely coincide with the SF, AGN and LI(N)ER areas defined by standard criteria.Composite galaxies extend over a larger area than expected from the standard criteria toward the AGN region, while shock-dominated galaxies lie preferentially at high log([O iii]/Hβ) 0 and log([O i]/Hα) −1, with [N ii]/Hα and [S ii]/Hα similar to those of other galaxy types.
We note that SF-dominated and composite galaxies reach lower [N ii]/Hα, [S ii]/Hα and [O i]/Hα ratios in the TNG50 population, which samples lower stellar masses and metallicities than the TNG100 simulation.Likewise, there are hardly any PAGB-dominated TNG50 galaxies.Such galaxies are typically more massive than 3 × 10 10 M⊙ and therefore significantly less numerous than in the eight-fold larger TNG100 volume.
It is interesting to note that, while PAGB dominated galaxies account for only 3 percent of all emission-line galaxies in the TNG100 simulation at z = 0, they make up ∼ 20 per cent of galaxies residing in the LI(N)ER area of the [O iii]/Hβ vs. [N ii]/Hα diagram.In the TNG300 simulation, which has a higher stellar mass cut and better statistics for massive galaxies, PAGB-dominated galaxies make up ∼ 40 per cent of all LI(N)ER.Thus, the assumption often made in observational studies that LI(N)ER are typically weak AGN can potentially strongly bias statistics, such as, for ex-ample, the optical AGN luminosity function and the AGN luminosity-SFR relation.
Similarly to PAGB-dominated galaxies, shockdominated galaxies are relatively rare, representing at most 10 per cent of all emission-line galaxies at z = 0 in TNG100 (the fraction is lower in TNG50, because of the lower percentage of massive galaxies).Nevertheless, shocks contribute 5 to 20 per cent of the Hβ-line emission from 81 per cent of SF-dominated galaxies, 98 per cent of composites, 97 per cent of AGN-dominated galaxies and 51 per cent of PAGB-dominated galaxies.
In Fig. 2, we show the analogue of Fig. 1 for the MEx diagram (Juneau et al. 2011), defined by the [O iii]/Hβ ratio and galaxy stellar mass M stellar .The dotted and dashed lines indicate empirical criteria from Juneau et al. (2014) to distinguish between SF-dominated (below the dotted line), composite (between dotted and dashed line) and AGNdominated (above the dashed line) galaxies.We find that AGN-dominated galaxies in the TNG100 simulation mostly overlap with the observationally-defined AGN region.However, the [O iii]/Hβ ratios of AGN-dominated galaxies with lower stellar masses sampled by TNG50 fall below the expectation.Shock-dominated galaxies lie in the same area and are therefore hardly distinguishable from AGN.Instead, SFdominated and composite galaxies extend beyond the area identified by Juneau et al. (2014) in this diagram.In the stellar mass range M stellar = 3 × 10 10 -10 11 M⊙, both populations can reach [O iii]/Hβ ∼ 1, i.e., an order of magnitude above expectation.
The diagnostic diagrams discussed so far in Figs 1 and 2 do not allow us to uniquely identify shock-dominated and PAGB-dominated galaxies.We may appeal for this to other diagnostic diagrams, such as the EW(Hα)versus-[N ii]/Hα (WHAN) diagram (Stasińska et al. 2008;Cid Fernandes et al. 2010, 2011;Stasińska et al. 2015), expected to help distinguish PAGB-dominated galaxies among LI(N)ER, and the (Kewley et al. 2019), expected to help distinguish between shock-dominated and other types of galaxies.We show the analogues of Fig. 1 for these two additional diagnostic diagrams in Figs 3 and 4, respecively.
The black dashed lines in Fig. 3 show the empirical selection criteria of Cid Fernandes et al. (2011), which divide the WHAN diagram into three rectangles.SF-dominated galaxies are expected to reside in the top-left rectangle, AGN-dominated galaxies in the top-right one, and PAGBdominated galaxies (referred to as 'retired' galaxies by Cid Fernandes et al. 2011) in the bottom one, corresponding to EW(Hα) < 3 Å.
Fig. 3 shows that the TNG100 and TNG50 simulated galaxies roughly overlap with SDSS galaxies in the WHAN diagram.As expected, the predicted Hα equivalent widths of PAGB-dominated galaxies lie below those of any other considered galaxy type, with EW(Hα) 1.25 Å.This is less than half the value of 3 Å proposed by Cid Fernandes et al. (2011), although this difference may arise in part from our definition of PAGB-dominated galaxies.Had we considered as PAGB-dominated all galaxies where PAGB stars contribute at least 40 (instead of 50) per cent of the total Hβ luminosity, the upper limit (red line) in Fig. 3 would lie around EW(Hα) ∼ 3-4 Å.Also, in our models, the predicted Hα luminosities of PAGB-dominated galaxies rely on PAGB evolutionary tracks from Miller Bertolami (2016, incorporated in the latest version of the Bruzual & Charlot 2003 code), with shorter lifetimes (but higher luminosities) than in previous widely used prescriptions.Adopting older PAGB models would enhance the predicted EW(Hα) of PAGBdominated galaxies (Section 5.1).In any case, neither changing the definition of PAGB-dominated galaxies nor using older PAGB models would introduce any confusion between PAGB-dominated and SF-or shock-dominated galaxies in Fig. 3.Only the separating EW(Hα) value would change.
In Fig. 4, the black dashed lines delimit the top-right region of the [O iii]/[O ii]-versus-[O i]/Hα diagram expected to be populated by galaxies whose line emission is dominated by fast radiative shocks, according to Kewley et al. (2019, their figure 11).The different types of TNG50 and TNG100 galaxy populations in the different diagrams of Fig. 4 all fall on the footprint of SDSS galaxies, indicating the ability of our models to account for these observations.Furthermore, the simulated shock-dominated galaxies are the only ones to populate the top-right region noted by Kewley et al. (2019), confirming the robustness of this diagnostic diagram to separate shock-dominated galaxies from other galaxy types.
Overall, we consider the agreement between simulated and observed galaxy populations of different types at z = 0 in the different diagnostic diagrams of Figs.1-4 as remarkable, given that, in our approach, different galaxy types are assigned on the basis of physical parameters, such as fractional contribution to the total Hβ luminosity and BHAR/SFR ratio, rather than of observables.Our results confirm previous suggestions that the The layout and colour coding are the same as in the top row of Fig. 1.Overplotted as black dotted and dashed lines are the empirical criteria of Juneau et al. (2014) to distinguish between SF-dominated (below the dotted line), composite (between dashed and dotted lines) and AGN-dominated (above the dashed line) galaxies.Fernandes et al. (2011) to distinguish between SF-dominated (top-left quadrant), AGN-dominated (top-right quadrant), and retired, post-AGB dominated (below horizontal dashed black line) galaxies.The red dashed line shows the criterion suggested by our simulated emission-line galaxies to robustly identify post-AGB-dominated galaxies.Kewley et al. (2019).The layout and colour coding are the same as in the top row of Fig. 1.In each panel, the top-right quadrant delimited by dashed black lines is the region expected to be populated only by galaxies dominated by fast radiative shocks (see figure 11 of Kewley et al. 2019).
Hα diagrams, a 5D line-ratio diagnostics, provide powerful means of distinguishing between SF-, AGN-, PAGB-and shock-dominated galaxies in the nearby Universe.

SFR and cool-gas fraction
After having divided the simulated galaxy populations into different types based on their dominant ionizing sources, and explored their location in various diagnostic diagrams, we continue in this Section to study some of the predicted properties of the different galaxy types at z = 0, such as SFR and baryonic fraction in the 'cool-gas' phase (i.e., gas with T < 10 5 K).We define the cool-gas fraction as Figure 5. Location of the TNG100 and TNG50 galaxy populations at z = 0 in the SFR-versus-stellar mass (top row) and cool-gas fraction-versus-stellar mass (bottom row) diagrams.The layout and colour coding are the same as in the top row of Fig. 1.Also shown by the black solid and dotted lines in the top diagrams are the observed main sequences of local star-forming galaxies from SDSS and GALEX data, respectively (Elbaz et al. 2007;Salim et al. 2007, the grey shaded area indicates the 1σ scatter about the solid line).In the bottom panels, the black squares with 1σ error bars are observed mean gas fractions from various samples of z ∼ 0 galaxies compiled by Peeples & Shankar (2011).
M cool /(M cool + M stellar ), where M cool and M stellar are the cool-gas and stellar masses.
In the top row of Fig. 5, we show the distributions of SFR versus M stellar for the different types of TNG100 and TNG50 galaxy populations, using the same layout and colour coding as in Fig. 1.The observed 'main sequence' (MS) of local star-forming galaxies is depicted by the black solid and dotted lines for SDSS and GALEX data, respectively (Elbaz et al. 2007;Salim et al. 2007).SF-dominated and composite galaxies in our simulations largely follow this observed MS (left two panels), while AGN-dominated galaxies have SFRs typically 1-1.5 dex lower at fixed M stellar (central panel).The shock-and PAGB-dominated galaxies (right two panels) are typically more massive, with M stellar 3×10 10 M⊙, and even less star-forming that the other galaxy types.Shock-dominated galaxies span a wide range of SFRs, between 1 and 10 −4 M⊙ yr −1 , filling the so-called 'green valley' of the SFR-versus-stellar mass plane (e.g., Salim 2014).PAGB-dominated galaxies are fully quiescent (retired) massive galaxies, with SFRs below 0.01 M⊙ yr −1 .
The sequence of decreasing SFRs from AGN-, to shock-, to PAGB-dominated galaxies at fixed stellar mass in Fig. 5 reflects the process of SF quenching by AGN feedback in IllustrisTNG galaxies (Weinberger et al. 2017;Nelson et al. 2018;Terrazas et al. 2020): in an AGN-dominated galaxy, feedback from the heavily accreting central black hole reduces star formation.This also regulates BH growth, via a drop in gas-accretion rate.The kinetic-feedback mode associated with the low BH-accretion phase (Section 2.1) causes shocks to emerge in the ISM, heating and expelling gas from the galaxy.This further lowers the SFR and produces shock-dominated line emission.After this phase, the galaxy becomes quiescent/retired, with very low levels of on-going star formation and BH accretion, so that PAGB stars can dominate line emission.
In the bottom row of Fig. 5, we show the distributions of cool-gas fraction versus stellar mass for the different TNG100 and TNG50 galaxy populations.Also plotted for reference are observed mean gas fractions from various samples of z ∼ 0 galaxies compiled by Peeples & Shankar (2011, black squares).Consistently with our findings above regarding star formation, we find that SF-dominated, composite, AGN-dominated, shock-dominated and PAGB-dominated galaxies represent a sequence of decreasing cool-gas fractions.We note that, at the end of the sequence, over 95 per cent of the stars in PAGB-dominated galaxies are older than 1 Gyr, with mean stellar ages greater than 6 Gyr and mean stellar metallicities between 0.6 and 1.2 times solar, i.e., typically more metal-rich than other galaxy types., 1, 2, 3, 4-5 and 5-7 (from top to bottom), using the same layout and colours as for the z = 0 distributions in the top row of Fig. 1 (except for the highest-redshift bin in the bottom row of Fig. 6, where we show the few assembled TNG100 galaxies as individual symbols).The adopted flux-detection limit cut is the same as in Fig. 1.
The lack of shock-dominated and PAGB-dominated galaxies at redshifts z > 1 in Fig. 6 reflects the generally higher star-formation and BH-accretion rates at these early epochs.Moreover, for all the simulated galaxy types, [O iii]/Hβ appears to globally increase and [N ii]/Hα decrease from low to high redshift.This can be traced back to a drop in interstellar metallicity (the models include secondary Nitrogen production) combined with a rise in SFR and global gas density (controlling the ionization parameter) in the models, as discussed in detail in Hirschmann et al. (2017Hirschmann et al. ( , 2019)).The global drop in [N ii]/Hα toward higher redshifts implies that the SF-dominated, composite and AGNdominated galaxies become less distinguishable in this classical BPT diagram (see also figure 14 of Feltre et al. 2016).This validates, on the basis of a much larger sample of simulated galaxies, the finding by Hirschmann et al. (2019) that standard optical selection criteria to identify the nature of the dominant ionizing source break down for metal-poor galaxies at z > 1.
A remarkable result from Fig. 7 is the agreement between the predicted emission-line properties of simulated galaxies at z ∼ 2 (blue symbols) and the observations of a sample of 251 star-forming galaxies with stellar masses in the range M stellar = 4 × 10 8 -2.5 × 10 11 M⊙ at z ∼ 2.3 by Steidel et al. (2014, thick grey solid line).To gain insight into the physical origin of the rise in [O iii]/Hβ at high redshift, we have conducted an analysis similar to that discussed in section 4 of Hirschmann et al. (2017) by exploring separately the relative influence of different physical parameters on observables, but using now the much larger sample of simulated IllustrisTNG galaxies.Our analysis confirms our previous finding that the increase in [O iii]/Hβ at a fixed galaxy stellar mass and [N ii]/Hα from low to high redshifts is largely driven by an increase in the ionization parameter resulting from the rise in SFR and global gas density.

Galaxy-type fractions over cosmic time
In Fig. 6 above, we have seen that PAGB-and shockdominated galaxies tend to disappear from emission-line diagrams at redshifts z 1. Fig. 8 quantifies how the fraction of galaxies of different types evolves with redshift in the TNG100 (solid lines) and TNG50 (dashed lines) simulations (using the same colour coding as in Figs 1-6).To assess the possible impact of resolution effects on inferred galaxy type fractions, we also show the results obtained when adopting for TNG50 galaxies the same stellar-mass cut as in the TNG100 simulation, i.e., M stellar > 3 × 10 9 M⊙ (dot-dashed lines).For clarity, the areas encompassed by all curves referring to a given galaxy type are colour-shaded.
Fig. 8 shows that, at low redshift (z 1), the emissionline populations in the TNG100 and TNG50 simulations consist primarily of SF-dominated galaxies.Specifically, at z = 0, the TNG100 (TNG50) population consists of about 40 (69) per cent of SF-dominated galaxies, 37 (21) per cent of composites, 10 (4) per cent of AGN-dominated galaxies, 10 (6) per cent of shock-dominated and 3 (≪ 1) per cent of PAGB-dominated galaxies.The different fractions found for the TNG100 and TNG50 simulations (solid and dashed lines, respectively) follow from the different stellarmass distributions: at fixed mass cut, these fractions (solid and dot-dashed lines, respectively) are roughly consistent with one another, indicating no significant resolution effect at low redshift.
At higher redshift, the fraction of SF-dominated galaxies increases from 60 to 80 per cent over the range 1 z 5 in the TNG100 simulation, while the fraction of composites drops from 25 to 18 per cent.In the TNG50 simulation, instead, the fraction of composites increases slightly at the expenses of SF-dominated galaxies, irrespective of the stellar-mass cut.This difference relative to TNG100 may arise from resolution effects, with higher BH-accretion rates -likely due to higher gas densities in the accretion regions around the BHs -being captured by the TNG50 simulation over this redshift interval.In both simulations, the fractions of shock-and PAGB-dominated galaxies drop below 1 per cent at z > 1, implying negligible contributions to line emission (Fig. 6).Also, the fraction of AGN-dominated galaxies falls to only 2-3 per cent from z ∼ 1 to z ∼ 5, because of the higher star-formation activity of high-redshift galaxies compared to their local counterparts, which can more efficiently outshine AGN emission.

UV emission-line diagnostic diagrams for metal-poor galaxies
We have seen in Section 3.3 that optical emission-line diagnostic diagrams can help reliably distinguish active from inactive galaxies out to z ∼ 1, but start to break down for metal-poor galaxies above this redshift.We now investigate the ability of 12 UV emission-line diagnostic diagrams highlighted by Hirschmann et al. (2019, using a small set of 20 cosmological zoom-in simulations of massive galaxies and their progenitors) to help identify the dominant ionizing sources in metal-poor TNG100 and TNG50 galaxies at redshifts z > 1.These diagrams are:    6 (SF-dominated: blue; composites: red; AGN-dominated: green; shock-dominated: lilac; PAGBdominated: yellow).Solid lines refer to TNG100 galaxies more massive than M stellar = 3 × 10 9 M ⊙ and dashed and dot-dashed lines to TNG50 galaxies more massive than M stellar = 1 × 10 8 and 3 × 10 9 M ⊙ , respectively.For clarity, the areas encompassed by all curves referring to a given galaxy type are colour-shaded.
(ii) EW(C iv λ1550) versus C iv λ1550/He ii λ1640; In Fig. 9, we show the locations in these UV diagnostic diagrams of metal-poor galaxy populations of different types from the TNG100 simulation in the redshift interval z = 1-5 (top 12 panels), and the TNG50 simulation in the redshift interval z = 5-7 (bottom 12 panels).The galaxies were selected to have N2O2 < −0.8, corresponding roughly to metallicities below 0.5Z⊙ (Hirschmann et al. 2019).This selection makes the contribution to line emission by post-AGB stellar populations negligible, and we ignore it here (this contribution is most important in evolved, metal-rich galaxies at low redshift; see, e.g., Hirschmann et al. 2017).We note that, by analogy with Hirschmann et al. ( 2019), we have applied a flux-detection limit of 10 −18 erg s −1 cm −2 to all emission lines when building Fig. 9.This corresponds roughly to the point-source emission-line flux sensitivity reached in 10 4 s using NIRSpec on board JWST, with a signal-to-noise ratio of 10.We also applied a detection limit of 0.1 Å on emission-line EWs.This low EW limit is justified by our desire to keep our analysis as general as possible for the proposed diagnostic diagrams to be useful also for future instruments with very high sensitivity (e.g., MOSAIC and HARMONI on the Extremely Large Telescope).We have checked that the results shown in this section do not change qualitatively when applying more rigorous cuts in UV-line fluxes and EWs.
Each panel of Fig. 9 also indicates the selection criteria identified by Hirschmann et al. (2019) to discriminate between SF-dominated and composite galaxies (black dashed line), and between composite and AGN-dominated galaxies (black dotted line).Since we already validated these UV-selection criteria against observational data in Hirschmann et al. ( 2019), for clarity, we do not report these observations in Fig. 9.
The first 12 panels of Fig. 9 show how shock-dominated galaxies largely fall in areas populated by AGN-dominated (and composite) galaxies in these diagrams.Thus, the UVdiagnostic diagrams shown here do not allow unique identification of galaxies, whose nebular emission is dominated by fast, radiative shocks.Given the scarcity of such galaxies in the simulations at z > 1 (Fig. 8), we do not regard this degeneracy as problematic.
In contrast, Fig. 9 clearly demonstrates that the first 10 UV-diagnostic diagrams [labelled (i) to (x)] in the above list can robustly discriminated between SF-dominated, composite and AGN-dominated galaxies over the full redshift range 1 z 7, based on the large statistical sample of galaxies in the TNG100 and TNG50 simulations.This confirms the results obtained using a much smaller set of 20 cosmological zoom-in simulations of massive galaxies and their progenitors by Hirschmann et al. (2019).As described in that study, the reason for the good separability of different galaxy types is mainly that the ratio of any of the five considered (collisionally excited) metal lines to the He ii λ1640 (recombination) line successively decreases from SF-dominated, to composite, to AGN-dominated galaxies.This is because of the harder ionizing radiation of accreting BHs compared to that of stellar populations, which increases the probability of producing doubly-ionized helium.We note that this is less the case for N v λ1240/He ii λ1640, since N v λ1240 requires photons of even higher energy than He ii λ1640 to be produced (77.5 versus 54.4 eV).Hence, N v λ1240/He ii λ1640 is not a clear indicator of the hardness of the ionizing radiation.Also the C iv λ1550/C iii] λ1908 ratio is sensitive to the presence of hard AGN radiation, which increases the probability of triply ionizing carbon.We note that the completeness and purity fractions (not shown)4 of SF-dominated, composite and AGN-dominated galaxies in the diagrams of Fig. 9 corresponding to the diagnostics (i)-(x) above all exceed 75 per cent over the full considered redshift range, further confirming the reliability of these UV-diagnostic diagrams.
To avoid having to rely on the He ii λ1640 line, whose high strength can sometimes be challenging to model in some metal-poor star-forming galaxies (e.g., Senchyna et al. 2017;Steidel et al. 2016;Berg et al. 2018;Jaskot & Ravindranath 2016;Wofford et al. 2021), Hirschmann et al. (2019) proposed alternative UV and optical-UV diagnostic diagrams to discriminate between different galaxy types.Among those are the last two diagnostic diagrams [labelled (xi) to (xii)] in the above list (we do not include here other diagrams involving the very faint N iv] λ1485 line).Fig. 9 shows that, unfortunately, when considering the much larger galaxy populations in the TNG50 and TNG100 simulations (based on different physical models), the different galaxy types overlap in these diagrams and cannot be separated, with completeness and purity fractions of only 10-25 per cent.
In this context, it is comforting that if the He ii λ1640line strength were to increase in the SF-galaxy models (by factors of 2-4, as considered in section 5.2 of Hirschmann et al. 2019), this would hardly affect the purity and completeness fractions of AGN-dominated galaxies, the purity fractions of SF-dominated galaxies and the completeness fractions of composites in the first 10 diagnostic diagrams of Fig. 9 (all fractions would remain above 75 per cent).This indicates that UV-selected AGNdominated galaxies would still be reliably identified, SFdominated samples largely uncontaminated by active galaxies (composites and AGN), and composite samples fairly complete.In contrast, the completeness fractions of SFdominated galaxies and the purity fractions of composites could drop below 20 per cent if the He ii λ1640 luminosity were quadrupled, as SF-dominated galaxies would move toward the region populated by composites in UV diagnostic diagrams.In this case, SF-dominated samples would become incomplete, and composite samples heavily contaminated by SF-dominated galaxies.

LINE-EMISSION CENSUS OF GALAXY POPULATIONS ACROSS COSMIC TIME
In this section, we explore the statistics of optical-and UVline emission from galaxy populations across cosmic time, as enabled by the large galaxy samples in the IllustrisTNG simulation set.We start by examining how the Hα, [O iii]λ5007 and [O ii] λ3727 luminosities predicted by these simulations trace galaxy SFR, and compare the Hα, [O iii]+Hβ and [O ii] luminosity functions at various redshifts with available observations from the literature (Section 4.1).Then, in Section 4.2, we explore the luminosity functions of different UV lines in the redshift range 2 z 7, and which UV lines are the most promising tracers of cosmic star-formation activity at z 7.All luminosity functions and SFR-line luminosity relations shown in that section are corrected for attenuation by dust.5 Finally, in Section 4.3, we compute predictions of number counts of active and inactive galaxies in various optical and UV lines down to the sensitivity limits of deep observing programs currently planned with JWST, including the potential effect of attenuation by dust.

Relation between Hα, [O iii]λ5007and [O ii] λ3727 luminosities and SFR for IllustrisTNG galaxies
We start by exploring how optical-line luminosities relate to the SFR for IllustrisTNG galaxies.Fig. 10 shows the mean relations between Hα, [O iii]λ5007 and [O ii] λ3727 luminosities (from top to bottom) and SFR for TNG100 and TNG50 galaxies at z = 0 (solid and dashed mauve lines, respectively).In the case of TNG100, we distinguish between different types of galaxy populations (blue: SFdominated; red: composite; green: AGN-dominated; orange: PAGB-dominated; lilac: shock-dominated).Also shown for completeness are relations often used in the literature .5 for details).A flux-detection limit of 10 −18 erg s −1 cm −2 was applied to all UV emission lines.Each panel shows the 2D distributions of SF-dominated (blue), composite (red) and AGN-dominated (green) galaxies; the top 12 panels also show the distribution of shock-dominated galaxies (lilac contours; no such galaxies are predicted to be found at z = 5-7).Overplotted in each panel are the selection criteria of Hirschmann et al. (2019) to discriminate between SF-dominated and composite galaxies (black dashed line), and between composite and AGN-dominated galaxies (black dotted line).The good correlation between line luminosity and SFR for SF-dominated galaxies in Fig. 10 confirms that Hα, [O iii]λ5007 and [O ii] λ3727 can provide rough estimates of the SFR in such galaxies.For other galaxy types, contributions to line emission from an AGN, shocks and PAGB stars shift the relations to larger luminosities at fixed SFR.Hence, the ability to constrain galaxy type (Section 3.3) is important to refine SFR estimates from optical-line luminosities.

Hα luminosity function
The top two rows of Fig. 11 show the evolution of the Hα luminosity function of TNG50 (green lines), TNG100 (red lines) and TNG300 (lilac lines) galaxies over the redshift range 1 z 6 (different panels, as indicated).It is worth pointing out that the peaks in the TNG100 and TNG300 luminosity functions are an artefact of the resolution limit, which causes galaxy number densities below the peak to be incomplete and hence not meaningful.The highest resolution of the TNG50 simulation enables us to follow the evolution of the faint end down to luminosities below ∼ 10 41 erg s −1 , and the larger cosmological volumes of the TNG100 and TNG300 simulations enable quantification of the evolution of the exponential cutoff at luminosities near ∼ 10 43 erg s −1 .
At any redshift, the bulk of the Hα luminosity in Fig. 11 arises from H ii regions ionized by young stars (coloured dashed lines).Only in the most luminous Hα emitters, with LHα 10 43 erg s −1 , can the contribution from an AGN become significant.This makes the Hα luminosity function a reliable tracer of the cosmic SFR density of galaxy populations, as often assumed in observational studies (e.g., Ly et al. 2007;Sobral et al. 2013Sobral et al. , 2015)).In fact, Fig. 11 shows that, as a consequence of the rising cosmic SFR density of IllustrisTNG galaxy populations from z = 0.1 to z = 3, the simulated Hα luminosity function strongly evolves over this redshift range: while the faint end hardly changes, the exponential cutoff rises from LHα 10 42 erg s −1 to LHα 10 43 erg s −1 .
The predicted Hα luminosity functions in Fig. 11 are in reasonable agreement with observational determinations (including corrections for dust attenuation) from various deep and wide, narrow-band and spectroscopic surveys conducted at z < 2 (Tresse et al. 2002;Fujita et al. 2003;Villar et al. 2008;Shim et al. 2009;Hayes et al. 2010;Sobral et al. 2013Sobral et al. , 2015;;Khostovan et al. 2018, shown as different grey lines).The large dispersion in these constraints illustrates the uncertainties inherent in their derivation (e.g., cosmic variance, corrections for dust attenuation, filter profile, completeness, etc.).In this respect, the constraints from the HiZELS survey (Sobral et al. 2013(Sobral et al. , 2015, light-grey dashed lines) are of particular interest, as they provide homogeneous estimates of the Hα luminosity function over the redshift range 1 z 2. These are consistent with the evolution predicted by our models, except at the faint end, where resolution effects and the stellar-mass cut at 10 8 M⊙ in the TNG50 simulation are likely to account for the shallower predicted slope.
At redshifts z > 2, where future observational constraints will be gathered by JWST, Euclid and the Roman Space Telescope, the galaxy number density at fixed Hα luminosity is predicted to strongly decline, e.g., by a factor of ∼ 30 between z = 3 and z = 6 at LHα = 10 42 erg s −1 .While this reflects the decline of the cosmic SFR density over this redshift range in the IllustrisTNG simulations, these predictions should be taken with caution, because of the growing limitation by resolution effects affecting the census of small galaxies at high redshift.) luminosity functions of TNG50 (green solid lines), TNG100 (red solid lines) and TNG300 (lilac solid lines) galaxies more massive than 10 8 M ⊙ , 3 × 10 9 M ⊙ and 10 10 M ⊙ , respectively, at z = 0.1, 0.5, 1, 1.5, 2, 3, 4, 5 and 6 (different panels, as indicated).Dashed coloured lines show the results obtained when considering only the contribution by H ii regions to the luminosity of each galaxy.Also shown are observational data from various sources listed in the legend (all corrected for attenuation by dust; different grey-and black-line styles).The peaks in the TNG100 and TNG300 luminosity functions arise from the resolution limits; galaxy number densities below the peak are incomplete and hence not meaningful.

[O iii]+Hβ luminosity function
The middle two rows of Fig. 11 show the evolution of the [O iii]+Hβ luminosity function of the TNG50, TNG100 and TNG300 galaxies, using the same layout as for the Hα luminosity function in the top two rows.Here, the [O iii]+Hβ luminosity is taken to be the sum of the [O iii]λ4346, Hβ and [O iii]λ5007 luminosities, which are hard to separate in narrow-band surveys.We note that the predicted [O iii]+Hβ luminosity function computed in this way differs noticeably from the pure [O iii]λ5007 luminosity function only at the faint end.
As in the case of the Hα line, Fig. 11 shows that the [O iii]+Hβ luminosity function is dominated by H ii regions ionized by young stars at all redshifts.Only at z = 2-3 do luminosities above L [O iii]+Hβ ∼ 10 43.5 erg s −1 start to be dominated by line emission from AGN.The [O iii]λ5007+Hβ luminosity function also exhibits strong evolution with redshift: even though the faint end hardly changes from z = 0.1 to z = 3, the exponential cutoff rises from L [O iii]+Hβ ∼ 10 41.5 erg s −1 at z = 0.1 to L [O iii]+Hβ > 10 43 erg s −1 at z = 3, in fair agreement with the bulk of observational constraints (Colbert et al. 2013;Hayashi et al. 2018;Khostovan et al. 2015Khostovan et al. , 2020)).
The only constraint we could find on the [O iii]+Hβ luminosity function at redshift z > 3 is that derived indirectly from the UV luminosity function at z = 8 by De Barros et al. (2019).For reference, we show this as a thin, black solid line in the z = 5 and z = 6 panels of the fourth row of Fig. 11.At z = 5, our predictions are roughly consistent with this semi-empirical determination, which however lies ∼ 0.5 dex above the predictions at z = 6.The difference could arise from uncertainties in the derivation of this indirect constraint, as well as from resolution effects in the simulations at that redshift.

[O ii] luminosity function
The evolution of the [O ii] λ3727 luminosity function, shown in the bottom two rows of Fig. 11, has characteristics similar to that of the [O iii]+Hβ luminosity function.It is largely dominated by the contribution from H ii regions ionized by young stars and shows strong evolution of the exponential cutoff, from L [O iii] ∼ 10 42.5 erg s −1 at z = 0.1 to L [O iii] ∼ 10 43 erg s −1 at z = 2.At redshifts z = 1-1.5, the predicted evolution is in reasonable agreement with various empirical determinations (Khostovan et al. 2015(Khostovan et al. , 2020;;Hayashi et al. 2018;Sobral et al. 2015;Ly et al. 2007;Takahashi et al. 2007;Bayliss et al. 2011).At lower redshift, the number density of [O ii]-luminous emitters lies above those estimated by Ciardullo et al. (2013) and Ly et al. (2007), while at higher redshift, it lies below that estimated by Khostovan et al. (2015).
Overall, we conclude that the luminosity functions of the Hα, [O iii]+Hβ and [O ii] emission lines predicted by the IllustrisTNG simulations are in reasonable agreement with empirical determination at redshifts from z = 0 to z = 2-3, although there are some discrepancies with the few existing constraints on the number densities of [O ii] λ3727 emitters at z < 1 and z > 1.5.(coloured as indicated) plotted against SFR for galaxies at redshifts z = 4-7 in the TNG100 (solid lines for SFR 10 M ⊙ yr −1 ) and TNG50 (dashed lines, with shaded area indicating the 1σ scatter) simulations.

Evolution of UV-line luminosity functions
Since UV emission lines become prominent in metal-poor galaxies and can be detected with JWST out to high redshifts (e.g., Stark et al. 2014), it is of interest to explore the predictions of our models for the evolution of UV-line luminosity and flux functions at z 2. We can also investigate which UV lines potentially offer the best SFR estimators in high-redshift galaxies, at least when corrected for attenuation by dust, when standard (in particular, Balmer) optical lines fall out of the observational window.
Fig. 12 shows the relations between the average luminosities of different UV lines (lilac: He ii λ1640; blue: C iv λ1550; green: [O iii]λ5007; yellow: Si iii] λ1888; red: C iii] λ1908) and SFR in the TNG100 and TNG50 galaxy populations (solid and dashed lines, respectively) over the redshift range z = 4-7 (shaded, coloured areas show the 1σ scatter about the TNG50 relations).All UV line luminosities appear to correlate with SFR over several orders of magnitude, the tightest relations being obtained for C iii] λ1908, O iii] λ1663, and Si iii] λ1888 (1σ scatter below 0.5 dex), compared to He ii λ1640 and C iv λ1550.This is because the latter two lines are more sensitive to contamination by an AGN component.
In the top row of Fig. 13, we show the luminosity functions of the C iii] λ1908, Si iii] λ1888, O iii] λ1663, He ii λ1640 and C iv λ1550 emission lines for TNG50 (solid lines), TNG100 (dashed lines) and TNG300 (dot-dashed lines) galaxies at different redshifts from z = 2 to z = 7 (different colours, as indicated).As in the case of optical-line luminosity functions (Section 4.1), the TNG50, TNG100 and TNG300 simulations predominantly sample, respectively, the faint, intermediate and bright ranges of UV luminosity functions.In the luminosity ranges where predictions from different simulations overlap, these are in good general agreement, except for galaxies at z 6, where the limitations arising from resolution effects become worse.
Regardless of the UV line considered, we find that at a given luminosity, the number density drops by up to an order of magnitude from z = 2 to z = 6, mainly for galax- , He ii λ1640 and C iv λ1550 luminosity functions (panels from left to right, top row) and flux functions (bottom row) of TNG50 (dashed lines), TNG100 (solid lines) and TNG300 (dot-dashed lines) galaxies more massive than 10 8 M ⊙ , 3 × 10 9 M ⊙ and 10 10 M ⊙ , respectively, at z = 2, 4, 5, 6 and 7 (different colours, as indicated).
ies with luminosities below the exponential cutoff.Instead, the luminous end is largely in place as early as z = 6, and thus evolves less strongly toward low reshift.The fact that the area under the luminosity function (i.e., the integrated luminosity function) decreases from z = 2 to z = 7 is a consequence of the decline in cosmic SFR density over this redshift interval, as shown by, e.g., Pillepich et al. (2018a).
It is interesting to note that the He ii λ1640 and C iv λ1550 luminosity functions exhibit bi-modal distributions.This is because, while the lower-luminosity peak is dominated by line emission from young star clusters, a second peak arises from the high He ii λ1640 and C iv λ1550 luminosities in AGN-dominated galaxies.This bi-modality is not observed in the other UV luminosity functions in Fig. 13, because C iii] λ1908, Si iii] λ1888 and O iii] λ1663 require lower ionization energies, and hence, are less sensitive than He ii λ1640 and C iv λ1550 to the hard ionizing radiation from accreting BH.
As a complement to the rest-frame line-luminosity functions, we show in the bottom row of Fig. 13 the evolution of the associated line-flux functions, which are directly observable quantities.The observed-flux functions differ from the rest-frame luminosity functions because of the redshiftdependent luminosity distance.Although the main trends are similar to those described above for the luminosity functions, the main difference between flux and luminosity functions is the shift of the high-luminosity end toward lower relative fluxes at increasing redshift (since at fixed luminosity, the observed flux decreases with increasing redshift).
Overall, it will be instructive to explore the extent to which future observational studies at high redshift, for example with JWST/NIRSpec, will be consistent with the pre-dicted optical/UV luminosity6 and flux functions in Figs 11 and 13, and whether potential inconsistencies between models and observations might require fundamental changes in models of galaxy evolution.

Predicted number counts in JWST/NIRSpec observations
The models presented in the previous sections allow us to predict the number of high-redshift galaxies that can be detected down to a given line-flux limit with the near-infrared spectrograph NIRSpec on JWST (Jakobsen et al. 2022).An interesting case study is that of the JWST Advanced Deep Extragalactic Survey (JADES),7 which combines NIRCam imaging and NIRSpec spectroscopy distributed in a 'DEEP' and 'MEDIUM' surveys (a guaranteed-time-observation program led by N. Lützgendorf and D. Eisenstein).Among the different components of JADES, we focus on the mediumresolution spectroscopy (resolving power R ∼ 1000) that will be achieved on a few hundred galaxies in the DEEP survey (two NIRSpec pointings with exposure times of 8.4-25.2ksec, i.e. from 2.3 to 7 hr, in each of several grating/filter configurations) and a few thousand galaxies in the MEDIUM survey (24 NIRSpec pointings with exposure times ranging from 2.7 to 9.3 ksec, i.e. 0.75 to 2.6 hr, per grating/filter configuration).We adopt the flux limits listed in Table 1, computed with the pre-flight JWST exposure-time calculator, ).The light blue and red bars, which are the same in all panels, show the number of, respectively, SF-dominated and active (AGN-dominated and composite) galaxies more massive than 10 8 M ⊙ predicted in redshift bins centred on z = 4, 5, 6, 7 and 8 by the TNG50 simulation (the fixed boxlength of the simulation translates into redshift-bin widths shrinking from ∆z = 10 −3 to 5 × 10 −4 over this range).The dark blue and red bars show the subsets of these galaxies whose line emission can be detected with NIRSpec with 5σ significance, assuming exposure times of 2, 5 and 10 hr (from top to bottom).The light-grey thin bars carved in the dark-colour bars show the effect of including dust attenuation with A V = 0.5 and a Calzetti et al. (2000) curve.
for different lines at different redshifts and for different exposure times.
In this context, it is of interest to compute the number of galaxies expected to be detected in a given emission line, given an exposure time, in a NIRSpec field of view (3 × 3 arcmin 2 ).This is shown as a function of redshift in Fig. 14 for the [O ii] λ3727, Hβ, [O iii]λ5007, Hα and [N ii]λ6584 lines (from left to right), assuming exposure times of 2, 5 and 10 hr (from top to bottom).In each panel of Fig. 14, the light blue and red bars show the number of, respectively, SF-dominated and active (AGN-dominated and composite) galaxies more massive than 10 8 M⊙ predicted in redshift bins centred on z = 4, 5, 6, 7 and 8 by the TNG50 simulation.The fixed boxlength of the simulation translates into redshift-bin widths shrinking from ∆z = 10 −3 to 5 × 10 −4 over this range.These predicted numbers are the same in all panels of Fig. 14, independent of emission line and exposure time.In contrast, the dark blue and red bars show the subsets of these galaxies whose line emission can be detected with NIRSpec with 5σ significance.These numbers change with emission line and exposure time.
So far, we have not considered attenuation of emission lines by dust, while galaxies may contain significant amounts of dust, even at early cosmic epochs (see e.g., Schneider et al. 2004;Zavala et al. 2022).We estimate the impact of attenuation by dust on the number counts of SF and active galaxies in Fig. 14 by adopting the dust attenuation curve of Calzetti et al. (2000) with a V -band attenuation of AV = 0.5 mag.The resulting drop in expected  Fig. 14 shows that the TNG50 simulation predicts from ∼ 40 (20) SF-dominated (active) galaxies per NIRSpec field in the z = 4 bin, to ∼ 5 (2) at z = 8.Virtually all of these can be detected in a two-hour exposure with NIRSpec in Hα and [O iii]λ5007, typically the brightest optical emission lines, out to z = 7 (when Hα drops out of the observational window) and z = 8, irrespective of the presence of dust.Instead, exposures of at least 5 hours are required to achieve a nearly unbiased census of Hβ.Even 10-hour exposures will not be sufficient to achieve an unbiased census of [O ii] λ3727 SF-dominated emitters, and even less so of [N ii]λ6584 emitters, especially in the presence of attenuation by dust.At any of the survey depths in Fig. 14, the observed populations in these lines would be biased toward active galaxies.
All this suggests that the JADES/MEDIUM survey will be able to detect Hα and [O iii]λ5007 in the targeted galaxies out to z = 7 and z = 8, respectively, and the DEEP survey extend this to Hβ, while a substantial fraction of SF-dominated, [O ii] λ3727 and [N ii]λ6584 emitters might remain below the detectability limit of these surveys.It is also worth noting that some galaxies less massive than 10 8 M⊙ may have line fluxes above the survey limits and therefore be detectable.As stated before, we do not consider such low-mass galaxies here as they are unresolved in TNG50.
In Fig. 15, we show the analogue of Fig. 14 for the C iv λ1550, He ii λ1640, O iii] λ1663, Si iii] λ1888 and C iii] λ1908 UV emission lines (from left to right).If galaxies were dust-free, the most complete censuses in this wavelength range would be achievable for C iii] λ1908, O iii] λ1663 and C iv λ1550 emitters at z 6, especially for long exposure times, although significant fractions of SF-dominated galaxies would still be missed at lower redshifts.Instead, the weaker He ii λ1640 and Si iii] λ1888 lines can be properly sampled only in active galaxies, because He ii λ1640 requires highly energetic photons, while Si iii] λ1888, hindered by the low abundance of Si relative to C and O, is boosted by AGN radiation.Exposure times of at least 5 hours would be required to start detecting populations of dust-free, SF-  1. NIRSpec-specific flux-detection limits (in units of log[erg s −1 cm −2 ]) for different optical/UV emission lines at redshifts z=4-8 for three different exposure times, as indicated.The 5σ flux limits were computed with the pre-flight (version 1.3) of the JWST exposure-time calculator.
dominated emitters in these lines with NIRSpec, and even 10 hours would not suffice to achieve proper censuses of these populations.
In the presence of dust, the expected number counts in Fig. 15 drop significantly, particularly for SF-dominated galaxies, whose weaker lines compared to AGN-dominated galaxies fall more easily below the detectability limits of surveys.Yet, significant fractions of C iii] λ1908, O iii] λ1663 and C iv λ1550 emitters still appear detectable with the adopted AV = 0.5, at least for exposure times exceeding 10 hours.We caution however that the predicted counts of C iv λ1550 emitters are likely to be overestimated, as we have ignored the enhanced absorption of C iv λ1550 photons by dust due to resonant scattering (e.g., Senchyna et al. 2022).
The results of Fig. 15 therefore suggest that the UV lines most accessible to the JADES/MEDIUM and DEEP surveys, modulo dust attenuation, are likely to be C iii] λ1908 and O iii] λ1663, and, in cases where enhanced attenuation by resonant scattering is particularly modest, C iv λ1550.SF-dominated He ii λ1640 and Si iii] λ1888 emitters are likely to remain more elusive.While this prediction relies on SF models having difficulty to reproduce the observed He ii λ1640 emission of low-redshift analogues of distant SF galaxies (e.g., Plat et al. 2019), it is interesting to note that a non-detection of He ii λ1640 combined with detections of C iii] λ1908 and O iii] λ1663 (and perhaps C iv λ1550) could point to a population of SF-dominated galaxies (as expected from the He ii λ1640-based diagnostic diagrams in Fig. 9).

DISCUSSION
In Sections 3 and 4, we have shown that the predicted optical and UV emission-line properties of the IllustrisTNG galaxy populations are consistent with numerous observational constraints out to redshifts z ∼ 2-3.Based on this success, we could make predictions about the observability of these lines in more distant galaxies, out to z ∼ 7-8, and propose associated spectral diagnostics as guidance for the interpretation of new spectroscopic surveys at high redshift.These results represent a major statistical extension of earlier work by Hirschmann et al. (2017Hirschmann et al. ( , 2019)).
Despite this success, the emission-line catalogues of Il-lustrisTNG galaxies used in our study may be affected by several caveats related to current photoionization models, modern cosmological simulations and our coupling methodology.We discuss these in Sections 5.1 and 5.2 below and also compare in Section 5.3 our emission-line catalogues to those from previous studies of optical and UV emission lines of simulated galaxies.

Caveats in the photoionization models
All photoionization models considered in this work were performed with recent versions of the Cloudy and Mappings photoionization codes, adopting a common set of element abundances down to metallicities of a few per cent solar (Section 2.2).However, these models are computed via 1D calculations of a gas patch irradiated by ionizing photons, assuming a simplified geometry (e.g., spherical, plane-parallel) and constant gas density, which do not reflect the complex 3D gas distributions found in nature.Also, while the H ii-region and AGN-NLR photoionization models employed here are ionization-bounded, some actively star-forming galaxies show evidence of leakage of Lyman-continuum photons (e.g., de Barros et al. 2016;Shapley et al. 2016;Bian et al. 2017;Flury et al. 2022).This generally tends to reduce the intensities of low-ionization relative to highionization lines (see, e.g., the density-bounded calculations of Plat et al. 2019).
The PAGB models we adopted to describe gas photoionized by evolved stellar populations also suffer from uncertainties.In particular, the most recent version of the Bruzual & Charlot (2003) stellar-population-synthesis code we rely on incorporates PAGB evolutionary tracks from Miller Bertolami (2016), which have significantly shorter lifetimes -although compensated in part by higher luminosities -than previous prescriptions.Adopting older PAGB models could enhance the predicted EW(Hα) of PAGBdominated galaxies by a factor of up to 2. We have checked that this would not qualitatively affect our results (in the sense that more galaxies would be PAGB-dominated, but they would still separate robustly in the WHAN diagram of Fig. 3).For the ionizing radiation from an accreting black hole in the AGN-NLR models, for simplicity, we adopted the fixed broken power-law shape described by Feltre et al. (2016), with Sν ∝ ν α and α = −1.7 at wavelengths below 2500 Å.As shown by Hirschmann et al. (2017Hirschmann et al. ( , 2019)), adopting a different value for α would not significantly affect the predicted luminosities of the optical and UV emission lines considered in the present work.
Another, perhaps more serious limitation of the H iiregion and PAGB models used in our study is that they do not include the hot ionizing radiation from binary-star products (envelope-stripped or spun-up stars), as they are based on single-star population-synthesis models.Hard radiation from binary stars will enhance the luminosities of lines from species requiring very high ionizing energies, such as He ii (see, e.g., Xiao et al. 2018).This point is of particular relevance, as no current stellar-population-synthesis model can account for the strong He ii λ1640 and He ii λ4686 emission exhibited by some very metal-poor ( 0.2 Z⊙) SF galaxies (see, e.g., Plat et al. 2019;Stanway & Eldridge 2019).In fact, this is true even for the BPASS v2.2.1 models of Eldridge et al. (2017), which do include binarystar populations.Interestingly, the latest version of the Bruzual & Charlot (2003) single-star models used here, which incorporate updated calculations of stellar evolution and atmospheres (Bressan et al. 2012;Chen et al. 2015), produce more He ii-ionizing photons than BPASS v2.2.1 (see Plat et al. 2019).Other potential sources of He ii emission in such metal-poor galaxies could be mini-AGN, radiative shocks and massive X-ray binaries, although these are heavily debated (e.g., Plat et al. 2019;Senchyna et al. 2020;Simmonds et al. 2021;Umeda et al. 2022).
Despite the above caveats, the predictions of the H iiregion and AGN-NLR models used in this study have been repeatedly shown to perform remarkably well in comparisons with observations of common emission lines from other elements than He ii (e.g., Chevallard et al. 2018;Mignoli et al. 2019;Tang et al. 2019).As shown in Section 3.5, increasing the He ii λ1640-line luminosity predicted by our H iiregion models by a factor of 4 of would not reduce much the effectiveness of the suggested selection criteria in UV diagnostic diagrams in Fig. 9 (in line with the result of Hirschmann et al. 2019).We also find that this would attenuate the bi-modality of the He ii λ1640-luminosity function (Fig. 13) and increase the number counts of SF-dominated, He ii λ1640 emitters by a factor of 2 to 3 (Fig. 15).
Finally, we note that the radiative-shock models of Alarie & Morisset (2019) used in this study update the older calculations of Allen et al. (2008), which were based on a previous version of the Mappings code and spanned a smaller range of metallicities.Both model sets are limited to shocks fast enough ( 100 km s −1 ) to generate sufficient radiation to ionize the pre-shock gas before it enters the shock front (i.e., to allow the pre-shock gas to reach ionization equilibrium before it is shocked; see also Sutherland & Dopita 2017).The models also assume that shocks are propagating in a neutral medium ionized by the hot post-shock gas, which might not always be the case in practice (e.g., if a shock propagates through an ionized medium).As we shall see in the next section, the resolution and identification of shocks in cosmological simulations tend to include stronger assumptions and limitations than do the grids of radiative-shock models.

Caveats in the IllustrisTNG simulations and their coupling with emission-line models
A potential source of inaccuracy of all current large-scale cosmological simulations, including IllustrisTNG, is that they do not resolve the ISM, which forces one to appeal to often simplified and ad-hoc sub-resolution models to describe baryonic processes (see Naab & Ostriker 2016, for a discussion).Specifically, the choice of stellar-and AGNfeedback models has been shown to significantly affect various galaxy properties (e.g.Pillepich et al. 2018b).In this context, it is important to stress that IllustrisTNG is one of the best-tested and explored, large-scale cosmological simulations to date, providing fairly realistic galaxy populations up to z = 3, for example in terms of chemical enrichment, SFR and black-hole accretion-rate histories (although the observed AGN luminosity function is not always well reproduced, especially at low redshift, see Habouzit et al. 2019Habouzit et al. , 2022)), stellar populations and the associated scaling relations.This success provides a solid foundation for our predictions of the emission-line luminosities controlled by these physical quantities.Furthermore, because IllustrisTNG, like other modern large-scale cosmological simulations, does not resolve gas properties, such as density, in individual ionized regions, nor does it track the formation, evolution and destruction of dust grains in the ISM, we must adopt a fixed hydrogen gas density in ionized regions (10 3 cm −3 for NLR, 10 2 cm −3 for H ii regions and 10 cm −3 for line emission from post-AGB stars), as well as a fixed dustto-metal mass ratio (ξ d = 0.3).As described in section 5.1.4 of Hirschmann et al. (2017), both recent observations (De Cia et al. 2016;Wiseman et al. 2017) and semi-analytic models (Popping et al. 2016) suggest that, at given gas metallicity, the dust-to-metal mass ratio hardly changes with redshift out to z = 6.The models further predict that this is the case also at fixed stellar mass (see figs 5 and 6 of Popping et al. 2016) and that galaxies more massive than M stellar > 3 × 10 9 M⊙ are expected to have dust-to-metal mass ratios around 0.2, reasonably close to our adopted value.In Hirschmann et al. (2017Hirschmann et al. ( , 2019)), we further studied the impact of these parameters on optical and UV line ratios, and found that they have a smaller effect on observed line ratios than the parameters of the photoionization models that can be determined from the simulations.
Another consequence of not resolving individual H ii regions in IllustrisTNG is that we have to adopt galaxywide properties of H ii-regions.We follow here the methodology of Charlot & Longhetti (2001, see Section 2 above), which convolves the time evolution of an H ii region with the galaxy star-formation history.While this approach goes beyond most current photoionization models (which typically treat a galaxy as a single region ionized by the entire stellar population), it suffers from shortcomings: even though we assume that a galaxy is composed of several H ii regions (with different star-cluster ages), any variation in the gas properties of the regions across the galaxy (such as density, filling factor and metallicity) is neglected.Regardless, these galaxy-wide line-emission models may be justified by their long-term success in reproducing observed galaxy spectra (Section 5.1).More detailed studies will be needed to fully validate this approach, but they lie beyond the scope of the present paper.
Furthermore, the inability to resolve the ISM in largescale cosmological simulations such as IllustrisTNG implies that shocks cannot be tracked and identified in unresolved cool gas phases.In fact, the on-the-fly shock finder described in Section 2 is by default disabled for star-forming gas.We have therefore intentionally neglected shock-related emission from the dense star-forming ISM itself, which could arise for example from supernova explosions.Instead, in Il-lustrisTNG, we are primarily capturing shocks caused by (AGN-driven) outflows, gas accretion as well as merger events arising in the warm/hot component of the ISM and in more distant regions, such as the disk-halo interface.Thus, in our analysis, we cannot draw any robust conclusion on line emission due to shocks in higher-redshift galaxies, where AGN-driven outflows seem to be less prevalent, and instead shocks due to stellar feedback, neglected in our approach, may become important.Also, at the moment, IllustrisTNG includes radiation-driven feedback only as thermal energy input, which may be insufficient to describe reality.We might therefore miss shocks linked to radiation-driven winds from radiatively efficient AGN, which may become particularly important in high-redshift galaxies.
While our modelling of synthetic emission lines of Il-lustrisTNG galaxies accounts for the effects of dust in H ii regions and AGN NLR (Section 2.3), it neglects attenuation by dust in the diffuse ISM (with the exception of Figs 14 and 15 in Section 4.3).To some extent, this neglect is justified by the fact that we compare our models with observed line ratios and line-luminosity functions corrected for this effect.Moreover, as expected from our results in section 5.1 of Hirschmann et al. (2019), we find that adopting a Calzetti et al. (2000) attenuation law with a V -band attenuation of AV = 0.5 mag would negligibly impact most UV diagnostic diagrams in Fig. 9.This however would reduce the predicted number counts of line emitters at high redshift by factors of up to several, as shown by Figs 14 and 15.A more sophisticated modelling of dust (in the spirit of the on-the-fly dust modelling of McKinnon et al. 2018), beyond the scope of the present paper, will be required for more robust predictions.
Finally, we note that our methodology does not account for the effect of interstellar-line absorption in stellar birth clouds nor in the diffuse ISM.As shown by Vidal-García et al. (2017), while some prominent UV emission lines, such as O iii] λ1663, Si iii] λ1888 and C iii] λ1908, are little sensitive to interstellar absorption, the luminosities of other lines, such as C iv λ1550 and N v λ1240, can be significantly reduced through this effect.While quantifying interstellar-line absorption in simulated galaxies is important, we postpone such a complex analysis to a future study.

Comparison with previous emission-line studies
In recent years, an increasing number of theoretical studies have emerged, which provide models for the nebular emission from young star clusters in galaxies in cosmological simulations.Below, we briefly outline the different methodologies proposed to compute line emission, summarise the main associated goals and results, and discuss how they compare with the method and analysis presented in this work.
The Sags semi-analytic model (Orsi et al. 2014): In this work, the gas metallicity and ionization parameter (via a dependency on gas metallicity) of Sags galaxies are used to select H ii-region models from a Mappings iii library (Levesque et al. 2010).With this method, the authors reproduce the local BPT diagram and the evolution of optical-line luminosity functions (we note that the predicted [O ii] λ3727-luminosity function underestimates the observed one locally, as for our models in Fig. 11).However, their BPT diagrams, specifically the [O iii]/Hβ-ratio, do not evolve with redshift as observed.This model has been further applied to the Galacticus and Sage semi-analytic models (run over merger trees from the MultiDark simulation) to explore the luminosity functions and clustering of [O ii] λ3727-emitting galaxies (Favole et al. 2020).
The Sage semi-analytic model applied to the UNIT project simulations (Knebe et al. 2022): Emission lines for the Sage semi-analytic model are constructed using the same method as in Orsi et al. (2014).Thanks to the large volume of the UNIT simulation (effectively ∼ 5 Gpc 3 ), this study presents galaxy catalogues for Euclid forecasts.The authors start by validating their model against the observed Hα luminosity function, and then explore the abundance and clustering of model Hα-emitting galaxies.
The Galacticus semi-analytic model (Zhai et al. 2019): The emission-line luminosities of Galacticus galaxies are computed using the Cloudy photoionization code (Ferland et al. 2013).The key step is to generate and interpolate tabulated libraries of emission-line luminosities using Cloudy as a function of the number of ionizing photons for various species, the ISM metallicity, the hydrogen gas density and the volume filling factor of H ii regions, which can be computed for Galacticus galaxies.The authors validate their emission-line results against observed Hα and [O iii]λ5007 luminosity functions and make predictions of number counts of line-emitting galaxies for galaxy surveys with the Roman Space Telescope.
The L-Galaxies semi-analytic model (Izquierdo-Villalba et al. 2019): This study adds the Orsi et al. (2014) model to L-Galaxies lightcones (run over the Millennium simulation) to create synthetic galaxy catalogues including emission lines for narrow-band photometric surveys.The authors focus on optical lines (including [Ne iii]λλ3869, 3967) and find fairly good agreement with the observed evolution of the Hα, [O iii]λ5007 and [O ii] λ3727 luminosity functions.
The Galform semi-analytic model (Gonzalez-Perez et al. 2018, 2020;Baugh et al. 2022): Gonzalez-Perez et al. (2018, 2020) assign H ii-region models from Stasińska (1990) to Galform galaxies and investigate line-luminosity functions, halo-occupation distributions and the distribution of these galaxies in the cosmic web.While the original Galform emission-line model provided a poor reproduction of local BPT diagrams, Baugh et al. (2022) propose an updated model based on the published version of the Gutkin et al. (2016) photoionization grid and a relation between ionization parameter, gas metallicity, gas density and specific SFR from Kashino & Inoue (2019).Their study suggests that the offset in [O iii]/Hβ of high-redshift relative to local galaxies results primarily from a higher ionization parameter and higher gas density, largely in line with our results in Hirschmann et al. (2017) and in Section 3.3 above.
The z > 7 cosmological simulation of Shimizu et al. (2016): This study explores the opticaland UV-line emission of z > 7 galaxies in cosmological simulations and their detectability with different current and future observational facilities (JWST, extremely large telescopes).The authors compute Lyα, Hα and Hβ line luminosities using Pegase 2 (Fioc & Rocca-Volmerange 1997).The luminosities of metal lines are scaled to the Hβ luminosity based on Cloudy models and assuming a metallicity-dependent emission efficiency for each line.Shimizu et al. (2016) find that [O iii]λ5007, Hβ, C iv λ1550 and C iii] λ1908 are the brightest lines in high-redshift galaxies and suggest that these could be strong targets for future observational facilities out to even for z ∼ 15.
The Simba cosmological simulation (Garg et al. 2022): In this study, the nebular emission of a simulated galaxy is computed by summing the contributions from all young-star particles.The authors employ a subgrid model to describe the star-cluster mass distribution per star particle and perform Cloudy calculations, using ages and metallicities appropriate for each star particle, as well as an empirical dependence of N/O on O/H (in the spirit of CloudyFSPS, Byler et al. 2017).Even though this methodology provides spatially-resolved information about line emission in a galaxy, it neglects all important dust processes in H ii regions (van Hoof et al. 2004).With this approach, Garg et al. (2022) reproduce the star-forming branch of the local BPT diagram and find that the [O iii]/Hβ offset at high redshift arises primarily from selection effects.Yet, they find a mismatch between the predicted and observed [O iii]/Hβ ratios of z ∼ 2 galaxies, which can be alleviated by invoking higher N/O ratio or a lower ionization parameter at fixed O/H ratio at high redshift.While Hirschmann et al. ( 2017) had a similar conclusion with regard to N/O, the finding of Garg et al. (2022) concerning the ionization parameter disagrees with our result in Section 3.3 above, as well as with those of several previous studies (e.g., Brinchmann et al. 2008;Hirschmann et al. 2017;Kashino et al. 2017;Kewley et al. 2019;Baugh et al. 2022).
The BlueTides cosmological simulation at z = 8 − 13 (Wilkins et al. 2020): As in the Simba simulation of Garg et al. (2022), the nebular emission of a simulated galaxy is computed by summing the contributions from all young-star particles, connecting ionization-bounded Cloudy models to the predicted gas metallicity and ionization parameter of each star particle, and assuming a fixed gas density and dust-to-metal mass ratio.The authors focus on predictions of a few optical (Hα, Hβ and [O iii]λ5007) and UV (C ii] λ2326, C iii] λ1908 and [Ne iii]λλ3869, 3967) emission lines for simulated galaxies at z = 8-13.They find good agreement of the predicted equivalent-width distributions of C iii] λ1908 and [O iii]λ5007+Hβ with the few available observational constraints, but poorer agreement between the predicted and observed (from de Barros et al. 2016) [O iii]λ5007+Hβ luminosity function.As we have seen, at slightly lower redshift (z = 5-6), our predicted [O iii]+Hβ luminosity function based on the IllustrisTNG simulations roughly matches this observational constraint (Fig. 11).
The IllustrisTNG cosmological simulation at high redshifts (Shen et al. 2020): In this work, the Hα, Hβ and [O iii]λ5007 luminosities of IllustrisTNG galaxies at z = 2-8 are computed through the coupling with a Mappings iii library of young ionizing stars clusters (Groves et al. 2008, considering only stellar age and metallicity for the coupling, and including dust physics).The authors find that the predicted [O iii]λ5007+Hβ luminosity function is consistent with observations at z = 2-3, but that the Hα luminosity function at z ∼ 2 is ∼ 0.3 dex dimmer than observed.At higher redshift, the models show hardly any evolution of the luminosity functions out to z ∼ 5, and are slightly dimmer than the observed [O iii]λ5007+Hβ luminosity of De Barros et al. (2019).In contrast, as illustrated by Fig. 11 above, our emission-line treatment of the same cosmological simulations allows us to capture the observed Hα luminosity function of z = 2 galaxies, and we find a marked decline of the Hα and [O iii]+Hβ luminosity functions from z = 3 to z = 6.These contradictory emission-line predictions will be testable with upcoming JWST surveys.
To summarise, the vast majority of recent studies on emission-line predictions of simulated galaxies focus on optical-line luminosity functions, clustering of line emitters and number-count predictions for high-redshift galaxy surveys expected to emerge from current and future observational facilities.Despite the wide variety of modelling approaches, and the differences in photoionization models, basic predictions, such as optical-line luminosity functions are (surprisingly) widely consistent across different studies, including the work presented here.Only two such published studies (Garg et al. 2022;Baugh et al. 2022, aside from our own previous work) explored the location of simulated galaxies in BPT diagrams, and the evolution of the associated line ratios with redshift.The results of Baugh et al. (2022) agree with those presented here (and in Hirschmann et al. 2017) by favouring an evolution of the ionization parameter as the main driver of the rise in [O iii]/Hβ ratio at high redshift, in stark contrast to the theoretical results of Garg et al. (2022), which instead favour selection effects as an explanation for the observed trends.
Compared to the above previous studies, our approach is unique in that we consider a large parameter space of physical quantities, as well as a more self-consistent modelling of the ionization parameter to select H ii-region photoionization models.Even though in some cosmological simulations, line emission is modelled on a spatially resolved basis, it is not clear how much more reliable such an approach is than the one presented here: modern, large-scale cosmological simulations can by no means resolve ionized regions in the ISM and masses of young star clusters, so that additional assumptions/sub-grid models become necessary (often just considering star-particle age and metallicity to model line emission).
Most importantly, we account not only for stellar neb-ular emission of simulated galaxy populations, as done in literature so far, but also simultaneously for line emission due to AGN, post-AGB stars and fast radiative shocks, resulting in comprehensive emission-line galaxy catalogues at different cosmic epochs.

SUMMARY
In this paper, we have presented the first multi-component, optical and UV emission-line catalogues of large simulated galaxy populations at different cosmic epochs.Following the methodology of Hirschmann et al. (2017Hirschmann et al. ( , 2019)), the emission-line catalogues have been constructed by employing the IllustrisTNG cosmological simulations and by self-consistently connecting them to modern, state-of-theart photoionization models (Gutkin et al. 2016;Feltre et al. 2016;Hirschmann et al. 2017;Alarie & Morisset 2019).This allows us to compute in a self-consistent way the line emission from multiple components: young star clusters, AGN NLR, PAGB stellar populations and, for the first time, fast radiative shocks.Specifically, we consider predictions from IllustrisTNG galaxies for the redshift evolution of global and central interstellar metallicity, C/O abundance ratio, SFR, BH accretion rate, global and central average gas densities, the age and metallicity of PAGB stellar populations, and shock properties (velocity, magnetic-field strength, interstellar metallicity and shock surface).Based on these, we select SF, AGN, PAGB and shock nebular-emission models for each simulated galaxy (above the mass-resolution limit) at any redshift.By default, we adopt a fixed dust-to-metal mass ratio, ionized-gas hydrogen density, power-law index of the AGN ionizing radiation and pre-shock gas density.Taking advantage of these emission-line catalogues of IllustrisTNG galaxy populations, we have presented a variety of optical and UV diagnostic diagrams to identify ionizing sources in galaxies in the local Universe and at high redshift, together with the predicted evolution of lineluminosity functions and number counts of distant lineemitting galaxies expected to be detectable in deep spectroscopic observations with the NIRSpec instrument on board JWST.We can summarise our main results as follows: • The synthetic [O iii]/Hβ, [N ii]/Hα, [S ii]/Hα, [O i]/Hα and [O iii]/[O ii] emission-line ratios and Hα equivalent widths of IllustrisTNG galaxies are in excellent statistical agreement with observations of star-forming, active, shockdominated and retired (PAGB) SDSS galaxies in the local Universe.This confirms, for large galaxy populations (and with the different physical models adopted), the result originally obtained by Hirschmann et al. (2017Hirschmann et al. ( , 2019) ) for only 20 re-simulated, massive galaxies.
• Our results are consistent with the classical selection criteria in standard BPT diagnostic diagrams to distinguish SF and active galaxies in the local Universe.Furthermore, we confirm that the WHAN diagram can robustly identify retired galaxies, and the [O iii]/[O ii]-versus-[O i]/Hα diagram shock-dominated galaxies.
• Present-day galaxies classified as 'SF-dominated' and 'composites' (i.e., SF+AGN) are found to be primarily starforming, main-sequence galaxies with large cold-gas fractions.Then, SFR and cold-gas content successively decrease from 'AGN-dominated' to 'shock-dominated' and 'PAGBdominated' galaxies: shock-dominated galaxies are predominantly located in the green valley of the SFR-stellar mass plane, and PAGB-dominated galaxies on the red sequence.This is primarily due to AGN feedback causing shocks in outflowing gas and suppressing star formation and blackhole accretion.This allows the (intrinsically weak) line emission from shocks and PAGB stars to become dominant.
• Shock-and PAGB-dominated galaxies make up between 3 and 10 per cent of the present-day galaxy population, depending on box size and resolution of the simulation.At redshifts z > 1, these fractions drop below 1 per cent, with SF-dominated galaxies being the most abundant population (between 40 and 70 per cent) at all redshifts.
• At fixed [N ii]/Hα ratio, the [O iii]/Hβ ratio is predicted to increase from low to high redshift.For SF-dominated galaxies, the increase is similar to that reported in several observational studies (e.g., Steidel et al. 2014).We confirm our previous finding in Hirschmann et al. (2017) that at fixed galaxy stellar mass, the increase in [O iii]/Hβ from low to high redshift is primarily a consequence of a higher ionization parameter in high-redshift galaxies, regulated by a larger specific SFR and a higher global gas density (consistent with the results of Baugh et al. 2022).
• At high redshifts, classical optical-selection criteria to identify the dominant ionizing sources in galaxies are no longer applicable: a drop in [N ii]/Hα causes composite and AGN-dominated galaxies to shift toward the SF-dominated region of the classical [O iii]/Hβ-versus-[N ii]/Hα BPT diagram.Instead, nine UV diagnostic diagrams and related selection criteria, originally presented in Hirschmann et al. (2019, for 20 re-simulated massive galaxies and their progenitors), provide a powerful means to distinguish SF from active galaxy populations out to early cosmic times (e.g., C iii] λ1908/He ii λ1640 versus O iii] λ1663/He ii λ1640).
• The synthetic emission-line properties of IllustrisTNG galaxies can largely reproduce the observed evolution of the Hα, [O iii]λ5007 and [O ii] λ3727 luminosity functions out to z = 3 (albeit with some tension for [O ii] λ3727).We provide predictions of optical-and UV-line luminosity and flux functions at z > 3, potentially useful for future surveys with, e.g., JWST, Euclid and the Roman Space Telescope.From z = 3 to z = 7, the predicted number densities of optical and UV emission-line galaxies can drop by up to three orders of magnitude (primarily at faint luminosities), reflecting the declining SFR density over that redshift range.
• Finally, we present number counts of SF and active galaxies expected to be detectable at redshifts between z = 4 to z = 8 in deep, medium-resolution spectroscopic observations with the NIRSpec instrument on board JWST.We find that 2-hour-long exposures are sufficient to obtain unbiased censuses of Hα and [O iii]λ5007 emitters, while at least 5 hours are required for Hβ, irrespective of the presence of dust.Instead, for [O ii] λ3727, O iii] λ1663, C iii] λ1908, C iv λ1550, [N ii]λ6584, He ii λ1640 and Si iii] λ1888 emitters, even though the active-galaxy population can still be well sampled, progressively smaller fractions of the SFgalaxy population are likely to be detectable, especially in the presence of dust, even with exposure times exceeding 10 hours.
Overall, the multi-component optical and UV emission-line galaxy catalogues presented in this paper include for the first time not only stellar nebular emission of simulated galaxy populations, as done in literature so far, but also simultaneously line emission due to AGN, post-AGB stars and fast radiative shocks.These catalogues provide useful insights into various diagnostic diagrams and statistical properties of optical and UV emission-line galaxies.These mock observables may be further helpful for the interpretation of near-and far-future emission-line surveys with, e.g., JWST, Euclid and the VLT/MOONs.This study is the first of a series, in which we also plan to explore, for example, the clustering of emission-line galaxies and the use of strongline ratios as tracers of interstellar metallicity, the ionization parameter and other physical quantities of the ISM in high-redshift galaxies.

Figure 1 .
Figure 1.Location of the TNG100 galaxy population at z = 0 in three BPT diagrams (coloured 2D histograms): [O iii]/Hβ versus [N ii]/Hα (top row), [O iii]/Hβ versus [S ii]/Hα (middle row) and [O iii]/Hβ versus [O i]/Hα (bottom row).The locations of different galaxy types (blue: SF-dominated; red: composite; green: AGN-dominated; lilac: shock-dominated; yellow: Post-AGB dominated) are shown in different columns.In each panel, the location of TNG50 galaxies is overplotted as black contours, while the grey shaded 2D histogram shows the observed distribution of SDSS galaxies.Also shown are classical empirical selection criteria.In the [O iii]/Hβ vs.[N ii]/Hα diagram, these are supposed to distinguish SF galaxies (below the dashed line) from composites (between the dashed and dotted lines), AGN (above the dotted line) and LI(N)ER (in the bottom-right quadrant defined by dot-dashed lines), according toKewley et al.  (2001, dotted line)  andKauffmann et al. (2003, dashed and dot-dashed lines).In the [O iii]/Hβ vs. [S ii]/Hα and [O iii]/Hβ vs. [O i]/Hα diagrams, SF-dominated galaxies are supposed to reside in the bottom left corner (below the curved dotted line), while AGN-dominated galaxies should occupy the top part (above the curved and straight dotted lines), and LI(N)ER the right part (right next to the curved and below the straight dotted lines), according toKewley et al. (2001, curved dotted line)  andKewley et al. (2006, straight dotted line).

Figure 2 .
Figure 2. Location of the TNG100 and TNG50 galaxy populations at z = 0 in the Mass-Excitation (MEx, Juneau et al. 2011) diagram.The layout and colour coding are the same as in the top row of Fig.1.Overplotted as black dotted and dashed lines are the empirical criteria ofJuneau et al. (2014) to distinguish between SF-dominated (below the dotted line), composite (between dashed and dotted lines) and AGN-dominated (above the dashed line) galaxies.

Figure 3 .
Figure 3. Location of the TNG100 and TNG50 galaxy populations at z = 0 in the EW(Hα)-versus-[N ii]/Hα (WHAN) diagnostic diagram (Cid Fernandes et al. 2010).The layout and colour coding are the same as in the top row of Fig. 1.Overplotted as black dashed lines in each panel are the empirical criteria of CidFernandes et al. (2011) to distinguish between SF-dominated (top-left quadrant), AGN-dominated (top-right quadrant), and retired, post-AGB dominated (below horizontal dashed black line) galaxies.The red dashed line shows the criterion suggested by our simulated emission-line galaxies to robustly identify post-AGB-dominated galaxies.

Figure 4 .
Figure 4. Location of the TNG100 and TNG50 galaxy populations at z = 0 in the [O iii]/[O ii]-versus-[O i]/Hα diagnostic diagram proposed byKewley et al. (2019).The layout and colour coding are the same as in the top row of Fig.1.In each panel, the top-right quadrant delimited by dashed black lines is the region expected to be populated only by galaxies dominated by fast radiative shocks (see figure11ofKewley et al. 2019).

3. 3
Evolution of galaxy populations in the [OIII]/Hβ-versus-[NII]/Hα diagram Based on the overall good agreement between predicted and observed distributions of different types of galaxy populations at z = 0 (Section 3.1), we now investigate the evolution of TNG100 and TNG50 galaxy populations in the [O iii]/Hβ-versus-[N ii]/Hα diagnostic diagram at higher redshift.In particular, we are interested in the extent to which the larger statistics together with different physical models offered by the IllustrisTNG simulations alter the conclusions drawn by Hirschmann et al. (2017), using a small set of 20 cosmological zoom-in simulations of massive galaxies and their progenitors, regarding the apparent rise in [O iii]/Hβ at fixed [N ii]/Hα from z = 0 to z = 2 in observational samples.The different rows of Fig. 6 show the distributions of TNG100 and TNG50 galaxies in the [O iii]/Hβ-versus-[N ii]/Hα diagram at redshifts z = 0.5

Figure 6 .
Figure 6.Redshift evolution of the location of simulated TNG100 and TNG50 galaxy populations in the [O iii]/Hβ-versus-[N ii]/Hα diagnostic diagram.Rows from top to bottom show the distributions at redshifts z = 0.5, 1, 2, 3, 4-5 and 5-7, as indicated.The layout and colour coding are the same as in the top row of Fig. 1.

Figure 7 .
Figure 7. Average [O iii]/Hβ ratio in bins of [N ii]/Hα ratio for low-mass TNG50 galaxies (M stellar = 0.1-3 × 10 9 M ⊙ , left panel), intermediate-mass TNG100 galaxies (M stellar = 0.3-3 × 10 10 M ⊙ , middle panel) and massive TNG100 galaxies (M stellar > 3 × 10 10 M ⊙ , right panel) at different redshifts (different colours, as indicated).Also shown for reference are the same SDSS data and empirical selection criteria as in the top row of Fig. 1, along with the mean relation observed in a sample of 251 star-forming galaxies at z ∼ 2.3 by Steidel et al. (2014, thick grey solid line).

Figure 8 .
Figure8.Fractions of galaxies of different types in the Illus-trisTNG simulations plotted against redshift.The colour coding is the same as in Figs 1-6 (SF-dominated: blue; composites: red; AGN-dominated: green; shock-dominated: lilac; PAGBdominated: yellow).Solid lines refer to TNG100 galaxies more massive than M stellar = 3 × 10 9 M ⊙ and dashed and dot-dashed lines to TNG50 galaxies more massive than M stellar = 1 × 10 8 and 3 × 10 9 M ⊙ , respectively.For clarity, the areas encompassed by all curves referring to a given galaxy type are colour-shaded.

Figure 9 .
Figure9.Location of metal-poor (N2O2 < −0.8) galaxy populations in the TNG100 simulation at z = 1-5 (top 12 panels) and the TNG50 simulation at z = 5-7 (bottom 12 panels), in 12 different UV-diagnostic diagrams highlighted by Hirschmann et al. (2019, see Section 3.5 for details).A flux-detection limit of 10 −18 erg s −1 cm −2 was applied to all UV emission lines.Each panel shows the 2D distributions of SF-dominated (blue), composite (red) and AGN-dominated (green) galaxies; the top 12 panels also show the distribution of shock-dominated galaxies (lilac contours; no such galaxies are predicted to be found at z = 5-7).Overplotted in each panel are the selection criteria ofHirschmann et al. (2019) to discriminate between SF-dominated and composite galaxies (black dashed line), and between composite and AGN-dominated galaxies (black dotted line).

Figure
Figure 10.Average line luminosities of Hα, [O iii]λ5007 and [O ii] λ3727 (from top to bottom) in bins of SFR for TNG100 (solid mauve line with shaded 1σ scatter) and TNG50 (dashed mauve line) galaxies at z = 0. Relations for different galaxy types are also shown in the case of TNG100 (blue: SF-dominated; red: composite; green: AGN-dominated; orange: PAGB-dominated; lilac: shock-dominated).Also shown for completeness are relations often used in the literature (from Kennicutt et al. 1998 for Hα and [O ii] λ3727, and Sobral et al. 2015 for [O iii]λ5007; black solid line in each panel).These should not be compared in detail with the IllustrisTNG predictions, as they were derived from purely solar-metallicity, SF-galaxy models assuming a Salpeter (1955) IMF truncated at 0.1 and 1M ⊙ .

Figure 11 .
Figure 11.Hα-(top two rows), [O iii]+Hβ-(middle two rows) and [O ii]-(bottom two rows) luminosity functions of TNG50 (green solid lines), TNG100 (red solid lines) and TNG300 (lilac solid lines) galaxies more massive than 10 8 M ⊙ , 3 × 10 9 M ⊙ and 10 10 M ⊙ , respectively, at z = 0.1, 0.5, 1, 1.5, 2, 3, 4, 5 and 6 (different panels, as indicated).Dashed coloured lines show the results obtained when considering only the contribution by H ii regions to the luminosity of each galaxy.Also shown are observational data from various sources listed in the legend (all corrected for attenuation by dust; different grey-and black-line styles).The peaks in the TNG100 and TNG300 luminosity functions arise from the resolution limits; galaxy number densities below the peak are incomplete and hence not meaningful.

Figure 14 .
Figure 14.Number counts of [O ii] λ3727, Hβ, [O iii]λ5007, Hα and [N ii]λ6584 emitters (from left to right) as a function of redshift in a JWST/NIRSpec field of view (3 × 3 arcmin 2).The light blue and red bars, which are the same in all panels, show the number of, respectively, SF-dominated and active (AGN-dominated and composite) galaxies more massive than 10 8 M ⊙ predicted in redshift bins centred on z = 4, 5, 6, 7 and 8 by the TNG50 simulation (the fixed boxlength of the simulation translates into redshift-bin widths shrinking from ∆z = 10 −3 to 5 × 10 −4 over this range).The dark blue and red bars show the subsets of these galaxies whose line emission can be detected with NIRSpec with 5σ significance, assuming exposure times of 2, 5 and 10 hr (from top to bottom).The light-grey thin bars carved in the dark-colour bars show the effect of including dust attenuation with A V = 0.5 and aCalzetti et al. (2000) curve.

Figure 15 .
Figure15.Same as Fig.14, but for the C iv λ1550, He ii λ1640, O iii] λ1663, Si iii] λ1888 and C iii] λ1908 emission lines.The predicted number counts of C iv λ1550 emitters in the presence of dust are likely overestimated, as they ignore the effect of resonant scattering.
number counts is shown by the light-grey thin bars carved in the dark-colour bars. Table