iMaNGA: mock MaNGA galaxies based on IllustrisTNG and MaStar SSPs -- II. The catalogue

Strengthening the synergy between simulations and observations is essential to test galaxy formation and evolution theories. To achieve this goal, in the first paper of this series, we presented a method to generate mock SDSS-IV/MaNGA integral-field spectroscopic galaxy observations from cosmological simulations. In this second paper, we build the iMaNGA catalogue consisting of $\sim$1,000 unique galaxies from the TNG50 cosmological simulations, selected to mimic the SDSS-IV/MaNGA-Primary sample selection. Here we present and discuss the iMaNGA sample and its comparison to the MaNGA Primary catalogue. The iMaNGA sample well recovers the MaNGA-Primary sample in terms of stellar mass vs angular size relation and spatial resolution. The S\'ersic index vs angular size relation, instead, is not reproduced well by the simulations, mostly because of a paucity of high-mass elliptical galaxies in TNG50. We also investigate our ability to recover the galaxy kinematics and stellar population properties with full-spectral fitting. We demonstrate that 'intrinsic' and `recovered' stellar kinematics, stellar ages and metallicities are consistent, with residuals compatible with zero within 1$-{\sigma}$. Also`intrinsic' and `recovered' star formation histories display a great resemblance. We conclude that our mock generation and spectral fitting processes do not distort the `intrinsic' galaxy properties. Therefore, in the third paper of this series, we can meaningfully test the cosmological simulations, comparing the stellar population properties and kinematics of the iMaNGA mock galaxies and the MaNGA observational results.


I N T RO D U C T I O N
During cosmic history, galaxies are shaped by complex physics acting on multiple scales (Somerville & Dav é 2015 ). Hence, hydrodynamical simulations in a cosmological conte xt hav e been utilized to theoretically predict what we observe . Nowadays, large-scale hydrodynamical simulations of galaxy formation are available, such as Illustris , IllustrisTNG (Nelson et al. 2019a ), and EAGLE (Schaye et al. 2014 ): in large cosmological volumes, baryonic matter, and dark matter evolve together from the primordial density fluctuations to the local universe. Thanks to these large simulated samples, we can now test theoretical predictions against the tremendous amount of observational data provided by modern surv e ys, e.g. the Cosmic Assembly Near-infrared Deep Extragalactic Le gac y Surv e y E-mail: lorenza.nanni@port.ac.uk (LN) daniel.thomas@port.ac.uk (DT) claudia.maraston@port.ac.uk (CM) (CANDLES; Grogin et al. 2011 ), the Sloan Digital Sky Surveys ( SDSS ; York et al. 2000 ;Abazajian et al. 2003 ), the Calar Alto Le gac y Integral Field Area survey (CALIFA; S ánchez et al. 2012 ), the Sydne y-AAO Multi-object Inte gral field spectrograph (SAMI; Allen 2014 ), and Mapping Nearby Galaxies at Apache Point Observatory (MaNGA; Bundy et al. 2015 ).
'Forward modelling' is a technique to compare theory to observations, which places model galaxies on the observational plane taking into account a variety of observational effects. This has been employed in e.g. Tonini et al. ( 2010 ), Snyder et al. ( 2015 ), Torrey et al. ( 2015 ), Trayford et al. ( 2015Trayford et al. ( , 2017, Bottrell et al. ( 2017 ), Rodriguez-Gomez et al. ( 2019 ), Huertas-Company et al. ( 2019 ), and Schulz et al. ( 2020 ). Below we provide a brief synopsis of each of these works. Tonini et al. ( 2010 ) and Henriques et al. ( 2012 ) show that semi-analytical models of galaxy formation (for an o v erview of these models see Vogelsberger et al. 2020 ) better reproduce the observed colours and near-infrared luminosities of high-redshift ( z ∼ 2-3) massive galaxies when calculated with stellar population models accounting for the thermally pulsing asymptotic MNRAS 522, 5479-5499 (2023) giant branch (TP-AGB) phase of stellar evolution. Snyder et al. ( 2015 ) include the effect of a Gaussian Point Spread Function and noise into synthetic images from Illustris galaxies in order to study how optical galaxy morphology depends on mass and star formation rate in the simulations. Torrey et al. ( 2015 ) develop a method to build a catalogue of 7000 synthetic images and 40 000 integrated spectra from the Illustris simulations at redshift 0, proving how, from the synthetic data products, it is possible to produce monochromatic or colour-composite images, perform SED fitting, classify morphology, and determine galaxy structural properties, as for the analysis of real galaxies. Trayford et al. ( 2015 ) include the effects of obscuration by dust in birth clouds and the interstellar medium in EAGLE simulated galaxies, using a two-component screen model. In Trayford et al. ( 2017 ), the dust effect is included with radiative transfer simulations, demonstrating an impro v ement between the predicted optical colours as a function of the stellar mass with the observed ones. Bottrell et al. ( 2017 ) generate synthetic images from the Illustris simulations, including noise and the effect of the point spread function, in order to carry out bulge + disc decompositions for SDSS -type galaxy images. Their work reveals that galaxies in Illustris are approximately twice as large and 0.7 mag brighter on average than galaxies in the SDSS , because of a significant deficit of bulge-dominated galaxies in Illustris for log M * /M < 11. Rodriguez-Gomez et al. ( 2019 ) generate synthetic images of ∼27 000 galaxies from the IllustrisTNG and Illustris, to match Pan-STARRS (Chambers et al. 2016 ) galaxy observations. The synthetic and real Pan-STARRS images are analysed with the same code ( STATMORPH ). The comparison reveals that the optical morphologies of IllustrisTNG galaxies are in good agreement with observations, improving the predictions of the original Illustris simulation. Ho we ver, the IllustrisTNG model still does not reproduce the observed strong morphology-colour relation because of an excess of both red discs and blue spheroids. Moreo v er, at a fix ed stellar mass, observations find discs to be larger than spheroids, while IllustrisTNG does not predict this trend. Huertas-Company et al. ( 2019 ) select around 12 000 galaxies in TNG100 to generate mock SDSS images, using the radiative transfer code SKIRT (Camps & Baes 2014 ; and including PSF and noise to mock SDSS r -band images. Observed and model morphologies are studied with a Convolutional Neural Network. The mass-size relations of the galaxies, divided by morphological type, match satisfactorily. However, there are discrepancies at the high-mass end of the stellar mass functions (SMF), which is dominated by disc galaxies in TNG100 and by earlytype galaxies in SDSS . Schulz et al. ( 2020 ) investigate the relationship between the UV slope, β, and the ratio between the infrared and UV luminosities (IRX) of galaxies from TNG50 on 7280 star-forming main-sequence (SFMS) galaxies. A general good agreement is found at z ≥ 1. Ho we v er, the y find a redshift-dependent systematic offset concerning empirically derived local relations, with the TNG50 IRXβ relation shifting towards lower β and steepening at higher redshifts. This selection of papers highlights how complex the comparison between observations and simulations is and the need for including observ ational ef fects in the simulations in order to allow for a close comparison. This is the approach we take in the papers of this series, which focus on simulating the MaNGA sample, which is an integral-field spectroscopic surv e y of 10 010 nearby galaxies (see Section 2.3 ). In Nanni et al. ( 2022 ), hereafter Paper I, we introduced our forward modelling procedure to generate realistic mock MaNGA-like galaxies. The main no v elties of our method are: the adoption of the MaStar stellar population models (Maraston et al. 2020 ) 1 that are based on stellar spectra obtained with the same MaNGA spectrograph (see Section 2.2 for details); a radiative transfer-based treatment of the dust; the reconstruction of a wavelength-dependent spectral noise based on MaNGA data; the use of the MaNGA ef fecti ve point spread function to include observ ational ef fects such as dithering. Furthermore, we follow the steps of the MaNGA Data Analysis Pipeline (DAP; Westfall et al. 2019 ) to analyse the mock data. Specifically, we use two spectral fitting algorithms, namely PPXF (see Cappellari 2017 ) in order to obtain stellar and gas kinematics and FIREFLY (Wilkinson et al. 2017 ) to obtain the stellar populations' properties -age, chemical composition, star formation history (SFH), reddening, and stellar and remnant masses -as in several analysis of the MaNGA data (e.g. Goddard et al. 2016 ;Goddard et al. 2017 ;Neumann et al. 2021Neumann et al. , 2022. As for cosmological simulations of galaxy formation and evolution here we adopt IllustrisTNG (Pillepich et al. 2018b ;Nelson et al. 2019a ), but we stress that our procedure can be easily applied to any other simulation suite.
In this paper, we describe how we construct a mock MaNGA-like catalogue -which we call the 'iMaNGA sample' -by applying the MaNGA-Primary target selection boundaries in redshift and i -band absolute magnitude (see Section 2.3 ) to the TNG50 and employing the post-processing and analysis pipeline presented in P aper I o v er this selection. This results in ∼1000 unique TNG50 galaxies obeying the selection. Here, we present and discuss the general properties of the mock galaxy catalogue, i.e. morphology, kinematics, and stellar populations. We then discuss how iMaNGA compares to the MaNGA-Primary sample, in particular focusing on the mass versus angular size relation, the spatial resolution and the S érsic inde x v ersus angular size and mass relations, see Section 6 . We finally demonstrate our ability to reco v er the truth values, i.e. the 'intrinsic' galaxy properties in the simulations. In the third paper of this series, we shall conduct a systematic comparison between our mock galaxies and observational results, including our own recent analysis published in Neumann et al. ( 2021 ).
The paper is organized as follows. Data and models in use are described in Section 2 , while our forward modelling procedure is recalled in Section 3 . The construction of the mock galaxy catalogue is presented in Section 4 and results are discussed in Section 5 . In particular, we show the general properties of the iMaNGA sample in Section 5.1 ; we illustrate the morphological characteristics of the iMaNGA sample in Section 5.2 ; we compare the MaNGA-Primary sample to the iMaNGA one in Section 5.3 ; we present the results of the analysis of the kinematics in Section 5.4 ; we study the stellar population properties in Section 5.5 . Also, in Section 6 we discuss other works on the construction of MaNGA-like galaxies from simulations. We draw our conclusions in Section 7 .

I N P U T M O D E L S A N D DATA
Here, we recap the description of models and data used in this work.

The IllustrisTNG simulation suite
IllustrisTNG (Marinacci et al. 2018 ;Naiman et al. 2018 ;Nelson et al. 2018Nelson et al. , 2019aPillepich et al. 2018bPillepich et al. , 2019Springel et al. 2018 ) is a suite of large-scale hydrodynamical simulations of galaxy formation and evolution, based on its predecessor Illustris (Genel et al. 2014 ;Vogelsberger et al. 2014 ;Sijacki et al. 2015 ). IllustrisTNG, while MNRAS 522, 5479-5499 (2023) maintaining the fundamental approach and physical models of Illustris, expands its scientific goal with larger volumes (up to 300 Mpc instead of 100 Mpc), and higher resolution (up to a mass resolution for the baryonic matter of 8.5 × 10 4 M instead of 1.6 × 10 6 M ). Moreo v er, new physics is incorporated (including magnetic fields, and dual-mode black hole feedback, as described in Weinberger et al. 2017 ;Pillepich et al. 2018a ). The fundamental physical processes comprised in these projects are the formation of cold dense gas clouds and stars; the stellar populations' evolution and stellar wind and feedback; the supernovae physics and evolution; the formation of supermassive BHs and their accretion, radiation, and feedback; the interstellar medium and its chemical enrichment. Indeed, the formation and evolution of galaxies are shaped by these processes which act across a broad range of spatial and time-scales, go v erning galaxies' fundamental characteristics, such as their stellar and gas content, star formation activity, chemical composition, morphology, and also their interactions with the external environment, e.g. in a cluster. Star formation in particular occurs stochastically when the gas number density is ≥0.13 particle / cm −3 according to a Chabrier ( 2003 ) initial mass function (IMF) and assuming the Kennicutt-Schmidt law (Schmidt 1959 ;Kennicutt 1989 ).
IllustrisTNG simulates three physical box sizes, with cubic volumes of roughly 50, 100, and 300 Mpc side lengths (named TNG50, TNG100, and TNG300, respecti vely). Each run has a dif ferent resolution. Particularly, in TNG50 (Nelson et al. 2019b ;Pillepich et al. 2019 ), the gravitational softening for baryonic and dark matter is: gas,min = 74 pc and DM,min = 288 pc; the mass resolution for baryonic and dark matter is: m bar = 8.5 × 10 4 M and m DM = 4.5 × 10 5 M . 2 Each run outputs 100 snapshots from redshift 20.05 to redshift 0.0. Haloes and subhaloes are identified with the Friendsof-Friends and the SUBFIND algorithms, respectively (see Springel et al. 2001b ;Nelson et al. 2015 ).
In this paper, we focus on subhaloes simulated by TNG50 and identified in snapshots from redshift 0.15 to redshift 0.01, which approximately corresponds to the redshift range observed with MaNGA (see Section 2.3 ). TNG50 is chosen because it allows the high spatial resolution typical of the MaNGA datacubes (pixel size of 0.5 arcsec, i.e. a spatial sampling raging from ≈100 pc at z ≈ 0.01 to ≈1.5 kpc at z ≈ 0.15, Section 2.3 ). A further discussion about the subhalo selection is presented in Section 4 .

MaStar: SDSS-based stellar population models
We use stellar population models from Maraston et al. ( 2020 ) which adopt the MaNGA stellar library MaStar (Yan et al. 2019 ) for the definition of stellar spectra as a function of ef fecti ve temperature, gravity, and chemical composition in the population synthesis. 3 MaStar (Abdurro'uf et al. 2022 ) consisting of ∼ 60 000 is the largest stellar library ever assembled. MaStar stellar spectra were obtained with MaNGA fiber bundles and the BOSS optical spectrographs, i.e. the same observational set-up as for MaNGA galaxy observations (see Section 2.3 ). Therefore, the stellar spectra and the correspondent population models share the same wavelength range, spectral resolution and flux calibration as the MaNGA datacubes.
Here, we use an updated version of the Maraston et al. ( 2020 ) Hill et al. 2021 ). Population models are calculated for eight dif ferent v alues for the IMF slope below 0.6 M , ranging between 0.3 and 3.8 in the notation in which the Salpeter ( 1955 )'s slope is 2.35, for each age and metallicity combination.
With the MaStar-based population models, we generate a synthetic spectrum for each stellar particle in the TNG50 galaxies [assuming the Kroupa ( 2002 ) Stellar population models are a key input of galaxy formation simulations and 'forward-modelling' (Baugh 2006 ;Tonini et al. 2010 ;Gonzalez-Perez et al. 2014 ): they provide the link to the observables, and they are instrumental to obtain the physical properties of data, through spectral fitting. Consequently, the choice of the model is an essential part of the comparison between galaxy simulations and observed data. Our adoption of MaNGA-based population models ensures we use the same spectral properties in the simulations as well as in the interpretation of galaxy data. We are therefore able to exclude any bias that would be caused by the adoption of different spectral models. Moreo v er, as all spectra involved in our work have been obtained with exactly the same instrument and observ ational set-ups, we achie v e the highest de gree of consistenc y to start a meaningful comparison between data and simulations (Paper III).

The MaNGA galaxy sur v ey
MaNGA (Bundy et al. 2015 ) is the largest Integral Field Spectroscopy (IFS) surv e y of galaxies to date. It observed 10 010 unique galaxies at a median redshift of z ∼ 0.03 (Abdurro'uf et al. 2022 ) providing spatially resolved spectra for each of them. MaNGA is part of the SDSS-IV surv e y (Blanton et al. 2017 ) and concluded its observations in 2020 August.
The MaNGA IFS (Drory et al. 2015 ) was based around the SDSS 2.5-m telescope at Apache Point Observatory (Gunn et al. 2006 ) and utilizes the SDSS -BOSS spectrograph (Dawson et al. 2013 ;Smee et al. 2013 ), with a wavelength range from 3600 to 10 300 Å and an average spectral resolution R ≈ 1800. In particular, the SDSS-BOSS spectrograph has a red and a blue camera, with a dichroic splitting the light around 6000 Å. In the blue channel, the resolution goes from R = 1560 at 3700 Å to R = 2 270 at 6000 Å; in the red channel, the resolution goes from R = 1850 at 6000 Å to R = 2650 at 9000 Å. In Paper I, we explain how we mimic the resolution of the SDSS-BOSS spectrograph at different wavelengths when we generate both the synthetic spectra and the noise.
MaNGA has hexagonal-formatted fiber bundles, made from 2 arcsec-core-diameter fibers, conducting dithered observations with Integral Field Units (IFUs), which vary in diameter from 12 . 5 (19 fibers) to 32 . 5 (127 fibers) (see table 2 in Bundy et al. 2015 ). The hexagonal-formatted fiber bundles are mimicked in our forward modelling of the simulated galaxies, as explained in Paper I.
MaNGA is characterized by a spatial resolution of 1.8 kpc at the median redshift of 0.037 (Law et al. 2016 ). The MaNGA's characteristic fiber -conv olved point-spread function (PSF) has a full width at half maximum (FWHM) of 2.5 arcsec (Law et al. 2015 ). For each MaNGA datacube, the 'reconstructed' PSF, or ef fecti ve PSF (ePSF), is supplied in different bands (Law et al. 2016 ). We use the ePSFs in the different bands when generating the mock MaNGA-like galaxy datacubes (see Paper I).
MNRAS 522, 5479-5499 (2023) The MaNGA galaxy sample is divided into a 'Primary' and a 'Secondary' sample, with a 2:1 split, for which the galaxy light sampling extends out to 1.5 ef fecti ve radius ( R eff ) and 2.5 R eff , respectively, (Wake et al. 2017 ). In this paper, we focus on building a Primary MaNGA-like sample from TNG50 simulated galaxies, see Section 4 .

M O C K G A L A X Y I N P U T A N D C A L C U L AT I O N
Here, we recapitulate our procedure to generate and analyse mock MaNGA galaxies, as introduced in Paper I.

Modelling the spectrum
Once a simulated galaxy is selected for post-processing, the first step is to model their stellar spectra. Our spectral modelling depends on the particle's age. If a stellar particle is younger than 4 Myr, we assume it to be a star-forming region and model its emission with the MappingsIII star-forming region models (MIII models, see Gro v es et al. 2008 ). For older ages, we use MaStar stellar population models (see Maraston et al. 2020 , and Section 2.2 ). It is important to emphasize that the synthetic spectra are associated directly with the stellar particles in TNG50 by interpolating within the SSP model grid. This is different from the studies presented by Ibarra-Medel et al. ( 2018 ) and Sarmiento et al. ( 2023 ) in which the stellar particles in the simulations are assigned model spectra from the closest properties in the stellar population model template grid without interpolation, introducing a difference between 'intrinsic' and 'assigned' properties. This point will be reprised in Section 6 .
We mimic an IFU observation to collect the light, generating a datacube with the MaNGA pixel size (0.5 arcsec) and a square FoV of 150 arcsec per side. Thanks to the use of the MaStar stellar population models, the synthetic datacube's spectral resolution and flux calibration are equal to MaNGA observations by construction. The virtual instrument is positioned along the z -axis of the cosmological volume in which the galaxy is identified. Note that observing the simulated galaxies with a line-of-sight (LOS) fixed to the cosmological z -axis ef fecti vely implies random viewing angles. An in-depth discussion of galaxy inclinations in iMaNGA sample will be done in Paper III.

Dust
Dust effects are included in the synthetic datacubes by reconstructing the attenuation curves spaxel by spaxel employing low-resolution radiative transfer simulations with SKIRT (Baes et al. 2011 ;. We define and discuss this original and f ast w ay to e xploit radiativ e transfer simulations in section 3.2.2 of Paper I. With SKIRT , we mimic an IFU observation with the same FoV and pixel size as for the synthetic datacubes, but with lower spectral resolution. It is important to underline that we do not assume any model for the attenuation curves: the attenuation curves are defined simply as the ratio between the signal in the spaxel with and without dust included in the radiative transfer simulations. The attenuation curves defined at lower spectral resolution are then interpolated on the wavelength array of the synthetic datacubes. The attenuation curves are then applied to the synthetic datacubes, spaxel by spaxel.

Kinematics
The kinematics are incorporated as follows. Spectra are Dopplershifted and broadened according to stellar kinematics, from the TNG50 simulations (see section 3.2 in Paper I).

Morphology
For the morphological analysis, we first obtain r -band SDSS-like images by applying the SDSS r -band filter and PSF to the synthetic datacubes. Images are then analysed with STATMORPH , a S érsic 2D fitting code (Rodriguez-Gomez et al. 2019 , see Paper I, section 3.3). The analysis with STATMORPH provides us with the ef fecti ve radius R eff which is needed to construct a mock MaNGA Primary sample.

Inclusion of obser v ational effects
Once the R eff values of the galaxies are known, we select the appropriate hexagonal fiber -b undle configuration that w ould be emplo yed by MaNGA to collect the light from the galaxy within 1.5 R eff . The available hexagonal diameters in the MaNGA set-up go from 12.5 to 32.5 arcsec, see Section 2.3 . As discussed in Paper I, we do not simulate the detailed spatial sampling mechanics of MaNGA observations, as done in Bottrell & Hani ( 2022 ), for instance. Instead, we select all the spaxels within the MaNGA hexagonal-formatted fiberbundle FoV [see also Nevin et al. ( 2021 )] and mimic observational effects such as dithering and resulting covariances between spaxels by exploiting the reconstructed PSF (or ef fecti v e PSF) pro vided for each of the observed MaNGA galaxies in different bands.
The ef fecti ve PSF depends on the exposure time and the observing condition in the considered band, and includes also dithering effects (Law et al. 2016 ). The convolution of the datacubes with the ef fecti ve PSF happens after the implementation of noise. The noise is modelled based on an analysis of the real wavelength-dependent SNR in MaNGA (see Paper I). Since the convolution happens after the inclusion of the noise in each spaxel, this results in the signal of adjacent spaxels being correlated, including also the noise. This indirect approach of mimicking the effects of the MaNGA IFS observations leads to significant savings in computing time. It is a simplification compared to reconstructing the detailed mechanics of the observations, as for instance done by Bottrell & Hani ( 2022 ). Both the reconstruction of the ePSF and the noise are based on MaNGA LOGCUBE output (Law et al. 2016 ). We refer the reader to section 3.4 of Paper I for more detail.
Following this approach, and thanks to the combination of TNG50 and MaStar Stellar Population models, we produce datacubes having the same spatial sampling, spatial resolution, spectral resolution, SNR as a function of the wavelength, flux calibration, and wavelength range of MaNGA observations.

Data analysis
We follow the procedure developed for the MaNGA Data Analysis Pipeline (Westfall et al. 2019 ) to analyse the iMaNGA galaxies. First, we employ the Voronoi algorithm of Cappellari & Copin ( 2003 ), with target S/N g > 10. We then run the penalized pixel-fitting algorithm (pPXF; Cappellari 2017 ), to reconstruct gas and stellar kinematics and model the emission lines.
We finally fit the mock spectra with FIREFLY (Wilkinson et al. 2017 ) to infer the stellar populations' properties, i.e. age, metallicity, mass, and age, and also reddening and the SFH. As we show in Paper I on two test galaxies with v astly dif ferent properties, our 'mocking' and fitting procedures do not alter the intrinsic properties of the TNG50 galaxies. , stellar mass and halfmass-stellar radius (as a proxy of a galaxy size), of the'initial sample', i.e. the 48 248 TNG50 galaxies selected to lie in the MaNGA redshift range (i.e. 0.01-0.15) and to having more than 10 000 stellar particles.

M O C K C ATA L O G U E C O N S T RU C T I O N
Here, we illustrate our method to construct the mock MaNGA catalogue from TNG50, i.e. our iMaNGA sample.

The initial sample of TNG50 galaxies
As explained in Section 2.1 , the SUBFIND algorithm is run o v er all the saved snapshot of the Illustris and IllustrisTNG output cosmological volume. This algorithm identifies structures in the cosmological volumes as galaxies. To construct the iMaNGA sample from the TNG50 simulations, at first, we select all the TNG50 galaxies in the MaNGA redshift range (between z ≈ 0.15 and z ≈ 0.01, see Section 2.3 ). This corresponds to 10 snapshots, from snapshot 88 to snapshot 98. We remo v e all the galaxies with less than 10 000 stellar particles from this sample, to ensure that dust effects are sufficiently resolved (as done in Schulz et al. 2020 ). We also remove all the 'galaxies' which are flagged as spurious artefacts of the SUBFIND procedure (for more details see Genel et al. 2017 ;Pillepich et al. 2018b , and the Data Specification page for IllustrisTNG) 4 .
We find that 48 248 galaxies in TNG50 satisfy these selection criteria. We shall refer to this sample of galaxies as the 'initial sample'. Fig. 1 displays the initial sample in (from top to bottom): i -band absolute (AB) magnitude, stellar mass, and half-mass-stellar radius R hmsr as a proxy of galaxy sizes. We refer to the i -band AB magnitude as M i − 5log 10 h , as in the NSA catalogue 5 , which was Figure 2. The initial sample of TNG50 galaxies in the magnitude-redshift plane, before and after randomizing their redshift (upper and central panel, respecti vely). The dif ference between these two redshift v alues, gi ven as t , is shown in the bottom panel.
used for the MaNGA target selection. As in the NSA catalogue, h = H 0 /100 km s −1 Mpc −1 , and h = 1 throughout the paper. It can be appreciated how the majority of galaxies in the initial sample have a stellar mass around 9 × 10 10 M .

Obtaining a smooth spatial sampling
Since we want to reco v er a smooth distribution in spatial sampling, as in the MaNGA-Primary sample, we need to alter the discreteness of the TNG50 redshift sampling, due to the fact that -at lowz -the TNG50 snapshots are output every 150 Myr, and make it a continuum sample. To this end, we associate to each galaxy in the initial sample (see Section 4.1 ) a new redshift, which we call z random . This redshift is randomly extracted from a uniform distribution with lower limit equal to the galaxy's snapshot redshift (we refer to it as z TNG50 ), and upper limit equal to the redshift of the previous, higher redshift snapshot. In other words, we allow a galaxy in a given snapshot to have a redshift between the redshift characterizing its snapshot and the redshift of the preceding snapshot. We do not change the redshift of galaxies at the upper redshift boundary, i.e. z ≈ 0.15. In this way, we obtain a smooth distribution in spatial sampling, as for MaNGA datacubes. The results of the whole procedure are visualized in Fig. 2 .
From now on, we consider the galaxies in the initial sample as characterized by z random , instead of their original redshift z TNG50 . It should be noted that the new redshift z random is solely used to construct a MaNGA-like catalogue, and to observe the galaxies in it when producing mock MaNGA-like observations (see Section 3 ). The ages of stellar particles as provided by TNG50 are not modified.

Magnitude selection: the parent sample
The MaNGA sample selection is solely based on the galaxies' absolute i -band magnitude and redshift, with the final sample achieving an approximately flat distribution in the i -band magnitude (for more information see Yan et al. 2019 ).
The top panel of Fig. 3 presents the MaNGA-Primary sample selection boundaries (in black) (see Wake et al. 2017 ), and the TNG50 galaxies in the initial sample falling into it, identified by the new redshift z random . The bottom panel shows their distribution in the i -band magnitude. After imposing the Primary sample selection boundaries, we are left with 3152 TNG50 galaxies. We refer to this sample as the 'parent sample'. Fig. 4 displays the magnitude, the stellar mass M * , and the half-mass-stellar radius R hmsr distributions (from top to bottom) for the initial (hatched-filled histograms) and the parent sample (yellow histograms). Excluding galaxies outside the MaNGA selection boundaries (grey points in Fig. 3 ) changes the galaxy distributions substantially. There is a significantly higher density of high-luminosity objects and high-mass objects once these selection criteria are applied. The size distribution, instead, remains largely unchanged.

The final iMaNGA sample
In order to achieve the final, flat distribution in i -band magnitude o v er the TNG50 parent sample (Section 4.3 ), we perform the following steps: (i) We build the galaxy distributions in i -band magnitude and redshift of the parent sample (bottom panel of Fig. 3 ), (ii) To each TNG50 galaxy in the parent sample, we associate a probability p inversely proportional to their containing bin count, N (bottom panel of Fig. 3 ); (iii) We randomly extract unique galaxies from this sample, assuming a selection probability for each of them equal to 1/ N . In other words, the probability of being selected is larger in underpopulated bins.
The requirement to achieve a flat distribution constrains the number of galaxies, which ends up being about 1000 in our case. We will refer to this sample as the 'iMaNGA sample'. This is our final sample, which we then post-process and analyse following the method presented in Paper I and recapitulated in Section 3 .
The top panel of Fig. 5 displays the MaNGA-Primary sample selection boundaries (in black), and the TNG50 galaxies in the iMaNGA sample, colour coded by number. The bottom panel shows the distribution in i -band magnitude in the initial (black hatchfilled histogram), parent (yellow histogram), and final iMaNGA (teal empty histogram) samples of TNG50 galaxies. Our method allows us to successfully generate a flat distribution in i -band magnitude. In Fig. 6 , we compare these three samples in terms of stellar mass (top panel) and size (bottom panel, where the half-mass-stellar radius is considered a proxy of the galaxy size as usual), with the same meaning of symbols as in the bottom panel of Fig. 5 . The iMaNGA sample of ∼1000 unique TNG50 galaxies is characterized by an approximately flat distribution in mass, as well as in i -band magnitude.

R E S U LT S
In this Section, we present the main characteristics of the iMaNGA sample, in particular the galaxy sizes, masses, and environmental densities (Section 5.1 ). Then, we report the outcome of the morphological analysis, based on mock r -band SDSS-like images Downloaded from https://academic.oup.com/mnras/article/522/4/5479/7150712 by University of Portsmouth Library user on 28 July 2023  . The distribution of TNG50 galaxies in the initial (hatch-filled black histograms), parent (yellow histograms), and iMaNGA (teal empty histograms) samples, in stellar mass and half-mass-stellar radius.
(Section 5.2 ). Next, we present a direct comparison between some key characteristics of the iMaNGA and the Primary MaNGA samples (Section 5.3 ). Afterwards, we describe the results of the kinematic analysis o v er the entire sample, discussing our ability to reco v er the 'intrinsic' kinematics, as given by TNG50 (Section 5.4 ). At the end, we present the analysis of the stellar populations' properties and our ability to reco v er the 'intrinsic' stellar population properties from TNG50 (Section 5.5 ).  The environment is defined with the N th nearest neighbour method: the distance to the N th nearest neighbour, d N , with N typically varying from 3 to 10, is used as a measure of the local galaxy (projected) o v erdensity (Muldrew et al. 2011 ;Etherington & Thomas 2015 ). The dimensionless o v erdensity, 1 + δ, is described by the following equation:

Characteristics of the iMaNGA sample
where N is the surface number density described using the N th neighbour method ( N = N /π d 2 N ), while < > is the mean surface density of galaxies. We assume N = 5 and compute < > within the TNG50 snapshot the galaxies are in, projecting the subhaloes in each of the simulated volumes along the z -axis of the cube. As a proxy of the galaxy environment, we use log (1 + δ) (for more details see Paper I). In each panel of Fig. 7 , we report the linear regression (black line) and the Pearson correlation coefficient ρ.
As expected, galaxy mass and i -band magnitude are closely correlated. Ho we ver, there are also significant correlations between stellar mass and the half-mass radius as well as the environmental density. These correlations are consistent with observations of real galaxies (see Kauffmann et al. 2004 ;Bernardi et al. 2010 ;Li et al. 2018 ). As expected, due to the design of the MaNGA target MNRAS 522, 5479-5499 (2023) Figure 8. RGB images of 8 galaxies in the iMaNGA sample (see Section 4.4 ). The RGB images are constructed from their synthetic datacubes, generated with the method presented in section 3.2 in Paper I (see Section 3 here). Different angular sizes and morphologies can be appreciated. The redshift and the ID of the galaxies in TNG50 are reported in the upper-left corner for each galaxy. Their key properties are stated in Table 1 . Table 1. Selected properties of the 8 galaxies in the iMaNGA sample. From left, the redshift, the magnitude in the i band, the stellar mass, and the R HMSR as given by the TNG50 data, while the environment (1 + δ), the ef fecti ve radius R eff and the S érsic index n are computed as described in Paper I, from r band SDSS mock images. selection (see Yan et al. 2019 ), there is a strong redshift bias, with more massive galaxies lying at higher redshift in order to fit into the IFU apertures. Our iMaNGA sample reproduces this effect.

Morphological analysis
As said earlier (and explained in Paper I), morphologies are obtained from the r -band images. Fig. 8 displays the RGB images of 8 galaxies in our iMaNGA sample, to showcase the variety of morphologies reproduced by the TNG50 simulations, and therefore also present in the iMaNGA sample. The RGB images are obtained from the synthetic datacubes, with a square FoV of 150 arcmin as side length, including the effects of dust. Note that we do not include any observ ational ef fects, such as noise or PSF, as these images do not have a scientific purpose, being solely meant to visualize the variety of morphology and inclination in iMaNGA. Table 1 reports the key properties of these galaxies: snapshot id, redshift z random , i -band magnitude, stellar mass, HMSR, environment (see Section 5.1 ), ef fecti ve radius, and S érsic index. Comparing the S érsic indices with Fig. 8 , we notice how galaxies showing spiral structures (ID 3, ID 452563, ID 532259, and ID 598178) have n ∼ 1, while elliptical-looking galaxies (ID 439126, ID 561122) have n > 2. Fig. 9 shows the distribution of S érsic index ( n ) in iMaNGA. The top panel presents the entire catalogue (teal histogram). In the central panel, the catalogue is split into galaxies with and without dust particles (grey and black dashed histogram). In the bottom panel, the catalogue is divided into galaxies with and without star-forming particles (grey and black dashed histogram). The distribution in S érsic index of the galaxies with star-forming and dust particles peaks at low n values ( n ∼ 1). Indeed, it is expected to have star formation and dust in disc-like and irregular galaxies (Lianou et al. 2019 ).
We note that the o v erall distribution of the S érsic index in iMaNGA has a peak around n ∼ 1. A similar peak is also present in the distributions of the S érsic index in the MaNGA catalogue. Ho we ver, the S érsic index distribution in MaNGA is bimodal, with another, slightly smaller peak around 4 (see fig. 12 in Fischer, Dom ínguez S ánchez & Bernardi 2018 ). This second peak is also present in Downloaded from https://academic.oup.com/mnras/article/522/4/5479/7150712 by University of Portsmouth Library user on 28 July 2023 Figure 9. The distribution of the galaxies' S érsic index n as computed by running STATMORPH o v er r -band SDSS-mock images (see Section 3 ) for all the galaxies in the iMaNGA sample. Upper panel : the r -band S érsic index distribution for all the galaxies in the iMaNGA sample (see Section 4.4 ). Central panel : the r -band S érsic index distribution for the sample, split into galaxies where dust particles are present (grey histogram) and where are not (black hatch-filled histogram). Bottom panel : the r -band S érsic index distribution, dividing the sample into galaxies with star-forming particles (grey histogram) and without (black hatch-filled histogram). the iMaNGA sample, but significantly smaller. This is the direct consequence of a paucity of elliptical galaxies in the underlying TNG50 catalogue. We will discuss this point further in the following section.

Comparison with the MaNGA-Primary sample
In this section, we present a comparison between the main characteristics of the MaNGA-Primary sample and the iMaNGA catalogue. All the properties of the MaNGA-Primary sample, i.e. total stellar mass, ef fecti ve radius, S érsic index, and redshift, are retrieved from DRPALL MaNGA data (Law et al. 2016 ). The total stellar mass of a galaxy in the iMaNGA sample is intrinsic, hence is directly computed as the sum of the stellar particles used to construct the synthetic datacube. Other quantities like S érsic index and ef fecti ve radius, instead, are derived from the synthetic iMaNGA images.

Angular size and spatial resolution
In Fig. 10 , we present the distributions of angular size (left-hand panels), spatial resolution expressed in kpc (central) and in terms of the ef fecti ve radius (right-hand panels) for MaNGA (top panels) and iMaNGA (bottom panels). The figure mimics fig. 5 of Wake et al. ( 2017 ), including the fine binning in the stellar mass range 8.9 ≤ log 10 M * /M ≤ 11.3. The vertical lines illustrate the radius of the hexagonal FoV of the smallest and biggest fibre-bundle configurations used in MaNGA observations.
A slight shift towards smaller angular sizes in iMaNGA with respect to MaNGA is noticeable in the left-hand panels, which is most pronounced for low-mass galaxies. The tails to large angular sizes are very similar between simulated and real sample, though. This plot demonstrates the necessity of the redshift cut in both the MaNGA and the iMaNGA samples to ensure co v erage of 1.5 R eff by the 5 hexagonal FoVs in MaNGA.
The central panels show spatial resolution, calculated from the nominal angular resolution element (2.5 arcsec, both in MaNGA and Figure 10. Comparison of angular size (left-hand panels), resolution in terms of kpc (central panels) and in terms of the ef fecti ve radius (right-hand panels), of the iMaNGA sample, generated from TNG50 as described in this paper, to the MaNGA-Primary sample. This plot reproduces the plots in Wake et al. ( 2017 ). The information for the MaNGA-Primary sample galaxies is retrieved from DRPALL MaNGA data (Law et al. 2016 ), as for all the other plots in this Section. The vertical lines report the smallest and biggest FoV of the 5 hexagonal fiber -b undles configurations used in MaNGA observations (see Section 2.3 ).
MNRAS 522, 5479-5499 (2023) iMaNGA) and galaxy redshift. The distributions for MaNGA and iMaNGA are very similar. Both peak at around 1 kpc, and exhibit a tail to significantly lower spatial resolution of up to 6 kpc in the most massive galaxies.
The right-hand panels show the distribution of spatial resolution in terms of R eff . As for the MaNGA-Primary sample, these distributions in the iMaNGA sample are largely independent of galaxy mass. Also, the median spatial resolutions are very similar: 0.37 R eff for the MaNGA-Primary sample, and 0.40 R eff for the iMaNGA sample.
We conclude that, o v erall, the iMaNGA sample reproduces the trends of the MaNGA-Primary sample for both angular size and spatial resolution well. Fig. 11 shows a density plot of total stellar mass as a function of the ef fecti ve radius for both the MaNGA-Primary sample (top panel) and the iMaNGA sample (central panel). As in Fig. 10 , we select galaxies with a stellar mass in the range 8.9 ≤ log 10 M * /M ≤ 11.3. Coloured hexagons represent galaxy densities, and the small black points show individual galaxies. The red diamonds represent the median of the total stellar mass along 5 equally sized bins in ef fecti ve radius, the red error-bars represent the standard deviation in each bin, and the red dashed line shows the linear regression with the Pearson correlation coefficient in the top-left corner. The bottom panel shows the residual calculated as

Stellar mass and effective radius
where O MaNGA and O iMaNGA represent the median of total stellar mass along the 5 equally sized bins considered for the two samples, and σ MaNGA and σ iMaNGA represent the standard deviation in each of the bins for the two samples. The correlation between ef fecti ve radius and stellar mass is well reco v ered in the iMaNGA sample, in particular at smaller angular size. Fig. 12 presents a density plot of the S érsic index as a function of ef fecti ve radius (left-hand panels) and stellar mass (right-hand panels), both for the MaNGA-Primary sample (top panels) and the iMaNGA sample (central panels). We show all galaxies in the mass range 8.9 ≤ log M * /M ≤ 11.3. The red diamonds represent the median along 5 equally sized bins with the red error-bars showing the median standard deviation within each bin. We also display the linear regression for the median values (red dashed lines), the Pearson correlation coefficient (in the upper left corners). With black points, we represent all the galaxies in the catalogues. In the bottom panels, we report the normalized residual between the two samples, computed as from equation ( 2 ) where O MaNGA and O iMaNGA represent the median of the S érsic index, and σ MaNGA and σ iMaNGA represent the standard deviation in each of the bins.

S érsic index
The MaNGA-Primary sample shows a trend of S érsic index slightly increasing with ef fecti ve radius, while, the paucity of galaxies with high S érsic indices leads to the opposite trend in the iMaNGA sample. The right-hand panels show that a correlation exists also between S érsic index and total stellar mass for both the MaNGA and the iMaNGA sample. The correlation is weaker in iMaNGA, though.
Our results are in agreement with the study by Huertas-Company et al. ( 2019 ) who compare mock SDSS r -band images from TNG100 to SDSS. In particular they find that, although the observed masssize relation is well reco v ered by TNG100, both in normalization and in slope, SDSS is dominated by elliptical galaxies at the high-mass end while TNG100 is dominated by late-type systems, with a lack of lenticular galaxies at intermediate masses and ellipticals at the high-mass end. This discrepancy is not due to volume, as shown by re-sampling the SDSS results o v er smaller volumes.
Similar results are also presented in Rodriguez-Gomez et al. ( 2019 ).

Kinematic analysis
The kinematic analysis of iMaNGA follows the procedure of the MaNGA DAP. In brief (see Paper I for details): (i) We apply the Voronoi binning scheme to meet a given SNR threshold (i.e. a minimum SNR of about 10 in the g -band images), av eraging o v er neighbouring spax els. In this way, we define the Voronoi tessellation (Cappellari & Copin 2003 ) in each iMaNGA datacube. This step is necessary, both for real and mock MaNGAlike datacubes, since accurate measurements of stellar kinematics and stellar population characteristics need a good SNR to be extracted without bias. Where a good SNR cannot be reached, a mask is applied to not analyse those spaxels any further; (ii) We run PPXF (Cappellari 2017 ) to obtain the stellar kinematics, as well as the gas kinematics and the emission lines best fit (see Belfiore et al. 2019 ;Westfall et al. 2019 ). The MILES-HC libraries are used as templates by PPXF for the stellar continuum, while the templates for the emission lines are constructed as explained in Westfall et al. ( 2019 ), Section 9.
In total we analyse ∼8 million Voronoi tassels in the iMaNGA sample. Fig. 13 reports examples of a fit with PPXF to the stellar continuum Downloaded from https://academic.oup.com/mnras/article/522/4/5479/7150712 by University of Portsmouth Library user on 28 July 2023 Figure 12. As in Fig. 11 , for the distribution of the galaxies' S érsic index n as a function of the r -band R eff and the total stellar mass, both for the MaNGA-Primary sample ( upper panels ) and the iMaNGA sample ( central panels ). Bottom panels : normalized residuals (red line), and the 1 σ , 2 σ , and 3 σ level intervals (dashed grey lines). Symbols, curves, and colours as in Fig. 11 .  Fig. 14 displays the stellar peculiar velocity maps for the 8 galaxies in the iMaNGA sample, presented in Fig. 8 and Table 1 , as reco v ered by running PPXF o v er the stellar continuum in each Voronoi tassel. In Fig. 15 , we show the stellar velocity dispersion maps for the same galaxies, as reco v ered from the PPXF analysis. A variety of kinematical states, FoVs, Voronoi binning tessellation and masking can be appreciated. Comparing these with Fig. 8 and the S érsic indices listed in Table 1 , we notice how elliptical galaxies (ID 439 126 at z ∼ 0.06, ID 561 122 at z ∼ 0.03) have n > 2 and the highest values of stellar velocity dispersion, while still presenting some pattern of rotation. Disc-like galaxies (ID 3 at z ∼ 0.03, ID 452 563 at z ∼ 0.08, ID 532 259 at z ∼ 0.05, ID 598 178 at z ∼ 0.02), instead, have n ∼ 1, exhibit a clear rotational velocity pattern, and a lower velocity dispersion compared to the ETGs. In Paper I, we presented an initial comparison between the 'intrinsic' kinematics (peculiar stellar velocity and stellar velocity dispersion), i.e. determined directly from the simulations calculated from equation (5) in Paper I, and the 'reco v ered' kinematics obtained from running PPXF o v er the Voronoi-binned mock datacubes, for two example galaxies. We found that residuals are compatible with zero at the 68 per cent confidence level, with no systematic bias for the two iMaNGA datacubes.
We now repeat this comparison, this time for all the tassels in the iMaNGA sample ( ∼8 million) to test our ability to reco v er the MNRAS 522, 5479-5499 (2023) Figure 14. Stellar velocity maps, obtained with PPXF , for the 8 galaxies in the iMaNGA sample from Fig. 8 and Table 1 . Maps are colour coded by the stellar peculiar velocity along the LOS. Galaxy redshifts and IDs are stated in the plots, as well as the diameter of the hexagonal MaNGA FoV the galaxy falls in, which is used to 'observe' the galaxy, see Section 3 . input values statistically. Fig. 16 displays the distributions of the residuals (red histograms), both for the stellar velocity (i.e. v z = v z , pPXF − v z , TNG50 ) and the stellar velocity dispersion (i.e. σ v z = σ v z , pPXF − σ v z , TNG50 ) o v er all the tassels in the iMaNGA sample. We also report the 0.16, 0.5, and 0.84 quartiles (vertical black lines). The quartiles for the v z distribution are equal to q 16 = −11.24, q 50 = −2.61, and q 84 = 6.35; for the σ v z are q 16 = −4.04, q 50 = 6.16, and q 84 = 16.94.
At the 1 σ level, the residuals are compatible with zero, hence we reco v er unbiased measurements of the stellar velocity and the stellar velocity dispersion along the LOS.
o v er all the tassels in iMaNGA. We report the 0.16, 0.5, and 0.84 quartiles of the residual distributions (vertical black lines). Residuals are compatible with zero o v er all tassels at the 68 per cent confidence intervals. Fig. 17 presents reco v ered stellar velocity dispersion as a function of intrinsic stellar mass (green circles). The median in 5 equally sized bins in stellar mass (black diamonds) and the standard deviation in each bin (black error bars) are also shown. The black line is the linear regression line for the median values, and the Pearson correlation coefficient is given in the top-left corner. A robust linear correlation between the stellar mass and log 10 σ v z is found, in good agreement with both real observations (e.g. Zahid et al. 2016 ) and other simulations (e.g. Pillepich et al. 2019 ).
In closing this section, we would like to emphasize that our analysis naturally considers random orientation, since we observe the galaxies with a fixed LOS along the z -axis of the cosmological volume, and the simulated galaxies are randomly orientated in it. Different orientations can be appreciated in Fig. 8 .

Stellar population analysis
As discussed in Paper I, after following the steps in the DAP, we can proceed to perform full spectral fitting of population models to the iMaNGA galaxies' datacubes in order to derive stellar population properties. The Voronoi tassels, stellar kinematics, and emission-line best-fits are used as input to the full spectral fitting code FIREFLY (Wilkinson et al. 2017 ), using the same strategy detailed in Neumann et al. ( 2021Neumann et al. ( , 2022 for the analysis of real MaNGA galaxies adopting MaStar stellar population models as fitting templates. The fitting results for the MaNGA galaxies are publicly available as valueadded-catalogue (Neumann et al. 2021 ) 6 . Based on the full-spectral fitting with FIREFLY , fundamental galaxy properties are reco v ered including mass-weighted and lightweighted metallicities and ages. These properties are essential to investigate the formation and evolution of galaxies.

Mass-weighted metallicity and a g e maps
Running FIREFLY o v er the mock IFU datacubes, we are able to spatially resolve the stellar population properties. Therefore, as in e.g. Neumann et al. ( 2022 ), we are in the position to produce maps of stellar properties for the iMaNGA datacubes.
Figs 19 and 20 show maps of reco v ered stellar metallicity and age for the eight galaxies presented in Fig. 8 . Here, we report the mass-weighted metallicity and age, i.e. [Z/H] MW and log 10 (age) MW (Gyr). We notice that elliptical galaxies (ID 439 126 at z ∼ 0.06, ID 561 122 at z ∼ 0.03) are metal-rich and old o v erall as e xpected. The iMaNGA disc galaxies (ID 3 at z ∼ 0.03, ID 452 563 at z ∼ 0.08, ID 532 259 at z ∼ 0.05, ID 598 178 at z ∼ 0.02), instead, have generally young stellar populations and show a metal-rich stellar component in the centre, with a steep drop in metallicity towards larger radii.
In Paper I we compared intrinsic age and chemical composition, calculated directly from the stellar particle data in TNG50, with the values reco v ered from FIREFLY for two iMaNGA galaxies, finding that residuals were consistent with zero at the 68 per cent confidence level with no systematic bias. Here, we repeat this comparison for the entire iMaNGA catalogue, thereby testing if, after the postprocessing of the TNG50 galaxies, the 'intrinsic' information is correctly retrieved with our analysis.
To this end, weighted 'intrinsic' properties for the same Voronoi grid are constructed directly from the simulated stellar particles. The weighted properties in each tassel are: where θ W,tassel is either the mass-weighted metallicity ([ Z / H ] MW ), the light-weighted metallicity ([ Z / H ] LW ), the mass-weighted age  ( Age MW ), or the light-weighted age ( Age MW ) in the selected tassel; θ i is the age or metallicity of the i -particle of the considered tassel with weight, either mass or light, W * , i ; N * is the number of stellar particles in the selected tassel. For the mass-weight, i.e. M * , i , we simply consider the stellar particle mass as provided by TNG50; for the light-weight, i.e. L * , i , we consider the total luminosity of the stellar spectrum, either from MaStar or MIII, depending on the stellar particle's age. , and the residual distributions for the age ( log 10 ( Age) = log 10 ( Age) Firefly − log 10 ( Age) TNG50 , bottom panel). The empty red histograms are the residual distributions for mass-weighted quantities, while the grey histograms illustrate the residual distributions for light-weighted quantities. We also plot the 0.16, 0.5, and 0.84 quartiles of the residual distributions: in red dashed lines, the quartiles for the mass-weighted residuals, in black, the quartiles for the light-weighted. The quartiles are reported at the top of the panels, in red for the MW residuals, and in grey for the LW ones. The figure shows the following: (i) The residual distribution for the mass-weighted metallicity is characterized by q 16 = −0.31, q 50 = −0.07, and q 84 = 0.13, and q 16 = −0.34, q 50 = −0.12, and q 84 = 0.07 for light-weighted metallicity.
These results show that our analysis does not introduce any biases: the residuals are well consistent with zero within the 68 per cent confidence intervals. As to be expected, the realistic observational effects implemented in iMaNGA lead to a scatter in the residuals that is well consistent with the measurement errors.
In Appendix A , we further show the residual as a function of the intrinsic properties (see Fig. A1 ) and the relation between the residuals on metallicity and age (see Fig. A2 ). We conclude that we are able to reco v er the intrinsic properties of the TNG50 galaxy sample.

Star formation histories
The SFH is an essential quantity to study galaxy formation and ev olution processes, b ut it is also one of the most difficult features to retrieve from observational data. We can use the FIREFLY fullspectral fitting algorithm to resolve the SFH non-parametrically so that we can compare it against the intrinsic SFH from TNG50. Again, we take into account both mass-weighted and light-weighted metrics. Fig. 22 reports the SFH as SSP mass-weights (top left-hand panel) and light-weights (bottom left-hand panel) versus look-back time.
In the top panels, we plot the reco v ered SFH (teal histogram) and the intrinsic SFH (black hatch-filled histogram) for two example g alaxies, i.e. g alaxy ID 3 (on the left) and ID 561 122 (on the right) at z ∼ 0.03. The two galaxies have different morphologies (see Table 1 ) and different stellar population properties (see Figs 19 and 20 ). The central panels show the intrinsic SFH as empty black histogram and the reco v ered SFH in yellow. In the bottom panels, we report the residuals, i.e. the difference between what is predicted by FIREFLY and what is intrinsic to the simulations. Light-weighted quantities are yellow, mass-weighted quantities are empty teal histograms. Intrinsic SFHs are computed from all the stellar particles in the simulated galaxies within the FOV of the hexagonal fiber bundle assumed to observ e its light; reco v ered SFHs are obtained considering all the Voronoi tassels in the galaxies, from the analysis with FIREFLY .
A further test is presented in Fig. A3 , Appendix A where we present the SFHs of 20 galaxies, divided into ellipticals and spirals. We conclude that the o v erall shape of the SFH is reproduced well by the full-spectral fitting procedure.

D I S C U S S I O N
Other works have presented methods to obtain mock MaNGA data from simulations (Ibarra-Medel et al. 2018 ;Duckworth, Tojeiro & Kraljic 2019 ;Nevin et al. 2021 ;Bottrell & Hani 2022 ;Sarmiento et al. 2023 ). In the following, we briefly compare these works to our project.

Ibarra-Medel et al. ( 2018 )
Ibarra-Medel et al. ( 2018 ) post-process two simulated Milky Waysized galaxies (from Col ín et al. 2016 ). The synthetic spectra are constructed using a combination of MILES and GRANADA models (more details in Cid Fernandes et al. 2013 ), with 156 spectral templates that co v er 39 stellar ages (from 1 Myr to 14.2 Gyr), 4 metallicity ( Z / Z = 0.2, 0.4, 1, and 1.5), and a wavelength range between 3600 and 7000 Å. When generating the synthetic spectra, the SSP models are not interpolated to the exact intrinsic quantities as we do, rather the 'closest' models in terms of age and metallicity are adopted. Because of this choice, Ibarra-Medel et al. ( 2018 ) have to distinguish between intrinsic and assigned stellar particles' properties. Also different from us, these authors do not run a radiative transfer code to model dust effects, but apply a simple dust extinction model.
The MaNGA fiber -b undle configuration is mimicked: each stellar particle is associated with one of the MaNGA fibers, each with an FoV of 2.5 arcsec, and the final spectrum in each fiber is given by the stack of all the spectra in it. To each stellar particle, one of the spectra from the templates is assigned, therefore, the final spectrum in each fiber is formed by a combination of discrete ages and metallicities.
MNRAS 522, 5479-5499 (2023) This operation is repeated three times to mimic the dithering in MaNGA observations. The noise is implemented with a combination of sigmoid functions to mimic the eBoss spectrograph behaviours at short and long wavelengths. This sample of mock galaxies is then used to test the ability of Pipe3D (S ánchez et al. 2016 ) to reco v er the stellar populations' properties. Ibarra-Medel et al. ( 2018 ) find some biases in the reco v ered properties, in particular, they recover younger stellar populations in the inner, older regions, and slightly older stellar populations in the outer, younger regions, as well as lower global stellar masses. Duckworth et al. ( 2019 ) construct the spectra for around 4500 galaxies in TNG100 (Marinacci et al. 2018 ;Naiman et al. 2018 ;Nelson et al. 2018 ;Pillepich et al. 2018b ;Springel et al. 2018 ) adopting the FSPS stellar population models (Conroy & Gunn 2010 ). FSPS contains 22 metallicity steps (from log 10 Z = −3.7 to log 10 Z = −1.5), and 94 age steps (from log 10 t = −3.5 [Gyr] to log 10 t = 1.15 [Gyr]), and a wavelength rage between 3600 and 7000 Å. They bin particles within spaxels of a length-side of 0.5 arcsec, o v er a hexagonal fibre bundle footprint, mimicking the MaNGA datacubes. In each bin, the mean v elocity, v elocity dispersion, and total flux for all particles are computed. No extinction is assumed. The SNR Figure 22. The SFH reco v ered by FIREFLY and intrinsic to TNG50, for two example galaxies in iMaNGA. The SFHs are represented as SSP MW (upper panels) or LW (central panels) as a function of lookback time. Lefthand panels : the MW SFH of galaxy ID 3 at z ∼ 0.03. In the top panel, the teal histogram shows the SFHs resolved by FIREFLY compared to the black hatch-filled histogram which illustrates the 'intrinsic' MW SFH reconstructed from TNG50 stellar particle data. In the central panel , as the upper one, this time considering the light-weighted (LW) SFH. In particular, the LW SFH as reco v ered by FIREFLY (yellow histogram), and intrinsic to TNG50 galaxies (black empty histogram). In the bottom panels we report the residuals, i.e. the difference between what is reco v ered tanks to FIREFLY and what is intrinsic to the simulations; with the empty teal histograms we report the residuals for the MW SSP weights and in yellow for the LW ones. Right-hand panels : as the left ones, with the same meaning of symbols, this time for galaxy ID 561 122 at z ∼ 0.03. as a function of the ef fecti ve radius in the g band is computed and used to assign noise to the synthetic spectra. A Gaussian kernel with 2 arcsec FWHM is assumed to mimic the MaNGA PSF. The datacubes are then re-binned with the Voronoi algorithm as in the official MaNGA DAP. Then, the intrinsic properties of the stellar and gas particles in each Voronoi tassel are used to investigate the relationship between the rotation of stars and gas with morphology and halo spin. The analysis is conducted o v er both TNG100 galaxies and MaNGA galaxies within the paper. They find a good agreement between TNG100 and MaNGA. Nevin et al. ( 2021 ) test an algorithm to identify galaxy mergers with stellar kinematics. Five merging galaxies and matched isolated ones in GADGET (Springel, Yoshida & White 2001a ) are considered. The stellar populations are modelled with the code SUNRISE (Jonsson 2006 ) based on STARBURST99 stellar population synthesis models (Leitherer et al. 1999 ), and synthetic datacubes are created. STAR-BURST 99 has 5 metallicities (between log 10 Z = −1.40 and log 10 Z = −3.), and the age co v erage is between 1 Myr and 1 Gyr, with a Downloaded from https://academic.oup.com/mnras/article/522/4/5479/7150712 by University of Portsmouth Library user on 28 July 2023 MNRAS 522, 5479-5499 (2023) wavelength range between 0.009 and 160 μm. Also in this work, as in ours, and differently from Ibarra-Medel et al. ( 2018 ), the stellar spectra are interpolated and therefore, the intrinsic stellar population properties are used to define each spectra. The effects of dust attenuation as well as AGN are included. A Gaussian kernel with FWHM equal to 2.5 arcsec is used to mimic the PSF in MaNGA. Then, the datacubes are re-binned to have a spatial sampling of 0.5 arcsec. Also in this work, the morphology is studied with STATMORPH on SDSS -like r -band images. The hexagonal FoV of the smallest fiber bundles in MaNGA capable of observing the galaxies within 1.5 R eff is adopted. They produce a typical noise spectrum, which is then normalized and used to include random noise to each spaxel in the datacube. The MaNGA DAP is followed, except that pPXF is run only for the stellar component.

Nevin et al. ( 2021 )
The methodology of this work is most similar to ours, albeit with some important differences. The convolution with the PSF in our pipeline is the last step, after the inclusion of the noise, such that the noise between adjacent spaxels correlates. Also, different from Nevin et al. ( 2021 ), we do not assume a Gaussian kernel for the PSF, but we reconstruct a MaNGA ef fecti ve PSF, which includes dithering ef fects and seeing. The PSF in our approach is wavelength dependent like the ePSF in MaNGA. Instead of reconstructing a typical MaNGA noise, we reconstruct SNR as a function of the wavelength at 1.5 R eff and then this information is used as in equation (4) in Paper I. Finally, we adopt population model spectra based on stellar spectra observed with the same SDSS spectrograph used for the MaNGA galaxies.

Bottrell & Hani ( 2022 )
Bottrell & Hani ( 2022 ) present the code REALSIM-IFS , which is capable of modelling the instrumental sampling mechanics for any IFS instrument, MaNGA included. REALSIM-IFS includes the effects of atmospheric seeing, IFU fibre characteristics and setup (designs), dithered exposure strategy, line-spread function, and spatial reconstruction of fibre measurements. Instead, Bottrell & Hani ( 2022 ) test the application of REALSIM-IFS with TNG50 to create synthetic MaNGA stellar kinematic maps. To our knowledge, this method is the only one capable of reconstructing all the details of the MaNGA observational setup. On the other hand, spectra are not included in their work. Sarmiento et al. ( 2023 ) apply the method by Ibarra-Medel et al. ( 2018 ) to TNG50 galaxies, using template spectra from Bruzual (pri v ate communication) based on the MaStar stellar library (Yan et al. 2019 ). These include a total of 273 synthetic spectra, with 39 ages between 0.0023 and 13.5 Gyr and 7 metallicities between 0.0001 and 0.43, and a linearly sampled wavelength range between 2000 and 10 000 Å. Stellar particles are associated with the closest on among these 273 synthetic spectra as in Ibarra-Medel et al. ( 2018 ). This implies that also in thier case, and differently from us, 'assigned' properties do not necessarily coincide with the 'intrinsic' properties predicted by the simulations (see their figs 6 and 7). The difference between assigned and intrinsic properties jeopardises the comparison with the reco v ered values, leading to larger scatters and potentially larger residuals.

Mock MaNGA catalogues
Among these w orks, Duckw orth et al. ( 2019 ) and Sarmiento et al. ( 2023 ) produced mock MaNGA catalogues. In both, the approach was to associate a galaxy in TNG with an observed MaNGA galaxy directly. Duckworth et al. ( 2019 ) look for unique matches for a total of 4500 galaxies in TNG100. Galaxies are matched by stellar mass, size, and SDSS g − r colour. Sarmiento et al. ( 2023 ) follow a similar approach, matching galaxies in mass, redshift, and ef fecti ve radius, but they do not match galaxies uniquely rather allow galaxies drawn from TNG50 to be selected multiple times. When a TNG50 galaxy is selected multiple times, it is observed with different line of sights. In this way, Sarmiento et al. ( 2023 ) match the full MaNGA catalogue to galaxies in TNG50, that are ho we ver not physically distinct.
Their approach is different from ours. In our work, we adopt the TNG50 catalogue and apply the MaNGA selection criteria to it, in order to construct our mock MaNGA sample. As a consequence, there are no direct matches between MaNGA and TNG50 galaxies. Because we wanted to match the MaNGA observed sample selection criteria, our simulated iMaNGA catalogue is significantly smaller than the MaNGA sample because of the relatively small simulated volume in TNG50.
Both approaches have benefits and pitfalls, but they appear to serve different scopes. In our approach, we aim to observe the universe as generated by the Illustris simulation as if it was observed, rather than matching the simulation to the observation. As a consequence, our iMaNGA catalogue allows us to test the characteristics of the simulation, such as fraction of certain g alaxy types, g alaxy scaling relations, etc. F or e xample, we find a paucity of massive elliptical galaxies. Interestingly, residual biases are still observed by Sarmiento et al. ( 2023 ), including an excess of disc galaxies at high mass in agreement with our findings.

C O N C L U S I O N S
We present a mock MaNGA catalogue, called iMaNGA, generated from the state-of-the-art magnetoh ydrodynamical g alaxy simulations TNG50. We illustrate the general characteristics of the iMaNGA sample in terms of morphology, kinematics, and stellar populations. We further run detailed tests comparing intrinsic galaxy properties from TNG50 with the reco v ered properties in iMaNGA derived through full-spectral fitting.
We identify galaxies in TNG50 through friends-of-friends algorithms, and initially select by redshift (0.01 ≤ z ≤ 0.15) and number of stellar particles ( N > 10 000), and exclude all galaxies that are flagged as of non-cosmological origin (for more details see Genel et al. 2017 ;Pillepich et al. 2018b , and the Data Specification page for IllustrisTNG) 7 . These criteria lead to the selection of 48 248 galaxies. Since we want to reco v er a smooth distribution in spatial sampling, as for the MaNGA-Primary sample, we then randomize galaxy redshifts in the initial sample, and select galaxies by i -band magnitude and redshift. In this way, we directly mimic the MaNGA selection criteria, which yields our 'parent sample' containing 3152 galaxies. We finally impose a flat distribution in absolute i -band magnitude following the MaNGA surv e y design, by randomly selecting unique galaxies from the parent sample. The resulting sample comprises 1000 galaxies from TNG50 and represents the final iMaNGA catalogue. We then post-process our iMaNGA galaxies following the method presented in Paper I (Nanni et al. 2022 ). We investigate whether the iMaNGA sample reproduces general trends of the MaNGA-Primary Sample. We find that angular sizes and spatial resolution (both in kpc and in terms of R eff ) are well consistent between iMaNGA and MaNGA. Likewise, the MNRAS 522, 5479-5499 (2023) relationship between angular size and total stellar mass is matched well. The correlations of morphology with angular size and stellar mass are instead not reco v ered by the iMaNGA sample. These dif ferences are dri ven by the fact that TNG50 is dominated by latetype systems with a paucity of intermediate-mass lenticular galaxies and massive elliptical galaxies as also found in e.g. Huertas-Company et al. ( 2019 ).
We demonstrated a generally good agreement between 'reco v ered' and 'intrinsic' properties, including stellar kinematics, stellar population ages, metallicities, and star formation histories. In particular, the stellar kinematics is reco v ered running the full-spectral fitting code PPXF , as in the MaNGA DAP. The 'intrinsic' stellar peculiar velocity and the stellar velocity dispersion along the LOS are both reco v ered within 1 σ . The stellar populations' properties are reco v ered by running another full-spectral fitting code, i.e. FIREFLY . Both the 'intrinsic' stellar age and stellar metallicity are well reco v ered, i.e. the residuals o v er all tassels in the iMaNGA sample are consistent with zero within the 68 per cent confidence interval. We also show the 'reco v ered' and 'intrinsic' SFHs for 22 galaxies in the iMaNGA catalogue, which show a generally good agreement.
Finally, we present a comparison with other works which have produced mock MaNGA datacubes and/or mock MaNGA catalogues.
While there are a number of differences in methodology for the construction of the mock MaNGA data cube, the most important difference to highlight for this paper is the method of constructing the mock MaNGA sample. Other works in the literature assign simulated galaxies from TNG50 or TNG100 directly to MaNGA galaxies by matching their basic properties such as mass and ef fecti ve radius, whereas we take the simulations at face value and apply the target selection criteria from MaNGA to the simulated catalogue. Both approaches have merits. While the former aims at producing a theoretical catalogue that resembles MaNGA as closely as possible, our approach is designed to test galaxy formation models. In our next paper in this series, we will present the scientific analysis of the iMaNGA catalogue in direct comparison with MaNGA.

A P P E N D I X A : R E C OV E RY O F S T E L L A R P O P U L AT I O N P RO P E RT I E S
In this appendix, we present additional tests of our capability to reco v er stellar population properties age and metallicity. Fig. A1 shows the residuals as a function of the intrinsic properties, for both mass-weighted and light-weighted quantities, colour coded by the number of tassels. As in Fig. 21 , the residuals are consistent with 0 within the 1 σ confidence level for the majority of the tassels. Small, ne gativ e correlations are present for age, indicating slightly positive residuals (hence o v erestimation of age) at low ages and slightly ne gativ e residuals (hence underestimation of age) at old ages. No such bias is seen for metallicity. Fig. A2 shows the relation between the residuals in metallicity and age, for both mass-weighted (left-hand panel) and light-weighted (right-hand panel) quantities, colour coded by the number of tassels. We can see again that the residuals are consistent with zero within the 1 σ confidence level (the green solids lines report the 0 value). The de generac y between age and metallicity is visible for the lightweighted quantities, i.e. when metallicity is o v erpredicted, age is underpredicted and vice versa. Figure A1. The distribution of the residuals of the metallicity (left-hand panels) and age (right-hand panels) of the stellar populations as measured by FIREFLY and by TNG50, for all the Voronoi tassels in the iMaNGA sample, as a function of the intrinsic values, colour coded by the number of tassels, considering mass-weighted (top panels) and light-weighted (bottom panels) results. The horizontal lines are the 0.16, 0.5, and 0.84 quartiles of these residual distributions, both light-weighted (red dashed lines) and massweighted (black solid lines). The green solid lines illustrate the 0. Figure A2. The distributions of the residuals of the stellar populations' age as a function of the residuals in the stellar populations' metallicity as measured by FIREFLY and by TNG50, for all the Voronoi tassels in the iMaNGA sample, as a function of the intrinsic values, colour coded by the number of tassels, considering mass-weighted (left-hand panel) and light-weighted (right-hand panel). The horizontal lines are the 0.16, 0.5, and 0.84 quantiles of these residual distributions, both light-weighted (red dashed lines) and massweighted (black solid lines). The green solid lines report the zero-values.
Finally, we discuss the SFH reco v ered by FIREFLY considering all the Voronoi bins for each galaxy, following the same method as in Section 5.5.2 . In Fig. A3 , we plot 20 'intrinsic' SFHs in comparison with the 'reco v ered' SFHs for galaxies in a redshift range between 0.03 and 0.1, and a total stellar mass between 9.5 and 11. log 10 M * /M . We separate between ellipticals (for a S érsix index abo v e 2.5, left-hand columns) and spirals (for a S érsix index between 0.5 and 1, right-hand columns). Some systematic differences between intrinsic and reco v ered SFHs become apparent. The reco v ered SFHs tend to be flatter (more uniform in time) than the intrinsic SFHs, which is more pronounced for the MW SFHs. Our fitting procedure seems to add a fraction of old populations, which is then balanced by a younger component.
For the spirals, instead, a preferred population at around 2 Gyr seen in the reco v ered SFHs. A more detailed analysis of these discrepancies in light of our methodology for spectral fitting will be subject of further work, which goes beyond the scope of this paper. It is important to underline here that for the 'intrinsic' SFH history, all the stellar population particles present in the simulated galaxies are considered, while the 'reco v ered' SFH is instead obtained running FIREFLY o v er the Voronoi-tessellated galaxy.
Downloaded from https://academic.oup.com/mnras/article/522/4/5479/7150712 by University of Portsmouth Library user on 28 July 2023 Figure A3. The SFH reco v ered by FIREFLY and computed from the stellar particle information provided by TNG50, for 20 galaxies in the iMaNGA sample. These galaxies are selected in the iMaNGA sample with a redshift range between 0.03 and 0.1, and a total stellar mass between 9.5 and 11. log 10 M * /M , and dividing them into ellipticals (when the S érsix index is abo v e 2.5) and spirals (when the S érsix index is between 0.5 and 1). First two columns : the SFH of elliptical galaxies in the defined redshift and mass range. In the first column, the teal histograms show the SFHs resolved by FIREFLY compared to the black hatch-filled histograms which illustrates the 'intrinsic' SFH reconstructed from TNG50 stellar particle data. Here, the SFHs are represented as SSP mass-weights as a function of lookback time. The second column, as the first one, this time considering the SFHs as SSP light-weights as a function of lookback time. In particular, the SFHs as reco v ered by FIREFLY (yellow histograms), and 'intrinsic' to TNG50 galaxies (black empty histograms). Last two columns : as the first two columns, this time for spiral galaxies in the discussed range in mass and redshift.