Redshift and stellar mass dependence of intrinsic shapes of disc-dominated galaxies from COSMOS observations below $z = 1.0$

The high abundance of disc galaxies without a large central bulge challenges predictions of current hydrodynamic simulations of galaxy formation. We aim to shed light on the formation of these objects by studying the redshift and mass dependence of their intrinsic 3D shape distributions in the COSMOS galaxy survey below redshift $z=1.0$. This distribution is inferred from the observed distribution of 2D shapes, using a reconstruction method which we test using hydrodynamic simulations. Our tests reveal a moderate bias for the inferred average disc circularity and relative thickness, but a large bias on the dispersion of these quantities. Applying the reconstruction method on COSMOS data, we find variations of the average disc circularity and relative thickness with redshift of around $\sim1\%$ and $\sim10\%$ respectively, which is comparable to the error estimates on these quantities. The average relative disc thickness shows a significant mass dependence which can be accounted for by the scaling of disc radius with galaxy mass. We conclude that our data provides no evidence for a strong dependence of the average circularity and absolute thickness of disc-dominated galaxies on redshift and mass that is significant with respect to the statistical uncertainties in our analysis. These findings are expected in the absence of disruptive merging or feedback events that would affect galaxy shapes. They hence support a scenario where present-day discs form early ($z>1.0$) and subsequently undergo a tranquil evolution in isolation. However, more data and a better understanding of systematics are needed to reaffirm our results.


INTRODUCTION
The structure of late-type galaxies is comprised of different components with distinct kinematic and morphological characteristics. The defining component is the thin stellar disc with its spiral arm over densities (e.g. Hubble 1926Hubble , 1936Vaucouleurs 1959a,b;Freeman 1970;van der Kruit & Searle 1982). Observations at low redshifts and in our own Milky Way revealed that this thin disc is often enclosed by a thick stellar disc as well as a stellar halo with relatively low densities (e.g. Carollo et al. 2010;Trujillo & Bakos 2013; Martínez-Lombilla & Knapen 2019, among others). Near the galactic center, the spiral arms of most discs transition either into a bar or into a bulge. The contribution of these components to the overall mass and morphology of a given galaxy is determined by the ★ E-mail: kai.d.hoffmann@gmail.com galaxies' formation history (e.g. Binney & Tremaine 2008;Vogelsberger et al. 2020). Therefore, studying how the shapes of late-type galaxies (hereafter referred to as disc galaxies) are distributed at different epochs of the Universe can provide significant insights into the processes which dominate galaxy formation, as detailed below.
During the recent years it was pointed out that the observed abundance of a particular type of objects, namely disc-dominated galaxies with only a small or no central bulge, challenges our current understanding of galaxy formation in the ΛCDM model (e.g. Kormendy et al. 2010). As a matter of fact, bulges are expected to form in galaxies for several reasons. van den Bosch (1998) suggested that bulges can form "inside out" from the low angular momentum components of the initial gas overdensities in which the galaxies are born (see also Kepner 1999). In another popular scenario, bulges result from the redistribution of angular momentum within the disc by central bars, spiral arms or bending instabilities which distort the stellar orbits (Kormendy & Kennicutt 2004;Debattista et al. 2006). This so-called "secular evolution" leads typically to the formation of a small "pseudo" bulge, but may also promote the emergence of a larger "classical" bulge when occurring at high redshifts (Elmegreen et al. 2008;Bournaud 2016). A third and important scenario for bulge formation is the accretion of satellite galaxies, globular clusters or clumpy streams of cold gas, which can move low angular momentum material to the center of the disc (e.g. through violent disc instabilities) without destroying it (e.g. Walker et al. 1996;Dekel et al. 2009;Hopkins et al. 2009;Dubois et al. 2012;Kretschmer et al. 2020). In relatively rare occasions even major mergers can lead to the formation of a bulge-dominated discs, as discs can survive or reform if one of the progenitors is gas-rich (Toomre 1977;Negroponte & White 1983;Hernquist 1992;Naab & Burkert 2003;Hopkins et al. 2008Hopkins et al. , 2009Jackson et al. 2020), especially in the case of prograde mergers (Martin et al. 2018). Since mergers and cold streams are believed to play a significant role in the formation and growth of disc galaxies (e.g. Baugh et al. 1996;Kauffmann et al. 1993;Aguerri et al. 2001;Eliche-Moral et al. 2006) it is difficult to explain how a significant fraction of high mass discs (i.e. ★ 10 10 ) could grow without forming a large bulge (Kautsch et al. 2006;Kautsch 2009;Weinzirl et al. 2009;Buta et al. 2015).
One possible solution to this problem are feedback processes, driven by starbursts, supernovae or active galactic nuclei, which can suppress the formation of a bulge by removing low angular momentum material from the disc, in particular after gas-rich major mergers (e.g. Governato et al. 2010;Brook et al. 2011Brook et al. , 2012Hopkins et al. 2012;Übler et al. 2014;Dubois et al. 2016;Grand et al. 2017). However, simulations suggest that this suppression is only effective for discs with stellar masses of the Milky Way ( 6 × 10 10 M ) or below, but not at higher masses (Brooks & Christensen 2016). An additional challenge for formation scenarios including major mergers are the low density environments in which bulgeless discs typically reside (e.g. Grossi et al. 2018). Kormendy et al. (2010) argued that this finding speaks for a gentle, rather than a violent mass accretion history (see also Kormendy 2016;Jackson et al. 2020). Recently, Peebles (2020) therefore discussed an alternative solution to the problem, according to which small scale non-Gaussianities in the initial conditions of a warm or mixed dark matter universe could lead to a large fraction of bulge-less discs. However, this approach remains to be explored with simulations of galaxy formation.
With this work we aim to shed light on the formation of bulgeless discs. We therefore use a novel approach for discriminating the first scenario in which feedback processes suppress bulge formation after mergers from alternative, less violent mass accretion scenarios by comparing the distributions of shapes of observed discdominated galaxies at different redshift and stellar mass ranges. This approach relies on the hypothesis that disruptive events, such as mergers and strong feedback should dramatically affect the morphology of the remaining galactic disc, besides the absence of a bulge. Such changes can occur in the form of an increase in disc thickness which results from vertical heating by feedback and merging events (e.g. Quinn et al. 1993;Grand et al. 2016) or tidal debris of mergers (Abadi et al. 2003). The latter can furthermore lead to large sub-structures which decrease the disc circularity, such as accreted satellites or spiral arms that form due to tidal interactions during merging events (Springel & Hernquist 2005;Robertson et al. 2006;Peschken et al. 2020). We therefore expect mergers and strong feedback to cause a significant mass and redshift dependence of the discs thickness and circularity.
On the contrary, if the bulge-less discs underwent an early and regular accretion of mass, for instance through infalling smooth streams of cold gas at high redshifts followed by a calm evolution without major mergers, their morphologies should exhibit a much weaker or no dependency on mass and redshift. It has been discussed in the literature that a secular redshift evolution of the disc thickness can be expected from vertical heating by bars, spiral arms, stellar clumps and giant molecular clouds (e.g. Bournaud et al. 2009;Saha et al. 2010;Aumer et al. 2016). However, Grand et al. (2016) find that the effect of such internal perturbations is weak compared to those caused by merging events. Moreover, Park et al. (2021) do not find a significant redshift dependence of the disc thickness in an ensemble of simulated disc galaxies that underwent a quiescent growth without significant mergers since = 1.0.
A challenging aspect of studying observed galaxy morphologies is to interpret the projected two-dimensional (2D) galaxy shape distributions in terms of 3D models of galaxy formation. In this work we address this challenge by reconstructing the distribution of 3D galaxy shapes from the observed distribution of 2D shapes. We thereby assume a simple ellipsoidal model for the 3D light distribution within a given galaxy. The 3D shape of the ellipsoids is thereby fully characterized by two of the three possible ratios between the major, intermediate and minor axes ( 3 , 3 and 3 respectively), (1) For disc galaxies the 3 parameter can be regarded as a measure for the circularity, while 3 and 3 both quantify the relative disc thickness. The 2D galaxy shapes of the projected ellipsoid are described by the axis ratio 2 ≡ 2 2 . (2) The main advantage of such an ellipsoidal approximation is that the distribution of 2D axis ratios can be predicted from a given model for the 3D axis ratio distribution at low computational cost. This allows for the reconstruction of the 3D axes ratio distribution by tuning the corresponding model parameters such that the predicted distribution of 2D axis ratios matches observations. This reconstruction technique dates back to Hubble (1926) who derived first constraints on the intrinsic 3D axis ratios of galaxies by modeling them as oblate ellipsoids. Over the last century it has been shown that, despite its simplicity, this methodology reproduces the observed 2D axis ratio distribution of late-as well as of early-type galaxies, assuming oblate or prolate ellipsoidal models (e.g. Sandage et al. 1970;Binney 1978;Noerdlinger 1979). The agreement with observed axis ratio distributions was further improved by modeling galaxies as triaxial ellipsoids, which allowed for more detailed interpretations of the observations (Benacchio & Galletta 1980;Binney & de Vaucouleurs 1981;Lambas et al. 1992). These early studies where continued using larger samples to study the relation between intrinsic galaxy shapes and properties, such as size, luminosity and color in the local universe, observed by the Sloan Digital Sky Survey (SDSS, e.g. Ryden 2004;Vincent & Ryden 2005;Padilla & Strauss 2008;Rodríguez & Padilla 2013). The evolution of intrinsic shapes with redshift has been studied with the same approach in the galaxy surveys SDSS, 3D-HST, GOODS, COSMOS and CANDELS (Yuma et al. 2011(Yuma et al. , 2012Holden et al. 2012;Chang et al. 2013;van der Wel et al. 2014;Takeuchi et al. 2015;Zhang et al. 2019;Satoh et al. 2019). Physical interpretations drawn from these reconstructed axis ratio distributions rely on the validity of the ellipsoidal model for the 3D galaxy shapes as well as on the accuracy of the model for the 3D axis ratio distribution. In addition, observational sources of systematics on the observed 2D axis ratio distribution, for example those induced by dust extinction within the galaxy, need to be taken into account in the analysis. The objective of our analysis is therefore two fold. Our main goal is to constrain potential explanations for the absence of large bulges in disc galaxies based on the reconstructed 3D shape distribution of disc-dominated galaxies as outlined earlier in this section. Our investigation is based on data from the COSMOS galaxy survey (Scoville et al. 2007), which provides excellent space-based imaging of galaxies over a wide range of redshifts and stellar masses, and is therefore ideal for our analysis.
However, the assumptions on which the reconstruction of the 3D shape distribution is based are highly simplistic which may induce uncertainties in our as well as in previous analyses. Our second goal is therefore to test several of these assumptions and to assess the overall performance of the reconstruction method. For that purpose we employ for the first time two state-of-the-art cosmological hydrodynamic simulations of galaxy formation, Horizon AGN and Illustris TNG. These simulations provide 3D as well as projected 2D galaxy morphologies which allows for an examination of the reconstruction method under controlled conditions. This paper is structured as follows. In Section 2, we present the galaxy catalogues from the COSMOS survey and the hydrodynamic simulations and explain our sample selection. Section 3 provides details on the shape reconstruction method together with validations of the model assumptions and accuracy tests. The method is then applied on the COSMOS data in Section 4. A summary of our results can be found together with our conclusions in Section 5.

COSMOS observations
The 2 deg 2 COSMOS field (Scoville et al. 2007) has been observed extensively by different ground and space based telescopes, including Hubble, Spitzer, VISTA, CFHT and Subaru. The joint analysis of these observations led to different catalogues providing estimates of galaxy properties such as stellar masses, star formation rates and morphological characteristics over a large range in redshift and luminosity. Our analysis is based on three of these catalogues, which are described below.

Photometry
The public COSMOS2015 catalogue 1 (Laigle et al. 2016) comprises photometry in 30 bands, covering ultra-violet to mid-infrared wavelengths. In our analysis we use redshift, stellar mass and specific star formation rate (sSFR) 2 estimates provided in this catalogue, which were derived for each galaxy by fitting templates of spectral energy distributions (SEDs) to the photometric data (Ilbert et al. 2006). We discard objects which are i) residing in regions flagged as "bad" (mostly because they are close to stars or to the edge of the field) ii) saturated and iii) not classified as galaxies by requiring the corresponding catalogue flags to be [flag_hjmcc, flag_peter, type] = [0,0,0]. We further impose cuts at the limiting AB magnitudes in the near-infrared 1 https://www.eso.org/qi/ 2 We are aware that this sSFR estimates is not very robust, but we checked that it is sufficient for the purpose of our analysis.
-band of 24.0 and 24.7 in the deep and ultra ultra-deep fields respectively. These magnitudes are defined within a fixed 3 diameter aperture (Ks_MAG_APER3) and the limits correspond to a 3 detection. After applying these cuts the catalogue contains 252, 527 galaxies. For this sample the standard deviation of the relative differences in redshift with respect to the zCOSMOS-bright spectroscopic control sample (Lilly et al. 2007) is Δ /(1+ ) = 0.007 (catastrophic failure fraction = 0.5%). Note that this error is an optimistic estimate, since it was inferred by comparison to spectroscopic redshifts of bright galaxies and is higher for dimmer objects. However, due to the bright magnitude cuts used in this work (see Section 2.1.5) we expect this inaccuracy to be a realistic estimate for the galaxies in our samples. The accuracy on the stellar mass and star formation rate estimates is expected to be ∼ 0.1 dex and ∼ 0.2 − 0.6 dex respectively at < 1.5 . These values may be lower for our samples as we focus on bright objects only. Nevertheless, Laigle et al. (2019) find evidence that the strong scatter (and bimodality) in the estimates is mainly driven by the inaccurate modelling of dust extinction within the galaxies at the SED-fitting stage (see their Figure B3). We also find strong evidence of dust extinction in the galaxies from our sample (e.g. Fig.  D1). We therefore study the sSFR only for discs with high apparent axis ratios, which we consider to be inclined towards a face-on orientation at which the impact of dust extinction (and dust extinction modelling) is expected to be minimal.

Shapes
We use galaxy shape estimates from the public Advanced Camera for Surveys General Catalog (ACS-GC 3 , Griffith et al. 2012). This catalogue is based on Hubble Space Telescope (HST) imaging in the optical red broad band filter F814W. The absence of atmospheric distortions allows for an excellent image resolution, which is mainly limited by the width of the HST point spread function (PSF) of 0.085" in the F814W filter and the pixel scale of 0.03". Sources were detected using the G software (Häußler et al. 2011), which runs SE (Bertin & Arnouts 1996) and G (Peng et al. 2002) in two subsequent steps. Galaxy shapes are described by the two-dimensional major over minor axis ratios 2 , which are derived by G from fits of a single Sérsic model to each objects image. The surface brightness in this model is given by where Σ is the surface brightness at the effective radius and the parameter is chosen such that encloses half of the total flux. The Sérsic index quantifies the concentration of the surface brightness profile. The 2D axis ratio 2 (provided as BA_GALFIT_HI in the catalogue) enters Equation (3) via = √︁ 2 + ( / 2 ) 2 , where and are the coordinates on the major and minor axis respectively. The modeled surface brightness profiles are convolved with the ACS PSF before being compared to the reference observation during the fit. The morphological parameters from G hence describe the intrinsic 2D galaxy shapes and do not require further PSF correction. We select objects from the catalogue which were classified by SE as galaxies (CLASS_STAR_HI < 0.1) and with good fits to the G model (FLAG_GALFIT_HI=0 and CHI2NU_HI < 2). From the remaining sample we reject 65 objects Figure 1. Random examples of ACS images of late-type galaxies in our volume limited COSMOS sample in the redshift range 0.2 < < 0.4. The galaxies are selected to be disc-dominated according to the ZEST morphological classification scheme (Section 2.1.3). Galaxies in the top and bottom panels are further classified by ZEST as being face-on and edge-on oriented respectively. The three columns on the left (right) show galaxies in our low (high) mass sample, with stellar masses below (above) ★ = 10 10. 35 . The 2D axis ratios 2 , provided in the ACS-GC catalogue, were obtained from single Sérsic profile fits (Section 2.1.2). The image sizes are adjusted for each galaxy.
which have axis ratios equal to zero or larger than unity or effective radii from G above 750 ACS pixels (22.5 ). After these cuts the final catalogue contains 128, 365 objects.

Morphological classification
For the selection of disc-dominated galaxies we use the Zurich Structure & Morphology Catalog 4 which is derived from the same HST imaging data as the ACS-GC. It provides a morphological classification for each galaxy, derived with the Zurich Estimator of Structural Type and is referred to as ZEST catalogue in the following. The ZEST classification is based on a principal component analysis of five non-parametric diagnostics: asymmetry, concentration, Gini coefficient, 2nd-order moment of the brightest 20% of galaxy pixels (M 20 ) and ellipticity (see Scarlata et al. 2006Scarlata et al. , 2007Sargent et al. 2007, who also provide validations of their classification). The catalogue contains galaxies in the COS-MOS field brighter than = 24. We select objects, which i) are classified as galaxies ([acs_mu_class, stellarity]==[1,0]), ii) do not reside in automatically or manually masked regions ([acs_mask, acs_masked]==[0,1]) and iii) are not flagged as unusable or spurious ([acs_clean, junkflag]=[1,0]). After applying these conditions the remaining sample contains 108, 800 galaxies. The morphological classification is considered to be unreliable for galaxies with half-light radii smaller than twice the size of the ACS F814W PSF (i.e. 0.17 , that is 5.6 times larger than the ACS pixel size), which we take into account in our sample selection (Section 2.1.5). We validate the morphological classification for the disc-dominated galaxies used in our analysis in Section 2.1.5 and in Appendix A.

Matched catalogue
Matching objects in these three COSMOS catalogues is not straightforward due to variations in common properties such as positions and magnitudes, which can lead to spurious mismatches. These variations can originate from atmospheric distortions in the groundbased COSMOS2015 data, differences in the employed image analysis software (i.e. SE and G ) and its configuration, in the employed telescopes, cameras and filters, as well as different quality cuts applied before matching. In order to minimise the chance for mismatches, galaxies are matched based on angular positions as well as on magnitudes. We start by matching objects in the COSMOS2015 and ACS-GC catalogue in three steps. 1) We select pairs of galaxies as candidate matches if their angular separation is < 0.6 , which is slightly below the typical seeing of the ground-based telescopes contributing to the COSMOS survey. 2) We discard candidate matches with a difference in brightness of more than 1.0 magnitude. 3) We finally select the matches as those with the smallest difference in angular positions. The COS-MOS2015 magnitudes used in step 2) are measured in the Subaru + band within a fixed 3 aperture (referred to as ip_mag_aper3 in the catalogue). They are compared to the SExtractor magnitudes MAG_BEST_HI from ACS-GC which are defined as magnitudes measured within a flexible elliptical aperture (referred to as MAG_AUTO) or corrected isophotal magnitudes if contaminating sources are located in the vicinity. The average wavelengths weighted by transmission in the Subaru + and ACS F814W filters are 7683.88 Å and 8073.43 Å respectively, while the width and shape of their transmission curves differ significantly. These differences are accounted for by the relatively large tolerance in magnitude. We identify 98, 604 objects in the matched COSMOS2015 and ACS-GC catalogue. From these matched objects 50 (95)% have less than 0.07 (0.22) and 0.12 (0.42) magnitude differences in their matched angular positions and luminosities respectively, which is well below the chosen tolerances described above and indicates that the match is robust. Subsequently we matched the joint COSMOS2015 and ACS-GC catalogue with the ZEST catalogue using the same three-step method with the same tolerances for magnitudes and angular positions. The ACS-GC MAG_BEST_HI are now compared to the SExtractor ACS_MAG_AUTO provided in ZEST.
The final matched catalogue contains 70, 708 objects from which 50 (95)% have less than 0.03 (0.2) and 0.03 (0.16) magnitudes differences in their matched angular positions and lumin-osities respectively. These smaller differences compared to the first matching between COSMOS2015 and ACS-GC can be attributed to the fact that the positions and magnitudes used for the second matching are all derived with SExtractor from the same HST ACS imaging data. A second reason is the cut at = 24 in ZEST, which excludes the dimmest objects with highest uncertainties on position and luminosity estimates. This latter cut also explains the strong drop in the number of objects in the final catalogue.

Volume limited main sample of disc-dominated galaxies
Our study is focused on disc-dominated late-type galaxies in the matched catalogue, which we identify as those with ZEST parameters type = 2 and bulg = 2 or 3. Among them, we select a volume limited main sample adopting cuts in photometric redshift, absolute Subaru + magnitude and comoving effective radius. The selection is displayed in Fig. 2 and described below. Examples of the galaxies in our main sample are shown in Fig. 1 and A1.

Redshift cuts.
The redshift range of our main sample is set to 0.2 < < 1.0 (marked by vertical red dashed lines in Fig. 2). The upper limit is a compromise between a deep selection in redshift and a sufficiently high number of galaxies in the volume limited sample. The lower limit is defined by the end of the distribution, below which very few objects are found due to the small volume of the light cone. The redshifts on which the cuts are applied are the median of the likelihood distribution from the SED template fits, which are referred to as photoz in the COSMOS2015 catalogue.
2.1.5.2 Magnitude cuts. Galaxies are further selected to have apparent isophotal magnitudes in the Subaru + band (referred to as ip_MAG_ISO in the COSMOS2015 catalogue and hereafter as ) brighter than max = 24 above which the ZEST morphological classification becomes unreliable (Scarlata et al. 2007). The apparent magnitude cut introduces a redshift dependent selection by the absolute magnitude which hampers the comparison of galaxy populations at different redshifts (see bottom panel of Fig. 2). We therefore require the absolute restframe Subaru + magnitudes (hereafter referred to as ) to be brighter than max = −21.5, ensuring that all objects in our redshift range are sufficiently bright to be unaffected by the apparent magnitude cut.
Note that the cut in absolute magnitude is chosen to be brighter than a naive cut at max = max − DM( max = 1.0) = 20.1 (indicated as blue dotted line in the bottom panel of Fig. 2) to mitigate effects of dust extinction on the observed axis ratio distribution. In the top panel of Fig. 3 we show that the apparent magnitude of discs is fainter for objects with low axis ratios. This effect can be expected from an increased dust extinction in objects that are inclined towards an edge-on orientation (e.g. Graham & Worley 2008). As a consequence, high redshift discs (at 1.0) that have low axis ratios ( 2 0.5) and where selected by max = −20.1 can fall below the apparent magnitude cut of max = 24, marked as horizontal dashed line in Fig. 3. Dust extinction can therefore introduce a bias in the observed axis ratio distribution towards apparently rounder (i.e. face-on) galaxies, as we discuss in more detail in Appendix D.
In the bottom panel of Fig. 3 we demonstrate that selecting galaxies with absolute magnitudes brighter than = −21.5 ensures that that the apparent magnitudes are brighter than max = 24, even if the galaxies are seen edge-on ( 2 1) at the highest redshifts considered in this work. Figure 2. Selection of the volume limited main sample from disc-dominated galaxies in the matched COSMOS catalogue by photometric redshift , transverse comoving radii ⊥ and apparent and absolute band magnitudes and respectively. The cuts on each variable are marked by red dashed lines, enclosing the selected sample in the red area. The blue dotted and dashed-dotted lines in the top panel show the comoving radii which correspond to one and two times the angular size of the HST PSF respectively. The blue dashed-dotted line in the bottom panel shows the limit on , given by the apparent magnitude cut and the distance modulus . The horizontal blue dotted line in the same panel shows a naive cut on (see Section 2.1.5 for details). The over-abundances at ∼ 0.35 and ∼ 0.7 are known large-scale structures (see e.g Fig. 4 in Gozaliasl et al. 2019;Guzzo et al. 2007), while some sharp vertical spikes in the redshift distribution might be artefacts of the photometric redshift estimation.

Size cuts.
Our final selection addresses the sizes of galaxies. We require objects in our sample to have angular radii which are at least twice as large as the width of the PSF in the HST F814W imaging (i.e. min = 0.17 ) to minimize the effect of inaccuracies in the PSF correction on shape measurements as well as PSF effects on the morphological classifications (Scarlata et al. 2007;Griffith et al. 2012). Simply applying a cut on would lead to objects with smaller physical sizes entering the main sample at lower redshifts. This effect could introduce an apparent evolution of the galaxy shapes with redshift since the apparent physical sizes to elucidate the correlation between both observables. Face-on discs with 2 1 appear brighter than edge-on discs, which indicates a dependence of dust extinction on the disc inclination relative to the observer. Top panel shows the volume limited sample selected as shown in Fig. 2, but with a more conventional cut on the absolute band magnitude of < −20.1. The distribution of this sample is cut off at the apparent magnitude limit at = 24, which biases the 2 distribution, as illustrated in Fig. D1. We mitigated this bias with a more conservative cut of < −21.5. The resulting sample, shown in the bottom panel, is only weakly affected by the cut on . and shapes are correlated (see top panels of Fig. 4 and discussion below). We therefore apply a cut on the transverse comoving radii ⊥ ( ) = ( ), which ensures that the observed angular radii of the galaxies in our sample are always larger than min , even for the most distant objects at max = 1.0. Here ( ) is the angular diameter distance at the source redshift . Assuming a flat ΛCDM universe we obtain the condition ⊥ > min We show ⊥ corresponding to one and two times the angular width of the HST PSF (0.085 ) at a given redshift as dotted and dashed-dotted blue lines respectively in the top panel of Fig. 2 together with the cut on ⊥ which defines the main sample (red dashed horizontal line). The angular radii are the PSF corrected effective radii from the GALFIT single Sèrsic model fits, which quantify the galaxy size along the projected major axis (referred to as RE_GALFIT_HI in the ACS-GC).
For opaque discs with zero thickness the size of the projected major axis would be equal to the three-dimensional major axis, independent of its inclination with respect to the plane of projection. In that case a cut by projected size would not introduce a cut by shape or inclination with respect to the observer. However, real disc galaxies have a non-zero thickness and are not opaque but to a certain degree transparent. Their observed surface brightness profile is hence affected by projection effects, which depend on the Comoving transverse effective radii ⊥ of disc-dominated galaxies in our volume limited COSMOS sample verses the apparent 2D axis ratio 2 . Red bars mark the average ⊥ in bins of 2 and show that the apparent size depends on the apparent axis ratio, presumably as an effect of the disc inclination. The horizontal dashed lines show minimum value for ⊥ used as cut in our sample selection. Bottom: Distribution of 2 for disc galaxies before and after the size cut (black dashed and red solid lines respectively). This figure demonstrates that our size cut does not bias the 2 distribution of our sample.
inclination angle with respect to the observer. As a consequence the diameter of the observed 2D isophotes is larger for edge-on than for face on discs (e.g. Holmberg 1946;de Vaucouleurs 1959;Heidmann et al. 1972). In the top panels of Fig. 4 we show for galaxies brighter than our absolute magnitude cut that ⊥ increases significantly with decreasing apparent axis ratios, indicating the expected dependence of ⊥ on the disc inclination. However, for our bright sample and within the redshift range we consider, the lower limit on ⊥ is sufficiently small to have only a negligible impact on the observed distribution of axis ratios as shown in the bottom panels of Fig.  4. We therefore do not expect a relevant impact of the size cut on the disc inclinations of our sample, which could otherwise bias the axis ratio distribution towards edge-on objects (e.g. Huizinga & van Albada 1992). All cuts defining the volume limited sample are summarized in Table 1. After applying these cuts the final volume limited main sample contains 3, 739 disc-dominated galaxies in total. The redshift distributions of these galaxies is shown in Fig. 5. Fig. 6 we display the relative abundance of the disc-dominated galaxies in our main sample with respect to the total galaxy population in that sample in three redshift bins. These abundances are compared to those of galaxies classified as discs with large bulges (ZEST parameter type=2 and bulg=0 or 1), as elliptical (type=1) and irregular (type=3). The figure confirms reports from the literature that a large fraction of discs galaxies has no significant bulge. The fraction of discs remains roughly constant at 80%, with a slight increase with redshift. It is further interesting to note that at 1.0 the majority of discs has no large bulge, while the opposite is the case at 0.4. The decline of the fraction of bulgeless discs with decreasing redshift lines up with the findings from Sachdeva (2013) based on Chandra Deep Field observations. Fig. 6 further shows that the rate at which this fraction declines is similar to the rate at which the fraction of discs with large bulges increases. This finding is consistent with the scenario in which discs with large bulges form out of bulgeless discs, potentially during mergers. Note, the absolute values of the different fractions may be specific to our sample selection. In particular the lower limit on the size separates out many elliptical galaxies, which tend to be more compact than discs, increasing the fraction of discs in our sample.

Type and color distributions. In
For further validation of our data set we compare the colorcolor diagrams of the disc-dominated and elliptical galaxies in our main sample in Fig. 7. We find that the discs and the ellipticals reside preferentially in the blue star forming and red quenched sequence respectively, which are well separated by the color-color cuts from Laigle et al. (2016). This result can be expected in general from the well known correlation between galaxy morphology and color (e.g. Larson Color-color diagram, based on estimates of the absolute restframe magnitudes in the , and filters from the COSMOS2015 catalogue for galaxies which pass our main sample selection on size, magnitude and redshift. Red and blue dots show objects which are classified in the ZEST catalogue as early-type and disc-dominated late-type respectively. The black dashed line is taken from Laigle et al. (2016) and separates the quenched and the star forming populations. et al. 2009). We find that the bulge dominated discs in our sample populate both, the quenched as well as the star forming sequences (not shown in the figure for clarity). This result lines up with the large bulges found in discs on the red quenched sequence, for instance in data from COSMOS and the Sloan Digital Sky Survey (see Bundy et al. 2010;Guo et al. 2020). Overall, the expected correlation between color an morphology shown in Fig. 7 confirms a posteriori that the morphological classification from ZEST and the photometric properties from the COSMOS2015 are consistent with each other. This test shows that the matching between both catalogues is sufficiently reliable for deriving physical interpretations from their combined morphological and photometric information.

Stellar mass -redshift samples
In order to investigate the redshift and stellar mass dependence of the galaxy shapes, we select three redshift sub-samples and two stellar mass sub-samples from the COSMOS main sample as shown in Fig. 5 and in the top panel of Fig. 8 respectively. For more detailed investigations we further select six sub-samples, which are selected by both, redshift and stellar mass, as shown in the lower panels of Fig. 8. The width of the redshift bins are chosen such that each sample contains a sufficiently large number of objects required for our statistical analysis. The stellar mass cut at cut ★ = 10 10.35 M corresponds to the median stellar mass of the main sample and lies close to the median stellar mass of the redshift sub-samples. The cuts in mass and redshift are summarized in Table 5 together with the number of galaxies in each sub-sample.
We show in the left column of Fig. 9 the distribution of the observed shapes from the ACS-GC catalogue, quantified by the axis ratio 2 , in the three redshift bins. We find that at all redshifts the distributions of the low and high mass samples are skewed towards high and low 2D axis ratios respectively. The difference between the mass samples increases with decreasing redshifts, which is mainly driven by an increasing fraction of high mass galaxies with low 2D axis ratios ( 2 0.3). The axis ratio distribution of low mass galaxies on the other hand shows no strong change with redshift. In fact, we argue in Section 4 that the redshift dependence of both samples is not significant when considering shot-noise errors on the binned distributions.
Nevertheless, a weak trend in redshift can also be seen in the transverse comoving radii ⊥ in the central column of Fig. 9. These radii are shown for discs inclined towards a face-on orientation with apparent axis ratios 2 > 0.5 to reduce the impact of inclination on the observed size (see Fig. 4). As for the axis ratios we find that at high redshift, the size distribution for low and high mass samples is similar. At lower redshift, the deviation between both distributions is slightly larger, mainly due to an increase of the sizes in the high mass disc sample. This trend can be seen in the average radii, displayed as dotted and dash-dotted vertical lines for the high and low mass sample, respectively.
A non-geometric galaxy property which follows the same behaviour is the specific star formation rate ( ) from the COS-MOS2015 catalogue, whose distribution is displayed in the right column of Fig. 9. The selection here is again restricted to galaxies with 2 > 0.5 to mitigate a potential bias in the estimates induced by inaccuracies of dust attenuation model employed in the SED fitting, as reported by Laigle et al. (2019). We see that the distribution is similar for the low and high mass samples at high redshifts. At low redshifts, both distributions differ significantly, which is mainly driven by a strong decrease of the for high mass discs. This result lines up with those of Grossi et al. (2018), who find that the of disc dominated galaxies in COSMOS field decreases with increasing stellar mass and decreasing with redshift. It is worth noting that the mass and redshift dependence is weaker for the (not shown here) than for the . This weaker dependence is consistent with the overall weak redshift dependence of star formation in nearby galaxies, suggested by findings of Kroupa et al. (2020) and may result from a partial compensation by the mass differences between the sub-samples. Overall, our findings indicate a correlation of the galaxy shapes, sizes and specific star formation rates with the stellar masses, which appear to increase as galaxy formation proceeds. We will discuss this result in more detail together with the reconstructed 3D axis ratio distribution in Section 5.
We stress here that our discussion of differences between the ⊥ and distributions from different mass and redshift samples discussed above is based on our visual impressions. Clarifying whether or not these differences are significant would require a statistical analysis, which is beyond the scope of this work.

disc galaxies in hydro-dynamic simulations
We use the state-of-the-art hydrodynamic simulations of galaxy formation Horizon-AGN 5 and Illustris TNG100 6 (hereafter referred to as HAGN and TNG100 respectively) to test the methods and assumptions on which our analysis of the COSMOS data is based. These tests are performed at redshift = 1.0, were the available snapshots of these simulations are closest to the maximum of the redshift distribution of our volume limited COSMOS sample. Both simulations assume a ΛCDM cosmology with recently constrained parameters, cover similar cosmological volumes and include subgrid mechanisms to model the formation of galaxies at a similar degree of complexity, as detailed below. However, their simulation techniques differ significantly, which allows for testing the robustness of our conclusions. The main characteristics of these simulations are compared to each other in Table 2. Note that these simulations where run at relatively low resolutions in order to cover large . Probability distributions of apparent 2D axis ratios, transverse comoving radii and specific star formation rates (left, central and right panels respectively) for disc-dominated galaxies in our volume limited COSMOS main sample with stellar masses below and above 10 10.35 M (blue and red histograms respectively). Vertical panels show results in three redshift bins with limits indicated on the right. The comoving radii and specific star formation rates are shown for discs with apparent axis ratios above 2 > 0.5, to minimize bias induced by projection and dust extinction (see Fig. 4 and D1). Vertical black dashed lines mark the minimum radius of the volume limited sample. Vertical solid and dotted lines indicate the mean radii for each mass sample over the full redshift range and in each redshift bin respectively.
volumes, which has a noticeable impact on the galaxy morphologies as discussed in Section 3.1.

Horizon-AGN
The hydrodynamic simulation HAGN was produced with the gridbased adaptive-mesh-refinement code RAMSES (Teyssier 2002), using cosmological parameters compatible with the constraints from WMAP-7 (Komatsu et al. 2011). The simulation includes the key processes relevant for galaxy formation: cooling, heating and chemical enrichment of gas, the formation and evolution of stars and black holes as well as feedback from stellar winds, supernovae and Active Galactic Nuclei (AGNs) (Dubois et al. 2014). Galaxies were identified in the distribution of stellar particles as groups with more than 50 members using the AdaptaHOP finder (Aubert et al. 2004).
A key achievement of HAGN with respect to previous generations of cosmological hydrodynamic simulations is the broad diversity of galaxy morphologies, with realistic fractions of discy and elliptical objects (Dubois et al. 2016). A reasonable agreement with observed luminosity functions and color distributions has been shown by Kaviraj et al. (2017), while moderate deviations from observed angular clustering have been found by Hatfield et al. (2019). For our analysis we select disc galaxies, which we define via the ratio = rot / between the stellar rotation rot , defined as the mean tangential velocity of stellar particles with respect the galaxies' spin axis and the velocity dispersion (see Chisari et al. (2015) for details). High values of indicate relatively high rotational velocities compared to those radial or parallel to the spin axis. For our analysis we select the 20% of galaxies with the highest values of , which corresponds to a cut at > 1.06 at redshift = 1.0 (Fig. 10). We attribute this low value of to the relatively low resolution of the simulation and demonstrate in Section 3.1 that the selected galaxies are indeed discy, although with "puffed-up" shapes, characterized by relatively high 3 / 3 axis ratios. We further limit the selection to galaxies with more than 500 particles, which corresponds to a stellar mass cut at ★ > 10 9 M , to ensure reliable measurements of the morphological properties (see also Section 2.2.3). The final sample contains 14198 disc galaxies.

Illustris TNG100
The TNG100 (Springel et al. 2018;Naiman et al. 2018;Nelson et al. 2018;Marinacci et al. 2018;Pillepich et al. 2018b) is a magnetohydrodynamic simulation, produced with the moving-mesh code AREPO (Springel 2010), which was run with cosmological parameters from the Planck Collaboration et al. (2016). It includes the same key processes for modeling galaxy formation as HAGN, although with significantly different implementations which are described in Pillepich et al. (2018a). Galaxies are identified in dark matter subhaloes with non-zero stellar components. These subhaloes are detected in friends-of-friends groups by the SUBFIND algorithm (Davis et al. 1985;Springel et al. 2001;Dolag et al. 2009). While it has been calibrated to match the galaxy mass function at = 0, the simulation has been shown to predict main characteristics of observed galaxy populations reasonably well, such as morphological diversity (Rodriguez-Gomez et al. 2019), stellar mass functions at higher redshift (Pillepich et al. 2018b), the color bimodality (Nelson et al. 2018) as well as color-dependent two-point clustering statistics (Springel et al. 2018). The basic properties of the simulation are compared to those of the HAGN simulation in Table  2.
To identify discs in the TNG100 simulation we use each  galaxy's fraction of stellar mass in the disc component with respect to the total stellar mass. For this purpose stellar disc particles have been defined via their circularity parameter ≡ / ( ), where is the specific angular momentum around a selected z-axis and ( ) is the maximum specific angular momentum possible at the specific binding energy of a given stellar particle.
The -axis is thereby given by the principle angular momentum vector of the star forming gas, or the stellar particles, if there is no star forming gas in the system (Vogelsberger et al. 2014;Genel et al. 2015). Particles with > 0.7 are considered to belong to the disc component (Abadi et al. 2003). The circularity parameter is provided on the Illustris database for all subhalos with ★ > 3.4 × 10 8 M within the half mass diameter 2 1/2,★ and at least 100 stellar particles. We classify TNG100 galaxies as discs if their fraction of disc particles is above 0.35, which corresponds to the upper 20% of the distribution at = 1.0, as shown in Fig. 10. As for HAGN, we attribute this low fraction of disc particles used for the cut to the limited resolution of the simulation and demonstrate in Section 3.1 that it is a reasonable choice for selecting discy objects. Note that we choose the disc particle fraction for the morphological classification in TNG100 since this quantity is provided for the = 1.0 snapshot on the public TNG database in contrast to . Besides this practical motivation, using two different morphological classification schemes in both simulations allows for drawing more robust conclusions regarding the validation of our analysis methods. The final sample contains 5, 674 disc galaxies.

Axis ratio measurements
The galaxies' axis ratios provided for the HAGN and TNG100 simulations are computed from the moment of inertia of their stellar mass distributions where ★ is the number of stellar particles in the galaxy, ★ is the stellar mass of the th particle, ★ = ★ ★ , and are the components of the particle position vectors, defined with respect to the center of mass (Chisari et al. 2015;Genel et al. 2015).
The moment of inertia can be defined via the particle positions in 3D as well as in 2D. In the latter case the positions are projected along one coordinate axis of the simulation, assuming an observer at infinity. In the 3D case the square roots of the three absolute eigenvalues 1 ≥ 2 ≥ 3 provide a measure of the 3D major, intermediate and minor axis lengths, respectively, i.e. ( 3 , 3 , 3 ) = √︁ ( 1 , 2 , 3 ). Accordingly, the 2D major and minor axis lengths are given by respectively. The axis ratios are defined according to Equation (1) and (2). We expect that the bias on such axis ratio measurements, which results from the discreteness of the particle distributions (e.g. Joachimi et al. 2013;Hoffmann et al. 2014;Chisari et al. 2015), is negligible, due to the lower mass limits imposed on our sample (see Section 2.2.1 and 2.2.2). The 2D axis ratios of the projected stellar mass distributions in HAGN are used to test our method for reconstructing the 3D axis ratio distribution. For TNG100 such measurements are currently not publicly available. However, we use 2D axis ratios measured from second-order moments in synthetic images from TNG100 galaxies from Rodriguez-Gomez et al. (2019). These images have been produced using the SKIRT radiative transfer code 7 (Baes et al. 2011;Camps & Baes 2015) by the Sloan Digital Sky Survey in the and broad band filters. The axis ratios are obtained from the second-order moments of the pixels in these images that belong to the source. Since these measurements are not PSF-corrected, we do not use them for testing the shape reconstruction method. Instead, we use them to obtain a rough estimation of how strongly the observed galaxy shapes depend on the wavelength range in which they are measured in order to interpret the COSMOS observations in Appendix E.

RECONSTRUCTING 3D GALAXY SHAPE DISTRIBUTIONS FROM 2D OBSERVATIONS
We now present the method used for constraining the distribution of 3D galaxy axis ratios from the observed distribution of 2D axis ratios in COSMOS and test this method using galaxy shapes from the TNG100 and HAGN simulations. The method is based on the assumption that each galaxy can be represented by an absorptionfree, self-similar, coaxial ellipsoidal stellar system, to which we refer to as 3D ellipsoid in the following. The shape of its 3D luminosity density is fully described by the two axis ratios 3 and 3 , given by Equation (1). We expect such a one-component model to be a simplistic, but useful description for the objects in our observed sample: disc-dominated galaxies, discarding bulge-dominated and irregular objects (see Section 2.1.3). However, substructures such as spiral arms could bias our results (see Fig. 1). The isodensity contours of the projected luminosity density (i.e. the isophotes) of such model galaxies are self-similar, coaxial ellipses (Stark 1977), whose 2D axis ratios can be obtained from the 3D axis ratios analytically, as detailed in Appendix B. This allows for an efficient prediction of the 2D axis ratio distribution, ( 2 ), for an ensemble of randomly oriented model galaxies with a given distribution of 3D axis ratios, ( 3 , 3 ). This distribution can hence be constrained by comparing the corresponding ( 2 ) prediction to observations. This approach relies on a physically meaningful model for ( 3 , 3 ), which we motivate in Section 3.1. The free model parameters are obtained from the COSMOS data using Bayesian interference as detailed in Section 3.2 and 3.3. In the latter section we also study the impact of inaccuracies of the employed ( 3 , 3 ) model on the inferred 3D axis ratio distribution.

Model for the 3D axis ratio distribution
The reconstruction of 3D axis ratio distributions relies on a physically meaningful model for those distributions. Several of such models have been proposed in the literature. Sandage et al. (1970) found that the 2 distribution of spiral galaxies can be fitted reasonably well with a simple oblate disc model according to which 3 = 1 for all objects while 3 is normal distributed around 3 0.25. Later studies based on larger samples found that the 2 distribution of spirals is better fitted using slightly triaxial disc models which describe the absence of perfectly circular face-on spirals in observations (e.g. Binney & de Vaucouleurs 1981;Fasano et al. 1993). Using normal distributions for 3 and 3 Lambas et al. (1992) obtained good fits to the observations from the APM Bright Galaxy Survey. The good performance of this model has been confirmed by recent results from Satoh et al. (2019) based on COSMOS data. Ryden (2004) found that a normal distribution for 3 and a log-normal distribution for the ellipticity 3 ≡ 1 − 3 delivers good fits to the 2 distribution of spirals in SDSS observations, which was confirmed by later studies (e.g. Padilla & Strauss 2008;Rodríguez & Padilla 2013). An alternative model based on normal distributions of the ellipticity 3 ≡ 1 − 3 and the triaxiality ≡ (1 − 2 3 )/(1 − 2 3 ) proved to match the 2 distribution from SDSS, 3D-HST, COSMOS and CANDELS observations (Chang et al. 2013;van der Wel et al. 2014). These different models were motivated by the fact that their prediction for the 2D axis ratios (or ellipticities) fit observations reasonably well, which indicates that observations do not provide a strong constrain on the functional form of the 3D axes ratio distribution.
In our analysis we use an approach that overcomes this limitation by employing a new model for the 3D axis ratio distribution which we validate using the HAGN and TNG100 simulations, described in Section 2.2. Since cosmological hydrodynamical simulations reproduce observed galaxy morphologies to varying degree of success and with a known dependence on simulation techniques, sub-grid models and resolution Dubois et al. 2016;Correa et al. 2017;Park et al. 2019;Rodriguez-Gomez et al. 2019;Tacchella et al. 2019), we only use them as a tool to validate our reconstruction method.
In Fig. 11 we show the marginalized probability distributions of 3 and 3 for galaxies classified as discs in the HAGN and TNG100 simulations (see Section 2.2 for details). We find very similar distributions for both simulations, which is interesting to note given a) the significant differences in the physical models employed in the simulations and b) the different parameters used for the morphological classification. The latter appears to be reasonable, as the discs tend to be oblate with 3 < 3 1. Note that the absence of thin discs with 3 << 1 can be attributed to the effective pressurized Equation of state employed in these simulations and Figure 11. Distributions of 3D axis ratios for disc galaxies in hydrodynamic simulations at = 1.0, obtained from the simple moment of inertia. Solid and dotted lines show fits to a normal and a skew normal distribution respectively.
their relatively low resolution. Thin discs are present for instance in the TNG50 run (see Pillepich et al. 2019) or the New Horizon Simulation (Park et al. 2019), which are higher resolution, lower volume versions of the runs analysed here.
We fit the probability distributions with normal and skew normal distributions, given by ( ) ∝ exp{− 2 }, and ( ) = ( )(1 + erf{ }) respectively, with ≡ ( − 0 )/( √ 2 ) 8 . Both functions are truncated outside of the interval (0, 1] to ensure that ∞ > 3 ≥ 3 ≥ 3 . We show in Fig. 11 that the fit of the skew normal distribution matches the axis ratios slightly better than the normal distribution, in particular 3 . However, we decide to neglect the skewness in our modeling, since the improvement in the 3D axis ratio fits is relatively small. Furthermore we show in Appendix C that a skewness on 3 has a very minor impact on the corresponding distribution of 2 and could hence be constrained very poorly by observations. In the same appendix we also show that 3 and 3 are uncorrelated in HAGN as well as in TNG100. Based on these findings we approximate the joint 3D axis ratio distribution with the product ( 3 ) ( 3 ), i.e.
where 0 , , 0 , are the free model parameters. The normalized truncated distribution is then given by

Error estimation and parameter inference
We obtain constraints on our model parameter vector = ( 0 , , 0 , ) from the observed data using Bayesian inference. We assume a multivariate normal distribution of the likelihood for observing the data vector d given the parameters , with The data vector is the observed distribution ( 2 ), measured in 25 bins of equal width within the 2 interval (0, 1] and ( ) is the corresponding prediction from our ( 3 , 3 ) model for the parameters . The posterior distribution of the parameters given the data d is given by Bayes' theorem as where is Π( ) is the prior, which we set to be flat in the intervals (0, 1] and [0.01, 1] for the parameters ( 0 , 0 ) and ( , ) respectively. We thereby include parameter combinations in our sampling that describe axis ratio distributions not only for discs ( 0 << 0 ), but for any kind of ellipsoids, e.g. 3 3 . The covariance matrix is assumed to be given by = 2 , with Poisson shot noise variance 2 ∝ , where are the counts of galaxies in a given bin . The model prediction ( ) is obtained by drawing a sample of 3D axis ratio pairs ( 3 , 3 ) from the ( 3 , 3 ) distribution for a given set of parameters . The corresponding 2 axis ratios are then computed for a random orientation, as detailed in Appendix B. The prediction for ( 2 ) is measured from the resulting 2 sample in the same bins as the observed data. For generating the predictions we choose a sample size of 10 5 , which is a compromise between a fast computation, needed for efficiently estimating the posterior, and having errors on the prediction which are negligible compared to those on the data vector. The latter condition is satisfied, as the number of galaxies in our six COSMOS sub-samples is more than two magnitudes smaller than the samples used for generating the predictions (see Table 5). We estimate ( |d) by sampling the parameter space with the Markov-Chain-Monte-Carlo (MCMC) method using the code emcee 9 (Foreman-Mackey et al. 2013). For each posterior we run 32 independent chains with at last 1, 000 steps each. The best fit parameters are obtained from the position of the maxima of the marginalized posterior distribution.

Testing the reconstruction method
The method for reconstructing the distribution of 3D axis ratios ( 3 , 3 ) from the distribution of 2D axis ratios ( 2 ) is now tested using the HAGN simulation. We begin by validating two assumptions on which this method is based. Those are 1) that the galaxies' 3D stellar mass isodensities are coaxial, self-similar 3D ellipsoids, whose 3D axis ratios can be related analytically to the 2D axes ratios of the projected 2D stellar mass isodensities as described in Appendix B, and 2) that our Gaussian model for the 3D axis ratio distributions from Equation (6) is sufficiently accurate to provide good predictions for the 2D axis ratio distribution. Subsequently, we test how well the distribution of 3D axis ratios in the simulation 9 emcee.readthedocs.io can be recovered from the corresponding distribution of 2D axis ratios.
To test assumption 1) we compare in Fig. 12 the 2 distribution in the HAGN simulation, measured directly from the projected distributions of each galaxy's stellar particles as described in Section 2.2.3 (top panel), to the distribution of 2D axis ratios, obtained analytically from each galaxy's 3D axes ratios assuming ellipsoidal 3D stellar mass isodensities (central panel). We find that the distributions differ significantly from each other, which indicates that the assumption of ellipsoidal stellar mass isodensities can only serve as a very rough approximation. This conclusion can already be expected from the images of face-on late-type galaxies in our COS-MOS sample, shown in Fig. 1. However, certain characteristics are present in both distributions, namely the cut-offs at 2 0.4 and 2 0.9, as well as a skewness in the distributions towards low axis ratios. To test assumption 2) we show in the bottom panel of Fig. 12 the 2 distribution predicted for an ensemble of 10 6 3D axis ratios, drawn from our ( 3 , 3 ) model fit to the HAGN simulation (see Fig. 11 and C2). We find that this predicted distribution is similar to the results obtained from the analytical projection of 3D axis ratios (shown in the central panel of Fig. 12), which indicates that our Gaussian model for ( 3 , 3 ) from Equation (6) is an appropriate approximation for our analysis.
We study the performance of the reconstruction method, starting with a self consistency test (a) in which we fit the ( 3 , 3 ) model to match its own ( 2 ) prediction, using parameters from the HAGN fits shown in Fig. 12. We find that the fit, shown as green line in the bottom panel of Fig. 12, is in good agreement with the reference measurements. The MCMC estimate of the posterior distribution is displayed in Fig. 13 as light and dark green contours, indicating the 68% and 95% confidence intervals, respectively. We find that the parameter constraints are in good agreement with the input parameters of the ( 3 , 3 ) model, shown as black dashed lines in Fig. 13. This result demonstrates that the 3D axis ratio distribution can be reconstructed from the corresponding distribution of 2D axis ratios for a sample of idealised disc galaxies, which satisfy the model assumptions outlined above. Note that this is not necessarily the case for elliptical galaxies, which are not subject of this work. In Appendix F we show that the input parameters parameters are also recovered for samples that are as small as the observational samples used on this work, but with larger confidence intervals.
We perform the same test under more realistic conditions by fitting the 2 distribution obtained by the projection of idealized 3D ellipsoids with axis ratios measured from HAGN galaxies (test b). The fit, shown as red line in the central panel of Fig. 12, is in an good agreement with the corresponding measurements. We attribute the remaining deviations to inaccuracies of the ( 3 , 3 ) model and not to the assumption of ellipsoidal stellar mass isodensities used for obtaining the 2D axis ratios, since this assumption is employed for generating both, the prediction as well as the reference measurements. The parameter constraints, shown as red contours in Fig. 13, are significantly biased with respect to the reference parameters from the ( 3 , 3 ) model fits to the 3D axis ratio distribution in HAGN. However, the relative deviations of the best fits (defined as maxima of the marginalized probabilities) are at the percent level for 0 , 0 and (see Table 3). The parameter shows the strongest deviation from the reference values of roughly 100%.
Our most realistic performance test of the reconstruction method consists in fitting the model to the distribution of 2D axis ratios measured from each galaxy's projected stellar mass distribution (test c). We find that the fit, shown as blue line in the top panel of Fig.  Figure 12. Probability distribution of 2D axis ratios. Histograms in the top and central panels show results from the HAGN simulation, measured from the projected stellar mass distributions and analytic projections derived from 3D axis ratios, assuming coaxial, self-similar ellipsoidal stellar mass isodensities respectively. The histogram in the bottom panel shows the 2D axis ratio distribution for a Gaussian model of the 3D axis ratio distribution with parameters from the 3D fits, shown in Fig. 11. In each panel we show fits of the model prediction as solid lines. The corresponding parameter constraints and best fit values are compared in Fig. 13 and Table 3. Poisson shot-noise estimates of the standard deviation are shown as error bars on the histograms. Note that errors are smaller than the line width in the bottom panel.
12, strongly deviates from the corresponding measurements. This finding can be expected from the previously discussed deviation between the 2 distribution obtained from the galaxies' projected stellar masses and the one obtained analytically from the 3D axis ratio measurements assuming idealized 3D ellipsoidal isodensities since the latter assumption is employed in the modeling. However, some main characteristics, such as the cut-off of the distribution, the position of the local maxima and the skewness towards low axis ratios, are captured by the fit. We find that the parameter constraints, shown as blue contours in Fig. 13 are biased more strongly with respect to the reference values from the direct ( 3 , 3 ) model fits compared to the previous test case b) (see also Table 3).
A potential shortcoming of these tests is that we defined discs in HAGN as the upper 20% of galaxies with the highest values of = / , which corresponds to a selection cut at > 1.06 (see Section 2.2.1). We thereby might include discs with large bulges for which a one-component ellipsoidal model is inaccurate. In order c) Figure 13. Posteriors of the parameters of the Gaussian model for the 3D axis ratio distribution, given by Equation (5). Results are derived from the 2D axis ratio distributions of a) an ensemble of projected ellipsoids with 3D axis ratios that follow the same Gaussian model (model input values are marked as black dashed lines), b) projected 3D ellipsoids with 3D axis ratio distributions of disc galaxies in the HAGN simulation, c) projected stellar mass distributions of disc galaxies in HAGN. Light and dark areas mark 95 and 68% confidence levels respectively.
to investigate how robust our accuracy tests are towards variations of the disc selection criteria, we repeat our analysis using galaxies with > 1.17 and > 1.27. This corresponds to selecting the upper 10% and 5% of galaxies with the highest values respectively and should reduce the fraction of discs with large bulges at the price of higher noise on the axis ratio distributions due to the smaller sample size. Our results from these different samples show no significant improvement in the accuracy of the reconstructed parameters when increasing the minimum values of . This finding speaks against the possibility that discs with large bulges are the main reason for the inaccuracies of the reconstruction method. For test c), that is based on the projected 2D stellar mass distributions, we obtain overall relative deviations between the reconstructed model parameters and those derived from fits to the 3D stellar mass distributions of around ∼ 3%, ∼ 60%, ∼ 8% and ∼ 20% for 0 , , 0 and , respectively (Table 4).
Refocusing on our sample of discs in HAGN with > 1.06, we compare in Fig. 14 the reconstructed distributions for 3 and 3 directly with the HAGN measurements. This figure illustrates that the location of the maxima of the reconstructed 3 and 3 distributions, given by the model parameters 0 and 0 , are close to the positions of the maxima in the measurements. The width of the reconstructed 2 distribution, quantified by the parameter, differs strongly from the HAGN measurements and is therefore not considered in the final discussion of our results.
We emphasize that the conclusion drawn in this section could change when conducting these tests under more realistic conditions. An ideal test would be based on synthetic images including effects of dust extinction, lensing, PSF convolution and pixelization, as well as image noise. These images would then need to be analysed using the same software as used for the observations. Such a realistic  Table 3. Best fit parameters of the model for the 3D axis ratio distribution, derived from disc galaxies in the HAGN simulation at = 1.0. The two top rows show results for the Gaussian and the skew Gaussian model from direct fits to the marginalised distributions of 3 and 3 (see Section 3.1), shown in Fig. 11. The two bottom rows show the parameters of the Gaussian model inferred from the distribution of 2D axis ratios using the reconstruction method. The 2D axis ratios used for the reconstruction are obtained either analytically from the galaxies' 3D shapes, assuming that the stellar mass distributions are perfect 3D ellipsoids, or directly from shape measurements of the projected stellar mass distribution (labelled as projected ellipsoids and projected stellar mass respectively, see Fig. 12).

Figure 14.
Testing the reconstruction of the 3D axis ratio distribution of disc galaxies in the HAGN simulation at = 1.0. Grey histograms show measurements in the simulation with 1 shot-noise errors displayed as black bars for each bin. Black solid lines show the fits to a truncated normal distribution, which are also shown as solid lines in the top panel of Fig.  11. Red dashed lines show the distribution reconstructed from fits to the 2D axis ratio distribution which was derived from the 3D axis ratios using the ellipsoidal galaxy model (see central panel of Fig. 12). Dashed-dotted blue lines show the reconstruction based on fits to the 2D axis ratios measured from the projected stellar mass (see top panel of Fig. 12). Thin lines show predictions for 500 random sampling points of the posterior distributions and reflect the uncertainties expected on the prediction. test is beyond the scope of this work, but would be an interesting objective for future investigations.

3D AXIS RATIO DISTRIBUTIONS IN COSMOS
We apply the method for reconstructing 3D axis ratio distributions to the different samples from the COSMOS data, that were selected by stellar mass and redshift as described in Section 2.1.6. The fits to the observed distribution of 2D axis ratios from which we infer the parameters of our model for the 3D axis ratio distribution are shown in Fig. 15. For the samples that are selected by both, stellar mass and redshift, we find that the fits match the measurements well with 2 values per degree of freedom around unity (see Table 5), capturing the skewness towards low and high values of 2 for the high and low mass samples, respectively (as shown in Fig. 9).
The fits to the samples that are selected only by stellar mass or only by redshift, shown in the top row and left column of Fig  15 respectively, are less accurate as indicated by the higher 2 values per degree of freedom. This finding can be expected on one hand from the larger sample size that decreases the errors on the measurements and hence increases the significance of deviations between measurements and model with respect to the errors. On the other hand the underlying model assumption that the 3D axis ratios, 3 and 3 , follow a Gaussian distribution may be less accurate for samples that are more broadly defined. This could in particular be the case if sub-samples have very distinct 2 distributions, like the low and high mass sub-samples.
However, we note that the model fits the observed 2 distributions overall better than corresponding distributions from the projected stellar mass in the HAGN simulation, shown in the top panel of Fig. 12.
This finding might result from the fact that we selected all disc galaxies in HAGN, including those with large bulges for which a one-component ellipsoidal shape model is clearly an inadequate approximation. Additional reasons may be differences between the 2D shapes of the projected stellar mass and the projected luminosity  Table 5. Parameters of the 3D axis ratio distribution model (equations (5) and (6)), inferred from fits to the 2D axis ratio distribution from the COSMOS survey, shown in Fig. 15. Results are given for different stellar mass -redshift samples, which are selected as indicated in the two left columns. For each model parameter we provide the lower and upper limit of the 68% confidence level, obtained from the marginalized posteriors. The right column shows the 2 goodness of fit per degree of freedom.
densities or shortcomings of the simulation, caused for instance by resolution effects as discussed in Section 2.2. In Fig. 16 we show the MCMC estimates for the posterior probability distribution of the ( 3 , 3 ) model parameters, 0 , 0 , and , quantifying the average disc circularity, the average relative disc thickness and the corresponding dispersions, respectively. The best fit parameters, defined as the positions of the maxima of the posteriors are summarized in Table 5. In this table we also provide results for the parameter 0 ≡ 0 0 , whose posterior distribution we obtain by multiplying the coordinates of the sampling points of the posterior distributions of 0 and 0 . It specifies the position of the maximum in the ( 3 , 3 ) distribution and quantifies the disc thickness relative to the major axis, facilitating a comparison to the disc circularity parameter 0 and the estimation of the absolute disc thickness (see Section 4.2.2). The posteriors shown in Fig. 16 and the corresponding model parameters in Table 5 were derived from 2 distributions measured in 25 bins. When repeating the same analysis using 20 bins we find only marginal changes of the posteriors and no significant change in the corresponding model parameters. However, larger fluctuations may occur for larger variations of the binning. In Appendix F we show that the posteriors can be strongly affected by noise in the data. Varying the binning would change the noise and therefore the posteriors. However, a rigorous investigation of how the binning impacts the posteriors is beyond the scope of our analysis.

Redshift dependence
We start examining the redshift dependence of the galaxy shape distribution by comparing the fits to the different redshift sub-samples with those for the entire redshift range (shown as red solid and blue dashed lines in Fig. 15 respectively). We find that within a given mass range, these different fits are close to each other, which is a first indication that the redshift evolution of the axis ratio distribution in our samples is weak.
This visual impression lines up with the relatively small variation of the best fit parameters for the average disc circularity and the relative disc thickness ( 0 and 0 ) with redshift. For the full mass redshift sub-samples we find the values of 0 and 0 to vary by ∼ 1% and ∼ 10% around the results for the full redshift range respectively (see Table 5). For the higher and low mass sub-samples these variation increase up to ∼ 10% and ∼ 25% for 0 and 0 respectively. The redshift variations of the corresponding dispersions, and are much higher with up to ∼ 50%. However, the amplitudes of these different redshift variations are overall consistent with the 68% confidence intervals from the marginalized posteriors.
In order to obtain a more detailed insight into how the parameter variations with redshift compare to the uncertainties, we show the posterior probability distributions of the ( 3 , 3 ) model parameters from the different sub-samples in Fig. 16. The 68% confidence levels of the posteriors for the full mass samples in the different redshift bins, shown in the top left of Fig. 16, are mostly overlapping. A weak indication for a redshift dependence in the data might be given by the fact that the 68% confidence levels of the intermediate and the high redshift samples are slightly disconnected in the 0 direction. Nonetheless, the 95% confidence levels still overlap clearly in that case. More significant differences between the posteriors from different redshift bins are present for the low and high mass sub-samples, as shown in the bottom left and right of Fig. 16 respectively. However, for the low mass sample the 68% confidence levels still overlap plainly. For the high mass sample the 68% confidence level of the central redshift bin is slightly disjoint from those of the other bins, although the 95% confidence levels from all redshifts still overlap. Overall our Bayesian analysis does not provide convincing evidence for a redshift dependence of the ( 3 , 3 ) model parameters that is significant with respect to the uncertainties. It is thereby important to note that the differences in the shapes of the posteriors are not an indication for such a dependence since they can be expected from the sampling noise in the 2 distributions, as we demonstrate in Appendix F.
It is further important to note that the insignificant redshift dependence of the parameters with respect to the uncertainties does not imply that the shapes of disc-dominated galaxies do not evolve since = 1.0. A potential evolution could just be too weak to be detected reliably with the Bayesian analysis presented here. In particular the large uncertainties on the dispersions of the circularity and relative thickness may obscure even a relatively strong redshift dependence of these parameters. We therefore employ the two-sample Kolmogorov-Smirnov test (e.g. Hodges 1958, hereafter referred to as K-S test) as an independent tool for testing the null hypothesis that the axis ratio distributions from two different redshift samples are drawn randomly from the same redshift independent distribution. We apply this test directly on the observed 2 distributions and are therefore independent of any model assumptions. In Table  6 we show the −values of the K-S test, which is the probability of obtaining a K-S test result that is as large or larger than the one   Table 6. −values of the two-sample Kolmogorov-Smirnov test. The test is applied on pairs of 2D axis ratio distributions measured in three redshift bins (Fig. 15). The redshift bins 0 , 1 , 2 correspond to the ranges [0.2, 0.7], [0.7, 0.9], [0.9, 1.0] respectively. obtained for the two samples under the null hypothesis. A typical, but arbitrary rejection criteria is < 0.05. For the redshifts samples defined over the full mass range the K-S test does not reject the null hypothesis. This result lines up with the strong overlap of the model parameter contours for these samples, which we discussed above. The situation is less conclusive for the -values from the low and high mass sub-samples. For the low mass sub-sample the K-S test suggests, that the 2 distributions from the lowest and highest redshift bins are drawn from different underlying distributions, since the -value of 0.0273 is below 0.05. This finding is interesting, given that the parameter contours from these two redshift bins are strongly overlapping (see bottom left panel of Fig. 16). Given this apparent disagreement and that fact that i) our arbitrarily chosen rejection criteria just slightly surpassed for this redshift bin combination and ii) the null hypothesis has not been rejected for the other redshift bins combinations, we conclude that the low mass sub-sample provides no strong indication for a substantial redshift evolution of the galaxy shapes. For the high mass sub-sample the K-S test suggests, that the 2 values from the intermediate and the highest redshift bin are drawn from different underlying distributions. In this case the rejection criteria is clearly fulfilled by the -value of 0.0053. This finding lines up with with the weaker overlap of the corresponding parameter contours, shown in the bottom right of Fig. 16, and could be an indication for a redshift dependence of the shapes of high mass disc-dominated galaxies. However, it could also result from systematic effects, such as potential misclassifications of galaxy morphologies in the ZEST catalogue. Another explanation for a redshift dependence might be cosmic variance, as galaxy clusters in the COSMOS field (see Fig. 2) could change the mean values and dispersions of the disc thickness and circularity in a given redshift bin. It is thereby interesting to note that the 2 distributions from the lowest and the highest redshift sample are consistent with each other, according to both, the K-S test as well the overlap of the parameter contours for these two samples shown in Fig. 16, which speaks against a physical evolution of the shapes. However, the same systematic effects may also change the 2 distributions in such a way that a detection of a redshift evolution is prevented. Samples with more objects that are distributed over larger areas and morphological classifications based on different techniques would be needed to better understand the impact of misclassification or cosmic variance on our results.

Mass dependence
We now proceed with studying the dependence of the galaxy shapes on stellar mass. Given that the evidence for a redshift evolution of the shapes is overall weak, we will thereby focus on the mass samples that are defined over the entire redshift range (0.2 < < 1.0) to maximize the statistical power of the samples.

Intrinsic disc circularity
The intrinsic circularity of the discs in our samples is quantified by the axis ratio 3 . The maximum and the width of the 3 distribution are described by the parameters 0 and of our ( 3 , 3 ) model. In the top right of Fig. 16 we compare the constraints on these model parameters from the low and high mass samples in the entire redshift range. We find that the high mass sample prefers slightly higher values of 0 and compared to the low mass samples. However, the corresponding confidence intervals from both mass samples are overlapping significantly, which indicates that our data provides no evidence for a mass dependence of the disc circularity.
We find 0 values of around 0.9 for both samples (see Table  5), which is consistent with estimates reported for discs at lower redshifts in previous studies (Sandage et al. 1970;Fasano et al. 1993;Ryden 2004Ryden , 2006Rodríguez & Padilla 2013) and describes the deficit of circular face-on discs with 2 = 1.0 in the observed data (see Fig. 15). Different reasons for this deficit have been discussed in the literature (e.g. Bertola et al. 1991;Huizinga & van Albada 1992;Rix & Zaritsky 1995;Bernstein & Jarvis 2002;Joachimi et al. 2013). On the one hand, it is argued that deviations from perfect circularity could result from observational effects, such as noisy isophotes, artifacts like cosmic rays in the galaxy images, or simply the fact that the images are pixelized. However, we expect the impact of such effects on our measurements to be minor, since i) we selected objects with "good" fits to the Sérsic profile (see Section 2.1.2), which should remove objects whose images are heavily distorted by artefacts from the sample and ii) the ACS pixel scale of 0.03" is well below our lowest limit for the effective angular radius of 0.17" at = 1.0 (see Fig. 2). However, since the deviations from perfect circularity are predicted to be relatively small, even minor systematics might be relevant. On the other hand, one could expect that the disc galaxies are intrinsically not perfectly circular due to patchy star formation activity and substructures, such as spiral arms (see Fig. 1) or galactic warps (see e.g. Binney 1992; Gómez et al. 2017, for the latter). This expectation lines up with the noncircularity of discs in the HAGN and TNG100 simulations, which we see in Fig. 11 and 12 as the lack of galaxies with 3 and 2 close to unity.
Note that we omit entering a detailed discussion of our constraints, since the test of our reconstruction method in Section 3.3 and Appendix F suggested that these constraints may be strongly biased.

Intrinsic disc thickness
The relative disc thickness is quantified by the minor to intermediate axis ratio 3 . The position of the maximum and width of the 3 distribution are described in our ( 3 , 3 ) model by the parameters 0 and . In the top right of Fig. 16 we show that 0 exhibits a significant dependence on stellar mass. It is predicted to be around 0.3 with lower and higher values for the high and low mass sample, respectively (see Table 5). The corresponding constraints on the relative thickness 0 ≡ 0 0 (quantifying the position of the maximum of the 3 ≡ 3 / 3 distribution) are 0.33 0.34 0.31 and 0.24 0.25 0.23 for the low mass and high mass sample respectively, where the upper and lower values here are the limits of the 68% confidence intervals. These values are consistent with results reported for discs at low redshifts which were obtained using similar reconstruction techniques as used in this work (Fasano et al. 1993;Ryden 2004Ryden , 2006Rodríguez & Padilla 2013) or from direct measurements based on edge-on oriented discs (Mosenkov et al. 2015; Figure 16. Posteriors of the 3D axis ratio model parameters from Equation (5), derived from the 2D axis ratio distributions in COSMOS (Fig. 15). Light and dark areas mark 95 and 68 % confidence levels respectively. Results for i) redshift samples defined over the full stellar mass range (top left), ii) stellar mass samples, defined over the full redshift range (top right), iii,iv) redshift samples with stellar masses below and above 10 10. 35 (bottom left and right respectively). The marginalized posterior distributions are shown on the diagonal panels of each sub-figure. Reshetnikov et al. 2016;Mosenkov et al. 2020). The dispersion of the relative disc thickness, , shows a significant mass dependence as well, as it takes lower and higher values for the high and low mass sample respectively. This mass dependence of the thickness dispersion explains why the cuff-off on the left side of the 2 distribution in Fig. 15 is sharper for high than for low mass discs, while we expect also this latter parameter to be significantly biased by ∼ 20% (Section 3.3).
In Fig. 17 we show the reconstructed marginalized distributions of the 3D axis ratios 3 and 3 for the low and high mass sample over the full redshift range, as predicted from the best fit parameters given in Table 5. The uncertainties on the prediction is illustrated as the density of thin lines which are obtained from randomly selected sampling points of the posterior in the ( 3 , 3 ) model parameters space. The corresponding values 0 and 0 are shown as vertical lines. They are enclosed by shaded areas which indicate the 68% confidence intervals for these parameters, given in Table 5. Note that these uncertainties are similar to the systematic bias of ∼ 3% and ∼ 8% for 0 and 0 respectively, which we found in our systematic tests (Section 3.3).
When comparing the distributions of both mass samples, we see that the values of 3 are well below those of 3 as expected for disc galaxies. This indicates that our results carry physically meaningful information, despite the expected biases. Fig. 17 further illustrates that i) high mass discs tend to be thinner with respect to their diameter than low mass discs, ii) high mass discs tend to be slightly more circular than low mass discs, iii) the dispersion of the circularity is larger for high mass than for low mass discs and iv) the dispersion of the relative thickness is larger for low mass than for high mass discs. However, the mass dependence of the peak of the circularity distribution, quantified by 0 , is statistically not significant, since the 68% confidence intervals for 0 overlap (see Figure 17. Distributions of galaxy 3D axis ratios, reconstructed from the distribution of 2D galaxy axis ratio in the COSMOS survey. Results are shown for the low and high mass sample, defined within 0.2 < < 1.0. Vertical lines mark the positions of the maxima, described by the best fit model parameters 0 and 0 from Table 5. Vertical shaded areas indicate the corresponding 68% confidence intervals. Thin lines show predictions for 500 random sampling points of the posterior distribution. also discussion in Section 4.2.1). The lower relative thickness for high mass discs could indicate that these galaxies tend to be more relaxed than low mass discs, which are more prone to perturbations by feedback, merging and tidal interactions. This interpretation is supported by the higher dispersion of the relative thickness of low mass discs and lines up with the overall higher star formation rates for low mass discs, which we see in the right panels of Fig. 9.
However, another reason to expect a lower relative thickness of massive discs with respect to their radius is that the radius is on average larger for discs in the high mass sample, as indicated by the mass dependence of the effective radii shown in the central panels of Fig. 9. Such an increase of the radius with mass could lead to a decrease of the relative thickness, even if the size of the absolute thickness is mass independent.
In order to test if this scenario is consistent with our data we derive an estimate of the expectation value for the absolute thickness, 3 , for the low and high mass sample via our constraints on the peaks of the 3 ≡ 3 / 3 axis ratio distributions, quantified by 0 . Note that we use 3 instead of 3 ≡ 3 / 3 , since we can approximate 3 directly from the observed 2D radii. We estimate the expectation value for 3 using the approximation 3 3 3 / 3 = 3 3 3 0 . This approximation is based on two assumptions: 1) We assume 3 ≡ 3 / 3 3 / 3 . This approximation corresponds to the zero-order term in a perturbative expansion of ( , ) = / around the and . We validated this relation using our sample of disc galaxies in the HAGN simulation and found it to be accurate at the sub-percent level. 2) We assume that the expectation of the 3 distribution is located at its maximum, i.e. 3 0 . This implies that the 3 distribution is symmetric, which is commonly assumed to be the case for observed discs (e.g. Ryden 2004). Again, we also verified that this is a good approximation using the disc galaxies in our HAGN sample.
We obtain the posterior distribution of 0 from the posteriors of 0 and 0 , using 0 = 0 0 as detailed in the beginning of this section. It is shown in the top panel of Fig. 18 for the low mass and high mass disc samples, defined over the entire redshift range. The distributions differ significantly from each other, as it can be expected from the constraints on 0 shown in Fig. 16. We proceed by estimating the posterior for the expectation value of the minor axis size from the posterior of 0 , using the aforementioned relation 3 3 0 , which we apply on each MCMC sampling point. For this last step we need an estimate for the average comoving size of the major axis 3 . This size is approximated with the mean effective comoving radius ⊥ of the disc galaxies in our sample, using objects with 2 > 0.5. This value is shown in the central panels of Fig. 9 as vertical red and blue solid lines for the high and low mass sample respectively. The cut on 2 was thereby chosen to mitigate the impact the projection effects on the effective radius as discussed in Section 2.1.5.
The estimated posterior of 3 is shown for the samples of high mass and low mass discs in the bottom panel of Fig. 18. The two posteriors overlap each other almost completely, which means that our approximation of the average 3D disc thickness shows no significant mass dependence. These posterior distributions are compared to the estimates of the mean major axis sizes 3 which are shown as vertical lines in the same panel. The strong mass dependence, which we find for 3 suggests that the mass dependence in the relative 3D thickness 3 is driven by the mass dependence of the major axis sizes.

SUMMARY AND CONCLUSIONS
We studied the 3D shapes of disc-dominated late-type galaxies from the COSMOS survey in different mass and redshift ranges, tackling the question of how these galaxies could grow without forming a large central bulge. We approximated the 3D light distribution of these galaxies as 3D ellipsoids described by the two ratios 3 ≡ 3 / 3 and 3 ≡ 3 / 3 , which quantify the circularity and relative thickness of the discs, respectively. We inferred the distribution of these 3D axis ratios from the observed distribution of 2D axis ratios, using a reconstruction method based on the assumption that the distribution of 3 and 3 is well approximated by a two-dimensional Gaussian. This Gaussian is characterized by the average disc circularity 0 , the average disc thickness 0 and the corresponding dispersions and . Variations of this method have been widely used in the literature, but their accuracy as well as the assumptions they employ remained to be tested. We began our analysis by performing such tests for the first time using two state-of-the-art hydrodynamic simulations of galaxy formation in cosmological volumes, Horizon-AGN and Illustris TNG100. We demonstrated that the 3D axis ratio distributions of disc galaxies in these simulations are adequately described by a Gaussian model. Reconstructing the 3D axis ratio distribution from the distribution of 2D axis ratios, we found that the inferred parameters of the Gaussian model are biased with respect to those derived directly from fits to the 3D distributions. For our most realistic test, based on 2D galaxy shapes from projected stellar mass distributions, we find this bias to be moderate for the parameters 0 and 0 (∼ 3% and ∼ 8% respectively) but strong for the corresponding dispersions and (∼ 60% and ∼ 20% respectively, see Table 4). We concluded that the bias is mainly driven by an inaccuracy of the ellipsoidal model for the stellar mass distribution, which we use for relating 3D to 2D galaxy shapes analytically during the reconstruction (see Section 3.3). The strong simplification implied when approximating late-type galaxies as 3D ellipsoids becomes obvious in the COSMOS images of such objects, shown in Fig. 1. Nevertheless, our bias estimates derived from the HAGN simulation may be overly pessimistic, since our test is based on all disc galaxies found in this simulation, including those which have a bulge for which a one-component ellipsoidal galaxy model is expected to be inadequate. More realistic tests should be based on synthetic galaxy images, take into account the effect of dust extinction and be analysed in the same way as the reference observational data.
After having tested the shape reconstruction method, we applied it on a volume limited sample of disc-dominated galaxies in COSMOS which is limited in redshift (0.2 < < 1.0), absolute magnitude ( < −21.5) and transverse comoving size ( ⊥ > 0.64 kpc). We demonstrated that a conservative cut on is required in order to minimize the impact of dust extinction on the observed 2D axis ratio distribution. Otherwise, this effect can lead to an apparent redshift evolution of the distribution (Fig. D1) and consequently to incorrect physical interpretations of the observations. We found that the ellipsoidal galaxy shape model in conjunction with the Gaussian model for the 3D axis ratio distribution provides good fits to the distribution of the observed 2D axis ratios (Fig. 15).
The constraints on the parameters 0 and 0 inferred from these fits show a variation with redshift around ∼ 1% and ∼ 10% respectively when considering the full mass range (Table 5). This finding indicates that the redshift evolution of the shapes is relatively weak. Splitting the sample into a high and a low mass sub-sample leads to larger variations of ∼ 10% and ∼ 25% for 0 and 0 respectively. The parameters and vary up to 50% across the different redshift samples. However, overall the variations with redshift lie within or close to the 68% confidence intervals on these parameters. Studying the joint posteriors on the different shape model parameters, we find no clear indication for a redshift evolution that is significant with respect to the uncertainties as the 95% confidence intervals of these parameters from different redshift samples overlap (Fig. 16). We crosschecked this result using the Kolmogorov-Smirnov (K-S) statistic to test the null hypothesis that the 2D axis ratio distributions from two redshift bins are drawn from the same underlying redshift independent distribution. Studying all redshift bin combinations for different mass samples, we find that the K-S test is mostly consistent with the results from the Bayesian analysis. As an exception the K-S test clearly rejects the null hypothesis that the 2D axis ratios of the high mass samples in our intermediate and high redshift bins are drawn from the same distributions. This finding appears to be inconsistent with the agreements between the 2D axis ratio distributions from the high mass samples in the low and high redshift bins and could be an indication for cosmic variance or misclassified objects contaminating the high mass sample at intermediate redshifts. In any case, the fact that the Bayesian analysis revealed no significant redshift dependence indicates that any dependence detected by the K-S test is relatively weak compared to the errors on the data. This weak dependence on redshift is contrasted by a relatively strong dependence on mass (Fig. 16). In particular, the parameter 0 is significantly higher for discs in our sample with stellar masses below 10 10.35 compared to discs with higher masses, even within the expected uncertainties of the reconstruction method (e.g. Fig.  17). This finding indicates that the relative disc thickness decreases with mass. However, the absolute disc thickness, estimated from the relative thickness and a proxy for the disc diameter, shows no mass dependence (Fig. 18), which suggests that the relative thickness of low mass discs is higher mainly because of their smaller diameters (e.g. Fig. 9).
In summary, we found indications that the distribution of the 3D axis ratios of disc-dominated galaxies in our sample is not (or if, then just weakly) redshift dependent in the range 0.2 < < 1.0 and the absolute disc thickness does not depend significantly on mass. These findings speak against mass accretion by major mergers and a subsequent suppression of bulge formation by strong feedback after 1.0, since one could expect that such disruptive events would decrease the discs circularity and increase their thickness due to vertical heating (e.g. Quinn et al. 1993;Grand et al. 2016;Park et al. 2021). The decrease of the specific star formation rates of massive discs with redshift, shown in Fig. 9, suggests that feedback occurs in these galaxies, suppressing star formation by removing cold gas, but may not be sufficiently strong to significantly affect the observed shapes. The absence of major mergers in disc-dominated galaxies since 1 is further supported by the fact that the mean comoving sizes, displayed in Fig. 9, show no redshift evolution for low mass galaxies and a very weak increase with redshift for high mass galaxies. We conclude that disc-dominated galaxies accreted most of their mass before = 1.0 and lived preferentially in isolation ever since. This picture lines up with the results from Grossi et al. (2018), who find that the star formation rates of disc-dominated galaxies show no significant dependence on the density of their environment, which indicates the absence of major interactions (see also Sachdeva & Saha 2016). A tranquil evolution is further supported by the weak redshift dependence of the transverse comoving size distribution (Fig. 17), which has also been reported for disc galaxies in the GOODS fields by Ravindranath et al. (2004).
However, we emphasize that the validity of our results is limited for several reasons. The small sizes of our samples lead to large statistical errors, which may obscure a weak evolution of the galaxy shapes with redshift and could hence affect our conclusions. This limitation can be overcome in future analysis by using measurements of galaxy shapes in larger volumes from upcoming weak lensing surveys like Euclid or the Legacy Survey of Space and Time on the Vera Rubin Observatory. Such high precision measurements will require an improved accuracy of the shape reconstruction method, based for instance on corrections which can be developed using realistic mock images.
Our results may further be affected by inaccuracies of the assumptions on which the reconstruction method is based. One inaccurate assumption is the aforementioned ellipsoidal model for galaxy morphologies, which differs strongly from the complex shapes of real disc galaxies. Another potential inaccuracy may result from the Gaussian model, that we assume for the 3D axis ratio distribution. Our choice for this model was based on 3D axis ratio distributions measured in two hydro-dynamic simulations. However, we also demonstrated that the 2D axis ratio distributions in these simulations differ significantly from the COSMOS observations due to the lack of thin discs with 2 << 1, which we attribute to resolution limits and approximations in the simulation method (see Sec. 2.2). More realistic simulations could reveal that our Gaussian model is inadequate, which could bias the outcome of the reconstruction method.
Potential errors in our analysis may further result from a missclassification of galaxies in the ZEST catalogue. Although this catalogue has been carefully calibrated and passed rigorous validations, it may contain a fraction of objects that are miss-classified as discdominated late-type galaxies and contaminate our samples. By inspecting the color-color distribution of the objects in our catalogue we find no indication of a significant fraction of miss-classified objects. However, visual inspection of randomly selected galaxies revealed the presence of some objects, in particular at high redshifts and low masses, whose morphological class is ambiguous to us (Appendix A). In order to assess the uncertainty from potential miss-classifications on our results one could compare the morphological classification in ZEST against alternative classification methods that are for instance based on machine learning techniques.
More detailed interpretations of the redshift evolution of observed galaxy shapes may also require a characterization of two observational systematic effects which we assumed to be insignificant with respect to the errors on our measurements. A first potential systematic could result from the fact that galaxy shapes are observed in the same filter at different redshifts. High redshift galaxies are therefore seen at bluer rest frame wavelengths than galaxies at lower redshifts. This may affect the observed shapes for two reasons. The first is that the age of stellar populations is not uniformly distributed, as star formation takes place in distinct regions such as spiral arms (e.g. Martin & Kennicutt 2001). The second reason is that extinction and reddening by dust in the source galaxy has a stronger impact on observations in blue than in red rest-frame wavelengths and may hence distort the observed shapes more strongly at high than at lower redshifts. This effect is further complicated by the fact that the distribution and overall density of dust evolves with time. In fact, we find strong indications of dust extinction in our data (Fig. D1). However, studying shapes measured from synthetic galaxy images at = 0.0, we find no strong dependence of the 2D axis ratio distribution on the filter band. This finding suggests that observing shapes in the same filter at different redshift has only a mild effect on the observed shape distribution (see Appendix D). A second systematic effect can be expected from gravitational lensing, while this effect is typically weak with a contribution of less than 1% to the observed ellipticity (e.g. Kirk et al. 2015).
An important outcome of our analysis is that the model for the 3D galaxy shapes and their distribution provides good fits to the observed distribution of 2D shapes for disc-dominated galaxies, despite the expected inaccuracies. This model can hence be employed to generate mock catalogues of galaxy shapes in large cosmological dark matter-only simulations in which stellar mass distributions of individual galaxies are not provided. In order to construct such mocks, Joachimi et al. (2013) approximated disc galaxies by flat opaque cylinders and showed that the resulting 2D ellipticity distribution differs significantly from COSMOS observations. An improvement on this aspect is crucial for building precision mock galaxy catalogues with intrinsic alignments for the preparation of future weak lensing surveys. Disc galaxies are expected to dominate the lensing source samples at high redshifts in which intrinsic alignments contaminate the gravitational shear induced by the large scale structure at lower redshifts (e.g. Fig. 6). Building and testing such improved mocks is subject of our ongoing work. is part of the Delta ITP consortium, a program of the Netherlands Organisation for Scientific Research (NWO) that is funded by the Dutch Ministry of Education, Culture and Science (OCW). The research of JD is supported by the Beecroft Trust and STFC. Our analysis is based on data products from observations made with ESO Telescopes at the LaSilla Paranal Observatory under ESO programme ID 179.A-2005 and on data products produced by TERAPIX and the Cambridge Astronomy Survey Unit on behalf of the UltraVISTA consortium. Our analysis was performed using J N (Kluyver et al. 2016), with the Python packages Colossus (Diemer 2018), emcee (Foreman-Mackey et al. 2013), Matplotlib (Hunter 2007), NumPy (Harris et al. 2020), pandas (Wes McKinney 2010), pyGTC Bocquet & Carter (2016), SciPy (Virtanen et al. 2020).

AVAILABILITY OF DATA
The galaxy catalogues from COSMOS (COSMOS2015, ACS-GC, ZEST) and Illustris TNG used in this work are publicly available at the links provided in Section 2. The matched COSMOS catalogue described in that section is available upon request to K. Hoffmann (kai.d.hoffmann@gmail.com). The Horizon AGN catalogue is available upon reasonable request to N. E. Chisari (n.e.chisari@uu.nl). Figure A1. Object classified as disc-dominated late-type galaxies in the ZEST catalogue of the COSMOS field. The figure shows 16 randomly selected galaxies for each of our 6 redshift-stellar mass samples.

APPENDIX B: FROM 3D TO 2D AXIS RATIOS
We obtain the 2D axis ratios of a projected 3D ellipsoidal stellar system following Joachimi et al. (2013), under the assumption that the 3D system is absorption-free, self-similar and coaxial. A 3D ellipsoid is thereby expressed in a coordinate system defined by the two orthogonal unit vectors {ˆ,ˆ} which span the projection plane and the unit vectorˆ , which is orthogonal to {ˆ,ˆ}, pointing along the observer's line of sight. In this new coordinate system the principal axes are given byS = { ˆ , ˆ , ˆ } ≡ {˜, ,˜, ,˜ , } with S ∈ { 3 , 3 , 3 }. The projected 2D ellipse is given by all points in the projection plane which fulfil W −1 = 1, where with N ≡ 11 + 22 + √ det W. The absolute value of the ellipticity, = √︃ 2 1 + 2 2 , is related to the 2D axis ratio 2 ≡ 2 / 2 of the projected ellipsoid as where 2 and 2 are the principle axes of the 2D ellipse. An alternative approach for obtaining 2 has been derived by Binney (1985) (see also Stark 1977;Benacchio & Galletta 1980). However, we find this latter calculation to be computationally less efficient.

APPENDIX C: VALIDATING THE MODEL FOR THE 3D AXIS RATIO DISTRIBUTION
We use the shapes of disc galaxies, measured in the HAGN and TNG100 simulations to further validate the approximations that are implied by using the Gaussian model for the 3D axis ratio distribution, ( 3 , 3 ), introduced in Section 3.1.
A possible improvement of that model could be the inclusion of a skewness in the 3 dimension, as demonstrated in Fig. 11. In the top panel of Fig. C1, we show the joint distribution of 3 and 3 for the truncated Gaussian model with and without skewness, using parameters which are typical for the disc galaxies in our analysis. The corresponding distribution of the apparent 2D axis ratios are shown in the bottom panel of the same figure. We find that the effect of the skewness is small compared to the errors on our measurements (see Fig. 15). As a consequence the skewness in the 3 distribution cannot be constrained from our observed data and is therefore not included in our ( 3 , 3 ) model.
A second possible improvement of our model could be the inclusion of a correlation between 3 and 3 which could be described by a covariance matrix in Equation (5). In order to asses the necessity for such a model extension we inspect the kernel density estimates of the ( 3 , 3 ) distribution from HAGN and TNG100 in Fig. C2. We find no evidence for a significant correlation between 3 and 3 in both simulations. For a visual impression of the accuracy of our Gaussian model we in compare it to the measurements in Fig. C2, using the model parameters from the fits to the marginalized 3D axis ratio distributions, shown in Fig. 11. The model describes the simulation data reasonably well. while the relatively weak deviations from the measurements appear to result from the neglected skew rather from a neglected covariance.

APPENDIX D: BIAS IN THE 2D AXIS RATIO DISTRIBUTIONS FROM APPARENT MAGNITUDE CUTS
We study the effect of dust extinction on the color and apparent magnitude of the disc-dominated galaxies from our matched catalogue with < −21.5 in Fig. D1. In this figure we display the apparent -band magnitude and the − color index against the apparent 2D axis ratio 2 in three redshift bins.
We see at all redshifts that galaxies with small apparent axis ratios are significantly redder (i.e. higher color index) and dimmer than those with apparent axis ratios close to unity. These effects can be expected from the extinction by dust in the interstellar medium of the source galaxy, as the pathway of light through the dust of the source towards the observer is longer for discs which are seen edge-on than for face-on objects (i.e. 2 << 1 and 2 1 respectively). As a consequence dust extinction can shift discs with inclined orientations below the apparent magnitude limit (marked as solid red horizontal line in Fig. D1), in particular at high redshifts (e.g. Binney & de Vaucouleurs 1981;Huizinga & van Albada 1992). This effect can introduce a redshift-dependent bias in the observed distribution of axis ratios towards apparently round (face-on) discs, which could be mistaken for an evolution in the intrinsic shape distribution, if not taken properly into account. We demonstrate this effect by applying a cut at = 23, shown as dashed blue horizontal line in Fig. D1. The apparent evolution of the axis ratio distribution introduced by this cut can be seen in the bottom panels of Fig. D1.
An additional problem caused by this selection effect is that orientations of galaxies in samples affected by the apparent magnitude cut cannot be expected to be randomly distributed with respect to the observer, which violates a basic requirement for the 3D shape reconstruction method employed in this work. In order to mitigate these biases we select objects with absolute -band magnitudes below −21.5, which appear not to be affected by the apparent magnitude cut of < 24 for < 1.0, used for selecting our volume limited sample. Figure C2. Distributions of the axis ratios 3 ≡ 3 / 3 and 3 ≡ 3 / 3 of disc galaxies in the HAGN and TNG100 simulations at = 1.0. These distributions are compared to the ( 3 , 3 ) model from Equation (5), which was fitted to the marginalized distributions of 3 and 3 as shown in Fig. 11.

APPENDIX E: DEPENDENCE OF GALAXY SHAPES ON COLOR
The galaxy shape measurements used in this work are based on ACS imaging in the F814W filter. This filter corresponds to a different rest-frame wavelength range at each source redshift, which can affect the observed shape of a given galaxy if its color is not uniformly distributed (for instance, due to extinction and reddening by dust or patchy star formation). We study the impact of this systematic effect on the distribution of apparent axis ratios measured from second-order moments in synthetic images of disc galaxies from the TNG100 simulations at = 0.0, produced by Rodriguez-Gomez et al. (2019, see Section 2.2.3). These measurements were performed in the SDSS -and -bands, which correspond roughly to the ACS F814 band at = 0.0 and 0.5 < < 1.0 respectively, as illustrated in Fig. E1. In Fig. E2 we show that the change of the apparent axis ratio distribution is weak, compared to the shot-noise errors which we expect for our COSMOS samples (see Fig. 15). We find the same result when using PSF corrected axis ratios obtained by Rodriguez-Gomez et al. (2019) from Sérsic profile fits to the same synthetic images. These findings line up with those from Georgiou et al. (2019), who report a minor difference between galaxy ellipticities measured in different filters of the KiDS survey. . Apparent 2D axis ratios 2 of disc-dominated galaxies in COSMOS with absolute -band magnitudes brighter than = −21.5 versus the rest-frame color index and the apparent -band magnitude (top and central panels respectively). The solid red lines in the central panels indicate the apparent magnitude cut of our volume limited sample at = 24. The dashed blue lines mark a cut at = 23. The corresponding 2 distributions, shown in the bottom panels, display how the apparent magnitude cut can bias the 2 distribution, in particular at high redshifts.
They are further consistent with from Ryden (2006), who shows that axis ratios measured in the and band are strongly scattered, but not biased. We therefore do not expect our results to be significantly affected by the fact that the ACS F814W filter probes different rest-frame wavelengths at different redshifts.

APPENDIX F: DEPENDENCE OF PARAMETER CONTOURS ON NOISE IN THE 2D AXIS RATIO DISTRIBUTION
We aim to test in this section if the strong variations of the posteriors from different redshifts samples, shown in the central and bottom panels of Fig. 13, can be expected if the axis ratios of galaxies in these samples are drawn randomly from the same redshift independent axis ratio distribution.
We tackle this question with a numerical experiment which is based on two sets of 10 6 artificial ellipsoids whose 3D axis ratio distributions we generate with our Gaussian model from eq. (5) and (6). The parameters used to generate these two samples are the best fit values for the high and low mass samples in the entire redshift range of the COSMOS main sample, given in Table 5. These low and high mass parameters are referred to as parameter set 1 and 2 respectively. The distributions of 2D axis ratios from the corresponding samples are derived assuming random orientations and are shown in the top panels of Fig. F1. From each of the two sets we draw three sub-sets with 500 randomly selected objects, which corresponds roughly to the size of our observational samples. The 2D axis ratio distributions of these sub-sets are shown in the three lower panels of Fig. F1. We proceed by performing the same MCMC SDSS g (z=0.00) Figure E1. Filters of the SDSS -and -bands and the ACS F814W filter used for the imaging in the COSMOS survey (dashed and solid lines respectively). The F814W filter covers the near infrared wavelengths at = 0.0, which corresponds to red and green wavelengths at the lowest and highest redshift bin used in our analysis (with mean redshifts of = 0.56 and = 95 respectively).
parameter inference as for the observational samples, described in Section 3.2. The best fit model prediction for each sub-set is shown as red line in Fig. F1. The corresponding posterior distributions are shown in Fig. F2. As for the observational data we find strong variations across results for the different sub-sets. This finding can  Figure E2. Distribution of apparent 2D axis ratios, measured from synthetic images of disc galaxies in the Illustris TNG100 simulation at = 0.0 in the SDSS -and -bands. Figure F1. 2D axis ratio distributions, generated from our model for the 3D axis ratio distribution for two different sets of parameters. The top panels show results for the parent samples, consisting of 10 6 objects. The bottom panels show results from three random sub-samples (a,b,c) of the parent sample, consisting of 500 objects each. Red lines show fits from 3D axis ratio distribution model to each sub-sample. Figure F2. Posteriors of the parameters of the Gaussian model for the 3D axis ratio distribution, derived from the 2D axis ratios of projected ellipsoids, whose 3D axis ratios follow the same Gaussian model (Fig. F1). Results are shown for the three random sub-samples (a,b,c). Light and dark areas mark 95% and 68% confidence levels respectively. The parameter values used for generating the parent samples are marked as horizontal and vertical lines. Top and bottom panels show results for the model parameter sets 1 and 2 respectively. be expected, since the posterior is computed directly from the data. Variations in the data hence translate into variations in the posterior. The true values of the input model parameters, shown in Fig. F2 by horizontal and vertical lines, lie within the 95% confidence intervals, but often outside of the 68% intervals. This test shows that strong variations of the posteriors can be expected for samples that are drawn from the same underlying distribution if the sample size is similar to the sizes of our observational redshift-stellar mass samples. This paper has been typeset from a T E X/L A T E X file prepared by the author.