It has been widely claimed that several lines of observational evidence point towards a ‘downsizing’ of the process of galaxy formation over cosmic time. This behaviour is sometimes termed ‘antihierarchical’, and contrasted with the ‘bottom-up’ (small objects form first) assembly of the dark matter structures in cold dark matter (CDM) models. In this paper, we address three different kinds of observational evidence that have been described as ‘downsizing’: the stellar mass assembly (i.e. more massive galaxies assemble at higher redshift with respect to low-mass ones), star formation rate (SFR) (i.e. the decline of the specific star formation rate is faster for more massive systems) and the ages of the stellar populations in local galaxies (i.e. more massive galaxies host older stellar populations). We compare a broad compilation of available data sets with the predictions of three different semi-analytic models of galaxy formation within the ΛCDM framework. In the data, we see only weak evidence at best of ‘downsizing’ in stellar mass and in SFR. Despite the different implementations of the physical recipes, the three models agree remarkably well in their predictions. We find that, when observational errors on stellar mass and SFR are taken into account, the models acceptably reproduce the evolution of massive galaxies (M > 1011 M⊙ in stellar mass), over the entire redshift range that we consider (0 ≲z≲ 4). However, lower mass galaxies, in the stellar mass range 109–1011 M⊙, are formed too early in the models and are too passive at late times. Thus, the models do not correctly reproduce the downsizing trend in stellar mass or the archaeological downsizing, while they qualitatively reproduce the mass-dependent evolution of the SFR. We demonstrate that these discrepancies are not solely due to a poor treatment of satellite galaxies but are mainly connected to the excessively efficient formation of central galaxies in high-redshift haloes with circular velocities ∼100–200 km s−1. We conclude that some physical processes operating on these mass scales – most probably star formation and/or supernova feedback – are not yet properly treated in these models.
In the last decades, the parameters of the cosmological model have been tightly constrained (Komatsu et al. 2009 and references therein), and the cold dark matter (CDM) paradigm has proved to be very successful in reproducing a large number of observations, particularly on large scales. The current standard paradigm for structure formation predicts that the collapse of dark matter (DM) haloes proceeds in a ‘bottom-up’ fashion, with smaller structures forming first and later merging into larger systems. It has long been known that galaxies do not share the same ‘bottom-up’ evolution, at least in their star formation (SF) histories. The most massive galaxies – mainly giant ellipticals hosted in galaxy groups and clusters – are dominated by old stellar populations. In contrast, faint field galaxies appear to have continued to actively form stars over the last billion years, and their stellar populations are dominated by young stars. This evidence is not necessarily in contrast with the hierarchical clustering of DM haloes as it relates to the ‘formation’ of the main stellar population of a galaxy, which does not necessarily coincide with the ‘assembly’ of its stellar mass and/or the assembly of its parent DM halo.
In the last decade, much observational effort has been devoted to quantifying the dependence of galaxy formation and assembly on stellar mass. In one of the earliest such studies, Cowie et al. (1996) showed that the maximum rest-frame K-band luminosity of galaxies undergoing rapid SF in the Hawaii Deep Field declines smoothly with cosmological time. Cowie and collaborators coined the term ‘downsizing’ (DS) to describe this behaviour. Since then, the same term has been extended to a number of observational trends suggesting either older ages, earlier active SF or earlier assembly for more massive galaxies with respect to their lower mass counterparts. Using the same word to describe very different kinds of observational results has naturally generated some confusion. The underlying thought has clearly been that these observations are all manifestations of the same underlying physical process. It is not clear to which degree this is in fact the case or to what degree these observational trends are ‘antihierarchical’, i.e. whether they are in fact in serious conflict with predictions from models based on ΛCDM cosmology.
It is useful, at this point, to summarize the different types of ‘DS’ that have been discussed in the literature. Clearly, each of the observational evidences discussed below has its own set of uncertainties and potential biases. Here, we report the trends as they have been claimed in the literature, and discuss in more detail the related uncertainties and caveats later. The first two types of DS that we describe are based on the local ‘fossil record’ and are related to the time of ‘formation’ of the stellar population, i.e. they tell us that the bulk of the stars in more massive galaxies formed earlier and on shorter time-scales than in their lower mass counterparts. These two types of DS are as follows.
Chemo-archaeological DS: among elliptical galaxies, more massive objects have higher (up to supersolar) [α/Fe] ratios. This result was first reported by Faber, Worthey & Gonzales (1992) and Worthey, Faber & Gonzalez (1992), who suggested three possible (and equally acceptable at that time) explanations: (i) different SF time-scales; (ii) a variable initial mass function (IMF) and (iii) selective mass-loss mechanisms. Several studies have since confirmed this observational trend (Carollo, Danziger & Buson 1993; Davies, Sadler & Peletier 1993; Trager et al. 2000b; Kuntschner et al. 2001), and a standard interpretation has become that of shorter formation time-scales in more luminous/massive galaxies (Matteucci 1994; Thomas et al. 2005), though other interpretations have not been conclusively ruled out.
Archaeological DS: more massive galaxies host older stellar populations than lower mass galaxies. A direct estimate of stellar ages is hampered by the well-known age–metallicity degeneracy (e.g. Trager et al. 2000a and references therein), although it has long been known that there are some spectral features (like Balmer lines) that are more sensitive to age than to metallicity (see i.e. Worthey 1994). Recent detailed analyses, based on a combination of spectral indexes or on a detailed fit of the full high-resolution spectrum, have confirmed a weak trend between stellar mass and age both in clusters (Nelan et al. 2005; Thomas et al. 2005, though see Trager, Faber & Dressler 2008) and in the field (Trager et al. 2000a; Heavens et al. 2004; Gallazzi et al. 2005; Panter et al. 2007).
The second kind of observational evidence for DS comes from ‘look-back studies’, or observations of galaxies at different cosmic epochs.
DS in (specific) star formation rate (SFR): the mass of ‘star-forming galaxies’ declines with decreasing redshift. This trend was first seen by Cowie et al. (1996), and there have been many claimed confirmations by subsequent deeper and/or wider observational programmes (Brinchmann et al. 2004; Kodama et al. 2004; Bauer et al. 2005; Feulner et al. 2005; Bundy et al. 2006; Pannella et al. 2006; Papovich et al. 2006; Bell et al. 2007; Noeske et al. 2007; Cowie & Barger 2008; Drory & Alvarez 2008; Vergani et al. 2008; Chen et al. 2009). This trend can also be recast as implying that the SFR density or specific star formation rate (SSFR) declines more rapidly for more massive systems; here there are conflicting claims in the literature about whether such a trend is in fact seen or not (e.g. Juneau et al. 2005; Conselice et al. 2007; Zheng et al. 2007; Mobasher et al. 2009). This trend reflects in SSFRs of nearby spiral galaxies which are higher for lower mass objects (Boselli et al. 2001). A possibly related trend is the increase with time of faint red-sequence galaxies in galaxy clusters (see e.g. De Lucia et al. 2004b, 2007; Gilbank et al. 2008), which may be due to a differential decline of the SSFR.
DS in stellar mass: the high-mass end of the stellar mass function (MF) evolves more slowly than the low-mass end, indicating that massive galaxies were assembled earlier than less massive ones. The same result is found both by correcting the B- or K-band luminosity function (LF) for ‘passive’ evolution (Cimatti, Daddi & Renzini 2006) and by estimating the stellar mass using multiwavelength photometry (Drory et al. 2004, 2005; Borch et al. 2006; Bundy et al. 2006; Fontana et al. 2006; Conselice et al. 2007; Pozzetti et al. 2007; Marchesini et al. 2008; Pérez-González et al. 2008). The significance of these claims has been recently questioned by Marchesini et al. (2008).
DS in metallicity: the stellar metallicity of more massive galaxies appears to decrease with redshift more slowly than for less massive galaxies (Savaglio et al. 2005; Erb et al. 2006; Ando et al. 2007; Maiolino et al. 2008). It is important to note, however, that often different indicators are used at different redshifts, and that there are large uncertainties in the metallicity calibration (Kewley & Ellison 2008).
DS in nuclear activity: the number density of active galactic nuclei (AGN) peaks at higher redshift when brighter objects are considered. This trend is found both for X-ray (Ueda et al. 2003; Hasinger, Miyaji & Schmidt 2005) and for optically (Cristiani et al. 2004; Fontanot et al. 2007) selected AGN, but it strongly depends on the modelling of obscuration (e.g. La Franca et al. 2005).
DS trends have often been considered ‘antihierarchical’, suggesting expected and/or demonstrated difficulties in reconciling the observed trends with predictions from hierarchical galaxy formation models. The naive expectation is that, like for DM haloes, galaxy formation also proceeds in a bottom-up fashion with more massive systems ‘forming’ later. It has already been pointed out in early theoretical work (Baugh, Cole & Frenk 1996; Kauffmann 1996) that the epoch of formation of the stars within a galaxy does not necessarily coincide with the epoch of the galaxy's assembly. Moreover, Neistein, van den Bosch & Dekel (2006) (see also Li, Mo & Gao 2008) suggested that a certain degree of ‘natural DS’ is actually expected in the CDM paradigm if one assumes that there is a minimum halo mass that can support SF and considers the integrated mass in all progenitor haloes rather than just that in the main progenitor. However, several authors (Cimatti et al. 2006; Fontana et al. 2006; Fontanot et al. 2006; Cirasuolo et al. 2008) have argued that the observed mass assembly DS represents a challenge for modern hierarchical galaxy formation models. As well, CDM models have been unable1 to reproduce the observed chemo-archaeological DS (Thomas 1999; Nagashima et al. 2005; Thomas et al. 2005; Pipino et al. 2008), and Somerville et al. (2008, hereafter S08) and Trager & Somerville (2009) have shown that the modern generation of models does not quantitatively reproduce the archaeological DS trend in the field or in rich clusters.
Early phenomenological models of joint galaxy–AGN formation by Monaco et al. (2000) and Granato et al. (2001) produced ‘antihierarchical’ formation of elliptical galaxies in ΛCDM haloes by delaying quasar activity in less massive haloes. More recently, it has been suggested that AGN feedback could provide a solution to the ‘DS problem’ (Bower et al. 2006; Croton et al. 2006). The suppression of late gas condensation in massive haloes gives rise to shorter formation time-scales for more massive galaxies (De Lucia et al. 2006), in qualitative agreement with the observed trends. However, the recent work by S08 indicates that the predicted trends may not be as strong as the observed ones, even in the presence of AGN feedback.
Moreover, AGN feedback does not stop the growth in stellar mass via mergers. ΛCDM models predict that the stellar masses of the most massive galaxies have increased by a factor of 2 or more since z∼ 1 via gas-poor ‘dry mergers’ (De Lucia et al. 2006; De Lucia & Blaizot 2007). It has been suggested that if mergers scatter a significant fraction of the stars in the progenitor galaxies into a ‘diffuse stellar component’, then perhaps one can reconcile the CDM predictions with the observed weak evolution in the stellar MF since z∼ 1 (Monaco et al. 2006; Conroy, Wechsler & Kravtsov 2007; S08), but observational uncertainties on the amount of diffuse light are still too large to strongly constrain models of this process.
Despite the large number of papers related to the subject of ‘DS’, a detailed and systematic comparison between a broad compilation of observational data and predictions from hierarchical galaxy formation models is still missing. Our study is a first attempt in this direction. We present here predictions from three different semi-analytic models (see Section 2), all of which have been tuned to provide reasonably good agreement with the observed properties of galaxies in the local Universe, and compare them to an extensive compilation of recent data on the evolution of the stellar MF, SSFRs and SFR densities, as well as with observational determinations of stellar population ages as a function of mass in nearby galaxies. An important new aspect of our study is that we consider three (claimed) ‘manifestations’ of DS simultaneously. Because these very different kinds of observations have very different potential selection effects and biases, this allows us to make a strong argument that, when a discrepancy is seen between the models and all three kinds of observations, this discrepancy is due to shortcomings in the physical ingredients of the models rather than errors or biases in the observations. Similarly, by making use of three independently developed semi-analytic models, which include different implementations of the main physical processes, we can hope to determine which conclusions are robust to model details.
In this study, we do not address the ‘chemo-archaeological DS’ or the ‘DS in metallicity’. Our chemical enrichment models are all based on an instantaneous recycling approximation which prevents us from making detailed comparisons with observed elemental abundances. We have also decided not to discuss here the ‘DS in AGN activity’, which depends strongly on the complicated and poorly understood physics of accretion on to black holes and on its relation with SF activity (see e.g. Menci et al. 2004, 2008; Fontanot et al. 2006).
This paper is organized as follows: in Section 2, we give a brief introduction to the models we use in our study. We then present our results for the DS in stellar mass (Section 3), in SFR (Section 4) and on the archaeological DS (Section 5). In Section 6, we discuss our results and give our conclusions. Throughout this paper, we assume a cosmological model consistent with the Wilkinson Microwave Anisotropy Probe 3 (WMAP3) results.
We consider predictions from three independently developed codes that use semi-analytic modelling (SAM) techniques to simulate the formation of galaxies within the ΛCDM cosmogony (for a review on these techniques, see Baugh 2006). In SAMs, the evolution of the baryonic component of galaxies – which are assumed to form when gas condenses at the centre of DM haloes – is modelled using simple but physically motivated analytic ‘recipes’. The parameters entering these analytic approximations of the various physical processes are usually fixed by comparing model predictions to observational data of local galaxies. Although the treatment of the physical processes is necessarily simplified, this technique allows modellers to explore (at least schematically) a broad range of processes that could not be directly simulated simultaneously (e.g. accretion onto a super-massive black hole on sub-pc scales within the framework of cosmological structure formation), and to explore a wide parameter space.
Most of the various SAMs proposed in the literature are attempting to model the same basic set of physical processes. When a comparison is made of several SAMs with observations, one may focus on differences between models, with the aim of understanding how the details of a particular implementation influence the predictions of galaxy properties. Alternatively, one may concentrate on comparing the model predictions with the observational data. In this case, the focus shifts to assessing whether the general framework, namely ΛCDM + the set of physical processes implemented, gives a plausible description of galaxy populations.
In this paper, we take the second approach. We use three SAMs: (i) the most recent implementation of the Munich model (De Lucia & Blaizot 2007) with its generalization to the WMAP3 cosmology discussed in Wang et al. (2008, hereafter WDL08); (ii) the morgana model, presented in Monaco, Fontanot & Taffoni (2007), adapted to a WMAP3 cosmology, and with some minor improvements which will be presented in Lo Faro et al. (2009), (iii) the fiducial model presented in S08, which builds on the previous implementation discussed in Somerville, Primack & Faber (2001). All models adopt, for the results discussed in this study, a Chabrier (2003) IMF.
In the following, we briefly summarize the main physical ingredients of SAMs, and then highlight the main differences between the implementations of these ingredients in the three models used here. For more details, we refer to the original papers mentioned above and to the references therein.
We first summarize the elements that are common to all three models. The backbone of all three SAMs is a ‘merger tree’, which describes the formation history of DM haloes through mergers and accretion. When a halo merges with a larger virialized halo, it becomes a ‘subhalo’ and continues to orbit until it either is tidally destroyed or merges with the central object. Gas cools and condenses via atomic cooling, and forms a rotationally supported disc. This cold disc gas becomes available for SF, which is modelled using simple empirical (Schmidt–Kennicutt-like) recipes. Galaxy mergers trigger enhanced ‘bursts’ of SF. After the Universe becomes reionized, gas infall is suppressed in low-mass haloes (≲30–50 km s−1) due to the photoionizing background. SF deposits energy into the cold gas, and may reheat or expel this gas. The production of chemical elements by Type II supernovae is tracked using a simple instantaneous recycling approximation with the effective yield taken as a free parameter. All three codes also track the formation of supermassive black holes, and differentiate between the so-called ‘bright mode’ (or ‘quasar mode’) which is associated with luminous AGN, and ‘radio-mode’ accretion which is related to efficient production of radio jets. The ‘bright mode’ is associated with galaxy–galaxy mergers (WDL08 and S08) or Eddington-limited accretion rates (morgana), while the ‘radio mode’ is associated with low accretion rates (few per cent of Eddington). All three models include ‘radio-mode’ feedback (heating of the hot gas halo by giant radio jets). morgana and S08 also include galactic scale AGN-driven winds, which can remove cold gas from galaxies.
All three models are coupled with stellar population synthesis models and a treatment of dust absorption, and are capable of predicting observable quantities like luminosities and colours in various bands. However, modelling these additional ingredients (especially dust, as recently shown by Fontanot et al. 2009) introduces a large number of additional uncertainties and degrees of freedom in both the model–model and the model–data comparison. To simplify the interpretation of our results, we therefore conduct our entire analysis in the space of ‘physical’ quantities (e.g. stellar masses and SFRs), which are directly predicted by the models, and may be extracted from multiwavelength observations.
Here, we highlight a few of the differences between the model implementations.
Cosmological and numerical parameters. All three semi-analytic models adopt values of the cosmological parameters that are consistent with the WMAP3 results within the quoted errors. WDL08 use a simulation box of 125 h−1 Mpc3 on a side, with cosmological parameters (Ω0, ΩΛ, h, σ8, nsp) = (0.226, 0.774, 0.743, 0.722, 0.947). morgana uses a 144 h−1 Mpc box, and adopts a cosmology with (Ω0, ΩΛ, h, σ8, nsp) = (0.24, 0.76, 0.72, 0.8, 0.96). S08 uses a grid of 100 realizations of 100 ‘root’ haloes, with circular velocities ranging from 60 to 1200 km s−1, and weights the results with the Sheth & Tormen (1999) halo MF. The S08 model used here assumes the following cosmological parameters (Ω0, ΩΛ, h, σ8, nsp) = (0.279, 0.721, 0.701, 0.817, 0.96). In all cases, the mass resolution is sufficient to resolve galaxies with stellar mass larger than 109h−1 M⊙. The very small differences in the cosmological parameters in the three models will have a nearly undetectable impact on our predictions, and therefore we make no attempt to correct the results for the slightly different cosmologies.
Merger trees. The WDL08 model uses merger trees extracted from a dissipationless N-body simulation (Springel et al. 2005), morgana uses the Lagrangian semi-analytic code pinocchio (Monaco et al. 2002) and S08 use a method based on the extended Press–Schechter formalism, described in Somerville & Kolatt (1999).
Substructure. The WDL08 model explicitly follows DM substructures in the N-body simulation until tidal truncation and stripping reduce their mass below the resolution limit of the simulation (De Lucia et al. 2004a; Gao et al. 2004). Beyond that point the merger time of the satellite is computed using the classical Chandrasekhar dynamical friction approximation (for more details, see De Lucia & Blaizot 2007; De Lucia & Helmi 2008). morgana and S08 do not track explicitly DM substructures, and assume that satellite galaxies merge on to central galaxies after a dynamical friction time-scale which is assigned at the time the satellite enters the virial radius of the remnant structure, following Taffoni et al. (2003) in the case of morgana and Boylan-Kolchin, Ma & Quataert (2008) in the case of S08. The same models also account for tidal destruction of satellites.
Cooling model. WDL08 and S08 use variations of the original cooling recipe of White & Frenk (1991), while morgana uses a modified model, described and tested against simulations in Viola et al. (2008), that predicts an enhanced cooling rate at the onset of cooling flows.
Galaxy sizes, SF and SN feedback. The three SAMs also differ in the details of the modelling of SF, stellar feedback and galactic winds, as well as in the computation of galaxy sizes. We prefer not to discuss these processes in detail here, and refer the reader to the original papers for more details.
BH growth and AGN feedback. In the WDL08 and S08 models, the radio mode is fuelled by accretion from the hot gas halo, and only haloes that can support a quasi-hydrostatic halo are subject to the radio-mode heating (though the conditions used differ, see Croton et al. 2006 and S08). In morgana, the radio-mode accretion comes from the cold gas reservoir surrounding the black hole. In the WDL08 and S08 models, the ‘bright mode’ or quasar mode is explicitly triggered by galaxy–galaxy mergers (though again, the details of the implementation differ), while in morgana it is associated with Eddington-limited high accretion rates (again coming from the cold reservoir). As noted above, morgana and S08 include galactic scale AGN-driven winds associated with the bright mode, while WDL08 do not.
The three models were each normalized to fit a subset of low-redshift observations. The specific observations used and the weight given to different observations in choosing a favoured normalization are different for each of the three models, and we refer to the original papers for details. The most important free parameters in all three models are those controlling the efficiency of supernova feedback, SF and ‘radio-mode’ AGN feedback. The efficiency of supernova feedback is primarily constrained by the observed low-mass slope of the stellar MF (morgana and S08) or the faint-end slope of the luminosity function (WDL08). The efficiency of SF is mainly constrained by observations of gas fractions in nearby spiral galaxies (WDL08 and S08) or by the cosmic SFR density (morgana). The efficiency of the ‘radio-mode’ AGN feedback is constrained by the bright or high-mass end of the observed LF or stellar MF. Other important parameters are the effective yield of heavy elements, which is constrained by the observed mass–metallicity relation at z= 0, and the efficiency of black hole growth, which is constrained by the observed z= 0 black hole mass versus bulge mass relationship. We emphasize that we made no attempt to tune the models to match each other or to match any of the high-redshift data that we now compare with.
3 DOWNSIZING IN STELLAR MASS
In this section, we focus on the evolution of the galaxy stellar MF. Our model predictions are compared with a compilation of published observational estimates using different data sets and methods to compute stellar masses. In the past, the rest frame near infrared light has been widely used as a tracer of the galaxy stellar mass (Cole et al. 2001; Bell et al. 2003). In more recent times, most mass estimates (Drory et al. 2004, 2005; Borch et al. 2006; Bundy et al. 2006; Fontana et al. 2006; Marchesini et al. 2008; Pérez-González et al. 2008) have been based on multiwavelength spectral energy distribution (SED) fitting algorithms. In this approach, broad-band photometry is compared to a library of synthetic SEDs, covering a relatively wide range of possible SF histories, metallicities and dust attenuation values. A suitable algorithm is then used to select the ‘best-fitting’ solution, thus simultaneously determining photometric redshift, galaxy stellar mass and SFR. Stellar mass estimates are therefore subject to several degeneracies (age, metallicity and dust), and their accuracy depends sensitively on the library of SF histories employed and on the wavelength range covered by observations (see e.g. Fontana et al. 2004; Pozzetti et al. 2007; Marchesini et al. 2008). In particular, most of these algorithms assume relatively simple analytic SF histories (with, in some cases, some bursty SF episodes superimposed), while SAMs typically predict much more complex SF histories, with a non-monotonic behaviour and erratic bursts. This may result in certain biases in the physical parameters obtained from this method (Lee et al. 2008). Similarly, using different libraries of SF histories has an effect on the final mass determination (Pozzetti et al. 2007; Stringer et al. 2009). Additional sources of uncertainty may come from the physical ingredients in the adopted stellar population models, for example to the treatment of particular stages of stellar evolution, such as TP-AGB stars (Maraston et al. 2006; Tonini et al. 2009). Moreover, due to the relatively small volumes probed at high redshift, cosmic variance due to large-scale clustering is a significant source of uncertainty, particularly in the number density of high-mass objects.
When high signal-to-noise ratio spectroscopy is available, galaxy stellar masses can be estimated by comparison of the observed spectra with theoretical SEDs (e.g. Panter et al. 2007 for Sloan Digital Sky Survey (SDSS) data). In this case, the finer details of the spectrum can be used to give tighter constraints on, for example, stellar ages and metallicities. However, the method is not free from uncertainties due to model degeneracies, and contamination from AGN and/or strong emission lines (usually not included in the theoretical SEDs) can, in principle, introduce systematic errors or simply limit the accuracy of the mass estimate.
In Fig. 1, we show a compilation of different observational measurements of the galaxy stellar MF from Two-Micron All-Sky Survey (2MASS) (Cole et al. 2001), 2MASS+SDSS (Bell et al. 2003), Munich Near-IR Cluster Survey (MUNICS) (Drory et al. 2004), FORS Deep Field + Great Observatories Origins Deep Survey (FDF+GOODS) (Drory et al. 2005), Classifying Objects by Medium-Band Observations (COMBO17) (Borch et al. 2006), DEEP Extragalactic Evolutionary Probe 2 (DEEP2) (Bundy et al. 2006), GOODS-Multiwavelength Southern Infrared Catalog (MUSIC) (Fontana et al. 2006), SDSS (Panter et al. 2007), VIMOS VLT Deep Survey (VVDS) (Pozzetti et al. 2007), Spitzer (Pérez-González et al. 2008), Multiwavelength Survey by Yale-Chile (MUSYC)+Faint InfraRed Extragalactic Survey (FIRES)+GOODS-Chandra Deep Field South (CDFS) (Marchesini et al. 2008) (green points; left- and right-hand panels show the same data). All estimates have been converted to a common (Chabrier) IMF when necessary; we use a factor of 0.25 dex to convert from Salpeter to Chabrier. These stellar MFs are fairly consistent among themselves, but the scatter becomes larger at higher redshift, in particular for the high-mass tail (which is significantly affected by cosmic variance).
A note on the errors and uncertainties associated with these observationally derived stellar MFs is in order. Most published papers quote only Poisson errors on their MF estimates. However, as noted above, both systematic and random errors can arise from the unknown true SF histories, metallicities and dust corrections, and also from photometric redshift errors, differences in stellar population models, the unknown stellar IMF and its evolution, and cosmic variance. Marchesini et al. (2008) carried out an extensive investigation of the impact of all of these sources of uncertainty on their derived stellar MFs. In their figs 13 and 14, they show a comparison of their results, including these comprehensive error estimates, with the three models presented here. Their analysis shows that the evidence for differential evolution in the stellar MF at z < 2, with more massive galaxies evolving more slowly than less massive ones, becomes weak when all sources of uncertainty in the stellar mass estimates are considered. When this is done, the observed evolution appears to be consistent with pure density evolution.
In order to make DS in stellar mass more evident, we divide galaxies in bins of stellar mass and, by averaging over the MF estimates of Fig. 1, compute the stellar mass density (SMD) of galaxies in these stellar mass bins as a function of redshift. These stellar mass densities agree with estimates published by Conselice et al. (2007) and Cowie & Barger (2008) and are shown in Fig. 2 (left- and right-hand panels contain the same data). The quoted errors refer only to the scatter between the estimates from different samples (note that this scatter is larger than the quoted errors on individual determinations, confirming that these errors are underestimated, as discussed above).
DS in stellar mass should consist of a differential growth of SMD, such that massive galaxies are assembled earlier and more rapidly than low-mass galaxies. Examining Fig. 2, it is hard to claim convincing evidence for such a behaviour from these data: although the evidence for the growth of SMD is clear, its rate of growth is very similar for all mass bins. To further illustrate this point, we perform a linear regression of the log ρ★−z relation in each mass bin; the slopes we obtain are consistent within their statistical errors. We then rescale the densities to the z= 0 value of their regressions and fit the whole sample, obtaining the thick dotted line in the left-hand panels of Fig. 2. The fit, valid in the 109 < M★/M⊙ < 1012 range, has a χ2 probability of >95 per cent.
The predicted stellar MFs for the morgana, WDL08 and S08 models (solid, dashed and dot–dashed lines, respectively) are shown in the left-hand panels of Figs 1 and 2, while in the right-hand panels of the two figures, the model stellar masses have been convolved with a statistical error on log M★. As we have discussed above, this error distribution depends on many factors, such as the specific algorithm and stellar population models used to estimate stellar masses for each sample, the magnitude and redshift of the galaxy, and characteristics of each observational survey, such as the volume covered and the number and wavelength coverage of the photometric bands that are available. A detailed accounting of this complex error distribution for each observational data set is clearly beyond the scope of this paper. Rather than simply ignoring the impact of errors in the stellar mass estimates on our data-model comparison, as has usually been done in the past, we adopt a simple approach that is meant to be illustrative rather than definitive. We assume that the error has a Gaussian distribution (independent of mass and redshift) with a standard deviation of 0.25 dex. This assumed that uncertainty roughly corresponds to the mean value of the formal error in the stellar mass determination from the GOODS-MUSIC catalogue (Fontana et al. 2006, their Fig. 2), is lower (by about 0.1 dex) than that estimated by Bundy et al. (2006), and is roughly consistent with the findings of Stringer et al. (2009).
The first thing to note is that the models give fairly consistent predictions. Secondly, as redshift increases, the intrinsic model predictions (i.e. without convolution with errors) show a significant deficit of massive galaxies (the two bins 1011 < M★/M⊙ < 1011.5 and M★ > 1011.5 M⊙) with respect to the data. The error convolution does not affect the power-law part of the MF, but it has a significant impact upon its high-mass tail, as already pointed out by Baugh (2006), Kitzbichler & White (2007) and more recently by Stringer et al. (2009). Because the models were tuned to match the z= 0 stellar MF or LF without errors, this convolution causes a small apparent overestimate of the number of the most massive galaxies at z∼ 0. This could be corrected by tuning the radio-mode AGN feedback in the models. However, there are indications that the observed magnitudes and stellar masses of the brightest local galaxies may be underestimated by significant amounts (see the discussion in S08). Therefore, we do not retune the models to correct this apparent discrepancy. When these observational uncertainties are taken into account, model predictions for massive galaxies are in fairly good agreement with observations over the entire redshift interval probed by the surveys that we considered (with morgana being ∼2σ low at z > 2).
In lower stellar mass bins (1010.5 < M★/M⊙ < 1011 and in particular 1010 < M★/M⊙ < 1010.5), all three models overpredict the observed stellar MF and SMD at high redshift (z≳ 0.5), with the discrepancy increasing with increasing redshift. Thus, a robust prediction of the models seems to be that the evolution of less massive galaxies is slower than that of more massive ones – i.e. the models do not predict stellar mass ‘DS’ but rather the opposite behaviour (sometimes called ‘upsizing’). This discrepancy has already been noted in previous papers (Fontana et al. 2004; Fontanot et al. 2007), and extends to other models. Therefore, the models seem to be discrepant with observations even if the real Universe shows mass-independent density evolution rather than DS. As a caveat, it is worth mentioning that, although the different groups have performed detailed completeness corrections, it is still possible that the high-redshift samples may be incomplete at the lowest stellar masses. In Figs 1 and 2, we show data only for mass ranges where the corresponding authors claim that no completeness correction is necessary.
In order to further investigate the evolution of the predicted galaxy SMF, we separately consider the contribution from central and satellite galaxies at different redshifts. Model predictions (convolved with observational uncertainties as before) are shown in the left-hand panels of Fig. 3. As a reference, in each panel we show the total observed MFs at the considered redshift. The three models predict a similar evolution for the two subpopulations. It is evident that central galaxies are the main contributors to the overprediction of low-mass galaxies at z < 2: the models predict a roughly constant z < 2 number density of low-mass central galaxies, while the low-mass satellite population shows a gradual increase which is due to the infall of field galaxies into galaxy groups and clusters. This implies that small objects are overproduced while they are central galaxies, and the excess is not primarily due to inaccuracies in the modelling of satellites.
In the right-hand panels of Fig. 3, we show the evolution of the galaxy SMF split in bins of parent halo mass at the considered redshift. Again, the three SAMs predict similar trends: it is evident that the main contributors at all redshifts to the low-mass end excess are low-mass (1011 < Mh/M⊙ < 1012) DM haloes. These results suggest that, in order to cure the discrepancies seen in these three models and others, we should seek a physical process that can suppress SF in central galaxies hosted by intermediate to low-mass haloes (Mh/M⊙ < 1012).
4 DOWNSIZING IN STAR FORMATION RATE
The DS in galaxy SFR seen in look-back studies most closely corresponds to the original definition of DS. There are, however, several different forms in which the diagnostics of DS in SFR may have been cast observationally. In addition, one can identify two different trends that might be called ‘DS’: first, the normalization of the ‘star-forming sequence’ of galaxies shifts downwards with decreasing redshift; secondly, galaxies move off the star-forming sequence and become ‘quenched’ or passive as time progresses. These different behaviours may offer clues to the physical mechanisms responsible, for example the downward shift of the SF sequence might be due to simple gas exhaustion by SF, while ‘quenching’ is presumably due to a more dramatic process such as AGN feedback. If DS is occurring, this evolution should happen in a differential way, with more massive objects being quenched earlier and/or more rapidly. In order to probe these different possible ‘paths’ for DS, we will consider several different ways of slicing and plotting the distribution function of SFR as a function of stellar mass and redshift: (1) the two-dimensional distribution of stellar mass and SFR in several redshift bins; (2) the average SFR as a function of stellar mass, plotted in redshift bins; (3) the SFR density contributed by objects of different stellar masses, as a function of redshift, and (4) the evolution of the stellar MF of active versus passive galaxies.
SFRs are estimated using different observational tracers, such as Hα emission lines, ultraviolet (UV), mid- and far-infrared (IR) emission and radio. SFR may also be estimated by fitting SF histories to multiwavelength broad-band SEDs, in a similar manner used to estimated stellar masses. SFR estimates are impacted by many of the same sources of uncertainty as stellar mass estimates (such as propagated errors from photometric redshift uncertainties and sensitivity to the assumed stellar population models, stellar IMF and SF histories), and also each tracer carries its own set of potential problems. For example, SFR estimates based on emission lines such as Hα are metallicity-dependent and (typically fairly large) corrections for dust extinction must be applied. A potential advantage to this approach is that dust corrections can be fairly reliably estimated from the Balmer decrement; however, these measurements are currently impractical at high redshift as they would require highly multiplexed, deep near-infrared spectroscopy. SFR estimates based on the UV continuum alone suffer mainly from the very large and uncertain dust corrections (extinction estimates based on the UV spectral slope, while widely used, are quite uncertain). Estimates based on the mid-IR (e.g. 24 μm) suffer from highly uncertain k-corrections (as strong polycyclic aromatic hydrocarbon features move through the observed bandpass), potential strong contamination by AGN, uncertainties in the IR SED templates (due to our lack of knowledge about the composition and state of the emitting dust) and possibly contamination by heating from old stellar populations. Measurements of the longer wavelength thermal IR, near the peak of the dust emission (∼100 μ), offer perhaps the most promising approach for obtaining robust estimates of total SFR. These are, however, currently available only for a small number of very IR-luminous galaxies. Moreover, all these indicators are usually calibrated on local galaxy samples, and the systematics connected with applying them to higher redshift are poorly known.
The observed SFRs used in this section have been obtained from UV +Spitzer 24 μm (Bell et al. 2007; Zheng et al. 2007), Spitzer 24 μm (Conselice et al. 2007), Galaxy Evolution Explorer (GALEX) far-ultraviolet (FUV) (Schiminovich et al. 2007), emission lines +Spitzer 24 μm (Noeske et al. 2007), SED-fitting continuum at 2800 Å (Drory & Alvarez 2008; Mobasher et al. 2009), GALEX FUV +Spitzer 24 μm (Martin et al. 2007), Balmer absorption lines (Chen et al. 2009), SED fitting +Spitzer 24 μmm (Santini et al. 2009) and radio (Dunne et al. 2009).
The comparison of models and data is also made difficult by the complex selection criteria involved. Most SFR estimates used here have poor sensitivity to sources with low SFRs, leading to many upper limits; for instance, SFR estimates for passive objects are poorly constrained by SED fitting techniques. Several authors have attempted to correct for incompleteness by stacking images of objects with similar masses to obtain deeper detections, or by using only galaxies with active SF to compute the average. A proper comparison should take into account the selection effects of each data set; however, systematics are large and poorly understood, so a detailed comparison at this stage is of doubtful utility. With all these caveats in mind, we compare our models to the data at face value, trying again to assess whether DS is seen in the data and to what extent models are consistent with available observations. Moreover, analogously with stellar masses, we convolve model SFRs with a lognormal error distribution; for its amplitude, we use a value of 0.3 dex, roughly equal to the median formal error of SFRs in GOODS-MUSIC. In the light of what is said above, this estimate is clearly naive, but it allows us to determine the gross effect of (random) uncertainties in SFR determinations. We find that our results are fairly insensitive to the inclusion of this error.
In Fig. 4, we show the two-dimensional distribution of SSFR as a function of stellar mass, for several redshifts from z= 2 to 0, for all three models and for a compilation of observational data (Noeske et al. 2007; Schiminovich et al. 2007; Santini et al. 2009). All model galaxies with SSFR < 10−13 yr−1 have been assigned SSFR = 10−13 yr−1 (this causes the thin quenched sequence at the bottom of each panel). In each panel, we plot the locations of the ‘star-forming’ and ‘quenched’ sequences at z∼ 0 from local observations based on SDSS+GALEX (Salim et al. 2007; Schiminovich et al. 2007), and of the so-called ‘green valley’ that divides the two sequences. We see that all three models show qualitatively similar behaviour. Perhaps the clearest discrepancy between the models and data is that the SSFR of low-mass galaxies in all three models are the same as or, in the case of morgana, even lower than those of massive galaxies, while in the observations a clear trend is seen such that lower mass galaxies have higher SSFR. In the models, the slope of the SF sequence does not appear to change significantly over time between z∼ 2 and 0, while the normalization of this sequence decreases over time. Also in all three models, there are few if any massive passive galaxies in place at z∼ 2; it remains to be seen whether this is in conflict with observations.
In Fig. 5, we show the evolution of the average SFR of galaxies as a function of stellar mass, for eight redshift bins from z∼ 0.3 to 3.5; data are taken from Bell et al. (2007), Noeske et al. (2007), Martin et al. (2007), Drory & Alvarez (2008), Chen et al. (2009), Santini et al. (2009) and Dunne et al. (2009). For the GOODS-MUSIC data (Santini et al. 2009), we show with open (filled) circles the bins where, according to the authors, the incompleteness is (not) significant; open symbols are then upper limits to the average SFR. In order to illustrate the redshift evolution of this quantity, in all panels the shaded cyan/yellow area represents the confidence region of the z∼ 0 observations. In the top row, we show redshifts 0.3–0.7, and show model results in which we average over all galaxies. In the second row, we repeat the redshift bins 0.3, 0.5 and 0.7, but this time include in the model averages only star-forming galaxies (defined here as having SSFR > 10−11 yr−1). The remaining panels show model averages for active galaxies only, for higher redshifts 1 < z < 3.5. We only show the low-redshift bins for both active and all galaxies because it is only at these redshifts that there is any significant difference in the results. We can see, however, that at low redshift, the inclusion of passive galaxies causes a turnover in the average SFR at high masses in the WDL08 and S08 models.
We first note again the good agreement between the results of the three different SAMs seen in Fig. 5, a result that we did not necessarily expect given the different implementations of SF and feedback. Regarding the comparison with observations, we find that the average SFR of low-mass galaxies (M★≲ 1011 M⊙) is underestimated by the models at all redshifts, as we already noted from Fig. 4. The average SFRs for massive galaxies generally lie near the middle of the range of different observational estimates at low redshift, and near the lower envelope of observational estimates at higher redshift (z≳ 2). Several previous studies (Daddi et al. 2007; Elbaz et al. 2007; Santini et al. 2009) compared the predictions of a slightly different version of the WDL08 models with a single observational estimate of the SFR as a function of stellar mass. Elbaz et al. (2007) found that the model predictions were lower than their observational estimates at z∼ 1 by about a factor of 2, while Daddi et al. (2007) and Santini et al. (2009) found that the models were low by a factor of ∼5 at z∼ 2. Our results are entirely consistent with their findings, but we also see that (as already noted above) the dispersion in different observational estimates of the average SFR at fixed stellar mass is as large as, or larger than, the discrepancy between the model predictions and the observational estimates of these previous studies.
morgana produces too few massive, passive galaxies at late times, resulting in an overestimate of the SFR of massive objects at low redshift. This was studied in more detail in Kimm et al. (2008), and is due to a less efficient, or delayed, quenching of the cooling flows in massive haloes via radio-mode feedback.
Similar conclusions can be reached by considering the SFR density, as a function of redshift, contributed by galaxies of different stellar mass (Fig. 6). We used the K-band-selected GOODS-MUSIC catalogue, complete to K < 23.5 (Grazian et al. 2006), to compute the SFR density as a function of stellar mass. Following the discussion in Fontana et al. (2006), we translated the magnitude limit into a stellar mass limit, and computed SFRs either with SED fitting using photometry from the near-ultraviolet to the mid-IR or with Spitzer 24 μm fluxes when available. We also plot several other data sets from the literature: local points from SDSS+GALEX from Schiminovich et al. (2007); the results of Conselice et al. (2007), based on the Palomar/DEEP2 Survey; estimates from stacked 24 μm flux from the COMBO-17 survey (Zheng et al. 2007) and estimates from UV luminosity alone (Mobasher et al. 2009).
The three models again give fairly consistent results, although the predictions diverge in the higher mass bins. All three models show a gentle decline in the SFR density for low-mass galaxies, and if anything a somewhat flatter behaviour for the SFR density in massive galaxies. This time the model predictions agree well with the observations for small-mass galaxies, because the higher number of small galaxies compensates for the lower SSFR of the objects.
Another way to characterize DS in SF is by dividing galaxies into active (blue) and passive (red) populations, then computing the two stellar MFs or, alternatively, the K-band luminosity functions. As pointed out by Borch et al. (2006), using the COMBO17 sample, and Bundy et al. (2006), using DEEP2, the two MFs cross at a characteristic mass which grows with redshift. Instead of using the colour criterion, we divide our sample into passive and active galaxies using a threshold value for the SSFR of 10−11 yr−1 (Brinchmann et al. 2004). Fig. 7 shows the evolution of the stellar MFs of active and passive galaxies as predicted by the three models. The MF of active galaxies shows almost no evolution since z∼ 2, whereas most of the evolution of the MF is due to the buildup of the passive population; this is qualitatively consistent with the observational results. However, observations (Borch et al. 2006; Bundy et al. 2006) show that the stellar MF of red (passive) galaxies peaks at ∼1011 M⊙ and decreases at lower masses. In other words, in observed samples, low-mass galaxies are predominantly blue (active), while in our models the low-mass slope of the SMF is nearly the same for active and passive galaxies. This result still holds when galaxies are divided using colours rather than SSFR, and this marks another discrepancy between models and data for small galaxies.
In Fig. 8, we show the stellar mass-weighted integrals of these functions, i.e. the SMD contained in the active and passive populations, as a function of stellar mass and redshift. In all models, the SMD is dominated by actively star-forming galaxies at high redshift, with the SMD contributed by passive objects growing rapidly at z≲ 1. These results are in qualitative agreement with observational results at z≲ 1 (e.g. Bell et al. 2007). Observational results at higher redshift will soon be available from ongoing and future surveys.
Up until this point in this section, we have discussed the model–data comparison without assessing whether either the predicted or observed behaviour constitutes ‘DS’. The DS-like differential evolution would be seen as an earlier accumulation of massive passive galaxies in Fig. 4, and as a flattening of the slope of the stellar mass–SFR relation in Fig. 5 with increasing time. In both Figs 4 and 5, we see a clear downward shift over time of the star-forming sequence both in the observations and in the models. Given, however, the significant discrepancies seen between different data sets and different SF indicators, and the possible incompleteness of the observations of low-mass passive galaxies at high redshift, we feel that it is difficult to claim that there is currently robust evidence for this differential evolution (DS) in the data in either figure. Once again, however, the models if anything show a reverse DS trend, with passive low-mass galaxies appearing earlier than massive, passive galaxies. In Fig. 6, the signature of DS would be a more rapid drop, with decreasing redshift, of the SFR density for more massive galaxies. Although some previous studies have claimed to see such an effect (e.g. Juneau et al. 2005), the observational compilation that we have shown here does not show clear evidence for this differential decline. The one bin in which a markedly sharp decline is seen (the highest mass bin, see discussion in Santini et al. 2009) may be affected by cosmic variance. Once again, the data appear to be consistent with a constant rate of decline in SFR density for galaxies of all masses.
5 ARCHAEOLOGICAL DOWNSIZING
In this section, we focus on the relation between the z= 0 galaxy stellar mass and the average age of the stellar population (the archaeological DS discussed in Section 1). In Fig. 9, we compare the stellar mass-weighted age of the stellar populations in galaxies as a function of their stellar mass (at z= 0) as predicted by the three models with the observational estimates from Gallazzi et al. (2005). They use high-resolution SDSS spectra to obtain estimates for the ages and metallicities of ∼170 000 galaxies with M★ > 109 M⊙. They measure these by comparing a set of absorption features in the spectra (in particular the Lick indices and the 4000 Å break) to a grid of synthetic SEDs covering a wide range of plausible SF histories and metallicities. Both the chosen SF histories and stellar population synthesis codes adopted are a likely source of systematic uncertainty in these estimates. Moreover, corrections must be made for in-filling by emission lines in the age-sensitive spectral features (see Gallazzi et al. 2005 for a complete discussion on how this correction was applied.
Our results show that the model massive galaxies are old, in agreement with the observations. However, two out of three models predict only a mild trend in age from high- to low-mass galaxies, in conflict with the steeper trend seen in the observational estimates (as already pointed out by S08). morgana behaves like models without AGN feedback, which produce an inverted trend (in which massive galaxies are younger than low-mass galaxies); Croton et al. (2006) and De Lucia et al. (2006) showed that including the ‘radio-mode’ AGN feedback makes the massive galaxies older, improving the agreement with the observed trend. Once again, it is low-mass galaxies that are discrepant, in the sense that they form too early and thus have ages that are too old.
The inverted trend predicted by morgana is mainly due to two different physical processes. The younger ages of massive galaxies are related to the inefficient quenching of cooling flows in massive haloes at z < 1 (see the discussion in Kimm et al. 2008). The resulting higher level of SF implies younger ages with respect to WDL08 and S08. The older ages of intermediate-to-low-mass sources are likely due to the enhanced cooling at high redshift discussed in Viola et al. (2008), and due to the associated enhanced SF at early times.
We note that the observational estimates are closer to being luminosity weighted more than stellar mass weighted –De Lucia et al. (2006) showed that light-weighted ages show a stronger trend with stellar mass – and also that the ages based on absorption line indices (mainly Balmer lines) tend to actually reflect the age of the most recent SF episode, rather than the luminosity-weighted age (Trager et al. 2000b, 2008). However, Trager & Somerville (2009) find that when these observational biases are modelled by extracting line strengths for the SAM galaxies in the same way as is done for the observations, this effect cannot fully account for the discrepancy between the model ages and the observed ages for low-mass galaxies.
One can even go a step further and attempt to extract SF histories from galaxy spectra (e.g. Panter et al. 2007) and to construct an average SF history for galaxies binned in terms of their stellar mass at z= 0. We compare our models with the results of Panter et al. (2007), who applied the moped algorithm to high-resolution (3 Å) spectra from the SDSS. This algorithm is similar in spirit to the SED-fitting we described in Section 3, but it treats the SFR as a free parameter (defined on an 11-bin grid), thus allowing for the reconstruction of the SF history of galaxies. We stress that these measurements come with numerous uncertainties. Panter et al. (2007) showed that their reconstructed SF histories depend strongly on the input assumptions. In particular, they demonstrated that the largest systematics are related to the chosen spectrophotometric code, stellar population model, the assumed IMF, the dust attenuation prescription and the calibration of the observed spectra.
These observationally derived SF histories are shown in Fig. 10, where we plot the cosmic SFR density in bins of z= 0 stellar mass as a function of redshift. In each panel, we renormalize both the data and the model predictions to the observed value at z= 0.0844 in order to highlight the differences in the shapes. For technical reasons, we cannot easily extract the SF histories for galaxies selected by present-day stellar mass from the S08 models. We therefore limit this final comparison to the two other models. Fig. 10 shows that both SAMs considered here fail to reproduce the observed trend in detail. Small galaxies form too large a fraction of their stars (compared to the observational estimate) at high redshift. For more massive galaxies, the SFR density evolution seems fairly well reproduced by morgana, while in the WDL08 model too few stars are produced in massive galaxies at low redshift.
6 DISCUSSION AND CONCLUSIONS
We have presented a systematic comparison of semi-analytic models of galaxy formation with observations of local and high-redshift galaxies that have been claimed to show so-called ‘DS’ trends. We had several goals: (i) to reassess the robustness of the claims of observed DS in the literature, based on an extensive comparison of different observational data sets; (ii) to see if a consistent picture is painted by the different observational ‘manifestations’ of DS and (iii) to test to what extent the predictions of hierarchical models of galaxy formation, set within the ΛCDM framework, are consistent with these observational results.
In order to test the general paradigm of galaxy formation within the hierarchical picture rather than a specific model implementation, we considered predictions from three independently developed SAMs (WDL08, morgana and S08). We used physical quantities (stellar masses and SFRs) derived from observations to avoid confusion related to differences arising from the spectro-photometric codes and dust models used by the three SAMs. Of course, we cannot avoid these issues since the observational estimates of stellar masses and SFRs still depend on stellar population models and contain assumptions about dust content, metallicity, SF history and IMF.
Despite significant differences in the recipes adopted in the three models to describe the physical processes acting on the baryonic component, the predictions are remarkably consistent both for the evolution of the stellar mass and for the SF history. This is encouraging, in that it suggests that our results are relatively robust to the details of the model assumptions.
We summarize our findings in terms of the three different manifestations of DS that we considered here. We remind the reader that, in all cases, the signature of DS is that massive galaxies formed (or were assembled) earlier and more rapidly than lower mass galaxies.
DS in stellar mass. (i) We do not see robust evidence for differential evolution of the stellar mass assembly in the observations, i.e. the data are consistent with an increase in SMD at the same rate for all stellar mass bins. (ii) We find that the models roughly reproduce the evolution of the space density of massive galaxies when their predictions are convolved with a realistic estimate for the observational error on stellar masses. At the same time, all models predict almost no evolution in the number density of galaxies of mass ∼1010 M⊙ since z∼ 2, at variance with real galaxies whose number density evolves by a factor of ∼6 in the same redshift interval. Put another way, the models (which are normalized to reproduce the stellar MF at z= 0) overproduce low-mass galaxies relative to observations at high redshift (z≳ 0.5).
DS in SFR. (i) We find that different estimates of SFR as a function of stellar mass from different methods show large systematic offsets as well as differences in slope. Based on the available observational compilation, we do not see conclusive evidence for differential evolution of the SFR or SFR density for galaxies of different mass. It may therefore be pre-mature to reach any firm conclusions about whether these observations in fact show the signatures of DS. (ii) The models roughly reproduce the increase of the average SSFR and SFR density of galaxies up to z∼ 4 though with a possible systematic underestimate, the weak evolution of the stellar MF of actively star-forming galaxies and the buildup of the population of passive galaxies at z < 2. However, the MF of passive galaxies has a much steeper small-mass-end slope than the data, and low-mass galaxies are too passive (have too little SF) at all probed redshifts.
Archaeological DS. (i) The data do clearly show the trend of massive galaxies being older than low-mass galaxies. However, this trend may arise in part from biases related to the SFR reconstruction algorithms. (ii) The WDL08 and S08 SAMs qualitatively reproduce the observed trend, in that low-mass galaxies are younger than high-mass ones. However, the slope of the mean stellar population age versus stellar mass trend is much shallower in the models than in the data. Some, though probably not all, of this discrepancy may be related to observational biases in the age estimates. The SAMs do not agree well with the detailed SF histories as a function of z= 0 stellar mass extracted from galaxy spectra; low-mass galaxies form too large a fraction of their stars at early times, and high-mass galaxies (at least in the WDL08 models) do not have enough SF at late times.
Massive galaxies have long been considered one of the main challenges for hierarchical models. The introduction of so-called ‘radio-mode’ AGN feedback helps keep massive galaxies from forming stars down to z= 0, so that red and old massive galaxies are now produced by the latest generation of SAMs. We find that when the stellar mass errors are accounted for (Baugh 2006; Borch et al. 2006; Kitzbichler & White 2007), discrepancies in the number densities of massive galaxies weaken or disappear. A number of problems still affect model predictions for the most massive galaxies: according to the results shown above, their evolution since z∼ 1, which is driven by mergers, is marginally inconsistent with the data (Fig. 2). Models may also underestimate the number of massive galaxies at z > 2 (see also Marchesini et al. 2008). Depending in part on which observational estimates turn out to be correct, at least in some of the models the SFR in massive galaxies at high redshift may be too low. These residual discrepancies may be solved by better modelling the known processes: the implementation of AGN feedback is still extremely crude. The merger-driven evolution at z < 1 may be slowed down by scattering of stars into the diffuse stellar component of galaxy groups and clusters (Monaco et al. 2006; Conroy et al. 2007; S08). Moreover, better modelling of chemical evolution is needed to address what may be the most severe challenge for massive galaxies, the chemo-archaeological DS.
At the same time, we find serious discrepancies in the model predictions for less massive galaxies in the range 109–1011 M⊙ in stellar mass: they form too early and have too little ongoing SF at later times, so their stellar populations are too old at z= 0. Their number density is nearly constant since z∼ 2, while observations show that it grows in time. Their SSFR is too low compared with observational data. The low-mass end slope of the SMF of passive galaxies is too steep, again indicating an excess of low-mass passive galaxies. Part of this discrepancy could be due to the overquenching problem for satellite galaxies (Weinmann et al. 2006; Gilbank & Balogh 2008; Kimm et al. 2008; van den Bosch et al. 2008), which is caused by the assumption in all three SAMs that the hot halo is instantly stripped from satellites as they enter a larger host halo, thus shutting off any further cooling on to satellite galaxies. However, as we showed in Fig. 3, the problematic galaxies are predominantly central galaxies in DM haloes with relatively high circular velocities, ∼100–200 km s−1. Therefore, mechanisms that only impact satellite galaxies (such as ram pressure stripping) or that only work on very low-mass haloes (like photoionization or, probably, pre-heating) are not viable solutions to this problem.
The paradox is that we must suppress the formation of low-mass galaxies in order to fit the low-mass end of the stellar MF or the faint end of the luminosity function within the CDM paradigm. In the three models presented here, as in probably all ΛCDM models in the literature, this is currently accomplished by implementing very strong supernova feedback in low-mass galaxies. Not only is it unclear that this strong SN feedback is physically motivated or in agreement with direct observations of winds in low-mass galaxies, but apparently it does not produce the correct formation histories for low-mass galaxies.
Another hint may come from chemical DS: Maiolino et al. (2008) (see also Lo Faro et al. 2009) showed that the models predict that small galaxies at high redshift are much more metal-rich than observed galaxies at these mass scales. This could indicate that either the metals are efficiently removed from these galaxies, e.g. by winds, or SF (and therefore metal production) is inefficient.
Thinking of a plausible mechanism that can suppress the formation of galaxies in small but compact DM haloes at high z is not so easy: their density is too high and their potential wells are too deep to suppress SF with heating from an external UV background, while massive galactic winds should not destroy galaxies of the same circular velocity at lower redshift. Therefore, the discrepancies discussed above call for a deep rethinking of the feedback schemes currently implemented in SAMs. Alternatively, the problem may be related to the nature of DM; if this is not completely collisionless, the density profiles of small DM haloes may be significantly different from the generally assumed Navarro, Frenk & White (1996) form, and this would influence cooling rates and infall times, galaxy sizes and SFRs.
All model predictions discussed in this paper and the data shown in Fig. 2 are available in electronic format upon request.
We are grateful to Eric Bell and Anna Gallazzi for discussion and careful explanation of their data, to Frank van den Bosch, Maurilio Pannella and Nicola Menci for enlightening discussions, to Adriano Fontana, Andrea Grazian, Sara Salimbeni for help in understanding and extracting information from the GOODS-MUSIC catalogue, to Danilo Marchesini for sharing his data before publication and to Kai Noeske for providing his data in electronic form and for very useful discussions about SF indicators. FF and GDL acknowledge hospitality at the Kavli Institute for Theoretical Physics in Santa Barbara. This research was supported in part by the National Science Foundation under grant no. NSF PHY05-51164. We thank the anonymous referee for suggestions that helped to improve this paper.