The physical properties of star forming galaxies in the low redshift universe

(modified) We present a comprehensive study of the physical properties of \~10^5 galaxies with measurable star formation in the SDSS. By comparing physical information extracted from the emission lines with continuum properties, we build up a picture of the nature of star-forming galaxies at z<0.2. We take out essentially all aperture bias using resolved imaging, allowing an accurate estimate of the total SFRs in galaxies. We determine the SFR density to be 1.915^{+0.02}_{-0.01}(rand.)^{+0.14}_{-0.42} (sys.) h70 10^{-2} Msun/yr/Mpc^3 at z=0.1 (for a Kroupa IMF) and we study the distribution of star formation as a function of various physical parameters. The majority of the star formation in the low redshift universe takes place in moderately massive galaxies (10^10-10^11 Msun), typically in HSB disk galaxies. Roughly 15% of all star formation takes place in galaxies that show some sign of an active nucleus. About 20% occurs in starburst galaxies. We show that the present to past-average star formation rate, the Scalo b-parameter; is almost constant over almost three orders of magnitude in mass, declining only at M*>10^10 Msun. The volume averaged b parameter is 0.408^{+0.005}_{-0.002} (rand).^{+0.029}_{-0.090} (sys.) h70^{-1}. We use this value constrain the star formation history of the universe. In agreement with other work we find a correlation between $b$ and morphological type, as well as a tight correlation between the 4000AA break (D4000) and b. We discuss how D4000 can be used to estimate b parameters for high redshift galaxies.


INTRODUCTION
One of the most active areas of research during the last decade has been the study of the present and past star formation histories of galaxies. This includes detailed studies of the stellar population of our own Galaxy (e.g. Rocha-Pinto et al. 2000) and of Local Group dwarfs, both individually and as relic evidence of the global star formation history of the Universe (Hernandez et al. 2000;. While potentially very powerful, the techniques used in these studies are difficult to extend much beyond the Local Group. At somewhat larger distances, efforts have gone towards understanding the global properties of more massive galaxies and the dependence of star formation on mass and/or morphological classification (e.g. Kennicutt 1983; Kennicutt et al. 1994; Gavazzi & Scodeggio 1996;Boselli et al. 2001). This has been accompanied by careful mapping of both the star formation and the gas distribution in galaxies (e.g. Martin & Kennicutt 2001;Wong & Blitz 2002;Young et al. 1995). Based on these studies, a picture has emerged in which more massive galaxies undergo a larger fraction of their star formation at early times than less massive ones. This has often been said to present a challenge for existing models of galaxy formation.
These studies have traditionally been carried out with relatively small samples of local galaxies. For example, the comparatively large sample of Gavazzi et al. (2002) contains 369 spiral galaxies. With the advent of large fiber-fed surveys such as the 2dF (Colless et al. 2001) and the SDSS (York et al. 2000), it is now possible to extend such studies dramatically in size. Large galaxy surveys have most recently been exploited by Hopkins et al. (2003), who carried out a comprehensive study of different SFR indicators, and by Nakamura et al. (2003) who studied the local Hα luminosity density.
Alongside these careful studies of individual local galaxies, a global picture of the star formation history of the universe has been assembled through careful studies of star form-ing galaxies as a function of look-back time (Cowie 1988;Songaila et al. 1990;Lilly et al. 1996;Madau et al. 1996;Madau et al. 1998). The main emphasis of this work has been the star formation history of the universe as a whole. Some workers have also tried to study the evolution of different morphological types (Brinchmann et al. 1998;Guzman et al. 1997). These studies have adopted a wide variety of star formation indicators (Glazebrook et al. 1999;Sullivan et al. 2001;Afonso et al. 2000;Flores et al. 1999). Despite differences between different indicators it is clear that there has been a decline in the rate of star formation since z ∼ 1, although there is still no consensus on how rapid this decline was. There is also no clear consensus on the star formation history before z = 2. Studies in the literature give answers ranging from a clear increase back to z = 8 (Lanzetta et al. 2002), through a constant rate of star formation back to z = 5 (e.g. Baldry et al. 2002;Giavalisco et al. 2003), to a decline at high z (e.g. Madau et al. 1998).
One of the key unanswered questions is what physical parameter(s) drive changes in the star formation rate in individual galaxies. Although an increasing body of evidence points to turbulence being a key influence on star formation on small scales (e.g. MacLow & Klessen 2003, and references therein), the origin of this turbulence and its relevance for star formation on galaxy-sized scales is unclear. It is therefore of major interest to investigate empirically what global quantities correlate strongly with star formation activity. It is logical, however, to expect that the star formation rate is a function of many variables. To investigate this dependence it is necessary to study the star formation rate density function in various projections (see also the approach by Lanzetta et al. 2002). This is also becoming a favoured approach to analyse theoretical models (e.g. Springel & Hernquist 2003;Yang et al. 2003).
We will pursue this goal here by trying to tie together detailed information on individual galaxies with their contribution to the overall cosmic star formation activity. This is possible both through studying a snapshot in time and by comparing current and past average star formation distributions. On a global scale, this provides a nearly model-independent complement to the studies of the cosmic spectrum by Baldry et al. (2002) and .
This kind of study requires a survey that spans a large range in galaxy properties and with accurate photometric and spectroscopic information. The Sloan Digital Sky Survey (SDSS York et al. 2000) is an ideal candidate for this kind of work. Its accurate photometry allows good characterisation of the selection function and the structural parameters of galaxies. The large database of accurately calibrated spectra allows a much more careful study of the spectral properties of galaxies than has hitherto been possible.
With such unprecedented quality of data it is pertinent to improve modelling over what has generally been done before. One aspect in particular that is generally missing in earlier work is a rigorous treatment of line formation and dust. This is especially important for the SDSS as it spans a large range in luminosity so the physical parameters of the galaxy can be reasonably expected to vary. In this paper we have therefore built on the work of Charlot et al. (2002) and fit all strong emission lines simultaneously to simultaneously constrain these physical properties.
Our paper is structured as follows: We start in section 2 by briefly describing the data used. Section 3 gives an overview of the models and the modelling needed for our analysis. This includes a discussion of the relationship between Hα-luminosity and SFR. Section 4 discusses in some detail the influence of AGN activity on our estimates of star formation rates (SFRs). Section 5 discusses the very important question of aperture effects in a fibre survey such as the SDSS and sets out our method for removing these. With these preliminaries out of the way, we discuss the local SFR density in Section 6 and give an inventory of star formation in Section 7. The specific star formation rate and its variation with mass is discussed in section 8 and the variation with other physical parameters is covered in Section 9. The paper is concluded in Section 10. We use a cosmology with H0 = 70km/s/Mpc, ΩM = 0.3, ΩΛ = 0.7. Where appropriate we define h = H0/100km/s/Mpc. We use the Kroupa (2001) universal IMF. To convert to the popular choice of a Salpeter (1955) IMF between 0.1 and 100M⊙, one should multiply our SFR estimates by 1.5, the ratio of mass in the two IMFs for the same amount of ionising radiation. We will also make use of stellar mass estimates below and to convert these to a Salpeter IMF it is again necessary to multiply the masses by 1.5.

THE DATA USED
The Sloan Digital Sky Survey (SDSS) (York et al. (2000); Stoughton et al. (2002)) is an ambitious survey to obtain spectroscopic and photometric data across π steradians. The survey is conducted using a dedicated 2.5m telescope at Apache Point Observatory. The photometry is obtained using drift-scanning with a unique CCD camera (Gunn et al. 1998), allowing near-simultaneous photometry in five bands (u, g, r, i, z, Fukugita et al. (1996); ; Smith et al. (2002)). The resulting data are reduced in a dedicated photometric pipeline, photo (Lupton et al. 2001) and astrometrically calibrated (Pier et al. 2003). We use the results of photo v5.4. To estimate total magnitudes we use the cmodel magnitudes recommended for use by the DR2 Paper (Abazajian et al. 2004). This is a weighted combination of the exponential and de Vaucouleurs fits to the light profile provided by photo and has the advantage over Petrosian magnitudes that they are higher S/N (cf. the discussion in Baldry et al. 2003). We have also performed all calculations with the Petrosian total magnitudes and found that the results in the paper change if these are used although we get a few more outliers. Fiber magnitudes in g, r, i are measured directly from the observed spectrum and the r-band magnitude is normalised to the fiber magnitude of photo 1 . When estimating fixed-frame magnitudes we follow Blanton et al. (2003b) and k-correct our magnitudes to z = 0.1 to minimise the errors on the k-correction. We will also adopt their notation and refer to these redshifted magnitudes as 0.1 g, 0.1 rand 0.1 i.
The spectroscopic observations are obtained using two fiberfed spectrographs on the same telescope with the fiber placement done using an efficient tiling algorithm . The spectroscopic data are reduced in two independent pipelines spectro1d (SubbaRao et al, in preparation) and specBS (Schlegel et al, in preparation) -we use the specBS redshifts, but the differences between the two pipelines are entirely negligible.
The dataset used in the present study is based on the main SDSS galaxy sample, described by Strauss et al. (2002). We use a subset of Blanton et al. (2003b)'s Sample 10 consisting of 149660 galaxies with spectroscopic observations, 14.5 < r < 17.77 and 0.005 < z < 0.22. This encompasses slightly more galaxies than the SDSS Data Release 1 (Abazajian et al. 2003). The overall continuum properties of this sample are discussed in some detail by Kauffmann et al. (2003a,b) and the sample selection and spectrum pipeline analysis are discussed by Tremonti et al. (2003). The spectrophotometric calibration we use will also be used in Data Release 2 (Abazajian et al. 2004). We make extensive use of stellar masses for the galaxies taken from Kauffmann et al. (2003a). To be consistent with the latter we will throughout use the narrow definition of the 4000Å-break from Balogh et al. (1998) and denote this as D4000.
As detailed in Tremonti et al. (2003) we have chosen to reanalyse the 1D spectra using our own optimised pipeline. This allows us to take more care in the extraction of emission line fluxes than is done in the general purpose SDSS pipeline. The key differences for the work presented here are as follows: we remove the smear exposures so that the spectra are uniformly exposed within a 3" fibre 2 and we use the latest high-resolution population synthesis models by Bruzual & Charlot (2003) to fit the continuum using a non-negative linear-least-squares routine (Lawson & Hanson 1974). This provides an excellent fit to the continuum and lets us perform continuum subtraction with unprecedented accuracy. The fitting procedure automatically accounts for weak metal absorption under the forbidden lines and for Balmer absorption (see e.g. Bruzual & Charlot (2003) for examples of the quality of fit).
The ability to extract very weak line emission turns out to be very important when studying trends with mass (and hence increasing continuum contribution) and when estimating dust attenuation. We will see below that it is also important to identify possible AGN contamination of the emission lines.
A crucial aspect of the SDSS for studies of the nearby galaxy population is that it has a well-defined and well-studied selection function (Strauss et al. 2002) and covers a large range in absolute magnitude. This means that we can use the galaxy sample not only to study individual galaxies, but to construct distribution functions for volume-limited samples with high accuracy. Note that one can usually extract trends from a judiciously chosen, but statistically poorly characterised sample, but the distribution of a parameter requires a statistically well-defined sample. We will therefore focus our attention on distribution functions throughout our analysis.
We will limit our study to galaxies with 0.005 < z < 0.22. The lower limit reflects our wish to include galaxies of the lowest possible luminosity. At the same time, we wish to avoid redshifts where deviations from Hubble flow are substantial and the accessible volume is very small. At the low redshift limit, a galaxy with the mass-to-light ratio appropriate for a 13.6 Gyr old stellar population and r = 17.77 would have a stellar mass just below 10 8 solar masses. We should therefore be complete to this mass limit 3 . Our sample can thus be used to reconstruct the properties of a volume limited sample of galaxies with M * > 10 8 M⊙. As we will see below, we also include the vast majority of all star formation in the nearby Universe.

Sample definitions
We will attempt to treat the entire sample in a uniform way in our SFR calculations. It will, however, prove necessary in the following discussion to define several subsamples of objects based on Figure 1. The distribution of the galaxies in our sample in the BPT lineratio diagram. The two lines shows the division of our sample into the three subsamples discussed in the text. An unweighted version of this diagram can be seen in Figure 9. The galaxies plotted here have S/N> 3 in all four lines.

Subsample
Number their emission line properties. These are defined on the basis of the Baldwin et al. (1981, BPT) diagram shown in Figure 1. The diagram has been divided into three regions which we discuss further below. Although a classification can be made on the basis of Figure 1 regardless of the signal-to-noise (S/N) in the lines, we find that requiring S/N> 3 for all lines is useful. Below this limit, a rapidly increasing fraction of galaxies have negative measured line fluxes and the non-symmetric distribution of galaxies along the y-axis of the BPT diagram leads to classification biases. We have adopted the following subsamples (cf. Table 1): All The set of all galaxies in the sample regardless of the S/N of their emission lines. SF The star forming galaxies. These are the galaxies with S/N> 3 in all four BPT lines that lie below our most conservative AGN rejection criterion. As discussed in section 4, they are expected to have very low (< 1%) contribution to Hα from AGN. C The objects with S/N> 3 in all four BPT lines that are between the upper and lower lines in Figure 1. We refer to these as composite galaxies. Up to 40% of their Hα-luminosity might come from an AGN. The lower line is taken from Kauffmann et al. (2003c) and is a shifted version of the upper line. It does lie very close to an empricial determination of the SF class (see section 4 below) but we have kept the Kauffmann et al line to be consistent with that work. No results below are affected by this choice. AGN The AGN population consists of the galaxies above the upper line in Figure 1. This line corresponds to the theoreical upper limit for pure starburst models so that a substantial AGN contribution to the line fluxes is required to move a galaxy above this line. The line has been taken from Equation 5 in Kewley et al. (2001b), but our models have an identical upper limit. Low S/N AGNs A minimum classification for AGN galaxies is that they have [N II]6584/Hα> 0.6 (and S/N> 3 in both lines) (e.g. Kauffmann et al. 2003c). It is therefore possible to classify these even if [O III]5007 and/or Hβ have too low S/N to be useful (cf. Figure 3). A similar approach is taken by Miller et al. (2003). In general, we will include these galaxies together with the AGN class. Low S/N SF After we have separated out the AGN, the composites and the low S/N AGNs, we have thrown out most galaxies with a possible AGN contribution to their spectra. The remaining galaxies with S/N> 2 in Hα are considered low S/N star formers. We can still estimate the SFR of these galaxies from their line strengths, even though we cannot use the full modelling apparatus described in the following section.
Unclassifiable Those remaining galaxies that are impossible to classify using the BPT diagram. This class is mostly made up of galaxies with no or very weak emission lines.
Note that this is merely a ''nuclear" classification -it does not make any statement about the properties of the parts of the galaxy outside the region sampled by the fibre. In particular, we expect the Unclassifiable category to include a substantial number of galaxies where we only sample the central bulge -for such galaxies there may be considerable amounts of star formation outside the fibre. We will return to this point when discussing aperture corrections below. Figure 2 shows the distribution of log S/N in each line considered for the galaxies in our sample. All galaxies falling outside the plotted range have been included in the three bins at the extremes of the plot. The thick solid line shows the overall S/N distribution and the distributions for each of the different classes defined above are shown with different line styles as indicated by the legend in the first panel. For clarity, the low S/N AGNs have been grouped with the AGN.
These distributions are of considerable interest to understand possible biases in our classification scheme. The most notable results from these panels can be summarised as follows • There is clear evidence that there is a considerable number of SF galaxies that are thrown out because they have low S/N in [O III]5007. The inclusion of the low S/N SF class is therefore very beneficial. As Table 1 shows this almost doubles the number of galaxies for which we can use Hα to determine the SFR.
Our sample covers a large range in galaxy luminosity and the . The median signal-to-noise ratio for five of our lines that can be observed over the entire redshift range spanned by our sample. Notice in particular how [O III]5007 goes from being very strong at low masses to being the weakest line at high masses. distributions shown in Figure 2 suppress this dimension. A complementary view of these data is shown in Figure 3. This shows the trend in the median S/N of the different lines as a function of the stellar mass of the galaxies. We have suppressed the results for the [O II]3727 line as it falls out of the sample at low redshift and also lies in a region of the spectrum that has rather different noise characteristics to that where the other lines reside. It shows a similar trend to to [O III]5007.
The most important point in this figure is that the fraction of galaxies for which we can assign a nuclear classification is a declining function of mass. With any fixed S/N cut, the unclassifiable category will be biased towards more massive galaxies.
The decline in S/N in the lines with mass is seen at M * > 10 10 M⊙. For lower masses, we note that the S/N in Hα is constant over a large range in mass. We will return to this point later. [O III]5007 changes from being the strongest line at low mass to being the weakest at high mass. This is caused by the decreasing electron temperature. At high mass, most of the cooling takes place in mid-IR fine-structure lines, depressing the [O III]5007-flux strongly. At the very highest masses the trend reverses because of the higher fraction of AGN in the most massive galaxies (c.f. Kauffmann et al. 2003c Kauffmann et al. 2003c).

MODELLING
Our aim is to use the data presented in the previous section to constrain the physical properties of the galaxies in our sample. The main emphasis in this paper is on the SFR. In the following we will sometimes distinguish between SFRs derived directly from the emission lines as discussed in this section and those derived indirectly from the 4000Å-break as discussed in section 4.1. The former we will refer to as SFRe and the latter as SFR d . In the majority of the paper we will use SFRe for some galaxies and SFR d for others and in that case we will use SFR for simplicity, the details will be given towards the end of section 5.
We hereafter C02) and model the emission lines in our galaxies with the Charlot & Longhetti (2001, CL01) models. These combine galaxy evolution models from Bruzual A. & Charlot (1993, version BC02) with emission line modelling from Cloudy (Ferland 2001; see CL01 for details). These are similar in spirit to other models recently presented in the literature (Kewley et al. 2001a;Zackrisson et al. 2001;Bressan et al. 2002;Moy et al. 2001;Panuzzo et al. 2003), but they contain a physically motivated dust model and are optimised for integrated spectra as opposed to individual HII regions. The modelling is discussed in more detail in Charlot et al. (2003, hereafter Paper I, see also C02); here we will only discuss those aspects important for our current undertaking.
We have generated a model grid using the following effective (ie. galaxy-wide) parameters: metallicity, Z, the ionization parameter at the edge of the Strömgren sphere, U , the total dust attenuation in the V -band, τV , and the dust-to-metal ratio of the ionized gas, ξ. The parameter ranges and grid sizes adopted are given in Table 2. Since we will only study relative line fluxes in this paper, we are not sensitive to stellar age, star formation history and the relative attenuation by dust in the stellar birthclouds (i.e. giant molecular clouds) and the ambient ISM. In total our grid contains ∼ 2 × 10 5 models.
Given the major uncertainty in traditional star formation estimates due to dust attenuation (Sullivan et al. 2001;Bell & Kennicutt 2001;Calzetti 1997), we should briefly summarise our dust treatment which follows that of Charlot & Fall (2000, hereafter CF00). The CF00 model provides a consistent model for UV to far-IR emission. It does this by incorporating a multi-component dust model where the birth clouds of young stars have a finite lifetime. This ensures a consistent picture for the attenuation of continuum and line emission photons. In addition the model deals with angle-averaged, effective quantities, which depend on the combined optical properties and spatial distribution of the dust.
Each model in our grid has been calculated with a particular dust attenuation and these attenuated line ratios are compared directly to the observed spectrum. This means that all lines contribute to the constraint on the dust attenuation. To first approximation, however, our dust corrections are based on the Hα/Hβ ratio. Note that we do not interpret the derived dust attenuations as coming from a simple foreground screen, a procedure which has been rightfully criticised as simplistic (e.g. Witt & Gordon 2000;Bell & Kennicutt 2001;Serjeant et al. 2002).
This model grid is somewhat larger than that used by C02, but the main difference from C02 is that we adopt a Bayesian approach to calculate the likelihood of each model given the data. This is similar to the approach used by Kauffmann et al. (2003a) in their calculation of stellar masses. If, for notational simplicity, we enumerate the models with a single index, j, we can write the likelihood of a model given the data, P (Mj| {Li}), as where the sum is over all lines observed for that galaxy, Li is the luminosity of line i, A is an overall scaling factor and M is the model grid -in practice this is a 4D array. The prior adopted is given in Table 2 and is essentially a maximum ignorance, flat prior for all parameters. The choice of prior has very little effect for the high S/N galaxies as long as we span the range of physical parameters in the sample. However, as we will explain below, it can make a difference for low S/N galaxies.
We fit in linear flux units rather than to flux ratios, because the approximation of Gaussianity is more appropriate there. (By comparing the observed ratio of the [OIII] lines at 4959Å and 5007Å with the theoretical value 4 , we find that the errors on our flux measurements become non-Gaussian below S/N ≈ 2. Since we limit our SF sample to S/N> 3, this is not a major concern) 5 .
The important advantage of our approach is that it generates the full likelihood distribution of a given parameter and from this we can rigorously define confidence intervals for our estimates. For high S/N objects, these likelihood distributions are symmetric around a best-fit value. In this situation it is possible to summarise the distribution using the best-fit value and the spread around this value. However, for lower S/N galaxies, the distributions may be double-peaked or non-symmetric and it is necessary to keep the entire likelihood distribution in further calculations. The main disadvantage of working with full distributions is that they are somewhat cumbersome to manipulate. The basic theory is covered in introductory statistics books (e.g. Rice 1995), but for the benefit of the reader we have collected the results most important for the present paper in Appendix A. Figure 4 shows the result of taking a high S/N spectrum and decreasing the S/N of the emission lines. Each likelihood distribution is the result of 100 realisations (assuming Gaussian noise on the line flux). The figure shows S/N= 2, 5 and 15, where the S/N is that of the weakest line of those contributing to the BPT diagram, for this galaxy this is Hβ.
The distributions broaden as expected when the S/N decreases. There is also a slight shift in the estimates with decreasing S/N -in particular the estimate of the SFRe goes up by 0.1 dex in the mode, somewhat more in average. Although the magnitude and sign of this shift depends on the particulars of the line fluxes of the galaxy, it is an indication that our apparatus does not function well at low S/N. For this reason, and because we cannot reliably classify galaxies at low S/N, we only use the results of the model fits for the SF class. Figure 5 shows the distribution of the difference between the observed value of the line fluxes and that of the the best-fit model, divided by the error for all galaxies in the SF class. The agreement is excellent except perhaps the models slightly overpredict the Hβflux for the highest S/N objects. This could be caused by uncertainties in the continuum subtraction, but it has negligible effect on our SFRe estimates.
This modelling approach is necessary to get accurate SFRe estimates because the SDSS spans a range in galaxy properties such as Hubble type, mass and emission-line characteristics. Previous studies of star formation activity (see Kennicutt 1998, for a review) using optical spectra have usually assumed a fixed conversion factor between Hα luminosity and SFRe. Dust attenuation is estimated from Hα/Hβ assuming a fixed unattenuated Case B ratio. The fixed conversion factor appears to be a reasonable approximation when stellar absorption in the Balmer lines is taken into account (e.g. Charlot et al. 2002). However, as is well known, and shown quantitatively by CL01, it is not correct in detail. The approximation neglects the effect of metallicity and ionization state.

Parameter
Description Range Z The metallicity −1 < log Z/Z ⊙ < 0.6 in 24 steps U The ionisation parameter −4.0 < log U < −2.0 in 33 steps τ V The total dust attenuation 0.01 < τ V < 4.0 in 24 steps ξ The dust-to-metal ratio 0.1 < ξ < 0.5 in 9 steps Table 2. The model grid calculated for the present work. This is calculated for a constant star formation history at t = 10 8 yrs -see text for details. It also neglects the diffuse emission in galaxies. More careful modelling is required in order to avoid creating biases in the SFRe estimates as function of metallicity or stellar mass. However, in many situations, in particular at high redshift, detailed modelling in the way we have done is very hard to carry out. It is therefore of considerable interest to look at the relationship between the Hα-luminosity and SFRe found in our study.

The relationship between Hα-luminosity and SFRe
We will follow CL01 and denote the ratio of observed Hαluminosity to SFRe as, ηHα: In this form the Hα-luminosity is assumed to be uncorrected for dust attenuation. We now wish to study how this parameter varies with the physical properties of the galaxy. It will prove convenient to split this into the conversion factor between unattenuated Hαluminosity and SFRe which we will refer to as η 0 Hα and the dust attenuation at Hα in magnitudes, AHα, which we will take as This assumes an attenuation law of the form τ (λ) ∝ λ −0.7 , which is a good approximation to the CF00 model over the wavelength range we consider. Figure 6 shows the likelihood distributions for AHα in five bins in stellar mass. These have been constructed by co-adding the individual likelihood distributions for AHα for all SF galaxies Figure 5. The distribution of the difference between the observed value of the line fluxes and that of the the best-fit model, divided by the error for all galaxies in the SF class. The smooth dashed curve shows a unit Gaussian for reference. The horizontally shaded histogram shows the results for the galaxies that have S/N in the line between 3 and 10. The obliquely shaded histogram is for galaxies with S/N> 20 in the line. in the relevant mass range. We note that ignoring the metallicitydependence of the Case B Hα/Hβ ratio would lead to an overestimate of the dust attenuation by up to ∼ 0.5 magnitudes for the most metal rich galaxies. The most striking result from Figure 6 is the clear increase in dust content at high stellar masses. Tremonti et al. (2003) has shown that there is a strong correlation between metallicity and mass for star-forming galaxies. One would therefore expect that the main physical reason for the correlation between dust attenuation and stellar mass is an increase in the metallicity of the emitting gas. It is also interesting that the width of the AHα distribution increases strongly as one goes to more massive galaxies. Some of this is due to the more massive galaxies having lower S/N emission lines and hence more extended likelihood distributions. However, note that we require S/N> 3 for all four lines in the BPT diagram, so it is unlikely that this is the main cause. Figure 7 shows the corresponding likelihood distributions for η 0 Hα . We have compared this to the conversion factor advocated by Kennicutt (1998) Hα (Kennicutt) = 10 41.28 erg/s/M⊙/yr for our adopted IMF). This value is indicated as the vertical dotted line in Figure 7. Clearly the Kennicutt (1998) conversion factor is a very good typical value and compares well with the median value for our sample log η 0 Hα = 41.27. However, the peak values of log η 0 Hα do vary by nearly 0.4 dex, with the most massive/most metal rich galaxies producing less Hα-luminosity for the same SFRe than low mass/metal poor galaxies. This correlation is also likely to be driven in main by the changes in gas-phase metallicity with mass, but as discussed by CL01 and Charlot et al. (2002) a changing escape fraction for ionizing photons is also likely to contribute to these changes in η 0 Hα .

A comparison to other estimators
These trends with mass suggest that any simple recipes for conversion of Hα-luminosity into SFRe are likely to show some biases with stellar mass. To quantify these possible biases, it is useful to compare our SFRe estimates with those obtained from previous methods to convert observed Hα-flux to SFRs (see also Charlot et al. 2002). We noted above that the Kennicutt (1998) conversion factor is a good typical value so we will use this in what follows, ie.
where L 0 Hα is the estimated emission luminosity of Hα. Following the discussion above we will also fix the Case B Hα/Hβ ratio to be 2.86 (appropriate for an electron density of n = 100cm −3 and electron temperature Te = 10 4 K; Osterbrock 1989), where relevant below. We have also adopted the Seaton (1979) attenuation law in the following. These choices have been made to ensure that we will use a method to estimate SFRs which is very similar to those commonly used in the literature (e.g. Sullivan et al. 2000). The residual difference between methods therefore lie in the estimates of dust attenuation and stellar absorption under the Balmer lines. This leads to the following list of methods of SFRe estimation: (1) Assume that the observed flux can be converted to a luminosity without any correction for dust or continuum absorption. This underestimates the SFRe, but might be reasonable for strongly star-forming low metallicity galaxies.
(2) Correct for dust attenuation by assuming a default attenuation of AV = 1mag in the emission line and correct for absorption at Hα by assuming a typical stellar absorption, EW abs (Hα), of 2Å. If the assumptions are correct in the mean for the sample this is likely to give a good estimate of the SFRe's, but the scatter around the mean value might be large depending on the sample.
(3) Improve on the dust corrections by using Hα/Hβ (or some other Balmer line combination) to determine the dust attenuation using a fixed dust-free Case B reference value. We can make two assumptions about the absorption underlying Hβ: (a) We can assume that EW abs (Hα)=EW abs (Hβ), this is the most common assumption in the literature.
(b) EW abs (Hα)= 0.6EW abs (Hβ). This is a better assumption for the BC03 models and presumably for our galaxies as well.
(4) Finally, with good spectra such as the SDSS it is possible to perform a careful estimate of the absorption at Hα and Hβ, which is what we have done for the present paper. These fluxes can be used to determine the dust attenuation as above and thence the SFRe.
We compare these four estimates of SFRe's with our best estimates in Figure 8. To produce this figure we took all SF galaxies with S/N in Hβ> 10 (to ensure that dust attenuations from Hα/Hβ are reliable) and compared each of the methods 1-4 above to the mode of our likelihood distribution for the SFRe's. The figure shows the median of the logarithm of the ratio of the two SFRe estimates in bins of 0.2 dex in stellar mass.
It is immediately clear that methods 1, 3b and 4 all agree well with our estimates for low mass galaxies. This is expected because the equivalent width of stellar absorption is very small compared to that of the emission line (the median ratio is 2-6% at log M * < 10) and the dust attenuation is very low in these low metallicity galaxies. But it is clear that the assumption (method 3a) of equal absorption under Hα and Hβ leads to an overestimate of the dust attenuation and hence an overestimate of the SFRe. Method 2 does not perform well at low masses either, but this is again expected since an average dust attenuation of 1 magnitude at V is too high for these galaxies so again we overestimate the SFRe.
As expected method 1 starts to diverge from the other estimators when going to more massive and dusty systems (cf. Figure 6). Likewise method 3a gives systematically discrepant results as explain above. Method 2 is similar to method 1 but agrees with the more sophisticated estimators in a different mass range since the underlying assumptions are more appropriate for more massive systems.
Finally the remaining methods, 3b and 4, agree fairly well with the results of the full fits except at the highest masses. This is because the assumption of a fixed η 0 Hα and Case B ratio for all masses is flawed as we noted in the discussion of Figure 7. However it is interesting to note that the changes in the Case B ratio and in η 0 Hα with mass nearly cancel when one estimates the SFRe over a large range in mass. This happy coincidence comes about because the standard Case B ratio will tend to overestimate the unattenuated Hα-flux, whereas ignoring the variation on η 0 Hα will require a larger Hα-flux for the same SFRe. Thus it is a fairly good approximation to use a fixed Case B ratio and η 0 Hα value, but one should never use only one and then apply the relations derived here. Also, it is necessary to include the uncertainty in the Case B ratio when quoting uncertainties, this uncertainty can only be accurately assessed using the full model fits.
At the very highest masses it is however worth noting that ignoring these factors might lead to up to a factor of 3 systematic underestimate of the SFRe's. It is therefore important to adjust the conversion factor and Case B ratio to that expected for the sample galaxies in which case the simple methods will work well.
We should also note that similar trends are seen with the emission equivalent width of Hα, with the result that method 2, 3b and 4 all give SFRe estimates in good agreement with the full model fits when the Hα emision line has an equivalent width 20Å.
As remarked above, there are a large number of galaxies with significant Hα which cannot be classified in the BPT diagram, the low S/N SF class. We would nevertheless still like to use the Hα line to constrain the SFRe of these galaxies. We do this by adopting the average likelihood distribution for the set of high S/N SF galaxies with masses that lie within a factor of 3 of the mass of the object in question. The range is increased if it contains less than 50 galaxies.

Additional uncertainties
There are several residual uncertainties in our determination of SFRe that warrant discussion. First, we have considered only one IMF, the Kroupa (2001) universal IMF. The emission lines we consider here depend on the massive end of the IMF. As long as this does not vary dramatically, the effect of changing the IMF is mostly to change the overall normalisation of our SFRe's. To correct from our choice of IMF to the Salpeter (1955) IMF between 0.1 and 100M⊙, one should multiply our SFRe estimates by 1.5. This is the ratio of mass in the two IMFs for the same amount of ionising radiation. Second, our model assumes that essentially no ionising radiation escapes the galaxies (see CL01). This assumption is necessary in order to relate Hα-luminosity to SFRe and seems reasonable for our sample of relatively massive galaxies. Even local star-Figure 8. Our estimates compared with the simpler SFRe estimates discussed in the text as a function of stellar mass. We have used all SF galaxies with S/N Hβ > 10. The figure shows the logarithm of the ratio of our estimate of the SFRe to that obtained by the alternative method, so a value of zero corresponds to equality. See text for details. burst galaxies have escape fractions < 10% (Leitherer et al. 1995;Heckman et al. 2001, see also CL01). Finally, our model uses angle-averaged, "effective" parameter values, whereas any particular galaxy is only seen from one angle. For a given galaxy, there may be a mismatch between the modelling and the observations. We have verified that no significant systematic is present with respect to the inclination of the galaxies, but a residual scatter is likely to remain. This can not be quantified without rigorous modelling of the spatially resolved properties of galaxies.
As explained in Paper I, additional uncertainties come from the mix of stellar populations within the fibre, from uncertainties related to interpolation of our model grid (1-2 %), and from uncertainties in stellar tracks and theoretical population modelling (∼ 2 %). We include these uncertainties below by adding a 4 % uncertainty in quadrature.
The largest source of uncertainty in the estimates of SFRe's within in the fibre, however, is that due to the possible contamination of our spectra by other sources of ionising radiation. We now turn to a discussion of these.

OTHER SOURCES OF EMISSION LINE FLUX
As with any other non-resolved spectroscopic survey of galaxies, the emission line fluxes of the galaxies in our sample will contain a component due to sources not directly connected to star formation. These include planetary nebulae (PNe), supernova remnants (SNRs), the diffuse ionized gas (DIG) and active galactic nuclei (AGN). The contributions of SNR, PNe and DIG are discussed in Appendix B, where we show that they are of minor importance for our work. Here we will quantify the possible level of AGN contribution in each of the classes defined in section 2.1. We will show that for the Composite and AGN classes, blind application of the modelling machinery described in the previous section is likely to lead to incorrect results. Note that a comprehensive study of the AGN activity in our sample is presented in Kauffmann et al. (2003c).
There are at least three methods commonly employed to deal with the presence of AGN in a galaxy survey: (i) Remove galaxies with AGNs by cross-correlating the sample with published AGN catalogues (e.g. Condon et al. 2002;Serjeant et al. 2002).
(ii) Identify galaxies with non-stellar ionising spectra using a di-agnostic diagram, typically the so-called BPT diagram introduced by Baldwin et al. (1981). This is possible so long as one can detect the lines required for classification (see section 2.1). At z 0.5, where Hα is redshifted out of the optical this method is less accurate (see however Rola et al. 1997;Tresse et al. 1996).
(iii) Finally AGN can be subtracted in a statistical manner. This is necessary when no other methods are applicable (e.g. Tresse & Maddox 1998). The method is, of course, sensitive to evolution in AGN activity with redshift and it is therefore necessary to include the related uncertainties in the error budget for the derived SFRs (e.g. Tresse et al. 2002;Flores et al. 1999).
All of these methods tend to classify galaxies as either AGN or starbursts, but as shown by Kauffmann et al. (2003c), many AGN have young stellar populations and ongoing star formation.
Our analysis will be based on the BPT diagram. The left panel of Figure 9 shows the BPT diagram as a 2D histogram for galaxies in our sample with S/N > 3 in all lines. The shading is based on the square root of the number of galaxies in each bin and the grid spacing is 0.05 along the x and y-axes. The lines delineate our AGN, C and SF classes. The cross shows the location of an "average" AGN which is constructed by taking the mean of the luminosities of all AGN galaxies with sigma-clipping of outliers using σ = 3.5. We use this AGN below, the results are robust to the exact average AGN used. Kewley et al. (2001b) advocate the use of several diagnostic diagrams to rigorously isolate AGNs. However, diagnostic diagrams involving [S II]6716/Hα or [O I]6300/Hα are much less effective at assessing the degree to which AGN may contaminate our estimated SFRe. This is because a change in AGN fraction moves a galaxy almost parallel to the SF locus in these diagrams. In the BPT diagram of Figure 9, a change in AGN fraction moves the galaxy perpendicular to this locus.
The upper line in Figure 9 shows the uppermost region that can in principle be reached by systems ionized by normal stellar populations. However, real star forming galaxies (and ionizationbounded HII-regions) populate a narrow sequence in the BPT diagram, because there are strong correlations between ionization parameter, dust-depletion and metallicity. A galaxy with both an AGN and ongoing star formation will be located to the right of this line (see also Kewley et al. 2001b).
The right-hand panel in Figure 9 shows the BPT diagram for the SF class, but this time as a conditional density diagram (i.e. the distribution in each bin along the x-axis has been normalised to unity). This distribution is very similar to that found for HII-regions in nearby galaxies (e.g. Bresolin & Kennicutt 2002;Kennicutt et al. 2000). The dashed lines show fits to the intervals enclosing 95% of the distribution. The fits use only the data with log[N II]6584/H α < −0.5. This ensures that the fits are the same if the AGN and C classes are included in the distribution.
We will take the upper dashed line in the right hand panel as the upper limit for our pure star formation sequence. For each galaxy above this limit, we ask what AGN contribution needs to be subtracted to move the galaxy below the limit. The average AGN we use is indicated with a cross in the left-hand panel in Figure 9. The results are fairly robust to the exact location of this average AGN until one gets close to the AGN region.
In Figure 10 we show the results of this exercise for the composite class. The solid histogram shows the fraction of galaxies that have a given fraction of their Hα-flux coming from an AGN. The dashed histogram shows the same for [O III]5007. At the top of the AGN plume in Figure 9, all the [O III]5007-flux can be expected to  Table 1) and after normalising the distribution in each bin along the x-axis to 1. The thick dashed lines show fits to intervals containing 95% of the galaxies as described in the text. come from an AGN, but we do not expect more than ∼ 40% of the Hα-flux to have a non-stellar origin for any galaxy 6 .
In summary, we find that 11% of the (observed) Hαluminosity density in the composites is likely to come from an AGN. For [O III]5007 AGNs contribute 41% of the luminosity density in the composites. This means that using the Hα-luminosity for the Composite class without any correction for AGN activity will only mildly overestimate the SFRe.
Because different lines are affected by AGN in different ways, model fitting may give unreliable results. We therefore estimate the in-fibre star formation in the Composite and AGN classes using their measured D4000 values, as discussed below.
We should also comment that the pure star forming sequence differs slightly from the definition of the SF class, but we have checked that no more than 0.2% of the Hα-luminosity density in Figure 11. The observed relationship between D4000 and specific SFRe (both inside the fibre). The contour lines are equally spaced from 10 galaxies per bin to the maximum. The bin size is 0.01 × 0.1 in the units given in the plot. The red line shows the average at a given D4000 and the blue line shows the mode of the distribution. The dashed lines show the limits containing 95% of the galaxies at a given D4000. the SF galaxies is expected to be of AGN origin, which is well below our systematic uncertainties and we will ignore this henceforth. The AGN contribution to the [O III]5007-luminosity of the SF galaxies can also be shown to be well below the level necessary to bias our model fits.

Estimating the SFR/M * from D4000
For the Composite galaxies and AGN in the sample, we cannot use the model fits to determine the SFRe's, because the line fluxes are likely to be affected by the AGN component. In principle it is possible to subtract off an AGN component, but in practice this is a very uncertain procedure. We have therefore decided to use the measured D4000 value to estimate the SFRs and will denote this SFR d .
This method uses the relationship between SFRe/M * and D4000 shown in Figure 11 to estimate the specific SFR d of a galaxy, and from there its total SFR d . This relation shown in the figure has been constructed using the derived SFRe likelihood distributions and the measured D4000 values (we assume Gaussian errors on the latter). The contours show the sum of the PDF in each bin. The red line shows the average of the distribution at a given D4000 and the blue line the mode. The dashed lines show the 95% limits at a given D4000. To derive the likelihood distribution of the SFR d of a given galaxy, we convolve the likelihood distribution in Figure 11 with the likelihood distribution of D4000 for that galaxy.
To minimise the possibility of biases we did not make a S/N cut when constructing Figure 11, but we did throw out all AGN, C and low S/N AGN. To test the reliability of this procedure, we have checked that if we co-add the spectra of the low S/N galaxies in the SF region of the BPT diagram and run these spectra through our pipeline, we retrieve the same SFRe's (within a couple of percent) that are obtained if we co-add the SFR d 's derived for individual galaxies using the D4000 calibration outlined here.
In what follows we will use SFR d for the AGN, C and Unclassifiable classes and SFRe for the SF and low S/N SF classes as our best estimate of the fibre SFR, and we will refer to this as the SFR.

APERTURE EFFECTS
The SDSS is a fiber based survey. At the median redshift of the survey, the spectra only sample ≈ 1/3 of the total galaxy light. The presence of radial gradients in galaxy properties can therefore lead to substantial uncertainties when one wants to correct to total quantities. The implications of these biases have been discussed in some detail by several authors (e.g. Kochanek et al. 2001;Gómez et al. 2003;Baldry et al. 2002;Pérez-González et al. 2003;Nakamura et al. 2003).
We now discuss our method for aperture correction which uses the resolved colour information available for each galaxy in the SDSS. We will show that by using this empirically based aperture correction we can remove the aperture bias.
Our aperture correction scheme focuses on the likelihood distribution P (SFR/Li|colour), i.e. the likelihood of the specific star formation rate for a given set of colours. It is essential to keep the entire likelihood distribution in order to avoid biases in our estimates, as the measured distributions of SFR/Li for certain colours turn out to be multi-peaked. We choose to normalise to the 0.1 iband luminosity. This is the reddest band with uniformly small photometric uncertainties. We have verified that the results do not depend significantly on the choice of photometric band.
In theory we could construct likelihoods for SFR/Li using all the colour information available. This has the disadvantage that even with our large sample, many bins have very few galaxies. The method is then sensitive to outliers. It turns out, however, that 0.1 (g − r) and 0.1 (r − i) contain all the useful information necessary for our purposes. Adding other colours does not significantly improve our constraints, this is also true if we use the model magnitudes which give somewhat higher S/N in the u and z magnitudes .
We therefore construct P (SFR/Li|colour) on a grid with bins of size 0.05 in 0.1 (g−r) and 0.025 in 0.1 (r−i). We add together the likelihood distributions for SFR/Li (inside the fibre) for all galaxies in each bin (we exclude AGN and low S/N AGN, including them with their SFRs estimated from D4000 does not change the results). The result of this procedure is illustrated in Figure 12. This shows P (SFR/Li|colour) for a subset of the full 0.1 (g − r), 0.1 (r − i) grid. The colour is indicated as 0.1 (g − r)/ 0.1 (r − i) in the top left hand corner of each panel and increases to the right for 0.1 (g − r) and upwards for 0.1 (r − i). The number of galaxies in each bin is also indicated in each panel as well as the number of galaxies that have colours in this bin outside the fibre (see below). The thick and thin bars on the x-axes show the average and median SFR/Li for that bin respectively. Because of the non-symmetric nature of these distributions, the average differs noticeably from the median.
The shape of the distributions varies strongly with colour and it is often not well represented by a Gaussian. In addition, higher values of SFR/Li often occur at redder 0.1 (r − i) colours. This is because the emission lines around Hα contribute significantly to the 0.1 i-flux during an episode of star formation. The likelihood distribution also becomes very wide for galaxies with 0.1 (g − r) 0.7, 0.1 (r − i) 0.5 and their star formation rates are constrained to no better than a factor of 10. This poor constraint on SFR/Li is simply due to the degeneracy between age, metallicity and dust. This region of colour space is populated both by dusty star-forming galaxies and by galaxies with old stellar populations.
We might consider improving our estimates by including the estimated gas-phase metallicity as a constraint. We cannot implement this procedure for the full sample, however, because gasphase metallicities can only be determined for the SF class. Includ- Figure 12. The likelihood distribution of log SFR/L i in bins of 0.1 (g − r) and 0.1 (r − i).. The 0.1 (g − r) colour increases to the right and is indicated as the first number in the top left corner of each panel. The second number gives the 0.1 (r − i) colour and this increases upwards in the figure. The thick black mark on the x-axis shows the average SFR/L i whereas the thin mark shows the median. Note that as one moves towards redder colours the distributions become highly non-symmetrical and there is a substantial difference between the average and the median. The number of objects contributing to the distribution shown in a given panel is indicated in each panel as N = X. Note also that this is a subset of the full 41 × 45 grid.
ing the stellar mass as an additional constraint gives little apparent improvement.
We have also verified that k-correction uncertainties do not appreciably affect our aperture correction estimates. We did this by calculating P (SFR/Li|colour) from galaxies with appropriate colours in a bin of width ∆z = 0.01 in redshift around the galaxy in question. This method uses only observed colours but agrees very well with the procedure using k-corrected colours. The disadvantage of the model is that for a subset of the data there are not enough galaxies with the appropriate colours to construct reliable likelihood distributions. We have therefore settled for the simplest solution; we use only 0.1 (g − r), 0.1 (r − i) to constrain SFR/Li.
We calculate the colour of the galaxy outside the fibre by subtracting off the fiber magnitudes from the cmodel (ie. total) magnitudes and we convolve this with P (SFR/Li|colour) to get an estimate of SFR/Li outside the fibre. We require that at least 5 galaxies contribute to the estimate of P (SFR/Li|colour) in a given bin. For the bluest colours, there are bins where no empirical estimate of P (SFR/Li|colour) exists, as few galaxies in the survey have very blue fibre colours. For these bins we use the closest bin. We have tested that this procedure does not introduce any significant bias into our estimates. For a few galaxies (176) this procedure failed because of photometric problems, these have been excluded from the further analysis. No noticeable differences would be found if they were included with an average aperture correction for their mass.
We would like to emphasise that our method of aperture correction is purely empirical. Our main assumption is that the distribution of SFR/Li for given 0.1 (g − r), 0.1 (r − i) colours is similar inside and outside the fibre. This assumption and the reliability of this procedure should clearly be tested using resolved spectroscopy. We are carrying out such a program and will report on it in an upcoming paper.
For now, the easiest way to test our assumption is to calculate P (SFR/Li|colours) in bins of (z/zmax) 2 , where zmax is the highest redshift at which the galaxy in question would pass the sample selection criteria. This way we can compare P (SFR/Li|colours) derived from galaxies where the fibre only samples the innermost regions with that derived from galaxies where the fibre sample a larger fraction of the galaxy. We have done this in 8 bins in Figure 13. The value of the specific SFR, SFR/M * as a function of z/zmax. The solid line with dots show the trend within the fibre, ie. both M * and SFR calculated within the fibre and a clear aperture effect is seen. The dashed line shows the result when assuming that all the SFR takes place inside the fibre, ie. M * is for the galaxy as a whole and the SFR for the fibre only, and the dash-dotted line shows the result from our aperture corrections. Notice that we very nearly take out all aperture biases.
(z/zmax) 2 and compared the resulting predicted log SFR/Li for the sample galaxies in each of these bins with that derived from the total sample. There is a spread, reflecting the intrinsic scatter in the method, but for (z/zmax) 2 0.4 individual bins gave results in agreement with the overall distributions. This indicates that the distribution of SFR/Li at a given 0.1 (g − r), 0.1 (r − i) is fairly similar at different radii as long as we sample 20% of the total rband light (the median fraction of light within the fibre for galaxies with (z/zmax) 2 0.4). This is the case for majority of the sample. Figure 13 shows that our method removes essentially all the aperture bias from the sample. This figure shows the median SFR/M * as a function of z/zmax in six different mass bins. All galaxies have been included. The specific SFR is estimated from the measured value of D4000 for the AGN and C classes (cf. Section 4.1). The likelihood distributions of SFR/M * have been coadded at each bin in z/zmax.
In the absence of aperture effects (and evolution), this plot should show no trend with z/zmax. Note that evolution is unlikely to provide more than 0.1 dex change over the range plotted in these diagrams. The dashed lines in Figure 13 show the trend if one assumes that all star formation activity takes place within the fibre but we use the stellar mass for the whole galaxy. This will clearly underestimate the true SFR/M * . Figure 13 shows that this quantity decreases towards lower values of z/zmax, as expected.
The solid line with dots in the diagram shows SFR/M * when both quantities are calculated inside the fibre and it is clear that there are still strong aperture effects for galaxies with log M * 10.5. This is expected since these galaxies often have prominent bulges, in which the specific SFR is expected to be low. For lower mass galaxies, the aperture problems are considerably smaller and it is clear that a simple scaling of the fibre SFR by the r-band flux, as done for example by Hopkins et al. (2003) is an acceptable aperture correction. This is also the case for galaxies that show no sign of star formation activity when more than a third of the total r-band light is sampled. It is not correct for other galaxies.
Finally, the dot-dashed lines show the result after applying our aperture corrections. Our correction has removed the aperture bias, at least in a statistical way. We note that the spread in SFR/M * at a given z/zmax is substantial, typically 2 orders of magnitude. We have also tested whether the distribution of points is the same at all z/zmax after application of our aperture corrections. We find The line with crosses points shows the same for the likelihood distributions for the total SFR derived as described in the text. We have grouped together all galaxies with log SFR/M * < −12 into one point and placed this at log SFR/M * = −13. The x-axis shows log SFR/M * , measured inside the fibre for the solid points and total for the crosses. The points for the total SFR/M * are offset slightly for clarity. that that it is, and that Figure 13 is an appropriate summary of the process even though it only shows the median.
At this point, it is useful to summarise the SFR indicators that are used for each subclass. Inside the fibre we have the following situation: (i) For the SF class we use the results of the full model fits.
(ii) For the Low S/N SF class we use the observed Hαluminosity and convert this into a SFR using the ηHα distribution functions derived for the SF class (section 3.1) .
(iii) For the AGN, Composite and Unclassifiable classes we use the D4000 value to estimate SFR/M * and SFR using the method outlined in Section 4.1.
Outside the fibre, we always use the colour method to estimate the SFR.
The uncertainties on our individual galaxy SFR estimates are summarised in Figure 14. To construct this figure we have calculated the 68% confidence interval for the log SFR estimate for each galaxy and calculated the spread in this value in bins of log SFR/M * (where the latter is taken to be the median of the distribution, but the results are the same with the mode or the average). All galaxies with log SFR/M * < −12 have been placed in one bin at log SFR/M * = −13.
The solid points show the median 68% confidence interval in dex for the estimate of log SFR inside the fibre for each bin in log SFR/M * . In the case of Gaussian errors this would correspond to 2σ. The error bars show the 68% spread in error estimates within that bin. The crosses show the same for the log SFR estimated after aperture corrections (i.e. total) and are slightly offset for clarity.
Most of the trends are easy to understand. Firstly the SFR estimates have lowest uncertainty at high SFR/M * because the S/N is high in most lines and tight constraints can be obtained. For low values of SFR/M * , the SFR inside the fibre is estimated from the D4000 value. This causes the error estimates to rise because D4000 provides a poorer constraint on SFR/M * . Finally, the error decreases when the D4000 is so large that there is no longer any room for significant star formation.
The uncertainties on the total log SFRs are larger than for log SFR measured inside the fibre at high SFR/M * because the aperture corrections are significantly more uncertain than the SFR estimates from the spectra. The spread in the uncertainty estimates is also increasing for increasing SFR/M * because the outer parts of these galaxies are very blue and few galaxies are available for the calibration of the aperture corrections at these very blue colours (cf. Figure 12). The median uncertainties on the total SFR/M * are moderate, however, since the aperture corrections are in average well constrained as Figure 12 shows. At low values of SFR/M * the total and fibre estimates have comparable uncertainties 7 . This is because the aperture corrections for these galaxies are calibrated by the fibre estimates with similar and broad likelihood distributions. The total log SFR therefore gets a very similar likelihood distribution. Some galaxies with poorly constrained log SFR in the fibre might nevertheless get better constrained log SFR total because they have bluer outsides than insides and hence better constrained aperture corrections.
The best test of our error estimates for the fibre SFRs is repeat observations. Approximately 5000 of our targets have been observed at least twice during the SDSS survey. We have used these duplicate observations to check that the errors we derive for the fiber SFRs are consistent with the dispersion between repeat observations.

THE LOCAL SFR DENSITY
We now turn to estimating the total SFR density of the local universe. As discussed above, we have a likelihood distribution for the SFR for each galaxy. This includes both uncertainties in the model fit and the aperture correction. We also need to make a correction for evolution in the SFR over the 2.5 Gyr spanned by our redshift limits. We do this by assuming that the SFRs of all galaxies evolve as: For the integrated SFR density, the favoured evolution exponent up to z ∼ 1 is currently β ∼ 3 , although substantial uncertainty exists. We will therefore adopt this form for individual galaxies as well. We will calculate our results at z = 0.1. The differences between β = 0 and β = 3 are then fairly small (∼ 6% , see Table 3 below). Note that a single evolutionary correction for individual galaxies is likely to be wrong. As we will see below, there are clear differences in the star formation histories of galaxies of different mass, so although the integrated star formation density evolves as a power-law, the individual SFRs cannot show the same evolution. Fortunately, most of the results in the following sections are not dependent on the evolution within the sample.
Our results for the total SFR density are summarised in Table 3. This table gives the mode of the likelihood distribution of ρSFR as well as the 68% confidence intervals. We show the result for β = 0 on the top line and the following lines all assume β = 3 -the relative contributions of the different classes do not depend on β. To estimate the star formation density of the survey, we add up the likelihood distributions, using 1/Vmax as a weight (Felten 1976). As discussed in Appendix A, the summation is carried out with a Monte Carlo method. To calculate errors on the final SFR Table 3. The best estimate of the star formation density at z = 0.1 from our survey. We show the mode and the 68% confidence intervals as well as the percentage contribution to the total for the different classes. The total for β = 0 (no evolution) is shown on the first row, the others use β = 3.

Sample
Fibre It is, of course, difficult to estimate accurately the systematic uncertainties. The first is the uncertainty in the estimator used to calculate the SFR inside the fibre for the Composite, AGN and Unclassifiable class. We have experimented with various estimators before settling on D4000. They all agree fairly well with a spread of 2% which has to be taken as a systematic uncertainty.
Our SFR density estimate is sensitive to our aperture correction scheme. We remarked above that our main assumption is that the distribution of log SFR/Li is the same at a given colour inside the fibre as outside. This is not necessarily so, something which is hinted at in the clear difference between the mode, median and average for the likelihood distributions. These indicate that the likelihood distribution for a given colour might be composed of at least two components and the balance between these could be different in the outer regions of the galaxy, cf. the double-peaked nature of some distributions in Figure 12. It is therefore possible to take the difference between the average (the formally correct estimate to use) and the mode as indicating the importance of this assumption. We have therefore carried out the aperture corrections also using the mode as an estimator of log SFR/Li. In this case we find a SFR density 15% lower than that in Table 3.
In addition there is a 6% difference between β = 0 and β = 3 which we take as an additional systematic uncertainty of ±3%. Finally differences between the linear interpolation scheme we use to draw numbers from the likelihood distribution and higher order schemes also produce a scatter of 1-2% percent in the calculation of the average. We will therefore adopt +6 −21 % as indicative of the expected range of the systematic uncertainties.
This gives our best estimate for the total SFR density at z = 0.1 as ρSFR = 1.915 +0.02 −0.01 (rand.) +0.14 −0.42 (sys.), in units of 10 −2 h70 M⊙/yr/Mpc 3 . The random errors correspond to the 68% confidence interval, while the larger systematic error is the expected total range. We have here taken the average of the β = 3 and β = 0 estimates since this difference is included in the systematic uncertainties.
We have calculated the SFR density in bins of absolute 0.1 rband magnitude and this is plotted as solid circles in Figure 15. For Figure 15. The star formation density as a function of r-band absolute magnitude compared with the r-band luminosity density scaled to have the same peak value. The error bars on the SFR density are derived from bootstrap resampling. The error-bars on the luminosity density estimate are suppressed for clarity. comparison, we plot the r-band luminosity density for our sample (see Blanton et al. 2003b, for a more in-depth discussion) scaled to the same peak value as the SFR density (crosses). If we fit both distributions with a (Schechter 1976) function of the form we find that the faint-end slopes, αFS, are the same for the two functions, but that Ł * is 0.27 magnitudes fainter for the star formation density. The error bars on the SFR density are derived from bootstrap resampling of the sample and do not include systematic errors. These will to good approximation only affect the overall normalisation, and not the shape. The curves are very similar in shape. At faint magnitudes where the majority of galaxies are strongly star-forming the r-band light is a good tracer of the SFR activity, but low luminosity galaxies are somewhat more important for the overall SFR density than for the r-band luminosity density. At the bright end of the distribution, there is less star formation per unit r-band luminosity. We will see later that this is even more prominent when plotted against the stellar mass.
Very little of the total SFR density occurs outside our sample (see also , their Figure 10). When integrating up the Schechter function, we find that only 1-2% of the total SFR density originates outside our selection limits. We have therefore decided not to correct our values for incompleteness.  Gallego et al. (1995), S00: Sullivan et al. (2000), Se02: Serjeant et al. (2002), SNe: SFR derived from SNe are as reported by Fukugita & Kawasaki (2003) who are also responsible for the upper limits from neutrino observations at SuperKamiokande, G03:  for whom we show the full range of the Hαderived local SFR. The dotted line shows a (1 + z) 3 evolution of ρ SFR for comparison. All values have been corrected to the Kroupa (2001) universal IMF and h = 0.7. Figure 16 places our estimate together with other recent estimates from the literature. In order to make this comparison we have adjusted the SFR density reported using the Salpeter (1955) IMF with a lower mass limit of 0.1M⊙ and an upper mass limit of 100M⊙ to our IMF by dividing all values by 1.515. All values have been corrected to h = 0.7, but no attempt has been made to adjust the values for small differences in the assumed cosmological model.
We have indicated different methods of estimating the SFR using different colours in the figure. It is clear that radio determinations (Haarsma et al. 2000;Condon et al. 2002;Serjeant et al. 2002) typically give the highest estimates, although alternative calibrations might bring these down somewhat .
Our value seems to be in good agreement with other Hαbased estimates (Gallego et al. 1995;Tresse & Maddox 1998;. In particular our value is in very good agreement with that derived for the SDSS by  in their analysis of the cosmic spectrum. We show the full range of their Hα-based estimate for the local SFR in Figure 16 and it is clear we are in very good agreement despite the different approaches taken. It might seem surprising that our error-bar is no smaller than many other determinations, which have used samples that are two orders of magnitude smaller than ours. This is mostly due to the adoption of a rather conservative systematic error for the aperture corrections. Finally, to close off this section, we comment on the contributions to the overall ρSFR from the different classes in Table 3. It is noteworthy that despite only contributing about 23% of the total stellar mass in the local universe, the SF class is the dominant class in terms of SFR density. Over 50% of ρSFR comes from this class. Likewise it is interesting that up to 15% of the total SFR takes place Figure 17. The relationship between the stellar mass and the SFR (both inside the fibre) for all galaxies with no AGN contribution. The figure has been volume weighted and normalised in bins of stellar mass. The contours are therefore showing the conditional likelihood of SFR given a stellar mass. The bin size is 0.1 × 0.1 in the units given in the plot. The red line shows the average at a given stellar mass, whereas the blue line shows the mode of the distribution. The dashed lines show the limits containing 95% of the galaxies at a given stellar mass.
in galaxies that show signs of AGN activity (cf. Kauffmann et al. 2003c).

THE OVERALL PROPERTIES OF LOCAL STAR FORMING GALAXIES
We now study the dependence of the SFRs on other physical parameters of the galaxies. Most of the results we show below will be familiar, as the general trends have been known for a long time.
The main importance of what follows is that for the first time it is possible to derive the full distribution functions for these quantities. This adds considerable quantitative information to what was known before. One important point is that the results shown in this section are only very weakly dependent on the systematic uncertainties discussed previously. We will therefore ignore these in what follows. We start with Figure 17, which shows the SFR distribution as a function of stellar mass for the SF, low S/N SF and unclassifiable classes. The figure has been volume weigthed and normalised in bins of stellar mass so it shows the conditional probability of SFR given a stellar mass. The clear correlation between SFR and stellar mass over a significant range in log M * is noticeable and will be a recurring theme in what follows. Note that that at log M * /M⊙ 10, the distribution of SFRs broadens significantly and the correlation between stellar mass and SFR breaks down.
In addition to the correlations between different galaxy parameters, it is instructive to study 1-dimensional projections of our multi-dimensional distribution functions. This enables us to address the question of how much of the total SFR density takes place in different kinds of galaxies. To do this we carry out the same Monte Carlo summation that we used to derive the total SFR density, but this time in bins of mass, concentration, size, surface density etc. The results of this exercise are shown in Figures 18 and 19. These show the star formation density per bin for a set of 8 different physical quantities. The bin size is indicated in each panel. The shaded gray area shows the 68% (1σ) confidence interval determined from the Monte Carlo and bootstrap summations.
The top left panel in Figure 18 shows ρSFR as a function of the central concentration of the galaxies, defined as the ratio of the radius containing 90% of the r-band light to that containing 50%. As discussed by Shimasaku et al. (2001) and Strateva et al. (2001), this quantity correlates quite well with morphological type for the galaxies in the SDSS. It related to other similar quantitative measures of galaxy concentration discussed previously (e.g. Morgan 1958;Doi et al. 1993;Abraham et al. 1994). The plot shows that the ρSFR distribution is broad and that it peaks around the value for a pure disk R90/R50 = 2.33. The SFR density contributions split into individual galaxy sub-classes are shown by thin lines as described in the legend. We see that the SF class contributes primarily to the disk-like part of the distribution. The star-formation in AGN occurs in galaxies that are more bulge-like. The top right panel shows ρSFR as a function of galaxy stellar mass. We can observe several interesting features here: • A considerable amount of star formation takes place in low mass galaxies. The best-fit Schechter function has a faint-end slope of −0.45 and a characteristic mass of log M/M⊙ = 10.95 with a possible transition to a steeper slope of αFS ≈ −0.55 at log M/M⊙ < 9.8. Approximately 50% of the total SFR takes place at M * > 2 × 10 10 M⊙ and about 90% at M * > 10 9 M⊙.
• The SF class almost completely dominates the SFR budget at masses less than about 10 10 M⊙.
• At masses > 10 10 M⊙ the majority of SFR takes place in galaxies low S/N SF class with important contributions estimated from galaxies that either cannot be classified or which show signs of AGN in their fibre spectra.
Many of the same trends can also be seen in the bottom left panel which shows ρSFR as a function of the stellar surface mass density (Kauffmann et al. 2003b). Taken together with the top left hand panel, it is clear that the majority of the SFR comes from high-surface brightness spirals. These typically have half-light radii around 3.0 kpc (slightly smaller than the Milky Way).
Turning to Figure 19 we now plot quantities that are determined inside the fiber aperture on the x-axis. We caution that since we only know these quantities within the fibre aperture, they may not be representative of the galaxy as a whole. The top right hand panel shows that most of the star formation takes place in galaxies with low D4000. 12% of the total SFR density comes from galaxies with D4000> 1.8, 2% with D4000> 2.0. Note that the star formation in these galaxies comes entirely from their outer parts and they are probably spiral systems with significant bulges. The bottom left Figure 19. Like the previous figure but this time we show parameters determined inside the fibre along the x-axis and the distribution functions are therefore relative to the SFR density inside the fibre. The top left shows the distribution with respect to the 4000Å break, the gas-phase oxygen abundance is at the top right, the present-to-past average star formation rate (bottom left) and finally the dust attenuation at Hα in the bottom right panel.
panel shows that the majority of star formation is taking place in galaxies that have a present-to-past average star formation rate between 0.3 and 0.5.
The top right panel shows the distribution as a function of gasphase oxygen abundance. Since the metallicity is derived from the fits to the CL01 models, we only show results for the SF class. It is interesting that most stars are forming in galaxies with solar or slightly super-solar metallicities, depending on the overall calibration of the metallicity scale (e.g. Kennicutt et al. 2003). Finally the bottom right panel shows the star formation density as a function of the attenuation at Hα (again, only for the SF class). This shows a peak around 1.0 magnitudes of extinction in Hα, in good agreement with other determinations (e.g. Kennicutt 1983;Sullivan et al. 2000;Hopkins et al. 2003;Nakamura et al. 2003, C02). Notice that we calculate the SFR weighted distribution function of AHα, whereas most earlier studies have focused on a sample averaged value. It is clear from Figure 6 that any such averaging will depend strongly on the sample used. This has also been pointed out by Hopkins et al. (2003), who show that a radio-selected subsample of the SDSS has considerably higher attenuation than the average SDSS galaxy.
We now compare these SFR distributions with number and mass density distributions. Number distributions as function of many photometric properties and as a function of local density have been discussed in detail by Blanton et al. (2003a) and by Hogg et al. (2003). Likewise the mass distributions have been discussed by Kauffmann et al. (2003a,b). Figure 18 and shows the distribution of ρSFR together with ρM * , the mass density distribution, and ρN , the number density distribution. In these plots we have normalised the distributions to the total within the plot limits. We remark that the spikes seen in several of the number density distributions are due to a handful of galaxies that are extremely close to our selection limits with very high volume corrections. The bootstrap errors on these spikes are large. Figure 20 shows that although high concentration galaxies are few in number, they contain the majority of the mass (see the discussion by Kauffmann et al. 2003a). As discussed previously, most of the star formation occurs in disk-like systems. The top right hand panel shows that the majority of galaxies are small (cf. Blanton et al. 2003a) and that the galaxies dominating the star formation budget and those dominating the mass budget are fairly similar in size. The bottom left hand panel in the figure shows the dependence on mass. We see that star formation is typically occurring Figure 20. The contribution to the total number, mass and star formation density as a function of various galaxy parameters. The SF density is shown in blue, the mass density in red and the number density in black. Top left: The contribution to the different densities as a function of the concentration of the galaxies. Top right: The same, but as a function of the half-light radii of the galaxies. Lower left: The density contributions as a function of log stellar mass and Lower right: The contributions as a function of log of the stellar surface density in M ⊙ /kpc 2 in galaxies that are somewhat less massive than those that contain the bulk of the stellar mass in the local Universe. As we noted previously, most of the star formation is taking place in high surface brightness galaxies. However, the the bottom right panel shows that the star formation is distributed over a wider range in stellar surface density than the stellar mass. This is presumably because a significant fraction of the stellar mass is located in ellipticals, which are not currently forming stars. Figure 21 is analogous to Figure 19 and shows properties measured within the fiber aperture. As discussed previously, the majority of the star formation is occurring in galaxies with low D4000. It is nevertheless intriguing that the majority of galaxies in the universe have even lower D4000 than those that dominate the SFR density. A very similar result is illustrated in the top right-hand panel, which contrasts the contributions to ρN , ρSFR and ρM * from galaxies with different present-to-past average star formation rates.

Figure 20 is analogous to
The bottom panels show the density distributions as a function of the attenuation at Hα (left) and as a function of gas phase oxygen abundance (right). These quantities are only calculated for the SF class. We see that the majority of SF galaxies have low metallicity and dust attenuation, two facts that are likely to be closely connected. Most of the mass and most of the star formation are found in galaxies with a wide distribution of AHα, centred around 1.0 magnitudes with a spread of ∼ 0.7 magnitudes for the SFR weighted distribution (see also the related study by Hopkins et al. 2003). For the AHα distribution weighted by stellar mass, we find an average value of ∼ 1.3 magnitudes and a similar spread. The difference reflects the fact that the most massive galaxies are more dusty than the galaxies dominating the SFR budget.

THE SPECIFIC STAR FORMATION RATE
The total SFR of a galaxy is interesting and for certain physical questions it is the relevant quantity. But given the strong correlation between SFR and mass (cf. Figure 17), it is clear that by normalising the SFR by the stellar mass, one can more easily study the relationship between star formation activity and the physical parameters of the galaxies. In this section we will turn our attention towards the star formation rate per unit mass, the specific star formation rate. For compactness we will refer to the specific SFR as rSFR below.
The specific star formation rate is closely related to several other important physical quantities. It defines a characteristic timescale of star formation, Since the supernova rate is proportional to the star formation rate, the specific star formation rate represents the current input of supernova energy per star in the galaxy. Likewise, if the stellar mass is proportional to the mass of the ISM, it is related to the porosity of the ISM in the galaxy (Clarke & Oey 2002). Finally rSFR does not depend on cosmology and is insensitive to the IMF as long as the high mass slopes are comparable. The star formation per unit mass has either explicitly or implicitly been used in numerous studies of field galaxies at z < 1 (e.g. Cowie et al. 1996;Guzman et al. 1997;Ellis 1997, and references therein) by using equivalent widths of Hα or [O II]3727. In the local Universe, it has however often been rephrased in terms of the present to past-average star formation rate (e.g. Kennicutt 1983;Scalo 1986;Kennicutt et al. 1994;Boselli et al. 2001;Gavazzi et al. 2002). This is a suggestive quantity, which immediately gives an indication of the past star formation history of the galaxy and its relation to present-day activity. The main disadvantage of the present-to-past average SFR is that it involves more assumptions than the specific star formation rate as we will see below.
The conversion between the two can be succinctly summarised as where R is the fraction of the total stellar mass initially formed that is return to the ISM over the lifetime of the galaxy, and T is the time over which the galaxy has formed stars (T = tH(z) − t form , where tH (z) is the Hubble time at redshift z and t form is the time of formation). Given a choice of IMF, it is possible to reliably estimate R using stellar evolution theory. We use the median R estimated for our galaxies which is R ∼ 0.5. For T it is customary to take T = tH , ie. assume that all galaxies started to form stars at t = 0. We will adopt this convention below (with T = tH (0.1)). This assumption is sensible when comparing average b values for large samples, but might be more questionable for individual galaxies. In general, this means that the b values we calculate will be upper limits.
Throughout this discussion, we will use b and rSFR interchangeably, depending on which is more appropriate. As highlighted by Rocha-Pinto et al. (2000), the b value is a convenient parameterisation of "burstiness" and b = 1 provides a convenient di- vide between galaxies consistent with a constant or declining SFR over the age of the universe and those that have a higher SFR at present than they have typically had in the past. Before turning to the results for our sample, we note that there are at least two reasonable definitions for the average rSFR and one must take care to specify which one is discussing.
The average rSFR value of a sample of N galaxies is simply where the weights wi can be unity or e.g. 1/Vmax depending on the question asked. This is the appropriate quantity to quote when talking about the typical galaxy. In practice we calculate this using the full likelihood distributions of rSFR for each galaxy. We also usually choose to quote the median rather than the average as this is binning insenstive (cf. discussion in Appendix A). r g SFR can be very different from the rSFR value for a volume limited sample: which is the appropriate quantity to quote for the the star formation history of the universe as a whole. To calculate r V SFR using our likelihood distributions, we employ the same Monte Carlo summation technique used in the calculation of ρSFR in section 6. It is important to include the stellar mass in this Monte Carlo code, because, as mentioned in Appendix A, neglecting to do so will bias our results. (To investigate the size of this bias we carried out the calculation of r V SFR with and without the full likelihood distribution of log M * and found that ignoring the uncertainty in log M * resulted in an overestimate of rSFR of about 15%.) We begin by calculating b V for the sample as a whole. This is a straightforward application of the results in the previous section and we show this quantity in Table 4. We see that for h = 0.7, the present day universe is forming stars at slightly higher than 1/3 of its past average rate, depending on the time-span one averages over in equation (9). Once again we have calculated the likelihood distribution using the Monte Carlo method discussed in Appendix A with 100 bootstrap repetitions and 30 Monte Carlo sums for each repetition. The resulting errors are also indicated in Table 4, but as with the total SFR density in section 6 the random errors are substantially smaller than the systematic ones. Following the discussion in section 6, we adopt the same estimates of systematic errors on the total SFRs and find a present-to-past average SFR for the local universe of with random errors corresponding to the 68% confidence interval. It is interesting to compare b V for the different classes we have defined. If we look at quantities determined inside the fibre, it fol-lows almost by definition that the SF class has the highest b V value and the unclassifiable category the lowest. If we look at the values derived from the total SFR, the SF class still dominates, but it is interesting to note that the global b value for this class is the same as the central one. For the other classes one typically finds that the global b is higher than that inside the fibre. This is, of course, partly a sample selection issue: galaxies classed as SF are those that have relatively more star formation inside the fibre. It is also interesting to note that galaxies showing indications of AGN activity form an intermediate category between the Composites and the unclassifiable. This is again logical because AGN would have been classified as composites if they had substantially higher SFR.
We now study how the specific star formation rate varies between galaxies of different stellar mass. We split the galaxy sample into nine bins in total stellar mass. We sum up the likelihood distributions of rSFR, ie. we calculate r g SFR without the normalisation in Equation (10). We show the resulting distributions of r g SFR in Figure 22. The histograms are normalised to the total number of galaxies within that bin. No volume correction has been applied, but the difference to that shown is very small except in the lowest mass bins. (Note that we might not be complete in mass in the lowest bin.) The contribution to the distributions that is located outside the plotted range has been distributed on the two lowest and two highest bins respectively.
The histogram for galaxies with 10.0 < log M * < 10.5 is repeated in each panel for reference. The shaded histograms show the distributions of those galaxies classified as recent starbursts based on their D4000 and HδA values by Kauffmann et al. (2003a). The thick black line shows the median volume weighted rSFR value (r g SFR ), and the red line shows r V SFR . The latter is higher as the total SFR is dominated by galaxies with extreme values of rSFR. Figure 22 shows that there is a reasonably good agreement between the recent burst estimates of Kauffmann et al. (2003a), which are sensitive to activity in the last Gyr, and the activity indicated by emission lines. Thus galaxies that are currently undergoing elevated star formation activity (relative to the Hubble time) have typically done so for at least a Gyr. It is also clear that the fraction of galaxies that show clear signatures of recent starbursts declines rapidly with mass. This can either be due to a lower gas fraction in more massive galaxies and/or the fact that periods of elevated star formation may be shorter in duration in more massive galaxies. Figure 22 also shows that the peak of the distribution shifts rather little over a large range in mass (log M * 10.5). The peak value is very close to b = 1. In other words, most galaxies with log M * 10.5 are forming stars at a rate that is consistent with their past average (ie. at a constant rate). Above few × 10 10 M⊙, we see a clear change in the shape of the distribution of b g with a transition towards a low specific SFRs. This coincides well with the transition mass defined by Kauffmann et al. (2003a). Also note that the second peak showing up at rSFR ∼ 10 −12 at high mass is Figure 22. The observed specific star formation as a function of total stellar mass in the mass ranges indicated in the top left hand corner of each panel. Shaded areas show recent starbursts, see text for details. Each histogram is normalised to the total number of galaxies in the mass range, given as N in each panel. The dashed histogram is the distribution in the log M * ∈ (10.0, 10.5] bin and is included for reference. The thick black bar shows the average b value, b g , in each mass range and the thick red bar shows b V . The latter is not shown in the first panel since we might not be statistically complete there. All objects with log SFR/M * −12 have been placed as a Gaussian around log SFR/M * = −12 as discussed in the text. broader than the one at high rSFR because the SFRs are less well constrained there.
Despite this near-constancy of the peak, there is a marked change in the shape of the tails. At low masses there is a substantial tail towards higher b-values, ie. a trend towards more bursty star formation activity. This changes to a substantial tail towards low b-values at high masses. Many massive galaxies are now forming stars at a depressed rate compared to their past average.

THE BIRTHRATE PARAMETER AS A FUNCTION OF PHYSICAL PARAMETERS
Section 7 focused on the distribution of the SFR density as a function of different physical parameters. We will now study the relation between the specific SFR and these same quantities. This will help us understand which factors control/are controlled by the star formation activity in individual galaxies.
Most previous studies have focused on the trend with mass and Hubble type. A key question has been which of these parameters correlate better with b. The studies of local spirals by Gavazzi & Scodeggio (1996) and Boselli et al. (2001) indicated that the driving variable was the stellar mass content. Other studies have made more of the correlation with Hubble type (e.g. Kennicutt et al. 1994). Here we will improve on these studies with a sample that is several orders of magnitude larger than the previous studies and with more careful and internally consistent modelling. We will not discuss the dependence on local density here. Recent studies of the correlation between star formation and density can be found in Gómez et al. (2003) and Balogh et al. (1998Balogh et al. ( , 2001. Throughout this analysis, we will show results using b g and b V derived from our total SFR estimates. In Figure 23, we show the trend of b with central concentration/Hubble type. The left hand panel in Figure 23 contrasts b g (solid line) with b V (crosses). The solid line shows the median of the unweighted b g calculated from the right panel using the formalism discussed in Appendix A. We do not show the confidence intervals, but as is evident from the full likelihood distributions, they are wide. The crosses show b V in each bin in R90/R50. This is calculated using Monte Carlo summation and is basically the ratio of the black and red curves in Figure 20. In general b V is larger than b g because rSFR has substantial tails  Figure 20. The right hand panel shows the (log of the) observed likelihood distribution of r SFR with respect to the concentration parameter, R 90 /R 50 , calculated as described in the text. The shading shows the conditional likelihood distribution (volume corrected) given a value for R 90 /R 50 . The contributions to the likelihood distributions below the plotted range have been put in the two lowest bins in r SFR . towards large values. b V is therefore dominated by galaxies away from the median.
The right panel of Figure 21 shows the conditional likelihood distribution of rSFR given the concentration parameter, C = R90/R50, for the local universe. This has been constructed by assuming that P (C) is independent of P (rSFR) and assuming a 5% error on C (this latter assumption is not important for the results). Each distribution has been weighted by 1/Vmax. The distribution in the right panel then follows from a straight addition of all the individual P (C, rSFR) and normalisation of the distribution in each bin of C to produce the conditional likelihood distribution. As in Figure 22 contributions to the likelihood distributions outside the plotted range in rSFR has been placed in the two lowest bins in rSFR.
These two panels illustrate the bimodal nature of the distribution of galaxies. This has recently been described and quantified in a number of papers (e.g. Strateva et al. 2001;Blanton et al. 2003a;Kauffmann et al. 2003b;Baldry et al. 2003). As can be seen, galaxies split into two basic populations: concentrated galaxies with low specific star formation rates, and low mass, less concentrated galaxies with high specific star formation rates. The distribution at high concentration is broad however because there is also a fairly important contribution from concentrated, actively star forming galaxies at very low redshift. When the figure is plotted for the sample  directly, ie. without volume weighting, the two regimes are much more clearly separated. Figure 24 shows, with the same layout, the distribution of b as a function of stellar mass. We see that although there is a clear decline in b with increasing stellar mass, the trend in b g is weak over a large range in stellar mass until it suddenly drops at around 10 11 h −2 70 M⊙. The b V value shows a much smoother trend. This is caused by the large spread in rSFR at given mass. If we compare Figures 24 and 23, we see that the Hubble type correlates somewhat less well with the b parameter than the stellar mass does. Figure 25 shows b as a function of stellar mass surface density, µ * . The connection between this and M * has been discussed in detail by Kauffmann et al. (2003b). Those authors show that µ * correlates well with M * , except at the very highest masses where µ * reaches a saturation value. It is therefore not surprising that we see a very similar trend with µ * as we saw with M * in Figure 24. When we plot the figure without volume weighting it is very simi-lar but the low SFR peak becomes more prominent at high µ * than at high M * , suggesting that the surface density of stars is more important than the stellar mass in determining when star formation is turned off. Figure 26 again presents a similar view. This time the b parameter is plotted as a function of absolute r-band magnitude at z = 0.1. The correlation here is fairly good. For galaxies fainter than about Mr = −20, the bg parameter increases by about 40% per magnitude. Figure 27 shows the relationship between b and the 4000Åbreak. The left panel shows this to be a fairly tight relation -one reason why Kauffmann et al. (2003a) were so successful in constraining the M/L ratio for galaxies using the 4000Å-break and the HδA index. We note that this relation does have the potential to constrain star formation in high redshift galaxies, as the Balmer break can be measured out to considerably higher redshift than Hα. From Figure 27 we see that a measurement of D4000 allows one to  constrain bg to within a factor of three (1σ) for D4000< 1.6. Although this is not very precise, it should be weighed against the fact that it is very difficult to estimate star formation rates to better than within a factor of two with any technique. It is however important to caution that the D4000 index intrinsically depend on the SFR in the last giga-year rather than the current SFR, as well as the metallicity. It is therefore necessary to assume that the present SFR is similar to the average activity in the last Gyr to make recourse to the relation in Figure 27.
We caution that D4000 is measured within the fibre, whereas our SFR estimates are for the entire galaxy. If we make the same plot using SFR estimated within the fibre, we still recover a tight correlation between b g and D4000. Moreover, we find that b g and b V now coincide. It is galaxies that have high D4000 within the fibre but significant star formation outside which make b V considerably higher than b g at high D4000 in Figure 27.
We finish by showing b versus R50 in Figure 28. It is clear that the low and high b galaxies co-exist over the entire range in R50. But there is a general trend for higher b V at the smallest sizes and for large galaxies to have typically lower values of b V .

Comparison with previous studies
Before we turn to a discussion of these results, it is enlightening to compare our results with previous work in more detail. The main comparison samples are those presented by Kennicutt et al. (1994) and Gavazzi et al. (2002) in their studies of star formation histories in spiral galaxies. They calculate b-parameters for a range in morphological types and in Figure 29 we plot this distribution of b versus approximate morphological type for our sample. We have taken the relation between RC3 T-type and concentration from Shimasaku et al. (2001) to convert concentration measures into ap- Figure 29. The present to past-average star formation rate for the sample plotted against central concentration in the r-band. The top x-axis shows the mean T -type implied by the concentration according to the recipe described in the text. The crosses show the sample of Kennicutt et al. (1994) and the boxes that of Gavazzi et al. (2002).
proximate T-types. We fitted a linear relation between T and C from their Table 3 which is indicated on the upper axis. In the same plot we show the results of Kennicutt et al. (1994, their Table 4) as crosses and those of Gavazzi et al. (2002) as squares, with big squares corresponding to the median value (BCGs are not shown). The data we show for our study are the unweighted b g values which should be appropriate for comparison to local studies.
There is clearly a considerable uncertainty in the x-axis, but the general agreement in trends and turn-over between the different samples indicates that the conversion between C and T -type derived by Shimasaku et al. (2001) seems to be fairly reliable. It is also clear that the values from Gavazzi et al. (2002) are considerably lower than ours. This may be due in part to differences in galaxy density in the two samples, but is more likely due to differences in the method used to determine the stellar mass. Gavazzi et al use an average dark matter mass to stellar mass ratio to derive stellar masses. We have chosen not to attempt to adjust for differences in methodology, as our main interest is in comparing the overall trend.

DISCUSSION
The preceding sections have provided a detailed inventory of the star formation activity in the local universe. One possibly surprising result is that the dominant contribution to the SFR density comes from galaxies with on-going star formation in their inner few kiloparsecs (the SF and low S/N SF classes). These galaxies contribute ∼ 72% of the total SFR density, whereas galaxies with signs of AGN (Composites and AGN) contribute about 15% of the total. The remaining 12% of ρSFR is located in galaxies we are unable to classify based on their fibre spectra. These are almost certainly galaxies with substantial bulges, in which stars are forming in the outer disk. A thorough examination of this question has to be postponed to a different article, but it is clear that the relative importance of star formation in the outer disk is a function of the mass or luminosity of the galaxy. When we look at narrow redshift ranges we find that the median aperture corrections increase by a factor of 2 from stellar masses around 10 9 M⊙ to M * ∼ 10 11 M⊙. The natural question is how this relates to star formation in bulges as compared to disks, but this will have to await a study of the resolved properties of our galaxies.
We found that galaxies in the SF class have SFR profiles that follow the r-band light. For these galaxies, it is possible, and accurate, to remove aperture bias by simple scaling with the r-band flux (cf. Hopkins et al. 2003).
We have, for the first time, been able to calibrate the conversion factor from Hα to SFR as a function of mass and we show that the standard Kennicutt (1998) conversion factor is a very good average correction for most galaxies in the local Universe. By lucky coincidence, trends in the conversion factor and in the intrinsic Case B ratio of Hα to Hβ tend to cancel each other out. As a result, it will be sufficient to take a fixed Case B ratio and a fixed η 0 Hα for most studies, even though neither of the two assumptions is correct.
One of the main questions we want to address is what excites large-scale star formation in galaxies. Seen from this angle the different parameters we have studied fall into two categories. First there is the incidental set -those parameters which are controlled by star formation activity rather than the other way around. Among these are D4000 and colour. Of more interest are those parameters that may have some direct effect on star formation such as M * , µ * , Hubble type and size.
We have found that the stellar mass correlates well with the SFR, a result found previously for a much smaller sample by Gavazzi & Scodeggio (1996). This is quite likely because even though the gas fractions of galaxies decline with increasing mass, the total gas masses still increase as a function of mass (Boselli et al. 2001).
Because of this overall correlation, the SFR density distribution functions are rather similar to the mass density distribution functions discussed by Kauffmann et al. (2003a). We find that the majority of star formation takes place in high surface brightness disk-dominated galaxies spanning a range of concentrations. These span a broad range of masses, peaking for galaxies slightly less massive than 10 11 M⊙ and with half-light radii around 3 kpc. The star forming gas has typically solar or slightly super-solar metallicity.
Our most striking result is contained in Figure 18, which shows that the majority of star formation in the local universe takes place in massive galaxies. This is a robust prediction of hierarchical models and the shape of our distribution matches that of Springel & Hernquist (2003) (although these authors show ρSFR versus the halo mass). This clearly warrants a more in-depth comparison and we will follow this up in a forthcoming paper.
Another interesting question is whether star formation in the local universe is predominantly bursty or relatively quiescent. As we saw in Figure 24, the answer depends strongly on the mass of the galaxy. It is safe to say, however, that the vast majority of star formation in the local universe takes place in galaxies that are not undergoing extreme starbursts. Quantitatively we find that 28% of ρSFR comes from galaxies with SFR/ SFR > 1 and 50% at SFR/ SFR > 0.54, but only 3% of ρSFR comes from galaxies with SFR/ SFR > 10. The traditional definition of star bursts is however b 2-3 and with that definition around 20% of the local SFR density takes place in star bursts. We should caution that our method of inferring the SFR from optical emission lines is likely to miss part of the SFR occurring in extremely dusty FIR-dominated galaxies. Given that such systems are very rare at low redshifts (e.g. Sanders & Mirabel 1996), this does not significantly affect our results.
Although IR-luminous galaxies are rare in the local universe, most local star formation occurs in galaxies with substantial opti-cal attenuation in their central parts. It is important to realise that this quantity might suffer from aperture biases so the galaxy averaged attenuation could be lower. The relationship between dust attenuation and SFR activity has been discussed by many authors. The most recent discussion is given by Hopkins et al. (2003) for a subsample of our SDSS data. Our results are in good agreement with theirs. Our main result is that the SFR-weighted dust attenuation at Hα is AHα ∼ 1.0 ± 0.5 magnitudes, with a good correlation between dust attenuation and SFR. The exact relationship between dust and star formation remains unclear. We see evidence for a strong correlation between stellar mass and dust attenuation, presumably reflecting a correlation between metallicity and dust content. The correlation between SFR and dust may simply reflect a more fundamental correlation between dust and metallicity. This question clearly merits further investigation. Another consequence of the moderately strong dust attenuation present in most galaxies is that the colours of the galaxies that dominate the SFR density have intermediate values -this is why we needed a second colour to get a good constraint on star formation activity in section 5.
To isolate the influence of other factors that may regulate the star formation, it is clearly necessary to take out the dominant trend with mass. One should probably normalise by the mass of the gas reservoir, but as this information is not available we have normalised by stellar mass. Previous studies by Gavazzi et al. (2002) and Boselli et al. (2001) have shown that SFR/M * correlates with mass. We confirm this trend. The median value of rSFR is fairly constant over a wide range at low masses and then decreases smoothly at masses greater than M * ∼ 10 10 M⊙. The simplest interpretation is that low mass galaxies form stars at an approximately constant rate, whereas the star formation history is increasingly peaked towards earlier times for more massive galaxies.
These trends have arguably been seen before. They agree with the "downsizing" scenario of Cowie et al. (1996). What has not been apparent up to now is the fact that the extremes of the distribution change noticeably. As we go up in mass, there are fewer and fewer galaxies with very high rSFR and a corresponding tail towards low rSFR starts to appear.
It is clear from the previous section that no single factor stands out as controlling rSFR. This is not entirely surprising since the properties we have used are simple and somewhat limited. It might turn out that more detailed properties correlate better with b, for example lopsidedness (Rudnick et al. 2000), local density (Gómez et al. 2003) or merging state (e.g. Le Fèvre et al. 2000). We will study these relations further in future work.
We mentioned previously that the r V SFR value for the sample as a whole provides a strong constraint on the star formation history of the universe. This has been discussed in some detail by Boselli et al. (2001), but that study did not use a SFH that is compatible with look-back studies -it assumed an exponential star formation history for all galaxies. Our b V value with our cosmology corresponds to an effective τ = 7 +0.7 −1.5 Gyr (including systematic errors) if one considers the whole universe to have an exponential SFH.
We will however adopt the parameterisation of the SFH of the universe used by Baldry et al. (2002). In general form this reads SFR ∝ (1 + z) α for z ∈ [z d , zc] (1 + z) β for z < z d 0 for z > zc , where we will adopt z d = 1, zc = 5 to be consistent with Baldry et al. (2002) and . Those authors showed that the local "cosmic spectrum" is well fit by a model with β = 3 and β = 0.
To calculate b V for the SFH defined by Equation (13) we simply integrate over time and define SFR = 1 t0 with t0 being the age of the universe at z = 0.1. This, together with equation (13), then gives the present to past-average SFR for the model It is interesting to note that because of the way we measure M * and the SFR, there is a dependence on the Hubble parameter in our b value calculations. That dependence cancels out in equation (13). To constrain α and β we thus need a prior on H0. We will take the simple approach and choose h = 0.7. We use the constraint on b V given in equation (12). Figure 30 shows the resulting constraints on α and β. Since the b V value only constrains the z = 0.1 value of the SFR and the integral under the curve, there is a strong degeneracy even with a fixed functional form. This degeneracy cannot be lifted without independent constraints, but it is still of considerable interest. Most noticeably it is clear that any really strong evolution, β > 4, at z < 1 is excluded by the current data.
The dotted lines in Figures 30 show the effect of systematic uncertainties on these constraints (for simplicity we have symmetrised these). The dashed lines show the purely statistical errors (1σ). It is clear that if we could remove all systematic uncertainties, the method would give a very powerful constraint on the star formation history of the universe. It is therefore worth asking what is required to further reduce the systematic uncertainties.
A thorough test of our aperture corrections using resolved spectroscopy would substantially reduce the systematic uncertainty in our SFR estimates. To reduce uncertainties due to evolution, one could limit oneself to a narrow range in redshift, but that would mean a corresponding increase in the uncertainty in extrapolating to faint galaxies. Alternatively, it might be possible to directly constrain the evolution inside the sample, but that is a rather delicate operation due to the complex aperture effects. We have postponed this to a later paper.
Systematic uncertainties notwithstanding, the results presented here provide remarkably good constraints on the cosmic star formation history using only data at z < 0.22. It is clear that a similar study carried out at higher redshift will be of great interest and provide a very interesting complement to recent studies of the mass-assembly of the universe Drory et al. 2001;Dickinson et al. 2003;Fontana et al. 2003;Rudnick et al. 2003).
In general it is our firm belief that future studies of high redshift data should try as far as possible to study the evolution of the full distribution functions in question rather than summarising the distributions to a single value. As we have seen, important changes might occur in the tails of the distributions with little influence on the median value. These physically interesting details can only be extracted through the use of distribution functions. It has been standard to study luminosity functions. With the techniques introduced in the present paper it is now possible to extend the analysis to a wide variety of distribution functions. This has been possible due to the unprecedented dataset provided by the SDSS survey. DISTRIBUTIONS Throughout this paper we use the full likelihoods of all parameters. This means that in many situations it is necessary to combine likelihood distributions. For the benefit of the reader we summarise some of the most important results here. We will use standard notation and write e.g. fZ (z) for the likelihood of a value z in the likelihood distribution Z. Thus we represent the addition of two likelihood distributions as Z = X + Y . This sum is discussed in most introductory statistics textbooks and the well know result is that where the latter equality is valid in the case of independent variables.
In the present work we do not often use the straight addition of distributions. The general case is not normally given in statistics textbooks. It is however easy to derive. We will focus our attention on the cumulative distribution function (CDF) F (z ẑ). In the continuous case this is given as where R is the region where the constraint z ẑ is fulfilled. If we write the general composition of two likelihood distributions as Z = g(X, Y ) then the constraint z ẑ translates into z g(x, y). For well-behaved distributions we can invert this to write y h(x, z). In general we can then write A simple change of variable presents itself and by setting y = h(x, u) we get with J = ∂h/∂u being the Jacobian of the variable transformation. By taking the derivative of this we can get the likelihood distribution fZ(z): which can be evaluated for any given g(X, Y ). The extension to discrete distribution functions is straightforward (see below).
As an example we can take the addition of two likelihood distributions for the SFR which we need for the aperture corrections.
Here we have Z = log 10 (10 X + 10 Y ), since the likelihood distributions are in log space. Thus calculating the Jacobian and writing it as a discrete distribution fZ(zj) = i fX (xi)fY (log 10 (10 z j − 10 x i )) 10 z j 10 z j − 10 x i .
This formalism extends fairly straightforwardly to the iterated composition of many (independent) likelihood distributions but this is not a practical approach for arbitrary likelihood distributions due to combinatorial explosion (the sum/integral has to be done over all combinations of variables that combine to a particular value of z). Instead it is easier in this situation to calculate the combined likelihood distribution in an approximate manner by carrying out a Monte Carlo sampling of the distributions. In this case we first form the CDF, F (x), for each galaxy. We then draw, for each galaxy, a random number, p, between 0 and 1 and solve F (x) = p numerically using linear interpolation. We use the latter approach for instance when evaluating the star formation density of the local universe. The main disadvantage of a Monte Carlo approach is that one needs to carry out a large number of repetitions to ensure a sufficient sampling of the final likelihood distribution and to estimate the confidence intervals with good accuracy.
Before continuing we would like to emphasise a technical issue which is of considerable importance for some combinations of likelihood distributions: Ignoring the likelihood distribution of a particular variable (in effect setting its error to zero) will in general bias the estimate of any combination of this variable with any other unless they are both Gaussians. A specific example relevant for our work is the ratio of two variables whose likelihood distributions are kept in log space.
In particular the calculation of rSFR in section 8 falls into this category. Here we have Z = 10 Y /10 X where X is the likelihood distribution of the stellar mass and Y that of the SFR. If we now for illustrative purposes assume that these are Gaussian in log space: log SFR ∼ N (log SFR0, σSFR) and likewise for log M * then a straightforward calculation gives us that the mean of Z z = ∞ 0 zfZ(z)dz, can be written