Abstract
We measure moments of the galaxy count probability distribution function in the Twodegree Field Galaxy Redshift Survey (2dFGRS). The survey is divided into volumelimited subsamples in order to examine the dependence of the higherorder clustering on galaxy luminosity. We demonstrate the hierarchical scaling of the averaged ppoint galaxy correlation functions, , up to p= 6. The hierarchical amplitudes, , are approximately independent of the cell radius used to smooth the galaxy distribution on small to medium scales. On larger scales we find that the higherorder moments can be strongly affected by the presence of rare, massive superstructures in the galaxy distribution. The skewness S_{3} has a weak dependence on luminosity, approximated by a linear dependence on log luminosity. We discuss the implications of our results for simple models of linear and nonlinear bias that relate the galaxy distribution to the underlying mass.
Introduction
The pattern of galaxy clustering can be quantified by measuring the galaxy count probability distribution function (CPDF) on a range of smoothing scales. The CPDF gives the probability that a randomly chosen region of the Universe will contain a particular number of galaxies, and is typically expressed as a function of both the size of the region smoothed over and the galaxy number within that volume. Traditionally, most effort has been directed at measuring the second moment of the count distribution, the variance, , through the autocorrelation function or, equivalently, its Fourier transform, the power spectrum (e.g. Percival et al. 2001; Padilla & Baugh 2003; Tegmark et al. 2004). The higherorder moments of the CPDF, expressed as volumeaveraged correlation functions, , provide a much more detailed description of galaxy clustering, probing the shape of the low and high count tails of the distribution.
The higherorder moments of the dark matter distribution are known to display a hierarchical scaling in which the ppoint volumeaveraged correlation functions, , can be written in terms of the variance of the count distribution, : (e.g. see Peebles 1980; Juszkiewicz, Bouchet & Colombi 1993; Bernardeau 1994; Baugh, Gaztañaga & Efstathiou 1995; Gaztañaga & Baugh 1995; Fosalba & Gaztañaga 1998). This scaling is a signature of the evolution under gravitational instability of an initially Gaussian distribution of density fluctuations. A remarkable feature of the scaling is that the values of the hierarchical amplitudes, S_{p}, on scales for which the density field evolves linearly or in a quasilinear fashion, are insensitive to cosmic epoch and essentially independent of the cosmological density parameter or the value of the cosmological constant. For a comprehensive review of such results see Bernardeau et al. (2002) and references therein.
Departures from the hierarchical scaling of the higherorder moments could conceivably arise in three ways as follows.
 (i)
A strongly nonGaussian distribution of primordial density waves as could arise, for instance, due to a seed nonlinear fluctuation such as a global texture (see Gaztañaga & Mahonen 1996; Gaztañaga & Fosalba 1998; Scoccimarro, Sefusatti & Zaldarriaga 2004 for examples of how the S_{p} scale in this case). This avenue now seems unlikely, following the clear detection of multiple acoustic peaks in the power spectrum of cosmic microwave background temperature fluctuations (Netterfield et al. 2002; Hinshaw et al. 2003; Mason et al. 2003; Scott et al. 2003; Kuo et al. 2004); such peaks are difficult to reconcile with models that include cosmological defects (Kamionkowski & Kowsowsky 1999). Moreover strongly nonGaussian primordial fluctuations are ruled out by the firstyear Wilkinson Microwave Anisotropy Probe (WMAP) results (Komatsu et al. 2003; Gaztañaga & Wagg 2003).
 (ii)
A weakly nonGaussian distributed primordial density field, resulting from a nonlinear perturbation to a Gaussian density field. This scenario is difficult to distinguish from the evolution of an initially Gaussian field under gravitational instability, because the perturbation can introduce a shift to the amplitudes S_{p} that is also hierarchical. This can happen even in the case where the nonlinear perturbation produces a negligible effect on the power spectrum (Bernardeau et al. 2002).
 (iii)
The spatial bias between the galaxy distribution and the underlying distribution of dark matter. Fry & Gaztañaga (1993) demonstrated that, under a local biasing prescription, the hierarchical scaling of the higherorder moments is preserved but the amplitudes S_{p} can change as a function of time or luminosity. This conclusion is also reached using more sophisticated, physically motivated semianalytic models of galaxy formation (Kauffmann et al. 1999; Benson et al. 2000; Scoccimarro et al. 2001).
Previous attempts to measure the higherorder correlation functions have been hamstrung by the small size of the available redshift surveys, a shortcoming that is exacerbated once volumelimited subsamples are constructed (Hui & Gaztañaga 1999). Nevertheless, early countsincells studies established that the first few higherorder moments of the galaxy distribution displayed the hierarchical scaling expected in the gravitational instability framework (Groth & Peebles 1977; Peebles 1980; Gaztañaga 1992; Bouchet et al. 1993; Fry & Gaztañaga 1994; Ghigna et al. 1996; Feldman et al. 2001). Such analyses were typically limited to measuring the three and fourpoint correlation functions. The nature of the dependence of the hierarchical amplitudes on luminosity has not been convincingly established. Recent work to investigate this in the optical (Hoyle, Szapudi & Baugh 2000) and in the farinfrared (Szapudi et al. 2000) was restricted to probing fairly narrow ranges of luminosity due to the size of the redshift surveys then available.
The advent of multifibre spectrographs exploited by sustained observing campaigns has led to a new generation of redshift survey which represents order of magnitude advances over surveys completed in the last millennium. The Sloan Digital Sky Survey (York et al. 2000) and the Twodegree Field Galaxy Redshift Survey (2dFGRS, Colless et al. 2001) have provided maps of the clustering pattern of galaxies with unprecedented detail. Analysis of the 2dFGRS clustering has suggested that the fluxlimited sample could be an essentially unbiased tracer of the dark matter in the Universe (Lahav et al. 2002; Verde et al. 2002). These results confirmed previous deductions about galaxy bias (e.g. Gaztañaga 1994; Frieman & Gaztañaga 1999; Gaztañaga & Juszkiewicz 2001) reached using the parent angular catalogue of the 2dFGRS, the APM Galaxy Survey (Maddox et al. 1990; Maddox, Efstathiou & Sutherland 1996). The 2dFGRS covers a volume that is an appreciable fraction of that sampled by the APM Survey, with full redshift coverage (modulo the relatively small redshift incompleteness that still remains). This means that for the first time, a measurement of the higherorder moments is possible in three dimensions with comparable accuracy to that attainable in two dimensions, but without the added complication of the effects of projection (Gaztañaga & Bernardeau 1998; Szapudi & Gaztañaga 1998).
The sheer number of galaxies in the 2dFGRS allows it to be subdivided in order to probe the dependence of the clustering signal on intrinsic galaxy properties in more detail. Norberg et al. (2001) found that the amplitude of the projected twopoint correlation function scales with luminosity, and characterized this trend using a relative bias factor with a linear dependence on luminosity. In this paper we extend the work of Norberg et al. to study the higherorder clustering of galaxies in the 2dFGRS and its dependence on luminosity. Our approach is the same as that followed in Baugh et al. (2004), who measured the higherorder correlation functions of a sample of L_{*} galaxies and found that they follow a hierarchical scaling.
We provide a brief review of the measurement of the moments of the CPDF in Section 2. In Section 3, we discuss the specific application of this method to the 2dFGRS; an important feature of our analysis is the use of mock catalogues to estimate the errors on our measurements (see Section 3.3). Our results for the higherorder correlation functions and the hierarchical amplitudes are given in Section 4. We quantify the variation of the higherorder moments with luminosity in Section 5, and discuss the interpretation of these results in terms of a simple relative bias model. Our conclusions are set out in Section 6. Throughout, we adopt standard present day values of the cosmological parameters to compute comoving distance from redshift: a density parameter Ω_{m}= 0.3 and a cosmological constant Ω_{Λ}= 0.7.
CountinCells Statistics
The CPDF and its moments have been used extensively to quantify the clustering pattern of galaxies (e.g. White 1979; Peebles 1980). In this section we give an outline of the countsincells approach, explaining how the volumeaveraged ppoint correlations are derived from the CPDF and give a brief theoretical background. A more comprehensive discussion of the countsincells approach can be found in Bernardeau et al. (2002).
Estimating the ppoint volumeaveraged correlation functions
The ppoint moment, or (unreduced) correlation function, m_{3}(r_{1}, r_{2}, r_{3}) ≡〈δ(r_{1})…δ(r_{p})〉, can be used to fully characterize the clustering of a fluctuating field δ(r). The reduced ppoint correlation function, ξ_{p}(r_{1}, …, r_{p}), is defined as the connected part of the above ppoint correlation in such a way that for p > 2 : ξ_{p}= 0 for a Gaussian field (see Bernardeau et al. 2002, for more details). Following the standard convention, for the remainder of this paper when we talk about correlations we will always assume they are ‘reduced’ correlations.
The ppoint volumeaveraged galaxy correlation function, , can be written as the integral of the ppoint correlation function, ξ_{p}, over the sampling volume, V (Peebles 1980), i.e.
A practical way in which to estimate is to randomly throw cells down within the galaxy distribution, recording the number of times a cell contains N galaxies so as to build up the galaxy CPDF, P_{N}(V). As we adopt spherical cells, the CPDF is a function of the sphere radius, R, where N_{N} is the number of cells that contain N galaxies out of a total number of cells thrown down, N_{T}. The volumeaveraged correlation functions are then related to the moments of the CPDF, m_{p}, where N̄ is the mean number of galaxies in a cell of volume V and is calculated directly from the CPDF For the case of a continuous distribution, is related to the corresponding cumulant, μ_{p}, through , where the cumulants are defined as (see Gaztañaga 1994, for details) If instead we are dealing with a discrete distribution, these relations must be corrected. A Poisson shot noise model is adopted (see Baugh et al. 1995, for a discussion of this point), to give corrected estimates of the moments, k_{p}, i.e. The volumeaveraged correlation functions, calculated from the galaxy CPDF, follow directly from the relation .Scaling of the higherorder moments
In the hierarchical model of clustering, all higherorder correlations can be expressed in terms of the twopoint function, , and dimensionless scaling coefficients, S_{p},
Traditionally, is referred to as the skewness of the distribution and as the kurtosis. The hierarchical scaling of the higherorder moments arises from the evolution due to gravitational instability of an initially Gaussian distribution of density fluctuations (see Bernardeau et al. 2002, and references therein).Systematic effects: biased estimators
In addition to sampling errors (see Section 3.3), the estimation of the hierarchical amplitudes can be compromised by systematic effects, as discussed in some detail by Hui & Gaztañaga (1999). These authors identified two sources of error that could lead to a systematic bias in the inferred values of S_{p}. The first effect arises from biases in the estimates of the higherorder correlation functions themselves, known as the ‘integral constraint bias’ (see e.g. Bernstein 1994). The second effect originates in the nonlinear combination of and to form S_{p}; this is called the ‘ratio bias’. The latter effect dominates on large scales and tends to cause the inferred values of the S_{p} to be biased low. Hui & Gaztañaga wrote down expressions for these biases which accurately reproduce the systematic effects seen upon estimating the hierarchical amplitudes from subvolumes extracted from Nbody simulations.
As mentioned above, we will use different volumelimited samples to study the luminosity dependence of the hierarchical amplitudes, S_{p}. As the luminosity that defines a sample is made brighter, the volume of the sample increases. Thus the estimation biases tend to cause the S_{p} to increase with sample luminosity. This spurious tendency has already been reported in the literature (see Hui & Gaztañaga 1999). For volumes of the size used in our analysis, it turns out that the predicted biases are smaller than the corresponding sampling errors (e.g. see fig. 3 in Hui & Gaztañaga 1999). This is the first time that a redshift survey has been available which is large enough to overcome such systematic biases.
Galaxy biasing
Galaxy samples constructed using different selection criteria display different clustering patterns. This leads one to the conclusion that distinct samples of galaxies must trace the underlying mass distribution in different ways, a phenomenon that is generally known as galaxy bias.
A simple, heuristic scheme describing the impact of a local bias on the scaling of the higherorder moments was proposed by Fry & Gaztañaga (1993). These authors demonstrated that in this case, the scaling of the higherorder moments of the galaxy distribution should mirror that of the dark matter, though possibly with different values for the hierarchical amplitudes S_{p}. Fry & Gaztañaga made the assumption that the density contrast in the galaxy distribution, δ^{G}, i.e. the fractional fluctuation around the mean density, could be written as a Taylor expansion of the density contrast of the dark matter, δ^{DM}, i.e.
On scales where the variance, , is small, the leading order contribution to the twopoint volumeaveraged correlation function of galaxies has the form where b_{1} is the ubiquitous linear bias b. The leading order forms for the hierarchical amplitudes, S_{p}, for the cases p= 3 and p= 4 are where we use the notation c_{k}=b_{k}/b_{1}. Expressions for the hierarchical amplitudes are given up to p= 7 in Fry & Gaztañaga (1993).Mo, Jing & White (1997) give theoretical predictions for the coefficients b_{k} using the Press & Schechter (1974) formalism and exploiting the framework developed by Cole & Kaiser (1989) and Mo & White (1996). For haloes of mass M, the first two bias factors (k= 1 and 2) are given by
and where ν≡δ_{c}/σ(M), δ_{c} is the linear theory overdensity at the time of collapse (δ_{c}= 1.686 for Ω= 1) and σ (M) is the linear rms fluctuation on the mass scale of the haloes. This is a simple model but nevertheless it shows some tendencies that are correct. For example, a typical mass halo corresponding to ν= 1 displays an unbiased variance with b_{1}= 1, but introduces a bias in the skewness, since c_{2}=b_{2}=−0.7. As a further illustration, consider massive haloes defined by ν^{2} > 3; in this case the Mo, Jing & White theory predicts that c_{2} > 0, while less massive haloes could produce c_{2} < 0. To get more realistic values of b_{k} for galaxy bias, a prescription has to be adopted for populating dark matter haloes of a given mass with galaxies of a given luminosity (Benson et al. 2000; Scoccimarro et al. 2001; Berlind et al. 2003).In the interpretation of the higherorder moments presented in this paper we will make use of a {relative} bias, which describes the change in clustering compared with that measured for a reference sample (Norberg et al. 2001, 2002a). Using equation (9) as a guide, we define the relative bias, b_{r}=b_{1}/b*_{1}, of a sample as the square root of the ratio of the twopoint correlation function measured for the sample over that measured for the reference sample, denoted by an asterisk (the reference sample will be defined explicitly in Section 5):
Thus, we can obtain an estimate of the relative bias from the ratio of the variances.When the linear bias is a good approximation (c_{k}≃ 0 for k > 1), we can relate S^{G}_{p} in different galaxy samples regardless of the underlying DM value of S_{p}
More generally, one can manipulate equation (10) to write down an expression comparing S^{G}_{p} for two galaxy samples, eliminating S^{DM}_{3} for the underlying dark matter (e.g. see equation 9 in Fry & Gaztañaga 1993). For the skewness where an asterisk denotes a quantity describing the reference sample and b_{r}=b_{1}/b*_{1} is the relative bias defined above. Any secondorder relative bias effects are thus given by As a special case, if the reference sample is unbiased (i.e. b*_{1}= 1 and c*_{p}= 0), we then have c′_{2}=c_{2}.Application to the 2DFGRS
In this section we describe the construction of volumelimited samples from the 2dFGRS (Section 3.1) and outline how we deal with the small, remaining incompleteness of the survey when we measure the CPDF (Section 3.2). The estimation of errors on the measured higherorder moments is described in Section 3.3. We use the full 2dFGRS as our starting point. The final spectra were taken in 2002 April, giving a total of 221 414 galaxies with highquality redshifts (i.e. with quality flag Q ≥ 3; see Colless et al. 2001). The median depth of the full survey, to a nominal magnitude limit of b_{J}∼ 19.45, is z∼ 0.11. We consider the two large contiguous survey regions, one near the South Galactic Pole (SGP) and the other towards the North Galactic Pole (NGP). After restricting attention to the highredshift completeness parts of the survey (see Colless et al. 2001; Norberg et al. 2002b), the effective solid angle covered by the NGP region is 469 deg^{2} and that of the SGP is 670 deg^{2}. Full details of the 2dFGRS and the construction and use of the mask quantifying the completeness of the survey can be found in Colless et al. (2001, 2003).
We make use of mock 2dFGRS catalogues to test our algorithm for dealing with the spectroscopic incompleteness of the survey and to estimate errors on the measured higherorder correlation functions. The construction of the mocks is described in Norberg et al. (2002b). In short, catalogues are extracted from the Virgo Consortium's ΛCDM Hubble Volume simulation which covers a volume of 27 Gpc^{3} (Evrard et al. 2002). A heuristic bias scheme is applied to the smoothed distribution of dark matter in the simulation to select ‘galaxies’ with a specified clustering pattern (Cole et al. 1998). The parameters of the biasing scheme are adjusted so that the extracted galaxies have the same projected correlation function as measured for the fluxlimited 2dFGRS by Hawkins et al. (2003). Observers are placed within the Hubble Volume simulation according to the criteria set out in Norberg et al. (2002b). Mock catalogues are then extracted by applying the radial and angular selection functions of the 2dFGRS. Finally, the mock is degraded from uniform coverage within the angular mask of the survey by applying the spectroscopic completeness mask of the 2dFGRS.
Construction of volumelimited catalogues
In a fluxlimited sample the density of galaxies is a strong function of radial distance. This effect needs to be taken into account in clustering analyses (for an example of a technique appropriate to a countsincells analysis, see Efstathiou et al. 1990). Alternatively, one may construct volumelimited samples in which the radial selection function is constant and any variations in the density of galaxies are due only to largescale structure. This greatly simplifies the analysis at the expense of using a subset of the survey galaxies. The 2dFGRS contains enough galaxies and covers sufficient volume to permit the construction of volumelimited samples corresponding to a wide baseline in luminosity from which robust measurements of the higherorder correlation functions can be obtained.
We follow the approach taken by Norberg et al. (2001, 2002a) who measured the projected twopoint correlation function of 2dFGRS galaxies in volumelimited samples corresponding to bins in absolute magnitude. The samples are defined by a specified absolute magnitude range, with absolute magnitudes corrected to zero redshift (this correction is made using the k+e correction given by Norberg et al. 2002b). As any survey has, in practice, a bright as well as a faint flux limit, this implies that a selected galaxy should fall between a minimum (z_{min}) and a maximum (z_{max}) redshift. This then guarantees that all sample galaxies are visible within the flux limits of the survey when displaced to any depth within the volume of the sample. The properties of the combined NGP and SGP volumelimited samples examined in this paper are given in Table 1.
Correcting for incompleteness
There are two possible sources of incompleteness that need to be considered when estimating the galaxy count within a cell. The first is volume incompleteness, which can arise when some fraction of the cell volume samples a region of space that is not part of the 2dFGRS. This situation can arise because the survey has a complicated boundary and also because it contains holes excised around bright stars and other interlopers in the parent APM galaxy catalogue (Maddox et al. 1990, 1996). The second source of incompleteness is spectroscopic incompleteness. The final 2dFGRS catalogue is much more homogeneous than the 100k release (contrast the completeness mask of the final survey shown in fig. 1 of Hawkins et al. 2003 with the equivalent mask depicted in fig. 15 of Colless et al. 2001). However, the spectroscopic completeness still varies with position on the sky and needs to be incorporated into the countsincells analysis.
It is therefore necessary to devise a strategy to compensate for the fact that a cell will sample regions that have varying spectroscopic completeness and which may even straddle the survey boundary or a hole. We project the volume enclosed by the cell on to the sky and estimate, using the survey masks, the mean combined spectroscopic and volume completeness, f, within the sphere. Rather than view the consequence of this incompleteness as missed galaxies, we instead consider it as missed volume. We compute a new radius for the sphere given by : such a sphere with radius R′ will have an incomplete volume equivalent to that of a fully complete sphere of radius R. Spheres for which f is less than 50 per cent are discarded. The galaxy count within the sphere of radius R′ then contributes to the CPDF at the effective radius R. Each sphere thrown down is individually scaled in this way according to its local incompleteness, as given by the survey masks. We note that, due to our chosen acceptable minimum completeness of 50 per cent, the rescaling of the cell radius is always less than the width of the radial bins we use to plot the higherorder correlation functions. Our results are insensitive to the precise choice of completeness threshold.
An alternative method to correct cell counts is described in Efstathiou et al. (1990). In this commonly used approach it is the galaxy counts which are scaled up in proportion to the degree of incompleteness in the cell, as opposed to the cell volume as we have done. We have tried both correction methods when calculating the higherorder moments and find the results are essentially identical (see Croton et al. 2004, for further discussion of the relative strengths and weaknesses of both methods).
A test of our method for dealing with incompleteness is shown in Fig. 1. This plot shows the S_{p} estimated from the higherorder correlation functions measured in mock 2dFGRS catalogues (Norberg et al. 2002b). The dotted lines show the results for complete mocks, with uniform sampling of the galaxy distribution within the full angular boundary of the 2dFGRS. The dashed lines show how these results change once the mocks are degraded to mimic the spectroscopic incompleteness and irregular geometry of the 2dFGRS, without applying any correction to compensate for this incompleteness. The circles show the values of S_{p} recovered on application of the correction for incompleteness described above. These results are in excellent agreement with those from the fully sampled, ‘perfect’ mocks.
We have carried out two independent countsincells analyses, using different algorithms to place cells within the survey volume. The results are insensitive to the details of the countsincells algorithm. The CPDF is measured using 2.5 × 10^{7} cells for each cell radius. We have further checked the countsincells analysis by comparing the measured twopoint volumeaveraged correlation function with the integral of the measured spatial twopoint correlation function, given by equation (1); the integral of the spatial correlation function is in very good agreement with the direct estimate of the volumeaveraged correlation function.
Error estimation
We estimate the error on the higherorder correlation functions and hierarchical amplitudes using the set of 22 mock 2dFGRS surveys described by Norberg et al. (2002b). The 1σ errors that we show on plots correspond to the rms scatter over the ensemble of mocks (see Norberg et al. 2001). To recap, we consider one of the mocks as the ‘data’ and compute the variance around this ‘mean’ using the remaining mock catalogues. This process is repeated for each mock in turn, and the rms scatter is taken as the mean variance. We illustrate this approach in Fig. 2 for the case of p= 3, for a volume defined by the magnitude range −19 > M− 5log_{10}h > −20. In the upper panel, the skewness or S_{3} measured in each mock is shown by the dotted lines. The points show the mean skewness averaged over the ensemble of 22 mocks. The error bars show the rms scatter on these measurements. The lower panel shows the fractional error that we expect on the measurement of S_{3} for this particular volumelimited sample. Beyond 20 h^{−1} Mpc, the fractional error increases rapidly. Our estimate of the fractional error automatically includes the contribution from sampling variance due to largescale structure (sometimes referred to as ‘cosmic variance’). To estimate the error on a measured correlation function, we simply compute the fractional rms scatter for the equivalent volumelimited sample using the ensemble of mocks, and multiply the measured quantity by the fractional error.
We have compared the estimate of the rms scatter from the ensemble of mocks with an internal estimate using a jackknife technique (see e.g. Zehavi et al. 2002). In the jackknife approach, the survey is split into subsamples. The error is then the scatter between the measurements when each subsample is omitted in turn from the analysis. The jackknife gives comparable errors to the mock ensemble for loworder moments. On large scales, the higherorder moments are particularly sensitive to sample variance and, in these cases, the jackknife approach can only provide a lower bound to the true scatter.
A more formal error estimation procedure is adopted when computing the bestfitting values for the hierarchical amplitudes, S_{p}. In this case, we employ a principal component analysis to explicitly take into account the correlations between the S_{p} inferred at different cell radii (see e.g. Porciani & Giavalisco 2002 and section 6 of Bernardeau et al. 2002). The mock catalogues are used to compute the full covariance matrix of the S_{p} data points to be fitted. Next, the eigenvalues and eigenvectors of the covariance matrix are determined. We find that, typically, the first few eigenvectors are responsible for over 90 per cent of the variance. Given the number of data points that we consider in the fits, this means that we have around a factor of two to three times fewer independent points than data points fitted. (Details of the range of scales used in the fits will be given in Section 4.2.) We note that in most previous work, the S_{p} measured at different cell radii were simply averaged together ignoring any correlations between bins, resulting in unrealistically small errors in the fitted values.
Results
Volumeaveraged correlation functions
The volumeaveraged correlation functions estimated from the CPDF constructed from the combined NGP and SGP cell counts are plotted in Fig. 3. The symbols show the correlation functions for the L_{*} sample with −19 > M>−5 log_{10}h > −20. The lines show the measurements made for galaxies in magnitude bins adjacent to L_{*} (the dashed lines correspond to a sample that is one magnitude fainter and the dotted lines to a sample that is one magnitude brighter). The correlation functions steepen dramatically on small scales as the order p increases.
To better quantify the dependence of the higherorder correlation functions on luminosity, we plot the ratio of the to the results for the L_{*} reference sample in Fig. 4, for the cases p= 2 and p= 3. The variance in the distribution of countsincells on a given smoothing scale increases with the luminosity of the volumelimited sample (see the bottom panel of Fig. 4). This effect is similar to that reported by Norberg et al. (2001, 2002a), who measured the dependence of the strength of galaxy clustering on luminosity in real space, whereas our results are in redshift space. This behaviour is broadly seen to extend to the higherorder clustering, however, the ranking of the amplitude of the higherorder correlation functions with luminosity is not always preserved on large scales. This issue is investigated further in Section 4.3.
Hierarchical clustering
We use the measured volumeaveraged correlation functions from Fig. 3 to test the hierarchical clustering model set out in Section 2.2 and equation (7). In Fig. 5, we plot the p= 3 –6 point volumeaveraged correlation functions as a function of the variance (or twopoint function) measured on the same scale. Small values of the moments correspond to large cells. The thick grey lines show the higherorder moments expected in the hierarchical model. (From equation 7, the offsets of these lines are the hierarchical amplitudes S_{p}. We have used the bestfitting values of S_{p} that we obtain later on in this section. However, the width of the lines does not indicate the error on the fit: the lines are intended merely to guide the eye.) On small scales (large variances), hierarchical scaling is followed. On intermediate and large scales, for which the variance drops below ∼1.3, the measured moments depart somewhat from the hierarchical scaling behaviour, particularly in the case of the higherorders.
The hierarchical scaling of the higherorder correlation functions is exploited to plot the hierarchical amplitudes as a function of cell radius in Fig. 6. Each panel corresponds to a different volumelimited sample, where the lines and points correspond to S_{3}, S_{4} and S_{5} in order of increasing amplitude. The hierarchical amplitudes measured from the two brightest volumelimited samples systematically show an increase around 10 h^{−1} Mpc. This effect is particularly significant in the −19 > M−5 log_{10}h > −20 sample, with the S_{p} increasing by a factor of 2 to 5 depending on p. On smaller scales the hierarchical amplitudes are essentially independent of the cell radius for all magnitude ranges considered. It should be noted that the S_{p} measured in real space vary more strongly in amplitude with scale than in redshift space, particularly at small cell radii (Gaztañaga 1994; Szapudi et al. 1995; Szapudi & Gaztañaga 1998).
We have fitted constant values to the measured S_{p}, using the principal component analysis outlined in Section 3.3. This approach takes into account the correlations between the measurements on different scales. The range of scales used to fit S_{p} is held fixed for each volumelimited sample and is quoted in Table 2. Typically, there are ten values of S_{p} in the range considered in the fittings. The principal component analysis reveals that just two to four linear combinations of these points account for more than 90 per cent of the variance; this gives a fairer impression of the number of independent data points. The principal eigenvector is in all cases almost independent of scale, i.e. its effect is to move all the points coherently up and down (driven by largescale variation in the mean density estimated from the survey). Therefore, the bestfitting constant tends to favour a fit either slightly above or below each set of data points. This is exactly what is seen in the various panels of Fig. 6. The bestfitting constants to the measured S_{p} are given in Table 2, along with an error from the principal component analysis. The fits to S_{3} and S_{4} for the −19 > M−5 log_{10}h > −20 sample are poor in terms of the reduced χ^{2}. There some dependence of the S_{p} with increasing luminosity. This behaviour is explored in Section 5.
Systematic effects: the influence of superclusters
The higherorder moments of the CPDF are sensitive to the presence of massive structures that contribute to the extreme event tail of the count distribution. It is therefore important to examine the 2dFGRS to look for any rare largescale structures that could exert a significant influence on the form of the CPDF. The projected density of galaxies in the right ascension redshift plane for a volumelimited catalogue (VLC) defined by the magnitude range −19 > M−5 log_{10}h > −20 is plotted in fig. 1 of Baugh et al. (2004). There are two clear hotspots or superstructures apparent in this figure, one in the NGP at a redshift of z= 0.08 and a right ascension of 3.4 h, and the other in the SGP at z= 0.11 at a right ascension of 0.2 h. These structures are confirmed as superclusters of galaxies in the group catalogue constructed from the fluxlimited 2dFGRS (Eke et al. 2004); of the 94 groups in the full fluxlimited survey out to z∼ 0.15 with nine or more members and estimated masses above 5 × 10^{14}h^{−1} M_{⊙}, 20 per cent reside in these superclusters. As a result of the redshift at which these superclusters lie, these structures are only influential in volumelimited samples brighter than M−5 log_{10}h=−18.
The results presented earlier in this section show features that could be due to the presence of these superclusters. For example, the volumeaveraged correlation functions for the −19 > M−5 log_{10}h > −20 sample plotted in Fig. 3 appear to have more power on large scales than those measured from the other volumelimited samples. This is consistent with the theoretical expectations for measurements that are strongly affected by the presence of a supercluster: a boost in the clustering amplitude on large scales, due to a structure with a larger bias, and a reduction in the clustering amplitude on small scales arising from the large velocity dispersions within the clusters making up the structure.
To investigate this hypothesis, we have carried out the test of removing the two superclusters from the sample and recomputing the volumeaveraged correlation functions. The goal of this exercise is not to ‘correct’ the measured correlation functions but rather to illustrate the impact of the superclusters on our results. We remove the superclusters by masking out their central densest regions, corresponding to prohibiting the placement of cells within a sphere of radius 25 h^{−1} Mpc from each supercluster centre (for a different approach on how to take this type of effect into account see Colombi, Bouchet & Schaeffer 1994; Fry & Gaztañaga 1994).
Fig. 7 shows the effect of the supercluster removal on the tail of the CPDF for 10 h^{−1} Mpc radius cells, calculated for three VLCs centred on L*. The mean number of galaxies in a cell for each galaxy sample is roughly 40, 24 and 6 going from faintest to brightest. The presence of the two superclusters makes a clear difference to the high N counts for galaxy samples brighter than M−5 log_{10}h=−19. The maximum redshift of the faint VLC in this figure only marginally includes the NGP supercluster, and so P_{N} remains essentially unaffected in this case.
Fig. 8 shows volumeaveraged correlation functions of order p= 2, 3, 4 for three VLCs from Table 1, where each panel corresponds to a fixed absolute magnitude range. The lines correspond to different orders of clustering, starting with the lowest in amplitude, the twopoint volumeaveraged correlation function, and moving through to the fourpoint function, at which we stop plotting the results for clarity although the trends shown continue up to sixth order. The solid curves show the correlation functions measured from the full volumelimited samples, as shown previously in Fig. 6, and the dashed lines show the results when the regions containing the superclusters are excluded from the CPDF. The higherorder correlation functions are systematically boosted on intermediate and large scales when the superclusters are included in the analysis. The precise scale on which the correlation functions become sensitive to the presence of the superclusters depends upon the order; for the fourpoint function, the two estimates of the correlation function typically deviate for cells of radius 3 h^{−1} Mpc and larger.
The impact on the hierarchical amplitudes, S_{p}, of removing the superclusters is shown in Fig. 9, in which we plot the results for the volumelimited sample defined by −19 > M−5 log_{10}h > −20. In Fig. 9, the open points show the hierarchical amplitudes measured from the full volumelimited sample. The filled symbols show the results obtained from the same volume but with the supercluster regions masked out. The S_{p} obtained when the two superclusters are removed from the analysis are much closer to being independent of cell size. The sensitivity of higherorders to rare peaks has been noticed in earlier analyses of galaxy surveys (Groth & Peebles 1977; Gaztañaga 1992; Bouchet et al. 1993; Lahav et al. 1993; Gaztañaga 1994; Hoyle et al. 2000).
Interpretation and the Implications for Galaxy Bias
In this section we quantify how the hierarchical amplitudes scale with galaxy luminosity and discuss the implications of our results for simple models of galaxy bias. We first test the hypothesis set out in Section 2.4 that the variation in clustering with luminosity apparent in Fig. 3 can be described by a single, relative bias factor, as defined by equation (14). The relative bias factors, b_{r}, computed from the variance and the deviation from the linear bias model, as quantified by c′_{2} (equation 17), are listed in Table 2; here the mean value is given by the best χ^{2} fit over all cell radii. The change in the amplitude of the relative bias with sample luminosity, shown in Fig. 4, is in excellent agreement with the trend found by Norberg et al. (2001), who analysed the projected spatial clustering of 2dFGRS galaxies. This agreement is remarkable given the different approaches used to measure the twopoint correlations and the fact that the analysis in this paper is in redshift space, whereas the study carried out by Norberg et al. was unaffected by peculiar motions.
The coefficients c′_{2} are different from zero at a 1σ level. These findings are consistent with a small deviation from the linear biasing model (at a 2σ level for the brighter samples). This is in qualitative agreement with the estimation of c_{2} using the the bispectrum (Scoccimarro 2000; Verde et al. 2002) or the threepoint function measured from the parent APM galaxy survey (Frieman & Gaztañaga 1999).
The variation of the hierarchical amplitudes with luminosity is plotted in Fig. 10. Each panel corresponds to a different order p. The filled points show the hierarchical amplitudes averaged over the different cell radii employed (these values and the associated errors are given in Table 2). The dotted line shows the hierarchical amplitudes predicted by the linear relative bias model (equation 15), using the bestfitting bias factors stated in Table 2. This model gives a rough approximation to the data. However, the observed variation of S_{p} with luminosity is somewhat better described by a linear fit in the logarithm of luminosity, as shown by the solid lines. This implies that the dependence of the hierarchical amplitudes on luminosity is more complicated than expected in the simple relative bias model of equation (15) (as does the fact that we find some evidence for nonzero values for c′_{2}). The solid lines show the best linear fitting to the hierarchical amplitudes as a function of the logarithm of the median luminosity of the samples, i.e.
We find a greater than 2σ (Δχ^{2} > 7.2 for two parameters) detection of a nonzero value for B_{3}. However, for p > 3, the constraints on B_{p} are much weaker and there is no clear evidence for a luminosity dependence in the S_{p} values in these cases. For completeness, the bestfitting values for each order are (A_{3}, B_{3}) = (2.07, −0.40), (A_{4}, B_{4}) = (6.15, −2.51), (A_{5}, B_{5}) = (21.3, −13.5) and (A_{6}, B_{6}) = (58, −39).Conclusions
In this paper we have measured the higherorder correlation functions of galaxies in volumelimited samples drawn from the 2dFGRS. The most recent comparable work is the analysis of the StromloAPM and United Kingdom Schmidt Telescope (UKST) redshift surveys by Hoyle et al. (2000). These authors also considered volumelimited subsamples drawn from the fluxlimited redshift survey. The largest UKST sample considered by Hoyle et al. contained 500 galaxies and covered a volume of 9 × 10^{5}h^{−3} Mpc^{3}; the reference sample used in our work contains 90 times this number of galaxies and covers ten times the volume. In our analysis, we can follow the variation of clustering over more than a decade in luminosity, whereas Hoyle et al. had to focus their attention around L_{*}.
The measurement of the higherorder galaxy correlation functions is still challenging, however. In spite of the order of magnitude increase in size that the 2dFGRS represents over previously completed surveys, we have found that the higherorder moments that we measure are somewhat sensitive to the presence of large structures. In particular, there are two superclusters that influence our measurements, one in the SGP region and the other in the NGP. These structures contain a sizeable fraction of the cluster mass groups in the 2dFGRS (Eke et al. 2004). The inclusion of these structures has an impact on our estimates of the threepoint and higherorder volumeaveraged correlation functions on scales around 4–10 h^{−1} Mpc and above, depending on the order of the correlation function. For this reason, we have presented measurements of the higherorder correlation functions both with and without these structures. We stress that the removal of these superclusters should not be considered a correction to the full catalogue results, but rather as an indication of the impact of rare structures on our results for the higherorder moments. On the other hand, the upturn that we find in the values of the hierarchical amplitudes on large scales is predicted by some structure formation models; for example models with nonGaussian initial density fields predict a similar form for the S_{p} values as we measure from the full volumelimited samples (Gaztañaga & Maehoenen 1996; Gaztañaga & Bernardeau 1998; Bernardeau et al. 2002).
The difficulties in estimating S_{p} values on large, quasilinear scales (>10 h^{−1} Mpc), prevent a direct comparison with perturbation theory (see Bernardeau et al. 2002). The current best estimates on these scales are still those measured from the angular APM Galaxy Survey (Gaztañaga 1994; Szapudi et al. 1995; Szapudi & Gaztañaga 1998). At the time of writing, the results from the Sloan Digital Sky Survey (SDSS) Early Data Release are still limited to small scales (Gaztañaga 2002; Szapudi et al. 2002). Despite being unable to make a robust measurement of the higherorder correlation functions on the very large scales for which weakly nonlinear perturbation theory is applicable, we are still able to reach a number of interesting conclusions as follows.

(i)
We have demonstrated that the higherorder galaxy correlation functions measured from the 2dFGRS follow a hierarchical scaling. Baugh et al. (2004) showed that L_{*} galaxies display higherorder correlation functions that scale in a hierarchical fashion; we have extended these authors' analysis to cover a wide range of galaxy luminosity. The higherorder moments of the galaxy count distribution are proportional to the variance raised to a power that depends upon the order of the correlation function under consideration. This behaviour holds on physical scales ranging from those on which we expect the underlying density fluctuations to be strongly nonlinear all the way through to quasilinear scales. This scaling has been tested up to the sixpoint correlation function for the first time using a redshift survey. This confirms the conclusions of a complementary analysis carried out by Croton et al. (2004), who found hierarchical scaling when measuring the reduced void probability function of the 2dFGRS.

(ii)
We have estimated values of the hierarchical amplitudes, , for cells of different radii. The hierarchical amplitudes are approximately constant on small to medium scales (depending on the order considered), while for the larger volumes, S_{p} seem to increase with radius at large scales. Although this could in principle result from a boundary or mask effect (e.g. see Szapudi & Gaztañaga 1998; Bernardeau et al. 2002), we have shown with mock catalogues that this is not the case here (e.g. see Fig. 1). If the two most massive superclusters in the survey are removed from the analysis, the hierarchical amplitudes are remarkably independent over all scales. That the S_{p} are roughly constant on small scales, with smaller amplitudes than in real space (e.g. Gaztañaga 1994), has been noted before for measurements in redshift space. It arises due to a cancellation of the enhanced signal on small scales in real space by a damping of clustering in redshift space due to peculiar motions (Lahav et al. 1993; Fry & Gaztañaga 1994; Hivon et al. 1995; Hoyle et al. 2000; Bernardeau et al. 2002).

(iii)
We find that the amplitude of the higherorder correlation functions scales with luminosity. The magnitude of the luminosity segregation increases with the order of the correlation (see Fig. 4). For the variance, , the strength of the trend is in very good agreement with that reported by Norberg et al. (2001), but note that these authors measured the luminosity segregation in real space, whereas our results are in redshift space. The strength of the luminosity segregation for higherorders can be mostly explained as the result of hierarchical scaling , so that most of the effect can be attributed to luminosity segregation in the variance. This can be seen in Fig. 5 where data from different luminosities trace out the same hierarchical curve with little scatter.

(iv)
We find some evidence for a residual dependence of S_{p} on luminosity, although the effect is only significant within the errors for the skewness p= 3 (greater than 2σ level). It is not clear whether this is driven by a pure luminosity dependence of the higherorder clustering or by a change in the galaxy mix with luminosity, with different galaxy types having different S_{p} values or by a combination of the two effects: see Norberg et al. (2002a) for an investigation of this point for the twopoint correlation function. A simple linear relative bias model (dotted line in Fig. 10) does not reproduce the dependence of the S_{p} on luminosity.
We have interpreted our results in terms of a simple, local bias model, and we have quantified trends in clustering amplitude with luminosity by estimating relative bias factors. These measurements, summarized in Table 2, extend the constraints upon models of galaxy formation derived from the twopoint correlation function, quantifying the shape of the tails of the count probability distribution as well as its width.
Acknowledgments
The 2dFGRS was undertaken using the twodegree field spectrograph on the AngloAustralian Telescope. We acknowledge the efforts of all those responsible for the smooth running of this facility during the course of the survey, and also the indulgence of the time allocation committees. Many thanks go to Saleem Zaroubi and Simon White for useful discussions and theoretical insight. DC acknowledges the financial support of the International Max Planck Research School in Astrophysics Ph.D. fellowship, under which this work was carried out. EG acknowledges support from the Spanish Ministerio de Ciencia y Tecnologia, project No. AYA200200850 and EC FEDER funding. CMB is supported by a Royal Society University Research Fellowship. PN acknowledges the financial support through a Zwicky Fellowship at ETH, Zurich.