## Abstract

We present a detailed analysis of the two-point correlation function, ξ(σ, π), from the 2dF Galaxy Redshift Survey (2dFGRS). The large size of the catalogue, which contains ∼220 000 redshifts, allows us to make high-precision measurements of various properties of the galaxy clustering pattern. The effective redshift at which our estimates are made is *z*_{s}≈ 0.15, and similarly the effective luminosity, *L*_{s}≈ 1.4*L**. We estimate the redshift-space correlation function, ξ(*s*), from which we measure the redshift-space clustering length, *s*_{0}= 6.82 ± 0.28 *h*^{−1} Mpc. We also estimate the projected correlation function, Ξ(σ), and the real-space correlation function, ξ(*r*), which can be fit by a power law (*r*/*r*_{0}), with *r*_{0}= 5.05 ± 0.26 *h*^{−1} Mpc, γ_{r}= 1.67 ± 0.03. For *r*≳ 20 *h*^{−1} Mpc, ξ drops below a power law as, for instance, is expected in the popular Λ cold dark matter model. The ratio of amplitudes of the real- and redshift-space correlation functions on scales of 8–30 *h*^{−1} Mpc gives an estimate of the redshift-space distortion parameter β. The quadrupole moment of ξ(σ, π) on scales 30–40 *h*^{−1} Mpc provides another estimate of β. We also estimate the distribution function of pairwise peculiar velocities, *ƒ*(*v*), including rigorously the significant effect due to the infall velocities, and we find that the distribution is well fit by an exponential form. The accuracy of our ξ(σ, π) measurement is sufficient to constrain a model, which simultaneously fits the shape and amplitude of ξ(*r*) and the two redshift-space distortion effects parametrized by β and velocity dispersion, *a*. We find β= 0.49 ± 0.09 and *a*= 506 ± 52 km s^{−1}, although the best-fitting values are strongly correlated. We measure the variation of the peculiar velocity dispersion with projected separation, *a*(σ), and find that the shape is consistent with models and simulations. This is the first time that β and *ƒ*(*v*) have been estimated from a self-consistent model of galaxy velocities. Using the constraints on bias from recent estimates, and taking account of redshift evolution, we conclude that β (*L*=*L**, *z*= 0) = 0.47 ± 0.08, and that the present-day matter density of the Universe, Ω_{m}≈ 0.3, consistent with other 2dFGRS estimates and independent analyses.

## Introduction

The galaxy two-point correlation function, ξ, is a fundamental statistic of the galaxy distribution, and is relatively straightforward to calculate from observational data. Because the clustering of galaxies is determined by the initial mass fluctuations and their evolution, measurements of ξ set constraints on the initial mass fluctuations and their evolution. The astrophysics of galaxy formation introduces uncertainties, but there is now good evidence that galaxies do trace the underlying mass distribution on large scales.

In this paper we analyse the distribution of ∼220 000 galaxies in the two-degree Field Galaxy Redshift Survey (2dFGRS; Colless et al. 2001). A brief summary of the data is presented in Section 2. Much of our error analysis makes use of mock galaxy catalogues generated from *N*-body simulations which are also discussed in Section 2.

The two-dimensional (2D) measurement ξ(σ, π), where σ is the pair separation perpendicular to the line of sight and π is the pair separation parallel to the line of sight provides information about the real-space correlation function, the small-scale velocity distribution, and the systematic gravitational infall into overdense regions. The spherical average of ξ(σ, π) gives an estimate of the redshift-space correlation function, ξ(*s*), where , so the galaxy separations are calculated assuming that redshift gives a direct measure of distance, ignoring the effects of peculiar velocities. Integrating ξ(σ, π) along the line of sight sums over any peculiar velocity distributions, and so is unaffected by any redshift-space effects. The resulting projected correlation function, Ξ(σ), is directly related to the real-space correlation function. Our estimates of ξ(σ, π), ξ(*s*), Ξ(σ) and ξ(*r*) are presented in Section 3. These statistics have been measured from many smaller redshift surveys (e.g. Davis & Peebles 1983; Loveday et al. 1992; Jing, Mo & Börner 1998; Hawkins et al. 2001; Zehavi et al. 2002), but because they sample smaller volumes, there is a large cosmic variance on the results. The large volume sampled by the 2dFGRS leads to significantly more reliable estimates. A preliminary analysis was performed on the 2dFGRS by Peacock et al. (2001) but we now have a far more uniform sample and twice as many galaxies. Madgwick et al. (2003) have measured these statistics for spectral-type subsamples of the 2dFGRS.

Peculiar velocities of galaxies lead to systematic differences between redshift-space and real-space measurements, and we can consider the effects in terms of a combination of large-scale coherent flows induced by the gravity of large-scale structures, and a small-scale random peculiar velocity of each galaxy (e.g. Marzke et al. 1995; Jing et al. 1998). The large-scale flows compress the contours of ξ(σ, π) along the π direction, as described by Kaiser (1987) and Hamilton (1992). The amplitude of the distortion depends on the mean density of the Universe, Ω_{m}, and on how the mass distribution is clustered relative to galaxies, which can be parametrized in terms of a linear bias *b*, defined so that δ_{g}=*b*δ_{m}, where δ represents fluctuations in the density field. The random component of peculiar velocity for each galaxy means that the observed ξ(σ, π) is convolved in the π coordinate with the pairwise distribution of random velocities. In Section 4 we describe the construction of a model ξ(σ, π) from these assumptions about redshift-space distortions and also the shape of the correlation function.

In Section 5 we use the *Q* statistic (Hamilton 1992) based on the quadrupole moment of ξ(σ, π) to estimate the parameter β≈Ω^{0.6}_{m}/*b*. In the absence of the small-scale random velocities, the shape of ξ(σ, π) contours on large scales is directly related to the parameter β. A similar estimate of β is provided by the ratio of amplitudes of ξ(*s*) to ξ(*r*) and this is also presented in Section 5.

In Section 6, we use the method of Landy, Szalay & Broadhurst (1998, hereafter LSB98) to estimate the distribution of peculiar velocities. This technique ignores the effect of large-scale distortions and uses the Fourier transform of ξ(σ, π) to estimate the distribution of peculiar velocities, *ƒ*(*v*). The large sample volume of the 2dFGRS makes our measurements more reliable than previous estimates in the same way as for the correlation functions mentioned earlier.

These two approaches provide reasonable estimates of β and *ƒ*(*v*) so long as the distortions at small and large scales are completely decoupled. This is not the case for real data, and so we have fitted models which simultaneously include the effects of both β and *ƒ*(*v*). The resulting best-fitting parameters are the most self-consistent estimates. Previous data sets have lacked the signal-to-noise to allow a reliable multiparameter fit in this way. Our fitting procedure and results are described in Section 7.

In Section 8, we examine the luminosity and redshift dependence of β and combine our results with estimates of *b* (Verde et al. 2002; Lahav et al. 2002) to estimate Ω_{m}, and compare this with other recent analyses.

In Section 9, we summarize our main conclusions. When converting from redshift to distance we assume the Universe has a flat geometry with Ω_{Λ}= 0.7, Ω_{m}= 0.3 and *H*_{0}= 100 *h* km s^{−1} Mpc^{−1}, so that all scales are in units of *h*^{−1} Mpc.

## The Data

### The dFGRS data

The 2dFGRS is selected in the photometric *b*_{J} band from the Automated Plate Measuring (APM) galaxy survey (Maddox, Efstathiou & Sutherland 1990) and its subsequent extensions (Maddox et al., in preparation). The bulk of the solid angle of the survey is made up of two broad strips, one in the South Galactic Pole (SGP) region covering approximately and the other in the direction of the North Galactic Pole (NGP), spanning . In addition to these contiguous regions, there are a number of circular two-degree fields scattered randomly over the full extent of the low extinction regions of the southern APM galaxy survey.

The magnitude limit at the start of the survey was set at *b*_{J}= 19.45 but both the photometry of the input catalogue and the dust extinction map have been revised since and so there are small variations in magnitude limit as a function of position over the sky. The effective median magnitude limit, over the area of the survey, is *b*_{J}≈ 19.3 (Colless et al. 2001).

The completeness of the survey data varies according to the position on the sky because of unobserved fields (mostly around the survey edges), unfibred objects in observed fields (due to collision constraints or broken fibres) and observed objects with poor spectra. The variation in completeness is mapped out using a completeness mask (Colless et al. 2001; Norberg et al. 2002a) which is shown in Fig. 1 for the data used in this paper.

We use the data obtained prior to 2002 May, which is virtually the completed survey. This includes 221 283 unique, reliable galaxy redshifts (quality flag ≥3; Colless et al. 2001). We analyse a magnitude-limited sample with redshift limits *z*_{min}= 0.01 and *z*_{max}= 0.20, and no redshifts are used from a field with <70 per cent completeness. The median redshift is *z*_{med}≈ 0.11. The random fields, which contain nearly 25 000 reliable redshifts are not included in this analysis. After the cuts for redshift, completeness and quality we are left with 165 659 galaxies in total, 95 929 in the SGP and 69 730 in the NGP. These data cover an area, weighted by the completeness shown in Fig. 1, of 647 deg^{2} in the SGP and 446 deg^{2} in the NGP, to the magnitude limit of the survey.

In all of the following analysis we consider the NGP and SGP as independent data sets. Treating the NGP and SGP as two independent regions of the sky gives two estimates for each statistic, and so provides a good test of the error bars we derive from mock catalogues (see below). We have also combined the two measurements to produce our overall best estimate by simply adding the pair counts from the NGP and SGP. The optimal weighting of the two estimates depends on the relative volumes surveyed in the NGP and SGP, but because these are comparable, a simple sum is close to the optimal combination.

It is important to estimate the effective redshift at which all our statistics are calculated. As ξ is based on counting pairs of galaxies the effective redshift is not the median, but a pair-weighted measure. The tail of high redshift galaxies pushes this effective redshift to *z*_{s}≈ 0.15. Similarly the effective magnitude of the sample we analyse is *M*_{s}− 5 log *h*≈−20.0, corresponding to *L*_{s}≈ 1.4*L** (using *M**− 5 log *h*=−19.66; Norberg et al. 2002a).

### Mock catalogues

For each of the NGP and SGP regions, 22 mock catalogues were generated from the Λ cold dark matter (CDM) Hubble Volume simulation (Evrard et al. 2002) using the techniques described in Cole et al. (1998), and are designed to have a similar clustering signal to the 2dFGRS. A summary of the construction methods is presented here but for more details see Norberg et al. (2002a) and Baugh et al. (in preparation).

These simulations used an initial dark matter power spectrum appropriate to a flat ΛCDM model with Ω_{m}= 0.3 and Ω_{Λ}= 0.7. The dark-matter evolution was followed up to the present day and then a bias scheme (model 2 of Cole et al. 1998, with a smoothing length, *R*_{S}= 2 *h*^{−1} Mpc) was used to identify galaxies from the dark-matter haloes. The bias scheme used has two free parameters which are adjusted to match the mean slope and amplitude of the correlation function on scales greater than a few megaparsec. On scales smaller than the smoothing length there is little control over the form of the clustering, but in reality the methods employed work reasonably well (see later sections).

The resulting catalogues have a bias scheme which asymptotes to a constant on large scales, giving β= 0.47, but is scale-dependent on small scales. Apparent magnitudes were assigned to the galaxies consistent with their redshift, the assumed Schechter luminosity function and the magnitude limit of the survey. The Schechter function has essentially the same parameters as in the real data (see Norberg et al. 2002a). Then the completeness mask and variable apparent magnitude limits were applied to the mock catalogues to reproduce catalogues similar to the real data.

In the analysis which follows, we make use of the real- and redshift-space correlation functions from the full Hubble Volume simulation. These correlation functions are determined from a Fourier transform of the power spectrum of the full Hubble Volume cube using the real- and redshift-space positions of the mass particles respectively, along with the bias scheme outlined above. This allows us to compare our mock catalogue results with those of the simulation from which they are drawn to ensure we can reproduce the correct parameters. It also allows us to compare and contrast the results from the real Universe with a large numerical simulation.

### Error estimates

We analyse each of the mock catalogues in the same way as the real data, so that we have 22 mock measurements for every measurement that we make on the real data. The standard deviation between the 22 mock measurements gives a robust estimate of the uncertainty on the real data. We use this approach to estimate the uncertainties for direct measurements from the data, such as the individual points in the correlation function, and for best-fitting parameters such as *s*_{0}.

When fitting parameters we use this standard deviation as a weight for each data point and perform a minimum χ^{2} analysis to obtain the best-fitting parameter. The errors that we quote for any particular parameter are the rms spread between the 22 best-fitting parameters obtained in the same way from the mock catalogues. This simple way of estimating the uncertainties avoids the complications of dealing directly with correlated errors in measured data points, while still providing an unbiased estimate of the real uncertainties in the data, including the effects of correlated errors.

Although this approach gives reliable estimates of the uncertainties, the simple weighting scheme is not necessarily optimal in the presence of correlated errors. Nevertheless, for all statistics that we consider, we find that the means of the mock estimates agree well with the values input to the parent simulations. Also, we have applied the technique described by Madgwick et al. (2003) to decorrelate the errors for the projected correlation function, using the covariance matrix estimated from the mock catalogues. We found a 0.1σ difference between the best-fitting values using the two methods. So, we are confident that our measurements and uncertainty estimates are robust and unbiased.

## Estimates of The Correlation Function

The two-point correlation function, ξ, is measured by comparing the actual galaxy distribution to a catalogue of randomly distributed galaxies. These randomly distributed galaxies are subject to the same redshift, magnitude and mask constraints as the real data and we modulate the surface density of points in the random catalogue to follow the completeness variations. We count the pairs in bins of separation along the line of sight, π, and across the line of sight, σ, to estimate ξ(σ, π). Spherically averaging these pair counts provides the redshift-space correlation function ξ(*s*). Finally, we estimate the projected function Ξ(σ) by integrating over all velocity separations along the line of sight and invert it to obtain ξ(*r*).

### Constructing a random catalogue

To reduce shot noise we compare the data with a random catalogue containing 10 times as many points as the real catalogue. This random catalogue needs to have a smooth selection function matching the *N*(*z*) of the real data. We use the 2dFGRS luminosity function (Norberg et al. 2002a) with *M**− 5 log *h*=−19.66 and α=−1.21 to generate the selection function, following the change in the survey magnitude limit across the sky. When analysing the mock catalogues, we use the input luminosity function to generate the selection function, and hence random catalogues.

As an alternative method, we also fitted an analytical form for the selection function (Baugh & Efstathiou 1993) to the data, and generated random catalogues using that selection function. We have calculated all of our statistics using both approaches, and found that they gave essentially identical results for the data. When analysing the mock catalogues, we found that the luminosity function method was more robust to the presence of large-scale features in the *N*(*z*) data. Thus, all of our quoted results are based on random catalogues generated using the luminosity function.

### Fibre collisions

The design of the 2dF instrument means that fibres cannot be placed closer than approximately 30 arcsec (Lewis et al. 2002), and so both members of a close pair of galaxies cannot be targeted in a single fibre configuration. Fortunately, the arrangement of 2dFGRS tiles means that not all close pairs are lost from the survey. Neighbouring tiles have significant areas of overlap, and so much of the sky is targeted more than once. This allows us to target both galaxies in some close pairs. Nevertheless, the survey misses a large fraction of close pairs. It is important to assess the impact of this omission on the measurement of galaxy clustering and to investigate schemes that can compensate for the loss of close pairs.

To quantify the effect of these so-called ‘fibre collisions’ we have calculated the angular correlation function for galaxies in the 2dFGRS parent catalogue, *w*_{p}(θ), and for galaxies with redshifts used in our ξ analysis, *w _{z}*(θ). We used the same mask to determine the angular selection and apparent magnitude limit for each sample as in Fig. 1. Note that the mask is used only to define the area of analysis, and the actual redshift completeness values are not used in the calculation of

*w*. In our ξ analyses we impose redshift limits 0.01 <

*z*< 0.2, which means that the mean redshift of the redshift sample is lower than the parent sample. We used the equation of Limber (1954) to calculate the scalefactors in amplitude and angular scale needed to account for the different redshift distributions. The solid line in Fig. 2 shows

*w*

_{p}, and the filled circles show

*w*after applying the Limber scalefactors. The error bars in Fig. 2 show

_{z}*w*(θ) from the full APM survey (Maddox, Efstathiou & Sutherland 1996), also scaled to the magnitude limit of the 2dFGRS parent sample. On scales all three measurements are consistent. On smaller scales

*w*is clearly much lower than

_{z}*w*

_{p}, showing that the fibre-collision effect becomes significant and cannot be neglected.

The ratio of galaxy pairs counted in the parent and redshift samples is given by (1 +*w*_{p})/(1 +*w _{z}*), which is shown by the filled circles in the lower panel of Fig. 2. As discussed in the next section, we use this ratio to correct the pair counts in the ξ analysis.

### Weighting

Each galaxy and random galaxy is given a weighting factor depending on its redshift and position on the sky. The redshift-dependent part of the weight is designed to minimize the variance on the estimated ξ (Efstathiou 1988; Loveday et al. 1995), and is given by 1/(1 + 4π*n*(*z _{i}*)

*J*

_{3}(

*s*)), where

*n*(

*z*) is the density distribution and . We use

*n*(

*z*) from the random catalogue to ensure that the weights vary smoothly with redshift. We find that our results are insensitive to the precise form of

*J*

_{3}but we derived it using a power law ξ with

*s*

_{0}= 13.0 and γ

_{s}= 0.75 and a maximum value of

*J*

_{3}= 400. This corresponds to the best-fitting power law over the range 0.1 <

*s*< 3

*h*

^{−1}Mpc with a cut-off at larger scales.

We also use the weighting scheme to correct for the galaxies that are not observed due to the fibre collisions. Each galaxy–galaxy pair is weighted by the ratio *w _{f}*= (1 +

*w*

_{p})/(1 +

*w*) at the relevant angular separation according to the curve plotted in the bottom panel of Fig. 2. This corrects the observed pair count to what would have been counted in the parent catalogue. The open points in Fig. 2, which have the collision correction applied, show that this method can correctly recover the parent catalogue result and hence overcome the fibre-collision problem. Because the random catalogues do not have any close-pair constraints, only the galaxy–galaxy pair count needs correcting in this way. We also tried an alternative approach to the fibre-collision correction that we used previously in Norberg et al. (2001, 2002b) where the weight for each unobserved galaxy was assigned equally to its 10 nearest neighbours. This produced similar results for , but did not help on smaller scales. All of our results are presented using the

_{z}*w*weighting scheme. Hence each galaxy,

_{f}*i*, is weighted by the factor

*i*,

*j*is given a weight

*w*

_{ƒ}

*w*

_{i}

*w*

_{j}, whereas each galaxy–random and random–random pair is given a weight

*w*

_{i}

*w*

_{j}.

### The two-point correlation function, *ξ(σ, π)*

*ξ(σ, π)*

We use the ξ estimator of Landy & Szalay (1993),

where*DD*is the normalized sum of weights of galaxy–galaxy pairs with particular (σ, π) separation,

*RR*is the normalized sum of weights of random–random pairs with the same separation in the random catalogue and

*DR*is the normalized sum of weights of galaxy–random pairs with the same separation. To normalize the pair counts we ensure that the sum of weights of the random catalogue equal the sum of weights of the real galaxy catalogue, as a function of scale. We find that other estimators (e.g. Hamilton 1993) give similar results.

The *N*(*z*) distributions for the data and random catalogues (scaled so that the area under the curve is the same as for the observed data) are shown in Fig. 3. It is clear that *N*(*z*) for the random catalogues are a reasonably smooth fit to *N*(*z*) for the data. Norberg et al. (2002a) have shown that large ‘spikes’ in the *N*(*z*) are common in the mock catalogues, and so similar features in the data redshift distributions indicate normal structure.

The resulting estimates of ξ(σ, π) calculated separately for the SGP and NGP catalogues are shown in Fig. 4, along with the combined result. The velocity distortions are clear at both small and large scales, and the signal-to-noise ratio is in general very high for σ and π values less than 20 *h*^{−1} Mpc; it is ≈6 in each 1 *h*^{−1} Mpc bin at *s*= 20 *h*^{−1} Mpc. At very large separations ξ(σ, π) becomes very close to zero, showing no evidence for features that could be attributed to systematic photometric errors.

We used an earlier version of the 2dFGRS catalogue to carry out a less detailed analysis of ξ(σ, π) (Peacock et al. 2001). The current redshift sample has about 1.4 times as many galaxies, although more importantly it is more contiguous, and the revised photometry has improved the uniformity of the sample. Nevertheless, our new results are very similar to our earlier analysis, demonstrating the robustness of our results. The current larger sample allows us to trace ξ out to larger scales with smaller uncertainties. Also, in our present analysis we analyse mock catalogues to obtain error estimates which are more precise than the previous error approximation (see Section 7.3).

### The redshift-space correlation function, ξ(*s*)

Averaging ξ(σ, π) at constant *s* gives the redshift-space correlation function, and our results for the NGP and SGP are plotted in Fig. 5 on both log and linear scales. The NGP and SGP measurements differ by about 2σ between 20 and 50 *h*^{−1} Mpc; we find one mock whose NGP and SGP measurements disagree by this much, and so it is probably not significant. We tried shifting *M** by 0.1 mag to better fit *N*(*z*) at *z* > 0.15 in the SGP, and this moved the data points by ∼0.2σ for 20 < *s* < 50 *h*^{−1} Mpc.

The redshift-space correlation function for the combined data is plotted in Fig. 6 in the top panel. It is clear that the measured ξ(*s*) is not at all well represented by a universal power law on all scales, but we do make an estimate of the true value of the redshift-space correlation length, *s*_{0}, by fitting a localized power law of the form

*s*), using two points either side of ξ(

*s*) = 1. This also gives a value for the local redshift-space slope, γ

_{s}. The best-fitting parameters for the separate poles and combined estimates are listed in Table 1. In the inset of Fig. 6 we can see, at a low amplitude, that ξ(

*s*) becomes negative between 50 ≲

*s*≲ 90

*h*

^{−1}Mpc.

In the bottom panel of Fig. 6 we examine the shape of ξ(*s*) more carefully. The points are the data divided by a small-scale power law fitted on scales 0.1 < *s* < 3 *h*^{−1} Mpc (dashed line). The data are remarkably close to the power-law fit for this limited range of scales, and follow a smooth break towards zero for 3 < *s* < 60 *h*^{−1} Mpc. The measurements from the Hubble Volume simulation are shown by the solid line, and it matches the data extremely well on scales *s* > 4 *h*^{−1} Mpc. On smaller scales, where the algorithm for placing galaxies in the simulation has little control over the clustering amplitude (as discussed in Section 2.2), there are discrepancies of the order of 50 per cent.

The mean ξ(*s*) determined from the mock catalogues agrees well with the true redshift-space correlation function from the full Hubble Volume. This provides a good check that our weighting scheme and random catalogues have not introduced any biases in the analysis.

### Redshift-space comparisons

Redshift-space correlation functions have been measured from many redshift surveys, but direct comparisons between different surveys are not straightforward because galaxy clustering depends on the spectral type and luminosity of galaxies (e.g. Guzzo et al. 2000; Norberg et al. 2002b; Madgwick et al. 2003). Direct comparisons can be made only between surveys that are based on similar galaxy selection criteria. The 2dFGRS is selected using pseudo-total magnitudes in the *b*_{J} band, and the three most similar surveys are the Stromlo-APM survey (SAPM; Loveday et al. 1992), the Durham United Kingdom Schmidt Telescope (UKST) survey (Ratcliffe et al. 1998) and the European Southern Observatory (ESO) Slice Project (ESP; Guzzo et al. 2000). The Las Campanas Redshift Survey (LCRS; Tucker et al. 1997; Jing et al. 1998) and the Sloan Digital Sky Survey (SDSS; Zehavi et al. 2002) are selected in the *R* band, but have a very large number of galaxies, and so are also interesting for comparisons.

The non-power-law shape of ξ(*s*) makes it difficult to compare different measurements of *s*_{0} and γ_{s}, because the values depend sensitively on the range of *s* used in the fitting procedure. In Fig. 7(a) we compare the ξ(*s*) measurements directly for the 2dFGRS, SAPM, Durham UKST and ESP surveys. Our estimate of ξ(*s*) is close to the mean of previous measurements, but the uncertainties are much smaller. Although we quote uncertainties that are similar in size to previous measurements, we have used the scatter between mock catalogues to estimate them, rather than the Poisson or boot-strap estimates that have been used before and which seriously underestimate the true uncertainties.

Fig. 7(b) shows the 2dFGRS measurements together with the LCRS and SDSS measurements. On scales *s*≳ 4 *h*^{−1} Mpc there appear to be no significant differences between the surveys, but for *s*≲ 2 *h*^{−1} Mpc the LCRS and SDSS have a higher amplitude than the 2dFGRS. This difference is likely to be caused by the different galaxy selection for the surveys, although the SDSS results shown are for the Early Data Release (EDR) and have larger errors than the 2dFGRS points. The 2dFGRS is selected using *b*_{J}, whereas the SDSS and LCRS are selected in red bands. Because the red (early-type) galaxies are more strongly clustered than blue (late-type) galaxies (e.g. Zehavi et al. 2002; and via spectral type, Norberg et al. 2002b), we should expect that ξ will be higher for red selected surveys than a blue selected survey. This issue is examined further in Madgwick et al. (2003).

### The projected correlation function, Ξ(σ)

The redshift-space correlation function differs significantly from the real-space correlation function because of redshift-space distortions (see Section 4). We can estimate the real-space correlation length, *r*_{0}, by first calculating the projected correlation function, Ξ(σ). This is related to ξ(σ, π) via the equation

_{max}= 70

*h*

^{−1}Mpc. The result is insensitive to this choice for π

_{max}> 60

*h*

^{−1}Mpc for our data. Because redshift-space distortions move galaxy pairs only in the π direction, and the integral represents a sum of pairs over all π values, Ξ(σ) is independent of redshift-space distortions. It is simple to show that Ξ(σ) is directly related to the real-space correlation function (Davis & Peebles 1983):

If the real-space correlation function is a power law, this can be integrated analytically. We write ξ(*r*) = (*r*/*r*^{P}_{0}), where the ‘P’ superscripts refer to the ‘projected’ values, rather than the ‘inverted’ values which are calculated in Section 3.8 and denoted by ‘I’. With this notation we obtain

The parameters γ^{P}_{r} and *r*^{P}_{0} can then be estimated from the measured Ξ(σ), giving an estimate of the real-space clustering independent of any peculiar motions.

The projected correlation functions for the NGP and SGP are shown in Fig. 8 and the combined data result is shown in Fig. 9. The best-fitting values of γ^{P}_{r} and *r*^{P}_{0} for 0.1 < σ < 12 *h*^{−1} Mpc are shown in Table 1. Over this range Ξ(σ)/σ is an accurate power law, but it steepens for σ > 12 *h*^{−1} Mpc. This deviation from power-law behaviour limits the scales that can be probed using this approach.

### The real-space correlation function, ξ(*r*)

It is possible to estimate ξ(*r*) by directly inverting Ξ(σ) without making the assumption that it is a power law (Saunders, Rowan-Robinson & Lawrence 1992, hereafter S92). They recast equation (5) into the form

Assuming a step function for Ξ(σ) =Ξ_{i} in bins centred on σ_{i}, and interpolating between values,

*r*=σ

_{i}. S92 suggest that their method is only good for scales

*r*≲ 30

*h*

^{−1}Mpc in the QDOT survey because

*r*becomes comparable to the maximum scale out to which they can estimate Ξ. We can test the reliability of our inversion of the 2dFGRS data using the mock catalogues.

In Fig. 10 we show the mean ξ(*r*) as determined from the mock catalogues using the method of S92. We compare this to the real-space correlation function determined directly from the Hubble Volume simulation, from which the mock catalogues are drawn. The agreement is excellent and shows that the method works and that we can recover the real-space correlation function out to 30 *h*^{−1} Mpc. Like S92, we find that beyond this scale the method begins to fail and the true ξ(*r*) is not recovered.

We have applied this technique to the combined 2dFGRS data and obtain the real-space correlation function shown in Fig. 11. The data are plotted out to only 30 *h*^{−1} Mpc due to the limitations in the method described above. On small scales ξ(*r*) is well represented by a power law, and a best fit over the range 0.1 < *r* < 12 *h*^{−1} Mpc gives the results for *r*^{I}_{0} and γ^{I}_{r} shown in Table 1.

The points in the bottom panel of Fig. 11 show the 2dFGRS data divided by the best-fitting power law. It can be seen that at scales 0.1 < *r* < 20 *h*^{−1} Mpc the data ξ(*r*) are close to the best-fitting power law but do show hints of non-power-law behaviour (see also discussion below).

### Real-space comparisons

In the inverted ξ(*r*) (and possibly Ξ[σ]) there is a weak excess of clustering over the power law for 5 < *r* < 20 *h*^{−1} Mpc. This has previously been called a ‘shoulder’ in ξ (see, for example, Ratcliffe et al. 1998). Although the amplitude of the feature in our data is rather low, it has been consistently seen in different surveys, and probably is a real feature. After submission of this work, Zehavi et al. (2003) also saw this effect in the SDSS projected correlation function and explained the inflection point as the transition scale between a regime dominated by galaxy pairs in the same halo and a regime dominated by pairs in separate haloes. Magliochetti & Porciani (2003) have found the same effect when examining correlation functions of different types of 2dFGRS galaxy.

The dotted line in the bottom panel of Fig. 11 shows the Hubble Volume simulation which agrees well with the data for *r* > 1 *h*^{−1} Mpc. On smaller scales, the Hubble Volume ξ shows significant deviations from a power law. On these scales, the galaxy clustering amplitude in the simulation is incorrectly modelled because the assignment of galaxies to particles is based on the mass distribution smoothed on a scale of 2 *h*^{−1} Mpc (as discussed in Section 2.2). The solid line in the bottom panel of Fig. 11 is the deprojected APM result (Padilla & Baugh 2003), scaled down by a factor (1 +*z*_{s})^{α}, with α= 1.7, suitable for evolution in a ΛCDM cosmology. There is good agreement between the 2dFGRS and APM results which are obtained using quite different methods.

We have estimated *r*_{0} and γ_{r} by fitting to the projected correlation function Ξ(σ)/σ, and also by inverting Ξ(σ)/σ and then fitting to ξ(*r*). The best-fitting values from the two methods are shown in Table 1, and it is clear that they lead to very similar estimates of *r*_{0} and γ_{r}. This confirms that the power-law assumption in Section 3.7 is a good approximation over the scales we consider.

Table 2 lists *r*_{0} and γ_{r} for the 2dFGRS and other surveys estimated using power-law fits to the projected correlation function Ξ(σ). As mentioned in Section 3.5, the SAPM, Durham UKST and ESP surveys are *b*_{J} selected surveys, and so should be directly comparable to the 2dFGRS. The values of *r*_{0} and γ_{r} for these surveys all agree to within one standard deviation, except *r*_{0} for the ESP, which appears to be significantly lower. It is likely that the quoted uncertainties for the ESP and Durham UKST parameters are underestimated because they did not include the effect of cosmic variance. Because they each sample relatively small volumes, this will be a large effect. The sparse sampling strategy used in the SAPM survey means that it has a large effective volume, and so the cosmic variance is small.

As in Section 3.5, the red-selected surveys, LCRS and SDSS, are significantly different from the other surveys. The discrepancies are most likely due to the fact that the amplitude of galaxy clustering depends on galaxy type, and that red-selected surveys have a different mix of galaxy types. We can make a very rough approximation of the expected change in ξ by considering how the mean colour difference of early and late populations changes the relative fraction of the two populations when a magnitude limited sample is selected in different pass bands. Zehavi et al. (2002) split their *r*-selected SDSS sample into 19 603 early-type galaxies and 9532 late-type galaxies. The mean (*g*–*r*) colours are 0.5 and 0.9, respectively. The 2dFGRS is selected using *b*_{J} which is close to *g*, and so, compared to the *r* selection, the median depth for blue galaxies will be larger that for red galaxies. The number of early and late types will roughly scale in proportion to the volumes sampled, and so the ratio of early-to-late galaxies in the 2dFGRS will be roughly ∼(19 603/9532) × 10^{0.6(0.5−0.9)}= 1.18. Note that this colour split leads to a very different ratio of early-to-late galaxies compared to the η split used by Madgwick et al. (2003). Assuming the early and late correlation functions trace the same underlying field, the combined correlation function will be

From the power-law fits of Zehavi et al., the ratio of bias values at 1 Mpc is *b*_{early}/*b*_{late}= 4.95. Inserting the different ratios *n*_{early}/*n*_{late} appropriate to the red- and blue-selected samples we find that the expected ratio of ξ for a red-selected sample compared to a blue-selected sample is roughly 1.36. Scaling the 2dFGRS values of *r*_{0}= 5.05 and γ_{r}= 1.67 leads to a SDSS value of *r*_{0}= 5.95 for γ_{r}= 1.75, within 1σ of the actual SDSS value. This simple argument indicates that the observed difference in ξ between the red- and blue-selected surveys is consistent with the different population mixes expected in the surveys. The extra surface brightness selection applied to the LCRS may also introduce significant biases.

Each survey is also likely to have a different effective luminosity and, as has been shown by Norberg et al. (2001), this will cause clustering measurements to differ. The relation for 2dFGRS galaxies found by Norberg et al. (2001) was

which gives, for*L*= 1.4

*L** (see Section 2.1),

*r**

_{0}= 4.71 ± 0.24, which will allow direct comparisons with other surveys.

## Redshift-Space Distortions

When analysing redshift surveys it must be remembered that the distance to each galaxy is estimated from its redshift and is not the true distance. Each galaxy has, superimposed on its Hubble motion, a peculiar velocity due to the gravitational potential in its local environment. These peculiar velocities can be in any direction and, because this effect distorts the correlation function, it can be used to measure two important parameters.

The peculiar velocities are caused by two effects. On small scales, random motions of the galaxies within groups cause a radial smearing known as the ‘Finger of God’. On large scales, gravitational instability leads to coherent infall into overdense regions and outflow from underdense regions. We analyse the observed redshift-space distortions by modelling ξ(σ, π). We start with a model of the real-space correlation function, ξ(*r*), and include the effects of large-scale coherent infall, which is parametrized by β≈Ω^{0.6}_{m}/*b*, where *b* is the linear bias parameter. We then convolve this with the form of the random pairwise motions.

### Constructing the model

Kaiser (1987) pointed out that, in the linear regime, the coherent infall velocities take a simple form in Fourier space. Hamilton (1992) translated these results into real space

where*P*

_{ℓ}(μ) are Legendre polynomials, μ= cos(θ) and θ is the angle between

*r*and π. The relations between ξ

_{ℓ}, ξ(

*r*) and β for a simple power law ξ(

*r*) = (

*r*/

*r*

_{0}) are (Hamilton 1992)

The Appendix has more details of this derivation and gives the equations for the case of non-power-law forms of ξ.

We use these relations to create a model ξ′(σ, π) which we then convolve with the distribution function of random pairwise motions, *ƒ*(*v*), to give the final model ξ(σ, π) (Peebles 1980):

We choose to represent the random motions by an exponential form

where*a*is the pairwise peculiar velocity dispersion (often known as σ

_{12}). An exponential form for the random motions has been found to fit the observed data better than other functional forms (e.g. Ratcliffe et al. 1998; Landy 2002, see also Section 6).

### Model assumptions

In this model we make several assumptions. First, we assume a power law for the correlation function. The power-law approximation is a good fit on scales <20 *h*^{−1} Mpc but is not so good at larger scales. This limits the scales which we can probe using this method. In Section 7, we consider non-power-law models for ξ(*r*), and recalculate equations (12)–(14) using numerical integrals (see Appendix), allowing us to reliably use scales >20 *h*^{−1} Mpc. Secondly, we assume that the linear theory model described above holds on scales ≲8 *h*^{−1} Mpc, which is almost certainly not true. We also consider this in Section 7. Finally, we assume an exponential distribution of peculiar velocities with a constant velocity dispersion, *a*, (equation 16) and this is discussed and justified in Sections 6 and 7.4.

### Model plots

To illustrate the effect of redshift-space distortions on the ξ(σ, π) plot we show four model ξ(σ, π) in Fig. 12. If there were no distortions, then the contours shown would be circular, as in the top left-hand panel, due to the isotropy of the real-space correlation function. On small σ scales, the random peculiar velocities cause an elongation of the contours in the π direction (the bottom left-hand panel). On larger scales, there is the flattening of the contours (top right-hand panel) due to the coherent infall. The bottom right-hand panel is a model with both distortion effects included. Comparing the models of ξ(σ, π) to the 2dFGRS measurements in Fig. 4 it is clear that the data show the two distortion effects included in the models. In Section 7 we use the data to constrain the model directly, and to deduce the best-fitting model parameters.

## Estimating β

Before using the model described above to measure the parameters simultaneously, we first use methods that have been used in previous studies. This allows a direct comparison between our results and previous work.

### Ratio of ξ

The ratio of the redshift-space correlation function, ξ(*s*), to the real-space correlation function, ξ(*r*), in the linear regime gives an estimate of the redshift distortion parameter, β (see equation 12):

Our results for the combined 2dFGRS data, using the inverted form of ξ(*r*), are shown in Fig. 13 by the solid points. The mean of the mock catalogue results is shown by the white line, with the rms errors shaded and the estimate from the Hubble Volume is shown by the solid line. The data are consistent with a constant value, and hence linear theory, on scales ≳4 *h*^{−1} Mpc.

The mock catalogues and Hubble Volume results asymptote to β= 0.47, the true value of β in the mocks. The 2dFGRS data in the range 8–30 *h*^{−1} Mpc are best fit by a ratio of 1.34 ± 0.13, corresponding to β= 0.45 ± 0.14. The maximum scale that we can use in this analysis is determined by the uncertainty on ξ(*r*) from the inversion method of S92 discussed in Section 3.8.

### The quadrupole moment of ξ

We now measure β using the quadrupole moment of the correlation function (Hamilton 1992)

where ξ_{ℓ}is given by

These equations assume that the random peculiar velocities are negligible and hence measuring *Q* gives an estimate of β. The random uncertainties in this method are small enough that we obtain reliable estimates on scales <40 *h*^{−1} Mpc, as shown by the mock catalogues (see below), but the data are noisy beyond these scales.

Fig. 14 shows *Q* estimates for the combined 2dFGRS data with the inset showing the NGP and SGP separately. The effect of the random peculiar velocities can be clearly seen at small scales, causing *Q* to be negative. The best-fitting value to the combined data for 30–40 *h*^{−1} Mpc is *Q*= 0.55 ± 0.18, which gives a value for β= 0.47^{+0.19}_{−0.16}, where the error is from the rms spread in the mock catalogue results. The solid line represents a model with β= 0.49 and *a*= 506 km s^{−1}, which matches the data well (see Section 7.1). Although asymptoting to a constant, the value of *Q* in the model is still increasing at 40 *h*^{−1} Mpc. This shows that non-linear effects do introduce a small systematic error even at these scales, although this bias is small compared to the random error.

To check whether this method can correctly determine β we use the mock catalogues. The data points in Fig. 15 are the mean values of *Q* from the mock catalogues, with error bars on the mean, and the dashed line is the true value of β= 0.47. The data points seem to converge on large scales to the correct value of *Q*. Fitting to each mock catalogue in turn for 30–40 *h*^{−1} Mpc gives a mean *Q*= 0.51 ± 0.18, corresponding to β= 0.43^{+0.18}_{−0.16}. As the models showed, the random velocities will lead to an underestimate of β even at 40 *h*^{−1} Mpc, causing the difference between the measured and true values. This all shows that we can determine β with a slight bias but the error bars are large compared to the bias.

The *Q* estimates from the individual mock catalogues show a high degree of correlation between points on varying scales and so the overall uncertainty in *Q* from averaging over all scales >30 *h*^{−1} Mpc is not much smaller than the uncertainty from a single point. It is this fact which makes the spread in results from the mock catalogues vital in the estimation of the errors on our result (also see Section 7.3).

## The Peculiar Velocity Distribution

To this point we have assumed that the random peculiar velocity distribution has an exponential form (equation 16). This form has been used by many authors in the past and has been found to fit the data better than other forms (e.g. Ratcliffe et al. 1998). We test this for the 2dFGRS data by following a method similar to that of LSB98. To extract the peculiar velocity distribution, we need to deconvolve the real-space correlation function from the peculiar velocity distribution.

### The method

We first take the 2D Fourier transform of the ξ(σ, π) grid to give and then take cuts along the *k*_{σ} and *k*_{π} axes which we denote by Σ(*k*) and Π(*k*) respectively, so and . By the slicing-projection theorem (see LSB98) these cuts are equivalent to the Fourier transforms of the real-space projections of ξ(σ, π) on to the σ and π axes. The projection of ξ(σ, π) on to the σ axis is a distortion free measurement of Ξ but the projection on to the π axis gives us Ξ convolved with the peculiar velocity distribution, ignoring the effects of large-scale bulk flows. Because a convolution in real space is a multiplication in Fourier space, the ratio of Σ(*k*) to Π(*k*) is the Fourier transform, , of the velocity distribution that we want to estimate. All that is left is to inverse Fourier transform this ratio to obtain the peculiar velocity distribution, *ƒ*(*v*). LSB98 cut their data set at 32 *h*^{−1} Mpc and applied a Hann smoothing window; we use all the raw data. Landy (2002, hereafter L02) used the LSB98 method on the 100-k 2dFGRS Public Release data and his results are discussed below.

Fitting an exponential to the resulting *ƒ*(*v*) curve gives a value for *a* assuming that the infall contribution to the velocity distribution is negligible. LSB98 and L02 claim that their method is not sensitive to the infall velocities. We show here that this is not the case. The additional structure in the Fourier transform of the velocity distribution found by L02 is a direct consequence of the infall velocities.

### Testing the models

To test the LSB98 method we apply the technique to our models, described in Section 4, with and without a β= 0.4 infall factor, using various scales, and with and without a Hann window. In Fig. 16 we show the Fourier transform of the peculiar velocity distribution and in Fig. 17 we show the distribution function itself.

It is clear from Fig. 16 that the shape of the Fourier transform at small *k* is quite badly distorted by the infall velocities. This leads to a systematic error in the actual velocity distribution as seen in Fig. 17, where the measured peculiar velocity dispersions are biased low, especially in the case where a smoothing window and a limited range of scales are used. In particular, the peak of the Fourier transform is not at *k*= 0, and the inferred *ƒ*(*v*) becomes negative for a range of velocities (dashed lines in the lower panels of Fig. 17). This clearly cannot be interpreted as a physical velocity distribution; the method infers negative values because the input model ξ(σ, π) is not consistent with the initial assumption of the method, which is that all of the distortion in ξ(σ, π) is due to random peculiar velocities. We conclude that both types of peculiar velocity need to be considered when making these measurements, and so our preferred results come from directly fitting to ξ(σ, π).

A further complication with the real data is that *ƒ*(*v*) may depend on the pair separation (see discussion in Section 7.3). The solid line in Fig. 18 shows for a model where *a* varies from 500 km s^{−1} at σ= 0 to 300 km s^{−1} at σ= 20 *h*^{−1} Mpc. This is compared to a model with *a*= 500 km s^{−1} (dashed line), a model with *a*= 300 km s^{−1} (dotted line) and a model where *a* varies from 300 km s^{−1} at σ= 0 to 500 km s^{−1} at σ= 20 *h*^{−1} Mpc (dot-dashed line). The models with varying *a* are very close to their respective constant *a* models at all *k* values, showing that this method leads to an estimate of determined mainly by the value of *a* at small σ.

### The mock catalogues

The mean of the peculiar velocity distributions for the mock catalogues is shown in Fig. 19. The distribution is compared to a model, shown as the solid line, with β= 0.47 and an exponential *ƒ*(*v*), with dispersion, *a*= 575 km s^{−1}. The exact form of the peculiar velocities in the Hubble Volume, and hence mock catalogues, is not explicitly specified and it should not be expected to conform to this model exactly.

### The 2dFGRS data

The Fourier transform of the peculiar velocity distribution for the combined 2dFGRS data are shown in Fig. 20 compared to a best-fitting model with β= 0.49 ± 0.05 and *a*= 570 ± 25 km s^{−1}. Fig. 21 shows the peculiar velocity distribution itself compared to the same model. We showed in Section 6.2 (with Fig. 18) that this was likely to be the value of *a* at small σ. The distribution of random pairwise velocities does appear to have an exponential form, with a β influence. Sheth (1996) and Diaferio & Geller (1996) have shown that an exponential peculiar velocity distribution is a result of gravitational processes.

Ignoring the infall, L02 found *a*= 331 km s^{−1}, using the smaller, publicly available, sample of 2dFGRS galaxies. We have made the same approximations and we have repeated his procedure on our larger sample, finding *a*= 370 km s^{−1}. Using our data grid out to 70 *h*^{−1} Mpc, with no smoothing and ignoring β, gives *a*= 457 km s^{−1}. We have shown that the result in L02 is biased low by ignoring β and that the infall must be properly considered in these analyses. As shown in Fig. 21, our data are reasonably well described by an exponential model with β= 0.49 and *a*= 570 km s^{−1}.

## Fitting To The ξ(σ, π) Grid

### Results

We now fit our ξ(σ, π) data grid to the models described in Section 4, assuming a power-law form for the real-space correlation function. This model has four free parameters, β, *r*_{0}, γ_{r} and *a*. The fits to the data are done by minimizing

*s*< 20

*h*

^{−1}Mpc, where δξ is the rms of ξ from the mock catalogues for a particular σ and π. This is like a simple χ

^{2}minimization, but the points are not independent. We have tried a fit to ξ directly but found that it gave too much weight to the central regions and so instead we fit to log [1 +ξ] so that the overall shape of the contours has an increased influence on the fit. The best-fitting model parameters are listed in Table 3. The errors we quote are the rms spread in errors from fitting each mock catalogue in the same way.

There are two key assumptions made in the construction of these models. First, although the contours match well at small scales, there are good reasons to believe that our linear theory model will not hold in the non-linear regime for *s*≲ 8 *h*^{−1} Mpc. Secondly, we have assumed the power-law model for ξ(*r*) and we have seen evidence that this is not completely realistic. Using non-power-law forms will also allow us to probe to larger scales.

To test whether our result is robust to these assumptions we first reject the non-linear regime corresponding to *s* < 8 *h*^{−1} Mpc. Then, we use the shape of the Hubble Volume ξ(*r*) instead of a power law, and finally we extend the maximum scale to *s*= 30 *h*^{−1} Mpc. We have shown in Section 3.8 that the Hubble Volume shape gives a good match to the data over the range 8 < *s* < 30 *h*^{−1} Mpc (the Appendix gives the relevant equations for performing the β infall calculation without a power-law assumption).

We find that the best-fitting parameters change very little with these changes but when using the Hubble Volume ξ(*r*), the quality of the fit improves significantly. The best-fitting model is compared to the data in Fig. 22. Notice the excellent agreement on small scales even though they are ignored in the fitting process. The best-fitting parameters are listed in Table 3, and we adopt these results as our final best estimates finding β= 0.49 ± 0.09.

If we repeat our analysis on the mock catalogues we find a mean value of β= 0.475 ± 0.090 (cf. the expected value of β= 0.47, Section 2.2), showing that we can correctly determine β using this type of fit. When fitting the mock catalogues it has become clear that β and *a* are correlated in this fitting procedure, as we have seen already with other methods. We use the mock catalogues to measure the linear correlation coefficient, *r* (Press et al. 1992), which quantifies this correlation, and find that, between β and *a*, *r*= 0.66. If we knew either parameter exactly, the error on the other would be smaller than quoted.

We have also tried other analytical forms for the correlation function and also different scale limits and we have found that some combinations shift the results by ∼1σ.

### Comparison of methods

We have now estimated the real-space clustering parameters using three different methods. In Section 3.9, we saw that the projection and inversion methods gave essentially identical results for *r*_{0} and γ_{r} whereas using 2D fits we obtain slightly higher values for *r*_{0}.

If ξ(*r*) was a perfect power law, the different methods would give unbiased results for the parameters, but we have seen evidence that this assumption is not true. The methods, therefore, give different answers as a result of the different scales and weighting schemes used, as well as the vastly different treatments of the redshift-space distortions.

### Previous 2dFGRS results

It is worth contrasting our present results with those obtained in a previous 2dFGRS analysis (Peacock et al. 2001). This was based on the data available up to the end of 2000: a total of 141 402 redshifts. The chosen redshift limit was *z*_{max}= 0.25, yielding 127 081 galaxies for the analysis of ξ(σ, π). The present analysis uses 165 659 galaxies, but to a maximum redshift of 0.2. Because galaxies are given a redshift-dependent weight, this difference in redshift limit has a substantial effect on the volume sampled. For a given area of sky, changing the redshift limit from *z*_{max}= 0.2 to *z*_{max}= 0.25 changes the total number of galaxies by a factor of only 1.08, whereas the total comoving volume within *z*_{max} increases by a factor of 2. Allowing for the redshift-dependent weight used in practice, the difference in effective comoving volume for a given area of sky due to the variation in redshift limits becomes a factor of 1.6. Because the effective area covered by the present data is greater by a factor of 165 659/(127 081/1.08) = 1.4, the total effective comoving volume probed in the current analysis is in fact 15 per cent smaller than in the 2001 analysis; this would suggest random errors on clustering statistics about 7 per cent larger than previously. Of course, the lower redshift limit has several important advantages: uncertainties in the selection function in the tail of the luminosity function are not an issue (see Norberg et al. 2002a); also, the mean epoch of measurement is closer to *z*= 0. Given that the sky coverage is now more uniform, and that the survey mask and selection function have been studied in greater detail, the present results should be much more robust.

The other main difference between the present work and that of Peacock et al. (2001) lies in the method of analysis. The earlier work quantified the flattening of the contours of ξ(σ, π) via the quadrupole-to-monopole ratio, ξ_{2}(*s*)/ξ_{0}(*s*). This is not to be confused with the quantity *Q*(*s*) from Section 5.2, which uses an integrated clustering measure instead of ξ_{0}(*s*). This is inevitably more noisy, as reflected in the error bar, δβ= 0.17, resulting from that method. The disadvantage of using ξ_{2}(*s*)/ξ_{0}(*s*) directly, however, is that the ratio depends on the true shape of ξ(*r*). In Peacock et al. (2001), this was assumed to be known from the deprojection of angular clustering in the APM survey (Baugh & Efstathiou 1993); in the present paper, we have made a detailed internal estimate of ξ(*r*), and considered the effect of uncertainties in this quantity. Apart from this difference, the previous method of fitting to ξ_{2}(*s*)/ξ_{0}(*s*) should, in principle, give results that are similar to our full fit to ξ(σ, π) in Section 7.1. The key issue in both cases is the treatment of the errors, which are estimated in a fully realistic fashion in the present paper using mock samples. The previous analysis used two simpler methods: an empirical error on ξ_{2}(*s*)/ξ_{0}(*s*) was deduced from the NGP–SGP difference, and correlated data were allowed for by estimating the true number of degrees of freedom from the value of χ^{2} for the best-fitting model. This estimate was compared with a covariance matrix built from multiple realizations of ξ(σ, π) using Gaussian fields; consistent errors were obtained. We applied the simple method of Peacock et al. (2001) to the current data, keeping the assumed APM ξ(*r*), and we obtained the marginalized result β= 0.55 ± 0.075. The comparison with our best estimate of β= 0.49 ± 0.09 indicates that the systematic errors in the previous analysis (from, for example, the assumed ξ[*r*]) were not important, but that the previous error bars were optimistic by about 20 per cent.

### Peculiar velocities as a function of scale

There has been much discussion in the literature on whether or not the pairwise peculiar velocity dispersion, *a*, is a function of projected separation, σ. Many authors have used *N*-body simulations to make predictions for what might be observed. Davis et al. (1985) found that the pairwise velocity dispersion of CDM remains approximately constant on small scales, decreases by about 20–30 per cent on intermediate scales and is approximately constant again on large scales. Cen, Bahcall & Gramann (1994) found a similar overall behaviour, as did Jenkins et al. (1998) whose results are plotted in the left-hand panel of Fig. 23 as the solid line for a ΛCDM cosmology. The dashed line is from Peacock & Smith (2000), who used the halo model to predict the peculiar velocities for the galaxy distribution. Kauffmann et al. (1999) and Benson et al. (2000) used the GIF simulations combined with semi-analytical models of galaxy formation, and the galaxy predictions of Benson et al. (2000) are shown by the dotted line. These predictions generally assume σ_{8}= 0.9, but there is evidence that σ_{8} could be 10 per cent lower than this (Spergel et al. 2003) and so the pairwise velocity dispersions implied would also be lower.

Observationally, Jing et al. (1998) measured the pairwise velocity dispersion in the Las Campanas Redshift Survey and found no significant variation with scale. We note again that the errors for the LCRS ignore the effects of cosmic variance and are likely to be underestimates. Zehavi et al. (2002) used the SDSS data and found that *a* decreased with scale for σ≳ 5 *h*^{−1} Mpc. These observations are plotted in the right-hand panel of Fig. 23. All these observations have assumed a functional form for the infall velocities (or ‘streaming’) and have not used β directly. We have already shown that proper consideration of the infall parameter is vital in such studies. Indeed, Zehavi et al. (2002) say that their estimates of *a* for σ > 3 *h*^{−1} Mpc depend significantly on their choice of streaming model. This factor, along with a dependence of *a* on luminosity and galaxy type, may help to explain the differences between the 2dFGRS and SDSS results.

The difference in results from Section 6.2, which measured the value of *a* at small σ (570 km s^{−1}), and from using the ξ(σ, π) grid (506 km s^{−1}), which measures an average value, hints that there may be such a dependence of *a* on σ in the 2dFGRS data. We test for variations in *a* by repeating the fits described in Section 7.1 using a global β, *r*_{0} and γ_{r} but allowing *a* to vary in each σ slice. The results are shown in Fig. 23, compared with the results from other surveys, and numerical simulations as discussed above. The value of 506 km s^{−1} obtained from the 2D fit for scales >8 *h*^{−1} Mpc is close to the value at 8 *h*^{−1} Mpc where most of the signal is coming from. The value of 570 km s^{−1} obtained from the Fourier transform technique agrees well with the results found for σ < 1 *h*^{−1} Mpc. The values of β, *r*_{0} and γ_{r} are essentially unchanged when fitting in this way. We note again that the effects of the infall must be properly taken into account in these measurements. We also note that we used our linear, power-law model on all scales, but we have seen that this is a reasonable approximation on non-linear scales.

We see that the overall shape of the 2dFGRS results are fairly consistent with, although slightly flatter than, the semi-analytical predictions, but the amplitude is certainly a little different, which could be due to the value of σ_{8} used in the models, as discussed above. We also plot the mean of the mock catalogue results (solid line), and the results of a simulated catalogue (dashed line) of Yang et al. (2003, with σ_{8}= 0.75) and these match the real data well.

## Constraining *Ω*_{M}

We take the value of β measured from the multiparameter best fit to ξ(σ, π)

which is measured at the effective luminosity,*L*

_{s}, and redshift,

*z*

_{s}, of our survey sample. In Section 2.1 we quoted these values, which are the applicable mean values when using the

*J*

_{3}weighting and redshift cuts employed, as

*L*

_{s}≈ 1.4

*L** and

*z*

_{s}≈ 0.15. We also note here that, if we adopt an Ω

_{m}= 1 geometry, we find that β= 0.55, within the quoted 1σ errors.

### Redshift effects

The redshift distortion parameter can be written as

where*ƒ*= d ln

*D*/d ln

*a*,

*D*is the linear fluctuation growth factor and

*a*is the expansion factor. A good approximation for

*ƒ*, at all

*z*, in a flat universe, was given by Lahav et al. (1991):

So, to constrain Ω_{m} from these results we need an estimate of *b*. There have been two recent papers describing such measurements.

Verde et al. (2002) measured *b*(*L*_{s}, *z*_{s}) from an analysis of the bispectrum of 2dFGRS galaxies. Their results depend strongly on the pairwise peculiar velocity dispersion, *a*, assumed in their analysis. They used the result of Peacock et al. (2001), who found *a*= 385 km s^{−1}, lower than our new value of ≈500 km s^{−1}. To derive Ω_{m} using these results would not therefore be consistent and so a new bispectrum analysis is in preparation.

Lahav et al. (2002) combined the estimate of the 2dF power spectrum, *P*(*k*) (Percival et al. 2001), with results previous to the *Wilkinson Microwave Anisotropy Probe* (*WMAP*) from the cosmic microwave background (CMB) to obtain an estimate of *b*, but this value is also dependent on Ω_{m}. Their likelihood contours are reproduced in Fig. 24, as the dashed lines. They also introduced a ‘constant galaxy clustering’ model for the evolution of *b* with *z*. Following these equations we can evolve our measured β to the present day and estimate

### Luminosity effects

We note that the above analysis is independent of luminosity as we examine everything at the effective luminosity of the survey, *L*_{s}. From the correlation functions in different volume-limited samples of 2dFGRS galaxies, Norberg et al. (2001) found a luminosity dependence of clustering of the form (cf. equation 10)

*b*

_{s}= 1.06

*b** (using

*L*= 1.4

*L**), where

*b** is the bias of

*L** galaxies. If this bias relation holds on the scales considered in this paper then β will be increased by the same factor of 1.06 and evolving β in a ‘constant galaxy clustering’ model (Lahav et al. 2002) then which we choose as a fiducial point to allow comparisons with other surveys with different effective luminosities and redshifts. Lahav et al. (2002) obtained β(

*L**,

*z*= 0) = 0.50 ± 0.06, in their combined 2dFGRS and CMB analysis, completely consistent with our result.

### Comparisons

Percival et al. (2002) combined the 2dFGRS power spectrum with the pre-*WMAP* CMB data, assuming a flat cosmology and found Ω_{m}(*z*= 0) = 0.31 ± 0.06. These measurements of Ω_{m} are also consistent with a different estimation from the 2dFGRS and CMB (Efstathiou et al. 2002) and from combining the 2dFGRS with cosmic shear measurements (Brown et al. 2003).

Also plotted in Fig. 24 is the recent result from the analysis of the *WMAP* satellite data. Spergel et al. (2003) found Ω_{m}= 0.29 ± 0.07 using *WMAP* data alone, although there are degeneracies with other parameters. It is clear that this is completely consistent with the other plotted contours. (Spergel et al. 2003) also found that the epoch of reionization, τ= 0.17, which would reduce the value of *b* found by Lahav et al. (2002) by about 16 per cent, still in good agreement with the results in this paper.

## Summary

In this paper we have measured the correlation function, and various related quantities using 2dFGRS galaxies. Our main results are summarized as follows.

- (i)
The spherical average of ξ(σ, π) gives the redshift-space correlation function, ξ(

*s*), from which we measure the redshift space clustering length,*s*_{0}= 6.82 ± 0.28*h*^{−1}Mpc. At large and small scales, ξ(*s*) drops below a power law as expected, for instance, in the ΛCDM model. - (ii)
The projection of ξ(σ, π) along the π axis gives an estimate of the real-space correlation function, ξ(

*r*), which on scales 0.1 <*r*< 12*h*^{−1}Mpc can be fit by a power law (*r*/*r*_{0}) with*r*_{0}= 5.05 ± 0.26*h*^{−1}Mpc, γ_{r}= 1.67 ± 0.03. At large scales, ξ(*r*) drops below a power law as expected, for instance, in the ΛCDM model. - (iii)
The ratio of real- and redshift-space correlation functions on scales of 8–30

*h*^{−1}Mpc reflects systematic infall velocities and leads to an estimate of β= 0.45 ± 0.14. The quadrupole moment of ξ(σ, π) on large scales gives β= 0.47^{+0.19}_{−0.16}. - (iv)
Comparing the projections of ξ(σ, π) along the π and σ axes gives an estimate of the distribution of random pairwise peculiar velocities,

*ƒ*(*v*). We find that large-scale infall velocities affect the measurement of the distribution significantly and cannot be neglected. Using β= 0.49, we find that*ƒ*(*v*) is well fit by an exponential with pairwise velocity dispersion,*a*= 570 ± 25 km s^{−1}, at small σ. - (v)
A multiparameter fit to ξ(σ, π) simultaneously constrains the shape and amplitude of ξ(

*r*) and both the velocity distortion effects parametrized by β and*a*. We find β= 0.49 ± 0.09 and*a*= 506 ± 52 km s^{−1}, using the Hubble Volume ξ(*r*) as input to the model. These results apply to galaxies with effective luminosity,*L*≈ 1.4*L** and at an effective redshift,*z*_{s}≈ 0.15. We also find that the best-fitting values of β and*a*are strongly correlated. - (vi)
We evolve our value for the infall parameter to the present day and critical luminosity and find β(

*L*=*L**,*z*= 0) = 0.47 ± 0.08. Our derived constraints on Ω_{m}and*b*are consistent with a range of other recent analyses.

Our results show that the clustering of 2dFGRS galaxies as a whole is well matched by a low-density ΛCDM simulation with a non-linear local bias scheme based on the smoothed dark-matter density field. Nevertheless, there are features of the galaxy distribution which require more sophisticated models, for example the distribution of pairwise velocities and the dependence of galaxy clustering on luminosity or spectral type. The methods presented have also been used on subsamples of the 2dFGRS, split by their spectral type (Madgwick et al. 2003).

### Acknowledgments

The 2dFGRS has been made possible through the dedicated efforts of the staff of the Anglo-Australian Observatory, both in creating the 2dF instrument and in supporting it on the telescope. We thank Nelson Padilla, Andrew Benson, Sarah Bridle, Yipeng Jing, Xiaohu Yang and Robert Smith for providing their results in electronic form. We also thank the referee for valuable comments.

## References

### Appendix

#### Appendix A: Coherent infall Equations

Kaiser (1987) pointed out that the coherent infall velocities take a simple form in Fourier space:

Hamilton (1992) completed the translation of these results into real space which reduces to where in general andIn the case of a power-law form for ξ(*r*) these equations reduce to the form shown in equations (12)–(14). In the case of non-power-law forms for the real-space correlation function these integrals must be performed numerically.