Testing cosmological structure formation using redshift-space distortions

Observations of redshift-space distortions in spectroscopic galaxy surveys offer an attractive method for observing the build-up of cosmological structure. In this paper we develop and test a new statistic based on anisotropies in the measured galaxy power spectrum, which is independent of galaxy bias and matches the matter power spectrum shape on large scales. The amplitude provides a constraint on the derivative of the linear growth rate through f.sigma_8. This demonstrates that spectroscopic galaxy surveys offer many of the same advantages as weak lensing surveys, in that they both use galaxies as test particles to probe all matter in the Universe. They are complementary as redshift-space distortions probe non-relativistic velocities and therefore the temporal metric perturbations, while weak lensing tests the sum of the temporal and spatial metric perturbations. The degree to which our estimator can be pushed into the non-linear regime is considered and we show that a simple Gaussian damping model, similar to that previously used to model the behaviour of the power spectrum on very small scales, can also model the quasi-linear behaviour of our estimator. This enhances the information that can be extracted from surveys for LCDM models.


INTRODUCTION
It is now widely accepted that the large-scale structure we see in the distribution of galaxies arises through a process of gravitational instability, which amplifies primordial fluctuations laid down in the very early Universe. The rate at which structure grows from these small perturbations offers a key discriminant between cosmological models. For instance, dark energy models in which general relativity is unmodified predict different Large-Scale Structure formation compared with Modified Gravity models with the same background expansion (e.g. Dvali et al. 2000;Carroll et al. 2004;Brans 2005;Nesseris & Perivolaropoulos 2008).
Structure growth is driven by the motion of matter (and inhibited by the cosmological expansion). Galaxies are expected to act as test particles within this matter flow, so the motion of galaxies carries an imprint of the rate of growth of large-scale structure. Because of this, many previous analyses have shown that observations of these galaxy peculiar velocities can distinguish between classes of models (e.g. Jain & Zhang 2007;Song & Koyama 2008;Song & Percival 2008). A key technique to statistically measure the growth of the velocity field uses redshift-space distortions seen in galaxy surveys (Kaiser 1987). Galaxy maps, produced by estimating distances from redshifts obtained in spectroscopic galaxy surveys, reveal an anisotropic galaxy distribution. The anisotropies ⋆ E-mail: will.percival@port.ac.uk arise because galaxy recession velocities, from which distances are inferred, include components from both the Hubble flow and peculiar velocities from the comoving motions of galaxies. These distortions encode information about the build-up of structure.
Many previous surveys have been analysed to measure β ≈ Ω 0.6 m /b, where b is the deterministic, local, linear bias of the galaxies. The latest generation of large surveys have provided ever tighter constraints. Analyses using the 2-degree Field Galaxy Redshift Survey (2dFGRS; Colless et al. 2003) have measured redshiftspace distortions in both the correlation function (Peacock et al. 2001;Hawkins et al. 2003) and power spectrum (Percival et al. 2004). Using the Sloan Digital Sky Survey (SDSS; York et al. 2000), redshift-space distortions have also been measured in the correlation function (Zehavi et al. 2005;Okumura et al. 2008;Cabré & Gaztañaga 2008), and using an Eigenmode decomposition to separate real and redshift-space effects (Tegmark et al. 2004(Tegmark et al. , 2006. These studies were recently extended to z ≃ 1 (Guzzo et al. 2007) using the VIMOS-VLT Deep Survey (VVDS; Le Fevre et al. 2005;Garilli et al. 2008) and the 2SLAQ survey (da Angela et al. 2008). In addition to measuring β at z = 0.8, Le Fevre et al. (2005) emphasised the importance of using large-scale peculiar velocities for constraining models of cosmic acceleration.
On linear scales the theory behind the observed redshift-space distortions is well developed (Kaiser 1987;Hamilton 1998), and previous analyses have made use of this theory to measure β. On quasi-linear and non-linear scales we are instead reduced to mak-ing approximations, or using fitting formulae based on numerical simulations. The standard approach is to use a 'streaming' model, where linear theory is spliced together with an approximation for random motion of particles in collapsed objects (see section 2). Tinker, Weinberg & Zheng (2006) and Tinker (2007), which are developments of work in Hatton & Cole (1999), discuss a number of possible improvements to the streaming model based on fits to numerical simulation results. These approaches aim to allow us to extract information on β from scales where the power spectrum does not match its linear form. One concern is that the theoretical dependence on β might change in the quasi-linear regime, leading to complicated dependencies and that fits to simulations will only be correct in a subset of cosmologies and galaxy formation models similar to those from which the fits were derived 1 .
A better approach would be to analyse the physics behind redshift-space distortions, and to try to use this to devise the best estimator with which to extract cosmological information. Because galaxy velocities only depend on the distribution of matter, we can devise an estimatorP(k) that is not affected by galaxy bias: instead it measures the large-scale shape of the matter power spectrum. Because the estimator is based on the velocity power spectrum, the large-scale amplitude is f 2 times that of matter fluctuations, where f is the logarithmic derivative of the linear growth rate with respect to the scale factor, f ≡ d ln D/d ln a. From this estimator, we can measure f σ 8, mass , which is proportional to dD/d ln a, and provides a good discriminant between modified gravity and dark energy models (Song & Percival 2008).P(k) is expected to match the linear matter power spectrum shape on large scales, so small-scale differences will help us to determine the scales on which the theory is breaking down. For ΛCDM models, we show that a simple Gaussian smoothing of the redshift-space power along the line-ofsight is able to match most of this break-down. This places redshiftspace distortion measurements on the same footing as weak lensing measurements in the sense that they both allow us to test the matter distribution directly. They provide complementary information, as non-relativistic velocity measurements only depend on temporal metric perturbations, while weak lensing tests the sum of the temporal and spatial metric perturbations.
The layout of our paper is as follows. We first review the theory behind redshift-space distortions (Section 2) in the linear, quasi-linear and non-linear regimes. In Section 3, we introduce a new estimator, based on the monopole and quadrupole from a Legendre decomposition of the redshift-space power spectrum, which is designed to recover a power spectrum given by f 2 times the matter power spectrum on large scales. Section 4 introduces the Nbody simulation used to test this theory and our new estimator. We use spherically averaged power spectra where we include distortions along multiple axes to analyse this simulation, an approach described in Section 5. Our analytic theory is compared with results from the numerical simulation in Section 6. The paper ends with a discussion of our results.

MODELLING GALAXY CLUSTERING IN REDSHIFT-SPACE
The redshift-space position of a galaxy differs from its real-space position due to its peculiar velocity, 1 The fits are quite sensitive to the satellite fraction, for example.
where u z (x) is the line-of-sight component of the galaxy velocity (assumed non-relativistic) in units of the Hubble velocity, and we have taken the line-of-sight to be the z-axis. We shall adopt the "plane-parallel" approximation, so this direction is fixed for all galaxies.
The galaxy overdensity field in redshift-space can be obtained by imposing mass conservation, (1 + δ s g )d 3 s = (1 + δ g )d 3 r, and the exact Jacobian for the real-space to redshift-space transformation is In the limit where we are looking at scales much smaller than the mean distance to the pair, u z /z is small and it is only the second term that is important (Kaiser 1987; but see Papai & Szapudi 2008), If we assume an irrotational velocity field we can write u z = ∂/∂z ∇ −2 θ, where θ ≡ ∇ · u, and ∇ −2 is the inverse Laplacian operator. In Fourier space, (∂/∂z) 2 ∇ −2 = (k z /k) 2 = µ 2 , where µ is the cosine of the line-of-sight angle, so we have that including second order convolutions in θ(k) and δ g (k), while neglecting third and higher order terms.

The linear regime
If θ and δ g are small, then we can drop the second and higher order terms from Eq. (4), and Often it is further assumed that the velocity field comes from linear perturbation theory. Then where f ≡ d ln D/d ln a ≈ Ω 0.6 m (Peebles 1980). For a population of galaxies, which we denote with a subscript g, the linear redshift-space power spectrum can be written where P gg (k) ≡ |δ g (k)| 2 , P gθ (k) ≡ δ g (k)θ(k) , P θθ (k) ≡ |θ(k)| 2 , are the galaxy-galaxy, galaxy-θ and θ-θ power spectra respectively for modes k. A subscript θ shows that a variable is determined from the velocity field of the galaxies, which we have assumed is irrotational and small compared with the real-space distance to the galaxies (see Fig. 1). In the following we often drop explicitly showing the k dependence of these power spectra, for convenience.

Galaxy bias
It has long been known that galaxies do not trace the mass, a phenomenon known as 'bias'. There are good theoretical reasons to believe that on large scales the shape of the power spectra of galaxies and mass are similar (e.g. Peebles 1980;Peacock 1999), which can be phrased as the assumption of scale-independent bias. Under this approximation, Dekel & Lahav (1999) introduced the bias relation, Figure 1. Mass density and velocity power spectra recovered from three different simulations. Solid lines correspond to the primary simulation used in this paper, and for this simulation we show power spectra of the overdensity (solid squares), velocity divergence (solid circles) and the curl of the velocity (solid triangles). For comparison, the dashed line is the corresponding power spectra from a simulation with the same initial conditions and cosmology, run with a PM code (so lower force resolution), and the dotted lines corresponds to a simulation with a box half as big (so twice the force resolution, 8 times the mass resolution but 1/8 the volume).
where b is the bias factor, and r is the dimensionless correlation coefficient between the distributions of mass and galaxies. If r = 1, we have a fully deterministic local linear bias relation, If biasing is not a purely Poisson process, but has a random stochastic element, then we could have r < 1. Measurements of r, such as summarized in Figure 11 of Swanson et al. (2008), indicate that r is close to 1 on large scales. If b ≃ 1, or δ(x) and bδ(x) are small, then galaxy velocities randomly sample those of the matter in the Universe and we can write, where P mass (k) ≡ |δ(k)| 2 . This is not true for some models of galaxy bias, such as the peaks model, which predicts a velocitybias where the distribution of galaxy velocities does not match that of the mass (Regos & Szalay 1995;Percival et al. 2008). Under the assumptions that lead to Eqns. 11, the standard way of writing the redshift-space galaxy power spectrum is where β ≡ f /b. By writing the power spectrum in this form we are hiding a key feature of linear redshift-space distortions: that they depend on the mass overdensity not the galaxy overdensity. To see this, we now consider the normalisation of the power spectrum.

Power spectrum normalisation
The normalisation of power spectra is usually measured using the rms fluctuation amplitude, in spheres of radius 8 h −1 Mpc, which would have arisen if linear evolution had proceeded to the time of observation. For the anisotropic redshift-space power spectrum we can define a slightly non-standard, anisotropic normalisation where W 8 (k) is the Fourier transform of the top-hat window function of width 8 h Mpc −1 . Here we have decomposed the standard 3D integral into a 1D integral assuming statistical isotropy of the density field, and have performed this integral along particular directions to the line-of-sight. For each µ, this is equivalent to the standard definition of σ 8 for an isotropic power with the same amplitude as P s g (k, µ). Substituting Eq. (8) into this expression, and using the fact that P gg , P gθ & P θθ are isotropic, we have that σ 2 8, g (µ) = (bσ 8, mass ) 2 +2µ 2 (brσ 8, mass )( f σ 8, mass )+µ 4 ( f σ 8, mass ) 2 .(14) Because of the µ dependence, large-scale peculiar velocity observations provide a measurement of f σ 8, mass that is independent of a local linear bias. From observations of the large-scale amplitude of P s g (k, µ) as a function of µ, we can hope to measure f σ 8, mass , bσ 8, mass , and brσ 8, mass . Eq. (14) shows in compact form that we cannot break these parameter combinations and constrain f , b or σ 8, mass independently solely using P s g (k, µ), although we can measure r.

The quasi-linear regime
In addition to removing higher order terms, the two assumptions leading to Eq. (4) are that we are dealing with an irrotational velocity field, and that we are in the distant observer limit (u z ≪ z) both of which we assume to remain true in the quasi-linear regime. Such assumptions (among others) are also made in standard Eulerian perturbation theory (Juszkiewicz 1981;Vishniac 1983;Fry 1984;Goroff et al. 1986;Makino et al. 1992;Jain & Bertschinger 1994;Bernardeau et al. 2002), Lagrangian perturbation theory (Buchert 1989;Hivon et al. 1995), renormalised perturbation theory (Scoccimarro 2001) and resummed Lagrangian perturbation theory (Matsubara 2007).
Going beyond the linear assumption leads to additional terms in µ 2 , µ 4 & µ 6 in the redshift-space power spectrum compared with Eq. (8). So, where A 1 P gθ , A 2 P θθ , and A 3 0. The importance of the extra higher order terms was emphasised most recently by Scoccimarro (2004). In addition to these redshift-space effects, the linear theory relation between δ and θ, given in Eq. (6), will break down in the quasi-linear regime, so we should expect the shapes of P gg , P gθ and P θθ to be different. The relationship between Eqns. (8 & 15) can be written where G(k, µ 2 ) has the property lim k→0 G(k, µ 2 ) = 1.

The non-linear regime
The standard model for redshift-space distortions includes a component caused by an uncorrelated velocity dispersion that grows on small scales. Such a model is motivated by the idea of "thermal motion" of particles in collapsed structures, which causes the Fingers-Of-God (FOG) observed in redshift surveys (Jackson 1972). An additional component comprising uncorrelated particle motions will dilute both the galaxy overdensity δ g , and the "extra" overdensity term caused by the linear distortions, θ. Motivated by numerical simulations (Sheth & Diaferio 2001;Huff et al. 2007) and the halo model (White 2001;Seljak 2001), we can assume that the centre of mass of a halo, around which galaxies orbit, still moves according to (quasi-)linear motion. Such a model leads to the much-used 'streaming' models (e.g. Hamilton 1998) where with F(k, µ 2 ) a function that depends on the distribution of random pair velocities in collapsed objects, which is often written as a function of y = kσ, where σ is the rms velocity dispersion. In order to match behaviour on large scales, we require lim k→0 F(k, µ 2 ) = 1.
We have written the equation in this form to highlight the similarity with Eq. (16). Note that this model is constructed by a rather adhoc splicing of linear, quasi-linear and non-linear behaviour which ignores the scale-dependence of the mapping between real and redshift space separations (Fisher 1995;Scoccimarro 2004, and references therein), while Eq. (16) was based on the analysis of the redshift-space distortions in the quasi-linear limit.
In general FOG are difficult to model well, and their amplitude is strongly dependent on the mean halo mass and satellite fraction of the population under consideration (White 2001;Seljak 2001). Previous work has concentrated on models with Gaussian or Exponential distributions (e.g. Cole et al. 1995;Peacock & Dodds 1996) for the pairwise velocity dispersion in configuration space. For an Exponential model for the pairwise velocity dispersion in configuration space, we expect a Lorentz damping factor for the power spectrum, while the Gaussian dispersion translates to a Gaussian damping of the power spectrum These terms have the same behaviour to first order. The exact form of F(k, µ 2 ), and the value of σ is strongly dependent on the galaxy population (Jing & Borner 2004;Li et al. 2007). An alternative approach would be to try to "eliminate" the FOG by applying a halo finding algorithm to the sample and manually moving galaxies either to halo centres, or to a spherically symmetric distribution around these centres (e.g. Tegmark et al. 2004). Such approaches tend to mask the fact that the radial "compression" is still model dependent, and requires a similar free parameter to σ in Eq. (20). This "parameter" controls the probability density function for the distortion of any galaxy in redshift space (Reid & Spergel 2008). However, FOG compression does have the advantage of including extra information in the analysis from the phases, which are used to locate the halos.
Dealing directly with the FOG is not the same as extending the linear model, and P gg , P gθ & P θθ into the non-linear regime, as discussed in the previous section. The real-space effect of random thermal motion of galaxies on small scales would lead P gθ to decrease in amplitude, because of the decoherence of density and velocity divergence, while P θθ increases. We showed in Eq. (16) that a similar function to F(k, µ 2 ) would be required to include quasi-linear behaviour in the redshift-space power spectrum. In this case we would not expect that these simple damping models can simultaneously match both the quasi-linear and fully non-linear behaviour (Fisher 1995;Scoccimarro 2004).
Clearly, if F(k, µ 2 ) has two roles, we must reconsider any implied or assumed physical meaning in the function. So, for example, the physical arguments that the Exponential form should be a better match to the average velocity dispersion in halos (Sheth 1996;Peacock & Dodds 1996;White 2001) are not applicable. For consistency with the standard 'streaming model', we continue to refer to the function F(k, µ 2 ) as a small-scale velocity dispersion (SSVD) model.

Combined model
Following our review of the theory behind redshift-space distortions presented in Sections 2.1-2.5, we are left with a model of the redshift-space galaxy power spectrum To first order in k, both the Gaussian and Exponential forms for F(k, µ 2 ) considered in Section 2.5 act as an extra µ 6 component with amplitude determined by σ. This fitting function can act as an approximate correction for both quasi-linear and non-linear terms in the mapping between linear real-space and observed redshiftspace. P θθ is independent of galaxy density bias, and is directly related to the matter velocity power spectrum, provided there is no velocity bias, caused by a mis-match between the distribution of galaxy velocities and the distribution of velocities in all matter. Eq. (20) could be fitted to data directly using a Likelihood approach (Hamilton 2000;Tegmark et al. 2004Tegmark et al. , 2006, resulting in observational constraints on P gg , P gθ , and P θθ . Linking P gg , P gθ , and P θθ to the matter power spectrum requires further assumptions. Following the deterministic, linear local galaxy bias model presented in Eq. (10), and the assumption of no velocity bias, the three power spectra are related to the matter power spectrum according to Eqns. 11, which leads to a model This allows observations to constrain the matter power spectrum multiplied by f 2 . We show that this model matches simulation results on large-scales in Section 6. If Eq. (21) holds, Likelihood fits could be used to measure f σ 8 (mass) and bσ 8 (mass) from the measured power spectra. Alternatively, we show in the next Section that estimators of the matter power spectrum multiplied by f 2 can be constructed, based on a Legendre polynomial decomposition of the redshift-space power spectrum.

REVISED ESTIMATORS FOR COSMOLOGICAL INFORMATION
We argued in Section 2.1 that large-scale linear redshift-space distortions should enable us to measure f σ 8 (mass) independently of linear bias. We will now show how we can do so, based on a Legendre decompositon of the redshift-space power spectrum. P s g (k, µ) can be decomposed into Legendre polynomials L ℓ (µ) to give multipole moments Redshift-space distortions 5 The first three even Legendre polynomials are L 0 (µ) = 1, L 2 (µ) = (3µ 2 − 1)/2 and L 4 (µ) = (35µ 4 − 30µ 2 + 3)/8. For redshift-space distortions along one axis, the expansion of the linear model for P s g (k, µ) (Eq. 8) into the first three multipole moments gives so we see that the monopole, quadrupole and hexadecapole completely characterise P s g (k) in the linear limit. If we assume that P gg , P gθ and P θθ have different forms, then we can invert Eq. (23) to recover them individually from P s 0 P s 2 , and P s 4 , In terms of µ, P gg , P gθ , and P θθ can be considered as revised moments by substituting the expressions for P s 0 , P s 2 , & P s 4 into this expression.
The inclusion of a SSVD model given by Eq. (18) or Eq. (19) in the analysis of this statistic is considered in Cole et al. (1995): the integrals in Eq. (22) need to be recalculated as in Section 5.2. This would enable us to use the monopole, quadrupole and hexadecapole to extract the same information from simulations as using the spherically averaged power spectra described in Section 5.
Following Eq. (12), measurements of P s 2 /P s 0 have been previously used to measure β through (Cole et al. 1994) This only involves the "lowest order" moments in µ (P s 0 and P s 2 ) and it has been argued that this makes it less sensitive to noise. Expanding in multipoles can also simplify the definition of simple survey limits. The expansion of this formulae to include a SSVD model has also been considered by Cole et al. (1995). The quadrupole to monopole ratio is expected to have this limiting behaviour on largescales, and can therefore be used to measure β.
We now show that we can use P s 0 and P s 2 to derive an estimator of the matter power spectrum multiplied by f 2 , which leads to cosmological constraints that are independent of a local galaxy bias. For a local linear bias, in the absence of velocity bias, P θθ and (P gθ ) 2 /P gg both have normalisation f 2 σ 2 8 (mass) and shape matching that of the matter density power spectrum in the linear limit. Results from simulations, presented in Section 6 are consistent with this claim. W can therefore define a new estimatorP, such that P = (P gθ ) 2 /P gg , andP = P θθ on large scales.
In terms of the Legendre decomposition, we can write two functions of P s 0 , P s 2 and P s 4 , which should have the same large-scale normalisation that is independent of a local linear bias. In the linear regime, from Eq. (24) these functions arê Because we have two equations forP, we can eliminate P s 4 , leaving a quadratic equation inP Solving this equation givesP which offers a mechanism for removing a local bias dependence from quadrupole and monopole measurements in the distant observer limit. The result is a power spectrum whose large-scale shape should match that of the mass, and normalisation should be f 2 times that of the mass density power spectrum.
We can extend this model to the quasi-linear regime by including the model for small-scale velocity dispersion. As we argued previously, in the quasi-linear regime, this should be considered to be a fitting formula for quasi-linear distortions, and is not physically motivated. In the following we only consider a Gaussian SSVD model, because this is favoured by simulation results presented in the next section. A similar formula could be calculated for the Exponential SSVD model. In the linear-Gaussian model, we can simplify the equations by defining Γ n (y) ≡ γ([n + 1]/2, y 2 )/(2y n+1 ) so that where we have defined P × ≡ P ggP for notational convenience. This equation reduces to the relevant expression in Eq. (29) in the limit as y → 0. As outlined above for the linear case, these two equations can be manipulated to eliminate P gg and calculate an expression forP, which is a quadratic in P s 0 and P s 2 , as in Eq. (29) and can be easily solved, although it now depends on Γ n (y) in a complicated way. The linear and damped versions of this estimator are compared with the results from our numerical simulation in Section 6.4.

SIMULATION
To investigate redshift-space clustering further we have used a large, high-resolution N-body simulation which is well suited to probing P s g (k, µ) in the quasi-linear regime. The cosmology was of the ΛCDM family with Ω m = 0.25 = 1 − Ω Λ , Ω b = 0.043, h = 0.72, n s = 0.97 and σ 8 = 0.8. The linear theory power spectrum for the initial conditions was computed by evolution of the coupled Einstein, fluid and Boltzmann equations using the code described in White & Scott (1995). Seljak et al. (2003) find that this code agrees well with CMBfast (Seljak & Zaldarriaga 1996). The simulation employed 1024 3 particles of mass 10 11 h −1 M ⊙ in a periodic cube of side 1 h −1 Gpc using a TreePM code (White 2002) with a Plummer-equivalent softening length of 35 h −1 kpc (comoving). A detailed comparison of this TreePM code with other codes can be found in Heitmann et al. (2008) and Evrard et al. (2008).
In order to test the convergence of this simulation, we compare present-day density and velocity power spectra against alternative simulations in Fig. 1. We compare with a simulation run from the same initial conditions, but using a particle-mesh code which has lower force resolution. We also compare against a simulation with half the box size, i.e. double the force resolution over 1/8 of the volume. The density power spectra recovered from the simulations do not reveal any significant trends with force resolution. There is weak evidence for smaller ∇×u and velocity divergence with higher force resolution, but this is not significant for our purposes.
The offset between density and velocity divergence power spectra seen in Fig. 1 on large scales is caused by the factor f 2 as described in Section 2.1. The difference on smaller scales is anal-ysed in detail later. Our assumption that the velocity field is curlfree is validated by these simulations, which show that |∇ × u| k ≪ |∇ · u| k for the scales of interest: k < 0.2 h Mpc −1 .
From the z = 0 output we generate a halo catalog using the Friends-of-Friends (FoF) algorithm (Davis et al. 1985) with a linking length of 0.168 times the mean inter-particle spacing. This procedure partitions the particles into equivalence classes, by linking together all particle pairs separated by less than a distance b. The halos correspond roughly to particles with ρ > 3/(2πb 3 ) ≃ 100 times the background density. We will present results calculated from all particles and from catalogues containing all particles in halos more massive than log 10 M = 12.5, 13 and 13.5 where M is in units of h −1 M ⊙ . The small-scale velocity dispersions of the particles will match the distribution of mass in each halo. We do not include further small-scale effects caused by the galaxies inside each halo not Poisson sampling the mass. Our intention in this paper is not to try to model the non-linear regime perfectly, but to test the effect of halo mass selection on the linear and quasi-linear regimes. We therefore adopt this simple halo occupation distribution in order to reduce the shot noise: by using all particles in each halo we optimise our determination of the (quasi-)linear halo velocity.
We cannot easily replicate Fig. 1 for our halo-particle catalogues because of the difficulty of interpolating the velocity field between the sparsely sampled galaxies. Instead, in the rest of this paper, we directly analyse redshift-space distortions through P s g (k, µ). We will consider two ways of analysing P s g (k, µ), either using spherically averaged power spectra with different redshiftspace distortion components, or by decomposing P s g (k, µ) into Legendre polynomials. In the simulations we determine P s ℓ in each kbin both by directly integrating against µ as in Eq. (22) and by a least-squares fit of P s g (k, µ) to a sum of Legendre polynomials. The results are consistent. Before presenting the results from simulations, we first consider the theory behind decomposing simulation results into spherically averaged power spectra.

SPHERICALLY AVERAGED POWER SPECTRA FROM SIMULATIONS
In this section we introduce the concept of spherically averaged power spectra calculated from simulations where we propagate redshift-space distortions along multiple axes. Clearly, such power spectra will not match observational results, where redshift-space distortions are only radial. However, they do provide a convenient and simple way to explore the components of Eq. (20) using simulations. For a sample of galaxies where we have applied redshiftspace distortions along multiple axes, for each k, Eq. (20) will still hold for some value of µ. The spherically averaged power spectra act as µ-dependent moments of P s g (k, µ), and this method does not alter the physics that we are testing. We are simply summing these modes over different combinations of µ. When analysing simulations, we apply the plane-parallel decomposition to both the simulation data and the analysis method. Consequently, this should not be considered an approximation as we do not relate the results to actual surveys, but only use simulations to investigate the physics.

Linear distortions
For redshift-space distortions along one axis, in the linear regime where Eq. (5) holds, we have For two axes with redshift-space distortions, we can take ν to be the cosine of the angle to the direction where there are no distortions and integrate so that, in the linear limit, For three axes, whatever direction we take our k-vector, we see redshift-space distortion of the overdensity is as in Eq. (5) with µ = 1. Solving the integrals in Eqns.(31) & (32), and substituting in the relations P gg (k) ≡ |δ g (k)| 2 , P gθ (k) ≡ δ g (k)θ(k) & P θθ (k) ≡ |θ(k)| 2 , leads to the following spherically averaged power spectra, Here P s i axes (k) includes redshift-space distortions along i axes. Using any three of P s i axes (k), we can reconstruct P gg , P gθ , and P θθ by solving the corresponding linear equations.

Including small-scale velocity dispersion
If we include a SSVD damping model then, for a power spectrum with redshift-space distortions along one axis, For redshift-space distortions along two axes, we take ν to be the cosine of the angle to the direction where there are no distortions, as in Eq. (32). We can consider that the power spectrum is damped by the factor F(k, 1 − ν 2 ) so that, For redshift-space distortions along three axes, whatever direction k/k we choose, we see redshift-space distortions as if we were looking along the line of sight. So If we assume an Gaussian model for the SSVD damping, then Eq. (34) can be solved using the γ function, and the factors 1, 2/3 and 1/5 in the expansion of P s 1 axis (k) in Eq. (33) become γ(1/2, y 2 )/(2y), γ(3/2, y 2 )/y 3 and γ(5/2, y 2 )/(2y 5 ) respectively. The original factors are recovered in the limit y → 0. Alternative expressions can be derived using error functions (Cole et al. 1995;Peacock & Dodds 1996). Eq. (35) can be solved similarly using the imaginary error function.
If instead we assume an Exponential model for the small-scale velocity dispersion, then the integral in Eq (34) can be analytically solved using the tan −1 function (Cole et al. 1995), while the integral in Eq. (35) is trivial.

Fits to the spherically averaged power
The first question we wish to address is "How well does the linear model with or without SSVD model (Eq. 21) recover the monopole Figure 2. Recovered power spectra from our simulation, with redshift-space distortions along one axis of the simulation box (solid circles). Power spectra are shown for all mass (lower data), and for the three halo catalogues described in the text (upper data, with the more biased power spectra corresponding to larger mass thresholds). These data were fitted for k < 0.05 h Mpc −1 using the model given by Eq. (21). Ratios between model and data are shown in the lower panels, and panels from left to right are for different SSVD models. See Section 5.2 for details. power P s 0 = P s 1 axis ?" This comparison is shown in Fig. 2. The two possible free parameters, f /b and σ, were fitted on very large scales k < 0.05 h Mpc −1 . The amplitude of the power spectrum on these scales is close to the expected value for all data, following linear theory (Section 2.1) and given the simulation value of β. Where no SSVD model is present, we see that the shape of the model power spectrum does not match the data, even on these large scales. Including either the Gaussian or Exponential SSVD model allows a better match of the shape. The goodness-of-fit is approximately the same for both of these models, which is not surprising as they have the same asymptotic behaviour in the limit y → 0. High values of σ ∼ 500-600 kms −1 are required to enable this match, which fails for k ∼ > 0.1 h Mpc −1 . While the SSVD model was introduced to correct the high-k behaviour of redshift-space power spectra, it is clearly having an effect on scales 0.05 < k < 0.1 h Mpc −1 , where it is attempting to correct both the quasi-linear redshift-space effects and the deterministic linear bias assumption.
The second question that we wish to answer is "How well does the model of Eq. (20) where P gg , P gθ & P θθ are arbitrary, with or without a SSVD model, fit the simulation results?". The power spectra, P s 1 axis are shown by the solid circles in the upper panels of Fig. 3. On large-scales, k < 0.1 h Mpc −1 , the data are well fitted by the model of Eq. (20) without the inclusion of a SSVD component. As we move to scales k > 0.1 h Mpc −1 , we need to include a SSVD model in order to match the differences between P s 1 axis , P s 2 axes , and P s 3 axes , which result from the µ dependence of P s g (k, µ). Ratios between model and data for P s 1 axis , P s 2 axes , and P s 3 axes are shown in the lower panels of this plot. Although the SSVD models significantly help with the fits to the observed power spectra, it is clear that the shapes cannot be matched and that a compromise is being reached in the fits, where the model crosses the data at k ∼ 0.6 h Mpc −1 . The best-fit values of σ vary with the range of scales fitted confirming this picture. On these quasi-linear scales, the Gaussian SSVD model allows a slightly better fit to the data, although it is not clear that there is a physical motivation behind this. The Exponential model is a better fit on smaller scales, which is physically motivated (Peacock & Dodds 1996;White 2001).

Extracting density and velocity power spectra
The third question that we want to use the simulations to answer is "How are the power spectra P gg , P gθ & P θθ related to each other and to the matter power spectrum?". To fit for P gg , P gθ & P θθ we use spherically averaged power spectra calculated assuming that we have redshift-space distortions along 0, 1, 2 or 3 axes. Without including a SSVD model, we could use any three of P s 0 axes , P s 1 axis , P s 2 axes , and P s 3 axes to solve Eq. (33) for P gg , P gθ and P θθ . In order to simplify the fit, we assumed P gg = P s 0 axes , and used P s 1 axis and P s 2 axes to solve for P gθ & P θθ . For fits including a SSVD model, we used all four power spectra to solve for σ, P gg , P gθ and P θθ .
The power spectra, P gg , P gθ / f , P θθ / f 2 , recovered from the fits to P s i axis are plotted in the upper panels in Fig. 4. The power spectra divided by P mass are shown in the lower panels to highlight variations in shape of these different power spectra. The residuals between model and simulation data for P s i axis were shown in Fig. 3. On the largest scales k < 0.05 h Mpc −1 , the data are consistent with the hypothesis that P θθ ≃ f 2 P mass , and P 2 gθ ≃ f 2 P gg P mass . Where we do not include a SSVD model, there is weak evidence that the amplitude of P θθ / f 2 is lower than the amplitude of the matter power spectrum on scales k < 0.05 h Mpc −1 , consistent with a 10% velocity bias, although it should be noted that the data are noisy here.
It is clear that the shapes of the velocity and overdensity power spectra do not match on scales k > 0.05 h Mpc −1 , even when either Gaussian or Exponential SSVD models are included. Deviations between the shapes of these power spectra were also noted in Figure 6 of Scoccimarro (2004), and were physically motivated by  Fig. 2. The lowest amplitude power spectrum corresponds to the mass, while the other three are from halo catalogues, as described in the text, where the amplitude is an increasing function of mass limit. For comparison we also plot the model given by Eqns. 34, where the free parameters fitted are P gg , P gθ , P θθ , and σ where a SSVD model is included (solid lines). To determine these models, we have fitted P s 1 axis , P s 2 axes , and P s 3 axes for k < 0.7 h Mpc −1 , where the data were equally weighted in log k. In order to highlight differences, we plot the model for P s 1 axis divided by the data in the lower panels (solid lines). Similar lines are also plotted for P s 2 axes (dashed), and P s 3 axes (dotted).

Figure 4.
Upper panels: P gg (dotted lines), P gθ / f (dashed lines), P θθ / f 2 (solid lines) recovered by fitting P s 0 axes , P s 1 axis , P s 2 axes , and P s 3 axes for k < 0.7 h Mpc −1 . Four power spectra are plotted, corresponding to the mass and the halo catalogues. The amplitude of P gg and P gθ is lowest for the mass, and is an increasing function of the halo mass limit. For P θθ , the deviation from the shape of P gg is weakest for the mass, and is an increasing function of halo mass limit. The lower panels show these power spectra divided by P gg for the mass to highlight deviations between the shapes of the power spectra.

Figure 5.
For P gθ , we can remove a deterministic linear galaxy density bias dependence by dividing by P gg . If the galaxy density bias is well described by a local linear model and there is no velocity bias, then we expect P 2 gθ /P gg = P mass , so we divide P gθ by a further factor of √ P mass to highlight deviations from this model. The solid line was calculated from a fit with no small-scale dispersion correction, the dotted line assumed an Exponential model, and the dashed line a Gaussian model, both with single scale-independent variance. the arguments in Section 2.4. The turn-off from linear behaviour is stronger for P θθ than P gg for any of the catalogues, which is consistent with the quasi-linear particle velocities leading the displacements, which is an integral over the velocities.
As we discuss in Section 3, most of the cosmological signal from observations comes from the cross correlation between overdensity and velocity fields. So, the forth question that we wish to use simulations to answer is "Is P 2 gθ /(P gg P mass ) = 1 a valid model on large scales?". To answer this, we plot the power spectrum combination P 2 gθ /(P gg P mass ) in Fig. 5. On large scales, this statistic → 1, validating our hypothesis. Comparison with Fig. 4 suggests that it is less biased than P θθ as a tracer of f 2 σ 2 8 (mass) on these scales, as the deviations from P 2 gθ /(P gg P mass ) = 1 are less than 10%. The large-scale "plateau" can be extended by assuming a Gaussian model for the small-scale velocity dispersion, provided the fit is not extended to large k. For the plot, we only fitted to k < 0.4 h Mpc −1 . The recovered values of σ for Gaussian model of scale-independent velocity dispersion correction are 310 kms −1 , 360 kms −1 , 390 kms −1 , 470 kms −1 , for the mass, and the three halo Figure 6. As Fig. 4, but now allowing the variance of the assumed SSVD model, σ to vary with scale. The resulting models of P s 1 axis , P s 2 axes , and P s 3 axes are an almost perfect fit to the data. This should be expected as we are fitting 4 data points with a model with 4 parameters at each k-value.
catalogues. For the Exponential model, the corresponding numbers are 280 kms −1 , 300 kms −1 , 340 kms −1 , 410 kms −1 . If we extend the fit to smaller scales, then σ increases for both models and the results from Exponential and Gaussian models become more similar. In addition, the plateau is reduced in size.
The sharp decrease in P gθ seen at k ∼ 0.1 h Mpc −1 is consistent with a model that includes both the effect of FOGs and the infall model. The inflection marks the point where the highest galaxies velocities change from being associated with the largest scales (due to infall), to instead be linked to the smallest scales (due to FOG). The quasi-linear nature of infall on small scales will also contribute to this transition. The scale at which the inflection occurs is affected by the SSVD model included. The Gaussian SSVD model acts as a slightly more accurate fitting function compared with the exponential model, extending the plateau of where the overdensity-velocity cross power spectrum matches the shape of P gg to k < 0.1 h Mpc −1 . This might reflect the contribution of quasi-linear effects to the FOG, where the Exponential model was physically motivated. We now investigate the SSVD model further.

Investigating SSVD fits
The fifth question that we wish to address using simulations is "How well do the Gaussian and Exponential SSVD models in Eq. (20) extend the fit to small scales?". This is clearly a difficult question to answer, but we have tried to quantify the effect, and determine the scales where the SSVD model is important, by allowing the free parameter, σ in the Gaussian and Exponential SSVD models to vary as a function of scale. The power spectra P gg , P gθ / f & P θθ / f 2 , calculated assuming the Exponential SSVD model are shown in Fig. 6, which has the same format as Fig. 4. Allowing σ to vary allows a virtually exact fit to P s 1 axis , P s 2 axes , and P s 3 axes . For k < 0.1 h Mpc −1 , Fig. 3 shows that no SSVD model is required to fit the observed power spectra, and there is no constraint on σ. For k > 0.1 h Mpc −1 the best fit value of σ rises from 0 at k = 0.1 h Mpc −1 to 295 kms −1 at k = 1 h Mpc −1 for the matter, and to 330 kms −1 , 390 kms −1 , and 470 kms −1 for the halo catalogues with mass > 10 12.5 , 10 13 & 10 13.5 h −1 M ⊙ respectively. A similar trend is seen for the Gaussian model, although there is more noise, perhaps indicating that the parameters P gg , P gθ , & P θθ and σ fitted for each k are more degenerate in this model. This would also explain why the Gaussian model with a single value of σ for all scales can provide a slightly better fit to the data shown in Fig. 3: the solution can move along the Likelihood degeneracy.

Testing our new estimator
The final question that we wish to answer from simulations is "How well do the power spectra constructed from our new estimators given in Eqns. (24) & (30), recover f 2 P mass (k)?". The power spectra are compared in Fig. 7. Even with only P s 0 and P s 2 , we see that we can recover an unbiased estimate of f 2 P mass (k) on large-scales. The inclusion of a Gaussian SSVD model is able to correctly model the quasi-linear behaviour of this estimator for k < 0.2 h Mpc −1 for the mass, a limit that decreases to k < 0.1 h Mpc −1 for the most massive halo catalogue. The value of σ can therefore be treated as a free "nuisance" parameter, and could be fitted to the estimator. In order to demonstrate consistency we actually plot the estimator as in Eq. (30), with σ fitted to P s 1 axis , P s 2 axes , and P s 3 axes for k < 0.4 h Mpc −1 . In practice, for a given linear matter power spectrum, one could calculate the estimator for a series of values of σ and marginalise over this parameter when fitting to the model.
Comparing the estimator of Eq. (30) to the non-linear matter power spectrum would result in a better fit to the data, because the behaviour ofP and the non-linear matter power spectrum are similar. We consider that this would be artificial, because we do not expect such a match from the theory presented in Section 2. With the Gaussian damping term, our new estimator is clearly a complicated function of P s 0 and P s 2 , and it would be difficult to accurately propagate standard analytic error estimates for power spectrum modes to this function. Given the precision to which future surveys would be able to measure this statistic, it would be better to base error analyses on the results from mock catalogues.

DISCUSSION
Redshift-space distortions encode key information about the buildup of cosmological structure. In the linear regime, the theory behind the density enhancement was developed over 20 years ago (Kaiser 1987). Following this development, most observational studies have focused on measuring β. One method to do this is to measure the quadrupole-to-monopole ratio, which depends on β in a known way (Eq. 25). By reviewing the theory of redshift-space distortions in Section 2, we argued that we should be able to obtain a bias independent cosmological constraint on f σ 8 (mass) from the large-scale redshift-space distortions. This statistic, proportional to  30), where we include a Gaussian component equivalent to a SSVD model, with variance fitted to P s 1 axis , P s 2 axes , and P s 3 axes (see text for details). This demonstrates that a Gaussian SSVD model on scales k < 0.4 h Mpc −1 can be treated as a fitting formulae for the quasi-linear turn-off from the expected linear behaviour. dD/d ln a with a proportionality constant depending on the amplitude of fluctuations at early times, provides an excellent test of DE models (e.g. Song & Percival 2008).
In Section 2.4 & 2.5, we set out to extend the theory into the quasi-and non-linear regimes, to investigate the smallest scales on which we can easily recover cosmological information using this physical process. In the linear regime, the power spectrum can be decomposed into terms dependent on µ 0 , µ 2 and µ 4 . As we push to quasi-linear scales we should expect this behaviour to become more complicated with extra µ 6 and higher order terms becoming important. We should also expect the simple relations bP gθ = f P gg and b 2 P θθ = f 2 P gg to break down. Previous work has introduced a "streaming" model for non-linear redshift-space distortions, including a damping term spliced together with linear theory. In the limit k → 0, the standard Gaussian or Exponential forms given in Eqns. (18) & (19) for this damping function both reduce to providing an extra µ 6 term with amplitude dependent on σ. We might therefore expect these terms can also be used to match the expected quasi-linear µ dependence of the power spectrum. In this case, the same function might not match all quasi-linear and nonlinear scales.
Using a large numerical simulation we have set out to test this theory. To facilitate this, we have introduced a novel approach to the analysis of the redshift-space power spectrum measured from simulations by allowing redshift-space distortions to be included along multiple axes. Power spectra calculated from density fields including redshift-space distortions along multiple axes act as moments of P s g (k, µ), allowing us to decompose into the expected µ 0 , µ 2 and µ 4 components. On large scales we see no evidence for strong velocity bias, and the velocity-velocity power spectrum has an amplitude f 2 times that of the mass to within 10% (right panel of Fig. 4). The density-velocity power spectrum, from which most information is obtained from surveys is less affected by this velocity bias. Fig. 2 shows that the relations bP gθ = f P gg and b 2 P θθ = f 2 P gg break down on scales k > 0.1 h Mpc −1 . The right panel of Fig. 3 shows that µ 6 and higher order terms need to be included for k > 0.2 h Mpc −1 . The Exponential and Gaussian models enable a better fit for 0.2 < k < 0.4 h Mpc −1 , but neither is perfect on all of these scales, although the Gaussian model does slightly better than the Exponential model in the cases we illustrate. Alternative models which reduce to µ 6 behaviour in the limit k → 0 might perform better, but this is not necessary for our purposes: we have shown that the Gaussian model extends the range of the fit.
Based on this analysis, we have devised a new estimator of the matter power spectrumP using the monopole and quadrupole power spectra. This matches the large-scale shape of the matter power spectrum with amplitude multiplied by f 2 , so we can measure f σ 8 (mass). Including the Gaussian damping model allows us to extend the match with the simulation data for k ∼ < 0.2 h Mpc −1 .
The match between the shape of the expected linear matter power spectrum andP should allow a test of the scales over which we can measure f σ 8 (mass). Rather than use such an estimator, it would also be possible to perform a maximum likelihood fit of Eq. (21) to the measured redshift-space power spectrum, with f σ 8 (mass) and bσ 8 (mass) as the free parameters. Our analysis has assumed the distant observer limit, a standard approximation used in many analyses. It would be straightforward to extend this work to methods that incorporate a proper radial-angular split allowing for standard survey geometries. An additional extension would be to perform such an analysis in configuration space: again, it would also be possible the extend the ideas behind our estimator to analyse the correlation function.

ACKNOWLEDGMENTS
We thank the referee Andrew J.S. Hamilton for carefully reading our manuscript and his insightful and concrete suggestions for improvements. We thank Jeremy Tinker and Luigi Guzzo, for helpful comments on an earlier draft of this paper. WJP is supported by STFC, the Leverhulme Trust and the European Research Council. WJP thanks Lawrence Berkeley National Laboratory for their hospitality during a visit in May 2008, when this work was initiated. MW is supported by NASA. The simulations used in this paper were analyzed at the National Energy Research Scientific Computing Center.