Forecasting Cosmological Constraints from Redshift Surveys

Observations of redshift-space distortions in spectroscopic galaxy surveys offer an attractive method for observing the build-up of cosmological structure, which depends both on the expansion rate of the Universe and our theory of gravity. In this paper we present a formalism for forecasting the constraints on the growth of structure which would arise in an idealized survey. This Fisher matrix based formalism can be used to study the power and aid in the design of future surveys.


INTRODUCTION
The growth of large-scale structure, as revealed in the clustering of galaxies observed in large redshift surveys, has historically been one of our most important cosmological probes. This growth is driven by a competition between gravitational attraction and the expansion of space-time, allowing us to test our model of gravity and the expansion history of the Universe. Despite the fact that galaxy light doesn't faithfully trace the mass, even on large scales, galaxies are expected to act nearly as test particles within the cosmological matter flow. Thus the motions of galaxies carry an imprint of the rate of growth of large-scale structure and allows us to probe both dark energy and test General Relativity (e.g. Jain & Zhang 2008;Song & Koyama 2008;Song & Percival 2008;Percival & White 2008;McDonald & Seljak 2008, for recent studies).
This measurement of the growth of structure relies on redshiftspace distortions seen in galaxy surveys (Kaiser 1987). Even though we expect the clustering of galaxies in real space to have no preferred direction, galaxy maps produced by estimating distances from redshifts obtained in spectroscopic surveys reveal an anisotropic galaxy distribution. The anisotropies arise because galaxy recession velocities, from which distances are inferred, include components from both the Hubble flow and peculiar velocities driven by the clustering of matter (see Hamilton 1998, for a review). Measurements of the anisotropies allow constraints to be placed on the rate of growth of clustering.
Ever larger surveys have provided ever tighter constraints. Analyses using the 2-degree Field Galaxy Redshift Survey (2dF-GRS; Colless et al. 2003) have measured redshift-space distortions in both the correlation function (Peacock et al. 2001;Hawkins et al. 2003) and power spectrum ). Using the Sloan Digital Sky Survey (SDSS; York et al. 2000), redshiftspace distortions have also been measured in the correlation function (Zehavi et al. 2005;Okumura et al. 2008;Cabré & Gaztañaga 2008), and using an Eigenmode decomposition to separate real and redshift-space effects (Tegmark et al. 2004(Tegmark et al. , 2006. These studies were recently extended to z ≃ 1 (Guzzo et al. 2007) using the VIMOS-VLT Deep Survey (VVDS; Le Fevre et al. 2005;Garilli et al. 2008). In addition to measuring clustering growth at z = 0.8, this work has emphasized the importance of using largescale peculiar velocities for constraining models of cosmic acceleration. Current constraints on the growth rate are at the several tens of percent level (e.g. Nesseris & Perivolaropoulos 2008;Song & Percival 2008), but observational progress is rapid.
In the next section we shall outline the formalism for forecasting constraints on cosmological quantities from measurements of redshift space distortions, and compare it with previous forecasts. We begin with the simplest model and then investigate various refinements. We finish in §4 with a discussion of future directions. For illustration we shall assume a fiducial ΛCDM cosmology with Ω m = 0.25, h = 0.72, n = 0.97 and σ 8 = 0.8 (in good agreement with a variety of observations) when computing specific predictions for future surveys.

THE FISHER MATRIX
The Fisher matrix provides a method for determining the sensitivity of a particular measurement to a set of parameters and has been extensively used in cosmological forecasting and optimization. Here we adapt this methodology to our particular problem.

The simplest case
Under the assumption that the density field has Gaussian statistics and uncorrelated Fourier modes, the Fisher matrix for a set of parameters {p i } is (e.g. Tegmark et al. 1998) where P is the power spectrum and the mode counting is determined by the effective volume (Feldman et al. 1994) which depends on the geometric volume of the survey, V 0 , and the number density,n, of the tracer. Ifn is high enough then V eff ≃ V 0 . The constraints are dominated by regions wherenP 1, so it is safe to neglect the higher order (inn −1 ) terms which arise assuming that galaxies are a Poisson sample of the underlying density fluctuations (Meiksin & White 1999).
The simplest model for the observed galaxy distribution is a linear, deterministic, and scale-independent galaxy bias, with redshift space distortions due to super-cluster infall (Kaiser 1987) and no observational non-idealities. In this case P obs ∝ b + f µ 2 2 P lin (k) where P lin is the linear theory mass power spectrum in real space, b is the bias and µ the angle to the line-of-sight. The quantity of most interest here is f ≡ d ln D/d ln a, the logarithmic derivative of the linear growth rate, D(z), with respect to the scale factor a = (1 + z) −1 . In general relativity f ≈ Ω mat (z) 0.6 (e.g. Peebles 1980), while in modified gravity models it can be smaller by tens of percent (e.g. Song & Percival 2008, figure 1). Redshift space distortions allow us to constrain f times the normalization of the power spectrum (e.g. f (z)σ 8 (z)), or dD/d ln a. The derivatives in Eq. (1) are particularly simple independent of the shape of the linear theory power spectrum, and hence of the spectral index and transfer function. Since we hold the normalization of the power spectrum fixed for these derivatives, the fractional error on f (z)σ 8 (z) is equal to that on f in our formalism. The errors on b and f depend sensitively on the maximum k in the integral of Eq. (1). Since we are using linear theory we choose to cut the integral off at k ≃ 0.1 h Mpc −1 for our fiducial cosmology. This is close to the value at which Percival & White (2008) saw departures from linear theory.
The bias and f turn out to be anti-correlated, with a correlation coefficient of 70 − 75%, depending on the precise sample. We marginalize over b by first inverting the Fisher matrix to find the covariance matrix and hence the error f , specifically δ f
Conversely, increasing k max to 0.2 h Mpc −1 reduces the error on the b = 1,nP ≫ 1 case to 0.6%. Moving to higher redshift, keeping the bias fixed, makes the constraint stronger as f /b is increased. By z = 1, f has increased to 0.83 from 0.44 and δ f / f ≃ 1% for our fiducial b = 1,nP ≫ 1 example. Note, however, that at higher z the effects of shot noise would typically be larger.
These constraints can be compared to the forecasts in Guzzo et al. (2007) who present a fitting function for the relative wheren is measured in h 3 Mpc −3 and V in h −3 Mpc 3 . Note that both forecasts agree on the scaling with volume, but the above scales approximately as the inverse square root of the total number of galaxies in the survey and is independent of the bias. Taking into account the correlation between the constraints on b and f our forecast constraint is Comparing our forecasts to this scaling we find relatively good agreement for b ≃ 1 andn ≃ 10 −4 h 3 Mpc −3 , but the Guzzo et al. (2007) scaling predicts much better constraints for higher number density or more biased samples.

Beyond linear theory
Of course we do not expect the simple linear theory result with super-cluster infall to be a perfect description of redshift space distortions on all scales. Comparison with N-body simulations suggests that halos do closely follow the matter velocity field and the major modification to the simple model at low k is in the quadrupole, with an additional effect coming from the generation of multipoles higher than 4. By introducing more freedom into the model we will increase our ability to describe the extra physics acting, and simultaneously begin to degrade our sensitivity to f . In Percival & White (2008) it was shown that a streaming model with a Gaussian small-scale velocity provided an adequate fit to N-body simulations to k ≃ 0.1 h Mpc −1 . Under these assumptions one model for the redshift space, galaxy power spectrum could be where P 0 represents the mass power spectrum in real space and σ z is to be regarded as a fit parameter which encompasses a variety of violations of the traditional analysis. We can additionally model inaccuracies in the observed redshifts by a line-of-sight smearing of the structure. In the limit that this smearing is Gaussian it can be absorbed into σ z . In this situation, the logarithmic derivatives with respect to b and f are unchanged and the new derivative required is simply ∂ ln P/∂σ 2 z = −k 2 µ 2 . We now marginalize over σ z in addition to b before reporting the constraints on f .
Many of the trends with b andn in our simple model also hold for this extended model. For our fiducial 10 (h −1 Gpc) 3 volume the constraint from our z = 0, unbiased tracers withnP ≫ 1 weakens from δ f / f = 1.6% to 3.2% when marginalizing over σ z . The fiducial value of σ z has little impact fornP ≫ 1. However whennP ≃ 1 the error is increased from 3% to 4% to 20% as σ z is increased from 0 to 10 h −1 Mpc to 100 h −1 Mpc for a sample with b = 1.
Another alternative is to model the small-scale suppression with a Lorentzian, which provides a better fit at higher k and is a good match to the superposition of Gaussians of different widths from halos of different masses (White 2001). The two agree to lowest order in kσ z , and which is the regime of most interest here, with the Gaussian matching the results of N-body simulations at small k slightly better than the exponential (Percival & White 2008). Changing the form from Gaussian to exponential makes a negligible change in our forecasts.

Mode by mode
The forecasts above all made quite strong assumptions about the relationship between the velocity and density power spectra, assumptions which are only known to be true for quasi-linear scales within the context of General Relativity. The parameters {p i } in our Fisher matrix (Eq. 1) don't have to be cosmological parameters however. We can fit directly for the three independent power spectra (the density-density, velocity-velocity and density-velocity spectra) rather than assuming that they are related by a specific functional form (e.g. Tegmark et al. 2004). Such constraints would be applicable to a wide range of theories including e.g., interacting dark energy, clustered dark energy or f (R) gravity.

Correlation of δ and Θ
N-body simulations of ΛCDM cosmologies show that the density and velocity divergence are highly correlated for k < 0.1 h Mpc −1 (see Fig. 1) so we will begin by making the assumption that the density and velocities are perfectly correlated (to be relaxed in §2.3.2). Then the density-velocity cross-spectrum becomes the geometric mean of the two auto-spectra and we have only two free functions. If we write Θ for the velocity divergence in units of aH the power spectrum becomes P obs (k, µ, z) = P gg (k, z) + 2µ 2 P gg (k, z)P ΘΘ (k, z) where P gg denotes the usual galaxy density auto-spectrum and we have assumed that small-scale ("finger of god"; Jackson 1972) effects have been cleanly removed by e.g. finger of god compression. The parameters in the Fisher matrix, Eq. (1), are now the values of the two spectra themselves, in bins of k and z: and the variance of P ΘΘ (k i , z j ) is given by where F −1 22 (k i , z j ) is 22-component of the inverse matrix of F αβ . The constraint on P ΘΘ (k i , z j ) in 2 redshift bins each of ∆z = 0.2 is plotted in Fig. 2 for a half-sky survey withn = 5 × 10 −3 h 3 Mpc −3 and b = 1.5.

Decorrelation of δ and Θ
The assumption of tight correlation between δ and Θ is a reasonable one (densities grow where flows converge and velocities are high where mass concentrations cause a large gravitational potential) but is not required. We can extend the formalism above by allowing the cross-correlation coefficient, to differ from unity. The power spectrum can now be written in terms of 3 free functions (P gg , P ΘΘ , r) as P obs = P gg + 2µ 2 r(k) P gg P ΘΘ + µ 4 P ΘΘ G FoG (k, µ; σ z ) where G FoG is a Gaussian describing the decrease in power due to virial motions and the derivatives are given by We find that allowing r(k) to be completely free degrades the constraint on P ΘΘ , and eventually f , significantly, until it is equivalent to simply measuring the µ 4 component in Eq. (11). To strengthen the constraint requires prior information about r(k), which can in principle be obtained from simulations or perturbation theory calculations of structure formation in modified gravity models. As an illustrative example, if we assume a prior where the error on r is equal to its deviation from unity (using the fiducial model with r measured from N-body simulations as in Fig. 1) we find the constraint on P ΘΘ is almost the same as we obtained before. Also note that in this analysis we can mitigate our uncertainty in the form of the small-scale redshift space distortion by downweighting modes for which the fingers-of-god (Jackson 1972) are expected to be large. The residual uncertainty after the weighting is where the weight function w FoG (k, µ) could, for example, be given by where G FoG is the finger-of-god suppression factor and σ th is a threshold value indicating our confidence in the accuracy of the FoG model.

Multiple populations
Until now we have implicitly assumed that we are dealing with a single population of objects. However galaxies come in a variety of sizes, luminosities, masses and types which exhibit different clustering patterns but all of which are expected to respond to the same large-scale velocity field. McDonald & Seljak (2008) pointed out recently that this allows, in principle, for significant gains in determination of the growth of structure. In fact, in the limit of Gaussian statistics, perfectly deterministic bias and infinitely dense tracers, one can measure the velocity power spectrum limited only by the total number of modes in the survey in all directions.
To include multiple populations in the Fisher matrix approach, there are two obvious ways of proceeding. McDonald & Seljak (2008) assumed that the densities, δ i for 1 i N, are the measured quantities and built a covariance matrix in terms of the power spectra, δ i δ j , where a superscript denotes a measured quantity that includes a noise term. An alternative and complementary approach is to extend the analysis presented in §2.1 assuming that the power spectra are the measured quantities. For Gaussian fluctuations, in which all of the cosmological information is encoded in the power spectrum, these approaches turn out to be equivalent 2 . We develop this second approach here. The fractional error on f (z)σ 8 (z) arising from a 10 (h −1 Gpc) 3 survey at z = 0 populated with two types of galaxies. The first population is held fixed with b 1 = 1 andn 1 = 10 −2 h 3 Mpc −2 , i.e.nP ≫ 1. The second population has b 2 = 1.4 (solid), b 2 = 2 (dashed) or b 2 = 4 (dotted) and the constraint is plotted vs.n 2 . All else being equal, the fractional constraints would be tighter at higher z where f is larger.

The Fisher matrix
The Fisher matrix for this problem is a simple generalization of Eq. (1) but now there are N(N + 1)/2 power spectra for N galaxy populations. For example, in the simplest case of two populations there are 3 measured power spectra, which on large scales are estimates of where i and j run over 1 and 2 and P 12 = P 21 .
To calculate the Fisher matrix for multiple samples we need to sum over [N(N + 1)/2] 2 elements of the inverse covariance matrix where we denote a pair of galaxy indices by X or Y. Note that this reduces to Eq. (1) for a single population. In order to calculate the Fisher matrix, we need to determine the covariance matrix and the derivatives of the power spectra with respect to the cosmological parameters of choice.

Calculating the covariance matrix
If we assume that the bias is deterministic and that the shotnoise can be treated as an (uncorrelated) Gaussian noise the covariance matrix for the power spectra is straightforward to compute. Similar results for the covariance matrix of quadratic combinations of Gaussian fields have been determined previously when considering CMB temperature and polarization power spectra (e.g. Zaldarriaga & Seljak 1997;Kamionkowski et al. 1997) or the problem of combining density and velocity power spectra (e.g. Burkey & Taylor 2004). Our problem is slightly more general, in that we need to consider additional combinations of power spectra, but similar in spirit.
If we define N a ≡ [1 + 1/(nP aa )], the diagonal terms in the covariance matrix are where a b, and the off-diagonal terms are calculated from C abac = P ab P ac + P aa P bc N a , where a b c d. These formulae are complete with the relations P ab = P ba and C XY = C Y X . The off-diagonal terms given in Eq. (22) do not occur in the CMB example as there is only a single non-zero cross power there. However, all covariance matrix elements can be calculated using the same standard procedure (see Appendix A).

Calculating the derivatives
If the parameters that we wish to constrain are b a and f then where δ ac K and δ bc are Kronecker δs. This completes the input that we need for the Fisher matrix, Eq. (17).
We can include a parametrized line-of-sight smearing by multiplying the power spectra by e.g. exp[−(1/2)k 2 µ 2 (σ 2 a + σ 2 b )]. The derivatives are multiplied by the same factor and there is an additional set ∂P ab ∂σ 2 For the mode-by-mode parametrization developed in §2.3 we can use the logarithmic derivatives in Eq. (12), multiplied by P obs (k a , z b ).

Results
We confirm the finding of McDonald & Seljak (2008) that using multiple populations can result in significant gains in constraining power. For example Fig. 3 shows the fractional error on f decreases by a factor of 2 − 3 if a second population is simultaneously used to provide constraints. At fixed number density the gain is higher the more biased is the second sample, and the constraint is weakened as the bias of the first sample is increased. Thus we would like to find a two samples with very different clustering properties but reasonable number densities.
As long as the line-of-sight dispersion, σ i , is not large the marginalization has little effect on the total error. One does, however, prefer slightly highernP when marginalizing over σ i than when keeping it fixed.
The gains saturate quickly when using more than two samples. In fact if the total number of objects observed is to be held fixed, it is better to increase the number densities of the lowest and highest biased sets rather than include an intermediately biased sample at the expense of lower number densities for all samples.
Within the deterministic bias model, splitting into multiple populations does not affect constraints on the overall large-scale power spectrum shape: here we are always limited by the total number of modes in the sample. A bias model can be used to weight galaxies of different bias, allowing for their different clustering Surveyn z N gal BOSS 3 0.1 < z < 0.7 1.5 WFMOS (1) 5 0.5 < z < 1.3 2.0 WFMOS (2) 5 2.3 < z < 3.3 0.6 EUCLID/JDEM 50 0.1 < z < 2.0 500 Table 1. Fiducial parameters adopted as indicative of various planned or ongoing surveys. N gal is given in units of 10 6 , e.g. BOSS has 1.5 million galaxies, andn in units of 10 −4 h 3 Mpc −3 . For the survey with the proposed WFMOS instrument, we assume that this is split into low (1) and high (2) redshift components as proposed in Glazebrook et al. (2005). We assume that each survey covers a fixed fraction of the sky, so the volume within any redshift interval is completely determined by these parameters.  Table 1. We consider all galaxies in a single bin (solid lines), and split into 4 bins as a function of bias (dashed lines). The existing constraints, as collected in Song & Percival (2008) and with the addition of da Angela et al. (2008), are shown as solid squares (see text).
strengths, in order to optimally calculate the overall power spectrum shape . Any cosmological benefit from splitting into multiple samples will therefore arise through better constraints on f (z)σ 8 (z).

PREDICTIONS FOR FUTURE SURVEYS
In this section we apply our Fisher matrix formalism to 3 concepts for future spectroscopic surveys 3 , with fiducial parameters given in Table 1. We assume a tight prior on small-scale velocity dispersion. The galaxy bias is one of the hardest parameters to predict for future surveys, so we have adopted a conservative approach here. We assume that redshift zero galaxy bias is sampled from a uniform distribution with 1 < b < 2. The bias evolves with redshift such that the galaxy clustering amplitude is constant. For all surveys, we assume that we can use all modes with k < 0.075 h Mpc −1 at z = 0, and that this limit evolves with redshift according to the Smith et al. (2003) prescription for k nl . This assumption is deserving of further investigation in N-body simulations.  Table 1. We show only a single, representative bin in redshift for each experiment, to avoid clutter.
Predictions for the error on f (z)σ 8 (z) are shown in Fig. 4. Results are presented either assuming that all galaxies are analyzed in a single bin, or are split according to galaxy bias. For surveys with a large number density, such as that proposed for the EUCLID/JDEM concept, splitting into bins with different galaxy bias can significantly reduce the expected errors, as we saw in §2.4. The galaxy sampling density proposed for the BOSS and WFMOS surveys is lower, and we gain less from the multiple-sample approach.
It is useful to ask how these forecasts depend on the input assumptions. As an illustration, if we decrease the number density by a factor of 1.5 the BOSS, WFMOS(1) and unsplit Euclid/JDEM results are largely unchanged. The error on WFMOS(2) increases by 30% while the split Euclid/JDEM limit increases by 10%. Decreasing the maximum k from 0.075 h Mpc −1 to 0.05 h Mpc −1 at z = 0 (with the same scaling to higher z) all of the constraints become weaker. The limit from BOSS increases by ∼ 40%, WFMOS(1) by ∼ 75%, WFMOS(2) by ∼ 15% and Euclid/JDEM by ∼ 85% for the unsplit case and 60% for the split case.
The direct constraint on P ΘΘ for our futuristic surveys is shown in Fig. 5, for some representative bins in redshift. We expect to be able to place tight constraints on the velocity power spectrum near k ≃ 0.1 h Mpc −1 with future experiments.

CONCLUSIONS
Observations of redshift-space distortions in spectroscopic galaxy surveys offer a powerful way to measure the large-scale velocity field, which in turn provides a sensitive test of both the expansion rate of the Universe and our theory of gravity. We have developed a Fisher matrix formalism which allows forecasting of the constraints future, idealized, surveys would be able to place on the linear growth rate, f (z)σ 8 (z) ∝ dD/d ln a, and shown that they are potentially highly constraining, though not as constraining as the scaling of Guzzo et al. (2007) predicts.
We have developed the Fisher matrix exposition in multiple levels of sophistication and realism, assuming strict functional forms for the power spectra or allowing them to float freely. As expected the constraints are tightest when theoretical investigations can provide good priors for the form and range of parameters, but even relatively conservative assumptions suggest that percent level measurements of f should be possible with future surveys. Further work on understanding the correlation of velocity and density fields, scale-dependent bias and non-linear effects could pay big dividends.
As pointed out by McDonald & Seljak (2008), using multiple populations of galaxies can tighten the constraint on f (z)σ 8 (z) (though it does not improve measurement of the shape of P ΘΘ ). We show that this can be naturally incorporated into our formalism. The largest improvement comes when disjoint samples with a large difference in bias, both having a high number density [nP(k ≃ 0.1 h Mpc −1 ) ≫ 1] are used. The ultimate limit to this method will come from stochasticity in the biasing of galaxies, and which types of galaxies minimize this effect on which scales is an important avenue for further investigation.
We have made a number of simplifications in this analysis which it will be important to address in future work. First we have assumed that the large-scale velocity field of the galaxies is that of the matter. Ultimately our ability to model any velocity bias will set a lower limit on what can be achieved. It is important to note that we are limited by how accurately the halo centers follow the mass velocity field "on large scales", rather than a bias within the halos. For the former, simulations suggest that halos do tend to trace the mass very well (Huff et al. 2007;Percival & White 2008). Current observational constraints on the latter from modeling clusters are consistent with no velocity bias at the 10% level (e.g. Sodré et al. 1989). Simulations suggest little or no velocity bias for the majority of "galaxies" in dark matter (Springel et al. 2001;Faltenbacher et al. 2006) and hydrodynamic (Berlind et al. 2003) simulations at the same level. Investigations of these phenomena in simulations of the standard cosmology and with alternative theories of gravity will be very valuable.
A code to compute the Fisher matrix given survey parameters is available at http://mwhite.berkeley.edu/Redshift.

ACKNOWLEDGMENTS
MW thanks Uros Seljak and YS thanks Olivier Dore for discussions on redshift space distortions. MW is supported by NASA. YS is supported by STFC. WJP is supported by STFC, the Leverhulme Trust and the European Research Council. The simulations used in this paper were analyzed at the National Energy Research Scientific Computing Center.

APPENDIX A: THE OFF-DIAGONAL POWER SPECTRUM COMPONENTS
In this section, we derive formulae for C aabc , and C abac , as given in Eqns. 21 & 22. Similar derivations for CMB power spectra were presented in Kamionkowski et al. (1997).
Suppose that, in a particular experiment, we have M independent complex samples δ m , with 1 < m < M, drawn from multivariate Gaussian distribution withP ab = 1/M m δ m * a δ m b . Our estimate of the covariance between power spectrum measurements from this experiment is To proceed, we split this sum into terms with m = m ′ and m m ′ . Where m = m ′ , we can use the standard result for the 4-order moments of multivariate Gaussian random variables that, if x a are real and Gaussian distributed, the expectation For the component of Eq. A1 where m m ′ , we can easily decompose into 2-order moments. For P aaPbc this procedure gives so C aabc = 2P abPac divided by the number of modes.
To calculate C abac , note that this procedure also gives that P abPac = 1/M[P abPac +P aaPbc ] +P abPac , so C abac =P aaPbc + P abPac , divided by the number of modes. The other terms in the covariance matrix can be calculated using the same methodology.