Interpreting the clustering of radio sources

We develop the formalism required to interpret, within a CDM framework, the angular clustering of sources in a deep radio survey. The effect of non-linear evolution of density perturbations is discussed, as is the effect of the assumed redshift distribution of sources. We also investigate what redshift ranges contribute to the clustering signal at different angular scales. Application of the formalism is focused on the clustering detected in the FIRST survey, but measurements made for other radio surveys are also investigated. We comment on the implications for the evolution of clustering.


I N T R O D U C T I O N
The canonical cold dark matter (CDM) model for the origin of structure in the Universe provides a concise qualitative description of large-scale-structure data spanning several orders of magnitude in distance scales, but on closer inspection there seem to be anomalies. For example, optical and infrared surveys seem to indicate more power on large scales relative to that on small scales (Peacock & Dodds 1994). A number of variations on the standard CDM model have been suggested which improve the agreement between some data and model predictions. These include (but are not limited to) the introduction of a cosmological constant (Efstathiou, Sutherland & Maddox 1990;Kofman, Gnedin & Bahcall 1993;Krauss & Turner 1995;Ostriker & Steinhardt 1995) or a lower Hubble constant (Bartlett et al. 1995). However, a problem with discriminating between these various models from large-scale-structure measurements is that luminous matter may be 'biased' relative to the mass distribution (Kaiser 1984), and this bias is likely to evolve with time and scale (see, e.g., Matarrese et al. 1997). Further investigation of the bias of luminous objects is thus important in the recovery of the primordial power spectrum from large-scale-structure data.
In this paper, we investigate the implications of new data obtained from deep radio surveys: in particular, the VLA FIRST (Faint Images of the Radio Sky at Twenty centimetres) survey (Becker, White & Helfand 1995;White et al. 1997).
The goal of FIRST is to survey the 10 000 deg 2 scheduled for inclusion in the Sloan Digital Sky Survey down to a flux-density threshold of 1 mJy. At present, the survey covers almost 3000 deg 2 of sky where 07 h 16 m 17 h 40 m and 22° 42°. This yields a catalogue of #250 000 sources, about one-third of which are in double-lobed and multicomponent sources. The survey has been shown to be 95 per cent complete down to 2 mJy, and 80 per cent complete to 1 mJy. The mean redshift of the radio sources in the survey is at z%1, as opposed to redshifts z 0.1 characteristic of sources in optical or infrared surveys, so the typical physical distances for fixed angular separations are larger. Clustering of FIRST radio sources therefore has the potential to probe the power spectrum on large scales and at earlier epochs.
The first high-significance detection of an angular correlation function (CF) for radio sources was presented in Cress et al. (1996). Using the CF-estimator proposed by Landy & Szalay (1993, hereafter LS) on the first 1550 deg 2 of the survey, it was found that the CF between 0°. 02 and 2°i s well fitted by a power law of the form ( / 0 ) 12 , where 0 :6.2 10 23 and :2.2. Results using a simpler estimator were also presented, but when the survey was later expanded to include a new #1500 deg 2 it became evident that the LS estimator was the more robust (Cress et al. 1997). CF measurements shown here are determined using the LS estimator in the expanded survey area.
The purpose of the work presented here is to develop the formalism required to interpret, within a CDM framework, angular CF measurements of deep surveys, and to apply this formalism to the observational results of Cress et al. (1996). We also discuss applications to measurements made from other radio surveys. We restrict our investigation to spatially flat models (allowing for the possibility of a non-zero ).
Given a power spectrum, calculation of the angular CF is straightforward and similar to that for optical or infrared surveys. However, there are two important issues which we address here. (1) We investigate the uncertainties in the predictios which arise from imprecise knowledge of the survey redshift distribution. (2) Since the typical redshifts of sources in the FIRST survey are of order unity, the evolution of the power spectrum must be considered in the analysis, or else there may be errors of order unity in the deduced primordial power spectrum. We include estimates of the non-linear evolution of the power spectrum.
The plan of the paper is as follows. In Section 2 we discuss the redshift distribution inferred for the sample. In Section 3 we discuss the time evolution of the power spectrum, and in Section 4 we discuss the calculation of the angular CF. Results are presented and discussed in Section 5. We make some concluding remarks in Section 6.

T H E R E D S H I FT D I S T R I B U T I O N
We use redshift-distribution estimates derived from two sources: (i) the 1.4-GHz radio luminosity function (RLF) presented by Condon (1984), and (ii) a set of 2.7-GHz LF estimates presented by Dunlop & Peacock (1990, hereafter DP).
In estimating the RLF, Condon used faint radio sources with optical counterparts in the UGC catalogue to determine a local RLF for spiral galaxies. He combined this with Auriemma's (1977) estimate of the local RLF for elliptical gal;axies, and then found a model for the RLF at higher redshifts by allowing the local population to have evolved in density and luminosity. The functions of redshift that describe theevolution wer constrained using number counts (down to submJy thresholds) at 1.4 GHz and at other frequencies, spectral-index measurements and redshift measurements for a few very bright sources. DP estimated the 2.7-GHz RLFs of steep-and flat-spectrum sources separately using similar constraints as those used by Condon, as well as some new data which included redshifts estimated from K-band photometry. To demonstrate the possible errors in the RLFs resulting from these imprecise redshift estimates, they analyse two distributions: one where assigned redshifts are the mean values for galaxies of a given K-band luminosity (mean-z distribution), and one (the high-z distribution) where redshifts are larger than the average (they give reasons why one might expect this kind of bias). Seven models were presented, and parameters for all seven models were determined for both meanand high-z distributions.
Here we obtain the redshift distribution of FIRST sources by integrating the RLFs over all luminosities that produce flux densities greater than 1 mJy. The DP models were first shifted to 1.4 GHz, assuming a spectral index of :0.85 for steep-spectrum sources and a spectral index of :0 for flat-spectrum sources (where the luminosity is related to the frequency by L 2 ). Fig. 1 shows the redshift distribution dN/dz for Condon's LF and for a selection of the models given in DP for a fluxdensity limit of 1 mJy. Mean-and high-z distributions produce similar results, and the differences between DP's models 1 to 5 are smaller than the differences shown in the figure. In Condon's model, the sharp increase in the number of sources near z:0 is due to starbursting galaxies. For larger flux-density limits, this low-z peak is reduced considerably, and we have checked that our redshift distributions for larger flux-density limits agree with those shown in fig. 6 of Loan, Wall & Lahav (1997).

T H E E V O LV E D P O W E R S P E C T R U M
In CDM models, one starts with a scale-free primordial power spectrum, P (k)fk n (where k is the comoving wavenumber). This P (k) is then 'processed' according a transfer function for CDM. At the epoch at which structure begins to grow, where T (k) for CDM is given by Bond & Efstathiou (1984), and P 0 is the normalization. We normalize to COBE 4-yr data using Liddle et al.'s (1996) formula, which is valid for spatially flat models. We will refer to this processed spectrum as the primordial power spectrum.
In the linear regime, the time evolution of the power spectrum can be calculated analytically and depends only on the expansion of the Universe. It is given by where the growth factor, G, is given in Caroll, Press & Turner (1992), and G ( 0 :1, :0):1.
In the highly non-linear regime, the CF obtained from a scale-free primordial spectrum obeys a simple scaling relation (Groth & Peebles 1977). One can interpolate between the linear and highly non-linear regimes to produce semianalytic models for clustering in the quasi-linear regime, which can then be accurately fitted to results from N-body simulations (Hamilton et al. 1991;Jain, Mo & White 1995;Peacock & Dodds 1996, hereafter PD). PD use the dimen-Interpreting the clustering of radio sources 487 © 1998 RAS, MNRAS 297, 486-492 Figure 1. The redshift distribution dN/dz derived from Condon's LF and from a selection of the models given in DP for a fluxdensity limit of 1 mJy. sionless power spectrum 2 (k, z)(2 2 ) 21 k 3 P (k, z). In our notation, the non-linear power spectrum is then given by where the linear and non-linear scales are related by k L :[1ǹ 2 NL (k NL )] 21/3 k NL . The form of the function f NL is given in PD for various values of 0 . We take into account the time evolution of the power spectrum by evaluating PD's formulae using the redshift-dependent matter density 0 (z) and cosmological constant (z).

P R E D I C T I N G T H E A N G U L A R C O R R E L AT I O N F U N C T I O N
As is well known, the angular CF, w ( ), for a given population is related to the spatial CF, (r, t), by Limber's equation (Limber 1954;Rubin 1954;Groth & Peebles 1977;Phillips et al. 1978;Peebles 1980;Baugh & Efstathiou 1993). If we assume that clustering is independent of luminosity (or if we consider an averaged CF for the entire sample), and that clustering is negligible on scales compared with the depth of the survey, then the angular CF in a flat universe is (e.g. where a (t) is the scale factor as a function of time t, and p (x) is the selection function (the probability that a source at a distance x is detected in the survey). The physical (not comoving) separation between two sources separated by an angle is where :2 sin( /2). The total number of sources in a survey of solid angle s is N: where dN/dz is the redshift distribution of sources in the survey. The spatial CF is the Fourier transform of the spatial power spectrum P (k, z): where k is the comoving wavenumber. Substituting (5), (6) and (7) into (4), and integrating over u, one obtains where To include non-linear effects, we use equation (3) for the power spectrum. The integral in equation (8) is then over k NL , and to determine the k L corresponding to a given k NL we need 2 L (k L ). Before calculating the CF, we therefore set up a table of k NL values for a range of k L 's at various redshifts.
The form of the k integrand in the CF estimate at :1° is shown in Fig. 2 for 0 :1 and h:0.5. One sees that there are positive contributions to the CF from ks in the range 0.01 to 0.5 Mpc 21 , with a large contribution coming from the larger ks (smaller scales). As one decreases the angle , the peak contribution shifts towards even large ks.
In Fig. 3 we have plotted dw/dz, the redshift distribution of the clustering signal, for a range of angles ( 0 :1, h:0.5 universe). One sees that the large angle CF estimates are dominated by contributions from nearby objects, and even at 0°. 1 one expects about half the signal to come from objects with z0.1. This illustrates that the angular CF probes clustering at redshifts significantly smaller than those characteristic of the survey population. More quantitatively, the models used here predict that the mean redshift probed by the CFV at :0°. 01 (0°. 1, 1°) is expected to bē z:0.4 (0.18, 0.08). Together, Figs 2 and 3 highlight the fact that the CF measured for FIRST is likely to be rather sensitive to smallish-scale correlations in fairly nearby sources. Fig. 4 shows how the predictions for w ( ) vary for different estimates of the redshift distribution, and how they change when non-linear evolution is considered. There appears to be a fair amount of uncertainty in the predicted result, depending on which redshift distribution is chosen. It should be noted, however, that Condon's LF is more carefully constructed to fit the low-z population, and it was 488 C. M. Cress and M. Kamionkowski  pointed out above that these sources are expected to make large contributions to the clustering signal. In addition, Peacock has indicated that DP's model 7 is the best model to use (the difference between 'high-z' model 7 as opposed to 'medium-z' model 7 is not significant). If one limits oneself to DP7 and Condon's model, then uncertainties in the redshift distribution do not contribute very significantly to uncertainties in the CF.

R E S U LT S
For the 0 :1, h:0.5 model shown in Fig. 4, it is clear that the effect of non-linear evolution is significant on scales less than #30 arcmin. For a -dominated model the nonlinear contributions are significant out to scales of 60 arcmin. The precise scale at which non-linearities become important also depends on the redshift distribution. Note that in the standard-CDM model, 8 for the linear-evolution spectrum is 1.22, while 8 for the non-linear-evolution spectrum is 1.17. Fig. 5 compares the measured CF in the FIRST survey (1-mJy flux threshold) with the CFs predicted by various cosmological models: the three solid lines are for 0 :1 and :0, with h taking on values of 0.5 (lowest curve), 0.65 and 0.9. The dashed lines show results for 0 :0.4, :0.6 and h taking values 0.5 and 0.65. Error bars in the plot are determined using a 'partition bootstrap method' in which the CF is calculated for 10 subdivisions of the survey region and the standard deviation of these measurements at each angle is used as a measure of the error. Note that the spike in the CF at 6 arcmin and possibly the dip at 9 arcmin are associated with the sidelobe contamination discussed in Cress et al. (1996). At a first glance it would appear that the data are best fitted by an 0 :1 model with a high Hubble constant, but this is misleading. It is likely that at low fluxes, a significant fraction of the source counts come from fairly nearby 'starbursting' spiral galaxies as opposed to the AGNpowered sources that one normally associates with radio sources (Condon 1994). Since spiral galaxies have been shown to have different clustering properties from ellipticals (see, e.g., Hermit et al. 1996) and from AGN in general (for an estimate of the quasar CF see Boyle et al. 1994), it seems likely that the CF measured here will have contributions from populations with different clustering properties. In Section 3 we showed that on large angular scales ( #1°), we expect the signal to be dominated by fairly local sources, and so it is probably more reflective of the    'starburst' clustering. On small angular scales our results are probably more sensitive to the clustering of more distant AGN. With this in mind, we do not attach special significance to the curve which best matches the slope of the data. Instead, we calculate the bias b, separately on small and large scales (where w matter-matter :b 2 w radio-radio ), and use this as an indicator of the relative bias of different populations in the survey. Inferred biases for various cosmological models are shown in Table 1. Estimates of the bias calculated when the power spectrum is normalized to the observed cluster abundance are shown in brackets. These were estimated by b (cluster):b (COBE) 8 (cluster)/ 8 (COBE), where 8 (COBE) is given by Bunn & White (1997) and 8 (cluster):0.6 20.53 0 for 0 :0.4 (White, Efstathiou & Frenk 1993;. Table 1 also shows the values of r 0 calculated for various models, assuming that the spatial CF of sources is approximated by a power law and that density fluctuations evolve linearly; i.e., (r, z):r/r 0 ) 2 (1ǹz) 22 (in comoving coordinates). We used :2.2 for the FIRST measurement and :1.8 for the other surveys, but the r 0 results do not depend heavily on the chosen value of . Peacock & Nicholson (1991, hereafter PN) measured the spatial CF for bright (S 2.4 GHz 1 Jy) radio galaxies with redshifts between 0.01 and 0.1, and found it to be well fitted by r 0 :11 h 21 Mpc and :1.8. After correcting for redshift-space distortions, one would most likely obtain a real-space r 0 slightly less than this. This result indicates that bright radio galaxies trace the matter distribution in a way similar to low-richness Abell clusters. Assuming linear evolution, estimates for r 0 in the FIRST sample range from 6 to 8 h 21 Mpc, depending on what redshift distribution is used. Note that in Cress et al. (1996), the r 0 obtained was based on a different estimate of the angular CF. The value obtained here is consistent with the fainter sources in the sample being less clustered than the bright sources investigated by PN. This picture is supported by Ledlow & Owen's (1995) observation that in a sensitive radio survey, field ellipticals and cluster ellipticals have a similar probability of being detected. One might thus expect sources to be clustered more like ellipticals than like clusters.
Another consistency check is supplied by the measurement of the angular cross-correlation of Abell clusters and FIRST sources presented in Cress et al. (1996). Using this result in a cross-correlation analogue of Limber's equation (see Seldner & Peebles 1978), we inferred the amplitude of the spatial cross-correlation of Abell clusters and FIRST sources. DP's model 7 was used for the redshift distribution of FIRST sources and the distribution of Abell clusters was obtained by fitting a function of the form log N (z):29.1z23.5 to data from Huchra et al. (1990). By combining the autocorrelation and the cross-correlation we found the ratio of cluster bias to radio-source bias (b c /b r ) to be about 1.9. A similar estimate of b c /b r was also obtained by combining the measured cluster-cluster spatial CF and the radio-radio spatial CF inferred above. Mo, Peacock & Xia (1993) compared bright radio galaxies with Abell clusters, and obtained a value of 1.7 for this ratio. This is also consistent with the idea that the faint sources in FIRST are somewhat less clustered than the bright sources investigated by their work (same as in PN).
Angular CFs of radio sources have also been measured for the Green Bank and the Parkes-MIT-NRAO (PMN) surveys done at 4.85 GHz (Kooiman, Burns & Klypin 1995;Sicotte 1995;Loan et al. 1997) and for WENSS -the 325-MHz Westerbork Northern Sky Survey (Rengelink et al. 1997). We have investigated CDM predictions for these surveys using DP7 to estimate the redshift distribution of sources with S 4.85 50 mJy in the CB/PMN surveys and of sources with S 325 35 mJy in WENSS. Predicted CFs (assuming no bias) are shown in Fig. 6.
Taking A WENSS :0.0025, A GB/PMN :0.005 and :1.8 [where w ( ):A 12 ], we have calculated the bias of sources in these surveys relative to CDM predictions and listed them in Table 1. The source composition in these two surveys is probably more similar to the sample investigated by PN, although WENSS is likely to include fewer quasars. The biases inferred for the PN sample are also shown in the table.
While a large amount of variations in inferred clustering strength could be attributed to uncertainties in the CF estimates and to the different clustering of different populations, data summarized in Table 1 show some indication that as one probes deeper, the inferred bias increases. Assuming that FIRST (at small angles) and the other surveys probe a similar population of objects, we have, with cluster normalization, b radioAGN #(2.8, 3.4, 3.1, 5.1) at z# (0.05, 0.27, 0.33, 0.42) for the standard model, and b radioAGN #(1.7, 2.0, 1.7, 2.8) at z# (0.05, 0.30, 0.42, 0.48). One interpretation of this is that the clustering is evolving slower than CDM models predict. However, Matarrese et al. (1997), drawing on work by Fry (1996) and Mo & White (1996), have recently argued that in CDM models, the bias of a given population should decrease with time; that is, the fluctuations in density inferred from luminous objects should tend towards the real fluctuations in the mass density as time goes by. Thus the observations in Table 1 could still be consistent with CDM predictions if bias evolution takes place. Interestingly enough, this apparent bias evolution is stronger than that seen in optical surveys (Materrese et al. 1996). This would be consistent with the general picture that the bias evolution for objects which tend to form at higher density mass peaks is stronger. If one writes b (z):1ǹb 0 (1ǹz) , the value of which best fits the data is #2, but this can vary significantly when uncertainties in the measurements are considered.

C O N C LU S I O N S
We have studied the implications of angular clustering of radio sources in deep surveys for standard-CDM-like models for structure formation. We have examined the effect of uncertainties in the redshift distribution and considered non-linear evolution of the power spectrum. We have also studied what may be learned about the biasing of such sources and the evolution of such a bias.
Interpreting the clustering of radio sources 491 © 1998 RAS, MNRAS 297, [486][487][488][489][490][491][492] We have shown that uncertainties in the redshift distribution of FIRST sources can contribute a significant amount to the uncertainties in the CDM predictions of the angular correlation function (CF) even when bias is not considered. However, limiting oneself to Condon's model and DP7 reduces the uncertainties significantly. It should also be kept in mind that the uncertainty in the predicted angular CF which arise from uncertainties in the redshift distribution are still not much larger than the statistical uncertainty in the measured angular CF. Uncertainties in the redshift distribution therefore do not currently affect strongly the implications of radio-source clustering for the origin of structure.
The effect of non-linear evolution on the predictions has also been explored. It appears that non-linear contributions become important on scales of 10 to 60 arcmin, depending on what model is used. Therefore, as data improve, the standard approximation of a power-law evoluton of the clustering with redshift will break down. We also found that although the mean redshift of sources in deep radio surveys may be of order unity, the clustering signal comes primarily from smaller redshifts. That is, the angular CF is more sensitive to nearby clustering than one might expect from the dN/dz distribution. Furthermore, the redshift distribution of the clustering signal varies significantly with the angular scale probed. The angular CF therefore reflects the evolution of the three-dimensional power spectrum as well as the power spectrum at some fixed epoch.
We have shown that spatially flat CDM models can produce angular CFs similar to those observed in the FIRST survey as long as some kind of biasing is invoked. The needed biasing seems to be a bit larger than that needed for optical sources, but less than that needed for brighter radio sources. This is probably due to the fact that FIRST sources are a heterogeneous sample which contained numerous nearby starbursting galaxies as well as more distant brighter radio sources. We have also begun to explore what may be learned about the evolution of bias from the First and other radio surveys. At this stage, it is difficult to make very quantitative statements about the evolution of clustering because the CF measurements that probe higher-z ranges have large uncertainties associated with them. In addition, bias evolution will have to be understood better if we are to learn about the evolution of mass clustering. This point is particularly significant for a heterogeneous sample of objects such as FIRST sources.
One way to improve our ability to make quantitative statements about clustering evolution is to make more precise measurements of the clustering of sources that probe the higher z range. As the FIRST survey coverage increases, we will be able to obtain significant clustering signals for samples with higher flux thresholds. Increasing the flux threshold should increase the average redshift probed by the CF, and thus improves one's chances of probing larger scales. Another way of increasing the redshift probed by the CF might be to remove all sources that have optical counterparts in the APM survey. This is currently being investigated. Improving our estimates of the redshift distribution of faint sources is also important, and there is some hope that this will be done in the near future.

A C K N O W L E D G M E N T S
We thank John Peacock, Jim Condon, Ofer Lahav, David Helfand, Alexandre Refregier and Jacqueline van Gorkom for useful comments. This work was supported by the US Department of Energy under contract DE-FG02-92ER40699, NASA grant NAG5-3091 and the Alfred P. Sloan Foundation. The FIRST project is supported by grants from the National Geographic Society, NSF grant AST94-19906, NATO, IGPP, Columbia University, and Sun Microsystems. Figure 6. Comparison of the measured w ( ) with that predicted by standard CDM (assuming no bias) for three different surveys. DP7 was used to model the redshift distribution. Power-law fits to the data are shown as faint lines.