- Split View
-
Views
-
CiteCitation
J. M. Diego, P. Vielva, E. Martínez-González, J. Silk, J. L. Sanz; A Bayesian non-parametric method to detect clusters in Planck data, Monthly Notices of the Royal Astronomical Society, Volume 336, Issue 4, 11 November 2002, Pages 1351–1363, https://doi.org/10.1046/j.1365-8711.2002.05874.x
Download citation file:
© 2018 Oxford University Press
Close -
Share
Abstract
We show how one may expect a significant number of Sunyaev–Zel'dovich (SZ) detections in future Planck data without any of the typical assumptions needed in present component separation methods, such as concerning the power spectrum or the frequency dependence of any of the components, circular symmetry or a typical scale for the clusters. We reduce the background by subtracting an estimate of the point sources, dust and cosmic microwave background (CMB). The final SZ effect map is estimated in Fourier space. The catalogue of returned clusters is complete above flux ≈200 mJy (353 GHz), while the lowest flux reached by our method is ≈70 mJy (353 GHz). We predict a large number of detections (∼9000) over four-fifths of the sky. This large number of SZ detections will allow a robust and consistent analysis of the evolution of the cluster population with redshift and will have important implications for determining the best cosmological model.
1 Introduction
The distortion in the radiation intensity of cosmic microwave background (CMB) photons produced when they traverse the hot intracluster plasma in the direction of a galaxy cluster [the Sunyaev–Zel'dovich effect (SZE), see Sunyaev & Zel'dovich 1972] is one of the most promising effects being studied for exploring cosmological models. In recent years, several groups have been working on the detection of the SZE and many detections have been reported (e.g. Birkinshaw, Gull & Hardebeck 1984; Carlstrom, Joy & Grego 1996; Pointecouteau et al. 2001). The SZE is growing in interest as the number of experiments and their quality is increasing. The number of experiments devoted to these kinds of observations, as well as their unprecedented quality, will allow a variety of analyses which, combining the SZ data with other data or alone, could be applied to the study of the intracluster media, its origin and evolution; the abundance of galaxy clusters and its important cosmological implications; the determination of the cosmological distances to the most distant galaxy clusters, etc. One of these experiments is the approved Planck satellite (scheduled launch in 2007). This satellite will observe the full sky at millimetre frequencies (30 < ν < 857 GHz) and with resolutions ranging from FWHM = 30 to 5 arcmin (see Fig. 1). Previous studies have shown that this satellite will produce a full-sky cluster catalogue with about 5000–50 000 clusters; the final number depending on the cosmological model, the Planck effective sensitivity and the method used to identify the different component contributions. This paper will be focused on this last point.
Initial simulated maps at the Planckfrequency channels. They contain: CMB, thermal Sunyaev–Zel'dovich effect, thermal dust emission, free–free, synchrotron, spinning dust, point source emission and instrumental Gaussian white noise. In this plot and in the following ones, the colour table is in units of ΔT/T .
Initial simulated maps at the Planckfrequency channels. They contain: CMB, thermal Sunyaev–Zel'dovich effect, thermal dust emission, free–free, synchrotron, spinning dust, point source emission and instrumental Gaussian white noise. In this plot and in the following ones, the colour table is in units of ΔT/T .
Recent proposed component separation methods, Wiener filter (WF, Tegmark & Efstathiou 1996; Bouchet et al. 1997), maximum-entropy (MEM, Hobson et al. 1998, 1999), fast independent component analysis (fastICA, Maino et al. 2002), mexican-hat wavelet analysis (MHW, Cayón et al. 2000; Vielva et al. 2001a), and adaptive filter analysis (AFA, Tegmark & de Oliveira-Costa 1998; Sanz, Herranz & Martínez-González 2001), are being tested in order to define a well-established method to perform the component separation on the Planck data. However, it will be extremely difficult to define the best method since some methods will work better than others under certain circumstances, and it will not be surprising if, at the end, the final component separation method results in a combination of a variety of methods (e.g. MEM + MHW, Vielva et al. 2001b).
Some methods try to separate all the components simultaneously. To do this in an effective way, some a priori information is needed. Commonly, the power spectrum of several (if not all) components and the frequency dependence of the components must be given (WF, MEM). Another typical assumption is that all the components are independent and non-Gaussian except possibly one, the CMB (fastICA).
In the case that the assumed information is close enough to reality, all of these methods work very well. On the other hand, if the a priori information is wrong, the result of the component separation will be biased with respect to the underlying real signal. This could have important consequences on the analysis of the final data maps. One of the risks in the simultaneous all-component separation methods is then that an error in the estimation of one of the components must be compensated by an error in one (if not all) of the other components owing to the constraint that the sum of all the recovered components must equal the data.
This problem can be partially reduced by using single-component separation methods such as the MHW or AFA, which have been successfully applied to the separation of point sources (PSs) (Cayón et al. 2000; Vielva et al. 2001a) and the SZ effect (Herranz et al. 2002a,b). These methods have the advantage, over the previous ones, that they do not need to assume anything concerning the Galactic components or the CMB. The information they need is taken directly from the data (the power spectrum of the background and the beam shape). The only thing they have to assume is a scale and the circular symmetry of the source. In the MHW technique applied to the detection of point sources, the optimal scale can be obtained from the background and the beam scale. In this sense, the analysis of the point sources based on the MHW technique is very robust since all the assumptions are taken from the data. When the AFA is applied to the detection of the SZ effect, prior knowledge of the scale and shape (asymmetry) of the clusters is needed (Sanz et al. 2001). This problem can be overcome by applying the filter at different scales (Herranz et al. 2002a,b). The problem of asymmetry in the resolved clusters can only be solved by rotating an axis-asymmetric filter, which will reduce the speed of the algorithm significantly.
In this work we propose an alternative method that can be applied to the detection of the SZ signal on the future Planck data. The main points of our method are the following.
We consider a Bayesian and non-parametric method without prior knowledge concerning the power spectrum of any of the components. We will show how the method allows us to include information concerning the power spectrum of the SZE component. However, the final result will not depend significantly on the particular choice of this power spectrum and arbitrary power spectra can be considered, provided they obey some general rules.
The method is easy to implement and fast since it is a non-iterative method and all the equations are solved in Fourier space mode by mode.
We do not need any prior knowledge concerning the frequency dependence of the components other than the SZ, and, obviously, the CMB. We only require that we have at least one channel which is clearly dominated by dust. However, results from BOOMERANG and IRAS suggest that the 857-GHz channel in Planck will be dominated by dust.
The method works for any kind of shapes and sizes of the galaxy clusters.
The method uses all the information available (all the channels).
As mentioned in the third point, we will assume only knowledge of the frequency dependence of the SZ effect (see Fig. 2). This is a well-established assumption as the physics of the SZ effect is very well known. We would like to remark that, although in this work we will assume only the non-relativistic corrections, our conclusions could be extended to include the relativistic corrections (provided the temperature of the cluster is known). We also did not consider the kinetic contribution to the SZ effect since it is of the order of 30 times smaller than the thermal part. However, this component could be estimated (in some clusters) as the residuals after the thermal contribution has been determined.
The f(ν) factor appearing in equation (1). The vertical lines are proportional to the signal-to-noise ratio for the SZE expected in each of the Planckchannels. Planckfrequencies are centred at 30, 44, 70, 100 (LFI and HFI), 143, 217, 353, 545 and 857 GHz. The planned bandwidths for Planck(not considered in this work) are δν/ν= 0.2for the LFI (30–100 GHz) and δν/ν= 0.25 for the HFI (100–857 GHz). The channels at 217 and 857 GHz show a tiny S/N ratio. At 100 GHz there are two channels: LFI channel (solid line) and HFI channel (dotted line).
The f(ν) factor appearing in equation (1). The vertical lines are proportional to the signal-to-noise ratio for the SZE expected in each of the Planckchannels. Planckfrequencies are centred at 30, 44, 70, 100 (LFI and HFI), 143, 217, 353, 545 and 857 GHz. The planned bandwidths for Planck(not considered in this work) are δν/ν= 0.2for the LFI (30–100 GHz) and δν/ν= 0.25 for the HFI (100–857 GHz). The channels at 217 and 857 GHz show a tiny S/N ratio. At 100 GHz there are two channels: LFI channel (solid line) and HFI channel (dotted line).
The structure of the paper is the following. In Section 2 we present the Planck simulations that have been used to test the method. We describe the method and the way it is implemented in Section 3. We apply the method to the realisticPlanck simulations and show the results in Section 4. Finally, we compare briefly with other methods in Section 5. The possibilities of the recovered SZE map are also highlighted in this section.
2 Data set: realistic Planck simulations
In order to check the power of the method, we have performed realistic Planck simulations. The simulations are realistic in the sense that they include all the main features of the Planck satellite such as the corresponding noise level in each channel, pixel size and antenna beam (see Table 1). They are also realistic in the sense that all the components (CMB, Galactic components, extragalactic point sources and SZ) were simulated, including the latest information we have concerning these components. The simulations were performed in patches of the sky of 12.8 × 12.8 deg2, although the method can be easily extended to include the entire sphere. For the sake of simplicity, we will not include the effect of the bandwidths in our simulations, although there is no problem if the bandwidths have to be included. Therefore, we will simulate the different maps only at the central frequency of the Planck channels.
Experimental constrains and simulation characteristics at the 10 Planck channels. The antenna FWHM is given in column 2 for the different frequencies (a Gaussian pattern is assumed in the HFI and LFI channels). Characteristic pixel sizes are shown in column 3. The fourth column contains information concerning the instrumental noise level, in ΔT/Tper pixel. In columns 5–11 we show the dispersion of the simulated components (CMB, thermal dust emission, free–free, synchrotron, spinning dust, point sources and Sunyaev–Zel'dovich effect, respectively) in ΔT/Tper pixel, after beam convolution.
Experimental constrains and simulation characteristics at the 10 Planck channels. The antenna FWHM is given in column 2 for the different frequencies (a Gaussian pattern is assumed in the HFI and LFI channels). Characteristic pixel sizes are shown in column 3. The fourth column contains information concerning the instrumental noise level, in ΔT/Tper pixel. In columns 5–11 we show the dispersion of the simulated components (CMB, thermal dust emission, free–free, synchrotron, spinning dust, point sources and Sunyaev–Zel'dovich effect, respectively) in ΔT/Tper pixel, after beam convolution.
The CMB simulation has been done for a spatially flat Λ-cold dark matter universe with Ωm= 0.3 and ΩΛ= 0.7, using the Cl generated with the cmbfast code (Seljak & Zaldarriaga 1996). It is a Gaussian realization.
The thermal Sunyaev–Zel'dovich (SZ) effect simulation was made for the same cosmological model. The cluster population was modelled using the results of Press–Schechter (Press & Schechter 1974) with a Poissonian distribution in the angular coordinates of the 2D map, θ and φ. The model was selected by fitting the cluster population as a function of z to several X-ray and optical cluster data sets. In that fit we obtained certain values for the cosmological parameters as well as an estimate for the parameters involved in the cluster scaling relations T–M and Lx–T (see Diego et al. 2001 for a discussion).
The extragalactic point source simulation was performed following the model of Toffolatti et al. (1998) assuming the cosmological model indicated above. The simulation include radio flat-spectrum and infrared sources. VLA and IRAS catalogues were used to fix the model. The predictions obtained with this model are compatible with ISO and SCUBA data (see the above paper for more details).
We have simulated four different Galactic emission sources: thermal dust, free–free, synchrotron and spinning dust.
The thermal dust emission was simulated using the data and the model provided by Finkbeiner, Davis & Schlegel (1999). This model assumes that dust emission is caused by two grey bodies: a hot one with a dust temperature of TDhot≃ 16.2 K and an emissivity αhot≃ 2.70, and a cold one with a TDcold≃ 9.4 K and αcold≃ 1.67. These quantities are mean values. The temperatures and emissivities change from point to point. An exhaustive description of the model is given in their paper, where the authors combined data from DIRBE, IRAS and FIRAS to fit the dust emission at high frequencies (500–3000 GHz). We will use their best-fitting model in this work.
The distribution of free–free emission is poorly known. Present experiments such as the Hα Sky Survey1 and the WHAM project2 will provide maps of Hα emission that could be used as a template. In this work, we have created the free–free template correlated with the dust emission as proposed in Bouchet, Gispert & Puget (1996). The frequency dependence of the free–free emission is assumed to change as Iν∝ν−0.16 and is normalized to give an rms temperature fluctuation of 6.2 μK at 53 GHz.
Synchrotron emission simulations have been done using the all sky template provided by P. Fosalba and G. Giardino in the FTP site: . This map is an extrapolation of the 408 MHz radio map of Haslam et al. (1982), from the original 1° resolution to a resolution of about 5 arcmin. The additional small-scale structure is assumed to have a power-law power spectrum with an exponent of −3. We have done an additional extrapolation to the smallest scale (1.5 arcmin) with the same power law. We also include in our simulations information on the changes of spectral index as a function of electron density in the Galaxy. This template has been made by combining the Haslam map with the maps of Jonas, Baart & Nicolson (1998) at 2326 MHz and Reich & Reich (1986) at 1420 MHz, and can be found on the FTP site mentioned previously.
We have also taken into account possible galactic emission caused by spinning grains of dust, proposed by Draine & Lazarian 1998. This component could be important at the lowest frequencies of the Planck channels (30 and 44 GHz) in the outskirts of the galactic plane.
3 A method in two steps
In this section, we describe our proposed new method to estimate the SZ thermal contribution to the millimetre data in the 10 Planck channels. A detailed description of the mission can be found in the official Planck web address .
From Fig. 2 it can be seen that the best channels are those between 100 and 353 GHz. Although the channel at 217 GHz does not seem to be relevant, it is in fact one of the most important to detect the SZ effect since at this frequency the thermal SZ effect is expected to be negligible. The other channel that does not seem to be relevant is the highest one at 857 GHz, which is expected to be completely dominated by the dust emission coming from our Galaxy (see Table 1). However, as we will see later, both channels will play a crucial role in our method.
The method is in fact divided into two main steps.
In the first step (map cleaning) we reduce the contribution of certain components (point sources, dust and CMB) under the assumption that point sources are unresolved, the thermal dust emission is, at different frequencies, the same spatial pattern times a parameter that depends on the frequency and the CMB is frequency-independent. This process will increase the noise level of the maps but will increase as well the signal-to-noise (S/N) ratio of the SZE signal.
In the second step (Bayesian approach) we develop a method to search for the Compton parameter in each pixel responsible of the SZ signature in our clean maps. We will define our approach in terms of Bayes' theorem.
3.1 Map cleaning
ΔT/T(ν, x) denotes the measured temperature of the sky minus the temperature of the CMB (TCMB≈ 2.73 K) divided byTCMB. The term ‘A⊗’ denotes the convolution with the antenna. There should be an additional term in the previous equation to account for the frequency response of our experiment, which is not a delta function at the frequency ν. Therefore, a real experiment will measure ΔT/T(ν1−ν2, x). This is just another convolution of equation (3) with the frequency response of the instrument centred at the frequency ν. For simplicity we will not consider the bandwidth in our calculations, although it could easily be included.
By looking at equation (3), it is easy to understand the complexity of the component separation problem. In this work we are only interested in one of these components, the SZE. The complexity of estimating that component could be reduced if we can subtract first, or at least reduce significantly, the contribution of some of the other components in equation (3). By reducing the contribution of some of the dominant components, the S/N ratio of the SZ signal can be increased, since the smaller the background the better our determination of the signal. However, one should be careful in the process of subtracting some of the other components since we do not want to remove any SZE signal.
There are several components that can be easily subtracted from the Planck data (or at least reduce their contribution to the background) without subtracting any significant thermal SZE signal.
Point sources
The point source contribution is expected to be especially relevant at the highest Planck frequency channels. The point source emission at these high frequencies is caused by infrared sources. At the lowest Planck frequencies, the point source emission is mainly a result of radio flat-spectrum active galactic nuclei (AGNs). Knowledge of the point source emission at the intermediate Planck channels is really poor. In fact, the determination of the point source emission at these frequencies is one of the challenges of the Planck mission.
The detection of the point source emission is a special issue for the component separation problem. There are two main differences between this emission and the other foregrounds. First, the frequency behaviour can change significantly from one point source to another. Secondly, the point source emission has a typical scale: the beamwidth. These properties suggest that common component separation methods such as MEM, WF or neural networks are not the best techniques to detect the point source emission.
We have applied the MHW technique first described in Cayón et al. (2000) and later extended in Vielva et al. (2001a). Wavelets are a powerful tool for detecting point sources. When a signal with a characteristic scale is analysed with a wavelet adapted to its shape, its wavelet coefficients are amplified with respect to the background coefficients. This amplification occurs at scales around the characteristic scale (Cayón et al. 2000). We can increase the number of detections by looking at the scale, which maximizes the amplification (Vielva et al. 2001a). This optimal scale can be determined directly from the data. Therefore, there is no need for assumptions either concerning the nature or characteristics of the underlying signals [e.g. spectral behaviour, power spectra, probability density function (PDF), etc.]. The optimal pseudo-filter for the detection of point sources that are convolved with a Gaussian beam (at least for 2D images with a power spectrum described by Cl∼l−2 around the point source scale) is the MHW (Sanz et al. 2001). A detailed description of the point source detection algorithm can be found in Vielva et al. (2001a). Here we will just describe the main steps.
is the Fourier transform of the MHW and R is the MHW scale. Once the optimal scale is determined, we select certain point source candidates (those maxima at the MHW optimal scale map that are above a certain level). The next step is to obtain the amplitude of those point source candidates. The amplitude estimation is done by fitting the theoretical dependence of the point source wavelet coefficient with the MHW scale. Then, we convolve the map with three additional MHW (at adjacent scales to the optimal one) and we fit the obtained coefficients to the theoretical curve: where w(R) is the wavelet coefficient at the scale R, σa is the beam dispersion and B is the PS amplitude (the parameter to be determined). After this, a detection criterion is applied to select the real point sources (see Vielva et al. 2001a) and those point sources are subtracted from the original data.Dust
The second contribution that can be easily subtracted is the thermal dust emission. This emission can be important at frequencies above ≈100 GHz. At frequencies of ≈ 350 GHz, dust starts to dominate over the other components and its contribution is even more important at higher frequencies. The Planck channel at 857 GHz is expected to be completely dominated by dust (as suggested by the BOOMERANG and IRAS results). This map can therefore be used as a spatial pattern of the distribution of dust in our Galaxy. The only thing we need to know to subtract the dust from the other channels, is the frequency dependence of the dust. However, since we do not want to assume any frequency dependence (and this is unknown in the range of frequencies of Planck) we need to look for other alternatives to subtract the dust.
refers to an integral in real space over all the pixels of the map. At the end, what we obtain is a value for α(ν) which is different for each frequency. The values of α(ν) should approach the real frequency dependence of the dust at these frequencies. It is important to remark that by subtracting the dust with this method we are not subtracting this component perfectly. This method is assuming that the dust has the same frequency dependence in the analysed sky patch, which is a good assumption for small regions of the sky. However, as will be noted in Section 4, this approach has proved to be successful with our simulations where we have really considered a varying spectral index and grain temperature for the dust emission. We have computed the spatial distribution of the spectral index on our simulated dust maps. We found that is nearly constant with a dispersion of about 3 per cent.We would like to note that our method for subtracting the dust is nothing other than a Wiener filter since we are looking for a linear operation on the maps such that the variance of the residual is minimum. This is just the general definition for a Wiener filter (e.g Zaroubi et al. 1995).
CMB
CMB is the last component that can be subtracted easily. Since the ΔT/T distortion in the CMB map is independent of the frequency, a good approach for the contribution of the CMB in each of the channels can be obtained by filtering a given CMB-dominated channel to the resolution of the other channels. The selection of the optimal channel to be filtered is easy if we want to increase the S/N ratio of the SZ effect. The channel at 217 is expected to have a negligible contribution to the thermal SZ effect, while the CMB component dominates over the other ones. Furthermore, this channel has a resolution (FWHM = 5.5 arcmin) very close to the best Planck resolution (FWHM = 5 arcmin). Moreover, this channel has a low noise level (although it is not the best channel in terms of the noise level).
Therefore, our first step in cleaning the maps of the CMB contribution is to filter the high-frequency channels with FWHM = 5 arcmin to the resolution of the channel at 217 GHz (FWHM = 5.5 arcmin)
, where we have assumed that the beam can be well described by a Gaussian. Then, the channel at 217 GHz is subtracted from the filtered channels. We repeat the process with the channels at frequencies below 217 GHz but in this case we have to filter the 217-GHz channel to the resolution of the other channels since in this case the channels below 217 GHz have a FWHM larger than 5.5 arcmin.
It is important to keep the previous order (PS, dust and CMB). Point sources should be extracted first (at least in the channel at 857 GHz) since the frequency dependence of the point sources will, in general, be different from the frequency dependence of the dust in our Galaxy. Therefore, if we subtract the 857-GHz map including the point sources in that map we are assuming that these point sources have the frequency dependence of our Galaxy, which is wrong. Then we should subtract the dust prior to the subtraction of the CMB (217-GHZ map) since we need to reduce the dust contribution in the 217-GHz map before subtracting this channel from the others.
We have to point out that, in the process of subtracting the components, we have increased the S/N ratio for the SZE but we have also increased the noise level. First, when subtracting the dust, we are adding the noise level of the 857-GHz map times the constant α(ν) to the other maps. This process does not increase the noise level very much in the channels below 300 GHz. At these frequencies the CMB contribution starts to dominate over the thermal dust and the value of α(ν) is, therefore, very small. Hence, the noise contribution is small as well.
Since the S/N ratio of the dust in the 857-GHz map is high and the value α(ν) is small at these low frequencies, then, the dust is subtracted without increasing the noise level too much in the low-frequency channels. Furthermore, since the 857-GHz channel must be filtered, this process decreases the noise level even more.
The increase in the noise level owing to the CMB subtraction is more important in those channels where the resolution is closer to that at 217 GHz (143-, 353- and 545-GHz channels) and is less important in the other channels. This is because we have to filter the maps to the resolution of the lowest-resolution channel. Hence, if the filter has a large FWHM [FWHM-filter = (FWHM2ν− FWHM2217)1/2], then the noise is smoothed over the large scale of the filter. As a result, some channels (around 217 GHz) have increased their noise level significantly, while others did not change the noise level too much.
3.2 The Bayesian approach
The prior
is the power spectrum of the SZ map, at each k-mode.We assume that the probability depends on yc through its module. This can be done owing to the almost symmetric shape of the PDF of yc (see Fig. 3). The real PDF is not exactly symmetric as can be appreciated in the figure. This means that the real PDF (in Fourier space) is not exactly a Gaussian, which is in accordance with the non-Gaussian nature of the SZE in real space. However, the PDF of the modes of the SZE map in Fourier space (although not exactly Gaussian) can be approached by a Gaussian much better than the PDF of the SZE map in real space.
Probability distribution of the yc Fourier coefficients (coefficients on the ring with |k|= 60) for a simulated SZE map. The histogram is made in the real–imaginary plane [the X-axis is the real part of the Fourier coefficient, the Y-axis is the Imaginary part and the Z-axis is the number of Fourier coefficients with real and imaginary parts falling in the bin (X, Y)]. Coefficients with real and imaginary parts equal to 0 are at the centre of the plane. The grid shows the PDF given by expression (13) with
the power spectrum of the simulated map at k= 60.
Probability distribution of the yc Fourier coefficients (coefficients on the ring with |k|= 60) for a simulated SZE map. The histogram is made in the real–imaginary plane [the X-axis is the real part of the Fourier coefficient, the Y-axis is the Imaginary part and the Z-axis is the number of Fourier coefficients with real and imaginary parts falling in the bin (X, Y)]. Coefficients with real and imaginary parts equal to 0 are at the centre of the plane. The grid shows the PDF given by expression (13) with
the power spectrum of the simulated map at k= 60.
We found that the previous probability (equation 13) makes a reasonably good fit to the probability distribution of the yc at intermediate and large k-modes (see Fig. 3). At small k-modes, equation (13) is still a reasonable approach, although the small number of yc coefficients at these low k-modes makes it difficult to have a (statistically) good enough probability distribution for yc. Nevertheless, the interesting information containing the cluster contribution to the SZ effect is at intermediate k-modes where equation (13) provides a reasonable approximation for the prior. The drawback of this approach is that we need to assume a given power spectrum for the SZ effect,
, in order to define the prior.
As shown in previous work (e.g Komatsu & Seljak 2002), the power spectrum of the SZE is particularly sensitive to the value of σ8. Our simulations also show this strong dependence. In Fig. 4 we plot the power spectra for various simulations where we only change the value of σ8 and the internal structure of the clusters. The three solid lines are for σ8= 0.8 and the dotted lines are for σ8= 0.9 (up) and σ8= 0.7 (down). The three solid lines show the change in the power spectra when we change the internal structure of the clusters. This is parametrized by a β-model
with one free parameter, rv/rc, which is the ratio of the virial radius to the core radius. The different solid lines are for the cases rv/rc= 5, 10, 15 and as it can be seen, the dependence on the internal structure is very weak. In the simulation of the SZE used to test the method we have assumed that σ8= 0.8 and rv/rc= 10 (central solid line). Hereafter, we will refer to this model as the ‘true power spectrum’.
The power spectrum, Pyc(k) , for various simulations of the SZE. The three solid lines are the power spectra of three simulations with rv/rc= 5, 10, 15 and σ8= 0.8. Dotted lines are two simulations with rv/rc= 10 and σ8= 0.7 (down), σ8= 0.9 (up) . The two dashed lines are the power spectra assumed in the prior in this work. The upper dashed line corresponds to model A and the bottom dashed line corresponds to model B.
The power spectrum, Pyc(k) , for various simulations of the SZE. The three solid lines are the power spectra of three simulations with rv/rc= 5, 10, 15 and σ8= 0.8. Dotted lines are two simulations with rv/rc= 10 and σ8= 0.7 (down), σ8= 0.9 (up) . The two dashed lines are the power spectra assumed in the prior in this work. The upper dashed line corresponds to model A and the bottom dashed line corresponds to model B.
Although the power spectrum of the SZE depends on the cosmological model (population) as well as on the internal structure of the individual clusters, a qualitative behaviour can be deduced straightforwardly: if we assume that the clusters are randomly distributed (following a Poissonian distribution) in the sky, the spectrum at low k-modes must be flat; on the other hand, at high k-modes the power spectrum must go down, following the β-model profile.
Hence, the shape of the power spectrum is rather stable but its amplitude depends strongly on the value of σ8. In order to check the dependence of our method to the particular choice of the power spectrum, we will compare the results using different power spectra. The main results will be presented using the power spectrum shown as the upper dashed line in Fig. 4 (model A). These results will be compared with those obtained using the true power spectrum (middle solid line) and the bottom dashed line (model B).
) but is not exactly the same. We do not take the real power spectrum of our simulations because in a real situation this is not going to be known. The particular case shown in Fig. 4 correspond to the parameters, C= 4.0 × 10−15, 3.0 × 10−16 and r= 0.3 for models A and B, respectively. The range in the normalization (C∈[3.0 × 10−16, 4.0 × 10−15]) corresponds more or less to a range in σ8∈[0.7, 0.9], which is consistent with the most recent constraints on this parameter (see, e.g., Lahav 2002 and references therein)The choice of an exponential form for models A and B is a compromise between taking the real power spectrum of the simulated SZE and a different one that could have been obtained through a specific model. The specific shape of the power spectrum is not critical, but it is important that the chosen one does not have much power at high k in order to suppress the deconvolution of the noise. This behaviour is easily achieved when the power spectrum is of exponential form.
The likelihood
If we assume as our hypothesis equation (11), then the residual χν(x) can be expressed as
Planck channels after point source, dust and CMB subtraction. Low-frequency channels are dominated by the point source residuals. Some clusters can be seen by eye as black spots in the 70-, 100- and 143-GHz channels and as bright sources in the 353-GHz channel.
Planck channels after point source, dust and CMB subtraction. Low-frequency channels are dominated by the point source residuals. Some clusters can be seen by eye as black spots in the 70-, 100- and 143-GHz channels and as bright sources in the 353-GHz channel.
We are looking for the Fourier coefficients of the Compton parameter map, yc, which minimizes the residuals in equation (16). Since the terms in C−1 are independent of yc, the minimum of the residuals gives the maximum of the probability in equation (18). Therefore, equation (18) can be considered as the likelihood of the data.
Then, after returning to real space, we end up with a map of Compton parameters. The previous equation is our particular search engine. We want to remark that all the terms given in that equation can be computed directly from the data (see below). The only term for which some assumptions need to be made is for the power spectrum of the SZE,
, although the particular election of one or another shape for this power spectrum does not have any significant effect on the final result provided that it obeys some basic rules that will be discussed later.
4 Applying the method to simulated data: results
We have applied the method explained in the previous section to the simulated Planck data described in Section 2. In Fig. 1 we show the 10 simulated data sets. These simulations include all the relevant components (galactic and extragalactic) as well as the noise (we have considered that the noise is uncorrelated). The high-frequency channels are clearly dominated by dust emission above ν≈ 300 GHz. Some point sources can clearly be observed in these high-frequency maps. Below ν≈ 300 GHz the CMB starts to dominate over the other components. The synchrotron, free–free and spinning dust emission only contribute to the lowest channels, although their contribution is expected to be small. First, we perform the map cleaning (partial subtraction of point sources, thermal dust and CMB). Then we apply our Bayesian approach in order to estimate the emission owing to the SZE.
4.1 Map cleaning
Point sources have been extracted using the mexican-hat wavelet as explained in Section 3.1. Applying the method described in Vielva et al. (2001a), we are able to detect (and subtract) 215 point sources in the 857-GHz channel, 25 at 545 GHz, 27 at 353 GHz, 18 at 143 GHz, 18 at HFI 100-GHz channel, 15 in the LFI 100-GHz channel, 12 at 70 GHz, nine at 44 GHz and five at 30 GHz. The number of spurious detections is lower than 5 per cent and the mean error in the amplitude estimation is lower than 18 per cent for all the channels.
None of the clusters was identified as a point source in our simulation since the MHW only detect sources above a certain flux limit, which is above the flux of the clusters in our simulation. In a much larger area of the sky (e.g. full sky) there could be some bright clusters with a large flux. However, these high-flux clusters are extended sources, which can be easily distinguished from the point sources.
The dust subtraction has been performed following the procedure explained in Section 3.1. The 857-GHz channel was filtered with an appropriate Gaussian beam in order to degrade its resolution to the resolution of the other channels. We show the result in Fig. 5. As can be seen from the figure, the dust subtraction works very well at 545 GHz, where no appreciable galactic structure is seen. Also the quality of the dust subtraction can be observed in the 353-GHz channel. This channel was, prior dust subtraction, basically a mixture of CMB and dust. After dust subtraction, the CMB component dominates clearly over the other components.
Planckchannels after point source and dust subtraction. Note the effect in the 353-GHz channel. It was, basically, a mixture of CMB and dust emission (see Fig. 1 ) and now it is dominated by the CMB contribution.
Planckchannels after point source and dust subtraction. Note the effect in the 353-GHz channel. It was, basically, a mixture of CMB and dust emission (see Fig. 1 ) and now it is dominated by the CMB contribution.
The last component we subtract is the CMB itself. To do this, we just subtract the 217-GHz channel where the expected thermal SZ signal is negligible. Again, we have to filter this map to the resolution of the other maps and in the case in which the resolution of the other maps is smaller than the resolution of the 217-GHz channel, we filter the former maps to the resolution of the 217-GHz channel. The result can be seen in Fig. 6
The main features that can be appreciated in these maps are the residuals of the point source subtraction. This fact suggests that an improvement in the method used to subtract the point sources could be to reduce the background level by first subtracting the dust and then the CMB (after a prior point source subtraction in the 857- and 217-GHz channels). Also some clusters can now be seen by eye in the channels between 100 and 143 GHz as black spots and in the channel at 353 GHz as bright sources.
It is important to note that before point source subtraction, the point source contribution was more important in the high-frequency channels. After point source subtraction, however, the residual of the point sources is more important at low frequencies as shown in Fig. 6.
The point source residuals at low frequencies are not a problem for our method since at those frequencies the method searches for negative wells. In contrast, point sources are bright peaks which, therefore, cannot be confused with galaxy clusters. The situation is different if a bright point source lies in the position of a cluster. In this particular case the point source is a serious contaminant for the detection of the SZE signature.
The dust and CMB subtraction will also leave a residual in the maps, although both residuals are small. However, the search engine will account for these contributions to the residual since the correlation matrix, C, will be computed from the clean data maps shown in Fig. 6.
4.2 Bayesian search engine
The Fourier transforms of the maps shown in Fig. 6 is what we consider as our data, d(k), in the Bayesian approach of equation (20). In order to solve the previous equation for the parameters yc, we need to compute the inverse of the correlation matrix: C−1. This is the correlation matrix of the residuals, χ(k), which is a 8 × 8 symmetric matrix since we have used eight different channels (we did not include the maps at 217 and 857 GHz in the analysis since they were not clean, i.e. no dust and/or CMB subtraction has been performed in these channels). However, we do not know the residuals until we know the Compton parameter (see equation 16). We solve this by running the code a first time, taking the correlation matrix of the residuals as the correlation matrix of the data vectors, d(k), and obtaining a first guess of the Compton parameters, yc, from equation (20). With this guess we can now compute the residuals (equation 16), their correlation matrix and finally its inverse, C−1.
Now we are ready to solve for equation (20) with a good estimation of the correlation matrix, C−1. After solving equation (20) mode by mode in Fourier space we can go back to real space by computing the inverse Fourier transform.
The recovered yc map is shown in Fig. 7, where we compare the recovered and the input maps. The method returns clusters at small and large scales simultaneously, although the largest scales are not very well recovered owing mainly to the low surface brightness of those clusters. The method also recovers shapes that are non-symmetric since we did not use any symmetrical filter to increase the S/N ratio (as is done in the mexican-hat wavelet analysis or the multifilter method, see Herranz et al. 2002a, for instance).
Recovered map of Compton parameters (right) versus input map (left). Both maps have been filtered with a Gaussian antenna of FWHM = 7 arcmin . This result corresponds to model A. The other two cases, model B and true power spectrum, look similar.
Recovered map of Compton parameters (right) versus input map (left). Both maps have been filtered with a Gaussian antenna of FWHM = 7 arcmin . This result corresponds to model A. The other two cases, model B and true power spectrum, look similar.
Since we can assume a nearly arbitrary shape for the power spectrum in the prior (models A and B), we can recover a biased estimate of the Compton parameter map. However, a percentage of this bias (that related to the bad normalization), can partially be corrected if we rescale the recovered map in a convenient way.
By correcting the bias in this way, the final Compton parameter map shows a very weak dependence on the assumed power spectrum in the prior as we will see below. In Fig. 8 we show a yTc–yRc plot, where we represent pixel by pixel the recovered map versus the true map. For this plot, both maps have been filtered with a Gaussian beam (FWHM = 7 arcmin), since the recovered map does not contain information below a certain scale while the true map does. The choice of the FWHM of the filter (7 arcmin) is based on the fact that the recovered map is basically built up from the contribution of the channels between 100 and 353 GHz, which have resolutions between 5.5 and 10 arcmin (the initial resolution of 5 arcmin in the 353-GHz channel was degraded to 5.5 arcmin during dust subtraction).
This figure shows the recovered map (power spectrum model A) versus the true one (convolved with a 7 arcmin FWHM and mean value subtracted) pixel by pixel. The cross-correlation between the true and recovered maps renders a coefficient, r= 0.58 . We also show the best-fitting linear model, y=sx (dashed line), which has a slope s= 0.58 . For comparison we also show the expected case of a perfect recovery, s= 1(solid line).
This figure shows the recovered map (power spectrum model A) versus the true one (convolved with a 7 arcmin FWHM and mean value subtracted) pixel by pixel. The cross-correlation between the true and recovered maps renders a coefficient, r= 0.58 . We also show the best-fitting linear model, y=sx (dashed line), which has a slope s= 0.58 . For comparison we also show the expected case of a perfect recovery, s= 1(solid line).
The recovered map contains true clusters as well as spurious detections. The spurious detections form a Gaussian distribution with mean value 0. This ‘background of spurious detections is caused by our incorrect (although approximated) assumption of Gaussianity in the PDF of the Fourier modes. Obviously, those pixels in the recovered map with a Compton parameter smaller than approximately 0 (it is not exactly 0 because the analysed map has a zero mean, see Fig. 11 in Section 5) can be interpreted as being spurious since this parameter is, by definition, positive. However, there will still be many pixels in the recovered map with positive values not corresponding to real clusters. The true clusters should be detected in this map, which has an approximately Gaussian noisy background. We have used the image package sextractor (Bertin & Arnouts 1996) to discriminate between the true clusters and the spurious background. Basically, sextractor selects regions with connected pixels above a given threshold. This threshold is expressed in terms of the dispersion of the map (σ). After applying sextractor to the previous recovered image, it returns 45 detections at the 3 σ level (regions with more than 15 pixels connected above the 3 σ threshold). From those detections 44 were real clusters and one was a spurious detection. If the detection limit is decreased to 2 σ, sextractor returns 169 detections, 90 of such detections are spurious and 79 real. In the other two models (model B and true power spectrum), we found the same number of detections, 44 real and one spurious (3 σ).
The figure shows the PDF of the recovered map (solid line, model A). The dot-dashed line is for a Gaussian with σ= dispersion of the recovered map. The dotted line is the PDF of the true map (mean value subtracted). Both maps, true and recovered, have been convolved for this plot with a Gaussian beam of FWHM = 7 arcmin. The convolution is done because the recovered map does not contain any information below 5 arcmin. We chose 7 arcmin since the most relevant channels for the SZE have a FWHM ranging from 5 to 10 arcmin. The clusters returned after applying sextractor are in the positive tail of the PDF.
The figure shows the PDF of the recovered map (solid line, model A). The dot-dashed line is for a Gaussian with σ= dispersion of the recovered map. The dotted line is the PDF of the true map (mean value subtracted). Both maps, true and recovered, have been convolved for this plot with a Gaussian beam of FWHM = 7 arcmin. The convolution is done because the recovered map does not contain any information below 5 arcmin. We chose 7 arcmin since the most relevant channels for the SZE have a FWHM ranging from 5 to 10 arcmin. The clusters returned after applying sextractor are in the positive tail of the PDF.
A large number of detections is important for cosmological studies. However, these detections would be useless if we do not have a good estimation of the flux of the clusters since this quantity is required to build the cluster number counts, N (> S). In order to check for a possible bias of the recovered total flux in the SZE map we have compared the true and the recovered total fluxes as returned by sextractor. This result can be seen in Fig. 9. Our method recovers the real flux with no significant bias. The previous plots also show the good agreement among the results independent of the assumption made on the power spectrum of the SZE. Changing the power spectrum by one order of magnitude does not have a significant effect on the recovered map and the fluxes.
True total flux versus recovered total flux for the clusters returned by sextractor . The dotted line corresponds to the perfect situation true flux = recovered flux. Circles are for model A, crosses for model B and asterisks for the true power spectrum.
True total flux versus recovered total flux for the clusters returned by sextractor . The dotted line corresponds to the perfect situation true flux = recovered flux. Circles are for model A, crosses for model B and asterisks for the true power spectrum.
Another parameter that is important for cosmological studies is the completeness of the recovered catalogue. The completeness level of the method is 100 per cent above fluxes of ≈200 mJy and drops quickly until fluxes of ≈ 70 mJy (true flux) below which no cluster is detected (the minimum flux detected was 77 mJy in model A, 67 mJy in the case of model B and 77 mJy in the case of the true power spectrum). This is illustrated in Fig. 10, where we plot the detected clusters as a function of their mass and redshift. Also plotted are the clusters that have not been detected above a flux of 70 mJy (all fluxes are given at 353 GHz).
Each point in this plot represents a cluster present in the simulation. Clusters having a flux below 70 mJy (at 353 GHz) are not represented. The clusters detected (15 pixels, 3σ) by sextractorare marked with a big open circle around the small dots. The two dotted lines are the selection functions for a survey with limiting fluxes of 70 mJy (bottom) and 200 mJy (top) at 353 GHz. Note that above 200 mJy, all the clusters in the simulation have been detected by sextractor. This implies that the Planck catalogue should be complete above this flux.
Each point in this plot represents a cluster present in the simulation. Clusters having a flux below 70 mJy (at 353 GHz) are not represented. The clusters detected (15 pixels, 3σ) by sextractorare marked with a big open circle around the small dots. The two dotted lines are the selection functions for a survey with limiting fluxes of 70 mJy (bottom) and 200 mJy (top) at 353 GHz. Note that above 200 mJy, all the clusters in the simulation have been detected by sextractor. This implies that the Planck catalogue should be complete above this flux.
In the other two cases for the power spectrum (model B and the true power spectrum), the completeness level and selection functions are very similar to the previous one. Therefore, the results are not very sensitive to the particular choice of the power spectrum in the prior, thus, allowing a certain degree of freedom in its choice. However, in order to obtain satisfactory results, this power spectrum should satisfy a set of quite generic conditions that will be explained in the next section.
5 Conclusions and discussion
We have presented a Bayesian and non-parametric method to detect the SZE signature in the Planck data. Our method only requires the knowledge of the frequency dependence of the SZE (which is well known) and the frequency dependence of the CMB (which is known to be constant). We also require that one of the channels must be completely dominated by dust emission. This is a well-founded assumption for Planck where the 857-GHz channel is expected to be completely dominated by dust emission. We also need to assume a specific form for the power spectrum of the SZE component, but that form is clearly determined by the typical β-profile of the clusters. However, we have seen that the final result does not depend critically on the assumed SZE power spectrum in the prior. We therefore have a certain degree of freedom in the choice of the power spectrum in the prior. However, there are several conditions that this power spectrum should obey in order to obtain satisfactory results.
If the clusters are distributed randomly over the sky (following a Poissonian distribution) then the spectrum must be flat at low values of the k-modes. Because clusters have a finite extension, the power spectrum must go down at large values of k-modes. The specific fall-off depends on the cluster profile. For a β-model
the power spectrum can be described by equation (14).
An exponential power spectrum, as assumed in this work, can obey both conditions. At high k-modes, it suppresses the noise deconvolution in a very effective way since it appears in the denominator of equation (20) as the inverse of an exponential. This kind of behaviour is also achieved by any power spectrum that is small at high k-modes. The normalization condition is satisfied when the inverse of the power spectrum in the low k-modes regime is not much larger than the term RC−1R†. This is a normalization constraint that can be satisfied by most of the cosmological models (with σ8 within reasonable limits).
The effect of the prior can be considered as a Fourier filter that suppresses the noise level while retaining the useful information at intermediate and low k-modes.
The reader should note that equation (13) is nothing but the expression for a Gaussian; that is, we are assuming that the Fourier coefficients of the SZE map are Gaussian. The inverse Fourier transform of a set of coefficients normally distributed is a map in the real space that is also normally distributed. However, we know that the SZE is clearly non-Gaussian. In Fig. 11 we show the PDF of the recovered Compton parameter map (real space) and overlaid a Gaussian distribution (dot-dashed line). As can be seen, the recovered map is very close to a Gaussian distribution but it has a long positive non-Gaussian tail. When we apply sextractor, all the detections are in the non-Gaussian positive tail. Therefore, we can consider our detections as the non-Gaussian part of an almost Gaussian PDF. The disagreement in the positive tail of the PDF when compared with the real PDF of the simulation can be explained by the bias in the recovery pixel-by-pixel (see Fig. 8). However, this is not in disagreement with the fact that the recovered flux is unbiased. Our method recovers the clusters with a larger apparent size but with a lower Compton parameter (see Fig. 7). Both effects compensate (after correcting for the bias, b) so the recovered flux shows no significant bias. Although in the method presented here we have assumed Gaussian distributions for the likelihood and the prior, the method can be applied in a more general (and possibly more realistic) way, considering that both the likelihood and the prior are non-Gaussian.
We have seen that after applying sextractor to our final recovered map we can detect ≈45 clusters. This number of detections in our small sky patch (12.8°)2 means that we expect ∼11 000 detections over the entire sky (∼9000 in four-fifths of the sky, i.e outside the Galactic plane). Such a large number of detections will allow detailed studies of the evolution of the cluster population, which will have important cosmological consequences.
We have seen that the method can reach limits of up to ≈ 70 mJy (at 353 GHz), although with a very low completeness level. In a previous paper (Diego et al. 2002), we estimated the flux limit for the MEM method (Hobson et al. 1999). We found that limit to be ≈30 mJy (353 GHz), although we do not know whether the MEM returns an unbiased estimate of the total flux (at least the pixel-by-pixel recovery is biased as shown by the authors in Hobson et al. 1998) and we do not know the completeness level of MEM at this flux. An unbiased estimation of the flux is essential in order to build the N(S) curve (the number of clusters with fluxes in the interval [S, S+ d S]). Curves like this allow an independent determination of the cosmological parameters, which should agree with the conclusions obtained from the CMB alone. Understanding the selection function and the completeness level of the survey is important since this information is needed to model the cluster number counts and its evolution.
Our method provides an estimate of the total flux of the clusters that is consistent with no bias and is almost independent of the assumed power spectrum. We have also seen that the catalogue of detected clusters is complete for those clusters with fluxes above 200 mJy. This flux defines the selection function of the catalogue that is needed in order to study the cluster number counts.
As was shown in Diego et al. (2002), the study of the number of counts as a function of flux (and/or redshift) could produce strong constraints in the cosmological parameters (Ω, σ8, Γ). That work was based on a Planck cluster catalogue with a limiting flux of 30 mJy (353 GHz, MEM). For this flux limit we found that we should expect ≈30 000 clusters over two-thirds of the sky. If in fact, with Planck we are able to reach this limiting flux, the cosmological implications of such a large cluster catalogue would be very relevant. To reach this limit with a given separation method, one should be very cautious as to the particular choice of the priors. In any case, our results are very robust and the information provided by this technique could be used to choose the best prior in other methods. For instance, the power spectrum (in particular, its normalization) of the SZE is assumed to be known in most of the component separation algorithms. Our method could produce an estimate of this which could be used in these methods. As we show in Fig. 12, our method produces a rather accurate description of the real power spectrum up to k≈ 30 (l≈ 900). The three power spectra shown in Fig. 12 correspond to the three recovered power spectra for models A, B and the true power spectrum. Our estimate of the power spectrum is very robust in the sense that it does not show any strong dependence on the assumed power spectrum in the prior (at least up to k≈ 30). Our estimation of the power spectrum could be used, for instance, to define the normalization of the power spectrum that is needed in other methods. As we suggested in the introduction, a successful method to perform the component separation should combine several methods. The technique presented in this paper could be just part of the final method.
Power spectrum of the recovered Compton parameter map (solid lines) compared with the real power spectrum (dotted line). The three solid lines correspond to the recovered power spectrum for the three models (true power, model A and model B). The estimate of the power spectrum given by our method could be used to define the prior of the SZE in other methods.
Power spectrum of the recovered Compton parameter map (solid lines) compared with the real power spectrum (dotted line). The three solid lines correspond to the recovered power spectrum for the three models (true power, model A and model B). The estimate of the power spectrum given by our method could be used to define the prior of the SZE in other methods.
Acknowledgments
This research has been supported by a Marie Curie Fellowship of the European Community programme Improving the Human Research Potential and Socio-Economic knowledge under contract number HPMF-CT-2000-00967. This work has been supported by the Spanish DGESIC Project PB98-0531-C02-01, FEDER Project 1FD97-1769-C04-01, the EU project INTAS-OPEN-97-1192, and the RTN of the EU project HPRN-CT-2000-00124. PV acknowledges support from a fellowship of Universidad de Cantabria.

















![Probability distribution of the yc Fourier coefficients (coefficients on the ring with |k|= 60) for a simulated SZE map. The histogram is made in the real–imaginary plane [the X-axis is the real part of the Fourier coefficient, the Y-axis is the Imaginary part and the Z-axis is the number of Fourier coefficients with real and imaginary parts falling in the bin (X, Y)]. Coefficients with real and imaginary parts equal to 0 are at the centre of the plane. The grid shows the PDF given by expression (13) with the power spectrum of the simulated map at k= 60.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/mnras/336/4/10.1046/j.1365-8711.2002.05874.x/2/m_336-4-1351-fig003.gif?Expires=1528957176&Signature=dh5ufKP4nGYSB6eLFkzvOlH-A7ti3YBQOBakp4zHAmcVSvd2EK9dY8slGh4qO1OTP7SdLvKgYdrl0jYNEfNyT4UkJC2vulbXeNzYHe9uFihry1xA-DVz2vCZMVXVyVLgndXxoIGQoSy5tYrDiqbizn~97MlP18EHEi50dfpP~wE6MLjucalHsLqhVV8NqAYApGzi8DRgDyl~JhLfXF2sbB8HNHYyZmzFe5nOLNP1abZRbgdESapSMasXQ-op5OLRNP6ZqPSf-P0BEjdS6ju6PSt~Su8MWK7gNopy~gdz-JEOGQ3d5-nf4wRvWn-U~fpsYwywF-hJMNSOgcJ5~T~K~A__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)

















