The cross-power spectrum between 21cm emission and galaxies in hierarchical galaxy formation models

The correlation between 21cm fluctuations and galaxies is sensitive to the astrophysical properties of the galaxies that drove reionization. Thus, detailed measurements of the cross-power spectrum and its evolution could provide a powerful measurement both of the properties of early galaxies and the process of reionization. In this paper, we study the evolution of the cross-power spectrum between 21cm emission and galaxies using a model which combines the hierarchical galaxy formation model GALFORM implemented within the Millennium-II dark matter simulation, with a semi-numerical scheme to describe the resulting ionization structure. We find that inclusion of different feedback processes changes the cross-power spectrum shape and amplitude. In particular, the feature in the cross-power spectrum corresponding to the size of ionized regions is significantly affected by supernovae feedback. We calculate predicted observational uncertainties of the cross-correlation coefficient based on specifications of the Murchison Widefield Array (MWA) combined with galaxy surveys of varying area and depth. We find that the cross-power spectrum could be detected over several square degrees of galaxy survey with galaxy redshift errors less than 0.1.


INTRODUCTION
The prospect of measuring the 21 cm power spectrum from the epoch of reionization is a focus of modern theoretical cosmology (e.g. Morales & Wyithe 2010). A very successful technique has been to employ an N-body code to generate a distribution of halos, and then apply radiative transfer methods in post-processing to model the generation of ionized structure on large scales using various models for the ionizing sources (e.g. Ciardi et al. 2003;Sokasian et al. 2003;Iliev et al. 2007;Zahn et al. 2007;Trac & Cen 2007;Shin et al. 2008;Iliev et al. 2008;Trac et al. 2008). However, when constructing models to assign ionizing luminosities to dark matter halos, most studies have used a constant mass-to-luminosity relation. On the other hand, the degree to which the important astrophysics governing formation and evolution of high redshift galaxies will influence observations of the 21cm power spectrum is not well known. To improve on the source modelling for calculation of the ionizing photon budget in reionization simulations, several studies ⋆ jaehongp@student.unimelb.edu.au † hansikk@unimelb.edu.au Raičević et al. 2011;Lacey et al. 2011) have used GALFORM (Cole et al. 2000;Baugh et al. 2005;Bower et al. 2006) combined with Monte-Carlo merger trees. However these studies calculated only the global evolution of reionization, and are not able to address the reionization structure. Most recently, Kim et al. (2013) have combined GALFORM implemented within the Millennium-II dark matter simulation (Boylan-Kolchin et al. 2009), with a semi-numerical scheme to describe the resulting ionization structure. Kim et al. (2013) demonstrated the sensitivity of the ionization structure to the astrophysics of galaxy formation, and found that the strength of supernovae (SNe) feedback is the most important quantity.
In addition to the 21 cm power spectrum, several studies have previously analysed the cross-power spectrum (correlation) between redshifted 21cm observations and galaxy surveys Lidz et al. 2009Lidz et al. , 2011Wiersma et al. 2013). These models showed that the crosspower spectrum should be observable, but do not provide a self consistent link between the astrophysics of galaxy properties and the reionization structure. For example  and Lidz et al. (2009Lidz et al. ( , 2011) used a simple one-to-one relation between luminosity and dark matter halo mass. Conversely, in Wiersma et al. (2013), the cross-power spectrum was predicted using a semi-numerical code for 21cm emission based on dark matter overdensity cross-correlated with a semi-analytic model for galaxies. As a result, the calculation did not include the direct relation between galaxies and ionization structure. In this paper our aim is to determine whether the cross-power spectrum can be used to infer the properties of high redshift galaxy formation. We present predictions for the cross-power spectrum between 21cm emission and galaxies using the model of Kim et al. (2013) who directly combined detailed models of high redshift galaxy formation using GALFORM with a semi-numerical description, and predict the resulting redshifted 21cm power spectrum of different reionization histories. This model provides self-consistent results because the ionizing sources and observed galaxies are the same. These galaxies include both the observed luminous galaxies and the low mass (∼ 10 8 M⊙) galaxies thought to drive reionization.
We begin in § 2 and § 3 by describing the implementation of GALFORM, our method for modelling the ionization structure, the cross-power spectrum and cross-correlation function, and the cross-correlation coefficient. The crosspower spectra from our method, and the effect of feedback processes on the cross-power spectra are presented in § 4. In § 5 we describe the observational uncertainty. We finish with some conclusions in § 6.

THE GALFORM GALAXY FORMATION MODEL
In this section we summarise the theoretical galaxy formation modelling based on Kim et al. (2013) that is used in our analysis in order to describe the new features for this paper.
We implement the GALFORM (Cole et al. 2000) model, within the the Millennium-II dark matter simulation (Boylan-Kolchin et al. 2009). In this study, we specifically use the Lagos implementation of GALFORM (Lagos et al. 2012) model described in Kim et al. (2013). The simulation has a cosmology including fractional mass and dark energy densities with values of Ωm = 0.25 , Ω b = 0.045 and ΩΛ=0.75, a dimensionless Hubble constant of h=0.73, and a power spectrum normalisation of σ8=0.9. The particle mass of the simulation is 6.89×10 6 h −1 M⊙ and we detect haloes down to 20 particles (the minimum halo mass corresponds to ∼ 1.4 × 10 8 h −1 M⊙) in the simulation box of side length L = 100h −1 Mpc. Figure 1 shows the relation between the UV magnitude (the rest-frame 1500Å AB magnitude) including the effects of dust extinction of galaxies and the host halo mass (top), and between the total Lyman continuum luminosity (ṄLyc) of each galaxies and the host halo mass (bottom) from the GALFORM model. Of particular note is that the luminosity of an ionizing source is not simply proportional to the host halo mass as is often assumed in reionization models (Iliev et al. 2011;Lidz et al. 2009Lidz et al. , 2011. In part this is because of the distribution of satellite galaxies. The broad scatter of the relation indicates that physically motivated modelling for ionizing sources during the reionization should be included to understand the epoch of reionization. We note that this magnitude is not the same as ionizing luminos- Figure 1. The relation between the UV magnitude (the restframe 1500Å AB magnitude) and the host halo mass (top), and between the total Lyman continuum luminosity (Ṅ Lyc ) of each galaxies (bottom) for galaxies and the host halo mass at z = 7.272 from GALFORM. In each panel, black and blue dots represent central and satellite galaxies. Red, orange, and yellow colors represent 1 (68.3%), 2 (95.4%) and 3-sigma (99.7%) levels.
ity. However, as shown in Figure 1, the UV magnitude (the rest-frame 1500Å AB magnitude) is closely related to the ionizing luminosity. Furthermore, Figure 1 shows that the predicted ionizing luminosity to mass ratio from the model is not a simple one-to-one relation between luminosity and dark matter halo mass.

THE IONIZATION MODEL
In this section we summarize the calculation of the ionized structure ( § 3.1) and describe calculation of cross-power spectrum and cross-correlation function ( § 3.2 and § 3.3).
3.1 Semi-Numerical scheme to calculate the evolution of ionizationed structure Mesinger & Furlanetto (2007) introduced an approximate but efficient method for simulating the reionization process, referred to as a semi-numerical technique. In this paper we apply a semi-numerical technique to find the ionization structure resulting from GALFORM galaxies within the Millennium-II dark matter simulation. The simulation box is divided into cells. We calculate the number of photons produced by galaxies in each cell that enter the IGM and participate in reionization to be where fesc is the escape fraction of ionizing photons produced by galaxies. HereṄ Lyc,cell (t) is the total Lyman continuum luminosity of the N cell galaxies within the cell expressed as the emission rate of ionizing photons (i.e. units of photons/s). The ionization fraction within each cell is calculated as where Fc denotes the mean number of recombinations per hydrogen atom and N HI,cell is the number of neutral hydrogen atoms within a cell. We assume that the overdensity of neutral hydrogen follows the dark matter and selfreionization of a cell occurs when Q cell 1. It is complicated to theoretically predict the values of Fc and fesc, and the values are not known. In this paper, we use the values of (1 + Fc)/fesc in table 2 of Kim et al. (2013). These parameters provide a reionization history with a mass-averaged ionization fraction of xi = 0.55 at z = 7.272 and xi = 0.75 at z = 6.712. We divide the Millennium-II simulation box into 256 3 cells, yielding cell side lengths of 0.3906h −1 Mpc and comoving volumes of 0.0596h −3 Mpc 3 .
Based on equation (2), individual cells can have Q cell 1. On the other hand, cells with Q cell < 1 may be ionized by photons produced in a neighbouring cell. In order to find the extent of ionized regions we therefore filter the Q cell field using a sequence of real space top hat filters of radius R (with 0.3906 < R < 100h −1 Mpc), producing one smoothed ion-ization field QR per radius. At each point in the simulation box we find the largest R for which the filtered ionization field is greater than unity (i.e. ionized with QR 1). All points within the radius R around this point are considered ionized. Ionization cells with 0 < Q cell < 1 which are not part of an ionized QR 1 region retain their values.

The cross-power spectrum
The 21cm brightness temperature contrast may be written asδ where (Zaldarriaga et al. 2004). For convenience, we define δ21(r) ≡δ21(r)/T0(z), so that δ21(r) is a dimensionless quantity. Galaxy overdensity is given by where ρ gal (r) is a galaxy density field andρ gal is mean density. Definingδ21(k) to be the Fourier transform of δ21(k), the cross-power spectrum is given by where δD(k) is the Dirac delta function. The dimensionless cross-power spectrum is

The cross-correlation function
The cross-correlation function is defined as We calculate the cross-correlation function using the Fourier transform, We also calculate the cross-correlation coefficient,

THE CORRELATION BETWEEN 21CM EMISSION AND GALAXIES
In this section we present predictions for the cross-power spectrum, cross-correlation function and cross-correlation coefficient between 21cm emission and galaxies as a function of redshift, luminosity, and host halo mass ( § 4.1). We also discuss the effect of feedback processes on the cross-power spectrum, cross-correlation function and cross-correlation coefficient ( § 4.2).  (Bouwens et al. 2011;Finkelstein et al. 2012). At each redshift, we calculate a mass-averaged ionization fraction, xi . From the correlation function, galaxies and 21cm emission are anti-correlated at small separations while at large separations we find a weak correlation. These regions are separated by a transition wavenumber at which the cross-correlation coefficient and cross-correlation function change from negative to positive. Galaxies are correlated with 21cm emission on scales larger than the ionized regions, but anti-correlated on smaller scales. The size of ionization regions therefore corresponds to this transition wavenumber. We find that the transition wavenumber from negative to positive cross-correlation coefficient increases as redshift decreases since the size of ionized regions generated by galaxies increases as the Universe evolves. Figure 3 shows the comparison of cross-power spectra and cross-correlation functions between different host halo mass thresholds at z = 7.272 ( xi = 0.55). We find that more massive halos exhibit stronger anti-correlation as expected (Lidz et al. 2009;Wiersma et al. 2013). The same trend is also shown in Figure 4 where we compare the results from calculations with different UV magnitude (MAB(1500Å)−5log(h)) thresholds. Figure 4 shows that the transition wavenumber is similar for galaxy samples selected at different luminosity thresholds, since this scale is primarily set by the size of HII regions.

The effect of feedback processes
In order to investigate the effect on the power spectrum of different feedback processes in galaxy formation, we follow a similar method to Kim et al. (2013). We use the Lagos et al. (2012) galaxy formation model as our fiducial case, and then consider two variants of this (hereafter called NOSN models) which have supernovae feedback turned off. We use two variants of the NOSN model. First, we consider the inclusion of photoionization feedback using Vcut = 30 km/s, where Vcut is a threshold value of the host halo's circular velocity (Kim et al. 2013). Second, we removed both supernovae feedback and photoionization feedback by setting Vcut = 0 km/s. We refer to this second model as NOSN (no suppression) in this paper. Since turning off supernovae feedback in the Lagos et al. (2012) model changes the bright end of the UV luminosity function, we have changed some other parameters so that the NOSN models still match the observed UV luminosity functions at z = 7.272. Specifically, we introduce a stellar initial mass function dominated by brown dwarfs, with Υ = 4, and also reduce the star formation timescale in bursts by setting f dyn = 2 and τ * burst,min = 0.005 Gyr (see Cole et al. (2000) and Lacey et al. (2011) for more details of these parameters).
In Figure 5 we show the resulting comparison of crosspower spectra and cross-correlation functions at z=7.272. The locations of transition wave numbers between the Lagos et al. (2012) model and the two NOSN models are significantly different (see also ionization structure for these models in Kim et al. (2013)). In particular, the Lagos et al. (2012) model has a larger transition scale. On small scales, the cross-correlation function of the Lagos et al. (2012) model shows stronger anti-correlation than the two NOSN models between 21cm emissions and galaxies. Furthermore, we find that while photoionization feedback also suppresses low luminosity galaxies, the effect is smaller than the effect of the supernovae feedback.

DETECTABILITY
In this section we describe the error estimation of the crosscorrelation coefficient ( § 5.1) and discuss observational requirements for future galaxy surveys ( § 5.2). Our examples are based on the MWA-like observations of the 21cm signal combined with various hypothetical galaxy redshift surveys.

Error estimate in the cross-correlation coefficient
In order to estimate the sensitivity of future surveys, we calculate the error on the cross-correlation coefficient Lidz et al. 2009). For convenience we use the notation of Lidz et al. (2009) for the crosscorrelation coefficient,  The error on the cross-correlation coefficient can be written as This equation has variances of the cross-power spectrum between 21cm and galaxy, and the auto-power spectra of both the 21cm emission and galaxies. It also has the covariance between different pairs of power spectra. The components of equation (11) are given by and where µ is the cosine of the angle between k and the line of sight. To introduce large scale redshift space distortions we use the relation, P (k, µ) = (1 + βµ 2 ) 2 P (k), where β = Ω 0.6 m (z)/b and b is a bias factor, between the redshift space power spectrum and the real space (Kaiser 1987). We use b 2 gal (k) = P gal (k)/PDM(k) which is scale dependent and assume b21 = 1 for 21cm power spectrum.
The first term in equation (13) comes from a sample variance within the finite volume of the survey and the sec-ond term comes from the thermal noise of the 21cm telescope. We have assumed specifications of the MWA for the calculation of thermal noise. In the thermal noise term, Tsys ∼ 250[(1 + z)/7] 2.6 K denotes the system temperature of the telescope; B = 8MHz is the survey bandpass; tint is the integration observing time. We use 1000 hours total observing time in this calculation; D and ∆D are the comoving distance to the survey volume and the comoving survey depth, ∆D = 1.7 B 0.1MHz 1+z 10 Ωmh 2 0.15 −1/2 (Furlanetto et al. 2006), respectively; n(k ⊥ ) denotes the number density of baselines in observing the transverse component of the wave vector, where k ⊥ = 1 − µ 2 k. Observing the signal of k ⊥ in each Fourier cell is related to the length of baseline and the antenna configuration. Here, we follow the method of Morales (2005), Bowman et al. (2006) and Datta et al. (2007) for calculation of n(k ⊥ ). The maximum value of the transverse component of the wave vector is k ⊥,max = 2πLmax/(Dλ), where Lmax = 750m is the maximum baseline distance in the antenna array. This limit is due to the maximum angular resolution of the telescope related to Lmax. On the other hand, the minimum line-of-sight wavenumber is set by the bandpass kmin = 2π/∆D; The observed wavelength is λ = 0.21m×(1+z), and Ae is the effective collecting area of each antenna. We use Ae ∼ N dip λ 2 /4 (Bowman et al. 2006), where N dip = 16 is the number of dipoles. We have assumed 500 antennae elements. 1 From equations (12 -19), we compute the errors of the power spectra averaged over a spherical shell of the logarithmic width ǫ = dlnk for individual k-modes. For example, the Figure 6. The 21cm power spectrum with estimated errors, based on an 800 deg 2 survey area, at z = 7.272. We assume 1000 hours total observing time, and, based on the assumption of 8MHz bandwidth, the survey depth is about 0.2 redshift units. Red represents the power spectrum from our model including supernovae feedback with Vcut = 30km/s. The light grey and dark grey lines represent power spectrum from the NOSN models with Vcut = 30km/s and no suppression, respectively. error of the cross-power spectrum is given by where Vsurvey is the effective survey volume for a radio telescope, Vsurvey = D 2 ∆D(λ 2 /Ae). The value of λ 2 /Ae corresponds to the solid angle of the survey, which for the MWA corresponds to ∼ 800deg 2 . Note that if the galaxy survey volume is less than the 21cm survey volume, then the variance is increased by a factor of Vsurvey,21/V survey,gal . The MWA is designed to operate at frequencies between 80 and 300MHz in order to observe the 21cm signal at 6 < z < 30. When the 21cm signal is observed at z ∼ 7, the wavelength is ∼ 1.7m corresponding to ∼ 200MHz. Figure 6 shows the 21cm power spectrum with errors estimated based on equation (13) for cases including different feedback processes. The 21cm power spectra show obvious differences between the models for supernovae feedback, especially at large scales. Figure 6 reinforces the importance of detailed modeling of galaxy formation during reionization (Kim et al. 2013).
The error on the galaxy power spectrum is expressed in equation (14). The galaxy shot-noise is dependent on the number density of galaxies observable (n gal ), k = µk, and σχ = cσz/H(z), where σz is the galaxy redshift error. Here, we assume a Gaussian distribution of redshift errors.

Observational requirements for future galaxy surveys
Following Lidz et al. (2009), we begin by considering a galaxy number density of n gal = 1.6 × 10 −4 h 3 Mpc −3 for a survey in combination with 21cm observations from the MWA. To match this number density in our galaxy catalogue we use a magnitude threshold, with a value of -19.4 at z = 7.272 and -19.8 at z = 6.712, in UV magnitude (MAB(1500Å) − 5log(h)). We also match the number density of NOSN models in the same way. To find the general requirements for detection of the cross-correlation, in Figure 7 we show the signal to noise (S/N) for the cross-correlation coefficient as a function of survey area (Asurvey) and redshift error (σz). In our calculations, we assume 1000 hours total observing time for the MWA. The left panel of Figure 7 shows the total S/N, which is calculated by summing up the S/N in each k bin, where i represent ith bin and ∆k is the bin size. We assume a redshift error (σz) of 0.05 as an example value from narrowband survey for Lyman-α emitters (Ouchi et al. 2008(Ouchi et al. , 2010. The S/N is calculated with a survey area of 800 deg 2 , and then scaled the S/N with the relation, S/N ∝ 1/ Asurvey. Our default model that includes SNe feedback with Vcut = 30km/s shows increased S/N compared with the results of the NOSN models. The default model predicts a 3-σ detection of cross-correlation with a 2 deg 2 survey area. Survey areas greater than 10 deg 2 will provide detailed high S/N measurements. As a specific example, we also calculate the total S/N as a function of σz by assuming the survey area of Asurvey = 5 deg 2 . The total S/N in the central panel of Figure 7 shows that measurements will require redshift uncertainties less than 0.1. The NOSN models show a similar shape to the default model, but have lower S/N. Lower accuracy redshifts (σz > 0.1) wash out the cross-correlation signal. An error of σz ∼ 0.1 provides measurement only on larger scales (k [hMpc −1 ] < 0.2) (right panel in Figure 7). To measure the cross-correlation over a broad range of k, redshift uncertainties, σz, less than 0.01 will be required. Figure 7 illustrates the conditions required for measurement of the 21cm-galaxy cross-correlation. Before concluding we discuss these requirements with respect to real galaxy surveys. In this study, we have used UV magnitude cuts to select galaxy samples which relate to observed Lyman-break galaxies. However, Lyman-break galaxies, which are photometrically selected, have σz 0.5 at z ∼ 6.5 (Beckwith et al. 2006) much longer than the σz < 0.1 requirement. As a result, Lyman-break surveys will not be sufficient to detect the cross-correlation (Wiersma et al. 2013). On the other hand, Ly-α emitters selected from narrow-band surveys have σz ∼ 0.05−0.1 (Ouchi et al. 2008(Ouchi et al. , 2010. Thus, a detection could be made based on the precision and volume of current Ly-α surveys. Our semi-numerical model does not predict Ly-α luminosity (see Orsi et al. (2008)). However, the difference between simulated populations of Ly-α emitters and Lyman-break galaxies is found not to be significant (Dayal & Ferrara 2012). For the purpose of our calculation, we therefore use star forming galaxies with UV magnitudes  . The error bars are calculated for spherical bins of logarithmic width ǫ ≡ dlnk = 0.5. We assume a 5 deg 2 galaxy survey field, 1000 hours total observing time, the redshift error of 0.05 and galaxy number density of the Subaru Deep Field survey. The solid(orange and red) lines represent the power spectrum from the default model (Lagos et al. 2012) including supernovae feedback with Vcut = 30km/s. The long dashed(light grey) and dot-dashed (dark grey) lines represent power spectrum from the NOSN models with Vcut = 30km/s and no suppression, respectively.
The largest Ly-α survey (Ouchi et al. 2010) covered 1 deg 2 at z ∼ 6.6, and used a narrowband filter with a central wavelength of 9196Å and a full width at half maximum of 132Å. These values correspond to a survey depth of ∆z ∼ 0.11 at z = 6.6. This is smaller than, but comparable to the survey depth of MWA observation which is ∆z ∼ 0.3 corresponding to the bandwidth of 8 MHz assumed for this paper. For the survey at z ∼ 7.3, Shibuya et al. (2012) has a survey depth of ∆z ∼ 0.18, which used the central wavelength of 10052Å and a full width at half maximum of 214Å. This is also smaller than, but comparable to the MWA observation of the survey depth of ∆z ∼ 0.38.
While 1 deg 2 represents the largest high redshift survey at the current time, future surveys will be larger. For example, in the next 5 years Hyper Suprime-Cam on the Subaru telescope will observe 10 5 galaxies at z ∼ 5.7 and 6.5 in a survey area of ∼ 30 deg 2 , and 100s of galaxies at z ∼ 7 in a survey area of 3 deg 2 (M. Ouchi private communication). As shown in Figure 7, this increased survey area will improve the S/N, so that the cross-power spectra signal could be detected with high significance. 2 Based on the requirement from § 5.2, we assume a 5 deg 2 galaxy survey field and a redshift error of 0.05 for a future galaxy survey. We also assume 1000 hours total observing time. Figure 8 shows the predicted errors for the crosscorrelation coefficient within spherical bins of logarithmic width ǫ = 0.5 at z = 6.712 and 7.272 for such a galaxy survey combined with the MWA. The estimated errors are exponentially increased near the wave number of 1 hMpc −1 , because of the limit of k ⊥,max for a 21cm survey. We compare the result with the cross-correlation coefficient from the NOSN models. The result shows that we could observationally distinguish our default model from two different NOSN models.
Ly-α observations at z 7 over a large area are very challenging. The latest Ly-α survey at z ∼ 7.3 (Shibuya et al. 2012) has a galaxy number density of ∼ 6.7 × 10 −6 covering a survey area of 0.48 deg 2 . This value is smaller than the value we assume for our error estimation. Computing the cross-power spectrum corresponding to this number density is not possible owing to the limited box size of our simulation. However, we have checked that the estimated error would approximately increase by a factor of 2, when using this number density.

SUMMARY AND CONCLUSIONS
In this study we have investigated evolution of the cross-power spectrum, cross-correlation function and crosscorrelation coefficient between 21cm emission and galaxies using the model of Kim et al. (2013). This model combines the hierarchical galaxy formation model GALFORM implemented within the Millennium-II dark matter simulation, with a semi-numerical scheme to describe the resulting ionization structure. We find that there is a transition wave number, k, at which the cross-correlation coefficient changes from negative to positive (Lidz et al. 2009). This transition wave number is associated with the size of the ionized regions generated by galaxies, and increases with decreasing redshift. We also find the same trend in the crosspower spectrum and cross-correlation function. We calculated the cross-power spectrum as a function of UV luminosity (MAB(1500Å) − 5log(h)) and host halo mass. These calculations reveal that bright galaxies and galaxies residing in massive halos have stronger anti-correlation, but a similar transition wavenumber.
We have studied observational uncertainties in measurement of the cross-correlation coefficient based on the specifications of an upgraded (512 tile) Murchison Widefield Array (MWA) combined with galaxy surveys. The results show that the cross-power spectrum signal could be detected when combined with more than 3 square degrees of a galaxy survey at the depth of the future galaxy survey having redshift error < 0.1. We have also investigated the dependance