21cm Intensity Mapping cross-correlation with galaxy surveys: current and forecasted cosmological parameters estimation for the SKAO

We present a comprehensive set of forecasts for the cross-correlation signal between 21cm intensity mapping and galaxy redshift surveys. We focus on the data sets that will be provided by the SKAO for the 21cm signal, DESI and Euclid for galaxy clustering. We build a likelihood which takes into account the effect of the beam for the radio observations, the Alcock-Paczynski effect, a simple parameterization of astrophysical nuisances, and fully exploit the tomographic power of such observations in the range $z=0.7-1.8$ at linear and mildly non-linear scales ($k<0.25 h/$Mpc). The forecasted constraints, obtained with Monte Carlo Markov Chains techniques in a Bayesian framework, in terms of the six base parameters of the standard $\Lambda$CDM model, are promising. The predicted signal-to-noise ratio for the cross-correlation can reach $\sim 50$ for $z\sim 1$ and $k\sim 0.1 h/$ Mpc. When the cross-correlation signal is combined with current Cosmic Microwave Background (CMB) data from Planck, the error bar on $\Omega_{\rm c}\,h^2$ and $H_0$ is reduced by a factor 3 and 6, respectively, compared to CMB only data, due to the measurement of matter clustering provided by the two observables. The cross-correlation signal has a constraining power that is comparable to the auto-correlation one and combining all the clustering measurements a sub-percent error bar of 0.33% on $H_0$ can be achieved, which is about a factor 2 better than CMB only measurement. Finally, as a proof-of-concept, we test the full pipeline on the real data measured by the MeerKat collaboration (Cunnington et al. 2022) presenting some (weak) constraints on cosmological parameters.


INTRODUCTION
Efforts to seek new physics beyond the standard cosmological model, which is grounded in cold dark matter and the cosmological constant (ΛCDM), are currently centred on addressing the  0 and  8 tensions.These involve modifications of the theoretical framework and the exploration of new observables with the potential to be sensitive to various scales and redshifts while being subject to distinct systematics and statistical errors compared to well-established probes.
Neutral hydrogen (HI) has recently emerged as a new quantitative tracer of the large-scale structure (LSS) of the Universe (e.g.Pritchard & Loeb 2012;Ansari et al. 2012;Santos et al. 2015) via the intensity mapping (IM) technique.The 21cm IM line is produced by the hyperfine structure spin-flip transition of the electron in atomic hydrogen (Furlanetto et al. 2006) and, compared to other observables, has the great advantage of probing large volumes in an efficient way at the expense of a relatively poor angular resolution.
The SKA Observatory (SKAO)1 , consists of SKA-Low and SKA-Mid telescopes, which will be located in Australia and South Africa, respectively.Using the SKA-Mid telescope array as a collection of single-dishes (Battye et al. 2013;Santos et al. 2015) it will be possible to perform 21cm IM at cosmological scales up to redshift  ∼ 3 (SKA Cosmology SWG 2020).The SKAO is currently under construction and MeerKAT, the SKA-Mid precursor, has been conducting IM survey for cosmology (Santos et al. 2017, MeerKLASS).Preliminary data analyses have provided interesting results on the potential of the MeerKAT telescope intensity mapping surveys (Wang et al. 2021;Irfan et al. 2022) along with a first MeerKLASS detection of the HI signal in cross-correlation with WiggleZ galaxies (Cunnington et al. 2022).More recently, another breakthrough has been reached with the detection at cosmological scales of HI with IM in the auto powerspectrum (Paul et al. 2023).However, mitigating the foregrounds and their impact on the extracted signal, both in cross and autocorrelation, is challenging and several foreground removal techniques have been proposed (Alonso et al. 2015;Wolz et al. 2016;Carucci et al. 2020;Matshawule et al. 2020;Irfan & Bull 2021;Cunnington et al. 2021;Soares et al. 2021Soares et al. , 2022;;Spinelli et al. 2021).
From the theoretical point of view, it is of great importance to refine the forecasts for the 21cm IM, both alone and in combination with other probes, to optimise the survey design in order to enhance the signal-to-noise ratio (Villaescusa-Navarro et al. 2015;Bull et al. 2016;Jiang et al. 2023), to address the non-linear scales modelling in the context of the MeerKat detection (Padmanabhan et al. 2023), to investigate the cross-correlation in the connection with galaxy formation models (Spinelli et al. 2020), and to estimate the impact of fundamental new physics on the observables, like non-gaussianities or dark energy (Jolicoeur et al. 2023;Wu et al. 2023).
In Berti et al. (2023) we built on the formalism of Blake (2019); Cunnington et al. (2020); Soares et al. (2021); Cunnington et al. (2022) and study the redshift-space 21cm power spectrum monopole and quadrupole, forecasting the constraining power of SKAO observations within multiple redshift bins.In this work, we extend this analysis by including in our pipeline the modelling of the 21cm and galaxy clustering cross-correlation signal.As before, we focus on the parameters of the ΛCDM model and exploit the exquisite tomographic nature of the 21cm IM signal.
For the 21cm IM, we mimic SKA-Mid observations following the SKAO Redbook prescriptions (SKA Cosmology SWG 2020).Regarding galaxies, we rely on the mocking of data sets for the galaxy clustering signal which could be provided soon by Dark Energy Spectroscopic Survey (DESI) (Vargas-Magaña et al. 2018;Aghamousa et al. 2016) and the Euclid mission (Blanchard et al. 2020).We will build a full likelihood integrated within the CosmoMC code (Lewis & Bridle 2002;Lewis 2013) and compute constraints through Markov-Chain Monte-Carlo (MCMC) techniques.We assess the constraining power of our mock 21cm data set in cross-correlation with galaxy clustering alone and combined with CMB data.
We note that forecasts for future IM observations based on the Fisher matrix formalism have also been presented in Obuljen et al. (2018); Karagiannis et al. (2022); Viljoen et al. (2020).Our work expands on previous studies in the following aspects: ) we constrain the complete set of cosmological parameters, ) we cross-correlate the signal with the most recent forecasts for state-of-the-art galaxy surveys, ) we combine the 21cm data within multiple redshift bins with the Planck latest available results, ) we conduct complete MCMC analyses, to estimate the full posterior distribution.
The structure of the paper is as follows.The methodology, discussed more extensively in Berti et al. (2023), is briefly reviewed in section 2. The building of the mock observations is detailed in section 3. Results are discussed in section 4, where we present the forecasted constraints obtained from 21cm and galaxy clustering cross-correlation alone (section 4.1) and in combination with the latest Planck 2018 CMB data (section 4.2).Results obtained with data measured in Cunnington et al. (2022) are shown in subsection 4.3.A summary of the results and our conclusions is outlined in section 5. We also provide two useful appendixes that discuss the Alcock-Paczynski effect's impact and a further cross-check on the attained signal on present data sets.

MODELING THE CROSS-CORRELATION SIGNAL
The analysis presented in this work for the 21cm × galaxy clustering power spectrum is an extension of the one discussed in Berti et al. (2023).Since we adopt analogous formalism and framework of the previous study, in the following, we review only the essential information.Having defined the cosmological model we consider in section 2.1, we discuss the 21cm auto-power spectrum and the galaxy power spectra in section 2.2 and section 2.3, which enter the error estimation.The model for the cross-correlation power spectrum and the power spectrum multipole expansion are presented in section 2.4 and section 2.5.

Fiducial cosmological model
We work within the standard cosmological model framework, i.e. the ΛCDM model.We perform our analysis using the following six parameters to define the fiducial cosmology: Ω  ℎ 2 and Ω  ℎ 2 , which describes the density of the baryonic and cold dark matter, respectively, the scalar spectral index   , the normalization of the primordial power spectrum   , the Thomson scattering optical depth due to reionization , and   , that is connected to the angular scale of the sound horizon at decoupling.Moreover, we will focus on the derived parameters  0 , i.e. the current expansion rate in km s −1 Mpc −1 and  8 , the root mean square matter fluctuations today in linear theory.

Model for the observed 21cm signal power spectrum
The 21cm non-linear power spectrum can be modeled as (Kaiser 1987;Villaescusa-Navarro et al. 2018; SKA Cosmology SWG 2020) where Tb is the HI mean brightness temperature,  HI is the HI bias,  is the growth rate,  = k • ẑ is the cosine of the angle between the wave number and the line-of-sight,  m (, ) is the non-linear matter power spectrum and  SN is the shot noise term.
For the evolution in redshift of the brightness temperature, we use the parametrization defined in Battye et al. (2013).Given that we lack an analytical model,  HI () and  SN () at a given redshift are computed by interpolating numerical results from hydrodynamical simulations (Villaescusa-Navarro et al. 2015;Villaescusa-Navarro et al. 2018).The growth rate  () and the non-linear matter power spectrum  m (, ) are, instead, computed numerically by means of the Einstein-Boltzmann solver CAMB2 (Lewis et al. 2000).
To mimic a realistic observation, we introduce the effect of a Gaussian telescope beam, as a suppression of the power spectrum on scales smaller than the beam's full width at half maximum (Battye et al. 2013;Villaescusa-Navarro et al. 2017;Soares et al. 2021;Cunnington et al. 2020;Cunnington 2022).The corresponding damping factor B(, , ) can be written in terms of the beam's physical dimension In a real-world scenario, one must consider the possibility of having chosen the wrong fiducial cosmology.This can be taken into account with the Alcock-Paczynski (AP) modifications (Alcock & Paczynski 1979).Anisotropies along the radial and transverse direction can be modelled as3 . (3) Here,  fid  () and  fid () are the values of the angular diameter distance and the Hubble parameter at redshift  predicted by the fiducial cosmology.The AP parameters  ⊥ and  ∥ modify the overall amplitude of the power spectrum and the wave vectors.The wave vector components along and transverse to the line of sight are then distorted as and where  and  are the assumed fiducial values of the wave vectors.
The observed 21cm power spectrum, marked with the symbol ˆ, including the beam smoothing, the AP effects and the instrumental noise, is then where  21 (, , ) is defined in Equation 1, but computed on the new variables  and .The SKAO-like instrumental noise  N can be modelled using the instrument specifications as in Equation 9of Berti et al. (2023).We note that, in this work, we expand the modelling of Berti et al. (2023), where we neglected the AP contribution in the first approximation.A discussion on the effect of the inclusion of the AP distortions on the cosmological parameter constraints is presented in appendix A.

Model for the galaxy power spectrum
The simplest parametrization of the galaxy power spectrum at a given redshift can be written as where  g is the galaxy bias and ng is the galaxy number density.
The term 1/ ng is the shot noise term for the galaxy power spectrum.
In this work, we use values of  g and ng in agreement with the official expected values for the planned galaxy surveys, as discussed in section 3.2.Due to the different observing techniques, the galaxy power spectrum is not affected by the beam correction.The AP distortions, instead, are the ones described in the previous section for  21 .Therefore, the observed galaxy power spectrum we consider is

The cross-correlation signal power spectrum
To predict the cross-correlation power spectrum between the 21cm signal and galaxy clustering, we use the following model (see e.g.Cunnington et al. (2022); Casas et al. (2023)) where all the quantities appearing here are defined in the previous sections.In the expression above we do not make explicit the redshift dependence of the brightness temperature, the bias, and the growth rate for the sake of notation easiness.Moreover, it can be shown that the shot noise contribution for the cross-correlation power spectrum is negligible (Castorina & Villaescusa-Navarro 2017;Villaescusa-Navarro et al. 2018) Taking into account the intensity mapping beam effect and the AP distortions, the observed cross-correlation signal becomes P21,g (, , ) = 1 with  being a cross-correlation coefficient that accounts for unknown effects that may modify the theoretical estimate of the correlation degree.4

Multipole expansion
The non-isotropic power spectrum can be decomposed using Legendre polynomials L ℓ ().The coefficients of the expansion, i.e. the multipoles of the power spectrum, are given by PX,ℓ (, ) = (2ℓ + 1) 2 with X being either the 21cm intensity mapping (X = 21), the galaxy clustering (X = g) or their cross-term (X = 21, g).In this work, we use the auto-power spectrum and cross-correlation monopoles, for which ℓ = 0 and L 0 () = 1.In particular, we focus on forecasting the cross-correlation power spectrum monopole P21,g,0 (, ).In the following, for clarity of notation, we drop the subscript 0 and simply refer to the monopoles as P21,g (, ), Pg (, ) and P21 (, ).

CONSTRUCTING THE MOCK CROSS-CORRELATION DATA
In this section, we construct mock data sets of cross-correlation measurements from future surveys.The 21cm and galaxy surveys we take into account are presented in section 3.1.Details on the construction of the synthetic data set and the analysis framework are given in section 3.2 and section 3.3.

21cm intensity mapping
The main focus of our analysis is the 21cm intensity mapping signal that can be measured with the SKAO telescope.We consider, in particular, a cosmological survey with the SKA-Mid telescope in single-dish mode, following SKA Cosmology SWG (2020).We assume a Wide Band 1 survey that covers a sky area of 20 000 deg 2 in the frequency range 0.35−1.05GHz (i.e. the redshift range 0.35−3).
The used SKAO specifications are summarized in Table 1.

Galaxy surveys
We assume a Euclid-like and a DESI-like spectroscopic galaxy survey.For Euclid, following what has been proposed in Blanchard et al. (2020), we consider observations within four different redshift bins in the range of 0.9 -1.8.The assumed values of the galaxy bias and number density computed at each effective redshift are presented in Table 1.
Mock data set for SKAO × Euclid (upper panel) and SKAO × DESI (lower panel) observations.The considered redshift bins are different for the two galaxy surveys.We refer to section 3 for the extended discussion on how the signal and the errors are computed.
To obtain a cross-correlation signal, one must take into account observations of the same portion of the sky.In agreement with other studies in the literature, we assume a 10 000 deg 2 overlapping patch of the sky observed by the SKAO and a Euclid-like survey.
Hereafter, we simply refer to the Euclid survey, where is understood that a Euclid-like survey as the one described above is intended.
To construct cross-correlation measurements between the SKAO and DESI, we follow Casas et al. (2023).We focus on the DESI Emission Line Galaxies (ELG), as they probe a redshift range similar to the one covered by Euclid, i.e. 0.7 -1.7, making easier a direct comparison between the two experiments.In Table 1, we report the assumed values of the galaxy bias and number density at each effective redshift and we consider an overlapping area between DESI ELG and SKAO of 5 000 deg 2 .The smaller area overlap with respect to a Euclid-like survey is forced by the different hemisphere locations of the two telescopes.

Mock data sets
We construct two different mock data sets for the 21cm and galaxy clustering cross-correlation power spectrum.One mimics an SKAO × Euclid analysis and the other a SKAO × DESI one, for the redshift bins and survey specifications described in section 3.1 and Table 1.
The scales that are accessible with the observations are fixed by the volume probed with the surveys in a given redshift bin.In Fourier space, the largest scale available at each effective redshift is  min () = 2/ 3 √︁  bin (), where  bin () is the volume of each redshift bin, which we compute as with  () being the comoving distance and Ω sur the survey are in steradians (see Table 1).The smallest scale is, instead, imposed by the size of the SKAO telescope beam, due to the damping effect.
It can be estimated as  max () = 2/ beam ().At scales smaller than  max , the signal is dominated by the beam providing no relevant information on cosmology.Finally, we choose a fixed k-bin width as a function of redshift Δ () in order to be large enough for modes to be independent, i.e.Δ () ∼ 2 min ().
Assuming a set of  measurements at redshift  of the crosscorrelation power spectrum P21,g () at scales { 1 , . . .,   }, we estimate the error on at each point as (see e.g.(Smith 2009;Cunnington et where P21,g is the cross-correlation power spectrum defined in Equation 10, P21 and Pg are the 21cm and the galaxy power spectrum introduced in Equation 6and Equation 8 respectively.Here,  modes is the number of modes per  and  bin, computed as At each central redshift  and data point  we compute the crosscorrelation power spectrum for SKAO × Euclid observations, labeled as PEuclid 21,g (, ), the one for SKAO × DESI, PDESI 21,g (, ), and the corresponding errors, as discussed above.In table Table 1, we gather some of the used redshift-dependent quantities.The resulting mock data sets are shown in Figure 1.
We highlight that in Equation 13 the 21cm intensity mapping instrumental noise enters the estimate of the errors through P21 .However, we find that the contribution of SKAO-like instrumental noise is minimal.This is not the case, instead, for a MeerKATlike noise, which induces a non-negligible contribution to the error estimate, as in the analysis of subsection 4.3.

Numerical analysis
In order to exploit the constraining power of the mock data set presented in section 3.2, we define a likelihood function and then set up the framework to constrain the cosmological parameters by adopting a Bayesian approach.Given a set of observations and a theory that depends on given parameters, the Bayes theorem links the posterior distribution to the likelihood function.The high-dimensional posterior can then be sampled using MCMC methods (Gilks et al. 1995, see e.g.).Following Berti et al. (2023), we build a working pipeline to conduct full MCMC analyses on 21cm and cross-correlation observations.We test this pipeline by forecasting the constraining power of the datasets described above.
Figure 2 shows the signal-to-noise ratios as a function of  in each redshift bin for both the constructed mock data sets.We observe that the signal-to-noise decreases at higher redshifts.The behaviour and orders of magnitude found here are compatible with the results for the 21cm power spectrum multipoles in Soares et al. (2021); Berti et al. (2023).
We conduct an MCMC analysis varying the six parameters describing the ΛCDM model, i.e. we vary {Ω  ℎ 2 , Ω  ℎ 2 ,   , ln(10 10   ), , 100 MC , Σ  ,  SN, } assuming wide flat priors on each of the parameters.Results on other parameters, such as  0 and  8 , are derived from results on this set.To perform the study, we develop a likelihood code integrated with the MCMC sampler CosmoMC5 (Lewis & Bridle 2002;Lewis 2013).We further expand on the code we implemented and used in Berti et al. (2022Berti et al. ( , 2023) ) including the computation of the theoretical expectations for the 21cm and galaxy clustering cross-correlation power spectrum and the relative likelihood function at different redshift.We recall that each redshift bin is considered independent, thus we consider a diagonal covariance matrix constructed with the forecasted errors.

Nuisance parameters
As in Berti et al. (2022Berti et al. ( , 2023)), along with the cosmological parameters we implement different nuisances.Indeed, the access to the matter clustering is not direct as it appears in Equation 10 in combination with the brightness temperature and the HI bias and the galaxy bias.These quantities, although the scientific community hopes to obtain external measurements (e.g. the total neutral hydrogen density as a function of redshift, key unknown for the brightness temperature, is one of the scientific goals of the MeerKAT survey Laduma), may need to be treated as unconstrained quantities in a pessimistic scenario.To take into account this lack of knowledge, we allow for combinations of these parameters to vary in the MCMC run, thus leaving free the overall amplitude of the power spectrum.The contribution from the nuisances is then marginalised out in the final analysis.To be completely agnostic, for the cross-correlation power spectrum we include in the nuisances also the correlation coefficient  and the galaxy bias.Thus, we consider the following three combinations of parameters √︁  T  HI  8 , √︁  T  g  8 , and √︁  T   8 , where we renormalized the matter power spectrum as  m / 2 8 .
Given that all the parameters are redshift-dependent quantities, the actual number of nuisances is three times the number of redshift bins.This translates into 4 × 3 nuisance parameters for SKAO × Euclid and 10 × 3 for SKAO × DESI.Especially in the latter case, the high number of parameters to vary can impact the numerical efficiency of the MCMC computations.Following what is already done in Berti et al. (2023), for SKAO × DESI only we reduce the number of nuisances by constraining their redshift evolution through a polynomial parametrization.Rewriting  () =  3 +  2 +  +  for √︁  T   8 , we implement as nuisances the coefficient of the polynomial , , , and , reducing the number of nuisance parameters from 30 to 12.
In the following, with the label "nuisances" or "nuis."we refer to the parameters described above.For each nuisance, we assume a wide flat prior in the range [−1, 1].

CMB likelihoods and data sets
In this study, we combine our mock 21cm and galaxy clustering crosscorrelation data sets with Planck 2018 (Planck Collaboration VI 2020).The CMB likelihood includes the high-ℓ TT, TE, EE lite likelihood in the interval of multipoles 30 ≤ ℓ ≤ 2508 for TT and 30 ≤ ℓ ≤ 19696 for TE, EE.Lite likelihoods are calculated with the Plik lite likelihood (Planck Collaboration V 2020).Instead for the low-ℓ TT power spectrum, we use data from the Commander component-separation algorithm in the range 2 ≤ ℓ ≤ 29.We also adopt the Planck CMB lensing likelihood and the low EE polarization power spectrum, referred to as lowE, in the range 2 ≤ ℓ ≤ 29, calculated from the likelihood code SimAll (Planck Collaboration III 2020).In the rest of the paper with the label "Planck 2018" we refer to the combination TT, TE, EE + low-ℓ + lowE + lensing.

RESULTS
We present in this section the results of our analysis.We first explore the constraining power of the mock cross-correlation data, with and without nuisances (section 4.1).We then combine the mock data sets with Planck CMB data (section 4.2).Finally, in section 4.3 we present the constraints we obtain on the cosmological parameters for the published measurement of the MeerKAT × WiggleZ crosscorrelation power spectrum presented in Cunnington et al. (2022).
As a reference, throughout this analysis, we compare results from the cross-correlation forecast with the best results obtained with the 21cm multiples in Berti et al. (2023), i.e. the fully non-linear monopole and quadrupole data set that we label as " P0 + P2 ".Note that we expand on the results of Berti et al. (2023) by introducing the AP, in order to be consistent with the modelling of the crosscorrelation used in this work.Thus, " P0 + P2 " here include AP effects.For a discussion on the impact of AP distortions, we refer to appendix A.
We stress that when combining auto-power spectrum and crosscorrelation forecasts, we neglect the covariance between the two data sets in the first approximation.
We show the marginalised 1D and 2D posteriors for the studied set of parameters.Note that 68% confidence level constraints are presented as percentages with respect to the marginalised mean value.

Probing the constraining power of future 21cm × galaxy clustering data
In Figure 3 and Figure 4 we present the forecasted posterior distributions we obtain for the SKAO × DESI and SKAO × Euclid mock data sets we construct in this work.We show only some of the model parameters for brevity.We obtain comparable results for both the SKAO × Euclid and the SKAO × DESI analysis.Looking at the 2D contours, we observe that the correlations between the cosmological parameters are similar and in line with the results obtained with the 21cm multipoles.The marked degeneracy in the  0 − Ω  ℎ 2 plane, found in previous works (Berti et al. 2022(Berti et al. , 2023)), is present also for the crosscorrelation power spectrum case.As discussed in Bardeen et al. (1986), measuring cosmological observables that strongly depend on Table 2. Marginalised percentage constraints on cosmological parameters at the 68% confidence level.We show the results obtained using different combinations of forecasted data sets.The label " P0 + P2 " stands for the forecasted 21cm power spectrum monopole and quadrupole observations (see appendix A). " PEuclid 21,g " and " PDESI 21,g " refer to the mock cross-correlation power spectrum data sets constructed above.The label "nuis."indicates that we vary the nuisance parameters along with the cosmological ones.Figure 3. Joint constraints (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.The label " P0 + P2 " (dashed lines) stands for the forecasted 21cm power spectrum monopole and quadrupole observations (see appendix A). " PDESI 21,g " refers to the mock cross-correlation power spectrum data set constructed above.The label "nuis."(dashed-dotted lines) indicates that we vary the nuisance parameters along with the cosmological ones.The relative constraints are listed in Table 2.

Parameter
the matter power spectrum, as P21,g does, fixes the shape of  m .This translates into fixing the quantity Ω m ℎ, which, in turn, induces the strong correlation Ω  ℎ 2 ∝  0 .This feature is particularly relevant when combining 21cm observations with CMB data, as discussed in the next section.
As expected from the signal-to-noise estimates of Figure 2, better constraints are obtained for the SKAO × Euclid data set (Table 2).Although DESI probes the same redshift range of Euclid and even with a higher number of redshift bins, Euclid will have a larger sky area of overlap with SKAO, suggesting that a larger sky coverage increases the constraining power more than the number of redshift bins.As expected, the best constraints are the ones obtained with the Figure 4. Joint (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.The label " P0 + P2 " (dashed lines) stands for the forecasted 21cm power spectrum monopole and quadrupole observations (see appendix A). " PEuclid 21,g " refers to the mock cross-correlation power spectrum data set constructed above.The label "nuis."(dashed-dotted lines) indicates that we vary the nuisance parameters along with the cosmological ones.The relative constraints are listed in Table 2. 21cm multipoles, in particular for Ω  ℎ 2 and  0 .Indeed, the P0 + P2 is constructed to sample a wider redshift ( = 0 − 3) and scales range (up to  ∼ 1 ℎ/Mpc at low redshifts).It is interesting, however, to see that, despite these differences, cross-correlation results are still able to deliver competitive constraints.This makes a strong case for cross-correlation studies, especially in light of the reduced challenges in terms of residual systematics from the 21cm intensity mapping observations.Adding the nuisance parameters, i.e. assuming no prior knowledge of the astrophysics at play, has the effect of varying the overall amplitude of the cross-correlation power spectrum.This translates into a broadening of the constraints, in particular on   .Moreover, the  Figure Joint constraints (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.The label " P0 + P2 " (dashed lines) stands for the forecasted 21cm power spectrum monopole and quadrupole observations (see appendix A). " PEuclid 21,g " and " PDESI 21,g " refer to the mock cross-correlation power spectrum data sets constructed above.The label "nuis."(dashed-dotted lines) indicates that we vary the nuisance parameters along with the cosmological ones.The relative constraints are listed in Table 2.
2D contours are generally broader and show less clear correlations, except for the  0 − Ω  ℎ 2 and  0 − Ω  ℎ 2 planes.Although the shape is stretched, the  0 − Ω  ℎ 2 degeneracy is still marked.
In Figure 5 we show the results on the full set of cosmological parameters for the combination of the two cross-correlation data sets and the 21cm multipoles one.Note that we do not report the constraints on , due to the fact that the considered probes are not sensitive to this parameter.We compare the results with the ones for the 21cm multipoles and the PDESI 21,g data set as a reference.In order to explore a more realistic scenario, we include the nuisance parameters.We observe that combining SKAO × DESI and SKAO × Euclid improves the constraints obtained with the two data sets separately.Including also the 21cm multipoles lead to the best result.With observations from 21cm probes only in the pessimistic case of including the nuisances, we are able to achieve constraints on the cosmological parameters comparable with Planck CMB observations.We conclude that 21cm IM observations in cross-correlation with galaxy clustering seem to have a reduced constraining power with respect to 21cm auto-power spectrum measurements.However, when combined with the latter, they improve the constraints, showing that the cross-correlation signal carries complementary cosmological information.

Combining 21cm × galaxy clustering with CMB observations
Most recent forecast analyses find 21cm IM future observations to be a pivotal cosmological probe, highly complementary to CMB observations (SKA Cosmology SWG 2020).Indeed, in Berti et al. (2023) we found that observations of the 21cm power spectrum multipoles contribute significantly to improving the constraints and reducing the degeneracies on the cosmological parameters.In this section, we investigate the effects of combining 21cm and galaxy clustering cross-correlations with CMB measurements.
For consistency, we first run the Planck likelihood in our frame-work and reproduce constraints in agreement with the Planck 2018 results.We then study the effect of adding the PEuclid 21,g and PDESI 21,g data sets and the two combined.As in section 4.1, we compare the results we obtain with the constraints from the 21cm power spectrum multipoles.
Table 3 shows the percentage constraints for this analysis.We observe that adding PEuclid 21,g or PDESI 21,g to Planck 2018 data reduces the estimated constraints with respect to the Planck alone results.The effect is prominent for Ω  ℎ 2 and  0 , for which the error is reduced by a factor of ∼ 3, and   , with a factor of ∼ 4 decrease.As one can see from Figure 6, in the  0 − Ω  ℎ 2 plane the effect is ascribable to the correlation directions.Indeed, with Planck observations  0 and Ω  ℎ 2 are anti-correlated, while they are positively correlated with P21,g .Combining the two removes the degeneracy and reduces errors.The effect is also particularly evident for  s since the CMB probes the quantity  s exp(−2) and the matter power spectrum, which is constrained by 21cm data is sensitive to  8 , which is in turn degenerate with the optical depth to reionization as measured from the CMB.Therefore adding 21cm data effectively removes the degeneracies.
When nuisance parameters are taken into account, as expected the improvement on the constraints is softened.In particular for   , and consequently  8 , the constraining power is lost when the parameter space is open to the nuisances.Varying the nuisances corresponds to effectively changing the amplitude of the power spectrum and, thus, it results in worsened constraints on   .
The effects observed for the cross-correlation data sets combined with CMB are qualitatively comparable to the results obtained for the 21cm power spectrum multipoles.This confirms that when combining different kinds of 21cm observations with CMB data the improvement in the constraints is always driven mainly by the breaking of the degeneracy in the Ω  ℎ 2 −  0 plane.Indeed, our analysis reveals that even a less constraining measurement, such as the 21cm and galaxy cross-correlation, is effective in improving the errors on Ω  ℎ 2 and  0 if it presents a sufficiently marked correlation among these parameters.
To better prove this point, in Figure 6 we compare the effect of combining Planck data with the 21cm multipoles, PDESI 21,g and PEuclid 21,g , and the three combined.We find that even with the nuisances, results from PDESI 21,g + PEuclid 21,g (orange contours) are similar to the constraints from the multipoles, for which, instead, the nuisances are kept fixed as a best-result reference (pink contours).The main difference resides in the loss of constraining power on   and the related parameters, which is however ascribable to the inclusion of the nuisances.Further adding the 21cm multipoles to PDESI 21,g + PEuclid 21,g (green contours), 6 does not impact the constraining power or the shape of the correlations.This confirms that the 21cm probe is pivotal in breaking the CMB degeneracy in the Ω  ℎ 2 −  0 plane and the effect is relevant already at the level of cross-correlation or with SKAO precursor power spectrum measurements (see also Berti et al. (2022)).
We conclude that cross-correlations measurements of 21cm IM and galaxy clustering are a key cosmological probe complementary to CMB observations and, in combination with Planck, their forecasted constraining power is compatible with the one from 21cm power spectrum multipole measurements with the SKAO.

State-of-the-art cosmological parameters constraints from the MeerKAT × WiggleZ detection
Cosmological 21cm observations with the SKAO will be possible in the upcoming years when the Observatory will be fully operational.However, the SKA-Mid pathfinder, MeerKAT is already taking data and its first cosmological surveys are promising.Recently, a power spectrum detection with the MeerKLSS survey, the intensity mapping survey with MeerKAT, in cross-correlation with clustering data has been made at the 7.7 level (Cunnington et al. 2022).The analysis pipeline we develop in this work is constructed to be ready to use with real cross-correlation power spectrum measurements.Therefore, we decide to test our methodology on the published results available for MeerKLASS.In the following, we present the result we obtain on the cosmological parameters constraints.We refer the interested reader to appendix B for the technical consistency checks we run on the adopted power spectrum model and the predicted signal-to-noise ratio.
We tune the parameters of the likelihood function to match the settings of the observed data.Instead of the SKAO specifications, we use the MeerKLASS survey parameters, i.e. we consider a 200 dg 2 survey area and dishes of a diameter of  dish = 13.5 m.The observed effective redshift is  = 0.43 with a bin width of Δ = 0.059.The signal is observed in cross-correlation with the WiggleZ 11h galaxy survey (Drinkwater et al. 2010;Drinkwater et al. 2018).When we do not include the nuisance parameters, we use the measured galaxy bias value  g = 0.911 (Blake et al. 2011) and cross-correlation factor  = 0.9 (Khandai et al. 2011).Other parameters and theoretical predictions are left unchanged.
We present the cosmological parameters constraints resulting from our MCMC analysis in Figure 7 and Table 4.We observe that the state-of-the-art constraining power is limited with respect to the results forecasted for SKAO × Euclid and SKAO × DESI, as expected due to the wider redshift ranges, probed scales, and survey area.Single bin MeerKAT observations are not yet able to constrain the complete set of cosmological parameters.However, the degeneracies between the parameters match the ones expected from our forecasts.In particular, the  0 − Ω  ℎ 2 correlation is clearly visible, although much less prominent.From these real measurements, we can infer new information on the marginalised mean value of the cosmological parameters.We find that all the constraints are compatible with the Planck results.
The most constrained parameters are Ω  ℎ 2 and  0 , proving that 21cm observations will be most useful to constrain them and their derived parameters.When fixing the nuisances, we find a high central value for  0 , although with a large error.We believe that this is not a physical effect, but is rather coming from a mismatch between the assumed brightness temperature value in our analysis and the one that seems to describe the observed data (see appendix B for a more in-depth discussion).The conservative result is then the one in which nuisances are taken into account.In this case, we find that MeerKAT × WiggleZ data prefer a lower value of  0 , although the significance is not high enough to draw firm conclusions.
From the constraints on the nuisances, one could estimate the value of Ω HI .With our analysis we find the constraints on the nuisance parameters from real data to be too wide to infer a meaningful result.
Lastly, although we do not show here the results, we test the effect of combining MeerKAT × WiggleZ data with Planck 2018 observations.We find that the measured cross-correlation power spectrum does not increase significantly the constraining power of CMB ob-Table 3. Marginalised percentage constraints on cosmological parameters at the 68% confidence level.We show the results obtained using different combinations of forecasted data sets in combination with Planck.The label "Planck 2018" stands for TT, TE, EE + lowE + lensing, while the label " P0 + P2 " refers to the forecasted 21cm power spectrum monopole and quadrupole observations (see appendix A). " PEuclid 21,g " and " PDESI 21,g " refer to the mock cross-correlation power spectrum data set constructed above.The label "nuis."indicates that we vary the nuisance parameters along with the cosmological ones.Figure 6.Joint constraints (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.The label "Planck 2018" stands for TT, TE, EE + lowE + lensing, while the label " P0 + P2 " (dashed lines) stands for the forecasted 21cm power spectrum monopole and quadrupole observations (see appendix A). " PEuclid 21,g " and " PDESI 21,g " refer to the mock cross-correlation power spectrum data sets constructed above.The label "nuis."(dashed-dotted lines) indicates that we vary the nuisance parameters along with the cosmological ones.The relative constraints are listed in Table 3.  refers to the cross-correlation power spectrum detection.The label "nuis."(dashed-dotted lines) indicates that we vary the nuisance parameters along with the cosmological ones.The relative constraints are listed in Table 4. Table 4. Marginalised percentage constraints on cosmological parameters at the 68% confidence level.We show the constraints on the cosmological parameters obtained with the MeerKAT × WiggleZ cross-correlation power spectrum detection.The label "nuis."indicates that we vary the nuisance parameters along with the cosmological ones.The symbol "-" stands for unconstrained.

Parameter
Parameter PMeerKAT×WiggleZ servations, leaving the constraints and the 2D contours mostly unchanged.
Although the constraining power of real detection is not yet competitive with other probes, the quality of the current 21cm IM observations in cross-correlation with galaxy clustering will improve sharply in the upcoming years and will soon become a useful independent cosmological probe.Moreover, the forecasted results for future surveys are very promising.

CONCLUSIONS
In this work, we forecast the constraints on the ΛCDM cosmological parameters for power spectrum measurements of 21cm intensity mapping in cross-correlation with galaxy clustering.Modelling the cross-power spectrum as in Cunnington et al. (2022), we forecast mock observations of the SKAO cross-correlated with DESI and Euclid-like surveys.We test the constraining power of such probes alone and combined with the latest Planck CMB observations.Note that our modelling does not include possible residual foreground and systematics contamination.
We follow the SKAO Red Book (SKA Cosmology SWG 2020) proposal and simulate single-dish observations with the SKA-Mid telescope in Band 1 (frequency range 0.35 − 1.05 GHz).We crosscorrelate this signal with a Euclid-like spectroscopic survey (Blanchard et al. 2020) and the DESI Emission Line Galaxies one (Raichoor et al. 2023;Casas et al. 2023) in the redshift range 0.7 -1.7.Assuming a Planck 2018 fiducial cosmology, we construct two data sets of observations within multipole redshift bins.To test the constraining power on the cosmological parameters of our mock observations, we implement a likelihood function for the cross-correlation power spectrum, fully integrated with the MCMC sampler CosmoMC.We include a discussion on the impact of our lack of knowledge on the baryonic physics involved in the computation of the 21cm power spectrum as nuisance parameters in the analysis.
We first focus on assessing the constraining power of crosscorrelation observations alone, compared to the results we obtain for the 21cm multipoles.We, then, combine the two to investigate if they carry complementary information.The results of our analysis can be summarized as follows.
We find that SKAO power spectrum measurements in crosscorrelation with galaxy clustering have a constraining power comparable to the 21cm auto-power spectrum.The SKAO × DESI and SKAO × Euclid data sets we construct are able to constrain the cosmological parameters up to the sub-percent level.They seem to be particularly effective on  0 , on which we obtain constraints between the 0.49% and the 1.96% from 21cm and galaxy clustering crosscorrelation alone.The tightest constraints are achieved when we combine 21cm power spectrum multipoles with the cross-correlation mock observations, for which we obtain a 0.33% constraint on  0 , a value that is competitive with Planck.
When combining the cross-correlation mock measurements with CMB data, we find that they are pivotal to reducing the errors on the cosmological parameters.The effect is particularly prominent for Ω  ℎ 2 and  0 , for which the errors are reduced by a factor between 2.5 -1.8 and 3.8 -2 respectively.Again, the best result is obtained by combining all the 21cm probes together.In this case, the error with respect to Planck alone results is reduced by a factor of 3.2 for Ω  ℎ 2 and 5.6 for  0 , with the nuisance parameters taken into account.
Lastly, we test our analysis pipeline on the recent data for the cross-power spectrum between MeerKAT, the SKA-Mid pathfinder, and WiggleZ galaxy clustering Cunnington et al. (2022).We find that state-of-the-art observations have limited constraining power on the complete set of cosmological parameters.However, the main features of the marginalised constraints are compatible with the forecasted results of this work.
To conclude, our analysis supports the case of 21cm and galaxy clustering cross-correlation measurements.In combination with CMB observations, cross-correlations will be able to provide competitive constraints on the cosmological parameters comparable to the ones obtained with the auto-power spectrum.The working pipeline presented in this work is found to be compatible and easily employable with real observations.The analysis we carry out can be straightforwardly adapted to forecast constraints on the neutrino mass and beyond ΛCDM models.These extensions are currently under study.
. Joint constraints (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.We show results for the 21cm multipoles alone (upper panel) and combined with CMB observations (lower panel).The label "Planck 2018" (dashed lines) stands for TT, TE, EE + lowE + lensing, while the label " P0 + P2 " stands for the forecasted 21cm non-linear power spectrum monopole and quadrupole observations, with and without ("no AP") AP effects taken into account." P0 + P2 − Ω  ℎ 2 = 0.13" labels the results obtained for a mock data set with a value of Ω  ℎ 2 different from the assumed fiducial cosmology.The label "nuis."indicates that we vary the nuisance parameters along with the cosmological ones.
centred around one.When, instead, the true cosmology is not the assumed one, although still compatible with one the constraints, both 1D and 2D, are clearly shifted.Thus, one can test their assumptions by looking at the AP parameters constraints.The lower panel of Figure A1 shows the results for the same exercise, but combining 21cm observations with Planck data.
We conclude that even when the whole set of cosmological parameters is used, the AP distortions are crucial not only to take into account our lack of knowledge of the true underlying cosmology but also to increase the constraining power on the cosmological parameters in matter power spectrum dependent probes as the 21cm IM one.

APPENDIX B: TEST ON THE MOCK DATA CONSTRUCTION PROCEDURE
To test the procedure we follow to construct the mock data sets, we compare our predictions to the measured cross-correlation data published in Cunnington et al. (2022).
As in section 4.3, we adjust the parameters of our formalism to mimic MeerKAT observations in the redshift bin centred at  = 0.43 and with Δ = 0.059.We find that with our pipeline we predict fewer -bin in a wider scale range and a slightly different value for the brightness temperature, due to a small difference in the theorypredicted and measured value of Ω HI .Correcting for the brightness temperature results in cross-correlation power spectrum values more in agreement with observations (Figure B2).However, this is not enough to reproduce the observed signal-to-noise ratio.Indeed, we find that varying the brightness temperature changes the power spectrum and the errors consistently, not impacting the signal-to-noise.Instead, adjusting the -bin width to match the one in Cunnington et al. (2022) is enough to better reproduce the observed signal-tonoise ratio, as shown in Figure B1.
We conclude that, compared to the state of the art, the pipeline we adopt in this work is consistent with real observational data.We are, however, more optimistic about the accessible scales and bin widths.Differences in the predicted power spectrum amplitude, i.e. the brightness temperature, are taken into account when opening the parameter space to the nuisance parameters.

APPENDIX C: ON THE IMPACT OF FOREGROUND CONTAMINATION
The aim of the work presented in this paper is the development and testing of a pipeline for future 21cm and galaxy clustering crosscorrelation assuming the ideal scenario where all the systematics in the data have been already treated.It is however interesting to explore the effect of the key challenge of intensity mapping i.e. foreground contamination.
We follow e.g.Karagiannis et al. (2020) and mimic the effect of the foreground as a loss of observed large scales by cutting our mock dataset at  min = 0.05 ℎ/Mpc.Figure C1 shows the comparison between the result presented above and the scale-cut data set for a reference case.As expected, foreground removal worsens the constraints on the cosmological parameter.However, it is worth noticing that the correlation between Ω  ℎ 2 and  0 is left unchanged.
As a consequence, when combining the scale-cut cross-correlation mock data set with measured Planck data, we observe that the same results are achieved with or without accounting for foreground effects.We can then conclude that, although real data could be affected by more complex systematics, the loss of some of the large scale in the cross-power spectrum does not prevent the increase in constraining power obtained combined with Planck results.

APPENDIX D: COMPARISON WITH GALAXY CLUSTERING
In section 4.1, we compare results on the cosmological parameter constraints from the cross-correlation of 21cm and galaxy clustering observations with the ones obtained from 21cm auto-power spectrum measurements.For completeness, in what follows we also forecast the constraining power of galaxy clustering auto-power spectrum measurements.
The pipeline we built for the 21cm observables can be extended  to include the analysis of galaxy clustering auto-power spectrum monopole  g (, ) by implementing an additional likelihood function.For this purpose, we construct a toy data set of future DESI and Euclid-like observations keeping the same framework of observed redshifts, scales and volumes presented in section 2. We highlight that a more realistic forecast that includes the full constraining power of  g (, ) measurements goes beyond the goal of this exercise.As in section 2, our galaxy clustering data set is constructed from a theory predictions for the galaxy clustering auto-power spectrum Pg (, ) (Equation 8) and using the survey specifications for values of the bias and the shot noise (subsection 3.1).Following the formalism used for the 21cm auto-power spectrum (Berti et al. 2022) We stress that to be consistent with our settings, we consider limited sky volumes and do not investigate fully non-linear scales.We thus expect to obtain with our mock data set pessimistic constraints with respect to the full potential of stage IV surveys.
Our results for the mock Euclid and DESI-like observations are compared with results from cross-correlations in Figure D1.We find that, in our framework, galaxy clustering provides constraints qualitatively comparable with cross-correlation.The main differences are noticeable in the shape of the 2D contours: we find stronger degeneracies in the galaxy clustering case between the cosmological parameters, while with cross-correlation some of these degeneracies are partially removed.As expected, the marked Ω  ℎ 2 −  0 degeneracy is present also for galaxy clustering alone, due to the fact that it derives from a measure of the matter power spectrum shape (see Berti et al. (2023)).

Figure 2 .
Figure 2. Predicted signal-to-noise ratio as a function of  for SKAO × Euclid (upper panel) and SKAO × DESI (lower panel) mock observations.

Figure B1 .Figure B2 .
Figure B1.Signal-to-noise ratio as a function of .We compare real data observations ("Cunnington et al. (2022)"), with the signal-to-noise predicted by the formalism adopted in this work.The gray shaded area shows the 2 region for the observed signal-to-noise.

Figure C1 .
Figure C1.Joint constraints (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.We compare the results presented in subsection 4.1 (green, continuous lines) with what is obtained mimicking the effect of the foregrounds by cutting the data set at  min = 0.05 ℎ/Mpc (yellow, dashed lines).The label "nuis."indicates that we vary the nuisance parameters along with the cosmological ones.

Figure D1 .
Figure D1.Joint constraints (68% and 95% confidence regions) and marginalised posterior distributions on a subset of the cosmological parameters.We compare the results presented in section 4.1 from measurements of the 21cm and galaxy clustering cross-correlation power spectrum P21,g (continuous lines) with the ones from galaxy clustering auto-power spectrum Pg (dashed lines).The label "nuis."indicates that we vary the nuisance parameters along with the cosmological ones.