Generating extremely large-volume reionisation simulations

Preparing for the first detection of the cosmic 21-cm signal from large-scale interferometer experiments requires rigorous testing of the data analysis and reduction pipelines. To validate that these pipelines do not erroneously remove or add features that can mimic the cosmic signal (e.g. from side-lobes or large-scale power leakage), we require reionisation simulations larger than the experiments primary field of view. For an experiment such as the MWA, with a field of view of $\sim25^{2}$ deg.$^{2}$, this would require a simulation of several Gpcs, which is currently infeasible. To overcome this, we developed a simplified version of the semi-numerical reionisation simulation code 21CMFAST preferencing large volumes over some physical accuracy by assuming linear theory for structure formation. With this, we constructed a 7.5 Gpc comoving volume with voxel resolution of $\sim1.17$ cMpc tailored specifically to the binned spectral resolution of the MWA. This simulation was used for validating the pipelines for the 2020 MWA 21-cm power spectrum (PS) upper limits (Trott et al.). We then use this large-volume simulation to explore: (i) whether smaller volume simulations are biased by the missing large-scale modes, (ii) non-Gaussianity in estimates of the cosmic variance, (iii) biases in the recovered 21-cm PS following foreground wedge removal and (iv) the impact of tiling smaller volume simulations to achieve extremely large volumes. In summary, we find: (i) no biases from missing large-scale power, (ii) significant contribution from non-Gaussianity in the cosmic variance as expected following Mondal et al. (iii) an over-estimate of the 21-cm PS of 10-20 per cent following wedge mode excision for our particular model and (iv) tiling smaller volume simulations under-estimates the large-scale power and also the estimated cosmic variance.


INTRODUCTION
Directly observing the first stars and primordial galaxies is rendered nigh on impossible by the inescapable neutral hydrogen fog that enshrouds the early Universe. This fog is gradually lifted in the intergalactic medium (IGM) as the neutral hydrogen is ionised by the cumulative output of ultra-violet (UV) photons from stars and galaxies. This phase transition is referred to as the Epoch of Reionisation (EoR).
Our most promising method to observe the EoR is E-mail: greigb@unimelb.edu.au through detecting the 21-cm hyperfine spin-flip transition of neutral hydrogen. This faint signal, measured relative to the Cosmic Microwave Background (CMB), is in either emission or absorption depending on both the ionisation and thermal state of the IGM (see e.g. Gnedin & Ostriker 1997;Madau et al. 1997;Shaver et al. 1999;Tozzi et al. 2000;Gnedin & Shaver 2004;Furlanetto et al. 2006;Morales & Wyithe 2010;Pritchard & Loeb 2012). Crucially, by measuring the 21-cm signal we gain a 2D picture of the spatial distribution of the neutral hydrogen in the IGM. Being a line transition, repeating this over a broad frequency range builds up full 3D time-dependent movie of IGM in the early Universe. From this, we can infer the properties of the galaxies responsible for driving the EoR.
To detect this 3D signal we require large-scale interferometer experiments, which are specifically designed to be sensitive to the spatial fluctuations. The first generation of these experiments include the Low-Frequency Array (LO-FAR;van Haarlem et al. 2013), the Murchison Wide Field Array (MWA; Tingay et al. 2013;Wayth et al. 2018), the Precision Array for Probing the Epoch of Reionisation (PA-PER; Parsons et al. 2010), the Owens Valley Radio Observatory Long Wavelength Array (OVRO-LWA; Eastwood et al. 2019) and the upgraded Giant Metrewave Radio Telescope (uGMRT; Gupta et al. 2017). Next generation experiments, offering larger collecting areas and lower noise, are currently underway or are planned such as the Hydrogen Epoch of Reionization Array (HERA; DeBoer et al. 2017), NenuFAR (New extension in Nançay Upgrading loFAR; Zarka et al. 2012) and the Square Kilometre Array (SKA; Mellema et al. 2013;Koopmans et al. 2015).
Although these interferometer experiments are yet to detect the 21-cm signal 1 , recent improvements in flagging of cleaned data and the data analysis and reduction pipelines have yielded the strongest yet upper-limits on the 21-cm power spectrum (PS) from LOFAR (Mertens et al. 2020), the MWA ) and HERA (The HERA Collaboration et al. 2022b).
Crucially, 21-cm signal detection is a precision science. The signal is extremely faint, hidden behind astrophysical foregrounds roughly five orders of magnitude brighter than the cosmic signal. Thus, when developing data analysis and reduction pipelines it is vital to test all components that could erroneously inject or remove signal. One vital piece towards this is having sufficiently large simulation volumes that extend well beyond the primary beam and into the side lobes. For example, for an experiment such as the MWA, with a field-of-view of ∼ 25 2 deg. 2 at 150 MHz (z ∼ 8.5), this corresponds to a transverse size of ∼ 4 Gpc. Simulations of this scale are currently infeasible.
In this work, we introduce a modified version of the semi-numerical simulation code 21CMFAST (Mesinger & Furlanetto 2007;Mesinger et al. 2011) specifically tailored towards generating sufficiently large reionisation volumes. To achieve this, we make a few simplifying assumptions such as adopting linear structure formation and that the IGM spin temperature is in excess of the CMB temperature (TS TCMB). Further, we follow a similar approach to Greig et al. (2011) and only simulate a restricted volume via considering a frequency dependent depth.
With these large volume simulations, redshift-space distortions, signal evolution and sky curvature effects can be correctly simulated to provide realistic 21-cm datasets to apply to data analysis pipelines. The simulations introduced in this work have been added to the software package, WODEN (Line 2022), which produce realistic end-to-end 1 A detection of excess absorption has been reported by the Experiment to Detect the Global EoR Signature (EDGES; Bowman et al. 2018a) global signal experiment, however, its cosmological origins have been disputed in the literature (see e.g. Hills et al. 2018;Draine & Miralda-Escudé 2018;Bowman et al. 2018b;Bradley et al. 2019;Singh & Subrahmanyan 2019;Singh et al. 2022) simulations for the MWA Telescope, where they provide a reference dataset for testing power spectrum amplitude and slope, as well as a comparison against which different analysis techniques can be made (e.g., if the data cleaning and treatment removes sufficient systematic power to reach the cosmological signal).
With such a large simulation volume we can also test and verify several assumptions and approximations that are typically made when being computationally limited to smaller simulation volumes. For example, the impact on the 21-cm PS when tiling simulations to achieve sufficiently large volumes or whether the 21-cm PS is impacted by large-scale modes longer than the simulation volume. Further, we statistically explore the 21-cm PS by breaking up our large volume simulation into smaller sub-volumes. We explore the impact of the non-Gaussianity in the 21-cm signal on the PS cosmic variance uncertainty following Mondal et al. (2015Mondal et al. ( , 2016 and the potential bias in the 21-cm PS when avoiding the contaminated foreground 'wedge' modes relative to the true 21-cm PS from the full simulation as highlighted by Jensen et al. (2016).
The outline of this paper is as follows. In Section 2 we detail our modified version of 21CMFAST and describe the adopted ionising source prescription. Next, in Section 3 we outline the specifically tailored large-volume simulation for the MWA before investigating various properties and assumptions related to the 21-cm PS. Finally, in Section 4 we conclude with our closing remarks. Unless stated otherwise, quoted quantities are in co-moving units and we adopt the cosmological parameters: (ΩΛ, ΩM, Ω b , n, σ8, H0) = (0.69, 0.31, 0.048, 0.97, 0.81, 68 km s −1 Mpc −1 ), consistent with recent results from the Planck mission (Planck Collaboration et al. 2020).

METHOD
In this work we use a heavily modified version of the seminumerical simulation code 21CMFAST 2 (Mesinger & Furlanetto 2007;Mesinger et al. 2011). 21CMFAST employs approximate but efficient methods to describe the astrophysics of the EoR which compare favourably to computationally expensive radiative transfer (RT) simulations on scales ( 1 cMpc) relevant to 21-cm interferometer experiments (Zahn et al. 2011). This efficiency is achieved by computing the ionisation field using the excursion-set approach (Furlanetto et al. 2004a) which compares the time-integrated number of ionising photons (following a source prescription) to the number of baryons within spherical regions of decreasing radius, R. Under this method, a simulation voxel at the co- Here, nion is the cumulative number of IGM ionising photons per baryon inside a spherical region of size, R and corresponding smoothed overdensity, δR. Explicitly, in this work we ignore the contributions from inhomogenous recombinations (e.g. Sobacchi & Mesinger 2014) and partial ionisations by X-rays to boost the overall computational efficiency.
2 In particular, the older C-only version, https://github.com/andreimesinger/21cmFAST For our ionising sources, we adopt the Park et al. (2019) astrophysical parameterisation for high-z galaxies which directly connects the star-formation rate and ionising escape fraction to the host dark matter halo mass. The advantage of such an approach is that following some simple conversions, UV luminosity functions (LFs) can be produced and compared against high-z observations. For specific details we defer the interested reader to the aforementioned work and provide a brief summary of the parameterisation below.
First, it is assumed that the typical stellar mass of a galaxy, M * , can be related to its host halo mass via, and that f * , the fraction of galactic gas in stars, can be expressed as a power-law in halo mass with index α * and normalised at a dark matter halo of mass 10 10 M through f * ,10, (3) Following this, we obtain an estimate for the starformation rate (SFR) by dividing the stellar mass by a characteristic time-scale, t * , corresponding to a fraction of the Hubble time, H −1 (z), The escape fraction of UV ionising photons, fesc, is also assumed to be related to the halo mass through a powerlaw of index, αesc, and is again normalised at 10 10 M via fesc,10, fesc = fesc,10 M h 10 10 M αesc . (5) Finally, to account for feedback and inefficient cooling processes which limit small mass haloes from hosting active, star-forming galaxies a duty-cycle, f duty , is included which describes the fraction, (1−f duty ), of dark matter haloes that cannot host star-forming galaxies, with Mturn corresponding to this suppression scale. In summary, this ionising source prescription results in six free parameters: f * ,10, α * , fesc,10, αesc, t * and Mturn. The number of ionising photons, nion inside a spherical region of size, R is then obtained via: whereρ b is the mean baryon density, dn dM h is the conditional halo mass function (HMF) and N γ/b is the number of ionising photons per stellar baryon. Here, we use the Press-Schecter HMF (Lacey & Cole 1993) normalised to the mean of the Sheth-Tormen HMF (Sheth & Tormen 1999), and N γ/b = 5000 representative of a Salpeter initial mass function (Salpeter 1955).
The cosmic 21-cm signal is computed via its brightness temperature contrast against the Cosmic Microwave Background (CMB) temperature, TCMB (e.g. Furlanetto et al. 2006): where τν 0 is the optical depth of the 21-cm line, ν0, TS is the gas spin temperature, δ nl (x, z) is the evolved (Eularian) overdensity, xH I is the ionisation fraction, H(z) is the Hubble parameter, dvr/dr is the gradient of the LOS component of the velocity and all quantities are evaluated at redshift z = ν0/ν − 1. To aid computational efficiency, in this work we ignore the effects of peculiar velocities (except in Section 3.5) while also assuming that the gas spin temperature is in excess of the CMB temperature (TS TCMB). Thus, we compute the brightness temperature contrast as, Typically, 21CMFAST first generates a high-resolution 3D realisation of the linear density before applying secondorder Lagrange perturbation theory (e.g Scoccimarro 1998) to obtain the requisite velocity and evolved density fields. It then smooths the evolved fields onto a lower resolution grid to mitigate numerical artefacts. However, to achieve extremely large-volume reionisation simulations, we bypass this step and consider only the linear density field.
Even following this, our largest achievable simulation size is limited to the maximum available memory on our computer network. This limitation is driven by the necessity of performing 3D Fast-Fourier Transforms (FFTs). To bypass this, we follow the approach of Greig et al. (2011) and take a considerable hit to computational efficiency by storing the 3D data on hard-disk and performing successive 1D and 2D FFTs rather than the expensive 3D FFTs in memory. In effect, this limits our memory requirements to N 2 rather than N 3 and takes advantage of the fact that harddisk space exceeds total memory (since it is cheaper, though considerably slower to access). Importantly, by adopting linear theory (initial conditions are generated in Fourier space) and highlighting that the excursion-set algorithm is applied in Fourier space, we can also minimise the number of FFTs between real and Fourier space. Finally, at no point do we ever generate a full 3D real-space cubic volume, rather we instead truncate the depth to a desired frequency bandwidth. That is, we consider a much smaller line-of-sight depth to our volumes than the transverse direction. By doing this, we only need to perform a limited number of 1D and 2D FFTs to achieve our requisite volume. However, at all times the statistics of the field remain correct as we always fully sample the largest scale modes.
Importantly, although we have made several assumptions to boost computational efficiency at the expense of some physical accuracy, our overall goal is to have a method to generate extremely large 21-cm simulations for testing data analysis and reduction pipelines. For these, physical accuracy is not necessarily the most important requirement, instead, we simply require that the statistics and correlations in our large volume simulations be representative of those expected of the true signal over an extremely large range of spatial scales and survey footprint. Additionally, for these extremely large volume simulations we do not include the effects of sky curvature at present. In future we will look into the impact of sky-curvature on large-volume simulations.

Tailored MWA simulation
The Trott et al. (2020) MWA upper-limits were determined from observations spanning 137-197 MHz in two slightly overlapping 30.72 MHz bands. Over these bands, the MWA frequency resolution is ∼ 80 kHz, which sets the minimum spatial resolution for our large-volume simulations and the maximal line-of-sight length required (384 channels). At 150 MHz (z ∼ 8.5), the MWA field of view is ∼ 25 2 deg. 2 , corresponding to a comoving side-length of ∼ 4 Gpc. To exceed these requirements, we generate a single simulation volume with a comoving transverse side-length of 7.5 Gpc over 6400 voxels and 600 voxels (703.1 Mpc) along the lineof-sight direction 3 . This simulation corresponds to a cell resolution of ∼ 1.17 cMpc. To our knowledge, this is the largest volume 21-cm simulation in the literature 4 . Using this simulation, we construct two 21-cm light-cones spanning the observing bands (z = 6.2 − 7.5 and z = 7.5 − 9.4) by stitching together 2D slices of the 21-cm signal from our simulation volume by linearly interpolating in cosmic time (e,g, Datta et al. 2012Datta et al. , 2014La Plante et al. 2014;Ghara et al. 2015;Mondal et al. 2018).
For this simulation, we adopt f * ,10 = 0.05, α * = 0.5, fesc,10 = 0.08, αesc = −0.5, t * = 0.5 and Mturn = 10 8.7 , consistent with the fiducial model adopted in Park et al. (2019), which was shown to match all recent observational constraints on the reionisation epoch. Additionally, in neglecting inhomogeneous recombinations we are required to set an effective mean free path for the ionising photons inside the ionised regions, R mfp . We adopt R mfp = 15 comoving Mpc motivated by the fiducial model in Greig & Mesinger (2015).
In Figure 1 we present a 2D slice of the IGM ionisation fraction, xH I, at z = 8 for which we obtain an IGM neutral fraction ofxH I ∼ 0.55. As a visual demonstration of the physical extent of these simulations, we overlay the respective fields of view for each of the MWA (red), HERA (blue), LOFAR (purple) and SKA (teal). Note that unlike the MWA, LOFAR and SKA which use phase tracking to follow a single patch of the sky, HERA is a drift-scan experiment (sky rotates over the fixed zenith pointing dishes), thus, here we show its instantaneous field of view. In Table 1 we provide the collecting area of a single station/element and the corresponding field of view for each 21-cm interferometer experiment.
3 Such a simulation required ∼ 6 days, using 32 CPUs and ∼ 10 GB of memory. The majority of this time is spent reading/writing to file, which could be considerably improved using solid state hard-drives. 4 As a verification of our method we also performed a simulation with a transverse side-length of 9.6 Gpc and 8092 voxels.

Exploring large-scale modes
Typically, simulations tailored for a 21-cm interferometer experiment will be designed to match the physical extent of the instruments primary beam. Although problematic for full tests of end-to-end pipelines, for most usage cases this assumption should be sufficient. However, by only simulating a volume equivalent to the primary beam, these simulations forgo modes on much longer scales which would otherwise be present in the observed 21-cm signal. In principle, the absence of these large-scale modes could result in an underestimate of the true large-scale power.
We can test the impact of neglecting modes beyond the physical extent of the primary beam by using our 7.5 Gpc simulation as a reference simulation to represent the true Universe. We then compare the 3D spherically averaged 21cm power spectrum (PS) obtained from our 7.5 Gpc simulation to the PS obtained from the smaller, independent realisations of the 21-cm signal tailored to match each interferometer experiment. Throughout, we define the 21-cm PS as ∆ 2 Further, when calculating the 21-cm PS we neglect all k ⊥ = 0 modes. That is, we remove all pure line-of-sight modes as these are not visible to an interferometer 5 (e.g. Datta et al. 2012).
In Figure 2 we compare the 21-cm PS obtained from our 7.5 Gpc simulation (black curve) to the mean and 68th percentile scatter obtained from 10 independent realisations of the 21-cm signal. In each panel we compare the 21-cm PS obtained from the primary field of view of the MWA (red), HERA (blue), LOFAR (purple) and SKA (teal). In all cases, we compute the 21-cm PS over a simulation volume with the same observing bandwidth of ∼ 30MHz (transverse distance is equivalent to the field of view).
Immediately from Figure 2 we see no obvious deviations at large scales. The mean 21-cm PS determined over the 10 independent realisations for each survey footprint overlaps with the 21-cm PS from the 7.5 Gpc simulation. On the largest scales for each individual instrument, we observe some scatter in the 21-cm PS owing to cosmic variance, where the binned 21-cm power has few modes (i.e. Poisson sampling error). Thus, provided the simulation volume exceeds the field of view, this should not be a concern. We can then conclude that omitting large-scales modes beyond the primary field of view does not alter the recovered statistics. This is consistent with the hybrid, large-volume (> 1 Gpc) reionisation simulations generated by Kim et al. (2016), whereby the amplitude of the large-scale 21-cm PS converges for increasing simulation volumes beyond 500 Mpc.
Note, this investigation slightly differs from previous works by Iliev et al. (2014), La Plante et al. (2014) and Kaur et al. (2020). Here, in each case the authors explore subvolumes from their largest reionisation simulation, which in the case of Iliev et al. (2014) and Kaur et al. (2020) is used to determine a minimum volume for reionisation simulations to  Table 1  achieve convergence of the large-scale radiative effects. Since this is a sub-division of a larger volume, these smaller simulations contain modes beyond the scale of the sub-volume. In our work, we generate new independent realisations of our smaller volume simulations that do not include modes beyond the scale of the simulation.

Exploring cosmic variance
The 3D spherically averaged 21-cm PS is calculated by averaging over all Fourier modes that fall within spherical shells. Naturally, the variance of this statistic is then given by the Poisson sampling error based on the total number of Fourier modes in each spherical shell, Here, ∆ 2 21 (k, z) is the dimensional 21-cm PS measured within a volume, V , with spherical shells separated by ∆k.
However, this assumes that the 21-cm PS fully encodes all the information about the 21-cm signal (i.e. that the 21cm signal is Gaussian). Instead it is well known that the 21-cm signal is highly non-Gaussian (e.g. Furlanetto et al. 2004a;Furlanetto et al. 2004b;Morales & Hewitt 2004;Cooray 2005), in which case the higher-order moments of the 21-cm signal will be non-zero. As a result of this Mondal et al. (2016) have shown that the cosmic variance uncertainty on the 21-cm PS is instead, where T (k, k) is the Trispectrum component which arises from the four-point correlation function of the brightness temperature fluctuations.
To explore this non-Gaussianity in the 21-cm PS cosmic variance, we extract a large number of sub-volumes from our large 7.5 Gpc simulation. In particular, we extract subvolumes equivalent in spatial extent to the field-of-view for each 21-cm interferometer experiment (as listed in Table 1), each with the same 30 MHz bandwidth along the line-ofsight. To create these sub-volumes, we first split the full 7.5 × 7.5 Gpc 2 sky-area into our requisite smaller volumes without overlapping regions. Then, to increase our number statistics, we generate an inset sky-area offset from the simulation edge by half the size of the respective instrument field-of-view. We then additionally split this inset area into non-overlapping sub-volumes. In doing so, we obtain a total of 5, 41, 221 and 365 different sub-volumes from our single, large volume simulation for the MWA, HERA, LOFAR and the SKA, respectively. By considering sub-volumes equivalent to each instruments field-of-view, we consider volumes ∼ 2 − 10× larger than those considered by Mondal et al. (2016) and Mondal et al. (2017). Further, by splitting up our large volume simulation we can obtain up to ∼ 7 times as many realisations, increasing our statistical sampling of the numerical cosmic variance uncertainty, albeit at the expense of notably less physics rich simulations.
In Figure 3 we show the resultant 1σ uncertainty on the 21-cm PS obtained from our distribution of sub-volumes (solid curves) for each 21-cm interferometer experiment. In contrast, we also show the expected theoretical estimate of the uncertainty from Equation 10 if we assumed Gaussian statistics (dashed curves). The dotted curve corresponds to the non-Gaussian (Trispectrum) component of the cosmicvariance as highlighted in Equation 11.
On relatively large scales (k < 0.5 Mpc −1 ), we find that the true cosmic variance uncertainty is well approximated by a Gaussian component for the theoretically expected error. However, progressing to intermediate and smaller scales, the amplitude of the non-Gaussianity increases significantly, with the Trispectrum component dominating by ≥ 10× that of the Gaussian term. This transition scale of k > 0.5 Mpc −1 for a non-Gaussian dominated cosmic variance uncertainty on the 21-cm PS is consistent with that of Mondal et al. (2016) and Mondal et al. (2017) for a similar stage of reionisation (xH I ∼ 0.55). On these scales, as the EoR progresses, the growth and percolation of the ionised regions drives the non-Gaussianity in the cosmic 21-cm signal. Importantly, we find the relative amplitude (the amplitude simply scales as ∝ V −1 ) and shape of the full cosmic variance uncertainty to be consistent across the differently sized sub-volumes. This indicates no additional sources of cosmic variance uncertainty due to the finite-size of the simulation volume considered. Note however, that while the non-Gaussian term for the 21-cm PS cosmic variance begins to dominate on scales larger than k > 0.5 Mpc −1 , this need not be too concerning for 21-cm experiments. Typically, on these scales, the 21-cm PS will be dominated by the instrument thermal noise (see figure 2 of Greig et al. 2020) making these two terms comparable. Nevertheless, care must be taken when estimating the 21-cm PS cosmic variance error.

Impact of foreground 'wedge' avoidance on cosmic variance
The cosmic 21-cm signal observed by radio interferometer experiments is measured in visibility space (uv coverage), which is the 2D Fourier transform of the brightness signal in the image (sky) plane. Measuring the power spectrum then requires gridding these visibilities. However, these visibilities are frequency dependent. Line-of-sight power in Fourier space (frequency dependent) can then leak into the transverse (frequency independent) Fourier modes. This results in the well known 'wedge' feature in cylindrical 2D Fourier space, where measured power within this regime is contaminated ( it is typically easier to simply avoid this contaminated region entirely, measuring the power spectrum in the relatively pristine region above the 'wedge'. The spatial location of this 'wedge' in 2D cylindrical space is determined by, where k and k ⊥ are the line-of-sight and transverse Fourier modes, respectively. The constant b corresponds to an additive buffer region above the 'wedge' of ∆k = 0.1 h Mpc −1 extending beyond the horizon limit while, Here, DC is the comoving distance, H0 is the Hubble constant, E(z) = Ωm(1 + z) 3 + ΩΛ and θ is the angular radius of the field of view which we conservatively take as θ = π/2 (i.e. observing down to the horizon). Previously, we numerically estimated the cosmic variance in the 21-cm PS measured over the full 30 MHz bandwidth simulation volume for each interferometer experiment. To highlight the impact of 'wedge' avoidance on estimating the 21-cm PS, in particular on the amplitude of the statistical uncertainty, we perform the same analysis as in Section 3.3 instead calculating the 21-cm PS using only Fourier modes above the 'wedge'.
In Figure 4 we compare the 1σ cosmic variance uncertainty after 'wedge' avoidance (dashed curves) relative to the 21-cm PS estimates from the full simulation volume (solid curves). As expected, in avoiding the 'wedge' region, we limit the total number of Fourier modes available in each spherical shell to measure the 21-cm PS. Thus, quite simply, our Poisson uncertainty increases owing to the smaller number of available modes driving the increasing uncertainty error. Further, as the number of available modes quickly diminishes as we move to larger spatial scales (smaller k) the cosmic variance uncertainty rapidly increases. For smaller scales, the cosmic variance uncertainty still becomes dominated by the Trispectrum component (as indicated by similar upticks in amplitude at k > 1 Mpc −1 as highlighted in Figure 3), though on slightly smaller scales than found previously. Nevertheless, the non-Gaussianity in the cosmic variance uncertainty is still important when performing 'wedge' avoidance. Importantly, the differences between the two cases are amplified by our survey geometry. In general, our simulation volumes are considerably larger along the transverse direction relative to the line-of-sight. Thus, previously, when measuring the 21-cm PS over the full volume, our number statistics for the k ⊥ modes were considerably larger than for the k modes. Now, when measuring the 21-cm PS using foreground avoidance, most of these k ⊥ modes are excluded since they fall within the 'wedge'.
In following this avoidance strategy, we also significantly limit the largest scale modes accessible to our 21-cm PS measurement. In all cases, the 21-cm PS is only measurable down to k ∼ 0.1 Mpc −1 , with no measurable modes beyond this limit. This limit is simply set by the adopted size of our buffer (b = 0.1 h Mpc −1 ) from Equation 12 which is somewhat arbitrary. In principle, it can be reduced by considering either larger bandwidths or with an improved understanding of our systematics. Note however, that the smallest accessible scale for each instrument is not exactly k ∼ 0.1 Mpc −1 , but rather slightly larger. As we additionally ignore all k ⊥ = 0 modes, from Equation 12 the minimum scale is simply b plus m times the smallest non-zero k ⊥ in our simulation volume. Thus, the minimum k for the MWA is smaller than the SKA due to the smaller k ⊥ (larger transverse scale). In Figure 5 we present the fractional increase in cosmic variance when removing the foreground contaminated wedge modes. We find that the relative increase is comparable across all scales irrespective of the total simulation volume, highlighting that the increase in the cosmic variance is primarily driven by the fraction of excised Fourier modes (i.e. increased Poisson noise in each bin due to having fewer modes free of foregrounds).

Biased estimates of the 21-cm PS following 'wedge' avoidance
Previously, we explored the impact of foreground avoidance on the resultant cosmic variance uncertainty on a 21-cm PS measurement. Next, we consider whether the 21-cm PS measured following foreground avoidance can be biased relative to the 21-cm PS measured from a full simulation volume. This is particularly pertinent given that recent upper limits on the 21-cm PS measured by LOFAR (Mertens et al. 2020), MWA ) and HERA (The HERA Collaboration et al. 2022b) have for the first time allowed Bayesian astrophysical parameter inference to be employed to understand the physics of reionisation (Ghara et al. 2020(Ghara et al. , 2021Greig et al. 2021a,b;Mondal et al. 2020;The HERA Collaboration et al. 2022a). However, when these inference pipelines are applied to these upper-limits, the simulated 21-cm PS is generated from the full simulation volume, rather than the characteristics of the observation, such as foreground avoidance. While likely irrelevant for the present upper-limits since these are not very aggressive and only disfavour relatively extreme models of reionisation. Nevertheless, as we approach a first detection, these inference pipelines will be vital to building our understanding of the galactic physics driving reionisation. Thus, it is important to explore the relative impact of any biases on the modelled 21-cm PS when not accounting for the characteristics of the observation.
Firstly, the observed 21-cm PS is asymmetric owing to redshift-space distortions from the peculiar motions of the gas along the line-of-sight. Thus unlike in the earlier sections, we must include the effects of redshift-space distortions on our estimate of the 21-cm brightness temperature signal (note, the inference pipelines above include these effects). These redshift-space distortions increase the relative power along the line-of-sight direction, which could amplify any biases when measuring the 21-cm PS following foreground avoidance. This is because modes are preferentially excised transverse to the line-of-sight, resulting in a higher amplitude signal after binning over all possible modes within spherical shells. Thus here, we compute the 21-cm brightness temperature signal for the full simulation following the multiplication of Equation 9 by the pre-factor 1 + dvr Hdr −1 and computing the anisotropic 21-cm PS. In Figure 6 we present the mean (solid curve) and the 68th and 95th percentiles (dark and light, respectively) for the ratio of the 21-cm PS with the 'wedge' removed (avoidance) relative to the 21-cm PS from the full simulation volume. Here, these are determined using the same sub-volumes as obtained previously for each individual experiment from the full 7.5 Gpc simulation. It is clearly evident that for all survey areas, the 21-cm PS measured following foreground avoidance returns a biased estimate of the true 21-cm PS obtained from the full simulation volume. On average, we find the relative amplitude of this bias to be relatively modest, at ∼ 10 − 20 per cent over all measured k-scales. On the largest scales (small k), when foreground avoidance has limited sampling of the Fourier modes (see Figure 4) this can result in increases/decreases in the 21-cm PS amplitude of larger than 40 per cent.
Although fairly modest in amplitude, it is important to note that these results are model dependent. That is, this bias corresponds specifically to our particular source parameterisation and frequency band (z ∼ 7.5 − 9.4). Therefore, it is plausible that the amplitude of this bias could increase under different astrophysical models or stages of reionisation. Hence, this could have important consequences for parameter inference studies if the statistics of the simulated models are not consistent with the characteristics of the observation, potentially biasing the inferred astrophysical parameters. Although the model dependency of the work is important to consider here, in previous sections it is less relevant. When investigating the cosmic variance uncertainty the relative amplitude of the uncertainty is dependent on the number of modes within each spherical shell when estimating the 21-cm PS, not the 21-cm PS amplitude itself.
To more clearly illustrate the relative increase in amplitude in the 21-cm PS following foreground avoidance, in Figure 7 we show in the same panel the mean ratio of the 21-cm PS with/without foreground avoidance for all interferometer experiments. This clearly demonstrates that the relative increase in the 21-cm PS amplitude following foreground avoidance is consistent at 10 − 20 per cent across all the different survey footprints.
This 10 − 20 per cent bias is broadly consistent with a similar investigation performed by Jensen et al. (2016). Here, this bias was determined for two astrophysical models over the full reionisation history. For their fiducial model, which more closely matches our model, at a similar stage of reionisation,x H I ∼ 0.5, they find a similar bias of 10 − 20 per cent, however, as a decrement rather than an amplification. Likely, these differences arise due to the specific modelling of the ionising sources, but more likely due to the simplifications present in our modelling (e.g. linear theory) relative to their N -body simulations. Importantly, though, they only consider a single realisation for this exploration. For sub-volumes equivalent in size to their single simulation (corresponding to the LOFAR field-of-view) in the lower left panel of Figure 6 we can accommodate individual models with a negative bias. Factoring in the modelling differences mentioned above, we are confident that these two works are broadly consistent.
Note, there also appears to be a slight gradient in the relative bias for the different experiments. The SKA, with the smallest survey footprint has the lowest amplitude whereas the MWA, with the largest footprint has the highest amplitude. However, this is simply due to the geometry of the survey volume. Foreground avoidance with the MWA removes on average more Fourier modes in each spherical shell below the 'wedge' than that of the SKA relative to the 21-cm PS computed from the full simulation volume. This is because the MWA has a considerably larger transverse size (higher sampling of transverse modes) which results in a higher fraction of modes cut at the same fixed k than the SKA. Equally, the uptick at the largest scales occurs earliest for the SKA compared to the other, larger area surveys. This occurs for the same reason as discussed earlier, the removal of the k ⊥ = 0 modes coupled with the k needing to satisfy Equation 12.

Impact of tiling reionisation simulations
Typically, when it is infeasible to generate a single, large volume simulation, the requisite size and volume can be achieved by stitching together many smaller simulations generated using periodic boxes (see e.g. Nasirudin et al. 2020).
Although sufficient volumes are achieved in this manner, the statistics of the simulations will only be accurate on the scales modelled by the smaller simulation. That is, it is impossible to inject information on scales beyond the largest scale of the smallest simulation. Using our 7.5 Gpc simulation, we can explore the relative impact of tiling simulations on the large-scale information in the 21-cm PS.
The tiling of periodic simulations additionally leads to repeating structures on scales characteristic of the simu-lation volume. When considering discrete objects such as galaxies in a redshift survey, these repeating structures can be minimised through rotating the smaller simulations prior to tiling. Unfortunately, since the 21-cm signal arises from the neutral gas, it is a continuous rather than discrete quantity preventing the usage of rotations. Rotations would lead to edge effects and disjoint ionised regions on the boundaries of the smaller simulations leading to strange artefacts in the statistics. Thus, for 21-cm studies, we can only consider tiled simulations.
To investigate the impact on the large-scale power due to tiling, we consider three smaller volumes and tile each to match our single 7.5 Gpc, 6400 voxel simulation volume. We generate 50 independent realisations of the smaller simulations with 200, 400 and 600 voxels per side length, corresponding to lengths of ∼ 234.4, 468.8 and 703.1 Mpc. We tile the same individual simulation to the large volume footprint, thus we obtain 50 independent large volume tiled simulations. For each, we generate the full 21-cm light-cone and only consider a line-of-sight depth of 30 MHz, consistent with large volume simulation.
In Figure 8 we compare the 21-cm PS obtained from our 7.5 Gpc simulation (black curve) to the 21-cm PS obtained from an equivalent tiled 7.5 Gpc simulation using the 200 (red), 400 (blue) and 600 (purple) voxel smaller simulations. In each, the vertical dashed line corresponds to the largest scale (fundamental frequency) sampled by the smaller simulations (k f = 2π/L small ). The shaded region corresponds to the 68th percentile scatter obtained from our 50 independent realisations. As expected, the 21-cm power drops away on scales below the fundamental frequency. This highlights that information on modes not originally simulated by the smaller volume simulations cannot be artificially created to match the full large volume simulation. Thus, a tiled simulation volume has a limited regime of validity restricted by the scale of the smaller simulations. On scales just above the fundamental frequency, we find the 21-cm power in the tiled box underestimates the power from the full simulation. In some instances this can be as much as a factor of ∼ 2 − 3. Although the physical scale of these spherically averaged Fourier modes are sampled by the original smaller volume simulation, on these scales we are still missing the contribution from Fourier modes which contain at least one component mode longer than the small volume simulation (e.g. kx, ky or kz < k f ). These individual modes contain no power, but contribute to the total number of modes within the bin, resulting in an overall reduction to the averaged power in that bin. Thus, the 21-cm power from the tiled box will underestimate the full box power on all scales until the fraction of Fourier modes with individual components larger than the smaller box is sufficiently small. Therefore, care must be taken to adequately produce sufficiently sized smaller volumes prior to tiling to minimise the loss of power on the largest scales.
In addition to underestimating the 21-cm power following the tiling of smaller periodic simulations, it is also important to note that we will underestimate the cosmic variance uncertainty in our tiled simulation volume. Since in the act of tiling we produce repeated copies of the spatial information from our smaller simulations we cannot artificially increase the variance within a sampled Fourier bin. That is, the measured sample variance is fixed to the variance (i.e. Poisson fluctuations) measured from the smaller volume simulation. However, since we have scaled up our total volume, following from Equation 11 the sample variance decreases as ∝ 1/V . Thus our estimated cosmic variance from our tiled, large-volume simulation must be scaled by the increased volume. In effect, the tiled simulation underestimates the cosmic variance error by ∝ N 2 ⊥ × N , where N ⊥ is the number of copies required to achieve the desired transverse scale and N is the number of copies required along the line-of-sight.
In Figure 9 we highlight this scaling by showing the ratio of the cosmic variance uncertainty from the full sim-ulated volume compared to that obtained from tiling the smaller volume simulations to achieve the same volume. In each, the tiled simulation cosmic variance underestimates the true cosmic variance, as indicated by the ratio sitting well above unity. The dashed horizontal lines correspond to the expected scaling (∝ N 2 ⊥ × N ) by which we underestimate the true cosmic variance in each tiled simulation volume. That is, the cosmic variance in the tiled simulation must be decreased by this amount to match the true cosmic variance from the full simulation volume. For larger k (i.e. k > 0.5 Mpc −1 ) the impact of the non-Gaussian (Trispectrum) contribution to the cosmic variance becomes increasingly important, modifying this simple scaling relation.
Importantly, here we have only considered the power extracted from the full simulation volume. However, if one also considers the instrumental response of the beam (i.e. beam multiplied by the tiled signal), then additional artefacts in the estimated power could be expected due to mode mixing and/or power leaking in from the side-lobes. We leave an investigation into these effects to future work.

CONCLUSION
In preparation for the wealth of data expected from largescale interferometer experiments such as the MWA, LOFAR, HERA and the SKA, we must rigorously test and validate the full data analysis and reduction pipelines. For experiments such as the MWA, with a field-of-view of ∼ 25 2 deg. 2 at 150 MHz (z ∼ 8.5) this requires reionisation simulations in excess of ∼ 4 Gpc. To explore the effects of side-lobes away from the primary beam and the potential for largescale power leakage, even larger volumes are required.
We introduced a modified version of the semi-numerical reionisation simulation code 21CMFAST (Mesinger & Furlanetto 2007;Mesinger et al. 2011) specifically tailored towards generating these extremely large volume simulations. To achieve this, we limit the line-of-sight depth of the simulations to specific observing bandwidths (though fully sample the large-scale modes), allowing the transverse scales to be extremely large. Further, we forego some level of accuracy by limiting structure formation to linear theory. Tiled boxes (DIM = 600) Expected scaling ( N 2 ⊥ N ) Figure 9. The ratio of the cosmic variance estimated from the full 7500 × 7500 × 500 Mpc simulation to the cosmic variance estimated from tiling smaller volume simulations to achieve comparable volumes. Again, we consider tiling a single simulation box containing 200 (red), 400 (blue) and 600 (purple) voxels per side length. The cosmic variance of the tiled simulations underestimates the true cosmic variance by an amount proportional to the total number of copies required to be tiled (∝ N 2 ⊥ × N ) as estimated by the horizontal dashed curves (see text for further details).
However, for the purposes of testing and validation, provided the simulated 21-cm signal behaves in a similar manner (i.e. correct statistics and correlated behaviour) to the expected signal, these assumptions are sufficient. Using this, we generate a 30 MHz band-width (∼ 500 Mpc) simulation with 6400 voxels over a 7.5 Gpc transverse side-length (∼ 1.17 cMpc resolution). This simulation was specifically tailored for the recent 2020 MWA upper-limits  With such a large volume simulation in hand, we then explored the statistical behaviour of the 21-cm power spectrum (PS) and assumptions made in the literature when computational limitations restrict the maximum achievable simulation volumes. We performed these using simulation volumes equivalent in size to the field-of-view for each of the 21-cm interferometer experiments, MWA, LOFAR, HERA and the SKA. We found: • finite simulation volumes do not exhibit any loss or increase in large-scale power from missing modes longer than the finite simulation volume.
• the cosmic variance uncertainty on a 21-cm PS measurement at k 0.5 Mpc −1 is dominated by the non-Gaussianity in the 21-cm signal by 10× over that expected under the typical Gaussian assumption. This is consistent with that found earlier by Mondal et al. (2016).
• the excision of contaminated foreground wedge modes significantly increases the amplitude of cosmic variance due to the reduction in the number of available Fourier modes. Further, it restricts the largest scale modes accessible.
• measuring the 21-cm PS following wedge mode excision results in a modestly biased estimate (increase of ∼ 10 − 20 per cent) of the true underlying 21-cm PS. Though, the relative amplitude will be heavily model dependent. This bias is broadly consistent with that found by Jensen et al. (2016) and should be an important consideration for reionisation astrophysical parameter inference studies.
• tiling smaller simulations to achieve extremely large volumes results in an underestimate of the 21-cm power on scales smaller than the small box fundamental frequency k f 2π/L small . Therefore, care must be taken when con-structing smaller volume simulations for tiling to ensure the relevant scales of interest are sufficiently sampled.
• tiling smaller simulations additionally results in underestimating the true cosmic variance of the desired large volume simulation. The underestimation can be compensated for by scaling the uncertainty by ∝ N 2 ⊥ × N which accounts for the number of repeated copies of the small volume simulation required to match the transverse and line-of-sight scales, respectively.