-
PDF
- Split View
-
Views
-
Cite
Cite
Julian B Muñoz, An effective model for the cosmic-dawn 21-cm signal, Monthly Notices of the Royal Astronomical Society, Volume 523, Issue 2, August 2023, Pages 2587–2607, https://doi.org/10.1093/mnras/stad1512
- Share Icon Share
ABSTRACT
The 21-cm signal holds the key to understanding the first structure formation during cosmic dawn. Theoretical progress over the last decade has focused on simulations of this signal, given the non-linear and non-local relation between initial conditions and observables (21 cm or reionization maps). Here, instead, we propose an effective and fully analytical model for the 21-cm signal during cosmic dawn. We take advantage of the exponential-like behaviour of the local star-formation rate density (SFRD) against densities at early times to analytically find its correlation functions including non-linearities. The SFRD acts as the building block to obtain the statistics of radiative fields (X-ray and Lyman α fluxes), and therefore the 21-cm signal. We implement this model as the public python package Zeus21. This code can fully predict the 21-cm global signal and power spectrum in ∼1 s, with negligible memory requirements. When comparing against state-of-the-art semi-numerical simulations from 21CMFAST we find agreement to |$\sim 10~{{\ \rm per\ cent}}$| precision in both the 21-cm global signal and power spectra, after accounting for a (previously missed) underestimation of adiabatic fluctuations in 21CMFAST. Zeus21 is modular, allowing the user to vary the astrophysical model for the first galaxies, and interfaces with the cosmological code CLASS, which enables searches for beyond standard-model cosmology in 21-cm data. This represents a step towards bringing 21-cm to the era of precision cosmology.
1 INTRODUCTION
The cosmic-dawn era, which saw the formation of the first galaxies, is quickly becoming the next frontier of cosmology. In addition to direct observations from telescopes such as Hubble and James Webb (e.g. Bouwens et al. 2021; Finkelstein et al. 2022; Robertson et al. 2023; Treu et al. 2022), different 21-cm experiments are targeting neutral hydrogen through its spin–flip transition (van Haarlem et al. 2013; Voytek et al. 2014; Beardsley et al. 2016; Dewdney et al. 2016; DeBoer et al. 2017). These observatories are expected to provide us with tomographic information on the evolution of cosmic hydrogen from the beginning of cosmic dawn until the end of reionization (z ∼ 5 − 25). As such, they have the potential to change our understanding of early universe galaxy formation and cosmology.
Extracting physical insights from the upcoming 21-cm data is, however, challenging. Mapping initial conditions (matter densities and velocities) into observable 21-cm fields is a non-linear and non-local process, one that is most often computed through simulations. These ought to account for the (non-linear) physics of structure formation as well as the (non-local) propagation of the radiative fields. Efforts in hydrodinamical simulations (e.g. Gnedin 2014; Mutch et al. 2016; Ocvirk et al. 2020; Kannan et al. 2021) have improved the modeling of the latter epoch of reionization (z ≲ 10). Nevertheless, simulating the earlier cosmic-dawn era is more expensive, given the broad dynamical range spanned between the small first galaxies and the large mean-free path of photons. Moreover, the parameter space to explore is vast, further increasing the cost of comparing a plethora of detailed simulations against data. As an alternative, analytical models for the cosmic dawn were proposed in Barkana & Loeb (2005) and Pritchard & Furlanetto (2007) based on linear perturbation theory. These were, however, eventually abandoned in favour of semi-analytical simulations such as Mesinger & Furlanetto (2007), Thomas et al. (2009), Santos et al. (2010), Visbal et al. (2012), Battaglia et al. (2013), Fialkov et al. (2013), Ghara, Choudhury & Datta (2015), and the well-known 21CMFAST (Mesinger, Furlanetto & Cen 2011; Murray et al. 2020, see however work based on the halo model Holzbauer & Furlanetto 2012; Schneider, Giri & Mirocha 2021; Schneider, Schaeffer & Giri 2023). Despite the semi-analytical nature of these codes, computing the 21-cm signal still requires a simulation box with gas cells whose properties are evolved individually. For instance, a typical 21CMFAST box still takes ∼1 h to evolve through cosmic dawn, and sets stringent memory constraints.1 This problem is compounded by cosmic-variance considerations, as current 21-cm telescopes can only observe wavenumbers near the line of sight (with cosines |μ| ∼ 1, Pober 2015), a subset of all modes that are simulated, which is thus plagued by larger sample noise. Leaving aside computational cost, a fully analytical approach to cosmic dawn can provide new insights into early universe astrophysics by laying bare the impact of each process, and is thus complementary to numerical simulations.
Here, we present a new approach to analytically compute the 21-cm signal during cosmic dawn, building upon the linear approach of Barkana & Loeb (2005) and Pritchard & Furlanetto (2007), jointly referred to as BLPF hereafter but including non-linearities as well as non-localities. Since the 21-cm signal depends on the radiative (X-ray and Lyman α) fields sourced by the first galaxies, our building block will be the local star-formation-rate density (SFRD). Historically, the main hurdle to obtain the 21-cm signal has been the complex behaviour of the SFRD against density. Our key insight is that the SFRD on a region of density δR (averaged over a radius R) scales as SFRD ∝ exp(γRδR) at high redshifts z, given the effective bias γR. This behaviour is caused by the paucity of early universe galaxies, which makes their abundance exponentially sensitive to overdensities and underdensities. Under this assumption, and Gaussian initial conditions, the SFRD is a lognormal variable, for which correlation functions are analytically known (Coles & Jones 1991). The 21-cm signal will depend upon a sum of SFRDs averaged over different radii R. As such, we can analytically compute the power spectrum of the 21-cm signal non-linearly and non-locally.
In this paper, we present our effective theory for cosmic dawn, as well as the public package Zeus21.2 where it is fully implemented. Zeus21 encodes the astrophysics described through this paper and it is built modularly in python. Moreover, Zeus21 interfaces with the cosmic Boltzmann solver CLASS3 (Blas, Lesgourgues & Tram 2011), so it allows the user to vary the underlying cosmology as easily as the astrophysics. We compare Zeus21 against 21CMFAST4, which has become the standard of semi-numerical 21-cm simulations. When using the exact same inputs, we recover the same 21-cm global signal and power spectrum to within 20 per cent (which is the goal of current semi-numerical codes Zahn et al. 2011; Majumdar et al. 2014; Ghara et al. 2018; Hutter 2018). Yet, Zeus21 takes 3 s to run in a laptop (down to z = 10, or 5 s down to z = 5), as opposed to ∼1 h for 21CMFAST, and has negligible memory requirements. Fundamentally, semi-numerical simulations like 21CMFAST compute the same non-linear correlation function of weighted SFRDs, though by numerically placing it in a grid rather than analytically. As such, it is not surprising to find remarkable agreement between both approaches.
The rest of this paper is structured as follows. In Section 2, we cover the basics of the 21-cm signal, and in Section 3 our effective model for cosmic dawn. We use this model in Sections 4 and 5 to compute the evolution of the 21-cm signal and related quantities such as the IGM neutral fraction and kinetic temperature. We compare Zeus21 against 21CMFAST in Section 6, and conclude with further discussion in Section 7. All units here will be comoving unless specified.
2 BASICS OF THE 21-CM SIGNAL
We begin with a basic introduction to the 21-cm signal during cosmic dawn and reionization. The interested reader is encouraged to visit Furlanetto, Oh & Briggs (2006b), Pritchard & Loeb (2012), or Liu & Shaw (2020) for in-depth reviews.
Observationally, the 21-cm brightness temperature T21 measures the deviation (i.e. absorption or emission) from the cosmic-microwave background (CMB) backlight. Physically, this deviation is sourced by intervening neutral hydrogen. As such, T21 at a redshift z and position x is determined by the thermal and ionization state of the intergalactic medium (IGM). In particular, given the temperature TCMB of the CMB,
where TS is the spin temperature, determined by the population ratio of the triplet and singlet states of neutral hydrogen. The 21-cm optical depth is calculated through (Barkana & Loeb 2001)
where xH i is the neutral hydrogen fraction, δ its density5, H(z) is the Hubble expansion rate, and |$\partial _r v_r$| is the line-of-sight gradient of the velocity. We have additionally defined a normalization factor
Through this paper we will work with a Planck 2018 cosmology (Planck Collaboration VI 2020) unless otherwise specified, which fixes the (reduced) Hubble parameter h, as well as the baryon and matter densities Ωb and Ωm. Furthermore, we will drop x and z dependencies unless non-obvious.
Detecting a 21-cm signal requires hydrogen to be out of equilibrium with the photon background, as otherwise TS = TCMB in equation (1), which yields T21 = 0. At early times, during the dark ages, collisions between hydrogen atoms couple the spin TS and kinetic Tk temperatures (Loeb & Zaldarriaga 2004).6 Since the gas is colder than the CMB after the two fluids thermally decouple at z ∼ 200, this coupling produces 21-cm absorption from then until z ∼ 30, when collisions become too rare to keep the kinetic and spin temperatures coupled (so TS rises to TCMB). That would be the end of the story, were it not for astrophysical sources. When the first galaxies form they generate a UV background that permeates the universe, exciting hydrogen through the Wouthuysen–Field (WF; Wouthuysen 1952; Field 1959; Hirata 2006) effect. This again couples TS and Tk and produces 21-cm absorption. Later on, X-rays from the first galaxies will heat up the hydrogen in the IGM, driving Tk (and thus TS) above the CMB temperature TCMB, and thus bringing 21-cm into emission. Eventually, reionization will drive xH i towards zero, and with it the 21-cm signal.
We will focus on the cosmic-dawn epoch, where collisional excitations can be ignored. In that case, the spin temperature can be obtained through
where xα is the dimensionless WF coupling parameter (given by the local Lyman α flux, as we will detail in Section 4), and Tc is the colour temperature, closely associated with the gas kinetic temperature Tk and numerically approximated by (Hirata 2006)
given gcol ≈ 0.4055 K. In practice, the difference between Tc and Tk during cosmic dawn is always below 5 per cent for the models we study, so one can keep Tk in mind for intuition, though we do use the correct Tc through this work.
By taking the approximation τ21 ≪ 1, and linear-order redshift-space distortions (δv, Barkana & Loeb 2006; Mao et al. 2012), we can write the 21-cm temperature as
This equation neatly separates the four different quantities that determine the 21-cm signal, namely: (i) the large-scale structure, through the local density and velocity; (ii) the WF effect, which measures the Lyman α flux emitted by the first galaxies; (iii) the gas kinetic/colour temperature, determined by the competition between adiabatic cooling and X-ray heating due to the first stars; and (iv) reionization. Through these four terms the 21-cm signal is a powerful tracer of the first stellar formation and the growing cosmic web at high redshifts.
The evolution of the 21-cm line is intimately linked to the intensity of the Lyman α (through xα) and X-ray (through Tc) backgrounds sourced by the first galaxies. Both radiative fields can travel significant distances before being absorbed, so their fluxes at a point x depend on the emission over the past light-cone. Schematically, they are given by integrals of the type (Pritchard & Furlanetto 2007)
where R is the light-cone (comoving) radius, over which the SFRD is averaged, and cα/X are coefficients that account for photon propagation, and depend on the specific astrophysics and cosmological parameters. As such, one can think of Jα/X (and thus T21) as non-local tracers of the SFRD. The SFRD is, in turn, a non-linear function of the matter density field δ. Translating from initial conditions in δ to T21 is, thus, a non-linear and non-local process, as expected.
Rather than constructing simulation boxes, here we will account for these non-linearities and non-localities with a fully analytical model. We show a diagram of our model in Fig. 1. This model uses the SFRD as a building block, which we find with a lognormal approximation to the non-linear process of structure formation (Coles & Jones 1991). The X-ray and Lyman α fluxes are then obtained as weighted sums of the SFRD averaged over different radii R. The 21-cm signal is straightforwardly computed from these fluxes, and thus accounts for nonlocalities as well as non-linearities. Note that we have shown a (simulation-like) realization of this algorithm in Fig. 1, but our effective model in Zeus21 does not need to draw from a realization. We will find the 21-cm signal, both global (|$\overline{T}_{21}$|) and fluctuations (through the power spectrum |$\Delta ^2_{21}$|) fully analytically, i.e. without simulation boxes. This will provide a sizeable computational advantage, both in terms of speed and memory usage, and a new way to understand the different processes at play during cosmic dawn. Yet, our model can reproduce the results of more complicated semi-numerical simulations. This approach relies on our effective lognormal model for the SFRD, which we now describe.

Schematic diagram of how Zeus21 computes the 21-cm signal (global and fluctuations). Starting from cosmology (over/underdensities δ, top left), Zeus21 calculates the SFRD as a function of the density δR (smoothed over different radii R, bottom left) with our lognormal model, given the calculated effective biases γR (which depend on the halo–galaxy connection). The Lyman α and X-ray fluxes Jα/X (bottom right) are computed as a weighted sum of those SFRDs (with coefficients that depend on stellar parameters like the SEDs). These fluxes finally determine the 21-cm signal (global |$\overline{T}_{21}$| and power spectrum |$\Delta ^2_{21}$|). The slices shown here are for visualization purposes only, as Zeus21 does not require simulation boxes. The δ realization is generated with the aid of powerbox (Murray 2018), the SFRD(δR) slices follow our exponential effective model, and the JX/α fluxes are sums of those SFRDs, given the weights defined in equations (28) and (42).
3 AN EFFECTIVE MODEL FOR THE SFRD
The building block that determines the 21-cm signal is the SFRD. Before delving into its density dependence, let us begin by computing its spatially averaged value (Madau et al. 1996),
where dn/dMh is the halo mass function (HMF, which depends only on cosmology), and |$\dot{M}_* (M_h,z)$| is the star formation rate (SFR) of a galaxy hosted in a halo of mass Mh, which thus also holds information about the halo–galaxy connection. The main assumption taken so far is that stars form in galaxies, which are hosted in (dark matter) haloes. As such, this formula is very generic.
To compute the SFRD, we will take a Sheth, Mo & Tormen (2001) fit for the HMF,
where σ2 is the variance of matter fluctuations on the scale of |${Mh,\,{\rm v}}=\sqrt{q{\rm ST}}{\rm \delta_{\rm crit}}/{\sigma}$| is a dimensionless variable, with δcrit = 1.686 as the usual critical barrier for collapse, and qST = 0.85 found to be a good fit in high-z simulations (Schneider 2018). Moreover, pST = 0.3 corrects the abundance of small-mass objects (Sheth et al. 2001), and |$A_{\rm ST} = 0.3222\, \sqrt{{2}/{\pi }}$| is a normalization factor.7
The SFR of high-z galaxies is highly uncertain. As such, we will make progress through flexible models that can capture different behaviours. In particular, we implement two approaches to link |$\dot{M}_*$| to Mh based on recent analyses of galaxy data from the Hubble (HST) and James Webb (JWST) Space Telescopes. Our main model, based on Sabti, Muñoz & Blas (2022, see also Mason, Trenti & Treu 2015; Furlanetto et al. 2017; Mirocha, Furlanetto & Sun 2017; Tacchella et al. 2018; Schneider et al. 2021), assumes that some fraction of the gas that is accreted by a galaxy is converted into stars, i.e.
where fb = Ωb/Ωm is the baryon fraction (which we take to be mass independent) and the mass accretion rate |$\dot{M}_h$| is found from the extended Press–Schechter formalism fitted in Neistein & van den Bosch (2006) (we also include an exponential model, where Mh(z) ∝ eαz with α = 0.5 Schneider et al. 2021 as an alternative). In both cases we assume a functional form for the efficiency
where
is a duty fraction, with Mturn the turn-over mass below which gas does not cool into stars efficiently. We will assume that galaxies can form down to the atomic-cooling threshold, so Mturn = Matom(z) (corresponding to a virial temperature Tvir = 104 K, Oh & Haiman 2002). We will study the effect of molecular-cooling haloes and their feedback (Johnson, Greif & Bromm 2007) in future work. Our star-formation efficiency f* has four free parameters: two power-law indices α* and β* for the small- and large-mass regimes, a normalization ϵ*, and the pivot mass Mpivot. For our fiducial case we will take ϵ* = 0.1, α* = −β* = 0.5, and |$M_{\rm pivot} = 3 \times 10^{11}\, {\rm M}_{\odot }$|, which broadly fits the UV luminosity functions (UVLFs) from HST in the range z = 4–10 (Sabti, Muñoz & Blas 2021, 2022). We show our functional form for |$\dot{M}_*$| in Fig. 2, where we also illustrate how for the cosmic-dawn epoch (z ≥ 10) the HMF is exponentially suppressed at high masses, so faint galaxies will dominate the emission.

Star-formation efficiency f* in our parametric model from equation (11) at z = 10. The power-law indices α* and β* control the steepness of the faint and bright ends, respectively; Mpivot sets the pivot point; and the amplitude ϵ* rescales the entire curve. Values for these parameters have been chosen to broadly reproduce the HST UVLFs. Also shown is the minimum mass Matom that a halo needs to cool gas through atomic transitions (blue vertical line), as well as a curve proportional to the HMF at z = 10 (red dashed), which shows that most haloes are in the faint end during cosmic dawn.
For ease of comparison with 21CMFAST in Section 6, we also implement their model (from Park et al. 2019), which takes
with t* = 0.5 as a dimensionless constant. Here, |$\tilde{f}_*(M_h)$| is a single power-law in mass (with a ceiling at unity). Both of these models can be made z dependent, and enhanced by adding scattering (Zahn et al. 2006; Whitler et al. 2020), which we will explore in future work.
We note that these are two example models inspired by seminumerical simulations. The effective approach we will present here is agnostic about the SFR parametrization, and can be extended to other models.
3.1 An effective lognormal model for the SFRD
The ‘effective’ nature of our model consists of approximating the dependence of the SFRD on density in such a way that arbitrary two-point functions – i.e. power spectra – can be computed analytically. Let us describe how. For notational clarity, we define the fluctuation on a quantity q to be |$\delta q(\mathbf {x}) = q(\mathbf {x})-\overline{q} = \overline{q} \delta _q (\mathbf {x})$|, given its global average |$\overline{q}$|. Through this work will use the reduced power spectrum
of different quantities q, as customary, and refer to it as power spectrum unless confusion can arise. Moreover, we will define δR to be the linear matter overdensity averaged over a radius R with a spherical top-hat window (unless otherwise specified), and δ without a subscript is the usual (unwindowed) density.
We build upon our model for the average value of the SFRD, from equation (8), to find the SFRD of a region of comoving radius R overdense by δR as (Barkana & Loeb 2005)
where the extra factor of (1 + δR) in front with respect to Barkana & Loeb (2005) accounts for the conversion from Lagrangian to Eulerian space (see e.g. Mesinger et al. 2011), and we have implictly assumed that the local density only modulates the cosmology (HMF) but not the astrophysics (|$\dot{M}_*$|), i.e. we neglect assembly bias (Wechsler et al. 2002). One can find the density-modulated HMF by using the extended PS (EPS) formalism (Press & Schechter 1974; Bond et al. 1991). Here, we follow Barkana & Loeb (2005) and take
which has been shown to match well numerical simulations (Schneider et al. 2021), and is constructed to return the correct average HMF from equation (9). The density behaviour follows (Lacey & Cole 1993)
where |$\tilde{\nu }= \tilde{\delta }_{\rm crit}/\tilde{\sigma }$|, ν0 = δcrit/σ, aEPS is a numerical constant (well fitted by aEPS = qST, Schneider et al. 2021), and where we have defined
for a region of overdensity δR and variance |$\sigma _R^2$|. We will find the value of the amplitude |$\mathcal {C}$| numerically. This EPS formalism is used in public 21-cm seminumerical codes, such as 21CMFAST, though with a slight variation as we will explain in Section 6.
We show the behaviour of the SFRD as a function of density in a relevant cosmic-dawn scenario (smoothed with a Gaussian kernel of RG = 3 Mpc at z = 15 and 10, chosen for illustration purposes) in Fig. 3. The strong dependence of this quantity on the overdensity owes to the extreme rarity of the first galaxies during early structure formation. They reside in haloes with exponentially small abundances, which are therefore exponentially sensitive to the local matter density [through the threshold in equation (18)]. This poses an obstacle to the usual perturbative methods.8

Behaviour of the SFRD against density δR (smoothed with a Gaussian kernel of radius RG = 3 Mpc) at z = 15 (thick coloured lines) and z = 10 (thin grey lines). Blue dotted–dashed line represents the full result, compared to the red lognormal approximation we follow in this work. The right-hand side shows the histograms of both SFRDs from a simulation box (as detailed below in Fig. 4). Likewise, the top shows histograms for the density (red line), and the logarithm of the SFRD (divided by SFRD(0) and the exponent γR) at z = 15, which also are in remarkable agreement.
There is, however, a better alternative. Encouraged by the SFRD shown in Fig. 3, we will take a simplifying approximation, which will allow us to make progress analytically. We will posit that
for δR ≪ δcrit, which occupies the entirety of the cosmic-dawn epoch. For notational convenience we have defined
which is renormalized to recover |$\left\langle \exp (\gamma _R \tilde{\delta }_R) \right\rangle = 1$| for any value of γR (Xavier, Abdalla & Joachimi 2016, see also our Appendix A for details). This approximation works very well.9 in Fig. 3 for RG = 3 Mpc, and becomes increasingly accurate for larger R, as there we will have δR ∼ σR ≪ δcrit. At very small δR (or large R) it recovers the usual linear bias, as originally used in BLPF. As such, our lognormal effective model allows us to analytically compute the SFRD fluctuations accurately including non-linearities.
To sharpen our intuition, and to test that the fit in Fig. 3 is a representative test case for cosmic dawn, we show in Fig. 4 a simulation slice of the SFRD at z = 15 given a linear density field, along with the result from our lognormal approximation. The two slices are indistinguishable by eye. Both the full SFRD and our approximation from equation (19) predict that the densest regions will dominate the SFRD. We also plot the ratio of the two predictions, which even at the small scales shown (RG = 3 Mpc) deviates by less than 10 per cent. Note that the ratio skews smaller than one, as we have not normalized those SFRD boxes. As we will see in Section 6 when comparing with 21CMFAST, these differences shrink to the per cent level for R ≳ 10 Mpc, which are the scales most readily observable by 21-cm interferometers.

Slice of different quantities at z = 15 across a simulated box 300 Mpc in side. Top left is the density δR, averaged with a 3 Mpc Gaussian kernel. The bottom shows both our approximation to the SFRD (left, a lognormal), and the full SFRD (right, from equation 15), which are remarkably similar. The top right showcases their ratio, which deviates from unity by |$\mathcal {O}(10~{{\ \rm per\ cent}})$| only at extreme δR values, in line with the expectation from Fig. 3.
This exponential approximation will allow us to analytically calculate correlation functions of the SFRD given those of δR, which are a well-known cosmological output. As such, all the non-linearities are encoded into the effective biases γR. In practice, Zeus21 numerically calculates the SFRD(z|δR) at each z and R using the formalism outlined in this Section, and simply fit for the γR parameters on the fly. These γR coefficients form the base parameters of our effective model of the cosmic dawn. We show a prediction for γR as a function of R in Fig. 5, both at z = 10 and 20. We plot xR = γR × σR to include the relevant scale of the δR term that multiplies the effective bias γR. Small values of this product indicate linear-like behaviour (as |$e^{x_R}\sim 1 + x_R$| for xR ≪ 1). This will only be the case for large R, whereas the lower R will have non-linear corrections, also estimated in Fig. 5. These corrections reach |$\mathcal {O}(1)$| for R ∼ 10 Mpc. As we will explore in Section 4, this will give rise to a factor of 2 larger power spectra of relevant quantities at scales |$k=0.1-1\, {\rm Mpc}^{-1}$|.

A prediction of our effective bias parameter γR as a function of the smoothing scale R. We multiply it by σR as a proxy for the δR factor that it always accompanies. We show results at z = 10 (solid) and 20 (dashed), though they are largely overlapping for our chosen parameters. We also show as a red dashed–dotted curve an estimate of the correction due to non-linearities, defined as |$(e^{x_R^2}-1)/x_R^2$| for xR = γRσR. We thus expect our lognormal model to predict |$\sim 10-100~{{\ \rm per\ cent}}$| larger power than the linear prediction for R ≲ 20 Mpc, which will increase the k ≳ 0.1Mpc−1 power spectrum of different quantities as we will show in Section 4.
While here the effective biases γR are calculated under a specific model for the SFR (equation 10) and assuming the HMF is modulated by the EPS formalism (equation 16), these biases could be used as free parameters and fit to high-z clustering data, simulations, or directly to the 21-cm signal, bypassing the SFR modelling.
3.2 Correlation functions
The benefit of a lognormal approximation for the SFRD is that we can analytically find its correlation function. In real space, the 2-pt function of two lognormal variables is (Coles & Jones 1991; Xavier et al. 2016)
where |$\xi ^{R_1,R_2}$|is the real-space correlation function of the density field smoothed over radii R1 and R2, i.e.
where FT denotes Fourier transform, Pm is the matter power spectrum, and WR(k) is a window function of radius R. The EPS formalism is usually calibrated with a (3D) top-hat:
For computational convenience we will assume linear growth in the correlation functions, so |$\xi ^{R_1,R_2}$| always refers to the z = 0 result (and the biases γR will be multiplied by the corresponding growth factor).
In order to build intuition, we show the diagonal elements (i.e. R1 = R2 = R for four values of R) of the |$\xi ^{R_1,R_2}$| matrix in Fig. 6 as a function of separation r, all linearly extrapolated to z = 0 using CLASS and mcfit10. Larger smothing scales R suppress the peak of the correlation function at low separations r. For reference, |$\xi ^{R,R}(r=0) \equiv \sigma _R^2$|, showing that scales with R ≲ 10 Mpc are expected to be non-linear at z = 0 (though less so during cosmic dawn owing to the smaller growth factor). In practice, we expect our exponential model for the SFRD to not hold for δR ∼ δcrit, so we will not consider correlation functions with σR ≳ 1, or R ≲ 2 Mpc.

Correlation function of (matter) density fluctuations as a function of separation r, linearly extrapolated to z = 0. In black dashed line we show the standard (unwindowed) result and the three colours represent different smoothing scales relevant to cosmic-dawn observations (R = 5, 20, 50 Mpc), also shown as vertical lines. The correlation function becomes negative for large r, beyond the sound horizon, shown as dotted points.
While EPS is calibrated with a 3D window function, in the model of BLPF the fluxes (Jα/X) are obtained with a 1D window function, |$W_R^{(\rm 1D)}(k) = \sin (k R)/(k R)$| (see also Dalal, Pen & Seljak 2010). We will follow their model and compute the linear part of the power spectra with a 1D window, and simply add the non-linear corrections on top (computed with non-linear EPS, and thus a 3D top-hat). This allows us to keep the success of the linear BLPF model, to which we add the non-linear corrections calibrated from the EPS formalism. The semi-numeric model of 21CMFAST assumes a 3D window throughout, which we have also implemented in Zeus21, as we will show in Section 6.
4 THE IGM DURING COSMIC DAWN
So far we have discussed the SFRD and our effective model for it. Physically, the 21-cm signal is sensitive to the thermal and excitation state of the IGM, and thus to the radiative fields. The X-ray and Lyman α fluxes at a point x are built from integrating the SFRD over R, the past light-cone, as outlined schematically in equation (7). What we have to determine are the coefficients cα/X that act as weights of the contribution from each R. While the SFRD depends on cosmology (through the HMF) and the halo–galaxy connection (through |$\dot{M}_*$|), these coefficients will also depend on the stellar properties of the first sources of light. In particular, how much each radius contributes to the Lyman α and X-ray flux is determined by photon propagation, and thus by the spectral energy distribution (SED) of the first galaxies. We now compute these weights, and show in more technical detail how we find the 21-cm signal. In both cases, we will start calculating the average (or global) fluxes before finding their fluctuations.
4.1 WF coupling
We begin with the process that likely started first: the WF coupling of the gas and spin temperatures. This coupling depends on the local flux of Lyman α photons, which we can compute as Barkana & Loeb (2005)
given the SFRD, where |$\nu ^{\prime } = \nu \, [1+z^{\prime }(R)]/(1+z)$| is the redshifted frequency.11, and compared to previous literature we have phrased the integral in terms of the comoving radius R over which photons contribute, rather than their corresponding redshift12z′(R). The reason will become apparent when we compute fluctuations below. Comparing equations (24) and (7) we see that the weights cα(R) that determine the contribution of the SFRD at each R depend on how far the photons, and thus on the SED |$\epsilon _\alpha ^{\rm tot}$| of the first galaxies (in the Lyman α to continuum regime, as lower energy photons cannot excite hydrogen, and higher energy photons get absorbed locally through ionizations).
We have defined the ‘total’ SED as a sum over possible transitions that eventually redshift into the Lyman α line,
where frec(n) are the recycled fractions from Pritchard & Furlanetto (2006), and wα(n) are weights equal to unity for z < zmax(n), and zero above13, with [1 + zmax(n)]/(1 + z) = [1 − (1 + n)−2]/(1 − n−2). As an intrisic spectrum we simply take two power laws as
for frequencies between Lyman α and the Lyman limit, with a break at Lyman β and power-law indices αi = {0.14, −8.0} below and above that break. We set the amplitudes |$\mathcal {A}_i$| so that 68 per cent of the flux is between Lyman α and β, and 32 per cent above it, and take a total number Nα = 9690 (Barkana & Loeb 2005) of photons emitted in this band per star-forming baryon (as μb is the mean baryonic mass). This can be easily modified by reading more complicated stellar spectra both for PopII and III stars (e.g. Bromm & Larson 2004), which we defer to future work.
As advanced in Section 2, we will compute the spin temperature TS in the IGM in terms of the (dimensionless) coupling coefficient (Furlanetto et al. 2006b)
with |$1/J_\alpha ^c = 1.811 \times 10^{11}/(1+z)\, (2.725\, {\rm K}/T_{\rm CMB}^{(0)})$| cm2 s Hz sr is a numerical constant (for which we have set the CMB temperature today |$T_{\rm CMB}^{(0)}$| to the Planck 2018 value). We find the correction factor Sα iteratively as detailed in Hirata (2006). This factor, which reduces the expected WF coupling by |$\sim 10-20~{{\ \rm per\ cent}}$| during cosmic dawn, requires finding the kinetic temperature Tk and free-electron fraction xe as well, which we will detail in the next subsection. Note that for now we only use the average correction for Sα in Zeus21, though in principle this quantity can be perturbed in our formalism too.
We show the evolution of xα in Fig. 7, along with our average SFRD (rescaled). Clearly these two variables trace each other, so in principle a clean measurement of xα can be invaluable to find the SFRD at high redshifts (cf. Madau et al. 1996). The 21-cm line allows us to infer xα, as the gas kinetic and spin temperatures will be coupled when |$J_\alpha \sim J_\alpha ^c$|, i.e. when xα ∼ 1. This requires modelling the rest of ingredients in the 21-cm signal, which we will do in turn.

Evolution of the average SFRD (dashed, multiplied by 103) and Lyman α coupling parameter xα (solid, defined in equation 27) as a function of z. In order to have 21-cm absorption due to efficient WF coupling one needs xα ≳ 0.1, which for our chosen parameters occurs for z ≲ 20.
4.1.1 Fluctuations
The integral form of equation (24) immediately makes it clear how to connect the fluctuations of the SFRD, which we computed above, to those of xα in the IGM. By converting the integral over R into a sum we can write
where
is an R-independent coefficient, and
includes all the R-dependent factors in equation (24), as well as the step14 ΔR. Finally, γR are the exponents of the SFRD against δR that we found in Section 3.
With this simple formula we can compute the (auto)correlation function of xα as
by taking advantage of our lognormal building blocks. This expression may seem daunting to evaluate, given the double sum. However, all the coefficients have been stored when computing the global evolution. As such, for standard precision in Zeus21 it takes ≈1 s to do all the correlation-function sums down to z = 5 (compared to ≈3 s for running the entire 21-cm and SFRD evolution, or ≈5 s for running the CLASS cosmology to the necessary high |$k\approx 500\, {\rm Mpc}^{-1}$|.).
The coefficients c2,α capture the nonlocality of the Lyman α flux. To illustrate their behaviour, we show the logarithmic derivative of Jα with respect to radius R in Fig. 8 (which is proportional to c2,α), at z = 15. One can think of this quantity as the Fourier transform of a window function, as each point carries the weight of modes at that radius R. We see that the weight is spread broadly, growing towards larger radii until R ∼ 350 Mpc, where it drops. This corresponds to the comoving distance from z = 15 to zmax = 17.9, over which photons from Lyman β can redshift into Lyman α. We also show in Fig. 8 the same quantity for X-rays, which we will describe in the next section.

Logarithmic derivative of the integrand in the Lyman α flux (Y = Jα in black dashed, equation (24)) and X-ray heating term (Y = ΓX in purple, obtained integrating the X-ray flux JX over energies as in equation (37)) versus comoving radius R. The Lyman α flux receives its largest contributions from large radii R, up to the maximum distance for photons to redshift from Lyman β into Lyman α. The X-rays tend to come from closer distances, though they have a tail at large R. Both curves depend sensitively on our choice of UV and X-ray SEDs. The kinks in the Jα curve appear whenever an atomic transition crosses one of the radii we sum over (and could be eliminated with interpolation as in 21CMFAST).
We now Fourier transform the ξα correlation function to obtain the power spectrum of xα fluctuations, which we show for our fiducial parameters in Fig. 9. The fluctuations are rather large, reaching |$\Delta _{\alpha }^2\sim 10^{-2} \, \overline{x}_\alpha$| at small scales for z = 12. We compare our result with a linear calculation as originally proposed in BLPF, though adapted to our astrophysical model (which has a non-constant f*(Mh)). For low wavenumbers k ≲ 0.1 Mpc−1 the linear and full calculations agree fairly well. However, they deviate by |$\mathcal {O}(1)$| in the k = 0.1–1 Mpc−1 range, where the non-linear calculation predicts significantly more power. This is precisely the range of scales that is relevant for 21-cm observations, highlighting the need for a non-linear approach for cosmic dawn.

Power spectrum of the WF coupling parameter xα (as a tracer of the Lyman α flux Jα). Solid lines show our calculation with Zeus21, which includes non-linearities; versus the linear estimate in dashed, which severely underpredicts the power at k ≳ 0.1 Mpc−1, the scales most relevant for 21-cm experiments.
One may wonder why the non-linear calculation provides additional power (rather than decrease it) in Fig. 9. While we will compare against 21CMFAST simulations in Section 6, which show the same trend (see also Santos et al. 2008), let us also provide an intuitive argument here. The SFRD grows faster than linear with δ (e.g. Fig. 3). As such, its correlation function will receive beyond-linear corrections, which tend to grow towards high k where the δ fluctuations are larger. Another way to see this is through the SFRD slices in Fig. 4. These show significant small-scale fluctuations in the SFRD, which are less pronounced in the densities themselves, corresponding to more high-k power in the former than the latter. Our lognormal model successfully predicts this behaviour.
Finally, we note that the impact of non-linearities depends on the stellar parameters (through the ci coefficients) and cosmology + halo–galaxy connection (through the exponents γR). Yet, the SFRD will remain as the building block, and our lognormal approximation will allow us to calculate its power spectrum quickly and accurately.
4.2 X-rays and the temperature of the IGM
We now turn our attention to the second component of the 21-cm signal during cosmic dawn: the gas kinetic temperature Tk.
During cosmic dawn Tk is determined by the competition between adiabatic cooling (due to cosmic expansion), and heating. The latter is due to both coupling to the CMB (which is however fairly inefficient after z ∼ 150) and X-rays from the first galaxies. We evolve the temperature through
where ΓC is the standard Compton heating rate to the CMB (Ma & Bertschinger 1995), and ΓX is the X-ray term, which will be given by the X-ray flux JX integrated over frequencies as we will detail below. We can divide the temperature evolution in two parts (see a similar discussion in Schneider et al. 2021)
where the ‘cosmology’ term Tcosmo takes into account the Compton coupling
and is obtained self-consistently using CLASS, whereas the X-ray term follows
where we have taken the approximation that ρ ∝ (1 + z)3 regardless of δ (which we will, however, lift when accounting for adiabatic fluctuations below). Given the simple z dependence of equation (35) we can rewrite
The next step is, then, to define ΓX.
4.2.1 The heating term and X-ray flux
The X-ray heating rate at a point x (and redshift z′) is given by an integral over all frequencies of the X-ray flux JX at that point (Pritchard & Furlanetto 2007),
weighted by the energy injected per photon and the ionization cross-sections σi(ν) for i = H i and He i (as we ignore the small correction from He ii), which we take from Verner et al. (1996). Here fi are the (number) fractions of both species, and |$\nu _{\rm ion}^{(i)}$| their ionization energies/frequencies. We approximate the deposition efficiency by |$f_{\rm heat} = \overline{x}_e^{0.225}$| (Schneider et al. 2021), which forces us to compute the average free-electron fraction |$\overline{x}_e$|. For this we simply take the ansatz
where |$\overline{x}_e^{\rm cosmo}$| is the result from CLASS (using Hyrec, Ali-Haimoud & Hirata 2011), and the X-ray term we estimate by |$\overline{x}_e^X = \int dz \Gamma _{X,\rm ion}$| (Mirocha 2014), for |$\Gamma _{X,\rm ion} = (3/2) f_{\rm ion} \Gamma _X/\left\langle \nu _{\rm ion} \right\rangle$|, with νion the averaged ionization frequency of H i and He i, and where we take the approximation |$f_{\rm ion}=0.4 \exp (-\overline{x}_e/0.2)$| for the fraction of energy that goes into ionization, from Furlanetto & Stoever (2010). This is a simplified treatment, but it will suffice for this first work.
The X-ray flux is given by
similarly to equation (24) for Jα, but with an opacity term determined by the optical depth at each energy ν (Pritchard & Furlanetto 2007)
where as before ν′ is the redshifted energy, and we will take the average nb here for computational simplicity (as done for instance in 21CMFAST). Moreover, we will keep track of the τX numerically computed, though Zeus21 has a flag to make the opacity |$e^{-\tau _X}$| either 0 or 1 as done in 21CMFAST, for ease of comparison to their results in Section 6.
The heating rate in equation (37) depends sensitively on the X-ray SED ϵX(ν), as lower energy photons travel shorter distances R and are more likely to heat up the IGM. We will assume an SED
where IX(ν) is the X-ray spectrum, normalized to integrate to unity over the band considered, which we take to cover from ν0 = 0.5 keV up to νmax = 2 keV, as for energies above the mean free path is too large so they barely contribute to heating (Greig & Mesinger 2015). This allows us to define the luminosity L40 (in units of 1040 erg/s/SFR) as a free parameter. In practice we will use a power-law SED, |$I_X(\nu) \propto \nu ^\alpha _X$| with αX = −1, for ν > ν0, which reasonably fits the spectra of high-mass X-ray binaries (Fragos et al. 2013). Both the power-law parameter αX and the cut-off frequency ν0 are free parameters in Zeus21, and the spectrum IX can be enhanced for arbitrary SEDs.
Given all this, we can calculate the X-ray heating rate ΓX. We show the contributions to ΓX coming from different radii R in Fig. 8, as we did for the Lyman α flux. The X-ray term has broader support over R, and it peaks at slightly lower R than its Lyman α counterpart (R ∼ 30 Mpc, rather than ∼300 Mpc). As such, the X-ray power will remain unsuppressed until smaller scales (larger k). This is, however, very dependent on the X-ray SED, which determines how locally the IGM heating proceeds during cosmic dawn (Pritchard & Furlanetto 2007; Fialkov, Barkana & Visbal 2014; Pacucci et al. 2014). We take a closer look at the effect of the X-ray SED in Zeus21, and compare it against that of Lyman α photons in Appendix C.
In this first work we will neglect several small corrections in the name of simplicity. We will ignore fluctuations on the free-electron fraction xe, which could affect the distribution of energy deposited. We note, though, that fheat is a shallow function of xe, and even by the end of our simulations (at z = 10 where heating is saturated) the free-electron fraction is still fairly small, xe ≈ 10−3. We also ignore the excitations from X-rays, which would produce a small amount of WF coupling, as well as energy transfer with Lyman α photons (Venumadhav et al. 2018) which would heat the gas modestly. These can be straightforwardly introduced into Zeus21.
With all these caveats, we can finally compute the evolution of the gas temperature, and we show its spatial average in Fig. 10. For our fiducial parameters the X-ray heating begins in earnest at z ∼ 15, and fully saturates (|$\overline{T}_k\gg T_{\rm CMB}$|) by the end of our calculation at z ∼ 10). We also show the resulting spin temperature |$\overline{T}_S$| in Fig. 10, which accounts for the kinetic temperature as well as the WF coupling. It is this quantity that determines the 21-cm temperature: if |$\overline{T}_S \lt T_{\rm CMB}$| we will have absorption (which will occur at z ∼ 10–20 for our parameters), and for |$\overline{T}_S \gt T_{\rm CMB}$| emission. Note that we are ignoring reionization for now, so we stop our calculations at z ∼ 10, since below that z the 21-cm signal will be saturated.

Evolution of the different relevant temperatures during cosmic dawn, all spatially averaged. The kinetic temperature |$\overline{T}_k$| originally follows the adiabatic prediction at high z (which includes Compton coupling to photons TCMB, through CLASS), and Zeus21 computes the X-ray heating |$\overline{T}_X$| component, see equation (32). The spin temperature |$\overline{T}_S$| additionally depends on the WF coupling xα, turning on when the first galaxies form at z ∼ 20. There will be 21-cm signal whenever TS ≠ TCMB.
4.2.2 Fluctuations
Let us now compute the fluctuations on the gas kinetic temperature.
We begin with the X-rays, for which we will mirror the formalism that we followed for xα. We re-write the X-ray temperature (i.e. the heating due to X-rays) as
where the ci, X coefficients are determined from the equations above, and γR are the same SFRD effective biases as we had in equation (28). This equation has, however, an additional nonlocality in time than its xα counterpart. The kinetic temperature of a gas parcel at x, z depends on the heating rate ΓX at all previous times, so equation (42) is integerated over z′ (as a consequence c1, X will include a Δz′ step, much like the ΔR in c2, α/X).
The building block for the X-ray-heating term is still the SFRD, and thus we can compute its correlation functions in terms of the same γR that we calculated in Section 3 to be
As was the case for xα, these expressions appear computationally expensive, even more so given the z′ integrals involved (due to the non-locality in time of X-ray heating). Nevertheless, only one z′ sum has to be carried over, as we cumulatively track ξX(z) from high to low z, which highly speeds up the calculation.
We show the power spectrum of X-ray fluctuations at z = 12 in Fig. 11. Notice that it is divided by the average kinetic temperature |$\overline{T}_k$|, rather than the X-ray only component |$\overline{T}_X$|. As in Fig. 9, the non-linearities in the SFRD change the power spectrum significantly for |$k\gtrsim 0.1\, {\rm Mpc}^{-1}$|. Fig. 11 also shows the effect of anisotropic adiabatic cooling, which is important at early times. Let us now describe it.

Same as Fig. 9 but for the kinetic temperature Tk. We only show the result at z = 12, both with full non-linear (black solid) and linear (blue dashed) fluctuations, including X-rays and adiabatic. We also separate the X-ray term (purple dot-dashed, nonlinear) and its cross term with density (the TX − δ adiabatic part, teal dotted) to showcase each contribution.
4.2.3 Adiabatic fluctuations
The adiabatic cooling of the IGM determines its kinetic temperature prior to the X-ray epoch. The adiabatic cooling rate depends on the local matter density δ, and as such Tk has fluctuations sourced by it. Looking at equation (34), one can see that if we ignored the coupling to the CMB (i.e. if we set ΓC = 0), the adiabatic temperature would follow |$T_k \propto (1+\delta)^{c_T}$|, with cT = 2/3 the adiabatic index. This well-known result is, however, complicated by the non-trivial thermal evolution of gas. Electrons keep scattering off of the photon bath after recombination, so their thermal evolution (and thus that of the IGM) retains some memory of the photon temperature down to the cosmic-dawn epoch. Rather than assume a value of cT, we follow Muñoz, Ali-Haïmoud & Kamionkowski (2015) and calculate the adiabatic index by solving for the evolution of Tk with z including the Compton coupling term ΓC. This calculation will turn out to uncover a missing element in the standard setup of 21CMFAST, so we now take a brief detour to describe it.
Assuming a global kinetic temperature |$\overline{T}_k$|, we can find its adiabatic-cooling fluctuations to linear order in δ by expanding15 equation (32),
mirroring equation (36), where as before primes denote derivatives with respect to redshift. We define the adiabatic index through the linearized relation
In that case, we can find the index to be
assuming that D(z) is the (scale-independent) growth factor. One can see that by setting Tk = T0(1 + z)2, and D(z) = D0/(1 + z) at all z we would recover cT = 2/3. Even in the most standard cosmology, |$\overline{T}_k$| does not follow that exact scaling, which changes the value of cT. We use the background thermal evolution from CLASS/HyREC (which accurately includes the electron scattering in ΓC after recombination), to find the adiabatic index cT, which we show in Fig. 12 as a function of redshift. At early times the gas and CMB temperatures are coupled, which erases adiabatic fluctuations and drives cT towards zero. However, as we approach cosmic dawn the index tends to its value of 2/3, though never reaching it. For reference, in Muñoz et al. (2015) we suggested the approximation
with c(0) = 0.58 and c(1) = 6 × 10−3, which is accurate to better than 3 per cent for our standard cosmology (in the absence of heating) in the range z = 6 − 50, as we also show in Fig. 12. At lower z these curves diverge as the gas is heated by X-rays, which drives |$\overline{T}_k$| up but does not generate further adiabatic fluctuations, lowering cT.

Evolution of the adiabatic index cT, as defined in equation (46), through cosmic time. Our prediction in black accounts for the Compton coupling of the IGM to the CMB, which lowers cT from its idealized value of 2/3. We also show the fit from Muñoz et al. (2015) in teal dot-dashed (reproduced here in equation 48), which ignores X-ray heating; and the default 21CMFAST assumption in red dashed, which fixes Tk to be homogeneous at z = 35 and underpredicts adiabatic fluctuations during cosmic dawn.
We note that it is critical to integrate up to very early times (as high as z ∼ 200) in equation (47) to find cT, as the integral retains memory of the high-z temperature. Cutting off the high-z part of the integral is equivalent to neglecting fluctuations (i.e. setting |$T_k=\overline{T}_k$|). This is an issue for current 21CMFAST simulations, for which Tk is assumed to be homogeneous at their initial z = 35. This assumption leads to an underprediction of the adiabatic fluctuations by a |$\mathcal {O}(1)$| factor even at z = 15, as illustrated in Fig. 12 by setting the z > 35 part of the integral to zero. This will affect the predictions of the 21-cm power spectrum by a similar amount, as we will explore in Section 6.
In order to evaluate the impact of adiabatic fluctuations during cosmic dawn we show the Tk power spectrum (again divided by its average) as a function of redshift in Fig. 13. We have separated the X-ray and adiabatic terms, and also show the total, which includes their cross term. Before X-ray heating is in full swing (z ≳ 15 for our parameters) adiabatic fluctuations reign, giving rise to a sizable power spectrum (|$\Delta ^2_{T_k}/T_k^2\sim 10^{-3}$|, or ∼ few% rms fluctuations). After X-rays turn on, the temperature largely will follow the SFRD; adiabatic fluctuations will slowly fade away as the total power grows. For reference, the power spectrum that we showed in Fig. 11 was at z = 12, where the adiabatic fluctuations are small (though they still contribute to the Tk power spectrum by |$\sim 10~{{\ \rm per\ cent}}$| through their cross term). We also show in Fig. 13 the prediction if one incorrectly set the temperature to be homogeneous at z = 35 (following the red line in Fig. 12). This ‘wrong’ cT would underpredict the Tk power spectrum by a factor of 3–10 before X-ray heating.

Dimensionless power spectrum of Tk as a function of redshift, at |$k=0.1\, {\rm Mpc}^{-1}$|. We show the contribution from X-rays (non-linear, in purple dot–dashed line), from adiabatic fluctuations (teal dotted), as well as the total (which includes their cross term, in black). At early times (z ≳ 15) the temperature fluctuations are dominated by adiabatic expansion and contraction, whereas later on X-rays are more important. We also show, for illustration purposes, the adiabatic prediction if one set the initial Tk to be homogeneous at z = 35 (in red dashed, as in 21CMFAST, see Fig. 12). This would underpredict temperature fluctuations by nearly an order of magnitude during cosmic dawn.
We conclude that adiabatic fluctuations ought to be properly included up to high z (see Fig. 12) in order to recover the full Tk fluctuations during cosmic dawn, and thus to predict the correct 21-cm signal. In practice we will group the adiabatic term with the ‘large-scale structure’ term (1 + δ) in Zeus21, as it depends on the local density δ rather than the SFRD.
4.3 Large-scale Structure
We now move to study the contribution from the density and RSD terms in equation (6), which we group under the ‘large-scale structure’ (LSS) label.
There is a fundamental difference between these and the terms that give rise to xα and TX. One can think of the latter two terms to live ‘in Lagrangian space’, as the SFRD depends on the initial overdensities linearly extrapolated; whereas the LSS terms are ‘Eulerian’, as they are given by local densities and velocities. In principle these LSS terms also suffer from non-linearities. At the redshifts and scales of interest (z ≳ 10 and |$k\lesssim 1\, {\rm Mpc}^{-1}$|) we expect the density field to be rather linear, so we will assume a simple model for non-linearities, which we aim to refine in future work.
Let us define Δ = (1 + δNL) as the (non-linear) density term. For notational consistency we will assume a lognormal model as we did for the SFRD, based on the work of Coles & Jones (1991). In that case the auto- and cross-correlations can be found trivially to be
For the redshifts and scales of interest to cosmic dawn we have tested that this correction to the auto and cross-spectrum of Δ is a few %. We have included a flag to allow the user to toggle this correction on and off. We will improve this model in future work, for instance through perturbation theory in Eulerian (Shaw & Lewis 2008) or Lagrangian (Crocce, Pueblas & Scoccimarro 2006) space.
Our model for redshift-space distortions will be likewise simple, as these are not the focus of our paper (for more thorough studies see Mao et al. 2012; Ross et al. 2021; Shaw et al. 2023 for instance). We will take the linear relation δv = −μ2δ, where μ = k||/k is the line-of-sight cosine. We have implemented three different options for RSDs in Zeus21. These are (i) no RSDs (or real-space, μ = 0), (ii) spherical RSDs (as standard for simulations, which we calculate by setting μ = 0.6, so that (1 + μ)2 ≈ 1.87), and (iii) foreground-avoided RSD (as observed by interferometers outside the wedge, which we obtain by setting μ = 1). Each of these will be revisited and compared to simulations in future work. Unless otherwise specified, we will show results for the spherical RSD mode for the 21-cm signal in order to better compare to other results in the literature, but real space for the rest of quantities like SFRD, TX, and xα.
4.4 Reionization
The final ingredient for computing T21 in equation (6) is the neutral hydrogen fraction xHI. The epoch of cosmic reionization will see xHI evolve from near unity during cosmic dawn to zero by z = 5 (Becker, Bolton & Lidz 2015). This is a complicated process, as originally small and isolated bubbles of HII will grow and merge, eventually percolating to reionize the entire universe (Furlanetto & Oh 2016). A successful analytic approach to model this era is that pioneered by Furlanetto, Zaldarriaga & Hernquist (2004), where the universe is populated with reionization bubbles built on top of galaxies. Rather than reproduce their model, or recent work based on perturbative reionization McQuinn & D’Aloisio 2018; Qin et al. 2022; Sailer, Chen & White 2022, we will focus on the cosmic-dawn epoch at z ≳ 10, and simply compute the average evolution of the neutral fraction xHI for reference. We will tackle models of the reionization fluctuations (or bubbles) in future work16
We will calculate the evolution of reionization through the ionizing emissivity |$\dot{N}_{\rm ion}$|, which is computed in terms of the SFRD as (Mason et al. 2019)
by accounting for the fraction fesc of ionizing photons that can escape the galaxy where they were produced. We model this fraction as a simple power-law in mass (e.g. Park et al. 2019),
and for our fiducial we set |$f_{\rm esc}^{(0)}=0.1$| and αesc = 0 for simplicity, which enforce full reionization by z = 5.3, as suggested by Lyman α data (Bosman et al. 2022), though both of these are free parameters in Zeus21. For instance, one can set a positive index αesc > 0, which implies a faster epoch of reionization (EoR), in which heavier/brighter galaxies dominate reionization, as suggested by recent results from simulations (Yeh et al. 2023) and observations (Naidu et al. 2019).
Rather than working with xHI, we define the filling factor Q = 1 − xHI of ionized hydrogen, whose evolution is given by (Madau, Haardt & Rees 1999)
where nH is the number density of hydrogen, and trec is its recombination time. We will follow Mason et al. (2019) and find the recombination time through
with a constant C = 3 and αB evaluated at Tk = 104 K for simplicity. Equation (52) can be suggestively rewritten as
where as before primes indicate derivative with respect to z, and we have defined a function g that satisfies
For matter domination (the epoch of interest) and a constant C we can analytically solve this function to be
where |$\tau _0 = t_{\rm rec}(0) H_0 \sqrt{\Omega _m}$|. Then, we can find the filling factor as
which accounts for recombinations (though only homogeneously, cf. Madau et al. 1999 for a similar solution in terms of t rather than z). This equation can be generalized to clumping factors C that evolve with z, though it is technically difficult to include imhonogeneities, so we defer those to future work.
We can now integrate equation (57) to find the global evolution of the neutral fraction, which we show in Fig. 14. Indeed, reionization is over by z ≈ 5.3 for our chosen parameters, and begins in earnest below z ≈ 10. This is in broad agreement with current reionization data (also shown in the Figure, summarized in Mason et al. 2019), as well as the CMB (our model gives rise to τreio = 0.05, well within the 1-σ Planck mesurement Planck Collaboration et al. 2020). The evolution of xH i is fairly fast in this model, as the bright galaxies dominate the SFRD. We focus on the cosmic-dawn era proper (z > 10) in this work, so we will set xH i = 1 through the rest of this paper unless otherwise specified.

Average reionization history for our fiducial model, along with a subset of current constraints. We parametrize the escape fraction as in equation (51) and set a fixed clumping factor C = 3 for simplicity.
5 A FULL ANALYTICAL CALCULATION OF THE 21-CM SIGNAL
In the previous sections, we have explained how we compute each of the ingredients of the 21-cm temperature T21 (equation 6). We now show how we combine them to find the 21-cm global signal and its fluctuations in Zeus21.
So far we have not made any assumptions on the size of xα, TX, or their fluctuations. However, in order to keep an analytically closed form we will now expand their contributions to the 21-cm signal as (Barkana & Loeb 2005; Pritchard & Furlanetto 2007)
where as before an overline means spatial average (and we assume a homogeneous Tc–Tk connection in this work). We remind the reader that T21 depends multiplicatively on each of these terms. These relations are linear in δxα and δTk, but not on the matter density δ. The SFRD at each R shell will behave as an exponential of δR, which gets added over all R to find the total power on xα and Tk (and thus on T21). One can then infer that T21 will be a sum of lognormal variables, with well-known statistical properties close to a lognormal itself (Mitchell 1968). While we are working to linear order in the xα, Tk fluctuations (and as such the TX and Tcosmo terms can be computed independently as argued in Section 4), we will show the full expressions in Appendix B, where we demonstrate that the corrections from higher order terms are sub 10 per cent for the models and redshifts of interest, making our approximations sufficient.
5.1 An example global signal with Zeus21
We begin by computing the 21-cm global signal. To first order, it can be found by simply using the average of each of the quantities at play (the temperature |$\overline{T}_k$|, Lyman α coupling |$\overline{x}_{\alpha }$|, and density/RSD δ = δv = 0).17
We show the Zeus21 prediction for the 21-cm global signal in Fig. 15. As expected of our fiducial (which contains only galaxies above the atomic-cooling threshold), the cosmic-dawn signal turns on at z ∼ 20, peaks at z ∼ 15, and turns into emission by z ∼ 10 (cf. Mirocha et al. 2017; Muñoz 2019; Park et al. 2019). For our choice of X-ray efficiency L40 the signal reaches an ∼−90 mK depth, far from the ∼−250 mK allowed by the adiabatic temperature of gas (assuming full WF coupling), and farther even from the ∼−500 mK required by the EDGES claimed detection (Bowman et al. 2018, though see of course Hills et al. 2018; Singh et al. 2021 for criticisms). In upcoming work, we will implement dark matter models (such as millicharged particles Muñoz, Dvorkin & Loeb 2018; Driskell et al. 2022) into Zeus21, which will allow the user to self-consistently account for the gas cooling from the CMB epoch to cosmic dawn and beyond (as Zeus21 interfaces with CLASS, allowing for joint CMB + 21-cm analyses).

Cosmic evolution of the 21-cm global signal (top) and fluctuations (bottom) calculated with Zeus21. We have set a Planck 2018 Lambda cold dark matter cosmology, and ‘minimal’ astrophysical parameters as summarized in the main text (which do not include Pop III stars in minihaloes). Solid lines show the spherically averaged prediction, whereas dashed lines have |μ| = 1, as expected of the modes that survive the foreground wedge, showing significantly larger power during cosmic dawn.
We now explore the 21-cm fluctuations in our full non-linear and non-local model.
5.2 An example power spectrum with Zeus21
We use the auto- and cross-power spectra for each of the 21-cm components (LSS, xα, and Tk) derived in Section 4, along with the weights we show in equation (58), to find the 21-cm power spectrum analytically in Zeus21. The full calculation at all z takes a mere 3–5 s in a single-core laptop. Further, the separation of components will allow us to test each of their contributions during cosmic dawn.
We show our prediction for the 21-cm power spectrum as a function of z at two wavenumbers in Fig. 15. Guided by observational constraints, we will focus on wavenumbers outside the ‘foreground wedge’ (Parsons et al. 2012; Liu, Parsons & Trott 2014a), as those within it are deemed unusable for cosmology. As such, we choose wavenumbers roughly at the low-k edge of the foreground wedge (|$k=0.1\, {\rm Mpc}^{-1}$|), and before thermal noise spikes up (|$k=0.5\, {\rm Mpc}^{-1}$|). Moreover, we show results assuming either spherical RSDs (as commonly done in 21-cm simulations) or foreground-avoiding RSDs (μ = 1), which evade the wedge, and in which case the power is larger by a factor of a few. In both cases the 21-cm power shows the characteristic growth from high to low z as the 21-cm signal grows in absorption (|$\overline{T}_{21}$| becomes more negative), until the trough at z ∼ 15, after which the power begins to decrease. The two power spectra shown reach different amplitudes at their peak, which occurs at slightly different z. This showcases the power of the 21-cm fluctuations to unearth the astrophysics of cosmic dawn, holding more information than the global signal alone. We do not run a full detectability study with interferometers such as HERA or the SKA here, as that is not our goal. However, we note that a similar 21CMFAST model in Muñoz et al. (2022) had overall lower power, but still boasted a signal-to-noise ratio of ≈100–200 for both HERA and the SKA, so we would expect a high-significance detection of our predicted 21-cm signal.
Our approach in Zeus21 diverges from previous analytical approaches (e.g. BLPF or ARES, see also Schneider et al. 2021) in that it accounts for non-linearities in the SFRD. This is key to obtaining accurate estimates of the 21-cm power spectrum. We showed in Figs 9 and 11 that non-linearities change the power spectrum of the xα and Tk components by |$\mathcal {O}(1)$| in the |$k=0.1-1\, {\rm Mpc}^{-1}$| range, where the 21-cm signal is most readily observable. As such, we expect the 21-cm power spectrum to be affected by these non-linearities that we are modelling.
We showcase this point in Fig. 16, where we plot the 21-cm power spectrum against wavenumbers k at three redshifts, chosen roughly to be near the beginning and end of cosmic dawn (z = 12 and 17), and midway (at z = 15). In all cases, the linear calculation is accurate at very large scales (|$k\lesssim 0.05\, {\rm Mpc}^{-1}$|), which are however inaccessible with current interferometers. At the observationally relevant scales (|$k=0.1-1\, {\rm Mpc}^{-1}$|), we see that the non-linear calculation in Zeus21 can change the 21-cm power spectrum significantly. At the highest z we show in Fig. 16 the non-linear corrections increase the power by roughly 50 per cent, as the Lyman α term dominates the fluctuations. This term follows the SFRD, and as we saw Fig. 9 the non-linear behaviour of this quantity increases the xα power significantly. Near the trough (z = 15) the Lyman α and X-ray fluctuations are expected to roughly cancel at large scales (Pritchard & Furlanetto 2007; Muñoz & Cyr-Racine 2021), and the 21-cm power rises steeply towards higher k, roughly tracking the density. As such, the non-linear SFRD corrections are small in Fig. 9. At the lowest z = 12 that we show in that figure we see that the linear prediction would cancel at |$k\sim 0.2\, {\rm Mpc}^{-1}$|, whereas our full calculation is radically different. At this low z the Lyman α fluctuations are small, so the 21-cm power is determined by a competition between the X-ray and LSS terms. These two terms are anticorrelated here, as larger densities increase T21 through the (1 + δ) term, but also provide more heating, which would decrease the contrast with the CMB and thus T21. This correlation will flip to positive at z ≲ 11, when our global signal turns into emission. Our non-linear modeling of the X-ray term is thus key to infer the correct 21-cm power at |$k\sim 0.1\, {\rm Mpc}^{-1}$| scales.
The three cases in Fig. 16 are examples of the richness of information encoded in the 21-cm fluctuations. They also showcase the level of detail that has to be modelled to extract said information from upcoming 21-cm data. We note that, unlike simulated power spectra, our curves correspond to a specific k, rather than a broad bin of wavenumbers. One can average over k in Zeus21 to emulate the output of simulations, if desired. We will compare our results with semi-numerical simulations from 21CMFAST in the following section, but we note in passing that when comparing with the approach in Schneider et al. (2021) we also find good agreement (in the linear case specially, as their non-linear model relies on the halo model, rather than the Eulerian picture we follow here).
6 COMPARISON WITH 21CMFAST
We now move on to compare our analytical results from Zeus21 to the well-known semi-numeric simulations from 21CMFAST. The excursion-set approach that we use for the heating and Lyman α terms is very similar to 21CMFAST, as in both cases it is based on the work of Barkana & Loeb (2005) and Pritchard & Furlanetto (2007). The parametrization of the sources themselves is, nevertheless, fairly different. In order to fairly compare the two, we will implement an astrophysical model within Zeus21 that closely resembles 21CMFAST, as we now detail. The busy reader may want to skip ahead to Figs 20 and 21 to see a comparison of the 21-cm global signal and fluctuations between the two codes, given the same inputs.
6.1 The SFRD in 21CMFAST
We begin by modelling the SFRD (and hence the halo–galaxy connection) as in 21CMFAST. Throughout this paper, we have followed the EPS formalism to perturb the SFRD on regions of density δR. In this formalism one takes the halo abundance to depend on density as in equation (16), which feeds into the SFRD. 21CMFAST, instead, follows an approach where the SFRD is itself modulated by densities using a Press–Schechter HMF. That is, in 21CMFAST (Mesinger et al. 2011)
which is then renormalized by a factor
where |$\overline{{\rm SFRD}}$| is the ‘true’ SFRD [calculated as in equation (8) with the spatially averaged Sheth–Tormen HMF], but SFRDPS(z|δR), and its spatial average 〈SFRDPS(z|δR)〉, are computed with the Press–Schechter HMF. We will not dwell here on the validity of this formula against its counterpart in equation (15), and instead simply show that it can also be fit by an exponential of δR, as it fundamentally relies on the same principle: in the early universe the abundance of haloes – and thus galaxies – is exponentially sensitive to the density field.
We showcase the agreement of the SFRD predicted by our lognormal model and 21CMFAST in Fig. 17. We show a slice of densities from 21CMFAST, as well as our lognormal approximation to the SFRD and its ratio to the direct output of the 21CMFAST simulation (from their Fcoll output box). Our analytical model agrees with the 21CMFAST output to within |$\lesssim 3~{{\ \rm per\ cent}}$|. This figure shows a smoothing scale of R = 10 Mpc, comparable to the scales that determine the power spectrum at the observable wavenumbers k ≲ 1 Mpc−1, as we will confirm below when comparing power spectra of Tk and Jα. We have implemented this alternative EPS algorithm in Zeus21, which can be turned on with the FLAG_EMULATE_21CMFAST.

Slice with the density from a 21CMFAST simulation (on top), and our lognormal model for the SFRD (in the middle). We also show the ratio with respect to the SFRD output from 21CMFAST in the bottom, which only deviates from unity at the few per cent level. The density field has been smoothed on R = 10 Mpc scales, and is extracted from a simulation with 1203 cells and 180 Mpc side.
6.2 Astrophysical and simulation parameters
In addition to the SFRD, the 21CMFAST model for astrophysical sources differs from our baseline in a few key elements, which we now describe.
On the astrophysics side, the SFR (|$\dot{M}_*$|) for each halo is computed from equation (13), which relies on a time-scale t* rather than EPS or exponential accretion. We have also implemented this model in Zeus21, and through this section we will choose the same astrophysical parameters as the PopII-only model of Muñoz et al. (2022), which fit current reionization data (xH i, HST UVLFs, and the Lyman α forest, see Qin et al. 2021). We will take a number Nα = 11 000 of Lyman α photons per baryon, as is inferred from the stellar spectra on 21CMFAST (though we assume a simpler spectrum as described in Section 4). As for X-ray propagation, we will force the 21CMFAST approximation that the X-ray opacity (e−τ) be a Heaviside theta function, either 0 or 1 (only for this section, the user can switch this feature on or off in Zeus21 using the flag XRAY_OPACITY_MODEL). We additionally change the constant conversion factor from xα to Jα to the 21CMFAST-coded value of 1.66 × 1011 [rather than our 1.81, both in the appropriate units, see equation (27)], and we do not include the Hydrogen fraction fH ∼ 0.9 in the optical depth τGP to imitate their code. As introduced in Section 4, 21CMFAST ignores adiabatic fluctuations before its starting redshift z = 35 by default. We will mimic this (erroneous) behaviour by setting the high-z part of the cT integral to zero (see Figs 12 and 13 for comparison).
On the cosmology side, we have modified the HMF parameters to match 21CMFAST (from Jenkins et al. 2001), and use their approximate ‘Dicke’ growth factor rather than the CLASS output. We additionally take a 3D top-hat window function in equation (22) for both the linear and non-linear correlation functions, as that is what the 21CMFAST algorithm assumes. We will cut scales smaller than the resolution of the 21CMFAST simulations, which will be Rmin = 0.93 Mpc (corresponding to 1.5 Mpc resolution in a cubic cell as recommended in Kaur, Gillet & Mesinger 2020); as well as larger than 500 Mpc, as 21CMFAST does not sum beyond that scale.
For the 21CMFAST simulations we have evolved the density field linearly, since that is supposed to be the input of the EPS algorithm. We have also turned off photoheating feedback (Sobacchi & Mesinger 2014, and in fact we set a fixed Mturn = 107.5 M⊙ in both codes), as we have not implemented that feature in Zeus21 yet. Moreover, we have set to zero the excitation contribution from X-rays (which is a small correction), and have fixed a homogeneous value of xe = 2 × 10−4 for the free electron fraction (though it only makes a |$\lt 10~{{\ \rm per\ cent}}$| difference). This serves a double purpose, as 21CMFAST uses RECFAST.18 with a clumping factor C = 2 for evolving xe, which leads to discrepancies when compared to HyREC and CLASS, and we are currently not tracking spatial fluctuations in xe in Zeus21. In all cases, we calculate real-space power spectra and focus on the fesc = 0 (no reionization) limit, which will allow us to make crisp comparisons between the two codes.
6.3 The IGM properties
We begin by comparing the properties of the IGM (chiefly the X-ray heating and Lyman α flux) between the 21CMFAST simulation boxes and our analytic predictions from Zeus21. We choose two indicative redshifts during cosmic dawn, z = 17 (when the signal will be turning down, as the first galaxies start to excite the Hydrogen and slowly heat it), and at z = 10, when we finish our simulations and the gas is fully heated.
We show in Fig. 18 the power spectrum of the IGM temperature fluctuations (|$\delta T_k/\overline{T}_k$|) due to X-rays, both for our Zeus21 run (with the modifications outlined above), and for the average of three 21CMFAST boxes (with 1000, 500, and 300 Mpc in side, all with 1003 cells), to which we assign an error given by their root mean square difference (or 20 per cent, whichever is larger) to test k convergence. The two calculations agree very well at both redshifts. Interestingly, a linear-only calculation (following BLPF) significantly underpredicts the power for k ≳ 0.1 Mpc−1 in Fig. 18. This shows the need for a non-linear approach as ours, and further validates its output against simulated Tk boxes. At k ≳ 0.8 Mpc−1, however, even the non-linear calculation deviates from the 21CMFAST result. That is partly due to our minimum radius Rmin at the cell size, as well as to possible aliasing on the 21CMFAST boxes (as the power spectrum spikes up at high k in a resolution-dependent manner, which is more obvious in Fig. 19).

Power spectrum of the Tk fluctuations (dimensionless) sourced by X-rays only (i.e. without adiabatic). Top curves correspond to z = 17, and lower for z = 10. Black lines are our Zeus21 predictions, and the red region is from 21CMFAST simulations with different Lbox, from 300 to 1000 Mpc in side (all with 1003 cells). The black dashed line is a linear prediction, which falls short of the true power spectrum at the relevant scales for interferometers (|$k\gtrsim 0.1\, \rm Mpc^{-1}$|).

Same as Fig. 18 but for the fluctuations of Jα. Here, the results from 21CMFAST simulations do not appear converged between simulation runs, but the Zeus21 result agrees with the highest resolution curves towards high k.
We show a similar analysis for the Jα fluctuations in Fig. 19. There, the 21CMFAST boxes are less converged, as larger boxes tend to miss some of the high-k power, whereas smaller boxes lack the low-k support (the photons that produce the WF effect can travel a distance comparable to H−1(z) before entering a resonance, see Fig. 8). This translates into a wider 21CMFAST band. Nevertheless, we see the same trends as for Tk, with good agreement between the semi-numeric simulations of 21CMFAST and our analytic power from Zeus21. As before, the addition of non-linearities to Zeus21 is critical for inferring the correct power spectrum in the relevant scales of k ∼ 0.1–1 Mpc−1.
These tests show that our approach is able to capture the relevant physics at cosmic dawn, including non-linearities. We now move to compare the 21-cm signal between both approaches.
6.4 The 21-cm signal
We show our predicted 21-cm global signal in Fig. 20, along with the 21CMFAST result. These runs have been performed on a 1203 box, with 180 Mpc on the side (to yield a standard 1.5 Mpc resolution, which we have tested to be converged at the scales of interest, see Appendix D). Both cases show the same broad features, namely a cosmic-dawn trough peaking at z ∼ 14 (as expected with atomic-cooling haloes only, cf. Fig. 15), with Lyman α taking the 21-cm signal into absorption at z ∼ 20, and the gas being fully heated by z ∼ 10. The two global signals agree remarkably well across the entire cosmic-dawn. We remind the reader we ignore reionization (setting fesc = 0), and thus also photoionization feedback (fixing Mturn = 107.5 M⊙).

Global signal from our Zeus21 (black) versus 21CMFAST (red dashed), given the same input parameters.
The true test of Zeus21 is in the 21-cm fluctuations, though. We show the 21-cm power spectrum at two wavenumbers, k = 0.3 and 0.5 Mpc−1, in Fig. 21. The output of the two codes agrees within |$\sim 10~{{\ \rm per\ cent}}$| through most of cosmic history, though they can deviate from each other at particular z slices. For instance, the |$k=0.3\, {\rm Mpc}^{-1}$| result from Zeus21 shows a small bump at z ∼ 12 that is negligible in 21CMFAST, and our |$k=0.5\, {\rm Mpc}^{-1}$| power is a bit larger. This may be due to the k binning in simulations, or to some remaining assumption in 21CMFAST that we have not ported to Zeus21, including the treatment of adiabatic fluctuations (as our approach to match the behavior of cT in 21CMFAST is only approximate, and linear) or our simplified Lyman α SED. We have also plotted in Fig. 21 our prediction when correctly modeling the adiabatic evolution, as opposed to the 21CMFAST assumption that Tk starts homogeneous at z = 35. A full treatment of adiabatic fluctuations results in a 50 per cent downward revision of the 21-cm power spectrum during cosmic dawn, which ought to be corrected when doing inference with current and upcoming data.19

Evolution of the the 21-cm power spectrum across cosmic dawn at two wavenumbers, |$k=0.3\, \rm Mpc^{-1}$| (top) and |$k=0.5\, \rm Mpc^{-1}$| (bottom). Grey thin lines represent the result from assuming the full correct evolution of the adiabatic index cT (see Fig. 12), rather than the 21CMFAST assumption of Tk homogeneous at z = 35, which we emulate in our black lines.
This agreement between semi-numerical simulations and our fully analytic approach is promising, especially since we have not ‘tweaked’ any of the astrophysical parameters. That is, the two calculations have the same set of inputs. The fact that the fluctuations of each component (X-ray and Lyman α) agree with 21CMFAST (Figs 18 and 19) builds confidence in our treatment of non-linearities and non-localities. Moreover, the 21-cm signal, both global and in the power spectrum, agrees to a similar ∼10 per cent accuracy. Agreement beyond 10 per cent is technically difficult, and would require a dedicated code-comparison project, beyond the scope of this work. Regardless, the semi-numeric treatment of codes like 21CMFAST is likely accurate only to the 10 per cent level (Zahn et al. 2011; Park et al. 2019).
We have stopped our calculations at z = 10. Below that the 21-cm signal will begin to saturate (TS ≫ TCMB), so the 21-cm fluctuations will just trace the LSS (given we are not yet modeling reionization fluctuations in Zeus21). We show this in Appendix D, where we compare against 21CMFAST and find agreement down to z = 5. Note that we have focused on our fiducial astrophysical parameters here, but also invite the reader to visit the same Appendix D to see similar accuracy for other parameter sets.
We note that there may be lingering numerical issues in 21CMFAST near the cell size (e.g. Mao et al. 2012; Georgiev et al. 2022, as the power artificially rises near the resolution limit in Figs 18 and 19), as well as missing features on the Zeus21 side (as for instance we are only considering linear adiabatic fluctuations, and are ignoring inhomogeneities on the WF correction from Hirata 2006). Further, 21CMFAST takes some approximations on the cosmology side, including the growth factor. Nevertheless, our approach in Zeus21 can reproduce simulation-based results with a much reduced computational cost (<5 s in a laptop), and negligible memory requirements. This meets the benchmark required for next-generation 21-cm interferometers, making Zeus21 an ideal 21-cm analysis tool.
7 DISCUSSION AND CONCLUSIONS
In this paper, we have presented an effective model for the formation of the first structures, and thus for the 21-cm line of neutral hydrogen during cosmic dawn. The effective nature of our approach consists of an approximation to the SFRD, which we show traces the over/underdensities roughly as an exponential. As such, it is a lognormal variable, and we can calculate its fluctuations analytically. The power spectra of quantities derived from the SFRD, such as the Lyman α (Jα) and X-ray (JX) fluxes, follows straightforwardly, and so does the 21-cm power spectrum. We have implemented our lognormal model in the public package Zeus21, which can predict the 21-cm global signal and power spectrum including non-linearities and non-localities from photon propagation during cosmic dawn. We have shown remarkable agreement when comparing against 21CMFAST semi-numerical simulations. Unlike 21CMFAST, however, a run of Zeus21 takes <5 s in a single core. We want to emphasize that Zeus21 does not have the same purposes as simulations, like 21CMFAST. We do not aim to produce 3D maps of T21 so we are currently not computing higher order correlations (e.g. bispectra), and we rely on our effective lognormal model for the SFRD (which for instance 21CMFAST does not need). Nevertheless, Zeus21 is built to produce a cheap – but fully non-linear – computation of the 21-cm global signal and fluctuations. Zeus21 is fully built in python and interfaces with cosmological codes like CLASS. We hope this work encourages community development and usage of the public Zeus21 for modelling the 21-cm signal.
The most obvious use code for Zeus21 is running inference on upcoming 21-cm data. Usual Metropolis–Hastings MCMCs will become possible, given the speed of Zeus21(comparable to CLASS). This will allow us to analyse upcoming data from 21-cm telescopes such as HERA (e.g. Abdurashidova et al. 2022) at a much reduced computational cost. In addition, we want to highlight here a few other possible applications.
Beyond standard-model cosmology. Adding new parameters self-consistently in Zeus21 comes at a relatively low cost, especially if their effects are already encoded in CLASS, as Zeus21 reads the matter transfer functions and adiabatic gas temperatures from its output. As a consequence, models of non-cold dark matter, such as ETHOS (Cyr-Racine et al. 2016), warm (Bode, Ostriker & Turok 2001), and fuzzy dark matter (Marsh 2016) can be implemented during cosmic dawn with ease (see e.g. Schneider 2018; Jones et al. 2021; Muñoz et al. 2021 for some previous work in this direction). Likewise, models that heat or cool the gas (such as millicharged particles Muñoz & Loeb 2018, and annihilating or decaying dark matter Furlanetto, Oh & Pierpaoli 2006a; Lopez-Honorez et al. 2016) are directly included as long as they are in CLASS.
Testing new astrophysical models. Zeus21 is built to be modular, so one can modify each key astrophysical assumption independently. As an example, we have implemented two different models for the halo–galaxy connection at high z (see Section 3), and adding new ones is straightforward. Likewise about new effects like Lyman α scattering or heating (Venumadhav et al. 2018).
Find an effective description of the first galaxies. Our lognormal model relies on the effective biases γR. Rather than predict them from a given halo–galaxy connection and cosmology, one can use them directly as free parameters and find them from the data. Our Fig. 5 hints at them being roughly z-independent, and thus a possibly simplified model for cosmic dawn.
Constrain the SEDs of the first galaxies. Due to computational constraints, standard 21-cm tools such as 21CMFAST often assume that the X-ray spectra of the first galaxies is a power-law in energy (see, however, Das et al. 2017), which we have followed in this work. With Zeus21, however, we can implement any arbitrary X-ray (or UV) SED for the first sources, and find their effects in the 21-cm signal. While this question has been explored for the 21-cm global signal in Mirocha (2014), ours is to our knowledge the first public code that can compute the 21-cm fluctuations for arbitrary X-ray SEDs.
Studies of cross terms. We can easily find the effect of each term on the final 21-cm signal, enabling us to study their correlations. For instance, this has allowed us to find a missing assumption on the standard adiabatic cooling in 21CMFAST (see Fig. 12).
Predict fluctuations on other tracers. While we have focused on the 21-cm line as a tracer of the SFRD, there are other tracers of this quantity that could benefit from our effective approach. For instance, line-intensity mapping of other high-z lines is also expected to trace the SFR of each galaxy (Kovetz et al. 2017). Likewise for galaxy luminosity functions at high z from space telescopes like HST, JWST, and Roman, or Earth-based like Subaru. Cross-correlations to other observables ought to be modelled within our formalism, which we will examine in future work.
Joint 21-cm and CMB analyses. As Zeus21 interfaces with CLASS in python it is straightforward to jointly consider 21-cm data with CMB and other cosmological data sets, including the large-scale structure and supernovae.
To summarize, we have presented an effective lognormal model for cosmic dawn, which allows us to find the evolution of the 21-cm signal including non-linearities and non-localities fully analytically. Our model is available as the open-source Zeus21 software package. This is a new tool to model the star-formation rate density and the 21-cm signal during cosmic dawn, and it sets the basis upon which to build an efficient 21-cm pipeline. As such, Zeus21 represents a key step towards unearthing astrophysical and cosmological information from the deluge of cosmic-dawn data ahead of us.
ACKNOWLEDGEMENTS
JBM acknowledges partial support by the University of Texas at Austin and by a Clay Fellowship at the Smithsonian Astrophysical Observatory. We are thankful to Yacine Ali-Haïmoud, Cari Cesarotti, Daniel Eisenstein, and Jordan Mirocha for enlightening discussions; as well as Jordan Flitter, Ely Kovetz, Adrian Liu, Andrei Mesinger, Jordan Mirocha, and the anonymous referee for insightful comments on a previous version of this manuscript.
DATA AVAILABILITY
Our code is publicly available on GitHub. The data underlying this article will be shared on reasonable request to the author.
Software: numpy (van der Walt, Colbert & Varoquaux 2011), scipy (Jones et al. 01), powerbox (Murray 2018), mcfit, classy (Blas et al. 2011), hyrec (Ali-Haimoud & Hirata 2011), and 21cmfast (Mesinger et al. 2011).
Footnotes
In principle hydrogen may not trace matter fluctuations perfectly, even at large scales, so one should distinguish δH from δ. We will ignore this subtlety here for ease of comparison with past literature, and will revisit it in future work.
In this z > 30 pre-galaxy era there are fast and precise analytical predictions, e.g. Lewis & Challinor (2007).
We reproduce the numerical factors here for completeness, but encourage the reader to visit the Zeus21 site for the implementation.
This is not surprising, since for highly biased objects the exponential term of the EPS correction dominates, in which case the contribution from each mass is multiplied by an exponent |$\propto \tilde{\delta }_c^2$| - |$\delta _{\rm crit}^2$|, which is linear in δR to first order.
https://github.com/eelregit/mcfit. Note that Zeus21 does this transformation on the fly for each cosmology.
We have chosen to use the same symbol ν for spectral frequency and normalized density in equation (9), as both are deeply ingrained symbols in cosmology, and confusion is unlikely to arise.
We define z′(R) at the edge of a shell of radius R by default in Zeus21. When comparing to 21CMFAST, however, we will follow their prescription and take it halfway to the previous shell.
In order to reduce the impact of ‘kinks’ whenever one of these shells turns on we account for a last partial shell with a weight wn ∝ [z − zmax(n)].
We keep the ΔR explicit to use sums rather than integrals, which keeps the notation and computations tidy. We also remind the reader that |$\overline{\rm SFRD}$| is evaluated at the redshift z′(R) relevant for each z and R.
Note that here we do not separate Tk into ‘cosmology’ and X-ray terms.
In the meantime, the interested user could plug Zeus21 into the reionization-only code from Mirocha et al. (2022), taking the |$\overline{T}_S$| and |$\overline{x}_{\rm HI}$| output from Zeus21 as inputs and modeling the cross terms.
This is the approach followed in e.g. ARES (Mirocha 2014).
Because of this their average prediction for Tk is off before X-rays, which we account for by lowering our Tcosmo by 5 per cent.
We suggest a solution for 21CMFAST, which consists of initializing the Tk box with linear adiabatic fluctuations with our fitted index cT from equation (48), and then evolving it normally.
References
APPENDIX A: NORMALIZATION OF THE LOGNORMAL SFRD
Through this work, we have assumed a functional form
for the SFRD at each R. Since δR is a Gaussian variable with norm zero, this makes the SFRD a log-normal variable. However, its average will not necessarily be the δR = 0 result, which raises a potential issue when normalizing the SFRD against simulations. In the main text, we defined the |$\tilde{\delta }_R$| variable in equation (20), so that |$\left\langle e^{\gamma _R \tilde{\delta }_R} \right\rangle = 1$| and the factor in front of equation (19) is simply |$\rm \overline{SFRD}(z)$|, the SFRD obtained from the spatially averaged Sheth–Tormen HMF (see equation 8).
This would be the end of the story if we worked in Lagrangian space [as that is where the Sheth–Tormen fit we use in equation (8) is calibrated]. However, the SFRD is evaluated in Eulerian space, which gives rise to an extra factor of (1 + δ) (see equation 15). As such, we ought to correct the pre-factor of |$\rm \overline{SFRD}$| to provide the true Eulerian-space average.
This correction will have the form
where we have used equations (15) and (19), from Section 3 and the L superscript implies Lagrangian space.For a linear model of the SFRD this correction will vanish. Of course our model is non-linear, but its analytical nature allows us to calculate this correction in a closed form to approximately be
which grows towards smaller R, where the SFRD is more nonlinear against δR. Added over all R, this correction is ∼5% for the Lyman α and X-ray power spectra in our fiducial case. While it is below our 10 per cent precision goal, we include it in Zeus21 and allow the user to turn it on or off with a flag.
APPENDIX B: NON-LINEAR EXPANSIONS
In this appendix, we expand the different terms of the 21-cm signal to higher order, and study how well our approximations work in describing both the 21-cm global signal and the power spectrum.
We begin with the expansion of temperature term,
where, as in the main text, we ignore fluctuations on the Tc–Tk relation (from Hirata 2006), in which case |$\delta T_c/\overline{T}_c = \delta T_k/\overline{T}_k$|. In practice, equation (B1) may not converge as one sums over larger n, as this is an asymptotic series (since there is a pole at Tk = 0, so the radius of convergence of this series around that point is formally zero in the complex plane). This is, nevertheless, physically irrelevant, as Tk does not vanish during cosmic evolution, and when expanding around |$T_k=T_k^{\rm ad}$| the radius of convergence will be finite (but still we expect the series to break at large δTk). Likewise, we expand the Lyman α term as
This case has two interesting regimes. For xα ≪ 1 the linear term suffices, even for large fluctuations, due to the 1 + term on the denominator. Later on, however, we expect xα ≳ 1, in which case higher order terms can matter. As such, we expect that corrections from non-linearities will grow at lower z. We have worked to n = 1 in the text, and we will study the corrections from higher n to both terms here.
Corrections to the power spectrum
Let us estimate the size of the correction to the 21-cm power spectrum from the terms ignored. For that, we compare the 2nd- and 1st-order terms in the above equations to each other. In this estimate, we will simply take δq = σ(q) for both q = {xα, Tk}, with σ averaged over R = 10 Mpc in order to represent the typical scales observed in 21-cm experiments. We will work in real space, and ignore adiabatic fluctuations in Tk, both for simplicity and to home in on the impact of our nonlinear SFRD model. We show the ratios for both xα and Tk in Fig. B1, which are safely below 10 per cent for z > 12, during cosmic dawn. They rise to roughly 10 per cent at lower z ∼ 10, though this error is still comparable to the expected uncertainties in fully non-linear semi-numerical simulations (e.g. 21CMFAST analyses often include a 20 per cent ‘systematic’ uncertainty, Park et al. 2019). As such, we have ensured that our Taylor expansion of the Tk and xα terms within T21 is accurate enough for determining the 21-cm power spectrum during cosmic dawn. At z ≲ 12 one may need to model quadratic terms in T21 to go below 10 per cent error. We note, however, that precisely at those lower z the 21-cm fluctuations from reionization will start to grow, and they are expected to overshadow the Tk and xα fluctuations in Fig. B1, so this is largely an academic issue.

Corrections to the 21-cm power spectrum and the global signal from higher order terms that we ignore in this work, as a function of z. We show our estimate of the relative contribution to |$\Delta ^2_{21}$| at observables scales k from non-linear terms in the kinetic temperature Tk (purple solid), and in the Lyman α coupling (black dot–dashed). We also show a similar estimate for the 21-cm global signal |$\overline{T}_{21}$| (teal dashed, from all contributions). It blows up at z ∼ 11 where the global signal vanishes, but the correction is under 1 mK at all z for this model.
Corrections to the global signal
Let us now calculate the correction |$\Delta \overline{T}_{21}$| to the 21-cm global signal. We expand equation (6) to second order in fluctuations as
where the sum runs over all cross terms (with qi = {δ, xα, Tk}), and the last two terms in this equation include the higher order Taylor expansion (note there is not one for q = δ in real space). We define the second-order |$\beta _i^{(2)}$| parameters to be |$\beta _T^{(2)} = \beta _T/\overline{T}_k$| and |$\beta _\alpha ^{(2)} = \beta _\alpha /(1 + \overline{x}_\alpha$|) in terms of the first-order βi [extracted from equations (58) in the main text, see also BLPF]. We show this correction in Fig. B1, which is safely below 10 per cent during the entire cosmic dawn, barring a spike at z ∼ 11 where the global signal vanishes. The absolute correction is below 1 mK for the entire cosmic-dawn range we study.
We thus conclude that corrections from quadratic (and higher) terms in the definition of T21 are largely unimportant during cosmic dawn, both for the global signal and the power spectrum. This is in strike contrast with the corrections from the non-linearities arising from the SFRD (see e.g. Figs 9 and 11).
APPENDIX C: IMPACT OF THE X-RAY SED
Here, we illustrate the large impact that the X-ray SED has in cosmic dawn, by showcasing a case with a smaller energy cutoff E0 in the SED.
Rather than show the entire 21-cm evolution, let us focus on the window function in Fourier space in order to isolate the effect of X-ray propagation (as both are easy to calculate with Zeus21, and the latter gives us direct access to the astrophysics of cosmic dawn). The linear window functions Wi are defined in terms of the (linear) power spectra as
for any quantity i, and as such they can be understood as describing how much power at each k is lost due to effects like photon propagation.
We show the linear windows Wi in Fig. B2 for three different cases, all at z = 10. First, for our fiducial X-ray case with E0 = 0.5 keV the window stays near unity until k ∼ 1/λmfp, where λmfp is the mean-free path of X-ray photons around the threshold (λmfp ∼ 35 Mpc for this fiducial). This is similar to the Lyman α case, which drops at slightly larger scales (lower k). In that case the drop is due to the distance that a photon can travel between two Lyman resonances, which is typically ∼100 Mpc comoving, rather than the X-ray cross section. A smaller cutoff E0 = 0.2 keV on the X-ray SED makes a large impact on the 21-cm fluctuations, as the last window function we show in Fig. B2 illustrates. In that case λmfp ∼ 2 Mpc, and as thus fluctuations are retained to larger k. These windows are useful to build intuition, though of course they will be modified by our nonlinear SFRD apporach in Zeus21.

Window function for (linear) fluctuations in the Lyman α flux (black dot-dashed), and in the X-ray heating term ΓX (purple) for our fiducial case of E0 = 0.5 keV. If we set a lower E0 = 0.2 keV (red dashed) the fluctuations survive undamped to higher k, as lower-energy X-rays are absorbed more locally. All cases are at z = 10.
APPENDIX D: BROADER COMPARISON TO 21CMFAST
In the main text, we compared the output of Zeus21 to 21CMFAST under one fiducial parameter set, finding agreement down to z = 10. In this appendix, we show how that agreement generalizes to other astrophysical situations, to lower redshifts, and check that our choice of box-size is appropriate.
Rather than perform an entire parameter scan, which is computationally expensive in 21CMFAST and beyond the scope of this work, we will simply vary two of the main input parameters. We run two simulations, one with a power law α* = 0.7 for the faint end of the halo–galaxy connection (steeper than the fiducial α* = 0.5), and one with a different X-ray efficiency L40 = 10−0.5, equivalent to |$\log _{10}(L_X/\rm SFR) = 39.5$| (a factor of 10 lower than usual). We show the output of both codes (global signal and power spectra) for these parameters in Fig. C1. The case with lower L40 produces a much deeper 21-cm global signal, and consequently a stronger 21-cm power spectrum during cosmic dawn, which Zeus21 captures (the spikes in the 21CMFAST output are likely due to numerical noise). Likewise, the case with a steeper index α* delays structure formation by reducing the brightness of small-mass objects, which shows up as a lower-z 21-cm trough.

Comparison of the 21-cm global signal (top) and power spectrum (bottom, at |$k=0.3\, {\rm Mpc}^{-1}$|) between Zeus21 (solid) and 21CMFAST (dashed). The black/red lines correspond to our fiducial parameter set chosen through the paper, whereas the blue lines have a factor of 10 lower X-ray luminosity per unit SFR (LX, which produces some numerical noise on the 21CMFAST simulations), and the green lines a steeper faint-end of the luminosity function (α*). In all cases, our effective approach in Zeus21 reproduces the 21CMFAST outputs accurately.
As for lower z, we show in Fig. D1 the result of running both codes down to z = 5, under our assumption of fesc = 0 (i.e. no reionization). This shows the same level of agreement at z < 10 as we found during cosmic dawn. This is to be expected, as for z < 10 our 21-cm signal is fully saturated (TS ≫ TCMB), in which case the 21-cm fluctuations will simply trace densities. We show this is true in Fig. D1, as we also show a curve proportional to the matter power spectrum.

Same as Fig. C1 but all the way down to z = 5. We have set our fiducial parameters as in the main text, including fesc = 0 (as we are not yet modeling reionization bubbles). Below z = 10 the spin temperature begins to saturate (TS ≫ TCMB), so in the absence of reionization the 21-cm signal will just trace densities. The green dashed–dotted line denotes |$\overline{T}_{21}^2\Delta _m^2$|, which indeed matches the 21-cm power at low z under our assumptions. Finally, the blue dotted–dashed curves correspond to a different set of initial conditions on 21CMFAST, which shows convergence at the scales and z of interest.
Finally, through this work we have compared Zeus21 with a 21CMFAST simulation box that is 180 Mpc on a side, with 1.5-Mpc resolution. While this is smaller than recommended in Kaur et al. (2020), it more than suffices to capture the 21-cm signal at the scales that we study. To show this, we run a 21CMFAST box with different initial conditions and show the results in Fig. D1. It is clear that any sample variance from the box size is small during cosmic dawn (see Giri et al. 2023 for a similar discussion during reionization).
To summarize, in all cases we see remarkable agreement between Zeus21 and 21CMFAST, building confidence in our effective model for the cosmic-dawn 21-cm signal.