Modelling a Hot Horizon in Global 21 cm Experimental Foregrounds

The 21 cm signal from cosmic hydrogen is one of the most propitious probes of the early Universe. The detection of this signal would reveal key information about the first stars, the nature of dark matter, and early structure formation. We explore the impact of an emissive and reflective, or `hot', horizon on the recovery of this signal for global 21 cm experiments. It is demonstrated that using physically motivated foreground models to recover the sky-averaged 21 cm signal one must accurately describe the horizon around the radiometer. We show that not accounting for the horizon will lead to a signal recovery with residuals an order of magnitude larger than the injected signal, with a log Bayesian evidence of almost 1600 lower than when one does account for the horizon. It is shown that signal recovery is sensitive to incorrect values of soil temperature and reflection coefficient in describing the horizon, with even a 10% error in reflectance causing twofold increases in the RMSE of a given fit. We also show these parameters may be fitted using Bayesian inference to mitigate for these issues without overfitting and mischaracterising a non-detection. We further demonstrate that signal recovery is sensitive to errors in measurements of the horizon projection onto the sky, but fitting for soil temperature and reflection coefficients with priors that extend beyond physical expectation can resolve these problems. We show that using an expanded prior range can reliably recover the signal even when the height of the horizon is mismeasured by up to 20%, decreasing the RMSE from the model that does not perform this fitting by a factor of 9.


INTRODUCTION
Creating a timeline of the universe between the period of recombination and the end of reionisation is a necessary step for cosmologists to describe the makeup of the early universe.Direct observation of cosmic neutral hydrogen remains the most promising tool to understand the universe between  ≈ 1100 and  ≈ 10.The hyperfine spin flip of neutral hydrogen produces an emission line at a rest frame of 21 cm (van de Hulst 1945).The power of this feature with respect to the radio background as it redshifts through cosmic epochs will provide crucial information about structure formation up to the Epoch of Reionisation.The depth, position and width of the 21 cm absorption feature allows us to probe things like early star formation rate (e.g.Schauer et al. 2019), the initial mass functions of population III stars (e.g.Gessey-Jones et al. 2022), the nature of X-ray binaries (Das et al. 2017), and more exotic physics like dark matter distribution and properties (e.g.Barkana et al. 2018).
We measure the 21 cm signal using the brightness temperature; that is, the temperature at which a blackbody in thermal equilibrium with an object would have to be to produce the same level of thermal excitation.The brightness temperature of this signal is several orders of magnitude lower than the radio foreground, making direct observation of the redshifted 21 cm signal extremely difficult (Shaver et al. ★ E-mail: jhnp2@cam.ac.uk † E-mail: da401@cam.ac.uk ‡ E-mail: ed330@cam.ac.uk 1999).Despite this, a speculative detection of the globally averaged 21 cm absorption trough was made by the 'Experiment to Detect the Global Epoch of Reionization Signal' (EDGES) (Bowman et al. 2018).This detection describes the global signal as a flattened Gaussian centred at 76 MHz with a depth of 0.5 K.The flattened nature of the signal, as well as its depth, did not appear to match any existing theory at the time (e.g.Cohen et al. 2017).The depth of the signal demanded either an enhanced radio background (Fialkov & Barkana 2019;Mittal & Kulkarni 2022), or a way of cooling the Universe more rapidly than expected due to interactions with dark matter (Liu et al. 2019); the flatness possibly being explained by two competing heating mechanisms, such as Lyman- photons and cosmic rays becoming dominant at different times (Gessey-Jones et al. 2023).
Questions, however, have been raised as to the robustness of the data analysis performed in the experiment (Hills et al. 2018).Issues may have arisen from nonphysical electron temperatures, a damped sinusoidal systematic (Singh & Subrahmanyan 2019;Sims & Pober 2019), and other residuals (Bevins et al. 2021) which may come from beam effects or other distortions such as the ionosphere (Shen et al. 2021).
The 'Radio Experiment for the Analysis of Cosmic Hydrogen' (REACH) (de Lera Acedo et al. 2022), aims to perform an independent measurement of the sky-averaged 21 cm signal to either confirm or disprove the EDGES detection (Bowman et al. 2018).Utilizing a fully Bayesian data analysis pipeline it aims to model systematics and foregrounds in a more physically motivated manner than has been done previously.
The global 21 cm absorption trough is predicted to sit between 70 and 200 MHz (e.g.Pritchard & Loeb 2008;Cohen et al. 2017), which lies in the ranges used by FM radio stations and Digital TV.This introduces a large amount of possible radio frequency interference (RFI) at a much higher intensity than the 21 cm signal, which poses a large problem for signal recovery.This RFI may be mitigated using data analysis tools (Leeney et al. 2022), but to further minimise this risk, the radiometer is set up in the Karoo Radio Astronomy Reserve in South Africa, surrounded by mountains on all sides.While the mountains accomplish the goal of greatly decreasing RFI around the antenna they create a new issue that must be overcome.It was shown in Bassett et al. (2021) that the effect of a horizon will be significant on the detection of the 21 cm signal.
For an experiment like EDGES a horizon may be unnecessary to describe, as the polynomials used to fit for their data may have been able to encompass a horizon without describing it specifically.
Any experiment using physically motivated foreground models, for example REACH, however, will require the description of the horizon itself.This paper focuses on the REACH radiometer (Cumner et al. 2021) and pipeline (Anstey et al. 2020), as this is the collaboration to which the authors belong, but this analysis is applicable to all global 21 cm experiments using physically motivated foreground models for signal recovery.
Section 2 deals with the expansion of the REACH pipeline to accommodate for a horizon in data generation and foreground modelling.Section 3 details the impact of this horizon on signal recovery, the reliance of signal recovery on correct soil parameter estimation, how fitting for soil parameters may liberate one from this reliance, and how this parameter fitting may increase tolerated error in horizon profile mapping.In Section 4 we outline our key conclusions and discuss future work to expand the models discussed in this paper.

METHODS
In this section we describe Bayesian inference (2.1), how we simulate a mock data set describing the power given off by a horizon surrounding the REACH antenna (2.2), and the methods used to account for this power in the physically motivated foreground models needed to recover the redshifted 21 cm signal (2.3).

Bayseian Fitting
The REACH pipeline (Anstey et al. 2020) relies on Bayesian inference for parameter estimation and model comparison in the recovery of the redshifted 21 cm signal.Bayesian inference relies on Bayes' theorem, which states: where one can infer the probability of having a set of parameters,  M , given a set of data, D, and a proposed model, M.This equation may be written more simply to be: where  is the prior, describing our assumptions on the initial probability distribution of the parameters we are estimating, which is updated to the posterior P. L, the likelihood, may be read as the probability of observing the data given a model and a set of parameters using that model.Z is the evidence, representing the probability of observing the data given a model, integrated over all possible parameters that the model could use.Provided the input data is the same, we can compare the Bayesian evidences of two models to determine which model fits the data with the highest probability, following: where the ratio of the evidences, weighted by the prior probabilities of the models (which we treat as a uniform distribution), will give the ratio of the probability of each model fitting the data.The REACH pipeline uses the Nested Sampling (Skilling 2006) algorithm P - (Handley et al. 2015a,b) to perform this parameter estimation and model comparison.This algorithm randomly draws a number of parameters from the given prior, for each of which a likelihood is calculated.The lowest likelihood points are discarded and the volume of the parameter space shrinks accordingly, updating the priors with a new sample being drawn from the newly constrained prior.This iterative shrinking of the prior space is done until a termination criterion has been met and parameters have been determined.The Bayesian evidences are generated as a byproduct of this and may be further used for model comparison.For more information on Nested Sampling see Skilling (2006).

Data Simulation
Using the S algorithm developed by Bassett et al. (2021) we generate a profile of the horizon around the REACH antenna at −30.8387 • N, 21.37492 • E with data from Google Maps based on the Landsat/Copernicus surveys.We define a horizon mask such that all pixels with an altitude angle () above that of the horizon for a given azimuthal angle () are assigned a value of 0, and any pixel on or under the horizon is assigned a value of 1 (Horizon mask is shown in Figure 1).
Maps of the variation of the spectral indicies across the sky are generated by calculating the spectral index required to map each pixel of the 2008 Global Sky Model (GSM) (de Oliveira-Costa et al. (4) Here we choose a GSM at 230 MHz to avoid contamination of the cosmological redshifted 21 cm signal, which at the extremes of theoretical predictions should be extinguished between 200 and 250 MHz.
We take then the 230 MHz base map and subtract a flat value of 2.725 K from each pixel to account for the temperature of the CMB.
This map is then scaled to each frequency according to the previously calculated spectral indicies and rotated according to time and date of a given observation.Further detail of sky map generation in the REACH pipeline is found in Anstey et al. (2020).
Once these maps have been rotated we mask them by the physical horizon surrounding the radiometer.In this process we make the assumption that the sky and the horizon sit on the edge of an infinite, flat plane and we may ignore all near-field effects below an altitude angle of 0 • .This assumption is nonphysical, and may cause some issues (Bradley et al. 2019), the impacts of near-field soil effects on REACH will be explored in a later work.
Our horizon mask may then be multiplied by some estimate for soil temperature in Kelvin,  soil , which will describe simple emission from the soil on the horizon, but will fail to replicate any additional power arising from radio waves reflected from the surface of the soil.
This mask makes the assumption that all light behind the horizon is fully attenuated by the mountains surrounding REACH, and that the temperature of the horizon is constant.The former is justified as the vegetation surrounding REACH is minimal.Around REACH, while there is vegetation, there are no large trees in the close foreground that must be relied on to block parts of the sky; thus the attenuation from the mountains can be assumed to be complete.
The latter is a more troublesome approximation, as  soil will decrease or increase at various rates depending on the composition of the horizon and location of that part of the horizon with respect to the sun.However, for the purposes of this model we believe this is an unnecessary complexity in demonstrating the importance of horizon modelling.
This model also assumes a lack of diffraction.Light is treated as only moving in straight lines, so any emission from behind the horizon is ignored entirely.We make this assumption as the ratio of light that is diffracted around the horizon to the amount of light that will be blocked by the horizon is of order /ℎ, where  is of order 1m, and ℎ, the height of the mountains around REACH, is of order 1000m.For much smaller horizons than REACH, diffraction may be a greater challenge, but we address a way of approaching this in Section 4.
In describing reflection we must make a number of assumptions about the soil, namely its makeup and moisture content, both of which will heavily impact the dielectric permittivity of the soil itself.As a proxy for the soil in the Karoo reserve we look to examples of better studied soil in other non-sandy desert environments.The relative dielectric permitivities of soil in the Avra Valley in Arizona were analysed and discussed in Sternberg & Levitskaya (2001).Using this as an approximation of what we expect to find around the REACH antenna we use a relative dielectric permittivity for Very High Frequency radio waves of order 10.The reflection coefficient (Γ) of an electromagnetic wave passing from a vacuum into a different medium follows Equation 5where we approximate the air to have a relative permittivity (  ) of 1.We model the reflection as diffuse as it will come from all areas of the sky and the ground, so we can average the angle of incidence to zero, allowing us to model the reflection coefficient as: Here   is the relative permittivity of the soil.Approximating the soil as having an   of between roughly 5 and 15 depending on soil moisture levels we find a value of Γ to be between -0.4 and -0.6.Therefore, to account for reflection we take an average of the power per pixel across the entire sky (including blackbody emission from the horizon), add this power onto each pixel of the horizon and multiply it by the magnitude of the reflection coefficient.This would account for all reflection from the sky and from thermal emission only if the soil were able to see every part of the sky.This is unphysical.Thus we also account for the individual parts of the horizon being unable to view the entire sky, i.e. a rock lying on one of the mountains will be unable to see any of the sky that the mountain it is lying on obscures.We assume the mountains around the telescope to have an incline of 45 • , an approximation we make from topographical maps of the area.We can then multiply the power we have mapped onto the horizon by a factor of 135/180 to account for the regions that remain unseen by that part of the horizon.This number is specific to the topography of REACH, and the slope of the basin surrounding it.The value of the incline used has a tolerance of up to ±15 • before accurate signal recovery becomes challenging and the depth of the recovered signal is poorly estimated.While accurate incline estimates are important for accurate signal recovery in our base horizon model the tolerance for precise estimation is loosened greatly once we begin fitting for the reflection coefficient, as in Section 3.3.
We combine this term, which we deem  reflection , with our map describing emission to create a snapshot of the sky, which we may integrate over time and solid angle to give a full time-averaged sky map.This is then convolved with the gain of the beam to give a simulated antenna temperature ( data ) for an all-sky observation at the REACH site.It is important to note we assign a value of zero to any part of the antenna beams that have an azimuthal angle below zero.Reflections of the beam below this angle will be dealt with in a separate paper.
Our data model thus follows: where: (Ω, ) is beam directivity at a given frequency,  (Ω) is the horizon mask, Ω describes solid angle,  is frequency,  soil is the soil temperature in Kelvin,  CMB is the CMB temperature, set to 2.725 K, and σ is experimental noise, which in this case we assume to be a Gaussian white noise set at 25 mK.

Physically Motivated Foreground Model
We follow the framework of Anstey et al. (2020) to generate our foreground model.The sky power is modelled by dividing the sky into  regions in which we approximate  to be equal across a given region, based on the spectral index map (Ω) described in Section 2.2.This is shown in Figure 1.An  of 1 assumes the entire sky has a constant spectral index, and as  tends towards number of pixels in the map each pixel will have its own unique spectral index.This coarse-graining approach allows for greater control of the complexity of the model, with each additional sky region demanding another parameter be fit for in the model.
Maximising the efficiency of the fitting process demands that we calculate the  sky 'chromaticity functions' (  ()) outside of the likelihood.This means we must approach modelling horizon in the foreground differently to the data generation process.
The coarse-grained spectral index map is rotated according to date and time of observation, and much the same as in the data generation process we mask out the horizon.Once the horizon has been masked out we map the average power per frequency of each spectral region onto the horizon mask, scaled by the fraction of the sky that it takes up.Multiplying by |Γ| and amount of the sky that a given part of the horizon can 'see' we are given the reflected power per frequency per sky region.This is a power that we will then scale according to the frequency scaling of the spectral region it arises from.The sky 'chromaticity functions', when multiplied by this sky term, which we denote  sky , will account for all power which arises originally from the sky.
sky =  230 where we follow convention from Anstey et al. (2020).  is the spectral index of a given sky region, and   (Ω) refers to a mask over a given sky region.
We deal with the blackbody emission and self-reflection of the soil separately, as it does not scale with frequency.We take the horizon mask and multiply it by a given  soil to give the blackbody emission term.To model reflection of the blackbody emission from the soil, we map the emission back onto the horizon mask, accounting for the total fraction of the sky it takes up, and multiplying by |Γ| and the amount of sky that the soil can 'see'.These will account for our blackbody terms, multiplying our blackbody term,  BB by the horizon 'chromaticity function'  () to give: These two terms are added together with a flat CMB temperature to give our total model, described by:

RESULTS
In this section we discuss the results and implications of a 'hot' horizon on the recovery of the 21 cm signal.In subsection 3.1 we discuss the issues that arise when we do not account for, or incorrectly account for the horizon in our foreground models.In subsection 3.2 we discuss the implications of incorrectly assuming the temperatures and reflection coefficients of the soil on the horizon in correcting for the mismeasurement of the height of a horizon.subsection 3.3 deals with the utility of allowing the temperature and reflection coefficient of the soil.

Recovery of the 21 cm Signal
In previous works recovering theoretical 21 cm signals using physically motivated foreground models the horizon is either ignored, or simply treated as something that blocks out sky power without emitting anything itself (Kim et al. 2022;Hibbard et al. 2023).
Here we investigate the impact of including a horizon in our mock data set, but failing to properly account for it in our foreground models.We inject a Gaussian signal with a central frequency of 85 MHz, bandwidth (here treated as the standard deviation of the Gaussian) of 15 MHz, and a depth of 0.155 K into our mock data set and use our pipeline with 5 distinct models describing the horizon to attempt to recover it.We detail the prior ranges for a 'realistic' redshifted Gaussian 21 cm absorption feature in Table 1, derived from Cohen et al. (2017).
For each of these models we fit for a Gaussian signal, detailing the central frequency, bandwidth and depth parameters recovered by the model, allowing us to compare these to the true values of the injected signal.This comparison will give some idea of how accurate the model is.We use the root mean square error (RMSE) between the injected signal and a Gaussian signal made from the recovered posterior parameters to show the how accurately the recovered signal mirrors the injected one, a lower RMSE indicating a better approximation of the 'True' value.For each of the models we also perform a fit for a non detection, in which the foregrounds are assumed to be the only source of power.We take the difference of the Log(Z) of the fitting our foreground model with an injected Gaussian signal with respect to fitting for just the foregrounds with no signal to give us the  Log( Z) .This is a measurement of how probable the model believes a detection of a Gaussian signal is when compared to a non-detection.A  Log( Z) of 1 would indicate that our model favours a detection with a probability ten times higher than that for a non-detection, with a  Log( Z) of -1 indicating that the non-detection is favoured by the same amount.Every increase by 1 in  Log( Z) corresponds to another order of magnitude by which the detection would be favoured over a non-detection.Thus, for us to claim that a detection has been made this number cannot be below zero.
We show these results in Table 2 in which we compare the ability of these 5 different horizon models to recover the 21 cm absorption trough from Cosmic Dawn.
• The 'No Horizon' model.This model fails to account for the horizon entirely, letting  (Ω) be a null matrix in Equations 10, 11, and 12.As shown in Figure 2 this saturates our prior for the depth of the signal, and recovers a very biased estimate of the centre frequency with a signal model that has unacceptably large residuals.
• The 'Cold Horizon' model.Here we do include a horizon in our foreground model, but set  soil and |Γ| to zero.This is an approximation of the approach that previous works have used to describe the horizon (Bassett et al. 2021;Kim et al. 2022;Hibbard et al. 2023), treating the horizon as something that attenuates radio waves, instead of being an emitter of any kind.We seem to achieve no improvement on the model that ignores the horizon entirely, as shown in Figure 3. Looking at Table 2 we see the 'Cold Horizon' model appears to struggle to recover the redshifted 21 cm signal even more than the model that ignores it entirely, with a Log(Z) 0.5 lower than the 'No Horizon' model and an RMSE 0.0002 higher.This is not a surprising result.The horizon radiates a large amount of power, be it through thermal emission or reflection.The signal is buried in the radio foregrounds, so failing to account for the horizon in any way means that the foreground model used will change to account for some emission from the sky in the horizon region.This in essence will mimic some of the  reflection term in our data model.When one just masks out the horizon and gives it no power we find ourselves even further away from a true description of the foregrounds, and will make signal recovery much more difficult.
• The 'No Emission' model.Allowing the horizon to reflect the sky, but not providing it with a description of its own thermal emission (T soil = 0, |Γ| = 0.6) does help signal recovery, with a Log(Z) increase of almost 600.This however, still saturates the depth parameter to the prior limit, seen in Figure 4, due to a large amount of power being unaccounted for in the foreground.
• The 'No Reflection' model.In Figure 5 we account for the thermal emission of the horizon in our foreground models, letting  soil be equal to 300 K, but not accounting for reflections, keeping |Γ| at 0. Here we come much closer to recovering the signal.Once again, the signal is biased, and the depth parameter is saturated.However, the residuals are greatly reduced.
Including thermal emission and ignoring the reflected power gives a much closer approximation to the correct foreground models than when we only consider sky reflection, which, while a large improvement on our 'cold' horizon model also entirely saturates the depth priors during signal recovery.
• The 'All' model.Shown in Figure 6, when we account for both emission and reflection we find the signal with a Log(Z) of 288.3.This is a Log(Z) of approximately 1600 more when compared to the model that does not account for the horizon.The model that accounts for reflection and emission also has an RMSE of less than half of the no horizon case.These values indicate that not only is this model the most favoured in a probabilistic sense, but it is also the model that most accurately recovers our 'True' signal.This implies that recovery of the 21 cm absorption trough using physically motivated foreground models demands realistic horizon modelling.

Investigation into Effects of Temperature and Reflection Coefficient
Until this point we have assumed we are able to perfectly predict both the temperature and reflection coefficient of the soil on the REACH horizon.Practically this is difficult.The temperature of the soil will vary with time, and while it is possible to set up temperature probes around the horizon of the antenna, this is difficult both fiscally and in terms of the human resources it would require.The Karoo is a radio quiet reserve, so temperature probes cannot be remote, and must be collected from around the reserve manually upon each observation.|Γ| is even more difficult to determine.This is heavily dependent on moisture levels of the soil, how deep the moisture penetrates the soil, and specific soil makeup across the mountains.This will all vary with weather and location around the mountain, which makes precise estimates of |Γ| year round unfeasible.Using the same model throughout for temperature and reflection coefficient in our model of data generation (300 K and 0.6 respectively), we examine the impact of incorrectly predicting the soil temperature and reflection coefficient in our foreground models.
It may be observed from Figure 7 (for full details see Table A1) that our misjudging the values of  soil and |Γ| in the foreground models will cause problems for signal recovery.It does, however, demonstrate the expected link between values chosen for  soil and |Γ|.Both will describe the amount of power emitted from the horizon; increasing  soil , leading to overestimation of emission power from the horizon may be compensated for by decreasing the value of |Γ| and vice-versa.This is not a perfect fix.|Γ| is contained in Equation 10, while  soil is not.This is an issue as this sky term in our model is the only one that gets scaled by frequency, so neither one term can entirely correct for the other.As a result we will need very accurate estimates of both  soil and |Γ| if we are correctly recover the 21 cm signal.

Soil Temperature and Reflection Coefficient Fitting
As shown in Section 3.2, by fixing values of  soil and |Γ| as reasonable estimates of the correct parameters1 which are slightly mismatched from the mock data we may yield a non-detection, or detection of a signal with very different parameters to the true one.We can potentially mitigate for these issues by fitting for both  soil and |Γ| as additional parameters in our model.Setting the priors of our model to be uniform for  and for between 0.4 and 0.7 respectively, we explore the ability of this model to recover a range of signals.
The model is able to accurately recover Gaussian and flattened Gaussian signals, as in Figures 8 and 9, with a low RMSE, but much higher residuals at lower frequencies than we saw for models in which we fixed  soil and |Γ| to specific, correct, values.
When we fit for the reflection coefficient, however, we see very large residuals at low frequencies.These residuals arise in Equation 10where |Γ| is multiplied by  −  , preferentially increasing the residuals for lower frequencies.While these residuals are large they are non-degenerate with the signal we recover, so do not create any issues for the model itself.
In Table 3 we compare the ability of our new model, fitting for parameters, to a 'perfect' model, where  soil and |Γ| are fixed to match the input values in our mock data 2 .To test this we generate 6 Gaussian signals with pseudo-random parameters designed to stretch the model beyond the simple 85 MHz, 15 MHz, 0.155 K signal we have been using up to this point.
This model in which we fit for  soil and |Γ| performs consistently as well as the model in which we have fixed parameters.The RMSE values of the two models yield similar results, and the  Log( Z) of the two models indicate the same ability of the models to recover a signal.
In the cases where the fixed parameter version of our model finds a signal, the fitted one will too.However, when one fails the other does the same.A signal that cannot be found with the fixed parameter model will not be found with the fitted model.This is most notable when the we attempt to recover signals centred at 53 and 72 MHz, falling at the very low end of the REACH observation band where the Gaussian signal does not fit entirely within the band.The 53 MHz signal is not consistently detected, the pipeline giving a higher evidence for a non detection.The 72 MHz signal is found, but the pipeline does not recover the parameters to within reasonable error 2 It is important to note for this comparison we can only make comparisons between models of a given signal, the evidences relating from one signal to another will be incomparable as the mock data we are fitting to will be different.
in either the fitted or hard coded cases.This detection has a very high RMSE, meaning we must be very careful in assuming the validity of any signals recovered around this range.
The pipeline being unable to properly recover signals that do not entirely fall within the observing band of REACH is not unexpected.A signal that does not sit fully within the band will be more degenerate with the foregrounds and will be fitted poorly.
Once the signal sits more comfortably in the observation band both the traditional fixed model, and the model were we fit for  soil and |Γ| accurately recover the signal.Crucially, when we fit for a model that assumes no signal we still see a much greater evidence for the model that contains a signal, which is to say that  soil and |Γ| being fitted will not arbitrarily increase the evidence of a fit to where it becomes impossible to determine detection from non-detection.
To confirm this we also perform a test in which we inject no signal, and fit this to a Gaussian.It can be seen that even when we are fitting the additional parameters of  soil and |Γ| we do not artificially recover a Gaussian detection, with the Bayesian evidence correctly favouring a non-detection.

Soil Temperature and Reflection Coefficient Fitting as a Way of Correcting for Horizon Measurement Errors
We have allowed ourselves to deal with any error in the measurement of  soil and |Γ| using the fitting process defined in Section 3.3.However, all models to this point assume that the physical height of the horizon and its projection onto the sky is measured without error.This is not a reasonable assumption.
The model being able to fit for both  soil and |Γ| may be able to compensate for errors in the measurement of the horizon.One might assume that artificially increasing  soil would cause the model to perceive the horizon to be higher than it actually is, or by increasing |Γ| the model will see more of the sky than it really does.We examine this naïve assumption in this section.
To explore the functionality of  soil and |Γ| fitting as a way to mitigate error in horizon height we expand the prior range beyond physical expectation.We allow  soil to sit between 200 and 400 K.We allow |Γ| to have any value between 0.3 and 0.9.We will then systematically create an artificial error in the horizon we use for our foreground modelling.We multiply the altitude angle, , in the horizon mask of the foreground model,  (Ω), by some scaling factor, , to increase or decrease the height of the horizon in the foreground model with respect to the mock data.
We examine the results of this in Figure 10 (for full details see Table A2), where we compare how the pipeline deals with incorrect horizon height estimates when we input the same values of  soil and |Γ| as was used to create our mock data set versus when we allow for those parameters to be fitted with an increased prior range.Here we try to recover the 85 MHz, 15 MHz, 0.155 K signal at 300 K with a |Γ| of 0.6.We set  to be 0.8, 0.9, 1, 1.1 and 1.2 to give a deviation in the horizon height measurement in the foreground models of up to 20% from the mock data.
We show that by fitting for  soil and |Γ| with this unphysical prior range we are able to very consistently find the signal where the model that fixes values of  soil and |Γ| is unable to.This is exemplified in the case where we make  equal to 1.2, simulating an overestimation of the horizon in the foreground models of 20%.As detailed in Figure 11 we move from being entirely unable to recover the 'True' signal with an RMSE of 0.0786 when we fix the parameters to those used in the mock data to a very accurate signal recovery.Our fitted version has an RMSE approximately 9 times lower and a Log(Z) 25 units higher.Table 3.An examination of how the pipeline copes with fitting soil temperature and reflection coefficient parameters with a range of varying signal parameters and soil properties generated using a pseudo random number generator.'Fixed Parameters' refers to  soil and |Γ| being hard coded into the foreground model to have the same values as the data model, 'Fitted' allows these parameters to be additional parameters that we fit for in the foreground model.Z Gaussian is the Bayesian evidence of trying to fit the injected signal to a Gaussian, and Z No 21 is the Bayesian evidence when we try to model for our data having no 21 cm signal. Log(Z ) is the difference in evidence between these models.RMSE refers to the root mean squared error when comparing the injected mock signal to one that we generate using the posterior averages that our Gaussian model suggests.By analysing the values of  soil and |Γ| in Table A2 we see how these parameters correct for horizon height error.If the projected model of the horizon in our foreground correction is higher than the actual horizon the fitting process will compensate for this by dragging down the soil temperature and increasing the magnitude of the reflection coefficient.This will by proxy increase the amount of sky that the telescope is 'seeing' in comparison to the blackbody emission from the horizon.This correction is useful, but not perfect, as a projected horizon height that is larger than the true height will mask out specific information on the sky power, obscuring parts of the spectral index map in our foreground maps, this is especially problematic when the galaxy is directly on the horizon.If the model is lower than the actual height of the horizon the fitting method will do the opposite.Here the model wants to maximise the amount of blackbody radiation coming from the horizon in order to compensate for the poor foreground modelling, and decrease the amount of sky reflection as much as possible to deal with the overestimation of the amount of sky we see.
While the Log(Z) of our fits for an underestimation of the horizon are higher than when we systematically overestimate its height, we must be wary as the RMSE is also higher.This would indicate that an underestimation of the horizon, even with fitting, will yield a worse recovery of the true signal.
Figure 10.An exploration into how fitting for temperature and reflection coefficient of soil surrounding the REACH antenna allows for the incorrect measurement of height of the horizon in the foreground models.We inject a signal with an 85M Hz Central Frequency, 15 MHz Width, 0.155 K Depth and our data model surrounds the telescope with soil at 300 K with a |Γ| of 0.6.Log(Z) is the Bayesian evidence of the model fitting the injected signal to a Gaussian.RMSE refers to the root mean squared error when comparing the injected mock signal to one that we generate using the posterior averages that our Gaussian model suggests.The 'Unfitted' model fixes values for  soil and |Γ| to be 300 K and 0.6 respectively.The 'Fitted' model fits for  soil and |Γ | as additional parameters with priors between 200-400 K and 0.3-0.9respectively.

CONCLUSIONS AND FUTURE WORK
This work aims to demonstrate the fact that a physically motivated foreground model demands an accurate description of the horizon for the recovery of the global 21 cm signal.We show that failing to account for the emissive and reflective properties of the soil on the horizon will lead to a non-detection of this signal.This analysis was focused on REACH but is applicable to all global 21 cm experiments that wish to use physically motivated foreground models in signal recovery.
This paper describes an easy-to-implement model that should greatly increase the ability of the REACH radiometer to recover the redshifted 21 cm absorption trough in spite of the large mountainous horizon surrounding the antenna.
This model, while it is a great improvement in describing the horizon has a number of shortcomings that may be addressed in future work, these are as follows: (i) The treatment of the horizon as composing entirely of one material that is entirely opaque to radio waves of all frequencies coming from behind is a bold one that must be addressed.In a general context this is difficult, but a specific investigation of the REACH horizon may allow for discussion and mapping of vegetation, rocks and different kinds of soil, which will all have different dielectric permittivities, attenuating and reflecting radio waves with different strength.
(ii) This model also only discusses light rays, assuming that diffraction is negligible.Further studies may need to analyse this issue in more depth.A proposed workaround to deal with diffraction would involve treating each spectral region as having a separate reflection coefficient, |Γ  |, when reflected by the horizon which may be fit as an additional parameter.This would allow for any region of the sky obscured by the model to be artificially scaled up again to compensate for the lack of explicit diffraction in the model.(iii) This model treats the soil as having a constant temperature around the horizon, an improvement to this model would involve dividing the horizon into a number of regions based on cardinal direction.Splitting these into a number of regions, each with the own temperature allows one to account for impact of the movement of the sun allowing for the eastern side of the valley to remain hotter than the western after the sun sets.
(iv) This model assumes an infinite, flat, ground describing no reflection or emission from the soil that falls below an altitude angle of 0 • .These near-field effects may have a very strong impact on signal recovery and will be explored in a later work This work represents a large step forward in horizon modelling, demonstrating twofold decrease in RMSE from previous approaches to horizon modelling, with an increase in Log(Z) ∼ 1600.We show that including a 'hot' horizon is a necessity when trying to recover the 21 cm signal using physically motivated foreground models.
We show that there is a dependency of signal recovery on the accurate estimation of soil parameters.While there is some tolerance in the estimation of T soil , in which T soil requires a precision of ≈ 10 K, |Γ| must be accurate to within 0.1 for accurate signal recovery.
To mitigate for error in soil parameter estimation we show that these parameters may be fitted for without compromising the integrity of signal recovery.This fitting process will consistently perform as well as a horizon model in which the soil is described using free parameters that perfectly match those used in data generation.
We also successfully demonstrate that allowing for these parameters to have priors that reach values beyond what is strictly expected physically will allow for a tolerance in horizon height measurement of up to ∼ 20%.

APPENDIX A: ADDITIONAL TABLES
This paper has been typeset from a T E X/L A T E X file prepared by the author.

Figure 1 .
Figure1.Coarse-grained map of the sky with altitude angle above 0 • in equatorial coordinates.The map is divided into 9 regions where each pixel in a given region is said to have the same spectral index as all other pixels in that region.The horizon here is masked out and assigned spectral region '0'.

Figure 2 .Figure 3 .
Figure 2. Recovery of a redshifted 21 cm signal when a horizon is described in the data, but not the foreground models; 'No Horizon' model.Injected 'True' signal shown in green, with 85 MHz Central Frequency, 15 MHz Bandwidth, 0.155 K Depth.This foreground model fails to accurately recover the injected signal.

Figure 5 .
Figure5.Recovery of a redshifted 21 cm signal when a horizon is described in the data with foreground models that only account for the blackbody emission from the soil; 'No Reflection' model.Injected 'True' signal shown in green, with 85 MHz Central Frequency, 15 MHz Bandwidth, 0.155 K Depth.This model shows a large step forward in signal recovery, with much smaller residuals than the previous models; it does however still struggle to accurately recover the injected signal.

Figure 6 .
Figure6.Recovery of a redshifted 21 cm signal when a horizon is described in the data with foreground models accounting for both blackbody emission from soil and reflected power from the soil and sky; 'All' model.Injected 'True' signal shown in green, with 85 MHz Central Frequency, 15 MHz Bandwidth, 0.155 K Depth.

Figure 7 .
Figure 7.Comparison of foreground models with varying soil temperatures and reflection coefficients recovering a redshifted 21 cm signal using the REACH pipeline and a log spiral antenna.In the data model the  soil and |Γ| are set to 300 K and 0.6 respectively.(a) shows the log Bayesian evidence as a function of changing foreground parameters, and (b) shows the corresponding change in root mean squared error of the recovered signal compared to the 'True' signal.

Figure 8 .Figure 9 .
Figure 8. Recovery of a theoretical redshifted Gaussian 21 cm signal using a log spiral antenna and the REACH pipeline in which we fit for soil temperature and reflection coefficient as additional parameters.Injected 'True' signal is shown in green, with 85 MHz Central Frequency, 15 MHz Bandwidth, 0.155 K Depth.

Figure 11 .
Figure11.A comparison of recovery of a theoretical redshifted 21 cm signal (F 0 = 85 MHz, Bandwidth = 15 MHz, Depth = 0.155 K) using the REACH pipeline with a log spiral antenna and a foreground model that assumes the horizon to be 20% higher than the model used in data generation.'True' signal is shown in green.(a) uses a foreground model that fixes soil temperature in foreground model at 300 K and reflection coefficient at 0.6, and (b) uses a model that fits the temperature and reflection coefficient as additional parameters with priors between 200-400 K and 0.3-0.9respectively.

Table 2 .
Comparison of models fitting a redshifted 21 cm signal when the horizon surrounding the REACH telescope is left unaccounted for, has parts of its power unaccounted for, or is accounted for correctly.The No Horizon model describes a foreground model which does not account for the Horizon.The 'Cold' Horizon model masks the horizon out of the sky model, but sets  soil and |Γ| to be 0. The No Emission model describes a model in which the horizon is able to reflect sky power, but does not thermally emit any power, setting  soil as 0 K and |Γ | as 0.6.The No Reflection model describes a model that does not reflect any sky or horizon power, but is able to thermally emit power, with  soil at 300 K and |Γ | set to 0. The 'All' model describes the horizon both emitting power thermally, and reflecting both sky power and the power of the horizon itself; here  soil is at 300 K and |Γ | is 0. The inserted mock signal has an 85 MHz Central Frequency, 15 MHz Bandwidth, 0.155 K Depth.In the mock data soil temperature is set to 300 K and the magnitude of the reflection coefficient is set to 0.6.T and |Γ | refer to the soil temperature and magnitude of the reflection coefficients respectively in our foreground model.Z Gaussian is the Bayesian evidence of trying to fit the injected signal to a Gaussian, and  No 21 is the Bayesian evidence when we try to model for our data having no 21 cm signal. Log(Z ) is the difference in evidence between these models.RMSE refers to the root mean squared error when comparing the injected mock signal to one that we generate using the posterior averages that our Gaussian model suggests.Recovery of a redshifted 21 cm signal when a horizon is described in the data with foreground models that only account for the reflection of the sky off of the soil; 'No Emission' model.Injected 'True' signal shown in green, with 85 MHz Central Frequency, 15 MHz Bandwidth, 0.155 K Depth.This model shows a decrease in the residuals during signal recovery, but still cannot accurately model the injected signal.