A Bayesian Method to Mitigate the Effects of Unmodelled Time-Varying Systematics for 21-cm Cosmology Experiments

Radio observations of the neutral hydrogen signal from the Cosmic Dawn and Epoch of Reionisation have helped to provide constraints on the properties of the first stars and galaxies. Since this global 21-cm cosmological signal from the Cosmic Dawn is effectively constant on observing timescales and since effects resulting from systematics will vary with time, the effects of these systematics can be mitigated without the need for a model of the systematic. We present a method to account for unmodelled time-varying systematics in 21-cm radio cosmology experiments using a squared-exponential Gaussian process kernel to account for correlations between time bins in a fully Bayesian way. We find by varying the model parameters of a simulated systematic that the Gaussian process method improves our ability to recover the signal parameters by widening the posterior in the presence of a systematic and reducing the bias in the mean fit parameters. When varying the amplitude of a model sinusoidal systematic between 0.25 and 2.00 times the 21-cm signal amplitude and the period between 0.5 and 4.0 times the signal width, we find on average a 5% improvement in the root mean squared error of the fitted signal. We can use the fitted Gaussian process hyperparameters to identify the presence of a systematic in the data, demonstrating the method's utility as a diagnostic tool. Furthermore, we can use Gaussian process regression to calculate a mean fit to the residuals over time, providing a basis for producing a model of the time-varying systematic.


INTRODUCTION
One of the most promising probes of the physics of the Comsic Dark Ages, Cosmic Dawn and the Epoch of Reionisation is 21-cm cosmology (Furlanetto et al. 2006).Approximately 379,000 years after the Big Bang the universe underwent a phase change known as 'recombination' whereby the ionised electrons and protons combined to fill the universe with neutral hydrogen (HI), the redshifted afterglow of which can be observed as the Cosmic Microwave Background (CMB) radiation (Planck Collaboration et al. 2020).This HI gas has a hyperfine transition which emits and absorbs radiation at a wavelength of  = 21 cm or a frequency of  = 1420 MHz in the rest frame.We can define a statistical 'spin temperature' which is related to the occupancy of the excited and neutral states of the HI gas.This signal is measured relative to the CMB temperature and is either in absorption or emission based on the coupling between the gas temperature, the background radiation and the spin temperature (Wouthuysen 1952;Field 1959;Furlanetto et al. 2006).
There are several low-frequency radio experiments which are attempting to detect the cosmological 21-cm signal.Interferometers such as HERA (DeBoer et al. 2017), LOFAR (van Haarlem et al. ★ E-mail: cjk55@cam.ac.uk † E-mail: da401@cam.ac.uk ‡ E-mail: ed330@cam.ac.uk 2013), the MWA (Tingay et al. 2013) and the the future SKA Observatory (Dewdney et al. 2009) use arrays of telescopes to measure the spatial power spectrum of fluctations in the early universe.
Experiments which measure the global sky-averaged 21-cm signal such as EDGES (Bowman et al. 2008), SARAS (Singh et al. 2018(Singh et al. , 2022)), LEDA (Price et al. 2018), PRIZM (Philip et al. 2019), MIST (Monsalve et al. 2023) and REACH (de Lera Acedo et al. 2022) are working to place constraints on the physics of the Cosmic Dawn and the Epoch of Reionisation.
The only detection of the cosmological global 21-cm signal claimed so far is by the EDGES experiment (Bowman et al. 2018).This experiment is made up of two low-band dipole antennae located in the Murchison Radio Astronomy Observatory (MRO) in Western Australia which operate between 50 and 100 MHz (Bowman et al. 2008).Since the detected signal had an abnormally flat and deep profile -at least two times deeper than previous predictions (Reis et al. 2021) -concerns were raised regarding the validity of the analysis methods and the effect of systematics (Hills et al. 2018;Singh & Subrahmanyan 2019;Sims & Pober 2020).Another 21-cm global signal experiment, SARAS3 (Nambissan et al. 2021), recently placed constraints on the 21-cm signal, rejecting the EDGES detection with 95.3% confidence (Singh et al. 2022).Hills et al. (2018) found issue with the foreground modelling method used by the EDGES team.By comparing the EDGES foreground model with a physically motivated non-linear expression, they found that the optical depth of the ionosphere and the electron temperature are both negative indicating that the foreground fit is unphysical.They suggest that these unphysical values result from unaccounted for systematics in the data.Removing a 12.5 MHz sine wave from the data allows good fit to a broad Gaussian absorption profile, obtained with five foreground parameters.It is proposed that a sinusoid in the data -which can be explained by any number of instrumental systematics (see Section 1.1) -is what resulted in the Bowman et al. (2018) absorption profile with a flattened bottom which is consistent with the results of Sims & Pober (2020).
Others have interpreted the EDGES signal as a need to introduce new exotic physics, as current theories cannot explain why there would be such a large contrast between the CMB temperature and the gas kinetic temperature.One such theory proposes dark matter of which a small fraction is millicharged e.g.Barkana (2018), which would scatter of the baryonic matter, providing an additional cooling mechanism.An alternative theory does not invoke new physics but rather suggests that there is an unaccounted for radio background (Feng & Holder 2018).This suggestion may be supported by measurements by ARCADE-2 and LWA (Fixsen et al. 2011;Dowell & Taylor 2018).
In this work we will investigate the effects of time-varying systematics and test a method to help identify systematics and mitigate their effects using Gaussian processes.In particular, we will perform this investigation in the context of the REACH global experiment (de Lera Acedo et al. 2022).In Section 2 we will present the standard REACH pipeline and likelihood, introduce the Gaussian process likelihood and demonstrate how they could be used to perform time regression.In Section 3 we present the results of introducing simulated systematics to the data and compare the signal recovery of both the standard and Gaussian process methods.In Section 4 we present the conclusions of the investigation.

Systematics in the REACH System
While all best efforts have been made to calibrate the REACH instrument (Roque et al. 2021;Anstey et al. 2021Anstey et al. , 2022;;Bevins et al. 2021;Scheutwinkel et al. 2022;Razavi-Ghods et al. 2023), it might be inevitable that some systematics will end up in the final data so it is important that we understand and attempt to mitigate the effects of these unknown systematics.As was indicated in Hills et al. (2018) and Sims & Pober (2020), not accounting for systematics can potentially have a large impact on your final fit.In particular for this investigation we will have to consider the effects of the galactic foreground moving over the sky and the beam, and the temperature changing over time as both will potentially introduce systematics in the data that will vary with time.
Figure 1 shows a schematic of the REACH antenna and receiver system.Here,  (, Ω) refers to the directivity of the antenna, Γ A (, ) is the impedence of the antenna, Γ RX (, ) is the impedence of the receiver,  RX (, ) is the gain of the Low Noise Amplifier (LNA), (, ) is the radiation efficiency of the antenna,  A () is the antenna temperature and  system () refers to system temperature that is produced by the receiver.These components combine to make the time-dependent antenna temperature (Cumner et al. 2022), which can be integrated to give the time-averaged antenna temperature, Including impedence reflections and the noise term,  (, , Γ A ), gives the system temperature, (3) The 1 − |Γ A | 2 term arises from unmatched impedence between the antenna and the receiver.Each cable of the antenna has its own impedence value and if there is a difference in impedence at the interface between the two cables or devices then electrical signals will be partially reflected off this interface.As a result, a standing wave may form in the cables producing a sinusoidal systematic in the data.
Cable reflections can also produce sinusoidal systematics from noise sources such as the LNA noise or sky temperature noise.This is problematic since systematics will remain in the averaged data while noise can usually be integrated down, with the amplitude of the noise scaling as 1/

√
int where  int is the integration time of the instrument (Kraus et al. 1986).Similarly, radio-frequency interference (RFI) -which can usually be flagged (Leeney et al. 2022) and the relevant frequency bin excised -can be made sinusoidal when it passes through a system with unmatched impedences.
Other systematics can arise from incorrectly modelling the directivity pattern of the antenna or the sky temperature, the residuals of which could be approximately sinusoidal in form (Price et al. 2018).The REACH pipeline uses the foreground fitting procedure to correct for this somewhat, although limitations -such as the antenna having such a high chromaticity that it exceeds the corrective abilities of the foreground fitting -mean that this cannot be done perfectly accurately (Anstey et al. 2021).Reflections from the soil due to its dielectric properties can also result in sinusoidal systematics as standing waves form between the antenna and the ground (Bevins et al. 2021(Bevins et al. , 2022)).This is particularly problematic as the exact dielectric constant of the soil is unknown and, as such is difficult to be modelled accurately (Pattison et al. 2023).Soil reflections are somewhat mitigated by the inclusion of a 25 m by 25 m square ground plane underneath the antenna, pictured in figure 2, although finite ground planes can introduce new standing waves (Bolli et al. 2020).
Of particular interest to this investigation is the question of what happens if the systematic changes with time.An example of this effect was found by the LEDA team who discovered that the pattern of oscillations changed after rainfall (Price et al. 2018), potentially resulting from the soil's moisture content changing the dielectric properties of the ground.Changing systematics could also arise from cable reflections which reflect the foreground power and, as such, results in a sinusoidal systematic whose amplitude is modulated over time as the Earth rotates.As impedence depends on temperature and components may flex as they warm or cool, the environmental conditions can also affect systematics introduced by the receiver (Cumner et al. 2022).

Bayesian Inference
Bayesian inference is a statistical method which can be used to infer the probability distribution of an unknown variable from some given dataset and as such is a useful tool for parameter estimation.The method relies on applying Bayes' theorem for inverting conditional probabilities which can be expressed as LNA T 21 (ν) T system (ν) T sky (t, ν, Ω)  where  are the parameters of the model, M, that we're trying to fit and D is the vector of data points (Sivia 2006).(|M), or (), is known as the 'prior distribution' and represents our prior knowledge of the parameter probability distribution.(D|, M), or L(), is known as the 'likelihood' and represents the probability of observing the dataset given the chosen model and parameters are true.(|D, M), or P (), is known as the 'posterior distribution' and is the probability of the parameters given the data and the model, and is inferred from the prior and likelihood distributions.Finally, (D|M) is called the 'Bayesian evidence', sometimes given as Z, and can be used as a goodness-of-fit measure for model comparison.
The likelihood function is an expression of how likely the data is given the model and its form depends on the probability distribution of the data.If the data is randomly distributed according to a multivariate Gaussian distribution, we use a Gaussian likelihood function of the form where M() is the model function,  is the length of the data, D and C is the covariance matrix.
The Bayesian evidence can then be calculated by integrating over the parameter space, a technique known as 'marginalising', as To calculate the evidence and sample the posterior we use the nested sampler PolyChord (Handley et al. 2015a,b).

REACH Pipeline
The REACH pipeline uses a framework for jointly modelling galactic foregrounds and correcting for chromaticity (Anstey et al. 2021).As the hexagonal dipole is not an achromatic beam, the antenna has a directivity pattern,  (Ω, ) which depends on the direction of the observation, Ω and the radio frequency, .This is then convolved with the time-dependent sky temperature,  sky (Ω, , ), at time of observation, , to get the observed antenna temperature, where σ is noise, assumed here to be uncorrelated Gaussian noise.
The observed sky temperature model used in the pipeline is made up three main components, where  fg (Ω, , ) is the galactic foreground temperature,  sg () is the temperature of the global 21-cm signal and  CMB = 2.73 K is the CMB temperature.Due to synchrotron radiation emitted by hot gas in the galaxy, the galactic foreground emission must be modelled as it is ∼ 10 4 times larger in magnitude than the 21-cm signal (Shaver et al. 1999).Since there are no foreground emission maps in the REACH band, the pipeline uses a global sky map (GSM) of antenna temperature at 230 MHz (De Oliveira-Costa et al. 2008).While the full REACH pipeline decomposes the sky into regions of uniform spectral index, to reduce computational time we simulate the sky as having a single spectral index,  = −2.55.The resulting sky map in the REACH band is then found as where  is fitted for as free parameter.We have defined the map using the 230 MHz GSM, although the 408 MHz map is an equally appropriate choice and produces very little difference in results.
The Bayesian evidence and posterior samples of the foreground and signal models are found using PolyChord, with uniform priors in the range 2.45844 <  < 3.14556, the full range of a spectral index map derived using the 230 MHz and 408 MHz GSMs (De Oliveira-Costa et al. 2008;Anstey et al. 2021).Testing of the pipeline is done using simulated data which is calculated using the 230 MHz GSM and the spectral index map.A Gaussian mock 21-cm global signal of the form where  21 is the signal amplitude,   , the centre frequency and  21 , the signal width, was added to the simulated data.As this is a similar shape to the physical 21-cm signal, this is a suitable analogue for testing purposes and is vastly less computationally expensive to model than other more physically motivated models.The signal parameters have uniform priors in the ranges 50 <   < 200 MHz, 10 <  21 < 20 MHz and 0 <  21 < 0.25 K.When testing the pipelines, they will be run with a 21cm signal with parameters  21 = 0.155 K,  21 = 15 MHz and   = 80 MHz.Gaussian noise with standard deviation σ = 0.1 K is added to the data to simulate the uncorrelated noise of the system.The REACH pipeline's method of fitting of the foreground parameter can make use of time dependent data in the pipeline (Anstey et al. 2022).The observation is split into   consecutive integrations, or time bins, which are measured in local sidereal time (LST) to match the observation to the stage of Earth's rotation.We hence define the likelihood like so, where  refers to the th frequency bin, and  to the th time bin.This likelihood will be referred as the 'standard pipeline' from here on.

Systematic Model
As discussed in section 1.1, we might expect to find systematics in the REACH system which are sinusoidal, generated by standing waves in the receiver or on the antenna.As a result we can model the general systematic we may expect to see as a damped sinusoid of the form where  0,sys = 50 MHz is the fiducial radio frequency of the systematic,  sys is the amplitude of the systematic,  sys is the period of the systematic,  sys is the phase of the systematic, and  sys is the dampening of the systematic (Scheutwinkel et al. 2022).In this paper the value of the dampening is fixed at  sys = 1.4.The model of the time-varying systematic we will consider in this paper is the case where a systematic is constant in phase, frequency and dampening but modulates its amplitude according to the incoming power from the galactic foreground.Here we define the amplitude of the systematic at time bin  as where  0 determines the radio frequency from which the foreground power is taken.Here, we will take  0 = 50 MHz.Currently the modulation is done monochromatically although it may be better to model the systematic by modulating each frequency bin separately in future.
Figure 4 shows this foreground-modulated systematic for 24 time bins of length 15 minutes.Here, the systematic shows a slight shift in amplitude as the foreground power increases over time.As there is no change in the phase or frequency of the systematic over time, the signal does not average down by any significant amount and as such its amplitude cannot be affected by increasing integration times.

Gaussian Processes
In order to account for the covariance in the model residuals introduced by the presence of a systematic in the data, we will use Gaussian processes (GPs) to build upon the standard REACH likelihood.Gaussian processes are non-parametric probabilistic methods of performing regression and forecasting that have found particular use for Bayesian time series regression (MacKay 2002;Roberts et al. 2013).
We define a GP as a collection of random variables which have consistent joint Gaussian distributions (Rasmussen 2004).These Gaussian distributions are defined by the covariance function, or kernel, of the Gaussian process where the choice of kernel is arbitrary, depending on the data you are trying to model.There is a wealth of literature into the variety of structures of kernel that can be used for GP regression, where their applicability depends on the problem that is trying to be solved.
The kernel, which we will use in this paper is the squared exponential, where ℓ is known as the characteristic length scale of the Gaussian process and  2 SE is the scale factor of the squared exponential kernel (Rasmussen & Williams 2006).
We choose this kernel as it has a simple form and describes a family of smooth functions, as seen in figure 3, of the form which we expect the systematic to take -particularly for the smoothly modulated systematics simulated here.There are many other kernel choices which are valid, for example a periodic kernel (Rasmussen & Williams 2006) which will be useful which the systematic is modulated by the periodic galactic foreground power.We could also introduce a 2D Gaussian process kernel to incorporate correlations between frequency bins.
The covariance matrix we construct is hence where  0,GP is the Gaussian signal noise and is equivalent to adding a white noise kernel to the squared exponential kernel.We set the prior on  0,GP to be a log uniform prior in the range 10 −4 ≤  0,GP ≤ 0.5 K, on  SE to be a log uniform prior in the range 0.01 ≤  SE ≤ 0.5 K and on ℓ to be a uniform prior in the range 100 ≤ ℓ ≤ 1000 minutes.The prior on the uncorrelated noise is taken from the standard REACH pipeline (Anstey et al. 2021) while the scale factor and characteristic length priors are informed by the amplitude and time variance respectively of the foreground-modulated systematic we insert into the data.This covariance matrix is then combined with equation 5, which will be referred to as the likelihood for the 'Gaussian process pipeline', L GP .Once the weighted mean hyperparameters, { 0,GP ,  SE , ℓ}, have been found using PolyChord and Anesthetic (Handley 2019), the mean regression line for a set of predicted times, t pred , using the observed data, {t data , T data }, can be found as and the covariance matrix of the predicted data given by

RESULTS
In this section we will demonstrate the results of injecting a simulated sinusoidal systematic into both the standard and Gaussian process pipelines and compare the results of the two.In section 3.1 we will demonstrate the improvements to recovery of the 21cm signal made by the GP pipeline, in section 3.2 we will see how systematics with different parameters affect the standard and GP pipelines, and in section 3.3 we will demonstrate a potential secondary use of the GP pipeline for regression of the time variation of the model residuals.

Signal Recovery
We first added a foreground modulated systematic over 24 time bins of length 15 minutes with an initial amplitude of  sys =  21 = 0.155 K, period of  sys = 0.5 21 = 7.5 MHz and phase of  sys = .
For the chosen time period the foreground modulated systematic amplitude varies monotonically, increasing to a final amplitude of ∼ 1.7 sys .The effects of adding a systematic to the data on the ability of the standard and GP pipelines to recover the signal can be seen in figure 5. Plotted in green is the true signal and the blue contours show the 1, 2 and 3 contours of the signal posterior plotted with fgivenx (Handley 2018).It can be seen that while the standard pipeline misses the signal by greater than 3, the GP pipeline has a much wider posterior, resulting in the pipeline capturing the signal within 1.
For the standard pipeline the mean noise parameter was  0,std = 0.0637 ± 0.0008 K.In the case of the GP pipeline, the mean hyperparameters were  0,GP = 0.0234 ± 0.0003 K,  SE = 0.064 ± 0.004 K and ℓ = 640 ± 60 minutes.This shows the ability of the GP pipeline to separate the uncorrelated Gaussian noise from the correlated systematic, something which could not be done with a standard Gaussian likelihood which absorbs both noise and systematic in the  0,std parameter.
When no systematic is added to the data both the standard and GP pipelines recovered the signal parameters to within 1.For the standard pipeline the mean noise parameter was  0,std = 0.0245 ± 0.0003 K.In the case of the GP pipeline, the mean hyperparameters were  0,GP = 0.0245 ± 0.0003 K,  SE = 0.0101 ± 0.0009 K and ℓ = 500±200 minutes.In this case the posterior of the  SE parameter has saturated at the lower end of its prior and can be assumed to either be very small or zero.Combining this with the fact that  0,std =  0,GP , this shows that the GP pipeline can be an important diagnostic tool to identify unmodelled systematics in the data, with the amplitude of the time-correlated noise parameter falling to zero in the absence of a systematic.

Varying Systematic Parameters
To test the limits of the robustness of the standard and GP pipelines to systematics we varied the parameters of the simulated systematic to see how the goodness-of-fit and parameter biases changed.We repeated the pipeline runs with systematics with initial amplitudes of  sys / 21 = {0.25,0.50, 0.75, 1.00, 1.25, 1.50, 1.75, 2.00}, periods of  sys / 21 = {0.5, 1.0, 1.5, 2.0, 2.5, 3.0, 3.5, 4.0} and phases of  sys = {0, 0.5, , 1.5}.For each of these parameter combinations, both the standard and GP pipelines were run with time-separated data with 24 time bins of length 15 minutes -equivalent to a single nights' observation.
We judge the goodness-of-fit of the pipeline fit using a root-mean-  (Handley 2018) for the standard pipeline (equation 11) and the Gaussian process pipeline (equations 5 and 15).In green is plotted the 'true' signal which was added into the simulated data.There is a foreground modulated systematic with an initial amplitude of  sys =  21 = 0.155 K, period of  sys = 0.5 21 = 7.5 MHz and phase of  sys =  present in the data.square error (RMSE) value.This is calculated as where  * sg,  and   are the signal posterior samples and their weights respectively,   is the number of frequency bins and  true, sg are the true signal parameters.
Figure 6 shows the root mean square error for the standard pipeline (lower row) and the Gaussian process pipeline (upper row).It can be seen in most cases that the GP pipeline improves the goodnessof-fit (lower RMSE).This can be attributed to a combination of the GP pipeline widening the signal posterior whilst also reducing the biasing on the fitted signal parameters when a systematic is present in the data.
To compare how the confidence in the ability of the pipeline to detect the global signal, we use the Bayes factor where Z GP is the Bayesian evidence given when the GP pipeline is run, and Z std is when the standard pipeline is run.This factor gives us the odds, 1 : K, that the data prefers GP pipeline over the standard pipeline (MacKay 1991).We see in figure 7, for all phases and most periods and amplitudes of the sinusoidal systematic, the Bayes factor is equal to or exceeds 20 indicating that the Gaussian process pipeline is highly favoured over the standard pipeline.In particular we find a minimum log K value of 4.8, corresponding to minimum betting odds of around 1 : 120 in favour of the GP pipeline's.We consider a Bayes factor over 2.5 to be a significant favouring and a Bayes factor over 5 to be a decisive favouring in line with the guidelines given by Jeffreys (1998).In the absence of a simulated systematic log K = −64.0,indicating a decisive preference for the standard pipeline over the GP pipeline -a difference which may be attributed to the Occam penalty of the Bayesian evidence penalising the extra two GP hyperparameters (Hergt et al. 2021).
We now look at the individual signal parameters to more clearly see how the widening posterior alongside the reduction in bias affects the error in the parameter estimation.Figures 8, 9 and 10 show the error in the recovery of the signal centre frequency,   , signal amplitude,  21 , and signal width  21 , respectively for different systematics.It can be seen that there is a marked improvement in the recovery of the signal parameters when using the GP pipeline.In particular, when the systematic period is smaller than twice the signal width the GP pipeline is able to recover the parameters to within 2 in most cases although has difficulty recovering  21 when  sys = 0.5 as the first trough of the systematic is close to the true signal centre frequency.For larger systematic periods there is a slight improvement in the RMSE but still miss the signal parameters by over 2, an effect which is likely caused by the troughs of the long-period sinusoids more closely representing the true Gaussian signal.In general varying the systematic amplitude has only a slight effect on the Gaussian process pipeline when the systematic period is low.
In order to discover the limits of the Gaussian process pipeline when run with large systematics, we also repeat the same parameter sweep with much greater systematic amplitudes of  sys / 21 = {4.0,8.0, 12.0, 16.0} and periods of  sys / 21 = {1.0,2.0, 3.0, 4.0}.The RMSE values of these systematics are shown in figure 11.It can be seen that when the systematic amplitude four times the signal amplitude or greater, the GP pipeline no longer provides an improvement and in some cases worsens the RMSE value.Analysis of the posterior distributions of the hyperparameters shows that for the largest amplitude systematics the correlated noise hyperparameter,  SE , saturates at the higher end of its prior, demonstrating the need to adjust the original priors should there be a very large systematic in the data.
Furthermore we can test the pipelines in the case where there is no global signal in the data but a Gaussian signal model is still being fitted for.Here we test the pipelines for systematics with amplitudes  sys /(0.155K) = {1.00,2.00, 3.00} and periods  sys /(15 MHz) = {1.0,2.0, 3.0}.Figure 12 shows the error in the fitted signal amplitude, where the true amplitude is 0 K.While the fit worsens somewhat at a phase of 0.5 for low periods, in general there is an improvement in the ability to identify a lack of global signal in the data to within 2 by the Gaussian process pipeline.This shows that by taking into account the time correlation of the systematics, the Gaussian pipeline is less likely to fit a trough of the sinusoidal systematic.

Gaussian Process Regression
Gaussian processes also enable investigation of systematic structure over time by allowing the calculation of a regression line for the model residuals.Using equations 16 and 17 we get a mean temperature of the model residuals with time, indicating trends in the amplitude of the systematic.
Figure 13 shows the Gaussian process time regression for the 85 MHz frequency bin for data with 24 time bins of length 15 minutes.A foreground-modulated systematic with an initial amplitude of  sys = 0.209 K was added to the data.The black points are the residual temperatures after the mean foreground and signal models had been subtracted.The red dot-dashed line in the true signal amplitude and given by equation 13.The blue line is the GP regression line determined using equation 16 and the blue shaded area is the ±1 error determined using equation 17.
While the regression line doesn't capture the same shape of the true systematic, the true line is within error of the regression and the GP captures the behaviour that the systematic is increasing with time.As the data is noisy it is unlikely that the GP will be able to capture the exact details of the systematic modulation.
The method outlined in this paper is general and only assumes that the systematic varies with time, we can use the GP regression line to infer how we would expect the systematic to change with time.In future work where we may model and fit for time-varying systematics, continuing on from the time-averaged work of Scheutwinkel et al. (2022), the GP regression is a useful base to inform the choice of model.

CONCLUSIONS
In this paper we presented a new method of mitigating the effects of unmodelled systematics when fitting for the global 21-cm signal in radio cosmology experiment data.By using a squared exponential Gaussian process kernel to fit for the correlations between time bins in the model residuals we are able to identify and mitigate the effects of time-varying residual systematics in the data.
We found that the Gaussian process pipeline was able to account for the presence of residual systematics in the data by widening the signal posteriors, reflecting our increased uncertainty in the signal parameters.We can use the squared-exponential kernel scale,  SE , to identify the presence of systematics in the data as its value will be non-zero should there be a systematic.This demonstrates the method's power as a diagnostic tool.
Furthermore we saw that the Gaussian process pipeline generally improved the goodness-of-fit, as measured using a root mean square value, with a 5% improvement in RMSE values on average for the systematics we tested.Comparing the fitted signal parameters with the true values demonstrated that the GP pipeline reduces the biasing for systematics with a period less than twice the signal width.In many cases we found that the GP pipeline can recover parameters to within 1 despite the standard pipeline parameter fits being further than 2 from the true value.This can mainly be attributed to the widening of the posterior.Further work will including using the Bayesian evidence to compare Gaussian Process kernels, as comparisons to other kernels could be beneficial to our understanding of the systematics.For example, a periodic kernel (Rasmussen & Williams 2006) could be useful when the systematic is modulated by the galactic foreground power as it is expected to vary periodically on a 24 hour timescale.A 2D Gaussian process could also be used to introduce correlations between frequency bins as well as the time bins.

Figure 1 .
Figure 1.A schematic of the REACH antenna and receiver.Adapted from Cumner et al. (2022).

Figure 2 .
Figure 2. The REACH antenna and ground plane in the Karoo desert in South Africa.The size of the ground plane is 25m by 25m with a 5m serrations and is built 1m above the ground.

Figure 3 .
Figure 3. Example of eight functions drawn from the squared exponential Gaussian Process kernel given by equation 14, with a scale factor of 0.01 K and a covariance length of 600 minutes.

Figure 4 .
Figure 4. Example of a damped sinusoidal systematic given by equations 12 and 13 which is modulated by the incoming power from the galactic foreground over 24 time bins of length 15 minutes.The initial amplitude of the systematic was set to  sys = 0.209 K.

Figure 5 .
Figure 5.Comparison of the recovered signal posteriors plotted with fgivenx(Handley 2018) for the standard pipeline (equation 11) and the Gaussian process pipeline (equations 5 and 15).In green is plotted the 'true' signal which was added into the simulated data.There is a foreground modulated systematic with an initial amplitude of  sys =  21 = 0.155 K, period of  sys = 0.5 21 = 7.5 MHz and phase of  sys =  present in the data.

Figure 6 .Figure 7 .
Figure 6.Weighted root mean square error (equation 18) of the pipeline fits for the standard pipeline (lower row) and the Gaussian processes pipeline (upper row) for different systematic parameters.

Figure 8 .Figure 9 .
Figure 8. Error in the fitted value of the centre frequency of the Gaussian signal model for the standard pipeline (lower row) and the Gaussian processes pipeline (upper row) for different systematic parameters.

Figure 10 .Figure 11 .
Figure 10.Error in the fitted value of the full-width half maximum of the Gaussian signal model for the standard pipeline (lower row) and the Gaussian processes pipeline (upper row) for different systematic parameters.

Figure 12 .
Figure12.Distance from zero of the fitted value of the amplitude of the Gaussian signal model when there is no global signal in the data for the standard pipeline (lower row) and the Gaussian processes pipeline (upper row) for different systematic parameters.

Figure 13 .
Figure 13.Example of Gaussian process time regression (equations 16 and 17) for the 85 MHz frequency bin for data with a systematic added.The blue line is the mean Gaussian process and the shaded area is the ±1 error.The red dot-dashed line is the true systematic amplitude over time.The initial amplitude of the systematic was set to  sys = 0.209 K.