Extracting the Global 21-cm signal from Cosmic Dawn and Epoch of Reionization in the presence of Foreground and Ionosphere

Detection of redshifted \ion{H}{i} 21-cm emission is a potential probe for investigating the Universe's first billion years. However, given the significantly brighter foreground, detecting 21-cm is observationally difficult. The Earth's ionosphere considerably distorts the signal at low frequencies by introducing directional-dependent effects. Here, for the first time, we report the use of Artificial Neural Networks (ANNs) to extract the global 21cm signal characteristics from the composite all-sky averaged signal, including foreground and ionospheric effects such as refraction, absorption, and thermal emission from the ionosphere's F and D-layers. We assume a 'perfect' instrument and neglect instrumental calibration and beam effects. To model the ionospheric effect, we considered the static and time-varying ionospheric conditions for the mid-latitude region where LOFAR is situated. In this work, we trained the ANN model for various situations using a synthetic set of the global 21cm signals created by altering its parameter space based on the"$\rm \tanh$"parameterized model and the Accelerated Reionization Era Simulations (ARES) algorithm. The obtained result shows that the ANN model can extract the global signal parameters with an accuracy of $\ge 96 \% $ in the final study when we include foreground and ionospheric effects. On the other hand, a similar ANN model can extract the signal parameters from the final prediction dataset with an accuracy ranging from $97 \%$ to $98 \%$ when considering more realistic sets of the global 21cm signals based on physical models.


INTRODUCTION
The period from the beginning of star and galaxy formation [Cosmic Dawn (CD)] till the change of the state of the Universe from an absolutely neutral to a fully ionized state, i.e., the Epoch of Reionization (EoR), is still observationally unexplored to astronomers.Detection of the redshifted H i 21-cm line is noticed as one of the most promising future probes of the Universe at these redshifts (z≈ 7 − 30) (Furlanetto et al. 2006a;Pritchard & Loeb 2012).The redshifted H i 21-cm lines are formed due to hyperfine splitting of the 1S ground state.Studying these epochs can answer many essential cosmological queries, such as the features of the early galaxies, the physics of mini-quasars, the development of very metal-poor stars, and other major research topics on the origin and evolution of the Universe.The primary science goal of upcoming radio telescopes like the SKA is to study these three extended epochs of the universe's structure formation history.In past decades, significant progress has been made in the theoretical modelling of the expected redshifted 21-cm signal.There are two different experimental techniques for observing these signals in the observational field.(Pritchard & Loeb 2012;Harker et al. 2010;Shaver et al. 1999): (a) using several dishes and huge interferometric arrays at very ★ E-mail:anshumantripathi85@gmail.com low radio frequencies to obtain statistical power spectra of the H i 21-cm variations, for example, Giant Meterwave Radio Telescope (GMRT (Swarup et al. 1991)), Hydrogen Epoch of Reionization Array (HERA (DeBoer et al. 2017)), Low Frequency Array (LO-FAR(van Haarlem et al. 2013)), Murchison Wide-field Array (MWA (Tingay et al. 2013)), Square Kilometer Array(SKA (Koopmans et al. 2015)), etc.(b) using a single radiometer to observe the sky-averaged signature of the redshifted H i 21-cm line, for example, Broadband Instrument for global Hydrogen Reionization Signal (BIGHORN (Sokolowski et al. 2015)), Large-Aperture Experiment to Detect the Dark Ages (LEDA (Greenhill & Bernardi 2012)), Experiment to Detect the global EoR Signature (EDGES (Bowman et al. 2018)), Shaped Antenna measurement of the background Radio Spectrum (SARAS (Patra et al. 2013;Singh et al. 2017)),etc.
Recently, the EDGES team announced a probable discovery of the Cosmic Dawn's sky-averaged H i 21-cm global signal.They observed that the measured signal had an absorption trough double the magnitude expected by the standard cosmological model (Bowman et al. 2018).However, this supposed detection has been challenged by another independent experiment SARAS (Singh et al. 2022).This contradiction in independent detection of the global 21-cm signal from ground-based observation further highlights its challenges.One of the reason why this signal is very difficult to detect is because it is very faint.The signal is embedded behind a sea of dazzling galactic as well as extragalactic foregrounds.The magnitude of the foregrounds is several orders brighter than the signal, approximately 10 4 to 10 6 order brighter than the signal.Furthermore, human-made radio frequency interference (RFI), mainly by the FM band and Earth's ionosphere, will also affect ground-based observation.The ionosphere distorts the lower frequency signal significantly when it passes through the ionosphere.
The ionosphere is the uppermost layer of the atmosphere, extending from ∼ 50 to ∼ 600 km above the Earth's surface.The impacts of solar activity significantly affect the electron density in the ionosphere.The ionospheric existence causes three significant effects in detecting the redshifted global 21-cm signal from the ground-based antenna.All radio waves, including galactic and extragalactic foregrounds, are refracted by the ionosphere, which also attenuates any trans-ionospheric signal and emits thermal radiation (Pawsey et al. 1951;Steiger & Warwick 1961).Further, due to solar activation of the ionosphere, these effects are fundamentally time variable (Evans & Hagfors 1968;Davies 1990).These ionospheric effects scale as  −2 , where  represents the frequency of observations.Hence, as the observing frequencies get lower, the effect of the ionosphere increases more.It demonstrates that when detecting the signal from the Cosmic Dawn and the Dark Ages ( ≥ 15), ionospheric effects will have a stronger influence on global signals than when detecting the signal from the Epoch of Reionization (15 ≥  ≥ 6) (Datta et al. 2016).
The effects of static ionosphere refraction and absorption for ground-based observation between 30 and 100 MHz were previously examined by Vedantham et al. (2014).In Datta et al. (2016), they presented the dynamic ionosphere effects like refraction, absorption, and thermal emission.They also demonstrated how these combined effects affect the global 21-cm signal from Epoch Reionization and Cosmic Dawn when we are observing from the ground.Shen et al. (2021); Shen et al. (2022) recently investigated the chromatic ionospheric effects on global 21-cm data by modelling the two principal ionospheric layers, the F and D layers, with a reduced spatial model with temporal variance.The investigation focuses on the chromatic distortions induced by the ionosphere.
Several studies have been done in recent years based on machine learning (ML) techniques to perform signal parameter estimation or signal modelling.Shimabukuro & Semelin (2017) and Jennings et al. (2019) have employed machine learning techniques to predict 21-cm power spectrum parameters.Similarly, Choudhury et al. (2022) extends the ANN to extract the 21-cm PS and corresponding EoR parameters from synthetic observations for different telescope models.Schmit & Pritchard (2018) used Artificial Neural Network (ANN) to emulate the 21-cm power spectrum for a wide range of parameters.Similarly, Tiwari et al. (2022) have developed an ANNbased emulator for the signal bispectrum, which they have further used to estimate signal parameters via a Bayesian inference pipeline.The global 21-cm signal from Cosmic Dawn and EoR has also been emulated using ANN by Cohen et al. (2020); Bevins et al. (2021b); Bye et al. (2022).Convolutional Neural Networks (CNN) have been utilized to identify reionization sources from 21-cm maps Hassan et al. (2019).Chardin et al. (2019) and Korber et al. (2023) have used deep learning models to emulate 21-cm maps from the dark matter distribution directly.Gillet et al. (2019) used deep learning with CNN to predict astrophysical parameters directly from 21-cm maps.Zhao et al. (2022) used CNN to estimate parameters and infer posteriors on 3D-tomographic 21-cm images.An ANN model that can extract astrophysical parameters of 21 cm from mock observation data sets, including the effects of foregrounds, instruments, and noise, has been successfully developed and presented by Choudhury et al. (2020Choudhury et al. ( , 2021)).The relevance of non-parametric techniques for this purpose has already been demonstrated in several previous studies (Harker et al. 2009;Tauscher et al. 2018).These studies have shown that using a simple parametric technique for signal and foreground subtraction can result in over-subtraction, leading to the loss of the signal.
In this paper, we use ANNs to extract the global 21-cm signal parameters along with foreground and ionospheric parameters from the composite all-sky averaged signal, containing foreground and ionospheric effects.This study considers perfect instrument conditions, representing an ideal scenario in which the instrument is assumed not to modify the signal.In the first case of study, we follow the tanh parametrized model and Accelerated Reionization Era Simulations (ARES) code (Mirocha et al. 2012) to construct the cosmological signal.We use the log (T) − log () polynomial model to map the bright, dominant foregrounds.According to Harker et al. (2016), a 3rd or 4th-order polynomial is sufficient to map the sky's spectrum.In contrast, Bernardi et al. (2015) demonstrated that when adding the antenna's principal chromatic beam, a 7th-order polynomial is required.We followed (Datta et al. 2016) to add the ionospheric effect into the simulated signal and foreground.In this, we consider mainly three effects induced as a resultant: refraction, absorption, and thermal emission, and all these are directly proportional to the electron density (TEC) and temperature of the electrons at various layers of the Ionosphere (T e ).These ionospheric effects introduce two more parameters into our training data sets.To check and validate the robustness and reliability of the developed model, we have considered a minute variation to the parameters TEC and T e to generate our final training data set.To further check and validate the ANN model's robustness, we use a more realistic set of global 21-cm signals presented in (Choudhury et al. 2021) instead of parametrized global 21-cm signals.This global 21-cm signal data has different parameters than the tanh parametrized global signal.In section 2 of this paper, we briefly review about 21-cm signal.Section 3 mentions the details about the foreground model that we used to map the galactic and extragalactic sources.Section 4 discusses the ionospheric effects and their impacts on the global 21-cm signal observation.We briefly discuss the ANN overview and matrices we used to evaluate the performance of our ANN model in section 5. Section 6 describes the methodology and procedures to simulate the global 21-cm signal, foreground, and ionospheric effects for training and testing the ANN model.We present the results obtained by our model for all the cases in section 7.In the last section, we summarize our work and discuss the implications of our predictions.

GLOBAL 21-CM SIGNAL
The 21-cm line of the neutral hydrogen is formed as a result of the hyperfine splitting of the 1S ground state caused by the interchange of the magnetic moments of the proton and electron.The quantity we can measure is known as "differential brightness temperature", T b .We measure this quantity relative to Cosmic Microwave Background (CMB) followed by (Furlanetto et al. 2006a): where x H i is the hydrogen neutral fraction,  b represents the fractional over-density in baryons, Ω  and Ω  signify total matter density and baryon density, respectively, H(z) is the Hubble parameter, T  (z) denotes CMB temperature at redshift z, and T s is spin temperature, and     is the velocity gradient along the line of sight.
The 21-cm global signal is a sky averaged signal that offers information on global cosmic occurrences.It can tell us about the story of the thermal history of ionizing radiation like UV radiation which interrupts neutral hydrogen, X-rays that heat the gas and elevate T k , and Ly  , which is accountable for the Wouthuysen-Field coupling (Wouthuysen 1952).In the study, the peculiar velocity and density fluctuation components in the global signal (Eqn.2) are neglected since they average out to a linear order and amount to a minor correction.As a result, the density, neutral fraction, and spin temperature all affect the form of the global signal (Mirocha 2014).
To construct the global signal, we primarily use this equation ( 3).The spin temperature influences the detectability of the 21-cm signal.Three main quantities determine spin temperature: (i) absorption/emission of 21-cm photons by CMB radiation; (ii) collisions with other hydrogen atoms, free electrons, and protons; (iii) scattering of L y  photons that cause a spin-flip through intermediate excitation.In this given limit, the spin temperature sky-average defined as (Field 1959): where T  is the temperature of radio background, primarily CMB, T  is the color temperature of ambient Lyman-alpha photons, and T K is kinetic gas temperature.y  , y k represents the coupling coefficient, which arises due to atomic collision and Lyman-alpha scattering.

FOREGROUND
The bright foregrounds, mostly caused by galactic and extragalactic sources, are the greatest observational obstacle in observing the global 21-cm signal for studying the CD-EoR.The radio emission from galactic and extragalactic sources is substantially brighter than the global 21-cm signal.We used a very basic model termed log polynomial (log(T) − log()) to map the foreground, which is based on Pritchard & Loeb (2012); Bernardi et al. (2015).In our study, we constrain our foreground model to a 3rd order polynomial in log(T) − log(), followed by Harker et al. (2016); Choudhury et al. (2020) which depicts diffuse foregrounds: where a 0 , a 1 , a 2 , ...., a n denote foreground model parameters and  0 is arbitrary reference frequency.In this study, the derived value of the foreground parameters (a 0 , a 1 , a 2 , a 3 )= (3.3094, −2.42096, −0.08062, 0.02898) are taken from Harker (2015); de Oliveira-Costa et al. (2008) and reference frequency taken around  0 = 80 MHz followed by (Choudhury et al. 2020).We varied these foreground parameters to construct the different realization of foregrounds (see in the Tab (3)).

IONOSPHERIC EFFECTS
The ionosphere is a region of the Earth's atmosphere that has a high concentration of electrically charged atoms and molecules.The Sun is one of the most powerful energy sources in the Solar System.Its intense Ultraviolet (UV) and X-ray radiation interact with the Earth's atmosphere to create the ionosphere through photo-ionization.The electron density and temperature change significantly depending on the type of solar fluctuations.Any electromagnetic signal travelling through an optically thin medium, such as the ionosphere, obeys the radiative transfer equation.To understand these effects, the ionosphere is divided into various layers, such as D-layers (60 − 90 km), E-layers, and a composite F-layer (160 − 600 km) Datta et al. (2016).

F-layer refraction
The F-layer of the ionosphere, located between ∼ 200 to ∼ 400 km above the Earth's surface, accounts for most of the ionospheric electron column density.Outside of this layer, the electron density is very low compared to the inside.Although the electron density varies within the F-layer, we consider it a homogenous shell 200 to 400 km in height and assuming a constant electron density of ∼ 5 × 10 11 electrons/m 3 which resulting column density of 10 TEC units (Vedantham et al. 2014;Datta et al. 2016).Due to density differences between the layers, any incoming beam experiences Snell's refraction at the boundaries of the F-layer.The ionosphere's refraction acts like a spherical lens, deflecting incoming light towards the zenith (Vedantham et al. 2014;Datta et al. 2016).As a result of this refraction, any radio antenna located on the ground captures the signal from a wider area of the sky, resulting in a higher antenna temperature.
The angular deviation experienced by any incoming ray with angle  to the horizon in the parabolic layer, which is surrounded by free space with refractive index  = 1, can be calculated as follows (Bailey 1948;Vedantham et al. 2014;Datta et al. 2016) : where R E = 6378 km is the Earth's radius, h represents the altitude, h m represents the altitude where the electron density is maximum in the F-layer, which is h m = 300 km, and d represents the change in altitude with respect to h m where the electron density is zero, which is 200 km in our simulation and  p is the plasma frequency (Thompson et al. 2001).
As seen from equation ( 6), the ionospheric refraction is proportional to  2 , with the greatest deviation happening for the horizon ray, which has an incidence angle of 0. As a result of this ionospheric refraction, the field of view at a particular observation frequency will be larger than the primary beam of the antenna.The ionospheric refraction's impact on the angular deviation, as shown in Fig. (1a) and increase in the field of view (FoV) is calculated and plotted across the frequency (), as shown in Fig. (1b).The resultant antenna temperature, which includes ionospheric refraction, as described by Vedantham et al. (2014).
where T iono sky refers to the antenna temperature that considers ionospheric refraction, (Θ 0 , Φ 0 ) is the pointing centre.B ′ (, t; Θ − Θ 0 −  (t), Φ) describes a modified field of view caused by the ionosphere's refractive effect, and T sky (, Θ, Φ) denotes actual sky temperature which includes signal and foreground.

D-layer absorption and thermal emission
The D-layer is the lowest layer of the ionosphere, extending from ∼ 60 to ∼ 90 km above the Earth's surface (Vedantham et al. 2014).Due to solar insolation, high electron concentrations in the D-layer are projected to last only during the daytime.At night-time, residual electron concentrations are mostly found of the order of ∼ 10 8 electron/m 3 .
The high concentration of atmospheric gas in the D-layer at these heights results in significant electron collision frequencies, which cause radio wave attenuation (Evans & Hagfors (1968); Davies (1990)).The absorption by the D-layer can be expressed follows (Evans & Hagfors 1968;Datta et al. 2016): where TEC D signifies the D-electron layer's column density and  indicates the optical depth.
The D-layer is also responsible for thermal emission (Pawsey et al. 1951;Steiger & Warwick 1961;Hsieh 1966;Datta et al. 2016), which is included as a (, TEC(t)) < T e > into the final term [see in Eq. ( 9)].The terms (,   ()) represents optical depth for the corresponding ionosphere, and < Te > is average electron temperature, which causes thermal radiation.In our simulations, we consider mid-latitude ionosphere, and we take D-layer electron temperature T e = 800 K (Zhang et al. 2004).We have calculated the attenuation factor and thermal emission for the corresponding mid-latitude ionosphere and plotted them against the frequency (), shown in Fig. (2a) and Fig. (2b).In the plot, we see that as we go lower in frequency (), this attenuation factor and thermal emission increase compared to the higher frequency ().
Finally, the brightness temperature of the radio signal recorded by the ground-based radio antenna in the presence of all three ionospheric effects is defined as (Datta et al. 2016): where T iono Ant is the effective brightness temperature captured by any ground-based antenna, T iono sky denotes the changed sky brightness temperature as a result of ionospheric refraction, and (Θ 0 , Φ 0 ) are pointing center.

BASIC OVERVIEW OF ARTIFICIAL NEURAL NETWORK
An ANN is a computational algorithm inspired by human biological neural networks.A basic architectural neural network is made up of three primary layers: an input layer, a hidden layer, and an output layer.The number of hidden layers defining its depth and the number of neurons in those layers determines the network width.In a feedforward network, each neuron in a layer is coupled to every neuron in the next layer, and the information flow is unidirectional.The connections between neurons are associated with weights and biases (Agatonovic-Kustrin & Beresford 2000).
To describe the detailed structure of the ANN architecture, we followed Shimabukuro & Semelin (2017); Choudhury et al. (2020).We considered that there is an n input training data set (x 1 , x 2 ....., x n ).Each input data set is fed by particular neurons in the input layer.For example, the input data x j is provided to the jth neurons in the input layer, which is further connected to the next layer neurons ( hidden layer) with associated a weight w (1) ij and a bias b 1 j .In general, this can be described as: In the hidden layer, the output from Eqn. ( 10) is further activated by a non-linear activation function h, such that y i = h(z i ).The final output Y ′ i which is the linear combinations of the activated outputs of the neurons in the hidden layer with weights w (2) ij and biases b 2 j can be described as (Shimabukuro & Semelin 2017): After each forward pass, a cost or error function is computed at the output layer.This cost function is optimized during training by back-propagating errors iteratively.We can define the total loss (cost) function of the network as follows: where N t represents the number of training epochs, N represents the number of output data elements, Y ′ denotes prediction by the ANN, and Y denotes the actual output feature.These weights (w) and the biases (b) are updated at the end of each training epoch by using methods called gradient descent in the following manners described below: where   0  and   0  represent the initial weights and bias, respectively, and  is the learning rate.We employed Python and the Sequential Model from the Keras API in our feed-forward network.To develop our network, we utilized standard sci-kit learn Pedregosa et al. (2011) and Keras modules.We pick the number of hidden layers and number of neurons such that we can get optimum network performance.The number of neurons in the output layer is the same as the number of output parameters we want to predict.The ANNs architecture employed in our study is discussed in detail in the following sections.

R 2 and RMSE Scores
We choose R 2 and root mean square error (RMSE) scores as a metric to evaluate network performance.The coefficient of R 2 and RMSE is obtained for each parameter from the test set of the predictions.The R 2 scores is defined as: where ȳ is the average of the original parameter, the sum is that the score R 2 = 1 denotes a flawless inference of the parameters across the whole test set, whereas R 2 might range between 0 and 1.
We have followed Shimabukuro & Semelin (2017) to calculate the normalized RMSE score for prediction : where N pred represents the total number of samples in prediction data sets, a lower RMSE value suggests that the parameter prediction is more accurate.

BUILDING OF TRAINING AND TEST DATA SETS
We follow the steps below to construct the data sets for all the different realizations to combine them to build the final training data sets.We created 360 sets of data sets for all the different realizations for both types of signals, parametrized and physical.These data sets are created using each parameter value sampled randomly and uniformly from the given parameter range by the following Tab.(1),Tab.(2) and Tab.(3).We further split these constructed data sets into three chunks for training, validation and testing of the model.In the test set, we add additional thermal noise for the corresponding observational hour by following the radiometer equation details described in the section (6.4).

Simulation methods for the global 21-cm signal Case 1: parametrized model
In the first case study, we used the tanh parameterization model to replicate the global 21-cm signal across the redshift range 6 <  < 40 suggested by (Mirocha et al. 2015).This approach utilizes rudimentary tanh functions to describe the Ly background, IGM temperature (T), and ionization percentage ( X) , where Ly background defines the amount of the Wouthuysen-Field coupling (Harker et al. 2016).Each quantity is allowed to grow as a tanh function (Mirocha et al. 2015) by following given Eq.( 16): where P(z) denotes the tanh model's primary parameter.P ref is step height, z 0 is pivot redshift, and Δz indicates duration.These are free parameters, and their characteristics are directly linked to IGM features but not to source attributes.That is why this model behaves like an intermediate model, which lies between the physical models and phenomenological models like cubic spline (Pritchard & Loeb 2010) or Gaussian (Bernardi et al. 2015;Bernardi et al. 2016) models.Now we evolve the model parameters J  (z) (Ly background), T(z) (IGM temperature), and Xi (ionization fraction) as tanh function by plugging these parameters into Eqn.(16), the details are shown below: where J ref represents Ly flux ( in order of 10 −21 erg s −1 cm −2 Hz −1 sr −1 ), J dz and J z0 both represents Ly background for corresponding redshift interval Δ and for the central redshift z 0 respectively, T dz and T z0 are X-ray heating term for the interval Δ and for the central redshift z 0 respectively, and T ref denote step height corresponding T(z) parameter, which is fixed at 1000 K.The exact height of the step is not essential because the signal is saturated with low redshifts.X ref represents the step height corresponding to the ionization percentage, and Δ and z 0 are represented by X dz and X z0 .Finally, we have seven signal parameters along with two fixed parameters ( X ref = 1.0, T ref = 1000 K) to simulate the global 21-cm signal using the tanh parametrization (Choudhury et al. 2020).To generate a simulated 21-cm global signal, we use ARES to determine the coupling coefficient and enter the parameter values into Eq.( 2).We named this simulated signal as a parametrized global 21-cm signal.

Case 2: Physical Model
In the second case study, we used the same data set as a training data set for the signal that was earlier used in (Choudhury et al. 2021) to construct the training data set for the global 21-cm signals.They used different physical models based on a semi-numerical algorithm to produce various realizations of the global 21-cm signals across the redshift range 6 <  < 50.The calculation that had been used in the signal construction closely follows (Furlanetto et al. 2006b) and Pritchard & Loeb (2012), and the parameters given as input to the model are the following astrophysical parameters: • number of Lyman-alpha photons produced per baryon in the interested frequency range, N  , They vary these astrophysical parameters in the given range, shown in Tab.(2), to construct the training data set for global 21-cm shown in figure(4 a), the detailed calculation and process described in (Chatterjee et al. 2019;Choudhury et al. 2021).(Choudhury et al. 2021) mentioned that the parameters f x and f xh are highly correlated, so in our case study, we combine these two parameters and take them as a single parameter so that we can improve network performance.In

Simulation of foreground
To add foreground into the global 21-cm signal in both cases parametrized and physical, we follow the log(T) −log() polynomial model described in Section 3. We simulated foreground by varying its parameters (a 0 , a 1 , a 2 , a 3 ) with (±15%, ±10%, ±1%, ±1%) respectively from its original given value to build the training data for all the scenarios, for more details [see Tab.3].Each sample in the training data set (see Fig. 3b) is given by : where T sky () is the total sky temperature without including ionospheric effects, T 21,parametrized () is the global 21-cm signal temperature constructed using the parametrized model, T FG () foreground temperature constructed using the log-log polynomial model.For the second case, each sample in the training data set (see Fig. 4b) can be defined as: where T 21,Physical () represents the global 21-cm signal temperature constructed using a semi-numerical physical model.

Simulation of ionospheric effects
To simulate the ionospheric effect, we chose two different scenarios.In the first scenario, we have added only the ionospheric refraction effect for the corresponding fixed TEC value, which is 10 TECU, These ionospheric effects introduce two more parameters in the parameter set: TEC ( Total electron content) and T e , representing the thermal temperature of the electron of the D-layer.In our simulation, we have used an F-layer total electron content of TEC 10 TECU and a D-layer electron temperature of T e = 800 K for the midlatitude ionosphere.We used the International Reference Ionospheric (IRI) model to obtain the TEC value for the D-layer (Bilitza 2003).According to this model, the usual ratio of electron column densities in the D-layer and F-layer is 8.0 × 10 −4 (Datta et al. 2016).
For both sets of signals, the parametrized and the physical, to build the final training data sets.We have varied ionospheric parameters by (TEC, T e ) (±1%, ±1%), respectively from their original defined Zeroth order foreground coefficient (a 0 ) ±15% ±15% ±15% First order foreground coefficient (a 1 ) ±10% ±10% ±10% Second order foreground coefficient (a 2 ) ±1% ±1% ±1% Third order foreground coefficient (a 3 ) ±1% ±1% ±1% Total electron content (TEC) Fixed ±1% Thermal electron temperature (T e ) ±1% Table 3.The percentage variation of each parameter of the foreground and ionosphere from its actual value to create upper and lower boundaries and construct the training data set for each scenario for the Case 1 study when we took the parametrized model and case 2 when we consider physical model.
values.The detailed variation that has been used in our simulation to construct the training data set for each case study in this paper is summarized in Tab.3.

Thermal Noise
The thermal noise, (), in the measured spectrum may be represented as follows using the ideal radiometer equation: where, T sys () is system temperature,  is the observational bandwidth and  is the observation time.We are working with simulated observations, which are created using a set of assumptions about signal, foreground, and ionosphere effects.In the future, a similar network will be used to anticipate the redshifted global 21-cm signal based on actual measurements.Actual data from simple observations will replace the with the test data sets.

RESULTS
In this section, we will discuss results from simulations representing different signal extraction scenarios: signal only, signal with foreground, signal and foreground with ionospheric refraction corresponding to a fixed TEC value, and signal and foreground with all three ionospheric effects with variable TEC and T e values.We constructed 360 samples of the data sets for training, validation, and testing of the ANN model for the each following cases.We use 240 (67%) samples of the data sets for the training and validation, and the rest of the 120 (33%) data sets we used to test the trained ANN model.The validation mainly guides us in tuning the model's hyperparameters, for example, the number of the hidden layer, the number of neurons in the hidden layer, the activation function, the learning rate, etc.It also assists us in identifying overfitting and underfitting by comparing the model loss of the training and validation.In the test set, we add additional thermal noise, (), corresponding to 1000 hours of observation by following the radiometer equation ( 20) to construct the final test data set for the each cases of studies.

Case 1a: Signal only (parametrized model)
In  In (Choudhury et al. 2020) has already shown a similar implementation.It is a proof-of-concept to see how effectively our model extracts parameters when adding the foreground.The model architecture we used is different from the first case.The model we have used has 4 layers made using sequential API from Keras.The input layer has 1024 neurons that correspond to the 1024 frequency channel, while the hidden layers have 32 and 16 neurons that are activated by the 'sigmoid' activation function.The output layer has 11 output neurons to predict the global 21-cm signal and foreground parameter.We will use the same model architecture, optimizer, and normalization method for the other cases.The only difference is the number of neurons in the output layer, depending on the number of output parameters.We calculate the R 2 -scores and RMSE for each parameter from the test set predictions to determine how well the network predicts the parameters.The plots of the original versus the predicted values of the parameters for the test data set are shown in Fig. (7), and R 2 score and RMSE score for corresponding parameters are listed in Tab. 4 and Tab. 5.The R 2 score for this case ranges from 0.97 to 0.98 for predicted signal parameters, which are significantly lower than the previous case (Case 1a).In the predicted foreground parameters, a 0 has the highest R 2 score of 0.99.

Case 1c: Signal and foreground with ionospheric refraction
The ionosphere of the Earth severely distorts low-frequency radio measurements in ground-based observations.We add the ionospheric refraction effects into the signal and foreground while constructing data sets to check this effect, shown in Fig. (3c).We followed the same ANN model structure that was used in Case 1b and trained the ANN model using training data sets constructed by adding the effects of ionospheric refraction.We tested the trained model with test data sets and calculated the R 2 score and RMSE score corresponding parameters to evaluate our model performance; the detailed result is listed in Tab.(4) and Tab. ( 5).The predicted values of the parameters by the network are plotted in Fig. ( 8).In this case, the obtained R 2 score for the signal parameters ranges from 0.94 to 0.96.

Case 1d: Signal and foreground with all three ionospheric effects-refraction, absorption and thermal Emission
In this case, we add other ionospheric effects like absorption and thermal emission and construct the training data set, which we name the "final training data set", shown in Fig. (3e).We utilized the same architecture as in previous cases (Cases 1b and 1c) to build an ANN model; the only difference here is that the output layer of the model has 13 output neurons corresponding to the 7 signal parameters, 4 foreground parameters, and 2 ionospheric parameters.We test this model with test data and calculate the R 2 score and RMSE score corresponding parameters to evaluate our model performance.The R 2 and RMSE scores for each parameter are listed in Tab. 4 and Tab. 5, and predicted values of the parameters by the network are plotted in Fig. ( 9).From Table 4, and 5, we can see that R 2 values slightly decrease, and the RMSE value slightly increases when we introduce foreground and ionospheric effects into the signals compared to Case 1a when we take signal only.When we add more complexity to the training data set, we have to train our network sufficiently well to maintain high accuracy levels.

Case 2a : Signal only (physical model)
To check the robustness and reliability of the network, we trained the ANN model for all the scenarios that we have studied previously with an entirely new set of the global 21-cm signal, taken from (Choudhury et al. 2021).They constructed the global 21-cm signals by using semi-numerical code followed by Chatterjee et al. (2019) Once the network is trained and validated, we save the model.We used the same optimizer and normalization method for the other cases.We test the model using a test data set and obtain the R 2 scores and RMSE for each parameter from the test set predictions to see how well the network predicts the parameters.The predicted parameters are plotted in Fig. ( 10) and the corresponding R 2 score and RMSE score for each parameter are listed in Tab.6 and Tab. 7. From the Tab.6, we can see that the parameter N  has the highest R 2 scores of 0.99 and parameters f esc has the lowest R 2 score of 0.98.To check the overfitting for all the cases, we have plotted training loss and validation loss, similar to the parametrized case [see Fig.

Case 2b : Signal with foreground
In this case, we train our model with a training data set, which is constructed by the combination of the foreground with different realizations of the global 21-cm signal, shown in Fig. (4b).We employed a 5-layer model architecture for training, with 1024 input neurons matching the 1024 frequency channels and three hidden layers with 32, 16 and 16 neurons activated by the 'elu' activation function.The output layer of the model has 8 output neurons corresponding to 4 astrophysical parameters of the global 21-cm signal and 4 foreground parameters.We test our trained model with the test data set and calculate the R 2 and RMSE scores for the corresponding parameters shown in Tab.6 and Tab. 7, and plot of predicted values of the parameters against the original values is shown in Fig. (12).From Tab. 6, we can see that the foreground parameters' R 2 score is much higher than the signal parameters; this means the network predicts the foreground parameter more accurately than the signal parameters.In the signal, the parameter f x,h * f x has the highest R 2 scores of 0.98, and N  has the lowest R 2 scores of 0.96.follow

Case 2c : Signal and foreground with ionospheric refraction
In the third case, we added foreground and ionospheric refraction effects for corresponding fixed TEC value into the global 21-cm  we used previously for the signal in the foreground case to build the model for this case.Now we train this model and save it for further validation and testing.We test the saved model with test data sets, and we calculate the R 2 score and RMSE score for the corresponding parameter.Details of the result are listed in Tab.6 and Tab. 7, and a plot of predicted values of the parameters against the original values is shown in Fig. ( 13).The R 2 score for the foreground parameters has been improved from the previous case, but the signal parameter  2 score decreases in comparison; this means adding more complexity to the training data making signal extraction more difficult.We obtained the R 2 score for the signal parameters around 0.96 to 0.98 [see Tab.6].

Case 2d : Signal and foreground with all three ionospheric effects-refraction, absorption, and thermal emission
In the last case, we have constructed training data sets by the combination of the global 21-cm signal, foreground, and ionospheric effects (refraction, absorption, and thermal emission), shown in Fig. (4e).To build the model for this case, we use the same architecture that we used in previous models except for the outer layer.Here, the output layer of the model has 10 output neurons corresponding to the 4 signal parameters, 4 foregrounds, and 2 ionospheric parameters.Once the models were trained and validated, we saved them for further testing.We tested the saved train model with the test data set and calculated the R 2 score and RMSE score for the corresponding parameters to check the network performance.(17, 18, 19 and 20) and rest other parameters like signal and foreground are also consistent with the actual one [see Tab. 8,9] .This is clearly indicating the ANN model's accuracy and ability to capture complex temporal variations in ionospheric phenomena.

SUMMARY AND DISCUSSIONS
In this study, we presented an ANN model to extract the 21-cm global signal by estimating their parameters from a combined spectrum that included signal, foreground, ionospheric effects, and thermal noise.
We trained our ANN model with four different scenarios described in detail in Section 7. To check the robustness of the ANN, we also used two different ways of modelling the global 21-cm signal; one is based on the functional parametrized (Mirocha et al. 2012(Mirocha et al. , 2015) ) and the other one is a physically motivated approach (Chatterjee et al. 2019;Choudhury et al. 2021).The parameter space in both cases is entirely different; parametrized model parameters are more directly related to IGM properties; however, the physical model includes both IGM and source properties.In the physical model, the parameter f R played the most crucial role in defining the form of the reconstructed signal in the semi-numerical code (Choudhury et al. 2021).A high f R value implies a strong radio background, resulting in a substantial absorption trough signal.In contrast, f R = 0 suggests that the excess radio background is turned off, resulting in a conventional signal.In our study, We have taken the traditional data set of the global 21-cm signals from (Choudhury et al. 2021).
For both parametrized and physical models, in the final case studies, Case 1d and Case 2d, the trained ANN model predicted the signal parameters from the test data set with an accuracy of ≥ 96%.This clearly demonstrates how a basic ANN model can easily manage up to 13 parameters (7 signal parameters, 4 foreground parameters, and 2 ionospheric parameters.We have estimated the uncertainty of the parameter by calculating the RMSE score of the individual parameters for all the cases [ See Tab. 5,7].We found the error in the parameter estimation increase when we increase the number of training parameters, e.g., in the first case of the study, when we used signal only (Case 1a), the ANN estimated the parameter with a maximum error of ≈ 4%, but in the final case (Case 1d), the maximum error was ≈ 6%.Similarly, in the second case of the study for Case 2a, the maximum error was ≈ 3%, but in the final case-Case 2d, the maximum error was ≈ 5%.This clearly indicates that when   The accuracy levels will remain high if the network has been trained well enough.We also demonstrated that for the dynamic ionospheric model where TEC and Te are varying randomly, ANN prediction accuracy is still consistent [see Tab. 8, 9].
To further emphasise the utility of the ANN method, we conducted an analysis attempting to fit the signal, foreground, and ionospheric effects using a simple analytical approach.This method failed to extract the signal parameters from both scenarios.Conversely, our trained ANN model demonstrated remarkable accuracy in predicting parameters for the same data set, as elaborated in Appendix A. Additionally, we evaluated the ANN's performance in terms of accuracy by comparing it with existing prior signal extraction methods.To assess this, we compared the predictions of these prior models with the true parameters, calculating the Mean Absolute Percentage Error (MAPE), as detailed in Appendix B. Our findings revealed that the predicted parameters by the ANN model are significantly more accurate than these traditional methods.
The other benefit of using ANN is that it can efficiently extract the observed sky signal's characteristics without modelling and eliminating foreground and ionospheric effects.Compared to the other existing parameter estimation techniques, ANN can extract features from data by building functions that connect the input and output parameters.The ANN model, unlike Bayesian approaches, does not require a defined prior; instead, we must provide training data sets, which may be seen as playing a similar role as the prior in Bayesian techniques.We may avoid computing the likelihood function a large number of times by using ANN to arrive at inferred parameter values.As a result, even when dealing with a larger dimensional parameter space, ANN is computationally faster and more efficient.
In this work, the ANN model is trained for the static as well as time varying ionospheric condition, which is very robust and sensitive for all the given input parameters with their defined parameter ranges used in the preparation of the data sets.By incorporating problems like beam chromaticity, and other systematic effects, we hope to create a more reliable ANN model in the future.Depending on telescope design and their geomagnetic, various systematics corrupt observations, such as standing waves from cable lengths internal to the system, chromaticity caused by environmental factors like antenna ground planes (Singh & Subrahmanyan 2019;Hills et al. 2018), ionosphere and RFI.These non-astronomical signal need to be modelled for accurate signal extraction.We plan to include these effects in our future study.

APPENDIX A: EXTRACTION OF SIGNAL, FOREGROUND AND IONOSPHERIC EFFECT PARAMETERS USING ANALYTICAL METHOD
We attempted to analytically fit two scenarios: one with signal and foreground and another with signal, foreground, and ionospheric effects using Least Square Fit from Scipy libraries in Python.For the first scenario, input data sets are simulated based on equation 18, and the corresponding true input parameters are listed in Tab.A1.Similarly, for the second scenario, the simulation relied on equation 9, with the true input parameters listed in Tab.A2.We follow two approaches to fit the signal and foreground.In the first approach, We attempted to fit both the signal and foreground simultaneously for the given sky signal simulated using equation 18, but encountered significant instability in the fit.The residual left after the best-fit signal and foreground is shown in Fig. A1, and best-fit parameters are listed in the table Tab.A1.In the second approach, we attempted individual fitting of the foreground and signal from the sky signal, initially fitting the foreground by following equation 5 and subtracting its best-fit model from the total observed sky signal.The remaining 21 cm signal was then fitted separately using the signal's parametrized model, described in section 6.1.However, the fitting function failed to accurately capture the foreground and global signal parameters.It is evident from the distinct residual signal, noticeably different from the input signal, indicating inaccurate foreground fitting (Fig. A1).The small uncertainty in the best-fit foreground parameters indicates under-constraint.The fitting function's performance for signal fitting was notably inadequate.The summarized results are in Tab.A1.We attempted to fit the total sky signal, incorporating the global 21 cm signal, foreground, and ionospheric effects similarly to the previous case.However, the fitting function failed to accurately capture the foreground and ionospheric parameters, as indicated by the statistically significant magnitude of the residual (Fig. A2).We also applied ANN to fit the same data sets for both scenarios, resulting in ANNpredicted parameters that closely approximated the true values of the parameters, see Tab. (A1, A2) and reconstructed 21-cm signals in Fig. A1 and A2, accompanied by residuals from the true input signals.

APPENDIX B: COMPARISON WITH OTHER EXISTING TECHNIQUES
We evaluated the accuracy of our ANN predictions in comparison to other methods by calculating the Mean Absolute Percentage Error (MAPE) detailed in Tab.B1.Our study demonstrated that while some prior techniques performed well with simpler signal models (those with fewer free parameters), they faltered when dealing with more complex signal models requiring additional parameters.In contrast, the ANN model consistently outperformed these methods.Tab.B1 demonstrates that our ANN predictions exhibited less than 5% error across all parameters, regardless of the signal scenarios, including foreground with ionospheric effects.For the physical signals, except for one parameter with more than 10% error, all other parameters are accurately constrained, with most falling below the range of 5% to 1.0% error.

Figure 1 .Figure 2 .
Figure 1.(a) The deviation angle   is plotted as a function of frequency for a typical mid-latitude daytime TEC value (TEC = 10 TECU).(b) The percentage increase in the field of view as a function of frequency for the same TEC value.
their study, they used two different kinds of global signals the first one when there is no excess radio background f R = 0 traditional set of signals, and another case when the excess background is present f R non zero, exotic set of signals.For this study, we have considered the traditional set of the global 21-cm signals for constructing the training data sets.
Global 21cm signals and foreground with ionospheric refraction.Global 21cm signals and foreground with all three ionospheric effects.Contribution of all three ionospheric effects.

Figure 3 .
Figure 3. (a) The training data sets for the global 21-cm signals generated using parametrized model by varying the signal parameters.(b)The training data set is constructed by adding foreground into the global 21-cm signals, here clearly we can see how the foreground dominated over the signals.(c) The training data sets for the case when we included the ionospheric refraction effect into the signal and foreground for corresponding fixed TEC value (TEC =10 TECU).(d) The amount of excess temperature recorded by antenna due to ionospheric refraction.(e) The samples of the training data set are constructed by adding all three ionospheric effects-refraction, absorption, and thermal emission into the signal and foreground for variable TEC and T e values.(f) Contribution due to all ionospheric effects into training data sets.In each subplot, a subset of the training data sets is shown in color, while the remaining training data sets are plotted in the background using light gray color.
into the foreground added signal to construct the training data sets shown in Fig.(3c) and Fig.(4c).Each sample of the training data set is constructed by the following equation (7).In the second case study, we have added ionospheric effects, mainly refraction, absorption, and thermal emission, while building the final training data sets shown in Fig.(3e) and Fig.(4e).In the final training data set, all the samples are constructed by the following equation (9).

Figure 4 .Figure 5 .Figure 6 .
Figure 4. (a) The training data set of the global 21-cm signal was generated using physical model (semi numerical approach).(b) The training data set after we add foreground into the signal.(c) The training data set was constructed by including the ionospheric refraction effect into the signal and foreground for the corresponding fixed TEC value 10 TECU.(d) The excess temperature caused by ionospheric refraction as recorded by the antenna in the training data sets.(e) The samples of the training data set were constructed by adding all three ionospheric effects-refraction, absorption, and thermal emission into the signal and foreground for variable TEC and T e values.(f) Contribution of all ionospheric effects to the training data sets.In each subplot, a subset of the training data sets is shown in color, while the remaining training data sets are plotted in the background using light gray color.

Figure 7 .Figure 8 .
Figure 7. Case1b: parametrized global 21-cm signal with foreground.The original values of the parameters are shown by the solid straight line in each plot, while the dots indicate the predicted values by ANN.
(11)].In Fig. (11), we can see that training loss and validation loss closely flow, and both got converse after 20 epochs for all the cases.

Figure 9 .Figure 10 .
Figure 9. Case 1d: parametrized global 21-cm signal and foreground with all three ionospheric effects-refraction, absorption, and thermal emission.The original values of the parameters are shown by the solid straight line in each plot, while the dots indicate the predicted values by ANN.

Figure 11 .
Figure 11.This graph shows the evolution of the network's loss function when we used a signal generated by a physical model.In all cases, the training loss is depicted as a solid line as a function of epochs, whereas the validation loss is plotted as a dashed line.The test loss function closely matches the training loss function in this case.

Figure 13 .
Figure 13.Case 2c: Signal and foreground with ionospheric refraction for fixed TEC value.In the each plots, the original values of the parameters are shown by the solid straight line, while the dots indicate the predicted values by ANN.However, F star and N  are plotted in logarithmic scale.

Figure 14 .Figure 15 .
Figure 14.Case 2d: Signal and foreground with all three ionospheric effects-Refraction, Absorption, and Thermal Emission.In the each plots the original values of the parameters are shown by the solid straight line, while the dots indicate the predicted values by ANN.However, F star and N  are plotted in logarithmic scale.

Figure 16 .Figure 17 .
Figure 16.The blue lines in this graph depict the variation of the D-layer electron temperature (Te) across a 1000-hour observation period.The red dashed line represents the calculated mean Te value derived from these fluctuations.

Figure 18 .
Figure 18.This histogram presents the distribution of Te values containing the entire observation duration in the context of the parametrized signal scenario.The blue dashed line denotes the average value of the actual Te, while the orange dashed line corresponds to the mean of the predicted Te values by the ANN model.

Figure 19 .Figure 20 .
Figure 19.This histogram presents the distribution of TEC values containing the entire observation duration in the context of the Physical signal scenario.The blue dashed line denotes the average value of the actual TEC, while the orange dashed line corresponds to the mean of the predicted TEC values by the ANN model.

Table 1 .
(Harker et al. 2016)ers used to build the training data set for the parametrized case of global 21-cm signals.The derived value of the parameters is taken from(Harker et al. 2016): J ref = 11.69,J dz = 3.31, J z0 = 18.54,X z0 = 8.68, X dz = 2.83, T z0 = 9.77, T dz = 2.82.To produce our training sets, we modified these values by 50% [ see Tab. 1].The number of parameters explored is sufficient to cover a wide spectrum of signal morphologies.
Figure (3a)depicts a typical collection of created signals that we will employ.The idea behind the chosen tanh model was that it can very well mimic the shape of the Global 21-cm signal and is very well tied to the physical characteristics of the IGM.The tanh parameters are closely related to the IGM characteristics, although they do not provide knowledge about the source properties.As a result, it lies between the phenomenological turning point model and other fully physical theories.

Table 4 .
The computed R 2 -scores for all signal, foreground, and ionosphere parameters for each case studied are listed here.We used the parametrized model to construct the global 21-cm signal.

Table 5 .
The calculated RMSE values for all the signal, foreground, and ionospheric parameters are listed here for each case studied.
. It contains a more realistic and diverse group of the 21-cm signal than the parametrized model signals.The signal parameters are also different than the parametrized model; here we use astrophysical parameters (f xh * f x , f star ,f esc , N  ).In the first case of the study, we take global 21-cm signals as a training, validation, and testing of the ANN model, shown in Fig.(4a).The training model consists of a four-layer structure built with Keras' sequential API, with 1024 input neurons matching 1024 frequency channels and two hidden layers with 12 and 11 neurons activated by the 'elu' activation function.The output layer contains four output neurons that predict astrophysical parameters of the global 21-cm signal (f xh * f x , f star , f esc , N  ).We used the StandardScaler function, which is available in 'sklearn' to preprocess input signals.At the same time, we use MinMaxScaler to scale the signal parameters before passing them to the model.We use 'adam' as the optimizer and' mean squared error' as the loss function.

Table 6 .
The values of R 2 score and RMSE score of each parameter are tabulated in Tab.(6) and Tab.(7).The plots of the actual versus predicted values of the parameters for the test data set are shown in Fig. (14).The R 2 score for the foreground and ionospheric parameters is much higher than the signal parameters R 2 score.The reason is simple: foreground and ionospheric effects dominated the final training data sets signal that we are giving to the network as fed compared to the signals.The R 2 score we obtained for the signal parameters is around 0.96 to 0.98 [see Tab.6].The computed R 2 -scores for all signal, foreground, and ionosphere parameters for each case studied are listed here.We used physical model (semi numerical model) to construct the global 21-cm signal.

Table 7 .
The calculated RMSE values for all the signal, foreground, and ionospheric parameters are listed here,for each case studied. 7.

9 Time varying ionospheric effects-refraction, absorption and thermal emission
We conducted further assessments to evaluate the robustness of the ANN model.We used a dynamic ionospheric model with temporal variations.This model assumes random fluctuations in Total Elec-

Table 8 .
This table presents both the actual parameter values and the values predicted by the ANN, along with the corresponding percentage errors for the time-varying ionospheric model.The parameters include those related to the parametrized signal, foreground, and mean values of TEC and Te.

Table 9 .
This table presents both the actual parameter values and the values predicted by the ANN, along with the corresponding percentage errors for the time-varying ionospheric model.The parameters include those related to the physical signal, foreground, and mean values of TEC and Te.

Table A1 .
This table contains the true signal and foreground parameter values and their best-fit estimates and uncertainties obtained through a simple analytical method.Additionally, it includes parameter values predicted by the ANN model.

Table A2 .
This table contains the true signal, foreground and Ionospheric parameter values and their best-fit estimates and uncertainties obtained through a simple analytical method.Additionally, it includes parameter values predicted by the ANN model.