A consistent and robust measurement of the thermal state of the IGM at $2 \leq z \leq 4$ from a large sample of Ly$\alpha$ forest spectra: Evidence for late and rapid HeII reionization

We characterise the thermal state of the intergalactic medium (IGM) in ten redshift bins in the redshift range $2 \leq z \leq 4$ with a sample of 103 high resolution, high S/N Ly$\alpha$ forest spectra using four different flux distribution statistics. Our measurements are calibrated with mock spectra from a large suite of hydrodynamical simulations post-processed with our thermal IGM evolution code CITE, finely sampling amplitude and slope of the expected temperature-density relation. The thermal parameters inferred from our measurements of the flux power spectrum, Doppler parameter distribution, as well as wavelet and curvature statistics agree well within their respective errors and all clearly show the peak in temperature and minimum in slope of the temperature density relation expected from HeII reionization. Combining our measurements from the different flux statistics gives $T_0=(14750 \pm 1322)$K for the peak temperature at mean density and a corresponding minimum slope $\gamma = 1.225 \pm 0.120$. The peak in the temperature evolution occurs at $z \approx 3$, in agreement with previous measurements that had suggested the presence of such a peak, albeit with a large scatter. Using CITE, we also calculate the thermal state of the IGM predicted by five widely used (spatially homogeneous) UV-background models. The rather rapid thermal evolution inferred by our measurements is well reproduced by two of the models, if we assume (physically well motivated) non-equilibrium evolution with photo-heating rates that are reduced by a moderate factor of $\sim 0.7-0.8$. The other three models predict HeII reionization to be more extended with a somewhat earlier as well as higher temperature peak than our measurements suggest.


INTRODUCTION
Lyα absorption seen in the spectra of distant bright QSOs (Quasi Stellar Objects) allow one to probe the thermal and ionization history of the Intergalactic Medium (IGM) in addition to constraining cosmological parameters. The thermal state of the IGM is often characterized by normalization (T0) and slope (γ) of the temperature-density relation (TDR, T = T0∆ γ−1 ; Hui & Gnedin 1997), while the ionization state of the IGM is characterized by H i and He ii photo-ionization rates (e.g. Haardt & Madau 1996). E-mail: pgaikwad@ast.cam.ac.uk These parameters have been measured using high-resolution, high signal-to-noise spectroscopic observations in conjunction with high-resolution hydrodynamical simulations.
He ii reionization is expected to be completed in the redshift range 2 ≤ z ≤ 4. He ii reionization injects energy into the IGM and thereby broadens the absorption features , decreases the neutral fraction of hydrogen (Rauch et al. 1997;Bolton & Haehnelt 2007) and leads to additional pressure smoothing of the density and flux fields (Gnedin & Hui 1998;Peeples et al. 2010). Accurate measurements of thermal parameters in this redshift range can therefore provide important constraints on the models of He ii reionization.
The thermal parameters of the IGM have been measured by (i) decomposing absorption features into multicomponent Voigt profiles and identifying the lower envelope of the line width (b-parameter) vs H i column density ( NHI; Schaye et al. 1999Schaye et al. , 2000Bolton et al. 2014;Hiss et al. 2018;Telikova et al. 2019), (ii) measuring the small scale suppression in the transmitted flux power spectrum McDonald 2003;Kim et al. 2004;Walther et al. 2019;Boera et al. 2019), (iii) using curvature (defined as, κ = F /[1 + (F ) 2 ] 3/2 where F and F are the first and second derivatives of the normalised flux with respect to the wavelength) statistics Boera et al. 2014;Padmanabhan et al. 2015) or (iv) characterising Fourier modes of the flux distribution in the range sensitive to thermal parameters using wavelet analysis (Zaldarriaga 2002;Theuns et al. 2002a;Lidz et al. 2010;Garzilli et al. 2012). Even if the same observational data is used, the sensitivity to small changes in thermal parameters and associated systematic uncertainties are found to be different for different flux statistics (see §5 for a detailed discussion of this). Ideally, one would thus like to calculate all these different flux statistics simultaneously and consistently for the same data set to obtain joint best fit values of the thermal parameters and the associated errors. In order to obtain robust and consistent results it is also important to perform such an analysis using a consistent set of model parameters.
As simulated data are an integral part of the parameter estimation from Lyα forest data, the reliability and accuracy of the extracted parameters depend on the assumptions involved in the simulations and the ability to perform a wide range of simulations that span a sufficiently wide parameter space with good i.e. not too coarse sampling. Generating such a set of simulations accounting for the relevant physical effects is challenging. At any given epoch the pressure smoothing of the density field depends on the past thermal history (Gnedin & Hui 1998;Peeples et al. 2010;Kulkarni et al. 2015;Nasir et al. 2016;Rorai et al. 2017a) and one needs self-consistent high-resolution cosmological hydrody-namical simulations of the IGM (Springel 2005;Almgren et al. 2013;Bolton et al. 2017). Practically one achieves a range of T0 and γ in simulations by varying photo-heating rates of H i and He ii as a function of redshift (see for example, Schaye et al. 2000;Becker et al. 2011). Such simulations are computationally expensive which limits the thermal parameter space that can be probed at any given redshift.
As noted before, most simulations used for this purpose are performed assuming a uniform ionizing background and ionization equilibrium. While these are good approximations after He ii reionization is complete such simulations do not capture all relevant physics before and during He ii reionization, i.e., 2.7 ≤ z ≤ 4.0 (see Puchwein et al. 2015Puchwein et al. , 2019Gaikwad et al. 2019). For this one needs simulations that also account for the non-equilibrium effects during He ii reionization. Ideally one would also like simulations to incorporate the spatially inhomogeneous nature of He ii reionization and the corresponding fluctuations of the UV background. However, such simulations will not only require to include radiative transfer, but also large box sizes as well as high resolution and thus very high dynamic range to capture the relevant physical processes accurately.
The main aim of this work is to obtain a consistent measurement of thermal parameters using the larger data sets that have become publicly available recently. Thanks to compilations like kodiaq dr2 (O'Meara et al. 2015(O'Meara et al. , 2017, from KECK/HIRES archival data) and UVES squad dr1 (Murphy et al. 2019, from VLT/UVES archival data), large samples of QSO spectra with high resolution (∼ 6 km s −1 adequate to resolve the thermally broadened Lyα absorption lines) and high S/N are now available for analysis. These samples have dramatically increased the number of available QSO spectra that have been reduced and continuum normalised using uniform techniques.
Additional impetus to perform a consistent measurement of thermal parameters for these large samples comes from two tools we have developed over the past few years: (i) the Code for Ionization and Temperature Evolution (cite, Gaikwad et al. 2017a) and (ii) the VoIgt profile Parameter Estimation Routine (viper, Gaikwad et al. 2017b), an automatic Voigt profile fitting code that decomposes Lyα absorption spectra into Voigt profile components. cite not only allows us to construct models with a wide range of thermal and ionization histories efficiently without running full hydrodynamical simulations, but also enables us to calculate the non-equilibrium evolution of the thermal and ionization state of the gas (see Gaikwad et al. 2019, for details). cite has been shown to reproduce the results of full hydrosimulations to well within 10 percent accuracy (Gaikwad et al. 2018). viper runs on parallel architectures and allows us to perform a Voigt profile analysis for large samples of observed and simulated spectra consistently (see Maitra et al. 2019Maitra et al. , 2020, for a clustering analysis using Voigt profile components). Thanks to these two codes, we are in a position to measure the physical parameters of the IGM from Lyα forest data by simultaneously using the different statistics mentioned above.
We present here the measurements of thermal parameters (T0 and γ) over the redshift range 2 ≤ z ≤ 4 in 10 redshift bins of width ∆z = 0.2 using 103 high resolution, high S/N QSO spectra drawn from the kodiaq dr2 sample. The paper is organized as follows, in §2 we present the details of the observational data used in this work and compare our new measurements of the mean flux as a function of redshift, H i column density distribution, flux PDF and power-spectrum with measurements in the literature. We describe our simulations and explain how we generate simulated spectra for a wide range of finely spaced thermal parameters in §3. We provide details of four different flux statistics of Lyα forest used to measure thermal parameters in §4 and in online supplementary appendix E. In §5 we show our measurements of thermal parameters, present our error analysis and compare our measurements with measurements in the literature. In §6 we show predictions for the thermal parameter evolution for different UVB models. Finally we summarize in §7. Note that we make all the measurements from observational data available to the community as online supplementary material. The default cosmological parameters used here are (ΩΛ, Ωm, Ω b , σ8, ns, h, Y ) = (0.69, 0.31, 0.0486, 0.83, 0.96, 0.674, 0.24), consistent with a flat ΛCDM cosmology (Planck Collaboration et al. 2014). All distances are given in comoving units unless specified. We have expressed ΓHI in units of 10 −12 s −1 denoted by Γ12. We often refer to T0, γ as thermal parameters in this work.

OBSERVATIONS
We use observed spectra from the second data release of the Keck Observatory Database of Ionized Absorption toward Quasars (kodiaq dr2) survey (O'Meara et al. 2015(O'Meara et al. , 2017 1 . The sample consists of 300 QSO spectra with emission redshifts z ≤ 5.3. All available spectra are continuum normalised and the data product provides normalised flux and the associated error as a function of wavelength. Many of these QSOs were observed more than once with different exposure times. We co-added all the spectra using the procedure described in online supplementary appendix A. In total there are 214 QSO spectra that cover the Lyα forest in the range 1.9 ≤ z ≤ 4. We manually checked all the coadded spectra and excluded spectra if one or more of the following criteria are satisfied: (i) sightline does not (partially or fully) contain Lyα forest in the redshift range 1.9 ≤ z ≤ 4, (ii) sightline contains Damped Lyα (DLA) or Figure 1. Each horizontal line shows the Lyα redshift range corresponding to the wavelength range between Lyα and Lyβ emission for individual QSOs in our sample. We exclude QSO proximity regions of 10 pMpc from the QSO towards us. The spectra are divided in ten redshift bins, each of size ∆ = 0.2 centered on z = 2.0, 2.2, · · · , 3.8 as shown by the vertical dashed lines. The number of lines of sight in a given redshift bin are shown in the boxes.
sub-DLA systems, (iii) the sightline contains large spectral gaps (see online supplementary appendix A for details), or (iv) the median S/N per pixel along the sightlines is smaller than 5. After excluding the QSO spectra with above criteria, the resulting sample consists of 103 QSO spectra. Fig.  1 shows the Lyα redshift range corresponding to the wavelength range between the Lyα and Lyβ emission lines of the 103 QSOs in our sample. The Lyα absorption close to the QSOs is expected to be influenced by the enhanced ionizing flux due to the QSOs (Carswell et al. 1982;Kulkarni & Fall 1993;Srianand & Khare 1996;Lidz et al. 2007;Calverley et al. 2011;Bolton et al. 2012). We exclude such biased QSO proximity regions (i.e 10 pMpc from the QSO towards us) in our subsequent analysis. This choice of QSO proximity region size corresponds to the quasar rest frame wavelength of ∼ 1166Å at z = 3. To avoid possible contamination from intrinsic O iv absorption we consider only regions with rest wavelength greater than 1050Å at the quasar emission redshift.
In order to measure the evolution of thermal parameters, we divide our sample into ten redshift bins centered on z = 2.0, 2.2, 2.4, · · · , 3.6 and 3.8 with a bin width of ∆z = 0.2. Fig. 1 shows the number of QSO spectra contributing to our sample in each of these ten redshift bins.
The properties of the observed spectra in these redshift bins are summarized in Table 1. The median S/N of the observed sample is > 10 in all redshift bins. The observed Lyα forest regions are usually contaminated by metal lines that produce narrow absorption features and potentially could bias our measurements of T0 and γ. To account for this we manually identified contaminating metal line absorption using the list of known intervening metal line systems along these sightlines. It is not always possible to identify all the metal lines contaminating the Lyα forest. However, as we fit all the observed spectra with our automated Voigt profile fitting routine viper (Gaikwad et al. 2017b) we can mitigate this. We treat all lines with b ≤ 8 km s −1 as metal lines. Finally, we replace all metal lines by continuum and add noise to the replaced regions. We have checked the effect of residual metal line contamination on our T0 − γ measurements and found the effect to be marginal (see online supplementary appendix H).
There could be additional errors in the observed flux due to continuum fitting uncertainties. Continuum fitting uncertainties depend not only on the number of unabsorbed spectral regions used in the fit but also on the observed S/N per pixel in these regions and the QSO spectral energy distribution (SED). Since the true QSO continuum is unknown, an exact quantification of the contribution of continuum fitting uncertainty to the error in the normalised flux is not available for the kodiaq dr2 sample. The continuum fitting uncertainty is expected to be of the order of a few percent for moderate S/N data (O'Meara et al. 2015). We thus allow for the possibility of a systematic error of ±5 per cent in the normalised flux for our final T0 and γ measurements to account for continuum uncertainties.

Comparison of observed statistics with previous measurements
In this section, we derive the statistics of the transmitted Lyα flux for our sample and compare them with results in the literature. In particular, we compare the evolution of mean flux, flux probability distribution function (FPDF), flux power-spectrum (FPS) and H i column density distribution function (CDDF) from our sample with other measurements in the literature. In §4 we describe in detail our method of calculating these flux statistics (along with other statistics) from observations and simulations. The last column in Table 1 provides the mean observed transmitted flux measured in each redshift bin. The error on the mean flux given also accounts for the systematic uncertainty due to continuum placement. As expected, the mean flux decreases monotonically with redshift due to the increase in the opacity of the IGM at higher redshifts. In Fig.  2, we compare the evolution of the mean flux F of our sample (see Table 1  . The presence or absence of this possible excess in F has only a marginal effect on our measured T 0 − γ but may be interesting in its own right as we will discuss later (see §2 for details).   with some notable differences. F at z = 2.2, 3.0 and 3.2 in our sample is smaller by ≤ 2σ than that of Kirkman et al. (2005);Kim et al. (2007). One possible reason could be the number of QSO lines of sight used in their work, ≤ 10 at those redshifts. Our F measurement at 2.9 ≤ z ≤ 3.5 ( 2.1 ≤ z ≤ 2.5) is systematically larger (smaller) than that of . The F evolution in our sample is in good agreement with that from Faucher-Giguère et al. (2008) over the full redshift range. We see a slight enhancement in F at 2.9 ≤ z ≤ 3.5 in the kodiaq dr2 sample (see online supplementary appendix B for more details). We have also analyzed the squad dr1 QSO sample for the purpose of calculating the F evolution (Murphy et al. 2019). The F evolution from two samples are in good agreement (within 1σ) with each other (except at z = 3.0, 3.2. We find that the enhancement in F at z = 3.2 is less prominent in squad dr1, but the F evolution shows a change in slope at z > 3.2. It is noteworthy that the statistics sensitive to thermal parameters are in good agreement with each other for the kodiaq dr2 and squad dr1 samples. Furthermore, note again that we rescale simulated optical depths to match the observed F which reduces the effect of differences in F on thermal parameters. The presence or absence of the slight excess of F discussed above has thus only a marginal effect on measurements of T0 − γ presented in this work. We compare the FPDF at 2.71 ≤ z ≤ 3.21 from our sample with those from Kim et al. (2007); Calura et al. (2012); Rollinde et al. (2013) in Fig. 3 (panel A1 and A2). The number of spectra used by Calura et al. (2012); Rollinde et al. (2013) and Kim et al. (2007) is 2, 5 and 8 respectively, while our sample contains ∼ 25 spectra. Despite this, our FPDF in the range 0.1 ≤ F ≤ 0.9 is within ∼ 13 percent of that from Kim et al. (2007); Calura et al. (2012); Rollinde et al. (2013). The FPDF statistics appears reasonably well converged even for small number of spectra. In this work, we focus on the FPDF in the flux range 0.1 ≤ F ≤ 0.9 because the flux at F < 0.1 (F > 0.9) could be dominated by sky subtraction (continuum placement) uncertainty.
In panels B1 and B2 of Fig. 3, we show a comparison of the FPS from our sample with that from Walther et al. (2018). Even though the sample used in Walther et al. (2018) and our work here is the same (i.e, the kodiaq DR2 sample), the number of QSOs per redshift bin is different because of our selection criteria (see §2). Our method of calculating the FPS is also different. Since our sample is selected such that the spectra do not contain spectral gaps, we compute the power spectrum using FFT. We also forward model the simulated Lyα forest spectra to mimic the noise and instrumental broadening properties of the observed spectra. Hence when calculating the FPS, we neither subtract the noise power nor deconvolve the instrumental broadening. We also replace metal lines with continuum and add noise in the replaced regions, while Walther et al. (2018) mask metal line regions. Despite these differences, when we compare our FPS at 0.01 ≤ k (s km −1 ) ≤ 0.1 with that from Walther et al. (2018), we find good agreement. Note that we use the FPS only at 0.01 ≤ k (s km −1 ) ≤ 0.1 to measure T0 and γ, as these scales are most sensitive to the variation of thermal parameters.
In Fig. 3 (panel C1-C2), we compare the CDDF from Kim et al. (2013) with that obtained from our sample using viper. For a fair comparison, we restrict the observed spectra of our sample to the redshift range 1.9 ≤ z ≤ 3.2 and we use the same definition of the CDDF. Kim et al. (2013) present the CDDF for the range 13 ≤ log NHI ≤ 18 since their sample is incomplete at log NHI < 13. We account for the incompleteness in our measurement by calculating the sensitivity curve hence our CDDF measurements are shown from log NHI = 12.6 to 15.8. Our observed CDDF is in good agreement (∼ 3.75 percent) with that of Kim et al. (2013). At log NHI > 15, we find more systems compared to Kim et al. (2013). This may be due to the fact that viper fits only Lyα absorption while Kim et al. (2013) fit Lyα and Lyβ absorption simultaneously. Despite this, the two CDDF in the range 13 ≤ log NHI ≤ 15 are consistent with each other. We refer the reader to online supplementary appendix C for a discussion of the associated errors in these statistics for our measurements and those in the literature.

SIMULATIONS
We evolve the cosmological density, velocity and temperature field using the smooth particle hydrodynamic (SPH) gadget-3 2 code. The initial conditions are generated at z = 99 using 2lpt (Scoccimarro et al. 2012) 3 . Our fiducial simulations have a box size of 10 h −1 cMpc and have 512 3 baryon particles and an equal number of dark matter particles corresponding to a gas particle mass of ∼ 10 5 M . The gravitational softening length is set to ∼ 0.65 h −1 ckpc. We found this to be the best compromise in terms of resolution, dynamic range and ability to run a sufficiently fine grid of thermal parameters. We have performed a resolution study using the Sherwood simulation suite 4 and show that our results are sufficiently well converged for our choice of simulation (see §4.4, Bolton et al. 2017). We also use the Sherwood simulation suite to demonstrate that our measurements are not significantly affected by the somewhat limited box-size and thus spatial dynamical range.
gadget-3 is a modified version of the publicly available gadget-2 code (Springel 2005) that incorporates the radiative heating and cooling by a time varying, spatially uniform UV background. For our fiducial simulations we incorporate the KS19 UVB model (QSO SED index α = −1.4) by modifying the TREECOOL file in gadget-3. We store the output of the gadget-3 code at equal redshift intervals z = 6.0, 5.9, · · · , 2.1, 2.0. We employ a simplified star formation criteria that converts particles with ∆ > 1000 and T < 10 5 into stars (the so called quick Lyα option, Viel et al. 2004a) and do not include AGN or stellar feedback.
To generate physically motivated thermal histories one needs to perform computationally expensive cosmological simulations for a range of UVB models Walther et al. 2019). In this work, we follow the approach laid out by Gaikwad et al. (2018) and explore the thermal parameter space efficiently. Our procedure to simulate the thermal history is: (i) We perform a gadget-3 simulation with the KS19 UVB model using equilibrium ionization evolution equation. The typical thermal parameters of such a simulation at 2.0 < z < 4.0 are T0 ∼ 9500 K and γ ∼ 1.5. (ii) To obtain the variation in thermal parameters, we modify the photo-heating rates in the UVB code (see online supplementary appendix D). We then solve the ionization and thermal evolution equation for each particle on gadget-3 outputs using our thermal evolution code Code for Ionization and Temperature Evolution (cite, Gaikwad et al. 2017a) (iii) We apply pressure smoothing corrections while extracting the nHI, T and v fields along a sightline by convolving the SPH kernel with the pressure smoothing Gaussian kernel (Gaikwad et al. 2018). (iv) Finally, we compute the Lyα optical depth accounting for thermal broadening, natural line width broadening and peculiar velocity effects. In Gaikwad et al. (2017aGaikwad et al. ( , 2019, we showed that thermal parameters can be constrained within 1σ uncertainty provided the mean flux is matched between simulation and observations. However unlike Gaikwad et al. (2019), in this work, we solve for the ionization evolution assuming equilibrium while solving the thermal evolution using the non-equilibrium equations.
We vary the thermal parameter evolution by scaling the H i, He i and He ii photo-heating rates of the KS19 UVB (see Becker et al. 2011, for a similar approach). We generate T0 − γ combinations for a finely sampled grid such that T0 is varied from ∼ 6000 K to ∼ 24000 K in steps of 500 K while γ is varied from ∼ 0.7 to ∼ 2.0 in steps of 0.05 at z = 3. The corresponding T0 and γ values at other redshifts differ by ≤ 7 percent. We have thus simulated the Lyα forest for 37×27 = 999 different thermal histories. The computational time required to run and extract Lyα forest spectra for 999 UVB models is around ∼ 2.25 million cpu hours. One can perform ∼ 50 self-consistent L10N512 simulations in similar time. Note that the number of thermal parameters probed in this work is larger by a factor of ∼ 13 and ∼ 50 than those by Walther et al. (2019, N thermal = 76) and Becker et al. (2011, N thermal = 18), respectively. We refer the reader to online supplementary appendix D for more details.
We generate the overdensity (∆), neutral fraction ( fHI), temperature (T ) and peculiar velocity (v) fields along 20000 random skewers through our simulation box at z = 1.9, 2.0, 2.1, · · · , 3.7, 3.8 and 3.9. In each redshift bin we concatenate the fields to match the observed redshift path length (∆z = 0.2). However, while computing the FPS, we do not concatenate spectra as there is no correlation beyond the length of the simulation box. The Lyα optical depth is generated from the ∆, fHI, T and v fields accounting for Doppler broadening, natural line broadening and peculiar velocity effects (Choudhury et al. 2001;Padmanabhan et al. 2014). For each redshift bin, we construct a single simulated mock sample with the same number of spectra as observed in that redshift bin (see Gaikwad et al. 2017a, for a similar method at z < 0.5). The simulated spectra in a mock sample mimic the observed spectra by (i) convolving the flux field with Gaussian instrumental broadening (FWHM ∼ 6 km s −1 ) (iii) resampling the pixel distribution similar to the observed data, and (iii) by adding Gaussian random noise generated from the observed S/N per pixel array. We generate 100 such mock samples to constitute a mock suite for each redshift bin. For example, in the redshift range 2.9 ≤ z ≤ 3.1, each mock sample consists of 26 simulated spectra and a mock suite consists of 26 × 100 = 2600 simulated spectra. We use these 100 mock samples to estimate the errors on different flux statistics. In order to compare the simulations with observations, we calculate the flux statistics that we will show to be sensitive to thermal parameters in the next section.

METHOD
The increase in temperature of the IGM due to He ii reionization has the following main effects on the H i Lyα forest (i) broadening of the absorption features Lidz et al. 2010;Becker et al. 2011), (ii) decrease in neutral fraction due to the temperature dependence of the recombination rate (Rauch et al. 1997 In what follows, we describe the statistics used in this work to measure T0 and γ.

Lyα forest statistics
In the literature, T0 and γ have been measured by two different kind of statistics, namely, those derived from the transmitted Lyα flux directly and those derived by fitting the transmitted Lyα flux with multi-component Voigt profiles.
We use here four statistics, namely, the flux power spectrum (FPS), the wavelet statistics (WS), the curvature statistics (CS) and the line width (b) distribution (hereafter BPDF) of Voigt profile components to measure the thermal parameters.

Flux power spectrum (FPS)
The FPS is sensitive to a wide range of parameters: thermal parameters, ΓHI, cosmological parameters (Ω b , Ωm, ns), the free-streaming length of dark matter particles and neutrinos etc (Viel et al. 2004b;Garzilli et al. 2017;Iršič et al. 2017a,b;Gaikwad et al. 2017a;Khaire et al. 2019;Gaikwad et al. 2020). For a given set of cosmological parameters in the ΛCDM framework, the FPS is sensitive to small scale smoothing of the Lyα flux arising from heating of the IGM. The FPS is systematically lower for models with higher T 0 at 0.02 k ( km −1 s) 0.2, while the FPS is systematically higher for models with smaller γ at 0.02 k ( km −1 s) 0.2. The FPS at scales k 0.02 is similar in all models because the underlying dark matter density field is the same in all the models, while the FPS at 0.2 ≥ k is dominated by observational systematics such as noise and instrumental broadening.
brevity, let us call it the hot model) is somewhat smoother and shows less peaked absorption as compared to the lower-T0 (the cold model) model. To calculate the FPS, we take the Fourier transform F of the flux contrast δ f = (F/ F ) − 1 where F is the mean flux in a given redshift bin. We bin the Fourier power |F | 2 in logarithmic bins centered on k ( km −1 s) = 0.005, · · · , 10.0 and with bin width δ log k = 0.125. This probes scales from ∼ 1250 km s −1 to ∼ 0.628 km s −1 . The lower and upper limits of k are motivated by the size of our simulation box and the resolution of the observed spectra, respectively. While computing the FPS, we do not concatenate spectra as there is no correlation beyond the length of the simulation box. Fig. 5 shows the sensitivity of the FPS to the thermal parameters T0 and γ. The power at 0.01 ≤ k (s km −1 ) ≤ 0.1 for a model with higher T0 (smaller γ) is smaller due to more smoothing of the flux. The model predicted FPS at scales k < 0.01 km −1 s are nearly same as the dark matter density field is the same for all models. On the other hand, the FPS at k > 0.3 km −1 s is dominated by observational systematic effects such as instrumental broadening and finite S/N per pixel. The FPS at 0.02 ≤ k (s km −1 ) ≤ 0.3 ( 20 ≤ v (km s −1 ) ≤ 300) is clearly sensitive to thermal parameters.

Wavelet Statistics (WS)
Wavelets are wave-like oscillations that are localized in both the real space and frequency domains. One can use a suitable wavelet to extract the scales in transmitted Lyα flux that are sensitive to T0 and γ (Zaldarriaga 2002; Theuns et al. 2002a;Lidz et al. 2010;Garzilli et al. 2012). Mathematically, this is equivalent to a convolution of the transmitted Lyα flux with the wavelets. Throughout this work, we use Morlet wavelets that look like a Gaussian in Fourier space 5 . The oscillation frequency of the wavelet depends on the center of the Gaussian in Fourier space. While the width of the wavelet is related to the width of the Gaussian in Fourier space. The Morlet wavelets have the functional form, where v is velocity, sn is wavelet scale, C is a normalization constant obtained by demanding that wavelets are square integrable i.e., |Ψ(v, sn)| 2 dv = 1. The wavelet transform of the flux (hereafter the wavelet field ASn) is the convolution of the wavelet Ψ(v, sn) with the transmitted Lyα flux (F ), Fig. 4 (panel B) demonstrates the sensitivity of the wavelet amplitude (for the wavelet scale sn = 50 km s −1 ) to the temperature of the IGM 6 . The wavelet amplitudes (log A50) in the hot model are systematically smaller than those from the cold model. It is also evident that the wavelet amplitude picks out small scale variations in the flux field. To quantify the effect of T0 and γ on the wavelet amplitude, we focus on the probability distribution function of the wavelet field. Usually, the wavelet field needs to be smoothed (e.g., using a boxcar filter of ∼ 1000 km s −1 , see Lidz et al. 2010) to reduce the effect of flux noise on the wavelet PDF. However, we find that such a boxcar smoothing reduces the sensitivity of the wavelet PDF to T0 and γ. Instead we use kernel density estimation to calculate the wavelet PDF. This reduces the effect of noise while preserving the sensitivity of the wavelet PDF to T0 and γ (see online supplementary appendix E1 for details). Fig. 6 shows the comparison of the wavelet PDF for 3 models (i) T0 = 10000 K, γ = 1.5, (ii) T0 = 20000 K, γ = 1.5, and (iii) T0 = 10000 K, γ = 1.1. The wavelet PDF is systematically shifting to lower values for higher T0 at all the three wavelet scales sn = 30, 50, 100 km s −1 . Fig. 6 also shows that the wavelet scale sn = 50 km s −1 is most sensitive to variation in T0. At smaller wavelet scales sn = 30 km s −1 it is less sensitive to T0 due to observational effects (noise and resolution) while larger wavelet scales sn = 100 km s −1 trace the underlying dark matter density field which is the same for the three models. The wavelet amplitudes are systematically lower for higher γ values. This is expected because for the same normalization T0, a higher γ value corresponds to larger temperature at ∆ > 1. Since a significant fraction of the Lyα absorption at z ∼ 3 comes from ∆ > 1, the Lyα absorption lines in models with higher γ are expected to be broader. But it is interesting to note that the wavelet scale sn = 100 km s −1 is slightly more sensitive to T 0 = 10000 K, γ = 1. 50 T 0 = 20000 K, γ = 1. 50 T 0 = 10000 K, γ = 1. 10 Figure 6. The left, middle and right panels show the sensitivity of the wavelet PDF to T 0 and γ for a wavelet scale of sn = 30, 50 and 100 km s −1 , respectively (for 2.9 ≤ z ≤ 3.1). Each panel shows the wavelet PDF for three models (i) T 0 = 10000 K, γ = 1.50 (blue solid line), (ii) T 0 = 20000 K, γ = 1.50 (green dashed line) and (iii) T 0 = 10000 K, γ = 1.10 (red dotted line). The wavelet PDF is systematically shifted to higher log A Sn for models with smaller T 0 , while the wavelet PDF is systematically shifted to smaller log A Sn for model with larger γ. The effect of T 0 and γ variation is smaller at small wavelet scales, i.e., sn = 30 km s −1 (panel A). The effect of T 0 on the wavelet PDF is larger at intermediate wavelet scale sn = 50 km s −1 , whereas the effect of γ is larger at higher wavelet scales sn = 100 km s −1 . We show in Fig. 11 that the simultaneous constraints using wavelet PDFs at scales 30 ≤ sn ( km s −1 ) ≤ 100 yield better constraints on T 0 and γ. All the wavelet PDFs are calculated using KDE (see online supplementary appendix E1).
γ than to T0. Thus, if we simultaneously use wavelet PDFs at scales 30 ≤ sn ( km s −1 ) ≤ 100, the constrains on T0 and γ should be better than using the wavelet PDF for the individual scales.

Curvature Statistics
The thermal broadening of the absorption lines can also be characterized by the curvature of the absorption features . The curvature (κ) is essentially a measure of the rate of change of direction of a point that moves on a curve and is defined as where F = dF/dv and F = d 2 F/dv 2 are the first and second derivatives of the flux field (with respect to velocity), respectively. Similar to wavelet amplitudes, the curvature can also be affected by the finite S/N of the spectra. To circumvent this difficulty, Becker et al. (2011) fitted a spline curve to the transmitted Lyα flux field and used that to compute the curvature. We, however, use a simpler approach of smoothing the flux field by a Gaussian filter of FWHM ∼ 10 km s −1 (for a similar approach, see Padmanabhan et al. 2015). Fig. 4 (panel C) shows the comparison of the curvature field for the hot and cold models. Similar to the wavelet field, the curvature amplitude is higher for the cold model as compared to the hot model. In this work, we use the curvature PDF to measure T0 and γ. Fig.  7 shows the sensitivity of the curvature PDF to T0 and γ. The curvature PDF is shifted to lower values of log |κ| for hot models as compared to cold models. The curvature PDF is also sensitive to variations in γ. However, note that the curvature PDF shows less sensitivity to T0 and γ than the wavelet PDF. . The figure shows the sensitivity of the curvature PDF to T 0 and γ at 2.9 ≤ z ≤ 3.1. The curvature PDF is systematically shifted to higher (smaller) log |κ| for smaller T 0 (larger γ).
The curvature PDFs are calculated using KDE (see online supplementary appendix E1). The curvature PDFs show a similar behaviour as the wavelet PDF except that it is less sensitive to thermal parameters than the wavelet PDF (at sn = 50 km s −1 ). For higher values of γ, the b − log N HI distribution becomes steeper. The above distributions are calculated from ∼ 1300 simulated spectra at 2.9 ≤ z ≤ 3.1. Due to the limited number of observed spectra in any given redshift bins, the b − log N HI plane is not sampled densely enough to be directly used for constraining the thermal parameters. To circumvent this problem, we calculate 1D b parameter probability distribution functions (BPDFs) in different log N HI bins which are individually sensitive to the thermal parameters (see Fig. 9). Vertical lines demarcate the log N HI bins used in this work to compute 1D BPDFs. spectra. The lower envelope of the b − log NHI distribution has been used to measure the thermal parameters (Schaye et al. 1999;Bolton et al. 2014;Gaikwad et al. 2017b;Rorai et al. 2018;Hiss et al. 2018Hiss et al. , 2019Telikova et al. 2019). Fitting Voigt profiles to Lyα forest spectra is challenging and somewhat subjective. Furthermore the complexity of Lyα absorption increases with increasing redshift as the opacity of the IGM increases. We further need to fit a large number of simulated spectra when measuring T0 − γ model in each redshift bin. To facilitate this, we use our VoIgt profile Parameter Estimation Routine (viper) that automatically fits Lyα forest spectra with multi-component Voigt profiles (see Gaikwad et al. 2017b, for details). The output of viper consists of best fit values (along with 1σ uncertainty) of line center (λc), H i column density ( NHI), b parameter and significance level of detection for each component. We used viper to fit N los × 50 (see Table 1) mock Lyα forest spectra in each redshift bin. The total time taken by viper to fit all the spectra in all redshift bins is ∼ 1.5 million cpu hours. Panels D and E in Fig. 4 show examples of the viper fits (along with residuals) to the cold and hot model, respectively.

The
Neither the gas temperature nor the over-density are directly observable quantities. However, the b-parameter of an absorption line is related to the gas temperature while the measured column density( NHI) is related to the overdensity. The b − log NHI distribution can thus nevertheless be used to measure the thermal parameters. The lower envelope in the b − log NHI distribution is shown to be sensitive to thermal parameters (Webb & Carswell 1991;Schaye et al. 2000;Bolton et al. 2014;Gaikwad et al. 2017b;Rorai et al. 2018). However, the lower envelope method utilizes fewer points that define the lower cutoff in b values at given log NHI ( To circumvent this difficulty, we have used a modified approach to fit the b − log NHI 2D distribution as shown in Fig. 8 and 9. We divide the b − log NHI plane in different log NHI bins centered on 12.9, 13.1, · · · and 14.5 with bin width dlog NHI = 0.2. We then calculate the 1D distribution of b parameters in each log NHI bin using KDE. Fig. 9 shows the sensitivity of the b distribution to T0 and γ in the three log NHI bins 13.0 ≤ log NHI ≤ 13.2, 13.4 ≤ log NHI ≤ 13.6 and 13.8 ≤ log NHI ≤ 14.0. The b distribution shows a systematic shift towards higher b values for higher T0 models. The b distribution is less sensitive to γ in the range 13.0 ≤ log NHI ≤ 13.2, but it is more sensitive in the range 13.8 ≤ log NHI ≤ 14.0. Thus, the b distribution at small column density is rather sensitive to T0 but less sensitive to γ. This systematic variation in the b distribution can change with redshift as the Lyα forest is sensitive to different overdensities at different redshifts. We simultaneously use the b distribution in all the log NHI bins to constrain the thermal parameters. Our method is similar to comparing the full 2D b − log NHI distribution from simulations with observations in the sense that we still use all the data points in the b − log NHI distribution. The only difference is that the size of the log NHI bin is larger than that used in the conventional 2D b − log NHI distribution. This effectively reduces the statistical fluctuations due to binning.

Error estimation of Lyα statistics
The best fit values and the associated uncertainties on T0 −γ crucially depend on how accurately we can compute χ 2 between data and model for a given T0 − γ. The accuracy of the χ 2 estimation, in turn, depends on (i) the estimated errors (covariance matrix) of the flux statistic and (ii) how finely we sample the T0 − γ plane. Since we use the same method/code to derive the flux statistics from observations and simulations, numerical or computational systematics affecting the flux statistics should cancel out in the χ 2 (and hence T0 − γ) estimation.
We have considered two possible ways to calculate the error on each flux statistics: (i) using bootstrap errors estimated from observed data and (ii) deriving the flux statistics and computing the error from 100 simulated mock samples (Rollinde et al. 2013). The second method is not suitable in our case as these errors depend on T0 and γ. This is especially true for the flux statistics WS, CS and BPDF because variation of thermal parameters shifts the PDF to the right or left i.e., along x−axis. We also estimate the covariance matrix using the bootstrap method. The bootstrap errors are found to be slowly converging as the diagonal elements of the covariance matrix converge faster than the off-diagonal ones. We find that the diagonal elements computed from the bootstrap method are smaller by ∼ 25 percent than those estimated from the mock samples. This is expected as the bootstrap method usually tends to underestimate the errors (Press et al. 1992). To obtain the covariance matrix, we first inflate the boostrap error by ∼ 25 percent. We then calculate the correlation matrix from simulated mocks and rescale it using inflated bootstrap errors. We use this covariance matrix when calculating χ 2 .
Equally important for the T0 − γ measurements and associated uncertainties is the number of T0 − γ grid points for which χ 2 is calculated (i.e., how finely we sample the parameter space). In this work, T0 is varied from ∼ 6000 K to ∼ 24000 K in steps of δT0 ∼ 500 K while γ is varied from ∼ 0.7 to ∼ 2.0 in steps of δγ = 0.05 (at z = 3). The Lyα statistics are thus derived for 37 × 27 = 999 different T0 − γ models. However, we find that the sampling needs to be finer (δT0 ≤ 200 K and δγ ≤ 0.02) if we want the 1σ contours for T0 and γ to be converged. This is because the T0 −γ uncertainty from joint constraints can be smaller than that from individual statistics. Due to limited computational resources, we populate the χ 2 field by interpolating all the model statistics on to a finer T0 − γ grid with δT0 = 100 K and δγ = 0.01. We find that such a linear interpolation of statistics between different T0 − γ model is accurate to 1.5 percent. Our approach is similar to that in Walther et al. (2019), except that we use simple linear interpolation instead of an emulator. Emulators are useful when the initial T0 and γ grid is sparse. In our case, the initial T0 − γ grid is more densely sampled than previous works, hence the linear interpolation of statistics between different T0 − γ model is sufficient.

Method of constraining thermal parameters
In this work, we measure the thermal parameters by minimizing χ 2 between the observed PDF/PS (D) and model PDF/PS (M (T0, γ)), where C is the covariance matrix for each statistics is calculated as described in previous section. We compute χ 2 for our T0 − γ grid and find the best fit model that corresponds to the minimum χ 2 for each statistics. We then calculate the 1σ uncertainty on the parameters by marginalizing over χ 2 min ± ∆χ 2 contours (∆χ 2 = 3.50 for 2 parameters + F normalization, Avni 1976). For the wavelet statistics, the observed and model PDFs depend on an additional parameter, the wavelet scale sn. To perform simultaneous measurements of T0 and γ using the wavelet statistics, we add the χ 2 from different wavelet scales sn = 30, 40, · · · , 90 and 100 km s −1 i.e., χ 2 WS = χ 2 30 + χ 2 40 + · · · + χ 2 90 + χ 2 100 . Similarly, the observed and model BPDF depends on log NHI bins. In this case, we add χ 2 from different log NHI bins ( 12.8 ≤ log NHI ≤ 13.0, · · · , 14.4 ≤ log NHI ≤ 14.6). When performing the measurements from combined statistics, we add the χ 2 from all the statistics under consideration, In this work, we ignore the correlation among different statistics for simplicity. Hence the statistical uncertainty on T0−γ in joint constraints may be somewhat underestimated.

Validation of our approach using mock data
Before measuring the thermal parameters from observations, we demonstrate the accuracy of our method in recovering the thermal parameters with an end-to-end test on mock data created from our fiducial hydro-simulation(s) and from the much larger dynamic range simulation from the Sherwood simulation suite (computed with cosmological parameters similar to what we use in our models). This is with the aim to (i) test the sensitivity of statistics to thermal parameters, (ii) study the degeneracy between thermal parameters, (iii) quantify the accuracy of the method, (iv) check if there are any systematic effects between true and recovered thermal parameters, (v) check if the simulations are sufficiently converged and (vi) test for the effect of Jeans smoothing.
The effect of box size and resolution on statistics of the Lyα forest using a range of Sherwood simulations are discussed in online supplementary appendix F. As shown in Fig. F1 to F6, the L10N512 simulation used in this work is sufficiently converged for the corresponding Sherwood model. To test how well our models can recover the thermal parameters, we generate Lyα forest statistics in three redshift bins from mock data based on the L40N2048 simulations in the Sherwood suite which has the same resolution and four times larger box size than our fiducial hydrosimulation The flux statistics are generated from ∼ 20000 skewers as explained in §3. The mock data and our model spectra both closely mimic the properties of the observed sample e.g. number of QSOs, S/N, resolution and redshift path length in the corresponding redshift bin. The errors for each statistics are calculated as discussed in §4.2. When using FPS, BPDF, wavelet and curvature PDF to characterise the thermal parameters we compute the χ 2 between model and mock data (see §4.2 for details). Fig. 10 shows the recovery of thermal parameters using our models and the mock data generated from the L40N2048 Sherwood simulation. For all four statistics, the true thermal parameters of the mock data are within the 1σ contours and are thus close to the best fit measured thermal parameters. The true thermal parameters are also well within the 1σ contours of the measured values if the four statistics are combined (shown by the black contours and the shaded region). As expected the joint analysis contours are narrower than those from individual statistics. The best fit thermal parameters do not show any systematic deviation from the true thermal parameters indicating that any uncertainty in recovery is likely to be statistical in nature.

The effect of varying Jeans smoothing
The width of absorption features depends on the instantaneous thermal state as well as on the entire past thermal history due to the smoothing effect of the thermal pressure on the spatial distribution of the gas, an effect that has been dubbed Jeans smoothing (Gnedin & Hui 1998;Peeples et al. 2010;Kulkarni et al. 2015). At the redshifts considered here, well past H i reionization and with the additional heat input from He ii reionization ongoing, the memory of different possible hydrogen reionization histories is expected to be modest. We have verified this by repeating our end-to-end test with mock data from an otherwise identical simulation of the Sherwood suite which has the same instantaneous temperatures, but where hydrogen reionization occurs considerably later (the zr9 Sherwood simulation). The difference in the measured thermal parameters was hardly noticeable (< 0.8 percent). The range of possible He ii reionization histories consistent with the He ii opacity data is also rather small (Worseck et al. 2019). Hence the effect of the differences in Jeans smoothing for the same instantaneous temperature and different He ii reionization histories is again expected to have a small effect on our measurements of thermal parameters.

4.6
The effect of spatial fluctuation of the temperature-density relation due to inhomogeneous He ii reionization In reality, He ii reionization will be spatially inhomogeneous and instead of a well defined temperature density relation there will be a range of temperature density relations depending on when the He ii in a particular region was reionized (Rorai et al. 2017b;Upton Sanderbeck & Bird 2020). In Gaikwad et al. (2020) we have shown that for the equivalent situation during H i reionization assuming a homogeneous UVB model nevertheless recovers the median thermal state reasonably well. We have verified that this should also be the case for He ii reionization by producing flux statistics for mock spectra obtained with a wide range of temperature density relations. As the box size of our simulations is much smaller than the expected size of a region where He ii is reionized simultaneously this should mimic the effect of inhomogeneous He ii reionization reasonably well. The thermal parameters measured from these samples of simulated mock spectra with a range of temperature-density relations are indeed within 1σ of the median values. We are not sure if this can be interpreted as a systematic bias, but we note that at and after the peak in temperature the inferred T0 values were about 1σ lower than the median while at z > 3 the difference was hardly noticeable.

RESULTS
In this section, we present our measurements of T0 and γ from the kodiaq DR2 sample. In order to make a fair comparison, we (i) derive the statistics from simulations and ob- T 0 ( × 10 3 K) Recovery of T 0 − γ for L40N2048 (SHERWOOD) 6 7 8 9 10 11 12 13 14 T 0 ( × 10 3 K) 3. 7 z 3. 9 Recovery of T 0 − γ for L40N2048 (SHERWOOD) Wavelet Curvature Joint Constraints FPS BPDF Best Fit Value True Value Figure 10. The left, middle and right panels show the recovery of thermal parameters using FPS, BPDF, wavelet and curvature statistics at z = 2, 3.2 and 3.8 respectively. The 1σ joint constraints (using all statistics simultaneously) on the thermal parameters are shown by black contours and the shaded regions. We here treat the Lyα forest generated from the large dynamic range L40N2048 Sherwood simulation as mock data. We post-process our fiducial L10N512 gadget-3 simulation with cite and generate model Lyα forest spectra for 999 different thermal parameters by scaling the KS19 photo-heating rates. The initial conditions of the L10N512 simulation used in this work are different from any of the ICs used in the Sherwood simulation suite. The mock and model Lyα forest data both mimic the observational properties at the respective redshifts. The true values (cyan stars) of the Sherwood simulation used to create the mock data sample lies within the gray shaded region at all three redshifts. Thus our method successfully recovers the thermal parameters despite the smaller spatial dynamic range of the simulations used and the approximate nature of our modeling of the thermal history.
servations in the same way, (ii) calculate the error on each statistics from simulations and/or observations, (iii) minimize the χ 2 between data and model and (iv) find the best fit model and the associated 1σ uncertainty on the parameters (see §4.3) 5.1 T0 − γ constraints in the redshift range 1.9 ≤ z ≤ 3.9 Fig. 11 shows measurements of T0 and γ in redshift bin 2.9 ≤ z ≤ 3.1. Panel A1 shows the measurements of thermal parameters from individual as well as the combined statistics. The constraints from all four statistics agree with each other within the 1σ uncertainties. The constraints on γ from the curvature statistics are poor compared to those from the other statistics. This is because the curvature PDF is less sensitive to γ compared to T0. In previous measurements using curvature statistics, T0 is measured at a characteristic density ∆ for an assumed value of γ. This does not allow for an independent measurement of γ. It is also evident that T0 and γ are anti-correlated for the wavelet statistics, BPDF and FPS, while there is little or no correlation between T0 and γ for the curvature statistics. Panels A2 and A3 show measurements of γ and T0 marginalizing over T0 and γ, respectively. The marginalized distributions also show that the T0 and γ measurements using different statistics are consistent with each other within 1σ. Note that the data and model Lyα forest spectra are the same in all cases only the method/statistics used to measure T0 − γ is different. In order to get tighter and more robust measurements of T0 and γ, we also plot the joint constraints for T0 and γ from all the statistics in panels A1-A3 of Fig. 11. We calculate the joint constraints by adding the χ 2 between model and data for all the four statistics. For simplicity, we ignore the correlation among different statistics. Hence the statistical uncertainty in joint constraints may be somewhat underestimated. However, as shown in Fig. 10 when combining the statistics, the fiducial T0 − γ is recovered well within 1σ. This suggests that the statistical uncertainty for the measurement from the combined statistics is realistic. The joint constrains for T0 and γ are indeed tighter than the corresponding constraints from any individual statistics. The best fit value (T0 = 14750 K and γ = 1.23) for the joint constraints from all the four statistics is shown as the cyan star in panel A1 of Fig. 11 (also see Fig. H1). Fig. 12 shows the comparison of FPDF, BPDF, wavelet PDF and curvature PDF from observations with the best fit model at 2.9 ≤ z ≤ 3.1. The corresponding residuals are shown in the lower panels. The best fit models for all the statistics are in 1σ agreement with the corresponding observations. The χ 2 per degree of freedom for all the statistics varies in the range 0.7 ≤ χ 2 dof ≤ 1.35 (see Fig. H1 and Table  G1). The predictions of our best fit model obtained from the joint analysis are thus in reasonably good agreement with observations for all the statistics. Fig. 13 shows measurements of T0 and γ in the other redshift bins spanning 1.9 ≤ z ≤ 3.9. In all the redshift bins, the T0 and γ measurements from the individual statistics are in good agreement with each other. The γ measurements from the curvature statistics are less tight than those from the other statistics. The uncertainty in T0 − γ using the joint constraints is significantly smaller than that from the individual statistics at all redshifts. The joint T0 − γ constraints are thereby in good agreement with the overlapping regions of all the individual statistics. The variation of the T0 and γ measurements with redshift is also evident from Fig. 13. As expected the T0 and γ constraints are anticorrelated in the lower redshift bins where the absorption . Panel A1 shows 1σ constraints for T 0 and γ from the wavelet PDF (blue dashed contour), curvature PDF (magenta dotted contour), BPDF (red solid contour) and FPS (green solid contour) statistics individually at 2.9 ≤ z ≤ 3.1. The joint constraints from all the four statistics is shown by the gray shaded region. The constraints are obtained by minimizing the χ 2 between data and model. The wavelet constraints correspond to joint constraints from wavelet PDFs for wavelet scales sn = 30, 40, · · · , 100 km s −1 . The BPDF constraints correspond to joint constraints from the BPDF calculated in 9 different log N HI bins [13.0,13.2], [13.2,13.4], · · · , [14.4,14.6]. The curvature PDF gives tighter constraints for T 0 as compared to γ. The 1σ contours from all the statistics are in good agreement with each other. Panels A2 and A3 show the marginalized γ and T 0 distribution from all four statistics and our joint analysis. The best fit T 0 and γ values (cyan star in panel A) lie in the overlapping region of constraints from the the four flux statistics. We show the 1D reduced χ 2 distribution in Fig. H1 (also see Table G1).
features probe densities above the mean. With increase in redshift the T0 and γ correlation becomes weaker, similar to the finding of Walther et al. (2019). This is because at z ≥ 3.5 the Lyα forest is mostly sensitive to approximately mean cosmic density ∆ ∼ 1. In Fig. 14, we compare the evolution of T0 and γ from the individual statistics and joint analysis. The T0 (γ) measurements are marginalized over γ (T0). We can clearly see an evolution of T0 and γ with redshift with a clearly identifiable peak in T0 that coincides with a minimum in γ at z ∼ 3. The T0 and γ measurements from each individual statistics as well as our joint constraints are consistent within 1σ in all the redshift bins. As expected, the T0 −γ uncertainty for the joint analysis is smaller than the corresponding uncertainty from the individual flux statistics. The best fit T0 and γ values show some scatter between neighboring redshift bins for the individual statistics due to the differences in sensitivity Table 2. Measurements of T 0 , γ with total uncertainty (see Table  H1 for error budget).

T0 − γ Error Budget
We consider the following important uncertainties when measuring T0 and γ: (i) modeling uncertainty, (ii) continuum placement uncertainty, (iii) uncertainty due to metal contamination and (iv) cosmological parameter uncertainty. Fig. 15 shows the effect of continuum placement uncertainty and metal contamination uncertainty on our measurements of T0 and γ from our joint analysis of the four different flux statistics (for 2.9 ≤ z ≤ 3.1). The uncertainty in T0 and γ due to continuum placement uncertainty (assumed to be ±5 percent, see O'Meara et al. 2017) is less than 3 percent. The continuum placement uncertainty mainly affects the mean flux of the observed sample. However, when measuring T0 and γ, we rescale the optical depth in the simulations to match the observed mean flux. Contamination by narrow metal absorption lines in the observed Lyα forest can also potentially bias our T0 and γ measurements. We thus remove the metal lines from the Lyα forest to minimize their effect. However, it is still possible that some metal lines are not identified due to insufficient wavelength coverage in the higher wavelength side of the Lyα emission and/or blending effects. By applying a cutoff in the fitted b parameters, we show in Fig. 15 that the contribution of such metal lines to the T0 − γ uncertainty is not larger than ∼ 2 percent. Similarly, we find that the uncertainty in cosmological parameters can lead to 0.5 percent uncertainty in T0 and γ.
One of the main uncertainties in the T0 and γ measurements comes from our modeling of the Lyα forest. Since we vary the thermal state of the gas by post-processing of gadget-3 simulations, the dynamical impact of pressure smoothing may not have been captured accurately. We find that the effect of pressure smoothing on the T0 and γ uncertainty is below 2.5 percent. We refer the reader to online supplementary appendix H for a detailed discussion of the effect of all the above uncertainties on our T0 − γ measurements. Table 2 provides our T0 and γ measurements at different redshifts. In Table H1 we give T0 and γ errors contributed by various above listed uncertainties. The continuum placement uncertainty is systematic, while the other uncertainties are statistical in nature. We add the statistical uncertainties in quadrature while the systematic uncertainty is additive in nature. In the rest of the paper, we show and discuss T 0 = 14500 K, γ = 1.25 (2.9 z 3.1) Observation Figure 12. Panel A1 shows a comparison of the FPS from the observations (blue squares with dashed line) with that of the best fit model [T 0 = 14500K and γ = 1.25] (red solid line) at 2.9 ≤ z ≤ 3.1 generated by cite. The measurement errors are calculated from the observed data using the bootstrap method (gray shaded region). Panel A2 shows the residuals between the observed distribution and the best fit model. Panel B1, C1, D1, E1 and F1 are similar to panel A1 except the observations and predictions of the best fit models are shown for wavelet PDF (sn = 50 km s −1 ) , BPDF (13.4 ≤ log N HI ≤ 13.6), curvature PDF, wavelet PDF (sn = 70 km s −1 ) and BPDF (13.8 ≤ log N HI ≤ 14.0), respectively. Panel B2 to F2 are similar to panel A2 and show again the residuals between observations and best fit model. For all the statistics, the best fit model is in good agreement (∼ 1σ) with observations. Note that the best fit T 0 , γ given in Table 2 and Table H1 corresponds to interpolated best fit values.
constraints on T0 and γ with total uncertainties as given in Table 2.

Evolution of T0 and γ
Fig . 16 shows the evolution of T0 and γ (as given in Table  2 and Table H1) from the joint analysis with total uncertainty in the redshift range 2 ≤ z ≤ 4. Initially T0 (γ) is small (large) at 3.7 ≤ z ≤ 3.9 then it increases (decreases) T 0 ( × 10 3 K) T 0 ( × 10 3 K) 3. 5 z 3. 7 8 10 12 14 16 T 0 ( × 10 3 K) 3. 7 z 3. 9 Wavelet Curvature BPDF FPS Joint Constraints Figure 13. Each panel is same as panel A1 in Fig. 11 except that the 1σ contours for T 0 and γ from all the statistics are shown for different redshift bins. The 1σ contours for different statistics are in good agreement with each other at all redshifts. The joint constraints (the gray shaded regions) are more robust and tighter than the constraints from the individual statistics. The redshift evolution of T 0 and γ is clearly evident from this figure. T 0 and γ are anti-correlated as expected. The degree of correlation decreases with increasing redshift as most of the Lyα forest probes gas typically at cosmic mean density (∆ ∼ 1) at z ∼ 3.8.
attaining a maximum (minimum) at 2.7 ≤ z ≤ 3.1. Subsequently T0 (γ) decreases (increases) at z ≤ 2.7. As we will discuss in §6, the peak in the evolution of T0 and γ is due to the additional heating from He ii reionization. Such a peak in T0 evolution has been suggested based on other published measurements, but note the rather large scatter between different measurements (see Fig. 16 and Fig. I4). Our measurements show for the first time a well defined peak and the expected smooth evolution of T0 and γ in the redshift range 2.0 ≤ z ≤ 3.8, consistently for all four different flux statistics. We attribute this to the following main differences of our study: (i) our observed sample contains a larger number of QSO sightlines than previous studies, (ii) the simulated Lyα forest in our analysis is generated for a finely sampled T0 − γ grid than in previous studies, (iii) we treat the data and simulations on an exact equal footing i.e., our simulated mock spectra are mimicked to match the observed sample, (iv) we apply the same procedure/algorithm to calculate the Lyα flux and Voigt profile statistics to observed and simulated spectra (v) by combining the T0 − γ measurements from different statistics, we mitigate possible biases of the individual flux statistics. All T 0 and γ measurements are consistent with each other in all redshift bins. Note, however, that the errors for T 0 and γ are different for different methods due to different sensitivities of these methods to variations of these parameters. The joint constraints minimize the scatter of the best fit values and the resultant 1σ uncertainty on T 0 and γ is smaller than that for the corresponding individual statistics.

Comparison with other measurements
We now compare the T0 and γ measurements from this work with that from the literature. Fig. 16 shows a compilation of T0 and γ measurements from various papers. T0 and γ have been measured in the past (with variants) of the same four different statistics we have used here. In order to do a fair comparison, we compare our measured T0 and γ from the joint analysis and the individual statistics with the corresponding statistics from the literature. T 0 ( × 10 3 K)  Figure 15. The effect of continuum placement uncertainty and metal contamination on T 0 and γ constraints from the joint analysis of data for 2.9 ≤ z ≤ 3.1. Comparison of the green contours (with metals) with the red contour (without metals) shows that T 0 and γ are underpredicted by ∼ 2 percent if metal contamination is not accounted for. The effect of continuum placement uncertainty (assumed to be ±5 percent) biases the measurements such that a low continuum predicts lower T 0 − γ measurements by 3% or less. We have accounted for both continuum placement and metal contamination uncertainties in our final measurements (see Table 2, Table H1 and online supplementary appendix H for details).

Comparison with measurments from b − log NHI
In panel A1 and A2 of Fig. 16, we compare our thermal parameter evolution with that from Schaye (2001) Telikova et al. (2019) are in good agreement with our γ measurements. However, their T0 measurements are significantly higher than our measurements at 2.6 ≤ z ≤ 3.6. Fitting a lower envelope to the b − log NHI distribution although well motivated, is somewhat subjective. Finding a lower envelope in an objective way is difficult (but see Telikova et al. 2019;Hiss et al. 2019). The lower envelope is found by iteratively rejecting the lines with low b parameter until convergence is achieved. More (less) rejection of low b parameter lines can lead to systematically higher (lower) inferred temperatures. . The uncertainty of our measurements for individual statistics appears to be consistent with being statistical in nature and is slightly bigger than that from the joint analysis. For the BPDF statistics, our T 0 measurements are typically smaller than that obtained from the literature (panel A1) while the γ measurements are similar to that in the literature. For the curvature statistics we show the T 0 measurements obtained using the characteristic overdensity method ( ∆ see §5.4.2, Becker et al. 2011). We use our γ measurements obtained from the joint analysis as a prior to calculate the T 0 uncertainty for the ∆ method in panel B1. The T 0 measurements using the ∆ method are systematically larger than that from the PDF method. However, the T 0 in both measurements are in good agreement (maximum difference ∼ 1.1σ) with each other. Our T 0 measurements at z ≥ 2.6 using the ∆ method are within 1σ of that from Becker et al. (2011); Boera et al. (2014). Panel C1 and C2 shows that the T 0 (γ) measurements for the FPS statistics are systematically larger (smaller) than those from Walther et al. plane is sampled poorly. To avoid these systematic biases, we use the 1D b distribution (BPDF) computed in different log NHI bins. The BPDF does not rely on rejecting low b parameters to find a lower envelope. At the same time the BPDF statistics is sensitive to both T0 and γ (see online supplementary appendix 4.1.4 for details). Our method of measuring T0 and γ using Voigt profile parameters is thus significantly different from that used in the literature. Unlike previous work, our T0 and γ measurements using the BPDF statistics are consistent with our measurements using the other statistics as shown in Fig. 14.

Comparison with measurments from curvature statistics
In panel B1 and B2 of Fig. 16, we compare our thermal parameter evolution with that from Becker et al. (2011); Boera et al. (2014) obtained using the curvature statistics. These authors measured T0 by measuring the temperature at a characteristic density ∆ ( T (∆)) that is a one-to-one empirical function of the mean of the absolute curvature irrespective of γ. To map T (∆) to the temperature at cosmic mean density T0, one needs to however assume a value of γ.
In Fig. 14 we show the measurement of thermal parameters using the PDF of the curvature statistics. To make a more direct comparison between our measurement with that from Becker et al. (2011);Boera et al. (2014), we also calculate the T0 using the characteristic (over-)density method. Fig. 17 and Table 3 shows the ∆ probed by the curvature statistics in our simulations. The evolution of ∆ from this work is in good agreement with that from Becker et al. (2011). We obtain the one-to-one empirical relation between mean absolute curvature and temperature at ∆ (as shown by the red curve in Fig. 17). For a given observed mean absolute curvature, we measure the temperature at ∆. We convert the temperature at ∆ into T0 using this empirical relation assuming a range of γ. For this we use the γ values (with 1σ uncertainty) from our joint constraints. The green squares in panel B1 of Fig. 16 show our T0 measurements using the ∆ method. The errors on the T0 measurements for the ∆ method account for the uncertainty in γ only.
In Fig. 16, we also show the T0 measurements using the ∆ method and that using the joint analysis. The best fit T0 values from the ∆ method are systematically higher than that from the joint analysis as well as those obtained using the curvature PDF (see Fig. 14) at z < 3.4. The ∆ method assumes that most of the absorption in the Lyα forest at any given redshift is associated with densities ∼ ∆. When converting temperature at ∆ to T0, one assumes a single ∆ value. However, in a realistic scenario the Lyα absorption will arise from a range in densities. Unlike the ∆ method, the T0 measured in our joint analysis (using PDFs and PS) is contributed by Lyα absorption coming from all densities. We suspect this to be the most likely reason for the systematically larger T0 obtained with the ∆ method. Another interesting feature of the T0 evolution obtained using the ∆ method is that the uncertainty of the T0 measurements increases with decreasing redshift. This is mainly because of the uncertainty in γ and the increase of ∆ at lower redshifts.
At z > 3.4, our T0 measurements from the ∆ method are consistent with that from Becker et al. (2011) irrespective of the assumed value of γ. This is because the Lyα forest at these redshifts is sensitive to densities close to the cosmic mean density ∆ ∼ 1. This is reflected in the smaller uncertainty of the T0 measurements at z > 3. As γ decreases (1.2 < γ < 1.5) in our joint measurements at 2.6 ≤ z ≤ 3.4, T0 increases and is in good agreement with that from Becker et al. (2011);Boera et al. (2014) for γ = 1.3. At z < 2.6, γ increases from 1.2 to 1.5, and our T0 measurement using the ∆ method is consistent with that from Becker et al. (2011);Boera et al. (2014) for γ = 1.5.  Fig. 16 shows that our T0 measurements are also in good agreement (within 1σ uncertainty) with their measurements. Note, however, that our T0 measurements are systematically larger (albeit within the errors) than their measurements at 2 ≤ z ≤ 3. The difference in the γ evolution is more pronounced. Our γ evolution shows a minimum at z = 3 while their measurement shows a nearly flat γ ∼ 1.7 at 2 ≤ z ≤ 3.5.

Comparison with measurements from the FPS
Since the observed FPS are in good agreement with each other, the moderate differences in the T0 − γ measurements are likely to come from differences in the simulations/analysis. Walther et al. (2019) vary T0, γ and the pressure smoothing scale λp as a free parameter using the OH17 UVB. They derive a single pressure smoothing scale at a given redshift for the entire simulation box using a cutoff in the 3D real space flux power spectrum that is calculated without accounting for peculiar velocity and thermal broadening. In reality, the pressure smoothing scale is not an independent parameter. This is because the pressure smoothing scale is set by temperature and density of the gas particles i.e., λp ∝ (T /∆) 1/2 ∝ [T0 ∆ γ−2 ] 1/2 . Simultaneous fitting of T0, γ and λp can result in degeneracies of λp − T0 (see Fig.  6, Fig. 6 in Rorai et al. 2018;Walther et al. 2019, respectively). An overestimation of λp can result in a systematic decrease and increase in T0 and γ, respectively. In our simulations, the pressure smoothing scale is set by the density and temperature of each SPH particle which varies for each particle. Hence effectively we fit T0 − γ as two free parameters to match the simulations with observations. It is also important to note that all UVB models predict γ < 1.55 at 2 ≤ z ≤ 4 (for equilibrium as well as non-equilibrium evolution of the ionization). As we show in §6 the γ evolution in Walther et al. (2019) is very difficult to reproduce with any UVB model.

Comparison of measurements from the wavelet statistics
We compare our T0 and γ measurements using the wavelet statistics with those from Lidz et al. (2010)  In the actual analysis, we use all the simulated γ values of our finely sampled T 0 − γ grids. The red curve in each panel shows the one-to-one relation between mean curvature (or wavelet) amplitude and temperature at ∆. The one-to-one relation is obtained by fitting a power-law. The higher wavelet scale corresponds to slightly higher characteristic density indicating that ∆ depends on the scales probed by the statistics. The ∆ probed by the curvature statistics is similar to that probed by the wavelet statistics for sn = 50 km s −1 . The redshift evolution of ∆ from this work (see Table 3) is in good agreement with that obtained by Becker et al. (2011) for the curvature statistics. We find the characteristic over-density ( ∆) probed by the wavelet statistics for sn = 50 km s −1 and sn = 100 km s −1 , respectively (see Fig. 17). Table 3 also summarizes the redshift evolution of ∆ for these two wavelet scales. The characteristic density varies similarly with other wavelet scales. We find that the scatter in the empirical relation increases with increasing wavelet scales reducing the sensitivity to T0. This is expected as on large scales the baryons trace the dark matter density field and large scales are less sensitive to variations in T0. We also measure T0 using the ∆ method for the wavelet statistics assuming the γ evolution from our joint analysis. The T0 values for all the wavelet scales are very similar. In Fig. 16, we show the T0 values obtained using the ∆ method (green squares, for sn = 50 km s −1 ) and that obtained using the joint analysis (black stars). The T0 values from the two methods are in good agreement with each other (within 1σ).

DISCUSSION: IMPLICATION FOR UVB MODELS AND HEII REIONIZATION
In this section, we compare our measurements of T0 and γ with predictions from published spatially homogeneous UVB models for the effect of He ii reionization on the thermal evolution of the IGM. In Fig. 18, we compare the observed T0 − γ evolution with that predicted by the UVB models of HM12; OH17; KS19; P19 and FG20. We use cite to obtain the evolution of T0 and γ for these UVB models, post-processing gadget-3 outputs.

Equilibrium vs Non-equilibrium UVB models
Panels A1 and A2 in Fig. 18 show that the different UVB models with equilibrium ionization evolution show a very similar T0 − γ evolution in the redshift range 2 ≤ z ≤ 4 irrespective of the temperature after H i reionization. The observed T0 (γ) is systematically larger (smaller) than that predicted from equilibrium models at 2.2 ≤ z ≤ 3.4 ( 2.2 ≤ z ≤ 3.6). It appears that it is not possible to reproduce the higher T0 (smaller γ) seen in the observed data with UVB models assuming equilibrium ionization evolution. With equilibrium ionization evolution, one assumes that the gas is ionized instantaneously as the ionizing radiation field is turned on. Because of this, the He ii fraction in equilibrium model is systematically underestimated (Puchwein et al. 2015;Gaikwad et al. 2019). Since the photoheating is proportional to the He ii fraction, the temperature is underestimated. Similarly, the He ii fraction depends on density in photo-ionization equilibrium and as a result the photo-heating of the gas is density dependent. As shown in Gaikwad et al. (2019) the density dependent photo-heating rates result in γ ∼ 1.5 in equilibrium ionization models. The He ii reionization will in reality also be spatially inhomogeneous, and spatially homogeneous UVB models can This Work: Observation (Gaikwad+2020b) Faucher-Giguere+2020 Puchwein+2019 Khaire+2019 Haardt+2012 Onorbe+2017 Figure 18. The figure shows the evolution of thermal parameters obtained from our joint analysis (black star points) and the predictions of the uniform UVB models of HM12; OH17; P19; KS19; FG20 (different curves). Panels A1 and A2 show the T 0 , γ evolution obtained for UVB models assuming equilibrium ionization evolution. The observed T 0 and γ evolution is significantly different from those predicted by the uniform UVB models assuming ionization equilibrium evolution. Panels B1 and B2 show the predicted T 0 , γ evolution of the UVB models assuming (physically very well motivated) non-equilibrium ionization evolution. The T 0 evolution for the HM12; KS19; P19 non-equilibrium models at z < 3 are consistent with that from observations. As we will discuss below the mismatch between observed and non-equilibrium UVB models at z > 3 is likely due to a somewhat wrong evolution of the assumed He ii ionizing emissivity and/or (effective) He ii photo-heating rates. Note also that on the rising side of the temperature peak He ii reionization will not yet be complete and the UVB is not expected to be spatially homogeneous.The γ evolution of the OH17; FG20 non-equilibrium UVB models is in good agreement with the observed values. However T 0 is systematically higher in these models. We have thus run the models of OH17; FG20 also with reduced photo-heating rates. The observed T 0 evolution is well reproduced if the photo-heating rate in these two models is lowered by factors of 0.8 and 0.7, respectively (see Fig. I1 and I2). The good match between observation and non-equilibrium OH17; FG20 UVB models with rescaled photo-heating rates is due to the rapid evolution of the He ii ion fraction in these two models.
only give an average evolution of the thermal history. However, in any given place the He iii ionization fronts will move too fast for equilibrium ionization evolution to be a good approximation. We have thus also implemented physically well motivated non-equilibrium ionization evolution in cite. Panels B1 and B2 in Fig. 18 show the comparison of our T0 and γ measurements with the prediction from UVB models assuming non-equilibrium evolution. The T0 (γ) values are systematically larger (smaller) than in the corresponding equilibrium models and the differences between the different UVB models are larger. The non-equilibrium predictions by all the UVB models also show a more pronounced temperature peak. This peak is generally somewhat higher than that in our measurements from the observations and occurs in some of the models also somewhat earlier.

The temperature peak/rise due to HeII reionization predicted by spatially homogeneous UVB models
The basic concept of the different UVB models is similar, but there are considerable differences in the detailed assumptions. OH17; FG20 compute photo-ionization and photoheating rates during ( H i and He ii) reionization differently than HM12; KS19; P19. To calculate photo-ionization and photo-heating rates during He ii reionization, OH17; FG20 assume the He ii reionization history ( fHeII evolution) to be a free parametric function. The reionization history is calibrated to match the observed τ eff,HeII evolution (Worseck et al. 2019). To compute the photo-heating rates during reionization, OH17; FG20 then use another free parameter namely the total heat injected during He ii reionization (∆THeII). HM12; KS19; P19 on the other hand try to predict photo-ionisation and photo-heating rates based on observed source luminosity functions and SEDs and a model of the . Panel A1 and A2 are similar to panel B1 and B2 in Fig. 18 except that the photo-heating rates of the OH17; P19; FG20 UVB models are scaled by a factors of 0.8, 0.9 and 0.7, respectively. Note that such a scaling with a constant factor does not affect γ. Panel B1 and B2 shows the evolution of the volume averaged He ii and H i fraction in our simulations for all the UVB models. f HeII evolves rather rapidly at 2.9 ≤ z ≤ 3.5 in the OH17; FG20 models (shaded region in panel B1 and B2). Correspondingly f HI in the same redshift range is similar for all UVB models (except P19). a The redshift of reionization is defined as z mid corresponding to epoch when f HeII = 0.5 b The extent of reionization is defined as ∆z = zstart − z end . where zstart, z end corresponds to the redshift when f HeII = 0.75, 0.02 respectively. c Factor by which photo-heating rates are scaled to better match the observed T 0 evolution at z < 3. Errors represent the error in the scaling factor that corresponds to the 1σ uncertainty for T 0 (see Fig. I1 and Fig. I2). d Cumulative energy deposited (in eV m −1 p ) into the IGM at z = 3 for non-equilibrium UVB models without scaling the photo-heating rates (see online supplementary appendix J).
He ii opacity based on observations. There is thus considerable variety in the assumptions of spatially homogeneous UVB models. As a result the exact time and height of the temperature peak predicted by these models should be considered somewhat uncertain. As can be seen in panel B1/B2 in Fig. 18, the timing of the temperature peak in the OH17; FG20 UVB models agrees very well with the timing of the peak in our measurements. The evolution of γ in these models is also in better agreement with our measurements than those from HM12; KS19; P19. However, T0 is systematically larger than our measured values at all redshifts. Given the uncertainty in the amplitude of the photo-heating rates in all the UVB models, we have rescaled the H i, He i and He ii photo-heating rates in order to better match our measured evolution of the thermal parameters. Such a rescaling changes T0 systematically, while the γ evolution remains largely unaffected. Panels A1 and A2 in Fig. 19 show the evolution of T0 and γ for scaled photo-heating rates in the different UVB models. In Table 4, we summarize the photo-heating rate scale factor required in different UVB models to match the observed T0 evolution at z < 3 (see also Fig. I1 and Fig. I2 for details). As expected, all the rescaled UVB models show a better agreement with the T0 evolution at z < 3. However, the T0 evolution at z > 3 in the rescaled OH17; FG20 UVB models is in much better agreement with our measured values than the HM12; KS19; P19 UVB models. The shape of the T0 − γ evolution is closely related to the He ii reionization history ( fHeII evolution Gaikwad et al. 2019). It is noteworthy here that the reionization history used in the OH17; FG20 UVB models is calibrated to match the observed τ eff,HeII evolution from Worseck et al. (2019). The good match between the shape of the observed T0 − γ evo-lution and that predicted by the UVB models with scaled photo-heating rates at z > 3 suggests that the T0 − γ evolution from our work is in good agreement with the observed τ eff,HeII evolution from Worseck et al. (2019). Such consistency is important because the evolution of T0 and γ at 2 ≤ z ≤ 4 is mainly driven by the fHeII evolution (see Fig.  5 in Gaikwad et al. 2019). We should here emphasize that the observed dataset, methodology and parameters that we used here are completely complementary to those used to measure the τ eff,HeII evolution in Worseck et al. (2019).
Despite the good match between observations and the predictions by non-equilibrium OH17; FG20 UVB models, we would like to caution the reader that these UVB models do not account for inhomogeneous He ii reionization and the spatial fluctuations in the He ii ionizing background which are expected to be present. To capture the effects of such spatial fluctuations, will, however require rather challenging high dynamic range radiative transfer simulations (ideally coupled to the hydrodynamics). We hope to do this in future work but caution that it will be very expensive to calibrate such simulations to match the observed data (see Gaikwad et al. 2020).

Late vs Early HeII reionization
In Fig. 19, we compare the evolution of the He ii and H i fraction predicted by the different UVB models with nonequilibrium ionisation evolution. As already discussed the amount of heat injected into the IGM is coupled to the fHeII evolution and we have scaled the photo-heating rates to match the observed T0 evolution. To quantify the differences between the various UVB models, we define the mid point of He ii reionization as the redshift, where fHeII = 0.5. We also define the extent of reionization (∆z) as corresponding to the difference between the redshifts when fHeII = 0.75 and fHeII = 0.02. These definitions facilitate the comparison of the UVB models quantitatively, but note that they may differ from definitions used in the literature (see Table 4).
The midpoint of He ii reionization in the OH17; FG20 models (z mid ∼ 3.2) is at lower redshift than that for the models by HM12; KS19; P19. The observed T0 − γ evolution from our measurements is consistent with the relatively late He ii reionization in the OH17; FG20 models. Fig. 19 also shows that the evolution of fHeII in the HM12; KS19; P19 models is more gradual than that in the OH17; FG20 UVB models. The extent of He ii reionization ∆z in the OH17; FG20 UVB models (< 0.4) is smaller than that in the HM12; KS19; P19 UVB models (see Table 4). We should note here that the actual duration and heating due to He ii reionization may be different if the spatially inhomogeneous nature of He ii reionization and the resulting fluctuations in the He ii ionizing background are taken into account. With this caveat in mind we note that for the spatially uniform UVB models considered in this work, our measured T0 − γ evolution is consistent with late and rapid He ii reionization.
We refer readers to online supplementary appendix I and J for more details on the effect of the UVB on the evolution of thermal parameters and cumulative energy deposited in to the IGM. In particular, we show the effect of observational and modeling uncertainty in the UVB model on thermal parameters in online supplementary appendices I1 and I2. We discuss the likely modification needed for the physi-cally motivated non-equilibrium UVB models to reproduce our measured thermal parameter evolution in online supplementary appendices I3 and I4. We discuss the evolution of the cumulative energy deposited in IGM by various UVB models in online supplementary appendix J.
Finally, we emphasize again that even though we use equilibrium KS19 UVB models to vary the thermal parameters in the simulations, our final measured thermal parameter evolution is consistent with the predictions by the physically motivated non-equilibrium evolution for the FG20; OH17 UVB models with moderately reduced photo-heating rates. This gives us additional confidence that our measurements of thermal parameters are robust with regard to observational and modeling systematics.

SUMMARY
We have measured the thermal parameters T0 and γ of the IGM at 2 ≤ z ≤ 4 using 103 high-resolution, high signal-tonoise QSO absorption spectra drawn from the kodiaq DR2 sample using four different Lyα flux statistics. The measurements are calibrated with a suite of simulations based on a high-resolution smoothed particle hydrodynamics simulation post-processed with our IGM thermal history code "Code for Ionization and Temperature Evolution" (cite) that finely samples the grid in T0 and γ. The salient points of our work are: • We use a sample of 103 QSO spectra satisfying the following criteria; (i) spectral S/N > 5, (ii) no large spectral gaps and (iii) no identified DLAs or sub-DLAs to avoid systematic biases. We have divided our sample in redshift bins having ∆z = 0.2 with centers at z = 2.0, 2.2, · · · , 3.6 and 3.8. The typical number of lines of sight used per redshift bin is larger than what has been used in previous studies. We have identified metal lines contaminating the Lyα forest and replaced them with continuum and added noise. Our simulations, based on L10N512 gadget-3 SPH simulation, sample T0 from ∼ 6000 K to 24000 K in bins of ∆T0 = 500 K and γ from ∼ 0.7 to ∼ 2.0 in bins in of δγ = 0.05 creating a densely sampled T0 − γ grids with 37 × 27 = 999 thermal parameter variations. The mock spectra created from these simulations account for the redshift path length, finite S/N and instrumental broadening of the spectrograph of the observed sample.
• Using our code "VoIgt profile Parameter Estimation Routine" (viper), we fit the observed and simulated Lyα forest spectra with multi-component Voigt profiles. We derive 6 flux statistics for observed and simulated spectra: (i) flux power spectrum (FPS), (ii) wavelet statistics (iii) curvature statistics, (iv) b distribution function (BPDF), (v) transmitted flux probability distribution function (FPDF) and (iv) H i column density distribution function (CDDF). The observed FPS, FPDF, CDDF and mean flux evolution from our sample is in good agreement with measurements in the literature. By varying one parameter at a time, we show the sensitivity of these statistics to thermal parameters of the IGM.
• We measure the thermal parameters T0 and γ in the redshift range 2.0 ≤ z ≤ 3.8 by quantitatively comparing four of the six flux statistic, namely FPS, wavelet, curvature and BPDF statistics. The T0 and γ measurements us-ing different statistics are consistent with each other within 1σ at all redshifts. By combining the four statistics we thus find joint constraints on T0 and γ that are more robust and tightly constrained than the measured values from the individual statistics. We estimate the total uncertainty on T0 and γ by accounting for the uncertainty due to continuum fitting, metal line contamination, cosmological parameters and modeling uncertainties. In this work, we ignore the correlation among different statistics for simplicity. Hence the statistical uncertainty of T0 − γ in joint constraints may be somewhat underestimated. However, our recovery using the Sherwood simulation suite suggests that the T0 − γ uncertainty is not seviorly underestimated. We find that our measured T0 varies from 9250 K at z = 3.8 to 14750 K at z = 3 to 9500 K at z = 2, while γ evolves from 1.525 at z = 3.8 to 1.225 at z = 3 to 1.500 at z = 2. We robustly detect a maximum in T0 and minimum in γ that occur simultaneously at z 3.
• We compare our measurements to the evolution of T0 and γ predicted by spatially homogeneous UVB background models of HM12; OH17; KS19; P19; FG20 assuming equilibrium and non-equilibrium ionization evolution. Our temperature measurements are difficult to reproduce assuming equilibrium ionization evolution. Our measurements are, however, in good agreement with the thermal evolution predicted by the UVB models that have been calibrated to match the observed He ii opacity evolution (OH17; FG20) if we assume non-equilibrium ionization evolution and reduce the photo-heating rates in these models by a moderate factor 0.7-0.8. The timing of the peak in our measured temperatures due to the extra heat input during He ii reionization appears thus to be nicely consistent with the timing of He ii reionization inferred from the He ii opacity evolution and appears to be rather rapid. Our measured temperatures are also (to varying degree) systematically lower than those in the other three models. These three models predict He ii reionization to be more extended and either an onset of the temperature rise (P19) or a timing of the peak that is too early (HM12; KS19) to be consistent with our temperature measurements.
The simulations used to calibrate our temperature measurements assume time varying but spatially uniform UVB models. While spatial fluctuations in the He ii ionizing background are clearly important during He ii reionization (z > 3), simulating these self-consistently is beyond the scope of the current work. Large suites of high dynamic range, large scale, multi-frequency cosmological radiative transfer simulations will be required to capture this effect accurately and to sample the observational uncertainties regarding relevant properties of the ionizing sources. We hope to perform such simulations in future. The tests we have performed and our experience with hydrogen reionization suggest that modeling the thermal evolution with homogeneous UVB models should nevertheless recover the evolution of the (median) thermal state of the IGM not only after but also before overlap of He iii regions reasonably well.

ACKNOWLEDGMENTS
We thank the DiRAC@Durham facility managed by the Institute for Computational Cosmology on behalf of the STFC DiRAC HPC Facility (www.dirac.ac.uk). The equipment was funded by BEIS capital funding via STFC capital grants ST/P002293/1, ST/R002371/1 and ST/S002502/1, Durham University and STFC operations grant ST/R000832/1. DiRAC is part of the National e-Infrastructure. Computations in this work were also performed using the Perseus cluster at IUCAA, the HPC cluster at NCRA and the CALX machines at IoA. Support by ERC Advanced Grant 320596 'The Emergence of Structure During the Epoch of reionization' is gratefully acknowledged. PG and MGH acknowledge the support of the UK Science and Technology Facilities Council (STFC). TRC acknowledges support of the Department of Atomic Energy, Government of India, under project no. 12-R&D-TFR-5.02-0700. We thank J. Bolton for making the Sherwood simulation suite available to our work. The Sherwood simulations were performed using the Curie supercomputer at the Tre Grand Centre de Calcul (TGCC), and the DiRAC Data Analytic system at the University of Cambridge, operated by the University of Cambridge High Performance Computing Service on behalf of the STFC DiRAC HPC Facility (www.dirac.ac.uk). This equipment was funded by BIS National E-infrastructure capital grant (ST/K001590/1), STFC capital grants ST/H008861/1 and ST/H00887X/1, and STFC DiRAC Operations grant ST/K00333X/1. DiRAC is part of the National E-Infrastructure. We thank Vid Irsic, Ewald Puchwein and Jamie Bolton for useful comments on the manuscript.

DATA AVAILABILITY
The data underlying this article are available in the article and in its online supplementary material. Upton Sanderbeck P. R., D'Aloisio A., McQuinn M. J., 2016, MNRAS, 460, 1885Viel M., Haehnelt M. G., Springel V., 2004a, MNRAS, 354, 684 Viel M., Weller J., Haehnelt M. G., 2004b Most of the QSOs in the kodiaq dr2 sample were observed more than once with different exposure times. In this section, we discuss our procedure to coadd the spectra. Let a QSO be observed N times with exposure times t1,t2,· · · ,tN . Let Fi(λ), σi(λ) be the value of the normalized flux and associated error at wavelength λ in the i th observation. The coadded flux F coadd (λ) and coadded variance σ 2 coadd of the flux is given by, where wi is weight for the i th pixel. wi = 0 for bad pixels and wi = 1 for good pixels. For the kodiaq dr2 sample, we define a pixel as a bad pixel if it satisfies any of the following criteria, (i) pixels with negative error i.e., σi(λ) ≤ 0, (ii) pixels with σi(λ) 1 (i.e., SNR per pixel 1) (iii) Fi(λ) = 0 (due to sky background Fi(λ) = 0 for good pixel), (iv) negative outliers i.e., pixels with Fi(λ) ≤ −3σi(λ) and (v) positive outliers i.e., pixels with Fi(λ) ≥ 1 + 3σi(λ), Fig. A1 shows an example of coadded spectrum obtained using our method. Spectrum obtained in individual exposures show many bad pixels due to observational systematics. Our coaddition method accounts for these effects as shown in panel D of Fig. A1.
In our analysis, we also exclude sightlines that contain large spectral gaps. We decide whether a sightline contains a large spectral gap or not in the following way. For a given emission redshift of QSO, we identify redshift bins that are spanned by QSO absorption spectra excluding proximity regions and Lyβ, O vi emission lines. The choice of QSO proximity region size corresponds to ∼ 1166Å at z = 3. To avoid possible contamination from intrinsic O iv absorption we consider only regions with rest wavelength greater than 1050Å at the quasar emission redshift. If the spectral gap is more than 25 percent of the redshift bin we mark it as a large gap and exclude the sightline from the analysis for that redshift bin. We note that the number of such sightlines are small (≤ 5).

APPENDIX B: EXCESS MEAN FLUX
In this section we test if the observed excess in mean flux at 3.1 ≤ z ≤ 3.3 in the kodiaq dr2 sample is due to the presence of any unusual QSO sightlines. To check if there are any outliers in this particular redshift bin, we plot the mean flux along individual sightlines in Fig. B1 for four redshift bins. Fig. B1 illustrates that the distribution of mean flux along sightlines is random around the mean flux (dashed line in panel A to D) of the entire sample for all redshift bins. We also do not see any outlier at 3.1 ≤ z ≤ 3.3. Since the mean flux along a sightline is an average quantity, we also check if there is any significant deviation in the observed FPDF evolution in panel E of Fig. B1. The observed FPDF at 3.1 ≤ z ≤ 3.3 also does not show any unusual deviation from other redshift bins. The excess in F at 3.1 ≤ z ≤ 3.3 may thus be real in the kodiaq dr2 sample. Another reason for the excess F could be that the number of sightlines at 3.1 ≤ z ≤ 3.3 are limited in our sample. Our preliminary analysis using UVES squad dr1 (Murphy et al. 2019) while showing distinct change in the slope at z∼ 3.2 does not show any prominent excess. We plan to investigate this further in our future work. However, we emphasize that the effect of excess in F at 3.1 ≤ z ≤ 3.3 has a marginal effect on our measurements of thermal parameters. This is because for any thermal parameter variation we rescale the optical depth in our simulations to match the mean flux. Thus, even if there are any systematics in the F evolution in the kodiaq dr2 sample, this does not significantly change the main results of this work.

APPENDIX C: COMPARISON OF OBSERVED FLUX STATISTICS ERRORS
In §2.1 we discuss the consistency of observed flux statistics from this work with that in the literature. In this section, we compare the errors of the observed FPDF, FPS and CDDF from this work with measurements in the literature. The errors for observed statistics are normally determined using the bootstrap method (except in Rollinde et al. 2013). For a fair comparison, we also compute the errors using the bootstrap method. Fig. C1 shows a comparison of the errors for different statistics. In general, the errors in different statistics from this work are either consistent or slightly smaller than those in the literature.
Panel A1 in Fig. C1 shows that the error in our observed FPDF is smaller than that in Kim et al. (2007); Calura et al. (2012); Rollinde et al. (2013). The smaller errors are due to the large number of QSOs in our sample. The large error in the FPDF of Rollinde et al. (2013) is due to the difference in their method of calculating the errors. Rollinde et al. (2013) argue that the errors computed using the bootstrap method are underestimated and may not account for the cosmic variance accurately. Hence they derive the errors from large number of simulated mocks. Furthermore, their sample size was limited to < 5 at the redshift of interest. We have calculated the errors on our other observed statistics using both methods and accounted for cosmic variance (see §4.2 for details). Panel B1 in Fig. C1 shows a comparison of the error of the FPS from this work with that from Walther et al. (2018). The errors (for the no metal case) are similar in the two works. This is expected because the number of QSOs per redshift bin are similar in both cases. Similarly, the data quality is the same because the subsets of QSOs are both drawn from the kodiaq dr2 survey. The consistency in FPS and its error of this work with that from Walther et al. (2018) reflects the consistency of both the data and our method of calculating the FPS. This consistency is important because the FPS is one of the statistics used in this work to measure thermal parameters.
In Fig. C2, we compare the observed FPS measurements from this work with those from Croft et al. (2002)  We find that our FPS measurements are generally in good agreement (maximum deviation is ∼ 1.8σ) with those from the literature at k ≤ 0.1 s km −1 . The large differences at k > 0.1 s km −1 are due to differences in the method of computing the FPS. In particular Kim et al. (2004);Walther et al. (2018) subtract power due to white noise in the spectra, whereas we do not subtract the noise power. Furthermore the number of observed spectra, quality of spectra (S/N) and total observed redshift path length is quite different in Croft et al. (2002); Kim et al. (2004) from that in our analysis. Even with such differences the FPS in these work seem to be in good agreement with our measurements. We also like to emphasize that any systematic differences in the method of computing the FPS does not significantly affect the measurements of thermal parameters. This is because we use the same method to compute the FPS statistics in simulations and observations. In panel C1 of Fig. C1, we compare the errors of our CDDF with that from Kim et al. (2013). Kim et al. (2013) measured the CDDF from 18 high resolution high S/N (usually > 50) Lyα forest spectra. On the other hand we measure the CDDF from a sample of ∼ 35 high resolution but moderate S/N spectra. Thus the data size and data quality is different in both the works. These differences of the observed dataset are reflected in the CDDF errors. For example the errors in our CDDF at log NHI < 13.6 is larger than that from Kim et al. (2013). This is because low NHI systems are affected more by the S/N of the spectra. Since S/N of our sample is smaller than in Kim et al. (2013), our CDDF error at small log NHI is correspondingly large. At high log NHI > 13.6, S/N is no longer an issue since lines are detected with relatively high significance level. The errors in our CDDF at log NHI > 13.6 are smaller than that from Kim et al. (2013) because the number of spectra in our sample are nearly twice that in Kim et al. (2013). Even though there are differences in data quality and sample size, the CDDF and the associated errors from the two works are in good agreement with each other. The agreement between the CDDF demonstrates the consistency of viper in fitting the Lyα forest with Voigt profiles. This consistency is important because we use the b parameter of Voigt profile fits to constrain thermal parameters. In summary, the observed statistics with associated errors from this work are in good agreement with that from the literature.

APPENDIX D: VARIATION IN THERMAL PARAMETERS
In this section, we explain our method of generating different thermal histories for a given UVB model. We vary the thermal parameter evolution by modifying the H i, He i and He ii photo-heating rates of the KS19 UVB. Following the approach of Becker et al. (2011), we scale the photo-heating rates ( i) by two free parameters a, b such that i = a∆ b KS19 i where i ≡ [ H i, He i, He ii ]. A change in factor a (b) leads to variation in T0 (γ) while keeping the evolution of γ (T0) relatively unchanged. It is important to note that the variation in thermal parameters obtained by scaling the photoheating rate is not physical. Ideally, one would like to vary the properties of the ionizing sources such as, luminosity function, QSO SED index, redshift of reionization etc., and generate the UVB model from a cosmological 1D radiative transfer code. However, such an approach cannot produce a large variation in T0 and γ (but see Oñorbe et al. 2019). Our approach, even though not physical, is useful in practice to probe a large range of the parameters T0 and γ on a finely sampled grid. Fig. D1 shows examples of the evolution in T0 and γ obtained by varying a and b for several thermal histories. The case a = 1 and b = 0 shows the T0 and γ evolution obtained using the default photo-heating rates from the KS19 UVB. Note that the shock heated gas (T > 10 5 K) is mostly unaffected by the photo-heating rate scaling, while the low density gas responds to this scaling differently and changes normalization (T0) and slope (γ) of the TDR.
When measuring the thermal parameters, it is important to sample the T0 − γ plane densely enough. A coarse sampling of T0 − γ plane can lead to incorrect best fit values and excessive smoothing of the χ 2 field which, in turn, can result in underestimation of the uncertainties of the thermal parameters. On the other hand a very dense sampling of T0 − γ would be computationally very expensive. We found it to be the best compromise to choose the T0 − γ grids such that T0 is varied from ∼ 6000 K to ∼ 24000 K in steps of 500 K while γ is varied from ∼ 0.7 to ∼ 2.0 in steps of 0.05 at z = 3 7 . We have thus simulated the Lyα forest for 37×27 = 999 different thermal histories. The computational time required to run the thermal histories and extract mock spectra for 999 UVB models is ∼ 2.25 million cpu hours.
APPENDIX E: THE SENSITIVITY OF Lyα FOREST STATISTICS TO T0, γ

E1 Kernel Density Estimation
Kernel density estimation is a non-parametric method to estimate the probability distribution function. KDEs are a generalization of histograms with the advantage that the loss of information due to binning of the data is minimal with the KDE method. Unlike histograms, KDE is not sensitive to the choice of bins. The PDF calculated using the KDE method converges faster than the simple histogram method.
We use KDE to estimate the PDF of wavelet amplitudes, curvature and b parameters. There are 3 main reasons for using KDE instead of histograms (i) the number of observed sightlines per redshift bins are usually limited (ii) the observed spectra have finite S/N and (iii) characterizing the shape of the PDF is important as it is sensitive to thermal and ionization parameters.
When calculating PDFs with KDE, we chose a Gaussian function as a kernel. The bandwidth (h) of the kernel is set by Scott's rule, h = 3.5 σ n −1/3 , where σ is the variance of the quantity whose PDF is to be estimated and n is number of samples. Fig. E1 illustrates the advantage of using KDE over histograms to estimate the b distribution (BPDF) from one of our models. First we generate a true BPDF by fitting Voigt profiles to 500 sightlines. We then randomly chose a fraction of sightlines and estimate the BPDF using histograms and KDE. The left, middle and True BPDF (100% Sample) BPDF (KDE) BPDF (Histogram) Figure E1. The figure illustrates the convergence of the BPDF using the histogram method (black solid curve) and kernel density estimation (KDE, red dashed line). The true BPDF (blue solid line) is calculated from Voigt profile fits to 500 sight-lines at 2.9 ≤ z ≤ 3.1. The left, middle and right panel show the BPDF calculated using the histogram method and KDE for 1%, 5% and 10% of sightlines. The PDF calculated using KDE converges faster with a smaller sample than the histogram method. For the KDE method, we use a Gaussian kernel with bandwidth given by Scott's rule. The choice of bins and bin width is the same for all cases.
right panel in Fig. E1 show the comparison of the BPDF using the histogram and KDE methods for 1%, 5% and 10% of the total sample. Fig. E1 clearly demonstrates that with KDE the estimated BPDF reaches a good approximation of the the true BPDF with a smaller sample than with the histogram method. PDFs estimated using KDE are smoother and converge faster than PDFs estimated with the histogram method 8 .

APPENDIX F: RESOLUTION AND CONVERGENCE TESTS
The size of the simulation box used in this work (L10N512) is 10 Mpc/h and is likely to be smaller than the typical size of He iii bubbles. As discussed in §3 the choice of our simulation box is motivated by the trade-off between dynamic range needed to resolve the Lyα forest at 2 ≤ z ≤ 4 and the large thermal parameter space to probe. However, it is important to demonstrate the sufficient convergence of our simulation by comparing the properties of the Lyα forest from this work with that from a range of simulations in which box sizes and particle numbers are varied. We illustrate the convergence of our simulation in Fig. F1 to F6. We compare the FPDF, FPS, curvature PDF, wavelet PDF, BPDF and CDDF statistics from our model with that from the Sherwood simulation suite Bolton et al. (2017) at 2.9 ≤ z ≤ 3.1. For a fair comparison we chose thermal parameter T0, γ consistent with that from the Sherwood simulations. For the models shown in Fig. F1 and F2, we post-process the simulated spectra to match the observed properties of our sample at 2.9 ≤ z ≤ 3.1. Note that the initial conditions of the density field used for the L10N512 simulation in this work is different from those used in the Sherwood simulation suite. When we rescale the optical depth to match the mean flux, we find that the amount of rescaling required is very similar in all the models. Fig. F1 to F6 shows a good agreement between our model statistics with that from the L10N512 Sherwood simulation suggesting that our method of varying the thermal history is consistent with the selfconsistent gadget-3 simulation. The slight mismatch between our Lyα forest statistics and that from the Sherwood simulation suite are primarily due to cite and not due to differences in initial conditions or the UVB models used. In Gaikwad et al. (2018), we have shown this explictly by comparing the Lyα forest statistics from spectra post-processed with cite and those from self-consistent simulations for the same initial condition and UVB model. However, these differences are smaller than the observational uncertainties and are compensated by the ability of our method to probe the large thermal parameter space which would be computationally expensive for simulations with larger dynamic range/particle numbers. Fig. F2, F4 and F6 also illustrate that the statistics are converged with regards to box sizes. We compare the statistics from the L10N512, L20N512, L40N1024 and L80N2048 Sherwood simulations. The good agreement among the Sherwood models L40N512, L40N1024 and L40N2048 (Sherwood) indicates that the statistics are converged with regard to resolution/ particle mass.

APPENDIX G: COMPARISON OF BEST FIT STATISTICS WITH OBSERVATIONS
Similarly to Fig. 12, we show the comparison of FPS, wavelet, BPDF and curvature statistics of our best fit models with those obtained from observations at different redshift bins in Fig. G1 to G9. In general, the predictions of the best fit models are in good agreement with the observed statistics. We summarize the reduced χ 2 between best fit model and observations in Table G1. The reduced χ 2 is close to 1 in most cases. We also plot a comparison of the observed b − log NHI contours (that includes 90 percent of points) with that from 5 mock samples (out of 100) in Fig. G10. The mock b − log NHI distribution corresponds to the best fit thermal parameters model. Traditionally, the lower b envelope in the b − log NHI distribution is used to measure the thermal parameters. The limitations of this method are described in §4.1.4. Fig. G10 shows that the 2D observed b − log NHI distribution is similar to that from the mock samples at all redshifts. However, in general we see a mismatch between observed b − log NHI and best fit b − log NHI distribution at low and high log NHI. The low log NHI systems are usually contaminated due to the finite S/N of the spectra. Note here that the Voigt decomposition of high log NHI systems may not be unique. This effect is more prominent at higher (z > 3.3) redshifts. Due to combination of these effects the mock b−log NHI distributions further show some differences compared to observations. It is noteworthy that the shape of the contours for lower b values from the simulations are in good agreement with that from observations. The variations in the b − log NHI distribution in different mock samples is also small indicating that the distribution is converged with respect to cosmic variance. In summary, the b − log NHI distribution from our best fit models is in general good agreement with that from observations. In Fig. 16 (panel B1), we show a comparison of T0 from our method using the characteristic density with the results from Becker et al. (2011);Boera et al. (2014). To get the T0 evolution in our model we assume the best fit γ evolution (with errors). To perform a fair comparison with the Becker et al. (2011);Boera et al. (2014) method, we show the temperature at the characteristic density from our work and Becker et al. (2011);Boera et al. (2014) in Fig. G11. Our temperature measurements from the curvature and wavelet statistics are in good agreement with that from the literature at z > 2.4. However, our measurements at z < 2.4 are systematically lower than that from the literature, but are still consistent within 1.5σ. It is interesting to note that the characteristic density predicted from our model ( ∆ ∼ 4.85) is also smaller at z ∼ 2 compared to that from Becker et al. (2011, ∆ ∼ 6).  Figure F1. Here we show the effect of resolution on Lyα statistics by comparing these statistics for a range of simulations from the Sherwood simulation suite. All other parameters such as T 0 , γ, Γ HI and box size (40 h −1 cMpc) are the same for all the models. The flux statistics of our default simulation are sufficiently converged with regard to resolution. For comparison, we also show the flux statistics using our gadget-3 + cite method. The flux statistics calculated for our post-processed simulation is consistent with that for the corresponding self-consistent simulation from the Sherwood simulation suite. The difference between the gadget-3 + cite model and the self-consistent Sherwood simulation suite are within the observational uncertainty for each statistics (see Fig. 12) All statistics are shown at 2.9 ≤ z ≤ 3.1. The shaded region shows the typical observational uncertainty for each statistics.

APPENDIX H: SOURCES OF UNCERTAINTY IN T0, γ MEASUREMENTS
In addition to statistical uncertainty, the T0, γ measurements are also affected by the modeling and observational uncertainties. We have considered four main sources of uncertainties contributing to the total uncertainty on the mea-sured parameters. These are uncertainties due to (i) continuum placement, (ii) contamination by unidentified metal lines, (iii) (numerical) modeling and (iv) cosmological parameters. The first two uncertainties are observational uncertainties while the later two uncertainties are modeling  Figure F2. Same as Fig. F1 except that the effect of box size (resolution is same) on Lyα statistics is shown by comparing these statistics for a range of simulations from the Sherwood simulation suite. The flux statistics of our default simulation are sufficiently converged with regard to box size. For comparison, we also show the flux statistics using our gadget-3 + cite method. In this figure the resolution of the Sherwood model is same as that of the gadget-3 + cite model.
uncertainties. We discuss below the contribution of each to the measurement uncertainty in detail.

H1 Continuum placement uncertainty
When comparing simulated Lyα forest spectra with observations, it is customary to use normalized observed spectra (Fnorm = Funnorm/Fcont). The continuum fitting of spectra is usually based on high order polynomial fitting connecting points in the spectrum believed to have no absorption. The uncertainty in continuum placement (δFcont/Fcont) affects the observed flux statistics of Lyα forest data and influences the inferred physical parameters. In the kodiaq DR2 sample, all the spectra are normalized by the procedure described in O'Meara et al. (2015described in O'Meara et al. ( , 2017. The continuum placement uncertainty in the kodiaq DR2 sample is of the order of a few percent (see O'Meara et al. 2015). We assume a conservative value for the continuum placement uncertainty Table G1. Reduced χ 2 (χ 2 per degree of freedom) between data and best fit T 0 and γ measurements for various statistics. The reduced χ 2 is calculated by adding the χ 2 between model and data b distribution in different log N HI bins. a The reduced χ 2 is calculated by adding the χ 2 between model and data wavelet PDF for different wavelet scales (sn). a Statistical uncertainty corresponds to uncertainty from joint constraints of respective parameters. b Model uncertainty corresponds to uncertainty in modeling of the Lyα forest in our simulation. c Cosmological parameter uncertainty is calculated using the Gunn-Peterson approximation. Note that the GP approximation does not account for the uncertainty in σ 8 . d Contribution of unidentified metal lines to the uncertainty in derived quantity. e Continuum placement uncertainty (σcont) is assumed to be ±5 percent. For T 0 and γ measurements, σcont includes the uncertainty due to the mean transmitted flux for a given continuum. For log Γ 12 measurements, σcont include the uncertainty due to variation of T 0 and γ parameters. f Total uncertainty is obtained by σtot = σ 2 stat + σ 2 model + σ 2 cosmo + σ 2 metal + σcont of ±5 percent. If F default is the flux corresponding to the default continuum and δFcont/Fcont is the uncertainty in continuum placement then we define a low /high continuum as F default /[1 ± (δFcont/Fcont)].
We quantify the effect of continuum placement uncertainty by rescaling the observed flux, deriving all the statistics from the rescaled spectra and constraining the T0, γ parameters using our simulations. Fig. 15 shows the effect of continuum placement on the (joint) T0 and γ constraints. For low (high) continuum placement, T0 and γ are systematically underpredicted (overpredicted) relative to measurements with the default continuum. One can qualitatively understand this systematic effect of continuum placement on flux. Since dividing a flux by a low (high) continuum stretches (compresses) the flux in the vertical direction, this artificially reduces (increases) the line widths and hence one gets systematically smaller (higher) temperature. Fig. H2 shows a similar effect of continuum placement uncertainty in the other redshift bins. The uncertainty in T0 and γ measurement due to continuum placement is systematic in nature hence we add this uncertainty in the total error budget (see Table H1).

H2 Metal line contamination uncertainty
The observed Lyα forest is usually contaminated by absorption lines arising from metals. For a given temperature, the metal lines are typically narrower than H i lines because for thermally broadened lines, the line width b ∝ m −1/2 and m is larger for metals than hydrogen. The presence of such narrower metal lines can potentially bias the temperature measurement to lower values. In this work, we do not model the intergalactic metal absorption systems as our simulations neither follow chemical evolution of metals nor include feedback processes. To mitigate this, we chose QSO sightlines that do not contain DLAs or sub-DLAs. Thus our sample selection criteria minimizes the number of metal lines contaminating the Lyα forest. Furthermore, we identify metal lines and replace them with continuum and add noise. We use manual and automated approaches to identify metal lines. We find that both approaches give consistent results.
In the manual approach, we first find the redshift of absorption systems using part of the spectra redward of the H i emission line. Corresponding to this redshift, we look for other transitions contaminating Lyα forest. Due to insufficient wavelength coverage, it is possible that we may not identify all the metal lines. Hence we manually look at each spectrum and look for obvious metal line systems. We then replace metal lines with continuum and add noise 9 . The manual method is tedious and somewhat subjective. Hence, we have also used an automatic method to identify the metal lines based on line widths. Since metal lines are narrower than corresponding H i lines, we fit all the spectra with multi-component Voigt profiles using viper. 9 We account for the effect of metal lines blended with H i absorption lines.
We treat all the lines with b ≤ 8 km s −1 as metal lines 10 . We replace these metal lines with continuum and add noise accounting for any blending effects. Fig. 15 and Fig. H2 shows the effect of metal contamination on T0 and γ constraints in all the redshift bins. As expected, when we use flux statistics for the observed spectra without subtracting metal contamination, we get systemati- cally smaller T0 and γ. Even though we identify and remove the metal lines, it is still possible that some of the metal lines are not identified by our method. Since the metal lines contamination introduces an uncertainty of ∼ 2 percent, we add this uncertainty in the total error budget of constrained parameters (see Table H1).

H3 Cosmological parameter uncertainty
All the results presented in this work assume a flat ΛCDM cosmology consistent with Planck Collaboration et al.
(2014). However the Lyα optical depth depends on cosmological parameters Ωm, h and Ω b as where we assume that recombination rate scales with temperature as T −0.7 . It would be computationally expensive to perform all the analysis with varying cosmological parameters. Hence we use a simplified approach of propagating the error using the above equation. are correlated with each other. Some of these correlation may lead to a cancellation of errors. However, while performing the error analysis, we always add errors due to different cosmological parameters. Thus the assumed uncertainty due to cosmological parameters is conservative. We do not account for the uncertainty in σ8 and ns as this would require to perform self-consistent gadget-3 simulations. However it is important to note that, we find that the uncertainty in T0 and γ measurements due to cosmological parameters uncertainty is small, ≤ 0.5 percent (see Table H1).

H4 Modeling uncertainty due to Jeans smoothing
In the simulations, we vary the thermal state of the IGM by post-processing gadget-3 simulations using cite. Pressure smoothing effects due to the hydrodynamical response of the gas are important during reionization (Park et al. 2016;D'Aloisio et al. 2020). As shown in Gaikwad et al.
(2018), we account for the pressure smoothing of the gas by convolving the SPH kernel with a Gaussian kernel that depends on the pressure smoothing scale (Gnedin & Hui 1998;Kulkarni et al. 2015). This approach is based on the ansatz that the Lyα absorbers are in local hydrostatistic equilib- rium (Schaye 2001). However, this assumption may not be valid during the process of reionization as the photo-heating time scales are typically shorter than the dynamical response time of the gas. In addition to this, the Lyα forest traces the neutral gas in large scale filaments which are undergoing gravitational collapse onto nearby over-dense objects. The hydrostatic equilibrium approximation may thus not be correct. To quantify the effect of this ansatz, we perform a self-consistent gadget-3 simulation with enhanced photoheating rates where T0 is doubled. The dynamical effect of pressure smoothing is thus captured accurately in this selfconsistent simulation. We compute the Lyα forest statistics for this self-consistent model and assume it as the fiducial model. We constrain T0, γ and ΓHI using our model, i.e. gadget-3 (at lower temperature) + cite (with convolution of SPH kernel with Gaussian). We find that the fiducial T0, γ and ΓHI are recovered within 2.5, 1.3 and 0.6 percent. Such a good level of agreement is partly expected because the smoothing of transmitted flux at these redshifts is dominated by temperature, while the pressure smoothing effects are sub-dominant. We account for this modeling uncertainty in the total error budget of derived parameters (see Table  H1). We also refer the reader to §3 for a related discussion.

APPENDIX I: THERMAL PARAMETER EVOLUTION IN UVB MODELS
In this section we discuss the effect of uncertainty in observed and modeling parameters in the UVB models on the thermal parameter evolution during He ii reionization.
I1 Effect of uncertainty in T0 − γ at z = 6 on thermal parameter evolution at 2 ≤ z ≤ 4 In this section we discuss the effect of varying the thermal parameters T0 and γ at z = 6 (the initial redshift chosen for cite in this work) on the thermal evolution at 2 ≤ z ≤ 4. Recently Gaikwad et al. (2020) measured T0 (γ) at z = 6 and found it to be larger than predicted by the HM12; OH17; KS19; FG20 UVB models. Since we run cite from z = 6 to z = 2, it is possible that an uncertainty in T0 and γ at z = 6 affects the thermal parameter evolution at 2 ≤ z ≤ 4. Fig. I1 shows that the effect of even a large uncertainty in thermal parameters at z = 6 is subdominant at lower redshifts. This is because the lower redshift thermal evolution is mainly affected by the He ii reionization prescriptions used in these models. Thus the effect of uncertainty in T0, γ at z = 6 is less than 4 percent (maximum) at 2 ≤ z ≤ 4. Note that Observation Figure G4. Same as Fig. 12   T 0 = 14500 K, γ = 1.25 (2.7 z 2.9) Observation Figure G5. Same as Fig. 12 except for 2.7 ≤ z ≤ 2.9.
at slightly higher redshift and is slightly higher for flatter (small α) QSO SED. The earlier temperature rise is mainly because of the fHeII evolution (see panel B1 Fig. I3) in these models which is also different such that He ii reionization occurs slightly earlier for flatter QSO SED (Gaikwad et al. 2019). However, even after changing the QSO SED spectral index in the observationally allowed range, the thermal parameter evolution in the KS19 UVB model is significantly different from our measurements. This suggests that additional modifications (such as η evolution) may be needed for the KS19 UVB models to become consistent with our temperature measurements.

I3 Modifications to UVB models
Recently FG20   proportional to ∆THI and ∆THeII respectively. Since the timing and amount of photo-heating during He ii reionization is poorly constrained at present, they parameterize this ignorance in ∆THI and ∆THeII. OH17; FG20 assumed ∆THI = 20000 K and ∆THeII = 15000 K and obtain a T0 and γ evolution similar to previous measurements. However, both groups assumed that photo-ionization equilibrium is valid at all times. But as shown by Puchwein et al. (2015); Gaikwad et al. (2018); Puchwein et al. (2019), this is not a good assumption during He ii reionization as the photo-ionization time scales are shorter than the recombination time scales. Thus non-equilibrium ionization evolution is more physically motivated. Puchwein et al. (2015); Gaikwad et al. (2018) shows that the T0 of the IGM in non-equilibrium is significantly larger than in equilibrium calculations. Thus when we use non-equilibrium ionization evolution for OH17; FG20, which are calibrated to match observation assuming photoionization equilibrium , we get significantly larger temperature predictions for their UVB models. In order to match the observations with non-equilibrium ionization evolution, we need to reduce ∆THI and ∆THeII. The reduction of ∆THI and ∆THeII by a factor of f scale is equivalent to reducing H i, He i and He ii photo-heating rates by a factor f scale . Fig. I1 shows the variation in T0 and γ evolution with the variation in ∆THI and ∆THeII predicted by the FG20 model using gadget-3 + cite. We show results in Fig. I1 for four f scale = 1.0, 0.8, 0.7, 0.6 for H i, He i and He ii photoheating rates. The main effect of changing this parameter is to reduce T0 systematically at all redshifts while γ changes very little. This is expected as scaling the parameters ∆THI, ∆THeII is equivalent to scaling H i and He ii photo-heating rates and such scaling changes T0 (keeping γ relatively same see Fig. D1). Fig. I1 shows that the T0 and γ evolution is consistent with our measurements if we use non-equilibrium ionization evolution with f scale = 0.7 parameters in FG20. For completeness, we summarize the best fit ∆THI and ∆THeII parameters used in OH17; FG20 in Table I1 that are consistent with our temperature measurements. In order to do a fair comparison with other UVB models, we show the T0 and γ evolution for varying the photo-heating rates in Fig.  I2. We find that other UVB models match the T0 evolution in some redshift bins but are systematically above or below in other redshift bins. The evolution of γ is similar to that shown in Fig. 18. Finally, we compare the evolution of T0 and γ over the full redshift range 2 ≤ z ≤ 6 and all the different flux statistics in Fig. I4. We emphasize that a considerable amount of work has also been done to measure the T0 and γ evolution . These measurements broadly agree with the theoretical T0 and γ evolution predicted by the different UVB models. Table I1. Heat injection parameters (∆T HI , ∆T HeII ) in UVB models consistent with the measured T 0 and γ evolution from this work. T 0 = 9500 K, γ = 1.55 (3.7 z 3.9) Observation Figure G9. Same as Fig. 12 except for 3.7 ≤ z ≤ 3.9.

I4 The way forward for UVB models
We have rescaled the photo-heating rates in §6.2 to better match our temperature measurements. Even though this is a simple modification to the UVB models, this is not necessarily physical because photo-ionization and photo-heating rates are ultimately linked to the population of ionising sources (QSOs and /or galaxies) and IGM properties. In this section we discuss the possible modifications that may be needed in the various UVB models to match our measurements. Note, however, that implementing these in the UVB models is beyond the scope of present work hence we discuss qualitative features of a physically motivated UVB model that could match our measurements. From Fig. 18, it is clear that the maximum in T0 occurs at systematically lower redshift than that from HM12; KS19 UVB models. However, the maximum value of T0 from these models seem to agree well with that from observations. This suggests that the normalization of the QSO emissivity in these models is roughly consistent with observations. However, the He ii reionization in these models starts early and is more gradual than suggested by observations (see panel B1 in Fig. 19). Thus, in HM12; KS19 the redshift evolution of fHeII needs to be modified. In KS19, this could be achieved Mock-1 Mock-2 Mock-3 Mock-4 Mock-5 Observation Figure G10. Each panel shows the contour enclosing 90 percent of the points in the b − log N HI plane. Black dashed contours shows the distribution of b − log N HI points from observed samples in this work. The solid contours (blue, green, red, magenta and cyan) show the regions of the b − log N HI distribution for our mock samples. Each mock sample matches the total observed redshift path length in a given redshift bin. For visual purposes, we show contours for 5 mock samples. However, we have computed b distribution for 100 such mock samples. The thermal parameters of the mock samples in each redshift bin correspond to the best fit thermal parameters obtained in this work. The b − log N HI distributions from our mock samples is similar to the observed b − log N HI distribution. The shape of the contours for lower b values from the simulations are in good agreement with that from observations. Traditionally the lower envelope of b values is used to measure the thermal parameters. The variations in the b − log N HI distribution in different mock samples are small indicating that the distribution is converged with respect to cosmic variance. Our temperature measurements from the curvature (black stars) and wavelet (magenta circles) statistics are in good agreement with those from the literature at z > 2.4. However, our measurements at z < 2.4 are systematically lower than those in the literature, but are consistent within 1.5σ.  Figure H1. Each panel is same as Fig. 11 except in panel A2 and A3, we show reduced χ 2 (χ 2 dof ). The minimum χ 2 dof is close to ∼ 1 suggesting that the errors estimated on statistics are not overestimated/underestimated. by changing the ratio of He ii and H i fractions (η). Since there are only observational limits limits on fHeII and the fluctuations in He ii ionizing background are important at z > 3, there is scope for modification in the fHeII evolution of the KS19 UVB. The fHeII evolution can also be modified by varying QSO SED index (α, fν ∝ ν −α Gaikwad et al. 2019). However, we show in online supplementary appendix I3 that the observed thermal parameter evolution can not be matched by simply varying the QSO SED index within the observationally allowed range of α = 1.4 − 2.0 (Lusso et al. 2014;Stevans et al. 2014;Shull & Danforth 2020). For the HM12 UVB model both the QSO luminosity functions and QSO SED index need to be updated to be compatible with the most recent compilations of the relevant observation(see e.g. Kulkarni et al. 2019).
In summary, the observed thermal parameter evolution from this work suggests a combination of changes to existing UVB models. For HM12; KS19 UVB models, modification in fHeII is necessary. For the OH17; P19; FG20 UVB models modification of the ionizing emissivity from QSOs is needed.

APPENDIX J: CUMULATIVE ENERGY PARAMETER
The default photo-heating rates of all the UVB models predict systematically higher T0 than we measure at z > 3 (see Fig. 18). Since we have changed the photo-heating rates to better match the observed T0 evolution, it is also interesting to have a look at the evolution of the cumulative energy (u0) deposited into IGM by He ii reionization. We calculate the cumulative energy per unit mass (u0) deposited into IGM as where H = nXI XI (with XI= H i, He i, He ii), ρ, H(z), nXI, XI are mean baryon density, Hubble constant, number density of species XI and photo-heating rate of specie XI (Nasir et al. 2016). To calculate u0 for a simulation box with given UVB and at a given redshift, we first calculate the volume average neutral fraction fXI. We then use above expression to compute u0. Given that there could be uncertainty in u0 after H i reionization, we chose u0(z = 6) = 0 as a reference point. The u0 defined in this work is mainly sensitive to heat deposited during He ii reionization.
In Fig. J1 we show the evolution of u0 for all the UVB models with non-equilibrium ionization evolution. The observed u0 evolution is difficult to measure directly from observations. To nevertheless facilitate a comparison, we show the u0 evolution in Fig. J1 for the FG20 non-equilibrium UVB with scaled photo-heating rates that matches the observed T0 − γ evolution from this work. We treat the u0 evolution corresponding to this model as that inferred from our temperature measurements (black dashed line with shaded region). Fig. J1 also shows the u0 evolution in all the UVB models assuming non-equilibrium ionization evolution. The photo-heating rates here are not scaled for the UVB models. At z = 3 the u0 predicted by all the UVB models is sys-

B1
This Work Gaikwad+2020 T 0 = 11500, γ = 1. 10 T 0 = 15000, γ = 1. 30 T 0 = 8000, γ = 0. 80 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 z Effect of T 0 , γ at z = 6 for FG20 UVB (Non-Eqbm) Figure I1. Each panel here is the same as Fig. 18 except the sensitivity of thermal parameters to the photo-heating rates and initial T 0 and γ at z = 6 is illustrated for the FG20 non-equilibrium model. Since the T 0 evolution for the default FG20 model is systematically higher than our measurements, we rescale the H i, He i and He ii photo-heating rates in panel A1 and A2. The observed thermal parameter evolution (along with 1σ uncertainty) from this work is in good agreement with the non-equilibrium prediction of the FG20 UVB model if photo-heating rates are reduced by a factor of 0.7±0.1. A simple scaling of photo-heating rates does not affect the γ evolution which is consistent with Fig. D1. Given that the uncertainty in thermal parameters after H i reionization is large, panel B1,B2 show the effect of initial T 0 and γ at z ∼ 6 on the T 0 , γ evolution at late times 2 ≤ z ≤ 4 (Gaikwad et al. 2020). The uncertainty in T 0 , γ at z ∼ 6 has only a small (less than 3 percent at z ∼ 3) effect on the T 0 and γ evolution at late times. This suggests that the thermal effect of H i reionization at 2 ≤ z ≤ 4 is sub-dominant relative to that of He ii reionization.

B2
sured T0 − γ evolution. The u0 evolution in HM12; KS19; P19 is more gradual compared to the other two models. This Work consistent with T 0 − γ measurements ( XI, FG20 × 0. 7) Figure J1. The evolution of the cumulative energy deposited into the IGM per unit mass (u 0 ) for different non-equilibrium UVB models is shown. We chose u 0 (z = 6) = 0 as reference point to minimize the effect of uncertainty in u 0 due to H i reionization.
The u 0 described in this work is primarily due to He ii reionization. The black dashed line and the shaded region show the u 0 evolution consistent with the measured T 0 − γ evolution from this work. Without re-scaling the photo-heating rate of all the UVB models predict systematically larger u 0 than consistent with our temperature measurements. Not that the u 0 evolution accounts for the heating but not the cooling of the IGM. Hence to match the T 0 − γ evolution, the whole of the u 0 evolution is important and not just matching u 0 at some redshift. A sharp increase in the u 0 inferred from our temperature measurements is evident at 3.0 ≤ z ≤ 3.6.