The Lyman-α forest catalog from the Dark Energy Spectroscopic Instrument Early Data Release

We present and validate the catalog of Lyman-α forest fluctuations for 3D analyses using the Early Data Release (EDR) from the Dark Energy Spectroscopic Instrument (DESI) survey. We used 88,511 quasars collected from DESI Survey Validation (SV) data and the first two months of the main survey (M2). We present several improvements to the method used to extract the Lyman-α absorption fluctuations performed in previous analyses from the Sloan Digital Sky Survey (SDSS). In particular, we modify the weighting scheme and show that it can improve the precision of the correlation function measurement by more than 20%. This catalog can be downloaded from https://data.desi.lbl.gov/public/edr/vac/edr/lya/fuji/v0.3, and it will be used in the near future for the first DESI measurements of the 3D correlations in the Lyman-α forest.


INTRODUCTION
The Lyman-α forest is a pattern of absorption features caused by neutral hydrogen in the Intergalactic Medium (IGM), typically observed in the spectra of distant quasars (z > 2).Measurements of 1D correlations in a handful of high-resolution quasar spectra emerged as a powerful tool to study the largescale distribution of matter (Croft et al. 1998;McDonald et al. 2000), opening a new field in the analysis of the high redshift universe and helping to constrain cosmological parameters.
Using data from the Baryon Oscillation Spectroscopic Survey (BOSS, Dawson et al. 2013), the three-dimensional correlation function of absorption in the Lyman-α forest was measured for the first time in Slosar et al. (2011).Shortly after that, the first measurement of the Baryonic Acoustic Os-cillations (BAO) peak in the Lyman-α forest was presented (Busca et al. 2013;Slosar et al. 2013;Kirkby et al. 2013), using data from BOSS DR9 (Lee et al. 2013).These were followed by other BAO analyses using increasingly larger Lyman-α forest datasets from BOSS (Delubac et al. 2015;Bautista et al. 2017) and from the extended Baryon Oscillation Spectroscopic Survey (eBOSS, Dawson et al. 2016) (de Sainte Agathe et al. 2019).The precision of these BAO measurements was significantly improved with the measurement of the cross-correlation of quasars and the Lyman-α forest (Font-Ribera et al. 2014;du Mas des Bourboux et al. 2017;Blomqvist et al. 2019), and the final Lyman-α BAO measurement combining BOSS and eBOSS was presented in du Mas des Bourboux et al. (2020).
The Dark Energy Spectroscopic Instrument (DESI) is currently undergoing a five-year campaign to obtain close to a million quasar spectra with z > 2 (DESI Collaboration et al. 2023a;Chaussidon et al. 2023).This dataset will be four times larger than the state-of-the-art (eBOSS DR16 quasar sample Lyke et al. 2020), and will enable sub-percent BAO measurements with the Lyman-α forest (Levi et al. 2013;DESI Collaboration et al. 2016a).
In this publication we present the first catalog of Lyman-α forest fluctuations in DESI, including data from the Early Data Release (EDR, DESI Collaboration et al. (2023b)) and from the first two months of the main survey (M2).This dataset is used in a companion paper to measure the first 3D correlations in the Lyman-α forest from DESI (Gordon et al. 2023), and a comparison with synthetic datasets is presented in Herrera et al. (2023).
The methodology used here is similar to the one developed for eBOSS analyses, especially the most recent analysis by du Mas des Bourboux et al. (2020).This served as the basis for developing the data analysis pipeline of the DESI Lyman-α forest working group.In this publication, we provide a detailed description of our new pipeline, focusing on the changes with respect to the one used in eBOSS analyses.Some of these changes are motivated by changes in the input data: for instance, while SDSS spectra had pixels equispaced in the logarithm of the wavelength, DESI uses linearly spaced pixels.Other changes are motivated by studies that appeared after du Mas des Bourboux et al. (2020).For instance, following Ennesser et al. (2022) we now include in our analysis the spectra of Broad Absorption Line (BAL) quasars, after we mask the most contaminated regions.Finally, we also revisit the weighting scheme used to compute correlations in the Lyman-α forest, resulting in an improvement of more than 20% in our precision.
All of the process followed in this paper, along with the updated changes, is executed using the publicly available code picca1 and can be reproduced by the user using public DESI data.The picca package also includes modules for the computations of both auto-and cross-correlation with quasars, cosmological fits, and multiple useful tools for Lyman-α forest studies.
The catalog described here is aimed at studies of 3D correlations in the Lyman-α forest.A similar dataset, however, is also used in two companion publications that present the first measurements of 1D correlations with DESI data (Ravoux et al. 2023;Göksel Karaçaylı et al. 2023).Although analogous estimations of the unabsorbed quasar continuum are also needed in these studies, their methodology is somewhat different and these publications focus on studies of systematics that primarily affect the 1D correlations.
This paper is structured as follows.In Section 2, we present the data used for this work, including spectra and quasar catalogs.In Section 3, we explain how we obtain the fluxtransmission field from spectra through the estimation of the expected flux of quasars in the continuum fitting process.This includes masking some wavelengths and applying corrections to the flux calibration and the reported uncertainties by the pipeline.In Section 4, we discuss aspects that require further clarification, such as modified weights and wavelength grid choices.Finally, in Section 5, we provide a summary of our findings and conclusions.

DATA
Data used in this publication comes from two different DESI datasets.On the one hand, we have the Early Data Release (EDR), which includes all Commissioning, Survey Validation (SV), and special survey data.On the other hand, we have EDR+M2, which includes all data from EDR as well as the first two months of the main survey.
Although this two samples are qualitatively similar, we are performing the analysis separately for each of them to achieve the two purposes of this paper: • Describe the Lyman-α forest Value Added Catalog (VAC) in the context of EDR (DESI Collaboration et al. 2023b): VACs for a wide variety of tracers are being released using DESI early data.The present work provides the Lyman-α forest fluctuations catalog.
• Describe the Lyman-α forest catalog used in the context of early DESI data publications: Multiple related publications are being published in this context within the Lyman-α working group.The objective of Gordon et al. (2023) is to obtain the first Lyman-α correlation measurements from DESI early data, testing the current pipeline and data quality, and compare its performance to previous eBOSS DR16 analyses.Herrera et al. (2023) provides details on the current status of the different procedures used to build mocks for Lyman-α forest analyses.Gontcho et al. (2023) characterizes the systematics caused by the DESI instrument on the 3D correlations of the Lyman-α forest.Bault et al. (2023) studies the impact of redshift errors on the 3D cross-correlation of quasars with the Lyman-α forest.P1D analyses are performed in two different papers: Ravoux et al. (2023) presents the 1 dimensional measurement using Fast Fourier Transform (FFT), while Göksel Karaçaylı et al. (2023) makes use of the Quadratic Maximum Likelihood Estimator (QMLE).
Finally, the current publication provides the Lyman-α fluctuations catalog, as well as its validation.The decision to use EDR+M2 data for these analyses was motivated by the need for a larger volume of data than the EDR could offer, which leads to better measurements of the correlation function, constraining power and estimation of systematics.
EDR data, including Lyman-α forest fluctuations is available now, including this VAC describing Lyman-α fluctuations.However, EDR+M2 will not be released as a separate piece of data and M2 will be released alongside Year 1 (Y1) data.

DESI spectroscopic data
DESI is the largest ongoing spectroscopic survey, operating on the Mayall 4-meter telescope at Kitt Peak National Observatory (DESI Collaboration et al. 2022).DESI consists of 5000 fibers placed in the focal plane and distributed across 10 petals.Each fiber is controlled by a robotic positioner, allowing for quick and automatic positioning.The large number of fibers and the fast cadence provided by the automatic positioners allows DESI to measure up to 5000 spectra every 20 minutes over a ∼ 3 • field (DESI Collaboration et al. 2016b;Silber et al. 2023;Miller et al. 2023).Fibers carry light from the telescope into a separate room, where it is dispersed by ten spectrographs.Each of them with three cameras each (B, R, Z), covering different wavelength ranges.
While most of the fibers are assigned to science targets, some are used for calibration.Fibers assigned to the sky are essential for sky subtraction, as they enable the removal of the contribution to spectra caused by light from the background sky, particularly emission lines.Other fibers are assigned to standard stars in order to properly calibrate fluxes.This process is performed by comparing the measured counts in the CCD and the expected fluxes from the standard stars, allowing to estimate fluxes for all the other targets through a process called spectroperfectionism (described in Bolton & Schlegel 2010).Spectroperfectionism provides spectral fluxes and their variances.For the full description of the DESI spectroscopic reduction pipeline see Guy et al. (2023).
To help data processing, other pipelines are developed for multiple purposes: optimize the observation strategy (Schlafly et al. 2023), calculate the exposure time Kirkby et al. (2023), assign fibers to targets Raichoor et al. (2023a), and target selection (Myers et al. 2023).
Quasars at redshift higher than 2.1 are identified by the DESI pipeline as high redshift quasars suitable for Lymanα forest studies.These will be re-observed, allowing for a better measurement of their spectra.Re-observations of the same quasar are then coadded and stored in files grouped by healpix pixels (see Górski et al. 2005).The coaddition consists in a simple weighted average of fluxes for each wavelength in the spectra, using as weights their inverse-variance.Coadded files alongside the quasar catalog are used by our pipeline to build the Lyman-α forest catalog.
Each coadded file contains per-spectra information for the three arms (B, Z, R), as well as various metadata, including fiber positions, instrument configuration during observation and atmospheric conditions.The spectroscopic data include fluxes, estimated inverse-variance and a mask identifying invalid pixels.
Each of the three arms of the spectrograph covers a different wavelength region: with some overlap between each of the arms.The Lyman-α forest is predominantly observed in the blue arm (B arm) of the spectrograph, and only partially in the red arm (R arm).DESI follows a linear wavelength grid with 0.8 Å steps, picca is adapted to work with this resolution.
In Figure 1, we show the distribution of wavelength and pipeline-reported error on the flux measurements for the EDR+M2 dataset, where σpip the variance reported by the DESI pipeline based on photon statistics and readout noise.The image highlights the main differences between EDR and M2 samples.EDR data corresponds to longer observations, with fewer quasars observed.M2 data, on the other hand, includes a large number of objects, but with shorter observations, leading to a larger value for σpip.The same panel shows  2. The solid lines mark the mean value of the error for a given wavelength, black for the EDR sample and white for the EDR+M2.Apart from the clear distinction in these two lines, a subtle division into two bands at larger wavelengths can also be observed.This is caused by the better signal-to-noise in the EDR sample, thanks to multiple re-observations of the same targets.We can also observe how the variance relatively increases in the two ends of the spectrograph and in the area affected by the collimator mirror reflectivity around 4400 Å.
To match the accuracy of our most reliable mock data (Herrera et al. 2023) and due to the limited amount of Lyman-α forest data available at longer wavelengths, we have limited our analysis to fluctuations fulfilling z < 3.79.Given that Lyman-α fluctuations can be related to a specific location in the wavelength grid (z = λ/λLyα − 1), this redshift cut corresponds to a maximum wavelength of 5772 Å entering our analysis.

Description of the quasar catalog
Objects in the spectroscopic data are identified using the template-fitting code Redrock (Bailey et al. 2023, and for the special case of QSOs, Brodzeller et al. 2023).Redrock employs a set of templates as representatives of the main object classes observed by DESI: quasars, galaxies and stars.For each observed spectrum, Redrock determines the bestfitting redshift and template by comparing it to the set of templates.This process allows Redrock to accurately identify the objects in the DESI spectroscopic data.
However, Redrock sometimes misidentifies quasars as galaxies.To address this issue, a set of independent quasaridentification approaches, referred to as "afterburners" are also run.These afterburners can identify features missed by Redrock and improve the accuracy of quasar identification.In our case, the most relevant afterburners are QuasarNet (Busca & Balland 2018;Farr et al. 2020), the Mg ii after-burner, and SQUEzE (Pérez-Ràfols et al. 2020).While only the first two methods were used to build the final catalog, all of them were tested during the SV phase (Alexander et al. (2023), also see Lan et al. (2023) for SV results on galaxies).
For the case of QuasarNET, its main role is to correct for cases where Redrock identifies an object as a low redshift galaxy when it is actually a high redshift quasar.In these cases, QuasarNet is used to re-identify the object as a quasar based on its spectral features using Machine Learning techniques and trained using visually inspected datasets.If QuasarNet identifies an object as a high redshift quasar, Redrock is run again with a high redshift prior.This will likely change the object identification to a high redshift quasar, and if confirmed, the object would be included in the final catalog.
The other relevant afterburner is the Mg ii afterburner.Its main role is to correct for cases where Redrock misidentifies a quasar as a galaxy.For every object identified as a galaxy by Redrock, the Mg ii afterburner checks the width of the Mg ii emission line.If the line is wide enough, the object is re-identified as a quasar and included in the final catalog.
For this release, only objects targeted and confirmed as quasars are used.The resulting catalogs include a total of 68,750 quasars for the EDR sample and 318,691 quasars for the EDR+M2 (see Table 1).This value includes all quasars in the input sample, being the actual value of quasars used for each region detailed in Table 2.The redshift and spatial distributions of the quasars is shown in Figure 2, compared to the larger catalog used in eBOSS DR16 analysis (du Mas des Bourboux et al. 2020).The number of quasars in the EDR+M2 is clearly dominated by the first two months of main survey (M2), containing almost half of the objects in the eBOSS sample.

BAL and DLA information
Damped Lyman-α Absorption (DLA) systems caused by H irich galaxies affect the Lyman-α forest by generating absorption features that can interfere with the continuum fitting process.DLA information is provided using a DLA finder based on a convolutional neural network and a gaussian process, for a full description of the DLA catalog see Zou et al. (2023).We selected from the full sample the objects where both the convolutional neural network and the gaussian process detected a DLA with a confidence larger than 50%.
Apart from DLAs, the Lyman-α forest can also be affected by broad absorption lines believed to be caused by the existence of ionized plasma outflows from the accretion disk.These BAL quasars can be added to the analysis, although they have to be appropriately masked (see Section 3.1).The algorithm for BAL identification was presented in Guo & Martini (2019), and the detailed description of the BAL catalog for DESI data is presented in Filbert et al. (2023).
The number of DLAs and BALs for both EDR and EDR+M2 samples is shown in Table 1.Given the better signal-to-noise in the EDR sample, the identification of DLA and BAL objects in this sample is higher in this smaller dataset.

SPECTRAL REDUCTION
The continuum fitting procedure estimates the expected flux for each of the quasars in the catalog, this process is essential for computing Lyman-α fluctuations.The flux-transmission field can be defined as: where F is the mean transmitted flux at a specific wavelength, Cq the unknown unabsorbed quasar continuum for quasar q and fq its observed flux.The combination F Cq(λ) is the mean expected flux of the quasars, and is the quantity that we fit for the spectra.
Continuum fitting is the procedure to compute the fluxtransmission field.It can be split in an initial clean-up phase and a second phase where the quasar continuum is actually fitted.The clean-up phase involves two sequential procedures: masking multiple unmodelled features and re-calibrating the spectra to eliminate residuals from the DESI calibration procedure.
In this section we first explain these two procedures, and then provide details on the core of the continuum fitting process.

Masks (sky lines, galactic absorption, DLA, BAL)
There are multiple features in the spectra that are not caused by Lyman-α fluctuations but are still present in our spectra due to contamination.In order to simplify the cosmological modeling when using Lyman-α fluctuations for correlation measurements, we mask these features and remove them from our analysis.
The three type of masks applied in our analysis are: DESI pipeline; BAL and DLA, applied to absorption features in the forest2 region; and galactic absorption and sky emission lines that appear in the spectrograph when observing Lyman-α quasars.We note that the DESI pipeline mask is applied before the individual observations are co-added (see Section 2.1).The other masks are applied after the coaddition.

DESI pipeline masking
A masking process is applied within the DESI pipeline to identify bad pixels in the spectra.These are typically caused by CCD defects or cosmic rays hitting the spectrograph CCD.This mask is found to only affect about 0.1% of the DESI pixels used for the Lyman-α analysis.Given the small fraction of pixels discarded, we decided to remove these pixels from the analysis.

DLA and BAL masking
DLA absorption affects our spectra by imprinting itself in the spectra of quasars, resulting in zero flux around the true redshift of the DLA and damping wings further away.A secondary effect of this is the reduction in the mean flux of the affected spectra, affecting the overall mean absorption.This biases our estimate of F Cq, potentially affecting all the quasars in the sample (see Section 3.3).
Although it is possible to include the absorption features from DLAs into our models for the correlation functions, for simplicity, and following the prescription in previous analyses (du Mas des Bourboux et al. 2020), we mask the regions of the Lyman-α forest that are affected by identified DLAs when the DLA reduces the transmission by more than 20%.We then correct the absorption in the wings using a Voigt profile as suggested by Noterdaeme et al. (2012), being able to include the pixels affected by these wings without affecting the mean absorption.The top panel of Figure 3, shows the fraction of pixels that have been masked due to DLAs as a function of observed wavelength.Although there is an increase in the number of detected DLAs with redshift, the number of masked pixels is always smaller than 5%.
Regarding BALs, in previous analyses quasars exhibiting BAL features were directly excluded from the analysis, although they were later used as tracers for cross-correlations between quasars and the Lyman-α forest.However, Ennesser et al. (2022) showed that quasars with BAL features could be safely included in the analysis if all expected locations of these absorption features are masked.Here, we follow their proposed approach.
In the bottom panel of Figure 3, we observe the fraction of pixels masked due to BALs as a function of rest-frame wavelength.In this case, we can see a pattern of absorption that matches the expected absorption features described in Ennesser et al. (2022).Since not all the BAL quasars suffer from the same absorption profile, we expect the BAL-masked pixel fraction to be maximal around the central absorption and decrease as we go away from it.It is worth noting that for the quasars used in the Lyman-α region continuum fitting, the percentage of BAL quasars is 16% for the EDR+M2 sample and 23% for the EDR sample.

Galactic absorption and sky emission masking
There are certain absorption and emission features in our spectra that differ from Lyman-α fluctuations or other ab- sorbers in the IGM.These absorption and emission features can be easily identified because they affect specific wavelengths, resulting in sharp features when the spectra of multiple quasars is combined.We can remove them by simply masking out the corresponding wavelengths from our analysis.
The two most significant features in this regard are galactic absorption and sky emission.Galactic absorption is caused by material in the Interstellar Medium (ISM) absorbing at specific wavelengths.The ISM absorption for some of these lines cannot be easily separated from the intrinsic absorption in the atmospheres of the stars used for flux calibration, and thus they are not properly accounted for.Here, we are affected by the Ca K and H transitions, and we mask the corresponding wavelengths (see Figure 4).
Sky emission comprises emission lines generated by atmospheric effects and are mostly corrected in the modelling of sky lines.However, inaccuracies in this modelling result in spurious features in our spectra.Here, we also mask the affected wavelengths (see figure 4).
In Figure 4, we present the measurement of the estimated 1 + δq(λ) in the C iii region (see Table 2).Two kind of features can be observed in this plot: first, we observe the sharp features corresponding to the mentioned galactic absorption and sky emission; then we observe smooth features caused by inaccuracies in the DESI calibration process.To correct for the former, we mask the pixels in the wavelength intervals shown in the plot.The later can be corrected through re-calibration (see Section 3.2).
Here we use the C iii region to build our masks as it lacks the Lyman-α absorption features.This makes it easier to estimate absorption and emission effects.

Re-calibration
DESI calibrates fluxes using standard stars (Guy et al. 2023).However, inaccuracies in the modelling of these calibration stars introduce features in the measured spectra when fluxes are predicted for other objects (such as quasars).These features can be observed by examining the mean δq(λ) in any region of the spectra, particularly in the regions without Lyman-α absorption, where the variance of the measured flux is expected to be lower.This is possible because the features caused by calibration defects are function of observed wavelength, while the continuum estimate is a function of restframe wavelength.Looking again at Figure 4, we can observe smooth features in the stack of fluctuations (1 + δq(λ)) alongside the masked sharp features.
These features are correlated between different forests, and could therefore bias the measurement of the correlation function.To avoid this, on top of the DESI pipeline calibration, we re-calibrate our spectra.As stated above, these features affect the same region of observed wavelength regardless of the restframe position where they act.Therefore, one could in principle use the entire quasar spectra for calibration purposes.However, the larger spectral diversity near quasar emission lines can unnecessarily complicate this.In consequence, we search for a featureless region redwards of the Lyman-α emission line.We will refer to all the candidate regions as recalibration regions.
In the absence of Lyman-α fluctuations, flux fluctuations are ideally only caused by noise.In any of the re-calibration regions, the measurement of 1 + δq(λ) is expected to be consistent with 1; and its measurement can be considered a null test that is not fulfilled in general given the issues described above.
In the bottom panel of Figure 5, we show the measurement of 1 + δq(λ) for multiple different candidate re-calibration regions.The fact that all of them show similar features, in spite of being at different regions of the spectra, suggests that all fluxes can be corrected consistently.Furthermore, we see comparable features in the White Dwarf average residuals in the top panel of the same Figure.This similarity can be seen in some of the regions of the spectrum, especially for λ ∈ [3600, 4200] Å, and it further justifies the re-calibration process.
In previous analyses, a region to the right of the Mg ii emission line (Mg ii-R) was selected to re-calibrate fluxes, partly because it is located further to the right of the spectra, reducing potential contamination from other absorption lines.However, in this work we selected the C iii region (λ ∈ [1600, 1850] Å) due to the larger number of pixels available and given the similar behavior compared to the other regions.Figure 6 shows the number of pixels at each wavelength bin of size ∆λ = 55.58Å for the different regions, and the number of forests available for each region can be found in Table 2.In both cases, we see that C iii has the largest number of pixels available for the analysis.
The choice of a re-calibration region at a larger wavelength than the Lyman-α forest also leads to an increase in the number of quasars that can be used for this process.The Lymanα forest analysis includes quasars with redshifts in the range z ∈ (2.1, 3.7), while for the C iii region, quasars in the range z ∈ (0.9, 2.6) are included.By looking at the quasar distribution (Figure 2), we see that the number of objects in the  4, sharp features are not present here because these samples have already been masked.The smooth features in the spectra are similar for all the measured regions, being the Mg ii-R an outlier in this tendency, likely to be caused due to the reduced number of pixels available for this region at low wavelengths (see Figure 6).Results are computed from the full EDR+M2 sample.Top: Average residuals to White Dwarf (WD) spectra in the blue arm of the DESI spectra, as seen in Guy et al. (2023).A similar trend can also be observed here between these residuals and our measured average of the flux-transmission field, especially at smaller wavelengths, further justifying our recalibration process.
second range will be larger, and hence a larger number of pixels available in the re-calibration region.
We use a similar procedure as du Mas des Bourboux et al. ( 2020) for this re-calibration, correcting all the fluxes by using the mean flux in the re-calibration region: (2) The error reported by the pipeline for these fluxes also needs to be corrected through: After this correction, smooth features in the spectra will be alleviated, which prevents biased results.In previous analyses, a second correction was applied to correct the estimation of the flux variance provided by the pipeline (see Appendix A).However, the better performance of DESI in this aspect allows us to skip this step.Evidence in this direction is the better behavior of the estimated correction to the pipeline error (see Figure 7).

Continuum fitting
The core of the continuum fitting process consists of obtaining the expected flux F Cq for each quasar in the sample.This allows for the determination of the flux-transmission field (Eq.1).The process is performed iteratively, with several quantities fitted simultaneously, and performed on each of the quasar regions independently: in the case of a re-calibrated Lyman-α analysis, it is firstly performed in the re-calibration region (C iii in this case), and afterwards in the Lyman-α region.
In order to simplify the process, the expected flux F (λ)Cq(λ) is assumed to be a universal function of rest-frame wavelength, C(λRF), corrected by a first degree polynomial in log λ.: where Λ ≡ log λ and Λmin,max identify its minimum and maximum values inside the region to be fitted.That is, the spectra of quasars in our sample are assumed to have the same underlying shape or continuum for all objects, allowing for variations in the amplitude and tilt.In the analysis, the transmitted flux F and the quasar continuum Cq cannot be fitted independently, although they are not needed separately in our analysis.For this reason, we choose to directly estimate the expected flux of quasars.
The parameters (aq, bq) are fitted by maximizing the likelihood function (5) where σ 2 q (λ) is the variance of the flux fi.This value has to be estimated, and since it depends on (aq, bq), we include this dependence in the likelihood function.
The full variance of the flux, σ 2 q (λ), includes not only the obvious contribution from the noise estimated by the DESI pipeline but also the intrinsic variance of the Lyman-α forest3 .We account for the intrinsic variance in the following way: where σ 2 LSS is the mentioned intrinsic variance of the Lymanα forest, and σpip,q = σpip,q(λ)/F Cq(λ) where σpip,q is flux variance as estimated by the DESI pipeline.In this expression we also include a correction η(λ) to account for inaccuracies in the pipeline noise estimation.We discard here an extra term in this expression accounting for quasar variability at high SNR, which was previously used in du Mas des Bourboux et al. (2020) (see A2 for details).It is worth noting that σ 2 q here is the variance of the flux and therefore has flux units, whether all quantities at the right hand side are dimensionless.
In the iterative continuum fitting process, we estimate the quantities η, σ 2 LSS , aq, bq, C(λRF).The sequential steps of the process are: (i) Assume wavelength-independent values for unknown quantities: A flat assumption is assumed for all the quantities to be fitted and measured.In practice, this means setting C(λRF) = 1, η(λ) = 1 and σ 2 LSS (λ) = 0.1.(ii) Fit the per-quasar parameters (aq, bq): Using the current values of C(λRF), η(λ) and σ 2 LSS (λ), the pair of values (aq, bq) is estimated for each individual forest through the minimization of the likelihood defined in Eq. 5.
(iii) Estimate the flux-transmission field δq(λ): Using the current value C(λRF) and the best-fit parameters for (aq, bq), we compute the expected flux for each quasar following Eq. 4. This allows for the computation of the fluctuations around the expected flux (δq(λ)) as defined in Eq. 1.
(iv) Fit the variance functions: The functions η(λ) and σ 2 LSS (λ), defined in Eq. 6, are now fitted using the estimated δq(λ) from the previous step.In order to do so, different values of the flux-transmission field are grouped by wavelength and σpip.We take 20 bins for the wavelength split and 100 for the split on σpip, generating a grid of 2000 points.We compute the variance σ 2 (λ, σpip) = δ 2 q (λ, σpip) for each point in the grid4 .The functions η(λ) and σ 2 LSS are fitted for each of the 20 wavelengths independently using the following likelihood function: where the sum in σpip include all the valid bins in σpip (at most 100).The points in the grid computed using less than 100 pixels are considered unreliable and discarded from the fits.We note that in those cases where the fit does not converge (for example, if there are too few quasars), then the default values from step (ii) are kept.
(v) Recompute the mean expected flux: At this stage we update the value of C(λRF).This is performed by computing the weighted average of all quasars expected flux sharing the same λRF value.Here, we use the optimal weights as defined in Eq. 8 (see section 4.1).
(vi) Compute and save relevant statistics: Relevant statistics are computed at each iteration and stored in delta attributes iteration{i}.fits.gzfiles (where i is the iteration number).The saved statistics include the stack of fluctuations (1 + δq(λ)), the fitted variance functions (η, σ 2 LSS ), the mean continuum (C(λRF)) and fit metadata including the tilt and slope values (aq, bq), the number of pixels used for the fit and the χ 2 of the fit.
(vii) Continue next iteration starting in step (ii): The next iteration starts using the updated values of C(λRF), η and σ 2 LSS .
This process is performed 5 times, resulting in stable estimates of all the defined quantities.
In Figure 7, we show the final estimates for the two fitted function η and σ 2 LSS for the two regions considered in this analysis (Lyman-α and C iii) and for the two different samples (EDR and EDR+M2).The correction to the pipeline estimated variance, η, shows a ∼2% deviation from the ideal value of 1 in the case of EDR+M2 and up to ∼7.5% deviation in the case of EDR.This difference is due to the worse estimation of the pipeline-reported variance for the first part of the SV program (see Ravoux et al. 2023 for details).For σ 2 LSS , its estimation is similar for both samples, showing a clear dependence on wavelength (and hence redshift) for the Lyman-α region and an expected (due to the lack of Lymanα fluctuations) nearly zero value for the C iii region.The fit fails at the higher wavelength values in the case of the EDR .Wavelength evolution of the fitted parameters η and σ 2 LSS , measured in both the C iii and Lyman-α regions for the EDR (dashed) and EDR+M2 (solid) samples.Top: The pipeline error correction η is found to be larger for the EDR sample, caused by the worse estimation of the pipeline-reported variance for the first part of the SV program.Bottom: σ 2 LSS , in this case it is consistent between the two samples.As expected the C iii region shows a value close to 0 for all wavelengths, while for the Lyman-α region follows the expected increase in its intrinsic variance with redshift.In both η and σ 2 LSS , measurements for the EDR sample, the fitted parameters could not be obtained for the larger wavelength value due to the reduced number of pixels available, falling to the default values η = 0.1 and σ 2 LSS = 0.1.
sample due to the small number of pixels available at these wavelengths, and then the initial value is kept.
Figure 8 shows the mean flux-transmitted field for the same two regions and samples.In this case, differences between the two samples are smaller, only showing a higher variance in the case of the smaller (EDR) sample.The variances are particularly worse at the largest wavelengths in the Lymanα region due to the reduced number of pixels available.The C iii region, for which no re-calibration has been performed, still shows the smooth features already discussed in Section 3.2.They do not appear in the already re-calibrated Lymanα region, although there is a higher variance caused by the less number of pixels (as shown in Figure 6).

Example of quasar fit
An example of a high SNR quasar spectrum is shown in Figure 9.We can see the Lyman-α forest at the left side of the Lyman-α emission line (λRF = 1215.7 Å), with the characteristic absorption features.The solid lines show the mean expected flux for that particular forest at different spectral regions.Metal absorption features can be observed in this example for the S iv and C iv re-calibration regions. .Measurement of the weighted mean of the fluxtransmission field for both the Lyman-α and C iii regions.The Lyman-α region shows an expected higher variance at all wavelengths compared to the C iii region, although it lacks smooth features thanks to the re-calibration process.Similarly, the EDR sample shows a higher variance than EDR+M2.In both cases, this is caused by the decrease in number of pixels available.Given the low number of pixels available at larger wavelengths for the EDR sample for the Lyman-α region, it departs from the expected unity behavior.

DISCUSSION
The main science driver of the Lyman-α catalog presented in this publication is the measurement of 3D correlations that allows us to constraint the BAO scale (Gordon et al. 2023).In this section, we discuss two methodological novelties in the construction of the Lyman-α catalog with respect to previous eBOSS analyses (du Mas des Bourboux et al. 2020), and we do this by looking at the impact of these changes to the precision with which we will be able to measure the 3D correlations.In particular, in Section 4.1 we discuss the optimization of the weights assigned to each Lyman-α fluctuation, while in Section 4.2 we will chosse the rest-frame wavelength range based on the precision of the correlation function measurements.
We conclude this section in 4.3 with a discussion on the distribution of per-quasar parameters that capture the diversity of quasar continua in the dataset.

Optimal weights
Correlations in the Lyman-α forest are estimated as weighted averages of products of fluctuations δq(λ) in pixel pairs at a given separation.An optimal quadratic estimator would use the inverse of the pixel covariance as the weight matrix, but the large number of pixels in current Lyman-α datasets makes this inversion not feasible.
Ignoring the small correlation between pixels in different quasar spectra, one can approximate the covariance as blockdiagonal, and make the inversion tractable.This approximation has been used in several measurements of the 1D power spectrum (McDonald et al. 2006;Karaçaylı et al. 2020Karaçaylı et al. , 2022;;Göksel Karaçaylı et al. 2023), it has been proposed to measure the 3D power spectrum (Font-Ribera et al. 2018) and it was used in one of the first BAO measurements with the Lyman-α forest (Slosar et al. 2013).
Recent measurements of 3D correlations in the Lyman-α forest (Delubac et al. 2015;Bautista et al. 2017;de Sainte Agathe et al. 2019;du Mas des Bourboux et al. 2020), on the other hand, have ignored all correlations between pixels and have used instead a diagonal weight matrix.These studies weighted each pixel with the inverse of its variance, including the instrumental noise and the intrinsic fluctuations (see Equation 6), effectively approximating the inverse covariance matrix with the inverse of its diagonal elements.This approximation is simple and easy to implement, but it is not the optimal diagonal weight matrix.
In this section we study a simple modification of our diagonal weight matrix, where we add an extra free parameter σ 2 mod that modulates the contribution of the intrinsic fluctuations σLSS to the weights: A small value of σ 2 mod is equivalent to weighting the pixels based solely on the noise variance, while a large value gives the same weight to all pixels at a given redshift, regardless of instrumental noise.
In Figure 10 we show that a value of σ 2 mod around 7-8 can improve the precision of the auto-correlation measurement by 25%, at no additional cost.As expected, the gain in precision in the cross-correlation is roughly half of that, and we find a 10% improvement for values of σ 2 mod around 6-7.It is important to note that the actual gain will depend on the properties of the dataset.For instance, σpip and σLSS change differently with the width of the pixels used, and we have tested that when using pixels of 2.4 Å (similar to the ones used in the BAO measurement of du Mas des Bourboux et al. 2020) the optimal value is smaller, σ 2 mod = 3.1, and the gain in the auto-correlation is only 8%.(dashed) correlation errorbars for different choices of the σ 2 mod parameter, scaled to a reference value at σ 2 mod = 1.The value of the errorbars was averaged over all scales, given the small scale dependency.The σ 2 mod parameter modified the inverse-variance weighting scheme as defined in Eq. 8, the use of a σ 2 mod ̸ = 1 allows us for the optimization of the weighting scheme.In both auto-and cross-correlations, we observe a reduction in the size of the errorbars when we approach the optimal value of the σ 2 mod parameter.This optimal value is slightly different, but in both cases is found around 7-8.In the optimal value, the improvement in the measurement of the auto-correlation is about 20%, and 10% for the case of the cross-correlation.Both measurements have been performed with the full EDR+M2 dataset.
For this publication and the associated Value Added Catalog, we decide to use a value of σ 2 mod = 7.5, and we leave for future work the implementation of a block-diagonal weighting.

Selection of rest-frame wavelength range
Due to diversity in quasar spectra near the emission lines, the rest-frame wavelength range that can be used for analyzing the Lyman-α forest is limited.This happens at the red end of the spectra due to the Lyman-α emission line (λRF = 1215.67Å) and at the blue end limited due to the Lyman-β line (λRF = 1025.72Å).
Extending the analysis towards longer wavelengths closer to the Lyman-α emission line allows for the incorporation of more data.By including these extra pixels in our analysis, we can improve our measurements of the correlation function if the improvement due to the large number of pixels is not counterbalanced by of the larger pixel variance.
In Figure 11, the blue points show the size of the errorbars in the auto-correlation measurement when we extend the analysis to higher λRF,max.The value at λRF,max = 1205 Å yields the smallest errorbars.To separate the two effects of increased number of pixels and increased pixel variance, we also plot σ 2 ξ N , where N is the number of pairs in the correlation measurement, since we expect σ 2 ∝ 1/N .This will show how valuable the added points are when extending the wavelength range.Its evolution in Figure 11 shows that extending λRF,max to higher wavelengths adds less valuable information, and therefore the decrease in the size of the errorbars is only driven by the increase in the sample size.
Given this result, we decided to set λRF = 1205 Å because we found that further increasing the limit did not add constraining power.The increased variance near the emission line is likely the reason behind this, as it is not accounted for in our calculations and could potentially affect our measurements if we approached it too closely.A more detailed study of quasar continuum variance, using a larger dataset than the one presented here, is necessary to fully understand its effects.We leave this for future releases of DESI data.
The same exercise was performed for the blue side of the quasar spectrum, and the results are shown in Figure 12.When compared to the prescription of previous analyses (λRF,min = 1040 Å in du Mas des Bourboux et al. 2020), we observe a degradation of the errorbars either going to smaller λRF,min due to the decrease of pixel count; or to larger λRF,min approaching the emission line.For this reason we retain this prescription of λRF,min = 1040 Å.

Per-quasar parameters (a,b)
As mentioned in Section 3.3, the quasar continua is assumed to follow a universal function of rest-frame wavelength, with a correction by a first degree polynomial parametrized by aq and bq.Quasar variability can be larger at both ends of the Lyman-α forest, due to the Lyman-β and Lyman-α emission lines, but variability in the weak quasar emission lines at 1017 Å and 1123 Å (Suzuki 2006) could also impact the results.
In Figure 13, we examine the distribution of these two parameters to test this assumption.The plot does not show special features apart from the long tail at large values of |bq/aq|.This feature is caused by spectra where only a small part of the forest appears in the spectrograph.In these cases, fitting the real continuum of the quasar is difficult, leading to poor fits.For this reason, we require forests to have a minimum length of 150 pixels of 0.8 Å pixels.

SUMMARY AND CONCLUSIONS
In this publication, we present the first measurement of Lyman-α fluctuations using DESI data.We use two data sam- show the errorbars size after averaging over all scales (given the small scale dependence of this quantity), and scaled to a reference value at λ RF,max = 1205 Å.The orange points show the equivalent measurement but in this case for the product of errorbar sizes times the number of pixel pairs, removing the dependence on the number of pairs used in the measurement.Following the blue points, we observe an improvement in the precision of our auto-correlation measurements when increasing the λ RF,max parameter, having the optimal value at 1205 Å.The orange points show that the quality of these added points actually decrease for higher wavelengths, revealing that the improved performance is only driven by the inclusion of more information in our sample.We selected 1205 Å as our default value as a compromise between these two features.We used data from the whole EDR+M2 dataset for these measurements.11, in this case for values of λ RF,min .The optimal wavelength is found at λ RF,min = 1040 Å, while having similar features for the orange points: a decrease in the quality of the points added when approaching the emission line.Following the same arguments as in the case of λ RF,max , we decided to keep the limit at 1040 Å.Again, these measurements were performed using the EDR+M2 sample.ples: EDR, which mainly consists of SV data; and EDR+M2, which also includes the first two months of the main survey (M2) data.The EDR sample contains 68,750 quasars, of which 20,281 have valid forests for Lyman-α studies, i.e., with the Lyman-α region visible and identifiable.The EDR+M2 sample contains 318,691 quasars, of which 88,511 have valid forests.We release the Lyman-α fluctuations catalog for the EDR sample as part of the first DESI data release.
To achieve this measurement, we adapted the methodology proposed in previous analyses, especially the one outlined in du Mas des Bourboux et al. (2020), to suit the unique characteristics of the DESI early data sample.We discussed and applied several important modifications.
• We adjusted our pipeline to work with linear-spaced bins in wavelength to match the DESI scheme, preserving its format and precision.
• We adapted the re-calibration process by simplifying it, removing unneeded steps that did not improve the performance of the whole re-calibration.We also switched the recalibration region to allow for more pixels to be used in this process.
• We simplified the weighting scheme by removing terms that added unnecessary complexity and did not contribute to the accuracy of the results, specially for this small data release.
• By optimizing our diagonal weight matrix, we improved measurements of the auto-correlation by about 20%, and of the cross-correlation by about 10% The presented flux-transmission field catalog could be used for other studies apart from the standard BAO analysis.As in Font-Ribera et al. (2012);Pérez-Ràfols et al. (2018a,b), its cross-correlation with DLAs can be measured.A similar analysis could be performed with Strong Blended Lymanα (SBLA) absorption systems, as in Pérez-Ràfols et al. (2023).The full-shape analysis of the Lyman-α forest threedimensional correlation function would allow for the measurement of the Alcock-Paczyński effect, as performed in Cuceu et al. (2023) using eBOSS DR16 data.Furthermore, IGM tomography is also possible with this sample.
As datasets get larger and larger, we will need to be more careful with the analysis.Weights used to measure clustering could be improved in two ways.As discussed in Section 4.1, the implementation of a block-diagonal weighting scheme will account for correlations between pixels within the same forest.Alternatively, taking into account that quasar diversity makes the estimation of the continua more difficult, one possible way of improving the weights is adding to the weighting scheme proposed in Eq. 6 an extra term accounting for errors in the estimation in the continuum.This term would be dependent on λRF and would be larger around the emission lines, where quasar diversity is expected to be higher.Finally, the analysis could be extended into the Lyman-β region.
This study can serve as a solid foundation for future research using DESI data.We are excited about the potential for further investigations in this area, as more comprehensive analyses become possible with the availability of future DESI releases.
istry of Science and Innovation of Spain (MICINN), and by the DESI Member Institutions: https://www.desi.lbl.gov/collaborating-institutions.Any opinions, findings, and conclusions or recommendations expressed in this material are those of the author(s) and do not necessarily reflect the views of the U. S. National Science Foundation, the U. S. Department of Energy, or any of the listed funding agencies.
The authors are honored to be permitted to conduct scientific research on Iolkam Du'ag (Kitt Peak), a mountain with particular significance to the Tohono O'odham Nation.
There are three relevant changes in the way the analysis is performed.First, the re-calibration has been updated to account for the improvements offered by the new instrument.Second, the variance estimation process has been simplified.Third, the re-calibration region has been switched to C iii.We have already discussed the third change in Section 3.2, now we will describe the first two changes in detail.

A1 Re-calibration of spectra in SDSS
The improved estimation of pipeline noise (σpip) in DESI has eliminated the need for multiple re-calibration steps as it was performed in the SDSS analysis.
For SDSS analyses, re-calibration was performed in two steps, both of which were conducted in the same recalibration region.The first step was equivalent to the one used in this analysis (Section 3.2).After the first step, a second re-calibration step was added to further correct the reported variance of the pipeline.In this second step, the continuum fitting process was run again to obtain a new estimation of η.
Then, this new value of η is applied in the main Lyman-α fluctuations run to correct for the values of σpip: (A1) The extra re-calibration step in SDSS aimed to make the final estimation of η closer to 1, but it also added an unnecessary layer of complexity to the process with the sole purpose of doing so.However, it did not modify the product η • σ 2 pip,q , and as a result, the overall variance assigned to pipeline noise was unchanged.

A2 Variance estimator
In Eq 6, we have defined the model for our variance of the flux σ 2 q (λ).Its expression has been simplified from previous SDSS analysis, where this variance was modeled as: σ 2 q (λ) F Cq(λ) 2 = η(λ)σ 2 pip,q (λ) + σ 2 LSS (λ) + ϵ(λ) σ2 pip,q (λ) . (A2) The additional term was added to account for the observed increase in variance at high SNR, which is likely caused by the diversity of quasar spectra.
In Figure A1, we present the weight of each term in Eq.A2 contributing to the total variance, selecting only the highest SNR quasars.This shows that the effect of this added term is small, and for simplicity we decided to not include it in our analysis.The plot displays only the top 5 % higher SNR quasars since this subset is expected to have the largest contribution (the third term in Eq.A2 becomes larger).It is worth noting that the contribution is too small to be discernible when all the objects in the sample are considered.
However, it is important to take into account quasar diversity in future analyses.We plan to incorporate a term in Eq. 6 that considers the variance in the quasar's continuum.This term will be dependent on rest-frame wavelength, σ 2 C (λRF), and will also help us select the region of the quasar suitable for Lyman-α analyses, showing us how close the emission lines we can get.
Nevertheless, for this DESI EDR analysis, we decided to / 2 pip, q Figure A1.Contribution of each term in Eq.A2 to the full variance, for multiple wavelength bins.For this measurement we only used the 5% of quasars with the highest SNR in the EDR+M2 sample.This analysis reveals that the effect of the ϵ term in Eq.A2 is minimal, even for the data subset where it is supposed to be most significant.
omit this modification since its inclusion lacked sufficient justification, and its influence on the analysis is negligible.

Figure 1 .
Figure1.Distribution of wavelength and pipeline-reported error on the flux measurement for the larger EDR+M2 sample.This measurement is performed in the C iii region of the spectra, as defined in Table2.The solid lines mark the mean value of the error for a given wavelength, black for the EDR sample and white for the EDR+M2.Apart from the clear distinction in these two lines, a subtle division into two bands at larger wavelengths can also be observed.This is caused by the better signal-to-noise in the EDR sample, thanks to multiple re-observations of the same targets.We can also observe how the variance relatively increases in the two ends of the spectrograph and in the area affected by the collimator mirror reflectivity around 4400 Å.

Figure 2 .
Figure 2. Distribution of objects in the two samples used in this publication, compared with the final eBOSS DR16 sample presented in du Mas des Bourboux et al. (2020).Top: Redshift distribution of the three catalogs.Bottom: Sky distribution of the same catalogs.We observe that M2 makes a significant portion of the EDR+M2 sample.When compared to the eBOSS data, the EDR+M2 sample is approximately halfway to matching the number of objects in the former.

Figure 3 .
Figure3.Fraction of pixels masked due to DLAs (top) and BAL (bottom) features.As expected, the number of detected DLAs increase with redshift (and therefore with λ), yielding a fraction of masked pixels always below the 5%.For the BAL case, we show the masked fraction as a function of λ RF , which allow us to observe the strong wavelength dependence of the masking, associated with emission lines for different elements.This Figure used the full EDR+M2 dataset.

Figure 4 .
Figure 4. Weighted average of the flux-transmission field measured in the C iii region.This measurement has been performed without masking sharp features, and they can be clearly observed at different positions in the spectrograph.The three masked regions are specified in the plot, showing the wavelength range affected by the mask.Smooth features in this measurement can also be observed.Their impact can be corrected through the flux re-calibration (see Section 3.2).For this Figure we used the full EDR+M2 dataset.

Figure 6 .
Figure6.Number of pixels available for the different regions measured in bins of ∆λ = 55.58Å.Given the location of the different regions at different positions in the quasar spectra, the number of available pixels is different for each of them.This is because our spectral coverage lies in the range (3600, 5772) Å. Thanks to the larger number of pixels available for the C iii region, it was selected as the region to be used for the re-calibration process.This Figure used the full EDR+M2 sample.
Figure 7. Wavelength evolution of the fitted parameters η and σ 2LSS , measured in both the C iii and Lyman-α regions for the EDR (dashed) and EDR+M2 (solid) samples.Top: The pipeline error correction η is found to be larger for the EDR sample, caused by the worse estimation of the pipeline-reported variance for the first part of the SV program.Bottom: σ 2 LSS , in this case it is consistent between the two samples.As expected the C iii region shows a value close to 0 for all wavelengths, while for the Lyman-α region follows the expected increase in its intrinsic variance with redshift.In both η and σ 2 LSS , measurements for the EDR sample, the fitted parameters could not be obtained for the larger wavelength value due to the reduced number of pixels available, falling to the default values η = 0.1 and σ 2 LSS = 0.1.
Figure8.Measurement of the weighted mean of the fluxtransmission field for both the Lyman-α and C iii regions.The Lyman-α region shows an expected higher variance at all wavelengths compared to the C iii region, although it lacks smooth features thanks to the re-calibration process.Similarly, the EDR sample shows a higher variance than EDR+M2.In both cases, this is caused by the decrease in number of pixels available.Given the low number of pixels available at larger wavelengths for the EDR sample for the Lyman-α region, it departs from the expected unity behavior.

Figure 9 .
Figure9.Example of a high signal-to-noise quasar spectrum.The mean expected flux C(λ RF ) as in Eq. 4 is shown for the Lyman-α, S iv, C iv and C iii regions.The quasar has a redshift zq = 2.495 and is identified as a DESI object with TARGETID=39628443918272474.As in this example, we occasionally observe metal absorption in the calibration regions.

Figure 10 .
Figure10.Measurement of the Lyman-α auto-(solid) and cross-(dashed) correlation errorbars for different choices of the σ 2 mod parameter, scaled to a reference value at σ 2 mod = 1.The value of the errorbars was averaged over all scales, given the small scale dependency.The σ 2 mod parameter modified the inverse-variance weighting scheme as defined in Eq. 8, the use of a σ 2 mod ̸ = 1 allows us for the optimization of the weighting scheme.In both auto-and cross-correlations, we observe a reduction in the size of the errorbars when we approach the optimal value of the σ 2 mod parameter.This optimal value is slightly different, but in both cases is found around 7-8.In the optimal value, the improvement in the measurement of the auto-correlation is about 20%, and 10% for the case of the cross-correlation.Both measurements have been performed with the full EDR+M2 dataset.

Figure 11 .
Figure11.Comparison of the size of the errorbars in the Lymanα auto-correlation for different choices of λ RF,max .The blue points show the errorbars size after averaging over all scales (given the small scale dependence of this quantity), and scaled to a reference value at λ RF,max = 1205 Å.The orange points show the equivalent measurement but in this case for the product of errorbar sizes times the number of pixel pairs, removing the dependence on the number of pairs used in the measurement.Following the blue points, we observe an improvement in the precision of our auto-correlation measurements when increasing the λ RF,max parameter, having the optimal value at 1205 Å.The orange points show that the quality of these added points actually decrease for higher wavelengths, revealing that the improved performance is only driven by the inclusion of more information in our sample.We selected 1205 Å as our default value as a compromise between these two features.We used data from the whole EDR+M2 dataset for these measurements.

Figure 12 .
Figure12.Identical measurements as the ones displayed in in  this case for values of λ RF,min .The optimal wavelength is found at λ RF,min = 1040 Å, while having similar features for the orange points: a decrease in the quality of the points added when approaching the emission line.Following the same arguments as in the case of λ RF,max , we decided to keep the limit at 1040 Å.Again, these measurements were performed using the EDR+M2 sample.

Figure 13 .
Figure13.Distribution of aq and bq/aq parameters defined in Eq. 4 for the EDR+M2 sample.The aq parameter modifies the amplitude of the mean continuum C(λ RF ), while the bq/aq term introduces a modification that tilts it as a function of rest-frame wavelength.This last parameter relates to the intrinsic width of the spectral index.The faint long tail at large values of |bq/aq| is caused by short forests in the sample, where the continuum fit can be problematic.

Table 1 .
Number of Lyman-α quasars in the two samples, number of quasars showing BAL features and number of DLAs affecting forests from the Lyman-α quasars.The better signal-to-noise in the EDR sample allows for the detection of a higher number of DLA and BAL features.

Table 2 .
Statistics for the regions considered during the analysis, the region span is defined in the quasar rest-frame wavelength (λ RF ).The number of forests corresponds to the number of quasars whose spectra can be observed in the spectrograph.A minor number of forests are rejected due to low signal-to-noise ratio (SNR) or due to them being too short.
Figure 5. Bottom: Weighted average of the flux-transmission field measured at different regions of the spectra.Its value has been shifted to better distinguish the different regions.As opposed to the results shown in Figure