Time Variation of Fine-Structure Constant Constrained by [O III] Emission-Lines at 1.1

[O III]$\lambda\lambda$4960,5008 doublet are often the strongest narrow emission lines in starburst galaxies and quasi-stellar objects (QSOs), and thus are a promising probe to possible variation of the fine-structure constant $\alpha$ over cosmic time. Previous such studies using QSOs optical spectra were limited to $z<1$. In this work, we constructed a sample of 40 spectra of Ly$\alpha$ emitting galaxies (LAEs) and a sample of 46 spectra of QSOs at $1.09<z<3.73$ using the VLT/X-Shooter near-infrared spectra publicly available. We measured the wavelength ratios of the two components of the spin-orbit doublet and accordingly calculated $\alpha(z)$ using two methods. Analysis on all of the 86 spectra yielded $\Delta\alpha/\alpha=(-3\pm6)\times10^{-5}$ with respect to the laboratory $\alpha$ measurements, consistent with no variation over the explored time interval. If assuming a uniform variation rate, we obtained $\alpha^{-1}{\rm d}\alpha/{\rm d}t = (-3\pm6)\times10^{-15}$ yr$^{-1}$ within the last 12 Gyrs. Extensive tests indicate that $\alpha$ variation could be better constrained using starburst galaxies' spectra than using QSO spectra in future studies.


INTRODUCTION
proposed that fundamental physical constants may change with the evolution of the universe.The fine structure constant α is defined as: where e is the electron charge, ε0 is the permittivity of free space, ℏ is the Planck constant, and c is the speed of light.It is a fundamental physical constant charactering the strength of the electromagnetic interaction between elementary charged particles.It also quantifies the gap in the fine structure of the spectral lines of atom.This gap is proportional to the energy of main level by a factor of α 2 (e.g., Dzuba et al. 1999;Uzan 2003).Thus variation of α over time could be directly constrained by comparing the wavelengths of fine-structure splitting of atomic lines from another epoch with that from today.Possible variation of α over time has been studied with the help of measurements from laboratories, geology and astronomy (see review by Uzan 2003Uzan , 2011)).Among these, the studies involving astronomical spectra can provide possible variation of α with the longest lookback time and the greatest spatial spans, in which case trivial variation of α with spacetime might accumulate to a measurable level.
At z > 1, when the [O III] doublet is redshifted beyond the wavelength ranges of optical spectrometers, the studies on the variation of α using astronomical spectroscopy can be carried out using alkali doublet absorption lines in the ultraviolet spectra of QSOs (e.g.Bahcall et al. 1967;Murphy et al. 2001).With a principle similar to the [O III] doublet method, measurements of ∆α/α involving this alkali-doublet method have reached a precision of ∼ 10 −5 , slightly better than the [O III] doublet method and did not detect variation in α either.
The most precise astronomical limits on α variation till now were achieved using many-multiplet method (e.g., Dzuba et al. 1999;Webb et al. 2001).In this method, the wave number of a spectra line, in producing which the fine-structure splitting of an energy level is involved, can be expressed as ωz = ω0 +q1x+q2y (e.g., Dzuba et al. 1999), where ωz and ω0 are the wave numbers in vacuum at redshift z and at today, respectively, and x = ( αz α 0 ) 2 − 1, y = ( αz α 0 ) 4 − 1.The parameters q1 and q2 can be computed using relativistic many-body perturbation and experimental data, while their values for light elements and heavy elements can differ by a factor of tens to thousands.This great diversity of the values of q1 and q2 can amplify possible variation of α and then reveal it in the differences between the wave numbers of the spectral lines of different atoms and ions.Multiple absorption lines in the damped Lyα absorption systems in QSOs' spectra, including Mg II, Fe II, and others, were used.In practice, the absorption lines generally have complex profiles, and therefore, a fitting technique involving many Voigt profiles is used.All absorption lines in an absorption system are fitted simultaneously, in which every line is decomposed into multiple Voigt components, and the number of free parameters is minimized by linking the physically related absorption components of different lines.Some early works claimed to have found evidence of significant differences between the past and present values of α (e.g., Webb et al. 2001;Murphy et al. 2003).However, subsequent works (e.g., Murphy et al. 2007;Griest et al. 2010;Whitmore & Murphy 2015) indicated that the early results were affected by wavelength distortion.After taking the wavelength distortion into account, later works, including those using QSO spectra with resolutions of 40, 000 ∼ 60, 000 aiming at several to hundreds of absorbers at 0.2 < z < 4 (e.g., King et al. 2012;Evans et al. 2014;Songaila & Cowie 2014;Murphy & Cooksey 2017) and those using spectra of bright QSO HE 0515-4414 with resolutions up to 145,000 (e.g., Kotuš et al. 2017;Milaković et al. 2021;Murphy et al. 2022a), measured ∆α/α with precisions of (0.8 ∼ 4) × 10 −6 and did not detect variation in α over time.Recently, Murphy et al. (2022b) applied the many-multiple method to the absorption spectra of neighbouring stars within 50 pc, finding ∆α/α consistent with 0 with a precision of 1.2 × 10 −8 .
Though having archived great precision, possible problems lurking in works using the many-multiplet method have been pointed out (see a review by Webb et al. 2022), which may emerge from the techniques for correcting wavelength distortion (e.g., Dumont & Webb 2017), the physically assumed linking that the Voigt fitting technique relies on (e.g., Levshakov 2004), the technical details of the fitting process (e.g., Bainbridge & Webb 2017;Lee et al. 2023), the unconsidered systematic errors (e.g., Lee et al. 2021), and other.These problems can lead to biases in the best-fitting ∆α/α values or to underestimations of their errors, and some of the problems lack clear solutions despite much effort.Anyhow, the results of the many-multiplet method need to be tested with entirely different methods.The [O III] doublet method relies on fewer assumptions and suffers from fewer systematic errors.No assumptions on chemical composition, ionization state, and distribution of energy levels are required, because the [O III] doublet lines originate in the downward transitions from the same upper level of the same ion.Also, there is no need to decompose one [O III] emission line into multiple components by assuming that it originates from multiple clouds, as was usually the routine in the study of absorption lines, for the [O III] doublet lines must have the same profile.The [O III] doublet method is more tolerant of the wavelength distortion because of the small wavelength range used in the measurement (doublet line interval 47.9(1+z) Å).At present, although the precision of the ∆α/α measured by the [O III] doublet method is much lower than that measured by the many-multiplet method, improvement in the future can be achieved by using more spectra of QSOs and starburst galaxies with better quality.In addition, current measurements based on absorption lines were limited to z < 7.1 (Wilczynska et al. 2020).Studying the α variation using absorption lines at higher redshifts is challenging because QSOs are extremely rare in the early universe, and the normal galaxies' continua are too weak.Fortunately, many starburst galaxies have been discovered in the early universe, and their [O III] doublet or other doublet emission lines can be used.
In studies of α variation using the [O III] doublet method, works with large sample sizes (N > 20) have used only spectra of QSOs, not starburst galaxies.This may be because QSOs have higher [O III] luminosities, or it may be related to the observational strategy of the SDSS project.In reality, starburst galaxies far outnumber QSOs in the universe.They have narrower [O III] emission lines than QSOs, which is advantageous for improving the accuracy of ∆α/α measurement and may compensate for their disadvantage of lower [O III] luminosity.If this can be proved, the number of available spectra could be significantly increased by including starburst galaxies in studies of α variation, and the precision of the final results improved.In addition, previous works only used optical spectra (wavelength < 1 µm), and hence the ∆α/α measurements were limited to z < 1. Applying this method to z > 1 demands infrared spectroscopy.
The XShooter (Vernet et al. 2011) is an intermediateresolution echelle spectrometer mounted on UT2 of the Very Large Telescope (VLT) since 2009.Entering celestial radiation is split into three arms optimized for the UVB,VIS and NIR wavelength ranges by dichroic mirrors.Each arm has an echelle and a detector, and the wavelength ranges of the UVB, VIS and NIR arms are 299-556, 534-1020 and 994-2478 nm, respectively.By splicing the spectra from the three arms, a continuous wavelength coverage from 300 to 2470 nm can be achieved.The XShooter can work under the long-slit or the integral field unit modes.For the long-slit mode, slits with widths of 0.4 to 1.5 ′′ are available in the NIR arm.The slits with widths of 0.6, 0.9 and 1.2 ′′ are often used, producing NIR spectra in resolutions of 8030, 5570 and 4290, respectively.After an observation finishes on the Xshooter, the raw data will be processed automatically by the pipeline (Modigliani et al. 2010), giving the extracted spectra of the targets.Overall, the XShooter has a unique wavelength coverage of up to 2.47µm with a moderate spectra resolution, and it is highly sensitive due to the large aperture of the VLT and the high efficiency of the spectrometer, not to mention the pipeline makes accurate and reliable wavelength calibration.Thus, spectra taken by the XShooter are suitable for studies of the variation of α via the [O III] doublet method at z > 1.
In this work, we measured the α variation by the [O III] doublet method using XShooter spectra of a sample of Lyman-α emitters (LAEs) and QSOs.LAEs are distant galaxies selected by their strong Lyman-α emission lines, among which most are starburst galaxies, and few are obscured active galactic nuclei.We achieved a measurement at 1.1 < z < 3.7 for the first time with this method.In addition, we found that α variation can be better constrained by using LAE spectra than QSO spectra.Throughout the paper, we assume a ΛCDM cosmology and use the cosmological parameters obtained by Bennett et al. (2014), which are H0 = 69.6 km s −1 Mpc −1 , ΩM = 0.286, and ΩΛ = 0.714.

Data reduction
We collected information on the normal, the large, the small, the ToO and the DDT XShooter programs in period 84 − period 104 between 2009 and 2020.Projects related to LAEs and QSOs at 1 < z < 4 and observed in the long-slit mode were selected.The redshifts of the included targets were then determined roughly by inspecting their optical/NIR spectra visually.Although these rough redshifts were only used for selecting targets, their accuracy was guaranteed later by the more precise redshifts obtained in further analysis by fitting the spectra.The uncertainties of the rough redshifts are on the 0.01 level, which is sufficient for selecting reliable targets at this stage.Targets with redshifts between 1.07 and 3.77 were selected, ensuring the NIR spectra cover the wavelength range of 4800-5200 Å in the rest frame.
We obtained the two-dimensional (2D) and onedimensional (1D) NIR spectral data of the selected targets from the ESO database 1 , which had been processed by the XShooter pipeline (version is related to the observation time).These data are referred to as pip-2D and pip-1D hereafter.The pip-2D data underwent processes including wavelength calibration, CCD cleaning, target tracking and straightening, combining different exposures, removing cosmic rays, and merging spectra from different orders.They had also been interpolated to a wavelength grid with a fixed interval of 0.6 Å.Based on the pip-2D data, the pip-1D data underwent spectral extraction and flux calibration.Instead of using the pip-1D data directly, we used the pip-2D data to extract the final 1D spectrum.In this section, we briefly introduce the typical extraction approach and display more details and the treatment of some exceptional cases in Appendix A.
Dither mode was used in most observations, and under this mode, a target will experience several consecutive exposures.The pipeline combines the data from the group of multiple exposures of a target, leaving three parallel images of the target in its pip-2D data, in which one image in the middle has positive flux, and the other two have negative fluxes.The extraction of such pip-2D data of a target is as follows.First, we determined the aperture for extraction by fitting the brightness profiles of the three images at the spatial direction of the 2D data.Second, we extracted the spectra of the three images and combined them.Third, we made flux calibration and telluric correction for the combined spectrum.Finally, we identified the unreliable pixels and assigned marking masks for them.Pixels seriously affected by sky emission lines (SELs) or telluric absorption lines (TALs) were masked in this step, and Figure 1 shows the examples.In addition, if a target has more than one group of exposures in a project  and hence several spectra were extracted, we would combine these extracted spectra into one.We extracted 95 LAE and 601 QSO spectra (including obscured QSOs).Note that if one target was observed in different projects, we extracted a spectrum for each project.

Spectral Analysis
Aimed to build a sample with spectra that have strong and clean [O III] doublet lines, we fit the [O III] doublet lines and their adjacent regions in each spectrum for the present.The fitting model includes one component for the continuum and one for the [O III] doublet.The former component is a linear function for an LAE and a second-order polynomial for a QSO.We did not set a separate component for the Hβ broad emission line in a QSO spectrum because it has been included in the second-order polynomial.For the latter component, a sum of multiple Gaussian components was used for each [O III] line.The λ4960 and λ5008 lines contain the same number of Gaussian components, and for each Gaussian component, we set: where wi,5008, σi,5008 and fi,5008 are the central wavelength, standard deviation and flux of the ith Gaussian component of the λ5008 line, and those with a subscript 4960 are the corresponding parameters of the λ4960 line.The definitions of η and A followed Bahcall et al. (2004), where 1 + η is the ratio of the wavelengths of the doublet, and A is the ratio of the fluxes of the doublet.We fixed η as the theoretical value η0 in this stage to distinguish the doublet lines when blended or avoid misidentification of the λ4960 line when weak.We adopted the vacuum wavelengths of 4960.295 and 5008.240Å for the doublet, which were taken from the NIST Atomic Spectra Database 2 , and hence η0 = 0.00966576.For spectra with > 10σ detections of both doublet lines, fixing η to η0 generally leads to an increase of the signal-to-noise ratios (SNR) of the λ4960 by <20%, and an increase of that of the λ5008 by <5%.Here an SNR was calculated as the flux of a line divided by its error.
The rest-wavelength ranges for fitting are different for LAEs and QSOs.In most cases, the rest-wavelength range for LAEs is 4900-5100 Å and 4900-5140 Å for QSOs.When fitting some QSOs' spectra, the result contains Gaussian components with FWHMs of several 10 4 km s −1 , which lack astronomical interpretations and are actually caused by a miscalculation of the continuum.In these cases, we adjusted the red end of the range to 5160, 5180 or 5200 Å to define the continuum component better.For QSOs with broad and blueshifted [O III] emission lines, such as the obscured QSOs with strong [O III] outflows in the sample of Zakamska et al. (2016), we adjusted the blue end of the range to 4880 Å to decompose the continuum and the λ4960 line effectively.For any spectrum, pixels with masks were excluded from the fitting.
We fit the spectra by minimizing χ 2 .First we tried one Gaussian component for each line, and then gradually added more Gaussian components.One component more will lower the χ 2 and the degree of freedom by 3 at the same time.Supposing the fitting were improved after adding a component, we calculated the confidence probability using the F test: where χ 2 and dof are the chi-square and the degree of freedom after adding the component, ∆χ 2 and ∆dof are the decreased amounts, and ∆dof is 3.We calculated the probability p(F, dof, ∆dof ) corresponding to this F value.If p is greater than 99.99%, we would add this component, and vice versa.This threshold of p corresponds to ∆χ 2 of 20-30 for most spectra (dof is 1000-1500).Figure 2 shows an example of how we gradually added the Gaussian components.All the doublets in the spectra can be fitted with no more than 4 components, while when 4 components have been used, adding one more only improves the fitting slightly but raise difficulty in finding the best solution.Bahcall et al. (2004) indicated that the SDSS pipeline overestimated the errors for lower flux levels(see their section 4.4).In each panel, we display the observed spectrum (black), the continuum model (green), the sum of continuum and emission line models (red), the residual (blue), and the observed errors (cyan).Note that we only show spectra and model in 4900-5100 Å while the wavelength range of fitting is 4900 to 5140 Å.
This situation also happened to the Xshooter pipeline.Therefore we corrected the error following Bahcall et al. (2004).If the fitting's reduced chi-square (χ 2 /dof ) is less than 1 when the original error is used, we multiplied the error of all pixels by a correction factor (equal to the square root of the reduced chi-square) and redid the fit.We corrected the error for 43% of the spectra, and the correction factors in our final sample are 0.6-1.0.This correction decreases the errors of model parameters and increases the SNRs of [O III] doublet lines.After correction, the SNRs are more consistent with visual estimates (e.g., 3-5 for a weak detection or > 7 for a clear detection).

Sample Selection
Our criteria for sample selection include the following aspects.
First, we required the SNRs of both the λ4960 and λ5008 lines to be above 10.
Second, we required that the [O III] doublet in a spectrum hardly bears any SELs, TALs, cosmic rays or bad CCD pixels.Although the pixels affected by these factors had been masked in Section 2.1 and were excluded from spectral analysis, these factors can still cause problems because automated programs may fail to mask all affected pixels when the effects are severe.In addition, for an emission line with half the pixels masked, including those near the peak, setting η as a free parameter (as we will do in Section 3) may lead to mistaken identification of the peak.We calculated a mask index I mask for each line to describe these effects quantitatively.As shown in Figure 1, this index was calculated as the sum of the fluxes of the masked pixels (grey region) divided by the total flux of the line.After visual inspections of the effects mentioned above, we required that the sum of I mask of λ4960 and λ5008 lines be below 0.5.The LAE spectra excluded by this criterion were mainly due to SELs, while the QSO spectra excluded were mainly due to TALs.
Third, we excluded QSOs with [O III] doublet lines severely blended.If so, the decomposition of the doublet may be wrong by setting A as free parameters if that by fixing A at theoretical value treated as correct.A blending index I blend was calculated to quantify the blending, as is shown in Figure 3(b).On the multiple Gaussian models of the [O III] doublet lines, we measured the flux of the highest point near 4960 Å as f1 and the flux of the lowest point in the range between 4960-5008 Å as f2.I blend was set to be the ratio between f2 and f1.According to visual inspections, to ensure the doublet lines be properly decomposed, we required that I blend is below 0.2.The QSO spectra excluded by this criterion all have broad and blueshifted [O III] emission lines.The decomposition by fixing A at theoretical value show that the [O III] FWHMs of these excluded QSOs are 800-4500 km s −1 , significantly greater than the FWHMs of the final sample, which has a median value of 610 km s −1 .
Finally, we excluded QSOs with strong Fe II bumps.The Fe II bump between 4900 and 5050 Å may lead to a severe systematic error when fitting the continuum with a simple second-order polynomial.To measure the intensity of the Fe II bump, we calculated a Fe II index IFeII as follows.As shown in Figure 3(a), we measured I4590 and I5250, which represent the intensity of the Fe II bumps around 4590 Å (Fe II λ4590) and around 5100-5400 Å (Fe II λ5250).We inferred the continuum below the Fe II bumps using the spectra in the green shade in Figure 3(a): the wavelength ranges for fitting the continuum are 4420-4460 Å and 4720-4760 Å for Fe II λ4590, and 5060-5100 Å and 5400-5440 Å for Fe II λ5250; the model used was a power law; the best-fitting model for the continuum is expressed as fcon(λ), as is shown by the blue line in Figure 3(a).We fit the continuum-subtracted spectra in 4460-4720 Å and 5100-5400 Å using cubic spline functions whose nodes have a uniform interval of 20 Å, and the fitting result is expressed as fFeII(λ).Hence we calculated: reliably measured because of the telluric absorption bands.
For uniformity, if I5250 can be reliably measured, then we set IFeII = I5250.And if not, we set IFeII = 1.3×I4590.The coefficient 1.3 here was chosen to be the ratio between I5250 (16 Å) and I4590 (12 Å) measured on the Xshooter QSO composite spectrum (Selsing et al. 2016).We excluded QSOs that meets the following criteria: IFeII > 10 Å and where The final sample passing these criteria contains 86 spectra, including 40 spectra of 32 LAEs and 46 spectra of 45 QSOs.The observing information of these 86 spectra is listed in Table 1 (the LAE subsample) and Table 2 (the QSO subsample).Six of the LAE spectra come from a galaxy at z = 2.37 named "Sunburst arc" (SA, Dahle et al. 2016;Rivera-Thorsen et al. 2017), which is highly magnified by a gravitational lens (magnification > 20).We refer to these six spectra as SA1 to SA6.LAE J0332-2746 has three spectra, J0217-0502 has two spectra, and each of the other 29 LAEs has only one spectrum.QSO J1313-2716 has two spectra, and each of the other 44 QSOs has one spectrum.
For each spectrum, we measured the peak wavelength of the [O III] λ5008 line and calculated a [O III] redshift as the approximation of the systematic redshift.For typical QSOs, this would cause an error of <0.001 (<300 km s −1 ) except for a small fraction of QSOs called "blue outliers" (Marziani et al. 2016), but those blue outliers generally have blended [O III] doublet and would not meet our criteria and hence would not be included in our sample.We also measured the luminosity of [O III] λ5008 in each spectrum.The statistical errors of the luminosities are <6% , corresponding to <0.03 dex in logarithm.Changes in the slit width, the seeing condition and the atmospheric transmittance can cause systematic errors in the luminosities, and we estimated them to be 0.1-0.2dex based on the multiple observations of some targets.These redshifts and [O III] luminosities of the spectra are listed in Table 1 and Table 2 and are displayed in Figure 4. We found that all the targets are distributed in three redshift intervals, 1.09 < z < 1.66, 2.01 < z < 2.57, and 3.21 < z < 3.74, corresponding to the observed wavelengths of the J, H and K bands respectively.Most of the LAEs are in the redshift range corresponding to the H-band, and this may be caused by the methods with which the LAEs were selected.The [O III] luminosities of the QSOs are in the range between 10 42.8 and 10 44.8 erg s −1 , and have a median value of 10 43.7 erg s −1 .The [O III] luminosities of the SA spectra are in the range of 10 43.6−44.4erg s −1 , and those of other LAEs are in the range between 10 42.1 and 10 43.2 erg s −1 , and have a median value of 10 42.7 erg s −1 , which is one order of magnitude lower than that of QSOs.
We measured the FWHMs of the [O III] doublet lines in

MEASUREMENTS OF EMISSION-LINE WAVELENGTHS
Following Uzan (2003), we calculated the variation in the fine structure constant as: where λ1 and λ2 are the wavelengths of λ4960 and λ5008 lines, respectively.Those with a subscript of 0 are the wavelength values at z = 0 from laboratories, and those with a subscript of z are the observed values of the doublet from a target's spectrum at a redshift of z.Equation ( 5) can be rewritten as: where η ≡ (λ2/λ1 − 1)z, η0 ≡ (λ2/λ1 − 1)0 = 0.00966576.The Taylor expansion of equation ( 6) around η0 gives: We will show later that for our sample, |η −η0| is less than 3× 10 −4 .In this case, ignoring the quadratic and the high-order terms only results in a relative error of ∆α/α no more than ∼ 10 −4 .Thus, we calculated the change in the fine structure constant and estimated its error with: In this way, the measurement of ∆α/α can be converted into the measurement of η.

Two Methods to Measure η
We measured η using the following two methods.The first is called the Multiple-Gaussian (MG) method.The λ4960, λ5008 lines were fitted simultaneously with the model expressed as equation ( 2).The fitting is basically the same as that described in section 2.2, and the only difference is that η is now a free parameter instead of the fixed value of η0.The wavelength ranges and the number of Gaussian functions are consistent with those used in section 2.2.We fit the spectra using a Levenberg-Marquardt technique with the MPFIT package.We set the initial parameters as those obtained in section 2.2.We took the values of η and A yielding the minimum χ 2 (expressed as χ 2 min ) as the measurement result and calculated the statistical errors σstat(η) and σstat(A) according to the range of η and A yielding χ 2 (η, A) < χ 2 min + 1.An example of measuring η and A using the MG method is shown in the upper row of Figure 6.
The second is called the Profile-Matching (PM) method.This method is based on the fact that the [O III] doublet originates from the same upper energy level and should have identical line profiles.The analysis approach is similar to that adopted by Bahcall et al. (2004).After subtracting the bestfitting continuum model from the observed spectrum, we obtained the profiles of λ5008 and λ4960 emission lines.We moved the λ5008 line leftward (λ divided by 1 + η) and decreased the amplitude (f λ divided by A 1+η ).Theoretically, at some specific values of η and A, the adjusted λ5008 line should be the same with the λ4960 line, in which case they can be regarded as two samplings from one profile beneath.So for a pair of values of η and A, we fit the adjusted λ5008 line and the λ4960 line simultaneously with a cubic spline function, from which a χ 2 value was obtained.With different values of η and A yielding different values of χ 2 , we obtained a χ 2 surface in the 2D-parameter space.The η and A yielding the minimum χ 2 were taken as the result, and their statistical errors were obtained using χ 2 (η, A) < χ 2 min +1.In the matching, we used the pixels in the following wavelength ranges: the range for the λ5008 is where the flux exceeds 20% of the peak flux (we use the flux obtained from the best-fitting Gaussian models to avoid random fluctuations), while the range for the λ4960 is the one of λ5008 divided by 1 + η.The pixels with masks were excluded.When η changes, the wavelength range of the λ4960 line changes accordingly, and the degree of free-dom may change when the λ4960 line contains masked pixels.If so, we slightly adjusted the wavelength range to keep the degree of freedom constant.An example of measuring η and A using the PM method is shown in the bottom row of Figure 6.
For all the 86 spectra, the χ 2 (η, A) surfaces obtained by the MG method look good: they are smooth, and their contours are close to ellipses in the vicinity of the minimum χ 2 , as shown in Figure 6(c) and 6(d).This holds in the range of χ 2 < χ 2 min +1 for all spectra and in the range of χ 2 < χ 2 min +10 for most spectra.Therefore, the measurements on η, A and the corresponding statistical error are reliable.However, only a small fraction of χ 2 surfaces obtained by the PM method look good, and we showed two bad examples in Appendix B. The PM method works well only for spectra with narrow [O III] doublet lines that have high SNR and almost bear no bad pixels.We visually checked the χ 2 surfaces and selected 16 LAE spectra and 1 QSO spectrum with which η can be reliably measured.
The consistency between the two methods was checked by the measurements of η from the 17 spectra.The η values are showed in Figure 7(a), and their statistical errors σstat(η) in Figure 7(b).The values of σstat(η) obtained from the two methods differ by less than 12%.To check whether the η values were consistent, we calculated deviation ∆ as: The distribution of ∆ is shown in Figure 7(c).The average value is close to 0, and their absolute values are no more than 1.12.These indicate that the results from the two methods agree with each other.As the MG method works well for all the spectra in our sample, we adopted the measurements using it as the final result.The best-fitting Multiple-Gaussian models are displayed in Appendix C. The number of Gaussian functions, the best estimates and the statistical errors of η and A are listed in Table 3, 4 and 5.As the measurements of the six spectra of SA enormously contributed to the final results, we also list the estimates of η and A using the PM method in Table 3 for comparison.

Uncertainties from the Wavelength Calibration
The wavelength calibration for the XShooter NIR Arm has a systematic error of ∆λ = 0.04 Å (recorded in the Xshooter pipeline, Modigliani et al. 2010).As η ≡ (λ2/λ1 − 1), assuming that the wavelengths of λ4960 and λ5008 lines in the observer's frame both have a systematic error of ∆λ = 0.04 Å, we obtained: The total error σ(η) can be calculated as: The added systematic errors will be on the order of 10 −6 ∼ 10 −5 .We checked whether or not these systematic errors should be considered using 17 spectra yielding σstat(η) < 10 −5 .If η values from the 17 spectra were not to suffer from the systematic errors and their current statistical errors were enough to be responsible for their scatter around η0, the values of (η−η0)/σstat(η) would obey the standard normal distribution with the standard deviation around 1.However, using the actual measured η, we obtained the standard deviation of (η − η0)/σstat(η) to be 1.28, indicating that the probability of the current statistical errors being account for the scatter of η around η0 to be only 4.7%.Therefore, the systematic errors caused by the uncertainties of raising from the wavelength calibration should be considered.After considering the systematic errors, the standard deviation becomes 1.04, consistent with the prediction of a standard normal distribution.Therefore, involving the above systematic errors is reasonable, and we list the systematic errors of η for all spectra in Tables 3, 4 and 5.

Robustness of The Measurements
Results from the final sample were checked statistically.We first checked if η correlated with A or not.The distribution of η and A measured from the 86 spectra is shown in Figure 8.The Pearson correlation coefficients of the LAE subsample, the QSO subsample and the whole sample are −0.13,−0.15 and −0.11, respectively, indicating no correlation.This is consistent with theoretical expectations, as Bahcall et al. (2004) discussed.Using a bootstrap method described in Section 4.1, we calculated the average values of A of the LAE and the QSO subsamples, which are 3.03 ± 0.03 and 3.05 ± 0.06, respectively.These values are consistent with 2.99±0.02measured by Bahcall et al. (2004) and 2.96 ± 0.02 measured by Albareti et al. (2015) within 3σ, and are also consistent with the theoretical value of 2.98 (Storey & Zeippen 2000).
We then checked whether the measured σstat(η) is related to the SNR and FWHM of the [O III] lines as expected.
As can be seen in Figure 9, σstat(η) is inversely related to SNR4960 for similar FWHM and is positively related to FWHM for similar SNR4960.We verified this finding through numerical simulations, as detailed in Appendix D. Numerical simulations give quantitative correlations, as expressed in formulas D1 and D2.In brief, σstat(η) is roughly inversely proportional to SNR4960 and positively proportional to FWHM.
In Figure 9, we show the relation with FWHMs of 130 and 610 km s −1 , which are the median FWHM values of LAE and QSO subsamples, respectively.For both the subsamples, the measurements are in good agreement with the theoretical expectations with scatters less than 0.4 dex.We finally checked whether χ = (η − η0)/σ(η) obeys the standard normal distribution or not, which should be the case if the true value of all η were η0 and σ(η) could be account for the uncertainty in the measurement of η. Figure 10 shows the distribution of χ.The mean value and standard deviation of χ and the mean value of χ 2 are 0.11, 1.02 and 1.04, respectively.These values are consistent with the theoretical values with a sample size 86, which are 0 ± 0.11, 1 ± 0.08 and 1 ± 0.15 (68.3% confidence level), respectively.We made a Kolmogorov-Smirnov (KS) test.We calculated a KS statistic value of 0.11, corresponding to a probability of 0.21, >0.05.Thus, we concluded that the distribution of χ obeys the standard normal distribution.

Averages
We calculated average values for the final sample and different subsamples using two methods.The first is the weighted average method (WM).For values xi = (∆α/α)i and errors σi, we calculated the weight as: Thus, the weighted average is: The size of the dots represents the error of η (larger for greater error).We did not show the error of A, but those with greater error of η generally have greater error of A.  and (−3±5)×10 −5 using WM and BS methods, respectively.The two values are consistent and indicate no deviation between ∆α/α and 0 in the 6 × 10 −5 error level.We calculated the averages of the LAE and QSO sub-samples and further divided the former into SA and other LAE sub-samples and calculated the averages.For each subsample, the averages by both two methods agree with 0. We found that for the SA subsample, the error of average by the BS and WM methods have a great difference, while for other subsamples, the difference is slight.This may be because the SA subsample has a small size N .Because we have shown that the errors σ(∆α/α) are estimated reasonably, we accept the results by the WM method as the final results.
The above results are based on η measured by the MG method.As described in section 3.1, there are 17 spectra with which η can be measured by the PM method, including 6 spectra of SA.The averages of ∆α/α measured by the PM method are shown in Table 6, which are also consistent with  0. This indicates that the coincidence of ∆α/α and 0 does not depend on how we measured η.
Due to telluric absorption bands, the whole sample can be naturally divided into three subsamples in three redshift ranges.The averages of ∆α/α of the three subsamples, listed in Table 6, are all consistent with 0. The errors of these averages are 3 × 10 −4 , 6 × 10 −5 , and 5 × 10 −4 , respectively.Because the results obtained from the spectra of SA (z = 2.37) have high precision, one may worry that the average value in the redshift range of 2.01 < z < 2.57 is greatly affected by SA and cannot be treated as the representative value of this redshift range.Thus, we calculated the average of ∆α/α obtained from other 54 spectra in this redshift range, and the result agrees with 0 with an error of 9 × 10 −5 .Figure 12.Tests of whether α changes over time.The upper panel shows the result using only our sample, and the bottom panel shows the result using both our sample and the measurements using SDSS QSO spectra by Albareti et al. (2015).The best-fitting model assuming that α varies at a uniform rate over time is shown in red, and the 1σ confidence interval is shown in green shade.

Variation over Time
We do not find any variance of ∆α/α over time because it agrees with 0 in all the redshift ranges.We limited the rate at which α may vary, which can be used to constrain cosmological models further.The redshifts of LAEs and QSOs in our sample are in the range of 1.097-3.735,corresponding to look-back time (tLB) of 8.2-12.0 billion years, or the age of the universe of 1.7-5.5 billion years.We assumed that α varies uniformly over time from then to now: The k here means the same as α −1 dα/dt in Bahcall et al. (2004).We fit the ∆α/α measurements of all the 86 spectra, and obtained k = (−3 ± 6) × 10 −15 yr −1 .
We then included ∆α/α measured by Albareti et al. (2015) using SDSS QSO spectra (see their Table 3) into the rate limitation.Note that we did not use the data in the redshift bin 0.580-0.625 in their results because it may be seriously affected by the SELs.If only using measurements at z < 1, the result is k = (2 ± 5) × 10 −15 yr −1 .And by combining the measurements at z < 1 and our measurements at 1.0 < z < 3.8, we obtained k = (0 ± 4) × 10 −15 yr −1 .This is by far the most precise limitation of the rate using the [O III] doublet method.Moreover, our results extend the limitation to an era when the universe is only 2 to 5 billion years old.

Considerations on Selecting LAE and QSO for α Measurement
We discuss the reliability and necessity of our final sample selection criteria.First, we only selected spectra where SNRs of both the [O III] doublet lines are greater than 10.An SNR of >10 guarantees that the contours of the χ 2 (η, A) surface near the minimum value are close to ellipses, and hence, the probability density distributions of η and A are close to Gaussian, which is convenient for subsequent analysis.In addition, the η measured from spectra with low [O III] SNR have large errors and hence have small weights when calculating average and rate.Therefore, discarding them hardly affects the precision of the final results.Some previous work studying α variation with optical QSO spectra used the SNR of λ5008 as a criterion for sample selection.We suggest the SNR of λ4960 line to be included in the criteria when using NIR spectra, because it is not strictly correlated with the SNR of λ5008 line (Figure D1) due to the influence of telluric lines.
Second, we added masks to pixels affected by SELs, TALs, cosmic rays and bad CCD pixels, and excluded these pixels when measuring line wavelengths.In addition, we excluded spectra where many pixels near [O III] doublet are affected, using a mask index I mask for quantification.Note that this approach differs from previous works using optical spectra, where pixels affected by SELs and TALs were included in the measurements but were given lower weights due to larger errors.In analysis concerning NIR spectra, systematic errors caused by corrections for SELs and TALs must be addressed, and therefore, the errors of the pixels where strong SELs and TALs are located may be significantly underestimated.Fortunately, the VLT/XShooter spectra have high spectral resolutions, so the impacting ranges of single SEL or TAL are narrow.For a typical XShooter spectrum, the proportion of pixels strongly affected by SELs or TALs is generally between 10% and 20%, except for wavelength ranges where telluric absorption lines are dense.Most spectra can pass the criterion that the sum of the I mask of the doublet lines is less than 0.5.Even if the affected pixels are directly masked for these spectra, the remaining pixels are sufficient to achieve reliable wavelength measurements.
Finally, we had two additional requirements for QSO spectra compared with previous works.One is that there should be no strong Fe II bump, and the other is that the doublet lines should not blend severely.For these two requirements, we measured Fe II index IFeII and blending index I blend as the criteria.We tested whether the two criteria are reasonable by examining the spectra they excluded.For a continuumsubtracted spectrum, we obtained the profiles of the doublet emission lines in wavelength ranges corresponding to a velocity range of −600 to 600 km s −1 .We made linear interpolation for the profile of λ5008 to align it with the profile of λ4960 in velocity space, and then calculated the Pearson correlation coefficient between them.This coefficient represents the consistency of the shapes of the doublet lines.In Figure 13, we show the coefficients of QSO spectra excluded by the Fe II or the blending criterion, those of QSO spectra in the final sample, and the A values measured using these spectra.We only show the results of spectra with SNR5008 > 50 to ensure the accuracy of the parameters.We found that the doublet profiles have a good correlation in the final sample as the coefficients have a median value of 0.95 and are all greater than 0.87, and the measurements of A are all around the theoretical value of 3.However, the correlation between the doublet profiles is not as good in QSO spectra excluded by the Fe II criterion, as the coefficients have a median value 0.83.Also, a large A value of >4 is measured for more than half of these excluded spectra.The likely reason is that the Fe II bump under the [O III] doublet causes the continuum to be fit improperly.The correlation can be poor in QSO spectra excluded by the blending criterion as more than half of coefficients are <0.5, and the measured values of A spread over a wide range.The likely reason is the mistaken decomposition of the very broad [O III] emission components.The anomalies in the correlation coefficients and measurements of A indicate that these spectra are unsuitable for the measurements of doublet wavelengths.Hence, our criteria are necessary and practical.According to the initial redshift limit of 1.07 < z < 3.77, 95 LAE spectra and 601 QSO spectra were selected.However, the final sample only contains 40 LAE and 46 QSO spectra.The passing rate of LAE and QSO subsamples are 42% and 8%, respectively.About 20% of the LAE spectra are excluded because of SELs: in some cases, the mask index is too large, and in other cases, the SELs result in insufficient SNR of [O III] .This fraction is similar to the fraction of pixels strongly affected by SELs, which is 10%-20% as we previously estimated.
QSO spectra have a lower pass rate.A few spectra are excluded due to observational factors, such as short exposure time causing insufficient [O III] SNR or inappropriate redshift causing [O III] to fall into the telluric absorption bands.These factors are similar for QSOs and LAEs.However, more than half of the spectra are excluded because of the QSOs' intrinsic properties, such as the weak [O III] relative to the continuum, the blending [O III] doublet, or a strong Fe II bump.The composite of the 46 QSO spectra in the final sample, displayed in Figure 14, differs from that of general QSOs.To demonstrate this difference, we also present other QSO composite spectra for comparison, including that of Selsing et al. (2016) using XShooter data of 1 < z < 2 high-luminosity QSOs, and that of Vanden Berk et al. (2001)  Table 7.Comparison between our QSO sample and QSO composite lines in their composite spectrum can reach 31.9Å, much higher than those in the other two composite spectra (Table 7).Also, they have weaker Fe II bumps as the Fe II indexes are smaller.Thus, QSOs suitable for α measurement may belong to a particular type, accounting for a small proportion of the total number of QSOs.This may be the main reason for the low pass rate of QSO spectra.Assuming that the proportions of LAEs and QSOs excluded due to observational factors are similar, with a pass rate of ∼ 40%, we estimated that only 20% of the QSO spectra are suitable for α measurement.

Implications for Future Works
We suggest that LAE spectra have more advantages than QSO spectra studying of α variation using the [O III] doublet method.The reasons are as follows.
First, the [O III] doublet lines are narrower in LAE spectra than those in QSOs' spectra, as the median values of the [O III] FWHM in the LAE and QSO subsamples are 130 and 610 km s −1 , respectively, with a difference of 4.7 times.The value of the FWHM affects the statistical error of η, as narrated in Appendix D, where σstat(η) ∝ FWHM 3/2 for the same flux.Therefore, a difference of 4.7 times in the values of FWHM would lead to a 10 times difference in σstat(η) if other conditions were the same.In our sample, the resultant precision of ∆α/α from the LAE and QSO subsamples are on the same level, for the higher luminosities of the [O III] lines in the QSOs' spectra compensate for their disadvantage in the FWHM.The median [O III] luminosity of QSOs (10 43.7 erg s −1 ) in our sample is 10 times that of LAEs (10 42.7 erg s −1 ), as can be seen in Figure 4.
Second, the volume density of LAEs suitable for studying α variation is much higher than that of QSOs at z > 1.We calculated the volume density of LAEs with Lyα luminosity greater than the mean value of our LAEs, and that of QSOs with continuum luminosity greater than the mean value of our QSOs, both at z ∼ 2. We extracted the spectra on the UVB arm of z ∼ 2 LAEs in our sample and measured the Lyα luminosities, which have a median value of 10 43.1 erg s −1 .Using the Lyα luminosity function of z = 2.2 LAEs from Konno et al. (2016), we estimated the volume density of LAEs with LLyα > 10 43.1 erg s −1 is ∼ 5 × 10 −5 Mpc −3 .We measured the G-band absolute magnitudes of the QSOs in our sample, with a median value of −26.2.Using the continuum luminosity function of 2.2 < z < 2.6 QSOs from Croom et al. (2009), we estimated that the number density of QSOs with g < −26.2 is only ∼ 5 × 10 −7 Mpc −3 .In addition, only a small fraction of QSOs have strong and narrow [O III] lines and weak Fe II bumps.We estimated this fraction at 20% in section 5.1.As a result, the number density of LAEs suitable for α measurement is 2-3 orders of magnitude higher than QSOs.
Finally, systematic errors may be larger and more challenging to analyze when measuring α with QSO spectra.In this work, we considered two possible sources of systematic error for QSOs, the blending of [O III] doublet and Fe II bump.We selected QSOs with [O III] not severely blended and with weak Fe II.The resultant η/σ(η) obeys the standard normal distribution, implying that the possible systematic errors from these two sources hardly affect the results of our work.However, if the method is applied to future α measurements with higher precision, these systematic errors may become significant as other errors decrease.In addition, theoretically, there are some other sources of systematic errors, such as the uncertainty of the shape of Hβ broad emission lines, weak QSO emission lines, and others.These errors may also affect future works using QSO spectra for accurate α measurement.In LAE spectra, the continuum is weak and mainly comes from starlight.Thus, the modelling of the LAE continuum is more reliable than the modelling of the QSO continuum.
We discuss the possible precision of future α measurements using spectroscopy of LAE and other starburst galaxies at z < 2. This work obtained an error of ∆α/α of 7 × 10 −5 using 40 XShooter spectra of LAEs.The error is 9 × 10 −5 if only using 6 spectra of SA and is 1.3 × 10 −4 if only using the other 34 spectra.To obtain an error of ∆α/α below 10 −6 to test the measurements using the many-multiplet method, N > 50, 000 spectra similar to the SA's or N > 600, 000 spectra similar to other LAEs' are required, assuming that the error is inversely proportional to √ N .To achieve the required sample size, observing multiple spectra of starburst galaxies simultaneously using multi-object spectrometers is necessary.The ongoing project Dark Energy Spectroscopic Instrument (DESI, DESI Collaboration et al. 2016) may meet the sample size demand.The DESI project is expected to observe about a dozen million emission line galaxies, of which ∼7 million at z < 0.95, suitable for the study of α variation using [O III] doublet as the wavelength range is 3600-9800 Å. Spectra with sufficiently high [O III] SNRs may be on the order of 10 5 ∼ 10 6 .Because the sensitivity, resolution and sample size of DESI are all better than SDSS, the final accuracy of α measurement must be better than the existing results using SDSS spectra (∼ 2 × 10 −5 ), and might be comparable with the results using the many-multiplet method at present.So far, the highest redshift at which α variation has been measured is 7.1 (Wilczynska et al. 2020).Their work used the absorption lines in the VLT/XShooter spectra of QSO J1120+0641 at z = 7.085.Measuring α variation at higher redshift using QSO absorption lines is challenging because QSOs at z > 7 are extremely rare.Fortunately, a number of z > 7 LAEs have been spectroscopically identified (e.g., Vanzella et al. 2011;Stark et al. 2017;Jung et al. 2020;Endsley et al. 2021).Their spectral energy distributions suggest they have strong [O III] emission lines.These LAEs can be used to constrain the variation of α at z > 7 as long as deep mid-infrared spectroscopic observations are obtained because the [O III] wavelengths are redshifted to > 4 µm.
The Near InfraRed Spectrograph (NIRSpec) mounted on James Webb Space Telescope (JWST, Gardner et al. 2006) can take spectrum in a wavelength range of 0.6-5.3µm, and hence can observe [O III] emission lines of LAEs at z < 9.4.Depending on the observational mode, the spectral resolution R can be 100, 1000 or 2700.Assuming that the LAEs have an intrinsic [O III] FWHM of 130 km s −1 and are observed with JWST/NIRSpec with R of 1000 or 2700, we estimated the observed [O III] FWHM to be 300 or 170 km s −1 using equation D3 in Appendix D. Assuming an SNR of λ4960 line of 10, we further estimated that the precision of ∆α/α measurement could reach (2 ∼ 3) × 10 −3 .Higher precision could be obtained if the SNR is higher or more than one LAE is observed.This measurement will, for the first time, provide a constraint on the variation of α at 7.1 < z < 9.4, corresponding to an age of the universe of only 500-700 million years.

CONCLUSIONS
We constructed a sample of 86 public NIR spectra observed by VLT/XShooter of LAE (40) and QSOs (46) at 1.09 < z < 3.73.The selection criteria of the sample concern the SNRs of the [O III] doublets and three indexes: the first describes to what extent the data is affected by SELs, TALs, cosmic rays and bad CCD pixels, the second describes the strength of Fe II bump in the QSO spectrum, and the third describes the blending of the [O III] doublet.We measured the parameter η describing the ratio of the wavelengths of the [O III] doublet.We further inferred ∆α/α, the difference between α in the distant universe and that in the laboratory.When measuring η, we tried a Multiple-Gaussian method and a Profile-Matching method, in which the former applies to all spectra (86) while the latter only applies to a part (17), and we adopted the results using the former method.Reassuringly, the two methods yield consistent results for spectra that are applied to both.Our main results are as follows.
1.The ∆α/α measured using the 86 spectra are all consistent with 0 by < 3σ.The ratio of ∆α/α to its error obeys the standard normal distribution, which supports the assumptions that the truth value is 0 and that our estimates of errors are reliable.
In addition, our results suggest that starburst galaxies' spectra have a better application prospect than QSO spectra in future studies of α variation.

APPENDIX A: DETAILS OF DATA REDUCTION
We briefly introduce the pip-2D data generated by the XShooter pipeline.The data is contained in fits files with three header data units (HDU).As shown in Figure A1(a), the first HDU stores the 2D spectrum of a target (in the unit of ADU); the second stores the flux error of spectrum; the third stores the bad CCD map, marking those pixels affected seriously by bad CCD pixels, cosmic rays, and other factors.The 2D spectrum had been straightened and aligned, with the wavelength direction going horizontally at a fixed interval of 0.6 Åand the spatial direction going vertically.This kind of 2D spectrum was produced by combining the 4 frames taken in 4 consecutive exposures under the "ABBA" dither mode.Using an A−B−B+A algorithm, the pipeline eliminated the sky emission and left 3 images of the target on the combined frame, among which the bright image originated from the two exposures labelled as A and have positive fluxes and the other two dark images originated from the two exposures labelled as B and have negative fluxes.
We extracted 1-D spectrum based on the pip-2D data as follows.
First, the aperture for extracting the spectra from the three images was determined, as shown in Figure A1(b).Almost all the targets are point-like sources, or close to point-like sources, and their brightness profiles along the spatial direction, cutting at the non-emission wavelengths for QSOs and at the [O III] λ5008 emission lines for LAEs, were well fitted by Gaussian functions.The best-fitting Gaussian models give positions of the three images and the FWHMs of their brightness profiles, in which the latter were used to determine the width of the aperture for extracting the 1-D spectra.The only exceptional target whose brightness distribution cannot be approximated by a Gaussian function is SA, and we will describe its extraction in another paragraph.
Second, spectra were extracted from the three images and merged into one spectrum, as shown in Figure A1(c).Using the previously determined aperture width, we extracted the 1-D spectra from the three images, to which the first HDU contributed the flux and the second HDU the error.With reference to the third HDU, we prepared a boolean value named "mask" for each wavelength of the 1-D spectra.The value was set to 0 if the data quality of all the pixels inside the aperture at this wavelength is 0, and to 1 otherwise.The three spectra were then rescaled to eliminate the small differences between their fluxes caused by weather changes, and the negative fluxes were turned positive.After the rescaling, new contaminated pixels were identified, at which the flux of one spectrum deviates from those of the other two spectra by more than 4.5σ.These pixels were affected by cosmic rays or other adverse factors but had not been identified by the pipeline, so we changed their mask values to 1.We calculated the weighted average of the three spectra using pixels with mask values of 0 and obtained a 1-D spectrum.This averaged spectrum was also assigned a mask value at each wavelength: it was set to 1 if the mask values in the three spectra are all 1, and to 0 otherwise.Besides, a few observations were not taken under the dither mode and each had only one image in its pip-2D data.In these cases, the rescaling and the averaging were skipped.
Third, for each target, we calibrated the flux of its 1-D spectra and corrected the telluric absorption, as shown in Figure A1(d).The flux of the 1-D spectra of a target was calibrated using the sensitivity function (the conversion from ADU to flux) stored in its flux-calibrated pip-1D data.As telluric absorptions appeared in the targets' spectra, they must have also appeared in the spectra of standard stars observed close in time.These standard stars' spectra, which belong to project 60.A-9022(C), can be obtained from the ESO archive.So, to correct the telluric absorption in each target's spectrum, we built a telluric absorption model using the spectrum of the standard star, which was observed closest in time and of whose continuum the SNR is >300 in the wavelength range of the NIR arm.For the spectrum of each standard star, we searched for a similar stellar spectrum in the XShooter Spectral Library (Gonneau et al. 2020) and considered it as the template, and a telluric absorption model can be calculated by dividing the observed spectrum by the template.Corrected QSO spectra using such telluric absorption models were examined in the continuum.The correction was successful at most of the wavelengths.At some wavelengths with severe telluric absorption, the correction was less successful, which may be caused by the inhomogeneity or the variability in the properties of the atmosphere.
Finally, for each spectrum pixel, once it meets any of the following situations, we changed its mask value to 1.The situations are: 1.The telluric absorption model of the pixel is below 0.5.2. A strong SEL affects the pixel.We adopted the list of SELs from Oliva et al. (2015), and selected 184 strong lines with flux rates > 1000.With the information on these lines and with that revealed by the 1-D error spectrum of each target, we identified the sky emission lines in each spectrum and calculated their width for masking, in which an algorithm was involved to give greater widths for stronger lines.3. The pixel lies near the [O III] doublet and is affected by visually identified cosmic rays that the pipeline had missed.The final mask values label pixels affected by strong SELs, TALs, cosmic rays and CCD bad pixels.
We extracted a 1-D spectrum for each pip-2D data.For a target, if there were more than one group of exposures taken under the dither mode and hence multiple pip-2D data in the ESO archive, we combined the extracted spectra, as we had done in combining spectra of the three images in the same pip-2D data.
SA is a lensed galaxy with multiple clumps, which could be star clusters.The VLT/XShooter observation of SA has been described in detail by Vanzella et al. (2020).This observation contained 3 exposures, during which the slit was placed differently but covered at least two clumps each time.The 2-D spectra of the three exposures are displayed in Figure A2.We extracted 6 spectra using 6 apertures shown in the red line in the figure, denoted as SA1 to SA6.

APPENDIX B: APPLICABILITY OF PROFILE-MATCHING METHOD
As we have demonstrated in section 3.1, the χ 2 (η, A) surfaces obtained by the MG method are smooth, and the contours are close to ellipses, and the best estimates and statistical errors of η and A can be reliably obtained.However, the results obtained by the PM method do not guarantee these properties.We found two types of problems.We illustrate each type with an example (Figure B1) and analyze the possible origin of the problem.
We show the first type of problem with LAE J0332-2746 as an example.When the PM method was used, a feature similar to a geological fault occurs on the χ 2 surface.Accordingly, jumps can be seen on the χ 2 (η) curve which are roughly evenly spaced with an interval of ∆η = 0.6 Å pixels of the λ4960 line included in the matching.When η increases, the rightmost pixel is excluded, and at the opposite end, a new pixel joins in to become the new leftmost pixel.The jumps are not obvious in the range of χ 2 (η) < χ 2 min + 1 when the λ4960 line has high SNR and small FWHM, but they can be very prominent when the λ4960 line suffers from great noises, which will make the measurement of η unreliable.
We then show the second type with LAE J1001+0206 as an example.When measured by the PM method, wavy structures appear on the χ 2 surface.The origin of this structure is as follows.wavy structures.The broader the [O III] lines, the more likely the wavy structures will appear.
By visually examining the χ 2 surfaces, we selected 17 spectra with which η can be reliably measured by the PM method, including 16 LAE and 1 QSO spectra.The [O III] emission lines in these spectra are narrow, have high SNR and are little affected by masks.Notably, the FWHMs of these spectra's [O III] emission lines span within 16 pixels.The previous work of (e.g., Bahcall et al. 2004) used QSOs' SDSS spectra, in which the typical [O III] emission lines' FWMHs of 400-1000 km s −1 span for 6-14 pixels.These are informative to the applicability of the PM method.We suggest that for spectra taken by Xshooter, SDSS or other spectrometers, as long as the [O III] emission lines' FWMHs span no more than 14-16 pixels, the application of the PM method can be considered.

APPENDIX C: THE MULTI-GAUSSIAN FITTING RESULTS
We present all 86 spectra and the best-fitting results using the multiple-Gaussian model in Figures C1, C2, C3 and C4.

APPENDIX D: FACTORS AFFECTING MEASUREMENT PRECISION OF η
Factors that affect the precision of η include the width and the SNRs of the [O III] doublet, and we explored their influence using simulated spectra.Assuming that both [O III] lines have the same Gaussian profile, we generate spectra in which the FWHMs and SNRs of the [O III] doublet vary independently.In a simulated spectrum, the FWMHs of the doublet were set to be equal, and the SNRs of the doublet, which were adjusted by assigning varied flux errors, are uncorrelated and can be different.Measurements on η using these simulated spectra revealed a relation between σstat(η) .Two examples of spectra with which η can be reliably measured by the MG method but not by the PM method.The top row shows the spectra, the middle two rows show the χ 2 (η, A) surface and χ 2 (η) curve obtained by the MG method, and the bottom two rows show those by the PM method.
and the FWHM and SNRs of the doublet, which can be expressed as: This relation holds when the FWHM is 100-1200 km s −1 and SNRs are 10-500.
We plot the observed [O III] SNRs of the final sample in Figure D1. Figure D1 also shows the contour of the values of σstat(η) with the FWHM of the doublet fixed at 300 km s −1 .In our sample, the values of SNR5008 are greater than those of SNR4960 for all the spectra, and their average ratio is 2.06.By assuming that SNR5008 = 2.06SNR4960, we can simplify Equation D1 as: In the simulation, when we were adjusting the SNRs of the doublets by assigning varied flux errors, we found that the values of SNR follow such relation with the value of FWHMs and the flux errors: SNR ∝ FWHM −1/2 Err.Hence if considering the influence of the FWHM alone, σstat(η) will be proportional to FWHM 3/2 .The spectral resolution R was generally considered to be one of the factors affecting the measurement precision.Higher resolution leads to more accurate wavelength calibration because the calibrating emission lines of arc lamps and SELs will be narrower and more resolved, offering more information for finding the wavelength solution.It then helps reduce the systematic error σsys(η) originating from the wavelength calibration.Higher resolution also helps reduce the statistical error σstat(η), because the SELs and TALs will be narrower and affect fewer pixels, leading to an increase in the SNRs of [O III] lines and hence reducing σstat(η).In addition, when the resolution is poor, the observed lines will appear broader.Assuming that the intrinsic profile of a [O III] emission line is a Gaussian with a FWHM of FWHMintr, and assuming that the broadening function of spectrometry is also a Gaussian, the observed FWHM (FWHM obs ) of this line is roughly: where c is the speed of light.Higher resolution leads to smaller FWHM obs , and then smaller σstat(η).Nevertheless, to what extent can σstat(η) be reduced by improving the resolution is limited by FWHMintr.For Xshooter, the resolution R is 4200-8100 (Tables 1 and 2), corresponding to c/R of 37-70 km s −1 .To a [O III] emission line with FWHMintr of 100 -150 km s −1 , the instrumental broadening from the Xshooter will bring an increase in FWHM obs of about 10%-30%.When the [O III] emission line's FWHMintr is above 150 km s −1 , the increase will be no more than 10%.For our sample, the observed FWHMs of the LAE subsample are affected by <30%, while those of the QSO subsample are almost unaffected.In summary, increasing resolution benefits on improving the precision of η measurement.In future studies, if the LAE spectra are used for constraining α variation, we suggest that the resolution R should be no less than 4000.Otherwise, the observed width of the [O III] lines would be significantly larger than the intrinsic width, and the measurement precision would be unpromising.MNRAS 000, 1-?? ()

Figure 1 .
Figure1.Left: Examples of masks caused by SELs.We identified strong lines in the sky emission spectrum and labelled the pixels strongly affected by them in grey.Right: Examples of masks caused by TALs.We show the normalized telluric absorption spectrum (1 for unabsorbed, 0 for fully absorbed).Pixels with a value less than 0.5 (red dotted line) on the normalized absorption spectrum are masked (grey).

2Figure 2 .
Figure2.Example of the multiple Gaussian fitting.From top to bottom panels, we show the models containing 1-4 Gaussian components for [O III] doublet.In each panel, we display the observed spectrum (black), the continuum model (green), the sum of continuum and emission line models (red), the residual (blue), and the observed errors (cyan).Note that we only show spectra and model in 4900-5100 Å while the wavelength range of fitting is 4900 to 5140 Å.

Figure 3 .
Figure 3. (a): An example of measuring I FeII .The green shade shows the continuum.The continuum model is blue, and the wavelength ranges used for fitting it are green shades.The red line shows the sum of the continuum and Fe II models.(b): An example of measuring I blend .The black line in the upper panel is the sum of λ5008 and λ4960 models.

Figure 4 .
Figure 4. Redshift versus [O III] luminosity for our final sample.

Figure 5 .
Figure 5.The distributions of SNR 4960 and [O III] FWHM for the final sample.

Figure 6 .
Figure 6.Examples of measuring η and A using the two methods.The MG method is shown in the upper row, and the PM method is shown in the bottom row.(a): Observed spectrum (black) and best-fitting multiple-Gaussian model (red).(b): The profile of λ4960 line (blue point), and that of λ5008 line which has been moved leftward and downsized (red point), and the best-fitting cubic spline function (black line).(c): The χ 2 (η, A) surfaces obtained by the two methods.(d): The contours that yielding χ 2 = χ 2 min + 1, and the resultant 1σ confidence intervals of η (green shade) and A (red shade).(e): The minimum χ 2 value for different η (black line) and the 1σ confidence interval of η (green shade) corresponding to χ 2 < χ 2 min + 1 (blue shade).

Figure 7 .Figure 8 .
Figure 7.Comparison of the η values measured using the two methods for the 17 spectra.(a): Comparison of η values measured using the two methods.(b): Comparison of the statistical errors.(c): Distribution of ∆ parameter, which characterizes the degree of deviation.

Figure 9 .
Figure 9. Relation between σstat(η) and SNR 4960 .The pentagrams represent LAEs, and the dots represent QSOs.The colour represents the [O III] FWHM (see the colour bar below).The two black dashed lines show theoretical expectations for FWHMs of 130 and 610 km s −1 , respectively.

Figure 10 .
Figure 10.The distribution of χ and a comparison with the standard normal distribution (red line).

Figure 11 .
Figure11.The upper panel shows the measurements of ∆α/α using all 86 spectra.The lower panel shows those using 37 spectra with errors < 10 −3 .

Figure 13 .
Figure13.We present the correlation coefficients of the doublet line profiles and the measurements of A for QSO spectra in the final sample (black pentagram), excluded by the Fe II criterion (red box) and excluded by the blending criterion (blue origin).

Figure 14 .
Figure 14.The composite spectrum of QSOs in our final sample (black).We show XShooter (blue) and SDSS (red) QSO composite spectra for comparison.All three spectra are normalized at 5100 Å.The insert panel shows the partial enlargement near the [O III] doublet.

Figure A1 .
Figure A1.Example of a typical approach of extracting 1-D spectrum.(a): Display of pip-2D data.(b): Example of brightness distribution (black line) of QSO continuum and determination of apertures for extracting (red shade).(c): Example of combining the spectra of the three images.(d): Example of correction for telluric absorption.
Figure A2.The apertures for extracting the spectra of SA.
Figure B1.Two examples of spectra with which η can be reliably measured by the MG method but not by the PM method.The top row shows the spectra, the middle two rows show the χ 2 (η, A) surface and χ 2 (η) curve obtained by the MG method, and the bottom two rows show those by the PM method.

Figure D1 .
Figure D1.The contours show the simulated relation between σstat(η) and SNRs of the doublet lines when FWHM= 300 km s −1 is assumed.The data points show the observed [O III] SNRs in the final sample, red for LAEs and blue for QSOs.The dashed line shows the relation assuming SNR 5008 = 2.06SNR 4960 .

Table 1 .
Information of the 40 LAE spectra

Table 2 .
Information of the 46 QSO spectra dian value of 130 km s −1 , and that of the QSOs is 270-1260 km s −1 with a median value of 610 km s −1 .The [O III] emission lines of the LAEs are narrower and less luminous than those of the QSOs, while the SNRs of the two subsamples are similar.

Table 3 .
Measurements of η and A for the 6 spectra of SA (z=2.37).

Table 4 .
Measurements of η and A for 34 other LAE spectra