Calibration of metallicity of LAMOST M dwarf stars Using FGK+M wide binaries

Estimating precise metallicity of M dwarfs is a well-known difficult problem due to their complex spectra. In this work, we empirically calibrate the metallicity using wide binaries with a F, G, or K dwarf and a M dwarf companion. With 1308 FGK+M wide binaries well observed by LAMOST, we calibrated M dwarf's [Fe/H] by using the Stellar LAbel Machine (SLAM) model, a data-driven method based on support vector regression (SVR). The [Fe/H] labels of the training data are from FGK companions in range of [-1,0.5] dex. The Teffs are selected from Li et al. (2021), spanning [3100,4400] K. The uncertainties in SLAM estimates of [Fe/H] and Teff are ~0.15 dex and ~40 K, respectively, at snri>100, where snri is the signal-to-noise ratio (SNR) at i-band of M dwarf spectra. We applied the trained SLAM model to determine the [Fe/H] and Teff for ~630,000 M dwarfs with low-resolution spectra in LAMOST DR9. Compared to other literature also using FGK+M wide binaries for calibration, our [Fe/H] estimates show no bias but a scatter of ~ 0.14-0.18 dex. However, the [Fe/H] compared to APOGEE shows a systematic difference of ~ 0.10-0.15 dex with a scatter of ~ 0.15-0.20 dex. While the Teff compared to APOGEE has a bias of 3 K with a scatter of 62 K, it is systematically higher by 180 K compared to other calibrations based on the bolometric temperature. Finally, we calculated the zeta index for 1308 M dwarf secondaries and presents a moderate correlation between zeta and [Fe/H].


INTRODUCTION
M dwarfs are low-mass stars located at the bottom of the main sequence in the Hertzsprung Russell diagram (HRD).They account for 70-75% of all stars in the solar neighbourhood and dominate the nearby stellar population (Henry et al. 2006;Bochanski et al. 2010;Winters et al. 2019).The lifetime of M dwarfs in the main sequence phase exceeds the current age of the universe (Baraffe et al. 1998;Henry et al. 2006).Therefore, they are considered to be the excellent objects to trace the evolution history of the Galaxy (e.g., West et al. 2011;Woolf & West 2012;Hejazi et al. 2015).They also provide an important laboratory for exploring the structure and evolution of thin and thick disks in the Milky Way (e.g., Ferguson et al. 2017;Montes et al. 2018).In addition, M dwarf stars have become key targets for planet hunting (e.g., Mann et al. 2011;Gillon et al. 2017;Ribas et al. 2018), as their radius and ⋆ E-mail: liuchao@nao.cas.cnmass are smaller than their solar-type counterparts.A large number of low-mass planets are expected to orbit a M-type star within its habitable zone, allowing for easier detection of low-mass exoplanets (e.g., Gaidos et al. 2007;Dressing & Charbonneau 2013;Tuomi et al. 2014;Kochukhov 2021).James Webb Space Telescope (JWST) has started searching for potential life in earth-like planets orbiting M dwarfs in recent (e.g., Lustig-Yaeger et al. 2023).Therefore, it is particularly important to obtain the accurate atmospheric parameters of M dwarfs, including effective temperature and chemical abundances.
Due to the intrinsic faintness of M dwarfs, obtaining highresolution, high signal-to-noise ratio (SNR) spectra requires a large telescope with a long integration time.Therefore, the currently available high-resolution spectra of M dwarfs has been limited to small samples of nearby stars mainly from the Galactic disk.e.g., Rajpurohit et al. (2014) compared 21 high-resolution spectra (R∼40,000) of very low mass objects, which observed by the optical spectrometer UVES (Dekker et al. 2000) on the Very Large Telescope (VLT), with the synthetic spectra computed from the BT-Settl model (Allard et al. 2011(Allard et al. , 2012b) ) to determine the physical parameters of these stars.They also analyzed the molecular (TiO, VO, CaH) and atomic (Fe I, Ti I, Na I, K I) features of these high-resolution spectra of different subtypes of very low mass stars.Lindgren & Heiter (2017) utilized the upgrade software package SME 1 (Piskunov & Valenti 2017) to infer effective temperature and metallicity of 28 M dwarfs with high-resolution (R∼50,000) spectra, which were obtained with the CRIRES spectrograph at ESO-VLT (Kaeufl et al. 2004).The precision of T eff and [Fe/H] are 100 K and 0.05 dex, respectively.Veyette et al. (2017) determined temperature and Fe abundance of 29 M dwarfs with highresolution spectra (R∼25,000) by using the PHOENIX stellar atmosphere model (Allard et al. 2012a; Baraffe et al. 2015;Allard 2016).They achieved precisions of T eff and [Fe/H] with 60 K and 0.1 dex, respectively.Rajpurohit et al. (2018) determined the stellar parameters of 292 M dwarfs by comparing the high-resolution spectra (R∼90,000) observed by CARMENES (Quirrenbach et al. 2014) with the synthetic spectra from the BT-Settl atmospheric model.They found that the prominent narrow atomic lines (K I, Na I, Ca I, Ti I, Fe I, Mg I and Al I) and molecular (TiO, VO, OH, and FeH) features of the objects can be well fitted by the BT-Settl model.Woolf & Wallerstein (2020) determined the Fe and Ti abundance of 106 M stars with spectral resolution ∼33,000 and SN R > 70 by using the spectral analysis routine MOOG (Sneden 1973).(Cristofari et al. 2022) compared 12 near-IR highresolution (R∼70000) spectra of M dwarfs acquired with the Spec-troPolarimėtre Infra-Rouge (SPIRou; Donati et al. 2020) with two grids of synthetic spectra, PHOENIX and MARCS (Gustafsson et al. 2008) model atmospheres, respectively.The T eff logg, and [M/H] of their work with internal errors of about 30 K, 0.05 dex, and 0.1 dex, respectively.
On the other hand, low-resolution spectra of M dwarfs can be largely collected with much more efficient observations.e.g., the Baryon Oscillation Spectroscopic Survey of the Sloan Digital Sky Survey (SDSS/BOSS; Dawson et al. 2013) and the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST; Cui et al. 2012;Deng et al. 2012;Zhao et al. 2012) have provided large amount of low-resolution spectra of M dwarfs.Consequently, developing an automatic technique to derive the precise atmospheric parameter of M dwarfs from low-resolution spectra is essential for Galactic studies.Galgano et al. (2020) determined the effective temperatures for 29,678 LAMOST M dwarfs with low-resolution (R ∼ 1800) spectra based on the supervised machine-learning code, T he Cannon (Ness et al. 2015).And the training stellar labels they used were from the TESS Cool Dwarfs Catalog provided by Muirhead et al. (2018).Du et al. (2021) determined the atmospheric parameters for the M dwarfs with low-resolution spectra by fitting the BT-Settl model grids (Allard et al. 2011(Allard et al. , 2012b)).The intrinsic precision are 118 K and 0.29 dex for T eff and [M/H], respectively.Li et al. (2021) used the stellar labels from APOGEE as the standard to calibrate T eff and [M/H] for ∼ 300,000 M dwarfs with a bias of 50 K and 0.12 dex compared to other literature.(Ding et al. 2022) fitted the LAMOST low-resolution spectra of M-type stars with the MILES spectral library (Falcón-Barroso et al. 2011) to determine the T eff logg and [Fe/H] by using the ULySS package (Koleva et al. 2009).The typical precision of T eff logg and [Fe/H] are 45 K, 0.25 dex, and 0.22 dex, respectively.
In recent years, machine learning techniques can efficiently 1 http://www.stsci.edu/valenti/sme.htmlprocess large amounts of spectral data to derive stellar parameters.(e.g.,Howard 2017;Ting et al. 2019;Antoniadis-Karnavas et al. 2020).Data-driven methods have been illustrated to be promising solutions in cool star parameterization (Jofré et al. 2019).The known information of training data sets can be transferred to the entire data sets through data-driven methods.The high performance of data-driven method in predicting stellar labels from lowresolution spectra is desirable (Ho et al. 2017).The Cannon (Ness et al. 2015) is a widely used data-driven approach for determining stellar labels from spectroscopic data.Huang et al. (2020) used the Cannon to determine the stellar parameters of K and M giant stars from the low-resolution spectra.Compared to cool-dwarf stars, the atmospheric models of F, G or K dwarfs quite accurately comply with observations, as their metallicity can be predicted from low-resolution spectra by comparing with synthetic spectra.The two members of a wide binary system are usually formed from the same molecular cloud and the metallicity of them should be same.That is, if the M dwarf star has a F, G, or K dwarf companion with known metallicity, it can be assumed that the metallicity of M dwarf is the same as that of the hotter companion (e.g., Bonfils et al. 2005;Neves et al. 2012;Montes et al. 2018).Therefore, using the F, G, or K dwarf companions with known metallicity to calibrate the metallicity of M dwarf star is a feasible technique (Rojas-Ayala et al. 2012;Mann et al. 2013;Newton et al. 2014).e.g., Rojas-Ayala et al. (2010) made use of 17 F, G, or K dwarf companions with known metallicity to construct an empirical metallicity indicator applicable for M dwarfs with accuracy of ∼0.15 dex.Mann et al. (2014) calibrated the metallicity of mid-to late-M dwarfs by F, G, K, or early-M dwarf primaries with accuracy of 0.07 dex.Porto de Mello et al. ( 2017) derived the atmospheric parameters of M dwarfs from detailed analysis of F, G, or K binary companions.The internal errors are 70 K and 0.1 dex for T eff and [Fe/H] , respectively, which are calibrated by Principal Component Analysis (PCA) method.Montes et al. (2018) has established a sample of 192 FGK+M physically bound systems.The atmospheric parameters of M dwarf companions were calibrated by F, G, or K-type primaries.Birky et al. (2020) used a training sample, which includes 87 M dwarfs with [Fe/H] labels from F, G, or K companions, to derive the T eff and [Fe/H] for 5875 M dwarfs.The prediction accuracy reaches 77 K and 0.09 dex, respectively.All the above works indicate that using F-, G-, or K-type companions with known metallicity to calibrate the metallicity of M dwarf secondaries is a common and effective way.
In this work, we identified 1308 FGK+M wide binaries from LAMOST to calibrate the metallicity of M dwarf stars with low resolution optical spectra.This is the largest wide binary sample size used to calibrate the metallicity of M dwarf stars so far.We trained a data-driven model SLAM based on these over 1300 FGK+M wide binaries, and applied the trained model to calibrate the atmospheric parameters of all LAMOST DR9 M dwarf stars.This paper is organized as follows: Section 2 presents the selection of FGK+M and M+M wide binary systems.In section 3, we describe the data-driven method SLAM for deriving the atmospheric parameters of stars from spectrum.In Section 4, we analyze the training results and validation of the trained model.The test of the trained model on LAMOST DR9 M dwarfs is presented in Section 5. A discussion is shown in Section 6.Finally, we summarize the results in Section 7.

DATA
We started with the wide binaries with a distance less than 1 kpc selected by El-Badry et al. (2021) 2020) is adopted.That is, limiting the projected separation between two stars less than 1 parsec, restricting the parallaxes of two components consistent within 3 (or 6) sigma.In addition, the proper motion of stars is a significant factor to distinguish genuine binaries from much more numerous pairs of unassociated stars in the field (Chanamé & Gould 2004).El-Badry et al. (2021) limited the proper motion of two stars to be similar and consistent with a Keplerian orbit.They built an initial wide binary candidates catalogue through the constraints of the above conditions.And then by counting the number of phase-space neighbours for each source of the binary candidates catalogue, they removed the candidate pairs in which either component had neighbours larger than 30, these objects may be clusters, background pairs, or triples.The details on selection criteria of wide binaries can be referred to Section 2 in El-Badry et al. (2021).Finally, there left 1,871,594 wide binary candidates, which include main sequence (MS)+MS, white dwarf (WD)+MS, and WD+WD wide binaries with a small fraction of contamination, in El-Badry et al. (2021).Two components of a wide binary system with the brighter and fainter Gaia G magnitude defined as the primary and secondary star, respectively.

FGK+M and M+M wide binaries
We cross-matched the 1,871,594 wide binary candidates with LAMOST DR92 (Yan et al. 2022).We acquired 2453 wide binary candidates with a FGK type main sequence and a M dwarf companion (FGK+M).It is worthy to mention that there are some chance alignments among these binary candidates.In practice, we further purified the FGK+M wide binaries and obtained reliable astrometry of both components by applying following five criteria.
(i) snrg1 > 15, where snrg1 represents the signal-to-noise ratio (SNR) at g-band of F, G, or K dwarf spectra.(ii) snri2 > 15, where snri2 is the SNR at i-band of M dwarf spectra.The above two criteria are designed to ensure that the [Fe/H] of primary stars and the low-resolution spectra of M dwarfs are all in reasonable quality.
(iii) R_chance_align <0.1, where R_chance_align3 represents the probability that a wide binary is a chance alignment.Highprobability binaries will be expected to have low R_chance_align values.R_chance_align <0.1 corresponds approximately to a wide binary with >90 % probability of being bound.(iv) ∆rv (=| rv1 − rv2 |) < 20 kms −1 .∆rv is the radial velocity difference between F, G, or K primaries (rv1) and M secondaries (rv2), the radial velocities of all stars are from LAMOST.∆rv < 20 kms −1 indicates that these wide binaries with high probability of being bound.(v) ruwe2 < 1.4,where ruwe2 is the Renormalized Unit Weight Error (ruwe) of the M dwarf companions in the wide binary systems.It is a quality specified by the Gaia survey (Fabricius et al. 2021).ruwe2 < 1.4 indicates that the M dwarf in FGK+M binary system does not have another closer companion and has a favourable astrometric observation.
Objects that do not meet any of the above five criteria were removed.Finally, 1308 FGK+M wide binaries are left.
We also identified wide binaries with both two M dwarf companions (M+M) from El-Badry et al. ( 2021) with both components observed by LAMOST.They can be used to verify the self consistency of metallicity of M dwarfs determined from the SLAM model, as described in subsection 4.3.2.The selection criteria of M+M wide binaries as follows.
(i) snri1 > 15 and snri2 > 15, where snri1 and snri2 are the SNR at i-band of the primary and secondary M dwarfs, respectively.
These criteria are same as the conditions for selecting FGK+M wide binaries to ensure that both M dwarf components have reliable spectra, and these samples are likely wide binaries.Finally, we obtained 606 reliable M+M wide binaries.

Properties of FGK+M and M+M wide binaries
We obtained the reddening value E(B − V ) for each star from the three-dimensional dust map by Green et al. (2019).The extinction value can be obtained from AV = 3.1 * E(B − V ).We further transferred the AV into the ones corresponding to the three Gaia bands, G, GBP and GRP .The extinction coefficient of the three bands can refer to Gaia Collaboration et al.Here MG0 = G + 5 log(ϖ/mas) − 10 − AG, is the extinction-corrected absolute magnitude of G band, where ϖ is the parallax from Gaia, AG is the extinction in the G band.The blue and red dots in the top panel represent the 1308 F, G, or K dwarf primaries and the 1308 M dwarf secondaries, respectively.Among these blue dots, a narrow branch composed of stars with 5 < MG0 < 7 are located ∼ 0.7 mag above the main sequence branch.These should be unresolved binary stars in triple systems, in which the F, G and K companions are binary stars.The unresolved binary primary stars would not significantly affect the precision of the metallicity derived from spectra (El-Badry & Rix 2018).In the bottom panel of 0.5 1.0 1.5 2.0 2.5 3.0  In this work, we aim to calibrate the metallicity of M dwarfs from the F, G, or K dwarf companions.The [Fe/H] of the 1308 primaries (hereafter [Fe/H]FGK) are derived from the LAMOST Stellar Parameter pipeline (LASP; Wu et al. 2011Wu et al. , 2014)).Besides, we also calibrated the T eff of the 1308 M dwarf secondaries from the trained model of Li et al. (2021) (hereafter T eff,Li ).The model of Li et al. (2021) was obtained by training SLAM with LAMOST M dwarf spectra and ASPCAP stellar labels ([M/H] and T eff ). Figure 2 illustrates the distribution of T eff and [Fe/H] of 1308 M dwarf secondaries in the CMD.In the left panel, the colors code T eff,Li spanning in the range 3100 <T eff,Li < 4400 K.It can be seen that T eff,Li is clearly correlated with the color index.The right panel of Figure 2 displays the CMD of the 1308 M dwarfs with colorcoded [Fe/H]FGK ranging from -1 to 0.5 dex.The color gradient is clearly shown in the CMD: from iron-low M dwarfs (lower-left) to iron-high M dwarfs (upper-right).The four dashed lines are the the-oretical isochrones from the PAdova and TRieste Stellar Evolution Code (PARSEC; Bressan et al. 2012) with [Fe/H]=0.3,0, -0.3, and -0.6 dex at age= 10 Gyr, as marked by the red, yellow, green, and blue dashed lines, respectively.It also shows that low metallicity M dwarfs are located at low-left, while high metallicity M dwarfs focus at upper-right side, which is the similar trend with the observed samples (Xiong et al. 2023;Qiu et al. 2023).

METHOD
In this work, we used the Stellar LAbel Machine (SLAM; Zhang et al. 2020a), which is a data-driven method based on support vector regression (SVR), to derive T eff and [Fe/H] of M dwarfs from lowresolution spectra.SVR is a robust nonlinear regression method widely used in spectral data analysis (Liu et al. 2012(Liu et al. , 2014;;Lu & Li 2015).In previous studies, SLAM has shown good performance in determining stellar parameters from spectra (Zhang et al. 2020a;Li et al. 2021;Qiu et al. 2023).In next section, we described how to train a SLAM model by using 1308 FGK+M wide binaries, and then predict stellar atmospheric parameters of all LAMOST M dwarfs from the trained model.

Training of SLAM
The training set includes low-resolution spectra of the low mass companion in FGK+M wide binary systems and their corresponding stellar labels (T eff,Li and [Fe/H]FGK).The T eff,Li label was derived from the trained model of Li et al. (2021) and [Fe/H]FGK is from the LAMOST estimated value of the F, G, or K companions.We normalized the spectra and standardized the training set before training.We follow the preprocessing procedure of SLAM in Zhang et al. (2020a).Firstly, we shifted the spectra back to the rest frame before training by correcting the radial velocity.And then we used a smoothing spline (de Boor 1978) to smooth the entire spectrum, the pixels that deviate from the smooth spectrum by a distance greater than a threshold would be excluded, e.g., 2 times the standard deviation of the residual in the wavelength bin.By smoothing the reserved pixels in the spectrum, we obtained the pseudo-continuum.The observed spectrum in the training set can be normalized by dividing its pseudo-continuum.Finally, the stellar labels and normalized spectral fluxes were rescaled, resulting in their mean and variance values are 0 and 1, respectively.
Assuming that there are m spectra in the training data set and each spectrum has n pixels.Fi,j is the flux at the jth pixel of the ith normalized spectrum, and The normalized spectrum can be standardized via Define ⃗ θi as the stellar label vector of the ith star in the training set.fj( ⃗ θi) is defined as the jth pixel of SLAM model output spectrum of the input stellar label vector ⃗ θi.The mean squared error (MSE) and median deviation (MD) can be measured by training SLAM model with a specific set of hyperparameters.Three hyperparameters, C, ϵ , and γ in SLAM need to be determined.They present the penalty level, the tube radius and the width of the radial basis function (RBF) kernel, respectively.The RBF is adopted by SLAM as the kernel.MSE and MD for jth pixel are defined as In principle, the smaller MSE is, the better fitting is.To avoid getting an overfitted model4 by training it on the entire training data set to seek the minimum MSE.Zhang et al. (2020a) used the k-fold cross-validated MSE (CV-MSE) and k-fold cross-validated MD (CV-MD) to evaluate M SEj and M Dj, i.e., the training samples are randomly divided into k subsets, and fj( ⃗ θi) of one subset is predicted by the trained model that depends on the other k-1 subsets of training sets.We evaluated M SEj and M Dj via 5-fold cross-validation in this work.After looping over all subsets, M SEj and M Dj can be calculated based on the predicted fluxes in crossvalidation and the corresponding to the standardized spectral fluxes in the training set.In the case, the best hyperparameters for the jth pixel can be determined by searching for the lowest M SEj among all specific sets of hyperparameters.By doing so pixel-by-pixel, we obtained the best trained SLAM model.It is worth mentioning that the training sample includes late-type K dwarfs and M dwarfs.Consequently, the trained SLAM model can be applied to both late-type K dwarfs and M dwarfs.

Prediction
According to the Bayesian theorem, given an observed spectrum, the posterior probability density function of its stellar labels is writ-ten as where ⃗ θ is the stellar label vector.⃗ f obs is the observed spectrum flux vector, in which f j,obs is the flux of the jth pixel.p ⃗ θ is the prior of stellar label vector, and p f j,obs | ⃗ θ is the likelihood of observed spectrum flux of jth pixel (f j,obs ) given the stellar label vector ( ⃗ θ).The stellar labels can be estimated by maximizing the posterior probability p ⃗ θ | ⃗ f obs .The logarithmic form of Eq (6) after takes a Gaussian likelihood becomes f j,obs and σ j,obs are the flux and uncertainty of jth pixel of observed spectrum, respectively.fj ⃗ θ and σj( ⃗ θ) are the model output spectral flux and uncertainty of the jth pixel corresponding to stellar label vector ⃗ θ.The cross-validate scatter (CV-scatter) and cross-validate bias (CV-bias) of stellar labels are defined by equations ( 8) and ( 9), respectively.
where ⃗ θi,SLAM and ⃗ θi are the SLAM model predicted stellar label vector and the ground truth, respectively.According to these two equations, the CV-scatter and CV-bias can be calculated only if the predicted stellar label vector is known.Therefore, CV-scatter and CV-bias can be considered as the standard deviation and mean deviation for the stellar labels between real and model output value, respectively.Theoretically, the smaller CV-bias and CV-scatter are, the better the prediction results of the model are.It should be noted that the CV-scatter and CV-bias are statistic of stellar labels, while the CV-MSE and CV-MD as described in Section 3.1 are statistic of stellar spectra.

RESULTS
Table 1 displays the notations of the stellar labels involved in this work.We trained SLAM model with low-resolution optical spectra and corresponding two stellar parameters, T eff,Li and [Fe/H]FGK.The analysis of the trained SLAM model are displayed in the following subsections.

Training Results
The percentage of variance explained (PVE) , which can be used to indicate the information content of signal in noisy data.
where sj 2 =1 in our work.The more information is contained in the jth pixel about the training stellar labels, the larger value of P V Ej, i.e., the smaller value of M SEj (Zhang et al. 2020b)

Validation
The cross-validation (CV) scatter mentioned in subsection 3.2 is usually used to quantify the precision of the predicted stellar parameters.Figure 4 shows how the CV-bias and CV-scatter of stellar labels change with the signal-to-noise ratio at i-band (snri).In both panels, the red and blue dotted lines represent the CV-scatter and CV-bias, respectively.The left panel of Figure 4 displays CV-bias and -scatter of the T eff versus snri.It demonstrates that the value of CV-scatter decreases gradually with increasing snri, while the CV-bias constantly fluctuates at around 5 K.The CV-scatter of the T eff reach ∼40 K at snri ∼ 100, and the mean CV-bias is about 5 K.The variation of CV-bias and scatter of the [Fe/H] with snri is drawn in the right panel and shows similar trend to the effective temperature.The CV-scatter of the [Fe/H] is smaller than 0.15 dex at snri > 100, reaching around 0.1 dex at snri > 160.The CVbias of the [Fe/H] is almost equal to 0 at any snri.

Performance
We adopted two methods to verify the self-consistency of the stellar parameters of M dwarfs determined by SLAM with details in subsection 4.3.1 and 4.3.2.Meanwhile, we also analyzed the stability of SLAM in predicting stellar parameters of M dwarfs, as described in subsection 4.3.3.

Self consistency in FGK+M wide binaries
We first checked out the self-consistency of the parameterization using the 1308 FGK+M wide binaries.First, they are randomly split out into two blocks, a training set that consists of 1000 FGK+M wide binary systems and a test set that is composed of 308 FGK+M wide binary systems.We trained the SLAM model with the 1000 M dwarf low-resolution spectra and their stellar labels ([Fe/H]FGK, T eff,Li   of the star at the i-th observation, where i is from 1 to 29 or 30.According to the metallicity results of above four stars, we find that the trained SLAM model is robust in predicting stellar parameters of M dwarf stars, especially for stars with high signal-to-noise ratios.

PREDICTION OF STELLAR LABELS FOR THE LAMOST DR9 M DWARFS
The large survey LAMOST has collected 11,226,252 optical spectra with low-resolution (R ∼ 1800) in its ninth data release.It contains more than 830,000 M-type spectra.We used the trained SLAM model to derive the atmospheric parameters from M dwarf spectra, as presented in following subsections.

The M dwarfs
The initial catalogue of M-type stars comes from LAMOST DR9.We obtained the parallax and Gaia three-band photometries of these stars by cross matching with Gaia eDR3.We further purified the M dwarf samples according to their positions in the CMD. Figure 8 displays the CMD of ∼ 830,000 M-type stars.The stellar extinction correction for each star is the same as that described in subsection 2.2.Obviously, there have some giants and other types of stars in the initial M-type star catalogue.We set the following four criteria to identify M dwarf candidates.These first three criteria are refer to Li et al. (2021) and Birky et al. (2020).The criterion i) and ii) are set to remove giants and white dwarfs.Criterion iii) aims to removing the stars above the main sequence branch, which may be pre-main sequence stars, largely reddened K dwarfs with wrong extinction correction, or multiple stars.Criterion iV) is set to remove stars with high astrometric noise or unresolved binaries.These first three criteria are marked by three black dashed lines in Figure 8.The green dots are those do not meet the above criteria.Finally, More than 650,000 M dwarf candidates left, as marked by the red dots.

Prediction Results
We derived the [Fe/H] and T eff for the ∼650,000 M dwarf stars from the trained SLAM model.It is noted that, due to the limitation of the SLAM model, it cannot extrapolate the stellar parameters beyond the range of the training samples.We obtained reliable stellar parameters of ∼ 630,000 M dwarfs after excluding stars whose predicted stellar labels that are beyond the range of the training labels.The description of the parameter catalog is shown in Table 2.The top two panels of Figure 9 show the CMDs of the ∼ 630,000 M dwarfs.As shown in the top-right panel, the ∼ 4500 stars are divided into two sub-samples.The histograms of ∆[Fe/H] of stars with T eff,SLAM >3800 K and stars with T eff,SLAM <3800 K are drawn in green and red, respectively.It is seen that [Fe/H]SLAM are systematically overestimated by ∼ 0.1-0.15dex than [Fe/H]AP, as marked by the green and red dashed lines.And the scatter of ∆[Fe/H] for stars with T eff,SLAM >3800 K are lower than that for stars with T eff,SLAM <3800 K.This is likely due to the iron abundance of the cooler stars is more difficult to be determined from the complicated absorption lines.The systematic difference between [Fe/H]SLAM and [Fe/H]AP may be related to many factors, such as the spectral resolution, wavelength ranges and methods used in this work are different from those used in APOGEE.In addition, the opacity and incomplete atomic or molecular lines in the atmospheric model   Gizis & Reid (1997) introduced a classification system of subdwarfs, which is based on the measurements of the four spectroscopic indices, CaH1, CaH2, CaH3 and TiO5.These indices were originally defined by Reid et al. (1995).Lépine et al. (2003) pointed out that they are able to discriminate dwarfs, subdwarfs, and extreme subwarfs of M stars (Reid & Gizis 2005;Lépine et al. 2007;Hejazi et al. 2020;Zhang et al. 2021;Hejazi et al. 2022).Lépine et al. (2007) proposed the ζ TiO/CaH metallicity index to redefine the metallicity subclass based on the calibration of the TiO to CaH ratio for stars at solar metallicity.ζ is defined as TiO5M ⊙ is a cubic polynomial fit of the TiO5 spectral index as a function of the CaH2+CaH3 index.It effectively provides the calibration of TiO band strength relative to CaH (CaH2+CaH3) band.
As shown in below In this work, we take the coefficients provided by Lépine et al. (2013), where a, b, c, and d are -0.588, 2.211, -1.906, and Mann et al. (2013) proposed that ζ is correlated with [Fe/H] for stars only at super-solar metallicity, but not at low metallicity.In Figure 12, we show the same trend between ζ TiO/CaH and [Fe/H]FGK, similar to the Figure 16 of Lépine et al. (2013).We obtained that the Pearson correlation coefficient and the two-tailed p-value associated with the Pearson correlation coefficient between ζ TiO/CaH and [Fe/H]FGK are 0.12 and 1.2e-05, respectively.It demonstrates a moderate linear correlation between ζ TiO/CaH and [Fe/H] for the ∼1300 M dwarfs in the range of -1 <[Fe/H]< 0.5 in this work.This is expected as the TiO and CaH features not only depend on the metallicity but also have an important correlation with other parameters such as α-element enhancement [α/Fe].However, it cannot be ignored that there are still differences between the PARSEC model and the predicted metallicity in the range of [Fe/H]< −0.6 dex.Considering that the stars with low metallicity are mostly old stars, the stellar activity does not have a great impact on photometric measurement.We point out that the PARSEC model of M dwarfs with low metallicity is inconsistent with the results of our work, which may be due to insufficient understanding of continuous opacities of M dwarfs.Alternatively, some other factors like the method, low-resolution spectra and wavelength ranges used in our work also may contribute to the difference.

CONCLUSION
In this work, we identified 1308 FGK+M wide binary systems.We calibrated the [Fe/H] of the M dwarfs from their F, G, or K companions.The [Fe/H] of the M dwarfs is in the range from -1 to 0.5 dex, and effective temperature spanning in 3100 < T eff < 4400 K.By training a data-driven model SLAM based on the 1308 M dwarf secondaries, we derived the stellar parameters ([Fe/H] and T eff ) for ∼ 630,000 LAMOST M dwarf stars with low-resolution optical spectra.The precision of [Fe/H] and T eff are 0.15 dex and 40 K at snri=100, respectively.
We used two methods to verify the self consistency of the stellar parameters determined from the SLAM model.The first one is dividing the 1308 FGK+M wide binaries into the training and test set.The bias and scatter of metallicty and temperature of the test data are 0.01±0.19dex and 2±54 K, respectively.In the second We compared our resulting [Fe/H] and T eff values with the literature.For [Fe/H], there is near zero bias with a scatter of ∼ 0.14-0.18dex compared to Birky et al. (2020), who also calibrated [Fe/H] using F, G, and K companions.However, [Fe/H]SLAM are systematically higher than [Fe/H]AP with an offset of 0.10±0.15dex to 0.15±0.20 dex.This systematic difference may be caused by the different atomic or molecular lines, or uncertainties in the continuous opacity of the stellar atmospheric model used by APOGEE pipeline.Many other factors like different spectral resolution, different wavelength ranges and different methods used in our work from those used in APOGEE may also contribute to the systematic difference.Compared to the temperature calibrated by bolometric temperature, the T eff,SLAM is overestimated by 180 K.But there is a good consistency between our temperature and that of APOGEE.
We calculated ζ for the 1308 M dwarf secondaries.It is originally defined for the classification of M stars.The Pearson correlation coefficient between ζ and [Fe/H]FGK is only 0.12, which indicates that there is a moderate correlation between these two parameters.
Compared to LAMOST, the upcoming SDSS-V (Kollmeier et al. 2017;Almeida et al. 2023) has the capability to detect fainter stars, making it a promising tool to supplement the lack of metal-poor wide binaries in this study.Our method and catalog will serve as valuable references for deriving fundamental parameters and metallicity within the framework of SDSS-V.T eff, SLAM (k) 0.9

APPENDIX
In order to ensure that the training and prediction sets in this work are indeed M dwarfs, we derived the photometric temperature (T eff,Mann ) and surface gravity (log gMann) for stars in our work.The T eff,Mann of samples calculated from the metallicityindependent empirical relationship provided by Mann et al. (2015Mann et al. ( , 2016) ) between T eff and magnitudes from Two Micron All Sky Survey (2MASS Skrutskie et al. 2006) and Gaia, i.e., J, H, BP and RP bands.This relationship is valid for stars with 0.1 R⊙ < R * < 0.7 R⊙ and -0.6 dex < [Fe/H]< +0.5 dex, where R * is the stellar radius.The detail coefficients of the relationship we used as described in Table2 of Mann et al. (2016) corresponding to the one that shows a scatter of 49 K.The left panel of Figure 14  where R⊙ and M⊙ are the stellar radius and mass, respectively.M * is the stellar mass.We inferred the R * /R⊙ of M dwarfs by adopting the relationship between R * /R⊙ and absolute magnitude in K band as descried in Table 1 of Mann et al. (2016), the one that shows a σ of 2.89%.The stellar mass determined by the relationship between M * /M⊙ and the absolute magnitude in K band provided in Mann et al. (2019).We used the coefficients of the relationship that shows a Bayesian Information Criterion (BIC) of 86 in Table 6 in Mann et al. (2019) to derive the mass of M dwarfs in this work.Then log gManncan be determined according to equation (13).
(2018).The color-(absolute) magnitude diagrams (CMDs) of the 1308 FGK+M and 606 M+M wide binaries are shown in Figure 1.The interstellar extinction is corrected for each star.

Figure 1 .
Figure1.The CMDs, i.e., Gaia G-band absolute magnitude (M G0 ) versus color index G BP 0 -G RP 0 , of the 1308 FGK+M and the 606 M+M wide binaries.Each star is corrected for the interstellar extinction.M G0 is the extinction-corrected absolute magnitude, G BP 0 and G RP 0 are the extinction-corrected apparent magnitude.The top panel represents the CMD of the 1308 FGK+M binary systems.The blue dots are the F, G, and K dwarf primaries and the red dots indicate the M dwarf secondaries.The CMD of the 606 M+M binary systems is displayed in the bottom panel.The blue and red dots are represent the M dwarf primaries (M 1 ) and the M dwarf secondaries (M 2 ), respectively.

Figure 1 ,
Figure 1, the blue dots indicate the 606 M dwarf primaries (M1) and red dots exhibit the 606 M dwarf secondaries (M2).In this work, we aim to calibrate the metallicity of M dwarfs from the F, G, or K dwarf companions.The [Fe/H] of the 1308 primaries (hereafter [Fe/H]FGK) are derived from the LAMOST Stellar Parameter pipeline (LASP;Wu et al. 2011Wu et al. , 2014)).Besides, we also calibrated the T eff of the 1308 M dwarf secondaries from the trained model of Li et al. (2021) (hereafter T eff,Li ).The model of Li et al. (2021) was obtained by training SLAM with LAMOST M dwarf spectra and ASPCAP stellar labels ([M/H] and T eff ). Figure 2 illustrates the distribution of T eff and [Fe/H] of 1308 M dwarf secondaries in the CMD.In the left panel, the colors code T eff,Li spanning in the range 3100 <T eff,Li < 4400 K.It can be seen that T eff,Li is clearly correlated with the color index.The right panel of Figure 2 displays the CMD of the 1308 M dwarfs with colorcoded [Fe/H]FGK ranging from -1 to 0.5 dex.The color gradient is clearly shown in the CMD: from iron-low M dwarfs (lower-left) to iron-high M dwarfs (upper-right).The four dashed lines are the the-

Figure 2 .
Figure 2. The training sample: Two panels display the distribution of T eff and [Fe/H] of 1308 M dwarfs in the CMD.The stellar extinction is corrected for each star.The left panel shows the CMD with color-coded effective temperature derived by the model of Li et al. (2021).The right panel indicates the CMD of the 1308 M dwarfs with color-coded iron abundance from F, G or K dwarf primaries.Four isochrones with different metallicity ([Fe/H]=-0.6,-0.3, 0, and 0.3 dex) at age= 10 Gyr from PARSEC model are shown as the blue, green, yellow and red lines, respectively.

Figure 3 .Figure 4 .
Figure 3.The first panel shows the MSE-Teff (blue) and 12 spectra with the same [Fe/H]∼ 0 dex but different T eff varying from 3471K to 4210 K (from red to green).The second panel shows the MSE-[Fe/H] (red) and 4 spectra with the same Teff ∼ 3710 K but different [Fe/H] spanning -0.4 <[Fe/H] < 0.22 dex (from red to green).The subplots in two panels are the enlarge view of flux in each spectra at the corresponding wavelengths.

Figure 5 .
Figure 5.The top-left panel is the comparison between the T eff,SLAM and T eff,Li of the 308 M dwarfs.The top-right panel is the histogram of ∆T eff =(T eff,SLAM -T eff,Li ).The comparison between the [Fe/H] FGK and the model predicted [Fe/H] SLAM of the 308 M dwarfs are displayed in the bottom-left panel.The bottom-right panel illustrates the histogram of ∆[Fe/H]=([Fe/H] SLAM -[Fe/H] FGK ).The dashed black line is the one-to-one line in the top and bottom left panels.The mean value of ∆T eff and ∆[Fe/H] are 2 ± 54 K and 0.01±0.19dex, respectively, as marked by the black dashed vertical line in the top and bottom right panels.

Figure 6 .Figure 7 .
Figure 6.The left panel shows the comparison of the [Fe/H] from the SLAM model of two components in the 606 M+M wide binaries.[Fe/H] SLAM1 and [Fe/H] SLAM2 represent the SLAM estimated [Fe/H] for the M dwarf primaries and secondaries, respectively, in the M+M wide binaries.The histogram of ∆[Fe/H](=[Fe/H] SLAM1 -[Fe/H] SLAM2 ) of the 606 M+M binaries is presented in the right panel.The median and scatter values of ∆[Fe/H] are 0.02 and 0.15 dex, as marked by the black dashed line, respectively.

Figure 8 .
Figure 8.The CMD of M-type stars from LAMOST DR9.The stellar extinction is corrected for each star.The red dots present ∼ 650,000 M dwarf candidates.The three black dashed lines correspond to the first three criteria in subsection 5.1.
About 630,000 M dwarfs are divided into five metallicity bins.The (MBP0 − MRP0 vs. MG0) diagram of these five subsets are shown in the five top-panels of Figure 13.Each star has been corrected for extinction as described in subsection 2.2.The colors are encoded by the logarithm of the number of stars in each MBP0 − MRP0 and MG0 bins.It is obvious that the M dwarfs with [Fe/H]> -0.3 dex are consistent with the isochrones (Bressan et al. 2012).However, there is a deviation between predicted metallicity and the corresponding isochrones in the range of -1 <[Fe/H]< -0.3 dex.The five bottom panels are similar to the five top panels.They display the J0 − K0 vs. MK s 0 diagrams of the five subsets.The extinction coefficients of the J and K bands are from Wang & Chen (2019), where AJ = 0.243*AV and AK = 0.078*AV .The five [Fe/H] bins are roughly consistent with the corresponding isochrones in nearinfrared bands, especially for M dwarfs with [Fe/H]> -0.3 dex.

Figure 9 .
Figure 9.The top two panels display the CMD of ∼ 630,000 M stars, and the colors code the predicted [Fe/H] and T eff , respectively.The top-left panel exhibits the CMD of the M dwarfs with different [Fe/H] SLAM from blue ([Fe/H] SLAM < -0.6 dex) to red ([Fe/H] SLAM > 0.3 dex).The four dashed lines in the top-left panel are same as those described in Figure 2. The CMD of ∼ 630,000 M dwarfs with color-coded T eff,SLAM is displayed in the top-right panel.The bottom two panels are the same as the top panels but include M dwarfs with snri > 50.

Figure 10 .
Figure 10.This figure illustrates the comparison of metallicity with [Fe/H] AP and [Fe/H] Birky .The top two panels display the ∆[Fe/H](=[Fe/H] SLAM -[Fe/H] AP ) versus T eff,SLAM and the histogram of ∆[Fe/H] of the ∼ 4500 common M dwarfs, respectively.The red dots and bars in the top-left panel represents the mean and dispersion values of ∆[Fe/H] in different T eff,SLAM bins.∆[Fe/H]=0 is marked as the red dotted line.The green and red histograms in the top-right panel illustrate the ∆[Fe/H] of stars with T eff,SLAM > 3800 K and stars with T eff,SLAM < 3800 K, respectively.The mean values of ∆[Fe/H] for the two sub-samples are exhibited by the green and red vertical dashed lines.Similar to the top two panels, the bottom two panels show the comparison of [Fe/H] with [Fe/H] Birky using the ∼ 3300 common stars, in which ∆[Fe/H](=[Fe/H] SLAM -[Fe/H] Birky ).
shows the comparison of T eff,Mann and T eff,SLAM of stars with [Fe/H]SLAM> -0.6 dex.It indicates that there has a systematic bias of 60 K with a scatter of 64 K as shown in the corresponding distribution of ∆ T eff (= T eff,SLAM -T eff,Mann ) in the right panel.The photometric surface gravity of M dwarfs in this work was computed by using the following relation log g = 4.438 + log10(M * /M⊙) − 2 * log10(R * /R⊙) (13) Figure 15 displays the distribution of T eff,SLAM (T eff,Li ) and log gMann of training sample (red) and the prediction M dwarfs (black).It demonstrates that the T eff and log g of all the samples are associated with late-type K and M dwarf stars.

Figure 13 .Figure 14 .
Figure 13.The top five panels display the M BP0 − M RP0 vs. M G0 diagrams of M dwarfs in different metallicity bins.The colors are coded by the logarithm of the number of M dwarfs in each M BP0 − M RP0 and M G0 bins.The four dashed lines in each panel are same as those described in Figure 2. The bottom five panels are similar to the top panels, but in J 0 − K 0 vs. M Ks0 plane.Each star has been corrected for extinction.
(Zhang et al. 2020a)eff , log g , [Fe/H], and [α/M] of spectra with SN R > 50 are 70 K, 0.1 dex, 0.1 dex and 0.04 dex, respectively.The Steller LAbel Machine (SLAM)(Zhang et al. 2020a), which is a data-driven method based on the support vector regression (SVR), also shows high performance in deriving stellar parameters from low-resolution spectra.
(Qiu et al. 202320a2020a) determined T eff , log g, and [Fe/H] for ∼ 1 million LAMOST DR5 K giants with low-resolution spectra using SLAM.The random uncertainties of these three parameters are 50 K, 0.09 dex and 0.07 dex, respectively.(Lietal.2021)measured T eff and [M/H] of LAMOST M dwarfs with low-resolution spectra with SLAM and demonstrated that the T eff and [M/H] are in agreement compare with the APOGEE results by 50 K and 0.12 dex.(Qiu et al. 2023) trained the SLAM model with LAMOST low-resolution spectra of M giant stars and the corresponding stellar labels from the APOGEE to obtain the T eff , log g, [M/H], and [α/M ].The uncertainties of T eff , log g, [M/H], and [α/M ] are 57K, 0.25dex, 0.16dex and 0.06 dex at signal-to-noise ratio (SN R) >100, respectively.

Table 1 .
The notation of stellar labels.
(Hejazi et al. 2020)arately with T eff,Li .As Figure3shows, the blue and red lines in the first and second panel represent the distribution of MSE-T eff and MSE-[Fe/H] over the wavelength range used in this work, i.e., from 6000 Å to 9000 Å, respectively.According to the value of MSE-T eff at each pixel, we infer that some molecular bands and atomic lines are sensitive to effective temperature, such as the CaH and TiO bands as well as the Ca and Na lines, as emphasized by the gray bands and black dotted lines.This is consistent with some atmospheric models, e.g., the BT-Settl model(Hejazi et al. 2020).Besides, 12 training spectra with [Fe/H]∼ 0 dex colored by T eff spanning 3471<T eff < 4210 K as displayed in the first panel.The zoom-in subplots show the flux changes in some certain lines like Na and Ca.The fluxes at many wavelengths with low MSE-T eff values varies regularly from lower to upper temperature, especially for some wavelengths in TiO band.It also demonstrates that these wavelengths are sensitive to T the second panel.As shown in some zoom-in subplots, some certain lines like Fe I and Fe II are sensitive to [Fe/H] in M dwarf spectra.The values of MSE-[Fe/H] of these wavelengths are slight smaller than that of most other wavelengths.In the following section, we analyzed the parameter results derived from the SLAM model trained with both T eff,Li and [Fe/H]FGKsimultaneously.
. In order to derive the contribution of effective temperature and metallicity in the training model, we trained SLAM separately with T eff,Li and [Fe/H]FGK.The cross-validated MSE of [Fe/H] (MSE-[Fe/H]) is trained solely with [Fe/H]FGK and that of T eff (MSE-T eff .In the bottom panel, the red line shows the distribution of MSE-[Fe/H].It indicates that although the information about Fe abundance is weaker than that to temperature in most molecular bands and lines, some weak iron and other metal lines are sensitive to [Fe/H].Similar to the first panel, the flux changes in the spectra with the same T eff ∼3710 K but different [Fe/H] ranging from -0.4 to 0.22 dex (from red to green) are displayed in ).Then, we determined the metallicity ([Fe/H]SLAM) and effective temperature (T eff,SLAM ) for the 308 test M dwarfs with the trained model.The comparison between T eff,SLAM and T eff,Li of the 308 test M dwarfs is illustrated in the top-left panel of Figure5.The mean and scatter values of ∆T eff =(T eff,SLAM -T eff,Li ) are 2 and 54 K, respectively, as shown in the histogram of ∆T eff in the top-right panel.
4.3.2Selfconsistency in M+M wide binariesWe then selected 606 M+M wide binaries as described in Section 2.1 to make further check on self consistency.The[Fe/H] of each component of the 606 M+M wide binaries are separately derived from the SLAM model.The left panel of Figure 6 displays the comparison of [Fe/H]SLAM1 and [Fe/H]SLAM2, where [Fe/H]SLAM1 is the [Fe/H] of the M dwarf primaries and [Fe/H]SLAM2 represents the [Fe/H] of the secondaries.We selected four stars with 29 to 30 observations, respectively, with different snri in LAMOST DR9.The [Fe/H] of these four stars in the observations are independently measured by the SLAM model.[Fe/H]SLAM vs. snri of these four stars are shown in Figure 7.It illustrates that the [Fe/H] of each observation is close to the median value of [Fe/H] ([Fe/H] median ) over multiple observations, especially for stars with high snri.The histogram of ∆[Fe/H](=[Fe/H]SLAMi-[Fe/H] median ) in each panel shows that the median value of ∆[Fe/H] is around 0 with a scatter of ∼ 0.02-0.03dex, where [Fe/H]SLAMi is the [Fe/H] The colors code the [Fe/H]SLAM and T eff,SLAM in top-left and -right panels, respectively.The top-left panel displays that the [Fe/H] values are located in the range of -1< [Fe/H]SLAM<+0.5 dex.90% of the M dwarf stars are with [Fe/H]SLAM>-0.6 dex.The four dashed lines in this panel are same as described in Figure 2. The colors show obvious gradient from bottom left (blue) to top right (red) in the CMD.The effective temperature of the ∼ 630,000 M dwarfs are in the range of 3100 < T [Fe/H]AP) and T eff,SLAM , where [Fe/H]AP is from APOGEE.The red dots and bars indicate the mean and scatter values of ∆[Fe/H] in different temperature bins, respectively.It is obvious that these two values are gradually increase with decreasing temperature, that is, the systematic difference between [Fe/H]SLAM and [Fe/H]AP is larger in low temperature than that in high temperature stars.
eff,SLAM < 4400 K as displayed in the top-right panel.∼70% of the M dwarfs are located in between 3200 to 4000 K.It is seen that there is a significant correlation between temperature and the color index, which is expected.The bottom two panels are the same as the top two panels but include M dwarfs with snri > 50.The distribution of [Fe/H]SLAM and T eff,SLAM of the M dwarf stars in CMD are similar to that of the training sample, as shown in Figure 2. It indicates that the stellar parameters of M dwarfs derived from the SLAM model are reliable.In addition, the stellar parameters predicted by the SLAM model are consistent with the stellar evolution model, especially for stars with [Fe/H] > -0.3 dex.5.3Comparison of MetallicityAs a check of the reliability of [Fe/H] determined by the SLAM model, we compared the predicted [Fe/H] in this work with two other studies.We first cross-matched the M dwarfs with APOGEE DR16 and obtained ∼ 4500 common stars.In Figure10, the topleft panel illustrates the distribution of ∆[Fe/H](=[Fe/H]SLAM- Figure 12 shows the distribution of [Fe/H] and ζ TiO/CaH for the 1308 M dwarfs selected in subsection 2.1.Except that the ζ TiO/CaH index of 10 stars with low spectral signal-to-noise ratio is less than 0.825, the other training samples in this work have ζ TiO/CaH >0.825, which indicates that most of the training samples in this work are M dwarfs.Woolf et al. (2009) demonstrated that there is a linear correlation between ζ TiO/CaH and [Fe/H] in M stars.Lépine et al. (2013) also found a weak correlation between ζ and [Fe/H] with 0.9< ζ TiO/CaH < 1.2.

Table 2 .
Catalog description of ∼ 750,000 M dwarfs