The role of carbon in red giant spectro-seismology

Although red clump stars function as reliable standard candles, their surface characteristics (i.e. $T_\text{eff}$, $\log g$, and [Fe/H]) overlap with those of red giant branch stars, which are not standard candles. Recent results have revealed that spectral features containing carbon (e.g. CN molecular bands) carry information correlating with the"gold-standard"asteroseismic classifiers that distinguish red clump from red giant branch stars. However, the underlying astrophysical processes driving the correlation between these spectroscopic and asteroseismic quantities in red giants remain inadequately explored. This study aims to enhance our understanding of this"spectro-seismic"effect, by refining the list of key spectral features predicting red giant evolutionary state. In addition, we conduct further investigation into those key spectral features to probe the astrophysical processes driving this connection. We employ the data-driven The Cannon algorithm to analyse high-resolution ($R\sim80,000$) Veloce Rosso spectra from the Anglo-Australian Telescope for 301 red giant stars (where asteroseismic classifications from the TESS mission are known for 123 of the stars). The results highlight molecular spectroscopic features, particularly those containing carbon (e.g. CN), as the primary indicators of the evolutionary states of red giant stars. Furthermore, by investigating CN isotopic pairs (that is, $^{12}$C$^{14}$N and $^{13}$C$^{14}$N) we find suggestions of statistically significant differences in the reduced equivalent widths of such lines, suggesting that physical processes that change the surface abundances and isotopic ratios in red giant stars, such as deep mixing, are the driving forces of the"spectro-seismic"connection of red giants.


INTRODUCTION
Red clump (RC) stars are effective standard candles.The RC represents a stage of stellar evolution experienced by all low mass (0.8 − 2.0  ⊙ ) stars.Following the red giant branch (RGB) stage of evolution, where an inert helium core is surrounded by a shell of hydrogen fusion, during the RC phase both core helium burning and shell hydrogen burning are active.The ignition of this core helium fusion happens at the same core mass (0.47  ⊙ ; Girardi 2016) regardless of the original mass of the star.As a result, RC stars possess well-confined luminosities, with log( RC / ⊙ ) = 1.95, and have a mean dispersion of ∼ 0.17 ± 0.03 mag in all colour bands (Hawkins et al. 2017).
One limitation of RC stars as standard candles is that they have very similar surface features, i.e.T eff , log , [Fe/H], and luminosity, to lower RGB stars, which are not standard candles.This makes RC stars difficult to classify accurately using photometry and stellar parameters.However, the disambiguation of the RC from the RGB ★ Based on observations collected with the high-resolution ( ∼ 80, 000) Veloce spectrograph at the Anglo-Australian Telescope.
† E-mail: k.banks@unsw.edu.aucan be achieved effectively using asteroseismology (Montalbán et al. 2010;Bedding et al. 2011;Chaplin & Miglio 2013;Stello et al. 2013;Mosser et al. 2014).RC and RGB stars exhibit distinct asteroseismic oscillations due to their different internal structures.At the onset of helium fusion, convection is present in the helium core accompanied by an increase in the size of the core.This decreases the density contrast between the core and the surrounding hydrogen-burning shell (Montalbán et al. 2010(Montalbán et al. , 2013)), resulting in significant differences in the Brunt-Väisälä frequency.Therefore, RC and RGB stars exhibit different ranges of large frequency separations, Δ, and period spacing, Δ, and are as a result easily distinguishable in Δ − Δ space (Chaplin & Miglio 2013).
The oscillations present in red giants are measurable with sufficiently long time-series observations of photometry or radial velocity (e.g.82 days for K2; Hon et al. 2018).However, it has been shown recently that this asteroseismic information is imprinted into the spectra of RC and RGB stars, beyond the radial velocity signal driven by Solar-like oscillations (e.g., Hawkins et al. 2018;Casey et al. 2019;Banks et al. 2023).This allows for the efficient classification of many more RC stars due to the existence of spectroscopic survey data capturing more stars across a larger volume of the Galaxy (e.g.APOGEE, GALAH; Majewski et al. 2017;Buder et al. 2021, respectively) compared to asteroseismic surveys (e.g.CoRoT, Kepler, K2, TESS; Baglin et al. 2006;Gilliland et al. 2010;Howell et al. 2014;Ricker et al. 2015, respectively) Hawkins et al. (2018) used the data-driven algorithm The Cannon (Ness et al. 2015;Casey et al. 2016) to determine whether red giant spectra could effectively predict asteroseismic parameters such as the large frequency separation and period spacing (i.e.Δ and Δ), or classify RC and RGB stars.They trained The Cannon on 1,676 red giants with single epoch infrared spectra from the APOGEE survey (1.5 − 1.7 ; Majewski et al. 2017).They found that the spectra of RC and RGB stars of similar temperatures, surface gravities and metalicities are overall quite similar except around molecular CN and CO line features.Hawkins et al. (2018) suggest that the differences in these molecular features are to be expected from mixing along the RGB causing RC stars to have a lower [C/N] ratio compared to RGB stars with similar T eff , log , and [Fe/H].
In Banks et al. (2023) we performed a broader investigation into this spectro-seismic connection.We used moderate-resolution X-Shooter (Vernet et al. 2011) spectra of 49 red giant stars covering a tenfold larger wavelength range (i.e.0.33 − 2.5) compared to Hawkins et al. (2018).We followed a similar strategy to that used by Hawkins et al. (2018), and used The Cannon to generate a model trained on the spectra of the 49 stars.This model predicts the flux at each wavelength pixel as a quadratic function of stellar and asteroseismic labels, specifically T eff , log , [Fe/H], Δ, and RC_Prob (the seismic probability class prediction determined through the neural network classifier detailed in Hon et al. 2018).From this investigation, similarly to Hawkins et al. (2018), we also found that molecular features, particularly CN and CO, are the most useful in classifying RC stars from RGB stars.
In addition to the CN, CO and CH features, we found that some atomic features also hold some significance, including iron, titanium and the iron peak elements vanadium and nickel.However, there are covariances in these features with other stellar labels, namely log  and T eff in particular.Therefore, those atomic features may not be as important as The Cannon model suggests in predicting the evolutionary state of red giants compared to other stellar parameters.
In this study we investigate these features further, utilising a larger sample of red giant stars with known classifications from TESS asteroseismology ( = 123) and high-resolution spectra obtained with the Veloce Rosso spectrograph at the Anglo-Australian Telescope (AAT) (Gilbert et al. 2018).Utilising a higher spectral resolution allows us to explore in more detail the spectral features identified in Banks et al. (2023), and potentially identify additional classification features not resolved in the moderate-resolution ( ∼ 10, 000) spectra of the previous study.We discuss this in Section 3.2.
In addition, we analyse the spectral features that hold the most significance in predicting RC/RGB evolutionary state, thus providing insight into the astrophysical processes that drive the spectro-seismic connection of red giants.In particular, we focus on the photospheric abundances of carbon-bearing molecules and the 12 C/ 13 C isotopic ratio, which are influenced by the first dredge-up, deep mixing along the RGB, and the helium flash.This is explored in Section 4.

DATA
We made an initial collection of stars selected from a list of red giants collated from the K2and TESS-HERMES surveys (Sharma et al. 2019(Sharma et al. , 2018, respectively) , respectively) with effective temperatures 4300 <  eff < 5100 K, surface gravity 2.2 < log  < 2.5, and metallicity −0.7 < [Fe/H] < 0.3.These stars have also been explored with asteroseismology data from both the K2 mission (Howell et al. 2014) and the TESS mission (Ricker et al. 2015) and their evolutionary states have been determined by Hon et al. (2018Hon et al. ( , 2022)).We made observations for a total of 301 stars (see Fig. 1) with the Rosso camera of the Veloce spectrograph (5800 − 9500 Å,  ∼ 80, 000; Gilbert et al. 2018) in conjunction with the 4 m Anglo-Australian Telescope (AAT) at Siding Spring Observatory over the course of two observing semesters, 2020B and 2021B.Table 1 is a representative list of Gaia IDs, on-sky coordinates, observation date, exposure time,  2 apparent magnitude, stellar parameters (T eff , log , [Fe/H]) and asteroseismic labels (Δ and evolutionary state, 0 = RGB and 1 = RC) for the stars in our data set.A complete table detailing all red giants observed in our sample is included in the supplementary online material of this paper.
Veloce collects light from a 2.5 ′′ diameter aperture, using a 19element hexagonal integral field unit (IFU), and uses optical fibres to reformat that in a 19-fibre linear pseudo-slit at the entrance of the echelle spectrograph.These 19 "star" fibres are supplemented by a ThXe simultaneous calibration exposure at one end of the pseudoslit, a single-mode laser comb simultaneous calibration fibre at the other end of the slit, and 5 "sky" fibres offset significantly on-sky from the target star fibres.
Each star was observed in a similar manner to observations collected in Palumbo et al. (2022) in a series of multiple 600s exposures, such that the total exposure time (given the conditions) delivered an S/N per pixel of >50 at ∼ 6670 Å.For example, a star with an apparent -band magnitude of 9.5 mag in typical AAT 1.5" seeing required 3×600s exposures, while a  = 11 star required 6×600s exposures.At the middle of each 600s exposure, a 0.5s laser comb exposure was exposed as part of the standard Veloce calibration process.The laser comb inserts ∼ 10, 000 diffraction-limited calibration lines into a single-mode fibre at the end of the Veloce pseudo-slit.Periodic ThXe exposures are obtained as well, and the observed 'distance' on the detector between the laser comb and ThXe spots that result is used to calibrate the (slow) time-variation of the apparent length of the pseudo-slit.
The extraction of spectra from the Veloce Rosso echellogram uses a parametric spectrograph model, the parameters of which (the effective echelle grating spacing, the effective cross-disperser grating spacing, the pseudo-slit linear plate-scale, the detector X,Y zero-  et al. (2021), asteroseismic parameter Δ from Reyes et al. (2022), and evolutionary state (0 = RGB; 1 = RC) from Hon et al. (2018Hon et al. ( , 2022) ) for the stars in our data set.point position and detector rotation) are determined from each laser comb exposure, by doing astrometry using DAOPHOT on the individual laser comb lines and treating them as "point sources".These X,Y positions determine the spectrograph parameters for each exposure.Those parameters are then smoothed and interpolated as a function of time, to enable a spectrograph model to be derived for any intervening exposure (even if a laser comb was not obtained, as -unfortunately -does sometimes happen).
The spectrograph model for each observation then predicts the fibre tracks on the detector for each pseudo-slit fibre in each echellogram order, driving an optimal extraction for all the fibres in all the orders.The optimal extraction requires model fibre profiles for each fibre in each order as a function of Y (i.e. the high-resolution dispersion direction) position, and these are obtained by fits to flat-field observations in 60x100 pixel chunks.The environmental stabilisation of the spectrograph ensures that these profiles do not change with time.This final spectrograph model is then used to extract the spectra from the CCD images and write them into 3D data cubes (Y × fibre number × echelle order).The spectrograph model also provides a wavelength of each pixel in each order and in each fibre, and these are used to calibrate the wavelength of the optimally extracted spectra.
Because the spectrograph model is based on grating equations and a model for the relevant angles within the spectrograph, a first-order blaze function can be constructed based on the sinc 2 function for the relevant echelle grating diffraction angle.A polynomial correction to this "ideal" sinc 2 blaze function was constructed from hundreds of nights of flat-field data, to produce a single "master blaze" that can be applied to all data.
Following this, the data is then scrunched to a common constantvelocity-spacing grid and pixels associated with telluric absorption (using the telluric map from NASA Planetary Spectrum Generator for a typical AAT exposure) deeper than 2 per cent are masked as bad.
These processing steps result in extracted exposure files with individual spectra from the 19 star fibres, five sky fibres, one ThXe calibration lamp fibre and one laser comb fibre, for each of the 40 spectral orders covering the wavelength range 5800 − 9500 Å.
Following data extraction, each fibre is normalised by relative fibre throughput.Then an average sky spectrum is determined for each order by averaging the flux from the sky fibres and subtracted from the flux in each star fibre.The sky-subtracted star flux in each fibre is then normalised and combined into an average stellar spectrum for each order followed by merging each subsequent order.Merging the orders of echelle spectra often results in a distinct ripple shape in the merged spectra (e.g., Cretignier et al. 2020;Różański et al. 2022).Estimating a pseudo-continuum using a low-order polynomial to normalise the flux is non-trivial.Therefore, we made use of the neural network normalisation tool SUPPNet (Różański et al. 2022).This tool filters through the domain of spectrum measurement to the domain of possible pseudo-continua using a fully convolutional neural network based on the semantic segmentation problem.Utilising this tool in our data reduction process successfully removed the distinct ripple shape and resulted in normalised continua ∼ 1.
The final step of data reduction, before the spectra are in a suitable state for use and analysis in The Cannon, is performing velocity corrections.First is a heliocentric correction followed by correcting for the stars' radial velocity via a Doppler correction.We achieve this with cross-correlation via the crosscorrRV function from the PyAstronomy python package (Czesla et al. 2019).We choose one star in the data set with an estimated relative velocity close to zero from the GALAH DR3 catalogue (star_id: 05185082-5707494, RV = 0.0187 km/s) to be the template spectrum.The remaining spectra, i.e. the observation spectra, are masked to a spectral window between 8480 Å and 8680 Å where there are numerous strong features present in all spectra such as the Ca II infrared triplet.The template spectrum is then Doppler shifted across a range of possible relative velocities (−150 < RV < 250 km/s in steps of 0.1 km/s) and linearly interpolated to the wavelength points of each observation spectrum to calculate the cross-correlation function: for  data points   at wavelengths   depending on the velocity   and weights   , where Δ ,  =   (  /).The weights used in this calculation are simply the inverse square of the flux uncertainties.
The relative velocity between the template spectrum and an observation spectrum is the relative velocity where the cross-correlation is at a global maximum.We then Doppler shift the unmasked observation spectra with respect to the determined relative velocity, which results in different wavelength arrays for each spectrum.The Cannon requires all spectra to be on the same wavelength grid, so we perform a linear interpolation of all spectra onto a common wavelength grid.Following Doppler correction, we mask common regions across all spectra that exhibit telluric features.
Following these steps, the spectra are then in a suitable state for use and analysis in The Cannon as well as for direct comparison of spectral features for the investigation explored in Section 4.
In addition to normalised spectra on a common wavelength grid, The Cannon also requires a precise estimate of the flux variance for each pixel of the spectra.We assume the flux uncertainty to be Gaussian and hence use it to weight the influence of certain pixels when computing the best set of labels via minimum  2 estimation.For computational speed, The Cannon is fed with the inverse variance or inverse of the squared flux uncertainty.Another necessary input to establish a spectral model with The Cannon are labels that describe the spectra to train the model, for example, stellar parameters such as  T eff , log , and [Fe/H].In this investigation, we utilise those stellar parameters from GALAH DR3 (Buder et al. 2021).GALAH T eff is estimated from the spectra rather than via photometry with precision to 49 K and log  is estimated via bolometric relations.Buder et al. (2021) find excellent agreement with the values of log  for Gaia Benchmark Stars as well as those that are cross-validated with asteroseismology.Finally, the [Fe/H] abundances reported in GALAH DR3 are determined from a global abundance of [Fe/H] from all Fe lines in the GALAH DR3 line list, resulting in a precision of 0.055 dex.
We also incorporate asteroseismic information derived from TESS asteroseismology.A total of 123 stars in our data set (shown in Figure 2) have reliable asteroseismic parameters available for our analysis.These include the large frequency separation, Δ, and their evolutionary state classification.We use the Δ values presented in Reyes et al. (2022).These values are determined for red giants with at least six months of TESS observations via the SYD asteroseismic pipeline (Huber et al. 2009(Huber et al. , 2011;;Yu et al. 2018).Reyes et al. (2022) vet those values and determine a reliability score using a neural network classifier.This ensures that Δ estimates are reliable regardless of the method used for their estimation.The evolutionary states of these red giants have been identified in Hon et al. (2018) and Hon et al. (2022).

ANALYSIS WITH The Cannon
The Cannon1 (Ness et al. 2015;Ho et al. 2016;Casey et al. 2016) is a powerful tool that allows for the prediction of stellar parameters from observed stellar spectra.It is a data-driven algorithm that predicts stellar parameters without relying on a grid of synthetic spectra, unlike other tools that employ a physics-based approach by fitting an observed spectrum to a synthetic spectrum.One particular advantage of The Cannon is its ability to predict stellar labels from lower signal-to-noise data with comparable accuracy to current physicsbased approaches.
The Cannon works in essentially two steps: the training step and the test step.The training step involves creating a generative spectral model from a set of input spectra and the stellar labels that describe those spectra (i.e. the training objects and training labels respectively).This model describes a probability density function (pdf) for the flux at every pixel in the spectrum as a function of input labels.Following the training step, the test step assumes this generative model holds for all other objects in the data set (i.e. the test objects).The spectra of the test objects are each compared to the generative model, which allows for the labels of the test objects to be solved providing, in effect, a label transfer from the training objects to the test objects.See Ness et al. (2015) for a more in-depth explanation.

Validating The Cannon Model
First, we conduct a bootstrap analysis to assess the reliability of The Cannon and to ensure consistent results in identifying significant spectral features for classifying red giants, irrespective of the selection of stars used to build the spectral model.We generated 1, 000 training sets made of the spectra of random selections of 18 RGB stars and 54 RC stars in the data set with known asteroseismic classifications.This results in a total training set size of 72 i.e. ∼ 60 per cent of the total number of red giants with known asteroseismic classifications and is representative of the ratio of RGB and RC stars in that sample.These training sets were used to train 1, 000 The Cannon models with the asteroseismic labels Δ and evolutionary state, as well as the stellar parameters T eff , log , and [Fe/H], to predict the flux at each pixel as a quadratic function of these labels.
We investigate the variability of the individual model coefficients (  ), with a particular focus on the asteroseismic labels.This is achieved by calculating the coefficient of variation (CV; Koopmans et al. 1964) of each pixel for each   across all models.The CV measures the variability of each model coefficient at each pixel across all models, normalised by the mean, providing insight into the impact of the selection of the training set.We choose to focus on the asteroseismic labels in our validation because it has previously been shown that The Cannon can predict stellar labels (i.e., T eff , log , and [Fe/H]) with high accuracy (e.g., Ness et al. 2015;Hawkins et al. 2017).The top panel of Figure 3 represents the CV for  0 which is the "base spectrum" of The Cannon model (Ness et al. 2015).The following panels represent the CV for the remaining model coefficients pertaining to the asteroseismic labels: Δ, ev_state, Δ × ev_state, Δ 2 , and ev_state 2 respectively.The distribution of the CV for each pixel across all bootstrap models peaks below 1, indicating low overall variation (i.e. the standard deviation of the distribution of the model coefficients is less than the mean).Therefore, we can confidently expect consistent results in identifying significant spectral features for the classification of red giants, irrespective of the selection of RC and RGB stars for the training set.In addition, we are confident that the coefficients of the model we use in Section 3.2 are meaningful, and that the spectral features we identify as significant based on the coefficients do truly correspond to whether a star belongs to the RC or RGB.In Appending A we provide more detail on the RC/RGB classifications produced by The Cannon and how those are affected by the selection of the training set.We further compare the reliability of these classifications to alternative methods in the literature such as those based on photometry and stellar parameters.

Identifying Significant Spectral Features
We initially identify prominent spectral features in the spectra of our data set using the find_lines_derivative function from the specutils python package (Astropy-Specutils Development Team 2019).This method identifies emission and absorption features in a spectrum based on finding zero crossings in its derivative, thus indicating the bottom of an absorption feature or the top of an emission feature.
Once spectral features are identified within the spectra, we then cross-reference those wavelengths with a line list for the wavelength range of the Rosso camera of the Veloce spectrograph (5800 − 9500 Å), created with the atomic and molecular line list generator linemake (Placco et al. 2021).In the high-resolution spectra of the red giants in our data set, we identify a total of 1,823 spectral absorption features.The majority of these features are CN (56 per cent), Fe (12 per cent), CH (4 per cent) and Ti (4 per cent).
To determine which spectral features are significant in the prediction of red giant evolutionary state, we train The Cannon with the spectra of all stars with asteroseismic information (95 RC; 28 RGB).We establish a spectral model with The Cannon that predicts the flux of red giants as a quadratic function of T eff , log , [Fe/H], Δ and RC/RGB evolutionary state.This allows us to investigate the values of the coefficients for the linear and quadratic terms pertaining to the evolutionary state of red giants and determine which pixels are most useful in their classification.
We initially define a spectral feature as significant to the prediction of red giant evolutionary state following a similar method to that in Banks et al. (2023).A spectral feature is significant if any of the pixels that are part of the line and not the surrounding continuum exhibit an ev_state model coefficient greater than the 90th percentile of the distribution (±0.0047).This is effective as a first identification since the linear term model coefficients represent the lowest order dependence of the flux on the respective label (Buder et al. 2021).We further refine our selection to require that significant spectral features must also exhibit an ev_state 2 model coefficient value greater than the 90th percentile of the distribution of the ev_state 2 coefficients (±0.03).This results in a total of 504 identified significant features.
Figure 4 illustrates a representative selection of significant features that hold the most information regarding the evolutionary state of red giants according to their ev_state and ev_state 2 coefficients across two spectral windows (∼ 6256 ± 5 Å and ∼ 8087 ± 6 Å).In the upper panels, we show spectra for one RC star in red (Gaia DR3 ID: 4661172395405681920, T eff = 4511, log = 2.35, [Fe/H]= 0.05) and one RGB star in black (Gaia DR3 ID: 4766063055901392128, T eff = 4514, log = 2.32, [Fe/H]= −0.02).The solid line in the lower panels shows the linear ev_state model coefficient value for each wavelength pixel, while the dashed lines at ±0.0047 illustrate the 90th percentile of the distribution of the ev_state linear model coefficient.Spectral features that are identified as significant to the prediction of red giant evolutionary state are highlighted in grey.
The majority of spectral features that we identify as significant are CN molecular features, followed by Fe and CH.A total of 42 spectral features overlap with those found in the moderate-resolution VLT/X-Shooter spectra from our previous broad wavelength investigation (Banks et al. 2023).On visual inspection of the Veloce Rosso spectra we can see that 18 of those 42 overlapping spectral features are actually blended in the VLT/X-Shooter spectra.With the benefit of higher resolution, we can see that the significant feature is the feature we identified in Banks et al. (2023) in 5 of 18 cases, but in the other 13 cases, we can see that the significant feature is due to a different transition (one example is shown in Figure 5).Lines with new, corrected identifications in our final line list of significant spectral features are noted in Appendix B.
We demonstrate this difference in Figure 5.Here we show a spectral window centred on the Fe I line at ∼ 6614 Å for an RC star with a moderate-resolution X-Shooter spectrum in magenta and a different RC star with a high-resolution Veloce Rosso spectrum in black.These stars have similar stellar parameters: T eff ∼ 4715 K, log ∼ 2.4, [Fe/H]∼ −0.2.It is evident from the Veloce Rosso spectrum that the Fe I line in the X-Shooter spectrum is blended with the highlighted CN line.The higher-resolution Veloce Rosso spectra enable us to distinguish the previously blended lines and identify that the significance is actually in the CN feature.
In our previous investigation (Banks et al. 2023), we did not probe potential covariances these significant features may have with other stellar labels.For example, a spectral feature identified as significant in the prediction of red giant evolutionary state may also be significant in the prediction of other stellar labels.The reason to include stellar parameters in The Cannon model is to detect and factor out their effects on the spectra of red giants so that the influence of red giant evolutionary state can be extracted as cleanly as possible.However, the quadratic model in The Cannon is not a perfect representation of stellar spectra, and we cannot completely separate the effects of abundance from the effects of T eff , log , and [Fe/H] on line strengths.
To be as certain as possible that we are focusing on features that are significant for the prediction of red giant evolutionary state, we further restrict our list to only consider the lines that are significant for ev_state but are not significant for the stellar parameters.Of the initial 504 spectral features we found to be significant in the prediction of red giant evolutionary state, 293 (∼ 58 per cent) are also significant in predicting T eff , 197 (∼ 39 per cent) are also significant in predicting log , and 244 (∼ 48 per cent) for the prediction of [Fe/H].
The top panel of Figure 6 shows the base spectrum of The Cannon model, i.e. a representative spectrum for a star with  eff = 4330 K, log  = 2.2, [Fe/H] = −0.95,Δ = 2.80 and ev_state = 0, in the spectral window ∼ 6318 ± 4 Å with prominent spectral features labelled.The lower panels show the model coefficients for the linear terms for the labels T eff in blue, log  in orange, [Fe/H] in green and evolutionary state in red.Spectral features that are significant to the prediction of each of these labels are highlighted in their respective colour in the upper panel.For example, the CH line at ∼ 6316.7 Å holds high significance in the prediction of red giant evolutionary state as well as log  and [Fe/H].Requiring that significant lines are only significant for ev_state reduces our list to a total of 66 spectral features, which are again dominated by CN, including the CN lines at ∼ 6317.5 Å.For a full line list see Table B1.

SPECTRAL FEATURE ANALYSIS
Abundance determination for the stars in this data set is the topic of upcoming work, but there are useful indicators of abundance, isotopic ratio, and the physical mechanisms responsible for the spectroseismic connection in the equivalent widths of significant features and The Cannon coefficients we are considering in this study.Our RGB stars were selected to have similar stellar parameters to our RC stars, which are fairly confined by nature.As a result, the relative strengths of absorption lines (as measured by their equivalent widths), which are the important factor for The Cannon, reflect abundances more directly than they would for a more heterogeneous set of stars, and we can use them as a rough proxy to compare the abundances between the RC and RGB stars.However, our data set does cover a range of 1 dex in [Fe/H], 800K in T eff and 0.3 in log , and this will add some scatter to the mapping from abundance to line strength.We expect that any trends found in the spectral feature analysis will become clearer when viewed as abundance trends.
Within the 66 spectral features, we find to be significant for predicting evolutionary state, we find that the linear coefficient for the ev_state label in The Cannon is negative for the majority of CN lines and positive for the majority of CH lines.This indicates that the 0.05 0.00 0.05 RC_Prob Figure 5.The top panel shows a comparison between a moderate-resolution X-Shooter spectrum (magenta) and high-resolution Veloce Rosso spectrum (black) for two RC stars with similar stellar parameters (T eff ∼ 4715 K, log ∼ 2.4 and [Fe/H]∼ −0.2).In the moderate-resolution X-Shooter spectrum, only the Fe I line is able to be resolved; however, in the high-resolution Veloce Rosso spectrum, this blended line is separated into the Fe I line and a CN molecular line.In the bottom panel, we show the model coefficients for the linear terms pertaining to the ev_state label (black) and the RC_Prob (magenta) label for The Cannon models trained on Veloce Rosso and VLT/X-Shooter spectra respectively.Here we also show the 90th percentile of the distribution of each model label (i.e. the threshold for significance used in both this study and Banks et al. 2023) in each respective colour as dotted lines.In this analysis, it is clear that the significant feature pertaining to the prediction of red giant evolutionary phase is the highlighted CN line and not the Fe I line.
CN lines are stronger and the CH lines are weaker in RC stars than in RGB stars, implying a higher nitrogen abundance and lower carbon abundance in the more evolved stars.This is the direction we expect for those abundances to evolve as a result of deep mixing during the RGB phase (e.g., Gratton et al. 2000;Martell et al. 2008).
The 12 C/ 13 C ratio for carbon isotopes is also known to change as a result of deep mixing during the RGB phase (e.g., Charbonnel et al. 1998).Its typical value drops from around 20 to around 8, and remains low in RC and horizontal branch stars.To test whether this effect can be seen in our data set and whether the 12 C/ 13 C ratio is meaningful for the spectro-seismic connection, we compared the ratio of line strength in 12 C 14 N and 13 C 14 N features in RC versus RGB stars.As a result of 12 C 14 N depletion and 13 C 14 N enrichment during deep mixing, a lower 12 C/ 13 C ratio in RC stars should, therefore, result in a lower line strength ratio in those stars (modulo the effects of varying temperatures and metallicity within our data set).We identified three pairs of 12 C 14 N and 13 C 14 N features within our spectra (that is, three transitions with a clear isotope shift) where at least one of the pair has been identified as significant in our The Cannon model for the prediction of red giant evolutionary state (see Table 2).We use the REvIEW code (Routine for Evaluating and Inspecting Equivalent Widths; 2 McKenzie et al. 2022), an automated tool that fits individual absorption lines using the scipy.optimizefunction curve_fit, to determine equivalent widths for those lines in the 123 stars for which we have asteroseismic classifications.
To test for a difference in the 12 C/ 13 C ratio between RC and RGB stars in our data set, we consider Δ log EW  ( 12 C 14 N− 13 C 14 N), the We demonstrate whether a spectral feature is defined as significant to predicting a particular label with coloured bars.Those that are highlighted in red are significant in predicting red giant evolutionary state, blue for T eff , orange for log , and green for [Fe/H].In our final selection of significant spectral features that hold information about the evolutionary state of red giants, we select those that are only significant in the prediction of the evolutionary state and not any other stellar label, such as the CN lines at ∼ 6318 Å (whose labels are highlighted in red).
Table 2.The locations of the three 12 C 14 N and 13 C 14 N pairs within our spectra that we used to investigate inferences of the 12 C/ 13 C ratio between RC and RGB stars in our sample.The isotopic transition from each pair that is identified as significant in our analysis is marked with an asterisk.difference between the reduced equivalent width for these isotopic line pairs, in the sense of 12 C 14 N minus 13 C 14 N. Due to the depletion of 12 C 14 N and the enrichment of 13 C 14 N along the RGB, we expect the population of RC stars to be slightly less negative than the RGB population.Figure 7 shows this difference for the 12 C/ 13 C line pair near 8043Å across all 123 stars with asteroseismic classifications.
There is a slight deviation in the distributions of the RC and RGB samples, suggesting a potential difference in the 12 C/ 13 C between the two populations due to carbon depletion from deep mixing along the RGB.We performed a two-sample Kolmogorov-Smirnov (KS) test to investigate whether the distributions of the difference in reduced EW for each line for RC and RGB stars are consistent with being drawn from the same parent distribution.This test returned a -value of 2.3× 10 −25 , indicating that the two distributions are more likely to have been drawn from different source populations.However, additional abundance analysis is necessary to confirm whether the 12 C/ 13 C follows the anticipated trends resulting from carbon depletion along the RGB.
Star-to-star differences in stellar parameters will add scatter to the line strengths even at fixed abundance, broadening both the RC and RGB distributions of reduced equivalent width difference.While this look at reduced equivalent widths is suggestive of a difference in the 12 C/ 13 C ratio between our RC and RGB stars, isotopic abundances determined through spectrum synthesis are needed to be sure.

SUMMARY
This investigation is the first high-resolution study of the spectroseismic connection of red giant stars.Here we have used Veloce Rosso spectra of 123 red giants to train a generative model with the data-driven algorithm The Cannon.We trained a model with 95 RC and 28 RGB stars with known asteroseismic classifications from TESS asteroseismology (Hon et al. 2018(Hon et al. , 2022)).With this The Cannon model, we were able to investigate which spectral features hold the most significance in predicting the evolutionary state of red giant stars, as well as probe these spectral features further to reveal more detail about the astrophysical processes that drive this spectro-seismic connection.
We followed a similar method for identifying significant spectral features for the prediction of red giant evolutionary state to that introduced in Banks et al. (2023).A spectral feature was determined to be significant in the prediction of red giant evolutionary state if the model coefficient pertaining to both the linear and quadratic terms for the evolutionary state label were greater than the 90th percentile of the distribution of those coefficients across all pixels.Following this same method as Banks et al. (2023) we found 504 such features that are significant in the prediction of red giant evolutionary state.
However, as noted in Banks et al. (2023), many of these significant features also hold some significance in the prediction of the other stellar labels (T eff , log , and [Fe/H).To be as certain as possible that we are focusing on features that are truly influenced by the evolutionary state of red giants, we refine our list of 504 significant features to exclude those that are also found to be significant in the prediction of other stellar labels.This reduces our list of significant features to 66, which are dominated by CN molecular features (∼ 53 per cent).This is in agreement with previous studies (e.g., Hawkins et al. 2018;Banks et al. 2023) that also find CN molecular features to be the dominant tracer of the spectro-seismic connection of red giant stars.In this process, we found that 42 of the significant features were also significant in the X-Shooter data we used for analysis in Banks et al. (2023).With the higher spectral resolution of Veloce Rosso we found that 18 of those were actually blended lines, and for 13 of the 18 the feature with the most significance for evolutionary state was not the one we originally identified.In Appendix B we have compiled a list of the 66 significant features that are only significant for predicting the evolutionary state of red giants, and we have noted the features for which our identification has changed since the previous study.
Focusing on the features we find to be significant in the prediction of red giant evolutionary state, we found that a stronger absorption in CN features and a weaker absorption in CH features was associated with The Cannon coefficient for ev_state for the majority of significant CN and CH features, suggesting that the typically higher [N/Fe] and lower [C/Fe] abundance in RC stars relative to RGB stars is an important contributor to the spectro-seismic connection.
We also considered the relative strengths of absorption in CN isotopic pairs, i.e., the same transitions for 12 C 14 N and 13 C 14 N.There are three such pairs of lines in our red giant spectra where at least one member of the pair is identified as significant only to the prediction of red giant evolutionary state.We measured the reduced equivalent widths for these lines across all our red giants with asteroseismic classifications using the automated EW estimator review.We then performed a two-sided KS test to investigate whether the distributions of the difference in reduced EWs between 12 C 14 N and 13 C 14 N were significantly different between the RC and RGB populations.For the line pair near 8043Å, this test returned a -value of 2.3×10 −25 , indicating that we cannot reject the hypothesis that the two distributions originated from two different parent populations.
These tendencies in line strength behave as we would expect if the physical processes that change the surface abundances and isotopic ratios in red giant stars, such as deep mixing and the helium flash, are the driving forces of the spectro-seismic connection in red giants.Follow-up work including elemental and isotopic abundance determination will help to solidify this connection and quantify whether red giants with similar T eff and log  can be distinguished by quantifying the C and N abundances or measuring 12 C/ 13 C without data-driven analyses such as those performed with The Cannon.

DISCUSSION AND FUTURE WORK
This work builds on previous investigations, first by Hawkins et al. (2018) and Ting et al. (2018) using APOGEE (1.5 − 1.7 , R ∼ 28 000) and LAMOST (3600 − 9000 Å, R ∼ 1800) spectra for large samples of red giants, followed by a broad wavelength investigation by Banks et al. (2023) with red giant spectra from the VLT/X-Shooter spectrograph (0.33−2.5 , R∼ 10 000).Molecular features involving carbon (i.e.CN, CO and CH) were found to be significant in the prediction of red giant evolutionary state in these studies; however, further investigation was required to understand which astrophysical processes could be responsible for the spectro-seismic connection.
Using the coefficients of The Cannon model for our 123 red giant stars with asteroseismic classifications, we find that the spectral features in our Veloce Rosso sample that are significant for separating RC and RGB stars are likely to correspond to the surface abundance changes caused by deep mixing on the RGB.Specifically, the depths of CN and CH absorption lines, along with the line depth ratios for the same transitions in 12 C 14 N versus 13 C 14 N correlate to The Cannon coefficient for ev_state.This suggests that RC stars are expected to display lower [C/Fe], higher [N/Fe], and a lower 12 C/ 13 C ratio compared to RGB stars.
The accumulated result of ongoing abundance changes caused by the first dredge-up and deep mixing on the RGB must be visible in the abundances of RC stars, which have already experienced their full RGB lifetime plus the helium flash at the tip of the RGB.However, a detailed abundance analysis based on these same spectra is necessary to be sure that the differences we find in absorption line strength between RC and RGB stars is truly representative of the expected changes in abundance during red giant evolution.A study on this topic is currently in progress.
Building from the idea that red giant evolution is the main driver of the spectro-seismic connection, we can make a number of testable predictions for the observable properties of RGB versus RC stars: • The spectroscopic differentiation of upper RGB stars from RC stars will be more difficult than for lower RGB stars due to less distinct abundances.This does not present a significant difficulty, since these stars are more easily differentiated in T eff -log -luminosity space.
• The spectro-seismic connection should be more pronounced in metal-poor red giants because deep mixing is much more efficient in red giants with lower metallicity.For example, Martell et al. (2008) found that the carbon depletion rate from deep mixing in red giants is doubled at a metallicity of [Fe/H] = −2.3 as compared to Fe/H = −1.3.
• Oxygen may also be a spectro-seismic indicator because it is also slightly depleted during phases of deep mixing (Weiss et al. 2000;Johnson & Pilachowski 2012).We do not find a significant correlation between atomic oxygen spectral features and the evolutionary state of red giants in our data.In addition, there are no CO lines present in the wavelength coverage of Veloce Rosso.Other moderate-to highresolution spectroscopic surveys that derive oxygen abundance, e.g.APOGEE (Majewski et al. 2017), may be able to investigate this further.
• Lithium is often of high interest in studies of red giants; however, it will not be useful for this kind of study.Lithium is almost completely depleted by the first dredge-up and the strong mixing at the RGB bump (Lind et al. 2009), leaving little behind to allow for the detection of a significant difference between the lower RGB and RC.
• Previous studies (e.g., Hawkins et al. 2018;Ting et al. 2018;Banks et al. 2023) have discussed whether the helium flash may also be a driver of the spectro-seismic connection seen in red giants.Comparing the abundances of secondary RC stars to primary RC stars with the same metallicity might be a way to test this, since the secondary RC stars ignite helium fusion smoothly, without a flash.
In the pursuit of spectroscopically identifying RC stars, we have found that stars falling within the specific ranges of T eff , log  and [Fe/H] explored in this study (4330 <  eff < 5030 K, 2.2 < log  < 2.5, and −0.95 < [Fe/H] < 0.31) have spectral features identifiable with high-resolution spectra that are effective for RC classification.Leveraging these features is particularly valuable for Galactic archaeology, as RC stars serve as standard candles.While the limiting magnitude of Veloce Rosso,  ∼ 12 mag, confines observations to a relatively limited volume of the Galaxy compared to those of TESS and Kepler (∼ 14 mag; Hekker et al. 2011;Stello et al. 2022), several other spectroscopic instruments such as VLT/UVES, Magellan/MIKE, Keck/HIRES, and Gemini/GHOST (Dekker et al. 2000;Bernstein et al. 2003;Vogt et al. 1994;Rantakyro et al. 2024, respectively) provide comparable high-resolution capabilities to Veloce Rosso while offering the advantage of fainter limiting magnitudes, ∼ 19 mag.This enables the accurate mapping of stellar population and abundance trends over a significantly larger volume of the Galaxy than Gaia parallax.The reach of RC star samples in spectroscopic surveys will expand from the 6 kpc covered by current-generation surveys like GALAH (Buder et al. 2021) and APOGEE (Majewski et al. 2017) to 16 kpc in upcoming projects such as 4MOST (de Jong et al. 2019) and WEAVE (Dalton et al. 2012), and potentially up to 100 kpc in proposed future projects like WST (Pasquini et al. 2018) and MSE (The MSE Science Team et al. 2019).• Recall: the ratio of true positives to the sum of true positives and false negatives.For RC stars, this is defined as the number of correctly predicted RC stars divided by the total number of known RC stars •  1 score: the harmonic mean of precision and recall • Accuracy: the fraction of correct predictions out of all predictions Table A1 shows the precision, recall and  1 score of all predictions made with The Cannon in the bootstrap models with RC:RGB = 3:1 training sets.The total accuracy of the predictions made by these The Cannon models is 0.87.We also show the confusion matrix of these predictions in the upper panel of Figure A1.
From these bootstrap models, true positives, i.e.RC stars accurately predicted as RC by The Cannon, occur in 93 per cent of the total population of RC stars in the test sets of all models.True negatives, i.e.RGB stars accurately predicted as RGB by The Cannon, occur in 81 per cent of the total population of RGB stars in the test sets of all models.Therefore, some misclassifications occur within these predictions, with RGB stars more likely to be misclassified as RC stars.
We also present the precision, recall and  1 scores for the red giant evolutionary state predictions made by The Cannon from the additional 1,000 models trained on balanced training sets in Table A2.The total accuracy of the predictions made by these The Cannon models is 0.86.We also show the confusion matrix of these predictions in the lower panel of Figure A1.
From these additional bootstrap models trained on a 1:1 split between RC and RGB stars, true positives occur in 77 per cent of the total population of RC stars in the test sets of all models.This reduction is anticipated, given the less comprehensive training set of RC spectra, comprising of 25 RC stars, as opposed to the 54 in the models featuring a 3:1 RC to RGB ratio.Conversely, true negatives have improved to 96 per cent of the total population of RGB stars in the test sets of all models.This enhancement is attributed to a more comprehensive training set of RGB stars, with 25 stars compared to 18 in the models presented in Section 3.1.In these models, RC stars are more likely to be misclassified as RGB stars.
The purpose of red giant classification is to identify RC stars such that their unique property as a standard candle can be used in studies of Galactic archaeology.From our investigation, we find that the overall classification accuracy from both sets of models yields similar results.However, both sets of models appear to be more useful for different outcomes.For example, by training The Cannon on sets of RC and RGB stars that are representative of the total population of red giants with known evolutionary states available to train (i.e. in our case RC:RGB = 3:1), this maximises the number of expected RC star predictions, however, there can be up to 20 per cent contamination from misclassified RGB stars.Conversely, if the goal is to accurately identify RGB stars such that there is little contamination present in RC selections, we find the balanced training set is most suitable.However, we note that this does result in fewer correct RC star classifications.
It is important to note that the goal of The Cannon is spectral modelling and label transfer rather than discrete classification.In our analysis, we attribute the improved classifications for each evolutionary state to a more comprehensive training set.Specifically, there is a larger selection of RC and RGB stars spanning a broader spectrum of stellar labels explored in The Cannon models (e.g.T eff , log , and [Fe/H]).The models established with The Cannon contain a total of 55,402 pixels where each pixel is described by a quadratic function of five labels (i.e.T eff , log , [Fe/H], Δ and ev_state) which results in 21 total parameters.Hence, larger training sets with broad coverage of label space in each evolutionary state will better constrain these factors and effectively contribute to a refined predictive capability of The Cannon.
In conclusion, when using The Cannon as a classification tool for red giant stars, we recommend employing a sufficiently large sample of both RC and RGB stars in the training set such that each sub-set of RC and RGB stars adequately covers a more complete distribution of other stellar labels that describe their spectra (i.e.T eff , log , and [Fe/H]).As a result, we are confident in the spectral features we have found as significant in the prediction of red giant evolutionary state.This is because we determined these from a model generated by The Cannon using the full suite of RC and RGB stars available for establishing and training the model (i.e.28 RGB and 95 RC stars).

A1 Comparing evolutionary state predictions to other methods
Before the rise of asteroseismology as the "gold standard" of red giant classification, RC stars were distinguished from the RGB either through selection cuts in photometry and surface gravity (e.g.( − ) and log  in Williams et al. 2013), or spectroscopically on the basis of stellar parameters (T eff , log , [Fe/H]) and photometry, when compared to theoretical expectation from isochrones (e.g., Bovy et al. 2014;Sharma et al. 2018).However, these methods result in significant RGB misclassification and therefore contamination.For example, Williams et al. (2013) select RC stars as those within 0.55 ≤ ( − ) ≤ 0.8 and 1.8 ≤ log  ≤ 3.0 as determined from RAVE spectra (Steinmetz et al. 2006) and estimated that ∼ 60 per cent of stars in their RC selection actually belonged to the RGB.Selections of RC stars from isochrones provide an improved selection with less contamination from RGB stars (e.g., ∼ 20 per cent in Wan et al. 2015).
Next, we compare the red giant evolutionary state predictions made by our The Cannon model to the RC/RGB classification regime explored in Martell et al. (2021).They classify red giants using two parameters, i.e., the probability for a star belonging to the RC.This probability, called is_redclump_bstep, is determined using the bayesian stellar parameters estimator (bstep) (see Sharma et al. 2018), which provides a probabilistic estimate of intrinsic stellar parameters from observed stellar properties via the use of theoretical stellar isochrones.The second parameter used to separate the RC from the RGB in Martell et al. (2021) is WISE  2 absolute magnitude.They separate the RC from the RGB with the following selection: • RC stars: bstep RC probability is_redclump_bstep ≥ 0.5 and absolute magnitude | 2 + 1.63| ≤ 0.80 • RGB stars: bstep RC probability is_redclump_bstep < 0.5 or absolute magnitude | 2 + 1.63| > 0.80 We establish an additional model that is trained on all stars with asteroseismic classifications in our data set.We perform the test step of The Cannon on the spectra of 149 stars that satisfy both selection criteria of Martell et al. (2021).
Figure A2 illustrates the evolutionary states predicted by The Cannon compared to their is_redclump_bstep score.The red bars represent stars that are predicted to belong to the RC by The Cannon model and the blue bars represent stars that are predicted to belong to the RGB.
The classifications made using the scheme employed by Martell et al. (2021) agree with the predictions made by The Cannon for 98 per cent of RGB stars in our test set and 70 per cent of RC stars.To get a clear RC sample with minimal RGB contamination, from this result, we suggest using the following selection: • The Cannon evolutionary state prediction is RC (predicted ev_state label ≥ 0.5), and • bstep is_redclump_bstep probability score is ≥ 0.5.

APPENDIX B: SIGNIFICANT FEATURES LINE LIST
This paper has been typeset from a T E X/L A T E X file prepared by the author.
MNRAS 000, 1-12 (2024) Table B1.Line list of the 75 spectral features within the Veloce wavelength range found to be significant in the prediction of red giant evolutionary state and not other stellar parameters, e.g.T eff , log , and [Fe/H].Here we detail the wavelength and species of the significant features, as well as their log   and lower excitation value in eV ( low ) and whether this identification has changed from the investigation in Banks et al. (2023) (these are identified with a checkmark). Wavelength

Figure 1 .
Figure 1.Kiel diagram of the stars in this data set overlaid on the GALAH DR3 sample (Buder et al. 2021).

Figure 2 .
Figure 2. Kiel diagram of the stars identified as RC in the top panel and RGB in the bottom panel.The stellar parameters T eff and log  are sourced from GALAH DR3 (Buder et al. 2021) and the RC/RGB classifications are determined using the neural network classifier detailed in Hon et al. (2017).

Figure 3 .
Figure 3.The distribution of the coefficient of variation (CV) for each model coefficient across all models.The top panel represents the base spectrum of the models and the following panels represent Δ, ev_state, Δ × ev_state, Δ 2 , and ev_state 2 terms of the bootstrap models respectively.The CV for each model coefficient peaks below 1, denoting a low variation across the different bootstrap models.

Figure 4 .
Figure 4.A representative selection of spectral features The Cannon identifies as the most significant in determining the evolutionary state of red giant stars in two spectral windows (i.e.∼ 6256 ± 5 Å and ∼ 8087 ± 6 Å).The upper panels show the flux of red giants in our data set with labelled spectral features.The red line represents the flux of one RC star in the data set (Gaia DR3 ID: 4661172395405681920,  eff = 4511, log  = 2.35, and [Fe/H] = 0.05).The black line represents the flux of one RGB star in the data set (Gaia DR3 ID: 4766063055901392128,  eff = 4514, log  = 2.32, and [Fe/H] = −0.02).The solid black line in the lower panels represents the model coefficients of the evolutionary state label (ev_state) for each pixel.The dotted lines about ±0.0047 represent the 90th percentile of the distribution of the ev_state model coefficient.We have highlighted in grey the spectral features in these spectral windows that we initially identified as significant (without regard to any covariances with other labels).

Figure 6 .
Figure6.A representative selection of absorption features in the spectral window near ∼ 6318 ± 4 Å and the corresponding model coefficients for the linear terms ev_state, T eff , log , and [Fe/H] in the lower panels (red, blue, orange, and green lines respectively).The top panel illustrates the base spectrum of The Cannon model with the major features labelled.We demonstrate whether a spectral feature is defined as significant to predicting a particular label with coloured bars.Those that are highlighted in red are significant in predicting red giant evolutionary state, blue for T eff , orange for log , and green for [Fe/H].In our final selection of significant spectral features that hold information about the evolutionary state of red giants, we select those that are only significant in the prediction of the evolutionary state and not any other stellar label, such as the CN lines at ∼ 6318 Å (whose labels are highlighted in red).

Figure 7 .
Figure 7. Histogram of the difference in reduced equivalent width between the 12 C 14 N and 13 C 14 N lines near 8043 Å, corresponding to the ratio of the two line strengths.RC stars are shown in red, and RGB stars are shown in blue.The red and blue solid lines are kernel density estimates for the same data.While both distributions are broad, there is a slight preference for RC stars to have a less negative value of Δ log EW  , indicating a difference of the 12 C/ 13 C ratio in RC and RGB stars.This supports the notion that the surface abundance changes caused by deep mixing on the RGB are the main drivers of the spectro-seismic connection of red giants.

Figure A1 .
Figure A1.Confusion matrices of evolutionary state predictions made across all bootstrap models described in Section 3.1 with RC : RGB = 3 : 1 (top panel) and models with RC : RGB = 1 : 1 (bottom panel).The known evolutionary states from TESS asteroseismology are on the y-axes and the predicted evolutionary states made by The Cannon are on the x-axes.True positives, i.e.RC stars correctly classified by The Cannon as belonging to the RC occur with 93 per cent of the RC population with a 3:1 training set split and 77 per cent of the population with a 1:1 training set split.True negatives, i.e.RGB stars correctly classified by The Cannon as belonging to the RGB occur with 81 per cent of the RGB population with a 3:1 training set split and 96 per cent of the population with a 1:1 training set split.

Figure A2 .
Figure A2.Results of The Cannon model's prediction of red giant evolutionary state compared to is_redclump_bstep score.Stars that are predicted to be from the RC by The Cannon are coloured in red and RGB stars in blue.The majority of stars classified by The Cannon as belonging to the RGB agree with their is_redclump_bstep scores (where RGB stars have scores < 0.5).However, approximately 33 per cent of stars classified as RC stars by The Cannon also exhibit RGB bstep scores.

Table 1 .
Gaia ID numbers, on-sky coordinates, observation date, exposure time, SNR,  2 apparent magnitude, stellar parameters T eff , log , [Fe/H] from Buder The full table is available in the online version of this article; an abbreviated version is included here to demonstrate its form and content.

Table A1 .
Classification scores: precision, recall and  1 scores for classifications made in the bootstrap models with RC : RGB = 3 : 1 training sets.The overall accuracy of the classifications made with these models is 0.87.

Table A2 .
Classification scores: precision, recall and  1 scores for classifications made in the bootstrap models with training sets comprised of an equal split between RC and RGB stars.The overall accuracy of the classifications made with these models is 0.86.