Modelling stellar variability in archival HARPS data: I - Rotation and activity properties with multi-dimensional Gaussian Processes

Although instruments for measuring the radial velocities (RVs) of stars now routinely reach sub-meter per second accuracy, the detection of low-mass planets is still very challenging. The rotational modulation and evolution of spots and/or faculae can induce variations in the RVs at the level of a few m/s in Sun-like stars. To overcome this, a multi-dimensional Gaussian Process framework has been developed to model the stellar activity signal using spectroscopic activity indicators together with the RVs. A recently published computationally efficient implementation of this framework, S+LEAF 2 , enables the rapid analysis of large samples of targets with sizeable data sets. In this work, we apply this framework to HARPS observations of 268 well-observed targets with precisely determined stellar parameters. Our long-term goal is to quantify the effectiveness of this framework to model and mitigate activity signals for stars of different spectral types and activity levels. In this first paper in the series, we initially focus on the activity indicators (S-index and Bisector Inverse Slope), and use them to a) measure rotation periods for 49 slow rotators in our sample, b) explore the impact of these results on the spin-down of middle-aged late F, G & K stars, and c) explore indirectly how the spot to facular ratio varies across our sample. Our results should provide valuable clues for planning future RV planet surveys such as the Terra Hunting Experiment or the PLATO ground-based follow-up observations program, and help fine-tune current stellar structure and evolution models.


INTRODUCTION
Since the groundbreaking detection of the first exoplanet orbiting a sun-like star, 51 Peg b (Mayor & Queloz 1995), using the radial velocity (RV) method, numerous techniques have been developed and campaigns initiated to discover exoplanets with the ultimate goal of finding another habitable world.The RV method continues to be one of the most promising techniques for detecting Earth-like planets, thanks to the development of optical ultra-stable high-resolution échelle spectrographs such as High Accuracy Radial Velocity Planet Searcher (HARPS; Mayor et al. 2003), HARPS-N (Cosentino et al. 2012), ESPRESSO (Pepe et al. 2021), EXPRES (Jurgenson et al. 2016), NEID (Schwab et al. 2018) and the upcoming KPF (Gibson et al. 2016) and HARPS-3 (Thompson et al. 2016).These new instruments can achieve extreme precision down to 0.3 ms −1 (e.g., Suárez Mascareño et al. 2020;Faria et al. 2022) and even beyond.
However, despite the sub-meter-per-second accuracy achieved by RV instruments, the detection of Earth analogues remains a significant challenge, primarily attributed to stellar activity.For example, in our own solar system, the amplitude of Earth's RV signal is a mere 0.1 m/s, yet RV fluctuations caused by solar activity can exceed several m/s (e.g., Meunier et al. 2010a;Haywood et al. 2022).As ★ haochuan.yu@physics.ox.ac.uk a result, signals from small, moderately distant planets, similar to Earth, could easily be obscured by the activity of their host star.Accurate modelling of these activity signals is crucial for the successful detection of such planets.
Variations in the RV time-series can arise from a variety of stellar activity processes, often manifesting across different timescales.Predominantly, these fluctuations are the consequence of the magnetic field of the star interplay with the convection.In the following, we discuss several mechanisms believed to be most crucial in inducing variations in the RVs.
On short timescales, the surface of an FGKM star consists of cells where local convection occurs, a phenomenon known as granulation.In each cell, material heated up rises to the surface, causing a blueshift in the lines, while cooler material sinks, causing a redshift.Since the hotter material is generally brighter than the cooler material, averaging the line profiles over the stellar surface produces an asymmetric line profile, resulting in a net blueshift when measuring the line centers.As these cells, or granules, evolve stochastically on the stellar surface, this effect can induce RV variations at the m/s level, with timescales ranging from a few minutes to several hours (e.g., Dumusque et al. 2011a;Meunier et al. 2015;Meunier & Lagrange 2019;Cegla et al. 2013Cegla et al. , 2018Cegla et al. , 2019)).On even shorter timescales, surface granulation and magnetic events can excite oscillations of the star at characteristic frequencies (acoustic modes), known as pressure-mode (p-mode) oscillations (e.g., Kjeldsen & Bedding 1995;Arentoft et al. 2008).For sun-like stars, p-mode oscillations can induce RV variations with a period of several minutes and an amplitude of around 0.1 to 1 m/s (Strassmeier et al. 2018;Cegla 2019).Given that the timescales of these effects are normally short compared to the orbital period of the planet of interest, such effects can be filtered out by either extending the exposure time (e.g., Chaplin et al. 2019), or binning the data to a lower time resolution.
On moderate timescales, effects caused by active regions on the stellar surface become significant.These regions are shaped by magnetic fields, which can interfere with the convection on the stellar surface in two ways.First, when the magnetic field suppresses the local heat transport, it results in dark regions known as spots.Second, if the magnetic field is not strong enough to entirely suppress the heat transport, it alters local opacity which produces brighter regions, often referred to as faculae or plages.These active regions can induce RV variations through two primary mechanisms: the photometric effect, marked by a localized flux alteration, and the inhibition of the convective blueshift effect, characterized by a reduced blueshift on the local line profile.Therefore, disk-integrated line profiles exhibit distortion due to these localized changes, manifesting as shifts in the RV.As these active regions rotate with the surface of the star, the induced RV signal is modulated by such rotation, resulting in a signal with quasi-periodic variations.Meunier et al. (2010b) found that, for the Sun, the amplitude of such variation in the RV is around 0.4 to 1.4 m/s and is dominated by the convective blueshift effect.The challenge lies in the fact that the rotation periods of Sun-like stars can be of the same order of magnitudes as the orbital period of exoplanets.This similarity led to a few early claims being subsequently refuted (e.g., Rajpaul et al. 2016).
Several techniques have been developed to mitigate such activity signals.One straightforward approach is to compare the Lomb-Scargle periodograms of RV and spectroscopic activity indicators to visually distinguish between stellar and planetary periods.This method is only effective when the data is relatively regular-sampled and the stellar signal remains less evolved.An alternative approach is to employ Stacked Bayesian General Lomb-Scargle periodograms (Mortier & Collier Cameron 2017), which provides a way to assess the stability of signals over time.Stellar signals can be identified as they are unstable and incoherent, in contrast to planetary signals which are consistent and stable.A more advanced approach is to use a multi-dimensional Gaussian Process framework, which models the RV together with the activity indicators (e.g., Aigrain et al. 2012;Rajpaul et al. 2015;Barragán et al. 2022).We will delve deeper into this approach in section 3. Besides, there are techniques that conduct activity mitigation at earlier stages.For instance, some start from cross-correlation function (CCF) (e.g., Donati et al. 2014;Collier Cameron et al. 2021;John et al. 2022;Zhao et al. 2022b;de Beurs et al. 2022), while others focus on the spectrum level mitigation (e.g., Davis et al. 2017;Jones et al. 2017;Cretignier et al. 2021Cretignier et al. , 2022)).For more details on existing activity mitigation techniques, we refer readers to Zhao et al. (2022a) for an overview.
On longer timescales of several years, the influence of magnetic activity cycles becomes prominent.Solar-type stars exhibit cycles of magnetic activity spanning several years, which can potentially affect all the above activity signals.Such cycles have been linked to long-term RV fluctuations over the years (Dumusque et al. 2011b).This means the RV signals attributed to these cycles can be easily confounded with the RV signals from long-period planets.Meunier et al. (2019) demonstrated that such effects can be substantially reduced by decorrelating the RV from chromospheric emissions.In addition to the long-term baseline variations, magnetic activity cy-cles can alter the characteristic period of signals induced by active regions over time.This is because the distribution of active regions across latitudes changes throughout the magnetic activity cycle (e.g., the butterfly diagram of the Sun) (e.g., Foing 1988).If the surface rotation rate varies from equator to pole (differential rotation), the rotation period inferred from the modulation of surface heterogeneities (whether in photometry or spectroscopic indicators) will vary along the magnetic cycle.For a Sun-like differential rotation and butterfly pattern, the active regions move from higher to lower latitudes as the cycle progresses, and their rotation period decreases.
Since 2003, the high-resolution spectrograph HARPS has been observing thousands of stars.The rich archival, high-quality data from HARPS provides a unique opportunity to test and apply stateof-the-art methods for activity modelling and mitigation.In this paper, we apply the multi-dimensional Gaussian Process framework to HARPS observations of 268 well-observed targets.We discuss sample selection, data reduction and pre-processing, as well as the basic properties of the sample in section 2. We then briefly introduce the multi-dimensional GP framework and detail its implementation in section 3. We discuss results regarding stellar rotation in section 4, and facular to spot ratio in section 5. We summarize our key conclusions and outline future work in section 6.

Sample selection
We utilized archived data from the HARPS, accessible via the ESO science archive1 .
We retrieved all publicly available HARPS spectra using the ESO's astroquery2 (Ginsburg et al. 2019) package.For every observation and each target, we cross-matched the stellar name and position with their respective counterparts in the simbad database (Wenger et al. 2000).This step ensured the removal of potential misclassifications from the ESO archive and close-in binary systems from our data set.Polarimetric data, which only constitute a minor portion of the HARPS archive, have been discarded as they were not reduced automatically by the standard pipeline.We did not implement cuts based on the signal-to-noise ratio (SNR) of the spectra, or cuts based on whether the simultaneous calibration has been applied.We anticipated that these factors would manifest themselves through the uncertainties of the pipeline products, which would be taken into account statistically by our Bayesian framework introduced later.Our preliminary target set included a total of 1438 targets with spectral types spanning from F to M. Their effective temperatures ( eff ) approximately ranged between 2800 K and 7500 K.
To guarantee a sufficient observation duration for our targets, we excluded those that had less than 40 daily-binned observations.In pursuit of homogeneity in the determination of the stellar parameters, e.g., effective temperature  eff , surface gravity log g, we restricted our sample to stars listed in the catalogue published in Gomes da Silva et al. (2021), and used the parameters listed in that catalogue throughout our paper.Our final sample contains 268 targets, with spectral types spanning from F to K. Figure 1 shows our sample selection in an Hertzsprung-Russell (HR) diagram, with log g versus  eff from GAIA DR2 (Gaia Collaboration et al. 2016Collaboration et al. , 2018)).The grey points show the 1438 targets on our initial list, while the selected 268 targets in our final sample are highlighted in orange.Gaia Collaboration et al. 2016, 2018).The grey points show the 1438 targets on our initial list.The selected 268 targets in our final sample are highlighted in orange.

Data reduction and pre-processing
The raw data underwent automatic reduction using version 3.5 and/or 3.8 of the HARPS Data Reduction Software (DRS).For each target, cross-correlation functions (CCFs) were computed by correlating the reduced spectra with a line mask that matches the spectral type of the star.CCF proxies, namely RV and bisector inverse slopes (BIS), were then measured from the CCFs within the DRS.Additionally, we computed a chromospheric emission metric, -index, from the Ca II H and K lines (resp. 3968.47 and 3933.66 Å) in the S1D spectra, employing the ACTIN3 (Gomes da Silva et al. 2018) package.
For each time-series, we calculated the median absolute deviation (MAD) metric, which is considered a more reliable measure of statistical dispersion than the often-employed standard deviation, especially in the presence of outliers.Data points exceeding three times the MAD value were identified as outliers and subsequently excluded.Any time-series extending beyond BJD=2457161.5 was partitioned into two sections for the purpose of outlier removal, given the expectation of two separate baselines resulting from the HARPS fibre upgrade.Finally, we binned each time-series to a maximum of one data point per day.

Basic properties of the sample
We showcase the basic properties of the 268 targets in our sample using HR diagrams in Figure 2. The colours of the points in the left, middle, and right panels represent the number of data points in the pre-processed time-series, the mean SNR per pixel (in échelle order 50) of the acquired spectra, and the mean log  ′ HK of the star, using values from the catalogue provided by Gomes da Silva et al. (2021), respectively.We can see that the numbers of data points in our sample vary, ranging from 40 to over 400, though the majority fall between 40 and 100.The mean SNR spans from 50 to above 300.The mean log  ′ HK for the majority of the targets are below -4.4,suggesting that they are predominantly inactive stars.This aligns with expectations, given that the primary goal of HARPS is exoplanet detection, leading the survey to prioritize low-activity stars.
In Figure 3, we show the normalized root mean square (RMS) of the time-series for both -index and BIS, plotted against the mean log  ′ HK .The RMS values serve as direct metrics of variability in the activity indicators, and as such, they are anticipated to correlate more intimately with the challenges of activity mitigation.We find, however, no clear correlation between the overall RMS values and the mean log  ′ HK , even though there appears to be a tentative positive correlation between the RMS's upper envelope and the mean log  ′ HK .For instance, at a mean log  ′ HK value of approximately -4.9, the corresponding RMS for either -index or BIS can fluctuate by an order of magnitude.Therefore, we deduce that relying solely on the mean log  ′ HK is inadequate as an activity metric within the context of activity mitigation.We recommend that future survey target selections consider additional metrics alongside mean log  ′ HK .

MODEL FOR STELLAR ACTIVITY
Different processes of stellar activity, e.g., p-mode oscillations, granulation, active regions, and magnetic activity cycles, can induce variations in the time-series of RV and spectroscopic activity indicators at various timescales.In this section, we focus on modelling the variations induced by the active regions -notably the effects of faculae/plages and spots.This is primarily because such activity signal is modulated by the rotation of the star, and the associated timescales (i.e., tens of days) are closest to the orbital periods of the planets.
In the following, we introduce the multi-dimensional GP framework to model such activity-induced signals in section 3.1, with the aim of measuring the rotation period of the stars.We then detail the implementation of the framework to activity indicators in section 3.2.

Multi-dimensional Gaussian Processes framework
GPs are commonly used in recent years as a tool to model stellar activity, given their ability to model the data by parameterising its covariance matrix to constrain the characteristics of the data, e.g., period, evolution timescale.This avoids the necessity of knowing the exact deterministic form of the underlying physical processes, which can be extremely hard in the case of modelling stellar activity as it is expected to be stochastic.For comprehensive descriptions of GPs, we refer readers to specialised literature, e.g., Rasmussen & Williams (2006), Roberts et al. (2017) and Aigrain & Foreman-Mackey (2023).At its core, a GP model is characterized by a mean function and a kernel function, the latter parameterizing the covariance matrix.Parameters within these kernel functions, termed hyper-parameters, shape the characteristics of the modelled data rather than its exact form.For example, a frequent choice for modelling stellar activity is the Quasi-Periodic (QP) kernel, where  is the amplitude representing the overall scale of the variation from the mean function,  is the characteristic period of the variation, Γ is the harmonic complexity, which in simpler terms indicates how much the variation strays from a pure sine oscillation within a given period. is the 'evolution timescale', which scales with the maximum distance of two data points that are strongly correlated.
Aiming to maximise the usage of information from the spectroscopic time-series in modelling the stellar activity, we want to model the activity signals from the RVs together with a set of selected spectroscopic activity indicators, i.e., -index and BIS.Following Aigrain et al. (2012); Rajpaul et al. (2015); Barragán et al. (2022); Delisle et al. (2022), we assume a set of N observable time-series,  =1,...,  (), which can be time-series of RV and activity indicators, and each time-series   follows where   () is the deterministic part of the model.For RV, it is where the mean function and planet-induced RV are incorporated.
For activity indicators, it is simply the mean function.  () is the measurement 'white' noise, which normally includes photon noise, calibration noise, etc.Every   () has a shared entity,  ().We interpret this as a latent GP variable, approximating the proportion of the visible stellar disc blanketed by active regions.Commonly, a GP with a quasi-periodic (QP) kernel is the modelling choice for this variable. () is the first temporal derivative of  () and remains a GP, roughly representing how the active regions evolve in time on the stellar disc.Coefficients   and   are free parameters that harmonise the interplay between the latent GP variables,  and , and various observable time-series,   .Building on this framework, a fully parameterised covariance matrix K for the N time-series  =1,...,  () can be established.With a known mean function m and covariance matrix K, the likelihood function can be written in analytic expression as where y encompasses the vector of N observed time-series   ().Full Bayesian frameworks, such as Monte Carlo Markov Chain (MCMC) or nested sampling, can be implemented with L to explore the posterior distributions of free parameters in the model.We direct readers to Rajpaul et al. (2015) and Barragán et al. (2022) for details.
For  (), the traditional approach involving a quasi-periodic (QP) kernel, as outlined in Equation 1, faces the drawback of computational costs that rise cubically in relation to the size of the dataset.An alternative option is to use a fast approximated kernel, such as a Matern 3/2 exponential periodic (MEP) kernel facilitated in S+LEAF 24 (Delisle et al. 2022), of which the computational cost only scales linearly.Such efficiency becomes indispensable when tasked with analysing expansive datasets for a large volume of targets without straining computational resources.We refer readers to Delisle et al. (2022) for details on the MEP kernel.

Implementation
We implemented the multi-dimensional GP framework on every target in our sample.As we initially focused on the activity indicators in the present work, the exact form of the framework we used can be outlined as follows where and  () are white noise models containing measurement noise and jitter terms.BIS is proved to be interlinked with both  () and its derivative  (), while S-index is exclusively related to  () (e.g., Isaacson & Fischer 2010;Dumusque et al. 2014;Thompson et al. 2017).Hence we exclude  () term for -index.In terms of choices of kernels for  (), we opted for MEP kernels available in S+LEAF introduced earlier, which is a faster approximation of the QP kernel.To explore the posterior of free parameters, we used nested sampling through PolyChord5 (Handley et al. 2015a,b), with all priors on the free parameters set as uninformative.This nested sampling approach is favoured here as it demonstrates better performance when compared to MCMC approaches, especially in datasets exhibiting multi-modality.
We show an example of implementing the framework using the * 61 Vir dataset.In Table 1, we show the inferred values of the free parameters in the model based on the posterior distributions, alongside the priors designated for the parameters during the sampling.The inferred values are the median of the posterior distributions, and the statistical uncertainties are given by the 16th to 84th percentile range.
Figure 4 shows a multi-dimensional GP fits to the time-series of * 61 Vir (HD 115617) dataset.In the left panel, the black markers show the time-series of -index and BIS, and their associated uncertainties.The coloured (blue and coral) lines show the GP predictions over the observations, and the corresponding shaded areas of the lines show one  uncertainty of predicted distributions.The horizontal dashed grey lines show the means of the time-series.The grey extensions on the errorbars show the jitter terms.The right panel show posterior distributions of selected model parameters, including period, evolution timescale and harmonic complexity.

Rotation period measurement
The variations induced by active regions in activity indicators, modulated by the rotation of the star, are believed to closely represent the star's true rotation period (e.g., Angus et al. 2018;Nicholson & Aigrain 2022).As such, the rotation period is captured by the hyper-parameter 'period' in our multi-dimensional GP model when the activity-induced signal is well-modelled.
We applied the multi-dimensional GP framework with a MEP kernel to each star in our sample.To determine whether the period is securely detected, on a separate analysis, we applied the same framework but used a Matern 3/2 kernel instead with GP, representing an aperiodic model in comparison to the above periodic model (with the MEP kernel).The evidence  of each model is estimated through nested sampling to calculate Δln Z = ln Z periodic − ln Z aperiodic .A positive value indicated that the periodic model is preferred.We then used Δln Z as our starting criterion to select targets where the periodic model is preferred.Additionally, we rejected cases through visual inspection where the Δln Z was unreliable because of two possible scenarios.Firstly, in cases where neither model fits well, the Δln Z criterion could be biased as the Bayesian model comparison is only meaningful when both models adequately fit the data.Secondly, the MEP kernel (or the quasi-periodic kernel in general) we employed has regions in the parameter space that could behave aperiodically.In such cases, even if the periodic model appears preferable based on evidence difference, secure detection of the period is not guaranteed.Upon thorough vetting, we identified 49 targets for which we believe the model adequately captured the activity-induced variations, resulting in an accurate recovery of their true rotation periods.
We also want to emphasize that in comparison to conventional methods, such as periodograms or the autocorrelation function (ACF), the GP approach displays greater resilience against harmonic misdetection when determining rotation periods (e.g., Angus et al. 2018).This robustness arises from the utilization of the MEP or QP kernel, where potential harmonic components are inherently considered, i.e., the 'harmonic complexity' hyperparameter serves as an approximate gauge of the harmonic contribution.This ensures the algorithm finds the true rotation period which is associated with the lowest base frequency.
The selected 49 targets are shown in an HR diagram in Figure 5, with the colour of the markers representing their recovered rotation periods.All other targets of which the periods have not been successfully recovered are denoted with grey markers.The dotted line at log g = 4.1 approximately separates main-sequence stars from evolved stars, i.e., sub-giants.Notably, we find that the rotation period broadly decreases as effective temperature rises (or for earlier spectral types), although there is a considerable amount of scatter at any given temperature.This is broadly consistent with both theoretical expectations (e.g., Skumanich 1972a) and preceding rotation period surveys (e.g., McQuillan et al. 2014).Further insights on age-activity-rotation relations are discussed in section 4.3.
Regarding the detection rate, there are approximately 23% of mainsequence G and K stars of which the rotation period is well recovered.However, this rate decreases significantly for earlier type stars such as F stars, and for sub-giants.The rapid rotation of F stars, typically with periods under 10 days, necessitates denser time sampling than what is currently available to accurately determine their rotation periods.In the case of sub-giants, their evolving nature results in diminished or altered activity patterns, deviating from the assumption of our activity model and thus reducing the detection rate.It is important to note that the detection rate for the rotation period can be considered as an indirect measure of the model's efficacy at mitigating activity signals across different stellar types.In this context, the model demonstrates the best performance for G and K stars.For F stars, the model's efficacy could be enhanced with optimized sampling, tailored to capture their rapid rotation.However, it is essential to recognize that effective activity modelling does not necessarily equate  to a target's appropriateness for planet detection.For example, challenges in modelling activity could arise due to a lack of pronounced active regions on the stellar surface.In such cases, the amplitude of the activity signal remains minimal and as such the star could still possibly be an attractive candidate for exoplanet detection.
Last but not least, it is important to highlight that compared to traditional methods like the Lomb-Scargle periodogram, the GP method holds an advantage in determining the signal's period when the rotational modulation signal evolves over time (Angus et al. 2018).Additionally, with the multi-GP method used in this work, we search for and identify common periodic signatures among multiple timeseries, thereby increasing the confidence in the measured rotation period.Moreover, the multi-GP framework offers insights beyond just the rotation periods of the stars.Specifically, it provides a quantitative understanding of the covariance structure of the activity signals.Such information can be integrated into statistical frameworks to disentangle activity from planetary signals in the RVs.This can be achieved through either simple linear regression models, or a more complex framework that explicitly takes the covariance structure into account, such as the L1 periodogram of Hara et al. (2017).We defer further exploration of this potential to future work.

Rossby number
The Rossby number Ro, which is given by the ratio of the rotation period  rot to the convective turnover timescale  c : is a key parameter of the dynamo process that gives rise to large-scale magnetic fields among the stars we are considering in this study.
Numerous empirical studies have shown that Ro is a key parameter for the angular momentum evolution, as well as for magnetic field strength and topography of late-type stars (see e.g.Noyes et al. 1984).
To first order, it is an indicator of the activity level of the star: the lower the Rossby number, the more active the star.We evaluated the Rossby number for the stars in our sample with rotation period measurements, using the empirical relations from Cranmer & Saar (2011) to calculate  c : .
The Rossby number of the sample is shown in the right panel of Figure 6.The coloured points show the selected 49 targets from the visual vetting.Most of the stars in our sample have a relatively high Rossby number, i.e., Ro > 1.0, and are thus inactive stars.This is unsurprising, as most archival HARPS observations were taken as part of planet surveys, which tend to avoid active stars.Unlike log  ′ HK , the correlation between the Rossby number and the effective temperature is relatively weak, and can be explained mostly by the fact that the hotter stars in our sample are starting to evolve off the main sequence.

Implications for spin-down of middle-aged stars
Sun-like stars are expected to undergo spin-down following a simple power law (e.g., Skumanich 1972b) to the first order during their main sequence stage.This spin-down emerges from the continuous loss of angular momentum, a consequence of torques generated from interactions between the star's surface magnetic field and its stellar wind.However, subsequent observations from open clusters spanning a wide age range challenged the Skumanich-like spin-down.Specifically, these results have brought to light two phases in a star's evolution where the spin-down process appears to 'stall' (Meibom et al. 2009;Agüeros et al. 2018;Douglas et al. 2019).
To account for these stalling effects, several scenarios have been proposed.The two main hypotheses are the core-envelope coupling (e.g., Spada et al. 2011;Lanzafame & Spada 2015;Spada & Lanzafame 2020;Johnstone et al. 2021) happening at a few Myrs to a Gyr, and the weakened magnetic breaking (e.g., van Saders et al. 2016) happening at a few Gyrs.The former suggests angular momentum can be transferred from the radiative core to the convective envelope due to internal differential rotation, attenuating the spin-down of the envelope.The latter indicates that magnetic braking is substantially weakened once the star achieves a critical Rossby number, possibly due to a change in the magnetic field's morphology.This alteration -shifting from a dipole field to a quadrupole field or an even more intricate configuration -makes it less effective at shedding angular momentum (Réville et al. 2015).
Gaps in the rotation period versus effective temperature diagram have been found with Kepler stars (e.g., McQuillan et al. 2013McQuillan et al. , 2014;;Gordon et al. 2021).These gaps can potentially be explained by the aforementioned stalling effects.David et al. (2022) provide stronger evidence by combining the Kepler rotation periods with more accurate spectroscopic effective temperatures from, e.g., the Large Sky Area Multi-Object Fiber Spectroscopic Telescope (LAMOST) (Cui et al. 2012;Zhao et al. 2012).Their findings indicate two prominent pile-ups in the period-effective temperature diagram.The shortperiod pile-up is attributed to the core-envelope coupling mechanism, while the long-period pile-up is linked to the phenomenon of weakened magnetic braking.This latter association is primarily because it aligns seamlessly with a consistent Rossby number curve.
We constructed a similar diagram, by combining rotation periods measured in Kepler data by McQuillan et al. (2014) with effective temperatures from LAMOST.This resulted in a blue Kernel Density Estimation (KDE) plot, displayed in the background of Figure 7, generated with the codes6 provided by David et al. (2022).Empirical cluster sequences of different ages are shown in grey lines, derived from Curtis et al. (2020).The purple lines correspond to constant Rossby numbers, following with convective turnover timescales from Cranmer & Saar (2011) as Equation 7.
In the KDE, the long-period pile-up is visible below the Ro = 1.45 curve, while the short-period pile-up appears above the Ro = 0.4 curve.The Sun is shown as a reference in red, with  eff,⊙ = 5772 K (Prša et al. 2016) and  rot,⊙ = 27 ± 2 d estimated from Snodgrass & Ulrich (1990).We have overlaid the asteroseismic sample from Hall et al. (2021), depicted in grey, and our HARPS sample of 49 well-determined rotation periods (detailed in section 4.1) in orange.Both samples clearly occupy a distinct period-temperature space to the much larger Kepler, predominantly located above the long-period pile-up.We interpret this primarily as a consequence of the different selection effects for the three samples.The Kepler sample, while much larger and benefiting from tight, regular time sampling, relies on detecting the rotational modulation of active regions in broadband optical light curves.This becomes increasingly difficult for less active stars as the fraction of the stellar disk covered by active regions drops, and for slower rotators as the rotation period becomes comparable to both the lifetime of the active regions and the duration of individual Kepler 'quarters'.The asteroseismic sample uses modes to measure internal and surface rotation rates, and is strongly biased towards hotter and lower gravity (evolved) stars, for which the amplitude of the -modes is larger and their frequencies lower than for later type main-sequence stars.The HARPS sample spans a longer baseline and uses spectroscopic activity indicators rather than broad-band photometry to detect the rotational modulation of active regions.It is reasonable to expect that targetted indicators should be more sensitive to active regions than photometry, particularly as the active regions decrease in size and number and the rotation slows.A possible additional contributing factor is the fact that, while dark spots dominate the activity patterns for active stars, less active stars tend to be more faculae-dominated (Meunier et al. 2010a).The photometric signatures of faculae are rather subtle, whereas they have strong chromospheric signatures, which are probed by spectroscopic indicators such as log  ′ HK and also have a strong signature in lineshape indicators such as the BIS via their local dampening effect on convective flows.A more in-depth discussion on this topic can be found in section 5.
Considering the combined HARPS and asteroseismic samples, we see that two distinct populations emerge.The first is composed of hotter stars (with  eff ≥ 5700 K), mostly from the asteroseismic sample, though with a handful of HARPS measurements.This population follows a very steep period-temperature relation, consistent with the rapid spin-down expected as stars leave the main sequence and start to evolve towards the red giant branch.
The second population, predominantly from our sample, consists of cooler (mid-G to mid-K) main-sequence stars.It follows a shallower period-temperature relation, and its upper envelope corresponds to the Ro = 2 curve.This second population could possibly be the extension of the long-period pile-up observed in the Kepler sample, but skewed towards cooler stars.In the Kepler sample, the long-period pile-up is less pronounced at cooler temperatures, but this could be a selection effect: at lower temperatures, the pile-up moves to longer periods, which correspond to lower photometric amplitudes, and the stars also become intrinsically fainter.In Figure 8, we show the Rossby numbers computed for our sample compared to the KDE of the Kepler sample.They fall in the range of 1.5-2.5, consistent with the critical Rossby number inferred from the photometric long-period pile-up.A slight discrepancy between this tail and the anticipated long-period pile-up extension might arise from biases in effective temperature measurements from different sources, as argued in David et al. (2022) in the context of the asteroseismic sample.

FACULAE-TO-SPOTS RATIO
Spots and faculae (or their chromospheric counterpart, plages) are the two main types of active regions that are expected to induce variations in the RVs and activity indicators used in planet searches.These variations are induced by two simultaneous processes.The first, known as the photometric effect, is thought to be mainly induced by dark spots, which locally reduce (or even suppress entirely) the local flux emerging from the stellar surface.The other is due to the fact that convective up-flows are suppressed in regions of enhanced magnetic flux density, which includes both spots and faculae.This results in a local reduction in the net blue-shift caused by convective flows relative to the "clear" photosphere, known as the Inhibition of Convective Blueshift (ICB) effect.Both processes cause distortions to the disk-integrated line profiles, resulting in variations the RV, FWHM and BIS, while chromospheric activity indicators such as the -index or log  ′ HK are sensitive to the fractional coverage of the active regions themselves.
The differentiating factor between spots and faculae lies in the processes through which they induce variations.While spots cause variations via both these processes, faculae primarily influence the ICB effect (e.g., Dumusque et al. 2014).Within the context of our multi-dimensional GP framework, the terms  () and   () distinctively represent these two effects (e.g., Aigrain et al. 2012;Rajpaul et al. 2015).Hence, the ratio of the RMS values of these two terms, RMS[ ()] / RMS[  ()], which we denote as / in the following text, can be interpreted as an approximate representation of the relative faculae-to-spots ratio.This potentially offers an interesting way to measure the faculae-to-spots ratio in the sample of stars where we consider the results of the GP modelling to be sufficiently robust.
In turn, doing this would allow us, in principle, to test the widely held paradigm that more magnetically active stars are more spotdominated, while quieter stars have fewer spots and more faculae (e.g., Meunier et al. 2010b;Amazo-Gómez et al. 2020).Of course, direct measurements of the star's magnetic field necessitate polarimetric observations, and few stars in our sample have such observations.However, the Rossby number Ro derived in Section 4.2 is known to correlate strongly with the strength of the magnetic field < B > (e.g., Reiners et al. 2014;Vidotto et al. 2014;Reiners et al. 2022), with an exception for very inactive stars (i.e., Ro > 2).
For the targets of which we have detected their rotation periods securely, we examined the relationship between the RMS ratio / and the derived Rossby number Ro in the left panel of Figure 9.The error bars show the propagated statistical uncertainties, and the colour scheme indicates the effective temperature of the stars.We excluded samples exhibiting uncertainties in the / exceeding 200%.
To evaluate the monotonic relationship between the / and the Ro, we calculated the Spearman correlation coefficient, considering an upper cut-off value for Ro in the range of 1.6 to 4.0.The result is shown in the right panel of Figure 9.We found that the Spearman coefficient peaks between 1.8 < Ro < 2.0 with a value of 0.62, after which the value drops rapidly.In addition, we conducted linear fits to log(A/B) as a function of Ro, and showed the slope of the fit versus the Ro cut-off value in the same panel.We found that the slope also peaks between 1.8 < Ro < 2.0, behaving similarly to the Spearman coefficient.Both proxies indicate a relatively strong positive correlation between / and Ro until Ro ≈ 1.9, after which the correlation significantly weakens.
Additionally, we conducted analyses to assess the impact of the data point at Ro ≈ 0.5 by excluding it from the correlation analysis.The results are represented as thinner lines in the same panel, labelled as Ro > 0.6.Both the Spearman coefficient and the slope showed behaviors remarkably similar to those observed in the full analysis.This similarity suggests that the overall results are not predominantly influenced by this single data point.Despite this, we assert that the detection at Ro ≈ 0.5 is robust.The lack of data in the 0.5 < Ro < 1.0 range is likely attributed to the HARPS sample's bias towards inactive stars.
The relationship between / and Ro is aligned with theoretical predictions: a lower Ro implies a stronger magnetic field and consequently, a star that is more spot-dominated.This leads to a decreased facular-to-spot ratio, mirrored by a lower /, which is exactly the trend we have seen in the Ro < 1.9 regime.However, when Ro > 1.9, the correlation disappears, and there is no longer any obvious relationship between the / and the Rossby number.We speculate that the disappearance of the correlation might arise from a morphological transition in the stellar magnetic field beyond Ro ≈ 1.9, away from a predominantly dipolar configuration.If the -field structure becomes more complex, the correlation between -field strength and Rossby number would not be expected to persist.
Intriguingly, the Rossby number at which the correlation with the / disappears, Ro ≈ 1.9, is close to the Ro that marks the upper envelope of the Kepler main-sequence rotation period-temperature distribution in Figures 7 and 8.It also corresponds approximately to the critical Rossby number the 1.4 ≤ Ro ≤ 2.0 postulated in stellar spin-down theories (e.g., van Saders et al. 2016van Saders et al. , 2019;;David et al. 2022), where the efficiency of stellar angular momentum dissipation is expected to drop significantly due to the same alterations in magnetic field morphology.
A more detailed investigation of the faculae-to-spot ratio and its dependence on magnetic field strength would certainly be valuable, but would likely require more sophisticated activity indicators that can help disentangle the contributions of the different types of active regions (see e.g.Crétignier et al. 2023) as well as a larger sample of stars with direct, polarimetric magnetic field measurements, and is therefore beyond the scope of this work.

CONCLUSIONS
This paper presents the first results from an ongoing study of the activity properties of F, G and K stars using archival HARPS data.We applied the multi-dimensional GP framework developed by Rajpaul et al. (2015) and Barragán et al. (2022) for activity mitigation in RV planet searches to activity indicators extracted from HARPS spectra of 268 well-observed targets with precisely determined stellar parameters.Applying the GP framework to such a large dataset was made possible by using the efficient implementation of our framework provided by Delisle et al. (2022).While we do plan to perform a joint analysis of the activity indicators and the RVs in the future, this is made challenging by the presence in the RV time-series of residual systematic effects and an unknown number of planetary signals.A new version of the HARPS Data Reduction Software is under development (Dumusque et al. priv. comm.), which is expected to lead to significantly reduced systematics and has already allowed new planet candidates to be identified (with further HARPS observations being gathered to confirm them, ESO program 1110.C-4043, PI Hara).In the mean time, applying the multi-dimensional GP framework to the activity indicators alone already reveals a wealth of useful information.Our key findings are as follows.
We successfully recovered rotation periods for 49 slow rotators in our sample.We found that the rotation period decreases as effective temperature rises, i.e. for earlier spectral types, which is broadly consistent with both theoretical expectations (e.g., Skumanich 1972a) and preceding rotation period surveys (e.g., McQuillan et al. 2014).One limitation of our approach is that we do not consider the effects of differential rotation, primarily due to the sparse sampling of the majority of the targets in our sample.A possible extension of this work would be to analyse the data season by season and investigate seasonal variations in the period as well as in the other parameters of the model.We placed our results in the context of existing, photometric estimates of field star rotation periods from Kepler, and discussed their implications for the age-activity-rotation relations of F, G & K stars.Our samples typically have longer periods than those derived from photometry, with the exception of the asteroseismic sample.We ascribe this to two factors.First, spectroscopic surveys like HARPS have a much longer baseline compared to photometric surveys.This allows for multi-seasonal monitoring, increasing sensitivity to slow rotators.Second, spectroscopic indicators are more sensitive to active regions, particularly the faculae that tend to dominate in slowlyrotating, magnetically inactive stars, than broad-band optical photometry.
Taken together with the pre-existing asteroseismic sample, our new sample of slowly-rotating ( >≈ 30 d) FGK stars consists of two, distinct sub-populations.One is a potential extension of the long-period pileup towards cooler stars, with Rossby numbers in the range 1.4 < Ro < 2.0.The other consists of 'hot slow rotators' with  eff > 5700 K which we interpret as sub-giants which are rapidly spinning down as they start to expand.
Overall, our results broadly agree with the findings of David et al. (2022), but this is the first time that the population of late G and Ktype slow rotators is observed so clearly, above the 'critical' Rossby number of Ro ≈ 1.5 that marks the upper envelope of the periodtemperature distributions for Kepler stars with photometrically derived surface rotation periods.We infer that angular momentum loss via a magnetised wind continues beyond this critical value, which instead marks the point where the active region covering fraction becomes too low to allow for photometric rotation period estimates.
We also explored indirectly how the ratio of faculae to spots in active regions varies across our sample, using the RMS ratio of  () to   () as a proxy.We find that this ratio is positively correlated with the Rossby number up to Ro ≈ 1.9, in accordance with theoretical expectations, since a lower Ro implies a stronger magnetic field and consequently, a more spot-dominated star.The correlation disappears for larger values of Ro, indicating a possible transition in the stars' magnetic field morphology away from large-scale, predominantly dipolar fields that give rise to clearly detectable rotational modulation in the activity indicators.
Thus, both our rotation period measurements and the relative contribution of higher-order versus lower-order behaviour in the activity indicator time-series are consistent with the idea that a significant transition in magnetic field morphology occurs around a critical Rossby number in the range 1.4 ≤ Ro ≤ 2.0.Notably, this transition aligns well with the critical Ro values suggested in stellar spin-down theories.
This paper demonstrates that rich information can be extracted from spectroscopic activity indicators, which sheds light on the structure and evolution of stars.Our intention is to extend this methodology to the RV once there is existing progress on the systemic correction and strategy for a thorough search of planetary signals.We note that the covariance structure of the activity signals learnt from this work can be effectively used to separate stellar and planetary components in the RVs.
The longer-term behaviour of the activity indicators analysed in this work can also be used to search for activity cycles in our targets.This will be the topic of the next paper in our series (Crétignier et al. in prep.).In the longer term, we also plan to incorporate the RV time-series, as well as other activity indicators, into our modelling, in order to improve our understanding of the strengths and limitations of the multi-dimensional GP framework for activity mitigation in RV planet searches, as a function of the stellar properties.

Figure 1 .
Figure 1.Our sample selection in an HR diagram, with log g versus  eff from GAIA DR2 (GaiaCollaboration et al. 2016Collaboration et al. , 2018)).The grey points show the 1438 targets on our initial list.The selected 268 targets in our final sample are highlighted in orange.

Figure 2 .Figure 3 .
Figure 2. The figure shows the basic properties of the 268 targets in our sample.The targets are displayed in HR diagrams, and the colours of the points in the left, middle, and right panels represent the number of data points in the pre-processed time-series, the mean signal-to-noise ratio (SNR) of the acquired spectra, and the mean log  ′ HK of the star, respectively.

Figure 4 .
Figure 4. Demonstration of multi-dimensional GP fit on the * 61 Vir (HD 115617) dataset.In the left panel, the black markers show the time-series of -index and BIS, and their associated uncertainties.The coloured (blue and coral) lines show the GP predictions over the observations, and the corresponding shaded areas of the lines show one  uncertainty of predicted distributions.The horizontal dashed grey lines show the means of the time-series.The grey extensions on the errorbars show the jitter terms.The right panel show posterior distributions of selected model parameters, including period, evolution timescale and harmonic complexity.

Figure 5 .Figure 6 .
Figure 5. HR diagram of our sample with the coloured points showing the measured rotation period for the 49 targets which passed our vetting procedure.The grey markers denote the other targets for which the rotation periods were not recovered or were not considered sufficiently robust to include in subsequent analysis.The dotted line at log g = 4.1 approximately separates main-sequence stars from evolved stars, i.e., sub-giants.

Figure 7 .Figure 8 .
Figure 7. Rotation period versus effective temperature for our sample, compared to previous results from the literature.The blue shading in the background represents a kernel density estimate of the Kepler/LAMOST period-temperature results.Empirical cluster sequences of different ages are shown in grey lines, derived from Curtis et al. (2020).Lines of constant Rossby number are shown in purple.The asteroseismic sample of Hall et al. (2021) is plotted in grey and our new results based on HARPS data are plotted in orange.

Figure 9 .
Figure 9.The left panel shows the RMS ratio / (a proxy for the faculae-to-spot ratio) versus the Rossby number Ro (more magnetically active stars have smaller Ro).The error bars show the derived statistical uncertainties, and the colour scheme indicates the effective temperature.The dotted grey line indicates Ro = 1.9, and the orange line shows the linear fit to the data within the range of Ro < 1.9.The right panel shows the estimated Spearman correlation coefficient between / and Ro, as well as the slope of the linear fit to log(A/B) as a function of Ro.The two proxies are plotted against the cut-off value of Ro ranging from 1.6 to 4.1.The blue and orange lines show the analysis of the full data, while the thinner purple and coral lines show the analysis excluding the data point at Ro ≈ 0.5 for comparison.

Table B1 :
Confirmed rotation period measurements of 49 targets.

Table B2 :
Basic properties of all 268 targets in the sample.Simbad name  eff [K]  eff err [K] log g log g err Mean log  ′