Shallower radius valley around low-mass hosts: Evidence for icy planets, collisions or high-energy radiation scatter

The radius valley, i.e., a dearth of planets with radii between 1.5 and 2 Earth radii, provides insights into planetary formation and evolution. Using homogenously revised planetary parameters from Kepler 1-minute short cadence light curves, we remodel transits of 72 small planets mostly orbiting low-mass stars, improving the precision and accuracy of planet parameters. By combining this sample with a similar sample of planets around higher-mass stars, we determine the depth of the radius valley as a function of stellar mass. We find that the radius valley is shallower for low-mass stars compared to their higher mass counterparts. Upon comparison, we find that theoretical models of photoevaporation under-predict the number of planets observed inside the radius valley for low-mass stars: with decreasing stellar mass, the predicted fraction of planets inside the valley remains approximately constant whereas the observed fraction increases. We argue that this provides evidence for the presence of icy planets around low-mass stars. Alternatively, planets orbiting low-mass stars undergo more frequent collisions and scatter in the stars' high-energy output may also cause planets to fill the valley. We predict that more precise mass measurements for planets orbiting low mass stars would be able to distinguish between these scenarios.


INTRODUCTION
The 'radius valley' is an observed paucity of planets between about 1.5 and 2 Earth radii.Its properties provide insights into planetary formation and evolution mechanisms.One group of theories suggests that small, close-in planets undergo thermally-driven mass loss scenarios, including photoevaporation (e.g.Owen & Wu 2013;Lopez & Fortney 2013;Chen & Rogers 2016;Owen & Wu 2017), and core-powered mass loss (e.g.Ginzburg et al. 2018;Gupta & Schlichting 2019, 2020), resulting in a separation between planets without substantial atmospheres, and planets that have retained them.Alternatively, it is suggested that under the late gas-poor formation scenario, planets below the valley have formed after most of the gas has dissipated and hence do not have an atmosphere (e.g.Lopez & Rice 2018).
The radius valley around FGK stars has been extensively observed (e.g.Fulton et al. 2017;Van Eylen et al. 2018;Fulton & Petigura 2018).These observations all show a separation between planets above the valley, which are thought to possess a substantial hydrogenhelium (H-He) atmosphere, and planets below the valley, which are expected to be bare rocky cores.These studies have also found that the radius valley shifts to a larger radius for shorter orbital periods, which is consistent with thermally-driven mass loss models, as larger planets could lose their atmospheres closer to their hosts.The radius ★ E-mail: sze.ho.20@ucl.ac.uk valley is also found to shift to a larger radius for more massive host stars, which again agrees with both photoevaporation and corepowered mass loss models (e.g.Berger et al. 2020b;Petigura et al. 2022;Ho & Van Eylen 2023).
However, the picture is different when it comes to M-dwarf host stars.Cloutier & Menou (2020) studied the radius valley around lowmass stars, and found that the valley location tends to larger planet radii for larger planet-star separation, opposite of the dependence for FGK stars.They suggested that this is evidence for such planets to undergo gas-poor formation (e.g.Lopez & Rice 2018;Cloutier & Menou 2020).Yet, Van Eylen et al. (2021) analysed small planets orbiting M-dwarf stars and found the valley location tending to smaller planet radii for larger planet-star separation, similar to that of FGK stars, contradicting the findings of Cloutier & Menou (2020).Bonfanti et al. (2024) also similarly analysed the orbital period dependence for the M-dwarf radius valley, and found a similar, but much weaker dependence, than Van Eylen et al. (2021).Recently, Luque & Pallé (2022) investigated the distribution of 43 small transiting planets around M dwarfs with mass measurements and argued that there is no orbital period dependence of the radius valley, but rather planets can be clearly separated into three distinct groups in bulk density space: rocky planets, 'water worlds' with 50% water, and 'puffy' sub-Neptunes with sizeable atmospheres.Similarly, studies from Venturini et al. (2020); Mousis et al. (2020); Venturini et al. (2024) have suggested that the valley emerges due to a dichotomy in density between supercritical water worlds and rocky planets.How-ever, Rogers et al. (2023) counter-argued that thermally-driven mass loss mechanisms can still explain the distribution of small planets around low-mass stars.If a 'boil-off' scenario existed, i.e. a significant amount of H-He atmosphere is removed during disk dispersal, the resulting planet distribution matches well with that of the 'water worlds' identified in Luque & Pallé (2022).Even if the 'boil-off' process does not happen, and the initial atmospheric loss processes are not known, the resulting distribution is consistent with the overall arrangement noted in Luque & Pallé (2022).This is because the atmospheric mass fraction scales with the planet mass, when atmospheric mass-loss is taken into account (Rogers et al. 2023).Hence, it is uncertain whether the difference in the valley's dependence with orbital period can be explained by the existence of such 'water worlds'.These discrepancies in the characteristics of the radius valley for host stars of different masses, as well as the various views and interpretations of the radius valley for low-mass stars, suggest the possibility that stellar mass plays a role in determining the evolutionary pathways of planets.Ho & Van Eylen (2023) found that careful fitting of short-cadence transit light curves to obtain precise planetary radii caused some inferred planet properties to change (see also Hernandez Camero et al. 2023), revealing a deeper, more sharply-determined radius valley.Ho & Van Eylen (2023) showed that the radius valley properties depend on stellar mass, but focused on planets orbiting FGK stars.In this work, we extend the sample by conducting a similar, careful and homogeneous analysis of Kepler short cadence transit observations to study the radius valley for planets orbiting low-mass M-type stars.We compare the observed radius valley for stars of different masses and attempt to reproduce the observed radius valley characteristics using photoevaporation models.
In Section 2, we describe the planet sample and the methods used to fit the Kepler transit light curves to obtain new planetary parameters.In Section 3, we analyse the observed radius valley's change with stellar mass and compare them with theoretical models.The physical implications of the results are discussed in Section 4. Finally, we provide conclusions and outlook in Section 5.

Sample of new planets orbiting low-mass stars
In this work, we study planets around low-mass stars, using planets in the sample from the California-Kepler Survey (CKS) 'extended mass' (CXM) sample (Petigura et al. 2022), which has a stellar mass range between 0.4 and 1.51 ⊙ .The CXM is a filtered sample of the union of the Kepler Data Release 25 stellar properties catalogue (Mathur et al. 2017), the Gaia DR2 catalogue (Gaia Collaboration et al. 2018), and the Gaia-Kepler Stellar Properties Catalogue (Berger et al. 2020a).
We select planets that meet the criteria defined in Ho & Van Eylen (2023), i.e.
(i) 1 ≤  p / ⊕ ≤ 4, where  p is the planetary radius, (ii) 1 ≤ /d ≤ 100, where  is the orbital period, (iii) at least 6 months of Kepler 1-minute short cadence data available, and (iv) without transit timing variations (TTV) measurements, as reported in Holczer et al. (2016).This gives us 72 planets not previously refitted by Ho & Van Eylen (2023) for which we homogeneously refit the planetary transits to update the planetary parameters.While the same orbital period does not correspond to the same irradiation received by the planet around stars of different masses, the radius valley is orbital-period dependent (e.g.Van Eylen et al. 2018;Ho & Van Eylen 2023), hence we expect that our analysis, which accounts for this dependency, remains robust across various stellar mass ranges.In our subsequent analysis, we will divide our sample into multiple stellar mass bins, therefore minimising the difference in irradiation received by the planet for the same orbital period.Furthermore, the radius valley lies well within 1 ≤  p / ⊕ ≤ 4 for the stellar mass ranges studied in this work (e.g.Ho & Van Eylen 2023), therefore we do not expect the radius cuts implemented during sample filtering to significantly impact our analysis of the radius valley.

Transit Fitting
Since Kepler 1-minute short cadence light curves produce superior planetary radius precision due to their higher sensitivity towards changes in the transit shape (e.g.Petigura 2020;Ho & Van Eylen 2023;Hernandez Camero et al. 2023), we refit the 72 planets that have not been studied in Ho & Van Eylen (2023) using Kepler 1minute cadence light curves, to obtain updated planetary parameters.
We follow the transit fitting method used in Ho & Van Eylen (2023) and summarise its main characteristics here.We start by performing data reduction and light curve detrending, then correcting the light curves with the 'radius correction factor' (Furlan et al. 2017) to account for stellar multiplicities.We then use exoplanet (Foreman-Mackey et al. 2021) to generate a transit light curve model with quadratic stellar limb darkening, and run a Hamiltonian Monte Carlo (HMC) algorithm implemented in PyMC3 (Salvatier et al. 2016) with a Gaussian Process (GP) model (Rasmussen & Williams 2006) included to account for undetected noise in the light curves.We initialise the HMC chains in the same way and apply the same priors as in Ho & Van Eylen (2023).Similarly, we fit for the following parameters: orbital period (), transit mid-time ( 0 ), planet-to-star radius ratio ( p / ★ ), impact parameter (), eccentricity (), argument of periapsis (), stellar density ( ★ ), and two quadratic stellar limb darkening parameters ( 0 and  1 ).We use the values from Petigura et al. (2022) as the initial guesses for  ★ , as these stars are not included in the Fulton & Petigura (2018) sample.The initial guesses for  0 and  1 are estimated using the Exoplanet Characterization ToolKit (Bourque et al. 2021), with the input effective temperature ( eff ), surface gravity (log ), and metallicity ([Fe/H]) taken from the Kepler Q1-16 dataset (Mullally et al. 2015).

Revised planetary parameters
We report the revised planetary parameters of all 72 planets in Table 1.For subsequent analysis of the radius valley, we only consider those planets whose properties are well-known.Following Ho & Van Eylen (2023), we, therefore, apply the following parameter constraints: (i) measurement uncertainty on the planet radius (  p ) ≤ 10%, (ii) radius correction factor (RCF) ≤ 5%, as reported in Furlan et al. (2017); this is to account for stellar flux from nearby stars contaminating the light curves (Furlan et al. 2017), and (iii) at least 3 transits in the Kepler short cadence data.
This results in a sample of 69 planets whose properties are homogeneously determined and precisely and accurately known.We then combine these planets with planets with similar characteristics, orbiting higher-mass stars, and include 375 such planets with parameters taken from Ho & Van Eylen (2023).This total sample of 444 planets is shown in Fig. 1.
The sample of planets with properties newly determined in this work provides an important complement to the sample studied by Ho & Van Eylen (2023).In Fig. 2, we show both samples as a function of stellar mass, highlighting how these 69 new planets significantly extend the homogeneous sample around low-mass M dwarfs.In Fig. 3, we compare the noise properties of both samples, showing that the uncertainties in planet-to-star radius ratio, planet radius, stellar mass and radius, are all broadly similar.

Shallower observed radius valley for lower mass stars
We now investigate the properties of the radius valley as a function of stellar mass.As a starting point, we divide the sample into four mass bins:  ★ / ⊙ < 0.7, 0.7 ≤  ★ / ⊙ < 0.9, 0.9 ≤  ★ / ⊙ < 1.1, and  ★ / ⊙ ≥ 1.1.The resulting sample in those four stellar mass bins is shown as a function of planet radius and orbital period in Fig. 4. On the same figure, we also show the boundaries of the radius valley, according to the 3-dimensional equation determined in Ho & Van Eylen (2023), which is defined by a support-vector machine (SVM) separating the planet population into two groups: with  = −0.09+0.02 −0.03 ,  = 0.21 +0.06 −0.07 ,  = 0.35 +0.02 −0.03 , and setting  ★ to the median of the host stars masses within each mass range.We take  upper = 0.42 and  lower = 0.29 from Ho & Van Eylen (2023), which are determined by the parallel lines passing through the supporting vectors.Here, we have plotted the valley as determined by Ho & Van Eylen (2023) -without the contribution from the lower-mass stars.When including the latter and refitting equation 1, we find  = −0.10+0.02 −0.03 ,  = 0.17 +0.14 −0.09 ,  = 0.36 +0.02 −0.02 , which are in agreement with Ho & Van Eylen (2023) well within 1.We also test whether the radius valley slope differs for individual stellar mass bins.We fit a 2-dimensional SVM in radius-period space for each stellar mass range.We find a slope of −0.15 +0.08  −0.18 , −0.10 +0.02 −0.04 , −0.10 +0.04 −0.10 , and −0.11 +0.05  −0.05 for  ★ / ⊙ < 0.7, 0.7 ≤  ★ / ⊙ < 0.9, 0.9 ≤  ★ / ⊙ < 1.1, and  ★ / ⊙ ≥ 1.1 respectively.These slopes are also in agreement with each other well within 1.As one of the main goals of this work is precisely to test whether the radius valley has the same physical origin around low-mass stars as around their higher-mass counterparts, we choose here not to use these updated radius valley parameters.
Visually inspecting the distribution of planet radii and orbital periods in Fig. 4, we see that for higher mass stars, there are fewer planets lying within the boundaries of the radius valley.Since the radius valley is orbital-period dependent (e.g.Van Eylen et al. 2018;Ho & Van Eylen 2023), we divide the plot into multiple tilted bins, as shown in Fig. 4, and following the procedure in Ho & Van Eylen (2023), shift the planets along the slope of the valley to an equivalent radii at  = 10 d ( p,10 ).We then plot a histogram of these equivalent radii as shown in Fig. 5, separated by different stellar mass ranges.Again, following the procedure in Ho & Van Eylen (2023), we fit the histogram with a curve determined using a Gaussian mixture model with two components, which is independent of the width and boundaries of the histogram bins, and fixes the bimodal distribution of small planets.This equation is given by with  1 +  2 = 1.To account for measurement uncertainties, we follow the method used in Rogers & Owen (2021) and fit the Gaussian mixture model as an inhomogeneous Poisson point process.
The likelihood function of observing an ensemble of planets at   = { p,10,1 , ...,  p,10,  pl } is given by where Λ =  ∫   p,10 d p,10 is the underlying occurrence rate, with  being an arbitrary constant which we also fit for,  pl is the number of observed planets, and   p,10, /Λ is the probability density of observing planet  at  p,10, .We infer the posteriors for ,  1 ,  2 ,  1 ,  2 , and  1 by inputting the log-likelihood, ln (), into a Markov Chain Monte Carlo (MCMC) algorithm, using the Python package emcee (Foreman-Mackey et al. 2013) with 100 walkers and 5000 steps.The median and ±1 uncertainties are shown on the 1 3 10 30 100 M /M < 0.7 1 3 10 30 100 0.7 M /M < 0.9 1 3 10 30 100 0.9 M /M < 1.1 1 3 10 30 100 M /M 1.1 fitted curves.The posterior estimates of ,  1 ,  2 ,  1 ,  2 ,  1 , and  2 are reported in Table 2.We observe two features in this histogram.The first one is that the radius valley gets deeper for higher stellar mass.We use the metric defined in Ho & Van Eylen (2023): to compare the depth of the radius valley, with and where  super-Earth, peak and  sub-Neptune, peak refer to the number of planets at the super-Earth and sub-Neptune Gaussian peaks, taken from the median values of the fitted Gaussian mixture model (equation 2), respectively, and  valley is the lowest number of planets between the two Gaussian peaks.The resulting  values for different stellar mass ranges are reported in Table 3 and plotted in Fig. 6.From Fig. 6, we see that both  avg and  SN strictly increases with stellar mass, whereas  SE generally increases with stellar mass except at the transition at around 0.9 ⊙ ; this may however be due to the decreasing detection efficiency of small planets for higher mass stars.
We perform a linear fit to the  values as a function of stellar mass.
The resulting slopes show a clear increasing trend with a slope of 5.36 +3.19  −3.23 for  avg , 4.79 +3.44  −2.19 for  SE , and 4.91 +3.05 −3.37 for  SN .The p-values from the Wald Test with t-distribution of the test statistic, as implemented in scipy.stats.linregress(Virtanen et al. 2020), for the linear fit are 0.02, 0.09 and 0.01, respectively, indicating a strong rejection of the null hypothesis of no linear correlation between the two parameters.We do not expect that observational biases could reproduce these trends.The planet detection efficiency should decrease with increasing stellar radius (and therefore mass), due to the transit depth scaling inversely with the square of stellar radius, i.e.
Although Kepler is less sensitive to cooler M-dwarfs, their detection efficiency is comparable to their FGK counterparts (∼ 92% compared to ∼ 96%, Christiansen et al. 2020).Hence, the deeper radius valley for higher-mass stars should not be caused by undetected planets.The relative uncertainty on stellar radius is also  similar for low-mass stars as for their higher-mass counterparts (see Figure 3).
The second feature we observe is that the sub-Neptune peaks shift to higher  p for higher mass stars, whereas the super-Earth peaks do not increase as much.Fig. 7 shows the radius of the super-Earth and sub-Neptune peaks as a function of stellar mass.Again, we perform a linear fit to the planet radii at the two peaks as a function of stellar mass, and find  p,peak = 0.13 +0.16  −0.19  ★ + 1.25   the no linear correlation hypothesis for sub-Neptunes, but not for the case of super-Earths.This effect is also confirmed by other observational studies (e.g.Petigura et al. 2022), which analysed a similar sample of planets, albeit using Kepler long-cadence observations and different transit modelling than the ones used here.

Photoevaporation models predict a narrower radius valley for lower mass stars
Owen & Schlichting (2024) suggested that although the observed super-Earth population likely contains significant fractions of planets that were stripped by both photoevaporation and core-powered mass-loss, photoevaporation was likely responsible for the final atmospheric stripping of planets located at the edge of the radius valley today, determining the final features observed today.Therefore, we only perform our subsequent comparison between observations and photoevaporation models.We model planets undergoing thermal cooling and photoevaporative mass loss of their hydrogen-dominated atmospheres with the semi-analytic framework of Rogers & Owen (2021) and Rogers et al. (2023).In doing so, we also adopt the underlying planet distributions inferred in Rogers & Owen (2021) for planetary core masses, initial atmospheric mass fractions, orbital periods, as well as an Earth-like core density for all planets of 5.5 g cm −3 .Changing these distributions, which come from statistical inference between the photoevaporation model and Kepler data, will inevitably change the population of planets that might be observed under such conditions.This highlights an important approach in this work, in that, we wish to compare Kepler data with a realisation of the photoevaporation model that fits the observations as well as reasonably possible.Inevitably, there are multiple parameters within our evolution model that can be tuned within reasonable limits to fit the observations more accurately.However, if inconsistencies still exist between the model and observations, even after tuning, the robustness of the result increases.One such example is that, under the photoevaporation model, the predicted slope of the radius valley as a function of the orbital period is altered with different assumptions in the escape efficiency  (Owen & Wu 2017), which allows one to calculate mass loss rates: where  XUV is the incident stellar XUV flux (following Rogers et al. 2021),  is the gravitational constant, and  p and  p are the planet's radius and mass.Typically, the efficiencies are parameterised as a function of the planet's escape velocity: To begin, we find a pair of  0 and   that approximately match the slope of the observed valley.In practice, we achieve this by finding the pair of parameters that minimise the number of observed planets inside the model's valley.We do this only in the highest stellar mass bin (1.1 ≤  ★ / ⊙ < 1.3) since the photoevaporation models have only ever been fit to FGK stellar hosts, for which an accurate fit has been shown to exist (Rogers & Owen 2021).Therefore, we fix the models to attain this consistent fit at high stellar mass, which then yields an extrapolation of the model to lower stellar masses.The specific values were found to be  0 = 0.24 and   = −1.11,consistent with the inference of Rogers & Owen (2021) and hydrodynamic simulations of atmospheric escape.We plot 10 6 modelled planets together with the observed sample in Fig. 8. Building on methods discussed in Rogers et al. (2021) and Petigura et al. (2022), we locate the boundaries of the radius valley region as the area enclosing 1% of the modelled planets, while maximising the valley area.Table 4 lists the upper and lower boundaries of the radius valley in the photoevaporation model for each stellar mass range.We compute the area of the radius valley of the photoevaporation model in the  p −  diagram for different stellar mass bins, and report this in Table 4.We see that the area of the radius valley increases with increasing stellar mass, meaning that higher-mass stars produce a wider radius valley.To first order, the width of the valley is controlled by the difference in size between a stripped core of a given mass and the same core if it were to host a small hydrogen-dominated atmospheric mass fraction ∼ 0.1%.The cause of the increase in valley width is that planets orbiting higher-mass stars experience a higher equilibrium temperature at a given orbital separation.Under these circumstances and all else held fixed, the transit radius of a given sub-Neptune is more inflated when compared to those around smaller-mass stars.On a population level, this increases the difference in size between a stripped core and sub-Neptune, thus widening the valley.The model's widening valley with increasing stellar mass may seem to qualitatively match the observations, where a narrower valley could appear shallower due to the uncertainty of the observed planet radius.However, further steps are required to robustly compare our observations to models, as observational biases exist that are absent in the models, meaning the observed planet population has a different distribution in radius-period space compared to the models.Therefore, we perform a more rigorous comparison in Section 3.3.

Comparison with theoretical models of atmospheric mass loss
As discussed in Section 3.2, we wish to compare Kepler data with a realisation of the photoevaporation model that fits the observations as well as reasonably possible.In Section 3.2, we constructed models that minimised the number of observed planets inside the valley, thereby matching the approximate valley slope.However, this did not precisely match the rest of the distribution across the entire range in stellar mass, such as the specific occurrence of super-Earths and sub-Neptunes in various regions of parameter space.In this Section, we introduce a methodology to fit the planet distribution, which avoids the extreme computational costs of Rogers & Owen (2021).
Our goal is to produce a model that is, in fact, perfectly matched to the observations everywhere except inside the valley.This perfectly fit model is then used to predict the number of planets inside the radius valley for each stellar mass bin, accounting for measurement uncertainty, which inevitably scatters some planets into the valley.
In this way, we investigate whether the photoevaporation framework is able to match the observed radius valley trend with stellar mass, as a consequence of the model producing a narrower radius valley for lower-mass stars and observational uncertainty.If the model still cannot reproduce the observed increase of planets inside the valley at lower stellar masses, then the inconsistency is robust.It would imply that photoevaporation models -even after exhausting the free parameters available within these models -are insufficient in their current form to explain the trend of the observed radius valley with  ★ .
For each stellar mass bin, we divide the radius-period plot into multiple cells, parallel with the valley, as shown in Fig. 8.We construct the cells such that the valley is contained within its own cell for each period bin, with boundaries defined from Table 4.The goal is to find a numerical weight for each cell outside of the valley such that the number of planets in the model is weighted to exactly match that of the observations.As discussed, this perfectly weighted model is then used to predict the number of planets inside the radius valley for each stellar mass bin, due to measurement uncertainties in planetary radii.
To begin, we account for measurement uncertainty in planetary radius by redrawing each modelled planet from N  p, model ,   p,model where   p,model is a random selection in the set of   p,obs of planets within the same period range.Afterwards, for each cell, we calculate the number of planets that have shifted to another cell as a result of the new draw.We represent this in matrix form: where  ,  represents the number of planets that were in cell  originally but are now in cell  after redrawing.Here, ,  ∈ {0, 1, .., , .., } and  represents the valley cell index.At this stage, we approximate to zeroth order that the model's radius valley is completely empty, and remove all elements from F that associate with the valley cell index , e.g. , and  ,  being the cells for which planets have shifted into, or out of the valley.Since the measured uncertainties in planetary orbital periods are very small, we calculate F independently for each period bin since planets do not shift across horizontal cells as a result of the new draws.We can then calculate the weights for each cell by solving the matrix equation where is the array of the number of observed planets in each cell.This is achieved by finding F −1 (the inverse of F), i.e., which is numerically trivial since F is a square matrix with dominant diagonal elements.To prevent a singular matrix, we assign a value of 1 for the diagonal terms  , if cell  has no planets inside.Our only requirement for the weight inside the valley,   , is that the weights vary smoothly across the valley for increasing planet size.As such, we determine   by taking the average of the two adjacent weights,  −1 and  +1 , i.e.
Our perfectly weighted model is now used to calculate the expected number of planets in the model's radius valley with where is the vector containing the number of planets that have shifted from different cells into the radius valley from the new draw.In essence,  model,valley represents the number of planets the photoevaporation model would predict to exist inside the valley, given that it reproduces the observations everywhere else.We perform a bootstrap of 1000 samples for each stellar mass bin, each time randomly redrawing a new set of  p,obs from  4. The black dotted lines show the cells defined in Section 3.3.
N  p,obs ,   p,obs , and solving equation 11 to get a new  model,valley , to obtain the median and ±1 uncertainties of  model,valley and  obs,valley .To investigate the effect of bin sizes in this analysis, we tested this method with multiple bin sizes in period and radius.We repeat the above-mentioned steps for different equally-spaced logarithmic bins in periods between 1 and 100 d, and varying bin sizes in radius.We find that the resulting values for  model,valley do not change substantially, and all values are within 1 of each other.We therefore use 5 equally-spaced bins in log 10  between 1 and 100 d, and a bin width of 0.2 in log 10  p , for reporting the final result.Fig. 9 shows the histogram comparing the number of planets in the lowest stellar mass bin (0.5 ≤  ★ / ⊙ < 0.7) between the observations and the photoevaporation model after weighting; here we see that the two match everywhere outside the valley, however there is an excess of planets inside the valley for the observations.Fig. 10 shows the fraction of planets inside the radius valley as a function of stellar mass, comparing the photoevaporation model with observations.The results show that for the observations, this fraction largely decreases with increasing stellar mass, with a bestfitting slope of  = −0.038+0.044 −0.045 , whereas in the theoretical photo-evaporation scenario, the fraction stays approximately constant, with  = 0.001 +0.002 −0.002 .Since the area of the radius valley also decreases with decreasing stellar mass (see Table 4), the stellar mass dependence of the fraction is less significant compared to the  metrics computed in Section 3.1 (see Fig. 6), which is independent of the radius valley area.However, even after taking this effect into account, we continue to see the decrease in fraction with increasing stellar mass, meaning that for lower-mass stars, there are more planets inside a narrower valley for the observations.The measured fractions between the model and observations are more than 1 apart for the 3 lowest stellar mass data points, with the fraction of planets inside the radius valley much higher for observations compared to the photoevaporation case.We therefore conclude that current photoevaporation models under-predict the number of planets observed inside the radius valley for low-mass stars.This suggests that these models need to be modified or that further physical mechanisms need to be considered to explain the characteristics of small planets orbiting low-mass stars.4, after weighting the model according to the method discussed in Section 3.3.Here, the number of planets match everywhere except inside the valley.

CAUSES FOR DISCREPANCY BETWEEN OBSERVATION AND MODEL
In Section 3, we noted the increased shallowness of the radius valley for low-mass stars compared to their high-mass counterparts, as well as the discrepancy in the emptiness of the valley between theoretical models and observations.Given the overall success of photoevaporation models in explaining the observed properties of close-in small planets, we investigate how these models can be extended to reconcile them with observations.In particular, we suggest the following possible explanations for the observed discrepancy: X-ray and UV (XUV) scattering (Section 4.1), dispersion in planetary core composition (Section 4.2), and planetary collisions (Section 4.3).Hypothetical plots of how each scenario may present in the radius-period plot are shown in Fig. 11.

Scattering in the stars' XUV output
A star's high-energy output is known to be tightly correlated with its rotation period (e.g.Wright et al. 2011;Johnstone et al. 2021).It is also known that at a given age and stellar mass, there is a spread in rotation periods, where this spread in rotation periods increases at younger stellar ages (e.g.Reinhold & Hekker 2020).Therefore, empirically, it is anticipated that stars with identical masses could have different histories of their XUV output.Studies (e.g.Tu et al. 2015;Kubyshkina et al. 2019) have shown that plausible different histories of a star's high energy outputs could lead to different planetary evolutionary scenarios, where planets with identical initial properties are able to retain or lose an atmosphere.The position of the radius valley directly depends on the XUV exposure (X xuv , i.e. the total lifetime integrated XUV flux incident upon the planet, Owen & Wu 2017).Thus, as the XUV exposure will vary from star-to-star with the same stellar mass, the radius valley will become blurred.Following Owen & Wu (2017) with a mass-loss efficiency scaling of  ∝  −2 esc , with  esc the planet's escape velocity, if a given star's XUV exposure is a fraction  X of a typical star with the same stellar mass, then the position of the radius valley should scale as: This implies a factor of 5 spread in the XUV histories is enough to move the position of the radius valley up or down for an individual star by ±10%.This is possible as Tu et al. (2015) has shown that for planets with the same initial planetary and atmospheric masses, the planet retains 45% of its initial atmosphere at 5 Gyr when orbiting a slowly rotating star with minimal XUV output, whereas the planet would lose all of its atmosphere at ∼ 100 Myr if it orbits a fast-rotating star.Thus, variation in a star's XUV history will fill the radius valley; planets around stars with a higher-than-average XUV output will have a radius valley at larger planetary radii.Therefore, more massive cores can be stripped at a given orbital period, polluting the radius valley with rocky planets.However, stars with a lower-than-average XUV output will allow lower mass planets to retain their atmospheres at a given orbital period, polluting the radius valley with planets with a H-He dominated atmosphere and thus appear lower density than a planet with an Earth-like bulk density.Therefore, if XUV scatter is the cause of the excess of planets in the radius valley, planet's in the radius valley should have a range of densities covering Earth-like bulk densities to more volatile-rich densities.
There are two reasons why this should lead to an increasing pollution factor with decreasing stellar mass.Firstly, as discussed in Section 3.2, the width of the radius valley decreases with decreasing stellar mass.Thus, for a fixed spread in XUV histories, this will naturally pollute a narrower valley more, as any planet is more likely to be scattered into the valley.Secondly, there is some empirical evidence that the scatter in possible XUV histories is larger around lower mass stars, as evidenced by the larger spread in rotation rates measured at a given age (e.g.Newton et al. 2016;Reinhold & Hekker 2020).Photoevaporation models could be constructed to take XUV scattering into account in order to investigate this effect further; however, since the rotation evolution of M-dwarfs remains poorly constrained, we leave such work for future studies.

Dispersion in planetary core composition
As in the case of scatter in a planet's XUV history, the bulk density of the planet's core also causes variation in the position of the radius valley.A planet with a lower-density core, but the same mass, has a weaker gravitational potential.Thus, this planet has its atmosphere more easily removed than a planet with a denser core.As such, the radius valley, at a fixed period, appears at a larger radius for lowerdensity cores.As discussed by Owen & Wu (2017); Jin & Mordasini (2018); Mordasini (2020), any spread in core-composition will result in a shallower radius valley (Fig. 8 in Owen & Wu 2017).Since the position of the radius valley appears to be in agreement with predictions for an Earth-like bulk density, but with an excess of planets in the valley, if a spread in core composition caused this, this would imply that the distribution of core densities has a larger tail of lower bulk density cores around lower mass stars.
The ice line of stars is the distance from the star beyond which the temperature is low enough for volatiles to condense.Hence, the ice line is closer to the star for lower-mass stars; planets with a larger ice fraction in their cores are more likely to form closer to the star, within the observed orbital period range.Therefore, if there was an increasing population of planets with lower bulk density cores around lower mass stars, this could also explain the excess pollution of the radius valley that we observe (e.g.Venturini et al. 2020).For planets with higher equilibrium temperatures (⪆ 400K) around low-mass stars, simulations have shown that water in planets exists wholly in the form of steam rather than in liquid or solid phases, and such water mixes with the planetary envelope (e.g.Mousis et al. 2020;Venturini et al. 2024), hence such steam worlds may populate the radius valley for low-mass stars.

Collisions
Collisions among planetary bodies have previously been proposed to be an integral part of Super-Earth and Sub-Neptune formation (e.g.Inamdar & Schlichting 2015;Izidoro et al. 2017).Such collisions among planets in the super-Earth and Sub-Neptune populations can also lead to a more filled-in radius valley.For example, when two sub-Neptunes collide, this giant impact imparts so much kinetic energy that is subsequently converted into heat that the hydrogen envelope is entirely lost while their cores merge (e.g.Liu et al. 2015;Biersteker & Schlichting 2019;Matsumoto et al. 2021;Izidoro et al. 2022).This, therefore, leads to a more massive but smaller planet (due to the lack of the hydrogen envelope) and hence would populate the radius valley.Similarly, a merger of sufficiently large super-Earths could also start to populate the radius valley from the lower edge.Regardless of whether the collisions appear between super-Earths and sub-Neptunes or within each of these populations, the net result is generally the loss of any hydrogen-envelope, leaving behind mostly rocky-cores.Therefore, if collisions are responsible for populating the radius valley, then the population of exoplanets residing in it is expected to be predominantly rocky in their composition (right panel of Fig. 11).
For the same planet mass ( p ) and semi-major axis (a), planetary scale collisions are indeed expected to be more common around lower-mass stars because the Hill radius, which estimates the scale over which neighbouring planets can significantly perturb each other gravitationally while orbiting their host star ( ★ ), is larger for lower mass stars.Additionally, at fixed planet mass, encounters can result in perturbations approaching the planets' escape velocity (e.g.Goldreich et al. 2004), around lower-mass stars (with shallower potentials), these planet-planet scatterings result in higher eccentricities.This, therefore, leads to more eccentricity excitation, eventual orbit crossings and collisions around lower mass stars than higher mass ones, assuming the planet population is identical otherwise.In fact, planets with high bulk densities around low-mass stars, which are thought to be products of collision events (e.g.Naponiello et al. 2023) have been discovered.

Mass measurements of planets point to a mixture of scenarios
As discussed in Sections 4.1-4.3,stellar XUV scatter, dispersion in core compositions, and collisions all result in similar views of the radius valley in radius-period space.To disentangle these degenerate views, one could measure the masses of planets inside the radius valley to infer their compositions.
In our sample of 444 planets analysed in this work, 203 orbit around host stars with 0.5 ≤  ★ / ⊙ < 0.9, of which only 33 have archival mass measurements, and only 11 of which have mass precision better than 30%.This is because the sample studied here originates from Kepler observations, which are suited for highly precise and homogeneous transit observations, but are less suited for follow-up mass observations due to the relative faintness of Kepler host stars.We increase the sample size by turning to the NASA Exoplanet Archive (Akeson et al. 2013) for the sample of all confirmed exoplanets with mass measurements, within the period range of 1 ≤ /d ≤ 100 and with planetary radii in the range 1 ≤  p / ⊕ ≤ 4. For planets with multiple reported mass measurements, we select the value published most recently.We also utilise the recent measurements from Leleu et al. (2023), which are not yet available on the NASA Exoplanet Archive at the time of writing.Although the most recent mass measurements may not have the best precision, we hold the assumption that the instruments and methodology used, and the data available for retrieving planetary masses from observations, should improve over time.We then only retain planets with mass precision better than 30%.For radius measurements, we acknowledge the importance of homogeneity and precise radius measurements (e.g.Ho & Van Eylen 2023), hence for planets not in the sample of Ho & Van Eylen (2023) or this work, we use the planetary radii and stellar mass from Berger et al. (2023) if available, otherwise we use the corresponding planetary radius and stellar mass measurement reported in the NASA Exoplanet Archive for the same mass entry.
We calculate an expected envelope fraction for the planets by following the method in Chen & Rogers (2016), which applies the Modules for Experiments in Stellar Astrophysics (MESA) models (Paxton et al. 2011(Paxton et al. , 2013(Paxton et al. , 2015) ) to sub-Neptune atmospheres.This method takes the planetary radius ( p ), planetary mass ( p ), planetary incident bolometric flux ( p ), and age as inputs.We use  p from Berger et al. (2023) where available, otherwise, we calculate  p from the stellar radius and effective temperature, and the orbital semi-major axis.We use the stellar ages from Berger et al. (2023) if available, else we find the value with the lowest percentage uncertainty from the NASA Exoplanet Archive.We compute  core using the mass-radius relationship from Fortney et al. (2007) for a given  p , assuming an Earth-like rocky/iron core with an iron fraction of 1/3, and that all of the planet's mass reside in its core.We then solve for the  env required to produce a planet with the given mass and radius using Brent's method of root finding.Some planets have mean densities consistent with Earth-like, atmosphere-free compositions, as is expected from photoevaporation models for planets below the radius valley.Where we cannot find an atmospheric solution, we set a default envelope fraction to 10 −4 .
We plot the resulting  env for planets with mass measurements in a  p −  plot in Fig. 12.A table listing all the planetary masses and envelope fractions is provided in the Appendix (Table A4).We observe that, in general, planets with bare rocky cores lie towards the lower part of the radius-period plot, while planets with significant atmospheric fractions (≳ 10 −2 ) populate the upper part.We notice a few planets with a bare rocky core move across the boundary into the region with an abundance of planets with large envelope fractions, however, the masses of these planets mostly come from transit timing variations (TTV) rather than radial velocity (RV) measurements.
By visual inspection, our results do not wholly match the case for strong XUV scatter, as we do not observe a significant mixing of bare rocky planets and planets with predicted substantial atmospheres inside the valley as in the left panel of Fig. 11.On the other hand, the presence of planets with predicted small, but non-zero, envelope fractions (≲ 10 −2 ) inside the valley would be consistent with a low ice fraction in their cores, or planets with water-rich atmospheres.Finally, if we consider the TTV masses to be true, the existence of some high-mass bare rocky planets with radii just above the radius valley suggests these are extremely dense planets which may be byproducts of collision events.Therefore, we suggest that a mixture of these scenarios is the probable scenario around low-mass stars.
However, one cannot draw a firm conclusion for the following reasons.Most importantly, we suffer from small number statistics since most planets do not have precise mass measurements available.Another possible concern is that the calculation of the envelope fraction requires information of the stellar age, which are highly uncertain (with a mean uncertainty of ±90% for 0.5 ≤  ★ / ⊙ < 0.9 in our sample).However, for a sample of main sequence stars, this is unlikely to significantly affect our conclusions.Nevertheless, a more comprehensive study is required to enhance our understanding around low-mass stars, such as obtaining more precise mass measurements of small planets around low-mass stars, with a focus towards those inside the predicted radius valley, as well as more precise age estimates of such stars, or thorough modelling of XUV scattering or collision events.

CONCLUSIONS
In this work, we refitted the light curves of 72 Kepler planets around low-mass stars, using Kepler 1-minute short cadence data.We present their revised planetary parameters, and combining with the parameters in Ho & Van Eylen (2023), we statistically analyse the change in the depth of the radius valley as a function of host star mass.
We find that the radius valley becomes shallower around lower mass host stars.Although the photoevaporation model also predicts a narrower radius valley for lower mass hosts, this filling-in of the observed valley could not be explained by theoretical models of photoevaporation, even when observation biases and dispersion in initial conditions are taken into account.This suggests that additional mechanisms are involved in shaping the radius valley for planets around low-mass stars.
We suggest three possible theoretical scenarios that could explain the shallower radius valley for low-mass stars: scattering in the stars' XUV output, dispersion in planetary core composition, and collision events.Combining our sample with other planets around low-mass stars on the NASA Exoplanet Archive, we calculate the expected envelope fraction of planets under the assumption of a rocky core.We note that the resulting population distribution with reference to the radius valley location does not favour either scenario exclusively, however, it provides evidence for the presence of planets with a low ice/water fraction, or planetary collisions.

Figure 1 .Figure 2 .
Figure 1.Planetary radius-orbital period-stellar mass plot of the 444 small planets analysed in this work, which includes 375 with parameters from Ho & Van Eylen (2023) (red) and 69 newly refitted (blue) in this work.The radius valley boundary determined in Ho & Van Eylen (2023) is indicated by the grey planes.

Figure 4 .
Figure 4. Radius-orbital period plot of planets analysed in this work, separated by host star mass.The black dashed lines indicate the radius valley boundary as defined by the parameters in Ho & Van Eylen (2023).The grey dashed lines divide the plot into multiple orbital-period dependent bins, which are then used to plot the adjusted histogram in Fig. 5.

Figure 5 .
Figure 5. Histogram of  p , adjusted to an equivalent radii at P = 10 d according to the valley slope determined in Ho & Van Eylen (2023) ( = −0.09),for planets of different host star masses.The histograms are normalised such that the peaks of the sub-Neptune curves have a relative density of 1.The lines and shaded regions show the median and 1 confidence interval of the fitted curve respectively.

Figure 7 .
Figure 7. Plot of the planetary radius at the super-Earth and sub-Neptune histogram peaks from Fig. 5 as a function of host star mass.The shaded regions represent the ±1 confidence intervals of the linear fits.

Figure 8 .
Figure 8. Radius-orbital period plot of planets for different host star masses.Grey dots indicate planets simulated with the XUV photoevaporation models of Rogers et al. (2021), and coloured circles indicate the observed planets.The black solid lines show the boundaries of the model's radius valley, as defined by the parameters in Table4.The black dotted lines show the cells defined in Section 3.3.

Figure 9 .
Figure 9. Histogram of  p , adjusted to an equivalent radii at  = 10 d, according to the slope of the model's radius valley as listed in Table4, after weighting the model according to the method discussed in Section 3.3.Here, the number of planets match everywhere except inside the valley.

Figure 11 .
Figure 11.Hypothetical  p - plots of XUV scatter (left panel), dispersion in planetary core compositions (middle panel), and collisions (right panel).Planets of different types are represented by different colours and shapes.The radius valley is bounded by the black dotted lines.

Table 1 .
Petigura et al. (2022)ameter estimates of orbital period(), planetary-to-stellar-radius ratio ( p / ★ ), planetary radius ( p ), stellar radius ( ★ ), stellar mass ( ★ ), number of transits in the fitted light curve ( tr ), of the 72 planets refitted in this work.We convert  p / ★ into  p using  ★ values fromPetigura et al. (2022).★values are also taken fromPetigura et al. (2022).'Flag' refers to whether the planet passes the filter checks and is included in the smaller subset for further analyses (1 for true and 0 for false).Only the first 5 planets are shown here; the full table is available online in a machine-readable format.

Table 2 .
Posterior estimates of parameters from fitting the Gaussian mixture model as an inhomogeneous Poisson point process.

Table 3 .
values of the radius valley for planets with different host star masses.The  metrics are calculated from equations 4-6.Plot of the  metric to measure the depth of the radius valley from equations 4-6, as a function of host star mass.The shaded regions represent the ±1 confidence intervals of the linear fits.

Table 4 .
Slopes and intercepts of lower and upper radius valley boundaries of the photoevaporation model used in this work, given by log 10  p / ⊕ =  log 10 ( /d) + .The area of the radius valley, incorporating 1% of the modelled planets, is also reported.

Table A2 .
Petigura et al. (2022)f the additional planets fitted in this work, that are not in the sample ofHo & Van Eylen (2023).Orbital period (), transit ephermeris time ( 0 ), planet-to-star radius ratio ( p / ★ ), impact parameter (), eccentricity () and argument of periapsis () are direct results from the transit fitting, while the planetary radius ( p ), ratio between the semi-major-axis and the stellar radius (/ ★ ) and incident bolometric flux ( p ) are indirectly calculated.The  p / ★ and  p values fromPetigura et al. (2022)(P22) are also listed here for comparison.Only the first 10 rows are shown here; the full table is available online in a machine-readable format.

Table A3 .
Transit jitter and Gaussian Process (GP) parameters for the transit fitting of planetary systems performed in this work.Only the first 10 rows are shown here; the full table is available online in a machine-readable format.