Occurrence Rates of Planets Orbiting M Stars: Applying ABC to Kepler DR25, Gaia DR2, and 2MASS Data

We present robust planet occurrence rates for Kepler planet candidates around M stars for planet radii $R_p = 0.5-4~\textrm{R}_\oplus$ and orbital periods $P = 0.5-256$ days using the approximate Bayesian computation (ABC) technique. This work incorporates the final Kepler DR25 planet candidate catalog and data products and augment them with updated stellar properties using Gaia DR2 and 2MASS PSC. We analyze a clean sample of $1,530$ Kepler targets that host $89$ associated planet candidates. These early M-dwarfs and late K-dwarfs were selected from cross-referenced targets using several photometric quality flags from Gaia DR2 and color-magnitude cuts using 2MASS magnitudes. We identify a habitable zone occurrence rate of $f_{\textrm{HZ}} = 0.38^{+0.04}_{-0.05}$ for planets with $0.75-1.5$ R$_\oplus$ size. We caution that occurrence rate estimates for Kepler's M stars are sensitive to the choice of prior due to the small sample of target stars and planet candidates. For example, we find an occurrence rate of $8.9^{+1.2}_{-0.9}$ or $4.8^{+0.7}_{-0.6}$ planets per M-dwarf (integrating over $R_p = 0.5-4~\textrm{R}_\oplus$ and $P = 0.5-256$ days) for our two choices of prior. These occurrence rates are greater than those for FGK-dwarf target when compared at the same range of orbital periods, but similar to occurrence rates when computed as a function of equivalent stellar insolation. This suggests that stellar irradiance has a significant and possibly dominant role in planet formation processes regardless of spectral type. Combining our result with recent studies of exoplanet architectures indicates that most, and potentially all, early-M dwarfs harbor planetary systems.


INTRODUCTION
M stars represent an interesting stellar type for occurrence rate studies. Transiting exoplanets are more easily detectable around M stars than sun-like stars because transit depth δ ∝ R −2 * and M stars are smaller than FGK stars. The habitable zone (HZ) around M stars is also closer to the star due to their lower flux. Therefore, HZ planets around M stars have smaller periods in comparison to HZ planets around FGK stars, increasing both their geometric transit probability and their transit SNR for a transit survey with fixed lifetime. Both these considerations make small planets around M stars relatively favorable targets for atmospheric characterization with follow-up transit observations.
Contrasting the M-dwarf exoplanet population to the exoplanet population around FGK-dwarfs can provide significant insights to planet formation theories. Since the stellar types have different accretion and disk dissipation timescales, observing clear differences in their exoplanet populations can give insight into the relative importance of the two timescales (Ercolano & Pascucci 2017). M-dwarfs also have significantly less stellar irradiance compared to FGK-dwarfs, so differences in the exoplanet populations could point to the relative significance of stellar irradiance and disk viscosity for planetary accretion and migration. For these reasons, many projects have and/or will search for exoplanets orbiting M-dwarfs, including the ground-based MEarth (Irwin et al. 2015) and SPECULOOS (Delrez et al. 2018) transit surveys focusing on M-dwarfs. Additionally, space-based surveys by the K2 (Howell et al. 2014) and TESS (Ricker et al. 2015) missions have significant increased the number of nearby M-dwarfs surveyed for transiting planets with small orbital periods. Ground-based radial velocity (RV) surveys using instruments such as HPF (Mahadevan et al. 2012), SPIRou (Artigau et al. 2014), and CARMENES (Quirrenbach et al. 2018) also will target M-dwarfs to better characterize their exoplanet population, extending to larger orbital periods.
Several studies have attempted to quantify occurrence rates for M stars using Kepler data. An early study by Dressing & Charbonneau (2013) used the Inverse Detection Efficiency Method (IDEM) to determine the occurrence rate of planets with radii < 4R ⊕ for stars believed to be cooler than 4000 K from Q1-Q6 Kepler data. Kopparapu et al. (2013); Gaidos (2013) used similar methods, but arrived at larger estimates for the HZ rate, primarily due to different HZ limits. Morton & Swift (2014) took a slightly different approach and modeled each planet using a weighted kernel density estimator, arriving at the conclusion that previous rates were underestimated by a factor of 1.6 due to incompleteness. Gaidos et al. (2014) made use of an iterative simulation model to characterize the dependence of occurrence rate on planet radius, concluding that the distribution of rates peaks at planet radius R p ∼ 0.8 R ⊕ and that on average M-dwarfs host approximately 2 planets with planet radii R p = 0.5 − 6 R ⊕ and orbital period P < 180 days.
A particularly influential study is that of Dressing & Charbonneau (2015), which characterized occurrence rates for cool stars with T eff < 4000 K and log g > 3 using the Q1-Q17 DR24 Kepler data and adding archival spectroscopic and photometric observations to improve their stellar parameters. With a custom fit for pipeline completeness and a false positive correction to their sample, they produced smoothed maps of planet detection and completeness to arrive at binned occurrence rates for P = 0.5 − 200 days and R p = 0.5 − 4 R ⊕ . They conclude that on average there are 2.5 ± 0.2 planets with R p = 1 − 4 R ⊕ and P < 200 days. A study by Gaidos et al. (2016) also using the DR24 Kepler data found occurrence rates of f = 2.2±0.3 over R p = 1−4 R ⊕ and P = 1.2 − 180 days, consistent with those of Dressing & Charbonneau (2015). A study by Mulders et al. (2015) which selected M-dwarfs having T eff = 2400 − 3865 K found systematically lower occurrence rates than Dressing & Charbonneau (2015) over R p = 0.5 − 4 R ⊕ and P = 0.5 − 50 days, which is attributed to Dressing & Charbonneau (2015) having a better detection efficiency estimate for their pipeline.
M stars span a large range of stellar masses and temperatures. Given Kepler's aperture and target list, most occurrence rate studies of low-mass stars based on Kepler data have been based on early-type M dwarfs and likely had significant contamination from late K dwarfs. While Kepler only observed a small number of mid-type M dwarfs, two studies focused on occurrence rates for this challenging sample. First, Hardegree-Ullman et al. (2019) used Kepler DR25 data and enhanced it with Gaia DR2 (Gaia Collaboration et al. 2018) and spectroscopic measurements to achieve better stellar properties. They found a significantly enhanced occurrence rate for planets with R p = 0.5−2.5 R ⊕ and P < 10 days compared to previous studies which computed occurrence rates based on samples dominated by of earlier stars (e.g., Dressing & Charbonneau (2015) and Mulders et al. (2015)). (Gaia Collaboration et al. 2018) concluded that occurrence rates decreased as a function of stellar type.
A small number of studies have begun to characterize the M-dwarf exoplanet population via their planetary system architectures. Muirhead et al. (2015) calculated the occurrence rate of compact multiple systems (i.e. systems with multiple planets at P < 10 days) for mid-type M stars using DR24 Kepler data, concluding that 21 +7 −5 % of mid-M dwarfs host such systems. A separate study by Ballard & Johnson (2016) investigated planet multiplicity for a sample of primarily early M-dwarfs and concluded that the Kepler dichotomy also exists in the exoplanet population around M-dwarf. The architectures inferred for M-dwarf hosts can be compared to the architectures of FGK host stars Mulders et al. (e.g., 2018); He et al. (e.g., 2019).
This study aims to provide the most up-to-date occurrence rate estimates using the final Kepler DR25 planet catalog and associated data products and an enhanced set of stellar properties using Gaia DR2 measurements. In §2 we briefly review how we adapt our previous methodology from Hsu et al. (2019) to this study. In §3 we cover the selection of our M-dwarf catalog using Kepler DR25, Gaia DR2, and 2MASS 2MASS Point Source Catalog (PSC) measurements as well as present our new occurrence rate estimates. Lastly, in §4 we discuss the implications of our results, focusing on comparisons with our previously estimated FGK rates and other M-dwarf occurrence rate studies.

METHODOLOGY
Our study follows most of the same methodology as described in Hsu et al. (2019). We make use of Approximate Bayesian Computation (ABC) to perform Bayesian inference because writing a rigorous likelihood for a real world, complex transit survey like Kepler is impractical, if not intractable. We implement a Population Monte Carlo (PMC) sampler to improve sampling efficiency and more rapidly converge on the ABC posterior for the planet occurrence rates (see (Hsu et al. 2018) for a full description).
The hyperparameters are the planet occurrence rates for each bin in a 2-d grid over orbital period and planet radius. The bin edges for this study are: P = {0. 5, 1, 2, 4, 8, 16, 32, 64, 128, 256} days & R p = {0.5, 1, 1.5, 2, 2.5, 3, 4} R ⊕ . Note that in comparison to the grid used in Hsu et al. (2019) the longest period bin (P = {256, 500}) and the largest radii bins (R p = {4, 6, 8, 12, 16}) have been removed. This choice was made due to the paucity of planet candidates within those ranges. Radius bins of 0.25R ⊕ size were also merged due to the smaller number of targets and planet candidates in the M dwarf sample.
When applying ABC-PMC to the M dwarf catalog, we chose to set the number of "particles" to 500 and to generate with simulated catalogs of the same size as the M-dwarf selected sample catalog. We set τ = 2 for all generations. Once the stopping criteria from Hsu et al. (2018) were met, the final population is used to approximate the posterior distribution for the planet occurrence rates.
Next, we detail aspects of the forward model we considered and changed to account for the different stellar type being analyzed.

Stellar Properties
In Hsu et al. (2019), we make use of the second Gaia data release (DR2) (Gaia Collaboration et al. 2018) to provide improve stellar radii for our selection of FGK target stars. In the case of our M-dwarf sample, a significant fraction of the target stars do not have stellar radii provided by the Gaia catalog.
In order to get accurate stellar parameters for our entire target catalog, we make use of empirical relations from Mann et al. (2015) between the 2MASS K s absolute magnitude and stellar radius/mass. These relations were fit using 183 K7-M7 single stars, with the relations showing scatter of 2-3%. We cross-matched the Kepler target list with Gaia DR2 and use the Gaia to 2MASS PSC cross-match results of Marrese et al. (2019). We estimate absolute K s by taking the apparent K s from 2MASS PSC and applying the distance modulus using the Gaia DR2 measured parallax. Uncertainties from 2MASS K s and Gaia DR2 parallaxes were propagated to estimate uncertainties on absolute K s. We also make use of 2MASS J and K s as a color threshold to identify our target sample of M dwarfs (and some late K dwarfs). We discuss our use of the 2MASS PSC to select our M-dwarf sample in more detail in §3.1.
Stellar radii estimates from Mann et al. (2015) are typically larger than radii estimates from Kepler DR25. This discrepancy between model and empirically derived stellar radii has been previously attributed to spots and magnetic activity of M-dwarfs. The underinflation of model M-dwarf radii is also noted more recently in a study that investigated eight nearby M-dwarf systems with confirmed exoplanets discovered by Kepler (Mann et al. 2017). They conclude that because Kepler-42 (an inactive, metal-poor star) was also inflated relative to model predictions, activity alone may not explain the discrepancy.

Planet Detection Efficiency:
Similar to Hsu et al. (2019), we use a combined model for the probability of a transiting planet being detected and labeled as a planet candidate by the robovetter using the pixel-level transit injection tests (Christiansen 2017) and the associated robovetter results (Coughlin 2017).
where N tr is the number of "valid" transits observed by Kepler, SN R is the estimated signal-to-noise ratio for the transit detection, and values for α, β, and c are given in Table 1 of Hsu et al. (2019).
A more accurate characterization of planet detection for M-dwarfs would require a new fit of the pixel level injection tests over the M-dwarf target sample. Unfortunately, the sample size is too small to accurately fit a new model for most ranges of N tr . However, comparing the injection test results for our M-dwarf sample to the FGK detection model curve shows the two are consistent with one another. Given this result, we have chosen to apply the FGK detection model directly to our M-dwarf catalog.

Model Parameterization
For our simulations, we perform five independent runs for each period bin that simultaneously fit seven radius bins over the ranges: R p = {0.25, 0.5, 1, 1.5, 2, 2.5, 3, 4} R ⊕ . We have previously verified that seven bins simultaneously fit can still produce reasonably precise results (see Fig. 1 of Hsu et al. (2019)). We discard the smallest radius bin due to a known bias that overestimates the rate in edge bins when incorporating stellar parameter uncertainties (see §2.5 from (Hsu et al. 2019)). We perform five ABC-PMC calculations and report the median rate of the relevant percentile (e.g., 50% for bins with 2 planet candidates). Upper (lower) uncertainties indicate the difference between the median of the 15.87th (84.13th) percentiles and the median 50th percentile from the five simulations. In bins with a zero or one planet candidates, we report the median 84.13% percentile from the five simulations.
In Hsu et al. (2019) we showed that some occurrence rates (particularly those for small, long-period planets) were sensitive to the choice of prior. Therefore, we analyze the M dwarf sample using two different priors. The two priors use two different model parameterizations. For the first prior and model parameterization, we assign each bin in period and radius its own occurrence rate, f i, j . We adopt a uniform prior for f i, j over [0, f max,i,j ). The upper limit was set to f max,i,j = C×log 2 (P max, j /(2P min, j ))×log 2 (R max,i /(2R min,i )), with C = 2 in most cases. 1 This prior assumes that all bins are independent of each other and any correlation in the observed rates of exoplanets between neighboring bins comes from imprecise determinations of planet radius that can result in a planet candidate being assigned to the incorrect radius bin.
For the second prior and model parameterization, we assume that each period ranges has a total occurrence rate drawn from a uniform prior, and assign a fraction of those planets planets to each radius bin. For each bin in orbital period (indexed by j), we infer: 1) f tot, j , the total occurrence rate for all planets within the jth range of orbital periods and the full range of planet sizes being considered, and 2) a vector f rel,i, j : the fraction of planets in the jth orbital period bin that have sizes falling within the range of the ith radius bin. Thus, the occurrence rate for planets in the ith radius bin and jth period bin is given by f i, j = f tot,j f rel,i, j . The upper limit for the uniform prior over f tot,j is given by f max,tot,j = 3 × log 2 (P max, j /(2P min, j )) which is motivated by long-term stability criteria. For each f rel,i, j (a vector with j fixed), we then adopt a Dirichlet prior with concentration parameters proportional to log(R max,i /R min,i ) with the parameters normalized so that the smallest parameter is unity.

Catalog Selection
In Hsu et al. (2019), we made use of the second Gaia data release (DR2) (Gaia Collaboration et al. 2018) to acquire significantly improved stellar parameters for the overwhelming majority of the Kepler target stars. In the case of Mdwarfs, however, we lose a significant fraction of stars if we require the Gaia DR2 to have an estimated stellar radius for every target. To that end, we incorporate J and K s magnitude measurements from the 2MASS Point Source Catalog (Skrutskie et al. 2006) adopting the best neighbor cross match between Gaia DR2 and 2MASS performed by the Gaia mission team (Marrese et al. 2019). We use Gaia's precise parallax measurements and the 2MASS J and K s magnitudes to accurately identify main-sequence M-dwarfs based on their position in the color-luminosity diagram.
Our stellar catalog stars begins with a foundation of Kepler target stars. Then, the Kepler DR25 stellar properties catalog is augmented with data from Gaia DR2 (Gaia Collaboration et al. 2018) and 2MASS PSC (Skrutskie et al. 2006). We use many similar selection criteria to select mainsequence M stars in this study as we did in Hsu et al. (2019). For completeness, we list the full selection criteria below (in order of application), along with explanations for additions and changes from our previous study: (i) Require that the Kepler magnitude and the Gaia G magnitude are consistent.
(ii) Require Gaia Priam processing flags that indicate the parallax value is strictly positive and both colors are close to the standard locus for mainsequence stars. This rejects Kepler targets which are unlikely main sequence stars or are so distant that Gaia does not have a good parallax measurement and the stellar radius will be highly uncertain.
(iii) Require Gaia parallax error is less than 30% the parallax value. We relax our error tolerance for the Gaia parallax relative to Hsu et al. (2019), so as to increase the completeness of our M-dwarf sample Most M-dwarf targets are faint in the Gaia bands, so their parallax uncertainties are expected to be larger than for similar distance FGK stars.
(iv) Require that Kepler DR 25 provide a valid Kepler data span, duty cycle, and limb-darkening coefficients.
(v) Require that the Kepler target was observed for > 4 quarters and must have been on the Exoplanet target list for at least one quarter.
(vi) Require the target to have a color J −K s 0.779 from 2MASS photometry. This color cut results in selecting cool stars (M stars and red giants) and is more accurate than using the temperature from the Kepler Input Catalog or the Gaia color B p − R p . The color threshold corresponds to the K7 stellar type color for the main sequence fit described by Mamajek (2019); Pecaut & Mamajek (2013).
(vii) Require the target star to have an absolute magnitude, M K s 4.8. We compute M K s by taking the 2MASS measured apparent magnitude K s and applying the distance modulus using the Gaia DR2 parallax. Given that all M-dwarfs in the Kepler target list must be nearby, we expect dust to have a negligible effect on the magnitude measurement and neglect extinction. The M K s threshold corresponds to the K7 stellar type for the main sequence fit described by Mamajek (2019); Pecaut & Mamajek (2013).
(viii) Require the estimated stellar radius to be R * < 2R ⊕ and estimated stellar mass to be M * > 0.5M ⊕ . New stellar radii and masses are estimated from M K s for each target using the empirical relations from Mann et al. (2015). Since these relations provide stellar parameter estimates for main sequence stars, any targets with new parameters outside of these ranges are expected to not be main sequence and should be discarded. Uncertainties on stellar mass and radii are estimated using error propagation first for M K s calculation involving the distance modulus, then for the Mann et al. (2015) empirical relations.
Using these cuts we arrive at a final M-dwarf sample of 1,530 targets with 89 associated planet candidates within our period-radius grid. In Fig. 1 we compare our M-dwarf population to the full Kepler population and find that based on the color-luminosity diagram our catalog should have high completeness of the Kepler M-dwarf targets. In comparison with the Dressing & Charbonneau (2015) catalog of 2,543 presumed M-dwarf targets, we find a overlap of approximately 1,000 targets. We note that Dressing & Charbonneau (2015) initially uses T e f f and log g measurements from the obsolete Q1-Q16 Kepler stellar catalog, which are less accurate than 2MASS measurements in combination with Gaia parallaxes.

Planet Occurrence Rates
In Table 1 we list the derived occurrence rates using our Mdwarf sample for two different choices of prior as described in §2.3. We also show the derived occurrence rates in Fig. 2.
A bin's rate estimated using the uniform prior parameterization is generally larger than the same bin's rate estimated using the Dirichlet prior parameterization. We attribute this difference primarily to the Dirichlet prior causing a correlation in occurrence rates for bins with the same period range. Since every period column over our M-dwarf occurrence rate grid has at least one bin with < 2 planet candidates, these upper limits are informed by the occurrence rate estimates of other bins in their respective period columns when using the Dirichlet prior. (As stated in §2.3, the concentration parameters are scaled on the log-width of the bins along the radius axis. This choice incorporates the prior assumption that larger bins should have a larger occurrence rate.) In contrast, the uniform prior assumes that the true rate of planets is independent between all bins. (Note that the observed rate of planets is still correlated for bins with the same period, due to the possibility of assigning planets to the incorrect radius bin given observational uncertainties.) Therefore, the presence of a bin with a small upper limit on its occurrence rate does not provide any information about the rate of a different bin within the same period range under this prior.

Comparison to FGK Occurrence Rates
In Table 1 we report ratios of M-dwarf planet occurrence rates from this study with our previously estimated planet occurrence rates for FGK stars from Hsu et al. (2019) Table  2 under column "Combined Detection & Vetting Efficiency." We find that using the uniform prior M rates and only including occurrence rate bins with 2 DR25 planet candidates, the M rate on a bin-to-bin basis is a factor of 5.8 +6.4 −3.8 larger than the associated FGK rate for the same period and radius (as can be seen in the top panel of Fig. 3). In contrast, when using the M rates estimated with the Dirichlet prior and again filtering on bins with 2 DR25 planet candidates we instead find the M rate is 2.9 +3.6 −1.7 times as large as the FGK rate over those bins (see bottom panel of Fig. 3). Integrating over the entire period-radius grid (P = 0.5 − 256 days and R p = 0.5 − 4 R ⊕ ), we find f = 8.9 +1.2 −0.9 and 4.8 +0.7 −0.6 for planet candidates around M-dwarfs with the uniform and Dirichlet priors respectively compared to the integrated rate of f FGK = 3.5 +0.7 −0.6 for planet candidates around sun-like stars. For a more robust comparison over portions of the period-radius grid where most bins have median reported rates, we also integrate over P = 2 − 32 days and R p = 1 − 2.5 R ⊕ . For the M-dwarf integrated rate we find f = 1.8 ± 0.3 and 1.1 ± 0.2 given a uniform and Dirichlet prior respectively, while the FGK-dwarf integrated rate is f FGK = 0.37 +0.04 −0.03 . Regardless of prior choice, if occurrence rates are compared in orbital period space then M-dwarf occurrence rates are elevated compared to FGK-dwarf occurrence rates, with the uniform prior M-dwarf results suggesting a larger factor. We also compare our estimated M rates to FGK rates by scaling the two grids to cover equivalent insolation ranges along the period axis. For this calculation, the period limits for the FGK period-radius occurrence rate grid of Hsu et al. (2019), P = {2, 4, 8, 16, 32, 64, 128, 256, 500} days are scaled to the insolation of a "typical" M-dwarf in our sample 2 . The resulting period limits for a typical M2.5 target after scaling for equivalent insolation are P = {0. 88, 1.8, 3.5, 7.1, 14, 28, 57, 113, 221} days. Here we have defined the typical FGK-dwarf and typical M-dwarf in our samples using Mamajek (2019)  While the insolation at a given period varies between targets, we find it valuable to compare occurrence rates using these scaled period bins which are representative of the typical M dwarf target.
When comparing the M-dwarf occurrence rates to FGKdwarf occurrence rates with bins chosen to have similar insolation, the median ratios of M-dwarf occurrence rates in bins with planet candidates to associated FGK-dwarf binned occurrence rates are 2.1 +1.9 −0.9 and 1.2 +1.6 −0.4 for the uniform and Dirichlet prior M-dwarf estimates respectively. Integrating over this M-dwarf period-radius grid, we find f = 7.6 +0.8 −0.9 and 4.3 +0.6 −0.5 for planet candidates around M-dwarfs with the uniform and Dirichlet priors respectively. While the integrated rate estimated using a uniform prior is still elevated compared to the associated integrated rate of f FGK = 4.9 +0.8 −0.7 for planet candidates around FGK dwarfs, the integrated occurrence rates estimated using a Dirichlet prior for M-dwarfs and FGK-dwarfs is now consistent, once we make the comparison at similar insolation values. Thus, the median ratio of the Dirichlet prior M-dwarf occurrence rate to FGK-dwarf occurrence is consistent with a value of unity.
The host star plays an important role in the evolution of its protoplanetary disc for multiple reasons, including determining planetary compositions, affecting the location where grains can condense from the disk, the clearing of the disc via stellar winds (Ercolano & Pascucci 2017), and the timescale for the end of planetesimal accretion. The planet radius valley (Fulton et al. 2017;Van Eylen et al. 2018;Hsu et al. 2019) for planets around FGK-dwarfs is often cited as evidence for the influence of stellar irradiance on planet formation. While photoevaporation (e.g. Owen & Wu 2017; Lopez & Rice 2018) is often cited as a potential mechanism for the planet radius valley, other physical mechanisms have been proposed including core-powered mass loss (Ginzburg et al. 2018;Gupta & Schlichting 2019a,b), impact erosion via planetesimals (Shuvalov 2009;Schlichting et al. 2015;Wyatt et al. 2020), and the formation of two distant exoplanet populations due to a gas-poor environment for the rocky planets   Figure 2. Inferred occurrence rates for Kepler's DR25 planet candidates associated with high-quality M target stars with two choices of prior: independent uniform priors per bin (top panel); and uniform prior on total rate over each period range and Dirichlet prior on fraction of total rate assigned to each radius bin (bottom panel). These rates are based on a combined detection and vetting efficiency model that was fit to flux-level planet injection tests. The numerical values of the occurrence rates are stated as percentage (i.e. 10 −2 ). The color coding of each cell is based on (d 2 f )/[d(ln R p ) d(ln P)], which provides an occurrence rate normalized to the width of the bin and therefore is not dependent on choice of grid density. The uncertainties shown are the differences between the median and the 15.87th or 84.13th percentile. Cells colored gray have estimated upper limits for the occurrence rate. Note that the bin sizes are not constant.
between M-dwarf and FGK-dwarf occurrence rates found in this study when using an equivalent insolation flux periodradius grid strongly suggests that stellar irradiance plays a significant, if not dominant role, in planet formation processes.

Occurrence Rates of Planets near the Habitable Zone
Among M stars, the location of the habitable zone is a strong function of each star's luminosity. Thus, we cannot perform the same analysis as Hsu et al. (2019) wherein we estimate the occurrence rate over a fixed radius and period range for all stars in the catalog. Instead, we take the estimated rates for M stars over the period-radius grid and draw 500 sample planetary systems for each star and estimate the average number of planets in the habitable zone (defined as between the runaway greenhouse and maximum greenhouse limits from (Kopparapu et al. 2013)) on a star-to-star basis. To report a single HZ occurrence rates, we marginalize over all stars in our target M star catalog. We find an estimated habitable planet occurrence rate for M stars of f HZ = 1.6 +0.2 −0.2 and 1.2 +0.1 −0.1 for planet radii R p = 0.5 − 4R ⊕ when simulating catalogs with uniform and Dirichlet prior estimated rates from Table 1 respectively. If one places stricter constraints on planet radii requiring R p = 0.75 − 1.5R ⊕ , then the habitable planet occurrence rate for M stars is reduced to f HZ = 0.48 +0.23 −0.08 and 0.38 +0.04 −0.05 using the uniform and Dirichlet prior rates from Table 1 respectively.

Comparison to Previous Studies
Dressing & Charbonneau (2015) found an integrated occurrence rate over their period-radius grid (P = 0.5 − 200 days and R p = 1 − 4 R ⊕ ) of 2.5 ± 0.2 planets. Gaidos et al. (2016) found a rate of f = 2.2 ± 0.3 over R p = 1 − 4 R ⊕ and P = 1.2 − 180 days, consistent with that of Dressing & Charbonneau (2015). Over the full period-radius grid in this study (P = 0.5 − 256 days and R p = 0.5 − 4 R ⊕ ) we find occurrence rate estimates of 8.9 +1.2 −0.9 and 4.8 +0.7 −0.6 using the uniform and Dirichlet priors respectively. These estimated rates are significantly larger than the respective estimates from Dressing & Charbonneau (2015); Gaidos et al. (2016). Part of the larger uncertainties is due to our cleaned stellar sample resulting in fewer target stars in our sample. Another important factor is our rigorous accounting of uncertainties due to finite sample size.
For a comparison where our study can report a median estimate, we choose to compare against occurrence rate estimates from Dressing & Charbonneau (2015); Mulders et al. (2015) over the ranges P < 50 days and R p = 1 − 2.5 R ⊕ . For these period-radius ranges, Dressing & Charbonneau (2015) finds an occurrence rate of 1.38 +0.11 −0.09 (taken from Table 5) while Mulders et al. (2015) finds a rate of ∼ 1.2 ± 0.1 (estimated from Fig. 5). For this study, integrating over the same radii range and P = 0.5 − 64 days results in estimates of 2.49 +0.46 −0.42 and 1.32 +0.23 −0.21 respectively using the uniform and Dirichlet priors. Our preferred estimates based on the Dirichlet priors are consistent with the rates from Dressing & Charbonneau (2015); Mulders et al. (2015) while the uniform prior estimate is significantly higher. Regardless of the best estimate, our results suggest that true uncertainties in the M dwarf occurrence rate from Kepler are significantly larger than suggested by previous studies.
Most previous studies chose to focus on occurrence rates in period-space rather than insolation-space, making comparisons over a radius-insolation grid difficult. Dressing & Charbonneau (2015) is one study, however, that also infers occurrence rates over a radius-insolation grid. However, the bin limits in insolation from Dressing & Charbonneau (2015) are not congruous with the period limits for equivalent insolation selected in this study. The closest comparisons that can be done between Dressing & Charbonneau (2015) and our study use the Dressing & Charbonneau (2015) cumulative binned rates over 10 − 200 F ⊕ in Table 7 and our binned rates over P M = 3.5 − 28 days (equivalent insolation to approximately 10.2 − 163 F ⊕ for an M2.5 dwarf). Our study's rates are systematically larger than the Dressing & Charbonneau (2015) rates by factors > 2 regardless of prior choice. This systematic difference likely reflects the approximation of our study to treat all stars in our M-dwarf sample as having an irradiance equivalent to a Mamajek (2019) M2.5 dwarf when in truth M-dwarfs cover a wide range of irradiance.
Additionally, Dressing & Charbonneau (2015) reported an occurrence rate of f = 0.56 +0.06 −0.05 for Earth-size (R p = 1−1.5R ⊕ ) planets with periods < 50 days. For our occurrence rate estimates, integrating over the same radii range and P = 0.5−64 days produces f = 0.50 +0.15 −0.11 using the Dirichlet prior, consistent with Dressing & Charbonneau (2015). However, we caution that using a uniform prior for each rate results in a higher estimate of f = 0.89 +0.29 −0.22 , just outside "1 − σ" agreement with the estimate of Dressing & Charbonneau (2015). Dressing & Charbonneau (2015) also gives an estimated habitable zone occurrence rate using the Kopparapu et al. (2013) limits that we apply in §4.2. For Dressing & Charbonneau (2015) the habitable zone occurrence rate for Earth-size (R p = 1 − 1.5R ⊕ ) planets was f HZ = 0.16 +0.17 −0.07 , whereas we find a slightly higher rate of f HZ = 0.23 +0.22 −0.05 or 0.21 +0.05 −0.04 for the same radius range (for the uniform and Dirichlet priors, respectively). Given the associated uncertainties, the HZ occurrence rate estimates are consistent with the Dressing & Charbonneau (2015) estimate regardless of prior choice.
Hardegree-Ullman et al. (2019) finds an occurrence rate for mid-type M stars of 1.19 +0.70 −0.49 with a limited range of P = 0.5 − 10 days and R p = 0.5 − 2.5R ⊕ . For the same range, we find a consistent rate of 0.94 +0.58 −0.46 with the uniform prior and an inconsistent rate of 0.48 +0.29 −0.21 with the Dirichlet prior. The latter estimate from this study is more consistent with Dressing & Charbonneau (2015); Mulders et al. (2015). It is important to remember that Hardegree-Ullman et al. (2019) investigated mid-type M stars, so there is reason to anticipate differences in these measurement. However, we caution that the Hardegree-Ullman et al. (2019) estimates are based on a small catalog of only seven stars with 13 planets, so we anticipate large uncertainties and cannot make a robust comparison of occurrence rates for these two stellar samples.

Future Prospects for Occurrence Rate Studies
While this work has estimated robust occurrence rates for M-dwarf exoplanets, the limited number of M-dwarfs in the Kepler target catalog restricts the precision of the rates. Future work should consider applying ABC-PMC or alternative Bayesian methodology with larger catalogs, both in number of stars and number of planet candidates. Recent work in formulating a pipeline for identifying and vetting of exoplanet candidates from K2 datasets (Kostov et al. 2019;Zink et al. 2020) represents an important step in this direction. Creating a homogeneous exoplanet catalog from the K2 observations could provide basis for future future M-dwarf occurrence rate calculations. In the near future, several ground-based RV (e.g. HPF, SPIRou, CARMENES) and transit surveys (e.g. MEarth, SPECULOOS), as well as the space-based TESS mission, will increase the catalog of exoplanet candidates associated with M-dwarfs.
If the sample of exoplanets from any single catalog is too small for precise occurrence rate estimates, then another possible direction for future work is the development of a model that accounts for the unique detection pipelines for each survey. Such a model would enable multiple catalogs as inputs to the same occurrence rate inference process, boosting the sample size and therefore increasing the precision of any estimated rates.
The model used in this study could also be improved further to account for additional effects and reduce the number of assumptions made. In addition to the limitations covered in §4.3 of Hsu et al. (2019), we summarize additional improvements future studies can make upon our model specifically for M star occurrence rates.
Detection Model: In this study, we performed tests to verify that the detection efficiency curve fit for FGK stars was consistent with the M star sample of this study based on the available Kepler DR25 data products. Future research could perform more transit injection and recovery tests for M dwarf targets to enable a more detailed characterization the detection efficiency specifically for M dwarfs. Additionally, our current detection efficiency model uses the expected effective SNR and the number of transits observed to determine the probability that a planet is detected. As suggested in Hsu et al. (2019), additional information (e.g., stellar properties, sky group) could be incorporated to improve the model.
Contamination by Non-MS M Stars: Obtaining precise planet occurrence rates requires having a large sample of target stars. Given the Kepler targets, applying strict cuts to avoid contamination of late-K stars into our M dwarf sample would have significantly reduced the number of targets and the resulting precision. Therefore, we we chose cuts that balanced the desire for a sufficiently large target star catalog with the desire to limit the contamination of late Kdwarfs. While we expect that the occurrence rate for these stars should be similar to those of main-sequence M stars, we encourage future studies incorporating additional data sources to identify and analyze planet occurrence rates for target star samples that have even lower contamination from late-K stars.
Intrinsic Stellar Activity: Due to M stars having larger convective zones (and later types being fully convective), many M stars have significant amounts of stellar activity which may contribute to false positives for planet detection. In this study we make the assumption that all planet candidates that pass the robovetter are true planets. However, the robovetter parameters were tuned based on FGK stars, rather than M stars. Therefore, a future study could perform more detailed analyses to better vet false positives accounting for the properties of M dwarf targets.
Tidal Locking: The original Kopparapu et al. (2013) study defining HZ limits as a function of stellar temperature did not consider the effects of tidal locking. The habitable zone around M stars is much closer to the host star (even in units of stellar radii) than for FGK stars. Therefore, planets at the inner edge of the Kopparapu et al. (2013) HZ could become tidally locked, which has implications on the habitability of the planet. For our study we make use of the Kopparapu et al. (2013) HZ limits, but a future study may consider an updated set of HZ limits that accounts for tidal locking (e.g., Kopparapu et al. 2016).

Conclusions
We have estimated occurrence rates for Earth to sub-Neptune sized planets over periods ranging from half a day to ∼ 70% a year using a cleaned sample of M-dwarfs targeted by Kepler with stellar parameters updated using Gaia DR2 and 2MASS PSC. These planet occurrence rates were estimated using ABC with a forward model that incorporates the final Kepler DR25 data products to accurately reproduce the Kepler planet candidate identification and vetting process.
We find significant differences between estimated rates using two different priors, emphasizing the importance of prior choice in the inference of occurrence rate estimates for M-dwarfs in the Kepler sample. Over the orbital period range of P = 0.5 − 256 days and planet radius range of R p = 0.5 − 4 R ⊕ , we find an occurrence rate of f = 8.9 +1.2 −0.9 and 4.8 +0.7 −0.6 when applying the uniform or Dirichlet prior respectively. Combining the integrated planet occurrence rates above with an the average multiplicity (i.e., planets per star with at least one planet), we can estimate the fraction of Mdwarfs with planets. If one adopts a multiplicity of 6.1 ± 1.9 from Ballard & Johnson (2016), then the fraction of Mdwarfs with planets (in the above range of sizes and periods) is estimated to be 1.5 ± 0.5 or 0.8 ± 0.3, when assuming the uniform and Dirichlet priors, respectively. Clearly, the fraction of stars with planets can not exceed unity. It is important to note that the integrated planet occurrence rate includes bins with only upper limits. In our Bayesian approach, we marginalize over the posterior distribution of occurrence rates, extending to rates too high to be consistent with the assumed multiplicity. We conclude that the observed number of M dwarf planet candidates is consistent with all M dwarfs hosting a planetary system with either of our priors. When using the Dirichlet prior, the observed number of M dwarf planet candidates is also consistent with as few as half of M dwarfs hosting planetary systems. Recently, He et al. (2019) investigated the architectures of planetary systems around Sun-like (FGK) stars with a clustered point process model built in the same SysSim framework as this study. He et al. (2019) report a 68.3% credible interval for the multiplicity of 2.28 +0.94 0.53 . Since this is less than the in-tegrated rate of M dwarf planets (for both out priors), even assigning every M dwarf such a planetary system would not be sufficient to explain the number of M dwarf planets detected by Kepler. We caution that a substantial fraction of the integrated M dwarf planet occurrence rate comes from bins with upper limits. Therefore, the observations do not provide evidence the necessitates every M dwarf host a planetary system. Collectively these three studies suggest that most M-dwarf hosts at least one Earth to sub-Neptune size planet, if not several. We find that the planet occurrence rate around Mdwarfs is substantially elevated relative to the planet occurrence rate around FGK-dwarfs at similar orbital periods orbital period. However, the planet occurrence rates are nearly consistent (to within statistical uncertainties) when comparing at similar insolation values. This suggests that stellar irradiance has a significant and possibly dominant role in planet formation processes regardless of spectral type.
Finally, we recommend that mission design concepts with the goal of characterizing M star habitable zones select science programs robust to a true rate of f HZ = 0.38 +0.04 −0.05 for planets with 0.75 − 1.5 R ⊕ size.