Maximum likelihood method for estimating the mass and period distributions of extra-solar planets

We investigate the distribution of mass M and orbital period P of extra-solar planets, taking account of selection effects due to the limited velocity precision and duration of existing surveys. We fit the data on 63 planets to a power-law distribution of the form dn=CM^{-alpha}P^{-beta}(dM/M)(dP/P), and find alpha=0.12+-0.10, beta=-0.26+-0.06 for M<10M_J, where M_J is the Jupiter mass. The correlation coefficient between these two exponents is -0.32, indicating that uncertainties in the two distributions are coupled. We estimate that 3% of solar-type stars have companions in the range 1M_J


Introduction
As of May 2001, radial-velocity surveys have discovered over sixty planets orbiting nearby stars. This sample should be large enough to provide reliable estimates of the distribution of planetary mass and orbital elements, at least in the range to which the radial-velocity surveys are sensitive.
At least two important selection effects must be included in any statistical analysis of this kind: (i) each survey has a detection limit K D , such that the orbits of companions that induce reflex motions in their host star of amplitude < K D cannot be reliably characterized; (ii) orbits of companions with periods much longer than the duration of the survey cannot be reliably characterized. In any survey limited by its velocity precision, uncertainties in the distribution of planetary masses M are coupled to uncertainties in the distribution of orbital periods P , because the velocity amplitude induced by a companion is ∝ M P −1/3 (eq. 2). Thus it is necessary to determine both distributions simultaneously.
The aim of this paper is to describe a maximum-likelihood method of estimating these distributions using data from a variety of surveys, while accounting for survey-dependent selection effects. We have chosen to fit the data to simple power-law models of the distribution of masses and periods; such distributions are simple to interpret and common in nature, and it is straightforward to generalize our approach to non-parametric models as the data improve.
We shall work with data from eight radial-velocity surveys of nearby stars, which have detected between 2 and 23 extra-solar planets each (see Table 1). Together these surveys have detected 90 planets as of June 1 2001, although several planets appear in more than one survey so we have only 63 distinct planets.
We compare our method and results with other determinations of the mass distribution of extra-solar planets in §3.

Maximum Likelihood Method
We focus initially on the simple case of a single survey that examines N * stars for radial velocity variations due to orbiting companions. The velocity amplitude K due to a companion of mass M with orbital period P is where e is the orbital eccentricity, M ⋆ is the stellar mass, and i is the inclination between the orbit plane and the sky plane. For simplicity we assume that e ≪ 1 so that the factor (1 − e 2 ) −1/2 is unity. We do not attempt to account more accurately for the eccentricity dependence because the detectability limit for eccentric orbits depends on both the amplitude and shape of the radialvelocity curve. At the upper quartile of the eccentricities in our sample, e = 0.46, the error in K caused by setting e = 0 in equation (1) is only 11%. We also assume that M ≪ M * , so that equation (1) simplifies to where m ≡ M sin i is the minimum companion mass, corresponding to an orbit viewed edge-on.
Throughout this paper we shall assume that all of the stars in the survey have mass equal to the Sun's, M ⋆ = M ⊙ . This is not a bad approximation since most radial-velocity surveys for low-mass companions have focused on solar-type stars. In fact, our method is easily generalized to the case where the survey stars have different masses, but to do so we need to know the masses of all the stars in the survey (not just the ones that have detected companions)-and this extra complication did not seem worthwhile in this preliminary analysis. 1 .
We restrict our attention to companions with minimum mass m ≤ m max ≡ 10M J , where M J is the Jupiter mass; this cutoff hopefully minimizes the contamination of our sample by brown dwarfs and is below the deuterium-burning threshold which sometimes is taken to define the boundary between planets and stars. We also restrict our attention to orbital periods P > P min = 2 days, corresponding to a semimajor axis of 6.7R ⊙ = 0.031 AU; this limit is small enough to include all the known planets.
We assume that the probability that a single star has a companion with mass and orbital period in the range [M, M + dM ], [P, P + dP ] is given by a power law, where C, α and β are constants to be determined, and M 0 and P 0 are a fiducial mass and period, which we choose to be M 0 = 1.5M J and P 0 = 80 d (the reasons for this choice are outlined in the following subsection). If the distribution of companion orbits is isotropic, the distribution of minimum mass m = M sin i and period is given by where Γ(·) is the gamma function, and α > −2.
Initially we assume that the survey detects a companion if and only if (i) the velocity amplitude K exceeds a survey-dependent detectability limit K D ; and (ii) its orbital period is shorter than a survey-dependent upper limit P max . We expect that P max will be proportional to the duration of the survey, since typically at least two orbits are required for a reliable detection. More realistic smooth cutoffs to the detection efficiency are discussed in §2.2. Although the detection limits K D and P max can be estimated from descriptions of the survey, we adopt the more objective approach of determining them directly from the maximum-likelihood analysis. Let x i = ln(m i /M 0 ) and y i = ln(P i /P 0 ), i = 1, . . . , N , where m i and P i are the minimum mass and period of the companions detected in the survey. Then the velocity amplitude K i (eq. 2) exceeds the detection limit K D if Similarly, the orbital period is less than the maximum detectable period if The other constraints are x i < x max ≡ ln(m max /M 0 ) = ln 10 + ln(M J /M 0 ), y i > y min ≡ ln(P min /P 0 ) = −5.207 + ln(1 yr/P 0 ).
The constants x max and y min are fixed, while the variables u and v are to be determined by the maximum-likelihood analysis.
From equation (4) the expected number of companions in the interval dx dy in a survey of N * stars is n(x, y)dx dy = N * p(x, y)dx dy where p(x, y) = ce −αx−βy .
The likelihood function L is the product of (i) the probability of detecting N companions with minimum masses x i and periods y i ; and (ii) the probability of observing none elsewhere in the domain D of (x, y) space in which companions are detectable. Thus and zero otherwise.
We now substitute equation (9) into equation (10) and take the log of the result, Here where The integral yields if v < x max − 1 3 y min and u > y min , and zero otherwise. The best estimates for the fitted variables c, α, β, u and v correspond to the global maximum of ln L. First, the constant c can be evaluated from Substituting this result into equation (12) yields To determine the best estimates for u and v we note that ln L depends on these parameters only through f (α, β, u, v), and that ln L is maximized when f is minimized. According to equation (13), f depends on u and v only through the limits of integration, that is, only through the shape of the domain D. Since the integrand is non-negative, we minimize f by making D as small as possible, so long as it still contains all the data points (x i , y i ). This can be done by setting v equal to the smallest value of x i − 1 3 y i in the sample, and u equal to the largest value of y i in the sample.
-5 - The best estimates for the remaining parameters are given by which can easily be solved numerically. Table 2 lists the best estimates of the parameters for each survey. The value of the normalization parameters C is based on the fiducial mass and period M 0 = 1.5M J and P 0 = 80 d, which are chosen for reasons outlined in the following subsection. Figure 1 shows the correlation between the duration of each survey and the fitted value of P max for that survey, as well as the stated velocity precision K S for each survey and the fitted detection limit K D for that survey (values taken from Tables 1 and 2). The detection limit is generally a factor of three or so higher than the stated precision, presumably because determining a reliable orbit is more difficult than simply detecting the presence of a companion. For most of the surveys there is an good correlation between duration and P max , with the slope of the correlation indicating that approximately two orbital periods of data are needed for a reliable detection.

Generalization to multiple surveys
It is straightforward to expand the analysis of the previous Section to multiple surveys, which we label by j = 1, . . . , J. The three parameters describing the companion distribution, α, β and c, are now derived from the entire sample of known companions from all surveys, while the parameters u j and v j that describe the period and radial-velocity thresholds are different for each survey. Thus the expected number of companions to be discovered in the interval dx dy in survey j is where p(x, y) is defined in equation (9) and N * j is the number of stars in survey j. Equation (10) becomes and the likelihood is given by Again, the integration constant can be eliminated from equation (22) ∂ and the likelihood becomes As before, v j is set equal to the smallest value of x i,j − 1 3 y i,j in survey j, and u j is set equal to the largest value of y i,j in survey j.
The surveys listed in Table 1 have discovered 90 companions, although several appear in more than one survey so there are only 63 distinct companions. Companions discovered in multiple surveys are counted in each survey where they appear; this approach leads us to underestimate the statistical uncertainties in our parameters somewhat (probably by about a factor of (90/63) 1/2 = 1.2), but avoids the systematic bias that would be created by counting the companions only once and discarding them from the other surveys. A conservative alternative approach is to use only the Coralie survey for parameter estimation (top lines of Tables 1 and 2).
The values of the normalization parameters c and C quoted in this paper are based on the fiducial mass and period M 0 = 1.5M J and P 0 = 80 d. These values are chosen to minimize the uncertainty in ln c. If the uncertainties are small, this requirement is equivalent to choosing M 0 and P 0 so that the covariances between c and the exponents α and β vanish.
The likelihood analysis yields α = 0.12 ± 0.10, where the confidence limits correspond to ln L = (ln L) max −0.5. With these estimators at hand, it is straightforward to plot the likelihood as a function of the exponents α and β (Figs. 2 and 3). Figure  2 indicates that there is a significant covariance between the exponents α and β that characterize the mass and period distributions (correlation coefficient r = −0.32). This correlation arises because of the selection effects on velocity amplitude, and demonstrates that both distributions should be fitted simultaneously. Table 3 shows the difference between the number of predicted companions and the number of actual detections; these are generally in good agreement except for the McDonald survey, which is discussed further in §3.2.
This table also lists the number of predicted brown dwarfs (10M J < M < 80M J ) in each sample, assuming that the planetary mass function extends to 80M J , and the number of brown dwarfs actually discovered. As many authors have pointed out, the small number of brown dwarf discoveries strongly suggests that the mass function we have derived cannot be extrapolated to brown dwarf masses.

Generalization to a smooth cutoff
A sharp cutoff in the detectability of planets at radial velocity K D and period P max is not very realistic. A better approximation is to replace the sharp cutoffs in equations (12) and (13) with smooth functions. We can do this by replacing f (α, β, u, v) with where g(x, y) is defined by equation (14).
The functions h u (·) and h v (·) are measures of the detection efficiency of the survey as a function of orbital period and velocity amplitude. The function h u (s) approaches 0 as s → −∞ and 1 as s → ∞; we shall assume that h u (s) − 1 2 is an odd function of s so that h u (0) = 1 2 , with similar assumptions for h v . In the limit where h u and h v are step functions we recover equation (13). Thus v and u are still defined by equations (6) and (7), except that K D and P max are interpreted as the velocity amplitude and orbital period at which the detection efficiency falls to 50%.

Let
then equation (29) can be rewritten as where the second line follows from equation (13), and we have dropped the primes on the dummy variables. These integrals are easy to evaluate numerically.
with a similar choice for h v (t). We call δ u and δ v the threshold widths. Figure 4 shows the effect of non-zero threshold widths on the slope estimators α and β. In the arbitrary but plausible case where the detection efficiencies h drop from 3 4 to 1 4 over a factor of two in period or velocity amplitude, we have δ u , δ v = 0.51. In this case the best-fit value of α is shifted downward by 0.04 and the best fit for β is shifted upwards by about 0.020. These changes are significant but relatively modest; since the appropriate values of the threshold widths are difficult to estimate, we shall not attempt to correct for this effect.

Comparison to other estimates of the mass distribution
We have found that the distribution of companion masses below 10M J is approximately flat in log M , or slightly rising towards small masses, i.e., the exponent α is small and positive (α = 0.12 ± 0.1, eq. 25). The distribution of companion masses has already been examined by a number of authors, most of whom have reached similar conclusions (Mazeh et al. 1998, Marcy & Butler 1998, Mazeh 1999, Stepinski & Black 2000, Stepinski & Black 2001, Jorissen et al. 2001, Zucker & Mazeh 2001. Our approach offers several advantages over the variety of methods used in these papers: (i) we correct for selection effects in period and velocity amplitude, (ii) we account for the coupling between the orbital period distribution and mass distribution, (iii) we estimate the sensitivity in radial velocity or maximum period (our parameters K D and P max ) self-consistently from the data, and (iv) we determine the normalization of the mass distribution, not just its shape.

Extrapolations
It is interesting to investigate the implications of extrapolating the mass and period distributions that we have derived. If we assume that our maximum-likelihood distribution (eqs. 25-28) applies in the mass range 10M J < m < 80M J that is usually associated with brown dwarfs, we predict that the Coralie and Keck surveys should have discovered, respectively, 13 and 8 companions in this range (Table 3); in fact these two surveys found only one companion each. This result confirms the finding of several authors (Basri & Marcy 1997, Mayor et al. 1998, Mazeh et al. 1998, Mazeh 1999, Jorissen et al. 2001) that there is a cutoff in the power-law distribution of companion masses at m ∼ > 10M J , and a "brown-dwarf desert" between ∼ 10M J and ∼ 100M J in which few companions exist at semi-major axes less than a few AU.
The average number of planets per star with masses between M 1 and M 2 and periods between P 1 and P 2 is given by equation (3): Thus, for example, in our best-fit model (eqs. 25-28), the expected number of planets per star with periods between 2 days and 10 yr and masses between M and 10M J is For M = M J , N = 0.033; thus about three percent of solar-type stars have a planet of Jupiter mass or larger in this period range. If we make the large extrapolation to Earth-mass planets (M = 0.003M J ) we find N = 0.172; in this case, more than 15% of stars would have an Earth-mass or larger companion.
If we extrapolate to larger orbital periods, we find that the number of companions in a given mass range with periods between 2 days and 5 yr would be about 0.28 times the number with periods between 5 yr and 1000 yr; in this case a significant fraction of Jupiter-mass planets would have orbital periods short enough to be detected in existing radial-velocity surveys.

Comparison with the solar nebula
The mass and period distribution (3) can be used to determine the total surface density in planets less massive than M max , assuming that the central star has mass 1M ⊙ : For our best-fit model, this can be compared to the gas and dust densities required in the minimum solar nebula (e.g. Hayashi 1981) Σ gas (a) = 1.7 × 10 3 g cm −2 1 AU a 1.5 ; Σ dust (a) = 7.1 g cm −2 1 AU a 1.5 .
The agreement of the exponents in equations (36) and (37) is striking and perhaps surprising, given that many theorists believe that the extrasolar giant planets must have formed at much larger radii and migrated to their present locations, while the planets in our solar system have suffered little or no migration. Figure 5 shows the minimum-mass and period distributions of all of the substellar companions found in the surveys in Table 1, along with the simple power-law models that we have used to fit these distributions. The effects of the selection effects at small mass and large period, and the evidence for a cutoff above ∼ 10M J , are evident in the bottom panels.

Summary
We have described a simple maximum-likelihood method that determines the mass and period distributions of extrasolar planets discovered in multiple surveys. Our method determines and accounts for selection effects on velocity amplitude and period from the data themselves, without relying on the nominal survey parameters. Our best-fit model is defined by equations (3) and equations (25)-(28).
We are indebted to W. Cochran, G. Marcy and M. Mayor for communicating some of the details of their surveys that were used in this statistical analysis. We are particularly grateful to David Weinberg for pointing out a normalization error in an early version of this paper. This research was supported in part by NASA grant NAG5-10456, and by a European Space Agency fellowship.
Vogt, S., Marcy, G., & Butler, R. P. 2000, ApJ, 536, 902 Zucker, S., & Mazeh, T. 2001 This preprint was prepared with the AAS L A T E X macros v4.0. 6.9 + 1.0 12.9 − 1.6 Table 2: For each survey, best estimates for the exponents α and β and the normalizing constant C in equation (3), the maximum detectable period P max and the velocity precision K D in m s −1 . The fiducial mass and period are M 0 = 1.5M J , P 0 = 80 d. The errors on P max and K D are onesided, since the maximum likelihood is achieved when these parameters equal the largest period and smallest velocity amplitude found in the survey.    Table  1) and the longest period in the sample. Note that Lick, ESO and McDonald programs have all detected the same long-period planet: ǫ Eridani, which has an inferred orbital period of P = 2502.1 days. All of the points lie below the dashed line (x = y), indicating that a reliable detection requires following the star for more than one orbital period. The right panel shows the correlation between the stated velocity precision of each survey (from Table 1) and the smallest velocity amplitude of any of their detected planets. Almost all of the points lie well above the dashed line, indicating that determining a reliable orbit requires a velocity amplitude that is significantly larger than the stated velocity precision. The exception is the derived limit for McDonald; in this case K D is set by their detection of a planet in the ǫ Eri system, for which our approximations of a circular orbit and solar-mass star yield an estimate for the velocity amplitude that is 46% too low (see footnote 1). Fig. 2.-Contours of constant likelihood for the combination of all eight surveys. The maximum of L is located at the filled circle, α = 0.12 ± 0.10, β = −0.26 ± 0.06. The contour levels represent "n-σ" confidence regions, in which the likelihood function is smaller than its maximum value by exp(−n 2 /2).    Table 1 (67 objects). The unshaded histograms include duplicate discoveries of the same object in different surveys, as discussed in §2.1 (94 objects). The curves show the predictions of our best-fit model, given by equations (25)-28) and (33). As discussed in the paper, the histograms are subject to selection effects at small mass and large period, and there is evidence for a cutoff to the mass distribution above ∼ 10M J .