A meta-analysis of neutron lifetime measurements

We calculated the median as well as weighted mean central estimates for the neutron lifetime, from a subset of measurements compiled in the 2019 update of the Particle Data Group (PDG). We then reconstruct the error distributions for the residuals using three different central estimates and then check for consistency with a Gaussian distribution. We find that although the error distributions using the weighted mean as well as median estimate are consistent with a Gaussian distribution, the Student’s t and Cauchy distribution provide a better fit. This median statistic estimate of the neutron lifetime from these measurements is given by 881.5± 0.47 seconds. This can be used as an alternate estimate of the neutron lifetime. We also note that the discrepancy between beam and bottle-based measurements using median statistics of the neutron lifetime persists with a significance between 4-8σ, depending on which combination of measurements is used.


I. INTRODUCTION
The precise measurement and theoretical estimate of the neutron lifetime is of tremendous importance for both particle physics and astrophysics [1,2]. The current weighted average of all neutron lifetime measurements according to the 2018 edition of Particle Data Group [3] (PDG, hereafter) using seven best measurements is 880.2 ± 1.0 seconds. This error includes a scale factor of 1.8. The theoretical estimate of the neutron lifetime is between 875.3 and 891.2 seconds within 3σ [4].
Neutron lifetime measurement techniques can be broadly classified into two types: 'bottle' and 'beam' based measurements 1 In the bottle method, ultracold neutrons are stored in some container (either some bottle or trap), and the neutron lifetime is measured by fitting the surviving neutrons to a decaying exponential. In the beam method, the number of neutrons and protons are produced from β-decay and the lifetime is obtained from the neutron decay rate. More details about these techniques can be found in Refs. [1,2].
However, there is a long standing discrepancy between these two methods used to measure the neutron lifetime [5]. As of 2018, the current value from two beam experiments [6,7] included in the PDG is 888 ± 2.0 seconds [4] and from five bottle experiments [8][9][10][11][12] is 879.6 ± 0.6 seconds [4]. This is a formally a 4σ discrepancy, and as pointed out in Fornal and Grienstein. [4] (F18 hereafter) could be evidence of uncontrolled systematics or new physics. Another possibility however not mentioned in the above works is that the measurements could contain non-Gaussian errors.
The central estimate of the neutron lifetime mentioned in PDG as well as other works has been obtained from a weighted average of all the measurements. One key premise for this is that the error bars are Gaussian [13,14]. In the last decade, Ratra and collaborators have shown that the error distributions for a whole slew of astrophysical and cosmological datasets are inconsistent with a Gaussian distribution [14][15][16][17][18][19][20][21][22][23][24]. The datasets they explored for this purpose include measurements of H 0 [17], Lithium-7 measurements [18] (see also [25]), distance to LMC [19], distance to galactic center [26], Deuterium abundance [23], etc. For each of these datasets, they have fitted the data to a variety of probability distributions. From all these studies, they have found that the error distribution is non-Gaussian. Consequently, they have argued that median statistics should be used for the central estimates of these parameters instead of the weighted mean [14,21]. Median statistics does not incorporate the individual measurements errors and hence is a more robust estimate than the weighted mean. To the best of our knowledge, no one has investigated the Gaussianity of the neutron lifetime measurements (for that matter any other datasets in PDG). The importance of doing such tests has been stressed in a number of works [13,14,20,22]. Due to non-Gaussanity of the error residuals for the aforementioned astrophysical datasets, median statistics has been used to obtain central estimates of some of these such as H 0 [14,16,21], Newton's Gravitational Constant [21], mean matter density [15], and other cosmological parameters [20].
Given the importance of the physics implications of the discrepancies in neutron lifetime measurements and to get a more robust estimate which can be easily compared with theoretical value, we revisit the issue of checking for non-Gaussianity of the errors and getting its central estimate from the vetted measurements in PDG as well as a few recent measurements not included in PDG. The outline of this paper is as follows. The dataset used for our analysis is described in Sect. II. Our analysis proce-dure and results are described in Sect. III. We conclude in Sect. V.

II. NEUTRON LIFETIME DATA
We briefly review the various neutron lifetime measurements used for this analysis. The 2018 edition of PDG lists a total of 24 measurements from 1972 to present. From these measurements, only seven have been used to obtain the central estimate by PDG. Using these seven measurements, a weighted mean central value of 880.2 ± 1.0 s was estimated [3]. Five of these are bottlebased measurements and two are beam-based. The remaining measurements were ignored either because the error bars for some of the pre-1980 measurements were large, or if the results from the old measurements were reanalyzed, and lastly because some of the measurements were withdrawn. However, a few measurements have also been excluded without any reasons. For our analysis, we also include these older measurements, except if they were reanalyzed or withdrawn. We also include four additional measurements [27][28][29][30], which are not included in PDG. In all, we have a total of 19 measurements for our analysis, which are tabulated in Table I. We note that in addition to these direct experimental measurements of neutron lifetime, there are also cosmological constraints on the measurements of neutron lifetime [31]. But we do not include them for this analysis, as these are modeldependent and not direct estimates.

III. ANALYSIS
The first step in analyzing the Gaussianity of the error measurements of a dataset is to get a central estimate using the available data. For this analysis, we use all the 19 measurements tabulated in Tab. I. We do not check for Gaussianity separately for beam and bottle-based measurements as the total number of data points in each category is too small to reliably test for this. However once the number of measurements in each category grows, this should also be tested. We note that in P18, a similar analysis was done using 15 deuterium abundance measurements. Similar to the works by Ratra et al (eg Ref. [23], P18 hereafter), we consider two central estimates: weighted mean and the median. For this analysis, we use all the 19 measurements tabulated in Tab. I.
The median value (τ med ) corresponds to the 50% percentile value, for which half of the data points are below and half above. The standard deviation in the median value can also be estimated and its analytic expression can be found in P18. The weighted mean central value (τ wm ) for multiple neutron lifetime measurements (τ i ) is given by [32]: where σ i indicates the total error in each measurement. The total weighted mean error is given by The weighted mean estimate is found to be τ wm = 879.97 ± 0.39 seconds and the median estimate is calculated to be τ med = 881.5 ± 0.47 seconds.

A. Error Distribution
Once we have a central estimate for the neutron lifetime (τ CE ) using one of the above three methods, we construct an error distribution by using [23,26] In the above equation, σ CE is the error in the central estimate and σ i is the error in the individual measurement. Similar to Refs. [23,24,26], we denote our error distribution for median (τ med ) and weighted mean(τ wm ) calculated from Eq 3 by N med σi and N wm+ σi respectively. If the central estimate is determined by the weighted mean, one must also account for correlations and the modified version of the error distribution, which accounts for these correlations is given by [26] Each of these three sets of absolute value of N σ histograms are then symmetrized around zero. We now fit the symmetrized histogram of |N σi | to various probability distributions as described in the next section.

B. Fits to probability distributions
We fit the symmetrized histogram for each of the different |N σ |s to Gaussian distribution as well as variants of Gaussian distribution, such as Cauchy, Laplacian and Student's t distribution, to see which of these distributions is most compatible, similar to recent works by Ratra et al (P18). We briefly review this procedure. More details can be found in P18.
The comparison is done using the one-sample unbinned Kolmogorov-Smirnov (K-S) test [41]. The K-S test is based on the D-statistic, which measures the maximum distance between two cumulative distributions. In this case, the two distributions are the error histograms and the parent PDF to which it is compared. From the D statistics, the K-S test also provides a p-value, whose analytic formula can be found in any statistic work [23,41].
Higher the p-value, more similar are the two distributions, whereas a low p-value indicates an inconsistency with the distribution. Our results can be found in Table II.
We find that for all the three estimates, the Gaussian distribution is not a good fit, unless the scale factor is different from unity. The data are much more consistent with Cauchy or Student's t distribution. Therefore, as argued in P18 it is best to use a median statistics estimate as the central value.

IV. DISCREPANCY BETWEEN BEAM AND BOTTLE MEASUREMENTS
We now quantify the significance of the discrepancy between beam and bottle-based experiments using central estimates based on median statistics. We do this analysis using three different combinations of datasets for beam and bottle based experiments. A summary of these comparisons can be found in Table III. We first use the same datapoints used by Fornal and Grinstein [4], who argued for a 4.4σ discrepancy. We obtain a median estimate using the same bottle-based experiments considered in F18 [8][9][10][11][12] and compare the same with the beam-based experiments therein [6,7]. The median lifetime of the five bottle-based experiments along with the 1σ median error bar is given by 880.7 ± 1.3 seconds. The corresponding lifetime for the two beam-based experiments considered in F18 is 888.45 seconds. Since, it is not possible to obtain a median error estimate with just two measurements, we do not quote a 1σ median uncertainty. The results do not change even after including the two additional bottle measurements [28,29] not used for their average. Therefore considering the median The scale factor (other than 1) which maximizes p b p-value that the data is derived from the PDF c The value n in the students t-distribution statistics estimates, the discrepancy is about 6σ. If we do this comparison by including all the measurements in Table III, the median lifetime for all bottlebased experiments is equal to 880.7 ± 1.2 seconds. The corresponding number for all beam-based experiments is 888.45 ± 1.65 seconds. Therefore, comparing the median estimates of both sets of measurements amounts to a 3.79σ discrepancy.
If we then redo this for a subset of all measurements in Table I, with total error less than 10 seconds, the median central estimate for all bottle-based experiments is 880.2 ± 1.1 secs. Since the total number of beam-based measurements in Table I is a very small number (three), we only can obtain a central estimate, which is equal to 889.2 seconds. Therefore, the total discrepancy is about 8.2σ.
Hence, we infer that the discrepancy between beam and bottle-based measurements persists, even when me-dian statistics is used.

V. CONCLUSIONS
There has been a long-standing discrepancy in the neutron lifetime measurements between the two different methods used for the measurements, viz. bottle and beam-based methods. As of 2018, the current discrepancy is about 4σ [4]. To get some insight into these issues, we carried out an extensive meta-analysis of the vetted neutron lifetime measurements compiled in literature. We first use a compilation of 19 measurements of neutron lifetime and their corresponding errors listed in the 2018 edition of PDG [3] as well as a few additional results not in PDG (cf. Table I), in order to ascertain into the non-Gaussianity of the residuals and to obtain a central estimate. The error distributions were analyzed in the same way as done for a plethora of astrophysical measurements by Ratra et al [23,24,26] and were obtained using both the weighted mean (with and without correlations) as well as the median value as the central estimate. The median estimate does not incorporate the errors in the neutron lifetime. We then fit these residuals to four distributions, viz. Gaussian, Laplace, Cauchy, and Student's t distribution. The resulting fits are tabulated in Table II. We conclude from these observations that the error distribution for the neutron lifetime measurements using both the median and weighted mean estimates is inherently non-Gaussian and is more suitably fitted using a Student's t or Cauchy distribution. We therefore argue that the median value should be used as the central estimate for the neutron lifetime. This median value along with 1σ error bars using the 19 measurements is given by 881.5 ± 0.47 seconds.
We then used the median estimate to evaluate the statistical significance of the discrepancy between beam and bottle-based measurements. When we use the same measurements as in F18, the discrepancy exacerbates to 6σ. If we consider all the measurements in Table I, the discrepancy becomes 3.8σ (8.2σ), depending on whether we include (exclude) measurements in this, with total error less than 10 seconds.

Dataset
τN (bottle-based) τN (beam-based) Discrepancy (secs) (secs) F18 [4] 880.7 ± 1.3 888.45 6σ Data from Table I 880.7 ± 1.2 888.45 ± 1.65 3.79σ Data from Table 1 880.2 ± 1.1 889.2 8.2σ having errors < 10 secs TABLE III: Summary of significance of discrepancies between beam and bottle based measurements using median statistics. The first column refers to the datasets used. The second and third columns contain the median statistics estimate of the neutron lifetime(τN ) using bottle and beam-based measurements respectively using 1σ median error bars obtained using the procedure in Ref. [14]. The last column indicates the statistical significance of the discrepancy.