An independent search for annual modulation and its significance in ANAIS-112 data

We perform an independent search for annual sinusoidal-based modulation in the recently released ANAIS-112 data, which could be induced by dark matter scatterings. We then evaluate this hypothesis against the null hypothesis that the data contains no annual modulation, using four different model comparison techniques. These include frequentist, Bayesian, and two information theory-based criteria (AIC and BIC). We find that according to the BIC test, the null hypothesis of no modulation is decisively favored over a cosine-based annual modulation. None of the other model comparison tests decisively favor any one hypothesis over another. This is the first proof of principles application of Bayesian and information theory techniques to test the annual modulation hypothesis in ANAIS-112 data.


I. INTRODUCTION
The ANAIS collaboration [1] (A19, hereafter) recently released their first scientific results, related to testing the long-standing DAMA claim of annual modulation caused by dark matter scatterings [2] (and references therein). Their target material consists of 112.5 kg of NaI (hence the experiment has been named ANAIS-112) and the total livetime of the released data was 1.5 years. With the current data, the ANAIS-112 data were found to be consistent with the null hypothesis of no modulation, with p−values of 0.65 and 0.16 in the 2-6 and 1-6 keV energy intervals respectively.
In two recent works [3,4], we performed an independent search for annual modulation, using data from two direct dark matter detection experiments, namely DAMA [2] and COSINE-100 [5]. In these works, we evaluated the significance of the annual modulation using four independent model comparison techniques: frequentist, information theoretical and Bayesian analysis. We now carry out the same exercise on the recently released ANAIS-112 data (which has been kindly made available to us by the collaboration). Our analysis is therefore complementary to the model comparison tests carried out in A19.
A brief summary of the ANAIS-112 results can be found in Sect. II. Our analysis and results of the same data is described in Sect. III. We conclude in Sect. IV. We have made our analysis codes publicly available and they can be found at https://github.com/aditikrishak/ ANAIS112_analysis * E-mail:aditi16@iiserb.ac.in † E-mail: shntn05@gmail.com

II. ANAIS-112 RESULTS
We now recap the ANAIS-112 results from A19, wherein more details can be found. The ANAIS-112 experiment consisting of 112.5 kg of NaI as target is located at the Canfranc Underground Laboratory, LSC, in Spain under 800 m of rock overburden. The experiment uses nine NaI modules. The experiment started taking data in August 2017 and has released about 1.5 years of data until February 2019. The background rates in the modules were fit to a superposition of constant and exponential terms. The annual modulation search was done in two different energy bins, viz. [1][2][3][4][5][6] and [2][3][4][5][6] keV. The fit is done by binning the data in 10-day intervals. The function used for fitting both the potential signal and background is given by (we use the same notation as in A19): where R 0 , R 1 , and τ are the parameters of the background model, and A, ω, and φ correspond to the amplitude, angular frequency and phase of the expected signal respectively. While doing the fits, the period and phase were fixed to 1 year and -62.2 days, respectively, where the phase corresponds to the expected maximum around June 2nd. For doing a background only fit, A is assumed to be equal to 0, and for fitting the data to the signal, A is assumed to be a free parameter. An independent search for signal with the phase as a free parameter was also done using this data.
The data was found to be consistent with the null hypothesis in both 2-6 keV and 1-6 keV energy intervals, corresponding to p-values of 0.67 and 0.18 respectively. The corresponding p-values for the annual modulation hypothesis are 0.65 and 0.16 respectively. The best fits were inconsistent with the DAMA/LIBRA best fits at 2.5σ and 1.9σ in the two energy intervals. More details of these results are available in A19. arXiv:1910.05096v1 [astro-ph.CO] 11 Oct 2019

III. OUR ANALYSIS
We first do an annual modulation fit by using the same background parameters as those of the ANAIS collaboration, and do a hypothesis testing on only the residual rates. We then allow the background values to vary, and do a combined fit to both the signal and background. For doing a fit to the signal hypothesis, we also vary the phase and period, in addition to the amplitude. Once parameter estimation is done for both the cases, we carry out model comparison against the null hypothesis of no modulation.

A. Parameter estimation
We denote the dark matter induced cosine modulation as hypothesis H 1 and the background only hypothesis as H 0 . We first do a fit to the residual where the best-fit background parameters have been obtained from A19.
For H 1 , the residual y(t) is modeled as: where the best-fit values for R 0 , R 1 and τ have been obtained from A19. The residual y(t) has been plotted in Fig. 2 of A19 for both 1-6 and 2-6 keV energy intervals. We also do a fit directly to the total event rate given in Eq. 1, by finding the best-fit parameters for both the signal and background.
To find the best-fit parameters, we construct the χ 2 functional, which quantifies the differences between the model and the data for both the residual and total event rates as follows.
where σ total is total error, given by [6]: In Eq. 3, f (t) is a place-holder for both y(t) (Eq. 2) as well as R(t) (Eq. 1), when we fit the background-subtracted and the total event rates, respectively. Similarly d i is used to denote the total observed background-subtracted and the total event rates in the two cases. The null hypothesis corresponds to a constant value for y(t) when we fit the residuals, and to A = 0, when we fit for R(t).
For each of the two models in both the energy intervals, we obtain the best-fit parameter by χ 2 minimization, and then carry out model comparison using multiple methods. The best-fit values we get for both the hypotheses, along with the residual data are shown in Figs. 1a and Figs. 1b. The results from each of these sets of tests are outlined below. For brevity, we skip the theory behind the model comparison tests, which can be found in our earlier companion works [3,4] and references therein.
• Frequentist model comparison: While comparing two models, in this test the model with the larger value of χ 2 p.d.f. would be considered as the favored model among the two [7]. The χ 2 pdf (or likelihood) for both H 0 and H 1 for the residual rate can be found in Table I. We note that in the 1-6 keV range, the χ 2 likelihood for the H 0 hypothesis is marginally greater than than the H 1 , whereas the opposite is true for the 2-6 keV energy interval. Making use of the fact that the two models are nested, we quantify the p-value of the cosine model as compared to the background model using Wilk's theorem [8]. For our example, the difference in χ 2 between the two models satisfies a χ 2 distribution with degrees of freedom equal to three. The p-value can be evaluated from the χ 2 c.d.f. as discussed in Ref. [7]. The corresponding significance or Z-score is calculated by finding the number of standard deviations by which a Gaussian variable would fluctuate in one direction to give the corresponding p-value [7,9]. The χ 2 values per degree of freedom and the likelihood of the model, given by the χ 2 p.d.f can be found in Table I, along with the p-value and Z-score. As we can see, the H 1 (background + cosine modulation) is very marginally favored, with a significance of only 1.39σ for 1-6 keV and 0.59σ for 2-6 keV.
The corresponding results when the background parameters are allowed to vary can be found in Table II. For 1-6 keV interval, H 1 is very marginally favored over H 0 , whereas converse is true in the 2-6 keV interval. However, the resulting p-value we obtain is greater than 0.5, so one cannot calculate a Z-score or significance. This implies that the none of the two hypotheses can be robustly distinguished at the moment.
• AIC and BIC: The Akaike and Bayesian information criterion are two information theory-based criterion used for model comparison [10], where additional terms get added to χ 2 to penalize for the additional free parameters.
where p is the total number of free parameters and N is the total number of data points.
While comparing two models, the one with the smaller value of AIC and BIC is preferred. The significance can be evaluated using the qualitative strength of evidence rules given in Ref. [11]. The ∆AIC and ∆BIC values are tabulated in Table I  and Table II for residual and total rates respectively. For residual rates, ∆BIC gives "positive" evidence against cosine modulation; while ∆AIC gives "substantial" support for H 0 in 1-6 keV and for H 1 in 2-6 keV range. When we fit for the total rates, we infer that the ∆BIC values (which are greater than 10) point to "very strong" evidence against cosine modulation. For the same data, ∆AIC points to "considerably less" evidence in support of cosine modulation, using the same strength of evidence rules.
• Bayesian Model Comparison: We carry out a Bayesian model comparison by calculating the Bayesian odds ratio, which in this case is equal to the Bayes factor B 21 for the M 2 model in comparison to the M 1 hypothesis. Here, we consider the null hypothesis (H 0 ) to be M 1 and the cosine model (H 1 ) to be M 2 . B 21 is given by where P (M 2 |D) and P (M 1 |D) are the posterior probabilities for M 2 and M 1 respectively given the observed data D. Similar to Refs. [3,4], the Bayesian evidences for both H 0 and H 1 have been evaluated using the Nestle package 1 in Python.
We assume a Gaussian likelihood and uniform prior for all the parameters. The resulting Bayes factor can be found in Tables I and II for the residual and total event rates. In both cases in Table I and  the 1-6 keV case in Table II, the Bayes factor is greater than one, indicating that the modulation hypothesis is favored. However the absolute value of the Bayes factor is between 1-3, which implies that according to Jeffreys' scale [12], the difference between the two models is negligible. For the 2-6   Here the fit is done to the total count rate, including background and signal. Note that the p-value are greater than 0.5, so the significance using the definition in Ref. [9] cannot be calculated.
keV case in Table II the Bayes factor is less than one, favoring null hypothesis but again the evidence is not substantial according to Jeffreys' scale.

IV. CONCLUSIONS
The ANAIS-112 dark matter direct detection experiment consisting of 112.5 kg of NaI, which has been designed to test the long-standing DAMA annual modulation claim, recently released their first results from 1.5 years of data, with total exposure of 157.55 kg year [1]. This data is consistent with the null hypothesis of no annual modulation.
As a follow-up to our previous work with DAMA and COSINE-100 data [3,4], we carried out an independent search for annual modulation using the same data in two different energy intervals: 1-6 and 2-6 keV. For each of these energy intervals, we fitted both the total event rate and also the background-subtracted residual event rate for a cosine modulation, which could be induced by dark matter interactions. In the latter case, we used the background subtracted data, provided by the ANAIS collaboration.
We then did a model comparison of these different data sets to test if the current ANAIS-112 data is compatible with annual modulation. For this purpose, we used four different model comparison techniques: frequentist, Bayesian, AIC and BIC. A tabular summary of our results can be found in Tables I and II. From the frequentist and Bayesian model comparison tests, we find that both the hypotheses are equally compatible with the data. When we analyze the total rate, we find that BIC decisively favors the null hypothesis of no modulation over a cosine-based modulation. This is the first proof of principles application of Bayesian and information theory based techniques to the ANAIS-112 data and is complementary to the model comparison techniques carried out by the ANAIS collaboration. To improve transparency in data analysis, we have made publicly available our analysis codes and they can be found at https://github.com/aditikrishak/ ANAIS112_analysis.