Windowing Artifacts Likely Account for Recent Claimed Detection of Oscillating Cosmic Scale Factor

Using the Pantheon data set of Type Ia supernovae, a recent publication (R20 in this work) reports a $~2\sigma$ detection of oscillations in the expansion history of the universe. Applying the R20 methodology to simulated Pantheon data, we determine that these oscillations likely arise from analysis artifacts. The uneven spacing of Type Ia supernovae in redshift space and the complicated analysis method of R20 impose a structured throughput function. When analyzed with the R20 prescription, about $11\%$ of artificial $\Lambda$CDM data sets produce a stronger oscillatory signal than the actual Pantheon data. The study conducted by R20 is a wholly worthwhile endeavor. However, we believe that the detected oscillations are not due to an oscillating cosmic scale factor and are instead artifacts of the data processing. Our results underscore the importance of understanding the false `signals' that can be introduced by complicated data analyses.


INTRODUCTION AND BACKGROUND
Since the initial discovery of dark energy (DE) by Riess et al. (1998); Perlmutter et al. (1999), observations of Type Ia supernovae (SNe Ia) have been integral in establishing the canonical ΛCDM cosmological model. In the ΛCDM model, the present energy density of our flat universe is dominated by cosmologically constant DE (Λ) and non-relativistic, collisionless ('cold') dark matter (CDM). Though some tensions between predictions and observations exist (Weinberg et al. 2015;Verde et al. 2019), this simple model has successfully predicted many cosmological and astrophysical signals (Peter 2012; Mortonson et al. 2013). However, despite the success of the ΛCDM model, the physical identifies of Λ and CDM remain unsettled. Researchers continue to search for deviations from the predictions of ΛCDM in many data sets, including an everincreasing archive of SNe Ia. In particular, some noncanonical cosmological models, such as those discussed by Barenboim et al. (2005); Xia et al. (2005); Feng and Li (2006); Lazkoz et al. (2010); Wang et al. (2017), predict that the true expansion of the universe might oscillate around the predictions of ΛCDM. Respectively using the Gold (Riess et al. 2007), Union (Kowalski et al. 2008), Constitution (Hicken et al. 2009) and Pantheon (Scolnic et al. 2018) data sets of SNe Ia, Jain et al. (2007), Liu and Li (2009), Lazkoz et al. (2010), and Brownsberger et al. (2019) search for evidence of such oscillations in cosmic expansion. Though they utilize a diversity of data sets and statistical methods, those analyses universally report no evidence of oscillations in the rate of cosmic expansion.
Contrary to those previous findings, Ringermacher and Mead (2015) and Ringermacher and Mead (2020) (R15 and R20 henceforth) claim to identify damped oscillations in the universe's recent expansion history. Combining data of radio galaxies and SNe Ia (Conley et al. 2011;Daly and Djorgovski 2004;Riess et al. 2004) into a 'CDR' data set, R15 claim to detect cosmic oscillations in the universe's scale factor. Using the Pantheon data set of type Ia supernovae, R20 build on R15 to claim a detection of an oscillating scale factor with a total statistical significance of at least 2σ.
Such a detection of oscillatory cosmic expansion would mark an enormous paradigm shift in our understanding of the physics of the universe, changing the canonical model that has held since the first identification of DE. The work of R20 is entirely worthwhile. Their results should be seriously considered and appropriately scrutinized.
Replicating the analysis method of R20 and applying it to simulated data, we find that there is an 11% chance that the Pantheon data observed in a ΛCDM universe would produce a stronger oscillatory signal than that which R20 detect. Our measurement does not include a statistical 'trials factor' penalization for the various tunable parameters in the R20 analysis, and the significance of the detected oscillations is therefore less than our reported metric. The oscillations noted by R20 are likely data analysis artifacts -the signature of a throughput function that consists of the uneven spacing of the Pantheon SNe in redshift and their sequencing of filtering and differentiation analysis steps.
In Section 2 below, we describe our replication of the R20 analysis. In Section 3, we describe our generation of the artificial data and the assessment of the consistency of the real data with ΛCDM. We detail our conclusions in Section 4.

REPLICATING THE R20 RESULTS
In this Section, we describe our replication of the R20 analysis and the R20 results.

Inferring Cosmic Time and Residual Scale
Factor Derivative for SNe Ia R20 search for oscillations by transforming the standard Hubble diagram (brightness vs. redshift) into plots of scale factor vs time. They claim that such a plot enables a modelindependent study of the universe's expansion history.
The Pantheon data set of Type Ia supernovae consists of measured redshifts, z i , distance moduli, µ i , and distance modulus uncertainties, σ µ,i . The subscripts identify each SNe in order of increasing z i . These are the directly measured quantities from which cosmologists infer cosmic expansion.
From these measured quantities, R20 note oscillations in non-standard inferred quantities: the normalized cosmic time since the end of inflation and the residual in the time derivative of the scale factor. They infer measurements of cosmic time, which we denote t i , by approximating an integral in luminosity distances with a discrete sum. They approximate residual time derivatives in scale factor by integrating in luminosity distance to determine t i , subtracting the predictions of ΛCDM to measure residuals, binning these residuals, taking a discrete derivative of these binned values, smoothing these binned derivatives with distinct smoothing kernels, and subtracting the two smoothings.
We make this series of operations explicit in our notation of the inferred residual time derivative scale factor, which we denote ∆G(d∆a i /dt). Here, the inner ∆ indicates a residual, (·) indicates binning, d(·)/dt indicates discrete time differentiation, and ∆G indicates the differences between two Gaussian smoothings.
Throughout the rest of this Section, we describe how we inferred t i and ∆G(d∆a i /dt) from z i and µ i . We measured the cosmological scale factors, a i , and luminosity distances, d L,i , using the standard relations: and The scaled luminosity distances, Y i , were defined by where D H = c/H 0 , c is the speed of light, and H 0 is the Hubble constant. We determined the scaled Y separations between measured SNe, ∆Y i , via the relation: We calculated the normalized cosmological times, t i,r aw , by approximating an integral over cosmic time via a discrete sum: We corrected the raw cosmological times using the relation: where t i are the corrected cosmological times. According to R20, t corr corrects for the fact that the first measurement of t is at the first SN where a is not equal to its present day value, and α Pan to CDR is a scaling to match the Pantheon t range to the t range of the CDR data set studied in R15. Following R20, we used t corr = 0.009579 and α Pan to CDR = 1.041. We calculated the residual scale factors, ∆a i , by subtracting from the measured a i values the canonical values of a i determined from t i : We defined the canonical scale factor, a ΛCD M , for a given cosmic time, t, by the integral relation Copying R20, we set Ω M = 0.27 and Ω Λ = 0.73. We cal- To bin the inferred scale factor residuals, we divided the t space (0 to 1) into N bin = 128 bins of equal size and calculated the mean ∆a i value in each t bin, ∆a i . We computed the wide baseline derivative of ∆a i , d∆a i /dt, following Equation (1) of R20: As in R20, n = 8 and ∆t = 1/128. For the first [last] n/2 bins, the lower [upper] ∆a i value was set to the first [last] bin and the n in the denominator was set equal to the number of bins over which the derivative was measured. We smoothed the d∆a i /dt values using a Gaussian kernel. We denote these smoothed derivatives as G k (d∆a i /dt) where the k index denotes the width of the Gaussian kernel in t: where We based Equations 10 and 11 on the definition of the ksmooth function of the Mathcad software, as that is the smoothing function used by R20. We believe the value of 0.37 is an approximation of one e-folding, 1/e.  Our final result, ∆G(d∆a i /dt), is the difference between G k (d∆a i /dt) with two kernels: Here, as in R20, we set k1 = 0.05 and k2 = 0.13. We emphasize that ∆G(d∆a i /dt) is not a true measurement of the residual of the time derivative of the scale factor, ∆ a. Rather, ∆G(d∆a i /dt) represents an attempt to infer ∆ a through a series of data processing steps. The relation between ∆G(d∆a i /dt) and t i describes the cosmic relation between ∆ a and t viewed through a structured windowing function.

Results from the True Pantheon Data
In Figure 1, we show the intermediate results of the data processing steps described in Section 2.1. To best replicate Figure 2 of R20, we measured only those bins with t ≥ 0.46 and t ≤ 1. The bottom panel, showing our calculation of ∆G(d∆a i /dt) vs t i , represents our best replication of Figure  2 in R20. We find a damped oscillatory relation between ∆G(d∆a i /dt) and t i . The oscillation amplitude decreases in t, with the largest excursion of |∆G(d∆a i /dt)| 0.12 occurring around t 0.5. We found that such oscillations around in ∆ a in t would correspond to similar oscillations in distance modulus residual, ∆µ, in redshift, z. We estimate that, for oscillations of the magnitude shown in Figure 1, there would be an oscillatory signal in ∆µ vs z with a peak amplitude of about 10 millimags. Comparing this prediction to the constraints shown in Figure 6 of Brownsberger et al. (2019), the Pantheon data are unable to rule-out such a small oscillatory signal in ∆µ vs z.
Having demonstrated the consistency of our analysis with that of R20, we next show how similar oscillatory signals can arise from the above analysis applied to Pantheonlike artificial data obtained in a canonical ΛCDM universe.

DEMONSTRATING HOW THE CLAIMED OSCILLATORY SIGNAL IS GENERIC, AND NOT AN INDICATION OF COSMIC OSCILLATIONS
The final data shown in the lower plot of Figure 1 are not direct measurements of the residual time derivative of the scale factor ∆ a. Rather, they represent an inference of ∆ a acquired through a serious of operations. Those operations conspire with the sampling of observed SNe in redshift to produce a complicated windowing function which itself carries structure.
In this section, we describe how we produced simulated Pantheon-like data sets and demonstrate that random data realizations distributed around the ΛCDM cosmology can generate the same sorts of oscillations identified in the bottom plot of Figure 1.

Generating Randomized Pantheon-Like Data
We applied the same procedure discussed in Section 2 to randomized versions of the Pantheon data set. For the i th Pantheon SN, we determined a randomized µ i by drawing from a normal distribution with mean equal to the background ΛCDM predicted µ i and standard deviation equal to the reported σ µ,i . The z i of each SN is well determined, and was left unchanged in the randomization. This preserves the window function in redshift. Each randomization produced a new set of 1048 µ i values at the same z i positions.
Any fundamental cosmic oscillation buried in the true Pantheon data does not exist in these artificial Pantheon-like data sets. By pushing each randomized data set through the same processing steps described in Section 2, we generated a plot of ∆G(d∆a i /dt) vs t i for Pantheon-like data from which any non-ΛCDM cosmic structure (oscillatory or otherwise) has been removed. We repeated this randomization N R = 10 4 times.
In Figure 2, we show a representative subset of the ∆G(d∆a i /dt) vs t i plots of the randomized Pantheon-like data sets. Many such plots (those shown and not shown in Figure  2 Figure 2. Thirty-four plots of artificial Pantheon-like data sets and the plot of the real Pantheon data all processed using the analysis in Section 2.1. We identify which panel displays the true data at the end of this caption. Because we randomized the distance moduli of these Pantheon-like data sets around ΛCDM, they are definitionally bereft of non-canonical cosmic structure. The oscillations exhibited in all but one of the above plots are nothing more than artifacts of the data processing. Because the oscillations of the true Pantheon data are not clearly distinct from the oscillations in the artificial Pantheon-like data, we argue that the oscillations identified by R20 could reasonably result from the Pantheon data observed in a canonical ΛCDM cosmology. In this figure, the plot of the true Pantheon data is displayed in the fifth row of the fourth column.
data. We identify the real Pantheon data in the Figure's caption.

The Frequency Spectra of Real and Artificial Pantheon Data
Examining the ∆G(d∆a i /dt) vs t i plots in Figure 2, we cannot confidently distinguish the plot of the true Pantheon data from those of randomized data plots. The oscillations in the real data appear consistent with apparent oscillations that could result from random fluctuations around the canonical ΛCDM cosmology.  Figure 3. The power spectrum of the true Pantheon data (black line) and the distribution of power spectra of the randomized Pantheon-like data. The N R = 10 4 randomized Pantheon-like data sets produce, at every frequency, a distribution of N R measurements of the power spectra that could result from random deviations around the ΛCDM cosmology. At each frequency, the noted percentage of randomizations lie below the labeled contour. For example, at each frequency, 50% of randomizations have power below the green contour and 90% of randomizations have power below the blue contour. computed power spectra of ∆G(d∆a i /dt) in t i . The data were binned in t i bins of equal size, and the data to be Fourier transformed was thus evenly spaced in time. We measured the Power Spectrum, P f , of ∆G(d∆a i /dt) in t i via a standard discrete Fourier transform: To avoid aliased modes, we measured the Fourier power in frequencies, f , from 0HHz to N bin /4 = 32HHz. Replicating R20, 1 HHz ('one Hubble Hertz') = 0.1023h 100 Gyr −1 . Using Equation 13, we computed the Power Spectrum of ∆G(d∆a i /dt) in t i for the true Pantheon data set and for the N R artificial Pantheon-like data sets. We show the true Pantheon power spectrum and the distribution of artificial Pantheon-like power spectra in Figure 3. At its peak, the power spectrum of the true Pantheon data (black line) lies below the 90% contour of the artificial power spectra (blue shading). R20 focus primarily on the frequency peak at f = 7.5HHz. In Figure 4, we display a histogram showing the Fourier power at f = 7.5HHz of the N R Pantheon-like artificial data sets and show where the power of the real Pantheon data lies in this histogram (black line). About 11% of randomized Pantheon-like ΛCDM data sets have more power at the chosen frequency than the real Pantheon data.

CONCLUSIONS
We replicated the analysis of the Pantheon data set of SNe Ia described by R20 and we found a similar result: the inferred a residuals oscillate in the inferred cosmic time. We show these results in Figure 1.
We repeated this analysis on artificial Pantheon-like data sets with unchanged redshifts and with distance moduli randomly drawn from normal distributions centered at the canonical ΛCDM cosmology. By definition, this randomization erased any cosmic oscillation signature that exists in the true Pantheon data. Many plots of the R20 analysis applied to these randomized distributions (see Figure 2 for a representative subsample) display oscillations similar in amplitude and frequency to those identified in the real Pantheon data.
To make this qualitative observation quantitative, we measured the power spectra of the real Pantheon data and of the artificial Pantheon-like data sets. We showed these power spectra in Figure 3. R20 focus on the power spectrum peak at f = 7.5HHz. In Figure 4, we showed the distribution of the artificial data sets' powers at this chosen frequency and where the true Pantheon data lies in this histogram. About 11% of the Pantheon-like ΛCDM data sets have more power at this chosen frequency than the real Pantheon data when analyzed according to the prescription of R20. Our analysis used the same choice of tuned analysis parameters that R20 report, including the widths of the smoothed Gaussian kernels, the chosen frequency, and the number of time bins. A robust measurement of the statistical significance of this 11% effect would also include a statistical penalization for these adjustable analysis parameters.
There are potential sources of systematic error that neither we nor R20 consider. Particularly, the Pantheon data set is a combination of distinct supernova survey projects, each of which carries its own imperfectly characterized systematic errors. These inter-survey systematics inherit each individual survey's uneven distributions in redshift and on the sky. If the oscillations noted by R20 appeared to be more than data analysis artifacts, we would analyze the signal's robustness against these inter-survey systematics.
There is at least a one-in-ten chance that statistical fluctuations around the canonical ΛCDM cosmology would con-spire with the windowing function of the R20 data analysis to produce a larger oscillatory signal than that which R20 report. The apparent oscillatory signal is consistent with data processing artifacts that masquerade as an oscillating signal in a truly ΛCDM cosmology.

ACKNOWLEDGMENTS
We found the work of VanderPlas (2018) particularly helpful in understanding the importance of being cautious about the potential impact of processing artifacts. SB and CS are supported by Harvard University and the US Department of Energy under grant DE-SC0007881. DS is supported by DOE grant DE-SC0010007 and the David and Lucile Packard Foundation. DS is supported in part by NASA under Contract No. NNG17PX03C issued through the WFIRST Science Investigation Teams Programme.