A Bayesian Fermi-GBM Short GRB Spectral Catalog

With the confirmed detection of short gamma-ray burst (GRB) in association with a gravitational wave signal, we present the first fully Bayesian {\it Fermi}-GBM short GRB spectral catalog. Both peak flux and time-resolved spectral results are presented. Additionally, we release the full posterior distributions and reduced data from our sample. Following our previous study, we introduce three variability classes based of the observed light curve structure.


I N T RO D U C T I O N
The Fermi Gamma-ray Burst Monitor (GBM; Meegan et al. 2009) is the most prolific detector of short gamma-ray bursts (GRBs). Over its 10 yr mission beginning in 2008 July, GBM has detected over 300 short GRBs. Long believed to be the by-product of binary neutron star mergers, the recent association of GW 170817 (Abbott et al. 2017a,b) with the short GRB 170817A (Abbott et al. 2017c;Goldstein et al. 2017) has made the study of GBM short GRB population properties pertinent. First of all, the low luminosity combined with otherwise typical spectral properties of short GRBs demands an explanation of the physical emission mechanism (e.g. Kasliwal et al. 2017). Next, given the detection rate within the Laser Interferometer Gravitational-Wave Observatory (LIGO) second observing run (O2) of R = 1540 +3200 −1220 Gpc −3 yr −1 (Abbott et al. 2017a) and consistent predictions from population studies (e.g. Chruslinska et al. 2018), it is obvious to ask for the detection of similar events in the GBM archival data that remained unrecognized as nearby mergers (e.g. Burgess et al. 2017). This also includes the question as to whether or not it is possible to identify nearby binary neutron star mergers based on just the gamma-ray data and potential optical/near-infrared (NIR) follow-up? Last but not least, questions like a clear distinctive separation from long-duration GRBs (based on the hardnessduration, lags, and temporal properties), the physical interpretation of soft tails, or the relations to magnetars (e.g. Rowlinson et al. 2014) all require input from a homogeneously deduced sample of spectral parameters. E-mail: jburgess@mpe.mpg.de Past GBM spectral catalogues (e.g. Goldstein et al. 2012;Gruber et al. 2014;Yu et al. 2016) have utilized maximum likelihood methods to provide spectral properties of GRBs to the community. Herein, we have invoked Bayesian analysis to extract both the temporal and spectral properties of short GRBs. This allows for the injection of our prior beliefs about the properties of short GRBs that results in the ability to uniformly model the data across various photon functions and properly select between spectral models. Additionally, we provide the results of our analysis and data reduction to the community to encourage follow-up studies.
This paper is organized as follows. First, we describe the sample selection and data reduction. Next, we detail out spectral fitting procedure and the catalogue distributions. Finally, we briefly discuss the implications of our results.

S A M P L E S E L E C T I O N
The Fermi Science Support Center 1 (FSSC) provides public data from the Fermi mission including GBM burst data. 2 Additionally, the GBM public GRB burst catalogue 3 provides up-to-date durations and background selections for all triggers classified as GRBs since the beginning of the mission.
Using these data bases, we selected all GRBs with a T 90 duration less than 7 s and retrieved the time-tagged event (TTE) data, response matrices, and background selections for detectors with a viewing angle less than 60 • from the reported source location.
The time span of this selection is from 2008 July 14 to 2017 August 18. While we will eventually use GRBs with a duration less than 2 s, we obtain those with a longer T 90 because we will use a different duration measure as detailed in Section 3. GBM releases response matrices in two forms. Some GRBs have responses for a single time interval (RSP) and others have responses determined for multiple time intervals to account for the slewing of the spacecraft (RSP2). If RSP2 files are available, we utilize these over the RSP files. However, the short durations of these GRBs means that there should be little difference between the use of RSPs and RSP2s.
The original sample includes 543 GRBs before data reduction. Because of issues with background selections and lack of significant signal, some GRBs were removed from the sample resulting in 321 short GRBs and 525 time-resolved spectra. The details of the data reduction are discussed in the following section.

M E T H O D O L O G Y
Following our work in Burgess et al. (2017) and Greiner et al. (2016), we apply a uniform methodology for background fitting, temporal binning, and time-resolved source selection. Each step is detailed in the following paragraphs.

Data reduction
For each GRB in our sample, an off-source background interval is selected using the intervals identified in the GBM online catalogue. Using these intervals, a series of four polynomials of increasing degree (from 0-3) is fit to the total rate in time. The likelihood for the fit is unbinned Poisson and as nested models, a likelihood ratio test (LRT) is applied using the likelihoods of each polynomial fit to find the more parsimonious degree of polynomial. A threshold of 10 in delta log likelihood is used to determine if an increase in polynomial degree is significant. With the degree determined, a polynomial of the same degree is fit to each of the 128 pulse height amplitude (PHA) channels to estimate the background model for the rate in that channel. The background will be integrated in time over each source interval. As the background estimation is the result of a maximum likelihood fit, the errors are assumed to be Gaussian distributed. Thus, when calculating the statistical error on the background for each channel, the covariance matrix of the background fit is propagated into the temporal integration resulting in our background error, σ b .
With the background fitted, we apply Bayesian blocks (Scargle et al. 2013) to the temporally unbinned source interval for each detector (T 0 − 5 + 10 s) with a chance probability parameter of p 0 = 0.01. The background model is used to shift the background from a non-homogeneous Poisson process to a homogeneous one. If no change points (significant changes of the Poisson rate) are inferred, the GRB is discarded from the sample. The detector light curve with the highest rate significance over the background is selected and its inferred change points are mapped to all other detectors. We note that the appropriate significance measure to use is via a likelihood ratio similar to that derived by Li & Ma (1983). However, that likelihood is derived for the significance of one Poisson rate over another. Since our determined background model possesses Gaussian errors, the appropriate likelihood ratio is that where we seek an excess over a Gaussian background rate. Thus, we determine significance via the method of Vianello (2018).
The intervals for spectral analysis are now selected by retaining all bins with a significance greater than 3σ . For each bin, the background model is integrated over the bin's time bounds and a source and background are exported to PHA files. Similarly, if the GRB has an RSP2 file, a weighted response matrix is calculated and exported.

Spectral analysis
For each temporal bin, we fit both a Band function (Band, Matteson & Ford 1993;Greiner et al. 1995), and a cut-off power law (CPL) function, Both functions are parametrized in terms of a cut-off energy (E cut ) rather than a νF ν -peak energy to reduce correlations between the peak and the low-energy spectral index (α). However, we compute the νF ν -peak energy (E p ) and report it for comparisons to previous results. We do not fit a simple power-law model to the data because we expect a spectral peak somewhere within the GBM energy range as higher spectral peaks have never been observed. When power-law models are fit to short GRBs, it is typically found that the photon spectral index is ∼>−1.5 which would imply a peak outside the GBM spectral window (Gruber et al. 2014). As we discuss below, we mitigate this with our prior choices. For all parameters except the photon model normalization, we adopt weakly informative priors. Our priors are guided by past GRB catalogues, including GBM's, but not derived from fits to those distributions. Moreover, our priors are derived from the assumption that a spectral peak lies within the GBM energy window. Thus, our priors are not coming from the data used herein. Additionally, as we have essentially created new data via our temporal selections and proper background fitting, we would not be in danger of violating any principles of Bayesian analysis. We choose a normal prior on the cut-off energy of both the Band and CPL functions centred at 200 keV and truncated at 10 keV. Thus, unless the data are more informative than the prior, we impose that a spectral peak exists in the GBM spectral window.
The following prior choices were used: The spectra are fit via a Poisson-Gaussian likelihood to account for Poisson distributed total counts and Gaussian distributed background estimate 4 (Arnaud, Smith & Siemiginowska 2011;Greiner et al. 2016). This is defined as where N is the number of spectral bins, M i , S i , and B i are the predicted, detected, and background counts in the ith bin, respectively, σ b i is the Gaussian uncertainty on the background, t s and t b are the source and background durations, and f i is a function of the background counts resulting from the profiling of the statistic. This profile likelihood removes the need for a background spectral model by essentially assuming a parameter in each spectral bin and profiling it out, i.e. maximizing the likelihood for the background model. This requires at least one background count in each spectral bin and thus, we bin the spectra to achieve this goal. However, this rarely reduces the number of bins by more than one or two.
To account for systematics in the GBM responses, we scale all responses except one by a normalization constant, a so-called effective area correction. A similar procedure was used in Yu et al. (2016). The GBM responses are claimed accurate to with 10 per cent (Bissaldi et al. 2009), and therefore we place a Cauchy prior centred at unity with a 10 per cent standard deviation on these normalization constants. The Cauchy prior is informative in its tails but allows for some lack of certainty around the mean. These corrections will be marginalized into our spectral parameter posteriors. A constant, energy-independent correction likely does not account for all complex misspecification of the GBM photon-to-count conversion process that is present in the responses. Nevertheless, without access to the full spacecraft model 5 or a comprehensive GBM detector inflight calibration paper, we must make an assumption that the linear correction to the effective area will suffice.
Finally, to perform the spectral fit, we sample the posterior with the MULTINEST algorithm (Feroz, Hobson & Bridges 2009). For each fit, 600 live points are used. 6 MULTINEST ceases to sample when a tolerance on the marginal likelihood integral has been achieved. Hence, we record the value of the marginal likelihood (Z) for each fit. Because of our use of informative priors (Bartlett 1957;Strachan & Dijk 2003), we can employ model selection between the Band and CPL functions via marginal likelihood ratios. This equates to choosing the more parsimonious model given the data and parameter values. 7 Model selection is performed in each time interval allowing for the spectral model to evolve within a burst.
All temporal and spectral analysis is carried out with the Multi-Mission Maximum Likelihood framework (3ML; Vianello et al. 2015). The results are stored in Flexible Image Transport System (FITS) 'analysis results' files that are readable by 3ML or any normal FITS reader. They contain information regarding the spectral model and the full posterior of all parameters. Each file can be used to fully set up the analysis for replication. Additionally, we propagate the spectral fit into an energy flux (F E ) calculation integrated over the 10 keV-1 MeV energy range resulting in a marginal distribution for the energy flux. Note that via the analysis result FITS files, the fluxes can be recomputed over any energy range.
In the previous work, the authors have argued that a more complicated spectral fitting algorithm should be invoked to account for systematics in the locations of GRBs (see the BAyesian Location Reconstruction of GRBs (BALROG); Burgess et al. 2018) and that physical photon models provide better insight into the emission 5 The GBM mass model falls under the International Traffic in Arms Regulations (ITAR). 6 We tested using 200, 300, and 400 live points on several simulated spectra. 400 live points are sufficient for both functions used, but we increased to 600 to have robust results. 7 It is important to note that the use of the marginal likelihood in these settings has become increasingly scrutinized in the statistical literature (Vehtari & Ojanen 2012). mechanism of GRBs (Burgess, Ryde & Yu 2015;. However, works on the systematics in location are ongoing, and as we note in Bégué et al. (2017), the emission mechanism of short GRBs requires further modelling. Therefore, with the confirmation of binary neutron star mergers as at least one progenitor of short GRBs, we find it pertinent to proceed with the classical photon models in order to provide the modelling community with an empirical view of the spectra. Additionally, we will utilize the publicly available data products such as the standard responses.

R E S U LT S
From our temporal binning and spectral fitting results, we provide the joint and individual parameter distributions for our sample. We consider the total duration of the emission as the interval from the beginning of the first to ending of the last Bayesian block from our temporal analysis. We note that this is quite different than the typical T 90 (Koshut et al. 1995) used for GRB durations and thus, we simply call this the duration of the GRB. Our duration does not account for the detector response as it is calculated in count space, however, we do not compare our results to previous duration measures for any interpretation. Moreover, duration measures are somewhat arbitrarily performed in different energy ranges, and differ across instruments. What is perhaps most important is how relative durations vary within a sample. A comparison of T 90 and our duration is shown in Fig. 1. The deviation at short durations between the two measures is the result of many of the GBM T 90 being quantized to multiples of the temporal binning (0.064 s) of the CTIME data (Meegan et al. 2009). Our measure is computed on the unbinned TTE data and thus is limited only by the 2 μs clock of the GBM DPU. A detailed study of different duration measurement techniques is warranted, but beyond the scope of this work.
In Burgess et al. (2017), we classified light curves into three classes: simple (consisting of only one significant bin), pulse like (consisting of several significant contiguous bins), and complex (consisting of non-contiguous significant bins). We will examine the parameter distributions of both the combined and individual classes below.
The selection of the Band function over the CPL was made if 2 ln K of the Band function was greater than 10 that of the CPL. Here, K = Z Band /Z CPL . With this criterion, only 12 of the timeresolved spectra were best fit by the Band function and 513 were best fit by the CPL function. We note that the threshold of 10 is a decision made by the authors and can easily be modified. Indeed, The Band function could be used for all fits as the inability to identify a high-energy power law will result in a marginal distribution that will integrate into the other parameters naturally.

Model checking
Before assessing the parameter distribution, it is important we validate our fits to the data. Classically, GRB spectra catalogues have used various goodness-of-fit methods to quantify the validity of model fits to data. These typically involve measures of the fit maximum likelihood compared to the degrees of freedom in the model, i.e. an equivalent reduced χ 2 . These measures suffer from two major statistical flaws (Andrae, Schulze-Hartung & Melchior 2010). First, reduced χ 2 measures are asymptotic. More importantly, they only apply to models linear in their parameters. Thus, it is not possible to invoke these methods to assess if the fitted model can well predict the data.  While quantitative fit assessment is preferred, when models are non-linear there are not any well-established methods that give a single number from which we can be sure that our model is working. An alternative Bayesian method is that of posterior predictive checking (PPC ;Gelman 2007;Betancourt 2015Betancourt , 2018. The method uses the posterior of the model fit to generate replicated data from the likelihood. Therefore, one can assess the probability of future data predicted from the model given the observed data. Mathematically, π (y rep |y) = dθ π(θ |y)π (y rep |θ ), where θ is the posterior of the model, y rep are replicated data, and y are the observed data.
In order to check our fits, PPCs where generated for each time interval and each detector using 500 data replications from the posterior. Both the source and background probability were taken into accounts. The procedure includes selecting a set of parameters from the posterior that defines a model spectrum. This model spectrum is then set for each detector and simulated source and background counts are generated from the likelihood. An example is displayed in Fig. 2. We note this is quite different from examining the residuals between data and model. Residual examination only investigates a single point on the posterior and data variability. Whereas PPCs integrate over both statistical processes.
Quantitatively assessing multidimensional PPCs is not a welldefined procedure. For our purposes, we examined each fit and counted the number of times the observed count rate deviated  from the 95th percentile predicted rates. If this occurred more than 10 times in several detectors, we considered the fit a poor representation of the data. We found that all spectral data were well modelled by the empirical functions used herein. All PPCs are include as online material.
The particularly interesting GRB 120323A (bn120323507) was modelled in the past with many different spectral components and combinations of those components (Guiriec et al. 2013). In our analysis, we have used objectively identified time intervals 8 and the proper data likelihood and our PPCs show that in most intervals of this GRB, the data are well modelled with a Band function (see online figures). The remaining intervals were well modelled with a cut-off power law. Thus, there is no need to introduce more spectral components in this analysis.

Parameter distributions
The parameter distributions 9 are displayed in Fig. 3 and are listed in Appendix A. The distributions have asymmetric tails and deviate from our assumed, weakly informative priors (see Fig. 4). This is not unexpected as our data reduction pipeline, likelihood, and general methodology differ from past approaches. This results in different measured spectra and backgrounds. Different choices of priors and interval selections will likely affect the results printed here. However, there is no motivation to analyse the use of improper likelihoods that will affect intervals with fewer source counts as the proper likelihood is known. The asymmetric tails arise from the priors influencing weaker data and thus could disappear with more sensitive measurements by future instruments.
We have additionally emphasized the parameters of GRB 170817A to demonstrate its displacement from the from the mean of the parameters. This GRB's signal is weak and its  parameters are more difficult to determine. This is demonstrated in Fig. 5 that displays the conditional distributions of the spectral fit.
We also examine interburst parameter correlations for both the full sample (see Fig. 6) and all light-curve structure subclasses (see Fig. 7). For each distribution, a colour scale indicating the duration of each GRB is included. Note that we combine both GRBs best modelled by the Band function and the CPL in these distributions and indicate with a triangle those parameters coming from a Band function. We do not attempt a rigorous analysis of correlations in this work because it is beyond our current scope. Such studies require proper accounting of selection effects.

S U M M A RY
We have presented the first Bayesian Fermi-GBM short GRB catalogue. In the advent of the multimessenger era of astronomy, modern Bayesian methodology provides a path to rigorous and sophisticated analyses. Using the posterior distributions from our catalogue allows for non-linear error propagation of our results into further population and emission modelling studies. Compared to asymmetric errors listed previous catalogues, the propagation of these posterior samples is mathematically well founded. Our choice of priors is subjective, but mainly allows for us to incorporate our knowledge of high signal-to-noise ratio spectra into weaker spectra and enforcing our belief that spectra have a cut-off in the GBM energy range. We have checked that this does not bias our results for bright spectra where the data become more informative than the prior. Nevertheless, as detailed in the following section, we provide our spectral data so that our results can be replicated and prior choices modified as seen fit.
The models used herein are empirical and cannot be interpreted beyond their general shape. They serve to predict the data as a function of energy. Physical modelling is the only way to obtain deeper insight into the mechanisms generating the spectra. We have not attempted this in the current work. Thus, we caution against over interpretation of the results. Detailed physical modelling or modelling with multiple components (e.g. Guiriec et al. 2010) may indeed produce results that do not fall under the assumptions we have made. However, through our use of PPCs, we have verified that the assumptions model the data adequately.
With the confirmed association of short GRBs with a neutron star mergers, this catalogue is aimed at providing context for the observed spectral parameters of GRB 170817A. The emission possess a comparatively soft spectral slope to that of other GRBs. While we do not speculate on the physical implications of their findings, we simply note that these spectral differences may have implications for follow-up of short GRBs in the future such as which spectral parameters are interesting. Further investigations of archival data are needed to understand the full context of this GRB.

Methodology differences
Our results differ from past catalogues and thus we find it important to detail the differences in our methodology with the past GBM catalogues. Our background fitting uses a Poisson maximum likelihood method unlike the standard GBM fitting procedure in RMFIT 10 that performs a weighted least-squares minimization with an assumption of Gaussianity of counts. As noted in Greiner et al. (2016), the method in RMFIT is improper and can lead to biased results when the number of counts in a single PHA channel is near zero. Another key difference in our analysis is the likelihood used for spectral fitting. Our use of a Poisson likelihood conditional on a Gaussian background estimate is self-consistently motivated. However, in past GBM catalogues (Goldstein et al. 2012;Gruber et al. 2014;Yu et al. 2016) and current spectral analysis (Goldstein et al. 2017), a simple Poisson likelihood is applied to the background-subtracted data. As demonstrated in Greiner et al. (2016), this is improper. While the total counts in GBM are Poisson distributed, backgroundsubtracted data are not. Moreover, this is simply not the appropriate likelihood for the data. Finally, our model selection criteria are vastly different from past catalogues. We must first disregard the improper likelihood used in past catalogues and focus on how the maximum likelihood value is applied. Typically, the ratio of these likelihoods between different spectral models is used to decide which model best fits the data. This is an invocation of Wilks' theorem (Wilks 1938). However, Wilks' theorem does not apply to non-nested or nonlinear models. Therefore, simulations of the null-distribution (or null-hypothesis) must be performed to assess the confidence level at which the simpler or null-model can be rejected. This must be done for all spectral fits when the regularity conditions of Wilks' theorem are not met (Protassov et al. 2002). We have avoided this issue via the use of the marginal likelihood as a model selection tool, though we note that there is great debate in statistical literature about the use of the marginal likelihood to select models.

Data availability
To encourage replication and follow-up studies, we provide a variety of data products from this study to the community. The raw spectral and background bins are provided as PHA files readable by both 3ML and XSPEC. 11 The spectral results are included and can be read using 3ML's LOAD ANALYSIS RESULTS function. Additionally, we include the pre-computed F E marginal distributions. Finally, machine readable summary tables for the time-resolved and peak flux spectral results are released. 12

S U P P O RT I N G I N F O R M AT I O N
Supplementary data are available at MNRAS online.

ppc.pdf
Please note: Oxford University Press is not responsible for the content or functionality of any supporting materials supplied by the authors. Any queries (other than missing material) should be directed to the corresponding author for the article.

A P P E N D I X A : PA R A M E T E R S
The following table (Table A1) lists the best-fitting model and