-
PDF
- Split View
-
Views
-
Cite
Cite
E A K Cohen, A J Gibberd, Wavelet spectra for multivariate point processes, Biometrika, Volume 109, Issue 3, September 2022, Pages 837–851, https://doi.org/10.1093/biomet/asab054
- Share Icon Share
Summary
Wavelets provide the flexibility for analysing stochastic processes at different scales. In this article we apply them to multivariate point processes as a means of detecting and analysing unknown nonstationarity, both within and across component processes. To provide statistical tractability, a temporally smoothed wavelet periodogram is developed and shown to be equivalent to a multi-wavelet periodogram. Under a stationarity assumption, the distribution of the temporally smoothed wavelet periodogram is demonstrated to be asymptotically Wishart, with the centrality matrix and degrees of freedom readily computable from the multi-wavelet formulation. Distributional results extend to wavelet coherence, a time-scale measure of inter-process correlation. This statistical framework is used to construct a test for stationarity in multivariate point processes. The methods are applied to neural spike-train data, where it is shown to detect and characterize time-varying dependency patterns.
1. Introduction
The process |$N(t)$| is second-order stationary, henceforth referred to simply as stationary, if |$\lambda(t)$| is constant for all |$t$| and |$\Gamma(t,s)$| depends only on |$\tau = s-t$|. In this setting we will denote the covariance density matrix by |$\Gamma(\tau)$|.
This provides a normalized measure on |$[0,1]$| of the correlation structure between the processes in the frequency domain. For time-series data, coherence has been used extensively in several disciplines, including climatology, oceanography and medicine. For event data, it has served as an important tool in neuroscience for the analysis of neuron spike-train data.
Estimation of the coherence can be achieved by substituting smoothed spectral estimators into (1). Failure to smooth, i.e., simply using the periodogram, will result in a coherence estimate of 1 for all frequencies, irrespective of whether correlation exists between the pair of processes or not. Tractability of the coherence estimator’s distribution is crucial for principled statistical testing and is dependent on the smoothing procedure used (Walden, 2000).
Often, stochastic processes do not conform to the assumptions of stationarity. This may be due to simple first-order trends in the underlying data-generating process or, more typically, complex changes in its second- or higher-order structure. For point processes, the key objective is to analyse how correlations within and across component processes change over time. One approach is to model the process and its associated time-varying spectrum, with the locally stationary Hawkes process (Roueff et al., 2016; Roueff & Von Sachs, 2019) representing an important recent advancement. The present paper complements this by developing a versatile nonparametric approach for exploring second-order structures in a time-localized way. Wavelets form a natural basis with which to do this because of their inherent ability to trade off time and frequency resolution. Wavelet methods designed for time-series analysis (e.g., Nason et al., 2000) could be deployed on discrete-time count sequences. However, should precise timestamps for the events be available, the necessary binning procedure required to implement them would discard information. This motivates the development of continuous-time methods that preserve all temporal information.
There are a wide range of ways in which nonstationarity can occur. Hence, rather than assume a specific model of nonstationarity, we propose to study the properties of the temporally smoothed wavelet periodogram and coherence for stationary point processes. This provides a statistical framework in which we deploy methods for exploratory data analysis, and construct a test for second-order stationarity, complementing stationarity tests in time-series analysis (e.g., Von Sachs & Neumann, 2000; Nason, 2013; Preuss et al., 2013).
2. Temporally smoothed wavelet periodogram
2.1. Formulation
The wavelet |$\psi(t)$| is a real- or complex-valued continuous function that satisfies (i) |$\int_{-\infty}^{\infty}\psi(t)\,{\mathrm d} t =0$|, (ii) |$\|\psi\|_2 = 1$|, and (iii) the admissibility condition |$\int_{-\infty}^\infty f^{-1}|\Psi(f)|^2 \,{\mathrm d} f<\infty$| where |$\Psi(f)$| is the Fourier transform of |$\psi(t)$|.
The smoothing function |$h(t)$| is a nonnegative symmetric function supported and continuous on |$(-1/2,1/2)$| and normalized such that |$\int_{-\infty}^{\infty}h(t)\,{\mathrm d} t=1$|.
Given a choice for |$h(t)$| and |$\kappa$|, the form of |$K(s,t)$| will depend on |$\psi(t)$|. Throughout this paper, we use the complex-valued Morlet wavelet and the real-valued Mexican hat wavelet as examples, since they are wavelets for which |$K(s,t)$| is analytically tractable.
2.2. Practical implementation
For example, the Morlet wavelet |$\psi(t) = \pi^{-1/4}\exp({-t^2/2})\exp({{\mathrm i}\, 2\pi t})$| shown in Fig. 1 has infinite support, but can be well approximated by |$\bar\psi(t)$| for |$\alpha=8$|. In practice, to speed up computation, it can make sense to use the approximating wavelet, as only a subset of the data is required to compute the wavelet transform. Hereafter, to simplify notation, we will use |$\psi(t)$| to represent both the original and the approximating wavelets, assuming that |$\alpha$| is chosen appropriately.

(a) The Morlet wavelet, showing its real (solid) and imaginary (dashed) parts; (b) the valid region for analysis, |$\mathcal{T}_{\alpha,\kappa,T}$|, plotted with time |$b$| on the horizontal axis and scale |$a$| on the vertical axis, as is convention.
In a finite data setting, we are restricted to regions of the time-scale space in which we can fairly evaluate (2) without the consequences of edge effects at either end of the data. These issues are compounded when smoothing across time; for a smoothing window |$h_\kappa(t)$| with |${\mathrm{supp}}(h) = (-\kappa/2,\kappa/2)$|, the effective size of the support for |$K(s,t)$| is |$\alpha+\kappa$|, so we restrict ourselves to values of |$a$| and |$b$| for which |${\mathrm{supp}}(K_{a,b})= (b-a(\alpha+\kappa)/2,\,b+a(\alpha+\kappa)/2)\times(b-a(\alpha+\kappa)/2,\, b+a(\alpha+\kappa)/2)\subseteq (0,T]\times (0,T]$|. This defines an isosceles triangle |$\mathcal{T}_{\alpha,\,\kappa,\,T}\subset\mathbb{R}^2$| with vertices |$(0,0)$|, |$(0,T)$| and |$(a_{\mathrm{max}}(T),T/2)$|, where |$a_{\mathrm{max}}(T) = T/(\alpha+\kappa)$|. This is an adaptation of the cone of influence of Mallat & Peyré (2008, p. 215) that also mitigates for smoothing distances. In practice, a positive minimum value of |$a$| should be imposed to ensure that a reasonable amount of event data exists in the smoothing range.
3. Multi-wavelet representation
3.1. Formulation
Since the kernel |$K(s,t)$| is continuous and nonnegative definite by construction, we can associate with |$K(s,t)$| a Hermitian linear operator |$T_K$|, defined as |$ [T_K g](s) = \int_{-\infty}^{\infty}K(s,t)g(t)\,{\mathrm d} t. $| It follows from Mercer’s theorem (Mercer, 1909) that |$ K(s,t) = \sum_{l=0}^{\infty}\eta_{l} \varphi_{l}(s)\varphi^{*}_{l}(t) $|, where |$\{\varphi_{l}(t),\, l=0,1,\ldots\}$| are the orthonormal eigenfunctions of |$T_K$| with nonzero eigenvalues |$\{\eta_{l},\, l=0,1,\ldots\}$| ordered by decreasing size. Because |${\mathrm{tr}} (T_K)= \int_{-\infty}^{\infty}K(t,t)\,{\mathrm d} t = 1$|, it follows that |$\sum_{l=0}^{\infty} \eta_{l} = 1$|. From here on, we refer to |$\{\varphi_{l}(t),\, l=0,1, \ldots\}$| as the eigenfunctions of |$K(s,t)$|. The following proposition shows that these orthonormal eigenfunctions are themselves wavelets.
Suppose that |$\psi(t)$| satisfies Assumption 1, |$h(t)$| satisfies Assumption 2, and for |$\kappa>0$| the corresponding nonnegative-definite kernel |$K(s,t)$| has eigenfunctions |$\{\varphi_{l}(t),\, l=0,1,\dots \}$|. Every eigenfunction |$\varphi_{l}(t)$| with a nonzero eigenvalue is a wavelet that satisfies the conditions of Assumption 1.
We call the functions |$\{\varphi_l(t), \, l=0,1,\ldots\}$| eigen-wavelets.
The representation in (5) will be crucial for deriving the distributional results in § 4, as well as affording computational speed-up. In particular, we will make use of the following proposition, which shows that the effective frequency response of the eigen-wavelet system is equal to the frequency response of the generating wavelet |$\psi(t)$|.
Suppose that |$\psi(t)$| satisfies Assumption 1, |$h(t)$| satisfies Assumption 2, and for |$\kappa>0$| the corresponding nonnegative-definite kernel |$K(s,t)$| has eigenfunctions |$\{\varphi_{l}(t),\, l=0,1, \ldots\}$| and eigenvalues |$\{\eta_l,\,l=0,1,\ldots\}$|. Then |$\sum_{l}\eta_{l}|\Phi_{l}(f)|^{2}=|\Psi(f)|^{2}$|, where |$\Phi_l(f)$| and |$\Psi(f)$| are the Fourier transforms of |$\varphi_l(t)$| and |$\psi(t)$|, respectively.
In general, closed-form expressions for the eigen-wavelets |$\{\varphi_l(t),\,l=0,1,\ldots\}$| will be unobtainable, and numerical procedures need to be used to find the solutions of |$ \int_{-\infty}^\infty K(s,t)\varphi(t)\,{\mathrm d} t = \eta\varphi(s). $| Details of an implementation of the Nystrom method for doing just this can be found in the Appendix.
3.2. Worked example

The kernel and first five eigen-wavelets for the Morlet wavelet in (a) and (b), and for the Mexican hat wavelet in (c) and (d), subject to a rectangular smoothing window of width |$\kappa=10$|. In panel (b) the solid and dashed lines represent the real and imaginary components, respectively.
The function |$k(s,t)$| is itself a real-valued nonnegative kernel with its own set of real-valued orthonormal eigenfunctions |$\{\phi_l(t),\,l=0,1,\ldots\}$| and associated eigenvalues |$\{\eta_l,\,l=0,1,\ldots\}$|. It follows that |$\varphi_l(t) = \exp({{\rm i}\, 2\pi t})\phi_l(t)$| is an eigenfunction of |$K(s,t)$| with corresponding eigenvalue |$\eta_l$|, and hence |$\{\varphi_l(t)=\exp({{\rm i}\, 2\pi t})\phi_l(t),\, l=0,1,\ldots\}$| is the eigen-wavelet system emitted by the Morlet wavelet with a rectangular smoothing function. The first five of these eigen-wavelets for |$\kappa=10$| are shown in Fig. 2(b). This eigen-wavelet system is in the same spirit as the generating Morlet wavelet, with the eigen-wavelets themselves being complex sinusoids enveloped by a taper. Thus, performing a continuous wavelet transform with one of the eigen-wavelets is equivalent to a time-localized tapered Fourier transform evaluated at frequency |$1/a$|, and the temporally smoothed wavelet periodogram as represented in (5) is equivalent to a time-localized multi-taper spectral estimator. For comparison, the kernel and associated eigen-wavelets of the Mexican hat wavelet using a rectangular smoothing function are shown in Fig. 2 panels (c) and (d), respectively.
4. Statistical properties under stationarity
4.1. Preliminaries
The following mixing condition (Assumption 2.2 in Brillinger, 1972) is sufficient for the asymptotic results that follow. It ensures that the dependency structure in the point process decays at a sufficient rate for central limit arguments to be invoked.
The distributional results that follow differ slightly depending on whether a real-valued wavelet, such as the Mexican hat wavelet, or a complex-valued wavelet, such as the Morlet wavelet, is chosen. We present the results for a complex-valued wavelet and relegate the derivation for a real-valued wavelet to the Supplementary Material. For wavelet |$\psi(t)$| with Fourier transform |$\Psi(f)$|, its central frequency is defined as the first moment of |$|\Psi(f)|^2$|, namely |$f_{0}=\int_{0}^{\infty}f|\Psi(f)|^{2}\,{\mathrm d} f$| (Cohen & Walden, 2010a). The central frequency of |$\psi_{a,b}$| is therefore |$f_0/a$|, and can be interpreted as the central analysing frequency of the wavelet at scale |$a$|. For example, the Morlet wavelet has a central frequency of |$f_0 = 1$|, and the Mexican hat wavelet has a central frequency of approximately |$f_0 = 0.21$|. We further define the frequency concentration of |$\psi(t)$| to be the second central moment of |$|\Psi(f)|^2$|, namely |$\sigma_\psi = \|(f-f_0)\Psi(f)\|_2$|. It immediately follows from Proposition 2 that the central frequency and frequency concentration of the eigen-wavelet system are |$f_0$| and |$\sigma_\psi$|, respectively.
The wavelet |$\psi(t)$| is complex-valued, satisfies Assumption 1, has approximating support |$(-\alpha/2,\alpha/2)$| for some finite |$\alpha>0$|, and is such that |$\sigma_\psi<\infty$|. Furthermore, there exists a finite |$C$| such that |$\int|\psi(t+u)-\psi(t)|\,{\mathrm d} t<C|u|$| for all real |$u$|, and |$\psi$| is orthogonal to its complex conjugate, i.e., |$\int_{-\infty}^{\infty}\psi(t)\psi(t)\,{\mathrm d} t = 0$|.
The Morlet wavelet is an example of a complex-valued wavelet that satisfies Assumption 4.
The smoothing function |$h(t)$| satisfies Assumption 2 and, moreover, there exists a finite |$C'$| such that |$\int|h(t+u)-h(t)|\,{\mathrm d} t<C'|u|$|.
4.2. Asymptotic distributional results
For any |$T$|, the valid region of analysis is normalized to |$\tilde{\mathcal{T}}_{\alpha,\,\kappa}$|, an isosceles triangle with vertices |$(0,0)$|, |$(0,1)$| and |$(1,1/2)$|, whose interior contains all valid pairs of |$(\tilde a,\tilde b)$|. Asymptotic results are presented for any fixed point |$(\tilde a,\tilde b)\in \tilde{\mathcal{T}}_{\alpha,\kappa}$| as |$T\rightarrow\infty$|. In doing so we define the frequency |$f_{\tilde a} = f_0^{\scriptscriptstyle{(T)}}/\tilde a$|, where |$f_0^{\scriptscriptstyle{(T)}}=\int_{0}^{\infty}f|\Psi^{\scriptscriptstyle{(T)}}(f)|^{2}\,{\mathrm d} f$| with |$\Psi^{\scriptscriptstyle{(T)}}(f)$| denoting the Fourier transform of |$\psi^{\scriptscriptstyle{(T)}}(t)$|.
In the following theorem, |$\mathcal{N}^{\mathcal{C}}_p(\mu,\Sigma)$| denotes the circular |$p$|-dimensional complex normal distribution with mean |$\mu$| and covariance matrix |$\Sigma$|.
Let |$N(t)$| be a |$p$|-dimensional stationary process satisfying Assumption 3 with spectral density matrix |$S(f)$|, and let |$\psi(t)$| be a wavelet with central frequency |$f_0$| satisfying Assumption 4. The continuous wavelet transform |$w^{\scriptscriptstyle{(T)}}(\tilde a,\tilde b)$| is asymptotically |$\mathcal{N}^{\mathcal{C}}_p\{0,S(f_{\tilde a})\}$| as |$T\rightarrow\infty$|, for all |$(\tilde a,\tilde b)\in\tilde{\mathcal{T}}_{\alpha,\kappa}$|.
Let |$\mathcal{W}^{\mathcal{C}}_p(n,\Sigma)$| denote the |$p$|-dimensional complex Wishart distribution with |$n$| degrees of freedom and centrality matrix |$\Sigma$|.
Let |$N(t)$| be a |$p$|-dimensional stationary process satisfying Assumption 3 with spectral density matrix |$S(f)$|. Let |$\psi(t)$| be a wavelet with central frequency |$f_0$| satisfying Assumption 4, let |$h(t)$| be a smoothing function satisfying Assumption 5, and for |$\kappa>0$| let |$\{\eta_l,\,l=0,1,\ldots\}$| be the eigenvalues of the kernel |$K(s,t)$| defined in (4). The temporally smoothed wavelet periodogram |$\Omega^{\scriptscriptstyle{(T)}}(\tilde a,\tilde b)$| is asymptotically |$(1/n)\mathcal{W}^{\mathcal{C}}_p\{n,S(f_{\tilde a})\}$| as |$T\rightarrow\infty$| for all |$(\tilde a,\tilde b)\in \tilde{\mathcal{T}}_{\alpha,\kappa}$|, where |$n=1/(\sum_{l=1}^\infty\eta_l^2)$|.
The following distributional result for the wavelet coherence is now immediate from Theorem 2 and Goodman (1963). Let |$_2 F_1(\alpha_1,\alpha_2;\beta_1;z)$| denote the hypergeometric function with 2 and 1 parameters |$\alpha_1$|, |$\alpha_2$|, |$\beta_1$| and scalar argument |$z$|.
In the case of the rectangular smoothing function given in (6), the effective degrees of freedom |$n$| scales linearly with |$\kappa$| according to the following proposition.
Let |$\psi(t)$| satisfy Assumption 4, let |$h(t)$| be the rectangular smoothing function given in (6), and for |$\kappa>0$| let the corresponding kernel |$K(s,t)$| have ordered eigenvalues |$\{\eta_{l},\, l=0,1,\dots \}$|. If |$\kappa>\alpha$|, then |$ n = (\sum_{l=0}^{\infty}\eta_l^2)^{-1} = \kappa \{\int^{\infty}_{-\infty} |\mathcal{P}(x)|^2\,{\mathrm d} x\}^{-1} $|, where |$\mathcal{P}(x) = \int_{-\infty}^\infty\psi(t)\psi^*(t-x)\,{\mathrm d} t$|.
5. Test for stationarity
5.1. Formulation
We construct a likelihood ratio test based on the asymptotic distribution of |$\Omega^{\scriptscriptstyle{(T)}}(\tilde a,\tilde b)$| stated in Theorem 1. To obtain asymptotic results we require the smoothing parameter |$\kappa$|, and hence the degrees of freedom |$n$| given in Proposition 4, to grow with |$T$|. Furthermore, we permit the maximum resolution of analysis |$J$| to also grow.
We have |$\kappa = O(T^c)$| and |$2^{J} = o(T^{1/2-c})$| for some |$c\in(0,1/2)$|.
Furthermore, when |$H$| is true, |$-2\log \{U(K)\}$| is asymptotically |$\chi^2_{f}$| as |$n\rightarrow\infty$|, where |$f = (K-1)p^2$|.
In the following proposition, we let |$\tilde U_j$| be a random variable that is equal in distribution to |$U(2^j)$| under the null hypothesis.
Suppose that Assumptions 3–6 hold. For a fixed |$j\in\{1,\ldots,J\}$|, under |$H_j$| we have |$\mathrm{pr}\{-2\log (V_j)\leq x\} = \mathrm{pr}(\chi^2_{\nu_j}\leq x) + o(1)$| as |$T\rightarrow\infty$|, where |$\nu_j = (2^j-1)p^2$|.
With the following assumption, we can combine the test statistics for |$H_1,\dots ,H_J$| to form |$V= \prod_{j=1}^J V_j$| with which to test |$H_0$|.
The wavelet |$\psi(t)$| satisfies |$\int_{-\infty}^{\infty}\psi_{j,k}^{\scriptscriptstyle{(T)}}(t)\psi_{l,m}^{*\scriptscriptstyle{(T)}}(t)\,{\mathrm d} t = 0$| for all |$(j,k)\neq(l,m)$|, where |$\psi^{\scriptscriptstyle{(T)}}_{j,k}(t)$| denotes the wavelet at the |$j$|th scale and |$k$|th translation (|$j=1,\ldots,J$|; |$k=1,\ldots,K$|).
We remark that this is only an approximation for the Morlet wavelet.
Suppose that Assumptions 3–7 hold. Under |$H_0$| we have |$\mathrm{pr}\{-2\log (V)\leq x\} = \mathrm{pr}(\chi^2_{\nu}\leq x) + o(1)$| as |$T\rightarrow\infty$|, where |$\nu = \sum_{j=1}^{J}\nu_j = p^2(2^{J+1}-2-J)$|.
5.2. Simulations
The following simulations all use a Morlet wavelet, and tests are conducted on |$H_1$|, |$H_2$|, |$H_3$| and the combined hypothesis |$H_0$| at the |$\alpha = 0.05$| level.
To demonstrate that the test asymptotically attains the nominal level, we consider a pair of independent homogeneous Poisson processes, each with intensity |$\lambda=1$|, and increase |$T$|. We let the smoothing parameter |$\kappa$| grow as |$\kappa = 10\, T^{1/4}$|. The Type I error rate of the tests as a function of |$T$| is shown in Fig. 3(a). The effective support of a smoothed wavelet periodogram at scale |$j$| is |$2^{-j}T$|; hence, convergence is fastest at |$j=1$|.

Results for the simulation studies in § 5.2: (a) Type I error rate for a pair of independent homogeneous Poisson processes; (b) power as a function of |$\theta$|, the slope of the intensity function for the pair of independent inhomogeneous Poisson processes; (c) power as a function of |$\alpha_{12}$|, and |$\alpha_{21}$|, the mutual excitation parameter in the second segment of the piecewise-stationary bivariate Hawkes process; (d) power as a function of |$\delta_0$|, the magnitude of the excitation delay effect in the locally stationary Hawkes process. Tests are performed at the |$\alpha=0.05$| level (grey solid line) on hypotheses |$H_1$| (dashed), |$H_2$| (dotted) and |$H_3$| (dot-dash), and on the combined hypothesis |$H_0$| (solid). In all cases, 1000 realizations of the processes were used.
To study the power with respect to first-order nonstationarity, we consider a pair of independent inhomogeneous Poisson processes. The first has a linearly increasing intensity function |$\lambda_1(t) = \tan(\theta) t + 1 - (T/2)\tan(\theta)$|; the second is linearly decreasing with intensity function |$\lambda_2(t) = \tan(-\theta) t + 1 - (T/2)\tan(-\theta)$|. We vary |$\theta$|, the slope of the intensity functions, between |$-\!\arctan(2/T)$| and |$\arctan(2/T)$|; the process is stationary if and only if |$\theta = 0$|. This model ensures that the expected number of events in |$(0,T]$| is the same for all |$\theta$|. We set |$T = 1024$| and again take |$\kappa = 10\, T^{1/4}$|, resulting in 22 degrees of freedom. The power is plotted in Fig. 3(b) as a function of |$\theta$|. As expected, the power increases with |$|\theta|$|, with the nominal level attained for small |$j$| at |$\theta =0$|.
To study the power with respect to second-order nonstationarity, we consider a bivariate piecewise-stationary Hawkes process made up of three segments; see the Appendix for details. The first and third segments are the time intervals |$(0,256]$| and |$(768,1024]$|, respectively. On these segments there exist a bivariate process with baseline intensities |$\nu_1=\nu_2 = 1$|, and an exponential excitation kernel with excitation parameter |$\alpha_{11} = \alpha_{22} = 1$| and decay parameter |$\beta_{11} = \beta_{22} = 2$|; there is no mutual-excitation between the processes, i.e., |$\alpha_{12} = \alpha_{21} = 0$|. On the second segment, namely the time interval |$(256,768]$|, there exists a bivariate Hawkes process with self-excitation parameters identical to those for the first and third segments, but now with mutual excitation between the individual processes. The decay parameters are set at |$\beta_{12} = \beta_{21} = 2$|, and we examine the power as |$\alpha_{12}$| and |$\alpha_{21}$| are increased together to introduce increasing levels of mutual excitation. This leads to increasing deviation from the distribution of the process in the first and third segments. The baseline intensity on |$(256,768]$| is decreased accordingly to ensure that the expected number of events is constant across the entirety of the time interval |$(0,1024]$|. Fig. 3(c) shows how the power increases with the value of |$\alpha_{12}$| in the second segment, i.e., as the distribution of the process in the second segment deviates further from that of the first and third segments.
This excitation kernel only allows an offspring event to be generated after a delay of |$\delta(\cdot)$|, which varies as a function of time to introduce nonstationary second-order structure. In this example we set |$\delta(s)=1+\delta_0|1-2s|$|; when |$\delta_0=0$| the process is stationary, and we can introduce time-varying second-order behaviour of increasing magnitude by increasing |$\delta_0$|. The power as a function of |$\delta_0$| is plotted in Fig. 3(d). As expected, the power increases with |$\delta_0$|, and the nominal level is attained at |$\delta_0 =0$|.
5.3. Practical implementation
In practice, one must make a choice of |$\kappa$| and |$J$| that is suitable for both the dataset under analysis and the time scales on which one wishes to detect deviations from stationarity, while accounting for traditional trade-offs. Assumption 6 requires the growth rates of |$\kappa$| and |$J$| to be balanced such that |$2^J\kappa = o(T^{1/2})$|. If analysing variation on small time scales, one benefits from lowering |$\kappa$| and increasing |$J$|, at the cost of asymptotic limits being less appropriate. Conversely, asymptotic limits can be practically attained at the cost of losing temporal resolution. Therefore, a universal heuristic for choosing |$\kappa$| and |$J$| does not exist. If attaining the nominal level of the test is important, experiments with independent Poisson processes over a duration and intensity indicative of the data can be used to select values of |$\kappa$| and |$J$|. Example results for experiments of this type are presented in the Supplementary Material.
6. Application to neural signalling
To illustrate the methods in practice, we analyse signalling regions within the lateral geniculate nucleus of a mouse. Specifically, we consider a set of neurons studied by Tang et al. (2015), who were primarily concerned with analysing firing properties in order to understand how visual signals are encoded and transmitted throughout the brain. To demonstrate the ability of our smoothed coherence estimator and stationarity test to operate with a single trial, we consider only a single firing sequence from Tang et al. (2015). In this case, the mouse was given a visual stimulus in the form of a liquid crystal display screen, showing a sinusoidal monochromatic drifting grating with spatial stimulus at a frequency of 0.04 cycles per second and a temporal flicker of 1 Hz. The firing pattern is 7 s in length and represents data for cells 108 and 117, which we selected for our example because they demonstrate relatively high firing rates. We use the Morlet wavelet with temporal smoothing parameter |$\kappa=10$| and approximating support parameter |$\alpha=8$|. For completeness, this example was performed using exact kernel sampling; however, an approximate computation based on the Nystrom method gave visually indistinguishable results and |$p$|-values; see the Appendix.
The results of the analysis are displayed in Fig. 4. Tests for stationarity were performed at scale levels |$j=1$| and |$2$|, informed by the analysis in the Supplementary Material, with dyadic sampling points marked by the crosses. With a |$p$|-value of 0.032, there is strong evidence that the process demonstrates nonstationary behaviour at the coarsest scale |$j=1$|, corresponding to a scale of 0.25 s and a frequency of 4 Hz through the |$f=1/a$| relationship for a Morlet wavelet. However, there is not enough evidence to reject the hypothesis that the wavelet spectrum stays constant at the finer scale |$j=2$|. With the parameters specified, the 95th percentile of the distribution for zero coherence is 0.593 and is represented by the black contour line. This indicates that the nonstationarity at |$j=1$| involves a change in the correlation between the two processes half-way through the experiment, with significant coherent signalling becoming present in the latter half.

Wavelet coherence estimated from the mouse neuron firing data with the Morlet wavelet. Black lines represent contours drawn at the 95th percentile of the distribution for zero coherence; see Corollary 1. Crosses mark the dyadic sampling points for the stationarity test, with arrows indicating the scale-specific |$p$|-values.
Acknowledgement
This work was funded by the U.K. Engineering and Physical Sciences Research Council. The authors thank the reviewers for their insightful comments and suggestions. Special thanks go to Leigh Shlomovich of Imperial College London for developing the Hawkes process simulation code, and to Heather Battey, Dean Bodenham and Andrew Walden for stimulating conversations.
Supplementary material
Supplementary Material includes proofs, results for real-valued wavelets, a guide to practical implementation, and a link to a Matlab package for implementing the presented methods.
Appendix
Computing eigen-wavelets and eigenvalues
The sum in (5) is over an infinite set of eigen-wavelet periodograms. However, in practice, the size of the eigenvalues drops away rapidly, indicating that the kernel can be accurately reconstructed using only a small number of its eigen-wavelets; hence (5) can be approximated with only a small number of terms. For example, in the case of |$\kappa = 10$|, the first nine eigenvalues contain 99.9|$\%$| of the overall energy.
Hawkes process
For simulation we use the thinning algorithm of Ogata (1981).