Debiasing Welch’s Method for Spectral Density Estimation

Welch’s method provides an estimator of the power spectral density that is statistically consistent. This is achieved by averaging over periodograms calculated from overlapping segments of a time series. For a finite length time series, while the variance of the estimator decreases as the number of segments increase, the magnitude of the estimator’s bias increases: a bias-variance trade-off ensues when setting the segment number. We address this issue by providing a novel method for debiasing Welch’s method which maintains the computational complexity and asymptotic consistency, and leads to improved finite-sample performance. Theoretical results are given for fourth-order stationary processes with finite fourth-order moments and absolutely convergent fourth-order cumulant function. The significant bias reduction is demonstrated with numerical simulation and an application to real-world data. Our estimator also permits irregular spacing over frequency and we demonstrate how this may be employed for signal compression and further variance reduction. Code accompanying this work is available in R and python .


Introduction
The periodogram is the fundamental non-parametric estimator of the power spectral density; however, it suffers from two deficiencies: the periodogram is statistically inconsistent and is biased with finite samples.By averaging multiple periodograms calculated from partitioned or overlapping segments of a time series, Welch (1967) overcomes the issue of inconsistency to provide a statistically consistent estimator of the power spectral density, eponymously known as Welch's estimator.Welch's estimator is simple to implement and easy to understand: all reasons that has led to its ubiquity in the applied sciences.However, for finite length time series, the issue of bias remains.Each segment's periodogram is biased and Welch's estimator inherits this bias.Further, as the magnitude of the bias is inversely related to the length of the segments, there is the classical inverse relationship between bias and variance of the estimator; this leads to the undesirable property that we become increasingly more certain in an estimator that is increasingly more biased.The bias is due to a phenomenon called blurring, which is observed most prominently in low power regions, and so it is often ignored as an unimportant aspect of the spectrum; although, such bias may severely impact inference and scientific understanding (e.g.Thomson et al., 1995).In this paper, we introduce a debiased analogue to Welch's estimator that we entitle the debiased Welch estimator.Welch (1967) popularised estimating a spectral density by averaging tapered periodograms calculated from shifted overlapping windows of a time series, thus generalising Bartlett's estimator (Bartlett, 1950).Statistical properties of Welch's estimator, and its behaviour under varying tapers, parametrisations, and smoothness assumptions of the underlying density, have been established; see, for instance, Nuttall (1971), Zhurbenko (1980Zhurbenko ( , 1986)), Dahlhaus (1985), and (Politis, 2005).Welch's estimator may further been seen as a special case of modern methodologies such as scalable subbaging (Politis, 2024).In many cases, these developments improve upon the standard statistical properties of Welch's estimator; however, none consider the explicit debiasing under general conditions.Alternative non-parametric methodologies have been developed to mitigate the effects of bias, most prominently lag-windowing (Blackman and Tukey, 1958) and multi-tapering (Thomson, 1982); although, typically these reduce correlation between distant frequencies at the cost of increasing correlation between nearby frequencies, and so bias remains.In this paper, we extend the results in Sykulski et al. (2019), who provided methodology to debias parametric spectral density estimates, to non-parametric estimation by constructing a natural basis representation to the power spectral density.From this basis representation we intentionally bias the bases, fit the biased bases to a biased estimate of the power spectral density, and recover debiased coefficients that are used to form our debiased non-parametric estimator.
Novel insight is provided across two theorems.The first presents theoretical asymptotic properties of Welch's estimator, making the finite sample bias explicit and motivating the construction of a new debiased alternative which we propose via a basis representation.To obtain a good representation of the power spectral density, we must specify a large number of basis functions and so estimation of the coefficients via direct optimisation of the Whittle likelihood, as in Sykulski et al. (2019), becomes computationally infeasible.The major contribution of this work is therefore housed in the second theorem, where we introduce methodology that leverages results established in the first theorem, to provide asymptotic results that enable fast and direct computation of the debiased Welch estimator.We build theory for time series requiring fourth-order stationarity with finite fourth-order moments and absolutely convergent fourth-order cumulant function.The theory is established for all tapers, subject to the standard conditions of squared summability of the tapering sequence, thereby extending many classical results that use the raw periodogram, for example, those in Anderson (1971), Brillinger (1975) and Katznelson (1968).
The theoretical developments are also demonstrated by simulation.To evidence the asymptotics, we use a canonical auto-regressive model of order four studied throughout the spectral estimation literature (see for example Percival et al., 1993); then, we present results for shorter-length time series generated from a Matérn covariance function.Next, we demonstrate the debiased Welch estimator in a realworld case study of coastal wave measurements.This provides an example of where parametric methods struggle to capture the complexity of the process, thus necessitating the use of non-parametric methods.Finally, unlike Welch's estimator, the debiased Welch estimator does not require even spacing over frequency; we use this to demonstrate how we may simultaneously debias and compress the signal.Code accompanies our results; R and python implementations are available at github.com/TIDE-ITRH.

Fundamentals and definitions
Define by {X t } a discrete real-valued stochastic process, sampled at interval ∆ and indexed by t ∈ Z, where Z is the set of integers.The process {X t } may be either thought of as a true discrete-time stochastic process, such as an auto-regressive model, or as uniformly sampled observations of a continuous-time stochastic process {X(t)} in which case X t = X(∆t).Define the auto-covariance sequence of {X t } as γ(τ ) = E[X t X t−τ ] for τ ∈ Z.Following Wiener-Khinchin's theorem, a necessary and sufficient condition for γ(τ ) to be the auto-covariance function of {X t } is that there exists a power spectral distribution function F (ω) such that for ω ∈ [−π/∆, π/∆] defined in radians, where (1) is a Riemann-Stieltjes integral.We assume fourthorder stationarity, with finite fourth-order moments and fourth-order cumulant function too is absolutely convergent.Consequently, the power spectral distribution is absolutely continuous (see Section 4.2 Shumway and Stoffer, 2000), and so there exists a power spectral density f (ω) that satisfies f (ω)dω = dF (ω), and f (ω) forms a Fourier pair with γ(τ ) so that As f (ω) is uniformly continuous over this interval, and according to the Vitali convergence theorem, f (ω) is Riemann integrable in ω.
Say we observe from {X t } an n-dimensional observation X n = (x 0 , . . ., x n−1 ).The most common estimator of f (ω) from X n is the periodogram, I n (ω), defined as the squared modulus of the discrete Fourier transform of X n , x t e −iωt∆ . (3) Alternatively, we may calculate the periodogram via the discrete Fourier transform of γ(τ ), a biased estimator to γ(τ ), The periodogram is a biased and inconsistent estimator of f (ω).It has expectation where is the Fejér kernel, and variance var[I n (ω)] ≈ f 2 (ω) which does not reduce with n, thus causing the inconsistency.The periodogram is asymptotically unbiased as F n (ω) tends to a Dirac delta function as n → ∞.Often, practitioners will appeal to this to argue that the periodogram is a suitable estimator; however, the effects due to bias are related not only to n, but also to the dynamic range of f (ω).Even with massive n, for common f (ω) encountered in the sciences, the periodogram can be too biased to be useful (e.g.see the discussion in Thomson, 1982).
A standard technique to alleviate the bias in the periodogram is to apply a taper, or window, to the data.Tapering pre-multiplies the data with a taper to form the product h t X t for t = 0, . . ., n − 1 where n−1 t=0 h 2 t = 1.The sequence {h t } is called the data taper.The tapered periodogram, otherwise known as the modified periodogram, is defined as The expectation of the tapered periodogram is E[I n (ω; h)] = f (ω) * H n (ω) where H n (ω) is the spectral window for {h t } defined by Tapering can be effective in removing bias; however, it is not a panacea.The data taper is generally chosen so as to reduce bias in the form of broadband blurring in I n (ω; h); this comes with the trade-off that bias from narrowband blurring is increased.Further, tapering, in effect, reduces the amount of data and so the tapered periodogram has a higher variance as compared to the standard periodogram.This variance can be reduced by applying multiple tapers (multi-tapering), but this then reintroduces bias such that the level of the bias-variance trade-off must be in practice controlled by the practitioner.Setting h t = n −1/2 recovers the periodogram as H n (ω) = F n (ω).For this reason all periodograms may be considered to be tapered, and so we will use the notation I n (ω; h) to refer to both tapered and standard periodograms throughout.

Welch's method for spectral density estimation
Welch's method segments a time series x 0 , . . ., x n−1 into M partitioned or overlapping blocks of length L, and applies a length L data taper to each block.We define the periodogram of the mth block, for m = 0, . . ., M − 1, as where S is an integer-valued shift factor satisfying 0 < S ≤ L, so that p = (L − S)/L, where 0 ≤ p < 1, is the percentage overlap of the blocks, and the relationship between M , L and p is such that setting two of {M, L, p} fixes the third for a given n.Welch's estimator is defined as The degree of overlap, p, and the choice of data taper, {h t }, are modelling choices often informed by either desired properties of the analysis or preferences within scientific disciplines.For instance, it is standard in the engineering sciences to assume a 50% overlap (p = 0.5) with a Hamming taper: see the MATLAB documentation of the function pwelch.Welch's estimator has expectation When n >> M , Welch's estimator is approximately equal to a lag-window estimator where the lagwindow is given by the convolution of h t with itself.This is most simple to see for Bartlett's estimator (p = 0 and h t = L −1/2 ) where the lag-window is a triangular function of width 2L, which results from the discrete convolution of h t = L −1/2 .The primary focus of this manuscript is to study the debiasing of Welch's estimator; although, we note, that the presented results may also be readily extended to certain lag-window estimators.We now establish some important properties of Welch's estimator in Theorem 1.
Theorem 1. Assume a fourth-order stationary stochastic process {X t } with finite fourth-order moments and absolutely convergent fourth-order cumulant function.Welch's estimator, defined in (9), of the observed process X n = (x 0 , . . ., x n−1 ), with data taper {h t } that satisfies h 2 t = 1, has the following properties.
Proof.Proofs are provided in the Supplementary Material.
Remark 1.As n increases, then it is not possible to increase both L, the block length, and M , the number of blocks, linearly with n for a fixed overlap p, see (8); a trade-off is required.For fixed L and linearly increasing M , Theorem 1a establishes that Welch's estimate has the optimal convergence rate of ), but the estimator is inconsistent for all processes except white noise due to the bias inherent in f (ω) * H L (ω) for fixed L as discussed.On the other extreme, for fixed M and linearly increasing L, the bias asymptotically vanishes but the variance is O p (1) (Theorem 1a) and therefore the Welch estimate suffers the same inconsistency as the periodogram (Theorem 1b).Consistency is only achieved by increasing both L and M with n at reduced rates, say, L = O(n α ) and M = O(n 1−α ) for 0 < α < 1, to satisfy (8).In practice, regularity assumptions may be employed to aid the selection of L and M .For example, if we assume h t = L −1/2 and f (ω) to be Lipschitz continuous then, as shown in Fejér (1910), and so the optimal value of α, with respect to mean squared error, is approximately α = 1/3.Further regularity conditions and specification of tapers may yield even faster convergence of the bias: assumptions on the order of differentiability of f (ω) can further reduce the optimal choice of α (see, for example, Zhurbenko, 1980), and tapers may be chosen with associated properties, such as the family of Slepian tapers that maximally concentrate bandwidth energy (Slepian, 1978).

Constructing biased bases
To construct a debiased and consistent estimate to f (ω) based on Welch's method, we use concepts proposed by Sykulski et al. (2019) but extend them to a non-parametric setting.Specifically, here we construct from a family of bases B k (ω), a family of biased bases Bk (ω).We assume the functional form 6) to provide debiased inference for the parameters ϑ = (a 1 , . . ., a K ).Here, the B k (ω) are basis functions specified from some family of bases chosen to represent some desirable property of f (ω).We set where each ρ k (τ ) are the auto-correlation functions calculated from B k (ω) as Due to the distributivity of the convolution operator the expected periodogram is where each Bk (ω; h) is calculated exactly using a fast Fourier transform by substituting ρ k (τ ) for γ(τ ; ϑ) in (7).

A Riemann approximation to the spectral density
Section 3.1 describes a general approach to bias any basis family.Here, we provide specific details with respect to a Riemannian approximation of f (ω).As f (ω) is Riemann integrable then we may approximate f (ω) by summating contiguous rectangular functions; this approximation becomes exact as the width of the rectangular functions tends to zero.Define the rectangular function by and the symmetric rectangular function with centre ω c and width δ as for ω c ≥ δ/2.As we are concerned with real-valued time series, with centres The approximation in (13) converges to equality as K → ∞ and all δ k → 0. This approach does not require us to space the bases regularly; any partition can be defined, allowing for irregularly spaced bases.For now, so as to mimic the behaviour of Welch's estimate, we assume the bases to be regularly spaced in ω, and so δ k = δ for all k.We define the centres and widths of the symmetric rectangular functions so that for a maximum value of K = ⌈(L − 1)/2⌉, ω c k = kδ − δ/2 and δ = π/(∆K).Setting B k (ω) = symrect (ω; ω c k , δ), and following the results of Tobar (2019), we calculate via (11) where sinc(τ ) = (πτ ) −1 sin(πτ ) and sinc(0) = 1.To calculate Bk (ω; h) we substitute ρ k (τ ; ω c k , δ) in (7).

Model fitting
For a given choice of bases B k (ω) the final step is to estimate ϑ = (a 1 , ..., a k ) to recover the spectral estimate f (ω; ϑ) = K k=1 a k B k (ω).The preferred debiased approach would be to specify f (ω; ϑ) = K k=1 a k Bk (ω; h) and solve for the a k by maximising the debiased Whittle likelihood defined in (6).For the approximation to f (ω) in (13) to be reasonable, however, we are required to define a potentially large number of bases, K, and so it is likely that maximising (6) is computationally infeasible using naive optimisation routines.
To proceed we observe from Theorem 1c that ĪL (ω; h) converges, with M , to be Gaussian, and so we model ĪL (ω; h) = K k=1 a k Bk (ω; h) + ϵ(ω), for heteroskedastic and Gaussian ϵ(ω) with var[ϵ(ω)] = var[ ĪL (ω; h)].Define the vector B(ω; h) = ( B1 (ω; h), . . ., BK (ω; h)) as the column vector of all the biased bases at ω. Weighted least squares can be used to estimate a solution of the form arg min As var[ ĪL (ω; h)] is dependent on f (ω), the true power spectral density we are trying to estimate, the solution to ( 15) is not analytical.As an approximation to ( 15), we substitute ĪL (ω; h) 2 for var[ ĪL (ω; h)] and instead solve The motivation for ( 16) is as follows.From Theorem 1b, as lim L→∞ var ĪL (ω; h) = c var [I L (ω; h)], over all ω, we substitute var [I L (ω; h)] for var ĪL (ω; h) in (15).Further, as we establish in Theorem 2 below, ĪL (ω; h) 2 converges in M and L to var[I L (ω; h)], and so we approximate var[I L (ω; h)] by ĪL (ω; h) 2 , resulting in ( 16).From θ = (â 1 , . . ., âK ), we define a debiased functional estimate of f (ω) For the symmetric rectangular basis function of Section 3.2, we define the debiased Welch estimator at a discrete set of frequencies as this is analogous to the frequencies for which Welch's estimator is generally defined.Due to finite-sample error, the debiased Welch estimator is not guaranteed to yield a non-negative solution.We address this in Section 3.4 by constraining the solution space of θ so that θ ∈ R + where R + denotes the non-negative space of reals.As by definition, f (ω) ∈ R + , and following similar arguments to Theorem 2 of Politis and Romano (1995), we can establish that constraining θ ∈ R + yields a better estimator with respect to mean squared error.
Theorem 2. Assume a fourth-order stationary process, {X t }, with finite fourth-order moments and absolutely convergent fourth-order cumulant function, K(τ i , τ j , τ k ).The variance of the periodogram of any observed length-L segment of X n = (x 0 , . . ., x n−1 ), for all ω ̸ = 0, ω N , and with data taper {h t } that satisfies Proof.The proof is provided in the Supplementary Material.
The weighted least squares routine in (16) follows asymptotic arguments in Theorems 1 and 2 that result in an estimator that is asymptotically unbiased and O p (M −1/2 ) as long as L and M both increase with n.As C(K, L, h) is not bounded above by 1, the debiased Welch estimator is not guaranteed to have a smaller mean squared error as compared to the Welch estimator given the same parameterisation.However, as the debiased Welch estimator corrects for the effects of bias, in practice we may apportion between L = O(n α ) and M = O(n 1−α ) with a lower α than would be chosen for Welch's estimator to minimise mean squared error (see also Remark 1), allowing for extra variance reduction, as we shall demonstrate for finite samples in Section 4.2.This is a similar result to Politis and Romano (1995) who study bias-correction of lag-window estimators.

Computation
We summarize the computation of the debiased Welch estimator in Algorithm 1 in the Supplementary Material, with two variants provided.The first variant, shown in lines 3-8 of Algorithm 1, provides the option to space the K bases evenly or unevenly over frequency.Even spacing of the bases provides a closer analogue to Welch's estimator; here, the choice of integer valued K ≤ ⌈(L − 1)/2⌉ is the only additional user decision required by the debiased Welch estimator.When there is large bias present in Welch's estimator, choosing K close to ⌈(L − 1)/2⌉ can lead to an ill-defined solution.In general, we recommend K = ⌈(L − 1)/4⌉ as a good initial estimate which down-samples the Fourier frequencies by an order of two.For uneven spacing, the centres and widths are all user-specified.We discuss uneven spacing further in Section 6 with an example provided for high-frequency slope estimation.The second variant of the model, shown in lines 12-15 of Algorithm 1, provides an option to impose non-negativity on ϑ in ( 16), constraining the solution space to θ ∈ R + .As discussed, the debiased Welch estimate may result in small negative values.This is particularly so when the choice of K is too high and so the solution to the linear system in ( 16) is unstable.Efficiently computing non-negative least-squares estimates has been long understood; in the accompanying code we implement results from Lawson and Hanson (1974).Note, that the non-negative least-squares solution is guaranteed to have a lower mean squared error, see proof of Theorem 2 in Politis and Romano (1995).The computational order of the debiased Welch estimator, for both even and uneven spacing, is O(max{n log L, K 3 }).Here, the first term is the computational order of Welch's estimator, and the second term is the computational order to do the debiasing in ( 16).The number of bases K scales with L and so O(K 3 ) ∝ O(L 3 ) for even spacing.
From Theorem 2, we observe that the optimal rate with which to scale L with n is at maximum O(n 1/3 ), and so the debiasing in ( 16) scales at most O(n).Thus, the debiased Welch estimator retains the same computational order of the standard Welch estimator, O(n log L).
4 Simulations studies

Autoregressive processes
To test empirical performance of the debiased Welch estimator, we run a simulation study using a canonical model for biased spectral density estimation: the AR(4) model of Percival et al. (1993).The model is specified as X t = 4 j=1 ϕ j X t−j +ϵ t where ϵ t ∼ N (0, σ 2 ) and {ϕ 1 , ϕ 2 , ϕ 3 , ϕ 4 , σ} = {2.7607,−3.8106, 2.6535, −0.9238, 1}.To plot units in the simulation studies in Sections 4 and 6, we measure {X t } in metres, t in seconds and assume ∆ = 1.Throughout, we calculate standard periodograms, corresponding to h t = L −1/2 .Similar results with other tapers were found, with the bias in Welch's estimator being reduced but not eliminated.First, we plot in Figure 1 the standard (left) and debiased (right) Welch estimates with black solid lines, calculated from a single realisation from the AR(4) model.We generate a time series of length 2 15 (32, 768) and set M = 64 (2 6 ) and L = 512 (2 9 ) with zero overlap for both estimates.In both plots, the true spectral density is shown by the dashed grey line, and the expected value of Welch's estimator, calculated from (7) using the auto-covariance function of the AR(4) model, is shown by the solid grey line.Visually, the debiased Welch estimate provides a closer estimate of the true spectral density, especially in regions of low power where we know the bias to be significant.We now explore how the bias, standard deviation and root-mean-square error of the Welch estimators behave as a function of M for fixed L. For each M ∈ {8, 16, 32, 64, 128, 256} and p ∈ {0, 0.5} we generate 1000 random time series from the AR(4) model with fixed segment length L = 1024 (2 10 ).For each M and p, we calculate, over the ensemble, the empirical mean absolute bias, standard deviation, and root-mean-square error, at each frequency.We aggregate the scores by computing the mean log value across frequency and for each metric; this makes the error process of the metrics independent of the expected power at each frequency and hence aggregates the frequencies with equal importance.Results are shown in Figure 2. The thick dashed lines show the results from the Welch estimator, the thick solid lines from the debiased Welch estimator, the grey lines with an overlap of p = 0.5 and the black lines with an overlap of p = 0 (i.e. Bartlett's estimator).As a point of comparison to other nonparametric spectral density estimation methodologies, we also include the results from a lag-window estimator with modified Daniell window, shown by the thin solid line, and a multitaper estimator calculated with Slepian sequence tapers, shown by the thin dashed line.We choose the window-width and number of tapers to provide similar estimator bandwidth for each value of M .The results of this simulation study support a number of our theoretical claims.The black dot-dashed line shows the M −1/2 convergence rate which is consistent with the theory established in Theorem 1a for the Welch estimator, and is maintained for the debiased Welch estimator.The lag-window, multitaper estimator, and debiased Welch estimators all exhibit similar asymptotic behaviour in absolute bias.The absolute bias tends to zero for the debiased Welch estimator as the approximation in Theorem 2 improves with increasing M , justifying the computation of the debiased Welch estimates using ( 16).The absolute bias does not converge to zero for the standard Welch estimator due to the finite-sample bias of periodograms of fixed length L as already established in Theorem 1a.Finally, we show computation times of the standard and debiased Welch estimates.As expected, the debiased Welch estimator is approximately constant over M .The discrepancy in time is due to the additional operations on top of calculating Welch's estimator and is negligible in practice.We further include in the supplementary material a similar simulation study that examines the performance of the estimators as a function of L for fixed M .We see similar benefits from using the debiased Welch estimator as demonstrated above for fixed L.

Matérn processes
Section 4.1 examines performance of the debiased Welch estimator as a function of M , for medium to long lengths of time-series and for a discrete time process.We now analyse the debiased Welch estimator for shorter time series for a continuous time process with fixed n and varying α, defined by L = n α .We  simulate from a Matérn process parameterised as with σ = 1, λ = 0.1 and α = 4/3, where Γ(•) denotes the gamma function and κ ν (•) the order ν modified Bessel function of the second kind.The Matérn process is asymptotically log-linear in terms of its spectral decay and so is useful for modelling many observed processes; for example, background ocean spectra and turbulent dissipation in oceanography (Garrett and Munk, 1972;Lilly et al., 2017), stellar rotation periods and red-shift in astronomy (Foreman-Mackey et al., 2017) and volatility processes in finance and economics (Heinrich et al., 2019).Figure 3 shows the integrated mean squared error averaged over 1000 random simulations of the Welch (dashed) and debiased Welch (solid) estimates as a function of α and for n ∈ {2 8 , . . ., 2 13 }.For the shorter time-series, shown in the top row, the debiased Welch estimator performs better over all values of α.There is insufficient resolution to observe the expected bias-variance trade-off in the Welch estimator; although, with increasing n, shown in the bottom row, this behaviour becomes apparent.For small values of α, bias dominates the the error; conversely, for large values of α variance dominates.The debiased Welch estimator removes the majority of the error in the bias dominated regions.For certain high values of α the debiased Welch estimator does not perform as well, although these values correspond to small M , a parameterisation that we do not recommend for the debiased Welch estimator.As discussed in Remark 2 following Theorem 2, we recommend scaling α < 1/3 to minimise mean squared error: the debiased Welch estimator gives better performance across all n in this region thus supporting the theoretical results.

Application to coastal wave monitoring
We now demonstrate the use of the debiased Welch estimator applied to a measured data-set of coastal sea-surface heights.The data are measured from two locations with differing mean water depths on the Southern coast of Western Australia: Ocean Beach and Torbay.Contemporaneous measurements of seasurface height are taken from acoustic and pressure sensors.Acoustic sensors provide measurements of the true process via acoustic ranging and pressure sensors measure pressure variation on the sea-bed and translate this to a measurement of sea surface height.As compared to pressure sensors, acoustic sensors are more expensive, and due to larger power requirements, do not allow for long field deployments.Shallow water wave theory states that pressure is attenuated through depth as where d is the undisturbed depth, −z is depth from the sea-surface to the sensor and, here, k is wavenumber.We can also define (17) as a function of frequency using the dispersion relation (Dean and Dalrymple, 1991).We show two 5-minute chunks of the measured time series from both sites in Figure 4; data are captured at 1 Hz such that ∆ = 1, and the solid grey and dashed black lines are measurements from the acoustic and pressure sensors, respectively.On the right of Figure 4 we show pressure attenuation over frequency.The high frequency signal in the Torbay measurements see larger attenuation than in the Ocean Beach measurements due to the deeper mean water level.Due to the attenuation of the high frequency signal, in both records, we expect a high dynamic range in the pressure signal, and hence non-ignorable spectral bias due to blurring.We calculate power spectral density estimates of a 100 minute record (n = 6144) of sea-surface heights at each location; the results are shown in Figure 5. First, we calculate the Welch estimates, with parameters L = 256, M = 47 and p = 0.5, shown in the left hand panels.Here, the grey and black lines are from the acoustic measurements, respectively.As expected, there is good agreement between the estimators at low frequencies, and the pressure signal is attenuated at higher frequencies.The spectral shape of the Ocean Beach record is interesting: there are two spectral peaks around π/8 rad s −1 and π/4 rad s −1 , implying the presence of both a well formed swell (low-frequency) and wind-sea (highfrequency), likely due to a local storm.This behaviour is hard to capture parametrically and is a good example to motivate the use of non-parametric methods.The black lines in the middle plots are the debiased Welch estimates, with additional parameter K = 64, calculated from the pressure signal, and the grey lines are the Welch estimates, for reference.For comparison, we also show the lag-windowed and multitaper estimates, described in Section 4.1.In this application, these estimates show very little bias, and so provide good validation of the performance of the debiased Welch estimator.We do not display the debiased estimate for the acoustic signal as the bias correction is very small here due to the small dynamic range.Finally, the right hand plot shows the Welch estimates of the pressure measurements inverted through the attenuation function to give an estimate of the true wave power spectra; the thick grey and black lines show the inverted Welch and debiased Welch estimates, respectively.These estimates are not shown above 3π/8 rad s −1 and 5π/8 rad s −1 for Torbay and Ocean Beach, respectively, as this is where the noise floor of the pressure sensor dominates the signal.The thin grey line is the acoustic Welch estimate, also shown in the left hand plots, and is used for a notion of the ground truth.Due to in-situ pre-processing of the pressure signal, whereby the signal is demeaned of tidal effects, there is some bias between the two signals at low-frequencies below π/8 rad s −1 across the entire data-set.Overall, the debiased Welch estimator gives a better estimate to the true process over a broader range of frequencies.
Capturing swell dissipation, that is, the power transition from laminar/low to turbulent/high frequencies, is vital for our understanding of swell propagation.Although there is relatively low variance captured in the biased low-power regions, it is known that incorrect description of swell dissipation has large effects in forecasting models (e.g.Ardhuin et al., 2009).Here, use of the debiased Welch estimator, resolves almost twice the amount of high-frequency signal as the standard Welch estimate and thus provides a more accurate measure of swell dissipation.

Unevenly spaced bases for signal compression
To be analogous to Welch's estimator, in Section 3 we define the debiased Welch estimator with even basis spacing over frequency.As noted therein, we do not require this, and can alternatively define the bases by an irregular partition over frequency.For spectra whose value varies slowly over certain neighbourhoods of frequency, we argue that a wider basis spacing would lead to little reduction of accuracy in estimating the true process.Allowing for wider, and irregularly spaced bases serves two purposes: first we achieve a degree of signal compression, as compared to the Welch estimator; and second, we reduce the variance of the estimator as there are more data informing the coefficient of each basis.The code provided alongside this manuscript accepts uneven bases by providing the ω c k and δ k , as defined in (13), as user defined inputs for all k.To demonstrate this concept, we return to the Matérn model simulated in Section 4.2.We sample a length 32, 768 (2 15 ) time series and calculate the Welch estimate, with parameters M = 32 (2 5 ), L = 1024 (2 10 ) and no overlap, shown in the left plots of Figure 6 in black.The true power spectral density and the expected value of Welch's estimator for the Matérn process are shown in the plots in Figure 6 by the dashed and solid grey lines, respectively.The Matérn power spectral density is asymptotically, in ω, log-linear and so we space the bases of the debiased Welch estimator evenly in

Figure 1 :Figure 2 :
Figure 1: Welch estimates of an AR(4) random time series.The AR(4) model is defined formally in the text; the true spectral density is shown by the dashed grey line and the expected Welch estimate is shown by the solid grey line.The Welch estimates are shown by solid black lines with the standard Welch estimate shown left, and the debiased Welch estimate shown right.

Figure 4 :
Figure 4: Contemporaneous measurements of sea-surface height from acoustic (grey) and pressure (black-dashed) sensors, measured in Western Australia at Torbay (top) and Ocean Beach (bottom).The right hand plots show the attenuation of the pressure signal over frequency and are dependent on the mean water depths.

Figure 5 :
Figure 5: Welch estimates of sea-surface height spectra from acoustic and pressure sensors located at Torbay (top) and Ocean Beach (bottom).The left plots show the Welch estimates of the acoustic (grey) and pressure (black) measurements.The middle plots show the debiased Welch estimates of the pressure data (black) and, for comparison, the Welch estimate (grey).The thin lines are the lagwindow and multitaper estimates described in Section 4.1.The right plots show the standard (grey) and debiased (black) Welch estimates of the pressure measurements accounting for attenuation.The thin grey line is the acoustic signal used as a notion of ground truth.

Figure 6 :
Figure 6: Welch estimates of a Matérn power spectrum with uneven basis spacing.The top plots show frequency on a linear scale and the bottom plots on a log scale.The standard Welch estimate is shown left and debiased Welch estimate, with uneven basis spacing, is shown right with black solid lines.Locations of the Fourier frequencies and basis centres are shown by the rug plots on the frequency axes.In all plots, the true power spectral distribution and expected Welch estimate are shown by the dashed and solid grey lines, respectively.