A multivariate pseudo-likelihood approach to estimating directional ocean wave models

Ocean buoy data in the form of high frequency multivariate time series are routinely recorded at many locations in the world's oceans. Such data can be used to characterise the ocean wavefield, which is important for numerous socio-economic and scientific reasons. This characterisation is typically achieved by modelling the frequency-direction spectrum, which decomposes spatiotemporal variability by both frequency and direction. State-of-the-art methods for estimating the parameters of such models do not make use of the full spatiotemporal content of the buoy observations due to unnecessary assumptions and smoothing steps. We explain how the multivariate debiased Whittle likelihood can be used to jointly estimate all parameters of such frequency-direction spectra directly from the recorded time series. When applied to North Sea buoy data, debiased Whittle likelihood inference reveals smooth evolution of spectral parameters over time. We discuss challenging practical issues including model misspecification, and provide guidelines for future application of the method.


Introduction
Applications of multivariate time series and spatiotemporal statistics are ubiquitous, for example using the affordable and widespread availability of GPS and accelerometer technology to track individuals and objects in three spatial dimensions over time. Applications include clinical studies of human rest/activity cycles (actigraphy) (Geraci, 2019), player activity in sports (Tierney et al., 2016), motor vehicle tracking (telematics) (Verbelen et al., 2018), animal and wildlife tracking (Rivest et al., 2016), the tracking of large-scale currents and drifting objects in oceanography (Sykulski et al., 2016;O'Malley et al., 2021), as well as measuring ocean surface waves using buoys-the final of which is the focus of this paper. The raw time series obtained from such devices are high frequency, but often noisy, and current practices throw away or over-smooth data without utilising their full potential. In this paper we present a likelihood-based stochastic modelling approach that can meaningfully extract and estimate more spatiotemporal features from ocean wave observations than current methods-but we present our methodology in such a way that it can be applied more broadly to spatiotemporal data, including handling model misspecification, anisotropy, high-and low-frequency noise, aliasing, non-stationarity, and uncertainty quantification.
Wind-generated surface gravity waves are an important feature of the ocean environment. Understanding the behaviour of such waves is of great scientific and engineering interest, with applications ranging from the design of ships and marine structures to modelling coastal flood risk. As such, large quantities of high frequency time series data are routinely recorded in order to help improve our understanding of the waves, generating a variety of statistical challenges. Characterisation of the wave environment needs to reflect the evolving nature of multiple weather systems, and the presence of measurement uncertainty.
From a modelling perspective, we typically seek to model the vertical displacement of the ocean surface over two-dimensional horizontal space and time. The second-order characteristics of this spatiotemporal process are usually summarised by the frequency-direction spectrum, which "is the fundamental quantity of wave modelling and the quantity that allows us to calculate the consequences of interactions between waves and other matter" (Barstow et al., 2005). Heuristically, the frequency-direction spectrum quantifies the contribution to the variance of the wave process from multiple sinusoidal components with different frequencies travelling in different directions. 1 This description assumes that the wave process is stationary; however, in reality this is not generally the case. To address this, wave records are usually partitioned into shorter intervals of time series (referred to as sea states), each of which can be treated as stationary (Holthuijsen, 2007).
High resolution measurements of the ocean surface in space and time are rarely available. However, recordings of the vertical displacement of the ocean's surface at a single location (e.g. using a wave staff or downward-facing radar) or of the motions of floating devices (e.g. buoys) are common (Jensen et al., 2011). In particular, modern buoy measurements provide time series of the buoy's full three-dimensional displacement. Such measurements are then used to estimate the frequency-direction spectrum, though in general this estimation is not trivial to perform.
Parametric estimation of the frequency-direction spectrum usually uses either a method-of-moments or least squares approach. In general, neither approach is optimal statistically. Furthermore, these techniques rely on non-parametric estimates of the frequency-direction spectrum, which exhibit substantial bias. As a result, these approaches perform poorly and can only reliably estimate simple location parameters such as the peak frequency or mean direction of the observed wave process. We propose using likelihood inference directly on the buoy data, avoiding both the poor performance of method-of-moments and least squares; and the issues generated by the non-parametric estimation. Ideally, parametric inference would be made using maximum likelihood estimation with the full sample likelihood; however, the full likelihood is expensive to compute. Fortunately, adoption of the Whittle likelihood (Whittle, 1953) provides a computationally efficient alternative to full maximum likelihood inference, which produces consistent parameter estimates and is optimally convergent. Furthermore, debiased Whittle likelihood inference  removes the finite sample bias present in Whittle likelihood inference, without compromising standard error or computational efficiency. Grainger et al. (2021) demonstrated in a univariate setting that debiased Whittle likelihood inference yields significant improvements over competitors, when estimating parameters of the spectral density function of ocean waves recorded only over time. The paper we present here seeks to generalise this methodology to incorporate directional characteristics of the wavefield. However, this extension is nontrivial, since the full spatiotemporal process, which constitutes the wavefield, is not recorded, and hence the spatial debiased Whittle likelihood of Guillaumin et al. (2019) cannot be applied directly. Instead we describe computationally efficient parametric estimation of a frequency-direction spectrum fitted directly to multivariate time series buoy data. Using a multivariate extension of the debiased Whittle likelihood we are able to obtain parameter estimates with lower bias and variance than competitor techniques. Our real-world data analysis reveals robust parameter estimates and captures their evolution over time during a storm; in contrast, such an analysis using existing techniques results in estimates that evolve erratically over time.
The structure of the paper is as follows. Section 2 gives some background on ocean waves, introduces an example data set and describes a model for the frequency-direction spectrum of wind-sea waves. Section 3 describes the debiased Whittle likelihood inference, and demonstrates its performance by simulation. In Section 4, we then apply the debiased Whittle inference to the example data set introduced in Section 2.2, discussing a number of important practicalities of the analysis. Finally, Section 5 provides a discussion and conclusions.
1 See Appendix for a more formal definition.

Ocean waves and frequency-direction spectra
Much of the interest in ocean waves relates to the surface displacement of the water over space and time, which is treated as a stochastic random field. Usually, the waves are assumed to be stationary and mean-zero within a given time window (often 30 minutes), referred to as a sea state. The covariance structure of the random field for this sea state is then described by the frequency-direction spectrum, S(ω, φ), the frequency-domain equivalent of the spatio-temporal autocovariance (see Appendix A.1 for more details).
Examples of frequency-direction spectra are shown in Figure 1, corresponding to: left, wind sea only; middle, wind sea and swell; and right, wind sea with two swells. Heuristically, if we think about the spectral representation of a process as decomposing that process into a "sum of sinusoids", then S(ω, φ) can be thought of as describing the contribution to the variance from a wave of a given angular frequency, ω (measured in rad Hz), travelling from a given direction, φ (measured in radians). For example, the left hand panel of Figure 1 describes a process where most of the variance is generated by sinusoids travelling from a southwards direction (π radians) with angular frequencies around 0.8 rad Hz. In contrast, the right hand example describes a process with major contributions to the variance from sinusoids with three different directions and frequencies. Notice that the direction is measured clockwise from North in radians and is the direction a wave is travelling from and not towards. 2 Figure 1: Example frequency-direction spectra. The left hand panel shows the frequency-direction spectra corresponding to a single wind-sea, the middle shows a wind-sea and single swell and the right shows a wind-sea and two swells. Direction here is the direction the waves are travelling from. The polar plots have direction from north (rad) on the angular axis and angular frequency (rad Hz) on the radial axis.
Direct characterisation of the wavefield would require measurements of surface displacement over space and time. Outside of laboratory wave tanks (Forristall, 2015;Schubert et al., 2020), shallow lakes (Young et al., 1996) or coastal regions (Long and Oltman-Shay, 1991;Eastoe et al., 2013), this is very difficult to achieve with current technology. However it is relatively straightforward to measure some characteristics of the wavefield, and to use these measurements to infer properties of the latent spatiotemporal process. For example, we can use measurements of the motion of a floating buoy to approximate the Lagrangian motion of a particle on the water's surface, recording time series Z(t), X(t) and Y(t), of the vertical, northwards and eastwards displacements of the buoy respectively.
We may also describe the covariance structure of the vector-valued stochastic process P(t) = Z(t) X(t) Y(t) (which is assumed to be stationary and mean-zero) by the spectral density matrix function provided certain technical conditions are satisfied (see Brockwell and Davis, 2006, for example), where c(τ) = E P(τ)P(0) T . Under linear wave theory (see Holthuijsen, 2007, for example), there is a transfer function Isobe et al., 1984), such that where A H denotes the conjugate transpose of a matrix A. This directly relates the frequency-direction spectrum, S(ω, φ), to the spectral density matrix function, f (ω), which is a feature we shall exploit to perform inference. From (2.2) we can immediately see that, for all ω ∈ R, f (ω) is non-negative definite for any non-negative choice of S(ω, φ) (and indeed for any choice of G(ω, φ), which may be required for other measurement systems, such as a heave-pitch-roll buoy). Therefore, if we specify a model for S(ω, φ) then we can obtain a model for f (ω). However, the relation in (2.2) is not invertible in general. Figure 2 shows an example of the relationship between S(ω, φ) and f (ω) for four different sea states, differing only by mean direction (indicated by the four colours). The difference in mean direction is obvious in S(ω, φ) in the left hand panel, and can still be identified from f (ω) in the right hand panel, even though f (ω) does not provide a complete description of S(ω, φ).

Example data
For the purpose of illustration, we consider a z, x, y time series recorded using a Datawell Waverider MkIII buoy (Datawell, 2006) in the southern North Sea, at a sampling rate of 1.28Hz. This particular five-day period is chosen to provide an illustration of various physical phenomena often present in the ocean. Within the period, 30-minute sea states (assumed stationary) range from being straightforward to being difficult to model, allowing us to explore the practical applicability of the technique we propose. Figure 3 shows a summary of the five-day period in question. The first panel of Figure 3 shows significant wave height, H s = 4 var (Z(t)), for each of the sea states, quantifying the roughness of the ocean's surface. The second panel shows wind speed recorded at a nearby platform. The third panel shows a spectrogram plotted on the decibel scale, computed using multitapering (Thomson, 1982) with half overlapped 30-minute windows, describing the time-frequency characteristics of the record. The fourth panel shows the mean direction of the waves at different frequencies over time, as defined by Kuik et al. (1988), again computed using half overlapped 30-minute windows. The final panel shows the wind direction over time at a nearby platform.
The record is made up of a variety of component weather systems, which are most easily identified from the mean wave direction (fourth panel). At the start of the record there are two components present, a highfrequency wind sea and lower-frequency swell. These components fade out throughout day 0, as can be seen from H s (first panel). A new high-frequency wind sea develops from the start of day 1, with a clear change in mean wave direction (fourth panel). Throughout day 1, this new component increases in magnitude and transitions to lower frequencies (see third panel), peaking at the start of day 2. Half way through day 2, the wind drops off and then increases (second panel) and changes direction (fifth panel). In response, we see another wind-sea component develop with a different direction to the previous component (fourth panel). Towards the end of day 3, a similar event occurs (though to a lesser degree) and we again see a change in direction. Meanwhile, the swell persists in a low frequency band throughout (third and fourth panels), though it is small in magnitude and narrow-banded in frequency compared to the wind sea (as can be seen from the spectrogram).

Models for wind-sea
When modelling the frequency-direction spectrum, the spectrum is decomposed as where f (ω) is the marginal spectral density function of the vertical displacement and D(ω, φ) is the so called spreading function. The marginal spectral density function, f (ω), can be thought of as describing the contribution to the variance from waves of a given frequency regardless of direction. Whereas the spreading function, D(ω, φ), describes the distribution of wave variance for waves of a given frequency over direction. The spreading function satisfies 2π 0 D(ω, φ)dφ = 1 and D(ω, 0) = D(ω, 2π) for all ω ∈ R. Figure 4 shows an example of the decomposition given in (2.3) for the model described in the remainder of this section.
For the purpose of this paper, we use the JONSWAP spectral density function first described by Hasselmann et al. (1973), which we denote f (ω; θ), where θ is the vector of parameters. The JONSWAP spectral density function is widely used for modelling the univariate vertical surface displacement resulting from wind-sea waves. Based on physical observations, Hasselmann et al. (1973) developed the JONSWAP spectral density function with an asymmetric peak and a polynomial decay in the high frequency tail. There is debate about the rate of this tail decay (e.g. Hasselmann et al., 1973;Toba, 1973;Phillips, 1985;Hwang et al., 2017), and so we treat the tail decay index as a free parameter in our analysis. The form of the JONSWAP spectral density function for ocean surface gravity waves can also be motivated from basic physical considerations. Wind waves are generated by the wind blowing over the ocean's surface, by a combination of three physical processes. Wind field turbulence disturbs the water's surface, creating high frequency surface water waves. Then, wind-wave interaction causes these surface waves to grow in amplitude. Thereafter, wave-wave interactions propagate wave energy from higher to lower frequencies. This produces a wave spectral density function with a single spectral peak and long highfrequency tail, with peak frequency evolving from higher to lower frequency during an ocean storm of limited duration. Figure 3: Summary of the storm data, showing significant wave height, wind speed, a spectrogram on the decibel scale, mean wave direction over time and frequency (direction the waves are travelling from, in radians clockwise from North) and the wind direction over time (direction the wind is travelling from, in radians clockwise from North) recorded at a nearby platform. The x-axis labels are at the start of the day, e.g. day 1 denotes the start of day 1. Figure 4: Example of the decomposition of a frequency-direction spectrum, showing the frequency-direction spectrum (left), marginal spectral density function (middle) and spreading function (right). Plots are given using Cartesian coordinates as this makes the arms of the spreading function easier to visualise.
Various models have been proposed for the directional spreading of wind-sea waves. A large number of experimental studies (e.g. Young et al., 1995;Ewans, 1998;Wang and Hwang, 2001) indicate that the spreading function is bimodal with direction, for frequencies exceeding the peak frequency. This finding is supported by theoretical arguments involving directional energy transfer through wave-wave interactions (Banner and Young, 1994;Young et al., 1995;Toffoli et al., 2010). For this reason, we adopt the bimodal wrapped Gaussian model of Ewans (1998) in this work. At each frequency, the spreading function over direction is assumed to be a bimodal wrapped Gaussian with means φ m1 (ω; θ) and φ m2 (ω; θ), but the same standard deviation 3 σ(ω; θ). In other words, (2.4) Table 1 gives a description of the parameters of the model, as well as the equations to which they relate, while Figure 5 shows the effect of varying these parameters on the relevant functions. A complete description of the model is given in Appendix A. Note that the inference approach described in this paper is applicable for other models, but the model described here has been chosen for definiteness.

Parameter Description
Parameter space Equation Table 1: Parameter descriptions and relevant equations. The first four rows describe the parameters for f (ω; θ) whilst the others describe the parameters for D(ω, φ; θ) as described in (2.4).

Modelling process
We aim to jointly estimate all the parameters of Table 1, both marginal and spreading, given a sample of threedimensional displacement. In this section, we describe the proposed inference technique, and demonstrate in simulations that it yields significant improvements in performance over the existing least squares and momentsmatching approaches, described in Appendix B. For brevity, we shall refer to such techniques as competitor techniques for the remainder of this paper. In contrast to competitor techniques, we convert the model for the frequency-direction spectrum to a model for the spectral density matrix function of the data we actually observe, and then fit the model directly to the data. This is statistically more appealing as we fully exploit the degrees of freedom in the observational data, rather than performing unnecessary smoothing transformations before model fitting, and is the key reason our method performs better.

Model fitting
Due to the quantity of available data, computationally efficient inference techniques are desirable. For a Gaussian process, full maximum likelihood would require the inversion of a 3n × 3n matrix. This is expensive when n = 2304 as in our case, especially given that we have a different time series every half an hour. Furthermore, we may wish to only model a certain frequency range (see e.g. Section 4.1 for our application study), which is hard to achieve with full maximum likelihood. Frequency domain psuedo-likelihoods such as the debiased Whittle likelihood  provide a nice alternative to full maximum likelihood inference. Debiased Whittle likelihood inference has been shown to perform well in a variety of applications, including for planetary topography , ocean drifters (Sykulski et al., 2016) and univariate recordings of ocean waves (Grainger et al., 2021). For these reasons, we use a multivariate extension of the debiased Whittle likelihood due to Guillaumin et al. (2019).
Let P t∆ = P(t∆) for t ∈ Z be the discrete time process arising from sampling {P(t)} every ∆ seconds. Assume we have a sample of length n, the periodogram, I n (ω), is defined as usually evaluated at the Fourier frequencies Ω n = {2πj/n | j ∈ {− n/2 + 1, . . . , n/2 }} using the Fast Fourier Transform (Cooley and Tukey, 1965). The multivariate Whittle likelihood (Whittle, 1953), in its discrete form, is given by where Ω ⊆ Ω n and f (ω; θ) denotes a parametric spectral density matrix function with parameter vector θ. The multivariate Whittle likelihood suffers from finite sample bias, especially as the dimension grows, so a debiased version may be used to improve estimates, accounting for finite sampling effects such as aliasing and blurring. The multivariate debiased Whittle likelihood  is where the expected periodogram can be efficiently computed using the relation In our case, the autocovariance, c(τ; θ), is not known analytically, and instead must be approximated numerically from the spectral density matrix function. Since models are specified for the continuous time process, the most efficient way to approximate the autocovariance is to first approximate the spectral density of the discrete time process, then approximate the autocovariance (Grainger et al., 2021). The first step requires aliasing the spectral density of the continuous time process by wrapping in contributions from infinitely many frequencies above the Nyquist frequency. To do this numerically, we have to use a truncated version of this sum. In practice, the instrument may not respond to waves with frequencies above a certain threshold, or the data may have been filtered in preprocessing. Therefore, the recorded process may not be aliased to the same extent as the theoretical sampled process. In our case, we treat the buoys as if no aliasing has occurred (due to the aforementioned preprocessing); however, we note that this technique is able to account for aliasing, should it be felt that aliasing is present.
In both (3.1) and (3.2), summation is over a set Ω. Usually Ω = Ω n ; however, we may wish to remove some frequencies to avoid model misspecification (see Section 4.1) or because at some frequencies in the periodogram the ordinates can be highly correlated for finite samples, which harms Whittle estimation. We then maximise this likelihood function using numerical methods, detailed further in Appendix C.

Simulation study
We now present a simulation study comparing the debiased Whittle likelihood inference proposed in Section 3.1 with the least squares and moments-matching techniques described in Appendix B. We have chosen three different scenarios that represent possible conditions seen in practice, including cases where certain parameters are on the boundary of the parameter space (as this is likely to cause problems for estimation techniques). The parameters for each scenario are given in Table 2, and the corresponding frequency-direction spectra are given in Figure 6.
Scenario 1 is a classic example of a fetch-limited wind sea, with directional shape parameters fixed to the standard values from Ewans (1998), and γ = 3.3 from Hasselmann et al. (1973). Scenario 2 is almost identical, except that σ r is set to 0, meaning that σ(ω; θ) is constant over frequency. Heuristically, this corresponds to a frequency-direction spectrum where the width of each arm in the spreading function is constant over frequency (see Figure 4 for the notion of an arm). This scenario is included because we often see this parameter tending towards the boundary of the parameter space in practice (as in Section 4.2) and it is useful to explore the effect of this on other parameter estimates (though we cannot say anything about the impact of model misspecification from this). Finally, scenario 3 is a Pierson-Moskowitz spectrum for a fully developed sea (Pierson and Moskowitz, 1964), also using the standard spreading parameters from Ewans (1998). This is a special case of the JONSWAP spectrum with γ = 1, and so is of particular interest as it lies on the boundary of the parameter space.
We simulate 1000 time series from each of the scenarios and estimate the model parameters using each of the techniques from Appendix B alongside the debiased Whittle likelihood inference from Section 3.1. 4 In particular, we use the least squares technique described in Appendix B.1 with both MLM and MEM based estimation of frequency-direction spectrum and the moments-matching approach described in Appendix B.2. Whilst there are three different methods from the existing literature in our comparison, they all use the same technique to estimate the parameters of the marginal spectral density function. As such, Figure 7 shows the marginal parameters estimated using least squares (the marginal technique for the competitor techniques), univariate debiased Whittle on only the vertical displacement, and multivariate debiased Whittle on all three time series. Figure 8, shows the spreading parameters where the least squares technique is now split into the three directional variants: least squares with MLM, least squares with MEM, and the moments-matching approach; and the univariate debiased Whittle is not included (as it cannot be used to estimate the spreading parameters).
From Figure 7, we see a clear improvement in the debiased Whittle when comparing to least squares, especially in terms of variance, as has already been reported by Grainger et al. (2021). Additionally to the results already seen in Grainger et al. (2021), there is also a benefit to estimating the parameters of the marginal spectral density function from all three series (as opposed to from the vertical displacement alone). Traditionally, estimating the marginal parameters has been treated as a separate problem from estimating the spreading parameters, with only the vertical displacements used to estimate the marginal parameters. However, this clearly throws away useful information about the marginal parameters which is present in the x and y time series. Furthermore, in Scenario 3, debiased Whittle likelihood recovers γ well, despite the true value being on the boundary of the parameter space (though clearly the estimates are not normally distributed).
Similarly, Figure 8 demonstrates a stark difference between competitor techniques and debiased Whittle likelihood inference. Other than the mean direction φ m , we see substantial bias in all the other parameter estimates from each of the three existing techniques which is not present in the debiased Whittle likelihood estimates. We also see that the debiased Whittle estimates exhibit significantly less variance across all parameters and scenarios. From Scenario 2, we see that debiased Whittle likelihood inference still performs well when σ r is on the boundary of the parameter space (though again estimates are not normally distributed). Additionally, when estimating β, we see that least squares with MLM in scenario 1 and moments-matching in scenarios 2 and 3 the majority of the estimates are on the upper boundary of the parameter space, an issue which debiased Whittle likelihood inference does not have.

Modelling the example data set
We now apply debiased Whittle inference for S(ω, φ; θ) (Table 1) to the data set introduced in Section 2.2. Both wind sea and swell are present in our example record. However, we have chosen to model only the wind sea as the purpose of this paper is to introduce a new inference technique, and this is easiest to illustrate and scrutinise with a simple wind-sea only model. The debiased Whittle procedure could naturally be used on a swell-only model (or indeed a joint wind-sea and swell model) should the swell characteristics be of further interest, but this is reserved for future work.
Due to issues with the measurement device and other contaminating processes, certain frequency regions do not reflect the process we are interested in modelling. Therefore, careful selection of the frequencies included in the objective function must be performed prior to inference. Selecting these frequencies is difficult, but there are principled ways to choose them. In particular, the buoy data does not accurately represent the data which we are interested in modelling at the lowest and highest frequencies (van Essen et al., 2018). As such, we select a low-and high-frequency threshold and use only the frequency interval between the thresholds in our analysis, as we shall now detail in Section 4.1.

Specification of low-and high-frequency thresholds for inference
Model misspecification presents a significant challenge for the fitting techniques discussed in this paper. Such misspecification can be generated in a variety of ways. Firstly, other component weather systems that we do not want to (or cannot) model may be present. Secondly, there may be noise due to the buoy not following the true motion of a particle on the water's surface. Finally, the approximations made by linear wave theory that justify the transfer function in (2.2) may not be valid. All of the aforementioned problems affect some frequencies more than others. Therefore, we shall remove frequencies that are heavily contaminated before fitting models to the data. Because we are using a frequency domain pseudo-likelihood, this is easy to do, and essentially just involves omitting the appropriate Fourier frequencies from the likelihood (as discussed in Section 3.1).
However, choosing which frequencies to remove is not trivial. One useful guide comes from (2.2), which implies that f zz (ω) = f xx (ω) + f yy (ω) under linear wave theory. Motivated by this, we define the error function R(ω) = log( f xx (ω) + f yy (ω)) − log( f zz (ω)). 5 An estimate,R(ω), of the error function can be obtained by first estimating the spectral density functions, then plugging them into the above formula for R(ω). Clearly we would expect R(ω) ≈ 0 for all ω ∈ [0, π/∆], so deviations from zero may indicate that there is a problem with a certain frequency range. Figure 9 shows a plot ofR(ω) for each half hour period from our example data set introduced in Section 2.2 using multitapering. Figure 9: Heatmap ofR(ω) for each half hour period in the example data set, computed using multitapering.
From Figure 9, we see a blue band in the very lowest frequency range with a red band sitting in the frequency range just above this, where the absolute value of the error function is significantly larger than zero. Therefore, in low frequencies the transfer function mentioned above is not valid, and as a result these frequencies are removed when fitting the model. Additionally,R(ω) is slightly negative in the highest frequencies. In other words, the spectral density of the {X(t)} and {Y(t)} processes decays more rapidly than that of the {Z(t)} process in the high frequency tail. This is possibly because the accelerometers for measuring the horizontal displacement of the buoy are mounted in a different way to the accelerometer measuring the vertical displacement, though more investigation is needed to ascertain the source of this discrepancy. Regardless, it is the general consensus that these instruments are more reliable for the middle of the frequency range than they are at the highest and lowest frequencies, and standard quality control of buoy data include checks for excessive level of low and high frequency spectral density (Christou and Ewans, 2014, for example).
Additionally, an old wind sea and a swell are present in the early sea states with the swell persisting, albeit with little energy, for most of the record. Since models for such conditions are beyond the scope of this paper, we only begin modelling when the new wind sea has become dominant, and remove frequencies in which the swell is large orR is sufficiently far from zero. The specific details are in the code provided on GitHub https://github. com/JakeGrainger/NorthSeaBuoyDisplacementAnalysis.git. Figure 10 shows this choice of modelling period and low frequency threshold with the modelling period delimited by dashed vertical lines and the threshold shown by dotted lines, indicating that only frequencies between these lines are included. 6

Parameter estimates
Here we estimate model parameters for S(ω, φ; θ) using debiased Whittle inference, for the frequency intervals specified in Section 4.1. Most of the parameters are initialised from standard values, with the only exceptions being ω p and φ m which are initialised by picking the frequency corresponding to the maximum of a non-parametric estimate of the marginal spectral density function, and the mean direction corresponding to this frequency respectively. Figure 11 shows the parameter estimates, with 95% approximate confidence intervals, calculated using the expected Hessian matrix and assuming parameters are Gaussian distributed. The location parameters ω p (the peak frequency) and φ m (the mean direction) behave as expected, following the spectral mode and reacting to changes in wind-direction respectively. They also evolve smoothly in time, despite fits being performed independently on non-overlapping sea states. The shape parameters for the marginal spectral density function (γ and r), clearly have time varying behaviour. The peak enhancement factor, γ, increases as each component wind sea evolves, then decreases as the component wind sea dies out. Similarly, from the estimates for r, the tail decay becomes less steep between components. It is likely that this is due to model misspecification, as we really have two wind-seas present, but are only modelling one of them. Furthermore, the shape parameters for the directional spreading (β, ν, σ l and σ r ) also show anomalous behaviour during these overlaps. In particular, we see large values of β (hitting the upper bound of the parameter space). Large values of β correspond to a wide spreading over direction, which likely occurs because there is another component present with different directional properties. However, outside these overlap periods we see stability in the parameters estimates, which is encouraging. Additionally, some of the estimates of σ r drop off to zero, because the low frequency threshold can make σ r unidentifiable as much of the information about σ r resides in frequencies below the peak frequency. As a result, we have an identifiability-bias trade off as lowering the threshold frequency introduces more of the noise processes, which tends to result in biasing of β, but raising the threshold makes σ r unidentifiable. This is a difficult problem, and is an important area of further research which we discuss more thoroughly in Section 5.
In summary, the parameter estimates converge to sensible values in the majority of sea states where a single wind-sea is present. Furthermore, looking at sea states where parameter estimates go to boundaries or unrealistic values helps to extract time periods of interest where the model fails and separate investigation is warranted. Figure 11: Parameter estimates using debiased Whittle likelihood inference over the period in question with approximate 95% confidence intervals. The two panels in the second row also include the spectrogram and wind direction for context. In order left to right then down, the panels show:Ĥ s ;α;ω p over the spectrogram on the decibel scale;φ m and wind direction;γ;β andν;r; andσ l andσ r , with approximate 95% confidence intervals.

Discussion and conclusions
This paper describes estimation of the parameters of frequency-direction spectra for ocean surface gravity waves from three-dimensional buoy displacement time series, using debiased Whittle likelihood inference. In simulation studies, debiased Whittle inference is shown to outperform inference using competitor techniques . Debiased Whittle inference for a sequence of sea states provides a means to characterise the joint evolution of spectral parameters in time, and allows uncertainties in parameter estimates to be quantified in a principled manner. The observed smooth nature of parameter evolution estimated from North Sea data, and the dependencies evident between parameters, are consistent with physical intuition.
Typically, the wave environment at a location is the product of different physical drivers, including swell and local wind forcing. In the current work, we focus on sea states corresponding to wind-sea conditions only, for clarity of description. More generally, debiased Whittle inference for mixed seas consisting of wind-sea and one or more swells is possible; in simulation studies of data for mixed seas (not shown), debiased Whittle inference again performs well. In simulation studies on samples of 30-minute records corresponding to wind-sea conditions, the debiasing procedure makes a small but marginal improvement over standard multivariate Whittle estimation. However, when fitting the joint wind-sea and swell model to mixed sea states, estimates using standard Whittle inference exhibit substantially greater bias than those from debiased Whittle inference.
In-situ measurement of the ocean environment is invariably problematic. In the present study, buoy displacement time series are contaminated by additional low-frequency processes, leading to spurious low-frequency spectral features not represented in the assumed parametric spectral form to be estimated. At very high frequencies, buoy displacement time series are further subject to on-board low-pass filtering effects not represented in the assumed spectral form. We adjust the inference procedure for these sources of model misspecification by only considering a central band of frequencies in the likelihood, set using low-frequency and high-frequency thresholds. In general, the low-frequency threshold in particular should be chosen carefully, to achieve a good balance between model misspecification (when the threshold is too low) and identifiability (when the threshold is set so high that aspects of the spectral form cannot be resolved). We have explored extending the spectral form to accommodate an additional low-frequency noise feature, but found that achieving this reliably required a large number of extra parameters, and resulted in greater loss of efficiency in estimating the wind-sea (and swell) components of interest compared to frequency thresholding.
Spectral estimates in the current work are based on data for the ocean's surface displacement only. In general, it would be advantageous to incorporate the effects of covariates such as the evolving wind field on the spectral form, particularly for characterisation of mixed seas. For example, the direction associated with a wind-sea component at a location is dependent on local wind speed and direction, whereas the characteristics of a swell component do not vary substantially with the local wind field. These covariate dependencies are often exploited by physical oceanographers to partition the frequency-direction domain into sub-domains corresponding to wind-sea and swell components (Hanson and Phillips, 2001, for example).
The spectral characteristics of ocean waves evolve smoothly in time. In this paper, as is common practice, we accommodate temporal non-stationarity by partitioning time series into consecutive 30-minute sea states which are considered stationary for purposes of spectral inference. Improved bias-variance properties of parameter estimates from debiased Whittle inference suggest that spectral estimation using sea states of shorter duration is feasible for more-rapidly evolving ocean environments; initial simulation studies (not shown) support this finding. More generally, simultaneous spectral estimation for a sequence of consecutive sea states exploiting smooth time-varying basis representations for spectral parameters (e.g. using splines), or adaptive estimation of evolving spectral forms (e.g. using dynamic linear models) are obvious research avenues.
The methodology for spectral inference described in this paper is generally applicable, provided that an appropriate model for the spectral density matrix function can be obtained by applying a suitable transfer function to the model for the frequency-direction spectrum. Thus, in addition to three-dimensional buoy displacement time series, debiased Whittle inference is applicable to heave-pitch-roll buoy data, for example. A collection of useful transfer functions for commonly used oceanographic devices is given by Benoit et al. (1997). The methodology can be modified for similar applications involving observations of a process viewed as a linear time-invariant filter of some latent process of interest. Practical issues encountered in the current work, relating to time series aliasing, unusual sources of measurement noise and complex likelihood functions are common across many applications (e.g. involving accelerometers and GPS tracking). Hence, we hope that the methodology presented and the ideas it incorporates will prove useful to the practicing oceanographer, ocean engineer and applied statistician.

A.1 Definition and relation to autocovariance
We are interested in modelling the spatiotemporal process {η(x, y, t)} (x,y,t)∈R 3 , which constitutes the surface displacement of the water over time and space. Let τ be the temporal lag, and l = [l x , l y ] be a vector of spatial lags. Then, assuming stationarity, c η (l, τ) = E η(l x , l y , τ)η(0, 0, 0) is defined to be the autocovariance of the spatiotemporal process. Denote the angular frequency by ω and wave-vector by κ = [k x , k y ]. The spectral density function of the spatiotemporal process is then Note that this is different to the definition given in some oceanography texts (Barstow et al., 2005, for example), this is because the angle φ = arg(k x + ik y ) is defined as the direction the wave is coming from, and so we need ωτ + κ · l as opposed to ωτ − κ · l in the exponential function (which is present in definitions when the author wants direction to be the direction the waves are propagating towards). Of course, this is only a convention, but it is worth noting the difference when applying these techniques. Now, whilst in general a process such as {η(x, y, t)} (x,y,t)∈R 3 requires a spatial spectral density function expressed in terms of a frequency and two wavenumbers (a wave-vector), under a simplification of the governing equations for wave dynamics known as linear wave theory, the absolute value of the wave-vector is specified uniquely by the absolute value of the frequency, using a dispersion relation. For this reason, the covariance structure of the process is usually specified by a frequency-direction spectrum, which we denote S(ω, φ), with the relation where h denotes the water depth when the water is still (which is assumed constant). More details can be found in Barstow et al. (2005) and references therein, for example.

A.2 Models for a wind-sea frequency-direction spectrum
One of the most widely used spectral density functions for modelling the univariate vertical surface displacement resulting from wind-sea waves is the JONSWAP (Hasselmann et al., 1973) spectral density function, given by with parameters α, ω p , γ, r.
For the spreading function, we use the bimodal wrapped Gaussian model proposed by Ewans (1998). The bimodal wrapped Gaussian spreading function is defined as for ω ∈ R and φ ∈ [0, 2π]; where φ m1 (ω; θ) and φ m2 (ω; θ) are the mean direction functions and σ(ω; θ) is the angular width function. These functions are themselves parameterised. Ewans (1998) gives a parameterisation with fixed values based on observed data, with a single location parameter to determine the mean direction. We shall use a less restrictive description by adding parameters for the shape and scale of the spreading (a similar parametrisation was used by van Zutphen et al. (2008), but we use slightly fewer parameters as some of the parameters in van Zutphen et al. (2008) have little effect on the frequency-direction spectrum). In particular, we write This adds an additional 5 parameters, namely φ m , β, ν, σ l , σ r .

A.4 Finite water depth correction
The relation given in (2.2) is for waves in deep water. For finite water depths a slightly different relation is required. In particular, we have where h is the water depth (in metres) 7 and ω 2 = k tanh(kh). Consequently we have meaning that the correct definition for R(ω) is R(ω) = log( f xx (ω) + f yy (ω)) + 2 log(tanh(kh)) − log( f zz (ω)).
It is this definition for R(ω) we use in Figure 9.

B Alternate inference techniques
In this appendix, we describe competitor techniques for estimating the parameters of model frequency-direction spectrum from observed buoy data. Typically, a two stage approach is taken. Firstly, the parameters of the spectral density function of the vertical displacement are estimated using least squares 8 and then the parameters of the spreading function are estimated separately. The latter is usually done in one of two ways: a momentsmatching approach (Ewans, 1998, for example); or by producing a non-parametric estimate of the spreading function, then fitting using least squares. Both of these techniques start from writing the spreading function as a Fourier series (which is possible from the periodicity of the spreading function): D(ω, φ) = 1 π 1 2 + ∞ ∑ n=1 a n (ω) cos(nφ) + b n (ω) sin(nφ) .
From (2.2) we can see that The remaining coefficients cannot be obtained from the cross-spectra in general. The approach to solving this problem has typically been to guess at the remaining Fourier frequencies either based on the Fourier coefficients of a model or by making some other assumptions about the behaviour of the spreading function. Of course, this assumes we know the cross-spectra, but we must estimate them. This distinction is not trivial.

B.1 Least squares fitting to estimates of the spreading function
A commonly used technique involves fitting the model spreading function to a non-parametric estimate of the spreading function using least squares. In other words, givenD(ω, φ), an estimate of the spreading function 9 , the parameters are obtained by solving argmin θ ∑ ω∈Ω ∑ φ∈Φ D(ω, φ; θ) −D(ω, φ) 2 .
The problem with this is that such a technique assumes that the estimator used for the spreading function is unbiased, normally distributed, homoscedastic and that at different pairs of frequency and direction estimates are uncorrelated; however, none of these are satisfied in practice. In particular, correlation across frequency is high for both MLM and MEM methods, and bias is substantial. As a result, estimation of anything other than location parameters using this technique performs poorly.

B.2 Moments-matching approach
Early approaches to fitting parametric spreading functions to data from buoys, such as Mitsuyasu et al. (1975), tended to match the Fourier coefficients estimated from the buoy to the theoretical Fourier coefficients from the model (under the relevant transfer function). These approaches actually work by first estimating the parameters of the spreading function at each frequency (which are different from the model parameters), and then essentially doing regression analysis to work out the parameters of the model for the behaviour of the spreading function over frequency. In our case, following Ewans (1998), at each frequency estimate θ ω = [φ m1 (ω), φ m2 (ω), σ(ω)] usingθ ω = argmin θ ω |c 1 (ω; θ ω ) −ĉ 1 (ω)| 2 + |c 2 (ω; θ ω ) −ĉ 2 (ω)| 2 where c j (ω; θ) = a j (ω; θ) + ib j (ω; θ) andĉ j is an estimate for c j (ω; θ) obtained by plugging in estimates for the relevant cross-spectra (typically estimated using some variation on Welch's Method), for j = 1, 2.
The parameters of interest (β, ν, σ l , σ r ) are then estimated bŷ Such a technique is usually not applied to a single sea state, but instead is applied to multiple sea states with the view to fixing the parameters of the spreading function (bar the mean direction). You can apply this to a single sea state but, as we show in Section 3.2, this performs poorly. However, it should be remembered that this technique can still be useful for getting a general idea of the shape different aspects of the spreading function can take, but it is not useful for estimating the parameters of a single sea state.

C Optimisation and gradient calculation
Parameters are estimated jointly by optimising the debiased Whittle likelihood using the interior point Newton method as implemented in Optim.jl (Mogensen and Riseth, 2018). We use Fisher scoring as the expected Hessian of the debiased Whittle likelihood can be computed analytically from the first derivatives of the autocovariance (whilst the Hessian would require the second derivatives as well). This results in very fast optimisation compared to other approaches. For further details, see the Julia package WhittleLikelihoodInference.jl (https://github.com/JakeGrainger/WhittleLikelihoodInference.jl.git).