## Summary

Traveltime inversion focuses on the geometrical features of the waveform (traveltimes), which is generally smooth, and thus, tends to provide averaged (smoothed) information of the model. On other hand, general waveform inversion uses additional elements of the wavefield including amplitudes to extract higher resolution information, but this comes at the cost of introducing non-linearity to the inversion operator, complicating the convergence process. We use unwrapped phase-based objective functions in waveform inversion as a link between the two general types of inversions in a domain in which such contributions to the inversion process can be easily identified and controlled. The instantaneous traveltime is a measure of the average traveltime of the energy in a trace as a function of frequency. It unwraps the phase of wavefields yielding far less non-linearity in the objective function than that experienced with conventional wavefields, yet it still holds most of the critical wavefield information in its frequency dependency. However, it suffers from non-linearity introduced by the model (or reflectivity), as reflections from independent events in our model interact with each other. Unwrapping the phase of such a model can mitigate this non-linearity as well. Specifically, a simple modification to the inverted domain (or model), can reduce the effect of the model-induced non-linearity and, thus, make the inversion more convergent. Simple numerical examples demonstrate these assertions.

## 1 Introduction

Despite its potential, full waveform inversion (FWI) has long suffered from the complex non-linearity of the objective function caused by the sinusoidal nature of wavefields, as well as, from the complex nature of the Earth's reflectivity. Nevertheless, we have come a long way to make waveform inversion more practical. For one, the cost in evaluating the gradient and Hessian was significantly reduced when Tarantola (1984a,b) and Lially (1984) suggested to represent the gradient function with the adjoint state. To tackle the issue of the lack of data sensitivity to parts of our model, especially in the elastic case, regularization, geared to making the inversion problem well posed by adding indirect constraints on the estimated model (Engl et al. 1996; Zhdanov 2002), gained traction. Originally developed by Tikhonov (1963) and others, regularization has become an indispensable part of the inverse problem. Lately, we saw development in implementation domains, for example, in the frequency and Laplace domains with recipes to navigate the complex non-linearity of the seismic inversion problem (Pratt 1999; Shin & Ha 2008). Despite such progress, waveform inversion is still not real-data friendly as it requires very low-frequency content in the data and/or an initial model for the inversion that produces synthetic data that is at most a half cycle away (everywhere in the wavefield) from that given by the true model. Despite the need for low frequencies in FWI, we almost always use the high-frequency behaviour of wavefields to develop our initial model. This is a disconnect that will cause an unnatural start to waveform inversion (Ellefsen 2009). Using a wavefield attribute that is inclusive of the full waveform content, but represented in a form that can be related to the high-frequency nature of the development of the initial model, is beneficial.

Tampering with the objective function is one way to reduce the effect of the non-linearity of the inversion problem, as well as solve other problems encountered in FWI. As an alternative to the conventional least-squares misfit function, Shin & Min (2006) suggested using logarithmic wavefields to isolate the amplitude and phase contributions. Additionally, Choi & Alkhalifah (2011b) used the convolved wavefields to eliminate the need to estimate the source. In fact, utilizing the phase information of our data in the objective function in what is referred to as phase inversion (Kim & Shin 2005) emphasizes the importance of the phase aspect of our wavefields in the inversion process. Nevertheless, most of the phase inversion implementations are based on extracting the phase without eliminating the wrapping phenomena, and thus, do not contribute much to reducing the non-linearity (cyclical nature of wavefields) of the inversion. Recently, Choi & Alkhalifah (2011a) used an unwrapped phase-based inversion; it was useful to obtain a convergent inversion even for high frequencies. It, however, required aggressive Laplace-based damping to the data to avoid the non-linearity induced by the complex reflectivity of the Earth, and thus resulted in relatively smooth models. Shah *et al.* (2012) unwrapped the phase in stages, but also had to focus on refracted data as the reflection non-linearity was still a source of strong non-linearity.

In this paper, we analyse the unwrapped phase-based inversion and compare it with conventional inversion methods. We introduce modifications to the unwrapped phase inversion that helps reduce the need for an aggressive damping. We specifically focus our analysis on the objective function and the non-linearity issue as it is at the heart of the success of the gradient method for model updates. We illustrate the benefits of this new unwrapped phase objective function and reflectivity model with simple numerical experiments.

## 2 Waveform Inversion

The full waveform inverse problem can be formulated mathematically as an optimization problem (Beylkin 1985; Symes 2008). Let us denote by *m* the subsurface parameter that we aim to model, by *d* the data collected by various experiments and observations, or attributes of such data, and by *L* the modelling process using the model *m* to obtain data to fit *d*. As the experiments typically involve acoustic, elastic and possibly electromagnetic waves, the operator *L* is modelled by various wave type equations. At this level, the ultimate goal is to find the parameter *m*^{★} that fits the data *d*, that is, that optimizes the objective function

*m*(here ||·|| is a certain norm). The FWI approximates the solution

*m*

^{★}via an iterative process from an initial guess

*m*

_{0}. At each stage of the process, given the current approximation

*m*

_{n}, we first perform a linear approximation of

*L*(

*m*) near

*m*

_{n}:

Let us define *F*_{n} :=*L*′(*m*_{n}) and it is clear that *F*_{n}: δ→*F*_{n}δ measures how much data, *d*, are affected by changes in the parameter *m*. Assuming that the above approximation is adequate and plugging it into the minimization process for an *l*_{2} norm, we obtain an approximate optimization problem

_{n}is supposed to approximate

*m*

^{★}−

*m*

_{n}. This problem can be solved explicitly by Once δ

_{n}is known, we set

*m*

_{n+ 1}=

*m*

_{n}+δ

_{n}and repeat the iterative procedure until δ

_{n}is sufficiently small.

The most time-consuming steps of the FWI are the application of *F*_{n} and . The most popular formulation of the FWI is the time-domain approach, in which the application of *F*_{n} and are performed by propagating the source forward in time and the data residuals backward in time, followed by a cross-correlation step. This approach typically requires storing and accessing the whole time history of the forward and backward wave evolutions, which can be quite costly in terms of computational time and storage space. An alternative approach is to work with the frequency domain formulation, that is, the acoustic Helmholtz wave equation

**x**is a vector describing a location in the 3-D Cartesian space, and ω is the angular frequency. This approach is attractive because there is no need to store the whole time history and often one works with only a small number of dominant frequencies.

## 3 Analysis of the Objective Function

In eq. (1) for full waveform seismic inversion, and in exploration applications, we conventionally use the *l*_{2}-norm with *d* given by the measured and assumed acoustic seismic data for all sources and receivers, and *m* given by the seismic *P*-wave velocities. To test this objective function, we consider a plane wave propagating vertically in a medium with a constant vertical gradient of the squared slowness , where , and *v* is the velocity. In this medium, the Helmholtz eq. (2) reduces, with a change of variables, to the Bessel equation and has a known exact analytical solution Pekeris (1946). The wave extrapolated from depth *z*_{0} to depth *z*_{1} is

For the tests below, we consider the true model to have a = 2 km s^{−1} and *g* = −0.139 s^{2}*et al.* km^{−3}, which results in a velocity of 3 km s^{−1} at a depth of 1 km. Thus, we place our source at the surface (*z*_{0} = 0) and measure the wavefield at depth *z* = 1 km. Fig. 1 shows an objective function based on the conventional *l*_{2} norm, where *g* is set to the true value and we vary only *v*_{0}. Clearly, the objective function for this single event trace is highly non-linear over the velocity range, with non-linearity reducing as frequency decreases. This type of non-linearity is caused solely by the nature of wavefields as phase has a periodic behaviour, and thus it wraps around the 2π range. Another source of non-linearity arises, as we will see later, when having more than one event in the seismic trace in which events interact with each other (something similar to crosstalk).

## 4 The Instantaneous Traveltime

In the spirit of the instantaneous frequency, the instantaneous traveltime can be extracted using the following formula:

where*u*is the wavefield in the frequency domain and ℑ stands for the imaginary part. Considering that then and in the high-frequency limit φ(ω) =ω

*t*(

**x**), and thus, τ=

*t*(

**x**), which is typically extracted from the eikonal equation. Therefore, the instantaneous traveltime provides us with a frequency-dependent time of a single event in the seismic trace. For more than one event in the trace, it provides an average traveltime of the events weighted by the amplitude of each event (Saragiotis

*et al.*2011) at that particular frequency. The detail information of the events are stored in the frequency dependency of the traveltime. This feature will be utilized in the inversion process.

Accordingly, an objective function based on this attribute is defined as the misfit between the instantaneous travetime of the observed and modelled data. Choi & Alkhalifah (2011a) showed that this approach provides convergent inversions of smooth velocity models provided we use reasonably high Laplace damping of the wavefield with time. This was necessary to avoid the non-linearity introduced by the model (or the reflectivity) when events talk to each other. We will stick with the *l*_{2}-norm as a measure of misfit throughout.

Fig. 2 compares objective functions using various attributes to represent the wavefield for the single event test described in Fig. 1. Fig. 2(a) displays the conventional objective function shown in Fig. 1 in a density plot format for comparison, and we clearly observe the high degree of non-linearity for such a simple model induced by the cyclical nature of the wavefield. The logarithmic wavefield-based objective function given by Fig. 2(b) does not fair better as it includes the same non-linearity. On the other hand, the instantaneous traveltime objective function, shown in Fig. 2(a), clearly eliminates such non-linearity at all frequencies.

However, this is not the whole story as we rarely have one event in our data, and thus, we will next evaluate the objective function for two nearby events recorded on a single trace.

## 5 Multiple Events

We simulate multiple events for the linear sloth (slowness squared) model described earlier by including two sources. Considering the linear nature of wavefields the result is a simple summation of the two wavefields each given by eq. (3). The sources are 200 m apart, which will induce zero spectral responses at certain frequencies (due to destructive interference).

The two events add non-linear (cyclical) behaviour to the inversion process resulting from their interaction, and thus, Fig. 3 shows such non-linearity, which tends to be in the direction of the frequency axis, almost normal to the non-linearity (cyclical behaviour) caused by the wrapped phase phenomena. For the instantaneous traveltime (eq. 4), which relies on a division over the wavefield, the zero-wavefield values as a result of having two events causes instabilities in the objective function (Fig. 3b). The instantaneous traveltime can be easily stabilized by multiplying the numerator and denominator of eq. (4) by the complex conjugate of *u* and adding a small positive number, ε, to the denominator, as follows:

_{s}, the gradient evaluation process is very complicated.

To obtain an objective function that has the features of the instantaneous traveltime including the unwrapped phase, without complicating possible computations of the gradient of the objective function, we consider the following attribute function

Note that if we set*A*to be frequency independent (like in a delta function), then

*q*is nothing but τ, given by eq. (4) weighted by the amplitude to remove the effect of dividing over the wavefield, yet it has the unwrapped phase feature and we can easily compute the gradient of this objective function. Fig. 4 shows the objective function based on this new measure of the wavefield for three different distances between the sources. The objective function has non-linearity mostly in the frequency-axis direction, but it is generally smoother than the other objective functions.

The cyclical behaviour in the objective function stemming from the multi-event interaction has a direction near orthogonal to the wavefield based non-linearity, as it stems from the model, not the sinusoidal nature of wavefields, specifically the reflectivity model. For band-limited signals, the reflectivity model has its own phase wrapping issues, as the phases of events with cyclical shifts beyond the wavelength interact. This happens even if our source is given by a Gaussian function, as it is related to the reflectivity. Next we analyse such non-linearity.

## 6 Model-Induced Non-Linearity

To analyse the model-induced non-linearity, we continue with the 1-D problem, and even reduce it to a simple case of two horizontal reflectors with responses given by the following:

which consists of direct reflections from two reflectors, no multiples. Furthermore, for simplification, we consider the amplitude to be frequency independent, and take the high frequency limit, φ_{1}= ω

*t*

_{1}and φ

_{2}= ω

*t*

_{2}, and, thus, obtain Clearly the source of non-linearity related to phase wrapping in this wavefield attribute is in the last term and specifically the cosine function, as traveltimes tend to be smooth. For a fixed Δ

*t*=

*t*

_{1}−

*t*

_{2}, which depends on the earth model, this function is ω dependent in a sinusoidal way. For large Δ

*t*, Fig. 4(c) demonstrates that the non-linearity cycles become shorter. By considering the real case of a frequency-dependent amplitude and phase, such non-linearity is more complicated, but should have the same general behaviour.

If we approach this problem from the model perspective, or from its representation via reflectivity (or alternatively velocity), which is given by

we obtain a similar formula when evaluating , with ψ_{1}=

*k*

_{z}

*z*

_{1}and ψ

_{2}=

*k*

_{z}

*z*

_{2}, where

*k*

_{z}is the depth wavenumber and

*z*

_{1}and

*z*

_{2}are the depths of the reflectors. Recognizing, that

*z*

_{1}=

*vt*

_{1}and

*z*

_{2}=

*v*

*t*

_{2}, allows us to connect the two attributes in the inversion for each of the model and data, and thus, remove this source of non-linearity. In other words, the cosine function in eq. (11) is coherent in both the data and model, as the wavenumber is linearly proportional to the frequency (ignoring dispersion) resulting in a smoother objective function. In the reflectivity representation of the model, the phase contains information related to the velocity and the depth of the reflector, often with a trade-off between the two, while the amplitude depends mainly on the velocity change (contrast) information. Also, clear from this exercise is that the source of non-linearity (the cosine function) is dependent on the difference between reflectors (thickness of layers). If we keep the thickness constant by focusing the description of the model on average depths we should mitigate the source of non-linearity. A more elegant way of doing this is to unwrap the model.

## 7 Unwrapping the Model

Like in the case of our data, we apply phase unwrapping to the model by considering the instantaneous depth, which is for the 1-D case

Here we use reflectivity, but the concept is equally applicable to velocity. We define the band-limited reflectivity for a constant density model (, where * stands for the convolution process and*s*is the source function) in the wavenumber domain as where

*V*is the 1-D velocity model and

*S*is the source function in the wavenumber domain. Since modelling requires the ability to transform the instantaneous depth to reflectivity () and also to velocity, we define a complex version of it as follows: with the imaginary part given by the instantaneous depth, and the real part governed mostly by the source wavelet and the velocity contrasts.

The unwrapped phase reflectivity can be extracted from this attribute by solving the following ordinary differential equation:

with and*R*

_{u}(0), for a piecewise constant velocity model, is given by the sum of velocity contrasts in the model, as suggested by eq. (11).

If we constrain the real part of eq. (14), then the phase part of our reflectively function is given by

which is referred to as the unwrapped phase and is guaranteed to be continuous (Stoffa*et al.*1974). Thus, we are effectively representing the model by the unwrapped phase of the reflectivity function. In other words, we are using the unwrapped phase of the data to invert for the unwrapped phase of the model. Note that in the zero-wavenumber limit this attribute is given by the average depth weighted by the amplitude part (velocity contrasts). We will focus on this feature next.

### 7.1 Analysis of the unwrapped model

Now we can expand this unwrapped phase representation of the reflectivity in Taylor's series around the zero wavenumber to obtain:

where for a piecewise constant velocity model the wrapped phase reflectivity is given by: and Δ*v*

_{i}is the velocity contrast along interface

*i*at depth

*z*

_{i}for

*N*interfaces. Thus, and Note that the coefficients of this expansion reduce to the velocity-contrast-weighted moments of the depths of the interfaces. In fact, note that ζ(0) is nothing but the velocity-contrast-weighted average of the depths of interfaces given by For invariant velocity contrasts, it is exactly the average depth.

### 7.2 Average depth

To simulate reflection data recorded on the surface in a simple analysis-friendly form, we again use the multisource model described earlier. However, now we place the sources at depths 1 and 1.2 km in the constant-gradient sloth model described in Fig. 1, and the receiver at the surface. In this setting, the sources act as reflectors in the exploding reflector sense and the critical information in this case is the depth of these sources (reflectors). For the same smooth (constant gradient of sloth) background velocity model used for the observed and calculated wavefields, inverting for the source depth includes the potential contribution of a velocity misfit or a reflector depth misfit considering the trade-off between the two in this 1-D problem. Fig. 5 shows the non-linearity in the objective function induced by this multireflector setting as a function of varying the depth of one of the sources (reflectors) in the modelled data. The depth of that source for the observed data is 1 km, and thus, we obtain zero misfit at that depth. The level of non-linearity in the objective function using the wavefield derivative amplitude (eq. 8) is large. Note in the objective function that the model-induced non-linearity (cyclical behaviour) is nearly orthogonal to the waveform based non-linearity especially near the true solution. However, if we decide to invert for the average depth of the two sources (reflectors), which is equivalent to the DC component of the instantaneous depth described in eq. (12), we obtain very mild non-linearity for this objective function as shown in Fig. 6c. This demonstrates the power of inverting for the unwrapped phase reflectivity model using the unwrapped phase wavefield. The additional details of the reflectors including the thickness between the two reflectors are embedded in the unwrapped phase representation of the reflectivity model as a function of wavenumber, which is somewhat synonymous to the Fourier representation of the reflectivity field.

## 8 The Velocity Gradient

Unlike traveltimes, wavefields contain information about the smooth velocity variation even with a single measurement. Such information is stored in the frequency dependency of the wavefield. Using the new wavefield measure based on the absolute value of the wavefield derivative with respect to angular frequency, we evaluate the objective as a function of surface velocity *v*_{0} and gradient *g*. Fig. 7 focuses on the objective function near the solution given by the target symbol for three different frequencies. Though there is a trade-off between the gradient and the velocity per frequency, eq. (8), which is well known in the high-frequency traveltime limit, at low frequencies we see variations in the objective function per frequency. Such variations are also observed with the conventional objective function at almost the same scale (Fig. 8). In both figures we focus on the objective function near the solution so non-linearity is not an issue at these frequencies. This feature emphasizes that despite the smooth objective function obtained with the unwrapped phase, which is a feature shared by traveltime tomography, these objective functions contain also waveform inversion information. Though the variation seems subtle in both the conventional and the unwrapped phase inversions, we expect these differences to become more apparent in more complex velocity models and at higher frequencies.

## 9 Discussion

The FWI includes complexities far beyond those analysed in this paper, however, our objective here is to provide insights into the non-linearity (cycle skipping) associated with the inversion process and suggest credible solutions to these complex non-linearities. Unwrapping the phase of the wavefield, in its data or model forms, mitigates the periodic nature of its phase in time and depth, respectively.

The unwrapping of phases in the data and the reflectivity model is an alternative to implementing the combination of the frequency and Laplace inversion schemes used to navigate the non-linearity introduced by the data and model, respectively. To do so the Laplace and frequency approaches rely heavily, at the initial stages, on good quality shallow and low-frequency data. The alternative approach proposed here allows us to keep vital waveform inversion information intact while tackling the source of non-linearity, the phase wrapping of the model and waveform. The process does not require initial elimination of high-frequency and later data.

The representation of the velocity model based on the resultant reflectivity is an invertible process provided we know the velocity at the surface (initial velocity). An inversion process requires the development of an update procedure given typically by the gradient. Thus, the derivative of the objective function with respect to the unwrapped phase of the reflectivity can be extracted from the conventional Jacobian, the first-order partial derivatives of the residuals with respect to the model parameters, using the chain rule. Nevertheless, we can equally unwrap the phase of the velocity model directly, though velocity model discontinuities may pose a challenge, we leave this subject to another study.

## 10 Conclusions

An analysis of various objective functions for seismic waveform inversion demonstrates that the type of attribute used in measuring the misfit between the observed and modelled data plays a crucial role in reducing the non-linearity in the inversion process. While the wavefield's cyclical nature induces non-linearity in the conventional least-squares objective function along the inverted parameter direction, the reflectivity model introduces its own non-linearity mostly in the frequency direction. Both non-linearities can be mitigated by using instantaneous attributes on the observed data and the reflectivity (or velocity) model to unwrap their phases.

## Acknowledgments

We thank Sergey Fomel and Christos Saragiotis for many useful discussions. We thank KAUST for its support. We also thank the editor and two anonymous reviewers for their comments and suggestions that helped improve the paper.