Eigenvector models for solving the seismic inverse problem for the Helmholtz equation

We study the seismic inverse problem for the recovery of subsurface properties in acoustic media. In order to reduce the ill-posedness of the problem, the heterogeneous wave speed parameter is represented using a limited number of coefﬁcients associated with a basis of eigenvectors of a diffusion equation, following the regularization by discretization approach. We compare several choices for the diffusion coefﬁcient in the partial differential equations, which are extracted from the ﬁeld of image processing. We ﬁrst investigate their efﬁciency for image decomposition (accuracy of the representation with respect to the number of variables). Next, we implement the method in the quantitative reconstruction procedure for seismic imaging, following the full waveform inversion method, where the difﬁculty resides in that the basis is deﬁned from an initial model where none of the actual structures is known. In particular, we demonstrate that the method may be relevant for the reconstruction of media with salt-domes. We use the method in 2-D and 3-D experiments, and show that the eigenvector representation compensates for the lack of low-frequency information, it eventually serves us to extract guidelines for the implementation of the method.

Eigenvector models for inverse problem 395 the background velocity (the low-frequency profile) is not correctly anticipated (Gauthier et al. 1986;Luo & Schuster 1991;Fichtner et al. 2008;Barucq et al. 2019b). For this reason, the phase information is included in the traveltime inversion by Luo & Schuster (1991) by using a cross-correlation function between measurements and simulations, where the relative phase shift is given by the maximum of the correlation. The method is further generalized by Gee & Jordan (1992), while Van Leeuwen & Mulder (2010) propose to select the phase shift using a weighted norm. The choice of misfit has further encountered a growing interest in the past decade: in application to global-scale seismology, Fichtner et al. (2008) compare the phase, correlation-based, and envelope misfit functionals, the latter being also studied by Bozdag et al. (2011). In exploration seismic, comparisons of phase and amplitude inversion are performed by Bednar et al. (2007); Pyun et al. (2007). The L1 norm is studied by Brossier et al. (2010) while approaches based upon optimal transport are considered by Métivier et al. (2016); Yang et al. (2018). In the context where different fields are measured, Alessandrini et al. (2019) and Faucher et al. (2019) advocate for a reciprocity-based functional, which further connects to the correlation-based formulas ). In the case of accurate knowledge of the background velocity, the inverse problem is close to linear or quasi-linear as the Born approximation holds and then, alternative methods of linear inverse problem can be applied, such as the Backus-Gilbert method (Backus & Gilbert 1967, 1968. The difficulty to recover the background velocity variation has also motivated alternative parametrization of the inverse problem: for instance the MBTT (migration based traveltime) reformulation of FWI (Clément et al. 2001; Barucq et al. 2019b), which represents the wave speed from a background and a data-space reflectivity models.
In order to diminish the ill-posedness of the inverse problem, a regularization criterion can be incorporated. It introduces an additional constraint (in addition to the fidelity between observations and simulations), which, however, may be complicated to select a priori and problem dependent (with 'tuning' parameters). For instance, we refer to the body of work of Kirsch (1996), Isakov (2006), Kern (2016), Kaltenbacher (2018) and the references therein. In the regularization by discretization approach, the model representation plays the role of regularizing the functional, by controlling (and limiting) the number of unknowns in the problem, and possibly defining (i.e. constraining) the shape of the unknown (e.g. to force smoothness). Controlling the number of unknowns influences the resolution of the outcome, but also the stability and convergence of the procedure. The use of piecewise constant coefficients appears natural for numerical applications, and is also motivated by stability results (Alessandrini & Vessella 2005;Beretta et al. 2016). However, such a decomposition can lead to an artificial 'block' representation (cf. Beretta et al. 2016;Faucher 2017) which would not be appropriate in terms of resolution. For this reason, a piecewise linear model representation is explored by Alessandrini et al. (2018Alessandrini et al. ( , 2019, still motivated by the stability properties. We also mention the wavelet-based model reductions, that offer a flexible framework and are used for the purpose of regularization in seismic tomography by Loris et al. (2007Loris et al. ( , 2010. In the work of Yuan et al. (2014Yuan et al. ( , 2015, FWI is carried out in the time-domain with a model represented from a wavelet-based decomposition. In our work, we will use a model decomposition based upon the eigenvectors of a chosen diffusion operator, as introduced by De Buhan & Kray (2013), Grote et al. (2017) and Grote & Nahum (2019). Note that this decomposition is shown (with the right choice of operator) to be related with the more standard total variation (TV) or Tikhonov regularizations. The main difference in our work is that we study several alternatives for the choice of the operator following image processing techniques, which traditionally also relies on such diffusion PDEs (e.g. Weickert 1998). We first investigate the performance of the decomposition depending on the choice of PDE, and, next, the performance of such a model decomposition as parametrization of the reconstruction procedure in seismic FWI. It shows that the efficient choice of PDE should change depending on the situation. In addition, we provide a series of experiment to extract the robust guidelines for the implementation of the method in seismic.
We specifically target the reconstruction of subsurface salt domes (i.e. media with high contrasting objects), which is particularly challenging, because (in addition to the usual restrictive data) of the change of the kinematics involved and the lack of low frequency data (Farmer et al. 1996;Chironi et al. 2006;Faucher et al. 2019a). In such cases, the use of the totalvariation regularization (Rudin et al. 1992) with FWI is becoming more and more prominent, and consists in incorporating an additional constraint on the model in the minimization problem. Its efficiency is shown in the context of acoustic media with salt-dome contrasts by, for example Brandsberg-Dahl et al. (2017), Esser et al. (2018, Kalita et al. (2019) and Aghamiry et al. (2019). In our work, we study several alternatives and demonstrate that the model representation with the criterion extracted from Geman & Reynolds (1992) appears the most appropriate for the eigenvector decomposition method in the presence of salt-domes. We also show the limitations of the method, in particular, it appears that the decomposition fails to represent models which are composed of several thin structures.
In Section 2, we define the inverse problem associated with the Helmholtz equation and introduce the iterative method for the reconstruction of the wave speed. In Section 3, we review several possibilities for the model decomposition using the eigenvectors of the diffusion operators. The numerical implementation of the method and subsequent time cost are given in Appendix A. The process of model (image) decomposition is illustrated in Section 4. Then, in Section 5, we carry out the iterative reconstruction with FWI experiments in two and three dimensions. Here, the model decomposition is based upon the initial model, which does not contain a priori information on the target, hence increasing the differences of performance depending on the selection of the basis. It allows us to identify the best candidate for the recovery of salt dome, and to extract some guidelines for applications in quantitative reconstruction.

Forward problem
We focus on acoustic media where, for simplicity, the density is taken as a constant, leading us to the identification of a single heterogeneous parameter: the wave speed c. We consider a domain in two or three dimensions, with ⊂ R 2 or ⊂ R 3 . The boundary of , is separated into two parts: the upper (top) boundary, 1 , represents the interface between the Earth and the air, with a free surface boundary condition imposing the pressure to zero. The other part of the boundary, 2 , corresponds to the numerical restriction of the Earth for the computations and we use absorbing boundary conditions (ABC, cf. Engquist & Majda 1977) to ensure that the waves are not reflected back to the computational domain. For a source f and an angular frequency ω, the forward problem writes in terms of the pressure field p solution to the Helmholtz equation, where ∂ ν is the normal derivative, c denotes the wave speed. The subsurface imaging works with partial surface measurements which correspond to observation of the wave propagation at the (discrete) device/receiver locations. We refer to this set of positions as and we define the forward map at frequency ω, F ω (which links the model to the data), such that, We have introduced m(x) := c −2 (x), which is our choice of parameter for the reconstruction. In seismic, the data are further generated from several point sources (excited one by one) and all devices (sources and receivers) remain near the surface.

Quantitative reconstruction using iterative minimization
The inverse problem aims the reconstruction of the unknown medium squared slowness m (i.e. the wave speed) from data y ω that connects to the forward map F ω (m † ) for a reference (target) model m † with where E ω represents the noise in the measurements (from the inaccuracy of the devices, model error, etc), and is possibly frequency dependent. Our choice to invert the squared slowness (instead of the velocity or slowness) is motivated by the Helmholtz eq. (1) and is further discussed by Tarantola (1986), Brossier (2011, Köhn et al. (2012) and (Faucher 2017, section 5.4).
For the reconstruction, we follow the FWI method (Tarantola 1984;Pratt et al. 1998), which relies on an iterative minimization of a misfit functional defined as the difference between the observed data and the computational simulations: where we use the standard least-squares minimization but alternatives have also been studied, as indicated in the introduction. A sequence of increasing frequencies is usually taken for the minimization Problem 4 (Bunks et al. 1995;Pratt & Worthington 1990;Sirgue & Pratt 2004;Brossier et al. 2009;Barucq et al. 2018;Faucher 2017) and we use a sequential progression in our experiments (instead of band/windows of frequencies), as advocated by Faucher et al. (2019a) to increase the convergence radius. Then, an iterative minimization algorithm is used for the resolution of Problem 4 in the framework of the Newton methods. Starting with an initial guess m (0) , the model is updated at each iteration k, using a search direction s (k) , such that Several possibilities exist for the search direction (e.g., Newton, Gauss-Newton, BFGS, gradient descent, etc.) and we refer to Nocedal & Wright (2006) for an extensive review of the methods. The scalar coefficient μ is approximated using a line search algorithm (Nocedal & Wright 2006). In our implementation, we rely on a gradient-based optimization, with the non-linear conjugate gradient method, and a backtracking line search (Nocedal & Wright 2006). A review of the performance of first order-based minimization algorithms and the influence of line search step selection is further investigated by Barucq et al. (2018) in the context of inverse scattering. The computation of the gradient of the misfit functional is carried out using the adjoint-state method (Chavent 1974;Plessix 2006), which specific steps for complex-valued fields can be found in (Barucq et al. 2019b, appendix A) Eigenvector models for inverse problem In this section, we introduce a representation of the unknown, that is, the model decomposition, based upon the eigenvectors of a diffusion equation. The objective is to reduce the dimension of the unknown to mitigate the ill-posedness of the inverse problem. We provide several possibilities for the choice of the eigenvectors, following the literature in image processing.

Regularization and diffusion operators
The resolution of the inverse problem using a quantitative method introduces an optimization problem (4) where the misfit functional J accounts for a fidelity term. The resolution of such a problem is also common in the context of image processing (e.g. for denoising or edge enhancement) where the fidelity term corresponds to the matching between the original and processed images. It is relatively common (for both the quantitative reconstruction methods and in image processing) to incorporate an additional term in the minimization, for the purpose of regularization. The primary function of this additional term is to reduce the ill-posedness of the problem, by adding a constraint. It has been the topic of several studies, we refer to, for example, Kirsch (1996), Isakov (2006), Charbonnier et al. (1994), Robert & Deriche (1996), Rudin et al. (1992), Vogel & Oman (1996), Lobel et al. (1997), Kern (2016), Qiu et al. (2016) and Kaltenbacher (2018). The regularized minimization problem writes as where I stands for the regularization term.
In many applications such as image processing, I is usually defined to only depend on the gradient of the variable (image), such that where φ ∈ L 2 ( ). In particular, the minimum of I with respect to m verifies the Euler-Lagrange equations (Evans 2010;Dubrovin et al. 1992). In one dimension, it is given by (Dubrovin et al. 1992, theorem 31.1.2) and is extended for higher dimensions with (Dubrovin et al. 1992, theorem 37.1.2) (further simplified in our case because I only depends on the gradient). It states that the minimizer of I is the solution of the diffusion equation: The minimization of J r can be performed using Newton-type algorithms or, when rewriting with the Euler-Lagrange formulation, be recast as a time dependent evolution problem, e.g., in image processing, Weickert (1998); Catté et al. (1992); Rudin et al. (1992); Alvarez et al. (1992). For the sake of clarity, we introduce the following notation:

Diffusion coefficients from image processing
There exist several possibilities for the choice of diffusion coefficient η (also referred to as the weighting function) in (9), inherited from image processing theory and applications. In the following, we investigate the most common formulations, see Table 1, for which we have mainly followed the ones that are reviewed by (Blanc-Féraud et al. 1995, table 1) and Robert & Deriche (1996). Furthermore, we incorporate a scaling coefficient β > 0 for the diffusion coefficient, which impacts on the magnitude. For consistency in the different models, the norms that we used are scaled with the maximal values so that they remain between 0 and 1. We define, where d is the space dimension (d = 2 or d = 3 in our experiments) and x the space coordinates: In order to simplify the formulas, we will omit the space dependency in the following. Note that in the numerical experiments, we calculate the eigenvalues and eigenvectors from the linear differential operator A(m, η) defined in (9), where the diffusion coefficient η is taken from the non-linear PDE model.

Remark 1.
Regarding the diffusion coefficients that are introduced in Table 1, we note that the PDE (8) using the Tikhonov diffusion coefficient η 9 coincides with the Laplace equation while the formulation with η 8 corresponds to the total variation (TV) regularization (Rudin Table 1. List of formula for the coefficient of the diffusion operator which is used to decompose the image.

Eigenvector model decomposition in FWI
In our work, we use the regularization by discretization approach: instead of adding the regularization term I in the minimization problem, we remain with Problem (4), and use a specific representation for the model (unknown). We follow the work of De Buhan & Osses (2010) Eigenvector decomposition: given an initial model m(x), a selected integer value N > 0, and the selected diffusion coefficient η.
(ii) Compute the subset of N eigenfunctions {ψ k } k=1,...,N which are associated to the N smallest eigenvalues {λ k } k=1,...,N such that, for all k, (iii) Compute the model decomposition using N eigenvectors: where α k is a scalar and ψ k a vector. Here, the set of α is chosen to minimize m − m 2 ; and the ψ k , k = 1, . . . , N are the eigenvectors associated with the N smallesteigenvalues λ k , computed in Step (ii).
Algorithm 1: The model decomposition using the eigenvectors of the diffusion equation associated with the smallest eigenvalues. We refer to the model decomposition as m(m, N , ψ) where ψ(m, η, N ) is the set of eigenvectors, see (14) and (15). The computational details for implementation are given in Appendix A.
Following Algorithm 1, we introduce the notation, ..,N the set of N eigenvectors associated with the model m and th diffusion coefficient η, computed from (11) and (12); Eigenvector models for inverse problem 399 and m(m, N , ψ) is the decomposition of the model m using N vectors from ψ, (13).
Therefore, the model is represented via N coefficients α in (13) in the basis given by the diffusion operator. The reconstruction procedure follows an iterative minimization of (4), and performs successive update of the coefficients α. The key is that N is much smaller than the dimension of the original representation of m, but allows an accurate resolution, as we illustrate in Sections 4 and 5. Algorithm 2 details the procedure.
Inputs: the measurements y ω ; the initial model m (0) ; the selected number of iterations n iter ; the list of frequencies ω i , i = 1, . . . , n ω ; the decomposition dimension associated with frequency: N i , for i = 1, . . . , n ω .
Using Algorithm 1, (i) compute the eigenvector basis for m (0) using the selected η and the highest integer (ii) Decompose the initial model using the initial decomposition dimension: Solve the Helmholtz eq. (1) at frequency ω i with model m (k) . Compute the misfit functional J in (4). Compute the gradient of the misfit functional ∇ α J using the adjoint-state method.
Compute the search direction s (k) . Compute the descent step μ using the line search algorithm.
Update the coefficient α with α (k+1) = α (k) − μs (k) . Update the model: end end Algorithm 2: The iterative minimization algorithm (FWI) using the model decomposition to control the number of variables. The model is represented in the eigenvector basis, for which the weights are updated along with the iterations.

Gradient computation
In our implementation, the gradient is first computed with respect to the original (nodal) representation and we use the chain rule to retrieve the gradient with respect to the decomposition coefficients α: It is straightforward, from (13), that the derivation for a chosen coefficient α l gives ∂ α l m = ψ l . Therefore, it is computationally easy to introduce the formulation with respect to the eigenvector decomposition from an already existing 'classical' (i.e. when the derivative with respect to the model is performed) formulation: it only necessitates one additional step with the eigenvectors.
Remark 2. In Algorithm 2, the basis of eigenvectors remains the same for the complete set of iterations, and is extracted from the initial model. Only the number of vectors taken for the representation, N i , changes. Namely, from N 1 to N 2 > N 1 , the decomposition using N 2 still has the same N 1 first eigenvectors in its representation (with different weights α), and additional (N 2 − N 1 ) eigenvectors.
(a) (b) Figure 1. Seismic velocity models m † used to illustrate the eigenvector decomposition. They are both represented with 921 × 301 nodal coefficients. Table 2. Minimal relative error obtained and associated scaling coefficient: E (β), for the decomposition of the Marmousi model Fig. 1(a). The definition of η is given Table 1.

I L L U S T R AT I O N O F M O D E L D E C O M P O S I T I O N
First, we illustrate the eigenvector model decomposition with two geophysical media in two dimensions: (i) the Marmousi velocity model, which consists in structures and faults, of size 9.2 × 3 km, see Fig. 1(a); (ii) a velocity model extracted from the SEAM (SEG Advanced Modeling Program) Phase I benchmark, 1 which consists in a salt dome (an object of high contrast velocity) and subsalt layer structures, of size 17.5 × 3.75 km, see Fig. 1(b).
In order to produce consistent comparisons, both models are represented on a structured nodal grid of size n x × n z = 921 × 301, that is with 277 221 coefficients. We perform the decomposition of the models by application of Algorithm 1, and steps (11), (12) and (13) (and we recall that we use the linear PDE problem). We study the main parameters of the decomposition: the number of eigenvectors N, the choice of η and β (see Table 1). The accuracy of the decomposition is estimated using the L 2 norm of the relative difference between the decomposition and the original representation such that where m is the original model ( Fig. 1) and m the decomposition using the basis of eigenvectors from Algorithm 1. We decompose the SEAM and Marmousi models using the nine possibilities for η, that are given in Table 1. For the choice of scaling coefficient β (which does not affect η 8 and η 9 ), we roughly cover an interval from 10 −7 up to 10 6 , namely: {10 −7 , 10 −6 , 10 −5 , 10 −4 , 10 −3 , 10 −2 , 5.10 −2 , 10 −1 , 5.10 −1 , 1, 5, 10, 10 2 , 10 3 , 10 4 , 10 5 , 10 6 }. In Tables 2 and 3, we show the best relative error (i.e. minimal value) obtained for the Marmousi model of Fig. 1(a) and the SEAM model of Fig. 1(b). We test all choices of η and values of N between 10 (coarse) and 500 (refined). The corresponding values of the scaling parameter β which gives the best (i.e. the minimal) error are also given in parenthesis.  (19) for two selected formulations of η (see Table 1) and different scaling parameters β. As expected, we observe that the more eigenvectors are chosen (higher N), the better will be the decomposition. When using 500 eigenvectors, which represents about 2 per cent of the original number of coefficients (921 × 301 in Fig. 1), the error is of a few per cent only. This can be explained by the redundancy of information provided by the original fine grid where the model is represented. Comparing the methods and models, we see that (i) the Marmousi model (Table 2) is harder to retrieve than the SEAM model (Table 3) as it gives higher errors. In particular for low N, the SEAM model can be acutely decomposed (possibly 4 per cent error with N = 10).
(iii) The scaling coefficient that minimizes the error may vary with respect to N, but remains with a similar amplitude. When changing the model, it may also require the modification of β.
To investigate further the last point, we show the evolution of relative error E with respect to the scaling coefficient β for the decomposition of the Marmousi and SEAM models in Fig. 2, where we compare two selected formulations for η. We observe some flexibility in the choice of β that gives an accurate decomposition but it still appears relatively model-dependent.
We then picture the resulting images obtained after the decomposition of both models. For the sake of clarity, we only show two formulations in Fig. 3. The pictures illustrate that the decomposition of the SEAM model necessitates less eigenvectors than the Marmousi one which, in addition, is not accurately represented depending on the choice of η while the upper boundary of the SEAM salt dome is always well captured.

E X P E R I M E N T S O F R E C O N S T RU C T I O N W I T H F W I
In this section, we perform seismic imaging following the FWI Algorithm 2 for the identification of the subsurface physical parameters. We focus on media encompassing salt domes, as we have shown that the decomposition is more appropriate, and it also gives a more challenging configuration in seismic applications. We follow a seismic context, where the forward problem is given by (1)     due to the change of kinematics (Farmer et al. 1996;Faucher et al. 2019a); (ii) we consider an initial guess that has no information on the subsurface structures and where the background velocity amplitude is incorrect and (iii) we avoid the use of the (unrealistically) low frequencies (below 2 Hz in exploration seismic).

Reconstruction of 2-D salt model
We first consider a 2-D salt model of size 9.2 × 3 km, which consists in three domes, see Fig. 5(a). We generate the data using 91 sources and 183 receivers (i.e. data points) per source. Both devices are located near the surface: the sources are positioned on a line at 20 m depth and the receivers at 80 m depth. In order to establish a realistic situation despite having a synthetic experiment, the data are generated in the time-domain and we incorporate white noise in the measurements. The level of the signal to noise ratio in the seismic trace is of 10 dB, the noise is generated independently for all receivers record associated with every source. Then we proceed to the discrete Fourier transform to obtain the signals to be used in the reconstruction algorithm. In Fig. 4, we show the time-domain data with noise and the corresponding frequency data for one source, located at 20 m depth, in the middle of the x-axis. For the reconstruction of the salt dome model, the starting and true model are given in Fig. 5. We do not assume any a priori knowledge of the contrasting object in the subsurface, and start with a 1-D variation, which has a drastically lower amplitude (i.e. the background velocity is unknown).

Fixed decomposition, single frequency reconstruction
We first only use 2 Hz frequency data, and perform 180 iterations for the minimization. In Fig. 6, we show the reconstruction where the decomposition has not been used, that is the model representation follows the original piecewise constant decomposition of the model (one value per node). In Fig. 7, we compare the reconstruction using Algorithm 2 for the different formulations of η given in Table 1, using a fixed N = 50. We use N = 100 for Fig. 8. For the sake of clarity, we focus on η 3 (the most effective formulation), and η 8 (which relates to the Total Variation regularization).
We observe that (i) the traditional FWI algorithm (without decomposition), see Fig. 6, fails to recover any dome. It only shows some thin layers of increasing velocity, with amplitudes much lower than the original ones.  (ii) The decomposition using N = 50 is able to discover the largest object with formulation η 1 , η 3 , η 5 , η 6 and η 8 , see Fig. 7. The best result is given by η 3 which recovers the three domes; while η 1 , η 5 and η 8 show artifacts in the lower right corner. The other decompositions fail. We note that, due to the lack of velocity background information, the positions of the domes are slightly above the correct ones to compensate for the low travel times.
(iii) In this experiment, the method behaves much better with a restrictive number of eigenvectors. With N = 100 (Fig. 8), the iterative reconstruction only results in artifacts. The restrictive number of eigenvectors provides a regularization of the problem by reducing the number of parameters, which is crucial. For instance, the stability is known to deteriorate exponentially with the number of parameters in the piecewise constant case (Beretta et al. 2016).
Opposite to the decomposition of images (Section 4), the quantitative reconstruction using a model represented with a basis of eigenvectors from a diffusion PDE shows drastic differences between the formulations, where the procedure can fail depending on the choice of η. In addition, the number of eigenvectors for the representation has to be carefully selected, see Subsection 5.3. Table 4. List of experiments for the reconstruction of the 2-D salt dome model (Fig. 5b). For each combination of frequency and associated number of eigenvectors N in the decomposition, 30 iterations are performed (n iter in Algorithm 2   Table 4 for the reconstruction of the salt velocity model Fig. 5.

Experiments with increasing N and multiple frequencies
We investigate the performance of the eigenvector decomposition for multiple frequency data, and with progressive evolution of the number of eigenvectors in the representation N. We have a total of four different experiments, which are summarized in Table 4. The reconstructions, for η 3 and η 8 , are shown Fig. 9. From these experiments using multiple N and/or frequency contents, we observe the performance of the method. The best results are obtained using a single 2 Hz frequency with either progression of increasing N or constant N: Experiments 2 and 1. The progression of N, Experiment 2, appears the most robust.
In this experiment, using multiple frequencies does not improve the results. It is probably due to the lack of knowledge of the velocity background which prevents us from recovering finer scale (i.e. kinematic error, Bunks et al. 1995). In particular, local minima in the misfit functional become more and more prominent in the high-frequency regime (Bunks et al. 1995;Barucq et al. 2019b). We illustrate the performance of the reconstruction with the results in the data-space: in Fig. 10, we show the time-domain seismograms for the true, starting and reconstructed velocity models. We observe that when we filter out frequencies above 2 Hz (first line of Fig. 10), the trace from the reconstructed model is indeed very similar to the measured one. However, when encompassing all frequency contents (bottom line of Fig. 10), (a) (b) Figure 11. Comparisons of the frequency-domain data (pressure field) at the receivers location for a central source. We compare the data obtained from the target wave speed with salt of Fig. 5(a), the starting wave speed of Fig. 5(b) and the reconstruction of Fig. 9(a). Those three models are respectively denoted c † , c start and c rec .
important differences arise, in particular, one can see the travel time of the first reflection which is earlier with the recovered model. This indicates that the location of the salt in the reconstructed velocity is above its 'true' position. Then, while we incorporate the higher frequency in the minimization procedure, the FWI is not amenable to improve the results (see Fig. 9) and it is most likely due to the missing velocity background which is not improved during the first iterations, and still missing. In Fig. 11, we show the frequency-domain data at 2 and 4 Hz: the observed data at 2 Hz are accurately obtained with the model reconstructed with the decomposition in eigenvectors, which confirms the pertinence of the method. Interestingly, at 4 Hz, while the frequency is not even used in the inversion scheme (we only use 2 Hz for Fig. 9a), we already have a good correspondence near the source and only the parts further away show a shift. In order to overcome the issue of recovering the background velocity, one would need lower frequency content, or one could try alternative strategies, such as the MBTT (migration based travel time) method, which parametrizes the inverse problem with the background velocity model and the reflectivity (Clément et al. 2001). Here, the decomposition in eigenvectors appropriately recover the reflectivity part (better than traditional FWI), but the background model remains missing.

On the choice of the number of eigenvectors
We have shown in Figs 7 and 8 that one should take an initial N relatively low for the reconstruction algorithm to succeed. It remains to verify if the appropriate N can be selected 'a priori', or based upon minimal experiments. In Fig. 12(a), we show the evolution of the misfit functional with 30 iterations, for different values of N, from 10 to 250. We compare, in Fig. 12(b), with the progression of N, which follows Experiment 2 of Table 4. From Fig. 12(a), we see that all of the choices of N have the same pattern: first the decrease of the functional and then its stagnation. We notice that the good choice for Nis not reflected by the misfit functional. Indeed, it shows lower error for larger N, while they are shown to result in erroneous reconstructions (Fig. 8 compared to Fig. 7). It is most likely that using larger N leads to local minima and/or deteriorates the stability (see, for the piecewise constant case, Beretta et al. (2016)). It results in the false impression (from the misfit functional point of view) that it would improve the efficiency of the method. Using a progression of N, Fig. 12(b), eventually gives the same misfit functional value than the large N, but it needs more iterations. This increase of iterations and 'slow' convergence is actually required, because it leads to an appropriate reconstruction (see Fig. 9). Therefore, we cannot anticipate a good choice for N a priori (with a few evaluation of the misfit functional). the guideline we propose, as a safe and robust approach, is the progression of increasing N, from low to high: it costs more in terms of iterations, but it converges properly.

Reconstruction of the SEAM Phase I model
We now consider the recovery of the SEAM Phase I model, which is expected to be more challenging as it contains both a salt-dome and subsalt layers. The starting model for the reconstruction is shown in Fig. 13, where, for the sake of clarity, we also picture the reference model which was previously used for the decomposition. This medium is of size 17.5 × 3.75 km and the starting model we use is a smooth version of the reference one, where the contrasting objects and layers are missing.
We follow the same configuration as in the previous experiment: the data are generated in the time-domain, and noise is incorporated to the seismograms using a signal-to-noise ratio of 10 dB. Next, we proceed with the discrete Fourier-transform to use the iterative procedure with time-harmonic waves. In this experiment, the smallest available frequency is 2 Hz.
For the sake of conciseness, we only present the reconstruction results with the the representation using η 3 (which was the most efficient). We follow a slow increase of N in a fixed basis (analogous to Experiment 2 of Table 4) which was the more stable approach. The reconstruction using 2 Hz is shown in Fig. 14. We observe that the standard FWI algorithm gives artifacts over the medium, with oscillatory patterns, in particular on the sides. On the other hand, the reconstruction using the representation based upon the eigenvector decomposition is stable, and is able to capture accurately the upper boundary of the salt dome. Because of the frequency used, only the long wavelengths are recovered at this stage. In Fig. 15, we (a) (b) Figure 14. Reconstruction of the SEAM phase I model from an initially smooth model, see Fig. 13, using 2 Hz frequency data.
(a) (b) Figure 15. Comparisons of the frequency-domain data (pressure field) at the receivers location for a centrally located source at 2 Hz frequency for the SEAM Phase I wave speed of Fig. 13(a), the starting wave speed of Fig. 13(b) and the reconstruction using the eigenvector decomposition, Fig. 14(b). Those three models are respectively denoted c † , c start and c rec . Figure 16. Reconstruction of the SEAM phase I model starting from the reconstruction obtained with eigenvector decomposition of Fig. 14(b), using data of frequency from 2 to 10 Hz, with 30 iterations per (sequential) frequency.
further illustrate the recovery by showing the frequency-domain data at 2 Hz frequency using the different wave speed models. We can see that the data from the starting model are out of phase as soon as we move away from the source, with cycle-skipping effects. However, the reconstruction using the eigenvector decomposition is able to retrieve this information and accurately capture the oscillations of the signal, and only the amplitude is inaccurate. We now continue the procedure using increasing frequencies. Because of the smoothing effect of the decomposition, we use the algorithm without the eigenvector representation (alternatively, one could use the decomposition but with large N). Therefore, the velocity model obtained from the FWI with eigenvector decomposition of Fig. 14(b) is used as an initial model for restarted FWI with multiple frequencies, from 2 to 10 Hz [we use the sequential progression advocated by Faucher et al. (2019a)]. Eventually, the reconstruction after 10 Hz iterations is pictured in Fig. 16.
The reconstruction is able to capture the finer details of the velocity model: the salt dome is clearly defined, and the subsalt layer starts to appear. Therefore, the eigenvector decomposition method can also serve to build initial models for the FWI algorithm, where it appears as an interesting alternative to overcome the lack of low frequency. To illustrate the different steps of the reconstruction, we show the time-domain seismograms at the different stages of the reconstruction in Fig. 17 (while the 2 Hz frequency-domain data are given in Fig. 15). We compare the traces resulting from the initial and true wave speed models, from the partial reconstruction obtained with the eigenvector decomposition (Fig. 14b), and from the final reconstruction (Fig. 16). The trace that uses the starting model mainly shows the first arrivals, with some minor reflections coming from the smoothing of the salt dome. The one using the partial reconstruction with eigenvenctor decomposition and only the 2 Hz data accurately resolves the main multiple reflections between the salt upper boundary and the surface (the thick line in the center of the red ellipses in Fig. 17) but it misses the events of smaller importance. Eventually, the final reconstruction, that uses up to 10 Hz data contents is able to reproduce some of the events of smaller amplitudes. We also note that the amplitude of the trace for the final reconstruction is slightly high, which indicates that the contrasts are even too strong.

Implementation of the method
The eigenvector decomposition for the model representation depends on different choices of parameters, and it is not trivial to efficiently implement the method in the iterative reconstruction procedure. From the experiments we have carried out, we propose the following strategy.
(i) Regarding the choice of weighting coefficient, η 3 from Geman & Reynolds (1992) is the most efficient in these applications, and supersedes the standard Total Variation approach (i.e. η 8 ).
Secondly, the difficulty resides in the number of eigenvectors to take for the decomposition: N. More important, it appears that the misfit functional does not provide us with a good indication (see Fig. 12).
(i) The number of eigenvector N for the decomposition should initially takes a low value, and progressively evolves to higher values with iterations. It may not be the fastest convergence, but it is the most robust approach.
Finally, the reconstruction can serve as an initial model for multifrequency data: (i) the (partial) reconstruction with eigenvector decomposition is used as a starting model for multifrequency algorithm. It allows the recovery of the finer details, which depend on the smaller wavelengths and where the smoothing effect is misleading.
Namely, the decomposition is particularly efficient to overcome the lack of low-frequency in the data.

3-D experiment
The method extends readily for 3-D model reconstruction, simply incurring a larger computational cost (as larger matrices are involved for the eigenvector decomposition and the forward problem discretization). We proceed with a 3-D experiment, where we consider a subsurface medium of size 2.46 × 1.56 × 1.2 km, encompassing several salt domes, illustrated in Fig. 18. The seismic acquisition consists in 96 sources, positioned on a 2-D plane at 10 m depth; one thousand receivers are positioned at 100 m depth. Similar to the previous experiments, the data are first generated in the time-domain and we incorporate noise before we proceed to the Fourier transform. Fig. 20 shows the time-domain data associated with a centrally located source, and the corresponding Fourier transform at 5 Hz frequency. For the reconstruction, we start with a 1-D variation, in depth only, where none of the objects is intuited, see Fig. 19, and the velocity background is incorrect. In Fig. 21, we show the reconstruction without using the eigenvector decomposition, where the wave speed has a piecewise constant representation on a 124 × 79 × 61 nodal grid. We only use 5 Hz frequency data, and 30 iterations. Next, we use the eigenvector model 410 F. Faucher,O. Scherzer and H. Barucq Figure 21. Velocity model reconstruction with vertical section at y = 670 m, from the starting medium Fig. 19 using 5 Hz frequency data. The reconstruction does not apply the eigenvector decomposition. The model is parametrized following the domain discretization, using piecewise constant representation with one value per node on a 124 × 79 × 61 grid. representation with Algorithm 2. Following the discussion in Section 5.3, we select η 3 , which is the most robust, and a sequence of increasing N = {20, 30, 50, 75, 100}. We perform 30 iterations per N, that is 180 iterations in total; the final reconstruction is in Fig. 22.
For visualization, we focus on the 2-D vertical and horizontal sections which illustrate the positions and shapes of the objects. This experiment is consistent with the 2-D results: the classical FWI reconstruction fails to discover the subsurface objects, with only a narrow layer, misplaced (see Fig. 21). However, the reconstruction with the eigenvector model representation is able to accurately capture the largest subsurface salt domes (see Fig. 22).

C O N C L U S I O N
In this paper, we have investigated the use of a representation based upon a basis of eigenvectors for image decomposition and quantitative reconstruction in the seismic inverse problem, using 2-D and 3-D experiments. We have implemented several diffusion coefficients, and compared their performance, depending on the target medium.
(i) In the context of image decomposition, the case of contrasting objects (salt domes) is clearly more appropriate than the layered media (such as Marmousi). All of the diffusion coefficients behave well and provide satisfactory results for salt domes image decomposition, even in the presence of noise.
(ii) For the decomposition of images with layered patterns, only a few formulations perform well (η 1 , η 3 , η 6 ). It would be interesting to investigate further the performance of anisotropic or directional diffusion coefficients, mentioned by Weickert (1998) and Grote & Nahum (2019).
Next, we have considered the quantitative reconstruction procedure in seismic, where only partial, backscattered (i.e. reflection) data are available, from one side illumination. We have probed the performance of the method by considering initial guesses with minimal a priori information, and by avoiding the low frequency data, which are not accessible in seismic applications. In this context, the FWI algorithm based upon the eigenvector model representation has shown promising results of subsurface 3-D salt dome media. The method only requires the preliminary computation of the basis of eigenvectors associated with the diffusion operator, and a trivial modification of the gradient computation. Namely, the overall additional cost of the method remains marginal compared to the cost of FWI. Our findings are the following.
(i) For reconstruction, the result depends on the choice of diffusion coefficient. We recommend η 3 , from Geman & Reynolds (1992), which was the most robust in our applications, even with a fixed N and a few iterations.
Eigenvector models for inverse problem 411 (ii) We have shown that the choice of N is not trivial, and one cannot rely on the misfit functional evaluation. Therefore, we have proposed a progression of increasing N, which appears to stabilize the reconstruction.
(iii) Because the method has a smoothing effect, it focuses on the long wavelength structures. Thus, the reconstruction using the decomposition can serve as an initial model to iterate with higher frequency contents, in order to improve the resolution.
Following these analyses, some difficulties remain regarding the optimal choice of parameters. For instance, it is possible that η has to be selected differently depending on the model (as illustrated with the Marmousi decomposition). Similarly, the scaling coefficient β would affect the performance but the acceptable range appears, hopefully, quite large.
For future work, it seems that the lack of background velocity information would not be overcome by the decomposition, possibly resulting in artifacts. Therefore, we envision the use of multiple basis to parametrize the velocity (e.g. using the background/reflectivity decomposition idea of Clément et al. (2001) and Barucq et al. (2019b), with a dedicated smooth eigenvector basis to represent the background, and another to represent the reflectors).
Eventually, the method can readily extend to multiparameter inversion (e.g. for elastic medium), upon taking a separate basis per parameter (e.g. one for each of the Lamé parameters in linear elasticity). However, a more appropriate approach would be the use of jointbasis by, for example, considering a system of PDE instead of the scalar diffusion operator. We have in mind strategies such that joint-sparsity and tensor decomposition.

A C K N O W L E D G E M E N T S
The authors would like to thank Prof Marcus Grote for thoughtful comments and discussions. The authors also thank the editor and the reviewers which have contributed to improve the quality of the paper. FF is funded by the Austrian Science Fund (FWF) under the Lise Meitner fellowship M 2791-N. The research of FF is also supported by the Inria-TOTAL strategic action DIP. OS is supported by the FWF, with SFB F68, project F6807-N36 (Tomography with Uncertainties). OS also acknowledges support from the FWF via the project I3661-N27 (Novel Error Measures and Source Conditions of Regularization Methods for Inverse Problems). HB acknowledges funding from the European Union's Horizon 2020 research and innovation program under the Marie Sklodowska-Curie grant agreement Number 777778 (Rise action Mathrocks) and with the E2S-UPPA CHICkPEA project.
(a) (b) Figure A1. Comparison of the computational time for the eigenvector decomposition for different η. The computation of the eigenvectors uses the discretized matrix of a diffusion operator of size 277 221 × 277 221. It corresponds with the decomposition of the model Fig. 5(a). Other formulations of η from Table 1 are not shown for clarity, but follow the same pattern.

A2 Computation of the coefficients α
The last step is to retrieve the appropriate coefficients α k in (13) for the decomposition. It basically consists in the resolution of a dense linear system (from least squares method). We use LAPACK, Anderson et al. (1999) (contrary to MUMPS, LAPACK is adapted to dense linear system). Note that, because we usually consider a few hundreds of coefficients for the decomposition, this operation remains relatively cheap compared to the eigenvectors computation.

A3 Numerical illustration
We compare the computational time for the eigenvectors computation and model decomposition in Fig. A1 for different values of N and η. We first note that the choice of η does not really modify the computational time. Then, we see that the two operations are mostly linear in N, and that the time to solve the least squares problem with LAPACK is much smaller than the time to compute the eigenvectors with ARPACK, namely, hundred time smaller. For the largest case: N = 1000 for a squared matrix of size 277 221, it takes about 30min to retrieve the eigenvectors, and 10 s to compute the α (in our applications we usually take N < 100).