Inexact Augmented Lagrangian Method-Based Full-waveform Inversion with Randomized Singular Value Decomposition

Full Waveform Inversion (FWI) is a modeling algorithm used for seismic data processing and subsurface structure inversion. Theoretically, the main advantage of FWI is its ability to obtain useful subsurface structure information, such as velocity and density, from complex seismic data through inversion simulation. However, under complex conditions, FWI is difficult to achieve high-resolution imaging results, and most of the cases are due to random noise, initial model, or inversion parameters and so on. Therefore, we consider an effective image processing and dimension reduction tool, randomized singular value decomposition (rSVD) - weighted truncated nuclear norm regularization (WTNNR), for embedding FWI to achieve high-resolution imaging results. This algorithm obtains a truncated matrix approximating the original matrix by reducing the rank of the velocity increment matrix, thus achieving the truncation of noisy data, with the truncation range controlled by WTNNR. Subsequently, we employ an inexact augmented Lagrangian method (iALM) algorithm in the optimization to compress the solution space range, thus relaxing the dependence of FWI and rSVD-WTNNR on the initial model and accelerating the convergence rate of the objective function. We tested on two sets of synthetic data, and the results show that compared with traditional FWI, our method can more effectively suppress the impact of random noise, thus obtaining higher resolution and more accurate subsurface model information. Meanwhile, due to the introduction of iALM, our method also significantly improves the convergence rate. This work indicates that the combination of rSVD-WTNNR and FWI is an effective imaging strategy which can help to solve the challenges faced by traditional FWI.


INTRODUCTION
Full Waveform Inversion (FWI) is a subsurface imaging technique in the field of seismic exploration, and this technique inverts and fits field or artificially synthesized seismic data to obtain the physical properties of underground media, such as velocity and density (Virieux & Operto 2009). The conventional workflow of FWI involves comparing the actual measured seismic wavefield with the computed seismic wavefield, calculating the difference between them through least squares, obtaining the updated direction based on the gradient method, and the velocity increment according to the updated direction, and superimposing the velocity increment with the result of the previous iteration to complete an iteration (Zhou et al. 2015). Ideally, the FWI algorithm will produce an inversion result very close to the real model after countless iterations, because, as a fitting algorithm, the inversion velocity model will gradually converge the real model under reasonable preconditions (Métivier et al. 2013). However, in reality, FWI still faces many challenges. For example, the errors inherent in the data collection equipment itself, or the impact of external environmental factors on the propagation of seismic waves may introduce random noise, which can significantly affect the effectiveness of FWI (Moghaddam & Herrmann 2010). In particular, high-level random noise can mask useful seismic information, making it impossible to capture some detailed underground information, and even slow down the convergence rate of FWI due to overfitting, resulting in erroneous inversion results (Aghamiry et al. 2021a). In addition, the unconstrained form of least squares, due to its overly loose solution space, often causes FWI convergence to easily fall into local minima. This problem is also often jointly considered with FWI's excessive dependence on the initial model, which has the same underlying mathematical logic, i.e., FWI, as a fitting algorithm, finds it difficult to accurately converge on non-strictly convex functions (Chi et al. 2014). Therefore, to achieve high-4 resolution and high-precision FWI, optimization of the inversion algorithm is crucial.
In the context of FWI, numerous approaches have been proposed to mitigate the potential overfitting issues encountered during the inversion process, such as Tikhonov regularization or Total Variation (TV) regularization (Song & Alkhalifah 2020). The overfitting issue frequently occurs due to the inherently noisy nature of real seismic data, often polluted with random noise or artifacts. After numerous iterations of FWI, especially with suboptimal initial models, the inversion process can be significantly degraded by this noise. In other words, the algorithm itself might mistakenly interpret the noise as genuine seismic signals. Incorporating them into the inversion process could lead to high sensitivity to noise, resulting in overfitting (Li et al. 2016).
Tikhonov regularization tackles this by adding a regularization term to the objective function, striving to make the model as smooth as possible, which no longer merely minimizes the difference in the wavefield but also minimizes model complexity. This method seeks a balance between reducing error and preventing overfitting, suppressing the sensitivity of FWI to noise, and enhancing the algorithm's generalization capability (Aghamiry et al., 2020). However, it is not ideal for dealing with highly discontinuous models or models with substantial velocity contrasts, because Tikhonov regularization generates smoother solutions. So, a clear manifestation of this is that it might blur the outline of salt bodies, resulting in reduced resolution (Zhu et al. 2021). Therefore, another regularization algorithm, TV regularization, has garnered attention owing to its introduction of sparse solutions. TV regularization allows the solution to have sharp boundaries and discontinuities, which is crucial for inverting various complex models in Earth Science (Qu et al. 2019). However, TV regularization itself also has notable issues. For instance, it involves non-smooth optimization problems (Aghamiry et al. 2021b).
Further, in FWI, the construction of sparse dictionaries based on the Singular Value Decomposition (SVD) algorithm presents a novel approach to dealing with noise. SVD and sparse dictionary methods (Li & Harris 2018) can learn from a training dataset to build a highly adaptable dictionary capable of better handling complex, non-stationary, and non-fixed signals.
Sparse encoding methods (Guo et al. 2020) can represent signals as sparse linear combinations, which can help isolate noise from signals, as signals can typically be represented by elements in the dictionary, while noise cannot (Liu et al. 2022). However, the conventional construction of sparse dictionaries and the solution of sparse encoding can require extensive computation, especially for large-scale seismic data, where both computational cost and memory consumption are crucial considerations. Additionally, the quality of the dictionary greatly affects the results. If the training data doesn't adequately represent the data to be processed, the effectiveness of the dictionary could be diminished (Wang et al. 2021).
Based on the question above, we propose to optimize the FWI iteration process with a combination of randomized singular value decomposition (rSVD)with weighted truncated nuclear norm regularization (WTNNR) and the inexact augmented Lagrangian method (iALM) algorithm to speed up the convergence rate of FWI. The most significant difference between the rSVD algorithm and conventional algorithms is that it first generates a compressed matrix relative to the original matrix through Gaussian random matrices and Economic QR decomposition. The scale of this compressed matrix is determined by the truncation coefficient k, which is embedded within the internal loop of FWI to achieve layered optimization of the velocity increment with each iteration, and the truncation parameter k increases as the number of iterations increases. Using the rSVD algorithm can not only significantly reduce the scale of seismic data but also reduce memory consumption and computing time by adjusting the truncation coefficient (Liu & Peter 2020). More importantly, we embedded a WTNNR algorithm in rSVD to handle the Σ matrix (rectangular diagonal matrix with singular values) after rSVD decomposition. WTNNR is a special low-rank matrix recovery method that improves upon the traditional nuclear norm minimization method (Deng et al. 2020). In WTNNR, the singular values in the Σ matrix are sorted, and each singular value decomposed by rSVD is assigned a weight according to its size. Subsequently, according to a predetermined threshold, the singular values are truncated, i.e., singular values less than the threshold are set to 0, thereby acting as singular value shrinkage. The weighted singular values then replace the original singular values in the Σ matrix. The new Σ matrix and unitary matrixes are used for reconstruction to obtain the recovered low-rank matrix. The advantage of this method is that it can better balance the lowrank nature and data fidelity, as it can assign different weights to different singular values and can remove smaller singular values through the threshold, thus achieving a better denoising effect (Gu et al. 2017).
In this paper, we innovatively introduce the rSVD-WTNNR algorithm to improve the robustness of FWI for data sets with high background random noise. The acceleration in computation is realized by adopting the iALM algorithm. The synthetic model (2004 BP model) is tested by comparing the inversion result of the Tikhonov regularization FWI with those of rSVD-WTNNR FWI to verify the superiority of our method.
The full waveform inversion (FWI) as a fitting algorithm can be represented in the frequency domain as: where Y is the input actual observed seismic data, χ is the model parameters, and  is the source matrix. The non-linear function F is the seismic data obtained from the forward simulation of the equation in the frequency domain. In the Newton method, the function of equation 1 is minimized iteratively. Each of these iterations requires finding the inverse of the Hessian matrix, acting on the gradient, and then obtaining the velocity increment by choosing an appropriate iteration step: † , H g , where † H is the approximation of the inverse of the Hessian matrix, g is the gradient, α is the iteration step size, and χ is the model update. Equations 2 and 3 are classical algorithms for solving the inverse problem of the FWI using the Gauss-Newton method. However, the conventional algorithm is very computationally intensive because each of its iterations requires solving the Gauss-Newton subproblem, and each sub-iteration of the subproblem requires multiple solving of the wave equation (Li et al. 2012 To achieve high-resolution FWI and reduce its computational burden, we optimize the model update by integrating dimensionality reduction techniques with image processing technologies.
In FWI, we can reduce the rank of velocity increments through truncation, which might help to eliminate high-frequency, small-scale structural noise, as this type of noise often corresponds to smaller singular values. Therefore, by eliminating these smaller singular values, we can obtain a better and more stable velocity increment. Specifically, we first employ the rSVD algorithm to compress the increment distribution matrix in each iteration, and through the truncation parameter k. We obtain a smaller-scaled compressed matrix and then recover it to a similar matrix with the same size as the original one. However, different from the original matrix, this similar matrix has undergone truncation optimization. So, its rank is restricted by the truncation parameter k, making it significantly smaller than the rank of the original matrix. On the other hand, this truncated matrix only contains the most important features of the original matrix and discards smaller singular values, thus helping to remove noise and other unimportant features.
In this paper, we describe rSVD's corresponding steps concerning the FWI. We reshape the velocity increments m n χ × ∈   as primitive matrices, where m and n are the number of rows and columns, which we multiply by a Gaussian random matrix to compress them into a process matrix associated with a truncation parameter k: where * Q is the transpose of matrices Q , and c χ is the compression matrix we initially obtained. It is of size (7) where ψ  is the left singular vector, k Σ contains the approximated first k singular values in the diagonal elements, * V is the right singular vectors.
The above describes the decomposition process of velocity increment reshaping size in FWI by rSVD. Through these steps, we can quickly obtain the approximate singular values and singular vectors of the velocity increment. We provide a comparison of the computational speeds of rSVD and full SVD in Figure 1, which is based on the 2004 BP model. As shown in Figure 1, where the horizontal axis is the rank of the approximating matrix, and the vertical axis is the computational time, Figure 1(a) shows that, due to the smaller model size, the computational speed of rSVD is significantly superior to that of full SVD in the early stages of iteration, with a significant difference in their speed curves. Additionally, Figure 1(b), where the horizontal axis is the singular value order of the singular value matrix, and the vertical axis is the singular value, Figure 1 (b) shows that with k=20 results in the original model, the curve of approximate singular values from rSVD also significantly differs from the singular value estimation curve from full SVD. The smaller model size corresponds to a smaller number and range of singular values, carrying out an initial truncation operation while retaining the main information of the model. Moreover, the selection process for the truncation parameter is not difficult. We first set the initial value of k before the forward modeling. We need to select the iteration step length ς with the interactions of the model increment so that the truncation parameter is linearly superimposed with the compensation ς in each inner loop. This procedure will result in a series of gradually expanded truncation ranges. Figure 2 shows the schematic diagram of the update process for the truncation parameter k. As k increases, the singular value range of the approximate matrix gradually approaches that of the original matrix. Also, based on equations 1-8, we have provided a visualization flow chart of rSVD in FWI, as shown in Figure 3. Here, we use the velocity model of the 2004 BP model of size 124 124 × as an example of velocity increment, providing a visual reference for one time's rSVD optimization process, in which the truncation parameter is taken as k=50 as an instance.

WEIGHTED TRUNCATED NUCLEAR NORM REGULARIZATION: WTNNR
FWI uses gradients to update the velocity model, and the conventional second-order Gauss- where WT  is one kind of truncated nuclear norm, and δχ is the optimized velocity increment. We can transform the WTNNR problem into the following form: , the truncation parameter k always satisfies min( , ) k m n ≤ , ( ) Trace  represents the trace of the matrix. For the singular value matrix obtained after rSVD in the previous section, we have to perform windowing. After the windowing process, more non-zero singular values are removed, which dramatically reduces the size and complexity of the matrix for further truncation operations: is the new singular value matrix after the weighting is taken to be 0. We give the schematic in × . We can intuitively find that all of the singular value matrices after singular value shrinkage treatment have smaller magnitudes and fewer non-zero elements and implemented singular value shrinkage optimization.
The choice of the weight vector is crucial. According to a priori knowledge, larger singular values are more important than smaller ones, especially in denoising applications. The larger the singular value, the less it needs to be shrunk (Chang et al. 2000), so the idea is that the weight assigned to the i-th singular value is inversely proportional to it: where β is a nonnegative constant, i σ is the locally estimated variance at the i-th position, γ is the number of similar patches, and 52 2 ε − = to avoid the divisor being zero: where i σ is the i-the singular value of the matrix. Similarly, in the case of taken to be 0. The four sets of singular value matrices shown in Figure 6 are derived from the original velocity increments, the singular value matrix of rSVD, the weight shrinkage WTNNR, and the optimized singular value matrix. We initially tested the effect of the treatment with and without WTNNR in Figure 7, and it can be seen from the results of a single rSVD treatment.
Especially in the case of 1 k = , the result with-WTNNR is better optimized, which highlights the is the source term, and the linear observation operator × ∈  Ma Na P sampling u at the receiver positions, 1 χ × ∈  Na is the model parameters, which contains preliminary information about underground parameters. The ( ) χ × ∈  Na Na A is the discretized PDE, which is synergic with the χ . Moreover, the = × × (1 1 4( ) ) / 2, where 1 0 λ λ ∧ = , and 1 1 t = . iALM is a new iterative method for solving the linear constraint convex function minimization problem of equation 14. This iterative method is an inexact version of the augmented Lagrange method (ALM). Since every single sub-problem needs to be solved accurately in each external iteration, the computational cost of the other variants version of ALM still has the potential to be reduced. More importantly, the sub-problems of accelerated augmentation Lagrangian do not have closed-form solutions; therefore, the algorithm we adopted allows inexact solutions to the sub-problems. This point makes the iALM accelerate the convergence rate and reduce the computational cost. It overcomes the problem that the exact solution to the sub-problem requires much calculation. Equation17 can also be simplified to: Finally, as shown in Algorithm 1, we give the computational flow of FWI based on the rSVD-WTNNR and iALM. As shown in Figure 8, we provide the full updated workflow for improving FWI.

SYNTHETIC EXAMPLES
To test the performance of the improved FWI proposed in this paper, we compared the results of the The main objective of this paper is to propose a robust FWI algorithm; therefore, we add random noise to the synthetic data, thus increasing the inversion difficulty of FWI. Then, we test the ability of improved FWI to achieve high-resolution imaging in the presence of random noise interference. We use the signal-to-noise ratio formula to define the level of random noise: where the η is the noise data, the  is the signal data.

BP MODEL
The famous 2004 BP model is divided into three parts, and we adopted the middle part of the We present the true velocity model in Figure 9 (a) and the initial velocity model in Figure 9 (b) of the 2004 BP model in Figure 9. In the synthetic data, we tested the inversion results under three different noise levels. So, we first give the data sets for three different dBs of random noise, as shown in Figure 10. Figure  The inversion results under the three different noise conditions all prove that our proposed rSVD-WTNNR-based FWI has higher precision inversion results, better noise suppression capabilities, and faster convergence speed, which validate the effectiveness and feasibility of the improved algorithm proposed in this paper.

DISCUSSIONS
FWI is a time-tested geophysical fitting algorithm. While its core framework has been widely accepted, there is still potential for improvement and refinement in its specific details, particularly when dealing with challenges found in practical applications. In pursuing a highresolution, noise-resistant FWI, we have attempted to incorporate specific image processing techniques into the FWI framework. By combining well-established image processing algorithms with the overall framework of FWI, we aim to achieve higher resolution and better noise resistance.
Specifically, our approach originates from the treatment of velocity increment in FWI. By resizing the reconstruction vector, we convert the original problem of FWI velocity increment processing into an image denoising issue. For example, as illustrated in Figure 19, we start with a velocity increment in Figure 19 (a), which undergoes SVD to yield Figure 19 (b). We then add a random noise in Figure 19 (c) to Figure 19 (a), resulting in a new singular value matrix in Figure   19 (d). How we handle the singular value matrix Figure 19 (d) is similar to how we take the objective function in FWI; hence, we consider this optimization method a regularization strategy.
Back-calculating Figure 19 (d) gives us the velocity increment, as seen in Figure 19 (e). If we simply truncate Figure 19 (d), we remove all singular values within the red box and backcalculate to end with Figure 19 (f). Since the result of Figure 19 (f) is significantly superior to those of Figure 19 (e), we believe this innovative approach provides us with new tools, strategies, and perspectives in dealing with FWI problems. Additionally, our proposed method potentially holds substantial value in addressing the ill-posed nature of FWI, especially when 20 handling data noise and uncertainty in velocity increment.
Based on this, we consider employing a rSVD algorithm for preliminary truncation processing.
rSVD is a potent tool for matrix decomposition, applicable to any real or complex matrix, and often used to address matrix decomposition problems in large-scale data. To elucidate why we adopt rSVD more intuitively, we interpret SVD from a geometrical perspective, as illustrated in

CONCLUSIONS
In response to the traditional FWI challenges of low resolution and outlier sensitivity, this paper introduces a novel optimization workflow to enhance FWI's inversion precision and robustness to noise under equivalent conditions. Within the FWI framework, we innovatively incorporate the rSVD algorithm into the inversion process and employ the WTNNR algorithm to optimize the singular value matrix of velocity increments. Subsequently, we utilize the iALM algorithm, which introduces variable Lagrange multipliers, thereby enhancing the stability and reliability of 22 the FWI and accelerating its convergence rate. In the numerical experiment section, we conducted a comparative analysis of the inversion results derived from the improved FWI and traditional FWI for the 2004 BP model under three distinct low signal-to-noise ratio scenarios.
The experimental outcomes demonstrate that the methodology proposed in this paper effectively mitigates the interference of outliers on inversion by truncating the eigenvalue range of velocity increments under various low signal-to-noise ratio conditions. This enhancement facilitates a more robust representation of model feature information and enables high-resolution imaging of complex structural deep regions.           Hz frequency, the truncation parameter k is equal to 10, 20, 30, 40, and 50, respectively.                   Figure 11. Conventional velocity increment model, (a1-a5) close to the 4.5 Hz frequency, the truncation parameter k, which increases as the number of internal iterations increases, is equal to 10, 20, 30, 40, and 50, respectively; modified velocity increment model, (b1-b5) close to the 4.5

ACKNOWLEDGEMENTS
Hz frequency, the truncation parameter k is equal to 10, 20, 30, 40, and 50, respectively.  15 Hz,3.9 Hz,5.35 Hz,7.70 Hz,and 9.24 Hz,respectively; velocity differences between the inversion results based on modified FWI and the true velocity model.   Velocity (km/s)