Multi-source multi-scale source-independent full waveform inversion

Multi-scale full waveform inversion (FWI) in the time domain requires an accurate source signature estimation, which is difficult to obtain from the field data. As an alternative, we present a source-independent inversion algorithm that modifies Wiener filter calculations. A low-pass Wiener filter can be computed by using information from a reference trace rather than the true source wavelet. AWiener filter was designed to handle observed and modeled wave fields separately, with a misfit function consisting of the filtered wave field. The source wavelets for the observed and modeled wave fields are both eliminated in the misfit function, thus offsetting any adverse effects from a directly observed or adjoint source wavelet. As the filtered data can be expressed as a convolution of the reflection coefficient and the target source wavelet, it is straightforward to use a low to high frequency selection strategy. Although this newmethod uses the same forward-modeling and inversion schemes as conventional FWI, it differs in how it calculates the misfit function and corresponding adjoint source. In order to achieve more faithful inversion results and promote computational efficiency, multi-scale source-independent inversion and multi-source inversion strategies were also adopted. In particular, we realized that a weighted average of the zero-offset traces in the super gather constitutes a better choice for reference traces. Aside from this, the noise resistance capability of the proposed method was tested and synthetic data evaluations of both the source-independent and multi-source inversion strategies were undertaken to assess the efficiency of the proposed method.


Introduction
Providing efficient inversions of velocity distribution is of significant importance for both the accurate imaging and characterization of reservoirs (Sun et al. 2017;Liao et al. 2009). However, conventional velocity analysis and tomography methods based on the travel time associated with picked events no longer meet the demands of exploration environments (Rao & Wang 2005;Tao & Sen 2015). Full waveform inversion (FWI), which focuses on minimizing misfit functions, measures the difference between synthetic and observed data to get the best possible data reproduction. This ensures more accurate velocity estimation (Tarantola 1984;Virieux & Operto 2009;Liao et al. 2017). As the computational ability of supercomputers and quality of data acquisition has improved, FWI has been used with increasing success for transmission data and wide-angle reflection/refraction seismic data (Luo & Schuster 1991;Yu et al. 2014;Operto et al. 2015).
Despite this success, the effective use of FWI is hindered by a number problems. First of all, gradient-based algorithms are the most popular choice for inversion. They search on the basis of an initial model that is already fairly accurate, then update the model with gradient information obtained by using an adjoint state method (Plessix 2006). Ultimately, this results in a minimized misfit function (Vireux & Operto 2009;Brossier 2011). However, when using this approach, the FWI misfit function is poorly defined and the calculations can produce numerous local minima, especially if there are low wavenumber components in the velocity model. Without a sufficiently good initial model and suitable observed data, the inversion process can easily become locked in the local minima and obtain results that are not in line with the actual situation. The nonlinearity of the misfit function is closely related to the frequency of the data and misfit functions associated with low-frequency data display better global convergence. A frequency continuation strategy, which inverts the low frequencies first by filtering the data according to the frequency or time domain, has been proposed as a way of avoiding the local minima issue by controlling the wavenumber admitted to the gradient in a sequence from low to high (Bunks et al. 1995;Boonyasiriwat et al. 2009). Sirgue & Pratt (2004) managed to significantly improve the computational efficiency of multi-scale FWI by defining a frequency discretization strategy that depended on the maximum effective offset. FWI in the frequency domain naturally follows a multi-scale approach because the data is already decomposed into separate frequency components (Pratt 1999). Boonyasiriwat et al. (2009) have also proposed a multi-scale method for time domain FWI where the frequency selection strategy is extended to time domain waveform inversion. Although multi-scale FWI in the time domain achieves higher computational efficiency and more linear convergence, it remains limited by the fact that the Wiener filter computation still needs information from a true source wavelet (Boonyasiriwat et al. 2009). In real cases, especially for land datasets, the source wavelet is not directly available and estimation of the source wavelet is complicated.
As FWI applies a wave equation to delineate the subsurface velocity structure according to the best fit between the synthetic and observed data, the source signature is one of the key factors that affects the nonlinearity of the misfit function (Hu et al. 2017). In real cases, the source signature of the observed data can be obtained by windowing a portion of the early arrivals (Yu et al. 2014(Yu et al. , 2016 or it can be estimated along with the velocity structure in the waveform inversion algorithm (Zhou et al. 1997). However, because of the poor knowledge of the coupling between the source and medium, it is difficult to invert for a good source wavelet without knowing the exact near-surface velocity distribution. The estimated source signature can differ substantially from the real source and the final tomogram may become severely polluted with unacceptable artifacts. Frazer & Sun (1998) have suggested a new objective function for the inversion of sonic waveforms with an unknown source where receiver functions can be used to interpret well-logged sonic waveform data. To avoid cumbersome source signature estimation, some researchers have proposed using source-independent FWI methods in both the frequency and time domains (Lee & Kim 2003;Choi et al. 2005;Choi & Alkhalifah 2011;Xie et al. 2018). These algorithms can be divided into two groups: convolution-based approaches and de-convolutionbased approaches. For convolution-based approaches, the effects of the source wavelets can be eliminated by convolving the gathered data with a reference trace from a modeled wavefield and the modeled data with a reference trace from an observed wave field (Choi et al. 2005;Choi & Alkhalifah 2011;Li et al. 2017). For de-convolution-based approaches, the seismic data is normalized with a reference trace selected from the seismic data in the frequency domain (Lee & Kim 2003). Out of the two, convolution-based methods are better-suited to FWI in the time domain because they avoid the additional Fourier transforms and division necessary in the frequency domain. Choi & Alkhalifah (2011) have applied convolution-based theory to encoded multi-source FWI in the time domain, where it was found to be a more efficient and stable inversion strategy.
Another key factor restricting the application of FWI is its high computational burden. With the advent of highdensity 3D survey systems, FWI has come to bear an even greater computational cost (Virieux & Operto 2009). Unlike FWI in the frequency domain, where the computational burden is somewhat independent of the number of shots (Pratt 1999), the computational burden of FWI in the time domain is proportionally related to the number of shots and length of recording over time. This makes it unsuitable for high-density or 3D data (Anagaw & Sacchi 2014;Kwon et al. 2015). Following on from the successful application of pre-stack migration schemes, simultaneous shooting techniques that generate super-shots by summing individual sources with a random encoding function have been turned to as a way to reduce the computational burden (Herrmann et al. 2009;Krebs et al. 2009;Ben-Hadj-Ali et al. 2011;Rao & Wang 2017;Rao et al. 2018). Here, only two forwardmodeling procedures are needed to calculate the gradient of the misfit function with respect to the model parameters. This eliminates the proportional relationship between the computational cost of FWI and the number of shots. However, the simulation of simultaneous sources introduces crosstalk artifacts, which negatively influence the inversion results. In order to suppress the crosstalk noise effectively, multiple sources can be randomly encoded with a phase encoding function or with a time-shift function (Romero et al. 480 2000; Berkhout & Blacquire 2011;Anagaw & Sacchi 2014). Other approaches to reducing crosstalk noise include generating new randomly encoded super-shots at every iteration (Kerbs et al. 2009) and smoothing the gradient in predefined directions (Guitton & Diaz 2012).
In this paper, we present a time-domain multi-scale inversion strategy where the Wiener filter is calculated using a reference trace rather than the exact source signature. This enables multi-scale source-independent FWI. The algorithm is combined with a multi-source waveform inversion method for higher computational efficiency. Over the course of the paper, we will first introduce the mathematical formalisms needed for conventional multi-scale FWI in the time domain. We will then present our multi-scale source-independent inversion algorithm and how it can be combined with a multisource inversion technique. Synthetic case studies, including independent source and multi-source experiments, will then be used to demonstrate the characteristic properties of the proposed method and its performance. Finally, we will look at noise resistance and the choice of the reference trace before providing our conclusions.

Theory
In this section, we first introduce conventional multi-scale FWI in the time domain. To avoid the source wavelet estimation issue, we develop a Wiener filter that uses a reference trace and provide an expression of the gradient of the new misfit function that uses an adjoint state method. We then adapt the inversion strategy for multi-source waveform inversion.

Multi-scale source-independent FWI
An L2-norm-based FWI misfit function can be defined as follows (Tarantola 1984): (1) where u obs and u cal represent the seismic field and modeling data recorded at time t for source point x s and receiver point x r , respectively. m stands for the velocity model parameters.
Using an adjoint state method (Plessix 2006;Fichtner & Trampert 2011), the gradient of the misfit function (1) with respect to velocity model parameters can be obtained as follows: where B is the forward-modeling operator. The source wave field u and adjoint wave fieldû , which make up the wave equation and adjoint wave equation, can be expressed as: where S and S adj represent the source and adjoint source, respectively, andB represents the adjoint forward-modeling operator. The adjoint source S adj of the L2-norm-based misfit function can be expressed as: Gradient-based algorithms are optimization algorithms that are commonly used to update the velocity during FWI. They start by searching on the basis of reasonably accurate initial input, then update the model using gradient information at a current point. The velocity is updated using the following: where k refers to the number of iterations and H k represents the modification of the gradient term, e.g. a conjugate gradient or a Limited Broyden-Fletcher-Goldfarb-Shanno (L-BFGS) algorithm. k is the step length, which can be determined by a quadratic line-search method (Köhn 2011). As mentioned above, conventional time-domain FWI tends to get stuck in the local minima of the misfit function when there is predominantly high-frequency data. Multiscale inversion, proceeding from low to high frequencies sequentially, improves the chances of reaching a global minimum for the misfit function, because low-frequency data is more linear with respect to model perturbations. Using a Wiener filter, Boonyasiriwat et al. (2009) developed an efficient multi-scale FWI strategy in the time domain. The Wiener filter in the frequency domain can be obtained as follows: where S origional refers to the spectrum of the original true source wavelet and S target represents the spectrum of the low-frequency target wavelet at the angular frequency w, adding the small amount to prevent the denominator from becoming 0. The Wiener filter can transform original high-frequency data into more desirable low-frequency data under the condition of knowing the true source wavelet. The transformed data can be obtained from: where U(w) is the spectrum of the original seismic data. A Wiener filter with extended frequency selection (Boonyasiriwat et al. 2009) can be used in combination 481 with multi-scale FWI in the time domain to obtain higher computational efficiency and better convergence performance than is typically found when using conventional FWI. However, low-pass Wiener filters need information from the original source wavelet and this is usually unavailable in real cases. As a result, complicated source wavelet estimation has to be used to get the source wavelet (Zhou et al. 1997;Yu et al. 2014Yu et al. , 2016. In order to avoid this cumbersome and costly source wavelet estimation issue, we have modified the low-pass Wiener filter so as to be able to use a reference trace based on a convolution model. A reference trace chosen from the seismic data can be expressed as a product of the source wavelet and a reflection coefficient in the frequency domain: where S represents the source wavelet and R ref represents the reflectivity coefficient of the reference trace. Similarly the seismic data can be written as follows: where R represents the reflection coefficient of the seismic data in the frequency domain. A Wiener filter using the reference trace from the seismic data can then be defined as: Using the Wiener filter seismic data yields the following: In this way, the source wavelet term is offset in the filtered data. We can compute the Wiener filter for the calculated and observed data separately. Thus, the filtered data becomes: where the subscripts cal and obs represent the calculated and observed data, respectively. The misfit function can be defined as the L2-norm of the distance between the filtered observed and synthetic data, i.e.: This removes the effect of the sources from the misfit function. Using an adjoint state method (Plessix 2006), the corresponding adjoint source of the misfit function can be expressed as: where W calref represents the modified Wiener filter for the calculated data. In this paper, the second term of the adjoint source in (9) can be dropped for the following reasons: (i) If we choose a single trace as the reference trace, the second term is back-propagated only at the reference trace point. The influence of this on the gradient is comparatively small and plays the role of a singular wave field. (ii) If an average trace reference is chosen in a multi-source inversion experiment, the effect of the second adjoint source term is much weaker than that of the first term because the super-shot is summed across the receiver and is divided by the number of receivers. In addition, as shown by Choi & Alkhalifah (2011), the choice of reference trace can have a significant effect upon the inversion result. For independent shot inversion, we suggest choosing the nearest offset trace as the reference (Choi & Alkhalifah 2011). The choice of the reference trace for multisource experiments can be more complex. We will be looking at this separately in the next section.

Encoded multi-source waveform inversion
A randomly encoded multi-source inversion strategy is an effective way of improving the computational efficiency of FWI. The objective function of an encoded multi-source waveform inversion is: where e n is the sequence of the coding matrices, ⊗ denotes convolution in the time domain and N s represents the number of shots per super-gather. The polarity coding sequence e n satisfies or approximately satisfies the following condition: ee T = I. With an effective cross-talk noise suppression strategy, the inversion result of multi-source FWI is comparable to that of independent source FWI, while the computational efficiency is significantly improved. However, it should be noted that source estimation is more difficult for multisource FWI for the following reasons: (i) The common shot data are summed up by a random encoding function, making the source signature of early arrivals more obscure. (ii) The back-propagated source is different at each iteration step and cross-talk noise is present during the process of random summation. This complicates the inversion of the source wavelet. In general, a few additional separate source inversions are needed for the source estimation. This is time-consuming when used for a multi-source inversion scheme. Therefore, in this paper, we adopt a multi-scale source independent strategy with encoded multi-source FWI to avoid complicated and time-consuming source estimation and to enhance the accuracy of the result.

Example
A modified Marmousi model (see figure 1a), which is complex and rich in reflectors, was used to test the effectiveness of the proposed method. The target area was 4.6 × 1.5 km with 10 m cells and the velocity varied from 1.5 to 5.5 km s −1 . Forty-six shots were spaced out at 100 m intervals to avoid aliasing artifacts and receivers were placed at every grid point on the surface. A constant gradient velocity model, increasing from 1.5 to 4 km s −1 (which is lower than a true model on average), was used as the initial model for the inversion (figure 1b). The velocity in the water layer was 1500 m s −1 during the inversion. A staggered grid and explicit finitedifference method, with eighth-order accuracy in the space and second-order accuracy in the time domain, was adopted for the forward modeling and adjoint calculation. In order to verify the effects of the source wavelet on the inversion result, observed and modeled data using different source wavelets were generated by different source wavelets. The observed data was generated with a Gaussian wavelet and true model. A Ricker wavelet with a peak frequency of 10 Hz was chosen as the source wavelet for the modeled data. A comparison of the two source wavelets is shown in figure 2.
To assess some of the features of the proposed source independent FWI strategy, we took the source term of the Ricker wavelet and compared it to the adjoint source term produced by a conventional strategy. Figure 3a shows the observed common shot-gather for the true model using the Gaussian wavelet and figure 3b shows the modeled common shot-gather for the initial model using the Ricker wavelet. Figure 3c shows the adjoint source for the conventional FWI, which contains a strong direct wave residual even though the velocity for the water layer in the initial model was the same as it was in the real model. This is because the source signature difference led to a direct wave residual in the adjoint source. As the adjoint source term in the proposed method (figure 3d) does not contain direct waves, just reflected and refracted waves, the difference between the source wavelets in the modeled and observed data does not matter. The zero offset trace was chosen as the reference trace and the target wavelet was the Ricker wavelet with the peak frequency of 10 Hz, which is the same as the source wavelet term for the modeled data. There was also some noise in the adjoint source because of the effect of the Wiener filter. However, the energy of the noise was too small to affect the gradient term. A comparison of the adjoint sources shows that the proposed method can eliminate the source signature effect, despite the absence of any knowledge about the source.
The inversion result for a simulation using conventional FWI with the Ricker wavelet as the source term is shown in figure 4a. There were some strong inversion noises in the shallower region, resulting from the strong direct waves in the adjoint source (see figure 3c). For the deeper region, only a few shadows of the structure were added to the background model because the energy of the direct waves in the adjoint source was much stronger than that of the reflections. To verify that our method was independent of the source wavelet term, we applied the inversion strategy outlined above. Once again, the zero offset trace was chosen as the reference trace and the target wavelet was the Ricker wavelet with the peak frequency of 10 Hz, as with the part (<0.5 km in depth and >2 km in the horizontal) closely matched the true model. We realized that the nonlinearity of the misfit function for the high-frequency data and the inaccuracy of the initial model were the causes of the failure of the inversion. To improve the inversion result, we therefore employed a time domain multi-scale inversion strategy. A Ricker wavelet with a peak frequency ranging from 3 to 11 Hz in 2-Hz intervals was chosen as the target wavelet for different stages. An inverted model of the different frequency bands was obtained after five iterations and the inversion result of the current stage was used as the input for the following stage. The difference between the conventional method and the proposed method lies how the Wiener filter is calculated, with a source wavelet being used for the conventional method, while the proposed method uses a reference trace.
First of all, it was assumed that the original wavelet was a Ricker wavelet with a peak frequency of 10 Hz. The observed data was low-pass filtered using a conventional Wiener filter and the model was updated using the proposed time domain multi-scale FWI method. Figure 5 parts a and b show the inverted models obtained by replacing the target source wavelet with 5 and 11 Hz Ricker wavelets, respectively. The result after the second stage was dominated by inversion noise in the shallower region, which was the result of a strong direct wave residual in the adjoint source. For the deeper region, virtually nothing was updated. As a consequence, the inversion result after the last stage still produced an unsatisfactory velocity estimation. Although there were some updates in the middle to the deeper region, many layers were clearly dislocated and the resolution was very poor. However, when both the observed and modeled data were filtered by the new Wiener filter using the zero offset trace as the reference trace, the source wavelet effect was effectively eliminated, as shown in figure 3. We then used the multi-scale source independent inversion strategy to invert the velocity model. Figure 6 parts a  ) revealed even more detailed information. Thus, the inverted model managed to converge smoothly to the right answer. Calculating the low-pass Wiener filter with the reference trace successfully avoided inversion noise. By recovering process more wave-number components of the velocity structure the multi-scale inversion strategy is able to refine the inversion result and slowly reveal all of the data. To further improve the computational efficiency of the proposed method, we combined the multi-scale source independent FWI with a multi-source strategy. Separate shotgathers were summed into a super-shot-gather by a random encoding function. The source functions were also gathered into a super-shot function by using the same encoding function. Following the approach advocated by Krebs et al. (2009), the encoding function was established as a matrix containing −1 or +1 randomly. The encoding function was randomly varied across each iteration to suppress cross-talk noise. The inversion parameters were the same as they were for the earlier multi-scale inversion, while the number of iterations for each frequency band became 20.
The adjoint source for the proposed method was then compared with the conventional one. The target source wavelet of the Wiener filter was a Ricker wavelet with a peak frequency of 3 Hz. Figure 7a shows the adjoint source corresponding to the L2-norm based misfit function where it was assumed that the source wavelet was known. It is clear that the adjoint source contained no direct wave residuals. Figure 7b shows the adjoint source corresponding to the L2-norm based misfit function when it was assumed that the wavelet was a Ricker wavelet with a peak frequency of 10 Hz. As the original source wavelet used by the Wiener filter was different from the real source term (which was a Gaussian wavelet), the adjoint source contained strong direct wave residuals. However, when a reference trace was used to calculate the Wiener filter, the obtained adjoint source (figure 7c) was similar to that produced by using the real source term (figure 7a). The adjoint source term for the proposed method did not contain direct waves, only reflected and refracted waves. The reference trace was chosen to be the weighted average of the traces corresponding to the source points, hereafter termed the weighted average of the zero offset traces. The weighting function was chosen to be the encoding function. Some noise still remained in the adjoint source after use of the Wiener filter, but its energy was relatively small so it could be ignored. The frequency component of the adjoint source calculated by equation (15) was also lower than it was in figure 7a.
The multi-source multi-scale FWI strategy was also used with the exact source term being set as the reference for comparison. The inversion result for this is shown in figure 8. Overall, the inversion result recovered most of the structure of the velocity model successfully, which is comparable to the inversion result obtained when using conventional independent source FWI. However, we found the inversion result was noisier than it was for the separate source FWI, with a poorer resolution. This was largely a result of the cross-talk between the different sources and an insufficient number of iterations. By using a more sophisticated encoding function or some kind of preconditioning such as smoothing, it would be possible to eliminate this noise effectively. As we are mainly concerned with the influence of the source term on the inversion result, a discussion of cross-talk noise suppression is outside the scope of this paper. After this, conventional multi-source multi-scale FWI was used to update the model. The original source used for the Wiener filter was a Ricker wavelet with a peak frequency of 10 Hz. The inversion result is shown in figure 9. Here, we can observe that a strong direct wave residual in the adjoint source resulted in inversion noise in the shallower region. There was also insufficient updating from the middle to the deeper region. By contrast, figure 10 shows an updated velocity model using the multi-source multi-scale source-independent FWI proposed in this paper. Here, instead of using the source wavelet, a reference trace was used to calculate the Wiener filter, thus avoiding the difficulty of source signature estimation. The reference trace was chosen to be the weighted average of the zero offset traces. Without the presence of a strong direct wave residual in the adjoint source term, the three main faults were largely resolved and the reflectors were closer to their real position. The middle to deeper region in the inversion result also appears to be smoother than it was for the conventional multi-source inversion using the exact source (figure 8). Overall the inversion result for the multi-source multi-scale FWI was significantly better than the source independent inversion strategy. Figure 11 shows the data errors for the different encoded multi-source FWI methods using the different source wavelets. It can be seen that the convergence rate for the multi-source encoded FWI using the true source wavelet was much faster than the others in the earlier stages, while the error curve begins to fluctuate for the later iterations. To synthesize the different super-shot-gathers, the encoding function was different for each iteration, which had a significant impact on the convergence curves. The error curve for the FWI using the wrong source wavelet did not converge at all. Although, the convergence speed for the proposed multisource multi-scale source independent FWI was slower than the method using true source wavelets, it finally converged to 20% of the initial error, illustrating its stability and effectiveness. As the initial error in the proposed source independent FWI was quite large, the final absolute error was larger than the FWI using a true source wavelet. Table 1 shows a comparison of the calculation time for various inversion methods (the CPU parameters were: an Intel(R) Xeon(R) CPU E5-2650 v.2 at 2.60 GHz; a double CPU with 32 threads; 64 Gb of RAM). The multi-source FWI displayed notably better computational efficiency than a traditional multi-scale FWI, with it also using fewer threads and less memory. The computational time for the source independent multi-source FWI was largely comparable with the multi-source FWI using a known wavelet.

Discussion
In the previous section, we illustrated the accuracy and efficiency of the proposed method when applied to synthetic data generated by a modified Marmousi model. However, we realize that there are many factors that can affect the quality of the final reconstructed model. As with conventional source independent inversion strategies, the choice of the reference trace can significantly influence the inversion result (Choi & Alkhalifah 2011). In this section, we will examine the influence of the choice of reference trace on the inversion result and the robustness of the proposed inversion strategy in relation to noise.
Apart from the weighted average of the zero offset traces, there are several other alternatives that might be used for reference. These include an average trace of the super-gather 487 Figure 11. The data error curves of the different encoded multi-source FWIs, using different source wavelets.  across the receivers or a single trace. Figure 12 parts a and b show the inversion results using a single trace corresponding to the first shot position and the average trace of the super-shot as references, respectively. Both of these inversion results are inferior to the result when using the weighted average of the zero offset traces as a reference. To further study the potential reasons for this, we compared the dif-ferent reference traces, the results of which can be seen in figure 13. Figure 13a shows the use of a single trace at the first source point. Note how there were fluctuations after the direct wave, resulting from the random summation of the super-shots. The reflectivity of the reference trace, which is used in the denominator of our proposed Wiener filter, is very complicated, which risks making the Wiener filter unstable. The average trace of the whole super-gather across receivers (figure 13b) is even more complicated and reveals little information about the source signature. The waveform during averaging can be canceled out, because all traces are summed up by random polarizations determined by the encoding function. However, because the weighting function can correct the polarization of the traces corresponding to the shot position, the weighted average of the zero offset traces (figure 13c) not only shows the character of the direct wave but also suppresses later noise. The choice of frequency has a large impact on the multiscale inversion effect. We used 6 Hz in the numerical experiments as the initial peak frequency. The inversion results for this are shown in figure 15. The independent source method had good applicability and was able to inverse the general structure, but a local extremum problem began to appear because the background field had an inaccurate inversion. The method proposed in this paper can eliminate the influence of the source wavelet, but it cannot inverse the background field effectively if there is high peak frequency.
In real cases, a zero offset trace of the different sources is not always available because the scope for acquiring it is limited. We therefore also investigated the inversion result when the reference trace was the weighted average of common nonzero offset traces. Figure 14 shows the inversion results when using the weighted average of the 50 m offset traces as a reference. Note that the inversion result is as accurate as it was when using the weighted average of the zero offset traces as a reference. Figure 13d shows the weighted average of the 50 m offset traces. Although the timing of the first break and the amplitude were different, the reference trace still represented the source signature accurately, making the inversion successful. Field data is usually contaminated by random noise. This is one of the main problems encountered during preprocessing. In order to examine the extent to which the multisource multi-scale source-independent inversion method was robust in the face of noise, some random Gaussian noise was added to the shot-gather. A super-shot-gather was then composed, as shown in figure 16a. Note that the reflections were buried by the strong random noise. The inversion parameters were set to the same when using the weighted average of the zero offset traces as a reference. The adjoint source for when the target source wavelet was a 3 Hz Ricker wavelet is shown in figure 16b. In this case, although strongly polluted by random noise, the adjoint source contained efficient reflection information and contained no direct wave residuals. The inversion result for the proposed method using noisy data is shown in figure 17. Although the resolution was decreased, the inversion result recovered the main structure accurately. This indicates that the inversion strategy proposed in this paper is robust to noise.

Conclusion
In order to avoid complex source signature estimation and improve inversion stability, we developed a multi-scale source-independent inversion method by modifying the Wiener filter and by using a reference trace. Here, the source wavelets for observed and modeled wave-fields are both eliminated in the misfit function. When combined with a multi-source inversion strategy, the proposed method was found to effectively decrease computational burden. Numerical tests based on a synthetic model that used data generated by a modified Marmousi model verified that the proposed method is capable of recovering the velocity structure without source wavelet information. When analyzing the influence of the reference trace on the inversion result, we realized that the weighted average of the common offset traces is a better choice for the reference trace than a single or average trace. We have also found that the proposed method is able to resist noise. As we mainly focused on the influence of the accuracy of the source signature on the inversion result, a relatively simple encoding strategy was adopted, leading to some cross-talk noise in the inversion result. To enhance the quality of the inversion result, a more complicated encoding strategy would need to be adopted.