Matched Field Processing accounting for complex Earth structure: method and review

5 Matched Field Processing (MFP) is a technique to locate the source of a recorded wave- 6 ﬁeld. It is the generalization of plane-wave beamforming, allowing for curved wavefronts. 7 In the standard approach to MFP, simple analytical Green’s functions are used as synthetic 8 waveﬁelds that the recorded waveﬁelds are matched against. We introduce an advancement 9 of MFP by utilizing Green’s functions computed numerically for Earth structure as synthetic 10 waveﬁelds. This allows in principle to incorporate the full complexity of elastic wave propaga- 11 tion without further manual considerations, and through that provide more precise estimates 12 of the recorded waveﬁeld’s origin. We call this approach numerical MFP (nMFP). To demon- 13 strate the applicability and potential of nMFP, we present two real data examples, one for an 14 earthquake in Southern California, and one for secondary microseism activity in the Northeast- 15 ern Atlantic and Mediterranean Sea. In addition, we explore and clarify connections between 16 localisation approaches for the ambient seismic ﬁeld, real world limitations, and identify key ar- 17 eas for future developments. To increase the adoption of MFP in the seismological community, 18 tutorial code is provided. 19


22
The ambient seismic field has become an attractive target of seismological studies over the last two between the oceans and the solid Earth, so-called microseisms. Understanding the exact mechanism 28 for this interaction has been a challenge for more than half a century (Longuet-Higgings, 1950;29 Hasselmann, 1963; Ardhuin et al., 2015) and some open questions remain, e.g., about the emergence  An inhomogeneous distribution of sources results in asymmetric cross-correlation functions, indicated 88 4 by the thickness of the wave fronts in Fig. 1c (Paul et al., 2005). In practise, this asymmetry is 89 usually quantified by comparing the causal and acausal part of each correlation function. In the 90 interferometry-based approach, synthetic cross-correlation functions are computed for a given source 91 distribution and compared against cross-correlation functions from real data. The mismatch between 92 the two is evaluated (e.g., by quantifying amplitude asymmetry), the source model perturbed, and a 93 best-fit source distribution is found via gradient descent in an iterative manner (Ermert et al., 2016). 94 Recent work has focused on improving efficiency (Igel et al., 2021b), the mismatch measure (Sager 95 et al., 2018), or expanding the method to multiple frequencies (Ermert et al., 2021). The advantages 96 of this approach are the stability of results, not as strict requirements on station geometry, and a 97 comprehensive theoretical foundation. Its disadvantages lie in computational cost, treatment of 98 recorded data and related introduction of assumptions, and loss of temporal resolution. 99 Another approach that has gained some popularity in seismology in recent years is Matched 100 Field Processing (MFP). MFP is the generalisation of beamforming to allow arbitrary wavefronts 101 (Baggeroer et al., 1993). This approach has been developed in ocean acoustics, where coherency  In this paper, we introduce an advancement of MFP to incorporate Earth structure and account 116 for the complexity of seismic wave propagation. In the following, we introduce the standard MFP 117 approach, demonstrate its shortcomings, and present our solution by incorporating more realistic 118 Green's functions. We discuss implications of our approach, strategies to cope with them, how 119 different disciplines and localisation approaches intersect, and finally demonstrate the applicability 120 5 of our approach on two real data examples. In line with the informative nature of this paper, 121 we provide broader context and discuss ideas as they become relevant instead of deferring such 122 considerations to a separate discussion section. The MFP algorithm is straight-forward: For a given potential source location, a synthetic wavefield 125 is computed and matched against the recorded wavefield, i.e., the seismograms, taking coherency 126 of the wavefields across stations into account. This match is evaluated and compared against other 127 potential source locations. The potential source location with the highest score or beampower 128 (representing the best-matching synthetic wavefield) is the resolved source location.

129
More precisely, spectra d(ω, x j ) are computed from the recorded seismograms at each receiver 130 position x j . The cross-spectral density matrix is computed as with * denoting the complex conjugate. K jk (ω) holds all information about the recorded wavefield 132 and encodes its coherency across stations; it contains the cross-correlations of the seismograms from 133 all station pairs.

134
The synthetic wavefield, i.e., the seismograms expected at each station from the candidate 135 source, is represented through synthetic spectra s(ω, x j , x s ), with x s the source position and x j the 136 receiver position. In principle, these could be estimated in the time domain, but MFP computations 137 are done in frequency domain for simplicity and computational speed. More on how these are 138 computed in practise in section 2.1 and onwards.

139
The match of the two wavefields represented through K jk (ω) and s(ω, x j , x s ) is then estimated 140 through a so-called beamformer or processor. The most straight-forward beamformer is the con-141 ventional beamformer, which in its most compact form in vector notation is often written as (e.g., with B the beampower score for a potential source location. In literature, this beamformer is 144 sometimes called Bartlett processor, although the origin of this name is unclear (e.g., Gal & Reading,145 2019), linear beamformer (e.g., Baggeroer et al., 1993), or frequency-domain beamformer (DeMuth, 146 6 1977). We express B more explicitly, for clarity as We exclude auto-correlations (k = j), as in Bucker (1976), because they carry "noise", i.e., the 148 incoherent parts of the wavefield (Soares & Jesus, 2003) and provide no useful additional information.

149
Auto-correlations scale the retrieved beampowers by recorded energy, which is not necessarily caused 150 by a higher signal-to-noise ratio (SNR). Here, signal refers to those parts of the ambient seismic 151 field that are (often weakly) coherent across stations.

154
Beamformers are often classified into conventional (eq. 3), adaptive (e.g., Capon, 1969;Cox et al., 155 1987; Cox, 2000) and sub-space beamformers (e.g., Schmidt, 1986). Adaptive beamformers aim to 156 increase resolution of the beampower distribution by increasing sensitivity to signal, but inherently 157 rely on high SNR. The increased resolution is also accompanied by increased computational cost, 158 e.g., the Capon beamformer involves computing the inverse of K jk (ω) (Capon, 1969). Sub-space 159 detectors such as MUSIC (Schmidt, 1986) are used, with t( x j , x s ) the travel time of the investigated wave between source and receiver ( Fig.   175 1d). In some seismological studies, the addition of an amplitude term A( x j , x s ) that accounts for   With Green's functions computed numerically for the same Earth structure, the location of the 234 source is resolved precisely for both broad-and narrowband sources (Fig. 2b, d). This is unsurprising, 235 given that we are essentially matching the synthetic wavefield against itself with some noise. But 236 this is also exactly the intent behind the approach: matching the recorded wavefield with a more localisation technique (Bucker, 1976) and has been adopted for broadband sources thereafter (e.g.,  appear reasonable to correct for them: correcting for amplitude decay (Fig. 3b), time-domain 306 normalisation (Fig. 3c), and spectral whitening (Fig. 3d). Without any treatment of amplitudes, 307 the beampower distribution is heavily biased by distance to stations (Fig. 3a). This effect is more  With this approach, we resolve the beampower peak close to the true source (Fig. 3c), but introduce This approach successfully retrieves the correct source location and does not appear to introduce 345 any unwanted biases (Fig. 3d).

346
From our tests, whitening the spectra of the synthetic wavefields (Fig. 3d)  to express the emergence of sidelobes for narrowband sources (Fig. 2c,d).  Understanding beamforming as convolution leads to another way of conceptualising MFP. We 381 rewrite the beampower score (eq. 3), omitting the variables (ω, x j , x s ) for readability, as Here, d * k d j is the correlation of the recorded wavefields and s * j s k the correlation of the synthetic This description and the one we introduce above result in the same realisation. Fundamentally, 400 only two approaches for locating sources of the ambient seismic field exist: polarisation analysis 401 (Fig. 1a) and approaches that exploit exactly wavefield-coherency across stations (Fig. 1b-d).

402
That beamforming, MFP, and interferometry-based localisation are essentially the same may not 403 be intuitive at first, especially considering the strikingly different sketches to illustrate them ( Fig.   404 1b-d), and the different language both communities use.

405
To retrieve "reliable" cross-correlation functions of the recorded data in ambient noise seismol- Above, we have already explored the advantages and limitations of using numerically computed 421 synthetic wavefields (Fig. 2) and amplitudes (Fig. 3) in MFP, as well as the emergence of striped 422 interference patterns for narrowband sources (Fig. 2). MFP shows further undesired behaviour 423 under certain conditions that we encounter in real-world applications. Some of these are more 424 straight-forward to understand in the conceptual framework of convolution introduced above.  Station density has direct impact on the retrieved beampower distribution that is worth pointing 444 out explicitly. In a synthetic test, we place additional stations on the right side (Fig. 4a). The from understanding MFP as the sum over convolutions of correlated wavefields, as described above.

450
Regions with higher station density are then inherently weighted higher and cause the observed 451 effect.

452
This goes beyond increased resolution due to better suppression of incoherent noise, and is an 453 effect that essentially all real-world applications of MFP will have to take into consideration. Single sources can cause prominent interference patterns, if they are narrowband (Fig. 2c,d), which 462 depend on station geometry and frequency band. This leads to even more complex, secondary 463 interference when multiple sources are active at the same time. In a synthetic test, we place two 464 narrowband sources that excite identical wavefields simultaneously (Fig. 4b). The second source is 465 placed such that it lies at the edge of a sidelobe of the first source (Fig. 2d). From the retrieved 466 distribution of beampowers it is not at all obvious that two and only two sources are active here, 467 and instead this may be misidentified as a single source close to the left array (Fig. 4b). The   identify and address in practice, especially if only a limited frequency band is available (Fig. 4b).

520
This aspect is also not considered in the array response for plane-wave beamforming. It is important  (Fig. 5). When applying the standard MFP approach, with an assumed 551 velocity v = 3.2 km/s (the best fit in the synthetic test in Fig. 2), we find a relatively good 552 location of the earthquake with 7.7 km distance to the location in the CI catalog (Fig. 5a, SCEDC, mechanism, we at first find a decrease in location accuracy (Fig. 5b). The retrieved location is 557 18.3 km away from the CI location. When we incorporate the moment tensor solution from the 558 CI catalog (SCEDC, 2013), straight-forward to do with nMFP, we find an improvement in location 559 accuracy with a distance of only 1.9 km to the CI location (Fig. 5c). This demonstrates one of the our approach may be more systematic and help give improved estimates of the source mechanism.

568
In the case of strong earthquakes, such as this example, the usefulness of MFP is limited. Other 569 approaches that rely on data abstraction are routinely applied and provide more precise results that  For the first snapshot (Fig. 6a,c), the results using the standard approach correlate well with 583 significant wave heights regardless of chosen velocity, and seismic sources are located West of the 584 British Isles. In the second example, however, we find considerable differences in the beampower 585 distribution depending on chosen velocity (Fig. 6d). The increased ocean activity to the North and 586 West of the Iberian Peninsula matches best with significant wave heights for velocities v = 3.0 or 587 v = 3.2 km/s (Fig. 6d,f). With v = 2.8 km/s an entirely different region, to the West of France 588 and South of the British Isles, is located as the dominant source (Fig. 6d). Similarly for the third 589 snapshot, we find a clear region of high beampowers in the Mediterranean Sea, West of Corsica, 590 that corresponds to significant wave heights only for v = 3.2 km/s (Fig. 6g,i).

591
This suggests that v = 3.2 km/s is a resonable choice of seismic wave velocity for the analysed 592 frequency band, reaffirming our synthetic analysis (Fig. 2) and our choice in the earthquake example 593 above (Fig. 5). We claim that seismic sources of the secondary microseism should roughly co-locate 594 with significant wave heights as an argument for the validity of this choice. This is resonable, because 595 the common explanation for the secondary microseism mechanism is that ocean gravity waves at the ble sources are made to simplify the problem, e.g., no distributed simultaneously acting sources.

605
Deviation from such assumptions can have significant impact on the retrieved beampowers and 606 complicate the decision (Fig. 4b). Our example for the secondary microseism demonstrates the 607 complexity of beampower distributions one encounters in a real world application that would need 608 to be interpreted when making a choice of v (Fig. 6, left). Without relying on the assumption of ourselves from assumptions about the study target. Similar considerations apply to the earthquake 627 example above. Importantly, this also means that nMFP likely performs worse when the real velocity 628 structure in the study area deviates significantly from the Earth model used, an effect that is more 629 pronounced for higher frequencies. Currently, we rely on an axisymmetric PREM model, which 630 is a severe limitation. In future works, heterogeneous 3D models of Earth structure should be 631 incorporated in the computation of Green's function databases utilised in nMFP.

632
The similarity between the standard approach (with v = 3.2 km/s) and nMFP is generally high to all stations, as e.g., for the Chino Hills earthquake (Fig. 5), improving the accuracy of the interpreting the observed patterns, as we have also demonstrated in synthetic tests (Fig. 4b).
If 652 nMFP will prove to be more precise also for microseisms, we may find that seismic waves are excited 653 in specific regions in the oceans and not distributed homogeneously beneath storm systems. It is 654 important to note here that for now we use an explosion source mechanism for the synthetic wave-655 fields to locate the microseism, which we have already shown to be inadequate for an earthquake 656 (Fig. 5). In the future, we require a strategy to describe and incorporate a source mechanisms ap- Matched Field Processing (MFP) is generalized beamforming for arbitrary wavefields, removing the 664 need for the plane-wave assumption. It is one of the current approaches to locating sources of 665 ambient seismic noise (Fig. 1). In this study, we advance MFP to better incorporate elastic wave 666 propagation in the Earth by using Green's functions numerically computed for a model of Earth 667 23 structure directly as the synthetic wavefield that the data is matched against. We call this approach 668 numerical MFP (nMFP).

669
When amplitudes are considered in MFP, results are biased by amplitude effects such as geo-670 metrical spreading and anelastic attenuation. In the standard approach, this is usually neglected 671 through spectral whitening of the synthetic wavefield. We find that this strategy performs best for 672 us as well, and that trying to correct for spreading and attenuation via an amplitude term, as has 673 been suggested before, may not be advisable (Fig. 3). This is especially the case for nMFP, where 674 multiple wave types can be considered simultaneously.

675
Two examples on real data showcase the potential of nMFP (Figs. 5, 6). In principle, we can 676 use it to search for the source mechanism of a seismic source, as suggested by the improved source 677 location after incoporating the earthquake's moment tensor (Fig. 5). This could be particularly useful 678 in the context of tremor activity, where source mechanism determination is challenging with classical 679 approaches. In a second example, we locate sources of the secondary microseism in the Northern

680
Atlantic and Mediterrenean Sea (Fig. 6). Results from nMFP match the standard approach's results 681 closely, likely due to source geometry, narrow frequency band, and our reliance on Green's functions 682 computed for an axisymmetric Earth. nMFP retains the advantage that is not biased by author 683 choice of a medium velocity, and potentially provides higher resolution. 684 We clarify conceptual approaches to MFP and its close connection to the interferometry-based 685 localisation. The striking similarity between them suggests that it may be a worthwhile endeavour 686 to unify them in the future, or at least provide a framework to let the different communities benefit 687 from each others' work. On a conceptual level, Beamforming, MFP, and the interferometry-based 688 localisation strategy all rely on quantifiyng the mismatch of correlation wavefields. MFP in particular 689 would benefit tremendously from a universally applicable approach for quantifying its resolution. The 690 lack of such a measure is currently its major disadvantage. Seismic data used in this study was provided by network operators of international, national, and The retrieved source location is sensitive to the chosen velocity. b) Our approach, with numerical Green's functions as synthetic wavefields (nMFP). Source is precisely located. Right: narrowband source (0.13-0.15Hz). c) Standard approach. Emergence of sidelobes due to interference. d) nMFP in the same narrow frequency band and the source is precisely located.  Accounting for the source mechanism of the earthquake improves the resolved location, performing better than standard MFP (1.9 km distance to the CI location).