Point source moment tensor inversion through a Bayesian hierarchical model

S U M M A R Y Characterization of seismic sources is an important aspect of seismology. Parameter uncertainties in such inversions are essential for estimating solution robustness, but are rarely available. We have developed a non-linear moment tensor inversion method in a probabilistic Bayesian framework that also accounts for noise in the data. The method is designed for point source inversion using waveform data of moderate-size earthquakes and explosions at regional distances. This probabilistic approach results in an ensemble of models, whose density is proportional to parameter probability distribution and quantifies parameter uncertainties. Furthermore, we invert for noise in the data, allowing it to determine the model complexity. We implement an empirical noise covariance matrix that accounts for interdependence of observational errors present in waveform data. After we demonstrate the feasibility of the approach on synthetic data, we apply it to a Long Valley Caldera, CA, earthquake with a well-documented anomalous (non-double-couple) radiation from previous studies. We confirm a statistically significant isotropic component in the source without a trade-off with the compensated linear vector dipoles component.


I N T RO D U C T I O N
Seismic moment tensors (MTs) are a basic tool to make inferences about earthquake sources.They can be used to map the fault structure at depth and infer the stress pattern.For these studies, as well as characterising the seismic source itself, uncertainties of the MT components are important.A number of agencies calculate MTs of global or regional earthquakes, but parameter uncertainties are rarely computed, even in more detailed studies.The MT is usually decomposed into a double-couple (DC, equivalent to shear faulting), compensated-linear-vector-dipole (CLVD) and isotropic (ISO) components.The uncertainties are particularly important when examining earthquakes with significant non-DC components because the amount of DC, CLVD and ISO components can significantly vary with small perturbations of parameters (Zahradník et al. 2008b).Recently, techniques based on probabilistic approaches are emerging (e.g.Šílený 1998;Wéber 2006;Ford et al. 2009;Duputel et al. 2012;Křížová et al. 2013;Stähler & Sigloch 2014) and frequently, they use an ensemble of solutions to estimate the uncertainties.
Departures from the pure DC model were initially assigned to data noise because they are found for nearly all earthquakes.However, with an ever-increasing number of seismometers and advances in data quality, non-DC components were better resolved and observed mostly in volcanic and geothermal areas, mines and deep subduction zones.The first earthquakes with well-constrained non-DC mechanism were found in Icelandic volcanic complexes (e.g.Klein et al. 1977;Foulger & Long 1984), Long Valley caldera, California (e.g.Barker & Langston 1983;Julian 1983;Ekström & Dziewonski 1985) and a number of volcanoes in Japan (e.g.Ukawa & Ohtake 1987;Shimizu et al. 1987Shimizu et al. , 1988) ) and their mechanisms are still intriguing seismologists (e.g.Nettles & Ekström 1998;Tkalčić et al. 2009).The high non-DC components can be a result of geometrically complex shear faulting, but are also attributed to tensile faulting (opening and closing of tensile cracks), heterogeneities or anisotropy in the medium and polymorphic phase transformations (Frohlich 1994;Julian et al. 1998).Data from such tectonic environments are often noisy, thus, it is crucial to account for the noise in the inversion.There have been methods proposed to simultaneously address the effects of noise, the source mislocation and the uncertainty in the velocity structure (e.g.Šílený & Panza 1991;Guidarelli & Panza 2006) using the non-linear modelling and turning parameter variances into confidence regions.Studies using the seismic potency tensor also focused on the non-DC components (Ross et al. 2015).
The source location is mostly determined by first motion data and gives the hypocentre, which does not have to coincide with the centroid.The MT solutions depend on centroid location, especially its depth (e.g.Dreger & Helmberger 1993).Zahradník et al. (2008a) proposed checking multiple locations with a grid-search method to stabilise the MT solution.Thus, it is beneficial to include the centroid as a parameter in the inversion.In traditional methods parameter uncertainties can be calculated in linearised inversions (e.g.Riedesel & Jordan 1989;Vasco 1990;Zahradnik & Custodio 2012), but inverting for the centroid location introduces non-linearity and parameter uncertainties cannot be estimated in a straightforward manner.
In order to estimate solution uncertainties and explore the model space more thoroughly, we solve the inverse problem in the probabilistic Bayesian framework (Box & Tiao 1973;Tarantola & Valette 1982;Christensen et al. 2011).Various flavours of Bayesian inversion were already successfully implemented for some geoacoustical and geophysical problems (Dettmer et al. 2007).Recently, Bodin et al. (2012) employed a hierarchical transdimensional Bayesian inversion in a tomographic study, jointly inverting receiver functions and surface wave dispersion data.They allowed the number of free parameters to vary throughout the inversion and inverted for noise parameters in different data sets.A similar technique was applied in a tomographic study of the lowermost mantle (Young et al. 2013).Most recently, Pachhai et al. (2014) utilised the rigorous treatment of data noise in a study of heterogeneities on top of the core-mantle boundary.For a more detailed description of Bayesian transdimensional method, including the hierarchical aspect, see a review by Sambridge et al. (2013).
Bayes' approach was first applied in seismic studies of earthquake sources by Ide et al. (1996) and Yagi & Fukahata (2008, 2011), who used Akaike Bayesian information criterion to determine optimal parameter values.In traditional inversions, the source parameters are sought on different spatial scales and using different datatypes.Similarly, the Bayesian inversion has been utilised to invert wave polarities (Walsh et al. 2009) and seismic waveforms of local (Wéber 2006), regional (Lee et al. 2011) and teleseismic (Duputel et al. 2012;Stähler & Sigloch 2014) events.Bayesian inversion was also applied by De ¸bski (2008) to compute the source time function using empirical Green's functions and not inverting for the mechanism.It has also been used to infer the finite fault models that employ seismic and geodetic data (Monelli et al. 2009;Minson et al. 2014), model the finiteness of a caldera (Fichtner & Tkalčić 2010) and investigate the uncertainties related to the source time function and variability in the Earth structure (Razafindrakoto & Mai 2014).
In this paper, we implement a hierarchical Bayesian inversion method for the seismic MT to study moderate-size earthquakes and explosions at regional distances.A hierarchical aspect of the inversion means that the noise is treated as a free parameter (a.k.a.hyperparameter) in the inversion.We let the data noise determine the level of fit (and, consequently, the model complexity) by making it a free parameter in the inversion, together with the MT parameters and the centroid location.This probabilistic approach gives an estimate of parameter uncertainties together with their average values.To avoid the trade-off between location and MT parameters, we use two Markov chains to sample the parameter space, one for the centroid location and another that samples noise and the MT parameters at each location.We examine the performance of the developed algorithm in synthetic experiments, with real station locations and noise added to synthetic seismograms.Finally, we analyse data from an earthquake in Long Valley caldera, California and compare the algorithm performance with previously published results.

H I E R A RC H I C A L B AY E S I A N I N V E R S I O N F O R T H E C E N T RO I D M O M E N T T E N S O R
In a Bayesian framework, the model parameters are treated as random variables, thus, the sampling yields an ensemble of models instead of only the best-fit solution.The ensemble can be used to estimate parameter uncertainties and gain more information about the model, in our case the seismic source.The Bayes' theorem gives p(m|d), the posterior distribution of model parameters m given the observed data d, based on the prior distribution p(m) and the likelihood function p(d|m), namely p(m|d) = c p(d|m) p(m). (1) The constant c normalises the posterior distribution so its integral over the model space equals to unity.
We have created an algorithm for a non-linear inversion of the seismic MT, where the model space consists of three parameters for the location (longitude, latitude and depth), six parameters for the MT and one hyperparameter for the noise.If we consider the noise to be the part of the data that we do not wish to explain (Scales & Snieder 1998), the theory errors are included in the noise together with the measurement errors.With the level of noise estimated, the rest of the information in the observed data is a meaningful signal.Thus, data noise determines the model complexity and the regularisation (smoothing and damping) is avoided.
The prior probabilities for all parameters are set to be noninformative, that is, a uniform prior is defined over a broad, physically reasonable range of values.The prior for location parameters is defined for a discrete set of values around a previously determined location with 1 km spacing in depth and 0.025 • (∼2.5 km) spacing for the epicentral coordinates.MT parameters have a uniform prior over a large set of values based on an existing value for the scalar moment M 0 (from −1.5 * M 0 to 1.5 * M 0 ) and the noise is defined as a per cent of data root mean square (rms) and can have values up to 5*rms.We express the noise in terms of the rms value because it better represents the whole signal than the maximum amplitude does (e.g. it would be lower for a signal consisting of a simple waveform and many zeros than for the same simple waveform surrounded by noise or a complicated and longer waveform with the same maximum amplitude).

Forward modelling
In the point source approximation, the displacement on Earth's surface is expressed as a convolution of spatial derivatives of the elastic Green's functions and the seismic MT.Displacement u i in the direction i can be expressed as where j and k are the remaining two Cartesian coordinates and * denotes the temporal convolution.To facilitate understanding of the results, the MT is usually decomposed into the DC, CLVD and ISO components (Jost & Herrmann 1989).Here, however, we follow the approach of Kikuchi & Kanamori (1991), with slight modifications implemented in the frequency-wavenumber code AXITRA used to create the Green's functions (Bouchon 1981; Cotton & Coutant 1997) and parametrize the MT using the following six tensors The first five tensors are DCs and determine the deviatoric part of the solution, while the sixth tensor determines the ISO part.Their focal mechanisms are shown in Fig. 1.The full MT is simply a linear combination of the six elementary tensors where a n are coefficients in the expansion.
It is reasonable to assume that all elementary tensors M n have the same time dependence, which can be convolved with the Green's functions derivatives to yield six elementary seismograms E n : Thus, we compute the synthetic seismograms as a linear combination of the six elementary seismograms with the same coefficients a n as in eq. ( 4): To increase the efficiency of the algorithm, we pre-compute the elementary seismograms for a given set of locations so they do not need to be calculated throughout the Monte Carlo search.

The likelihood and data noise covariance matrix
All information from the data is included in Bayesian inversion through the likelihood function, which makes it a key factor.It quantifies how well a particular set of parameters composing a given model m reproduces the observed data d.We define the misfit function using the data noise covariance matrix where G(m) are the modelled (synthetic) seismic waveforms.The likelihood function is based on a Gaussian distribution where |C e | is the determinant of the covariance matrix and N is the number of points.
We consider different parametrizations of the covariance matrix.In the simplest one, we do not take into account the noise correlation and the matrix is diagonal (proportional to the identity matrix) where σ 2 is the noise variance.In this case, the likelihood can be simplified to However, this formulation assumes uncorrelated noise, but noise in seismic waveforms is considered correlated due to its bandlimited nature and the inelastic attenuation of the Earth (Yagi & Fukahata 2008).Assuming uncorrelated noise can underestimate parameter uncertainties and alter their values.

Estimating the covariance matrix
In previous Bayesian studies, the covariance matrix has been estimated in a few different ways.Some studies assume an exponentially decaying or a Gaussian covariance matrix (e.g.Bodin et al. 2012;Duputel et al. 2012), others compute the matrix from data residuals (e.g.Dettmer et al. 2007) or synthetically generated noise seismograms (e.g.Gouveia & Scales 1998;Sambridge 1999;Piana Agostinetti & Malinverno 2010).We choose a combination of the first and last approach and empirically estimate the covariance matrix from a number of realisations of pre-event (ambient) noise recorded on the stations used in the inversion, similar to Kolb & Lekić (2014), and invert only for the noise variance.As the noise series used to compute the covariance matrix might be influenced by seasonal variations, we approximate it by an analytical expression and subsequently parametrize it in three ways: using an exponentially decaying function, a Gaussian function and a combination of two attenuated cosine functions.This approach avoids repeating the inversion, which is necessary if the matrix is directly computed from data residuals.We do not use the empirical data noise covariance matrix directly because it is not the same on all seismograms.Using different matrices for all waveforms would be difficult to implement.Parametrization makes it more general and less influenced by seasonal variations of noise.
The noise used to compute the covariance matrix is pre-processed in the same way as the data used in the inversion (bandpass filtered between 20 and 50 s).We approximate the empirical matrix with an exponentially decaying function and a Gaussian function because they are commonly used in inversion studies.However, we do not use a Gaussian covariance matrix in the inversion because of computational difficulties in calculating its inverse.Ababou et al. (1994) found it to be 'worst' conditioned compared to a number of other matrices.A matrix with exponentially decaying elements has the form The noise variance is again denoted by σ 2 , |i − j| is the time difference between samples i and j and r e is the decay factor.When we examine the average autocorrelation functions (i.e. the cross-diagonal terms of the covariance matrices) of the noise recorded on stations around the globe (Fig. 2), their side lobes are prominent.They are quasi-periodic, with oscillations decaying with the time lag.One attenuated cosine function could not adequately capture the complexity of the autocorrelations, therefore we parametrize them using two attenuated cosine functions where b denotes the amplitude of the first cosine function.Since σ 2 is the measure of noise level, we want the term inside the square brackets to have a maximum of 1, which makes the amplitude of the second cosine function equal to 1 − b.The periods of the two cosine functions are denoted by L 1 and L 2 and r e1 and r e2 are their exponential decay factors.This parametrization can adequately approximate the side lobes of the autocorrelations (e.g.Figs 3b and  c show synthetic noise autocorrelations and both the exponential and two cosine function fit).

Sampling of the parameter space
To estimate the posterior distribution, we use the Markov chain Monte Carlo (MCMC) method.The MCMC is a random walk through the parameter space guided by the values of likelihood and based on the Metropolis-Hastings algorithm (Metropolis et al. 1953;Hastings 1970).It results in an ensemble of models whose density reflects the posterior distribution.The first point in the chain (the first model) is randomly chosen, from which the algorithm proceeds by creating a new model m' as a perturbation of the last (m), based on a Gaussian probability distribution q(m |m), with a variance θ 2 i for a particular model parameter.The new model is accepted with a probability α, which depends on the ratio of posteriors of the new model and the previous one and the ratio of proposal distributions.Technically speaking, a uniform random number r with values between 0 and 1 is generated and the acceptance probability α is defined as Since the distributions q and p(m) are symmetric, the ratio in eq. ( 13) depends only on the likelihood ratio p(d|m )/p(d|m).If α ≥ r the new model m' gets accepted and the previous model m is replaced by it.This means that models with a higher likelihood are always accepted, but the walk can also proceed to a less likely model and adequately sample the posterior probability (Mosegaard & Tarantola 1995).
The location parameters are fundamentally different from the MT parameters because they dictate the shift from linearity to nonlinearity.Thus, we use two Markov chains: one to sample the location parameters and another one to sample the remaining parameters (noise and the MT parameters) for each point in the first (outer) chain.The first part of every inner Markov chain (called the burn-in period) is discarded and models are collected for the ensemble after it has reached convergence, that is, when the misfit, the MT parameters and the noise show no obvious trends and their fluctuation with iteration number is similar to a white noise process.Convergence was monitored in a number of tests prior to the inversion.It usually occurs after several hundred iterations so a conservative value of 20 000 iterations was chosen as the length of the burn-in period.Every 200th model from the post burn-in period is selected to eliminate dependent samples in the ensemble.To ensure the location parameters are exhaustively sampled, but in the same time collect a large number of samples from the maximum a posteriori (MAP) location, the proposal for location parameters in the outer chain is broad (θ 2 i = 2.5) in the first 500 iterations and changes to a smaller value (θ 2 i = 0.45) in the next 500 iterations.The ensemble of MT and noise parameters is taken from the MAP location.

S E N S I B I L I T Y T E S T S
Initially, we investigate performance of the algorithm with synthetic data.We use the configuration of an earthquake in Long Valley Caldera, California and station locations from the Berkeley Digital Seismic Network (BDSN; shown in Fig. 4) as a study case (Dreger et al. 2000;Minson & Dreger 2008) to create synthetic seismograms.The structure model used to create the elementary seismograms and the synthetics is the Southern California (SO-CAL) model, consisting of three horizontal layers over half-space (Dreger & Helmberger 1990).The synthetic input source mechanism has substantial non-DC components (∼35 per cent CLVD and ∼10 per cent ISO) and a Dirac source-time function; the synthetic data are filtered between 0.02 and 0.05 Hz using a one pass Butterworth filter.
Additionally, we add noise in the frequency domain, multiplying the complex spectra by k(1 + r 1 + ir 2 ), where k is a constant and r 1 and r 2 are random numbers between −1 and 1.This type of noise simulates the propagation effects on the wavefield because it introduces a phase shift together with an amplitude variation (Kennett 1985).Furthermore, its covariance matrix shows similar statistical properties as the real seismic noise.The autocorrelations of noise traces created this way (Fig. 3a) have the same behaviour as the real noise autocorrelations (Fig. 2).
The level of noise is calculated as a per cent of the data rms and varies by a few per cents on different components.We perform the inversion with three parametrizations of the noise covariance matrix (defined by eqs 9, 11 and 12) for data without noise, with low (k = 0.25, i.e. ∼16 per cent data rms) and high (k = 1, i.e. ∼65 per cent data rms) levels of noise, with σ 2 being the hyperparameter in the inversion.The coefficients necessary to parametrize the C e matrix (r e for the exponential matrix and b, r e1 , r e2 , L 1 and L 2 for the attenuated cosine matrix) were calculated prior to the inversion using the Hyper-sweep code (http://www.iearth.org.au/codes/Hyper-sweep/) that performs a grid search in a multidimensional space.

Data without noise
All three assumptions for the noise covariance matrix behaved similarly for data without noise and retrieved the correct location and MT.The location parameters converged in the first few hundred iterations in the outer Markov chain.After 500 iterations, the algorithm concentrates on a narrow area around the MAP location (Fig. 5).The acceptance ratio for the location parameters is low so we plot all the sampled locations (empty circles) together with the accepted ones (full circles).We also show cross-sections through the MAP source location to increase the visibility.The MT parameters are exhaustively sampled during the burn-in period of the inner Markov chains, but the final ensemble, collected after it, shows low uncertainty (Fig. 6 and Table 1).The full MT MAP solutions agree well with the input mechanism (Fig. 7a) and the seismograms are indistinguishable from the input data (Fig. 7b).Given that this was 'perfect' data, created with the same structural model we use to compute the elementary seismograms for the inversion, correct retrieval of the input mechanism and low uncertainties are not surprising.We wish to examine the non-DC components so we visualise them using the lune plots (Tape & Tape 2012), where longitude γ is determined by the amount of the CLVD component and latitude δ by the amount of the ISO component, placing the pure DC mechanism in the middle.For data without noise, the final ensemble (darker colours on all three plots) shows low uncertainty and coincides with the input mechanism in all cases (Fig. 7c).

Data with added noise
Adding noise to the synthetic data somewhat affected the algorithm convergence, the inner Markov chains converged slower.However, this is not a strict rule because the convergence is also affected by initial parameter values, which are random.When the initial values are close to the optimal values, the algorithm will converge faster.
Inversions of data 'contaminated with noise' showed larger uncertainties of the model parameters (Tables 2 and 3).As the wavelengths at frequencies used in the inversion are several tens of kilometres long, the spatial resolution of the centroid location is limited.Nevertheless, the retrieved location was only a few gridpoints away from the input location.The DC part of the mechanism was close to the input values, but there were more differences in the non-DC part.
When a low level of noise was added to the synthetic seismograms, the algorithm converged to a location 1 km deeper and 2.5 km south of the epicentre when the diagonal and exponential noise covariance matrices were used.The DC part of the MT was accurately retrieved (the MAP solutions had only a few degree difference in strike, dip and slip from the input solution), but the non-DC components were not (Fig. 7d).Seismograms obtained with all three assumptions are quite similar and fit the data (Fig. 7e).However, a more complex parametrization of the noise covariance matrix, the one using two attenuated cosines, resulted in a correct location and the correct non-DC components (Fig. 7f).
With more noise added to the data, the earthquake was mislocated with all three assumptions for C e .The DC part of the mechanism was close to the input values in all three cases, but the scalar moment was overestimated by 4-8 per cent (Fig. 8a-c).The non-DC parts in first two inversions are closer to the input source (Fig. 7i).However, the uncertainties are largest in the third case, reflecting the high data noise.Furthermore, the last inversion retrieved the correct epicentre location.

Improving the azimuthal coverage
Stations used in the previous experiments had a large azimuthal gap (more than 180 • ) so we test the algorithm performance using additional five (synthetic) stations distributed along the gap (blue triangles in Fig. 4).Although an MT inversion with waveform data is theoretically possible even using data from a single station because the number of data is larger than the number of parameters, the noise can cause spurious non-DC components.The performance of the inversion can be improved with good azimuthal coverage (e.g.Šílený 1998;Zahradnik & Custodio 2012).We add the higher level of noise to all data and compare the results of inversions with five stations.
The MAP location was closer to the input location when an exponential and an attenuated cosine matrices were assumed (∼2.5 km horizontally and 1 km in depth, respectively).The DC components were similar as before, but their uncertainty slightly decreased and the variance reduction increased when diagonal and exponential matrices were used (Fig. 8d-f and Table 4).Non-DC components from all locations show a smaller spread than before (light colours in Figs 9c and 7i) and an even stronger trade-off between the CLVD and ISO components.Non-DC components from the MAP location improved when using the attenuated cosine matrix.

A P P L I C AT I O N T O A L O N G VA L L E Y C A L D E R A E A RT H Q UA K E
After testing the algorithm on synthetic data, we apply it to an earthquake from a complex tectonic setting in Long Valley Caldera, California.The analysed earthquake occurred on 1997 November 11 and was one of the four M W > 4.5 earthquakes with anomalous radiation pattern from the November swarm (Dreger et al. 2000).We use waveform data from the same five stations (BKS, CMB, KCC, ORV and PKD, shown in Fig. 4) and filter it in the same way as synthetic data in the previous examples.The inversion is performed with all three assumptions for the covariance matrix.Since the noise autocorrelations on these BDSN stations are different on the horizontal and vertical components (they are in Group 1 in Fig. 2), we use one parametrization for the horizontal components and another for the vertical.The covariance matrices are computed fitting the average noise autocorrelation for the five stations shown in Fig. 2.
We centre the grid for the centroid location on epicentre reported in the Council of the National Seismic System (CNSS) catalogue at 37.634 • N latitude and −118.946• W longitude and include depths from 3 to 20 km.The hypocentral depth reported in the CNSS catalog is 7.1 km and the depth obtained by Dreger et al. (2000) is 5 km.Initially, we perform a linear inversion on a number of depths and search for a time-shift between the data and Green's functions to accommodate for an imperfect structure model.Time-shifts for    moment were highest when using an attenuated cosine covariance matrix.
All three inversions resulted in a high ISO component (∼45 per cent) and a CLVD component close to zero.The inversions with diagonal and exponential covariance matrices gave a slightly negative CLVD component, while the inversion with an attenuated cosine covariance matrix gave a slightly positive CLVD, but still lower than (Minson & Dreger 2008) solution (Fig. 12c).Interestingly, we do not observe a prominent trade-off between the ISO and CLVD components when looking at solutions from all sampled locations (light colours on the lune plots) as in the synthetic tests.

C O N C L U S I O N S
We have shown that the hierarchical Bayesian inversion is well suited for non-linear inversion of source parameters using noisy waveform data.To account for the interdependence of observational errors, we parametrize the noise covariance matrix bringing to bear noise recorded on stations used in the inversion.Inverting for the level of noise takes into account the lack of knowledge on data uncertainty.The procedure results in an ensemble of models showing parameter uncertainties.A disadvantage of the method is that it does not account for uncertainties in velocity models, inherited through Green's functions.
Synthetic tests show that the DC part of the solution is well determined even for data with a high level of noise.Although we are using long-period data, the location is close to the input value and improves by using a covariance matrix computed from empirical noise.The non-DC parts are resolved less well, but our approach gives a good estimate of their uncertainty.
We have applied the approach to data from a complex tectonic environment, the Long Valley Caldera.Solutions obtained using different parametrizations of the covariance matrix are similar.The centroid location and the DC component are well resolved and the DC part agrees well with a previous result of Minson & Dreger (2008).For this data set we do not observe a trade-off between the CLVD and ISO components, and the ISO component is large (∼45 per cent) in inversions with all parametrizations of the covariance matrix.

Figure 1 .
Figure 1.Mechanisms of elementary moment tensors used in the inversion.

Figure 2 .
Figure 2. Autocorrelations (for non-negative lags) of the three-component noise seismograms on different stations around the globe.Group one shows results for four stations in California (BKS, CMB, KCC and ORV), two stations in Australia (CTAO and ERAB), one in Japan (ERM) and one in north Russia (NRIL).Stations in Group two (BRVK in Kazakhstan, ESK in the United Kingdom, MBAR in Uganda and NNA in Peru) have more complex autocorrelations of the horizontal components.

Figure 3 .
Figure 3. (a) Autocorrelations of 15 noise series (three components of noise on five stations) and their average value.(b) The best-fit values of an exponential and two attenuated cosine functions for the average autocorrelation.(c) Covariance matrices corresponding to the average autocorrelation (empirical) and the best fit for the two parametrizations.

Figure 4 .
Figure 4. Map of the region showing the earthquake location in Long Valley Caldera (green star), location of real stations used in the inversions (red triangles) and synthetic stations added in the last synthetic experiment (blue triangles).The insert in the lower left corner shows the smaller caldera area.

Figure 5 .
Figure 5. Centroid locations for 1000 iterations in the outer Markov chain for the inversion of data without noise assuming a covariance matrix with two attenuated cosine functions.The average iteration number for each location determines its colour.Open circles show all proposed locations and full circles are the accepted ones.Symbol sizes are determined by the likelihood value (the 1 per cent MAP locations are plotted with the largest circle, the next 9 per cent with a smaller one, etc.).The black star represents the input location, while the green circle shows the MAP location.The first subplot shows locations in 3-D and the other three subplots are cross-sections through the MAP source location.

Figure 6 .
Figure 6.The DC part of the solutions for data without noise before they are collected for the ensemble (burn-in period) and afterwards.The solutions were collected at the MAP location from inversions with a (a) diagonal, (b) exponential and (c) attenuated cosine covariance matrix.The solutions are coloured based on the value of the variance reduction calculated with the L2 norm.The dashed green lines show the DC part of the input mechanism.

Table 1 .Figure 7 .
Figure 7. (a) The input mechanism (black) and MAP solutions obtained for data without noise assuming a diagonal (grey), exponential (blue) and attenuated cosine (red) covariance matrix.(b) Synthetic data and seismograms from three MAP solutions, coloured in the same way as beachballs.(c) Lune source-type plots with solutions from all locations shown in light colours and the final ensemble shown in darker colours for the three inversions.The colours are as in the previous two plots; the star shows the input value.(d), (e) and (f) are the same for data with a low level of noise and (g), (h) and (i) for data with a high level of noise.

Figure 9 .
Figure 9. (a) Input and recovered focal mechanisms for a synthetic experiment with 10 stations.(b) Three component seismograms.(c) The lune plots for all three assumptions of C e .For details see the caption of Fig. 7.

Figure 10 .Figure 11 .
Figure 10.Sampled centroid locations for real data from the Long Valley Caldera.For details see the caption of Fig. 5.

Figure 12 .
Figure 12.Inversion results for the LVC data: (a) focal mechanisms, (b) seismograms, (c) the lune plots.For details see the caption of Fig. 7.The green beachball in (a) and stars in (c) show the solution from Minson & Dreger (2008).

Table 4 .
Range of parameter values for an inversion that uses 10 stations with ∼65 per cent noise.For details see Table1.

Table 5 .
Range of parameter values for LVC data.For details see Table1.