-
PDF
- Split View
-
Views
-
Cite
Cite
Chak-Hau Michael Tso, Marco Iglesias, Paul Wilkinson, Oliver Kuras, Jonathan Chambers, Andrew Binley, Efficient multiscale imaging of subsurface resistivity with uncertainty quantification using ensemble Kalman inversion, Geophysical Journal International, Volume 225, Issue 2, May 2021, Pages 887–905, https://doi.org/10.1093/gji/ggab013
- Share Icon Share
SUMMARY
Electrical resistivity tomography (ERT) is widely used to image the Earth’s subsurface and has proven to be an extremely useful tool in application to hydrological problems. Conventional smoothness-constrained inversion of ERT data is efficient and robust, and consequently very popular. However, it does not resolve well sharp interfaces of a resistivity field and tends to reduce and smooth resistivity variations. These issues can be problematic in a range of hydrological or near-surface studies, for example mapping regolith-bedrock interfaces. While fully Bayesian approaches, such as those using Markov chain Monte Carlo sampling, can address the above issues, their very high computation cost makes them impractical for many applications. Ensemble Kalman inversion (EKI) offers a computationally efficient alternative by approximating the Bayesian posterior distribution in a derivative-free manner, which means only a relatively small number of ‘black-box’ model runs are required. Although common limitations for ensemble Kalman filter-type methods apply to EKI, it is both efficient and generally captures uncertainty patterns correctly. We propose the use of a new EKI-based framework for ERT which estimates a resistivity model and its uncertainty at a modest computational cost. Our EKI framework uses a level-set parametrization of the unknown resistivity to allow efficient estimation of discontinuous resistivity fields. Instead of estimating level-set parameters directly, we introduce a second step to characterize the spatial variability of the resistivity field and infer length scale hyperparameters directly. We demonstrate these features by applying the method to a series of synthetic and field examples. We also benchmark our results by comparing them to those obtained from standard smoothness-constrained inversion. Resultant resistivity images from EKI successfully capture arbitrarily shaped interfaces between resistivity zones and the inverted resistivities are close to the true values in synthetic cases. We highlight its readiness and applicability to similar problems in geophysics.
1 INTRODUCTION
Electrical resistivity imaging (ERI), or electrical resistivity tomography (ERT), is an effective method to reveal the subsurface structure of the Earth’s near-surface (e.g. Binley & Slater 2020). It can provide a proxy of properties of interest at spatial coverage and resolution that are not attainable by point-based sampling methods. When used in a time-lapse manner, ERT surveys can be repeated frequently to capture changes in the resistivity distribution, offering insight into subsurface processes. Recent advances in instrumentation allows the deployment of ERT surveys with a large number of electrodes and the collection of time-lapse data using autonomous systems. ERT is widely used in a large number of environmental and engineering applications, such as hydrological characterization (e.g. Binley et al. 2002; Johnson et al. 2012; Cheng et al. 2019; McLachlan et al. 2020), landslide monitoring (e.g. Uhlemann et al. 2017; Holmes et al. 2020), contaminant transport monitoring (e.g. Kuras et al. 2016; Tso et al. 2020), studying root water uptake (e.g. Whalley et al. 2017; Blanchy et al. 2020b), saltwater intrusion monitoring (e.g. Hermans & Paepen 2020; Ronczka et al. 2020) and archaeological exploration (e.g. Ullrich et al. 2008; Fernández-Álvarez et al. 2017). Many applications require the identification of boundaries between two or more subsurface zones and reliable estimates of resistivity within each zone (e.g. soil horizons, geological facies, engineered structures or wetting fronts). Moreover, some measures of uncertainty to these estimates are often desirable to facilitate interpretation and decision making.
Smoothness constrained inversion (e.g. Binley 2015) is the most commonly used method for ERT inversion. Although it is an efficient and robust method, it explicitly favors the smoothest resistivity field (by L2 measure) that honours the observed data, subject to prescribed constraints. In the presence of sharp changes of resistivity in the subsurface, the interfaces between regions are often not clearly resolved. Moreover, the inverted resistivity values tend to be in the region of the mean (linear or log) resistivity of the entire domain, rather than in the range of the actual resistivities in each zone. This then limits our ability to use the inverted resistivities to map directly to hydrological models via petrophysical transforms. While these issues can be alleviated partly via approaches such as the use of layered or lateral constraints (e.g. Auken & Christiansen 2004), minimum support functionals (e.g. Nguyen et al. 2016) or parameter disconnect if the boundaries are known (e.g. Slater & Binley 2003; Johnson et al. 2012), a general approach is lacking to image the subsurface with arbitrarily shaped zones. There exist approaches to derive zonal or facies membership from smooth inverted images, mainly post-possessing the images via clustering or edge detectors (Chambers et al. 2014; Ward et al. 2014), or deriving conditional probability maps from colocated measurements from direct sampling (Hermans & Irving 2017). However, their performance depends on the results of the smoothness-constrained inversion results. Smoothness constrained inversion is also not particularly well-suited for uncertainty quantification, especially when there are sharp boundaries. Conducting uncertainty analysis on the smoothly varying fields allows one to understand their variability, but provides little information on the location of potential discontinuities in the actual resistivity field.
Incorporation of geostatistical information can greatly improve inversions. Early attempts include the sequential successive linear estimator (SSLE, Yeh et al. 2002) [see also a comparison between smoothness-constrained and geostatistical inversion by Englert et al. (2016)]. More recent attempts include applying the principal component geostatistical inversion on ERT data (Kitanidis 2015). Geostatistical approaches have largely been based on variograms and they assume a stationary random field. This assumption is violated in systems where there are features of strikingly different orientation and abrupt changes or discontinuities such as those arising from stratified geological features. Recent methods have taken advantage of the non-stationary Matérn family of covariance functions, where variogram and spatial scales are estimated at each location in the model domain as a stochastic process. Such method has been used increasingly for spatial modeling in geophysics and precipitation modelling and was first used in 2-D Bayesian ERT inversion by Bouchedda et al. (2017).
With increasing interest in quantifying the uncertainty in ERT estimates and improving the identification of features (Andersen et al. 2003; Ramirez et al. 2005; Irving & Singha 2010; Linde et al. 2017), there has been significant interest in using fully Bayesian approaches based on Markov Chain Monte Carlo (MCMC) to approximate the posterior probability density function (pdf) of the unknown resistivity field (i.e. posterior of resistivity at each pixel). Accurate approximations of the full posterior pdf are useful for quantifying the uncertainty of all possible modelling outcomes emerging from ERT. Recent work includes the trans-dimensional inversion concept where the number and density of parameters are estimated alongside the parameter values (Galetti & Curtis 2018), which adaptively avoids over- or underparametrizing the given inverse problem. A related approach uses a fixed parameter mesh and in each MCMC iteration, the estimates of the interface between the two zones (i.e. a polyline) are first updated, followed by estimates of the parameter fields within each subdomain (de Pasquale et al. 2019). Similarly, to speed up the generation of MCMC proposals and hence convergence, an area-to-point kriging approach has been proposed recently to generate fine-scale multi-Gaussian realizations from smooth tomographic images obtained from smoothness-constrain inversions (Nussbaumer et al. 2019).
The theory of fully Bayesian methods such as MCMC ensures that sampling approximations converge to the target (posterior) distribution. In practice, however, it is widely known that these methods require hundreds of thousands or even millions of model runs to produce accurate approximations. A few thousands of MCMC samples can, indeed, be used to approximate regions of high-probability. However, regions of low probability under the posterior pdf curve can be undersampled since most MCMC proposals are based on local-moves which tend to reject samples that belong in those low probability regions. It comes as no surprise that most fully Bayesian methods in the context of ERT have been only applied to 2-D settings and/or focus on approximating only the high probability areas which could, perhaps, be approximated more cost-effectively with methods based on Gaussian approximations such as the one that we introduce in this work.
Data assimilation (DA) methods such as ensemble Kalman filter has gained popularity in the Earth sciences. Early attempts, for example in hydrogeology, uses the standard filtering approach to update both parameter and state variables (e.g. Zhou et al. 2014; Camporese et al. 2015) but recent work has focused on reformulating the problem to estimate model parameters only (e.g. Chen et al. 2013; Song et al. 2019). These methods are known as ensemble smoothers (ES) or ensemble Kalman inversion (EKI). ES methods are particularly popular with history matching of production data in hydrocarbon reservoirs (Emerick & Reynolds 2013; Le et al. 2016), and have been recently extended to include seismic (Emerick 2016) or land subsidence data (Iglesias et al. 2014; Baù et al.2015), as well as estimation of petroelastic properties (Liu & Grana 2018) and facies models (Canchumuni et al. 2019). Some groundwater applications of ES or EKI include the work of Ju et al. (2018), Kang et al. (2018), Lan et al. (2018), Kang et al. (2019) and Song et al. (2019), while Aalstad et al. (2018) used ES to estimate the snow distribution at Arctic sites from satellite-derived data. In near-surface geophysics, ES has been used for crosshole GPR traveltime tomography in conjunction with approximate forward solvers and model error correction (Köpke et al. 2019). Other examples of ES or EKI outside the Earth sciences includes monitoring manufacturing process of fiber-reinforced composite materials (Iglesias et al. 2018).
The ensemble or Monte Carlo nature of DA methods allows them to produce uncertainty estimates but unlike MCMC, the relatively small number of samples and the assumed multivariate Gaussianity in the posterior parameter space implies that they can only derive one peak for each parameter value and its spread. Since ES or EKI are derivative-free solvers, they are also sometimes seen as an alternative to classical inversion solvers, which require the derivation of sensitivity or Jacobian matrices. For many large and complex coupled problems, this is a non-trivial task (especially for coupled problems) to derive the sensitivity (Jacobian) matrix via the adjoint method analytically in order to solve the inverse problem efficiently. Note the speed-up using the adjoint method is not possible for proprietary codes since a thorough knowledge of the code is needed to derive the adjoint analytically. In contrast, the use of DA methods is much more straightforward since it only requires the evaluation of forward model runs (i.e. run the code as a ‘black box’) and they can be run in parallel simply by distributing the runs to different workers. Although DA or EKI methods are not widely used for solving applied geophysical problems, their ease of modelling and high flexibility allow them to be readily applied to a wide range of geophysical problems (e.g. Muir & Tsai 2020).
While classical geostatistical approaches treat the model domain as a continuous random field, recent methods aim to treat the model domain as discrete facies (i.e. zones). Note that these facies may not be geological but can also be ‘geochemical’ or ‘hydrological’ for modelling purposes (e.g. Sassen et al. 2012; Wainwright et al. 2014). Among them, the level-set method (Chan & Tai 2004) represents boundaries between zones or facies by using the zero contour (the levelset) of a scalar function One of its earliest. applications is to a simplistic electrical resistivity imaging problem (Chung et al. 2005). Notable applications of level-set parametrization of the model parameter field includes a Bayesian inversion method of aquifer test data that is extensible to geophysical data (Cardiff & Kitanidis 2009) and an ensemble DA framework for aquifer test and tracer data at the U.S. Hanford site (Song et al. 2019). In addition, it has been applied to study stream bed heterogeneity in a hyporheic exchange context (Chen & Zeng 2015). In geophysics, Zheglova et al. (2013, 2018) developed a 2-D level-set traveltime inversion and multiple level-set joint inversion of traveltime and gravity data, respectively. Finally, Chou et al. (2016) estimated saturated hydraulic conductivity in heterogeneous soil during infiltration tests by monitoring the movement of the wetting front with ERT. A simplified level-set method was used to ensure flow lines derived from ERT images at two consecutive times did not intersect.
Recognizing the advantages and limitations of current methods, we propose an ensemble Kalman inversion combined with level-set parametrization for electrical resistivity imaging. This approach builds on previous EKI with level-set parametrization work (Iglesias 2016; Chada et al. 2018; Iglesias et al. 2016, 2018) and is modified for ERT. Its approximate Bayesian and Monte Carlo nature provides estimates of model uncertainty at a fraction of the computation cost of MCMC inversion, allowing it to be applied readily to large 3-D and time-lapse surveys. Its level-set parametrization permits the estimation of spatially varying geostatistical parameters of the resistivity field within the inversion so that it can capture elongated features with different orientations. Its use of Matérn covariance functions allows estimation of spatially variable correlation lengths of the resistivity field. We describe the methodology in the next section and demonstrate its use with a series of synthetic and field hydrogeophysics examples, followed by a discussion and conclusions.
2 METHODOLOGY
2.1 Bayesian formulation of ERT
with appropriate boundary conditions (e.g. Binley 2015).
In order to fully determine the posterior |$\mathbb {P}(\sigma \, \vert d)$| from (5), we face two substantial challenges. The first one is to design a good prior |$\mathbb {P}(\sigma )$| that not only reflects our prior knowledge of σ, but is also capable of extracting key features of σ†. The second challenge is to compute Z. Indeed, if |$\mathbb {P}(\sigma )$| is specified and Z is known, then expression (5) fully determines the posterior that can be, in turn, used to compute point estimates (e.g. mean) as well as measures of uncertainty (i.e. variance and credible intervals) for σ(x).
The connection between smoothness constrained inversion and the MAP estimator can be further generalized to non-Gaussian priors by considering an appropriate penalty term, |$\frac{\beta }{2}\left\Vert W_{\sigma }\sigma \right\Vert ^2$|, in (2). However, the advantage of the Bayesian approach over smoothness constrained is that, with the former we have the full posterior |$\mathbb {P}(\sigma \, \vert d)$| that we can use to quantify uncertainty in the estimates of σ(x). In addition, a fully Bayesian approach has additional advantages including full account of non-linear effects, multimodality, near-arbitrary priors and likelihoods (e.g. Linde et al. 2017; Galetti & Curtis 2018). For a detailed discussion of the link between the deterministic and Bayesian inversions, the reader is referred to the work of Calvetti & Somersalo (2018).
2.2 Priors and parametrizations
In addition to quantifying uncertainty in the estimates that we produce, a key requirement for our ERT framework is to be able to capture sharp interfaces between zones of different conductivity. In the Bayesian setting this can be done either by (i) a careful selection of priors |$\mathbb {P}(\sigma )$| that produces realizations of conductivity fields with discontinuities or (ii) a parametrization of the conductivity field which incorporates discontinuous (or piecewise continuous) features. As discussed at the end of the preceding section, a Gaussian prior will enforce smoothness via the prior covariance. Hence, Gaussian priors are not suitable for characterizing discontinuous (non-smooth) fields. Other priors that can be used for this purpose are the so-called edge-preserving (i.e. Total Variation, Besov and Cauchy) priors (Arridge et al. 2019) commonly used in image processing. However, unlike Gaussian priors, characterizing (e.g. sampling from) these edge-preserving distributions can be computationally complex. For this reason we follow the second approach which consists of an adequate parametrization of |$\sigma =\mathcal {P}(u)$| that we construct so that (i) it allows us to characterize multiple (unknown) zones with different conductivity and (ii) the new level-set parameter u(x) can be easily characterized under the prior. Fig. 1 shows an illustration of the level-set methods to delineate sharp interfaces. In Section 2.3, we revisit the Bayesian ERT formulation from Section 2.1 in terms of the new parameter u(x).
For the sake of exposition we consider a 3-zone example where the conductivity σ(x) is defined for |$x\in \mathbb {R}^{d}$| (d = 2 or d = 3 for 2-D and 3-D problems, respectively), but the approach can be used for more zones. Constructing the parametrization involves two steps.
2.2.1 Level-set parametrization
Instead of estimating σ(x) we could now reformulate the inverse problem in terms of identifying the functions |$\lbrace \xi _{\iota }(x)\rbrace _{\iota =1}^{4}$|. Solving this new problem will amount to estimating the shape of the three zones [via ξ4(x)] of different conductivity, together with the possibly heterogeneous conductivities within each zone. Before we tackle this new inverse problem, we need to further parametrize |$\lbrace \xi _{\iota }(x)\rbrace _{\iota =1}^{4}$|.
2.2.2 Whittle–Matérn parametrization
While the characterization of GRFs via (16)–(17) may seem quite involved, we should note that the operator |$\nabla \, \cdot \text{diag}(\mathbf {L}^2)\nabla$| that appears in (17) is simply the anisotropic version of a Laplacian. Thus, a discretization of |$\mathcal {A}\equiv (\mathbb {I}-\nabla \, \cdot \text{diag}(\mathbf {L}^2)\nabla )$| can be relatively straightforward using standard finite element or finite difference methods. Given the discretization of |$\mathcal {A}$|, the SPDE Approach from Lindgren et al. (2011) provide us with straightforward steps for solving (17) for the case when (ν + d/2) is an integer.
Rather than approximating the posterior distribution for the (log) conductivity σ(x), our goal is to compute the posterior on u(x) and then transform back to a distribution on σ(x) via |$\mathcal {P}$|. The framework above gives us a natural choice for the priors |$\mathbb {P}(\omega _{i})=N(0,\mathbb {I})$| (i.e. Gaussian white noise). Priors on λi and |$\Theta _{i}=(\mathbf {L}_{i},\nu _{i},\tau _{i})$| must now be specified.
The parametrization that we use in (12) has some modelling limitations which are worth mentioning. First, we restrict ourselves to the case in which only three well-defined conductivity regions are present. Secondly, the definition in (12) does not allow regions Ω1 and Ω3 to intersect. Of course, if the truth cannot be accurately characterized under these modelling conditions, the inversion results can be inaccurate. However, our aim here is to show that EKI, combined with adequate parametrizations of the conductivity, can be an effective derivative-free tool for ERT with uncertainty quantification capabilities. Other parametrizations can be used to addressed the aforementioned limitations, including multiple level-sets (Dorn & Villegas 2008), as well as other characterizations of the unknown such as the pluri-Gaussian method (Sebacher et al. 2017).
2.3 EKI for the parametrized ERT problem
In this section we apply the Bayesian framework to the new parameter u(x) that we introduce earlier to characterize discontinuous σ(x) via the mapping |$\sigma =\mathcal {P}(u)$|. We re-formulate the Bayesian ERT problem in terms of the push-forward of |$\mathbb {P}(u\vert d)$| (i.e. the posterior on u(x)) under the parametrization map |$\mathcal {P}$|. In plain words, if we are able to produce samples from |$\mathbb {P}(u\, \vert d)$|, say |$\lbrace u{(j)}\rbrace _{j=1}^{J}$|, then the corresponding conductivities, |$\sigma ^{(j)}=\mathcal {P}(u^{(j)})$|, are samples of the ‘push-forward’ density that we denote by |$\mathbb {P}^{y}(\sigma )$|. This is analogous to the posterior on σ(x) introduce in the Section 2.1 and comprises the statistical information of σ(x) while enforcing the 3-zone assumption encoded in |$\mathcal {P}(u)$|.
2.3.1 EKI algorithm
In order to approximate the posterior |$\mathbb {P}(u\, \vert d)$| in (23), we consider the ensemble Kalman inversion (EKI) algorithm of Iglesias et al. (2018). EKI is based on the tempering scheme which consist of introducing a sequence of N intermediate densities between prior and posterior: |$\mathbb {P}_{0}(u)\rightarrow \mathbb {P}_{1}(u) \rightarrow \dots \mathbb {P}_{N}(u) \rightarrow \mathbb {P}_{N+1}(u) =\mathbb {P}(u\, \vert d)$|.
EKI is an iterative algorithm that, at each of the nth iteration, produces samples from a Gaussian approximation of |$\mathbb {P}_{n}(u)$|. The algorithm is initialized with an ensemble of inputs |$\lbrace u_0^{(j)} \rbrace _{j=1}^J$| drawn from the prior |$\mathbb {P}(u)$|, measurements d and error covariance Ξ. Note that the prior ranges of conductivity values of different zones should not overlap.
Set n = 0 and θ−1 = 0
At each iteration n,
Compute |$\mathcal {G}_{n}^{(j)}= \mathcal {G}(u_{n}^{(j)}),\qquad j\in \lbrace 1,\dots , J\rbrace$| by running the forward model.
- Compute αn viawhere M is the number of measurements (i.e. length of vector d) and set |$\theta _{n}\leftarrow \theta _{n-1}+\alpha _{n}^{-1}$|. The choice of regularization parameter αn is crucial to inversion performance; here we follow the approach proposed recently in Iglesias & Yang (2021).(29)\begin{eqnarray*} \alpha _{n}^{-1}\equiv \min \Bigg \lbrace \Bigg [\frac{1}{M}\frac{1}{J}\sum _{j=1}^{J}\left\Vert \Xi ^{-1/2}(d-\mathcal {G}_{n}^{(j)})\right\Vert ^2\Bigg ]^{-1},1-\theta _{n-1}\Bigg \rbrace , \nonumber \\ \end{eqnarray*}
- Update each particle (a realization of the ensemble) according to:(30)\begin{eqnarray*} u_{n+1}^{(j)}=u_{n}^{(j)}+\mathcal {C}_{n}^{u\mathcal {G}}(\mathcal {C}_{n}^{\mathcal {G}\mathcal {G}} +\alpha _{n}\Xi )^{-1}(d+\sqrt{\alpha _n}\eta _{n}^{(j)}-\mathcal {G}(u_{n}^{(j)})), \end{eqnarray*}where j ∈ {1, …, J} and |$\eta _n^{(j)} \sim N(0,\Xi )$| is artificial noise that we generate with the same assumed statistics of the measurement error. A great advantage of EKI is that the particles (i.e. forward models) can be evaluated independently, allowing a large part of the update to be computed in parallel. In (30), |$\mathcal {C}_n^{u\mathcal {G}}$| and |$\mathcal {C}_n^{\mathcal {GG}}$| are model-data covariance and data autocovariance matrices, respectively, which are formed empirically based on the evaluation of the forward models |$\lbrace \mathcal {G}(u_n^{(j)}) \rbrace _{j=1}^J$|. |$\mathcal {C}_{n}^{\mathcal {G}\mathcal {G}}$| is typical diagonal so the inverse in the above equation can be calculated efficiently. Specifically, |$\mathcal {C}_n^{u\mathcal {G}}$| and |$\mathcal {C}_n^{\mathcal {GG}}$| are given by:(31)\begin{eqnarray*} C_{n}^{\mathcal {G}\mathcal {G}}\equiv &\frac{1}{J-1}\sum _{j=1}^{J}(\mathcal {G}(u_{n}^{(j)})-\overline{\mathcal {G}}_{n})\otimes (\mathcal {G}(u_{n}^{(j)})-\overline{\mathcal {G}}_{n}) \end{eqnarray*}with |$\overline{u}_{n}\equiv \frac{1}{J}\sum _{j=1}^{J}u_{n}^{(j)}$| and |$\overline{\mathcal {G}}_{n}\equiv \frac{1}{J}\sum _{j=1}^{J}\mathcal {G}_{n}^{(j)}$|.(32)\begin{eqnarray*} C_{n}^{u\mathcal {G}} \equiv & \frac{1}{J-1}\sum _{j=1}^{J} (u_{n}^{(j)}-\overline{u}_{n})\otimes (\mathcal {G}(u_{n}^{(j)})-\overline{\mathcal {G}}_{n}), \end{eqnarray*}
Set n ← n + 1
The iterative loop continues until convergence which is controlled by the criterion in (28), which is equivalent to while θn − 1 < 1 is not true. The algorithm is converged whenever θn = 1 which ensures that (28) is satisfied and, hence, that the ensemble of particles |$\lbrace u_{n+1}^{(j)}\rbrace _{j=1}^{J}$| approximate the sought Bayesian posterior. To quantify convergence we define the weighted root-mean-squared (WRMS) data misfit based on eq. (29) as:(33)\begin{eqnarray*} WRMS=\frac{1}{M}\frac{1}{J}\sum _{j=1}^{J}\left\Vert \Xi ^{-1/2}(d-\mathcal {G}_{n}^{(j)})\right\Vert ^2. \end{eqnarray*}
2.4 Summary of methods
To summarize the above, the key point of our method is that we do not directly invert for a distribution of conductivity fields. Rather, we invert for parameters u and use a mapping |$\mathcal {P}$| to convert them to conductivity fields. The level-set function is a good choice for |$\mathcal {P}$| to handle sharp boundaries but we want to parametrize it using hyperparameters such as length scales and smoothness. So for each of the level-set functions, we rewrite them based on Matérn covariance functions and the SPDE approach (16–18). The end results is that we can parametrize the level-set functions ξi(x) and thus the conductivity fields σ(x) through the Whittle–Matérn hyperparameters for each of the ith zone |$\Theta _{i}=(\mathbf {L}_{i},\nu _{i},\tau _{i})$| via eq. (18). Recall that |$\mathbf {L}_{i}$|, νi and τi are the controls the length scale, smoothness and amplitude of the level-set functions, respectively, alongside the mean conductivity of each zone λi and the Gaussian white noise at each grid location wi(x). Prior conductivity fields are generated by specifying λi and Θi. For examples of prior conductivity fields, refer to Fig. 4.

An illustration of the level-set methods to delineate sharp interfaces. Using a higher dimension function allows much more straightforward description of arbitrary geometries in the original dimension. In this illustration, the two zones can be defined simply as above or below z = 0 level of the 3-D function.
The parameters are updated by EKI. The general concept is to update an ensemble of model realizations using ensemble Kalman filter-like updates. By computing the data misfit between the ensemble (obtained by running the forward model for each realization) and the observed, the mean of the ensemble is iteratively driven to the solution of the inverse problem. At each iteration, the regularization parameter αn is updated. It effectively allows more regularization when the data misfit is large, while letting it decreases gradually as the ensemble evolves closer to the solution.
To estimate a conductivity field assuming there is no within-zone heterogeneity, we update u(x) = (λi, Θi, wi(x)) via EKI. λi is the homogeneous conductivity value for each zone. Note that although λ and Θ are scalars, we are estimating their distributions (and hence their means and variances).
It is noteworthy that the flexibility of EKI means we may omit certain hyperparameters or include more if needed. In the deep Earth seismic EKI work of Muir & Tsai (2020), they are confident about their priors (i.e. the hyperparameters they set) and the assumed zonal seismic slownesses so these parameters are not updated. In one of their examples, they added a hyperparameter to account for the potential presence of a fault, which allows them to recover the fault geometry clearly. Since our examples also estimate the length scales, we call our approach ‘multiscale’, or ‘hierarchical EKI (Chada et al. 2018)’. Although EKI is a Bayesian method, Muir & Tsai (2020) used it pragmatically as a fast and flexible optimizer to replace the Jacobian and they did not report any uncertainty estimates.
Estimating a conductivity field with within-zone heterogeneity can be achieved by also estimating the spatially varying length scales of within-zone heterogeneity at each direction and at each grid cell. We demonstrate this in one of our synthetic examples.
2.5 Implementation notes
For the ERT forward modelling, we use a grid that extends laterally several times the dimension of the ERT imaging area to simulate an infinite earth in field studies. For convenience, here we discretize the parameter grid used for inversion is a grid consists of squares/cubes covering the entirety of the imaging area. For the field hillslope example (Section 3.2.1), the parameter grid is distorted to quadrilaterals based on surface elevations. At each iteration, the parameter grid is interpolated to the forward modelling grid to obtain simulated ERT data. In all the examples, the number of realizations used for each iteration is set to 300. For this paper, we implement the EKI method in MATLAB(R) and its statistics and machine learning toolbox as well as its built-in parallel tool. However, it can be implemented rather straightforwardly in other scripting languages.
Priors are given in log space and are assumed a uniform distribution. One of the zone is assigned to be the background zones such that its mean value is assigned for cells that are outside the parameter estimation grid.
Putting all of the above together, our approach can be summarized as a flowchart in Fig. 2. The unknown parameters are level-set parameters and they are converted to conductivity fields at the beginning of each iteration before being run through the ERT forward code to obtain simulated ERT data.

In all the examples, we compare the inversion results from EKI to smoothness-constrained inversion. The parameter grid used here is the same as the forward modelling grid. R2 and E4D are used for inversion of 2-D and 3-D data, respectively. To prevent very long run time, we treat the inversion as converged when the WRMS decrease between two iterations less than 1 per cent.
3 EXAMPLE APPLICATIONS
In this section, we report EKI inversion results from a series of synthetic and field examples. Unless otherwise specified, 2 per cent Gaussian noise is added to the data and a 2 per cent data error level is assumed in the inversions. For 2-D modelling, we use R2 (http://www.es.lancs.ac.uk/people/amb/Freeware/R2/R2.htm) for forward modelling because of its ease to setup the problem, particularly with the help of the the ResIPy interface (Blanchy et al. 2020a). For 3-D modelling, we use E4D (Johnson et al. 2010) as the forward solver because of its efficient parallel capabilities. For each problem, either a rectangular grid is used or a triangular (2-D)/tetrahedron (3-D) mesh is generated using tetgen (Si 2015).
3.1 Synthetic examples
3.1.1 2-D cross-borehole survey
In this example, two boreholes, 16 m apart are installed for cross-borehole ERT imaging (Fig. 3a). Along each borehole, 16 regularly separated electrodes are installed. In total, 204 ERT measurements (i.e. quadrupoles) are collected. The forward modelling grid includes 6336 rectangular cells while the parameter grid for inversion consists of 64 regularly spaced square cells in each direction (4096 total), spanning X = [ − 8, 8] m and Z = [ − 8, 8] m. The ground surface is at Z = 8 m. The background resistivity is 100 Ω m and that of the inclusion is 1 Ω m. The prior resistivity ranges for zone 1 (background) and 2 are [75,200] Ω m and [0.1,30] Ω m, respectively.

ERT inversion results of the 2-D cross-borehole example: (a) The true resistivity model. Electrodes are marked as black circles and the rectangular inclusion is highlighted. (b)The resistivity model estimated by smoothness-constrained inversion. The range of estimated resistivity is extremely small (c) The mean resistivity model estimated by EKI, where the rectangular bounding box shows the true location of the 1 Ωm inclusion (d) The prior estimated probability of zone 1 (i.e. the background) (e) The posterior estimated probability of zone 1. In each subfigure, the true boundary of the two zones are marked by a red line.

(a) True model for the 2-D surface example, which comprises of two layers and a vertical fault. The black circles denotes the electrode locations. (b–h) Example realizations of prior resistivity field and the corresponding realization number.
The resistivity map obtained from commonly used smoothness constrained inversion is extremely smooth (Fig. 3b) and the resistivity range (note the range of values) is very small. Users familiar with smoothness constrained inversion can appreciate this image may contain a sharp target but it is difficult to communicate this to a non-geophysicist. The mean resistivity estimates from EKI (Fig. 3c), in contrast, recovers the two zones very well (with the exception of the corners of the target) and obtains the zonal resistivity value perfectly. The estimated target appears to be shifted to the left slightly from the true rectangular interface—this is due to a rather coarse parameter grid being used. In contrast, the smoothness constrained inversion returns a smoothed images with variations of resistivity values across the image. It also does not give a correct estimate of the resistivity values in neither the background region nor the inclusion.
A significant feature of EKI is the ability to obtain zonal probability maps of across the estimated resistivity models. Here, since a 2-zone formulation is used, each cell belongs to either the background or the inclusion zone. As can be seen in Fig. 3(d), the probability map for zone 1 is highly variable at the first iteration and the values are all between 0.9 and 1.0, reflecting our assumption of a background region. The posterior map (Fig. 3e), however, shows very high zone 1 probability in most of the domain, very low probability at the centre of the inclusion, and a probability of between 0.4 and 0.7 around the interface of the two zones.
3.1.2 2-D surface survey
In this example, a surface ERT line is deployed to image a layered system with a vertical fault (the true model is shown in Fig. 4a). The layers can be conceptualized as a bedrock overlaid by a less resistive topsoil. 25 regularly separated electrodes (2 m spacing) are installed. In total, 117 dipole–dipole measurements are simulated. The forward modelling grid includes 10 752 rectangular cells while the parameter grid for inversion consists of 25 600 square cells spanning X = [ − 24, 56] m and Z = [ − 20, 10] m, with 320 and 80 regularly spaced cells in the X and Z directions, respectively. The background resistivity (zone 1) is 2500 Ωm and that of the topsoil (zone 2) is 250 Ωm. Fig. 4 ) also shows a few selected realizations of the prior resistivity fields. They are generated by the level-set functions given prior ranges of length scales and resistivity values of the two materials.
Fig. 5 shows the results from a smoothness constrained inversion and a pseudo-L1 (or blocky) inversion using iterative reweighting (Loke et al. 2003), respectively. The smoothness constrained inversion shows a more resistive region at depth greater than 5 m and the resistivity variation is gradual and smooth. For the pseudo-L1 inversion, a flat topsoil layer is recovered, while the bedrock to the right of the fault is found (mistakenly) to be less resistive than that to its left. In both inversions, the resistivity value of the bedrock is underestimated. Note that the range of resistivity values estimated both the smoothness-constrained inversion are highly dominated by that of the top soil.

The resistivity model estimated by (a) smoothness-constrained inversion and (b) a pseduo-L1 inversion. ERT inversion results of the 2-D fault example using a 2-zone formulation: (c) The mean resistivity model estimated by EKI (e) The posterior estimated standard deviation. (g) The posterior estimated zone 1 probability (d, f and h) shows the above by repeating EKI by assuming a 3-zone formulation. Note that in (h), a probability less than 1 denotes a non-zero probability for zone 2 or 3. In each subfigure, the true boundary of the two zones are marked by a red line.
In this example, EKI with both 2 zone and 3 zone formulations are compared. In layered systems such as this one (i.e. a laterally extended topsoil with different electrical signatures to the rest of the domain), it may be of interest to test the hypothesis of whether an intermediate third zone exists. Therefore, a formulation with a third zone with a resistivity ranges that is lower than zone 1 and higher than zone 2 is also tested. In the 2 zone example, the prior resistivity of zone 1 (background) and zone 2 are [2000,3000] Ω m and [200,300] Ω m, respectively. Fig. 5 shows some example realizations of prior resistivity fields.
The results for EKI (2 zone) is reported in (Figs 5c, e and g). It gives a mean estimates of background/bedrock(zone 1) and topsoil (zone 2) resistivity of 2362 and 243 Ωm, respectively, which are very close to the true values (2500 and 250 Ω m). It also captures the resistivity structure very well, although the recovered fault structure does not appear to be perfectly vertical, the estimate zone boundaries and resistivity values are close to the true ones. As shown by Muir & Tsai (2020), the EKI can be formulated so that fault parameters (e.g. location and dip angle) are estimated explicitly and return more explicit fault geometry (see also discussions in Sections 2.4 and 4.3). The standard deviation maps shows the uncertainty is the highest around the interface between the two zones, which is helpful for interpretation and for uncertainty propagation to subsequent analysis steps. The uncertainty is also lower for the topsoil, partly because it is a lower resistivity layer, but also because of its proximity to the surface electrodes. It is also noteworthy that the left half of the image is subject to higher uncertainty. Recall that the fault system is set up so that the topsoil to the left is 2 m thicker than that to the right. This observation highlights the attenuation effects of ERT signals due to topsoil thickness.
In the 3 zone formulation (Figs 5d, f and h), the prior resistivity ranges of zone 1, zone 2 (background) and zone 3 are [2000,3000] Ω m, [700,900] Ω m and [200,300] Ω m, respectively. The results from EKI (2 zone) and EKI (3 zones) share many common features. EKI (2 zone) gives mean estimates of bedrock, background and topsoil resistivity of 2372, 798 and 242 Ωm, respectively, which are very close to the true values and the background value is almost identical to the geometric mean of the other zones. It is estimated that the moderate zone (zone 2) appears as a smooth band along the layer boundaries (Fig. 5d). Such a pattern would suggest a third intermediate zone does not exist, although in practice some independent verification is needed to determine whether the transition between zones is abrupt. Note that although the standard deviation is again along the zonal boundaries, there is a great reduction the standard deviation in the bedrock (Fig. 5f), meaning the effect of the uncertain boundary location on estimating bedrock resistivity is mitigated. Adding an intermediate zone in EKI can be helpful in practice when propagating uncertainty. If the contrast between the two zones are too high, a wrong estimation of zone membership may have serious effects.
We repeated our analysis with wider parameter ranges and obtained very similar results. In all the examples, convergence was achieved between 10 and 12 iterations. The computation was performed on an Intel i7 laptop (quad core) laptop and the time required for the entire inversion is less than 2 hr. Note that most of the run time is used for forward ERT model run. So the computation time scales linearly with the number of cores (up to the size of the ensemble J). The smoothness constrained inversion run using R2 for the same problem takes less than a minute on the same machine. Finally, we compare our results with a smoothness-constrained inversion (Fig. 5a), which shows a very smooth interface. Similar to EKI, a 1 m × 1 m parameter grid was used.
We also considered the performance of EKI to estimate a rather heterogeneous resistivity field. The true field (Fig. 6a) has the same zone boundaries but there are within-zone heterogeneities. The EKI algorithm is set to allow within-zone variations in resistivity values, which is done by estimating additional parameters such as length scales and variations of resistivities within each zone. In general, the two zones in the true field (Figs 6b–d) are recovered, but not as accurately as the previous case. The within-zone heterogeneity are only partly resolved. Again, the uncertainty is found to be higher at the estimated interfaces between zones. The computation cost for EKI here is comparable to that in Fig. 5.

ERT inversion results of the 2-D fault example with heterogeneity within each zone: (a) the true resistivity field; (b) the mean resistivity model estimated by EKI; (c) the posterior estimated probability of zone 1 (i.e. the background zone) and (d) the posterior variance estimated by EKI. In each subfigure, the true boundary of the two zones are marked by a red line. The boundary of the two zones is marked with a red line.
3.1.3 3-D structure
In this 3-D example, the true model is a cylinder of 2 m radius and 8 m in height, which mimics structures such as an abandoned well or mine shaft that is backfilled with sediments (Fig. 7). Three surface ERT lines (5 m separation), consisting of 16 electrodes (1 m separation), are used for data collection. The background resistivity is 200 Ω m and that of the inclusion is 10 Ω m. The ERT grid consists of 62002 tetrahedron cells. For EKI, we set the initial guess vertical length scale hyperparameters (|$\mathbf {L}$|) to be three times larger than the horizontal ones than to speed up convergence, but note that all of them are updated by EKI. The computation was performed on a DELL R420 cluster with 2 × Intel Xeon E5-2450 8-core processors (32 logical cores total) and 96Gb RAM. The inversion took 81 min to complete and converged in 18 iterations.

ERT inversion results of the 3-D complex structure example: (a) the true resistivity model; (b) the resistivity model estimated by smoothness-constrained inversion; (c) the mean resistivity model estimated by EKI; (d) the prior; (e) posterior variance map obtained by EKI and (f) the posterior zone 2 probability map obtained by EKI. The location of the electrodes are marked as orange cubes.
The smoothness constrained inversion recovers a rather smoothed structure of which the interface between the two zones are not clearly shown. The cylinder appears as a conductive blob near the surface, making it difficult to interpret if the true resistivity model were not known.
In contrast, the extent of the inclusion is very well recovered by EKI. The mean estimated resistivity for each zone is 200 and 10 Ωm, which are very close to their true values. The variance maps shows the confidence of facies/zone estimation and allows propagation of uncertainty for subsequent analysis. The evolution from prior to posterior variance maps shows a reduction in uncertainty by conditional the potential resistivity models with data. The posterior variance is higher near the interface between the two zones. Similarly, the zone 2 probability allows intuitive visualization of zone membership and its uncertainty.
3.2 Field examples
3.2.1 Borth peat bog, Wales, UK
Comprised of 98 per cent organic matter, peatlands are one of the largest reservoirs of carbon in the carbon cycle and are a major source of atmospheric methane. The existence of water logging and anaerobic condition slows down decomposition of plant materials, which in turn leads to the accumulation of peat. Estimating the extent and thickness are important to understand the hydrological and biogeochemical processes that occur in peatlands and geophysics has proved to be extremely useful in this regard (Slater & Reeve 2002; Comas & Slater 2004; Comas et al. 2004).
A series of field hydrogeophysical surveys and laboratory was conducted at Cors Fohno, a peat bog located at Borth in Wales, UK (52°32′N 04°00′W) Asunbo (2007). GPR, ERT, IP and direct sampling measurements are taken at the site. In this example, we focus on a 2 m electrode spaced ERT survey conducted along a 94 m length of board walk at the site. The dataset consists of 397 dipole–dipole measurements with a dipole spacing of one electrode and up to 10 survey levels. The forward modelling grid comprises of 7904 rectangular cells while the EKI parameter grid consists of 300 and 60 cells (0.5 m spacing) in the X and Z directions, respectively. A 5 per cent measurement error is assumed for the inversions and a 2 zone formulation for EKI is used. The prior resistivity of zone 1 (background) and zone 2 are [70,125] Ω m and [140,200] Ω m, respectively.
Fig. 8 a shows the results from smoothness-constrained inversion, which shows the resistivity decreases gradually with depth. Figs 8(b)–(d) shows the results for EKI. Two almost perfectly horizontal zones are identified, which is expected from Fig. 8(a) and from other geophysical and core measurements taken at the site. The lower boundary is found to be about 4 m deep by EKI, which agrees with direct sampling results of Asunbo (2007), which shows the presence of peat down to 4–5 m, underlained by dark organic-rich electrical conductive sediment down to 6.5–6.9 m. This last boundary designates the peats base which consists of blue (marine) clay. Figs 8(c) and (d) shows that the high uncertainty areas (on zone membership) are at the boundaries of the zones and also at some locations at depth and on the right-hand-side of the panel. The higher complexity on the right-hand side is also reported by Asunbo (2007).

ERT inversion results of the Borth field example: (a) the resistivity model estimated by smoothness-constrained inversion reported in Asunbo (2007). The inset shows a photo of the ERT line. (b) The mean resistivity model estimated by EKI. (c) The probability of a cell being in zone 2 estimated by EKI. (d) The standard deviation map obtained by EKI. The location of the surface electrodes are marked as black circles in panels (b–d).
In this example, convergence was achieved in 10 iterations. The computation was performed on an Intel i7 workstation (quad core) laptop and the time required for the entire inversion was less than 3 hr.
3.2.2 Chenqi catchment hillslope, SW China
The example data used here is from a survey conducted at Chenqi catchment in Guizhou Province, China in April 2017 Cheng et al. (2019). An ERT line was deployed along a hillslope and the goal of the ERT survey was to identify the hydrostratigraphy of the hillslope in order to improve a hydrological conceptualization of runoff processes within the karstic catchment. The 2-D profile consists of 48 electrodes and separated by about 5 m. The exact position of the electrodes were surveyed and recorded. A dipole–dipole measurement configuration was used with dipole spacings of one, two, and three electrodes and up to 11 survey levels. A full set of reciprocal measurements of the data were taken for error analysis. The data set consists of 1569 reciprocal measurements. A 5 per cent measurement error is assumed for the inversions. Fig. 9(a) shows the resistivity map obtained from smoothness constrained inversion, which shows a thin layer of conductive topsoil and several very smooth resistive regions. The latter can be interpreted as either horizontal layers or local inclusions.

ERT inversion results of the Chenqi field example: (a) the resistivity model estimated by smoothness-constrained inversion. The inset shows a photo of the ERT line. (b) The mean resistivity model estimated by EKI. (c) The probability of a cell being in zone 2 estimated by EKI. (d) The standard deviation map obtained by EKI. (e) A histogram of the EKI-estimated resistivity values of each zone across realizations. The location of the surface electrodes are marked as black circles.
We ran an inversion using EKI as a 3 zone problem and its results inform the choice of prior ranges of resistivity values based on smoothness constrained inversion results. For EKI, we use a parameter grid consisting of 1 × 1 grid cells spanning an area of 250 m by 140 m is used for parameter estimation. The prior resistivities of zone 1, zone 2 (background) and zone 3 are [800,3000] Ω m, [150,650] Ω m and [10,100] Ω m, respectively. The posterior mean estimates of the three zones are 1717.4, 431.4 and 36.2 Ωm, respectively. Since we allow zonal resistivity values to vary between realizations, Fig. 9(e) shows the histogram of posterior estimates of resistivity values for each zone. They do not vary greatly, as the structure of resistivity field is the main control of the ERT response. Fig. 9(b) shows an image of the mean resistivity estimates from EKI, which shows clearly the interfaces between zones. The bottom resistive zone is more extensive than estimated by smoothness-constrained inversion, while the two resistive zones at the upper left corner of the domain appears to be localized inclusions. EKI also captures a localized conductive zone at X = 170 m, which does not appear in the smoothness-constrained inversion image. The resistivity maps obtained from EKI better captures the weathered features and soil infills that are expected in karstic systems like the ones at Chenqi, making the interpretation of the geometry of geological features from ERT images more intuitive. The high resistivity zone agrees with the occurrence of a very flat, slightly dipping mudstone in geological maps for the site Cheng et al. (2019).
EKI also return the uncertainty estimates. Fig. 9(c) shows the probability map of a given cell to be a member of zone 2. A low value means a high probability of being a member of either zone 1 and 3. In the cropped region displayed in Fig. 9, there are no cells that has a non-zero probability of belonging to both zones 1 and 3 (or all three zones), making the zone 2 probability map a helpful way to visualize uncertainty. It appears for most cells that the zone 2 probability is either 0 or 1. Alternatively, uncertainty can be visualized as a map of standard deviation across realizations of the posterior estimates (Fig. 9d). A high standard deviation is observed around interfaces, which is expected because in those cells there is significant probability for them to belong to more than one zone. This is expected in most ERT problems as the uncertainty within a structure is low and that along the interface of zones is high.
In this example, convergence was achieved in 29 iterations. The computation was performed on an Intel i7 workstation (quad core) laptop and the time required for the entire inversion was less than 5 hr, while a R2 inversion run takes less than two minutes.
3.2.3 Eggborough, Yorkshire, UK
At Eggborough (UK National Grid Reference SE 570 232), ERT and GPR surveys were conducted in 1999 (Binley et al. 2002; Cassiani & Binley 2005) to study the use of geophysical measurements for the parametrization of unsaturated hydraulic parameters. The data were later used to study the utility of joint inversion of ERT and GPR data (Linde et al. 2006; Bouchedda et al. 2012) and the influence of prior information on vadose zone parameters estimation in stochastic inversion Scholer et al. (2011). Cores were collected at Eggborough to study petrophysical models for direct current resistivity and induced polarization (Binley et al. 2005; Tso et al. 2019). In this example, we focus on the cross-borehole ERT survey for the panel R3–R4, where both surface and subsurface electrodes were used. The data set consists of 6690 reciprocal measurements. The forward modelling grid consists of 3596 rectangular cells. The EKI parameter grid is 40 m × 30 m, comprised of 19200 grid cells that are uniformly spaced at 0.25 m. The prior resistivity ranges of zone 1, zone 2 (background) and zone 3 are [10,80] Ω m, [120,180] Ωm and [200,400] Ω m, respectively. A 5 per cent data error level is assumed for the inversions.
Fig. 10(a) shows the resistivity estimates from a smoothness constrained inversion, using a 10:1 prescribed horizontal anisotropic ratio, which shows many horizontal features. The smoothness constrained inversion without such constraint is shown in Fig. 10(b) where some of the finer layers disappear. Fig. 10(c) shows the mean resistivity map estimated by EKI, and the mean estimates of resistivity values of the three zones are 15.66, 108.52 and 265.46 Ω m. Without any of the information of the resistivity structure pre-defined, EKI was able to estimate a highly layered resistivity field. Its estimates also appears to be more realistic by showing irregular edges while the smoothness-constrained inversion shows almost horizontal layers. EKI also returns uncertainty estimates in Figs 10(d) and (e). The zonal uncertainty appears to be particularly high at corners of zone boundaries (Fig. 10d).

ERT inversion results of the Eggborough field example: the resistivity model estimated by smoothness-constrained inversion (a) with a 10:1 vertical anisotropy constraint and (b) without anisotropy constraint. (c) The mean resistivity model estimated by EKI. (d) The probability of a cell being in zone 2 estimated by EKI. (e) The standard deviation map obtained by EKI. (f) The EKI mean resistivity from (b) is compared against gamma-ray logs and cross-borehole ground penetrating radar (GPR) results. The black circles in (a–e) denotes electrodes locations. Note that the surface electrodes extends for 5 m more in each direction.
Gamma logs and cross-hole radar measurements were also taken at Eggborough are compared against the EKI ERT results (Fig. 10f). Two shallow (4–5 m depth) high gamma counts region appear on R4 and E4 logs and appear to extend, albeit with a weaker signal, towards E3. This suggests a localized thickening of two clay rich (siltstone) units towards E4. Note that the elevated gamma counts are indicative of higher clay content, which lead to lower resistivity because of the impedance to vertical unsaturated flow. Elevated moisture content in this zone is also revealed by the low radar velocity between R3 and R4. These two shallow siltstone units align with the low resistivity zone in the inverted model; this feature does not extend to E3. The deeper high gamma counts region at 12 m also appears in the resistivity maps but is somewhat weaker because the moisture content is lower (as show in the radar profile). Finally, zone 1 at depths greater than 10 m correspond to a well-drained /coarser sandstone. The radar profiles indicate dry zones around 11 and 14 m, which corresponds to the zone 1 locations at R3 and R4. The comparison here illustrates the zones identified by EKI may not be the same geological unit throughout and care must be taken in their interpretation. Specifically, a EKI ERT inversion only define zones based on electrical signatures. It is expected that two or more facies of similar electrical signatures will be considered as the same zone by EKI. Binley et al. (2004) comment that shallow geology consists of a series of fining upward sequences of 1 m to 3 m thick, grading from medium-grained to fine-grained sandstone. The EKI ERT inversions appear to accurately identify the position of these sequences within the profile of the unsaturated sandstone.
In this example, convergence was achieved in 19 iterations. The computation was performed on an Intel i7 workstation (quad core) laptop and the time required for the entire inversion is less than 3 hr.
4 DISCUSSION
4.1 Interpreting EKI results
An ensemble Kalman inversion has been applied to ERT inversion, for the first time, to obtain resistivity images of the Earth’s subsurface. The method is highly efficient and Table 1 shows all the test cases converge within a small number of iterations. It has been shown in previous theoretical studies that EKI can be treated as both a Gaussian approximation in sequential Monte Carlo approaches Iglesias et al. (2018) and a regularizing, iterative optimizer that is an approximation to the Levenberg-Marquardt solution (Iglesias 2016; Muir & Tsai 2020). This shows that EKI has preserved some for the strengths for both MCMC-type inversions and smoothness-constrained inversion. The level-set parametrization improves the performance of EKI in terms of both improving the well-posedness of the inverse problem (Iglesias et al. 2014) and allowing better estimation at interfaces (Iglesias et al.2016).
Convergence statistics of all EKI test cases presented in this paper. WRMS stands for data error-weighted root-mean-squared errors defined in eq. (33). 300 forward model runs were used in all test cases.
Convergence statistics of all EKI test cases presented in this paper. WRMS stands for data error-weighted root-mean-squared errors defined in eq. (33). 300 forward model runs were used in all test cases.
In the presence of two or more materials with distinct resistivity ranges, it is well-known that the smoothness-constrained inversion will return a smoothed resistivity image and the range of resistivity values are much smaller than the true range. This is because a sharp change in resistivity can be seen as a violation of assumption of the smoothness constrained inversion so the algorithm tries to restrict it as much as possible. Such features can limit the use of inverted resistivity values for further analysis (e.g. via petrophysical transforms). We observe from the results reported here that our EKI with level-set parametrization approach tends not to suffer from these issues. The zone boundaries are well-recovered (even if they are arbitrarily shaped), the inverted resistivity values are in the same range as the actual ones, and the high uncertainty areas are not pre-dominated by their distance from the electrode array. Unlike smoothness constrained inversion, sharp interfaces are built into the prior distribution of prior models so the inverted resistivity image does not rapidly lose resolution when there is a sharp change in resistivity. EKI estimates is also affected by losing resolution away from electrodes, but less so than smoothness-constrained inversion. In our examples, we observe areas with pronounced artefacts in our EKI results, but they are located outside of the ERT imaging regions and are cropped out.
While classical inverse methods include a useful framework for image appraisal and uncertainty quantification, their interpretation may not be intuitive. For example, when repeating smoothness constrained inversion runs with perturbed data using Monte Carlo analysis, the uncertainty areas tend to be highest near the electrodes and decreases further away (Tso et al. 2017, 2019), which conveys more about the geometry of the electrode array than the uncertainty pattern of the ERT problem. This pattern is caused by low resolution in the smoothness-constrained inversion away from electrodes (as shown, for example, from a plot of the diagonal of the resolution matrix). These apparent low uncertainty zones can appear to be misleading for uncertainty analysis (Tso et al. 2019; Tso 2019). The above issue can be solved partially by bootstrap sampling (Yang et al. 2014), nevertheless it is an uncertainty analysis on smooth images. Our EKI approach provides additional useful estimates of uncertainty by means of posterior variance maps and zonal probability maps. Unlike smoothness constrained inversion, these maps appear to be more realistic and do not show the misleading correlation between proximity to electrodes and variance of inverted resistivity. Nevertheless, the posterior uncertainty tends to be very low away from the interfaces (or zonal membership very close to 0 or 1). It is important to note that EKI uses a small ensemble and each of the ensemble member move towards the misfit minima (and we have not specified anything to keep them apart). So the value of the uncertainty estimates should not be treated as the full parameter uncertainty. Such considerations should be taken account when interpreting EKI uncertainty estimates, especially if they are propagated to subsequent analysis (e.g. they may need to be inflated). Nevertheless, the uncertainty pattern returned by EKI is very helpful to guide interpretation and further analysis.
The prior resistivity ranges of the different zones do not seem to show a great impact to the EKI results. However, resistivity ranges between zones should not overlap in order for the level-set parametrization to work properly. Our method also works best when sharp interfaces are expected and there are known contrasts between the resistivity of the two zones. It is possible that it will mistake a gradual change in resistivity for an abrupt one. Precautions should also be taken when interpreting low-uncertainty regions returned by EKI. For example, when EKI estimates gives low variance and zonal probability equals 0 or 1. It only represents the uncertainty given the inverse problem setup (e.g. assumptions or number of zones and prior resistivity ranges). These factors needs to be taken into consideration when interpreting (and propagating) EKI uncertainty maps.
4.2 Comparison with MCMC methods
As mentioned in the introduction, smoothness-constrained inversions are not suitable for uncertainty quantification because it tends to show low sensitivity areas as low variance (i.e. uncertainty) areas. Therefore, such uncertainty estimates are not helpful when used to propagate uncertainties in subsequent analysis. While there have been efforts to mitigate such issues by accounting for such effects (e.g. Nussbaumer et al. 2019), using smoothness-constrained inversion results for uncertainty quantification remains challenging, especially for property fields with discontinuities.
In contrast, MCMC inversion has been considered as the gold standard for uncertainty modelling in many disciplines including hydrogeophysics. Many consider it as fully Bayesian and has advantages such as thoroughly sampling the parameter space and allowing estimation of posterior pdfs with multiple modes. However, its very high computing effort (i.e. normally requiring 103−106 model runs) prohibits its use in most practical ERT applications.
In the MCMC ERT inversions (e.g. Zahner et al. 2016; Galetti & Curtis 2018), high uncertainty bands can be observed near the interface between zones. For example, the trans-dimensional MCMC ERT inversion of Galetti & Curtis (2018) obtained a very detailed full posterior pdf at each location and they showed high uncertainty areas where the posterior pdf is flat and multiple modes are present. These bands are helpful to visualize uncertainty in parameter estimation and they have not been reported using other methods. Although the EKI cannot obtain the full posterior pdf as in MCMC methods, we observe regions of high uncertainty (i.e. variance) around the zonal interfaces in our synthetic examples. This shows that the EKI method can offer a decent approximation of location of the interfaces between zones.
4.3 Applicability and potential extensions
In the last few years, EKI or ensemble Kalman filter (EnKF) methods have been used increasingly in hydrogeophysics. For example, Tso et al. (2020) used ensemble smoother for multiple data assimilation (ES-MDA, Emerick & Reynolds 2013) to estimates leak parameters for field-scale leak events from ERT data. Kang et al. (2018, 2019) used EnKF to estimate DNAPL distribution from ERT data of sandbox experiments. Claes et al. (2020) used ES-MDA to calibrate zonal K values of a watershed model from ERT data. However, EKI/ES-MDA has not been used for ERT or geophysical inversion or imaging before (although derived ERT data such as traveltimes or spatial moments have been used). Without the level-set parametrization and the use of hyper-parameters as described in this work, many geophysical imaging problems would have too many unknowns to be practical. It would also be challenging to generate prior geophysical parameter fields. However, as highlighted above that EKI/ES-MDA are well suited for solving coupled inverse problems thanks to its derivative-free formulation. EKI with level-set formulation may open up future opportunities for coupled hydrogeophysical inversion of heterogeneous fields.
As proposed by Cardiff & Kitanidis (2009) in the groundwater/hydrogeophysics literature, the level-set parametrization is an extensible framework for facies detection that is suitable for joint or coupled inversion (e.g. hydraulic head and seismic velocity) and uncertainty quantification. However, the uptake has been slow and it has not been used in practical ERT problems [although they have been included in theoretical work (e.g. Chung et al. 2005; Aghasi et al. 2011; Chada et al. 2018)]. As shown here, EKI significantly reduces the computation cost and thus lower the barrier for Bayesian inversion and uncertainty quantification. The use of EKI as a parameter estimation method makes it computationally much less intensive than other level-set methods previously reported in the hydrogeophysics literature Cardiff & Kitanidis (2009). EKI is also much very flexible to incorporate data from different modalities due to the ‘black-box’ nature (i.e. derivative-free) of its inversion formulation. It has been increasingly recognized that one can constrain a smoothness constrained inversion by geostatistical data (by modifying the a priori model covariance matrix, e.g. Hermans et al. 2012). As shown in Muir & Tsai (2020) recently in their deep-earth seismic work, this can be done within the EKI but with much greater flexibility (see also comments in section 2.4). Their work and ours also underscores the great potential and readiness to apply EKI with level-set parametrizations to a wide variety of geophysical problems (specifically those where somewhat discrete interface(s) between zones are expected) and measurement methods. They work best when the geophysical parameter ranges between zones do not overlap. For example, they can be used to identify layer boundaries from ground penetrating radar (GPR) or electromagnetic induction (EMI) profiles by flexibly introducing 2-D or 3-D zonal structure as prior information.
A straightforward extension of this work is to consider its best formulation in time-lapse ERT problems. Difference inversion Labrecque & Yang (2001) has been a standard for smoothness constrained inversion of time-lapse ERT surveys. Future studies should verify whether the current EKI formulation is suitable for time-lapse ERT studies and whether a difference inversion can further improve the method for time-lapse ERT use. Likewise, the examples considered here contains extensive features (e.g. inclusions, layers). Given the flexibility of its parametrization, it will be worth considering its use in imaging discrete features (e.g. planar hydraulic fractures; Wu et al. 2019).
Smoothness constrained inversion will probably continue to be the most popular method for ERT inversion. It is a very helpful tool to evaluate ERT data in the first instance. However, the EKI method described here has significantly lower the barrier of entry for performing uncertainty quantification and reliably identify zonal interfaces for ERT inversion. Its use should be encouraged to improve the interpretability of ERT.
5 CONCLUSIONS
We have described a computationally efficient method, based on EKI with level-set parametrization, for ERT inversion that is suitable for uncertainty quantification and imaging (potentially discontinuous) multiscale features. In particular, our field examples show that it can handle arbitrarily shaped layers and inclusions (without specifying anisotropy). Our method is efficient enough for solving 2-D problems in a personal computer and 3-D problems in a small computer cluster. Its computational cost is a fraction of that of MCMC methods, while also circumventing apparent issues with uncertainty quantification when using smoothness-constrained inversions. Importantly, it produces uncertainty estimates that are helpful for interpretation and are fit-for-purpose for uncertainty propagation. Common limitations for ensemble Kalman filter-type methods apply to our approach, such as the use of a small number of samples may lead to a collapse of posterior samples (and thus an underestimation of posterior uncertainty), and the breakdown of assumption of Gaussian distributed posteriors in non-linear problems (given a Gaussian prior and a Gaussian error model). However, this can be seen an acceptable compromise for a great computational speedup in many applications (e.g. test cases herein). The level-set parametrization (i.e. through the Whittle–Matérn hyperparameters) we use provides a flexible framework to handle spatially varying changes in the resistivity field and it does not require strong assumptions on the length scales or structure of the resistivity fields. Such method contributes to the continuing effort to improve resistivity imaging of material interfaces and to broaden the use of uncertainty quantification in ERT applications. We have also highlighted its applicability and readiness to be applied in other geophysical problems or measurements. It is our intention that it will serve as a useful tool in a wide range of hydrogeophysics applications.
ACKNOWLEDGEMENTS
This paper is published with the permission of the executive director of the BGS. Michael Tso was supported by a Lancaster University PhD studentship and a Nuclear Decommissioning Authority (UK) bursary.
We thank Tim Johnson for help with using E4D, and University of Nottingham and BGS for access to computing resources. The field data used originated from Lancaster–NERC grants: NE/N007409/1 (Chenqi); NE/F004958/1 (Borth); NER/A/S/2001/01175 (Eggborough). Field assistance from Lancaster researchers and students Qinbo Cheng (Chenqi), Adeola Asunbo & Nick Kettridge (Borth) and Peter Winship (Eggborough) is acknowledged. We acknowledge the helpful comments from editor Alexis Maineult, Niklas Linde, and an anonymous reviewer.
Data availability statement:
Example codes to implement the method is available on GitHub (https://github.com/cmtso/EKI_geophysics_2020) (Tso & Iglesias 2021).
Footnotes
The Bayesian formulation on functional spaces has been rigorously formulated in Stuart (2010). We can alternatively think of σ(x) as a multivariate random variable consisting of nodal/cell values of σ(x) on a finite-dimensional domain (i.e. a finite element mesh).