Inversion of time domain three-dimensional electromagnetic data

We present a general formulation for inverting time domain electromagnetic data to recover a 3-D distribution of electrical conductivity. The forward problem is solved using ﬁnite volume methods in the spatial domain and an implicit method (Backward Euler) in the time domain. A modiﬁed Gauss–Newton strategy is employed to solve the inverse problem. The modiﬁcations include the use of a quasi-Newton method to generate a pre-conditioner for the perturbed in the air, on the ground, or in boreholes, and from an arbitrary grounded or ungrounded source. Three synthetic examples illustrate the basic functionality of the algorithm, and a result from a ﬁeld example shows applicability in a larger-scale ﬁeld example.

A main impediment to inverting time domain electromagnetic data is the size of the problem and the number of computations to be carried out. To be more specific, assume that the forward problem (in continuous space) is written in the form where A(m) is some version of Maxwell's equations (including boundary conditions), m = log(σ ) is the log conductivity, u stands for the fields and b represents sources and boundary values. We assume for simplicity of exposition that A is invertible for any relevant m, that is, there is a unique solution to the forward problem.
In the inverse problem we measure some function of the fields and desire to recover the model m. Let us write the measured data as where Q is a measurement operator which projects the fields (or their derivatives or integrals) onto the measurement locations in 3-D space and time, and is the measurement noise. The data are finite in number and contaminated with noise, and therefore, there is no unique solution.
To obtain a single model which depends stably on the data we incorporate a priori information and formulate the inverse problem (in continuous space) as a constrained optimization problem of the form min m,u Here, β > 0 is the regularization parameter, and R(·) is a regularization operator reflecting our a priori information. Typically, we assume that m is a piecewise smooth function over the spatial domain in 3-D, so we design an R(·) that involves some norm of ∇m over , for example, weighted L 2 or L 1 or a Huber combination (Huber 1964;Farquharson & Oldenburg 1998) where θ can be seen as a jump parameter. For gradients with values smaller than θ smooth solutions are assumed but for large gradients discontinuities are allowed. This type of regularization can be written as where ρ is given by the Huber function, m ref is some background reference model and γ is a user-specified constant that adjusts the relative influence of the two terms in the regularization functional. The choice of this functional ensures that the weak form has Dirichlet boundary conditions and that in regions far away from where we have data, the model converges to a known background. This choice also guarantees that the Hessian of the regularizer is invertible. Next, the problem (3) is discretized using some finite volume method over a finite grid representing the domain in space and time. This yields the finite dimensional optimization problem where u, m and b are grid functions ordered as vectors and correspond to their continuous counterparts above, and Q and A are large, sparse matrices. The matrix A depends on m and is non-singular. The discrete regularization function has the form where ∇ h is the discrete gradient operator, e is a vector of ones and W is a matrix. In previous work , we have used an inexact, all-at-once methodology (Biros & Ghattas 2005;Ascher & Haber 2003) to solve (5), solving the forward problem and the inverse problem simultaneously in one iterative process. While this approach can be highly efficient, it requires storage of the time-space history of all of the fields. In practical field situations, where we are concerned with large volumes, small cells, and multiple transmitters, this approach is not yet feasible.
In this work, we aim to generate a practical time-domain inversion that is applicable to large-scale problems and multiple sources. For such problems one cannot store all the fields on computer hardware that is currently available to us, and we therefore, turn to other avenues for the solution of the problem. A simple way to reduce storage is to use a reduced space method, which eliminates Maxwell's equations, that is, solve for the field u given the model m. This yields the unconstrained optimization problem This leads to the Euler-Lagrange system where ∂m and J (m) is the sensitivity matrix (see Haber et al. 2000, for derivation) and R m (m) = ∂ R ∂m .
We search for an m that solves (8) and yields a minimum of the function. The evaluation of g(m) involves a number of computations. To calculate R m (m) we use the chain rule Also, the forward problem must be solved to generate predicted data, and calculating the action of J (m) on a vector requires the solution of the adjoint problem. If a gradient descent type method is used to solve (8) then each iteration requires the solution of a forward and adjoint problem which means solving Maxwell's equations forward and backward in time. In addition, a line search involving forward modelling must be carried out. In this, and in all other descent methods for solving the optimization problem, a globalization strategy is necessary to guaranty convergence to a local minimum (rather than a saddle point or a maximum). Here we have used an Armijo line search strategy (see Nocedal & Wright 1999).
Newton-type methods are generally far superior to steepest descent methods when solving (8). If a Gauss-Newton method is used, then at each iteration we solve the linear (but dense) system where s is a model perturbation. This system is never formed explicitly but is solved using conjugate gradient (CG) methods. At each CG iteration J (m) and J (m) are applied to a vector and hence a forward and adjoint problem must be solved. Even with reasonably good pre-conditioners, a few tens of CG iterations are likely required to get a good estimate of s. This results in many forward modellings and hence the cost of a Newton-CG algorithm is also too high. We therefore, turn to quasi-Newton techniques.
In a quasi-Newton method we solve a system where B is an approximation to the Hessian. One of the most successful methods for unconstrained optimization is the limited memory BFGS (L-BFGS) method (Nocedal & Wright 1999). In a BFGS method either B or it's inverse, B −1 , is built up through successive iterations. The main computations at each iteration are construction of the gradient and performing a line search. The constituent vectors for constructing B need to be stored. The limited memory version of BFGS addresses this issue. While the method is highly attractive for inverse problems, naive implementation of L-BFGS usually fails (see Haber 2005, for details).
In this paper, we build on our prior work and improve our inversion algorithm by introducing new inversion strategies.
(1) We combine the strengths of Gauss-Newton and quasi-Newton approaches.
(2) We alter the usual Tikhonov methodology to use an iterated Tikhonov approach that partially overcomes difficulties in solving the optimization problem when the regularization parameter is small.
(3) We explore a particular implementation of L-BFGS and use the constructed approximate inverse Hessian as a pre-conditioner for the Gauss-Newton iteration. This substantially reduces the number of CG iterations in solving (9).
(4) We present new strategies for dealing with some of the practical issues mentioned earlier. In particular we introduce a source correction procedure that allows us to reduce the spatial volume of the region to be inverted and to ameliorate discretization errors. We also introduce a technique to work with problems that have an extended time source.
The paper is divided as follows. In Section 2, we describe the solution of the forward problem. In Section 3, we discuss how to modify the right hand side of the forward problem such that the size of the discretized domain is smaller. In Section 4, we discuss the overall approach for the solution of the problem and motivate the use of quasi-Newton methods. In Section 5, we discuss efficient implementation of quasi-Newton methods. In Section 6, we discuss how to deal with extended waveforms source-signals. In Section 7, we present three synthetic examples that illuminate the basic functionality of the inversion and we also provide results from a larger-scale field example. The paper is summarized in Section 9.

S O L U T I O N O F T H E F O RWA R D P RO B L E M
In this section, we present our forward problem, Maxwell's equations, and discuss solution procedures suitable for the parameter regimes of interest. This section is a summary of our previous work Haber et al. 2004) and we add it here for completeness.

Reformulation and discretization
The time-dependent Maxwell equations can be written as over a domain × [0, t f ], where E and H are the electric and magnetic fields, μ is the permeability, σ is the conductivity, is the permittivity and s is a source. The equations are given with boundary and initial conditions: although other boundary and initial conditions could be used. It is possible to verify that the above system is very stiff (Ascher & Petzold 1998;Haber et al. 2004). This implies that typically many time scales exist and careful numerical treatment is needed in order to solve the problem. We therefore, turn to implicit methods, and use a Backward Difference Formula (BDF) type method. For the work here we choose the simplest, lowest order member of both these families of stiff integrators, namely, the backward Euler method. This allows us to simplify the presentation, however we have also implemented a BDF(2) integrator and the reader is referred to (Haber et al. 1999. Semi-discretizing (11), (11c) over a time step [t n , t n+1 ] yields the equations where α n = (t n+1 − t n ) −1 . The superscripts in (12) denote the time level, with solution quantities at n + 1 being unknown while those at n are known. Rather than working with the fields E n+1 and H n+1 we first eliminate H n+1 from (12) Substituting, we obtain an equation for E n+1 As discussed in ) the discretization of the system (13) is difficult to solve using iterative methods. Introducing the potentials A, φ to write E =A + ∇φ we get where we have subtracted the stabilizing term ∇μ −1 ∇ · A from the first equation. A discretization of the system (14) is better conditioned than (13) because the null space of the curl is eliminated by adding the stabilizing term. For ease of notation we introduceσ = σ + α n . The continuous equations are discretized in space using a finite volume technique on a staggered grid (see details in Haber et al. 2004). Variables A and E are defined at the centre of cell faces, H is defined on the edges, and φ in the cell centres. Upon discretization we obtain where L μ = C e M μ C f − G M c μ D and the matrices C e , C f , G and D are, respectively, the discretizations of the curl on the edges and faces, the gradient and the divergence. The matrix Mσ is a discretization of σ + α n on the cell faces. It is composed of harmonic averages ofσ for the two cells in contact at a face. The matrix M μ consists of arithmetic averages of the permeability for cells adjacent to an edge. Finally, the matrix M c μ is a discretization of μ in cell centres and it assumes that μ is a scalar quantity (rather than a tensor).
This linear system can be solved using standard iterative methods (Saad 1993) and effective scaleable pre-conditioners can be designed for it (Aruliah & Ascher 2002;.

Summary-the forward problem
In many applications we are concerned with multiple sources and with multiple, s, time steps. For solving the inverse problem it is useful to formulate the forward problem as a single equation (see e.g. Section 1). We note that it is possible to rewrite the system as a lower block bidiagonal system for A, φ and H of the form where In the case of multiple sources we obtain a block diagonal system where each block has the same structure as (16). Note that only the diagonal blocks in (16) depend on the conductivity. Also, once we have an efficient solver for one block A k (m), solving the forward problem (16) is straightforward (with the cost of solution increasing by a factor of s compared to the cost of solving for one block).

C O R R E C T I V E S O U RC E S
Our boundary condition requires the domain to be large enough such that the magnetic fields are dramatically reduced. This may require that a very large domain be modelled. This is especially true if the sources are far from the receivers, as they are in many EM surveys.
To make the problem tractable we introduce a correction procedure which has many elements of a primary/secondary field formulation. The methodology can be used to reduce the size of the problem and also to provide a first order correction for discretization errors. We first write the steps in a generic manner and then show how this is implemented in the time domain equations.
Let A denote a general operator which depends upon m, our physical property of interest (i.e. electrical conductivity),ũ is a field, andq is a source. m,ũ,q are all functions with a relationship The above equations are also provided with boundary conditions. Analytic, or quasi-analytic, solutions are available only for simple situations, for example, finding the fields in a half-space due to an electric or magnetic source. For realistically complicated earth models we obtain a numerical solution by solving a discrete system where A h is a matrix that results from the discretization of A, u h is a vector of (discrete) field values, and q h is a vector that contains the discretized source and boundary conditions. Let m 0 be a model for which we can generate an exact solution to eq. (17) and let the discrete field values beũ(m 0 ). It is unlikely that u(m 0 ) is an exact solution for the discrete eq. (18). The residual is This discrepancy can be regarded as a corrective source and added to the discrete equations. Since the source does not depend on m we solve a corrected equation It is straightforward to show that when m = m 0 the solution of the discrete equations u h =ũ(m 0 ). So the numerical solution is equal to the 'exact' solution for our reference problem. The above correction can be used for the following two modelling problems.
(1) Discretization errors: The corrective source is a set of currents that compensate for the differences between the discrete and continuous models for a specific conductivity. This is particularly useful for dealing with the current source. In geophysical field applications the source current is carried by a small wire, whereas in the discrete formulations the current is assumed to be distributed through the face of a cell. If a receiver is close to a transmitter, the difference in fields can be significant.
(2) Reducing the size of the problem: In CSAMT problems, or large loop UTEM surveys, the data are acquired in a region that is significantly away from the source. Our data area, and region of interest, is much smaller than the volume that contains the transmitter and receivers. If the discretized volume is outside the region of the transmitter, the corrective-source will primarily be currents on the boundary of the volume.
The corrective sources provide the needed compensation for a specific conductivity. We have no mathematical procedure for evaluating how well the correction terms work when the conductivity changes significantly from m 0 . However, we are able to iteratively change m 0 in the course of the inversion process. We have proved in Haber (2006) that such a process is a convergent one.
We now apply the above methodology to our time domain problem. Assume we have calculated the flux J n 0 and the magnetic field H n 0 n = 1..N , that correspond to the conductivity σ 0 and the source s. This can be done by analytic formulas (Ward & Hohmann 1988) or by using 1-D codes that allow for the calculation of highly accurate fields given a simplified earth model. The corresponding (discrete) electric field is Our basic equation is (13) and thus the source for time t n+1 is Using the potentials, we therefore, have Note that upon using s 0 in (13), and the equivalent A − φ formula (22) with the background conductivity σ 0 , we obtain the fields E 0 and H 0 . The right hand side of the equation is zero (up to discretization errors) where there are no electric or magnetic sources. However, the right-hand side is different from zero on the boundary of our domain. This can be thought of an artificial source which emulates the source influence on a finite grid.
Numerical experiments show that the formulation (22) allow us to dramatically reduce the volume under consideration without sacrificing accuracy.

I N V E R S I O N F R A M E W O R K A N D A L G O R I T H M S
As discussed in Section 1 we aim to solve the minimization problem (7). There is one further complication when solving the optimization problem because, in general, the regularization parameter is unknown a priori. Therefore, we need to solve the optimization problem a few times for different regularization parameters. The 'optimal' regularization parameter has to be chosen using an appropriate criteria such as discrepancy principle (Parker 1994) or Generalized Cross Validation (GCV) (Hansen 1998;Wahba 1998;Haber & Oldenburg 2001). A general sketch of the algorithm that achieves this goal is as follows Algorithm 1 is a common implementation of Tikhonov regularization. However solving the optimization problem (7) becomes increasingly harder as the regularization parameter approaches zero. One way to alleviate this difficulty is to use an iterated Tikhonov regularization (Engl et al. 1996).
The use of iterated Tikhonov for non-linear problems has been analysed in Clarenz et al. (2002) and it is known to give similar results to the Tikhonov regularization. Some background about the iterated Tikhonov approach is provided in the Appendix A.
The iterated Tikhonov regularization is based on the observation that the non-linear eq. (8) can be rewritten as where m 0 = 0 and τ = β −1 . Eq. (23) is a single backward Euler step of the non-linear ODĖ This implies that methods other than a single backward Euler method can be used. For example, in the classical implementation of the Iterated Tikhonov regularization method a few backward Euler steps are taken with the same regularization parameter (or time step). Thus we solve the following sequence of problems with the same regularization parameter but with a changing reference model m ref .
It can be shown under mild assumptions that if we set m ref to the solution of the kth problem, then the algorithm reduces the data misfit at each iteration. A sketch of the iterated Tikhonov regularization is as follows. It is also possible to combine algorithms and change the regularization parameter as well as the reference model. In this approach both the regularization parameter and the reference model change at each iteration. This leads to faster reduction in the misfit.
Using an iterative regularization has one main advantage. The regularization parameter β used at each iteration is larger than the equivalent one used for classical Tikhonov regularization. This can be advantageous since the convergence of the linear solver at each iteration is highly dependent on the size of the regularization parameter. On the other hand, there are some disadvantages to the iterative regularization approach. The main one is that it is difficult to take advantage of solving problems on coarser grids . Our code, therefore, implements either the classical or iterated Tikhonov solution.

E F F I C I E N T I M P L E M E N TAT I O N O F Q UA S I -N E W T O N M E T H O D S
Irrespective of whether we use a Tikhonov or iterated Tikhonov approach, an essential computation is the solution of eq. (9), which we write again for the sake of completeness These will be solved with a CG solver and the efficacy of solution depends strongly on the pre-conditioner. In previous work , we used the matrix M = (0.1I + βW T W ) as a pre-conditioner. This worked well for large β but was less effective as β decreased. To improve computational efficiency we need to find a better approximation to the Hessian matrix (J J + βR mm ). We will use quasi-Newton techniques to construct an approximation to the inverse Hessian and use that matrix as a pre-conditioner. As a prelude we discuss quasi-Newton methods for the solution of the optimization problems (7) and (26). We start by introducing some notation, review the L-BFGS method and discuss important points in it's implementation.
For a fixed regularization parameter β and reference model m ref , given two approximate solutions m j and m j+1 and the two corresponding gradients g j and g j+1 , we define, as is usual in quasi-Newton methods, The usual Secant Condition (Dennis & Schnabel 1996;Nocedal & Wright 1999) is where H j is an approximation to the Hessian. In our case the Hessian has the structure , where R mm is a known, sparse, easy-to-evaluate matrix and is a dense hard-to-evaluate matrix which corresponds to a compact operator (Engl et al. 1996). Common quasi-Newton methods do not make use of the known part and evaluate the whole Hessian. However, we would like to use the known information about the Hessian.
To do that, we also note that for the applications we consider here, we have an explicit knowledge of the gradients with respect to the data objective function and the model objective function Using the separability of the objective function and the gradient we now discuss approximations to the Hessian.
In the BFGS method one uses eq. (27) to approximate the inverse of the Hessian directly. The standard BFGS update is (see Dennis & Schnabel 1996;Nocedal & Wright 1999) where γ k = (s k y k ) −1 . In the L-BFGS method only a limited number of vectors are kept and the matrix H, or H −1 , is never generated explicitly. The vectors {m j , g j } j = 1, . . . are saved and an inverse-Hessian matrix vector product can be calculated using recursion (see Kelley 1999, for details).
Although it is possible to use the L-BFGS methods directly. As discussed in Gilles et al. (2002), if the matrix R mm is constant (as when the regularization is quadratic) it is crucial to initiate the Hessian with βR mm . This implies that when calculating the inverse-Hessian times a vector one needs to calculate the product v = R −1 mm w. This can be done by the solution of the system R mm v = w, which can be achieved directly, or using a multigrid method (Haber 2004).
However, there are a few difficulties when using L-BFGS for the solution of our particular problem. Consider the L-BFGS solver imbedded into the algorithms 1, 2. When solving two adjacent problems, with a different regularization parameter or a different reference model, the gradients of the previous functions cannot be used. This implies that a standard implementation of L-BFGS requires restarting the process ignoring all (potentially useful) previous information. We therefore, suggest a variant which makes use of previous approximations to the solution.
Assume we are in iteration of a BFGS algorithm with a reference model m ref k and a regularization parameter β k and that we have the vectors Now assume that the regularizer part has changed either because β k or m ref k has changed to β k+1 and m ref k+1 . Straight forward implementation of BFGS requires we restart the process ignoring information we have previously collected. However, we can circumvent this and define the following new set of vectors It is clear that the set of vectors g i i = 1, . . . , are the gradients of the modified objective function which corresponds to previous models m 1 , . . . , m . We can therefore, use these vectors to generate a new BFGS approximation to the Hessian. Note that for the class of regularization we consider here, R m (m i ; m ref k+1 ) can be calculated in linear time. When using the new BFGS vectors we must test the quasi-Newton condition, y j s j ≥ 0. If this condition does not apply to the new quasi-Newton vectors then we simply drop the corresponding pair.
In the L-BFGS method one chooses a maximum number of vectors, (l max , to be used. The convergence rate increases with l max but so will storage requirements. Thus l max will be problem and machine dependent.
Although the optimization problem could be solved directly with L-BFGS, we have found that it gave poor performance. We suspect that for small regularization parameters many BFGS vectors are needed in order to obtain a good approximation to the Hessian. Nevertheless, the BFGS approximation to the Hessian can still perform very well as a pre-conditioner for the GN system. This is the role it plays here.

W O R K I N G W I T H E X T E N D E D WAV E F O R M S
Most transmitter waveforms have an on-time and off-time and the ideal scenario is that electromagnetic fields have completely decayed away before the next waveform begins. In conductive environments however the decay time might extend for a couple of periods of the input waveform. Modelling the field data then requires an extended time source that consists of a few repetitions of the waveform. As a example, consider a conductive sphere buried in a low conductivity background. Suppose that the sphere has a late-time decay constant τ = 100 ms. The waveform is a half-sinusoid pulse followed by an off-time. The data are acquired in the off-time and stacked over many periods. The final data set is assumed to be the steady state values. If the fundamental frequency of the transmitter is 10 Hz, then numerical experience shows that at least three cycles of a waveform need to be modelled before an approximate steady state solution is reached. Rather than carrying out this time stepping in all forward modellings for the inverse problem we now suggest a new method to deal with this difficulty.
Assume that the conductivity structure of the earth is known and that we can run the full forward problem with this conductivity structure. Let u e be the field that corresponds to time t e , which is the time after a few waveforms and assume that the magnetic field is measured only at time equal to, or larger than, t e . We rewrite the system as
For the inversion we are interested only in fields at times t e and later. However, to compute the field at t e we need to know u e−1 , the field at t e−1 . If this were known then we could solve a much smaller forward problem The problem is that the field u e−1 is not known unless we had previously performed a full forward modelling and computed and stored it. Then, if the forward modelling for times t e and beyond are needed, we can avoid computing all previous fields and use the forward problem (32) instead of the full forward problem (31). This saves many of the computations.
In the course of the inversion process, we need to compute the forward problem only after we update the model. Fortunately, since the PDE is stiff, if the change to the model is small then the change to the fields is exponentially smaller. Thus, for small changes in the model, we are able to obtain a good enough solution to our forward problem even with the fields u e−1 computed from a near-by (previous) model. This observation motivates the following idea. Replace the forward modelling matrix A(m) with A red and guess the fields at time u e−1 . This implies that not only is the forward problem easier to solve, but the gradients and the product of the sensitivity matrix with a vector are also cheaper to compute. However, we must keep track of u e−1 and update it as m changes. This idea can be summarized by the following algorithm. while true do 1 Compute the forward problem and store u e−1 2 Compute the gradient (eq. 8) using the reduced forward modelling A red 3 Compute the Gauss-Newton correction s using eq. (9) with the reduced forward modelling A red 4 Update the model, m k+1 ← m k + αs where α is a line search parameter to guarantee reduction in the objective function.
if m k+1 fulfil chosen stopping criteria then break; end if end while An important observation can be made when analysing the algorithm. Note that by assuming that the perturbation in m is small between iterations we can perform step [1] only every second or third iteration. Furthermore, if we update u e−1 with a near-by model then the computation of steps [1] and [2][3] can be done in parallel. Thus, given parallel architecture of two nodes we are able to significantly accelerate our algorithm.

E X A M P L E S
In this section, we present three synthetic and one field example. The first example is a generic inversion where we invert data arising from a loop on the earth's surface; the goal is to illustrate the basic functionality of our algorithm. The second and third examples, respectively, demonstrate the use of the corrective source technique to reduce the volume, and show how extended waveforms can be handled more efficiently. We conclude with a larger-scale field example from a mineral deposit in Mexico.
In all of the examples we define the regularization functional R(m) to be the discretized equivalent of where γ = 10 −5 .

A generic inversion
A square loop with dimensions of 130 × 130 m is located on the earth's surface. The transmitter current is a step-off at time zero and responses are measured in 32 logarithmically spaced times between 10 −6 and 10 −3 s. The earth model is made of one conductive block of 10 m and a resistive block of 1000 m buried in a half-space of 100 m. There are 100 receivers on the surface inside the transmitter loop and five component fields (H x , H y , H z , E x and E y ) are measured. Gaussian random noise has been added to the 16 000 data. The standard deviation of the noise was one percent of the data value plus a floor. Fitting the data to this small error makes the test relatively difficult. The problem is discretized using 42 3 cells and the number of unknowns (A, φ amd H for all space-time and m) is roughly 16 million. The regularization functional is that used in eq. (33). The initial model and initial reference model were set equal to the true half-space.
To solve the optimization problem, we have used the iterative Tikhonov algorithm along with L-BFGS method to generate a preconditioner for the matrix system (9). We allowed four L-BFGS iterations per-β before making a decision about whether to change β. The  reference model was updated to be equal to the current model each time β changed. The true and recovered conductivity models are shown in Fig. 1. The conductive and resistive blocks are in the correct locations but, as is usual in smooth model inversions of blocky structures, the recovered conductivities of the target bodies are somewhat smaller than the true values and edges are smoothed. Because of the small amount of noise added, and because the data have been fit so well, there is little visual difference between observed and predicted data. We have therefore, not plotted these images.
The numerical progress of the inversion is summarized in Fig. 2. A misfit of 19 000 is achieved after 40 iterations and a cooling schedule that saw β decreased by five orders of magnitude in eight steps.
The inversion software is coded in Fortran 90 and was run on an Opteron 244 processor. It took roughly 64 h to complete.
In an inversion involving a cooling strategy there are tradeoffs regarding the rate at which the regularizing parameter β is reduced and how many iterations are carried out at each β. The goal is to proceed cautiously so that the solution trajectory stays close to the classical Tikhonov curve. Our situation is slightly more complicated because we are using an iterated Tikhonov approach where the reference model is changing. To obtain some confidence that the solution in Fig. 1 is a good minimizer of the initial objective function (33), we repeat the  (33) is evaluated as a function of iteration. In this computation the reference model remains equal to the initial half-space value. The closeness of the two curves suggests that we are doing a reasonably good job at finding a solution that minimizes our initial objective function. This is supported by the fact that R(m), evaluated with the true model, is 3925. This is more than an order of magnitude larger than that achieved by our inversion results.

Corrective source for reducing the volume
The following example emulates a typical mineral exploration survey in which a large transmitter loop is deployed on the surface and the area of investigation lies outside the loop. The goal is to find a conducting target. The 1000 × 1000 m transmitter loop and the conductivity model are shown in Fig. 3. The transmitter signal is a step-off and three component magnetic field data are recorded at 31 equally spaced logarithmic times between 10 −4 and 10 −2 s. The data were contaminated with one percent Gaussian noise.
Forward modelling the data required using a large volume that included the transmitter (thus we avoided the 'inverse crime' here of inverting the data with the same method used to generate it). The resultant mesh was too large to be used in the inversion and also the extended volume served little purpose since the data and the volume of interest is quite confined. A smaller mesh encompassing the volume of interest was generated using 50 × 50 × 25 m cells in the core region that included the survey area. Padding cells then extended the final inversion volume such that the left boundary terminated inside the transmitter loop. The total number of cells in the inversion volume was 28 672. The background fields for a half-space, whose conductivity was equal to the true half-space, were used to generate the corrective sources at each time.
This inversion was run on an Pentium four 3GHz and it took 84 h to complete. A vertical cross-section of the recovered model is shown in Fig. 3. The misfit and model regularization curves as a function of iteration are also shown. Again the inversion has been run twice, with the maximum number of iterations per β being 4 and 10. In this case, allowing more iterations per regularization value resulted in a reduced final misfit and better overall convergence.

Numerical experiments with extended waveforms
A surface loop survey is carried out over a conductive prism of 8 S m −1 . The earth model and source geometry are shown in Fig. 5. The waveform is a half-sinusoid followed by an off-time. The frequency of the transmitter is f = 1000 Hz and thus the total waveform, comprised of equal length of on-and off-time segments, is T = 0.001 s. Because of its high conductivity and size, the time constant for the target is approximately τ = 0.004 s which is greater than T. Multiple waveforms are needed and in Fig. 4 we plot the first four of these along with the time sampling. Constant time steps were used in the on-time of the waveform, but logarithmic samples were used in the off-time. The number of time samples was 193. Also in Fig. 4, we plot the decay curves for H z for the first four cycles. There is a substantial difference between successive decays but approximate steady state is being approached by the end of the fourth cycle.  . The misfit and model norm curves are, respectively, shown in (c) and (d). The solid lines correspond to an inversion in which the forward modelling began at time zero. The dashed lines correspond to starting the modelling at the beginning of the fourth pulse but using the times from a previous model. We next take the fourth decay and use that as data for the inverse problem. All three components of noise-contaminated magnetic field are inverted. The inversion is carried out by setting t e to be the time of the onset of the fourth waveform. One processor is used to run the forward modelling from (0, t e−1 ) and it stores the field u e−1 . This field is updated every time the model is changed. The second processor, which is working on generating a model perturbation, begins each forward modelling with the stored field.
The success of this approach requires that the fields at t e from a previous model are close to the fields that would have been obtained if a complete modelling had been done on the current model. We have compared the misfit obtained by starting with the fields at t e−1 with those obtained by carrying out a full modelling. The differences, as illustrated in Fig. 5 are minimal. Yet the time savings were important. We have run the inversion on an Opteron 244 processor. By having the first three periods of the transmitter pulse modelled by the second processor, the inversion time was reduced by roughly a factor of 3 to 73.5 hr while the run on a single (same) processor took 231 h. For completeness we plot the regularization functional in Fig. 5 as well as a cross-section of the recovered model.

F I E L D E X A M P L E
In our final example, we invert data from a mineral deposit in Mexico. San Nicolas is a Cu-Zn massive sulphide deposit located in central Mexico in the state of Zacatecas. The sulphide is a high conductor compared to most of the geologic units but the overburden and some of the tertiary volcanic breccia, have a conductivity in the range of the sulphide. The details about the geology, inversion and interpretation are presented elsewhere (Napier Oldenburg, in preparation). The example is included here primarily to show the application of our algorithm to a larger scale problem involving field data.
A UTEM survey was conducted over the San Nicolas deposit in December of 1998. The UTEM system, described by West et al. (1984), uses a continuous on-time loop transmitter with a sawtooth waveform and coil receivers which measure the time derivative of the magnetic field. At San Nicolas the UTEM survey consisted of three large loops as shown in Fig. 6. Vertical components of the field were recorded on north-south and east-west lines over the deposit.
We now invert the data from the eastern-most transmitter. The central volume of the mesh shown in Fig. 6(a) was discretized into 25 × 25 m cells with a variable thickness, from 15 m near the surface to 25 m at depth. The mesh then extended out from the core region several kilometres in each direction. The final volume consisted of 349 440 cells. The inversion used 38 time steps over 1 1 4 UTEM cycles. The first 3/4 of a cycle were uniformly sampled in time. The waveform was then densely sampled in log-time on the next half cycle. Sampling on that ramp began at 1.5 decades before the first datum. The frequency of the transmitter was 30.974 Hz and the first and earliest and latest time channels were, respectively, 47 μs and 12 ms. The inversion functional is that given in (33) and the initial and reference models were equal to 200 − m. Data errors assigned were equal to 5 per cent of the datum plus a floor value equal to 3 per cent of the median value of each time channel. In total there were 1082 data. The initial and final β values were 1.0 and .0016 and the inversion used 17 iterations to achieve the desired misfit of 1082. It took 19 days on a single Opteron 244 processor. A vertical cross-section and a volume rendered image of the recovered conductivity are shown in Fig. 6. Also plotted there are the observed and predicted data for the second time channel and a decay curve for a selected location. These illustrate the good agreement between the observed and predicted data.
The field data from all three loops were simultaneously inverted but with a model that had only 90 000 cells. The results were quite comparable but, as might be expected with more transmitters, there are differences which might have significant geologic implications. These are being investigated in the paper by Napier Oldenburg, in preparation.

A P P E N D I X A : R E G U L A R I Z AT I O N E F F E C T O F T I M E S T E P P I N G
Assume that we are given an ill-posed problem of the form F[m] = d + and we would like to obtain a solution m that fits the data. Since the problem is ill-posed we need to use regularization, that is, to obtain a model that fits the data to some extent and is not sensitive to small changes in the data. To that extent, we embed the solution m in a family of models m(t) where t is a regularization parameter. As can be found in Vogel (2001) any regularization technique has the following two criteria C1 Let m(t 1 ) and m(t 2 ) be two solutions where t 1 < t 2 then m(t 1 ) W ≤ m(t 2 ) W and F[m (t 2 It is well known that Tikhonov regularization fits this criteria. To see that we write Tikhonov regularization as the solution of the optimization problem min φ(m) = 1 2 F[m] − d 2 + 1 2t m 2 W . Then, assuming φ is convex we obtain better data fit for larger t and obtain a perfect fit for t → ∞. Furthermore, for t 1 < t 2 we have m(t 1 ) W ≤ m(t 2 ) W .
In the case that F[m] = Am (i.e. the forward problem is linear), and simple quadratic norm, assuming the Singular Value Decomposition of A as A = U V it is possible to write the solution as where f T (λ;t) is referred to as Tikhonov filter function f T (λ, t) = λ 2 λ 2 + t −1 . For λ t we have f T ≈ 0 and for λ t we have f T ≈ 1. Thus, the effect of the filter is to damp the vectors that are associated with small singular values while leaving the larger ones untouched.
The Tikhonov regularization, may be the most known regularization technique. However, a different regularization technique can be obtained by embedding the solution into a dynamical system. Consider first the case of the linear forward modelling and define the regularized solution as a solution of the dynamical system which is described by the following Ordinary Differential Equation dm dt = A (Am − d).
Using the SVD of A it is easily verified that the solution of this ODE is We see that the above approach yields filters (and hence solutions) that are very similar to the Tikhonov filter functions. More details about this type of regularization can be found in Hanke & Groetsch (1998). For non-linear problems the dynamical system is a non-linear one and can be written as In this case there is no close solution however, similar to the case of Tikhonov regularization, assuming convexity it is possible to show that the two conditions above apply. To solve the above non-linear dynamical system one requires a numerical strategy. In the paper, we have used the implicit backward Euler approach.