Symplectic structure of wave-equation imaging: a path-integral approach based on the double-square-root equation

SUMMARY We carry out high-frequency analyses of Claerbout’s double-square-root equation and its (nu-merical) solution procedures in heterogeneous media. We show that the double-square-root equation generates the adjoint of the single-scattering modelling operator upon substituting the leading term of the generalized Bremmer series for the background Green function. This adjoint operator yields the process of ‘wave-equation’ imaging. We ﬁnally decompose the wave-equation imaging process into common image point gathers in accordance with the characteristic strips in the wavefront set of the data.

interpret such gathers we develop a method based upon microlocalization to estimate the normal direction to the isochrones. ACIGs are used for amplitude inversion and in tomography (Brandsberg-Dahl et al. 2003b;De Hoop et al. 2001).
If the medium of the configuration were laterally homogeneous, the directional decomposition would become an algebraic operation in the lateral Fourier or wavenumber domain (see, e.g., Kennett 1985). In such a medium, the phase-shift method (Gazdag & Sguazzero 1984) is amongst the fastest (and accurate) one-way algorithms. In this method, in the lateral wavenumber domain, the phase shift is simply proportional to the vertical wavenumber; the vertical and lateral wavenumbers are connected through an algebraic dispersion relation. As soon as the medium becomes laterally heterogeneous, it is still advantageous to carry out the analysis in the lateral wavenumber domain, but without leaving the lateral space domain-an observation well established in the field of microlocal analysis (see, e.g., Treves 1980) and applied by Fishman & McCoy (1984a). The lateral space-wavenumber domain constitutes the (lateral) phase space. Through the Fourier transforms, the space and wavenumber domains are 'dual' to one another. In the directional (de)composition-and in the downward and upward propagation, and in the reflection and transmission caused by variations in medium properties in the preferred direction-we now encounter pseudodifferential operators the symbols of which are defined in phase space, and lead to a calculus that generalizes the algebraic manipulations in the case of a laterally homogeneous medium (De Hoop 1996, and Section 2 below).
The wave-equation approach to seismic imaging, analysed in this paper, is advantageous, in particular, if the complexity of the Earth's properties is such that caustics (and pseudocaustics) develop. Its geometrical-ray counterpart, the high-frequency approach, has been discussed in Burridge et al. (1998). The approach is applicable, not only in exploration seismology but also in crustal seismology, imaging teleseismic body waves for detailed lithospheric profiling (Bostock & Rondenay 1999). Examples of subsurface complexities are salt domes and (shallow) magma chambers.
The outline of the paper is as follows. In Section 2 we discuss the method of directional wavefield decomposition. Such a decomposition leads to a coupled system of one-way wave equations. It is the objective to image the symbol of the coupling operator occurring in the system. In Section 3 (and Appendices B and C) we develop a Trotter-product representation for the fundamental solution to the (uncoupled) one-way wave equation. The Trotter-product representation can be viewed as the composition of Fourier integral operators, each of them being identified with a 'thin-slab' propagator. We derive a condition under which such a composition exists, ensuring the proper propagation of singularities. This propagation occurs consistently along the bicharacteristics of the original wave equation. In the remote sensing problem (Section 4) one places sources and receivers at the surface. Sources generate acoustic waves, which scatter in the subsurface and are observed at the surface as a function of time. To image the observations, the measurements of the sources and the receivers are propagated into the subsurface in a presumed background medium back to the time when the scattering took place. This continuation is governed by the double-square-root equation. We develop a Trotter-product representation for the imaging operator. Such an operator can be identified with the fundamental solution of the double-square-root equation. We recover isochrones from the Trotter-product representation, and relate certain phase variables in the representation to scattering angle and azimuth. We then modify the imaging operator into an operator that generates multiple images, namely for each scattering angle and azimuth. Finally, in Section 5 we discuss a procedure to estimate the migration dip while applying the imaging operator, using a continuous wavelet transform. A numerical example of generating ACIGs with Trotter products is shown in Section 6. With the tools developed in the main text, we assess in Appendix D the extent to which the reduction to the computationally efficient 'common azimuth continuation' approach (Biondi & Palacharla 1996) can be applied. Throughout we will make use of microlocal techniques, the fundamental concepts of which are introduced in Appendix A.

O N E -W A Y W A V E E Q U A T I O N S
First, we discuss the method of directional wavefield decomposition. Such a decomposition leads to a coupled system of one-way wave equations. Sometimes, the one-way wave equations are referred to as single-square-root equations. Our final purpose is to image the symbol of the coupling operator occurring in the system.
For the details of the derivation of exact one-way wave equations, we refer the reader to De Hoop (1996). Here, we restrict ourselves to a summary. Let p be the acoustic pressure (Pa), v r the particle velocity (m s −1 ), ρ the volume density of mass (kg m −3 ), κ the compressibility (Pa −1 ), q the volume source density of the injection rate (s −1 ) and f k the volume source density of the force (N m −3 ). We assume that the coefficients κ and ρ are smooth and constant outside a compact domain. This provision enables us to formulate the acoustic wave propagation, when necessary, as a scattering problem in a homogeneous embedding. The smoothness entails that the singularities of the wavefield (in particular those in the neighbourhood of the wave arrival) arise from those in the signatures of the source distributions. The formation of caustics, associated with scattering in the lateral directions ('multipathing'), is captured in the approach developed in this paper.
To be able to make use of the standard calculus of pseudodifferential operators, we should carry out our analysis in the time-Laplace domain (Widder 1946). To show the notation, we give the expression for the acoustic pressure, Under this transformation, assuming zero initial conditions, we have ∂ t → s. In the Laplace domain, the acoustic wavefield satisfies the system of first-order equations sκp + ∂ rvr =q.
However, for the purpose of the microlocal analysis to follow, we will invoke the Fourier limit, lim s→iω . The evolution of the wavefield in the direction of preference will now be expressed in terms of the changes of the wavefield in the directions transverse to it. The direction of preference is taken along the x 3 -axis (or 'vertical' axis) pointing into the subsurface and the remaining ('lateral' or 'horizontal') coordinates are denoted by x µ , µ = 1, 2. Since we allow the medium to vary with all coordinates and hence also with the coordinate in the preferred direction, we are forced to carry out the wavefield decomposition from the system of first-order equations rather than the second-order scalar Helmholtz equation.

The reduced system of equations
The decomposition procedure requires a separate handling of the horizontal components of the particle velocity. From eqs (2) and (5) we obtain leaving, upon substitution, the matrix differential equation in which δ I J is the Kronecker delta, and the elements of the acoustic field matrix are given bŷ the elements of the operator matrix of the acoustic system bŷ and the elements of the notional source matrix bŷ It is observed that the right-hand side of eq. (4) andÂ I J contain the spatial derivatives D ν with respect to the horizontal coordinates only. D ν is interpreted as the horizontal slowness operator. Furthermore, it is noted thatÂ 12 is simply a multiplicative operator.

The coupled system of one-way wave equations
To distinguish up-and downgoing constituents in the wavefield, we shall construct an appropriate linear operatorL I J witĥ which, with the aid of the commutation relation (∂ 3L I J ) = [∂ 3 ,L I J ], transforms eq. (5) intô as to makeˆ J M , satisfyinĝ a diagonal operator matrix. We denoteL I J as the composition operator andŴ M as the wave matrix. The matrix expression in parentheses on the left-hand side of eq. (12) is diagonal and its diagonal entries are the two so-called one-way wave operators. The first term on the right-hand side of eq. (12) is representative for the scattering caused by variations of the medium properties in the vertical direction. The scattering caused by variations of the medium properties in the horizontal directions is contained inˆ J M and, implicitly, inL I J also. To investigate whether solutions of eq. (13) exist, we introduce the generalized eigenvector operatorsL (±) I columnwise according tô Upon writing the diagonal entries ofˆ J M , the generalized eigenvalue operators, aŝ eq. (13) decomposes into the two systems of equationŝ By analogy with the case where the medium is translationally invariant in the horizontal directions, we shall denoteˆ (±) as the vertical slowness operators. Note that the operatorsL (±) 1 compose the acoustic pressure and that the operatorsL (±) 2 compose the vertical particle velocity from the elements ofŴ M associated with the up-and downgoing constituents.
In De Hoop (1996) an ansatz procedure has been followed to solve the generalized eigenvalue-eigenvector problem eq. (16) in the operator sense: choosing the acoustic-pressure normalization analogue, we satisfy the commutation rule In this normalization, we find the vertical slowness operator or generalized eigenvalues to bê is the characteristic operator equation, while the generalized eigenvectors constitute the composition operator withŶ =Â −1 12ˆ denoting the vertical acoustic impedance operator. With respect to the normalization, note that we have decomposed the pressure, namely according toF 1 =Ŵ 1 +Ŵ 2 . In terms of the inverse vertical acoustic impedance operator, the decomposition operator then follows aŝ The (de)composition operators account for the radiation patterns of the different source mechanisms and receiver types. Using the decomposition operator, eq. (12) transforms into which can be interpreted as a coupled system of one-way wave equations. The coupling between the counter-propagating components,Ŵ 1 andŴ 2 , is apparent in the first source-like term on the right-hand side, which can be written as in whichT andR represent the transmission and reflection operators, respectively. (The goal of the double-square-root propagation, to be explained in Section 4, is to image the kernel of the coupling operator.) In the acoustic-pressure normalization analogue, we find The second, primary, source term on the right-hand side of eq. (21) is chosen in accordance with the Helmholtz Green function: a volume injection point source will result inN 1 = 0,N 2 = δ(x − x ) and hence the inhomogeneous term in eq. (21) will reduce to a vector with components whereŶ −1 indicates the Schwartz kernel ofŶ −1 .

T R O T T E R -P R O D U C T O N E -W A Y W A V E P R O P A G A T O R
We develop a Trotter-product representation for the fundamental solution to the (uncoupled) one-way wave equation. The propagation of singularities is governed by the principal symbol of the one-way wave operator, which defines the Hamiltonian. The Trotter-product representation can be viewed as the composition of Fourier integral operators, each of them being identified with a 'thin-slab' propagator. We derive a condition under which such a composition exists, ensuring the proper propagation of singularities.

The product integral
We note that the vertical slowness operators at different levels of x 3 do not necessarily commute with one another owing to the heterogeneity of the medium. Thus we arrive at a 'time'-ordered product integral representation (De Witte- Morette et al. 1979) of the one-sided propagators (cf. eq. 27) associated with the one-way wave equations (Fishman & McCoy 1984a,b;De Hoop 1996;Fishman et al. 1997) where 'time' here refers to the vertical coordinate x 3 , where H denotes the Heaviside function. In this expression, the operator ordering is initiated by exp[−iωˆ (·, x 3 ) dζ ] acting onû(·, x 3 ) followed by applying exp[−iωˆ (·, ζ ) dζ ] to the result, successively for increasing ζ . The functional integral serves, here, as a framework for the further analysis of propagation of singularities.

The 'square-root' or vertical slowness operator symbol, the Trotter product
If the medium in the interval [x 3 , x 3 ] were weakly varying in the vertical direction, the Trotter-product formula could be applied to the product integral in eq. (29). This results in the Hamiltonian path integral representations (De Witte- Morette et al. 1979) with 'measure' D for the Green functions, where P is a set of paths (x µ (ζ ), α ν (ζ )) in (horizontal) phase space satisfying Schlottmann (1999). In eq. (30),γ (±) is the left symbol * ofˆ (±) , i.e.
where α ν indicate the horizontal slownesses. The path integral in eq. (30) is to be interpreted as the lattice multi-variate integral wherē are the coordinates of a path at the discrete values ζ j of ζ as j = 1, . . . , M. The limit M → ∞ is a formal limit; the rigorous convergence of some path integral representations have been analysed by Ichinose (1997Ichinose ( , 1999Ichinose ( , 2000. To account for the source-radiation pattern (cf. eq. 24), we invoke the Volterra composition whereŷ −1 is the dual (or right) symbol ofŶ −1 . Then eq. (32) changes tô In preparation of the microlocal analysis of the Green functions, we now introduce the wavevectors for k = 1, . . . , M, for which components together at k = M will relate to the cotangent vectors appearing in the wavefront set of G (±) . Note that K Then, in the time domain, the Trotter-product formula can be written as the limit of a sequence of the form and similarly without the radiation pattern R. We have assumed that the source is operative at t = 0. In general, shifting this instance to t , yields As before, for simplicity of notation, we will restrict our analysis again to G = G (+) , K 3 = K (+) 3 . Eq. (38) is a simplified representation of the actual compositions of thin-slab propagators: we consider eq. (38) to be the composition of M Fourier integral operators (Le Rousseau et al. 2001). Each composition is in horizontal space and time. However, the integrals over the successive times result in Dirac measures in the difference of successive frequencies. Those measures have been integrated out in eqs (37) and (38), conveying the conservation of frequency.

The canonical relation
For operators with kernels of the form eq. (38), the propagation of singularities is controlled by the phase function. This is described by the canonical relation. With the canonical relation we recover the geometrical (ray) aspect of wave propagation. The canonical relation is a Lagrangian manifold (see also Kendall & Thomson 1993).
The phase function in eqs (37) and (38) is given by in accordance with the eikonal equation, and where x = (s , s 3 ) with s 3 = 0 (corresponding to k = 0), x corresponds to k = M, and where µ = 1, 2 and ν = 1, 2. We have x 3 = x 3 − s 3 = x 3 . We have also introduced the principal part K prin 3 of K 3 ; the remaining exponentials x 3 are contained in the 'amplitude' of G and account for phenomena not associated with the wavefront set. This remaining exponential factor is (a symbol) of order −1 and can therefore be considered as an amplitude. Then the phase is homogeneous in ( K (k) ν 1≤k≤M , ω) † (see also Duistermaat 1996). Note that the operator R in eq. (38) is pseudodifferential and does not move the wavefront set. Let us collect the phase variables according to The stationary point set, φ , is defined as (a 'prime' indicates taking the gradient) Thus, the defining equations of the stationary point set, φ , are for σ = 1, 2 and l = 1, . . . , M − 1, k = 1, . . . , M.
In Appendix B, it is shown that the phase is non-degenerate for M sufficiently large. Hence the phase is clean with excess 0, and the mapping is a diffeomorphism. The stationary point set, φ , is a submanifold of the manifold of points {(x, t, s , t , θ)}. φ is the associated Lagrangian submanifold in the cotangent bundle associated with the manifold of points {(x, t, s , t )}. In the mapping above, K (M+1) 1,2 is given by following eqs (43) while for M large enough, (cf. Appendix C) on the stationary point set.
Observe that the (finite) system of equations defining the stationary point set, φ , is derived from a complex-analytic square-root Hamiltonian, K prin 3 , see Subsection 3.5 below. The solution of this system can be interpreted as a set of bicharacteristics on a lattice as illustrated in Fig. 1. Omitting so-called turning rays, through a finite-difference approximation in the vertical (x 3 ) coordinate, this system yields in the limit M → ∞ the Hamilton system that defines the geometry of the asymptotic ray and Maslov high-frequency solutions (Kendall & Thomson 1993;De Hoop & Brandsberg-Dahl 2000) to the wave equation.

Propagation of singularities
Let G denote the time-domain counterpart of the previously definedĜ (cf. eq. 26). Then G(·, ·, ·; s 0 , 0, t 0 ) = G(δ(s − s 0 ) ⊗ δ(t − t 0 )). We denote the wavefront set of this Green function as WF(G). With the aid of the canonical relation C φ , we find that so that which means that the cotangent vector, ξ, is the cotangent vector at x associated with the bicharacteristic joining the source, s 0 , and the point x, with initial 'slope' or 'slant' given by ω −1 K (1) ν , and 'one-way' traveltime (t − t 0 ).

Square-root Hamiltonian system
Let H be the Hamiltonian associated with the (original wave equation's) propagation process: , ω) has the same degree of smoothness as H (x, K, ω). Also, the implicit function theorem states that We set Then, in the considered neighbourhood U, F and H have the same Hamiltonian flow: Because of eq. (54), the flow of F is parametrized by ζ = x 3 . Now, if τ is the parameter for the Hamiltonian flow associated with H, we have which implies which relates evolution in depth to evolution in time. (This relation is at the basis of the theory of one-way wave propagation.) Hence the equivalence of the two Hamilton systems and their solutions.
In the present case, and (H ) K 3 = 0 implies that K 3 = 0. In view of eqs (49)-(52), for the equivalence of Hamiltonian flows to apply, any bicharacteristic for which K 3 vanishes at some point, a so-called turning ray, must be excluded (see also Appendix A). In an open set U ⊂ R 3 × R 3 which closure is away from these bicharacteristics, F is then simply defined as In the present case, we therefore have H 3 = K prin 3 . In the process of discretizing the system (53)-(56) we take ζ = M −1 x 3 and subdivide the depth or vertical range into M steps. We recover system (43), hence the earlier interpretation of flow in terms of bicharacteristics on a lattice.

Amplitude of the propagator kernel
To analyse the amplitude behaviour of the Trotter-product representation for the one-way Green function, we consider eq. (38) to be the composition of M Fourier integral operators (Le . The kernel of each constituent operator is a distribution with an oscillatory integral representation. The phase of each such representation is of the form (cf. eq. 39) In the composition it follows that all ω (k) = ω; we have t (M) = t and t (0) = 0. We will first establish that, in the stationary phase approximation, the oscillatory integral representation of each constituent distribution yields a change of coordinates on the Lagrangian manifold φ (k) . The stationary point set, φ (k) , follows from and a similar equation for (φ (k) ) ω (k) = 0 determining the traveltime. The above equation can be solved for x (k) σ as a function of K (k) σ if the Hessian is non-singular (by virtue of the implicit function theorem (Dieudonné 1969)). Here, the speed c is evaluated at (x (k) 1,2 ,ζ k ). Having the functions x (k) σ implicitly defined, we apply the chain rule to arrive at  Since which confirms that in the stationary phase approximation (generating the determinant on the right-hand side) we retain a coordinate transformation (generating the determinant on the left-hand side) in accordance with the Maslov representation of the Green function. The condition of non-singularity is satisfied if we enforce (cf. eq. 63) where the first factor on the right-hand side is bounded in view of its homogeneity property (degree 0). Observe that if the medium were laterally homogeneous, M could simply be chosen to be equal to 1. The same condition appears in the proof that the compositions in eq. (38) preserve the geometrical aspects of the propagation of the singularities (Le . Note that the vertical stepsize is bounded in terms of a norm of horizontal medium derivatives. Secondly, we establish that the stationary phase approximation of each constituent oscillatory integral representation, in particular the initial one (k = 1), reproduces the geometrical spreading i.e. the solution of the transport equation. Using eq. (62), we find that The equation (φ (k) ) ω (k) = 0 leads to the relation so that Substituting this equation into eq. (66) then yields in which t (k) − t (k−1) represents the traveltime to cross the kth slab. We recognize the geometrical spreading-for x (k−1) ,ω (k) fixed-and the radiation pattern of a vertical force (ω −1 K prin 3 ) which is removed by the initial decomposition introduced in eqs (24) and (35)-the factor ω −1 accounts for the fact that we employ the particle velocity rather than displacement.

T R O T T E R -P R O D U C T D O U B L E -S Q U A R E -R O O T P R O P A G A T O R
In the remote sensing problem one places sources and receivers at the surface (x 3 = 0). Sources (at s 1,2 ) generate acoustic waves, which scatter at points x = (x 1,2 , z) in the subsurface and are observed at r 1,2 as a function of time t. (The x 3 -axis points into the subsurface.) To image the observations, the measurements of the sources and the receivers measurements are propagated into the subsurface in a presumed background medium back to the time when the scattering took place (Claerbout 1985). The operator thus obtained is composed of a propagator of the previous type associated with the source and a propagator of the previous type associated with the receiver.
We develop a Trotter-product representation for the imaging operator. Such an operator can be identified with the fundamental solution of the double-square-root equation. We recover isochrones from the Trotter-product representation, and relate certain phase variables in the representation to scattering angle and azimuth. We then modify the imaging operator into an operator that generates multiple images, namely for each scattering angle and azimuth. This yields the 'wave-equation' counterpart of asymptotic imaging with the generalized Radon transform. Finally, we deduce from the latter operator annihilators of the data.

The imaging kernel
Using, for example, arguments based on reciprocity , it follows that the process of seismic imaging based upon the first-order term of the generalized Bremmer series (De Hoop 1996) can be written in the operator form (IW 2 )(x 1,2 , z) = (1/2π) I(x 1,2 , z; s 1,2 , r 1,2 , t)W 2 (s 1,2 , r 1,2 , t) dt ds 1 ds 2 dr 1 dr 2 , where W 2 represents the (upgoing component of the) data. Here, I denotes the distributional imaging kernel that can be written as the Trotter-product representation forming a sequence, with x 3 = z and The imaging kernel follows the simultaneous downward continuation of sources (variables with.) and receivers (variables with. ) in accordance with the one-way Trotter product represented by eq. (37). In the absence of turning rays, in the high-frequency approximation, the kernel will approach δ(t − T ((r 1,2 , 0), (x 1,2 , z), (s 1,2 , 0))), where T denotes the two-way traveltime.

The canonical relation of the imaging operator (h im 1,2 = 0)
The phase function associated with distribution (73) is given by where µ = 1, 2 and ν = 1, 2 and x = (y (M) 1,2 , z). If we collect the phase variables, the stationary point set, , is defined as The defining equations of the stationary point set, , are then for σ = 1, 2 and l = 1, . . . , M − 1, k = 1, . . . , M. In analogy with the analysis of the one-way wave propagator, one can view the solution of the system of eqs (79)-(83) as an approximation on a lattice of the geometry of the Maslov high-frequency solution of the two-way scattering process, as represented in Fig. 2. An extension of the proof of Appendix B shows that the phase is non-degenerate. The associated Lagrangian submanifold of the cotangent bundle associated with the manifold of points {(x, y 1,2 , h 1,2 , t)}, is now given by 1,2 , k 1,2 , ω = x, ( ) x ; y 1,2 , h 1,2 , t,   x 1,2 , z, As before, the canonical relation, C , is obtained by twisting the Lagrangian submanifold , 1,2 , ω ∈ .
The so-called isochrone, the projection of the canonical relation on to the subsurface for fixed two-way traveltime t and source and receiver, i.e. fixed h 1,2 and y 1,2 , is shown in Fig. 3.
The polar angles associated with the covector Ξ are denoted by (θ Ξ , ψ Ξ ). We now introduce the parameters scattering angle and azimuth {ϑ, ψ} at (x 1,2 , z) according to and r 0 is given as the vertical vector pointing up (negative x 3 direction) in the plane orthogonal to Ξ as illustrated in Fig. 4. By analysing Fig. 4, we find that through the canonical relation ). On the other hand, we have Figure 4. Euler angles associated with the two-way scattering process. Left-hand side, the unit vector with the positive first component along the intersection of the vertical plane containing Ξ and the horizontal plane is denoted by r π/2 . We then denote r 0 = Ξ ∧ r π/2 . Right-hand side, call r ψ the unit vector in the directionξ −ξ. It lies along the intersection of the scattering plane (ξ,ξ) with the plane orthogonal to Ξ, i.e. (r 0 , r π/2 ). The azimuth angle is defined as the angle between r ψ and r 0 , while the scattering angle ϑ is defined as the angle betweenξ andξ.
We observe that the construction of 3 out ofξ,ξ is independent of azimuth in view of the isotropy of the medium here. In general, we will have four unknown angles (ϑ, ψ), (θ Ξ , ψ Ξ ) and four accessible spectral variables, (k ). They are a manifestation of offset and mid-point in the subsurface. The mentioned scattering angle and azimuth together form the set of parameters along the characteristic strips in the wavefront set of the single-scattered seismic data (see Stolk & De Hoop 2002). It is along these characteristic strips that we can verify the velocity model by differential semblance.

Angle-gather imaging condition
The process of imaging, through the path integral, induces a composite mapping To highlight the redundancy in the data, the redundancy being parametrized by (ϑ, ψ), we will create images for each (ϑ, ψ). To this end, we carry out a final Fourier transform of the kernel of the double-square-root propagator, J M , with respect to (h subject to the substitution We consider the outcome at y (M) 1,2 = x 1,2 as a function of (ϑ, ψ), which defines the angle common image point gather at depth z. The redundancy in the data is made apparent by applying an operator of the type eq. (91) with the final line replaced by Such an operator annihilates the data in accordance with the differential semblance (Symes 1991)criterion applied in (ϑ, ψ) (Stolk & De Hoop 2002). The angle gathers are controlled by dip (θ Ξ , ψ Ξ ). To create these gathers we therefore need to determine the direction of the isochrone cotangent vector, which is the subject of the next section.

M U L T I -R E S O L U T I O N A P P R O A C H T O D E T E R M I N I N G T H E M I G R A T I O N D I P ( I S O C H R O N E C O T A N G E N T D I R E C T I O N )
We discuss a procedure to estimate the migration dip while applying the imaging operator, using a continuous wavelet transform.
In the simplest situation, let ψ = ψ = 0 ('in-line' scattering). Then we obtain the angle ϑ out of the ratio and we just have to determine 3 . In general, we will have to determine both (θ , ψ ). We will do so by a careful windowed Fourier analysis. The estimation of the isochrone cotangent direction, known in seismology as the migration dip, can be carried out with the aid of a continuous wavelet transform. Let be a symmetric, smooth function (Gaussian for example), and define the spatial Fourier transform (indicated by.) P τ φ combined with a cut-off φ as with ψ x,τ (y) = φ τ (y − x).
If φ is Gaussian, its Fourier transformψ x,t is also Gaussian. For f we now substitute the distribution imaging kernel I, and employ the Fourier transform x 0 = (y Then (x 0 , Ξ 0 ) does not belong to the wavefront set of I if there is a conical neighbourhood of (x 0 , Ξ 0 ) such that on this neighbourhood for all a, N ≥ 1, to Folland, and Córdoba & Fefferman (Folland 1989). Taking the complement of such conical neighbourhoods yields the singular direction associated with the isochrone.

N U M E R I C A L E X A M P L E
We will briefly discuss an example of applying the method to synthetic data in the presence of caustics. The model (Brandsberg-Dahl et al. 2003a) (2-D) contains a low-velocity lens (Fig. 5). It further consists of two layers separated by a ramp-like interface that represent a target reflector with an adjacent fault.
The upper layer has constant gradients. The seismic data were generated by a time-domain finite-difference approach (the dominant frequency of the source signature was 35 Hz). The vertical component of a typical shot gather over the lens is shown in Fig. 6 (the direct arrival has been removed from the shot gathers by remodelling without the reflector being present). For the imaging we invoked the generalized screen approximation for the Trotter product Le Rousseau & De Hoop 2001a). We focus on a horizontal location where multipathing dominates. The ACIG generated using the true wave speed model is shown in Fig. 7(a). The seismic event is unfolded and the alignment ('flatness') confirms the redundancy in the data. Since we used the true model, applying the annihilator results in the disappearance of the seismic event, see Fig. 7 An incorrect model results in 'non-flat' events in the ACIG, see Fig. 7(c) where we removed the lens from the wave speed model; applying the annihilator results in Fig. 7(d). Clearly, the annihilator can be used as a mismatch criterion for tomographic inversion (Brandsberg-Dahl et al. 2003b;De Hoop et al. 2001).

D I S C U S S I O N
In this paper, we have presented a microlocal (high-frequency) analysis of the double-square-root equation and its possible solution schemes. The double-square-root equation generates the wave-equation seismic imaging procedure. In particular, we found that (Maslov-)Kirchhoff and wave-equation imaging procedures are microlocally compatible and hence should yield the same result on the singular support of the image. More precisely they are equivalent if all the data are used (and the data set is complete, i.e. 5-D). However, high-frequency methods (Brandsberg-Dahl et al. 2003a,b) show artefacts in ACIGs with a moveout (with angle) that are not visible in the results of Section 6. Such differences suggest that in the angle domain the two methods are not fully equivalent. This is analysed in detail in Stolk & De Hoop (2001).
As far as the solution schemes of the double-square-root equation are concerned, we found a fundamental condition for the step size in depth that plays an important role in complex (strongly heterogeneous) velocity models. Finally, we designed a wave-equation imaging scheme honouring the characteristic strips in the wavefront set of seismic data, thus generating scattering-angle/azimuth common image point gathers of the type introduced in Kirchhoff-style imaging procedures. ACIGs are used for amplitude inversion and in tomography.
With our approach, 'true-amplitude' aspects of the imaging procedure in inhomogeneous media can also be addressed, namely via the composition of the adjoint (solution of the double-square-root equation) and the forward scattering operator (the first term of the generalized Bremmer coupling series). In the case of laterally homogeneous media such an analysis has been carried out (Deregowski & Rocca 1981;Dubrulle 1983;Ekren & Ursin 1999). In the case of heterogenous media, see Stolk & De Hoop (2001).
We analysed the case of acoustic waves. We have applied the imaging procedure to real exploration data (Le Rousseau et al. 2002). The tools needed to extend the current theory to mode-coupled elastic waves may be found in Le .

A P P E N D I X A : S O M E C O N C E P T S F R O M M I C R O L O C A L A N A L Y S I S A N D S Y M P L E C T I C G E O M E T R Y
Microlocal analysis is the study of singularities (of a wavefield for instance) and their 'high-frequency' orientations (directions of propagation); the two combined form the wavefront set. In direct and inverse wave propagation problems, microlocal analysis provides mathematical tools to follow the propagation of singularities, even in rather complex media (in the presence of caustics). In a remote sensing experiment, most of the information is contained in the wavefront set of the data. Microlocal analysis thus offers a suitable framework for inverse scattering theory. The wavefront set of u is denoted by WF(u). For a basic reference on the theory of distributions we refer the reader to Schwartz (1966b). For more advanced treatments see Schwartz (1966a), Treves (1967 and Hörmander (1990). For basic references on manifolds and differential geometry see Spivak (1965) and Choquet-Bruhat et al. (1996). For more advanced sources we refer the reader to Dieudonné (1972). For a treatment of symplectic geometry we refer the reader to Abraham & Marsden (1978) and Hörmander (1985a, chapter 21).

A1 The tangent and cotangent bundles
Manifolds generalize the concept of surfaces. At every point x of a manifold X of finite dimension we can define the tangent space, T x X , for instance as the set of all vectors tangent, at x, to smooth curves on X passing through x. We then define the tangent bundle of X as The tangent space of X at x is a vector space of the same dimension as X , say n. One can therefore introduce its dual T * x X and then define the so-called cotangent bundle as For R n we have T * R n R n × R n . In the forward and inverse Fourier transforms we encounter the inner product x, ξ . Since on R n the tangent space is identified with the space itself, ξ is then considered as a dual variable of x, i.e. the cotangent variable. Since the directions of the singularities of distributions on R n are given by Fourier analysis (Hörmander 1990, Section 8.1)-and hence are cotangent directions-we are allowed to define the wavefront set as a subspace of T * R n . A subset of T * R n is said to be conical if it is not affected by positive scaling in the ξ direction, i.e. if (x, ξ ) ∈ then for all t > 0 (x, tξ ) ∈ .

A2 Symplectic and Lagrangian manifolds
The archetypical example of a symplectic manifold, S say, is given by T * R n along with the 2-form σ with the local representation (u, v) and (u , v ) are in the tangent plane, T (x,ξ ) T * R n , of T * R n at (x, ξ ), A submanifold L is Lagrangian if its dimension is n, i.e. half the dimension of S, and σ | L = 0, i.e. for all x ∈ L, T x L is its own orthogonal with respect to σ . An example of a Lagrangian submanifold of T * R n is the conormal bundle, NX , of a submanifold X of R n .
If (S 1 , σ 1 ) and (S 2 , σ 2 ) are two symplectic manifolds, a Lagrangian submanifold of S 1 × S 2 with respect to the symplectic form σ 1 − σ 2 is called a canonical relation from S 2 to S 1 .

A3 Rays, bicharacteristics
It is common to refer to the cotangent bundle T * X as phase space. When solving a Hamiltonian system, i.e. tracing rays in phase space, one actually builds a correspondence 'table' C between points in phase space. Two points (x 0 , ξ 0 ) and (x, ξ ) are related to each other if there is a ray (bicharacteristic) that connects them with traveltime t and frequency τ , then (x, t, ξ , τ ; x 0 , ξ 0 ) ∈ C. The 'table' C is a canonical relation C 2003 RAS, GJI, 153, 52-74 in T * (X × R) × T * X . With this simple example we see that canonical relations are therefore important geometrical objects that control the propagation of singularities.

A4 Phase functions
Let X be a manifold of dimension n, e.g. R n , and φ(x, θ) a smooth real-valued function in an open conical set ⊂ X × (R N \0) which is homogeneous of degree 1 in θ , i.e. φ(x,λθ ) = λφ(x, θ ) for all λ ∈ R. Then φ is called a clean phase function if dφ = 0 and the stationary point set is a smooth manifold with tangent space defined by the equations d((φ ) θ ) = 0. If the number of linearly independent equations d((φ ) θ ) = 0 is N − e we call e the excess of the phase. The phase is said to be non-degenerate if e = 0. The dimension of the stationary point set, S φ , is n + e. Furthermore, the set φ = {(x, ξ) | ξ = (φ ) x (x, θ), (φ ) θ (x, θ) = 0 for some θ ∈ R N \0} has dimension n and is a locally conical Lagrangian submanifold of T * X .
Conversely, given a conical Lagrangian submanifold in T * X , at every point of , (x, ξ ) say in local coordinates, we can locally parametrize by a non-degenerate phase function φ in a conical neighbourhood of (x, ξ ) (Hörmander 1985a, theorem 21.2.16). Hence a high-dimensional object such as a Lagrangian submanifold (or a canonical relation) can be locally represented by a single function.
For illustration, let us come back to the previous example of a canonical relation generated by bicharacteristics: where the functions x(x 0 , ξ 0 , t) and ξ (x 0 , ξ 0 , t) are the solutions of the Hamiltonian system. A convenient choice of a phase function is described in Maslov & Fedoriuk (1981). They state that one can always use a subset of the cotangent vector components as phase variables. Let us choose coordinates of the form (x I , x 0 , ξ J , τ ), where I ∪ J is a partition of {1, 2, . . . , n}. It follows from theorem 4.21 in Maslov & Fedoriuk (1981) that there exists a function S such that locally C is given by Such a function S is called a generating function for the canonical relation C. It is a generalization of traveltime in the presence of caustics. A non-degenerate phase function is then found to be φ(x, x 0 , t, ξ J , τ ) = S(x I , x 0 , ξ J , τ ) + ξ J , x J + τ t.

A5 Oscillatory integrals and Fourier integral operators
Oscillatory integrals (OIs) are (finite) distributions on an open set of R n with an integral representation that makes sense via regularization (see Hörmander 1990, Section 7.8, Hörmander 1971, Section 1.2 and see below). Let φ be a non-critical ‡ phase function on an open cone of × (R N \0) for some N, i.e. φ = φ(x, θ ) is real, § smooth, and homogeneous of degree 1 in θ. Let also a ∈ S m ( × R N ), i.e. let a be a symbol of order m: ∂ α ξ ∂ α x a(x, ξ) ≤ C α,β,K (1 + |ξ |) m−α , x ∈ K , ξ ∈ R n , for K ⊂ compact. Then for u ∈ C ∞ c ( ) the integral is well defined. It is clearly defined if, for instance, a is rapidly decreasing and from this I φ can be extended to symbols of any order in a unique manner. Such an integral is then called an OI. The symbol a is also referred to as the amplitude of the OI. A way to perform the regularization procedure is as follows: if χ is a given compactly supported smooth function on R N (∈ C ∞ c (R N )), we have as a distribution. The wavefront set of the OI can be estimated through the phase function as WF(I) ⊂ {(x, (φ ) x (x, θ)) | (x, θ) ∈ S φ ∩ supp(a)}.
When the phase is clean the stationary point set, S φ , of the phase function is a submanifold (Hörmander 1985a, definition 21.2.15). Then is locally a Lagrangian submanifold of T * ( ). In the case when = X × Y , with X and Y open subsets of R m and R p for some m and p (m + p = n), the OI defined by such a phase function, φ and a symbol, a, may be viewed as the kernel of an operator (Schwartz 1966a;Treves 1967;Hörmander 1990), A, from C ∞ c (Y ) (the space of compactly supported smooth functions on Y ) into D (X ) (the space of distributions on X ). Such an operator is called a (local) Fourier integral operator. If φ has no stationary points as a function of (y, θ) for all x ∈ X , A maps C ∞ c (Y ) into C ∞ (X ). If φ has no stationary points as a function of (x, θ) for all y ∈ Y , A can then be extended from E (Y ) (the space of compactly supported distributions on Y ) into D (X ). A phase function that satisfies both these properties is called an operator phase function (Hörmander 1971, definition 1.4.4).
To the Lagrangian submanifold, φ , one can associate the canonical relation, This canonical relation gives the proper geometric tool to follow the propagation of singularities through the application of the FIO by the algebraic composition of C φ and WF (u): WF(Au) ⊂ C φ • WF(u) = {(x, ξ) | ∃(y, η) ∈ WF(u) such that (x, ξ, y, η) ∈ C φ }.
In the case of the example of Section A3, where bicharacteristics build the canonical relation, one then sees that the composition of the canonical relation and the wavefront set just expresses that the singularities propagate along the bicharacteristics. Two properly supported FIOs associated with operator phases can be composed, and the resulting operator will be an FIO with clean phase, if their canonical relations compose cleanly (Treves 1980;Hörmander 1985b). Figure A1. Matrix of the differentials of (φ ) θ . Figure A2. Bicharacteristics satisfying the common azimuth assumption.