m Wavelet analysis of the coherence between Bouguer gravity and topography: application to the elastic thickness anisotropy in the Canadian Shield

We use a wavelet transform to compute the local and azimuthal variations of the coherence between Bouguer gravity and topography in eastern Canada. The isotropic coherence is calculated by averaging the wavelet spectra from optimally overlapping 2-D Morlet wavelets having an isotropic spectral envelope in adjacent directions within 180 ◦ , deﬁning the so-called ‘fan’ wavelet. The isotropic wavelet coherence spectrum is inverted to obtain local estimates of the elastic thickness ( T e ) of the lithosphere. We calculate the anisotropic coherence by restrict-ing the fan wavelet over an azimuthal range of 90 ◦ . The direction of maximum coherence is diagnostic of the direction of preferred isostatic compensation, or the direction where the lithosphere is weakest. The coherence is inverted using the theoretical response of a thin anisotropic plate model. We have carried out extensive tests on synthetic topography and Bouguer gravity data sets to verifythat:(1)thewaveletmethodcanrecover T e forsimplemodelswitheitherhomogeneousor spatially variable rigidity patterns; and that: (2) the method can determine azimuthal variations in the 2-D coherence for homogeneous models with anisotropic T e . We have used data from the eastern Canadian Shield to infer the variations in T e and the anisotropy of the coherence. The relative variations in T e show trends similar to those obtained in previous studies that used different spectral methods. The wavelet transform gives T e values between 30 and 120 km. T e is generally high ( > 80 km) throughout eastern Canada. Lower values (30–60 km) are found in the eastern Grenville Province, in the northern Appalachians, and in the Superior Province in the Great Lakes region. The high values found in Hudson Bay are consistent with previous studies of elastic thickness and models of basin subsidence. The direction of maximum coherence obtained from the wavelet method is also consistent with our previous results obtained with the multitaper method and shows that the weak mechanical axis is perpendicular to the fast seismic axis where seismic anisotropy has been detected.


I N T RO D U C T I O N
The correlations between the topography and gravity anomalies provide important information on the level of isostatic compensation of the lithosphere at the geological timescale, and reflect its thermomechanical state (Watts 2001). The response of the lithosphere to surface (e.g. mountain belts, sedimentary basins) and internal loadings (e.g. Moho undulations) is modelled by assuming that regional isostasy is achieved by the flexure of a thin elastic plate overlying an inviscid fluid. The effective elastic thickness T e of the lithosphere is defined as the thickness of an equivalent elastic plate that would give the same response under the known tectonic loading. It is obtained from the flexural rigidity parameter, D, used in the equation of flexure of a thin elastic plate where E is the Young modulus and ν Poisson's ratio (assumed to be 10 11 Pa and 0.25 throughout this work). Elastic thickness depends on many factors, including the density structure, the thermal and stress state, and the mechanical properties of the lithosphere (Burov & Diament 1995). In the oceans, the rheology is relatively simple and the estimates of the flexural rigidity correlate well with the depth to the 450-600 • C isotherm, calculated from a cooling plate model (Watts 2001). In the continents, where the rheological properties of the lithosphere are vertically and laterally heterogeneous, T e does not correspond to an isotherm or to a physical boundary. In general, T e is low in young and tectonically active regions and increases in the stable continent (Lowry & Smith 1994;Flück et al. 2003). An approximate correlation between the long-term strength of the lithosphere and its age has been suggested in a few regions, for example, in Europe (Poudjom-Djomani et al. 1999;Pérez-Gussinyé & Watts 2005), and Australia (Simons & van der Hilst 2002). This simple model does not apply everywhere: for example, little correlation of T e with heat flow or the geology is found in some shields, for example, in the Siberian craton (Poudjom-Djomani et al. 2003) and the Canadian Shield (Audet & Mareschal 2004a). There are two main approaches in the estimation of T e : the direct and inverse approaches. In the former, forward modelling of the gravity anomalies computed from the assumed tectonic loading can be compared with the observed gravity field to infer the mechanical properties of the lithosphere. This method is useful for certain geological settings such as mountain belts, seamounts and sedimentary basins, where the loading structure is well known (e.g. Karner & Watts 1983;Stewart & Watts 1997). More often, however, the estimates of T e are inverted from the spectral relationships between topography and gravity anomalies, assuming that the lithosphere behaves as a thin elastic plate. Following Forsyth (1985), the majority of researchers use the coherence between Bouguer gravity and topography to estimate T e because it allows the decomposition of the loads into surface and internal components and is less sensitive to short wavelength noise in the data than the admittance between free-air gravity and topography. The coherence between two fields F and G is defined in the Fourier transform domain as: where denotes some averaging in the 2-D wavevector space k, and the asterisk indicates complex conjugation. The coherence is a measure of the phase relationships between two fields. If no averaging were done, the coherence would always be 1. The most common averaging method consists of binning over different annuli of wavenumber bands, but this destroys the azimuthal information in the spectra. For a discussion on the different ways of averaging the spectra, see Simons et al. (2000). For uncorrelated fields, the phases of the cross-spectra at a given wavevector are randomly distributed and averaging cancels the coherence. For correlated fields, the phases of the cross-spectra interfere constructively and averaging yields a high coherence. When surface or internal loads are fully compensated by the deflection of the plate at long wavelengths, the Bouguer anomaly is negatively correlated with the topography, resulting in a high coherence. At shorter wavelengths, the loads are supported by the strength of the plate, and, if there is no correlation between initial surface and internal loading, the Bouguer anomaly is incoherent with the topography. The transition wavelength from low to high coherence depends both on the rigidity and the loading structure of the plate. For estimating T e , the observed coherence is compared with the coherence predicted for a thin elastic plate with some assumptions about the loading scheme (Forsyth 1985). Major differences in the estimation of T e result from the use of various spectral estimators when calculating the isotropic coherence. The fact that high flexural rigidity implies long transition wavelengths imposes a lower limit on the size of the windows used to calculate the spectra. Tests with synthetic data have shown that the T e estimates based on the modified (windowed or mirrored) Fourier periodogram and multitaper methods are highly sensitive to window size (Ojeda & Whitman 2002;Audet & Mareschal 2004a;Pérez-Gussinyé et al. 2004). The multitaper method calculates the spectra with multiple orthogonal windows used as data tapers to reduce the variance of the estimates, and averaging is done over different, (approximately) independent, subsets of the data (Simons et al. 2000). The resolution degrades with the number of tapers used. Pérez-Gussinyé et al. (2004) showed that serious discrepancies occur when comparing the multitaper coherence with theoretical curves, due to the large bias introduced near the transition wavelength by the tapering procedure. Parametric spectral estimators (e.g. maximum entropy) have been found to perform better on synthetic data (Lowry & Smith 1994;Audet & Mareschal 2004a). However, the basic assumption of the parametric estimation, that the data were produced by an autoregressive stochastic process, while likely to be met by numerically generated fractal surfaces, remains to be verified in the case of heterogeneous and anisotropic data, such as continental topography and gravity fields.
The flexural rigidity of the lithosphere is usually assumed to be isotropic. This is a convenient assumption because it allows the reduction of the problem to 1-D by averaging the azimuthal information in the spectra. It has been shown by several recent studies (Lowry & Smith 1995;Simons et al. 2000Rajesh et al. 2003;Swain & Kirby 2003b;Audet & Mareschal 2004b) that the coherence increases in one direction compared to the azimuthal average, and this anisotropy reflects the preferred direction of isostatic compensation where the lithosphere is weaker. The retrieval of anisotropy in the coherence is hampered by the lack of a suitable averaging method. Multiple windowing techniques are better suited to this task than either the modified periodogram or the maximum entropy method because the coherence can be calculated at each wavevector, enabling the detection of anisotropy. However, all the methods mentioned above return a single estimate of T e or its anisotropy at each window, thus limiting the spatial resolution.
In the course of quantifying the lateral and azimuthal variations of the coherence between topography and Bouguer anomaly, the knowledge of the local phase information of these fields is necessary. A multiple windowing technique that uses orthonormal Hermite polynomials in 2-D, providing the necessary condition for both spatial and spectral localizations and enabling the retrieval of anisotropy in the coherence, has already been developed ). However, the use of many windows decreases the spectral resolution and direct comparison of the coherence with predictions is inaccurate, unless the same bias is carried in the calculation of the predicted coherence (Pérez-Gussinyé et al. 2004).
In recent years, there have been new developments in the application of wavelet analysis to geophysical data. The main advantage of the wavelet analysis over the Fourier transform approach is that it is applicable to non-stationary time-series. We believe that the study of isostasy would benefit from a wavelet point of view since the wavelet transform uses optimally sized windows to obtain the spatial localization and does not require multiwindowing. So far, the majority of applications using the wavelet transform in isostatic studies have been restricted to the calculation of the local isotropic coherence. Kido et al. (2003) constructed an isotropic wavelet-like kernel by azimuthal averaging a 2-D Gabor function and correcting for spherical geometry. Stark et al. (2003) derived a tensor-like wavelet that makes use of the multiple derivatives of the real valued Gaussian function in different directions. Recently, Kirby (2005) showed that only the complex valued Morlet wavelet and its relatives are able to reproduce the Fourier spectrum accurately because of the similarity between the basis functions of the Fourier transform and the Morlet wavelet. This property is important for the comparison of the observed 1-D coherence curves with theoretical predictions to yield T e estimates. Moreover, the directional selectivity of the Morlet wavelet in the spatial domain naturally allows for the detection of azimuthal information in the spectra. Kirby (2005) developed a quasi-isotropic wavelet dubbed the 'fan' wavelet by superposing optimally spaced Morlet wavelets in a given azimuthal range. This method is further exploited in a recent paper by  where they use a restricted fan wavelet to detect anisotropy in the coherence. This paper will first describe how the wavelet transform based on this directional wavelet can be used in the study of lateral and azimuthal variations in the 2-D coherence. The method is then demonstrated on synthetic gravity/topography with homogeneous, heterogeneous isotropic, and with homogeneous anisotropic elastic thickness. We apply this method to data from the Canadian Shield and compare the results with relevant geological and geophysical information from this area.

Wavelet transform
The literature on wavelets being quite extensive, we will only briefly review here the basic theory necessary for our particular application and refer the interested reader to Foufoula-Georgiou & Kumar (1994), Torrence & Compo (1998), or Holschneider (1999. The 2-D wavelet transform is defined as the convolution of a signal with a scaled and rotated kernel ψ θ a,b called a wavelet (Foufoula-Georgiou & Kumar 1994). Manipulating the wavelet dilation parameter (or scale) a and azimuth θ has the effect of analysing a signal at location b at different scales and directions, where a can be related to an equivalent Fourier wavenumber. The wavelet transform is thus a multiresolution operation that allows the detection of non-stationarity and a localized characterization of the signal at different scales and azimuths. The shape of the wavelet determines its localization in both space and wavenumber domains. We shall use the more general terminology physical space and spectral space for the representation of a signal in the two reciprocal domains.
A wavelet must satisfy two conditions: (1) zero mean, to insure a wave-like behaviour and (2) compact support in both physical and spectral spaces. We construct the family of wavelets by translating and dilating the argument of a mother wavelet where r is the location in the 2-D physical plane (r x , r y ), and C θ is the rotation operator There are two classes of wavelet transform: (1) orthogonal and (2) non-orthogonal. The orthogonal wavelet transform allows the decomposition of a signal into a set of orthogonal basis functions, like a Fourier transform, and its validity is restricted to discrete signals. This property is appealing for the archiving of large data sets, digital image compression and filtering. It also provides a minimally redundant representation of a signal. The orthogonal wavelet kernels are usually complicated functions to express analytically in both spaces. The non-orthogonal wavelet transform is defined at arbitrary wavelengths, a useful property for the spectral analysis of continuous signals. In this study we use a discrete version of the non-orthogonal continuous wavelet transform (conventionally termed CWT) W f of f (r), defined as where the asterisk denotes complex conjugation. The kernel functions ψ θ a,b determine the resolution of the wavelet transform in both physical and spectral spaces. At large scales, the wavelet is spread in physical space and the wavelet transform picks up information at long wavelengths. At smaller scales, the wavelet is well localized in physical space and the wavelet transform picks up information at short wavelengths. The wavelet resolution is bounded by the uncertainty principle, which states that physical and spectral localizations cannot be simultaneously measured with arbitrarily high precision. Hence the spatial localization of the wavelet comes at the expense of the spectral localization.
In Fourier domain, the wavelet transformŴ F becomeŝ where k is the two dimensional wavevector (k x , k y ),F(k) is the 2-D Fourier transform of f (r) andψ * a (C θ (k)) is the complex conjugate of the Fourier transform of the wavelet ψ at scale a and azimuth θ . The multiplication of the spectra in the Fourier domain is faster to perform than the convolution in eq. (5), and one can take advantage of the fast Fourier transform (FFT) algorithm for transforming back and forth from space to Fourier domain.

Wavelets
The choice of a wavelet depends on several factors, including the need for a real or complex valued function, shape and resolution of the desired wavelet, and ultimately the capacity of resolving anisotropic features in 2-D fields. In our case where we want to calculate the coherence between two fields, it is important to use a complex valued wavelet to retrieve the phase information. To compare the coherence with Fourier-derived predictions, it is also natural to use a wavelet constructed from complex exponentials modulated by a smoothing function. The Morlet wavelet is thus an ideal candidate for this task and exhibits a natural property of directional selectivity which allows the detection of azimuthal information in 2-D fields.

Morlet wavelet
In 2-D, the Morlet wavelet is an oscillating function of wavevector k 0 = (k 0 x , k 0 y ) modulated by a Gaussian smoothing function where |k 0 | ≈ 5.336 in order to satisfy the zero mean condition, and A is the 2 × 2 anisotropic diagonal matrix where the first nonzero element is , with 0 < ≤ 1, and the second element is 1 (Antoine et al. 1993;Kumar 1995;Antoine et al. 1996). By setting = 1, the Morlet wavelet has an isotropic envelope in the spectral space. This property will be convenient in the construction of the fan wavelet to be described later. We refer to this wavelet as the isotropic Morlet wavelet. In the spectral space the Morlet wavelet is a Gaussian bandpass filter centred on the wavevector k 0 The Morlet wavelet in the spectral space is almost entirely supported in the positive domain, subject to the restriction |k | > |k 0 |.
The wavevector k 0 can be arbitrarily chosen and will filter information in the direction given by In Fourier domain, the peak representation of the Morlet wavelet is at wavenumber which is considered as the equivalent Fourier wavenumber.

'Fan' wavelet
In the context of finding a suitable isotropic wavelet that can be interpreted as a Fourier spectrum, Kirby (2005) showed that a controlled superposition of directionally adjacent isotropic Morlet wavelets, dubbed the 'fan' wavelet because of its shape in spectral space, provided better results than any of the commonly used wavelets (DoG (Derivative of Gaussian), Paul, Perrier and Poisson). The fan wavelet is expressed aŝ The geometry of the fan wavelet is achieved by averaging adjacent isotropic Morlet wavelets separated by an azimuthal sampling of δθ = 2 √ −2 ln p/|k 0 |, where the optimal value of p was found to be 0.75 (Kirby 2005). The fan wavelet is obtained by setting a single range of azimuthal increments N θ = int( θ/δθ ), where θ is the azimuthal extent. For θ = π, the fan wavelet includes 11 adjacent Morlet wavelets and is quasi-isotropic. For smaller values of θ, the fan wavelet is anisotropic.

Cross-spectral analysis
Eq. (5) is used to compute a wavelet scalogram, which represents the energy density of a function f (r) at scale a, location b and direction θ The wavelet cross-scalogram of any arbitrary 2-D functions f and g is calculated in the same way as (11) Eqs (11) and (12) are combined to estimate the local wavelet complex admittance and the complex impedance Q f g (a, θ, b) = Q g f (a, θ , b). The local wavelet coherence is the product of admittance with impedance: At this point, if no averaging is performed, the coherence is 1 at all scales and azimuths. In the wavelet implementation using the fan wavelet, the averaging is done over distinct azimuths where θ = θ ± θ. Details on the azimuthal averaging are discussed in the next section. One can also calculate a global estimate of the coherence from the wavelet scalograms by averaging the local wavelet spectra over the spatial coordinates b at each scale and azimuth.

Algorithm
We follow the standard practice and calculate the convolution in Fourier domain using the fast Fourier transform (FFT) algorithm to transform back and forth from space to spectral domains. We now describe in more details the steps involved in the calculation of the cross-spectral quantities (eqs 11 to 14). We start by Fourier transforming the Bouguer and topography fields. In the second step, we define (1) the range of scales to be computed, depending on the size of the data window and the size of the grid, and (2) the discrete angle increments for the azimuthal selectivity of the fan wavelet (≈16.3 • ). Second we loop over all scales and azimuths up to 180 • (i) Calculate the daughter wavelet in spectral space for this set of parameters.
(ii) Compute the Fourier representation of the wavelet transform by multiplying the daughter wavelet with the transformed fields (eq. 6).
(iii) Inverse Fourier transform the result to obtain the local wavelet transform in physical space (eq. 5).
(v) Average the scalograms and cross-scalogram depending on the type of analysis; and finally.
(vi) Combine the resulting quantities to obtain the local admittance and coherence (eqs 13 and 14).
Step (v) is the key procedure in the algorithm, as it controls the azimuthal resolution and the variance of the coherence. In the isotropic case, the averaging is performed over individual wavelet spectra calculated from a range of Morlet wavelets spanning 180 • in the spectral space. By averaging the local spectra over a restricted range of azimuths, the anisotropic coherence is obtained at the azimuth corresponding to the middling value of the range. The width of the range controls the variance by virtue of the number of selected spectra in the averaging, while the overlap between adjacent ranges controls the azimuthal resolution. Note that no averaging in wavenumber (or scale) is performed. We follow  and set the azimuthal range to 90 • in the anisotropic analysis. The azimuthal distance between overlapping ranges is set to 5 • , which is smaller than the azimuthal separation between adjacent Morlet wavelets. This introduces some redundancy and increases computational time but improves upon the azimuthal resolution in the inversion.

Limitations
While the use of the FFT to compute the convolution in the spectral domain speeds up the calculations, it can also bias the scalograms at the larger scales. Because we are dealing with finite size 2-D fields, errors will occur at the edges of the grids, as the FFT assumes that the data are periodic. This means that signals at one edge of the field will get wrapped around the other end (Torrence & Compo 1998). This effect is more pronounced at larger scales as the influence of each wavelet extends further in space. There is thus a cone of influence beyond which edge effects become important. In our application, it means that for gridpoints close to any edge of the grid which have a large transition wavelength in the coherence (large T e ), the estimated elastic thickness and the anisotropy are affected by data from the opposite side of the grid. This effect increases with increasing contrast in T e at opposing edges of the study area. There exists many pre-processing techniques aimed at removing the edge effect in spectral analysis. The most obvious one that does not require any data manipulation is to select a data window that is much wider than the region of interest with the smallest grid sample possible. This procedure is ideal but not applicable in all cases as the data availability and resolution are usually limited. Another possibility is to mirror the edges of the window, calculate the spectra on the larger data set, and retain only the initial quadrant in the final analysis. However it is well known that this procedure introduces artificially long wavelengths and can greatly affect the value of T e (Lowry & Smith 1994;Ojeda & Whitman 2002;. For example, small wavelength coherent signals at one edge of the initial data window will be mapped at longer wavelengths in the enlarged grid due to the mirroring, and the coherence curve will be shifted accordingly, resulting in the elastic thickness being over-estimated at the edges. The anisotropic coherence might suffer even more from the mirroring since the weak direction will also be mirrored at gridpoints close to any edge of the data window.
The preferred method used in this study pads the 2-D fields with zeros to the next power of 2 in each direction, applies the FFT on the larger grid, and retains only the relevant quadrant in the computation of the scalograms. Spectral leakage due to discontinuities in the data are dealt with by the application of a Hanning taper to all sides of the initial data set. This procedure down-weights the wavelet spectral power at the edges, but the phase of the cross-spectra is confidently recovered and no artificial information is introduced in the data.

Isotropic coherence inversion
In the inversion step, T e is estimated by minimizing the misfit between the observed coherence and predictions from an ideal elastic lithosphere, provided a certain amount of assumptions about the loading scheme, depth to the surface of compensation, and the elastic parameters. One approach compares the observed coherence with theoretical (analytical) curves, that is, with the solution of the elastic plate for an infinitely large window and a uniform load ratio f (Kirby & Swain 2004). However, as outlined by Pérez-Gussinyé et al. (2004) and Simons et al. (2000), there is a bias in the spectral estimates of the coherence. The comparison of biased estimates with theoretical (unbiased) predictions results in an inaccurate estimate for the elastic thickness. The wavelet transform does not escape this problem because the resolution in physical and spectral spaces is bounded by the uncertainty principle. Wavelets are bandpass functions, hence the transform at a single scale will include information from neighbouring wavenumbers. As the scales increase, the wavelets are more localized in the spectral space and the transform is closer to the Fourier spectrum, at the cost of spatial resolution. At wavenumbers near the transition from low to high coherence, the estimated coherence is biased towards higher values because the contribution to the coherence of the wavelet at neighbouring lower wavenumbers will smear out the long wavelength information. The fractal nature of the topography and Bouguer gravity fields will also put more weight on the long wavelengths and together these effects will shift the transition wavelength toward lower values, resulting in the elastic thickness being underestimated. The assumption of uniform load ratio f is also a crude approximation because it is a wavenumber dependent parameter calculated from the data that affects the shape of the coherence, albeit in a weaker manner than T e (Forsyth 1985). Pérez-Gussinyé et al. (2004) suggest to use the same spectral estimator for calculating the predicted and the observed coherence. In this case, the same biases affect both quantities and their effects balance each other in the inversion for the elastic thickness. For the wavelet spectral estimator, the Fourier-based deconvolution method of Forsyth (1985) can be adapted to fit the purpose of T e calculation by assuming that adjacent local spectra are decoupled, as was done by Stark et al. (2003) and .
The method proceeds as follows: (1) using the wavelet transforms of the Bouguer gravity W B and topography W H at each scale and azimuth, we select a particular spatial point and solve for the initial surface W H i and internal W B i loads for a given value of T e ; (2) the initial loads are decomposed into the four components of the surface (W Ht and W Bt ) and subsurface (W H b and W B b ) deflection that they cause; and (3) the auto and cross-spectra are averaged and combined to form the local predicted coherence, assuming that the initial loads are statistically uncorrelated: where the brackets indicate averaging overall azimuths. Thus by minimizing the misfit between the observed and predicted coherence, we are also able to calculate the local subsurface to surface loading ratio f , which is given by where W B i and W H i are the components of the best fit solution, and ρ c and ρ m are the crust and mantle density, respectively. Some authors have recently suggested to use the transition wavelength from low to high coherence (typically a coherence of 0.5) as a proxy for the mechanical strength, because it does not require making many uncertain assumptions as those needed for inverting T e , such as the depth and magnitude of the density contrast (Simons & van der Hilst 2002;Rajesh & Mishra 2004). The inversion of the observed coherence from synthetic data using theoretical curves should then be equivalent to calculating the transition wavelength since the density structure and the uniform load ratio are known. It is expected that both inversion methods should yield similar results on synthetic data when the transition wavelength is much smaller than the window size. At longer transition wavelengths (higher T e ) the limited wavelet resolution introduces bias in the estimated coherence near the transition wavelength and the Fourier deconvolution method should be more accurate than the inversion based on theoretical solutions.
In the inversion, the misfit is calculated by the L2 norm weighted by the variance of the observed coherence. In the next section we compare the ability of the different inversion methods to recover T e through the application of the wavelet method on synthetic grids.

Anisotropic coherence inversion
The full inversion of the 2-D coherence for anisotropic T e necessitates the modelling of the lithosphere as a thin plate with anisotropic mechanical properties. Although this method has already been developed and used in isostatic studies (Swain & Kirby 2003b;, its application in a laterally varying lithosphere remains ambiguous because of the complexity observed in the 2-D coherence. The method compares the 2-D wavelet coherence with theoretical solutions to the equation of the flexure of an orthotropic elastic plate given a large number of assumptions, such as intrinsic orthotropy, and isotropy in the subsurface to surface load ratio, which is kept constant at all wavelengths. The discussion about the bias in the estimates of the wavelet coherence and how it affects the recovery of T e also applies in this case. These arguments, coupled with the observed complexity in the observed wavelet coherence, render the interpretation in terms of the elastic thickness in two perpendicular directions debatable. However the response of the lithosphere should in theory be close to that of a thin elastic plate, and the orthotropic elastic plate model has the advantage of determining physical estimates of the anisotropy, whether the assumptions are valid or not. We thus choose to invert the 2-D coherence using the theoretical solutions as in . Relative comparisons can be made between the recovered isotropic and anisotropic elastic thicknesses, while the estimates of the weak direction are regarded as absolute.
In the 2-D coherence inversion, the misfit is weighted by the inverse of wavenumber to down-weight the contribution of short wavelength coherent features that are not associated with the response of an elastic plate .

N U M E R I C A L E X A M P L E S
Synthetic gravity and topography were generated as two non-correlated fractal fields with a spectral slope of β = −3 representing the initial surface and subsurface loads that are applied to a thin elastic plate with known elastic parameters and density structure (Macario et al. 1995). The elastic plate is assumed to consist of a crust of uniform thickness and density. The subsurface load is placed at the Moho 40 km deep, with a 500 kg m −3 density contrast at the interface and a crustal density of 2670 kg m −3 . The amplitudes of the loads were scaled by imposing a constant ratio f = 1 between surface and subsurface loads. The flexure produced by both loads was calculated by solving the thin elastic plate equilibrium equation in the spectral domain for uniform rigidity plates and with a finite-difference solution in the spatial domain for the case with a varying rigidity pattern.

Uniform rigidity pattern
We first verified the robustness of the wavelet transform method on synthetic data with spatially uniform flexural rigidity. We treat the isotropic and anisotropic cases independently and compare the ability of the wavelet transform and the different inversion methods in recovering the input T e . In all examples, synthetic data were generated on 256 × 256 grids with a sampling interval of 8 km. The spectra were calculated on a smaller subgrid (128 × 128) to avoid using periodic data sets.

Isotropic plate
We first tested the homogeneous isotropic plate. Applying the procedure outlined above, we generated 100 data sets for each isotropic elastic plate thickness of 10, 20, 40 and 80 km. Various tests were performed on these synthetic grids. First we compared the global wavelet coherence with theoretical predictions to evaluate the abil- ity of the wavelet transform to reproduce the Fourier coherence. The theoretical curves are calculated from eq. (10) of Pérez-Gussinyé et al. (2004), corrected for the numerator (variable φ not squared). We also compared the global wavelet coherence with the predicted global wavelet coherence as described in Section 2.6. The inversion for the best fitting elastic thickness is effectively done for each of the 100 runs and each input value of T e , a procedure that is now routinely carried out in studies of the recovery of T e with spectral methods (Macario et al. 1995;Swain & Kirby 2003a;Pérez-Gussinyé et al. 2004;Audet & Mareschal 2004a). In Fig. 1 the histograms show the distributions, mean values and standard deviations for each input T e for both inversion methods. The relative difference between the observed and predicted solutions and the standard deviations increase when T e is >40 km. When the observed wavelet coherence is compared with the predicted wavelet coherence rather than the theoretical function, the solutions are closer to the input values except for the lowest T e , where the estimates are similar. This could be because the theoretical solutions are as good as the deconvolution solutions for low T e 's, where the transition wavelength is much less than the area dimensions, as argued before.
We also calculated the local coherence at each spatial point for a single synthetic data set. best-fitting predicted and theoretical coherence. The T e recovered from the global wavelet coherence is 19.0 km when T e is inverted from the theoretical curves, and the mean and standard deviations of the local estimates on the whole grid is 19.5 ± 3.7 km; estimates for T e are, respectively, 22.0 km for the global coherence, and 24.5 ± 4.1 km for the mean and standard deviations when T e is inverted from the predicted coherence. The dominant wavelength in the spatial variations of the recovered T e is about 500 km, similar to what was found by Kirby & Swain (2004). With controlled synthetic data, the spatial variations in the estimated T e are on the order of 15 per cent the input value.

Anisotropic plate
Anisotropy in the flexural rigidity parameters D x and D y , where x and y are two perpendicular directions in a Cartesian coordinate system, produces a decrease in the transition wavelength in the direction where the rigidity is lowest. The rigidities are converted to equivalent elastic thicknesses in two perpendicular directions using constant and isotropic Poisson's ratio and Young's modulus. Following Swain & Kirby (2003b) method for the generation of synthetic gravity and topography with homogeneous anisotropic elastic thickness, and using the same procedure as before we generated synthetic data sets with uniform elastic thickness pairs of (T ex , T ey , φ) = (5, 10, 0), (20, 10, 0), (50, 10, 0) km, where φ is the weak direction calculated anticlockwise from the x-axis. The global wavelet anisotropic coherence inversion gives (T ex , T ey , φ) = (5, 10, 0), (18, 10, 5), (40, 12, 0). Fig. 3 shows the wavelet coherence at each scale and azimuth and the best fit theoretical coherence at a randomly chosen point on the grid. The local coherence inversion results are sensibly the same as for the global results and show that we can recover the direction of highest coherence with a good accuracy (±10 • ). For large T e contrasts, the recovered T e values are inaccurate but the coherence anisotropy is well marked (Fig. 3c and f). We must also keep in mind that the inversion with theoretical solutions underestimates T e , particularly when T e > 40 km for this size of grid.

Spatially varying rigidity pattern
The motivation for choosing a wavelet method over traditional spectral methods is an improved spatial resolution for recovering variations in T e . To address the issue of resolution, we calculated the local wavelet coherence on synthetic Bouguer and topography grids with a T e structure consisting of a circular core with diameter of 700 km and T e = 50 km, with radially decreasing values towards the edges of the window to T e = 0 km (Pérez-Gussinyé et al. 2004). The data were generated with a uniform load ratio f = 1, and the depth to Moho = 40 km. The study area size is 2048 × 2048 km 2 , with a sampling interval of 8 km. In this case the whole grid is retained in the calculations and the zero-padding/tapering procedure is not applied since the data are periodic in both directions.
The results in Fig. 4 show the recovered T e pattern obtained with both inversion methods along with the input structure. Both inversion methods show a roughly circular T e structure with values increasing from ≈0 km at the edges to ≈50 km towards the centre. This figure should be compared with fig. 4(a)-(h) of Pérez-Gussinyé et al. (2004). While their coherence results show a similar pattern when they use a 800 × 800 km 2 window, the resolution is limited to a 1200 × 1200 km 2 square region at the centre of the grid. The wavelet method, on the other hand, is able to recover the low values at the edges with good accuracy. However this might not be the case for different T e models as discussed in Section 2.5.
The rms errors calculated on both grids are 9.1 and 8.4, respectively, for (a) and (b). The deconvolution method also produces higher estimates of T e in the central region of the study area. The larger rms error than in Kirby & Swain (2004) and Stark et al. (2003) can be partly explained by the sharpness in the structure of our model, and by the wide range of input T e values. This rather extreme example illustrates well the limits of spectral methods in recovering spatial variations in T e : structures with high values of T e are well estimated only if their wavelength is longer than the transition wavelength from low to high coherence. When this condition is not verified, the spatial variations of T e represent a weighted average of the true distribution.
In light of these results on synthetic data, it is clear that one must be careful when interpreting maps of T e values, since both the processing and the inversion steps rely on a large number of assumptions, and even with known input parameters the inversion is flawed. The accuracy of recovered T e values strongly depends on the size of the study area compared to the transition wavelength.

Previous studies
The Canadian Shield is an excellent target for the application of the wavelet method to the study of the lateral and azimuthal variations    Figure 4. Recovered T e pattern from the input model shown in (c). (a) Is obtained by inverting the local coherence with theoretical curves. Estimates of (b) are calculated by inverting the local coherence with the adapted deconvolution method described in Section 2.6. of the coherence between Bouguer gravity and topography for a number of reasons. The eastern Shield comprises several Precambrian provinces around the Archean Superior craton; the different provinces of the Shield were welded together during the Proterozoic during orogenic events that are expressed in the topography and the gravity fields. The structure of the crust and mantle are well documented only in the southern part of the Shield in the regions where LITHOPROBE conducted active seismic studies on selected transects (e.g. Winardhi & Mereu 1997). Seismic and electrical conductivity anisotropies have been reported in some regions, but large parts of the Shield are not explored. In the southern part of the Shield, lower crustal and mantle temperatures are well constrained by heat flow studies . Some constraints on mantle temperatures have been added by inversion of seismic velocity profiles .
Recent T e studies in the Canadian Shield have shown that it varies over relatively short distances, but the correlation between T e and the heat flow or geology is weak (Wang & Mareschal 1999;Flück et al. 2003;Audet & Mareschal 2004a). All those studies used the maximum entropy spectral estimator (MEM) to compute the isotropic coherence because it performs well with synthetic data. Although the results of the T e studies are consistent with each other, there is still questions concerning the eastern part of the Shield because the effect of anisotropy in the coherence was neglected. Audet & Mareschal (2004b) used the multitaper method (MTM) to show strong anisotropy in the coherence at intermediate wavelengths and a sharp decrease of the coherence at long wavelengths which does not correspond to the response of a thin elastic plate. This flexural anisotropy is strongly correlated with surface geology and with seismic and electrical anisotropies in the lithospheric mantle.

Tectonic setting
The study area extends from west of Hudson Bay to the sea of Labrador and the Atlantic Ocean to the east (Fig. 5). The area covers most of the exposed Canadian Shield with the exception of the Archean Slave Province and the Proterozoic Wopmay Orogen to the northwest where the gravity data are very sparse. In the eastern Shield, the Archean (2.5 Ga) Superior (SUP) and Nain Provinces (NA) are separated by the southeastern Churchill Province (SECP) which consists of a core zone of reworked Archean rocks sandwiched between the New Quebec and the Torngat orogens (Wardle et al. 2002). The Nain Province was part of the North Atlantic craton from which it was separated by the opening of the Labrador Sea at 100 Ma. In the southeastern Churchill Province, the main tectonic features and shear zones are NE-SW to N-S oriented. In the northeastern part of the Superior Province, most of the faults trend NW-SE.
South of the Superior Province, the Grenville Province (GRE) is a major Mid Proterozoic (1.1 Ga) zone of collision extending from northern Mexico to Labrador. It is separated by the Atlantic ocean from its European counterpart in Scandinavia. The Grenville Front, separating the Grenville Province from the older structural provinces to the northwest, is a sharp northeast trending tectonic break that is well defined on the regional gravity and magnetic maps. The Grenville Province in Canada consists of an autochthonous, and of several allochthonous belts that were emplaced during a series of tectonic events culminating at 1.1 Ga. On the seismic reflection profiles conducted by LITHOPROBE, reflectors that can be identified as the Archean basement extend far south into the Grenville Province. The Canadian Shield disappears beneath the Appalachi- ans (APP) which were thrust over the Grenville basement during the Palaeozoic.
To the west, the Superior Province is separated from the Archean Hearne and Rae (HR) cratons by the Trans-Hudson Orogen (THO), a major Proterozoic (1.8 Ga) collision belt that extends from the northern tip of Quebec, across Hudson Bay and Manitoba, Saskatchewan into the Dakotas. The THO is exposed only in northern Manitoba and Saskatchewan. It is covered by the sediments of the Williston basin to the south and the Hudson Bay basin to the north.

Topography and gravity data
The Bouguer gravity data were obtained from the Geological Survey of Canada. Data points are very unevenly distributed with an average distance between points ≈10 km. We have used the free-air gravity data derived from satellite altimetry of Smith & Sandwell (1995) to complete the grids over the Ungava Bay where no standard gravity measurements are available. The spatial resolution of this data set is ≈20 km. The free-air gravity data were converted into Bouguer data using the bathymetry and a constant rock density (2640 kg m −3 ). The topography-bathymetry data come from the 2' grid digital elevation model of Smith & Sandwell (1997). The data set was projected on a transverse Mercator grid with 10 km sampling distance over an area of 3390 × 2770 km 2 (Fig. 5). The wavelet coherence was calculated at every point on the same grid with a sampling distance of 10 km.

Variations in T e
In previous studies, we have determined the variations in elastic thickness in the Canadian Shield with the maximum entropy method Figure 6. T e map of eastern Canada obtained with the wavelet method. (Audet & Mareschal 2004a) and the multitaper method (Audet & Mareschal 2004b). Fig. 6 shows the wavelet results using the deconvolution method of Section 2.6. The density of the crust and mantle are taken to be 2670 and 3200 kg m −3 , respectively, and the depth to the Moho is given by the LITH5.0 crustal model of Perry et al. (2002). The results are generally consistent with those in the previous studies. The T e values obtained with the wavelet method ranging between 30 to 140 km are in the same range than those obtained with the MEM (40-120 km) and show similar trends. T e is generally high (>90 km) throughout eastern Canada. Lower values (40-80 km) are found east of the Grenville Province, and in the eastern part of the Appalachians. High values are found in the Hudson Bay and in the Hearne and Rae Provinces, consistent with the previous studies. While T e was poorly estimated by both the maximum entropy and multitaper methods in the northern Appalachians (Audet & Mareschal 2004a,b), the wavelet method yields values ranging from 50 to 80 km. This is consistent with high T e values in the US Appalachians (Armstrong & Watts 2001). On a smaller scale, there are very low T e 'bull's eyes' in an otherwise very strong lithosphere in the northern Shield, which are due to short wavelength coherent features mapped as T e .
The region of low T e near the Great Lakes is the most major discrepancy between this and previous studies. Other differences include a ridge of lower T e (50-70 km) along 270E from 50-65N, and similar values in much of northern Quebec and Newfoundland found by the MEM and MTM, whereas the wavelet results show values >100 km throughout this region. We note that the MEM and MTM results are more similar to one another than to the wavelet results. This could be due to the fact that the MEM and MTM produce a single estimate of T e within a fixed window size (1024 × 1024 km 2 ), which is moved over the study area with some overlapping. As stated previously, T e is well estimated only if the transition wavelength is much shorter than the size of the study area. The ridge of low T e values in the Hudson Bay basin found with the MEM and MTM could be underestimated because the transition wavelength (≈1000 km) is on the order of the window size. The annular band averaging performed in the MEM and MTM can also decrease the maximum resolvable wavelength and bias the results for the largest values (Swain & Kirby 2003b). Hence the MEM and MTM might capture the shorter wavelength information of the coherence in these regions. In the Great Lakes region the situation is reversed, and the wavelet estimates are lower than the MEM and MTM results.
One should attach no importance on the elastic thickness values over the ocean since our simple flexure model does not consider the oceanic crustal structure nor does it take the load of water into account. These oceanic T e results are also affected by the fact that the bathymetry is derived from the free-air gravity data assuming a certain value of T e (Smith & Sandwell 1997).

Anisotropy
Anisotropy in the coherence typically appears at different wavelength regimes . The long wavelength anisotropy reflects the deep lithospheric structure and can be inverted to reveal the elastic thickness in two perpendicular directions. Short wavelength anisotropy is more likely to be related to isostatic adjustment along faults and involves the shallow part of the lithosphere. The approach we have taken here allows the identification of the short wavelength anisotropy by calculating the average of the observed coherence for all wavelengths shorter than the transition wavelength of the best fit solution in each azimuth. However only a few sparsely distributed points, mostly over the ocean, are statistically significant, that is, their value is within the 1σ uncertainty apart from the azimuthal average, and we choose not to interpret them. Also, because we invert the 2-D coherence with theoretical solutions, we prefer to discuss the relative variations in the magnitude of the anisotropic parameter, (T max − T min )/T max , and the variations of the weak direction , subject to an angular uncertainty of ±10 • . Fig. 7 shows the results in eastern Canada. Anisotropy is found almost throughout the study area and it varies in magnitude and direction within each province or subprovince. The anisotropic direction is consistent with the results obtained with the MTM The direction of the bars is that of the minimum strength (i.e. maximum coherence), the length is ∝ (T max − T min )/T max . The fast direction of seismic SKS anisotropy is also plotted as coloured bars. Red bars are from Rondenay et al. (2000), green bars are from Eaton et al. (2004), and purple bars are from Kay et al. (1999). (Audet & Mareschal 2004b) except for a few regions. In particular both studies show that the weak axis is quasi-perpendicular to the main tectonic breaks: the Grenville Front and the main belts of the Grenville Province, and the tectonic boundaries between the Southeastern Churchill Province and the rest of the Shield. Both studies show the absence of anisotropy near Hudson Bay, and weak anisotropy in the central eastern Superior Province. There are two important differences between this study and the one by Audet & Mareschal (2004b): the E-W direction of the weak axis south of Hudson Bay where the multitaper method found a N-S weak direction; and to the north of the study area where no anisotropy was detected by the multitaper method. The wavelet anisotropy south of Hudson Bay is inconsistent with the previous interpretation that the weak direction is perpendicular to the main belts of the Superior Province and to the fast seismic axis (Kay et al. 1999). It is possible that, in this region, the wavelet coherence detects anisotropy at a longer wavelength regime than the MTM. Concerning Hudson Bay, we must note that the elastic thickness is very high and poorly resolved with the MTM because of the size of the window. Another critical point is that the inversion of the elastic thickness does not resolve differences in T e when it is >80 km. In other words, for values of T e > 80 km, the difference between T max and T min becomes meaningless.
In the Appalachians, the direction of the weak axis suggested by the wavelet method is fairly consistent with the cross-strike direction obtained with the MTM. For the Appalachians, the MTM failed to determine the elastic thickness because the coherence decreases at long wavelengths. However, the MTM detected increased coherence in the cross-strike direction at short wavelengths suggesting that the weak axis is perpendicular to the strike of the Appalachians.
In the Grenville, the seismic fast directions vary from ENE to the north of the station array to ESE to the south (Rondenay et al. 2000). The northern results are interpreted to reflect E-W regional shear zone of Archean age, and the fast directions perpendicular to the main belts of the Grenville to the south suggest that the mantle fabric may have recorded the last extensional event. Our results are more sensitive to the crustal and upper-mantle fabric and are more consistent with the former interpretation, suggesting that the lithosphere is weaker in the direction perpendicular to the structure (Barruol et al. 1997).
We verified that our results do not change when the window is moved in any direction. The wavelet method is more time consuming than the MTM for determining the anisotropy and does not have the same azimuthal resolution as the MTM. It has the advantage that it yields an elastic thickness in regions where the MTM fails (Appalachians). Clearly there are good reasons to use and compare both methods whenever possible.

C O N C L U S I O N
We have tested and used a wavelet-based method to determine the coherence between Bouguer gravity and topography and to obtain estimates of the effective elastic thickness of the lithosphere and its anisotropy. Tests with synthetic data show that effective elastic thickness can be estimated within ±15 per cent with the wavelet coherence. Tests also show that the wavelet method can detect anisotropy and determine the directional variations in effective elastic thickness.
The wavelet method was applied to study variations in elastic thickness in the southeastern Canadian Shield and the results were compared with those of previous studies of T e based on different spectral estimates of the coherence. Within the relatively large errors associated with the inversion of T e , the results of different studies consistently show that elastic thickness is high (>90 km) over most of the eastern Canadian Shield. The present study also confirms that T e is high in the Appalachians. Regional trends also appear to be largely consistent between studies.
The wavelet based method shows anisotropy almost everywhere including Hudson Bay where the multitaper method did not detect any anisotropy and where there is no evidence of seismic anisotropy in the mantle. With the exception of the western Superior Province (i.e. south of Hudson Bay), the directions of the weak axis obtained by the wavelet method and those obtained by the multitaper method are consistent where the MTM detected anisotropy. These directions are perpendicular to the main tectonic breaks and to that of the fast seismic axis.

A C K N O W L E D G M E N T S
We are grateful to Marta Pérez-Gussinyé for kindly providing some of the synthetic data used in this study. This paper was much improved by thorough and constructive reviews by Jon Kirby and Marta Pérez-Gussinyé. This work is supported by NSERC through a postgraduate fellowship to PA and a discovery grant to JCM