Complex cratonic seismic structure from thermal models of the lithosphere: effects of variations in deep radiogenic heating

Thus the variations in major and minor element mantle lithosphere composition commonly seen in mantle samples can account for much of the variability in imaged seismic structure of cratonic lithosphere.


I N T RO D U C T I O N
Precambrian cratons comprise the ancient stable cores of the continents. Relative to Phanerozoic lithosphere, they are stiff and buoyant due to a melt-depleted, dry composition and low temperatures (e.g. Jordan 1979;Carlson et al. 2005;King 2005;Sleep 2005). It has long been recognized that within cratons, there is a first-order variation of average surface heat flow, lithospheric composition and seismic structure with formation age (inferred from the crust). The Archean parts of the cratons have low average surface heat flow values of about 41 mW m −2 , while Proterozoic regions display average heat flow around 48 mW m −2 (Nyblade & Pollack 1993), but with significant scatter (values ranging from 36 to 94 mW m −2 ; Jaupart & Mareschal 2007). On a large scale, these surface heat flow trends correlate with trends in lithospheric seismic structure (Röhm et al. 2000;Artemieva & Mooney 2001;Simons & Van der Hilst 2002). Clues on the cause of these differences may come from xenoliths and xenocryst data, which indicate that the Proterozoic lithosphere is generally less depleted in basaltic components than the Archean lithosphere (Griffin et al. 1999b). It has been proposed that the variations in the degree of melt-depletion with thermotectonic age represent a secular trend related to cooling of the mantle (Griffin et al. 1999b;. A more recent interpretation, however, explains the characteristics of Proterozoic lithosphere as a result of differing degrees of metasomatism of originally Archean lithosphere . Regardless of the mechanism of formation, observations show that there is actually a large degree of lateral and vertical variability both within the Archean and within the Proterozoic lithosphere. When analysed in detail, this is found in surface heat flow (McLaren et al. 2003;, xenolith and xenocryst data (Griffin et al. 1999bKopylova & Caro 2004;Carlson et al. 2005), as well as seismic structure (e.g. Fischer & Van der Hilst 1999;James et al. 2001;Bruneton et al. 2004a;Darbyshire et al. 2007). Several recent initiatives have combined geophysical, petrological, geochemical and geological information to image 1000 C. F. Hieronymus and S. Goes cratonic lithosphere structure (e.g. recent special issue on Kaapvaal and Slave craton experiments Jones et al. 2003, and results from a seismic deployment in the Baltic Shield Bruneton et al. 2004b;Sandoval et al. 2004). While the spatial scale of the variability is difficult to assess using the rather sparse set of xenolith and xenocryst data, seismic analyses indicate variations by several percent in both V P and V S over distances of 100 km or less (e.g. James et al. 2001;Bank et al. 2000;Bruneton et al. 2004a;Darbyshire et al. 2007). Much effort has been expended on linking these variations to structural, compositional, or thermal characteristics of the crust and lithosphere.
Cratonic seismic anomalies of several percent are surprising because, at lithospheric depth, seismic velocities are much more sensitive to variations in temperature (changes of 0.5-2 per cent per 100 K) than in composition (only 1-2 per cent for most lithospheric compositions commonly found in xenoliths or other mantle samples) (e.g. Goes et al. 2000;Lee 2003;James et al. 2004;Schutt & Lesher 2006). Hence, lithospheric seismic anomalies are usually interpreted as being mainly thermal in origin . In stable continents however, lateral variations in temperature are generally thought to be small. Arguments for this are: (1) long time elapsed since the last tectonic event which is comparable to the thermal equilibration time (about 2 Ga even for thick lithosphere Michaut et al. 2007), (2) relatively constant surface heat flow measurements (Nyblade & Pollack 1993;Artemieva & Mooney 2001) and (3) the relatively narrow range of xenolith-derived geotherms (Rudnick et al. 1998;Lee 2006;Eaton et al. 2009).
Below the Kaapvaal (James et al. 2001) and the Slave (Bank et al. 2000) cratons, models derived from teleseismic V P body waves display around ±1 per cent (i.e. total range of 2 per cent) velocity variations. Especially in Kaapvaal, these variations correlate with surface tectonics and the chemistry of xenoliths, xenocrysts and diamond inclusions (Shirey et al. 2002;Griffin et al. 2003a;James et al. 2004). Hence, these seismic anomalies are reasonably interpreted as due to variations in major-element composition of the mantle lithosphere. However, other tomographic models display a variability of as much as 4-6 per cent in V S and up to 4 per cent in V P [Baltic Shield (Bruneton et al. 2004a,b;Sandoval et al. 2004;Eken et al. 2008); Superior Province Frederiksen et al. 2007); African cratons (Priestley et al. 2008); South American cratons (Feng et al. 2004) and West Australian cratons (Simons & Van der Hilst 2002;Fishwick et al. 2005)]. Furthermore, where the distribution of receivers allows sufficient lateral resolution, the variations in lithospheric structure often do not correlate with surface geology (Bruneton et al. 2004b;Sandoval et al. 2004;). Thus, both in magnitude and in pattern these variations are difficult to explain as solely due to compositional effects on seismic velocity (Bruneton et al. 2004a). This variability is superimposed on a cratonic background velocity that is several percent higher than global reference models, reaching maximum velocities of 4.7-4.8 km s −1 in V S (Lebedev et al. 2009;Pedersen et al. 2009) and at least 8.1-8.2 km s −1 in isotropic V P (Ryberg et al. 1998;Nielsen et al. 1999;Saltzer 2002;Fernández Viejo & Clowes 2003;Simon et al. 2003).
The aim of this paper is to investigate what maximum thermal contributions to seismic structure may be expected in cratonic lithosphere. We show how simple models of tectonically plausible structures in the continental lithosphere may lead to non-uniform distributions of radioactive elements in the continental mantle (and crust), which may in turn lead to significant horizontal thermal gradients which are essentially permanent (that is, they decay only on the long timescales corresponding to the half-lives of the radionuclides). We also show that variable thermal structures at depth generally yield similar surface heat fluxes. Within the uncertainties of surface heat flow measurements and pressure and temperature estimates for xenoliths (Rudnick et al. 1998;, this leaves room for several hundreds of degrees Celsius of lateral variations in temperature that could significantly contribute to the cratonic seismic signal.

T H E R M A L S T RU C T U R E A N D D I S T R I B U T I O N O F H E AT P RO D U C T I O N
In studies of lithospheric temperature and composition, it is usually assumed that the crustal component of radiogenic heating is most important because this is where the concentrations of radioactive elements are highest. Typical values of radiogenic heat production in the upper continental crust are 0.9-3.25 μW m −3 , decreasing to about 0.12-0.5 μW m −3 in the lower crust (Weaver & Tarney 1984;Fountain et al. 1987;Rudnick & Fountain 1995;Joeleht & Kukkonen 1998;Jaupart & Mareschal 2007). The continental mantle lithosphere, on the other hand, is depleted in radioactive elements. Given the low concentration of radioactive elements, the mantle-lithospheric heat production is sometimes set to zero in dynamic models, but more realistic estimates range from about 0.01 μW m −3 for strongly depleted lithosphere up to 0.1 μW m −3 for metasomatized lithosphere (Pollack & Chapman 1977;Rudnick et al. 1998). Much higher average lithospheric values are not possible as they would yield a cratonic geotherm that does not intersect the mantle adiabat (Rudnick et al. 1998), and Monte Carlo inversions of xenolith data yield even lower maximum values of 0.04 μW m −3 for mantle heat production (Michaut et al. 2007).
Nonetheless, mantle xenoliths do display widely varying degrees of metasomatism, accompanied by variations of 2-3 orders of magnitude in concentrations of radioactive elements (Rudnick et al. 1998). The high end of the range is markedly above the permitted average, implying that significant regional variability in heat production may exist in the continental mantle lithosphere.
Despite the relatively low concentrations of radioactive elements in the mantle lithosphere and their accordingly subordinate contribution to the surface heat flux, deep-seated radiogenic heating has a significant effect on the lithospheric geotherm: Fourier's law of heat conduction (for purely vertical heat flow, q = −k ∂ T /∂z, where q is heat flow and k is thermal conductivity) implies that a certain temperature gradient is required to conduct a given amount of heat out of the lithosphere. If that heat is generated at depth, its contribution to the vertical temperature gradient will extend all the way from the surface to that depth, resulting in a large temperature difference at depth. For shallow heat production, the same temperature gradient extends only over a narrow depth range and the effect on temperature is much lower.
This effect is illustrated in Fig. 1 for a few simple examples with different vertical distributions of heat production. Three models with identical total heat production (M1-M3) display significantly different geotherms. Generally, the overall temperature is higher if more of the heat producing elements are concentrated at depth. The magnitude of the temperature differences depends on the concentration of heat producing elements, their vertical distribution, and on the thickness of the lithosphere. It has been much debated what the relative importance is of variations in crustal heat production and lithospheric thickness for explaining observed variations in cratonic surface heat flux (Lenardic 1997;Nyblade 1999;Cooper et al. 2006). We find that for a lithosphere with constant thickness of 250 km, variations in heat production in the mantle lithosphere can lead to lateral temperature contrasts of several hundred degrees. Fig. 1 also shows that the deep-seated heat generation has a relatively minor influence on surface heat flow which is strongly dominated by the crustal component. Vertical redistribution of heat production in these examples results in surface heat flow variations of 5 mW m −2 superimposed on a mean of about 40 mW m −2 . As the total heat production in the first three models is equal, the small differences are evidently due to variations in the thermal blanketing effect: high temperature at the base of the lithosphere allows less conductive heat flow from the convecting mantle. This effect is further investigated with model M4 (Fig. 1). Here, the same realistic (e.g. Rudnick et al. 1998) crustal model was used as in model M3, and the mantle-lithospheric heat production of 0.115 μW m −3 was chosen to match the surface heat flux of M2. The lithospheric heat production is at the high end of what is believed to be realistic, and the maximum temperature difference relative to M2 is almost 250 • C, occurring at 120 km depth. The surface heat flow of 43 mW m −2 is only slightly higher than that of M3 with the same crustal heat production, and it is exactly the same as M2 with a similar crustal model. The surface heat flux is thus rather insensitive to variations in heat production at depth. This is because the total deep heat production generally is low, but also because elevated temperatures at depth insulate against asthenospheric heat flux. In the above examples, we used a constant thermal conductivity of 2.5 W(m K) −1 ; in the more realistic calculations below, we use the temperature-dependent formulation of McKenzie et al. (2005).

T E C T O N I C T H E R M A L M O D E L S
Cratonic lithosphere, although it has been preserved over several billions of years, shows ample evidence of a complex tectonic history. Most cratonic cores consist of blocks of different ages and provenance and show evidence of phases of compression, subduction, rifting, strike-slip faulting and/or igneous activity (Bleeker 2003), and in many cases several generations of tectonic events. It is generally assumed that the original cratonic lithospheric cores consisted of highly melt-depleted (up to 30-50 per cent of partial melting) mantle material (Jordan 1979;Griffin et al. 1999b;Lee 2006;Bernstein et al. 2007). However, old and younger tectonic and mantle convective activity resulted in highly variable degrees of metasomatic refertilization with basaltic components, fluids and heat producing and other minor elements (e.g. Simon et al. 2007;Griffin et al. 2008). This would have affected seismic as well as thermal properties of the lithosphere.
Two types of models for craton formation have been proposed (see review by Lee 2006): superplume activity accompanied by extensive melting, or more gradual accumulation through accretion of subducting oceanic or arc lithosphere. Geochemical data that require that melting occurred at relatively low pressures, and the absence of extensive amounts of kimberlitic material, which should have been generated in plume melting, would indicate a preference for accretion models. But it is also possible that both mechanisms played a role. While the model of gradual accretion inherently would be associated with the type of heterogeneity observed in xenoliths and xenocrysts and possibly also that seen in seismic images, it is also possible that individual cratonic blocks with differing degrees of melt depletion and/or metasomatism were later placed next to each other by tectonic motions. Irrespective of the formation mechanism of the cratonic blocks, the observed heterogeneity is probably related to concurrent or subsequent plate motions. Motivated by this, we set up several simple tectonic models that juxtapose lithospheres with different amounts of heat producing elements and major element compositions consistent with xenolith data. We test the following four end-member models, which may also be combined to make more complicated models.
(1) Lithospheric juxtaposition model: Two types of lithosphere with equal thickness are placed next to each other. (Variations in lithospheric thickness in a convecting mantle have also been shown to have large and long-term effects on temperature and seismic velocities (Lenardic & Moresi 2000;Hieronymus et al. 2007), but here we choose to concentrate on effects originating from within the lithosphere.) The contact is a vertical boundary. Such a configuration might form tectonically as the result of a long-lived strike slip fault, or by suturing of two differently depleted or refertilized lithospheres. The differences between the lithospheres are the deep major element concentrations (which directly affect seismic velocities) and the coincident distributions of radioactive elements (which influence seismic velocity via temperature). The crust is taken as uniform in order to focus on the lithospheric effect.
(2) Crustal juxtaposition model: Two blocks are placed next to each other with different crustal thickness and heat production, but with the same mantle lithosphere (to focus on effect of crust alone). The mantle lithosphere model is the same as that of Pollack & Chapman (1977), in which the top layer is highly depleted (H = 0.01 × 10 −6 W m −3 ), but refertilized below 120 km depth (H = 0.084 × 10 −6 W m −3 ). Such deep lithospheric structure is observed in Kaapvaal and below the Slave craton (seen in xenocryst chemical tomography sections of (Griffin et al. 1999b. In Kaapvaal, this configuration is proposed to be due to refertilization by metasomatism from below the lithosphere (Menzies et al. 1987;Simon et al. 2007). Below the Slave craton, this is proposed to be due to lithospheric stacking (Griffin et al. 1999a).
(3) Stalled oceanic slab in lower lithosphere as the result of continental collision. Essentially, this is a juxtaposition of two equal continents (note that in reality they might also be different), but at the contact between the two, there is an old, dipping oceanic slab with relatively high concentrations of radioactive elements in the crust and low concentrations in the mantle part of the slab. This scenario is motivated by the dipping reflector found below the Slave craton (Bostock 1998), which is interpreted as the top of an ancient oceanic slab. The simple model focuses on the effect of the slab.
(4) Stalled continental slab in the lithosphere. This model is the same as the oceanic slab model above, but when the continents collided, one continent started to be subducted beneath the other. The continental slab persists down to the base of the lithosphere. This model is an end-member scenario in terms of heat generation and compositional contrasts. The heat generated at the depth of the lower lithosphere is enormous, and due to the insulating effect of the surrounding material leads to melting. Thus, we assume that partial melt has been removed and all continental crust of the stalled slab is more like typical, granulitic lower continental crust. There is strong geological evidence that continental crust can be subducted to deep lithospheric levels (Dobrzhinetskaya et al. 1995;Ye et al. 2000), although it is not known how commonly this occurs or for how long crustal material can remain at that depth.
For each model, we calculate the steady-state 2-D thermal structure. In the absence of time-dependent terms, the diffusion equation can be written as where T is temperature, H is the rate of heat production per unit volume (values for each model given in Table 1)  temperature is determined by a mantle potential temperature of 1300 • C and a constant adiabatic gradient of 0.4 • C. By using the lithosphere thickness as a model input, we implicitly make some assumptions about the mechanical behaviour of the mantle. If the lithosphere-asthenosphere boundary were entirely thermal, then the lithospheric thickness would be determined by the temperature dependence of the viscosity, by the dynamics of the mantle, and by the heat production within the lithosphere. However, it has been shown that thermal effects alone cannot stabilize continental lithosphere over billions of years (King 2005;Sleep 2005), and at least to some degree the boundary must be compositional. The details of how the continental lithosphere remains stable are not known. By prescribing the lithospheric thickness in our models, we thus assume the position of the compositional boundary. Furthermore, we require of a realistic solution that the temperature in the lithosphere is below the mantle adiabat. These assumptions appear to be in good agreement with dynamic models of continental stability (Cooper et al. 2006), which show relatively constant temperatures at the base of the compositional lithosphere, although lateral differences in lithospheric heat production were not taken into account. The horizontal extent of the model domain is 1000 km, the vertical size is equal to the thickness of the lithosphere. We solve the diffusion equation using a Fourier pseudo-spectral code (Boyd 2003) in both the horizontal and the vertical directions. The model resolution is taken as 256 nodes in the horizontal and 128 nodes in the vertical. The top and bottom boundary conditions are satisfied by subtracting out a linear temperature drop from the base of the lithosphere to the surface (thus solving only for the perturbations relative to this background temperature), and by adding an equal but opposite model domain in the vertical (i.e. a mirror image with negative temperatures and negative heat sources). The antisymmetry of the solution in the vertical assures that the temperature perturbation at the top and bottom edges of the lithosphere are zero, which results in the desired boundary condition. The non-linearity due to variable heat conduction is accounted for by iterating until the solution converges. We ran models with lithospheric thicknesses of 150, 200 and 250 km. However, the effects are most pronounced and most clearly demonstrated with those of 250 km thickness. Compositional and resulting steady-state thermal structures are shown in Fig. 2.

S E I S M I C V E L O C I T Y C A L C U L AT I O N S
Seismic velocities in continental lithosphere are the result of thermal structure, major element chemistry and presence of water. Melt is not an issue in the cold settings we investigate here. Furthermore, we will concentrate our forward calculations on isotropic velocities, although anisotropy can be considerable in stable continents (e.g. Oreshin et al. 2002;Frederiksen et al. 2007;Snyder & Bruneton 2007;Eken et al. 2008;Lebedev et al. 2009). We compute seismic velocities as a function of only temperature and major element composition, as the effect of hydration on mantle lithospheric velocities is poorly constrained, although it may significantly lower seismic velocities Bruneton et al. 2004a). In previous studies, we have extensively assessed uncertainties in the conversion of thermochemical structure to seismic structure due to uncertainties in elastic, anelastic and density parameters (e.g. Goes et al. 2000;Cammarano et al. 2003). For peridotitic compositions, 1σ uncertainties are about 0.02-0.045 km s −1 in both absolute V P and V S for the P-T range of the models considered here. The largest uncertainties are at temperatures above 1100-1200 • C where anelasticity plays a significant role. Uncertainties in V S anomalies range from 0.05 per cent/100 K at low temperatures to about 0.5 per cent/100 K at the highest temperatures of the models. Uncertainties in basaltic/eclogitic seismic properties are significantly larger, but it is clear that these will lead to strongly anomalous velocities compared to those of peridotitic compositions. For this study, we use two different approaches that give a good illustration of the uncertainties using the most recent mineral physics data sets and somewhat different ways of calculating seismic velocity from these parameters.
(1) This method will be referred to as fin-sfo05. The phase composition and elastic and anelastic contributions to the velocities are calculated separately. The phase composition for a given composition in terms of oxides is calculated as a function of pressure and temperature with the program PerPle X (Connolly 1990(Connolly , 2005 from the data compilation by Stixrude & Lithgow-Bertelloni (2005), complemented by Khan et al. (2006) and applies to mantle mineralogies in the CFMAS system. Elastic velocities are calculated with a 3rd order adiabatic Birch-Murnaghan equation of state with a Grüneisen thermal correction. A single anelasticity model is used, as the effects of anelasticity are small at cratonic lithospheric temperatures. Parameters and the procedure are described in detail in (van Wijk et al. 2008) and are a slight update to what was used by (Goes et al. 2005).
(2) The second method will be referred to as perp-stx08. This is a joint calculation, done using PerPle X (Connolly 2005), of the phase composition, elastic properties and density using the most recent data base and equation of state from Xu et al. (2008) for the NCFMAS system. This formulation uses an internally selfconsistent database, and because of the inclusion of Na 2 O allows a more proper assessment of the properties of basaltic compositions. The third order Birch-Murnaghan-Mie-Grüneisen equation of state results in slightly different thermal sensitivity than the fin-sfo05 approach. We apply the same anelasticity model as for approach 1.
(3) In addition, for the calculation of velocities for lower continental crust (which is poorly approximated by the NCFMAS system) at high P, T we follow an integrated approach similar to approach 2, again using PerPle X, but using the data base from Holland & Powell (1998) complemented by shear modulus parameters by Connolly & Kerrick (2002), which covers the KNCFMAS-OH system. We apply a constant shear attenuation value of 600 (from PREM Dziewonski & Anderson 1981). As the Holland & Powell (1998) EoS is less applicable to deep lithospheric conditions, these results should be taken as indicative only.
The compositions we used are listed in Table 2. The most depleted parts of the lithosphere are believed to be dunitic (Beyer et al. 2006;Bernstein et al. 2007;Griffin et al. 2008), which we model using the xenolith-derived composition Arc-9 from Griffin et al. (2008). For the refertilized lithosphere, we use Griffin et al.'s (2008) Proterozoic composition Pr-4. For tests of extreme compositional effects, we used an undepleted (pyrolitic) mantle composition as given in (Xu et al. 2008). The oceanic lithosphere that is part of the stalled, subducted oceanic slab is taken to be harzburgite (Xu et al. 2008). This composition may not be exactly the same as Archean oceanic lithosphere, but many harzburgitic samples are found in xenoliths. The oceanic crust, which is part of the stalled slab is assumed to be a basalt which is complementary to the harzburgite (Xu et al. 2008) (again this should be taken as indicative for Archean crust), while the subducted continental crust has the composition of lower continental crust as given by Rudnick & Fountain (1995).
The seismic anomalies will be plotted relative to the depthdependent average of the model. How positive or negative the anomalies thus depends on the reference model. The model average seems most appropriate as most cratonic seismic models are displayed relative to a regional background (especially those from teleseismic traveltimes which have little sensitivity to the regional average). The total anomaly range is relatively insensitive to the reference model chosen. Hence, we will focus the discussion on the anomaly range. In addition, we show absolute velocity values which can be compared with 1-D regional seismic models.

Thermal structure and heat flow
Overall, we find that in a lithosphere of uniform thickness, the distribution of heat producing elements within realistic bounds has a significant influence on the lithospheric thermal structure. The location of the temperature differences broadly coincides with the heat production anomalies. Maximum lateral temperature variations in the models range from 200 to 300 • C. This range is small enough not to be resolvable within the uncertainties in surface heat flow and xenolith data, yet large enough to have a seismic expression. The crustal juxtaposition model (Juxt Crust) generates maximum thermal anomalies of 100 • C near the Moho. Anomalies of up to 50 • C persist down to 170 km depth into the mantle lithosphere. For thinner lithospheres of 150 and 200 km (results not shown), the near-Moho anomaly is approximately the same, which is not surprising as the specified heat production is independent of lithosphere thickness for these models. The deep thermal anomalies, however, are lower due to the bottom boundary condition which prescribes a constant temperature at the base of the lithosphere. The crustal model shown is characterized not only by different distributions of heat sources in the left-and right-hand halves of the model domain, but also by different values of total heat production. This difference in crustal heat production is directly reflected in the surface heat flow results (Fig. 2). Virtually all the additional heat generated in the right-hand half of the model is conducted out through the surface, indicating that the heat influx into the base of the lithosphere is nearly identical for both halves of the model.
The relatively moderate differences in lithospheric heat production in the lithospheric juxtaposition model (Juxt Lith) generate anomalies throughout the lithospheric mantle of up to 210 • C. For a model where the full right-hand lithosphere is refertilized (i.e. not just the portion below 120 km; results not shown), this difference increases further to almost 300 • C . The anomaly is centred significantly deeper than in the crustal juxtaposition model (Fig. 2). Despite the difference in total integrated heat production per unit area (29.5 μW m −2 in the fully depleted column versus 46.8 μW m −2 in the refertilized column; Fig. 2), the resulting surface heat flows are almost identical due to the stronger thermal blanketing effect of the hotter lithosphere. The additional heat sources in the lower lithosphere are thus largely compensated by a decrease in asthenospheric heat flow.
The subduction models produce the most pronounced temperature anomalies because of the strongly localized anomalies in heat production they introduce in the mantle lithosphere. Crustal heat production in the oceanic crust produces a local high-temperature anomaly of 60 • C near the base of the continental Moho. In the mantle lithosphere, the dominant thermal effect is a 120 • C lowtemperature anomaly (relative to a refertilized background lithosphere) due to the harzburgitic part of the subducted slab, which was assumed to be completely depleted of radiogenic elements. Although this may be an end-member case, it does illustrate the kind of thermal complexity that may be introduced by stacking of lithosphere with different amounts of heat production over long timescales. The continental subduction leaves a slab-like hightemperature anomaly of up to 260 • C due to the high heat production in the subducted continental crust. This should be seen as an upper bound, as it is probably difficult to underthrust a complete continental crustal section, even under present conditions which are cooler than during the Archean. The models shown here are meant to bracket the possible effects rather than represent the most realistic scenarios. Note that the subduction models yield thermal anomalies that are of similar magnitude as the juxtaposition models.

Seismic anomalies
Seismic velocity anomalies for the different tectonic models are shown in Figs 3 and 4. In Fig. 3, we illustrate the effect of the different methods of computing seismic velocity described in Section 4 using only the model Juxt Lith, as well as the individual contributions of temperature and composition. In Fig. 4, we use only the perp-stx08 technique to calculate the seismic velocities for all of the thermomechanical models shown in Fig. 2.
For the example of the lithospheric juxtaposition model (Juxt Lith), we find seismic anomalies due to the temperature effect alone which reach magnitudes of up to 2-2.5 per cent in V S and up to 1.5-2 per cent in V P (Figs 3a-d). The fin-sfo05 approach yields a somewhat stronger temperature sensitivity than the more recent perp-stx08 data set. Both fall within the published range Lee 2003;James et al. 2004). Deep thermal anomalies have a more pronounced seismic signature than shallow lithospheric anomalies because anelasticity effects start to play a role at the higher absolute temperatures present in the deep lithosphere. For example, in the crustal juxtaposition model (Juxt Crust), a maximum thermal contrast of 100 • C located just below the Moho yields a contrast in V S of 0.8 per cent. A thermal anomaly twice as large in the lithospheric juxtaposition model (Juxt Lith) at around 150 km depth gives rise to a V S anomaly that is 2.7 times larger than what is found in Juxt Crust.
Compositional effects generally enhance the thermal anomalies. For the full range of lithospheric compositions from highly depleted dunite to fertile peridotite, the purely compositional effect on seismic velocity has been shown to be as large as 1-2 per cent Lee 2003;James et al. 2004). For the more moderate Figure 3. Seismic and density structure calculated for thermal structure from model Juxt Lith. Top two panels illustrate absolute V S and V S anomalies (relative to horizontal average of the model) for only thermal effects calculated with two different approaches (composition is dunitic everywhere, crust has been blanked out). Bottom three panels illustrate V S , V P and density structure calculated with the perp-st08 approach, when both thermal and compositional effects are taken into account.
variations such as those shown in Juxt Lith, the compositional seismic anomalies in V S and V P are only 0.8-0.9 per cent (Figs 3c-f). It is well known that melt depletion leads to higher seismic velocities (Jordan 1979;Goes et al. 2000;Lee 2003;James et al. 2004;Schutt & Lesher 2006), but generally not considered are the thermal effects of the associated depletion in radiogenic elements which, over long timescales, also make the depleted lithosphere colder. With the combined effects of temperature and major element chemistry, varying amounts of depletion can lead to anomalies of up to 3-4.5 per cent in V S and up to 2.5-3.5 per cent in V P , when a full range of compositions from fertile to highly depleted is considered. The oceanic subduction model illustrates a different scenario in which thermal and compositional effects reinforce each other. The crust and the lithosphere of the stalled slab are affected differently by the variations in pressure and temperature, and the combined result may display relatively complex anomaly patterns which may or may not look slab-like if recovered in a tomographic inversion. Around the unperturbed Moho, the thermal effect of the slab is a temperature high in the oceanic crust and a temperature low in the oceanic lithosphere. The low seismic velocities in basalt enhance the crustal high-temperature effect, while the effect of depletion in the oceanic lithosphere is strengthened by low temperatures there.  The seismic signature deeper in the continental lithosphere depends strongly on the degree of depletion or refertilization of the background material. In a depleted continental lithosphere, the slab has a negligible influence on temperature and the seismic signal is dominated by the high velocities of the eclogitic basalt. To what extent anomalies due to the thin oceanic crust can be imaged seismically depends on the frequency content of the seismic data used and on the way in which they sample the image volume. If the depleted oceanic lithosphere is instead embedded in a refertilized continental background, then the slab is characterized by a negative temperature anomaly of up to 120 • C (Fig. 2c). An ancient, stalled slab may then look somewhat similar to an actively subducting slab. The reason for this, however, is not the thermal inertia of the slab, but instead the low concentration of radiogenic elements due to melt depletion. If instead the subducted plate is continental, seismic anomalies are (not surprisingly) dominated by the (not well constrained, but definitely very large) compositional effects of the subducted continental crust. As with the other models, thermal effects will further enhance these effects.
Thus, the simple models investigated here already show a remarkable degree of complexity, and one can imagine that a superposition of subduction and lithospheric juxtaposition effects, which is likely to occur in real cratons, can lead to even more complex seismic anomalies that are not easily interpreted in terms of tectonics. Interestingly, thermal and compositional effects generally act in unison making total seismic anomalies larger than usually expected for these stable regions.

Absolute seismic velocities
For many cratons, the best-resolved seismic information consists of average 1-D velocity profiles Pedersen et al. 2009). These show that cratonic velocities are several percent higher than global reference models, reaching maximum velocities of 4.7-4.8 km s −1 in V S and at least 8.1-8.2 km s −1 in V P . It has been noted previously that such high velocities require extremely depleted compositions . Our calculations confirm that only the dunitic compositions reach such high values, although uncertainties in the forward calculations of absolute Thick grey line segments represent an estimate of two times the standard deviation in peridotite absolute velocities as obtained from previous analyses of a large range of plausible elastic and anelastic mineral properties Cammarano et al. 2003). velocities are greater than those of seismic anomalies Cammarano et al. 2003), hence somewhat less depleted compositions can not be fully ruled out. In our calculated seismic velocity profiles for constant composition (Fig. 5), V S below the Moho always clearly decreases with depth, although it does so slightly less with the newest data bases than with the previous compilations used by Goes et al. (2000) and Bruneton et al. (2004b). A low velocity zone in V S is thus expected in the transition region from conductive lithosphere to the convective mantle. On the other hand, V P has a gradient that is much closer to zero and it can be slightly positive, negative, or switch sign over the depth range of a 200-250-km-thick cratonic thermal lithosphere (depending on the details of the thermal structure and composition). James et al. (2004) and Afonso et al. (2008) also found that for Archean geotherms, V P can increase with depth while V S decreases. This result can be understood in terms of the higher temperature-sensitivity of V S than of V P . The decrease in V S with depth is due to a dominant sensitivity to the increasing temperature. For V P , the competing effects of increasing pressure and increasing temperature nearly cancel each other. Interestingly, this implies that for V P , a low-velocity zone near the base of the lithosphere is expected to be weak or absent.
Our predictions of differences between the vertical gradients of V S and V P as well as the presence or absence of a low velocity zone are consistent with a number of published seismic profiles. Simon et al. (2003) constructed average V P and V S profiles below southern Africa, and found a low velocity zone in V S near the base of the lithosphere, but none in V P . Other studies also found no low velocity zone in V P in southern Africa (James et al. 2004), below the Slave craton (Bank et al. 2000), and below the Baltic region (Sandoval et al. 2004), although it has to be noted that teleseismic data may not be optimal to vertically resolve such a zone. In contrast, a low velocity zone in V S has been detected below cratonic Africa (Priestley et al. 2006), a moderate one below the Slave craton (Chen et al. 2007), in a number of the Superior province profiles studied by Darbyshire et al. (2007), and in various cratonic regions (Lebedev et al. 2009). Pedersen et al. (2009), on the other hand, found no such low velocity zone in V S below the Baltic Shield, Kaapvaal, Slave or Yilgarn cratons.
In the most fertile compositions, V P and V S are relatively low in the spinel stability field from just below the Moho down to 60-70 km. Highly depleted compositions contain too little aluminium to form enough spinel to influence seismic velocities.
The interface between layers of different degrees of depletion produces jumps of a few percent in seismic velocity due to the joint thermal and chemical effects (Fig. 5). Interestingly, in the Slave craton some of the tomographically imaged anomalies appear to correlate with discontinuities observed with receiver functions (Snyder et al. 2004). Our models show that this does not mean that both seismic signatures are purely non-thermal in origin. While the reflections would only sense the sharp chemical velocity jump, tomography would be affected by the broader total thermal plus chemical contrast.

Thermal anomalies
The results show that lateral differences in the concentrations of radioactive elements can lead to lateral temperature contrasts of up to 200-300 • C, especially in thick lithosphere. The magnitude of the thermal effect is not limited by observations of radiogenic heat production (xenolith samples often display significantly higher radiogenic heating than the values used in our models Rudnick et al. 1998), but instead it is limited by the dynamic requirement that the lithosphere must be cooler than the asthenosphere. This limitation is particularly stringent for radiogenic anomalies of large horizontal extent (such as our juxtaposition models) since all the heat generated must be conducted vertically upward. For anomalies of smaller horizontal extent, greater local concentrations of radioactive elements are possible which lead to hotter thermal anomalies. Especially in the upper or middle lithosphere such smaller but hotter anomalies would be stable because temperatures here are well below the mantle adiabat and below the melting temperature.
In our models, we use boundary conditions of constant temperature at the base of the lithosphere. High heat production in the mantle lithosphere thus leads to elevated temperatures in the lower lithosphere, which in turn result in reduced heat flux from the underlying asthenosphere due to the effect of thermal blanketing. If boundary conditions of constant heat flux were employed instead, deep lithospheric regions with high heat production would heat up even more, and lateral temperature anomalies would become more pronounced and extend to the base of the lithosphere. The appropriate type of boundary condition is intrinsically linked to the question of stability and equilibrium thickness of the lithosphere. This problem can only be resolved by dynamical models that include the convecting mantle as well as lateral variations of heat production within the continental lithosphere.
Models Juxt Lith and Juxt Crust illustrate that the surface heat flow is dominated by crustal heat production, while deep lithospheric heat production and the associated large thermal anomalies at depth have only a minor effect. An important conclusion is therefore that our models with variable degrees of deep heating (i.e. Juxt Lith) are realistic in the sense that they match surface observations of nearly constant heat flow even in cratonic regions of strongly varying deep seismic velocities. Conversely, the results also show that surface heat flow is not an effective tool in determining deep lithospheric temperatures. Relatively small uncertainties in radiogenic element concentrations in the mantle lithosphere can lead to large errors in calculated temperatures. While the correlation between surface heat flow and crustal heat production is causal (that is, high crustal heat production directly causes high heat flow), we suggest that the observed general correlation between low surface heat flow in Archean terranes and concurrent high seismic velocities in the mantle lithosphere is essentially coincidental. Whatever process formed Archean lithosphere and crust, it produced mantle lithosphere which is seismically fast (probably due to depletion in major and in radiogenic elements), but it also produced crust with relatively low total heat production. However, there is no direct physical mechanism which links surface heat flow and deep seismic anomalies. In fact, as we show, it is possible to have low surface heat flow even in regions of low seismic velocity at depth. This conclusion is directly supported by the observation of large regional variability in seismic velocities in cratonic regions with approximately constant surface heat flow (e.g. comparing heat flow in the Baltic Shield (Balling 1995) with corresponding seismic structure Bruneton et al. 2004b).

Lateral seismic variations
Under simple but realistic assumptions about the tectonic structure of cratonic lithosphere, the effect of temperature due to variable degrees of depletion in radiogenic elements combined with the effect of composition due to varying degrees of depletion in major elements leads to significant lateral contrasts in seismic velocity. The modelled seismic velocity variations reach values of up to 4.5 per cent in V S and 3.5 per cent in V P if the composition covers the full range from fertile lherzolite to highly depleted dunite. The magnitudes of the seismic anomalies in our models are similar to the anomalies typically observed under cratons. Only the largest observed variations (up to 6 per cent in V S and 4 per cent in V P have been resolved under the Baltic shield Bruneton et al. 2004b;Sandoval et al. 2004) remain difficult to explain with our models. Although anomalies of crustal material embedded in the lithosphere can reach such magnitudes, most of the seismic imaging has been done with long-wavelength data, which would most likely not recover such small-scale perturbations. As seismic tomography tends to underestimate rather than overestimate anomaly amplitudes, the largest seismic anomalies imaged require less straightforward explanations. We will return to this in the next section.
The range in major element composition and in radiogenic element concentrations is seen to have a dominant impact on the model results. The chemical composition of the typical lithospheric components (i.e. excluding subducted oceanic or continental crust) on which we base most of our results is within the confines of cratonic lithosphere sampled by xenoliths and xenocrysts. The chemical tomography profiles of Griffin et al. (2003b) document both a large amount of vertical and lateral variability in composition, and a wide range of depletion signatures from fertile lherzolites to depleted harzburgites. Dunites are not commonly sampled, but it is argued that these are nonetheless a common component of Archean lithosphere (Beyer et al. 2006;Bernstein et al. 2007;Griffin et al. 2008).
Measurements of radiogenic elements in xenoliths also indicate a great amount of variability as well as maximum values that are well in excess of the assumptions used in our models. The data compilation of Rudnick et al. (1998) reports mean and median values of heat production in cratonic peridotite xenoliths of 0.093 and 0.044 W μm −3 , respectively, and extreme values of over 0.15-0.2 μW m −3 . While there may be statistical biases due to the choice of samples collected, and possible refertilization by the volcanism which transported the xenoliths to the surface, the range in heat production in the cratonic lithosphere appears to be greater than assumed in our models. The higher observed values of heat production are often discounted as non-representative because they do not yield plausible geotherms for thick lithosphere (Rudnick et al. 1998). However, for a strongly heterogeneous lithosphere, high heat production may occur locally and cause significant thermal and seismic perturbations of relatively small spatial extent.

Vertical seismic variations
The seismically imaged anomalies are 3-D, implying that the variations are not only lateral, but also vertical, with faster material overlying slower material in one region while it is the other way around in another. Substantial vertical variations in seismic velocities are documented in the regional V S profiles from surface waves (Bruneton et al. 2004b) which have better vertical resolution than the teleseismic body waves used for most V P models. Also some other 1-D V S profiles display more complexity than expected from our models for mantle lithosphere of constant composition (Chen et al. 2007;Lebedev et al. 2009). Models like our lithospheric juxtaposition model yield V S between 4.7 and 4.8 km s −1 from the Moho down to about 120-150 km for the entire range of common lithospheric compositions. This range is compatible, for example, with profiles that Bruneton et al. (2004b) recover in the northwestern, dominantly Archean, part of their study area in the Baltic Shield, and would indicate a more depleted composition like dunite above 120 km and a less depleted composition like our refertilized composition below. The more constant and higher velocities below the Proterozoic, Svecofennian, southeastern part of the model are compatible with a dunitic composition down to about 200 km depth.
Layering of more buoyant depleted lithosphere over less buoyant fertile/refertilized lithosphere seems dynamically most logical, and is in agreement with a number of xenocryst-derived chemical depth sections of the lithosphere (Griffin et al. 2003b). However, the observations of deeper high velocities, if attributed to more depleted compositions, would require that dynamically less stable configurations also exist. If indeed, as commonly argued based on numerical experiments, high viscosity is more important in stabilizing cratonic lithosphere than buoyancy, gravitationally unstable layering may not be a problem (King 2005;Sleep 2005). Modelling of isostasy and geoid indicates that purely dunitic lithosphere would be too buoyant and that significant quantities of more fertile material need to be present, but this gives no information on how the added density would be distributed within the lithosphere .
Significantly lower seismic velocities in the range between 4.6 and 4.7 km s −1 are found between 50 and 100 km depth in the northwestern Svecofennian part of the SVEKALAPKO study region (Bruneton et al. 2004b), as well as many of the cratonic paths studied by Lebedev et al. (2009), and between 70 and 120 km depth below the Slave craton (Chen et al. 2007). Although our synthetic absolute velocity carry substantial uncertainties (Fig. 5), we cannot generate the whole range from 4.6 to 4.8 km s −1 just with the thermochemical signature of the depletion series from peridotite to dunite. For the same reason, we cannot generate the maximum range of observed velocity anomalies. At least part of this range requires additional complexity. Lebedev et al. (2009) propose that the spinel-garnet transition (widened and shifted by high Cr content) is responsible for their low sub-Moho velocities and for the positive velocity gradients down to about 100 km. However, this (1) requires a very fertile composition just below the Moho and (2) results in a velocity variation that is lower than that observed. When we use a mechanical mixture of a depleted composition with 10-20 per cent basaltic component (Xu et al. 2008), this results in relatively low velocities below the Moho, possibly down as far as 70-80 km, depending on where the eclogite transition is complete. However, these velocities are probably still not low enough. Bruneton et al. (2004a) suggested that clinopyroxene-rich cumulates might contribute. These would be expected to have very low seismic velocities. All of these proposed mechanisms would require that a melt/volatile-enriched composition exists in some cratonic regions at shallow depths in the lithosphere. Vertical layering other than bottom-up metasomatism by the underlying mantle may result from accretion mechanisms of cratonic formation. For example, if two parts of lithosphere are assembled at a subduction zone, the overlying lithosphere may be hydrated and thus end up more metasomatized than the lower lithosphere. The observations appear to require such composite thermochemical signatures and a diversity of scenarios. Models such as ours may help quantitatively interpret seismic models.

C O N C L U S I O N S
Where it has been imaged in detail, the seismic structure of the cratonic lithosphere exhibits considerable lateral and vertical variability. This variability is probably mostly related to different degrees of metasomatism. As Griffin et al. (2003a) write, 'The trace-element patterns of many garnets reflect the metasomatic refertilization of originally highly depleted harzburgites and lherzolites, much of the lateral and vertical heterogeneity observed in the subcontinental lithospheric mantle within the craton is the product of such metasomatism.' We show that such compositional variations due to depletion and refertilization have thermal and chemical seismic signatures that typically enhance each other and can explain much of the relatively large variability in seismic structure for stable continental lithosphere. The variations can produce gradients in seismic velocity that may be strong enough to generate reflections that can be imaged for example with receiver functions. At the same time, the variations in surface heat flow associated with deep thermal anomalies are small because the corresponding differences in total heat production are minor and because regions of increased deep heat production coincide with decreased heat flow from the asthenosphere.
Observed seismic structures are most compatible with a formation history that juxtaposed lithospheric blocks of different depletion/refertilization in various configurations, including some where dense blocks are stacked above lighter ones. Such complexity appears more compatible with tectonic accretion than with singleevent superplume scenarios of craton formation.

A C K N O W L E D G M E N T S
We would like to thank Jean-Claude Mareschal and an anonymous reviewer for thoughtful comments which helped to improve the manuscript.