On the modelling of self-gravitation for full 3-D global seismic wave propagation

SUMMARY We present a new approach to the solution of the Poisson equation present in the coupled gravito-elastic equations of motion for global seismic wave propagation in time domain aiming at the inclusion of the full gravitational response into spectral element solvers. We leverage the Salvus meshing software to include the external domain using adaptive mesh reﬁnement and high order shape mapping. Together with Neumann boundary conditions based on a multipole expansion of the right-hand side this minimizes the number of additional elements needed. Initial conditions for the iterative solution of the Poisson equation based on temporal extrapolation from previous time steps together with a polynomial multigrid method reduce the number of iterations needed for convergence. In summary, this approach reduces the extra cost for simulating full gravity to a similar order as the elastic forces. We demonstrate the efﬁcacy of the proposed method using the displacement from an elastic global wave propagation simulation (decoupled from the Poisson equation) at 200 s dominant period to compute a realistic right-hand side for the Poisson equation.


I N T RO D U C T I O N
The last decade has witnessed an unprecedented increase in highquality long-period seismic data. This is because of the occurrence of several very large Earthquakes that were recorded on an exponentially growing number of broad-band seismic stations that are installed in very dense networks such as the USArray. These new data have led to an ever increasing resolution in tomography for seismic velocities. Density however, even though it is a key parameter in models of the geodynamical evolution of Earth's mantle as density differences drive mantle flow, remains poorly constrained.
While tomography based on full 3-D simulations of the seismic wavefield has become a standard tool in local to continental scale seismology (e.g. Fichtner et al. 2009;Tape et al. 2009;Virieux & Operto 2009;Zhu et al. 2012;Warner et al. 2013;Virieux et al. 2017), this is not yet the case for whole planet models. Early attempts (Lekić & Romanowicz 2011;French & Romanowicz 2014) used approximations for the gradients and only recently the first global model fully based on the adjoint method became available Lei et al. 2020). To the best of our knowledge, no such study exists for the normal mode frequency range. However, it is the long-period range in which density is more likely to be accessible than for other seismic observables (Ishii & Tromp 1999;Koelemeijer et al. 2017). The main reason for this discrepancy is the fact that there is no established method to model the full physics of long-period seismology in 3-D and compute gradients with respect to material properties at reasonable computational cost.
Because density information is contained in the normal mode spectra at relatively small amplitudes compared to the seismic velocity structure, an accurate implementation of the underlying physics is required. The widely used splitting functions are based on the self-or group-coupling approximation, which introduces an error on the same order of magnitude as the effect of density perturbations itself (Akbarashrafi et al. 2018). Another difficulty in normal mode coupling theory arises from the linearization of the effects of boundary perturbations such as topography, ellipticity and crustal thickness, though recent work by Al-Attar et al. (2018) addresses this issue by means of a particle relabling transform. In a related work, Maitra & Al-Attar (2019) apply the same spatial mapping to the Poisson equation to derive an efficient and numerically exact solution in aspherical planets.
In contrast, time domain methods such as the spectral element method, usually avoid the full implementation of gravity and use the Cowling approximation instead (e.g. Komatitsch & Tromp 2002a). This approximation assumes that the gravitational potential is constant in time and ignores gravity perturbations caused by the seismic displacement itself. The theory describing the complete gravity physics is however well established and demonstrated to work with the spectral element method (Chaljub & Valette 2004).
On the modelling of self-gravitation 633 Fig. 1 shows the relative error caused by the Cowling approximation in eigenfrequency or, equivalently, phase velocity for all spheroidal modes up to 15 mHz in PREM (Dziewonski & Anderson 1981) in a dispersion diagram (a) and as a function of frequency (b). As expected, a clear tendency of a decreased error with increased frequency is apparent and the error is largest for the lowest frequency modes. However, especially at higher frequencies the error also depends on the mode type: the lowest error is seen for the ScS equivalent modes with very low group velocity at low angular degree. A slightly higher error level is exposed by the fundamental mode Rayleigh waves and their overtones. Most sensitive are the CMB and ICB Stoneley modes as well as the core modes. This pattern can be explained by the displacement eigenfunctions: while vertical displacement across density contrasts directly perturbs the gravitational potential (e.g. Rayleigh and Stoneley modes), the perturbation is smaller for horizontal displacements where the density change is only due to the normal strain (e.g. ScS modes). That said, the ICB Stoneley and core modes are of lower importance in many applications focused on the mantle, as they cannot be easily excited directly by an earthquake in the crust or upper mantle. The Cowling approximation can be considered sufficiently accurate for frequencies above 5-10 mHz in most applications.
While not yet common, it is in principle possible to compute long-period spectra using time domain methods (e.g. Nissen-Meyer et al. 2014;. Fig. 2 shows amplitude spectra for the vertical component seismogram recorded at the black forest observatory in Germany for the Tohoku Oki event. The spectra are computed from 32 hr of elastic 3-D time domain spectral element wave propagation and ignore the effects of gravity, which we are addressing in the following. The incremental change from addition of surface and Moho topography, 3-D velocities and density from S20RTS (Ritsema et al. 1999) is computed as the absolute value of the complex difference of the spectra. Large phase shifts thus cause differences to exceed the 1-D reference spectrum in amplitude. It is apparent that the strongest effect is from surface and Moho topography (including ellipticity) and the effect of lateral velocity variations is only marginally smaller. The effect of density however is an order of magnitude smaller, confirming the necessity for accurate modelling. As the dispersion error in time domain simulations accumulates over long propagation distances, correction for the dispersion error caused by the time stepping scheme (e.g. Koene et al. 2018) is an attractive alternative to more accurate time integration (e.g. Nissen-Meyer et al. 2008).
Two main challenges arise in including full self-gravitation in the 3-D spectral element method: while the purely elastic simulations can be formulated in a fully explicit way and only require the computation of matrix-vector products, self-gravitation couples a Poisson equation into the system that needs to be solved in each time step. The cost of solving this system exceeds the cost of the elastic terms by far (Chaljub et al. 2007), rendering this unpractical, in particular for inverse problems. Secondly, the spatial domain of the equation extends to the full space, with a boundary condition at infinity that cannot be treated directly by a standard space-discretized method.
We address both these issues in an effort towards the inclusion of the full physics for long-period seismology in an efficient spectral element method. In Section 2, we introduce the problem in quantitative terms and detail the steps towards an efficient solution. In Section 3, we then apply these methods to the Poisson problem using the wavefield from a purely elastic simulation as a test case.

Problem statement
The Poisson equation defining the perturbed gravitational potential is given by (e.g. Chaljub & Valette 2004) where G is the gravitational constant, ρ is the mass density, u is the seismic displacement and the Earth is denoted by e . ψ vanishes at infinite distance from the Earth and is continuous everywhere including at Earth's discontinuities and surface. The normal derivative of ψ is discontinuous at these interfaces and the jump is controlled by In the fluid parts, the seismic displacement can be computed from the displacement potential(s). The right-hand side (RHS) of eq. (1) has two contributions: (1) the change of density due to compression or dilation of the medium and (2) the displacement of material with heterogeneous density, including motion perpendicular to internal discontinuities and the free surface.

Boundary conditions
One difficulty for the numerical solution of Poisson's equation is that the domain is the full R 3 with a homogeneous Dirichlet boundary condition that needs to be applied at infinity. Chaljub & Valette (2004) approach this problem using a Dirichlet-to-Neumann operator to derive a Robin-type boundary condition at a spherical boundary large enough to compensate for Earth's non-spherical shape. This method requires computation of the spherical harmonic transform of the potential at this boundary in each iteration to couple it to the analytical solution in the outer domain. As an alternative, Gharti & Tromp (2017); Gharti et al. (2018) propose to use infinite elements to virtually extent the domain by mapping one face of the finite elements to infinity. The efficacy of this method is based on the fact that the potential decays with distance and this can be accommodated in the space of test functions. To maintain the convergence order, a Gauss-Radau quadrature rule is then used in the radial direction which avoids the evaluation of the basis polynomials on the outer boundary. In contrast, we use the same quadrature in all elements both in the interior and exterior domain. Here we argue for a different approach: it is the low-order terms of the solution that are most sensitive to the outer boundary condition, as they decay with low powers of 1/r. Higher-order terms decay quickly with distance from Earth's surface. We thus use the analytical solution given by the multipole expansion to derive the Neumann boundary condition at a finite radius. This is similar to the approach by Chaljub & Valette (2004) but instead of computing the spherical harmonic expansion over the surface of the domain in each iteration of the conjugate gradient solver, it requires computation of the expansion of the RHS of eq. (1) over the volume of the Earth only once per time step. The Neumann boundary condition only determines the potential up to an additive constant in contrast to a Dirichlet condition at infinity, however, the resulting forces in the wave equation only require the gradient of the potential and are insensitive to such a constant. While in theory a Dirichlet condition may be preferred, it requires the splitting of the solution into a homogeneous and inhomogeneous solution, elimination of the degrees of freedom on the surface from the linear system or introduction of (A) (B)  a penalty term on the boundary. As we found no convergence issues with the Neumann condition, it is the preferred solution due to its simplicity. The Neumann boundary condition can be written as where with e ⊂ ⊂ R 3 denotes the finite computational domain. As the RHS of eq. (1) is compactly supported within the Earth, the exterior field can be expanded into the multipole series in powers of 1/r where Y lm are the complex valued spherical harmonic functions and q lm are the spherical multipole moments, The radial derivative needed to obtain the Neumann boundary condition according to eq. (3) if we assume to be spherical can then be calculated directly: For the problem to be well posed, the Neumann boundary condition needs to fulfil the compatibility condition, that is we need to demonstrate that As the surface integral over spherical harmonics vanishes for all terms other than the monopole term with l = 0, it can be shown with the help of eqs (5) and (6) that g indeed satisfies the compatibility condition and In practice, the multipole expansion needs to be truncated to a maximum degree l max . Fig. 3 shows the maximum angular degree of the exterior solution as a function of the radius based on a threshold ε of the multipole expansion (4): This relation can be used to determine both the l max for the boundary condition given the domain size and the lateral element size in the exterior domain. Similar to the truncation of the series we employ here, infinite elements also incorporate terms up to a maximum power of 1/r only and the exponent is equal to the polynomial order of the Lagrange basis in radial direction in the layer of infinite elements (eq. 25-27, Gharti et al. 2018), and this order can be chosen independently of the polynomial order in the volume elements. With a polynomial order of 2, however, as used in the numerical examples by Gharti & Tromp (2017) and Gharti et al. (2018), a significant buffer layer of normal spectral elements between the Earth's surface and the infinite elements is hence needed to achieve the required accuracy for the perturbed gravitational potential at all angular degrees present in the solution. Unfortunately, the authors are not aware of a quantitative test of infinite elements for high angular degrees, but assume that the required size of the domain and hence the total number of elements needed in the mesh will be similar to the one needed with the truncation of the Neumann boundary condition at the same angular degree.

Discretization
In order to apply the spectral-element method (SEM, Patera 1984;Chaljub et al. 2007) to the Poisson equation, it needs to be written in the weak form. This is achieved by multiplication of eq. (1) with a test function ϕ and integration by parts: The surface integral over ∂ on the RHS and surface integrals over internal interfaces vanish due to the jump condition eq. (2). The jump condition is hence readily and implicitly included in the weak form, which is equivalent to the strong form of the equation provided that it holds for all test functions ϕ from an appropriately chosen set. The last term can be identified with the Neumann boundary condition: We then subdivide the domain into hexahedral elements and use the standard SEM with Gauss quadrature on the Gauss-Lobatto-Legendre (GLL) points with Lagrangian interpolation functions to write the eq. (10) in matrix form: Here, K is the stiffness matrix, ψ is the vector of degrees of freedom and the RHS f contains both the density perturbation and the Neumann boundary condition. For this study, we explicitly compute K and assemble it as a sparse matrix. While most codes that aim at optimal performance avoid this assembly and use matrix-free implementations, we chose the matrix-based approach for its simplicity, allowing for an implementation in Python and relying on available libraries to achieve acceptable performance in the solver for the purpose of this paper. For a production code that would couple directly to the seismic simulation, a large part of the implementation is equivalent to acoustic wave propagation that can hence be reused. Here we only discuss performance in terms of number of iterations and not in terms of actual runtime, so that this less efficient implementation can be ignored.

Meshing
Cubed sphere meshing (Ronchi et al. 1996) in combination with deformed regularly gridded cube and lateral refinements by doubling or tripling layers has been established as the standard method in numerical global seismic wave propagation (e.g. Komatitsch & Tromp 2002b;Chaljub & Valette 2004). This approach can naturally be extended to also include the outer domain as shown in Fig. 4. While doubling layers in theory allow to better approximate the desired element size based on the S-wave length, we find tripling layers to be easier to locate. This is because fewer refinements are need to achieve the same change in element size and tripling layers only span one element in radial direction compared to three elements for the doubling on the full sphere due to the continuity conditions at the boundaries of the cubed sphere chunks (see fig. 4 of Komatitsch & Tromp 2002b). The resulting meshes have very similar numbers of elements for long-period meshes. Furthermore, the smallest elements that determine the time step due to the Courant-Friedrichs-Lewy (CFL) condition are always located in the crust in this application, as the radial element size is constrained by the crustal thickness to have an element boundary conforming with the Moho. Also note that we employ elements that approximate the spherical shape with polynomials at the same order as the test functions so that much fewer elements are required to achieve acceptable accuracy (van Driel et al. 2020, figs 8 and 9).
While the element size within Earth is constrained by the local S-wave length and we assume this to be a conservative choice for the perturbed potential, there is no such constraint on the element size in the exterior domain. As can be seen from Fig. 3, the lateral complexity of the perturbed potential decays rapidly with increasing distance from Earth's surface and this can be accounted for by coarsening the mesh using a tripling layer in the same way as in the mantle. Furthermore, the potential decreases monotonically as a function of the radius in the exterior domain. While the decay can be on short distances just above the free surface, the complexity is extremely low at larger radii. This suggests that the radial element size needs to be smaller close to Earth's surface and can then increase rapidly with distance. To accommodate these constraints, we extend the mesh with a first layer of approximately isotropic elements, where the size is given by the S-wave length in the crust. This buffer layer is followed by a coarsening layer that increases the lateral element size by a factor of three. Finally, we add a configurable number of elements that increase in size with distance, where the radial nodes are computed as r i = r 0 + h 0 · dr i . Here r 0 and h 0 are the radius and lateral element size of the preceding layer of elements and dr is a parameter to tune the element growth rate to the complexity of the solution. For the range of number of elements in radial direction (up to about 10) and values for dr between 1.3 and 3, this keeps the aspect ratio of the elements in an acceptable range to avoid potential ill-conditioning of the system.
We will determine the necessary domain size as well as an appropriate element shape empirically in Section 3.1. With this meshing approach the number of elements in the outer domain is only a fraction of the total number of elements (21 per cent for the mesh shown in Fig. 4 where the exterior domain has a radius 14.5 times the Earth radius). One third of these are located in the first layer above Earth's surface that is needed in any case to include topography in the approach by Chaljub & Valette (2004) or the infinite elements in the approach by Gharti & Tromp (2017). This shows that the potential gains from using the infinite element method instead of adaptive mesh refinement and the Neumann boundary discussion as discussed here are relatively small, even if a higher polynomial order in the radial direction was used in the infinite elements to avoid the additional buffer layers discussed above.

Initial solution
The number of iterations needed to solve the Poisson equation can be significantly reduced if a suitable initial solution is known. In the case of the self-gravitation, the RHS is computed from the timedependent displacement, which only changes marginally between time steps. For global simulations at long period, this is even more pronounced because the crust is much thinner than the wavelength. The explicit time-stepping used in the spectral element method is then limited by the CFL condition to values smaller than approximately 0.5 s, with the exact value depending on the crustal velocity and thickness model as well as the surface topography resolution. On the other hand, solutions from preceding time steps are only known to finite numerical accuracy and this may render extrapolation unstable for high order schemes.
Here, we compare three extrapolation methods: First, we use the solution of the previous time step as initial solution, where the index indicates the time step. Second, linear extrapolation can be written as and finally a quadratic extrapolation is given by We evaluate these in a numerical experiment in Section 3.1.

Multigrid solver
The convergence rate of iterative solvers depends on the scale length of the solution, with a higher convergence rate typically associated to the shorter wavelength component of the solution (e.g. Wesseling 1992). Combining multiple discretizations that vary in their spatial resolution, this can be exploited to speed up the overall convergence. In the case discussed here, the element size is dictated by the S-wave velocity and the solution to the Poisson equation is dominated by longer scale components, which suggests that a multigrid approach may improve the convergence significantly. As we work with fully unstructured meshes, no straightforward coarsening of the mesh exists, in contrast to the hierarchical meshes often uses (e.g. Bank et al. 1988;May & Knepley 2011). However, the polynomial basis we employ typically has a polynomial degree p = 4. Thus, bases with lower polynomial orders on the same mesh can then be used to create a coarser spatial discretization (e.g. Craig & Zienkiewicz 1985;Foresti et al. 1989;Helenbrook et al. 2003; Bello-Maldonado & Fischer 2019). For two polynomial spaces P m and P n with different orders m and n, the projection of an element ϕ m from P m into P n is defined as the solution to ϕ n = arg min This is a strictly convex problem with a unique solution satisfying (ϕ n , ϕ) = (ϕ m , ϕ) for all ϕ ∈ P n , where On the modelling of self-gravitation 637 denotes the L 2 inner product on . Because the lower-order polynomial space is a subset of the higher-order space, the mapping to higher orders is exact and we simply obtain ϕ n = ϕ m for n ≥ m. However, in case n < m, the projection requires the solution of the linear system defined by eq. (17) with the Gram matrix for basis vectors ϕ i n , ϕ j n ∈ P n and an RHS b defined by Here, it is important to use exact integrals and not the Gauss-Lobatto quadrature rule as used by the SEM because the latter is only exact up to order 2n − 1 and thus not suitable for computing eq. (18). In the following we require the projection to lower orders only for the RHS of eq. (1). The RHS is allowed to be discontinuous at element boundaries, so we can use the locally optimal projection in each element. Furthermore, because the 3-D SEM basis is formed by the tensor product of the 1-D basis, the 3-D mapping matrices are obtained by solving the linear system mentioned above in 1-D and applying the resulting projection to the three dimensions subsequently. In the following, we refer to the mapping of the 3-D SEM basis from order m to n as P mn .
In the application discussed here, a good initial solution is available from the extrapolation from previous time steps, see previous section. Hence, the residual only needs to be reduced by a small factor and this allows to apply a simplified multigrid approach with N stages, going through each stage exactly once. We denote the full resolution as stage 0 and indicate the stage with an upper index on all variables. The Poisson equation to be solved can then be written in the form Assuming that a good initial solution ψ 0 0 is available, the residual is defined as Due to the linearity of the equation, the residual can be used as a RHS to solve for the correction to the initial solution. To improve the convergence rate, we first solve this equation at lower polynomial order and use the result as an initial solution in the next higher order stage. This process is iterated until reaching the highest resolution stage. In each stage n of the multigrid starting at the coarsest discretization we hence first compute the RHS by restricting the residual to the polynomial order of this stage: Note that the RHS does not need to be continuous across element boundaries so the restriction to the lower order is local to each element. Then, the solution is smoothed using several conjugate gradient iterations to reduce the error on where the subscript on ψ indicates on which stage and corresponding RHS it was computed and the superscript indicates the discretization stage. While on the coarsest stage we use a zero initial solution, the initial solution for the other stages is computed by interpolation: Ultimately, the final solution at the full resolution is obtained by correcting the initial solution ψ 0 0 accordingly:

Solver verification
To verify the correct implementation of our numerical solver, we compute the gravitational potential for the 1-D PREM density and compare it to the semi-analytic solution obtained by numerical integration over the radius of the planet (e.g. Dahlen & Tromp 1998). While the homogeneous Dirichlet condition at infinity was assumed to require a large computational domain in previous work (Gharti & Tromp 2017), we apply the Neumann boundary condition for the monopole term (the only non-zero term in the multipole expansion in this 1-D example) directly at Earth's surface. While this offsets the potential by a constant relative to a solution with homogeneous Dirichlet condition at infinity, the absolute value of the potential has no physical significance. The resulting gravitational force is given by the gradient of the potential, rendering the force invariant under the addition of a constant to the potential. To measure the quality of the numerical solution, we hence subtract the mean value from the difference to the analytical solution, see Fig. 5. For a finite element based method, the more difficult challenge in this test case is the accurate representation of the spherical shape. While analytical mappings between the reference coordinates in each element and the physical coordinates could be used in this exactly spherical case (compare e.g. Chaljub & Valette 2004;Nissen-Meyer et al. 2007), we prefer a more generic approach using polynomial approximations to be able to include topography at a later stage. The accuracy of approximating the sphere by polynomials on the GLL points with relatively few elements is demonstrated by van Driel et al. (2020). Here, we use the same polynomial order for the shape representation as for the test functions, that is we use isoparametric elements. This choice is the reason that for polynomial orders n ≥ 2 the convergence rate seen in Fig. 5(d) is approximately n + 2: not only is the solution represented more accurately, but additionally the accuracy in representing the spherical shape improves with increased order and decreased element size.
In the first-order shape representation n = 1 that is commonly used in seismology, when placing the nodes of the elements on the planet's surface, the volume of the sphere is systematically underestimated. This explains the particularly slow convergence.
On the other hand, the meshes in the Poisson problem are designed for seismic wave propagation primarily, and assuming a crustal Swave velocity of 3.2 km s −1 , the lateral element size at the surface is h = 1600 km when using two elements per wavelength at 1 mHz. Using n = 4 for the shape approximation is hence a conservative choice to avoid errors in the potential and n = 2 is likely sufficient for meshes designed for shorter periods.

A P P L I C AT I O N T O S E I S M I C WAV E S
To verify our approach, we consider the Poisson problem where the RHS is computed from a purely elastic seismic wave propagation ignoring the coupling between the gravitational potential and the seismic displacement. Although self-gravitation has a significant effect on the longest period modes, we consider this to be a realistic test case to evaluate the performance of the Poisson solver separately. A snapshot from such a simulation is shown in Fig. 6.
The seismic simulation is based on the mesh shown in Fig. 4, designed to resolve S waves at 200 s with two elements per wavelength in the PREM velocity model and uses the Salvus wave-propagation software package (Afanasiev et al. 2019). This results in a total of 105K elements, 21 per cent of which are in the exterior domain. The source is the centroid solution of the Tohoku Oki earthquake with a (A) (B) (C) (D)  half-duration of 100 s and the snapshot is taken 900 s after the quake. With a time step of 0.34 s governed by the stability condition for elements in the crust, the computational time was 16 s on two Nvidia Titan X GPUs. In the fluid part of the core, the displacement shown in A is computed as the gradient of the displacement potential times the density. From the displacement, we compute the RHS according to eq. (1) and then solve the discrete system (eq. 12) using the conjugate gradient method with a diagonal Jacobi pre-conditioner and homogeneous Neumann conditions for simplicity. We use the same mesh (within the volume of the Earth) and polynomial order as in the elastic simulation, resulting in a total of approximately 800 iterations to achieve a residual of 10 −5 , which was determined by Chaljub & Valette (2004) as sufficiently accurate. Fig. 6 shows both the fields as well as their time derivatives. While the potential is dominated by the static displacement close to the source, the time derivative also shows significant contributions from Rayleigh and P waves. Love and S waves have a very small contribution to the RHS of eq. (1) as they have no associated divergence and only contribute by translating material with a density gradient. However, at the free surface and the core mantle boundary, P to S converted phases are clearly visible both in the RHS and the resulting potential. These observations confirm the expectations discussed in Fig. 1.
In the following subsections, we use several time steps around this snapshot to evaluate the efficiency of the numerical approach described in Section 2.

Boundary condition verification
To verify the Neumann boundary conditions introduced in Section 2.2 and choose appropriate values for the exterior domain radius and maximum degree l max in the multipole expansion of the RHS, we perform a convergence test, see Fig. 7. The reference solution was calculated with l max = 16 and a radius of r = 38.1 · 6371 km, so that the error due to the finite domain size can be neglected. The element growth parameter is constant dr = 1.4 for all cases (see the following subsection and Fig. 8).
This test shows empirically, that the L 2 error within the volume of the Earth computed relative to the reference solution converges depending on the radius of the computational domain r as r −(2lmax+3) .
Due to this fast convergence, relatively low values for both r and l max lead to sufficient accuracy of the solution. A final choice depends on the trade-off between using more elements or more expansion coefficients, but r ≈ 3r Earth and l max = 4 appear to be reasonable values.
Higher values of l max were suggested by Chaljub & Valette (2004), presumably because in their numerical tests they assumed a spherical Earth and applied the boundary condition directly on the free surface. As can be seen from Fig. 7, it is likely more efficient to extent the domain to some degree rather than just increasing l max : the cost of the multipole expansion scales with l 2 max and the number of elements scales subproportionally with r due to the increasing radial element size even without a coarsening layer.  Fig. 7 also gives an indication for the required lateral resolution as a function of the radius, confirming our assumption that lateral coarsening can be applied relatively close to Earth's surface. The remaining question particularly concerns the radial element size in the exterior domain. Fig. 8 shows the L 2 error computed within Earth relative to a reference solution computed with dr = 1.2. In all cases, the exterior domain had 8 elements in radial direction and l max = 16, to ensure that the boundary condition does not contribute to the error.

Exterior mesh verification
The result suggests that dr should be chosen slightly below a value of 2 to achieve an accuracy of 10 −5 , that is a bit less aggressive than what was used to generate the mesh in Fig. 4

Accuracy of initial solutions
To evaluate the accuracy of the three different extrapolation schemes discussed in Section 9, we compute the perturbed potential corresponding to four consecutive time steps to a residual of 10 −5 and then compare the extrapolation from the preceding one, two or three steps, respectively, to the numerical solution of the last one. The time between two steps in this case is 0.34 s using a single crustal layer with a thickness of 25 km. For applications with crustal thickness variations the CFL condition will dictate a smaller time step, which will improve the extrapolation relative to the results discussed here.
The error level of the extrapolated potential is about three orders of magnitude below the field itself for the constant extrapolation and one order of magnitude lower for linear and quadratic extrapolation. Quadratic extrapolation is most accurate for body waves at depth and surface waves, but less so for body waves at the free surface. The near source region is likely dominated by numerical noise introduced by the point source approximation at this accuracy level and hence does not behave physically in the extrapolation.
To quantify this visual impression and evaluate the quality of the extrapolated field as initial solution, we compute the residual for 20 consecutive time steps preceding the one discussed above. The cumulative distribution of these residuals is shown in Fig. 10 for the linear and quadratic scheme. The residuals for the constant extrapolation are beyond the scale of the figure and take values of approximately 1.2 × 10 −3 , almost two orders of magnitude above the other two schemes. The figure shows the result both for a single discretization at fourth order (p = 4) and the multigrid scheme (mg). The quadratic extrapolation appears to perform slightly better  . Cumulative distribution of the initial residual from linear and quadratic extrapolation. For constant extrapolation, the value is consistently around 1.2 × 10 −3 and not shown here. Importantly, the efficacy of the extrapolation also depends on the spatial scheme. than the linear scheme in both cases, however, probably less than expected from the visual impression in Fig. 9. Additionally, the performance of the linear scheme is much more predictable for the multigrid approach with significantly less variance of the residual over the iterations.

Efficiency of MG
The final crucial component of our approach is the polynomial multigrid method introduced in Section 2.6. Fig. 11 shows how the four different stages contribute to the final solution starting from a linearly extrapolated initial solution. In the stages using polynomial orders p = 1 to p = 3, we empirically chose the convergence criterion to be the relative residual of 0.01, 0.05 and 0.05 solving for the update of the initial solution using the residual as the RHS. In the last step, we converge to a residual of 10 −5 in terms of the full potential. While all the long-scale structure of the solution in particular in the exterior domain is readily present in the first stage, the higher orders are needed for a detailed representation of the body and surface waves. Also, the amplitude of the solution in each stage is significantly reduced for the higher-order stages by up to an order of magnitude.
To quantify the performance gained by using multiple stages, we compare the number of iterations required using different time extrapolation schemes as well as the multigrid and the fourth order system. For the multigrid scheme, we consider both the number of iterations at the highest resolution as it dominates the computational cost, as well as a weighted sum of the iterations in all stages. The weighting is estimated from the leading order scaling in the number of FLOPs in a matrix free implementation of the stiffness terms as would be used in a production code, that is p 4 . Fig. 12 shows cumulative distributions of these numbers of iterations; in all cases, the linear extrapolation performs better than both the quadratic and the constant one. Additionally, as already seen in the residual, the quadratic extrapolation again exhibits the largest variance, making the linear extrapolation the best choice. The median value for the multigrid method with linear extrapolation is 5 iterations in the highest order and 8.2 iterations in the weighted sum. This confirms about a factor 5 improvement in the performance by using multigrid in comparison to the same extrapolation used with fourth order only, which is comparable to previous results (e.g. Barker & Kolev 2021).
The linear extrapolation itself leads to a factor 3 reduction in cost in comparison to the constant extrapolation for the fourth order approach and to a factor 10 for the multigrid approach. This however mostly suggests that the simplified multigrid approach we use here with a single cycle through the different orders is not a good choice if the initial solution has a higher residual. With higher initial residuals, we expect cycling through the stages multiple times to be more efficient.
As a reference, for a zero initial solution and using fourth order, the number of iterations are approximately 700 to 900. Chaljub et al. (2007) report a range of 50-100 iterations, though using wavefields with significantly lower frequency (dominant frequency of 1 mHz versus 5 mHz used here, corresponding to 443K versus 6.8M degrees of freedom for the potential). They also use a spatially variable polynomial order from 2 to 10 to improve the condition number of the matrix and do not specify the initial solution, so that a comparison to their numbers requires careful interpretation.  Cumulative distribution of the number of iterations needed to reach a residual of 10 −5 for constant, linear and quadratic time extrapolation and using the four-stage multigrid method as well as just the highest order (4). For the multigrid method, both the number of iterations at the highest order and a weighted sum over all stages based on a FLOP count estimate are shown.

C O N C L U S I O N S A N D O U T L O O K
In summary, linear extrapolation together with the simplified multigrid approach reduces the cost of solving the Poisson equation for the perturbed gravitational potential significantly. As the elastic stiffness terms in the wave propagation are about an order of magnitude more expensive than the stiffness term in the Poisson equation, the reduction to a cost equivalent to less than 10 iterations means that the cost for the Poisson solver is on the same order of magnitude as the elastic terms and no longer dominates the numerical cost. All computations in this paper both for the wave propagation and for the Poisson equation were run on a workstation.
Future work includes the implementation of the presented method into a production software with direct coupling between the gravitational and elastic forces as well as its verification against established solutions. In order to apply this method to full waveform based tomography at normal mode frequencies, classical methods of extracting information from the seismograms (Laske & Widmer-Schnidrig 2015) need to be revised and adapted for this framework and the corresponding adjoint sources need to be derived. The check pointing approach used for the computation of gradients in the adjoint method  needs to be verified for simulations with very high numbers of time steps and potentially extended to use multiple levels (Walther & Griewank 2004). In any case, the work presented here constitutes an important step towards the inclusion of full self-gravitation in routine calculations of long-period seismograms.

A C K N O W L E D G E M E N T S
We would like to thank editor Andrew Valentine, reviewer David Al-Attar and one anonymous reviewer for their thoughtful and constructive comments. This work was supported by grants from the Swiss National Supercomputing Centre (CSCS) under project ID s922, the European Research Council (ERC) under the EU's Horizon 2020 Framework Programme (grant No. 714069) and the Swiss National Science Foundation (SNF projects 172508 'Mapping the internal structure of Mars' and 197369 'Towards a self-consistent Earth model from multi-scale joint inversion: Revealing Earth's mantle elasticity and density with seismic full-waveform inversion, tidal tomography and homogenization').

DATA AVA I L A B I L I T Y
There are no new data associated with this paper.