Numerical noise suppression for wave propagation with finite elements in first-order form by an extended source term

Finite elements can, in some cases, outperform ﬁnite-difference methods for modelling wave propagation in complex geological models with topography. In the weak form of the ﬁnite-element method, the delta function is a natural way to represent a point source. If, instead of the usual second-order form, the ﬁrst-order form of the wave equation is considered, this is no longer true. Fourier analysis for a simple case shows that the spatial operator corresponding to the ﬁrst-order form has short-wavelength null-vectors. Once excited, these modes are not seen by the spatial operator but only by the time- stepping scheme and show up as noise. A source with a larger spatial extent, for instance a Gaussian or a tapered sinc, can avoid the excitation of problematic short wavelengths. A series of numerical experiments on a 2-D problem with an exact solution provides a suggestion for the best choice of parameters for these source term distributions. The tapered sinc provided the best results and the resulting accuracy can be better than that of the second-order form. The higher operation count of the former, however, does not make it more efﬁcient in terms of accuracy for a given computational effort, at least not for the 2-D examples considered here.


I N T RO D U C T I O N
Modelling of seismic data requires substantial computational resources. The finite-difference method is widely used in the oil industry because it is relatively easy to code up and optimize. The finite-element method is computationally more demanding but may offer better accuracy at a given cost in the presence of topography and large impedance contrast, but only if the mesh follows the interfaces between different rock types (Kononov et al. 2012;Zhebel et al. 2014).
A typical finite-element discretization of the wave equation in its second-order form involves a stiffness matrix, related to the spatial derivatives, and a mass matrix, related to the second derivatives in time. Because inverting the large sparse mass matrix at each time step is costly, it is replaced by its mass-lumped version, a diagonal matrix obtained by taking its row sums. The resulting weights are equivalent to those of a numerical quadrature rule. For rectangular types of elements, quadrangles in 2D and hexahedra in 3D, Legendre-Gauss-Lobatto quadrature produces the well-known spectral elements (Komatitsch & Tromp 1999). The accuracy of the inexact mass-lumped version is amply sufficient, exceeding that of the exactly evaluated stiffness matrix by one order.
Spectral elements for simplicial elements, triangles in 2D and tetrahedra in 3D, are more difficult to construct. One or more orders in spatial accuracy are lost by mass lumping, but can be recovered by augmenting the basis function with higher-degree polynomials that are the product of a bubble function and a polynomial (Fried & Malkus 1975). A bubble function is a polynomial that vanishes on all the edges of the triangle. At present, triangular elements are known up to degree nine (Fried & Malkus 1975;Cohen et al. 1995Cohen et al. , 2001Mulder 1996Mulder , 2013Chin-Joe-Kong et al. 1999;Cui et al. 2017;Liu et al. 2017). In 3D on tetrahedra, two kinds of bubble functions are required: face bubbles that vanish on the edges of the faces and interior bubbles that are zero on all edges and faces of the tetrahedron (Mulder 1996). Tetrahedral elements are known up to degree three (Chin-Joe-Kong et al. 1999).  considered their performance for elastic wave propagation. Geevers et al. (2018a,b) recently showed that the accuracy requirements for the construction of these elements is too strong. A modified accuracy condition led to a series of new tetrahedral elements up to degree four.
Discontinuous Galerkin (DG) methods offer an alternative to diagonal mass lumping by giving up conformity and restoring it by penalty terms leading to additional fluxes in the discretization (Rivière & Wheeler 2003;Grote et al. 2006;Käser & Dumbser 2006;De Basabe & Sen 2007;Diaz & Grote 2009). The resulting mass matrix is block diagonal and easy to invert. Note that more accurate schemes for the stiffness matrix, formulated on Fekete (Chen & Babuška 1995;Taylor et al. 2000;Komatitsch et al. 2001;Mercerat et al. 2006) or other optimized sets of nodes (Hesthaven & Warburton 2002, 2006Hesthaven & Warburton 2007;Modave et al. 2015) are well suited for DG methods but not for diagonal mass lumping without loss of accuracy.
Finite-element schemes for the acoustic and elastic wave equation are commonly based on the second-order form of the partial differential equations. Dispersion analysis by Ainsworth (2014) showed that the first-order form provides an accuracy that is better by two orders for the odd-degree Legendre-Gauss-Lobatto elements. However, if the error in the eigenvectors is included, as it should (Mulder 1999), the full error only shows this improvement for the lowest degree elements ): a first-order formulation with linear elements in 1D has fourth-order spatial accuracy, but this requires a consistent or full, mass matrix. With mass lumping, required to avoid the inversion of the mass matrix, the accuracy drops to second order. However, by invoking the defect-correction principle (Stetter 1978), we could show that iterative inversion of the mass matrix requires only one iteration when using the lumped mass matrix as preconditioner, at least on equidistant grids. This result motivated us to consider the first-order formulation of the wave equation with continuous linear elements in 2D, because it might lead to a gain in efficiency over the second-order formulation. Note that the first-order formulation with the DG method is less uncommon (Hesthaven & Warburton 2002Chung & Engquist 2009;Delcourte et al. 2009;Etienne et al. 2010;Wilcox et al. 2010;Modave et al. 2015, e.g.).
Unfortunately, the first-order form with linear elements led to noisy results. We will show that this can be traced back to the oddeven or checker-board decoupling that may occur for some discrete schemes in first-order form. The scheme of Brossier et al. (2008) shows a similar behaviour when used on a regular structured mesh. This decoupling is related to the shortest wavelengths that happen to lie in the null-space of the discrete spatial operator. The firstorder form considered here even has additional null-vectors. Once excited, these will not disappear if the scheme is not dissipative. In this paper, we investigate how to suppress the excitation of these waves by spatially extending the source. This lets the source term act as a high-cut filter.
In seismic simulations, the source term is typically much smaller in size than a wavelength and can therefore be represented by a delta function. In the finite-element formulation of the wave equation, be it in second-or first-order form, integration of the spatial delta function against the basis functions offers a natural way to obtain its discrete representation. To further reduce the wavenumber spectrum of the excited waves, we can extend the size of the source beyond a single element. One option is to replace the delta function by a spatial Gaussian with a small standard deviation. An alternative to a Gaussian is the tapered sinc that Hicks (2002) proposed for finite-difference schemes. A sinc function is the spatial equivalent of a band-limited delta function and the tapering keeps it localized. Waldén (1999) presented piecewise polynomial approximations of the delta function, both for finite-element and for finite-difference schemes. Petersson et al. (2016) applied these ideas to finite-difference scheme for the wave equation in first-order form.
We will examine the performance of four source representations, delta function, Gaussian, tapered sinc and a polynomial approximation of the delta function, with linear mass-lumped finite elements in first-order form in combination with defect correction. For reference, we will also consider the standard second-order mass-lumped finite-element discretization of the acoustic wave equation with linear elements on a triangular mesh.
Section 2 contains a description of the first-order formulation of the acoustic wave equation, the source term representations and Fourier analysis of the schemes for a simple structured 2-D periodic mesh, offering some insight in what to expect. Section 3 presents results for a series of numerical experiments that assess the performance of the various schemes. It ends with a non-trivial example. Section 4 summarizes the main conclusions.

Finite elements in second-order form
Before turning to the first-order form of the 2-D acoustic wave equation, we first review the more common second-order form, which we will use for reference: Here, p(t, x) is the pressure as a function of time t and position is the source term with wavelet w(t) and spatial distribution s(x), typically taken as a delta function s(x) = δ(x − x s , z − z s ) for a source position (x s , z s ). The sound speed c(x) and density ρ(x) characterize the material through which the waves propagate.
For the finite-element discretization, we consider a triangular mesh with N nodes and expand the pressure as where the basis functions φ j (x) are piecewise linear on those triangles that have x j as one of their vertices and φ j (x k ) = δ j,k for all vertices x k . The mass matrix M and stiffness matrix K on the computational domain have elements respectively. They are both of size N × N and sparse symmetric. The lumped mass matrix L is obtained from the row sum of M: L j,k = δ j,k N k=1 M j,k . In practical computations, the material properties are often taken as piecewise constant per element, so that they can be moved outside the integrals in eq. (3). With mass lumping, the mass-lumped values that go into the assembly of the full mass matrix are proportional to numerical quadrature weights of second-order accuracy. We then may represent 1/(ρc 2 ) as piecewise linear per element and evaluate them at the vertices, or just inside the triangle if there is a discontinuity from one element to the next, to carry out the numerical integration.
The discrete scheme becomes where p n contains the pressures p j on the nodes at time t n = t 0 + n t. The size of the time step t should not exceed 2/ λ max (L −1 K), where λ max ( · ) denotes the spectral radius or maximum eigenvalue (Mulder et al. 2014, e.g.). We will discuss the source term vector f n later on.
The pressure should be zero at the free surface. This condition can be imposed on the mass matrix by simply eliminating the entries that correspond to the free-surface boundary. Alternatively, one can set those entries to zero in the inverse lumped mass matrix L −1 . For the other boundaries, zero pressure is imposed as well together with sponge boundary conditions (Cerjan et al. 1985), or without in one specific test problem.

Finite elements in first-order form
The acoustic wave equation in first-order form is given by the system where the particle velocity in the x-direction is denoted by u and in the z-direction by v.
At the level of the partial differential equations, the first-and second-form are equivalent. After discretization, they may yield solutions with different numerical errors.
An intermediate representation, more easily amenable to higherorder time stepping, is with particle accelerations a and b in the x-and z-directions, respectively.
For the first-order formulation, we expand u and v into the same basis function as the pressure p and define derivative matrices D (x) and D (z) with elements They are sparse and of size N × N. Now there are three mass matrices, M ( p) , M (u) and M (v) . The first is the same as in eq. (3). The other two mass matrices have entries Zero-pressure boundary values can be eliminated from M ( p) . Doing the same for the differentiation matrices, we obtain non-square matrices. More precisely, the divergence theorem provides where ∂ denotes the boundary of the domain and n the outward normal on that boundary. Here, the scalar field φ = φ ( p) (x) and the vector ψ = φ (u) T . If we set φ = 0 everywhere on the boundary ∂ , the first term on the right-hand side vanishes. In the discrete representation, we can let the earlier matrix D (x) act on p and drop the columns that correspond to zero-pressure values on the boundary and do the same with D (z) , making these matrices non-square. For the velocities, minus the transpose matrices can then be used: In this way, the condition of zero transverse velocity is not explicitly imposed, but the pressure at the boundary remains zero.
With a leap-frog time-stepping scheme, the discrete system be- The superscripts with n denote the solution at time t n = t 0 + t. The time step t should not exceed 2λ −1/2 max (B) where the spatial operator follows from eq. (10) by using M ( p) p n+1 − p n − M ( p) p n − p n−1 to eliminate the velocities u and v.
In 2D, the inversion of the mass matrices can be accomplished by a sparse Cholesky decomposition, but is costly. One or two iterations preconditioned by the lumped mass matrix should suffice. As the lumped mass matrix provides second-order accuracy and the consistent one fourth-order, at least in 1D on a uniform mesh, the defect-correction principle (Stetter 1978) states that one extra iteration on top of the initial step should suffice. Given the results of the Fourier analysis in Section 2.4, we do not expect fourth-order convergence but still hope for some improvement in accuracy.
To describe the method, we define iteration matrices and let and The N i iterations proceed as Likewise, following the same pattern but written in a concise form The factor t may be absorbed into A (x) , A (z) , B (x) , B (z) and g for efficiency.
Higher-order time stepping for eq. (4) can be accomplished by the Cauchy-Kowalevsky or Lax-Wendroff or Dablain or modified equation method (von Kowalevsky 1875; Lax & Wendroff 1960;Dablain 1986;Shubin & Bell 1987), which are all the same. Higherorder time stepping for eqs (16)-(18) is easier to implement for the discrete form of the intermediate representation (eq. 6). The secondorder time-stepping discretization of the latter is Higher-order time stepping requires repeated evaluation of the spatial discretization. Note that in practical computations, higherorder time stepping can be avoided altogether by suitable postprocessing of the recorded time-series at the receivers, using Stork's dispersion correction method (Stork 2013;Anderson et al. 2015;Wang & Xu 2015;Qin et al. 2017). It completely removes the timestepping error of a second-order time-stepping scheme, thereby reducing the overall computational cost.
Some elements next to the surface topography may end up with zero pressures on all three vertices. If a receiver happens to be located inside that element, linear interpolation to its position will result in a dead trace. Higher-order mass-lumped or DG finite elements do not suffer from that limitation. Alternatively, an interpolation scheme that involve more than one element (Putti et al. 1990) may be used.

Source term representations
We consider three different ways to discretize the source term. The delta function is the most straightforward, leading to a discrete source term vector Almost all entries are zero except for s j = φ j (x s ) on the three vertices x j of the triangle that contains x s . If the source position coincides with a vertex, there is only one non-zero value at that point. For a Gaussian with standard deviation σ , we have The normalization constant C σ ensures that N j=1 s j = 1 when summed over all vertices, similar to integration of the delta function over the domain. Although the Gaussian has infinite support, we truncate it in practice to a region where its amplitude exceeds a small tolerance. For the numerical convergence tests, we have chosen this region rather large.
The tapered-sinc function in 2D reads s(x, z) = 1 2 1 + cos πζ n w + 1 sin πζ πζ , and zero otherwise. The integer n w , typically 2 or 3, controls the length of the taper in terms of a number of extra loops of the sinc function and (1 + n w )r s defines its actual radius. The corresponding source term vector is with normalization constant C s . The fourth-order polynomial approximation of the delta function in eq. (A2) can be treated in the same way.

Fourier analysis
To obtain some insight in the properties of the chain of first-order operators that leads to the discrete Laplace operator B, we consider its Fourier representation on a simple periodic mesh (cf. Shamasundar & Mulder 2016). The mesh is assumed to consist of squares with sides of length h, each one divided in two triangles as sketched in Fig. 1. The pressure p i, j is defined on the vertices (ih, jh). Shift operators are defined by T x p i, j = p i + 1, j and T z p i, j = p i, j + 1 . One row of the mass matrix M and derivative operators D x and D z can then be expressed as with Fourier symbolŝ Here, we have used the fact thatT x = exp(iξ ) andT y = exp(iη). The scaled wavenumbers in x and z are ξ = k x h x and η = k z h z , where h x = h z = h denote the lengths of the sides of the squares and k x and k z the wavenumbers. The Fourier symbol of the corresponding spatial operator B in eq. (11) iŝ Near the origin of the wavenumber domain, a Taylor series expansion for small ξ and η provideŝ B 1 h 2 (ξ 2 + η 2 ) − 1 180h 2 2(ξ 6 + η 6 ) −5ξη(ξ − η) 2 (ξ 2 + ξη + η 2 ) .  Fig. 1. Near the centre, the operator follows the exact one, ξ 2 + η 2 . Further away, the error is of order four, but still further away, the operator becomes zero in a number of points, marked by crosses. The zero at the centre, indicated by a circle, should be present, but the others correspond to unwanted null-vectors. (b) Fourier symbol of the second-order operator, shown for reference, only has a zero at the origin.
The first term represents the exact operator, k 2 x + k 2 z , the second the discretization error, which is of order four relative to the exact operator. Fig. 2(a) displaysB over the whole wavenumber domain. The bowl near the origin shows the quadratic term ξ 2 + η 2 . At higher wavenumbers, the deviation from ξ 2 + η 2 increases and hence the numerical errors grow. The symbolB should vanish only at the origin, but it is also zero at the points (ξ , η) = (m 1 π , m 2 π ), with integer m 1 and m 2 , and at η = −ξ = ± 2 3 π . Viewed as an elliptic operator, obtained by dropping the second time derivative from the wave equation, B has a number of non-trivial null-vectors and is therefore unstable. For the wave equation, it implies that the spatial operator does not see certain waves but the time-stepping scheme does. Once excited, these waves will start to live a life of their own and not disappear, except perhaps at an absorbing boundary. The net effect will be a noisy pressure wavefield.
For an actual problem that is not too large, we can numerically evaluate the operator B and compute its singular-value decomposition to find the eigenvalues and corresponding eigenvectors. This shows that zero Dirichlet boundary conditions can suppress some of the unwanted modes and also that the number of zero eigenvalues depends on the number of vertices in the mesh. On unstructured meshes with zero Dirichlet boundary conditions, the zero eigenvalues disappear altogether. Still, some of the smallest eigenvalues have eigenvectors with high wavenumbers that will still show up as noise patterns once excited.

Initial guess for source term parameters
To avoid noisy solutions, we either have to abandon the first-order formulation altogether or ensure that such waves are not excited. A sufficiently band-limited source function can accomplish that. Here, we will use Fourier analysis to obtain reasonable values for the source term parameters. In the next section, we will refine these values by numerical experiments on a homogeneous problem.
For the chosen structured periodic mesh, Fig. 2(a) suggests that wavenumbers with ξ 2 + η 2 1 2 π or k 2 x + k 2 z 1 2 π h should not be excited. We can, for instance, replace the delta function source by a Gaussian. In the 1-D case, we may require the Gaussian to have half its maximum amplitude in the wavenumber domain halfway the spectrum. This leads to a standard deviation σ/ h = (2/π ) 2 log 2 = 0.75. In the weak form of eq. (23), with s(x − x s ) replaced by a Gaussian, and after application of the inverse mass matrix, this is not expected to be very different.
A similar consideration can guide the choice of parameters for the tapered sinc. Fig. 3 shows a number of dispersion curves for the first-order formulation in the 1-D case, taken from . Shown is the spectrum of the 1-D first-derivative operator, M −1 D as a function of the normalized wavenumber ξ = k x h x , scaled by π . Note that |ξ | ≤ π and that only the positive axis is displayed as the rest follows by symmetry. The exact result should be κ = iξ , implying that κ/i = ξ appears as a straight line. The deviation from that line represents the numerical dispersion error. For the first-derivative operator with a consistent mass matrix, the dispersion curve is drawn in blue. It is given by 3 sin(ξ )/(2 + cos ξ ) ξ (1 − 1 180 ξ 4 ). As the scaled exact operator should be ξ = k x h x , this shows that the relative error is proportional to ξ 4 or h 4 x and therefore, the discretization has fourth-order accuracy. With mass lumping, shown in red and given by sin ξ ξ (1 − 1 6 ξ 2 ), the accuracy reduces to second order. One iteration produces the green . Dispersion curves for the first derivative and two tapered-sinc source window functions with n w = 3 and r s = 2h or 3h that should suppress high wavenumbers towards the right, where the dispersion curves deviate strongly from the exact κ = iξ until it becomes zero at the highest wavenumber, where it has a non-trivial eigenvector, proportional to a +1, −1, +1, −1, ... mode. curve, which is described by 1 3 (4 − cos ξ ) sin ξ ξ (1 − 1 30 ξ 4 ), and therefore restores fourth-order accuracy.
The tapered sinc should behave as a high-cut filter that removes wavenumbers above |ξ |/π around 0.4 or 0.5. Fig. 3 shows the spectra of the window function provided by the tapered sinc for two parameter choices. To obtain these spectra, we choose a 1-D uniform periodic grid with element size h, placed a source at 0.2h from a vertex, evaluated eq. (23), applied the inverse mass matrix and performed a Fourier transform. The precise position of the source inside an element does not seem to matter for the results shown in Fig. 3, obtained for n w = 3 and r s = 2h or r s = 3h. Larger values of n w will make the transition from 1 to 0 steeper, at the expense of increasing the spatial source size, which will complicate matters when close to the free surface. We expect that parameters in this range will be close to optimal in the 2-D case.
In the weak form, the spatial part of the source function will be multiplied by the basis functions, after which it will be multiplied by the inverse of the mass matrix or its iterative approximation. The effect of this will be another second-order error, as can be seen easily by considering the same Fourier analysis as before. If the integration against the basis functions is denoted by a linear operator , its Fourier symbol on the earlier structured mesh becomeŝ For small ξ , showing its second-order error. The inverse mass matrix does not compensate for that: This suggests that we cannot obtain fourth-order convergence with the first-order formulation. One may wonder if the second-order error term can be removed by adjusting the spatial source distribution. An attempt to recover fourth-order accuracy is presented in the appendix, for an equidistant mesh in 1D. The idea is to compensate the second-order impact of the discretization,M −1ˆ , in the source function. In the 1-D equidistant case, that can be accomplished easily. However, it is not obvious how to generalize this idea to an unstructured 2-D mesh.

R E S U LT S
We examine the performance of the two discretizations, in first-or second-order form, and four source terms, a delta function and three extended ones: Gaussian, tapered sinc and a fourth-order polynomial approximation of the delta function. The test problem is homogeneous with a constant sound speed c = 1.5 km s −1 and constant density ρ = 1 g cm −3 . The rectangular domain has a size of 3 × 1.5 km. A point source is located at x s = 1.5 km and z s = 0.5 km. The compactly supported wavelet is The length of the wavelet, T w , is related to peak frequency by T w = 0.934129/f peak and we chose f peak = 3 Hz. The simulations run from a time − 1 2 T w = −0.156 to t max = 0.45 s. At that time the direct wave has been reflected once by the free surface and another part has reached the opposite boundary. We take all boundaries as zero Dirichlet and use mirror sources with opposite sign when computing the exact solution. The error in the pressure at t max is measured at all vertices. The coordinates and velocities were rotated by 30 • for testing purposes. Fig. 4(a) illustrates what happens with a delta function as source in the first-order formulation. It generates short wavelengths that dominate the solution. With the tapered sinc, we obtain the result in Fig. 4(b). For these examples, we used the consistent mass matrix, fourth-order time stepping with a time step close to the stability limit and an unstructured mesh.
To find good parameters for the extended sources, we computed the root-mean-square (RMS) error for the above problem over a range of source sizes, using the consistent mass matrix and a fourthorder time-stepping method based on eq. (19). The time step was chosen close to the stability limit. For the latter, we use t ≤ C min j (d inner, j /c j ) where d inner /c is the ratio of the diameter of the inscribed circle over the sound speed and the minimum is taken over all triangles j. The constant C is estimated to be 1.36 with the consistent mass matrix in eq. (19), C = 2.41 with mass lumping, C = 1.76 with one iteration and C = 1.56 with two. With fourth-order time stepping, these constants can be increased by a factor √ 3. Fig. 5 plots the RMS error as a function of the source size r scaled by the element size h, defined by the longest edge of the element that contains the source. The consistent mass matrix was used, despite its higher cost. For a Gaussian, r is its standard deviation scaled by element size and the smallest error is obtained at r/h = σ /h = 1.04. The graphs for the tapered sinc are less smooth. The smallest error occurs for n w = 3 and r/h = r s /h = 1.91. The result is better than for a Gaussian source. Here, h is the maximum edge length of the element that contains the source position.
We also examined a fourth-order polynomial approximations of the delta function. The results were considerably less accurate than for the tapered sinc and are therefore not shown.
Next, we study convergence on a range of meshes, from coarse to fine, both structured and unstructured. We use fourth-order time stepping and a source based on the tapered sinc with n w = 3 and r/h = r s /h = 2. Fig. 6 shows the RMS error as a function of N λ , the number of elements per wavelength, defined as N λ = h mean /λ peak , where the   Figure 6. RMS error as a function of N λ , the number of elements per wavelength at the peak frequency, for the tapered sinc with n w = 3 and r/h = r s /h = 2 on structured (a) and unstructured (b) meshes. Results are shown for the consistent or full mass matrix (F) and for mass lumping with no, one or two additional iterations. The blue line corresponds to the second-order form with mass lumping. The triangles indicate the slopes for second-and fourth-order convergence. mean mesh size h mean = √ A /N t , A is the total area of the mesh, N t the number of elements and λ peak = c/f peak the wavelength at a peak frequency f peak for a constant velocity c. For the structured mesh in Fig. 6(a), N λ / √ N 0.331 whereas for the unstructured mesh in Fig. 6(b), N λ / √ N 0.234, with N the number of vertices. The errors for an unstructured mesh in Fig. 6(a) start out with fourth-order behaviour in h mean or N λ on the coarser grids but degrade to second-order on finer ones. Given the results of the 2-D Fourier analysis in the previous section, we cannot expect to do better than second-order. The results for the consistent or full mass matrix are included as a reference but are costly to compute, taking an order of magnitude longer, depending on problem size. With mass lumping, the accuracy drops to second order but one iteration provides a significant improvement in accuracy, as expected. With unstructured meshes, the errors are more erratic. Nevertheless, the defect-correction approach still appears to pay off.
We repeated the exercise for the second-order formulation in eq. (4), but now with second-order time stepping and mass lumping without iterations. The time step was close to the stability limit. With a Gaussian source distribution, the smallest RMS error was obtained at r/h = σ /h = 0.31, but was only 4 per cent smaller than with a delta function source. For the tapered sinc, the best result was found for r/h = r s /h = 0.92 and also only 4 per cent smaller than with a delta function source. Given the simplicity of latter, there seems to be no reason to replace it.
This leaves the question if the first-order is better than the secondorder form, meaning that it requires less floating-point operations for a given accuracy. The answer, unfortunately, is no. The blue lines in Figs 6(a) and (b) plot RMS errors obtained on the same meshes for the second-order form with mass lumping and second-order time stepping. They lie well below those for the first-order form without defect correction, drawn as dashed lines. This is not unexpected, given the result of Fourier analysis at the end of Section 2.4. The iterations reduce the errors for the first-order form. On the structured mesh, the final result is not better than that for the second-order form. On the unstructured mesh, there is a significantly smaller error in a small range of N λ . The first-order form, however, requires seven matrix-vector multiplications per time step for second-order and double that for fourth-order time stepping. Although the maximum allowable time step is larger by a factor of about 2.5 for the latter, compared to the second-order scheme, this still results in a higher computational cost for a given accuracy. Although we observed this in our Matlab implementations of both schemes, which is not really suited for measuring performance, we believe this will carry over to implementations in a compiled language like C or C++. On top of that, there exist higher-order mass-lumped schemes in second-order form that are more efficient than the second-order one considered here (Mulder 1996;Chin-Joe-Kong et al. 1999), making the first-order form considered here even less competitive.
To demonstrate that the first-order form can actually be used on non-trivial problems, we have applied it to the inhomogeneous sound speed model shown in Fig. 7. A source at x s = 2468.36 and z s = 410.351 m with a 12-Hz Ricker wavelet generated the wavefield displayed in Fig. 7(b), using the first-order formulation with fourthorder time stepping. The tapered-sinc source had n w = 3 and r s /h = 3. The mesh contained 800466 elements and 401764 vertices. The computations started at −0.17 s to let the Ricker wavelet peak at zero time.

C O N C L U S I O N S
We have examined the performance of four source distributions, the delta function, a Gaussian, a tapered sinc and a polynomial approximation of the delta function, in a finite-element formulation of the acoustic wave equation. The spatial operator in the discrete first-order form of the wave equation may have short-wavelength null-vectors. The corresponding waves are therefore not seen by the spatial operator and persist on their own once excited. The result is a noisy solution that can be avoided by suppressing the short wavelengths. One approach is to replace the weak form of the delta source function by a source of wider extent. We have performed numerical experiments to find suitable parameters for the Gaussian, for the tapered sinc and for a polynomial approximation of the delta function. The tapered sinc provided the most accurate results.
In the standard second-order form, the Gaussian and tapered sinc hardly improve the accuracy and a delta function appears to be the most attractive choice, given its simplicity. The first-order form with one iteration may have a better accuracy than the second-order form, but that does not appear sufficient to compensate for its higher cost, at least not in our 2-D Matlab implementations. The second-order form and in particular its higherorder versions appear to be more attractive.

A C K N O W L E D G E M E N T S
This work was partly funded by the Industrial Partnership Programme (IPP) 'Computational sciences for energy research' of the Foundation for Fundamental Research on Matter (FOM), which is part of the Netherlands Organisation for Scientific Research (NWO). This research programme is co-financed by Shell Global Solutions International B.V. The authors are grateful for the helpful comments of the two reviewers, Vincent Etienne and Nobuaki Fuji.