Abstract
We analyze Galerkin discretizations of a new well-posed mixed space–time variational formulation of parabolic partial differential equations. For suitable pairs of finite element trial spaces, the resulting Galerkin operators are shown to be uniformly stable. The method is compared to two related space–time discretization methods introduced by Andreev (2013, Stability of sparse space-time finite element discretizations of linear parabolic evolution equations. IMA J. Numer. Anal., 33, 242–260) and by Steinbach (2015, Space-time finite element methods for parabolic problems. Comput. Methods Appl. Math., 15, 551–566).
1. Introduction
In recent years, one has witnessed a rapidly growing interest in simultaneous space–time methods for solving parabolic evolution equations originally introduced in Babuška & Janik (1989, 1990); see, e.g., Gunzburger & Kunoth (2011), Andreev (2013), Urban & Patera (2014), Steinbach (2015), Gander & Neumüller (2016), Langer et al. (2016), Schwab & Stevenson (2017), Devaud & Schwab (2018), Rekatsinas & Stevenson (2019), Steinbach & Zank (2018), Voulis & Reusken (2018), Fuhrer & Karkulik (2019), Neumüller & Smears (2019). Compared to classical time marching methods, space–time methods are much better suited for a massively parallel implementation and have the potential to drive adaptivity simultaneously in space and time.
Apart from the first-order system least squares formulation recently introduced in Führer & Karkulik (2019), the known well-posed simultaneous space–time variational formulations of parabolic equations in terms of partial differential operators only, so not involving nonlocal operators, are not coercive. As a consequence, it is non trivial to find families of pairs of discrete trial and test spaces for which the resulting Petrov–Galerkin discretizations are uniformly stable. The latter is a sufficient and, as we will see, necessary condition for the Petrov–Galerkin approximations to be quasi-optimal, i.e., to yield an up to a constant factor best approximation to the solution from the trial space. This concept has to be contrasted to rate optimality that, for quasi-uniform temporal and spatial partitions, has been shown for any reasonable numerical scheme under the assumption of sufficient regularity of the solution.
If one allows different spatial meshes at different times then for the classical time marching schemes, quasi-optimality of the numerical approximations is known not to be guaranteed as demonstrated in Dupont, (1982, Section 4).
In view of the difficulty in constructing stable pairs of trial and test spaces, Andreev (2013) considered minimal residual Petrov–Galerkin discretizations. They have an equivalent interpretation as Galerkin discretizations of an extended self-adjoint mixed system, with the Riesz lift of the residual of the primal variable being the secondary variable. This is the point of view we will take.
A different path was followed by Steinbach (2015). Assuming a homogenous initial condition, for equal test and trial finite element spaces w.r.t. fully general finite element meshes, stability was shown w.r.t. a weaker mesh-dependent norm on the trial space. As we will see, however, this has the consequence that for some solutions of the parabolic problem these Galerkin approximations are far from being quasi-optimal w.r.t. the natural mesh-independent norm on the trial space.
In the current work, we modify Andreev’s approach by considering an equivalent but simpler mixed system that we construct from a space–time variational formulation that follows from applying the Brézis–Ekeland–Nayroles principle (Brézis & Ekeland, 1976; Nayroles, 1976). With the same trial space for the primal variable, we show stability of the Galerkin discretization of this mixed system whilst utilizing a smaller trial space for the secondary variable. In addition, the stiffness matrix resulting from this mixed system is more sparse. In our numerical experiments the errors in the Galerkin solutions are nevertheless very comparable.
1.1 Organization
In Section 2 we derive the two self-adjoint mixed system formulations of the parabolic problem that are central in this work. In Section 3 we give sufficient conditions for stability of Galerkin discretizations for both systems. We provide an a priori error bound for the Galerkin discretization of the newly introduced system and improved a priori error bounds for the methods from Andreev (2013) and Steinbach (2015). In Section 4, we show that the crucial condition for stability (being the only condition for the newly introduced mixed system) is satisfied for prismatic space–time finite elements whenever the generally nonuniform partition in time is independent of the spatial location, and the generally nonuniform spatial mesh in each time slab is such that the corresponding |$L_2$|-orthogonal projection is uniformly |$H^1$|-stable. In Section 5 we present some first simple numerical experiments for a one-dimensional spatial domain and uniform meshes. Conclusions are presented in Section 6.
1.2 Notation
In this work, by |$C \lesssim D$| we mean that |$C$| can be bounded by a multiple of |$D$|, independently of parameters that C and D may depend on. Obviously, |$C \gtrsim D$| is defined as |$D \lesssim C$| and |$C\eqsim D$| as |$C\lesssim D$| and |$C \gtrsim D$|.
For normed linear spaces |$E$| and |$F$|, by |$\mathcal L(E,F)$| we denote the normed linear space of bounded linear mappings |$E \rightarrow F$|, and by |${\mathcal{L}}{is}(E,F)$| its subset of boundedly invertible linear mappings |$E \rightarrow F$|. We write |$E \hookrightarrow F$| to denote that |$E$| is continuously embedded into |$F$|. For simplicity only, we exclusively consider linear spaces over the scalar field |$\mathbb R$|.
For linear spaces |$E$| and |$F$|, sequences |$\varPhi =(\phi _j)_{j \in J} \subset E$|, |$\varPsi =(\psi _i)_{i \in I} \subset F$|, |$f \in F^{\ast }$| and a linear |$A\colon E \rightarrow F^{\ast }$|, we define the column vector |$f(\varPsi ):=[f(\psi _i)]_{i \in I}$| and matrix |$(A \varPhi )(\varPsi ):=[(A \phi _j)(\psi _i)]_{i \in I,\,j \in J}$|. If |$E=F$| is an inner product space, then with |$R\colon E \rightarrow E^{\prime }$| denoting the Riesz map, we set |$\langle \varPsi ,\varPhi \rangle :=(R \varPhi )(\varPsi )=[(R \phi _j)(\psi _i)]_{i \in I,\,j \in J}= [\langle \psi _i,\phi _j\rangle ]_{i \in I,\,j \in J}$|.
2. Space–time formulations of the parabolic evolution problem
Let |$V,H$| be separable Hilbert spaces of functions on some ‘spatial domain’ such that |$V \hookrightarrow H$| with dense and compact embedding. Identifying |$H$| with its dual, we obtain the Gelfand triple |$V \hookrightarrow H \simeq H^{\prime } \hookrightarrow V^{\prime }$|.
We use the notation |$\langle \cdot ,\cdot \rangle $| to denote both the scalar product on |$H \times H$| and its unique extension by continuity to the duality pairing on |$V^{\prime } \times V$|. Correspondingly, the norm on |$H$| is denoted by ∥ ∥.
For a.e.
let
|$a(t;\cdot ,\cdot )$| denote a bilinear form on
|$V \times V$| such that for any
|$\eta ,\zeta \in V$|,
|$t \mapsto a(t;\eta ,\zeta )$| is measurable on
|$I$|, and such that for a.e.
|$t\in I$|,
With
|$A(t) \in{\mathcal{L}}\textrm{is}({V},V^{\prime })$| being defined by
|$ (A(t) \eta )(\zeta )=a(t;\eta ,\zeta )$|, we are interested in solving the
parabolic initial value problem to find
|$u$| such that
Remark 2.1With |$\tilde{u}(t):=u(t) e^{-\varrho t}$|, (2.3) is equivalent to |$ \frac{\textrm{d} \tilde{u}}{\textrm{d} t}(t) +(A(t)+\varrho \textrm{Id}) \tilde{u}(t)= g(t)e^{-\varrho t}$| (|$t \in I$|), |$\tilde{u}(0) = u_0$|. So if initially |$a(t;\eta ,\eta )$| is not coercive but only satisfies a Gårding inequality|$ a(t;\eta ,\eta ) + \varrho \langle \eta ,\eta \rangle \gtrsim \|\eta \|_{V}^2$| (|$\eta \in{V}$|), then one can consider a transformed problem such that (2.2) is valid.
In a simultaneous space–time variational formulation, the parabolic partial differential equation (PDE) reads as follows: find
|$u$| from a suitable space of functions of time and space such that
for all
|$v$| from another suitable space of functions of time and space. One possibility to enforce the initial condition is by testing it against additional test functions. A proof of the following result can be found in
Schwab & Stevenson (2009); cf.
Dautray & Lions (1992, Ch. XVIII, Section 3) and
Wloka (1982, Ch. IV, Section 26) for slightly different statements.
Theorem 2.2With
|$X:=L_2(I;{V}) \cap H^1(I;V^{\prime })$|,
|$Y:=L_2(I;{V})$|, under conditions (
2.1) and (
2.2) it holds that
where for
|$t \in \bar{I}$|,
|$\gamma _t\colon u \mapsto u(t,\cdot )$| denotes the trace map. That is, assuming
|$g \in Y^{\prime }$| and
|$u_0 \in H$|, find
|$u \in X$| such that
is a well-posed variational formulation of (
2.3).
One ingredient of the proof of this theorem is the continuity of the embedding |$X \hookrightarrow C(\bar{I},H)$|, in particular implying that for any |$t \in \bar{I}$|, |$\gamma _t \in \mathcal L(X,H)$|.
Defining
|$A, A_s \in{\mathcal{L}}\textrm{is} (Y,Y^{\prime })$| (here (
2.2) is used),
|$A_a \in \mathcal L(Y,Y^{\prime })$| and
|$C, \partial _t \in \mathcal L(X,Y^{\prime })$| by
an equivalent well-posed variational formulation of the parabolic PDE is obtained by applying the so-called Brézis–Ekeland–Nayroles variational principle (
Brézis & Ekeland, 1976;
Nayroles, 1976); cf. also
Andreev (2012, Section 3.2.4). It reads
where the operator on the left-hand side is in
|${\mathcal{L}}\textrm{is}(X,X^{\prime })$|, is self-adjoint and coercive.
We provide a direct proof of these facts. Since
$$\left [\begin{smallmatrix} A_s & 0 \\ 0 & \textrm{Id} \end{smallmatrix}\right ] \in{\mathcal{L}}\textrm{is}(Y\times H,Y^{\prime }\times H)$$
, an equivalent formulation of (
2.5) as a self-adjoint saddle point equation reads as follows: find
|$(\mu ,\sigma ,u) \in Y\times H\times X$| (where
|$\mu $| and
|$\sigma $| will be zero) such that
or
Thanks to (
2.5), this Schur complement
|$B^{\prime } A_s^{-1} B+\gamma _0^{\prime}\gamma _0$| is in
|${\mathcal{L}} \textrm{is} (X,X^{\prime })$|, is self-adjoint and coercive.
We show that (
2.9) and (
2.7) are equal. Recalling the definitions of
|$C$| and
|$\partial _t$|, note that the right-hand sides of both equations are the same, and that
thanks to
|$A_a^{\prime }=-A_a$|. The proof of our claim is completed by noting that for
|$w,v \in X$|,
As (
2.9) was obtained as the Schur complement equation of (
2.8), in its form (
2.7) it is naturally obtained as the Schur complement of the problem of finding
|$(\lambda ,u)\in Y \times X$| such that
Knowing that its Schur complement is in
|${\mathcal{L}} \textrm{is} (X,X^{\prime })$|,
|$A_s \in{\mathcal{L}} \textrm{is} (Y,Y^{\prime })$| and
|$C \in \mathcal L(X,Y^{\prime })$|, we infer that the self-adjoint operator on the left-hand side of (
2.10) is in
|${\mathcal{L}} \textrm{is} (Y\times X,Y^{\prime }\times X^{\prime })$|.
Substituting
|$C=B-A_s$| and
|$Bu=g$|, we find that the secondary variable satisfies
Remark 2.3When reading |$\gamma _T^{\prime } \gamma _T$| as |$\partial _t+\partial _t^{\prime }+\gamma _0^{\prime}\gamma _0$|, system (2.10) has remarkable similarities to a certain preconditioned version presented in Neumüller & Smears (2019) of a discretized parabolic PDE using the implicit Euler method in time. Ideas concerning optimal preconditioning developed in that paper, as well as those in Andreev (2016), can be expected to be applicable to Galerkin discretizations of (2.10).
Remark 2.4In equations (
2.8) and (
2.9), the operator
|$A_s$| can be replaced by a general self-adjoint
|$\tilde{A}_s \in{\mathcal{L}} \textrm{is} (Y,Y^{\prime })$|. With
|$\tilde{C}:=B-\tilde{A}_s$|, the equivalent equation (
2.7) then reads
and (
2.10) as
with solution
|$\lambda =u$|.
In the next section, we study Galerkin discretizations of equations (2.8) and (2.10), which then are no longer equivalent.
Since the secondary variables |$\mu $| and |$\sigma $| in (2.8) are zero, the subspaces for their approximation do not have to satisfy any approximation properties. Since the secondary variable |$\lambda $| in (2.10) is nonzero, the subspace of |$Y$| for its approximation has to satisfy approximation properties, and the error in its best approximation enters the upper bound for the error in the primal variable |$u$|.
On the other hand, (uniform) stability will be easier to realize with equation (2.10) and will also be proven to hold true for |$A_a \neq 0$|; the system matrix will be more sparse and the number of unknowns will be smaller.
In order to facilitate the derivation of some quantitative results, we equip the spaces
|$Y$| and
|$X$| with the ‘energy norms’ defined by
which are equivalent to the standard norms on these spaces. Correspondingly, orthogonality in
|$Y$| is interpreted w.r.t. the ‘energy scalar product’
|$(A_s\cdot )(\cdot )$|.
3. Stable discretizations of the parabolic problem
3.1 Uniformly stable (Petrov–) Galerkin discretizations and quasi-optimal approximations
This subsection is devoted to proving the following theorem.
Theorem 3.1Let
|$W$| and
|$Z$| be Hilbert spaces, and
|$F \in{\mathcal{L}} \textrm{is} (Z,W^{\prime })$|. Let
|$(W^{\delta },Z^{\delta })_{\delta \in \varDelta }$| be a family of closed subspaces of
|$W \times Z$| such that for each
|$\delta \in \varDelta $| it holds that
|${E_W^{\delta }}^{\prime } F E_Z^{\delta } \in{\mathcal{L}}\textrm{is} (Z^{\delta },{W^{\delta }}^{\prime })$|, where
|$E_W^{\delta }\colon W^{\delta } \rightarrow W$|,
|$E_Z^{\delta }\colon Z^{\delta } \rightarrow Z$| denote the trivial embeddings. Then the collection
|$(z^{\delta })_{\delta \in \varDelta }$| of Petrov–Galerkin approximations to
|$z \in Z$|, determined by
|${E_W^{\delta }}^{\prime } F E_Z^{\delta } z^{\delta }={E_W^{\delta }}^{\prime } F z$|, is
quasi-optimal, i.e.,
|$\|z-z^{\delta }\|_Z \lesssim \inf _{0 \neq \bar{z}^{\delta } \in Z} \|z-\bar{z}^{\delta } \|_Z$|, uniformly in
|$z \in Z$| and
|$\delta \in \varDelta $|, if and only if
Proof.The mapping
|$P^{\delta }:=z \mapsto z^{\delta }=E_Z^{\delta } ({E_W^{\delta }}^{\prime } F E_Z^{\delta })^{-1} {E_W^{\delta }}^{\prime } F z$| is a projector. For
|$\{0\} \subsetneq Z^{\delta } \subsetneq Z$|, it holds that
|$P^{\delta } \not \in \{0,\textrm{Id}\}$|, and consequently,
|$\|\textrm{Id} -P^{\delta }\|_{\mathcal L(Z,Z)}=\|P^{\delta }\|_{\mathcal L(Z,Z)}$| (see
Kato,
1960;
Xu & Zikatanov,
2003). We obtain
It remains to show uniform boundedness of
|$\|P^{\delta }\|_{\mathcal L(Z,Z)}$| if and only if uniform stability is valid.
The definition of
|$P^{\delta }$| shows that
Further, we have
where the last equality follows from
|$\|{E_W^{\delta }}^{\prime }\|_{\mathcal L(W^{\prime },{W^{\delta }}^{\prime })} \leq 1$| and, for the other direction, from the fact that for given
|$f^{\delta } \in{W^{\delta }}^{\prime }$| the function
|$f \in W^{\prime }$| defined by
|$f|_{W^{\delta }}:=f^{\delta }$| and
|$f|_{(W^{\delta })^{\perp }}:=0$| satisfies
|$\|\,f\|_{W^{\prime }}=\|\,f^{\delta }\|_{{W^{\delta }}^{\prime }}$| and
|$f^{\delta }={E_W^{\delta }}^{\prime } f$|.
The proof is completed by
Remark 3.2In particular, the above analysis provides a short self-contained proof of the quantitative results
that were established earlier in
Tantardini & Veeser (
2016, Section 2.1, in particular (2.12)).
3.2 Uniformly stable Galerkin discretizations of (2.10)
Let
|$Y^{\delta } \times X^{\delta }$| be a closed subspace of
|$Y \times X$|, and let
|$E_Y^{\delta }\colon Y^{\delta } \rightarrow Y$| and
|$E_X^{\delta }\colon X^{\delta } \rightarrow X$| denote the trivial embeddings. Since
|${E^{\delta }_Y}^{\prime }A_s E^{\delta }_Y \in{\mathcal{L}} \textrm{is} (Y^{\delta },{Y^{\delta }}^{\prime })$| (as well as being an isometry), the Galerkin operator resulting from (
2.10) can be factorized as
We conclude that this Galerkin operator is invertible if and only if the Schur complement
is invertible, which holds true for any
|$X^{\delta } \neq \{0\}$|.
Theorem 3.3Let
|$(Y^{\delta },X^{\delta })_{\delta \in \varDelta }$| be a family of closed subspaces of
|$Y \times X$| such that
1 Let
|$\rho =\rho _{\varDelta }$| be the root in
|$[0,1)$| of
and let
so that
|$C_{\varDelta }=3 \sqrt{3} \,\gamma _{\varDelta }^{-2}$| when
|$\|A_a\|_{\mathcal L(Y,Y^{\prime })}=0$|, and
|$\lim _{\|A_a\|_{\mathcal L(Y,Y^{\prime })} \rightarrow \infty } C_{\varDelta }=\infty $|. Then with
|$\lambda =u$| and
|$(\lambda ^{\delta },u^{\delta })$| denoting the solutions of (
2.10) and its Galerkin discretization, respectively, it holds that
Proof.In view of the second inequality presented in Remark
3.2, we start with bounding the norm of the continuous operator. Using Young’s inequality, for
|$(\lambda ,u) \in Y \times X$| we have
Together with
|$\|A_a u\|_{Y^{\prime }}^2+\|A_a^{\prime } \lambda \|_{X^{\prime }}^2\leq \|A_a\|^2_{\mathcal L(Y,Y^{\prime })}(\|\lambda \|_Y^2+\|u\|_X^2)$|, it shows that
To bound, in view of (
3.2), the norm of the inverse of the Galerkin operator, we use the block-LDU factorization (
3.3). With
|$r:=(1+\|A_a\|_{\mathcal L(Y,Y^{\prime })}^2)$|, for
|$u \in X$| it holds that
Together with the fact that
|${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta } \in{\mathcal{L}}\textrm{is}(Y^{\delta },{Y^{\delta }}^{\prime })$| is an isometry and again Young’s inequality, it shows that for
|$(\lambda ,u) \in Y^{\delta }\times X^{\delta }$|,
or
Obviously, the
|$\mathcal L({Y^{\delta }}^{\prime } \times{X^{\delta }}^{\prime },{Y^{\delta }}^{\prime } \times{X^{\delta }}^{\prime })$|-norm of the inverse of the first factor on the right-hand side of (
3.3) satisfies the same bound.
Moving to the second factor, we consider the Schur complement operator. From
|$({E_Y^{\delta }}^{\prime } A_s E_Y^{\delta } \lambda )(\lambda ) =\|\lambda \|^2_Y$| for
|$\lambda \in Y^{\delta }$|, we have for
|$f \in{Y^{\delta }}^{\prime }$|,
|$f(({E_Y^{\delta }}^{\prime } A_s E_Y^{\delta })^{-1}f) =\|({E_Y^{\delta }}^{\prime } A_s E_Y^{\delta })^{-1} f\|^2_Y=\|\,f\|_{{Y^{\delta }}^{\prime }}^2$|, and so for
|$u \in X^{\delta }$|,
Using that for
|$u \in X^{\delta }$|,
and
Young’s inequality shows that
where we have assumed that
|$\rho _{\varDelta }>0$|, i.e.,
|$A_a \neq 0$|. It follows that
where we have used that
|$1+(1-\rho _{\varDelta }^{-1}) \|A_a\|_{\mathcal L(Y,Y^{\prime })}^2=(1-\rho _{\varDelta }) \gamma _{\varDelta }^2$| by definition of
|$\rho _{\varDelta }$|. One easily verifies (
3.7) also in the case that
|$A_a=0$|, i.e.,
|$\rho _{\varDelta }=0$|.
Since |${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta } \in{\mathcal{L}}\textrm{is}(Y^{\delta },{Y^{\delta }}^{\prime })$| is an isometry, and |$0<(1-\rho _{\varDelta })\gamma _{\varDelta }^2 \leq \gamma _{\varDelta }^2 \leq 1$|, we conclude that the |$\mathcal L({Y^{\delta }}^{\prime }\times{X^{\delta }}^{\prime }, Y^{\delta }\times X^{\delta })$|-norm of the inverse of the second factor is bounded by |$(1-\rho _{\varDelta })^{-1}\gamma _{\varDelta }^{-2}$|.
In view of the second inequality presented in Remark 3.2 in combination with (3.2), the proof is completed by collecting the bounds that were derived.
3.3 Galerkin discretizations of (2.8)
Although it is likely to be possible to generalize results to the case of
|$A_a \neq 0$|, as in
Andreev (2013) and
Steinbach (2015) in this section we operate under the condition that
Following
Steinbach (2015), for a given closed subspace
|$Y^{\delta } \subseteq Y$| we define the ‘mesh-dependent’ norm on
|$X$| by
Note that
|$\|\,\|_{X,Y}=\|\,\|_{X}$|.
The following result generalizes the ‘inf-sup identity’, known for |$Y^{\delta }=Y$| (see, e.g., Ern et al. 2017), to mesh-dependent norms.
Lemma 3.4Assuming (
3.8), then for
|$u \in Y^{\delta } {{\cap X}}$|,
If additionally
|$\gamma _0 u\in H^{\delta }$|, then
Proof.Let
|$y \in Y^{\delta }$| be defined by
|$(A_s y)(v)=(\partial _t u)(v)$| (
|$v \in Y^{\delta }$|). Then
|$(A_s y)(y)=\sup _{0 \neq v \in Y^{\delta }} \frac{(\partial _t u)(v)^2}{\|v\|_{Y}^2}$|. Furthermore, for
|$v \in Y^{\delta }$|,
|$(Bu)(v)=(A_s(y+u))(v)$| and so, thanks to
|$u \in Y^{\delta }$|,
where we used that
|$2\int _I \langle \partial _t u(t),u(t)\rangle \,\textrm{d}t=\|u(T)\|^2-\|u(0)\|^2$|.
The second statement follows from
thanks to
|$u(0) \in H^{\delta }$|.
The next theorem gives sufficient conditions for existence and uniqueness of solutions of the Galerkin discretization of (2.8) and provides a suboptimal error estimate.
Theorem 3.5Assuming (
3.8), for closed subspaces
|$Y^{\delta } \times H^{\delta } \times X^{\delta } \subset Y \times H \times X$| with
|$X^{\delta } \subseteq Y^{\delta }$| and
|${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|, the Galerkin discretization of (
2.8) has a unique solution
|$(\mu ^{\delta },\sigma ^{\delta },u^{\delta }) \in Y^{\delta } \times H^{\delta } \times X^{\delta } $|, and with
|$u$| denoting the solution of (
2.6),
Proof.Thanks to the assumptions |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|, the inf-sup identity (3.9) guarantees the unique solvability of the Galerkin system.
For any
|$u \in X^{\delta }$|, there exist unique
|$y_u \in Y^{\delta }$|,
|$h_u \in H^{\delta }$| such that
We decompose
|$Y^{\delta } \times H^{\delta }$| into
|$Z^{\delta }:={\operatorname{clos}}\{(y_u,h_u)\colon u \in X^{\delta }\}$|2 and its orthogonal complement
|$W^{\delta }$|. Using that for any
|$u \in X^{\delta }$| and
|$(v_1,v_2) \in W^{\delta }$|,
|$(B u)(v_1)+\langle u(0),v_2\rangle =0$|, one infers that for any
|$u \in X^{\delta }$|, the inf-sup identity (
3.9) remains valid when the supremum is restricted to
|$0 \neq (v_1,v_2) \in Z^{\delta }$|. Furthermore, since for any
|$(v_1,v_2) \in Z^{\delta }$| there exists a
|$z \in X^{\delta }$| with
|$(B z)(v_1)+\langle z(0),v_2\rangle \neq 0$|, we infer that
|$u^{\delta }$| is the unique solution of the Petrov–Galerkin discretization of finding
|$u^{\delta } \in X^{\delta }$| such that
By applying both these observations consecutively, we infer that for any
|$\bar{u}^{\delta } \in X^{\delta }$|,
where again we have applied (
3.9), now for
|$Y^{\delta }=Y$|. A triangle inequality completes the proof.
Theorem 3.5 can be used to demonstrate optimal rates for the error in |$u^{\delta }$| in the |$\|\,\|_{X,Y^{\delta }}$|-norm, and hence also in the |$Y$|-norm. Yet, for doing so one needs to control the error of best approximation in the generally strictly stronger |$\|\,\|_X$|-norm, which requires regularity conditions on the solution |$u$| that exceed those that are needed to guarantee optimal rates of the best approximation in the |$\|\,\|_{X,Y^{\delta }}$|-norm. In other words, this theorem does not show that |$u^{\delta }$| is a quasi-best approximation to |$u$| from |$X^{\delta }$| in the |$\|\,\|_{X,Y^{\delta }}$|-norm, or in any other norm.
Remark 3.6Theorem 3.5 provides a generalization, with an improved constant, of Steinbach’s result (Steinbach, 2015, Theorem 3.2). There the case was considered that the initial value |$u_0=0$|, |${\operatorname{ran}} \gamma _0|_{X^{\delta }}=\{0\}$|, |$H^{\delta }=\{0\}$| and |$Y^{\delta }=X^{\delta }$|. In that case the Galerkin discretization of (2.8) means solving |$u^{\delta } \in X^{\delta }$| from |$(B u^{\delta })(v)=g(v)$| (|$v \in X^{\delta }$|) (indeed, |$Z^{\delta }$| in the proof of Theorem 3.5 is |$X^{\delta } \times \{0\}$|). So with this approach the forming of ‘normal equations’ as in (2.9) is avoided.
In the case of an inhomogeneous initial value |$u_0 \in H$|, one may approximate the solution as |$\bar{u}+w^{\delta }$|, where |$\bar{u} \in X$| is such that |$\gamma _0 \bar{u}=u_0$|, and |$w^{\delta } \in X^{\delta }$| solves |$(B w^{\delta })(v)=g(v)-(B \bar{u})(v)$| (|$v \in X^{\delta }$|). Although such a |$\bar{u} \in X$| always exists, its practical construction becomes inconvenient for |$u_0 \not \in V$|. For |$u_0 \in V$|, |$\bar{u}$| can be taken as its constant extension in time.
To investigate in the setting of
Steinbach (
2015) the relation between the
|$\|\,\|_{X,X^{\delta }}$|- and
|$\|\,\|_X$|-norms, we consider
|$X^{\delta }$| of the form
|$X^{\delta }_t \otimes X^{\delta }_x$|, where
|$X^{\delta }_t$| is the space of continuous piecewise linears, zero at
|$t=0$|, w.r.t. a uniform partition of
|$I$| with mesh size
|$h_{\delta }=\frac{T}{2 N_{\delta }}$| for some
|$N_{\delta } \in \mathbb N$|, and
|$X^{\delta }_x \subset V$| with
|$\cap _{\delta \in \varDelta } X_x^{\delta } \neq \{0\}$|. Given
|$z^{\delta } \in X^{\delta }$|, Lemma
3.4 shows that
For some arbitrary, fixed
|$0 \neq z_x \in \cap _{\delta \in \varDelta } X_x^{\delta }$|, we take
|$z^{\delta }=z^{\delta }_t \otimes z_x \in X^{\delta }$|, where
|$z^{\delta }_t \in X^{\delta }_t$| is defined by
|$\frac{\mathrm{d}}{\mathrm{d} t} z^{\delta }_t=(-1)^{i-1}$| on
|$[(i-1) h_{\delta },i h_{\delta }]$|. Since
|$z^{\delta }_t(0)=0$|, also
|$z^{\delta }_t(T)=0$|. We have
|$\|z^{\delta }_t\|_{L_2(I)}\eqsim h_{\delta }$|,
|$\|\frac{\textrm{d} z^{\delta }_t}{\textrm{d} t}\|_{L_2(I)}\eqsim 1$|,
|$\sup _{0 \neq v \in Y}\frac{(\partial _t z^{\delta })(v)}{\|v\|_Y}=\|\frac{\textrm{d} z^{\delta }_t}{\textrm{d} t}\|_{L_2(I)}\|z_x\|_{V^{\prime }} \eqsim 1$|,
|$\|z^{\delta }\|_Y =\|z^{\delta }_t\|_{L_2(I)} \|z_x\|_{V} \eqsim h_{\delta }$| and
Let us equip the space of piecewise constants w.r.t. the aforementioned uniform partition with the
|$L_2(I)$|-normalized basis
|$\{\chi _i^{\delta }\}$| of characteristic functions of the subintervals, and
|$X^{\delta }_t$| with the set of nodal basis functions
|$\{\phi _i^{\delta }\}$| normalized such that their maximal value is
|$h_{\delta }^{-\frac 12}$|. Then with
$$G:=[\langle \chi _j,\phi _i\rangle _{L_2(I)}]_{i j} =\frac 12 {\small \left [\begin{smallmatrix} 1 & 1 & & \\ &\ddots & \ddots &\\ & & 1& 1\\ & & & 1 \end{smallmatrix} \right ]}$$
, and
|$\vec{x}:=\sqrt{h_{\delta }}\, [(-1)^{i-1}]_{1 \leq i \leq 2 N_{\delta }}$|, from the uniform
|$L_2(I)$|-stability of
|$\{\phi _i^{\delta }\}$| one infers that
By substituting these estimates in the right-hand side of (
3.12), we find that its value is
|$\eqsim \sqrt{h_{\delta }}$|, so that
|$\inf _{0 \neq z^{\delta } \in X^{\delta }} \sup _{0 \neq v \in X^{\delta }} \frac{|(B z^{\delta })(v)|}{\|z^{\delta }\|_X\|v\|_Y} \lesssim \sqrt{h_{\delta }}$|. As follows from the first inequality in Remark
3.2, this means that there exist solutions
|$u \in X$| of the parabolic problem for which the errors in the
|$X$|-norm in these Galerkin approximations from
|$X^{\delta }$| are a factor
|$\gtrsim h_{\delta }^{-\frac 12}$| larger than these errors in the best approximations from
|$X^{\delta }$|.
Numerical evidence provided by Steinbach (2015, Table 6) indicates that in general these Galerkin approximations are not quasi-optimal in the |$Y$|-norm either.
Returning to the general setting of Theorem 3.5, in the following theorem it will be shown that under an additional assumption, quasi-optimal error estimates are valid.
Theorem 3.7Assuming (
3.8), let
|$(Y^{\delta },H^{\delta },X^{\delta })_{\delta \in \varDelta }$| be a family of closed subspaces of
|$Y \times H \times X$| such that in addition to
|$X^{\delta } \subseteq Y^{\delta }$| and
|${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|, also (
3.5) is valid. Then for the Galerkin solutions
|$(\mu ^{\delta },\sigma ^{\delta },u^{\delta }) \in Y^{\delta } \times H^{\delta } \times X^{\delta } $| of (
2.8) it holds that
Proof.As we saw in the proof of Theorem 3.5, thanks to the assumptions |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|, the component |$u^{\delta } \in X^{\delta }$| of the Galerkin solution of (2.8) is the Petrov–Galerkin solution of (2.6) with test space |$Z^{\delta } \subset Y^{{{\delta }}} \times H^{{{\delta }}}$|.
Equation (3.11) shows that the projector |$P^{\delta }\colon u \mapsto u^{\delta }$| satisfies |$\|P^{\delta } u\|_{X,{{Y}}^{\delta }} \leq \|u\|_X$|. The proof is completed by |$\|\,\|_X \leq \gamma _{\varDelta }^{-1} \|\,\|_{X,{{Y}}^{\delta }}$| on |$X^{\delta }$| by assumption (3.5), in combination with (3.1).
Andreev (2013) studied minimal residual Petrov–Galerkin discretizations of $$\left [\begin{smallmatrix} B \\ \gamma _0\end{smallmatrix}\right ]u=\left [\begin{smallmatrix} g \\ \gamma _0^{\prime} u_0\end{smallmatrix}\right ]$$
. They can equivalently be interpreted as Galerkin discretizations of (2.8) (cf. Cohen et al., 2012; Broersen & Stevenson, 2014, Proposition 2.2). In view of this, Theorem 3.7 reproduces, though here with a clear-cut constant, the results from Andreev (2013, Theorems 3.1 & 4.1).
Remark 3.8As was pointed out earlier in
Andreev (
2013), for practical computations it can be attractive to modify the Galerkin discretization of (
2.8) by replacing
|${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta }$| by some
|$\tilde{A}_s^{\delta }={\tilde E_Y^{\delta }}{^{\prime }} \in{\mathcal{L}} \textrm{is} (Y^{\delta },{Y^{\delta }}^{\prime })$| whose inverse can be determined cheaply (a preconditioner),
3 such that for some constants
|$0<c_{\mathcal N} \leq C_{\mathcal N}<\infty $|,
Indeed, in that case one can solve the then explicitly available Schur complement equation with preconditioned CG, instead of applying the preconditioned MINRES iteration. By redefining
$$Z^{\delta }:={{{\operatorname{clos}}_{Y^{\delta } \times H^{\delta }}}} {\operatorname{ran}}\left [\begin{smallmatrix} (\tilde{A}_s^{\delta })^{-1} {E_Y^{\delta }}^{\prime } B \\ \gamma _0 \end{smallmatrix}\right ]\Big |_{X^{\delta }}$$
in the proof of Theorem
3.5, and by taking
|$W^{\delta }$| to be its orthogonal complement in
|$Y^{\delta } \times H^{\delta }$| with
|$Y^{\delta }$| now being equipped with inner product
|$(\tilde{A}_s^{\delta } \cdot )(\cdot )$|, instead of (
3.11) we now estimate for any
|$\bar{u}^{\delta } \in X^{\delta }$|,
Consequently, a generalization of the statement of Theorem
3.5 reads as
and that of Theorem
3.7 reads
Remark 3.9As we saw in the previous section, under the condition that (3.5) is valid, Galerkin discretizations of (2.10) yield quasi-optimal approximations. Assuming |$A=A^{\prime }$|, in the current section we have seen that the same holds true for Galerkin discretizations of (2.8) when in addition |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}}\ \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|. For the latter discretization, however, still a suboptimal error bound is valid without assuming (3.5). This raises the question whether this is also true for Galerkin discretizations of (2.10).
As we saw earlier, the Galerkin operator resulting from of (2.10) is invertible whenever |$X^{\delta }\neq \{0\}$|. Moreover, when equipping |$X^{\delta }$| with the ‘mesh-dependent’ norm |$\|\,\|_{X,Y^{\delta }}$|, by adapting the proof of Theorem 3.3 one can show that the Galerkin operator is in |${\mathcal{L}}\textrm{is}(Y^{\delta } \times X^{\delta }, {Y^{\delta }}^{\prime } \times{X^{\delta }}^{\prime })$| with both the operator and its inverse having a uniformly bounded norm. Despite this result, we could not establish, however, a suboptimal error estimate similar to Theorem 3.5.
Finally in this section, we comment on the implementation of the Galerkin discretization of (
2.8). This system reads
By eliminating
|$\sigma ^{\delta }$|, it is equivalent to
The operator
|$E_H^{\delta } \big ({E_H^{\delta }}^{\prime } E_H^{\delta }\big )^{-1} {E_H^{\delta }}^{\prime }$| is the
|$H$|-orthogonal projector onto
|$H^{\delta }$|. So under the assumption that
which was made in Theorem
3.7, it can be omitted, or equivalently, it can be pretended that
|$H^{\delta }=H$|, without changing the solution
|$(\mu ^{\delta },u^{\delta })$|. The implementation of the resulting system
is easier, and it runs more efficiently than (
3.13).
Remark 3.10System (
3.15) can be viewed as a Galerkin discretization of
but for the analysis of the discretization error in
|$(\mu ^{\delta },u^{\delta })$| it is still useful to view (
3.15) before elimination of
|$\sigma ^{\delta }$|, as a Galerkin discretization of (
2.8) which yielded the sharp bound on this error presented in Theorem
3.7.
4. Realization of the uniform inf-sup stability (3.5)
In Theorem 3.3 it was shown that Galerkin discretizations of (2.10) are quasi-optimal when (3.5) is valid, and in Theorem 3.7 the same was shown for Galerkin discretizations of (2.8) when in addition |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }} \subseteq H^{\delta }$| (and |$A=A_s$|) are valid.
In this section we realize condition (3.5) for finite element spaces w.r.t. partitions of the space-time domain into prismatic elements. In Section 4.1 generally nonuniform partitions are considered for which the partition in time is independent of the spatial location, and the spatial mesh in each time slab is such that the corresponding |$H$|-orthogonal projection is uniformly |$V$|-stable. In Section 4.2 we revisit the special case, already studied in Andreev (2013), of trial spaces that are tensor products of temporal and spatial trial spaces.
4.1 Nonuniform approximation in space local in time, nonuniform approximation in time global in space
Theorem 4.1Let
|${\mathcal O}$| be a collection of closed subspaces
|$X_x$| of
|$V$| such that the
|$H$|-orthogonal projector
|$Q_{X_x}$| onto
|$X_x$| is in
|$\mathcal L(V,V)$|, with
|$\mu _{\mathcal O}:= \inf _{X_x \in{\mathcal O}} \|Q_{X_x}\|_{\mathcal L(V,V)}^{-1}>0$|. For any
|$N \in \mathbb N$|,
|$0=t_0<t_1<\cdots <t_N=T$|,
|$q_0,\ldots ,q_{N-1} \in \mathbb N$|,
|$X_x^0,\ldots ,X_x^{N-1} \in{\mathcal O}$|, let
Then with
|$\varDelta $| being the collection of all
|$\delta =\delta (N,(t_i)_i, (q_i)_i, (X_x^i)_i)$|, it holds that
i.e., (
3.5) is valid.
Proof.In Andreev (2013, Lemma 6.2) it was shown that |$\inf _{0 \neq u \in X_x} \sup _{0 \neq v \in X_x} \frac{\langle u,v\rangle }{\|u\|_{V^{\prime }}\|v\|_{V}}=\|Q_{X_x}\|_{\mathcal L(V,V)}^{-1}$|.
With
|$P_n$| denoting the Legendre polynomial of degree
|$n$|, extended with zero outside
|$(-1,1)$|, for any
|$u \in X^{\delta } $|,
|$\partial _t u$| can be written as the
|$L_2(I;H)$|-orthogonal expansion
|$(t,x) \mapsto \sum _{i=0}^{N-1} \sum _{n=0}^{q_i-1} P_n\big (\frac{2t-(t_{i+1}+t_i)}{t_{i+1}-t_i}\big ) u_{i,n}(x)$| for some
|$u_{i,n} \in X_x^i$|. Fixing
|$\varepsilon \in (0,\mu _{\mathcal O} )$|, for each
|$(i,n)$| there is a
|$v_{i,n} \in X_x^i$| with
|$\|v_{i,n}\|_{V}=\|u_{i,n}\|_{V^{\prime }}$| and
|$\langle u_{i,n},v_{i,n}\rangle \geq (\mu _{\mathcal O} -\varepsilon ) \|u_{i,n}\|_{V^{\prime }} \|v_{i,n}\|_{V}$|. Taking
|$v:=(t,x) \mapsto \sum _{i=0}^{N-1} \sum _{n=0}^{q_i-1} P_n\big (\frac{2t-(t_{i+1}+t_i)}{t_{i+1}-t_i}\big ) v_{i,n}(x)$|, we conclude that
which implies the result.
Remark 4.2In view of Theorem 3.7, note that both |$X^{\delta } \subset Y^{\delta }$| and (3.5) are valid by taking |$Y^{\delta }:=\{v \in L_2(I;V) \colon v|_{(t_i,t_{i+1})} \in P_{q_i} \otimes X_x^i\}$|.
Considering the condition on the collection
|${\mathcal O}$| of spatial trial spaces
|$X_x$|, let us consider the typical situation that
|$H=L_2(\varOmega )$|,
|$V=H^1_{0,\gamma }(\varOmega )=\{u \in H^1(\varOmega )\colon u=0 \textrm{ on } \gamma \}$| where
|$\varOmega \subset \mathbb R^d$| is a bounded polytopal domain and
|$\gamma $| is a measurable, closed, possibly empty subset of
|$\partial \varOmega $|. We consider
|$X_x \subset V$| to be finite element spaces of some degree w.r.t. a family of uniformly shape regular, and, say, conforming partitions
|${\mathcal T}$| of
|$\varOmega $| into, say,
|$d$|-simplices, where
|$\gamma $| is the union of some
|$(d-1)$|-faces of
|$S \in{\mathcal T}$|. When the partitions in this family are quasi-uniform, then using, e.g., the Scott–Zhang quasi-interpolator (
Scott & Zhang, 1990), it is easy to demonstrate the so-called (uniform)
simultaneous approximation propertyWriting for
|$u \in V$| and any
|$v \in X_x$|,
|$Qu=v+Q(u-v)$|, one easily infers that
|$\sup _{X_x \in{\mathcal O}} \|Q_x\|_{\mathcal L(V,V)}<\infty $|.
The uniform boundedness of |$\|Q_x\|_{\mathcal L(V,V)}$| is, however, by no means restricted to families of finite element spaces w.r.t. quasi-uniform partitions, and it has been demonstrated for families of locally refined partitions, for |$d=2$| including those that are generated by the newest vertex bisection algorithm. We refer to Carstensen (2002) and Gaspoz et al. (2016).
4.2 Nonuniform approximation in space global in time, nonuniform approximation in time global in space
If in Theorem 4.1, the spatial trial spaces |$X_x^i$| are independent of the temporal interval |$(t_i,t_{i+1})$|, then |$X^{\delta }$| is a tensor product of trial spaces in space and time. In that case, one shows inf-sup stability for general temporal trial spaces, e.g., spline spaces with more global smoothness than continuity.
Theorem 4.3Let |${\mathcal O}$| be as in Theorem 4.1. Given closed subspaces |$X_t \subset H^1(I)$|, |$\frac{\mathrm{d}}{\mathrm{d}t} X_t \subseteq Y_t \subset L_2(I)$| and |$X_x \in{\mathcal O}$|, let |$X^{\delta }:=X_t \otimes X_x$|, |$Y^{\delta }:=Y_t \otimes X_x$|. Then with |$\varDelta $| being the collection of all |$\delta =\delta (X_t,Y_t,X_x)$|, (4.1) is valid.
The proof of this result follows from the fact that thanks to the Kronecker product structure of
|$\partial _t \in \mathcal L(X,Y^{\prime })$|, for such trial spaces we have
(To see this, one may use that for Hilbert spaces
|$U$| and
|$V$|,
|$T \in \mathcal L(U,V^{\prime })$|, and Riesz mappings
|$R_U\colon U \rightarrow U^{\prime }$|,
|$R_V\colon V \rightarrow V^{\prime }$|, it holds that
|$\inf _{0 \neq u \in U}\sup _{0 \neq v \in V}\frac{(Tu)(v)}{\|u\|_U\|v\|_V}=\min \sigma (R_U^{-1} T^{\prime } R_V^{-1} T)$|, with
|$R_U^{-1} T^{\prime } R_V^{-1} T\in \mathcal L(U,U)$| being self-adjoint and non-negative. In the above setting, it is a Kronecker product of corresponding operators acting in the ‘time’ and ‘space’ directions, respectively.)
Remark 4.4(Sparse tensor products).
Instead of considering the ‘full’ tensor product trial spaces from Theorem 4.3, more efficient approximations can be found by the application of ‘sparse’ tensor products. Let |$X_x^{(0)}\subset X_x^{(1)} \subset \cdots $| be a sequence of spaces from |${\mathcal O}$|, |$X_t^{(0)}\subset X_t^{(1)}\subset \cdots \subset H^1(I)$|, and |$Y_t^{(0)}\subset Y_t^{(1)}\subset \cdots \subset L_2(I)$| such that |$Y_t^{(k)} \supseteq \frac{\textrm{d}}{\textrm{d} t}X_t^{(k)}$|. Then for |$X^{(\ell )}:=\sum _{k=0}^{\ell } X_t^{(k)} \otimes X_x^{(\ell -k)}$|, |$Y^{(\ell )}:=\sum _{k=0}^{\ell } Y_t^{(k)} \otimes X_x^{(\ell -k)}$| inf-sup stability holds true uniformly in |$\ell $| with inf-sup constant |$\mu _{\mathcal O}$|.
Although this result follows as a special case from the analysis given in Andreev (2013) for convenience we include the argument. Defining |$W_t^{(k)}:=Y_t^{(k)} \cap (Y_t^{(k-1)})^{\perp _{L_2(I)}}$| for |$k>0$|, and |$W_t^{(0)}:=Y_t^{(0)}$|, from the nestings of |$(Y_t^{(i)})_i$| and |$(X_x^{(i)})_i$| one infers that |$Y^{(\ell )}=\oplus _{k=0}^{\ell } W_t^{(k)} \otimes X_x^{(\ell -k)}$| is an |$(L_2(I)\otimes H)$|-orthogonal decomposition. Given |$y \in Y^{(\ell )}$|, let |$y=\sum _{k=0}^{\ell } y_k$| be the corresponding expansion. Fixing |$\varepsilon \in (0,\mu _{\mathcal O} )$|, there exist |$\tilde{y}_k \in W_t^{(k)} \otimes X_x^{(\ell -k)}$| with |$\langle y_k,\tilde{y}_k\rangle _{L_2(I)\otimes H} \geq (\mu _{\mathcal O}-\varepsilon )\|y_k\|_Y\|\tilde{y}_k\|_{Y^{\prime }}$| and |$\|\tilde{y}_k\|_{Y^{\prime }}=\|y_k\|_Y$|, and so |$\langle \sum _{k=0}^{\ell } y_k,\sum _{k=0}^{\ell } \tilde{y}_k\rangle _{L_2(I)\otimes H} \geq (\mu _{\mathcal O}-\varepsilon ) \|\sum _{k=0}^{\ell } y_k\|_Y\|\sum _{k=0}^{\ell } \tilde{y}_k\|_{Y^{\prime }}$|. Thanks to |$\partial _t X^{(\ell )} \subseteq Y^{(\ell )}$|, the proof is completed.
Remark 4.5In view of (4.2), it is obvious that Theorem 4.3 remains valid when the condition |$\frac{\textrm{d}}{\textrm{d}t} X_t \subseteq Y_t$| is relaxed to |$\inf _{\{u \in X_t\colon \frac{\textrm{d} u}{\textrm{d} t} \neq 0\}} \sup _{0\neq v \in Y_t} \frac{\int _I \frac{\textrm{d} u}{\textrm{d} t} v\,\textrm{d}t}{\|\frac{\textrm{d} u}{\textrm{d} t}\|_{L_2(I)}\|v\|_{L_2(I)}}>0$| uniformly in the pairs |$(X_t,Y_t)$| that are applied. As shown in Andreev (2013), the same holds true in the sparse tensor product case. For |$X_t$| the space of continuous piecewise linears w.r.t. some partition |${\mathcal T}$| of |$I$|, and |$Y_t$| the space of continuous piecewise linears w.r.t. the once dyadically refined partition, an easy computation shows that the inf-sup constant is not less than |$\sqrt{3/4}$|.
Since in our experiments with the method from Andreev (2013), with this alternative choice of |$Y_t$| the numerical results are slightly better than when taking |$Y_t$| to be the space of discontinuous piecewise linears w.r.t. |${\mathcal T}$|, we will report on results obtained with this alternative choice for |$Y_t$|.
5. Numerical experiments
For the simplest possible case of the heat equation in one space dimension discretized using as ‘primal’ trial space
|$X^{\delta }$| the space of continuous piecewise bilinears w.r.t. a uniform partition into squares, we compare the accuracy of approximations provided by the newly proposed method (i.e., the Galerkin discretization of (
2.10) with trial space here denoted by
|$Y_{\textrm{new}}^{\delta } \times X^{\delta }$|) with those obtained with the method from
Andreev (2013) (i.e., the Galerkin discretization of (
2.8)). We implement the latter method in the form (
3.15), i.e., after eliminating
|$\sigma ^{\delta }$|. The remaining trial space is denoted here by
|$Y_{\textrm{Andr}}^{\delta } \times X^{\delta }$|. So we take
|$T=1$|, i.e.,
|$I=(0,1)$|, and with
|$\varOmega :=(0,1)$|,
|$H:=L_2(\varOmega )$|,
|$V:=H^1_0(\varOmega )$|,
|$a(t;\eta ,\zeta ):=\int _{\varOmega } \eta ^{\prime } \zeta ^{\prime }\,\textrm{d}x$|. With
|$\frac{1}{h_t}=\frac{1}{h_x} \in \mathbb N$|, we set
Note that
|$\dim Y^{\delta }_{\textrm{new}} \approx \dim X^{\delta }$| and
|$\dim Y^{\delta }_{\textrm{Andr}} \approx 2\dim X^{\delta }$|. The total number of nonzeros in the whole system matrix of the new method is asymptotically a factor 2 smaller than this number for Andreev’s method.
Prescribing both a smooth exact solution |$u(t,x)=e^{-2t} \sin \pi x$| and a singular one |$u(t,x)= e^{-2t} |t-x| \sin \pi x$|, Fig. 1 shows the errors |$e^{\delta }:= u - u^{\delta }$| in the |$X$|-norm as a function of |$\dim X^{\delta }$|.
The norms of the errors in the Galerkin solutions found by the two methods are nearly indistinguishable from one another. Furthermore, the observed convergence rates |$1/2$| and |$1/4$|, respectively, are the best possible ones, that in view of the polynomial degrees of |$X^{\delta }$| and |$Y^{\delta }$| (new method) or that of |$X^{\delta }$| (Andreev’s method) and the regularity of the solutions, can be expected with the application of uniform meshes. (For any |$\varepsilon>0$|, |$e^{-2t} |t-x| \sin \pi x \in H^{\frac 32-\varepsilon }(I \times \varOmega ) \setminus H^{\frac 32}(I \times \varOmega ).$|)
For both solutions and both numerical methods, the errors |$e^{\delta }(T,\cdot )$| measured in |$L_2(\varOmega )$| converge with the better rate |$1$|, i.e., these errors are asymptotically proportional to |$h_x^2=h_t^2$|; see the left-hand picture in Fig. 2. To illustrate that the two methods yield different Galerkin solutions, we show |$e^{\delta }(0, \cdot )$|, measured in the |$L_2(\varOmega )$|-norm on the right-hand-side of Fig. 2.
The new method actually yields two approximations for |$u$|, viz. |$u^{\delta }$| and |$\lambda ^{\delta }$|. This secondary approximation is not in |$X$|, but it is in |$Y=L_2(I;V)$|. For both solutions, the errors in |$\lambda ^{\delta }$| measured in the |$Y$|-norm are slightly larger than in those in |$u^{\delta }$|; see the left-hand picture in Fig. 3.
Finally, we replaced the symmetric spatial diffusion operator by a nonsymmetric convection-diffusion operator |$a(t;\eta ,\zeta ):=\int _{\varOmega } \eta ^{\prime } \zeta ^{\prime }\,+ \beta \eta ^{\prime } \zeta \textrm{d}x$|. Letting |$\beta := 100$| and again taking the singular solution |$u(t,x)=e^{-2t} |t-x| \sin \pi x$|, the errors |$e^{\delta }$| in the |$X$|-norm of both Galerkin solutions vs. |$\dim X^{\delta }$| are given in Fig. 3. We once again see that the two methods show very comparable convergence behavior.
6. Conclusion
Three related (Petrov–) Galerkin discretizations of space–time variational formulations were analyzed. The Galerkin scheme introduced by Steinbach (2015) has the lowest computational cost and applies on general space–time meshes, but depending on the exact solution, the numerical solutions can be far from quasi-optimal in the natural mesh-independent norm. The minimal residual Petrov–Galerkin discretization introduced by Andreev (2013) yields for suitable trial and test pairs quasi-optimal approximations from the trial space. For suitable pairs of trial spaces, Galerkin discretizations of a newly introduced mixed space–time variational formulation also yield quasi-optimal approximations, but for the same accuracy at a lower computational cost than with the method from Andreev (2013).
Funding
NSF (grant DMS 172029 to R.S.); Netherlands Organization for Scientific Research (NWO) (under contract. no. 613.001.652 to J.W.).
References
Andreev
, R.
(
2012
)
Stability of space–time Petrov–Galerkin discretizations for parabolic evolution equations
.
Ph.D. Thesis
.
ETH, Zürich
.
Andreev
, R.
(
2013
)
Stability of sparse space–time finite element discretizations of linear parabolic evolution equations
.
IMA J. Numer. Anal.
,
33
,
242
–
260
.
Andreev
, R.
(
2016
)
Wavelet-in-time multigrid-in-space preconditioning of parabolic evolution equations
.
SIAM J. Sci. Comput.
,
38
,
A216
–
A242
.
Babuška
, I.
& Janik
, T.
(
1989
)
The h-p pversion of the finite element method for parabolic equations I. The p version in time
.
Numer. Methods Partial Differ. Equ.
,
5
,
363
–
399
.
Babuška
, I.
& Janik
, T.
(
1990
)
The h-p version of the finite element method for parabolic equations II. The h-p version in time
.
Numer. Methods Partial Differ. Equ.
,
6
,
343
–
369
.
Brézis
, H.
& Ekeland
, I.
(
1976
)
Un principe variationnel associé à certaines équations paraboliques. Le cas dépendant du temps
.
C. R. Acad. Sci. Paris Sér. A-B
,
282
,
Ai, A1197
–
A1198
.
Broersen
, D.
& Stevenson
, R. P.
(
2014
)
A robust Petrov–Galerkin discretisation of convection-diffusion equations
.
Comput. Math. Appl.
,
68
,
1605
–
1618
.
Carstensen
, C.
(
2002
)
Merging the Bramble–Pasciak–Steinbach and the Crouzeix–Thomée criterion for |${H}^1$|-stability of the |${L}^2$|-projection onto finite element spaces
.
Math. Comp.
,
71
,
157
–
163
.
Cohen
, A.
, Dahmen
, W.
& Welper
, G.
(
2012
)
Adaptivity and variational stabilization for convection–diffusion equations
.
ESAIM Math. Model. Numer. Anal.
,
46
,
1247
–
1273
.
Dautray
, R.
& Lions
, J.-L.
(
1992
)
Mathematical Analysis and Numerical Methods for Science and Technology
. .
Berlin
:
Springer
.
Devaud
, D.
& Schwab
, C.
(
2018
)
Space–time |$hp$|-approximation of parabolic equations
.
Calcolo
,
55
,
Art. 35
,
23
.
Dupont
, T.
(
1982
)
Mesh modification for evolution equations
.
Math. Comp.
,
39
,
85
–
107
.
Ern
, A.
, Smears
, I.
& Vohralík
, M.
(
2017
)
Guaranteed, locally space-time efficient, and polynomial-degree robust a posteriori error estimates for high-order discretizations of parabolic problems
.
SIAM J. Numer. Anal.
,
55
,
2811
–
2834
.
Führer
, T.
& Karkulik
, M.
(
2019
)
Space–time least-squares finite elements for parabolic equations
.
Technical Report
, .
Gander
, M. J.
& Neumüller
, M.
(
2016
)
Analysis of a new space–time parallel multigrid algorithm for parabolic problems
.
SIAM J. Sci. Comput.
,
38
,
A2173
–
A2208
.
Gaspoz
, F. D.
, Heine
, C.-J.
& Siebert
, K. G.
(
2016
)
Optimal grading of the newest vertex bisection and |${H}^1$|-stability of the |${L}_2$|-projection
.
IMA J. Numer. Anal.
,
36
,
1217
–
1241
.
Gunzburger
, M. D.
& Kunoth
, A.
(
2011
)
Space–time adaptive wavelet methods for control problems constrained by parabolic evolution equations
.
SIAM J. Contr. Optim.
,
49
,
1150
–
1170
.
Kato
, T.
(
1960
)
Estimation of iterated matrices, with application to the von Neumann condition
.
Numer. Math.
,
2
,
22
–
29
.
Langer
, U.
, Moore
, S. E.
& Neumüller
, M.
(
2016
)
Space-time isogeometric analysis of parabolic evolution problems
.
Comput. Methods Appl. Mech. Eng.
,
306
,
342
–
363
.
Nayroles
, B.
(
1976
)
Deux théorèmes de minimum pour certains systèmes dissipatifs
.
C. R. Acad. Sci. Paris Sér. A-B
,
282
,
Aiv, A1035
–
A1038
.
Neumüller
, M.
& Smears
, I.
(
2019
)
Time-parallel iterative solvers for parabolic evolution equations
.
Adv. Comput. Math.,
,
45
,
1031
–
1066
.
Rekatsinas
, N.
& Stevenson
, R.
(
2019
)
An optimal adaptive tensor product wavelet solver of a space-time FOSLS formulation of parabolic evolution problems
.
Adv. Comput. Math.
45
,
1031
–
1066
.
Schwab
, C.
& Stevenson
, R. P.
(
2009
)
A space-time adaptive wavelet method for parabolic evolution problems
.
Math. Comp.
,
78
,
1293
–
1318
.
Schwab
, C.
& Stevenson
, R. P.
(
2017
)
Fractional space–time variational formulations of (Navier)–Stokes equations
.
SIAM J. Math. Anal.
,
49
,
2442
–
2467
.
Steinbach
, O.
& Zank
, M.
(
2018
)
Coercive Space–time Finite Element Methods for Initial Boundary Value Problems. Berichte aus dem Institut für Angewandte Mathematik, Bericht 2018/7
.
Technische Universität Graz
.
Steinbach
, O.
(
2015
)
Space–time finite element methods for parabolic problems
.
Comput. Methods Appl. Math.
,
15
,
551
–
566
.
Scott
, L. R.
& Zhang
, S.
(
1990
)
Finite element interpolation of nonsmooth functions satisfying boundary conditions
.
Math. Comp.
,
54
,
483
–
493
.
Tantardini
, F.
& Veeser
, A.
(
2016
)
The |${L}^2$|-projection and quasi-optimality of Galerkin methods for parabolic equations
.
SIAM J. Numer. Anal.
,
54
,
317
–
340
.
Urban
, K.
& Patera
, A. T.
(
2014
)
An improved error bound for reduced basis approximation of linear parabolic problems
.
Math. Comp.
,
83
,
1599
–
1615
.
Voulis
, I.
& Reusken
, A.
(
2018
)
A time dependent Stokes interface problem: well-posedness and space-time finite element discretization
.
ESAIM Math. Model. Numer. Anal.
,
52
,
2187
–
2213
.
Wloka
, J.
(
1982
)
Partielle Differentialgleichungen: Sobolevräume und Randwertaufgaben
.
Stuttgart
:
B. G. Teubner
.
Xu
, J.
& Zikatanov
, L.
(
2003
)
Some observations on Babuška and Brezzi theories
.
Numer. Math.
,
94
,
195
–
202
.
© The Author(s) 2020. Published by Oxford University Press.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (
http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.