Abstract

We analyze Galerkin discretizations of a new well-posed mixed space–time variational formulation of parabolic partial differential equations. For suitable pairs of finite element trial spaces, the resulting Galerkin operators are shown to be uniformly stable. The method is compared to two related space–time discretization methods introduced by Andreev (2013, Stability of sparse space-time finite element discretizations of linear parabolic evolution equations. IMA J. Numer. Anal., 33, 242–260) and by Steinbach (2015, Space-time finite element methods for parabolic problems. Comput. Methods Appl. Math., 15, 551–566).

1. Introduction

In recent years, one has witnessed a rapidly growing interest in simultaneous space–time methods for solving parabolic evolution equations originally introduced in Babuška & Janik (1989, 1990); see, e.g., Gunzburger & Kunoth (2011), Andreev (2013), Urban & Patera (2014), Steinbach (2015), Gander & Neumüller (2016), Langer et al. (2016), Schwab & Stevenson (2017), Devaud & Schwab (2018), Rekatsinas & Stevenson (2019), Steinbach & Zank (2018), Voulis & Reusken (2018), Fuhrer & Karkulik (2019), Neumüller & Smears (2019). Compared to classical time marching methods, space–time methods are much better suited for a massively parallel implementation and have the potential to drive adaptivity simultaneously in space and time.

Apart from the first-order system least squares formulation recently introduced in Führer & Karkulik (2019), the known well-posed simultaneous space–time variational formulations of parabolic equations in terms of partial differential operators only, so not involving nonlocal operators, are not coercive. As a consequence, it is non trivial to find families of pairs of discrete trial and test spaces for which the resulting Petrov–Galerkin discretizations are uniformly stable. The latter is a sufficient and, as we will see, necessary condition for the Petrov–Galerkin approximations to be quasi-optimal, i.e., to yield an up to a constant factor best approximation to the solution from the trial space. This concept has to be contrasted to rate optimality that, for quasi-uniform temporal and spatial partitions, has been shown for any reasonable numerical scheme under the assumption of sufficient regularity of the solution.

If one allows different spatial meshes at different times then for the classical time marching schemes, quasi-optimality of the numerical approximations is known not to be guaranteed as demonstrated in Dupont, (1982, Section 4).

In view of the difficulty in constructing stable pairs of trial and test spaces, Andreev (2013) considered minimal residual Petrov–Galerkin discretizations. They have an equivalent interpretation as Galerkin discretizations of an extended self-adjoint mixed system, with the Riesz lift of the residual of the primal variable being the secondary variable. This is the point of view we will take.

A different path was followed by Steinbach (2015). Assuming a homogenous initial condition, for equal test and trial finite element spaces w.r.t. fully general finite element meshes, stability was shown w.r.t. a weaker mesh-dependent norm on the trial space. As we will see, however, this has the consequence that for some solutions of the parabolic problem these Galerkin approximations are far from being quasi-optimal w.r.t. the natural mesh-independent norm on the trial space.

In the current work, we modify Andreev’s approach by considering an equivalent but simpler mixed system that we construct from a space–time variational formulation that follows from applying the Brézis–Ekeland–Nayroles principle (Brézis & Ekeland, 1976; Nayroles, 1976). With the same trial space for the primal variable, we show stability of the Galerkin discretization of this mixed system whilst utilizing a smaller trial space for the secondary variable. In addition, the stiffness matrix resulting from this mixed system is more sparse. In our numerical experiments the errors in the Galerkin solutions are nevertheless very comparable.

1.1 Organization

In Section 2 we derive the two self-adjoint mixed system formulations of the parabolic problem that are central in this work. In Section 3 we give sufficient conditions for stability of Galerkin discretizations for both systems. We provide an a priori error bound for the Galerkin discretization of the newly introduced system and improved a priori error bounds for the methods from Andreev (2013) and Steinbach (2015). In Section 4, we show that the crucial condition for stability (being the only condition for the newly introduced mixed system) is satisfied for prismatic space–time finite elements whenever the generally nonuniform partition in time is independent of the spatial location, and the generally nonuniform spatial mesh in each time slab is such that the corresponding |$L_2$|-orthogonal projection is uniformly |$H^1$|-stable. In Section 5 we present some first simple numerical experiments for a one-dimensional spatial domain and uniform meshes. Conclusions are presented in Section 6.

1.2 Notation

In this work, by |$C \lesssim D$| we mean that |$C$| can be bounded by a multiple of |$D$|⁠, independently of parameters that C and D may depend on. Obviously, |$C \gtrsim D$| is defined as |$D \lesssim C$| and |$C\eqsim D$| as |$C\lesssim D$| and |$C \gtrsim D$|⁠.

For normed linear spaces |$E$| and |$F$|⁠, by |$\mathcal L(E,F)$| we denote the normed linear space of bounded linear mappings |$E \rightarrow F$|⁠, and by |${\mathcal{L}}{is}(E,F)$| its subset of boundedly invertible linear mappings |$E \rightarrow F$|⁠. We write |$E \hookrightarrow F$| to denote that |$E$| is continuously embedded into |$F$|⁠. For simplicity only, we exclusively consider linear spaces over the scalar field |$\mathbb R$|⁠.

For linear spaces |$E$| and |$F$|⁠, sequences |$\varPhi =(\phi _j)_{j \in J} \subset E$|⁠, |$\varPsi =(\psi _i)_{i \in I} \subset F$|⁠, |$f \in F^{\ast }$| and a linear |$A\colon E \rightarrow F^{\ast }$|⁠, we define the column vector |$f(\varPsi ):=[f(\psi _i)]_{i \in I}$| and matrix |$(A \varPhi )(\varPsi ):=[(A \phi _j)(\psi _i)]_{i \in I,\,j \in J}$|⁠. If |$E=F$| is an inner product space, then with |$R\colon E \rightarrow E^{\prime }$| denoting the Riesz map, we set |$\langle \varPsi ,\varPhi \rangle :=(R \varPhi )(\varPsi )=[(R \phi _j)(\psi _i)]_{i \in I,\,j \in J}= [\langle \psi _i,\phi _j\rangle ]_{i \in I,\,j \in J}$|⁠.

2. Space–time formulations of the parabolic evolution problem

Let |$V,H$| be separable Hilbert spaces of functions on some ‘spatial domain’ such that |$V \hookrightarrow H$| with dense and compact embedding. Identifying |$H$| with its dual, we obtain the Gelfand triple |$V \hookrightarrow H \simeq H^{\prime } \hookrightarrow V^{\prime }$|⁠.

We use the notation |$\langle \cdot ,\cdot \rangle $| to denote both the scalar product on |$H \times H$| and its unique extension by continuity to the duality pairing on |$V^{\prime } \times V$|⁠. Correspondingly, the norm on |$H$| is denoted by ∥ ∥.

For a.e.
$$\begin{equation*} t \in I:=(0,T), \end{equation*}$$
let |$a(t;\cdot ,\cdot )$| denote a bilinear form on |$V \times V$| such that for any |$\eta ,\zeta \in V$|⁠, |$t \mapsto a(t;\eta ,\zeta )$| is measurable on |$I$|⁠, and such that for a.e. |$t\in I$|⁠,
$$\begin{alignat}{3} |a(t;\eta,\zeta)| & \lesssim \|\eta\|_{V} \|\zeta\|_{V} \quad &&(\eta,\zeta \in V) \quad &&\text{(boundedness)}, \end{alignat}$$
(2.1)
$$\begin{alignat}{3} a(t;\eta,\eta) &\gtrsim \|\eta\|_{V}^2 \quad &&\;\;\quad(\eta \in{V}) &&\qquad\text{(coercivity)}. \end{alignat}$$
(2.2)
With |$A(t) \in{\mathcal{L}}\textrm{is}({V},V^{\prime })$| being defined by |$ (A(t) \eta )(\zeta )=a(t;\eta ,\zeta )$|⁠, we are interested in solving the parabolic initial value problem to find |$u$| such that
$$\begin{equation} \left\{\!\! \begin{array}{rl} \frac{\textrm{d} u}{\textrm{d} t}(t) +A(t) u(t)&\!\!\!= g(t) \quad(t \in I),\\ u(0) &\!\!\!= u_0. \end{array} \right. \end{equation}$$
(2.3)

 

Remark 2.1

With |$\tilde{u}(t):=u(t) e^{-\varrho t}$|⁠, (2.3) is equivalent to |$ \frac{\textrm{d} \tilde{u}}{\textrm{d} t}(t) +(A(t)+\varrho \textrm{Id}) \tilde{u}(t)= g(t)e^{-\varrho t}$| (⁠|$t \in I$|⁠), |$\tilde{u}(0) = u_0$|⁠. So if initially |$a(t;\eta ,\eta )$| is not coercive but only satisfies a Gårding inequality|$ a(t;\eta ,\eta ) + \varrho \langle \eta ,\eta \rangle \gtrsim \|\eta \|_{V}^2$| (⁠|$\eta \in{V}$|⁠), then one can consider a transformed problem such that (2.2) is valid.

In a simultaneous space–time variational formulation, the parabolic partial differential equation (PDE) reads as follows: find |$u$| from a suitable space of functions of time and space such that
$$\begin{equation} (Bw)(v):=\int_I \big\langle{ \textstyle \frac{\textrm{d} w}{\textrm{d}t}}(t), v(t)\big\rangle + a(t;w(t),v(t))\, \textrm{d}t = \int_I \langle g(t), v(t)\rangle =:g(v) \end{equation}$$
(2.4)
for all |$v$| from another suitable space of functions of time and space. One possibility to enforce the initial condition is by testing it against additional test functions. A proof of the following result can be found in Schwab & Stevenson (2009); cf. Dautray & Lions (1992, Ch. XVIII, Section 3) and Wloka (1982, Ch. IV, Section 26) for slightly different statements.

 

Theorem 2.2
With |$X:=L_2(I;{V}) \cap H^1(I;V^{\prime })$|⁠, |$Y:=L_2(I;{V})$|⁠, under conditions (2.1) and (2.2) it holds that
$$\begin{equation} \left[\begin{array}{@{}c@{}} B \\ \gamma_0\end{array} \right]\in{\mathcal{L}}\textrm{is} (X,Y^{\prime} \times H), \end{equation}$$
(2.5)
where for |$t \in \bar{I}$|⁠, |$\gamma _t\colon u \mapsto u(t,\cdot )$| denotes the trace map. That is, assuming |$g \in Y^{\prime }$| and |$u_0 \in H$|⁠, find |$u \in X$| such that
$$\begin{equation} (Bu)(v_1)+\langle u(0,\cdot),v_2\rangle=g(v_1)+\langle u_0,v_2\rangle\quad ((v_1,v_2) \in Y \times H) \end{equation}$$
(2.6)
is a well-posed variational formulation of (2.3).

One ingredient of the proof of this theorem is the continuity of the embedding |$X \hookrightarrow C(\bar{I},H)$|⁠, in particular implying that for any |$t \in \bar{I}$|⁠, |$\gamma _t \in \mathcal L(X,H)$|⁠.

Defining |$A, A_s \in{\mathcal{L}}\textrm{is} (Y,Y^{\prime })$| (here (2.2) is used), |$A_a \in \mathcal L(Y,Y^{\prime })$| and |$C, \partial _t \in \mathcal L(X,Y^{\prime })$| by
$$\begin{align*} (Au)(v)&:=\int_I a(t;u(t),v(t))\,\textrm{d}t,\quad A_s:={\textstyle \frac12}(A+A^{\prime}), \quad A_a:={\textstyle \frac12}(A-A^{\prime}),\\ C&:=B-A_s,\quad \partial_t:=B-A, \end{align*}$$
an equivalent well-posed variational formulation of the parabolic PDE is obtained by applying the so-called Brézis–Ekeland–Nayroles variational principle (Brézis & Ekeland, 1976; Nayroles, 1976); cf. also Andreev (2012, Section 3.2.4). It reads
$$\begin{equation} (C^{\prime} A_s^{-1} C+A_s+\gamma_T^{\prime} \gamma_T)u=(\textrm{Id} +C^{\prime} A_s^{-1})g +\gamma_0^{\prime} u_0, \end{equation}$$
(2.7)
where the operator on the left-hand side is in |${\mathcal{L}}\textrm{is}(X,X^{\prime })$|⁠, is self-adjoint and coercive.
We provide a direct proof of these facts. Since
$$\left [\begin{smallmatrix} A_s & 0 \\ 0 & \textrm{Id} \end{smallmatrix}\right ] \in{\mathcal{L}}\textrm{is}(Y\times H,Y^{\prime }\times H)$$
, an equivalent formulation of (2.5) as a self-adjoint saddle point equation reads as follows: find |$(\mu ,\sigma ,u) \in Y\times H\times X$| (where |$\mu $| and |$\sigma $| will be zero) such that
$$\begin{align} \left[\begin{array}{@{}ccc@{}} A_s & 0 & B\\ 0 & \textrm{Id} & \gamma_0\\ B^{\prime} & \gamma_0^{\prime} & 0\end{array}\right] \left[\begin{array}{@{}c@{}} \mu \\ \sigma \\ u \end{array}\right]&= \left[\begin{array}{@{}c@{}} g \\ u_0 \\ 0 \end{array}\right] \end{align}$$
(2.8)
or
$$\begin{align} (B^{\prime} A_s^{-1} B+\gamma_0^{\prime}\gamma_0)u&=B^{\prime} A_s^{-1}g+\gamma_0^{\prime} u_0. \end{align}$$
(2.9)
Thanks to (2.5), this Schur complement |$B^{\prime } A_s^{-1} B+\gamma _0^{\prime}\gamma _0$| is in |${\mathcal{L}} \textrm{is} (X,X^{\prime })$|⁠, is self-adjoint and coercive.
We show that (2.9) and (2.7) are equal. Recalling the definitions of |$C$| and |$\partial _t$|⁠, note that the right-hand sides of both equations are the same, and that
$$\begin{equation*} B^{\prime} A_s^{-1} B+\gamma_0^{\prime}\gamma_0=C^{\prime} A_s^{-1} C+A_s+C+C^{\prime}+\gamma_0^{\prime}\gamma_0=C^{\prime} A_s^{-1} C+A_s+\partial_t+\partial_t^{\prime}+\gamma_0^{\prime}\gamma_0 \end{equation*}$$
thanks to |$A_a^{\prime }=-A_a$|⁠. The proof of our claim is completed by noting that for |$w,v \in X$|⁠,
$$\begin{align*} \big(\big(\partial_t+\partial_t^{\prime}+\gamma_0^{\prime}\gamma_0\big)w\big)(v)&= \int_I \big\langle{ \textstyle \frac{\textrm{d} w}{\textrm{d}t}}(t), v(t)\big\rangle+\big\langle w(t), { \textstyle \frac{\textrm{d} v}{\textrm{d}t}}(t)\big\rangle\,\textrm{d}t+\langle w(0), v(0)\rangle\\ &= \int_I{\textstyle \frac{\textrm{d}}{\textrm{d}t}} \langle w(t),v(t)\rangle\,\textrm{d}t+\langle w(0),v(0)\rangle=(\gamma_T^{\prime}\gamma_Tw)(v). \end{align*}$$
As (2.9) was obtained as the Schur complement equation of (2.8), in its form (2.7) it is naturally obtained as the Schur complement of the problem of finding |$(\lambda ,u)\in Y \times X$| such that
$$\begin{equation} \left[\begin{array}{@{}cc@{}} A_s & C \\ C^{\prime} &\! - (A_s+\gamma_T^{\prime} \gamma_T) \end{array}\right] \left[\begin{array}{@{}c@{}} \lambda \\ u \end{array}\right]= \left[\begin{array}{@{}c@{}} g \\ -(g+\gamma_0^{\prime} u_0) \end{array}\right]. \end{equation}$$
(2.10)
Knowing that its Schur complement is in |${\mathcal{L}} \textrm{is} (X,X^{\prime })$|⁠, |$A_s \in{\mathcal{L}} \textrm{is} (Y,Y^{\prime })$| and |$C \in \mathcal L(X,Y^{\prime })$|⁠, we infer that the self-adjoint operator on the left-hand side of (2.10) is in |${\mathcal{L}} \textrm{is} (Y\times X,Y^{\prime }\times X^{\prime })$|⁠.
Substituting |$C=B-A_s$| and |$Bu=g$|⁠, we find that the secondary variable satisfies
$$\begin{equation*} \lambda=u. \end{equation*}$$

 

Remark 2.3

When reading |$\gamma _T^{\prime } \gamma _T$| as |$\partial _t+\partial _t^{\prime }+\gamma _0^{\prime}\gamma _0$|⁠, system (2.10) has remarkable similarities to a certain preconditioned version presented in Neumüller & Smears (2019) of a discretized parabolic PDE using the implicit Euler method in time. Ideas concerning optimal preconditioning developed in that paper, as well as those in Andreev (2016), can be expected to be applicable to Galerkin discretizations of (2.10).

 

Remark 2.4
In equations (2.8) and (2.9), the operator |$A_s$| can be replaced by a general self-adjoint |$\tilde{A}_s \in{\mathcal{L}} \textrm{is} (Y,Y^{\prime })$|⁠. With |$\tilde{C}:=B-\tilde{A}_s$|⁠, the equivalent equation (2.7) then reads
$$\begin{equation*} (\tilde{C}^{\prime} \tilde{A}_s^{-1} \tilde{C}+2 A_s-\tilde{A}_s+\gamma_T^{\prime} \gamma_T)u=(\textrm{Id} +\tilde{C}^{\prime} \tilde{A}_s^{-1})g +\gamma_0^{\prime} u_0, \end{equation*}$$
and (2.10) as
$$\begin{equation*} \left[\begin{array}{@{}cc@{}} \tilde{A}_s & \tilde{C} \\ \tilde{C}^{\prime} & -(2 A_s-\tilde{A}_s+\gamma_T^{\prime} \gamma_T) \end{array}\right] \left[\begin{array}{@{}c@{}} \lambda \\ u \end{array}\right]= \left[\begin{array}{@{}c@{}} g \\ -(g+\gamma_0^{\prime} u_0) \end{array}\right], \end{equation*}$$
with solution |$\lambda =u$|⁠.

In the next section, we study Galerkin discretizations of equations (2.8) and (2.10), which then are no longer equivalent.

Since the secondary variables |$\mu $| and |$\sigma $| in (2.8) are zero, the subspaces for their approximation do not have to satisfy any approximation properties. Since the secondary variable |$\lambda $| in (2.10) is nonzero, the subspace of |$Y$| for its approximation has to satisfy approximation properties, and the error in its best approximation enters the upper bound for the error in the primal variable |$u$|⁠.

On the other hand, (uniform) stability will be easier to realize with equation (2.10) and will also be proven to hold true for |$A_a \neq 0$|⁠; the system matrix will be more sparse and the number of unknowns will be smaller.

In order to facilitate the derivation of some quantitative results, we equip the spaces |$Y$| and |$X$| with the ‘energy norms’ defined by
$$\begin{equation*} \|v\|^2_Y:=(A_s v)(v),\quad \|u\|_{X}^2:=\|u\|_{Y}^2 +\|\partial_t u\|_{Y^{\prime}}^2 + \|u(T)\|^2, \end{equation*}$$
which are equivalent to the standard norms on these spaces. Correspondingly, orthogonality in |$Y$| is interpreted w.r.t. the ‘energy scalar product’ |$(A_s\cdot )(\cdot )$|⁠.

3. Stable discretizations of the parabolic problem

3.1 Uniformly stable (Petrov–) Galerkin discretizations and quasi-optimal approximations

This subsection is devoted to proving the following theorem.

 

Theorem 3.1
Let |$W$| and |$Z$| be Hilbert spaces, and |$F \in{\mathcal{L}} \textrm{is} (Z,W^{\prime })$|⁠. Let |$(W^{\delta },Z^{\delta })_{\delta \in \varDelta }$| be a family of closed subspaces of |$W \times Z$| such that for each |$\delta \in \varDelta $| it holds that |${E_W^{\delta }}^{\prime } F E_Z^{\delta } \in{\mathcal{L}}\textrm{is} (Z^{\delta },{W^{\delta }}^{\prime })$|⁠, where |$E_W^{\delta }\colon W^{\delta } \rightarrow W$|⁠, |$E_Z^{\delta }\colon Z^{\delta } \rightarrow Z$| denote the trivial embeddings. Then the collection |$(z^{\delta })_{\delta \in \varDelta }$| of Petrov–Galerkin approximations to |$z \in Z$|⁠, determined by |${E_W^{\delta }}^{\prime } F E_Z^{\delta } z^{\delta }={E_W^{\delta }}^{\prime } F z$|⁠, is quasi-optimal, i.e., |$\|z-z^{\delta }\|_Z \lesssim \inf _{0 \neq \bar{z}^{\delta } \in Z} \|z-\bar{z}^{\delta } \|_Z$|⁠, uniformly in |$z \in Z$| and |$\delta \in \varDelta $|⁠, if and only if
$$\begin{equation*} \inf_{\delta \in \varDelta}\inf_{0 \neq z \in Z^{\delta}} \sup_{0 \neq w \in W^{\delta}} \frac{|(Fz)(w)|}{\|z\|_Z \|w\|_W}>0\qquad\textrm{(uniform stability)}. \end{equation*}$$

 

Proof.
The mapping |$P^{\delta }:=z \mapsto z^{\delta }=E_Z^{\delta } ({E_W^{\delta }}^{\prime } F E_Z^{\delta })^{-1} {E_W^{\delta }}^{\prime } F z$| is a projector. For |$\{0\} \subsetneq Z^{\delta } \subsetneq Z$|⁠, it holds that |$P^{\delta } \not \in \{0,\textrm{Id}\}$|⁠, and consequently, |$\|\textrm{Id} -P^{\delta }\|_{\mathcal L(Z,Z)}=\|P^{\delta }\|_{\mathcal L(Z,Z)}$| (see Kato, 1960; Xu & Zikatanov, 2003). We obtain
$$\begin{equation} \begin{split} \sup_{z \in Z\setminus Z^{\delta}}\frac{\|z-z^{\delta}\|_Z}{\inf_{\bar{z}^{\delta} \in Z^{\delta}}\|z-\bar{z}^{\delta}\|_Z}&= \sup_{z \in Z\setminus Z^{\delta}}\sup_{\bar{z}^{\delta} \in Z^{\delta}}\frac{\|(I-P^{\delta})z\|_Z}{\|z-\bar{z}^{\delta}\|_Z}\\ &= \sup_{0 \neq \bar{z} \in Z}\frac{\|(I-P^{\delta})\bar{z}\|_Z}{\|\bar{z}\|_Z}=\|P^{\delta}\|_{\mathcal L(Z,Z)}. \end{split} \end{equation}$$
(3.1)
It remains to show uniform boundedness of |$\|P^{\delta }\|_{\mathcal L(Z,Z)}$| if and only if uniform stability is valid.
The definition of |$P^{\delta }$| shows that
$$\begin{equation*} \|F^{-1}\|_{\mathcal L(W^{\prime},Z)}^{-1} \leq \frac{\|P^{\delta}\|_{\mathcal L(Z,Z)}}{\|E_Z^{\delta} ({E_W^{\delta}}^{\prime} F E_Z^{\delta})^{-1} {E_W^{\delta}}^{\prime}\|_{\mathcal L(W^{\prime},Z)}} \leq \|F\|_{\mathcal L(Z,W^{\prime})}. \end{equation*}$$
Further, we have
$$\begin{equation*} \|E_Z^{\delta} ({E_W^{\delta}}^{\prime} F E_Z^{\delta})^{-1} {E_W^{\delta}}^{\prime}\|_{\mathcal L(W^{\prime},Z)}=\| ({E_W^{\delta}}^{\prime} F E_Z^{\delta})^{-1} {E_W^{\delta}}^{\prime}\|_{\mathcal L(W^{\prime},Z^{\delta})}=\| ({E_W^{\delta}}^{\prime} F E_Z^{\delta})^{-1}\|_{\mathcal L({W^{\delta}}^{\prime},Z^{\delta})}, \end{equation*}$$
where the last equality follows from |$\|{E_W^{\delta }}^{\prime }\|_{\mathcal L(W^{\prime },{W^{\delta }}^{\prime })} \leq 1$| and, for the other direction, from the fact that for given |$f^{\delta } \in{W^{\delta }}^{\prime }$| the function |$f \in W^{\prime }$| defined by |$f|_{W^{\delta }}:=f^{\delta }$| and |$f|_{(W^{\delta })^{\perp }}:=0$| satisfies |$\|\,f\|_{W^{\prime }}=\|\,f^{\delta }\|_{{W^{\delta }}^{\prime }}$| and |$f^{\delta }={E_W^{\delta }}^{\prime } f$|⁠.
The proof is completed by
$$\begin{equation} \|({E_W^{\delta}}^{\prime} F E_Z^{\delta})^{-1}\|_{\mathcal L({W^{\delta}}^{\prime},Z^{\delta})}^{-1}=\inf_{0 \neq z \in Z^{\delta}} \sup_{0 \neq w \in W^{\delta}} \frac{|(Fz)(w)|}{\|z\|_Z \|w\|_W}. \end{equation}$$
(3.2)

 

Remark 3.2
In particular, the above analysis provides a short self-contained proof of the quantitative results
$$\begin{equation*} \|F^{-1}\|_{\mathcal L(W^{\prime},Z)}^{-1} \leq \frac{\sup_{z \in Z\setminus Z^{\delta}}\frac{\|z-z^{\delta}\|_Z}{\inf_{\bar{z}^{\delta} \in Z^{\delta}}\|z-\bar{z}^{\delta}\|_Z}} {\inf_{0 \neq z \in Z^{\delta}} \sup_{0 \neq w \in W^{\delta}} \frac{|(Fz)(w)|}{\|z\|_Z \|w\|_W}} \leq \|F\|_{\mathcal L(Z,W^{\prime})} \end{equation*}$$
that were established earlier in Tantardini & Veeser (2016, Section 2.1, in particular (2.12)).

3.2 Uniformly stable Galerkin discretizations of (2.10)

Let |$Y^{\delta } \times X^{\delta }$| be a closed subspace of |$Y \times X$|⁠, and let |$E_Y^{\delta }\colon Y^{\delta } \rightarrow Y$| and |$E_X^{\delta }\colon X^{\delta } \rightarrow X$| denote the trivial embeddings. Since |${E^{\delta }_Y}^{\prime }A_s E^{\delta }_Y \in{\mathcal{L}} \textrm{is} (Y^{\delta },{Y^{\delta }}^{\prime })$| (as well as being an isometry), the Galerkin operator resulting from (2.10) can be factorized as
$$\begin{equation} \begin{split} &\left[\begin{array}{@{}cc@{}} {E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y & {E^{\delta}_Y}^{\prime}C E^{\delta}_X \\ ({E^{\delta}_Y}^{\prime}C E^{\delta}_X)^{\prime} & -{E^{\delta}_X}^{\prime}(A_s+\gamma_T^{\prime} \gamma_T)E^{\delta}_X \end{array}\right] = \left[\begin{array}{@{}cc@{}} \textrm{Id} & 0 \\ ({E^{\delta}_Y}^{\prime}C E^{\delta}_X)^{\prime} ({E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y)^{-1} & \textrm{Id}\end{array}\right] \\ &\quad\circ \left[\begin{array}{@{}cc@{}} {E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y & 0 \\ 0 & -{E^{\delta}_X}^{\prime}(A_s+\gamma_T^{\prime} \gamma_T)E^{\delta}_X-({E^{\delta}_Y}^{\prime}C E^{\delta}_X)^{\prime} ({E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y)^{-1}{E^{\delta}_Y}^{\prime}C E^{\delta}_X \end{array}\right] \\ &\quad\circ \left[\begin{array}{@{}cc@{}} \textrm{Id} & ({E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y)^{-1}{E^{\delta}_Y}^{\prime}C E^{\delta}_X \\0 & \textrm{Id}\end{array}\right]. \end{split} \end{equation}$$
(3.3)
We conclude that this Galerkin operator is invertible if and only if the Schur complement
$$\begin{equation} {E^{\delta}_X}^{\prime}(A_s+\gamma_T^{\prime} \gamma_T)E^{\delta}_X+({E^{\delta}_Y}^{\prime}C E^{\delta}_X)^{\prime} ({E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y)^{-1}{E^{\delta}_Y}^{\prime}C E^{\delta}_X \end{equation}$$
(3.4)
is invertible, which holds true for any |$X^{\delta } \neq \{0\}$|⁠.

 

Theorem 3.3
Let |$(Y^{\delta },X^{\delta })_{\delta \in \varDelta }$| be a family of closed subspaces of |$Y \times X$| such that
$$\begin{equation} \gamma_{\varDelta}:=\inf_{\delta\in \varDelta}\inf_{\{u \in X^{\delta}\colon \partial_t u \neq 0\}} \sup_{0\neq v \in Y^{\delta}} \frac{(\partial_t u)(v)}{\|\partial_t u\|_{Y^{\prime}}\|v\|_Y}>0.\end{equation}$$
(3.5)
1 Let |$\rho =\rho _{\varDelta }$| be the root in |$[0,1)$| of
$$\begin{equation*} \gamma_{\varDelta}^2 (\rho^2-\rho)+\|A_a\|_{\mathcal L(Y,Y^{\prime})}^2(\rho-1)+\rho=0, \end{equation*}$$
and let
$$\begin{equation*} C_{\varDelta}:=\frac{(3+\|A_a\|_{\mathcal L(Y,Y^{\prime})}^2)(\sqrt{3}+\|A_a\|_{\mathcal L(Y,Y^{\prime})})}{(1-\rho_{\varDelta})\gamma_{\varDelta}^2}, \end{equation*}$$
so that |$C_{\varDelta }=3 \sqrt{3} \,\gamma _{\varDelta }^{-2}$| when |$\|A_a\|_{\mathcal L(Y,Y^{\prime })}=0$|⁠, and |$\lim _{\|A_a\|_{\mathcal L(Y,Y^{\prime })} \rightarrow \infty } C_{\varDelta }=\infty $|⁠. Then with |$\lambda =u$| and |$(\lambda ^{\delta },u^{\delta })$| denoting the solutions of (2.10) and its Galerkin discretization, respectively, it holds that
$$\begin{equation} \sqrt{\|\lambda -\lambda^{\delta}\|^2_Y+\|u -u^{\delta}\|^2_X} \leq C_{\varDelta} \inf_{(\bar{\lambda}^{\delta},\bar{u}^{\delta}) \in Y^{\delta}\times X^{\delta}} \sqrt{\|\lambda -\bar{\lambda}^{\delta}\|^2_Y+\|u -\bar{u}^{\delta}\|^2_X}. \end{equation}$$
(3.6)

 

Proof.
In view of the second inequality presented in Remark 3.2, we start with bounding the norm of the continuous operator. Using Young’s inequality, for |$(\lambda ,u) \in Y \times X$| we have
$$\begin{align*} &\|A_s\lambda +\partial_t u\|_{Y^{\prime}}^2+\|\partial_t^{\prime} \lambda -\big(A_s+\gamma_T^{\prime} \gamma_T\big)u\|^2_{X^{\prime}} \\ & \leq{\textstyle \frac{3}{2}} \|A_s\lambda\|_{Y^{\prime}}^2+3\|\partial_t u\|_{Y^{\prime}}^2+ {\textstyle \frac{3}{2}}\|\partial_t^{\prime} \lambda\|^2_{X^{\prime}}+3\|\big(A_s+\gamma_T^{\prime} \gamma_T\big)u\|^2_{X^{\prime}}\\ &\leq{\textstyle \frac{3}{2}}\big(\|\lambda\|_{Y}^2+\|\lambda\|_{Y}^2\big)+3 \big(\|\partial_t u\|_{Y^{\prime}}^2+\|u\|^2_Y+\|u(T)\|^2\big)=3\big(\|\lambda\|_Y^2+\|u\|_{X}^2\big). \end{align*}$$
Together with |$\|A_a u\|_{Y^{\prime }}^2+\|A_a^{\prime } \lambda \|_{X^{\prime }}^2\leq \|A_a\|^2_{\mathcal L(Y,Y^{\prime })}(\|\lambda \|_Y^2+\|u\|_X^2)$|⁠, it shows that
$$\begin{align*} &\Big\| \left[\begin{array}{@{}cc@{}} A_s & C \\ C^{\prime} & -(A_s+\gamma_T^{\prime} \gamma_T) \end{array}\right]\Big\|_{\mathcal L(Y \times X,Y^{\prime} \times X^{\prime})}\\ &\leq \Big\| \left[\begin{array}{@{}cc@{}} A_s & \partial_t \\ \partial_t ^{\prime} & -(A_s+\gamma_T^{\prime} \gamma_T) \end{array}\right]\Big\|_{\mathcal L(Y \times X,Y^{\prime} \times X^{\prime})}+ \Big\| \left[\begin{array}{@{}cc@{}} 0 & A_a \\ A_a^{\prime} & 0 \end{array}\right]\Big\|_{\mathcal L(Y \times X,Y^{\prime} \times X^{\prime})}\\ &\leq \sqrt{3}+\|A_a\|_{\mathcal L(Y,Y^{\prime})}. \end{align*}$$
To bound, in view of (3.2), the norm of the inverse of the Galerkin operator, we use the block-LDU factorization (3.3). With |$r:=(1+\|A_a\|_{\mathcal L(Y,Y^{\prime })}^2)$|⁠, for |$u \in X$| it holds that
$$\begin{equation*} \|C u\|_{Y^{\prime}} \leq \|\partial_t u\|_{Y^{\prime}}+\|A_a\|_{\mathcal L(Y,Y^{\prime})} \|u\|_{Y} \leq \sqrt{r}\, \|u\|_X. \end{equation*}$$
Together with the fact that |${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta } \in{\mathcal{L}}\textrm{is}(Y^{\delta },{Y^{\delta }}^{\prime })$| is an isometry and again Young’s inequality, it shows that for |$(\lambda ,u) \in Y^{\delta }\times X^{\delta }$|⁠,
$$\begin{align*} \|\lambda-\big({E_Y^{\delta}}^{\prime} A_s E_Y^{\delta}\big)^{-1} {E_Y^{\delta}}^{\prime} C E_X^{\delta} u\|_Y^2+\|u\|_X^2 &\leq (1+r)\|\lambda\|_Y^2+(1+r^{-1}) r \|u\|_X^2+ \|u\|_X^2\\ &\leq (2+r)\big(\|\lambda\|_Y^2+\|u\|_X^2\big) \end{align*}$$
or
$$\begin{equation*} \Big\| \left[\begin{array}{@{}cc@{}} \textrm{Id} & ({E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y)^{-1}{E^{\delta}_Y}^{\prime}C E^{\delta}_X \\0 & \textrm{Id}\end{array}\right]^{-1} \Big\|_{\mathcal L(Y^{\delta} \times X^{\delta},Y^{\delta} \times X^{\delta})} \leq \sqrt{3+\|A_a\|_{\mathcal L(Y,Y^{\prime})}^2}. \end{equation*}$$
Obviously, the |$\mathcal L({Y^{\delta }}^{\prime } \times{X^{\delta }}^{\prime },{Y^{\delta }}^{\prime } \times{X^{\delta }}^{\prime })$|-norm of the inverse of the first factor on the right-hand side of (3.3) satisfies the same bound.
Moving to the second factor, we consider the Schur complement operator. From |$({E_Y^{\delta }}^{\prime } A_s E_Y^{\delta } \lambda )(\lambda ) =\|\lambda \|^2_Y$| for |$\lambda \in Y^{\delta }$|⁠, we have for |$f \in{Y^{\delta }}^{\prime }$|⁠, |$f(({E_Y^{\delta }}^{\prime } A_s E_Y^{\delta })^{-1}f) =\|({E_Y^{\delta }}^{\prime } A_s E_Y^{\delta })^{-1} f\|^2_Y=\|\,f\|_{{Y^{\delta }}^{\prime }}^2$|⁠, and so for |$u \in X^{\delta }$|⁠,
$$\begin{equation*} \Big(\big({E^{\delta}_Y}^{\prime}C E^{\delta}_X\big)^{\prime} \big({E^{\delta}_Y}^{\prime}A_s E^{\delta}_Y\big)^{-1}{E^{\delta}_Y}^{\prime}C E^{\delta}_X u\Big)(u)=\|{E^{\delta}_Y}^{\prime}C E^{\delta}_X u\|_{{Y^{\delta}}^{\prime}}^2. \end{equation*}$$
Using that for |$u \in X^{\delta }$|⁠,
$$\begin{equation*} \big\|{E_Y^{\delta}}^{\prime} \partial_t E_X^{\delta} u\big\|_{{Y^{\delta}}^{\prime}}^2=\Big(\sup_{0 \neq v \in Y^{\delta}}\frac{(\partial_t u)(v)}{\|v\|_Y}\Big)^2 \geq \gamma_{\varDelta}^2 \|\partial_t u\|_{Y^{\prime}}^2 \end{equation*}$$
and
$$\begin{equation*} \big\|{E_Y^{\delta}}^{\prime} A_a E_X^{\delta} u\big\|_{{Y^{\delta}}^{\prime}}^2 \leq \|A_a\|_{\mathcal L(Y,Y^{\prime})}^2 \|u\|_{Y}^2, \end{equation*}$$
Young’s inequality shows that
$$\begin{align*} \big\|{E^{\delta}_Y}^{\prime}C E^{\delta}_X u\big\|_{{Y^{\delta}}^{\prime}}^2 \geq (1-\rho_{\varDelta}) \gamma_{\varDelta}^2 \|\partial_t u\|_{Y^{\prime}}^2+(1-\rho_{\varDelta}^{-1}) \|A_a\|_{\mathcal L(Y,Y^{\prime})}^2 \|u\|_{Y}^2, \end{align*}$$
where we have assumed that |$\rho _{\varDelta }>0$|⁠, i.e., |$A_a \neq 0$|⁠. It follows that
$$\begin{align}\nonumber ((A_s+&\gamma_T^{\prime} \gamma_T)u)(u)+\big\|{E^{\delta}_Y}^{\prime}C E^{\delta}_X u\big\|_{{Y^{\delta}}^{\prime}}^2 \\ \nonumber &\geq(1+(1-\rho_{\varDelta}^{-1}) \|A_a\|_{\mathcal L(Y,Y^{\prime})}^2)\|u\|_Y^2+\|u(T)\|^2+(1-\rho_{\varDelta}) \gamma_{\varDelta}^2 \|\partial_t u\|_{Y^{\prime}}^2\\ &\geq (1-\rho_{\varDelta})\gamma_{\varDelta}^2 \|u\|_X^2, \end{align}$$
(3.7)
where we have used that |$1+(1-\rho _{\varDelta }^{-1}) \|A_a\|_{\mathcal L(Y,Y^{\prime })}^2=(1-\rho _{\varDelta }) \gamma _{\varDelta }^2$| by definition of |$\rho _{\varDelta }$|⁠. One easily verifies (3.7) also in the case that |$A_a=0$|⁠, i.e., |$\rho _{\varDelta }=0$|⁠.

Since |${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta } \in{\mathcal{L}}\textrm{is}(Y^{\delta },{Y^{\delta }}^{\prime })$| is an isometry, and |$0<(1-\rho _{\varDelta })\gamma _{\varDelta }^2 \leq \gamma _{\varDelta }^2 \leq 1$|⁠, we conclude that the |$\mathcal L({Y^{\delta }}^{\prime }\times{X^{\delta }}^{\prime }, Y^{\delta }\times X^{\delta })$|-norm of the inverse of the second factor is bounded by |$(1-\rho _{\varDelta })^{-1}\gamma _{\varDelta }^{-2}$|⁠.

In view of the second inequality presented in Remark 3.2 in combination with (3.2), the proof is completed by collecting the bounds that were derived.

3.3 Galerkin discretizations of (2.8)

Although it is likely to be possible to generalize results to the case of |$A_a \neq 0$|⁠, as in Andreev (2013) and Steinbach (2015) in this section we operate under the condition that
$$\begin{equation} A=A_s. \end{equation}$$
(3.8)
Following Steinbach (2015), for a given closed subspace |$Y^{\delta } \subseteq Y$| we define the ‘mesh-dependent’ norm on |$X$| by
$$\begin{equation*} \|u\|_{X,Y^{\delta}}^2:=\|u\|_{Y}^2 +\sup_{0 \neq v \in Y^{\delta}} \frac{(\partial_t u)(v)^2}{\|v\|_{Y}^2} + \|u(T)\|^2. \end{equation*}$$
Note that |$\|\,\|_{X,Y}=\|\,\|_{X}$|⁠.

The following result generalizes the ‘inf-sup identity’, known for |$Y^{\delta }=Y$| (see, e.g., Ern et al. 2017), to mesh-dependent norms.

 

Lemma 3.4
Assuming (3.8), then for |$u \in Y^{\delta } {{\cap X}}$|⁠,
$$\begin{equation*} \|u\|_{X,Y^{\delta}}^2=\sup_{0 \neq v \in Y^{\delta}} \frac{(Bu)(v)^2}{\|v\|_{Y}^2}+\|u(0)\|^2. \end{equation*}$$
If additionally |$\gamma _0 u\in H^{\delta }$|⁠, then
$$\begin{equation} \|u\|_{X,Y^{\delta}}^2=\sup_{0 \neq (v_1,v_2) \in Y^{\delta} \times H^{\delta}} \frac{((Bu)(v_1)+\langle u(0),v_2\rangle)^2}{\|v_1\|_{Y}^2+\|v_2\|^2}. \end{equation}$$
(3.9)

 

Proof.
Let |$y \in Y^{\delta }$| be defined by |$(A_s y)(v)=(\partial _t u)(v)$| (⁠|$v \in Y^{\delta }$|⁠). Then |$(A_s y)(y)=\sup _{0 \neq v \in Y^{\delta }} \frac{(\partial _t u)(v)^2}{\|v\|_{Y}^2}$|⁠. Furthermore, for |$v \in Y^{\delta }$|⁠, |$(Bu)(v)=(A_s(y+u))(v)$| and so, thanks to |$u \in Y^{\delta }$|⁠,
$$\begin{align*} \sup_{0 \neq v \in Y^{\delta}} \frac{(Bu)(v)^2}{\|v\|_{Y}^2}&=(A_s(y+u))(y+u)= (A_s y)(y)+2(A_s y)(u)+ (A_s u)(u)\\ &= (A_s y)(y)+2(\partial_t u)(u)+ (A_s u)(u)= \|u\|_{X,Y^{\delta}}^2-\|u(0)\|^2, \end{align*}$$
where we used that |$2\int _I \langle \partial _t u(t),u(t)\rangle \,\textrm{d}t=\|u(T)\|^2-\|u(0)\|^2$|⁠.
The second statement follows from
$$\begin{equation*} \sup_{0 \neq (v_1,v_2) \in Y^{\delta} \times H^{\delta}} \frac{((A_s(y+u))(v_1)+\langle u(0),v_2\rangle)^2}{\|v_1\|_{Y}^2+\|v_2\|^2}= (A_s(y+u))(y+u)+\|u(0)\|^2, \end{equation*}$$
thanks to |$u(0) \in H^{\delta }$|⁠.

The next theorem gives sufficient conditions for existence and uniqueness of solutions of the Galerkin discretization of (2.8) and provides a suboptimal error estimate.

 

Theorem 3.5
Assuming (3.8), for closed subspaces |$Y^{\delta } \times H^{\delta } \times X^{\delta } \subset Y \times H \times X$| with |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|⁠, the Galerkin discretization of (2.8) has a unique solution |$(\mu ^{\delta },\sigma ^{\delta },u^{\delta }) \in Y^{\delta } \times H^{\delta } \times X^{\delta } $|⁠, and with |$u$| denoting the solution of (2.6),
$$\begin{equation*} \|u-u^{\delta}\|_{X,Y^{\delta}} \leq 2 \inf_{\bar{u}^{\delta} \in X^{\delta}} \|u-\bar{u}^{\delta}\|_{X}. \end{equation*}$$

 

Proof.

Thanks to the assumptions |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|⁠, the inf-sup identity (3.9) guarantees the unique solvability of the Galerkin system.

For any |$u \in X^{\delta }$|⁠, there exist unique |$y_u \in Y^{\delta }$|⁠, |$h_u \in H^{\delta }$| such that
$$\begin{equation*} (A_s y_u)(v_1)+\langle h_u,v_2\rangle=(Bu)(v_1)+\langle \gamma_0 u,v_2\rangle\quad((v_1,v_2) \in Y^{\delta}\times H^{\delta}). \end{equation*}$$
We decompose |$Y^{\delta } \times H^{\delta }$| into |$Z^{\delta }:={\operatorname{clos}}\{(y_u,h_u)\colon u \in X^{\delta }\}$|2 and its orthogonal complement |$W^{\delta }$|⁠. Using that for any |$u \in X^{\delta }$| and |$(v_1,v_2) \in W^{\delta }$|⁠, |$(B u)(v_1)+\langle u(0),v_2\rangle =0$|⁠, one infers that for any |$u \in X^{\delta }$|⁠, the inf-sup identity (3.9) remains valid when the supremum is restricted to |$0 \neq (v_1,v_2) \in Z^{\delta }$|⁠. Furthermore, since for any |$(v_1,v_2) \in Z^{\delta }$| there exists a |$z \in X^{\delta }$| with |$(B z)(v_1)+\langle z(0),v_2\rangle \neq 0$|⁠, we infer that |$u^{\delta }$| is the unique solution of the Petrov–Galerkin discretization of finding |$u^{\delta } \in X^{\delta }$| such that
$$\begin{equation} (B u^{\delta})(v_1)+\langle u^{\delta}(0),v_2\rangle=g(v_1)+\langle u_0,v_2\rangle \quad((v_1,v_2) \in Z^{\delta}). \end{equation}$$
(3.10)
By applying both these observations consecutively, we infer that for any |$\bar{u}^{\delta } \in X^{\delta }$|⁠,
$$\begin{equation} \begin{split} \|u^{\delta}-\bar{u}^{\delta}\|^2_{X,Y^{\delta}} &\!\!= \sup_{0 \neq (v_1,v_2) \in Z^{\delta}} \frac{((B(u^{\delta}-\bar{u}^{\delta}))(v_1)+\langle u^{\delta}(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{\|v_1\|_{Y}^2+\|v_2\|^2}\\[16pt] &= \sup_{0 \neq (v_1,v_2) \in Z^{\delta}} \frac{((B(u-\bar{u}^{\delta}))(v_1)+\langle u(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{\|v_1\|_{Y}^2+\|v_2\|^2} \leq \|u-\bar{u}^{\delta}\|^2_{X}, \end{split} \end{equation}$$
(3.11)
where again we have applied (3.9), now for |$Y^{\delta }=Y$|⁠. A triangle inequality completes the proof.

Theorem 3.5 can be used to demonstrate optimal rates for the error in |$u^{\delta }$| in the |$\|\,\|_{X,Y^{\delta }}$|-norm, and hence also in the |$Y$|-norm. Yet, for doing so one needs to control the error of best approximation in the generally strictly stronger |$\|\,\|_X$|-norm, which requires regularity conditions on the solution |$u$| that exceed those that are needed to guarantee optimal rates of the best approximation in the |$\|\,\|_{X,Y^{\delta }}$|-norm. In other words, this theorem does not show that |$u^{\delta }$| is a quasi-best approximation to |$u$| from |$X^{\delta }$| in the |$\|\,\|_{X,Y^{\delta }}$|-norm, or in any other norm.

 

Remark 3.6

Theorem 3.5 provides a generalization, with an improved constant, of Steinbach’s result (Steinbach, 2015, Theorem 3.2). There the case was considered that the initial value |$u_0=0$|⁠, |${\operatorname{ran}} \gamma _0|_{X^{\delta }}=\{0\}$|⁠, |$H^{\delta }=\{0\}$| and |$Y^{\delta }=X^{\delta }$|⁠. In that case the Galerkin discretization of (2.8) means solving |$u^{\delta } \in X^{\delta }$| from |$(B u^{\delta })(v)=g(v)$| (⁠|$v \in X^{\delta }$|⁠) (indeed, |$Z^{\delta }$| in the proof of Theorem 3.5 is |$X^{\delta } \times \{0\}$|⁠). So with this approach the forming of ‘normal equations’ as in (2.9) is avoided.

In the case of an inhomogeneous initial value |$u_0 \in H$|⁠, one may approximate the solution as |$\bar{u}+w^{\delta }$|⁠, where |$\bar{u} \in X$| is such that |$\gamma _0 \bar{u}=u_0$|⁠, and |$w^{\delta } \in X^{\delta }$| solves |$(B w^{\delta })(v)=g(v)-(B \bar{u})(v)$| (⁠|$v \in X^{\delta }$|⁠). Although such a |$\bar{u} \in X$| always exists, its practical construction becomes inconvenient for |$u_0 \not \in V$|⁠. For |$u_0 \in V$|⁠, |$\bar{u}$| can be taken as its constant extension in time.

To investigate in the setting of Steinbach (2015) the relation between the |$\|\,\|_{X,X^{\delta }}$|- and |$\|\,\|_X$|-norms, we consider |$X^{\delta }$| of the form |$X^{\delta }_t \otimes X^{\delta }_x$|⁠, where |$X^{\delta }_t$| is the space of continuous piecewise linears, zero at |$t=0$|⁠, w.r.t. a uniform partition of |$I$| with mesh size |$h_{\delta }=\frac{T}{2 N_{\delta }}$| for some |$N_{\delta } \in \mathbb N$|⁠, and |$X^{\delta }_x \subset V$| with |$\cap _{\delta \in \varDelta } X_x^{\delta } \neq \{0\}$|⁠. Given |$z^{\delta } \in X^{\delta }$|⁠, Lemma 3.4 shows that
$$\begin{equation} \sup_{0 \neq v \in X^{\delta}} \frac{|(B z^{\delta})(v)|}{\|z^{\delta}\|_X\|v\|_Y}= \frac{\|z^{\delta}\|_{X,X^{\delta}}}{\|z^{\delta}\|_X}. \end{equation}$$
(3.12)
For some arbitrary, fixed |$0 \neq z_x \in \cap _{\delta \in \varDelta } X_x^{\delta }$|⁠, we take |$z^{\delta }=z^{\delta }_t \otimes z_x \in X^{\delta }$|⁠, where |$z^{\delta }_t \in X^{\delta }_t$| is defined by |$\frac{\mathrm{d}}{\mathrm{d} t} z^{\delta }_t=(-1)^{i-1}$| on |$[(i-1) h_{\delta },i h_{\delta }]$|⁠. Since |$z^{\delta }_t(0)=0$|⁠, also |$z^{\delta }_t(T)=0$|⁠. We have |$\|z^{\delta }_t\|_{L_2(I)}\eqsim h_{\delta }$|⁠, |$\|\frac{\textrm{d} z^{\delta }_t}{\textrm{d} t}\|_{L_2(I)}\eqsim 1$|⁠, |$\sup _{0 \neq v \in Y}\frac{(\partial _t z^{\delta })(v)}{\|v\|_Y}=\|\frac{\textrm{d} z^{\delta }_t}{\textrm{d} t}\|_{L_2(I)}\|z_x\|_{V^{\prime }} \eqsim 1$|⁠, |$\|z^{\delta }\|_Y =\|z^{\delta }_t\|_{L_2(I)} \|z_x\|_{V} \eqsim h_{\delta }$| and
$$\begin{align*} \sup_{0 \neq v \in X^{\delta}}\frac{(\partial_t z^{\delta})(v)}{\|v\|_Y}&= \sup_{0 \neq v \in X_t^{\delta}}\frac{\big\langle\frac{\textrm{d}z^{\delta}_t} {\textrm{d}t}, v \big\rangle_{L_2(I)}}{\|v\|_{L_2(I)}} \sup_{0 \neq v \in X_x^{\delta}}\frac{\langle z_x, v \rangle}{\|v\|_{V}} \leq \sup_{0 \neq v \in X_t^{\delta}}\frac{\big\langle\frac{\textrm{d}z^{\delta}_t} {\textrm{d}t}, v \big\rangle_{L_2(I)}}{\|v\|_{L_2(I)}} \|z_x\|_{V^{\prime}}. \end{align*}$$
Let us equip the space of piecewise constants w.r.t. the aforementioned uniform partition with the |$L_2(I)$|-normalized basis |$\{\chi _i^{\delta }\}$| of characteristic functions of the subintervals, and |$X^{\delta }_t$| with the set of nodal basis functions |$\{\phi _i^{\delta }\}$| normalized such that their maximal value is |$h_{\delta }^{-\frac 12}$|⁠. Then with
$$G:=[\langle \chi _j,\phi _i\rangle _{L_2(I)}]_{i j} =\frac 12 {\small \left [\begin{smallmatrix} 1 & 1 & & \\ &\ddots & \ddots &\\ & & 1& 1\\ & & & 1 \end{smallmatrix} \right ]}$$
, and |$\vec{x}:=\sqrt{h_{\delta }}\, [(-1)^{i-1}]_{1 \leq i \leq 2 N_{\delta }}$|⁠, from the uniform |$L_2(I)$|-stability of |$\{\phi _i^{\delta }\}$| one infers that
$$\begin{equation*} \sup_{0 \neq v \in X_t^{\delta}}\frac{\big\langle\frac{\mathrm{d}z^{\delta}_t }{\mathrm{d}t}, v \big\rangle_{L_2(I)}}{\|v\|_{L_2(I)}} \eqsim \sup_{0 \neq \vec{y}}\frac{\langle G \vec{x},\vec{y}\rangle}{\|\vec{y}\|} = \|G \vec{x}\| =\frac12 \sqrt{h_{\delta}}. \end{equation*}$$
By substituting these estimates in the right-hand side of (3.12), we find that its value is |$\eqsim \sqrt{h_{\delta }}$|⁠, so that |$\inf _{0 \neq z^{\delta } \in X^{\delta }} \sup _{0 \neq v \in X^{\delta }} \frac{|(B z^{\delta })(v)|}{\|z^{\delta }\|_X\|v\|_Y} \lesssim \sqrt{h_{\delta }}$|⁠. As follows from the first inequality in Remark 3.2, this means that there exist solutions |$u \in X$| of the parabolic problem for which the errors in the |$X$|-norm in these Galerkin approximations from |$X^{\delta }$| are a factor |$\gtrsim h_{\delta }^{-\frac 12}$| larger than these errors in the best approximations from |$X^{\delta }$|⁠.

Numerical evidence provided by Steinbach (2015, Table 6) indicates that in general these Galerkin approximations are not quasi-optimal in the |$Y$|-norm either.

Returning to the general setting of Theorem 3.5, in the following theorem it will be shown that under an additional assumption, quasi-optimal error estimates are valid.

 

Theorem 3.7
Assuming (3.8), let |$(Y^{\delta },H^{\delta },X^{\delta })_{\delta \in \varDelta }$| be a family of closed subspaces of |$Y \times H \times X$| such that in addition to |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|⁠, also (3.5) is valid. Then for the Galerkin solutions |$(\mu ^{\delta },\sigma ^{\delta },u^{\delta }) \in Y^{\delta } \times H^{\delta } \times X^{\delta } $| of (2.8) it holds that
$$\begin{equation*} \|u-u^{\delta}\|_X \leq \gamma_{\varDelta}^{-1} \inf_{\bar{u}^{\delta} \in X^{\delta}}\|u-\bar{u}^{\delta}\|_X. \end{equation*}$$

 

Proof.

As we saw in the proof of Theorem 3.5, thanks to the assumptions |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|⁠, the component |$u^{\delta } \in X^{\delta }$| of the Galerkin solution of (2.8) is the Petrov–Galerkin solution of (2.6) with test space |$Z^{\delta } \subset Y^{{{\delta }}} \times H^{{{\delta }}}$|⁠.

Equation (3.11) shows that the projector |$P^{\delta }\colon u \mapsto u^{\delta }$| satisfies |$\|P^{\delta } u\|_{X,{{Y}}^{\delta }} \leq \|u\|_X$|⁠. The proof is completed by |$\|\,\|_X \leq \gamma _{\varDelta }^{-1} \|\,\|_{X,{{Y}}^{\delta }}$| on |$X^{\delta }$| by assumption (3.5), in combination with (3.1).

Andreev (2013) studied minimal residual Petrov–Galerkin discretizations of

$$\left [\begin{smallmatrix} B \\ \gamma _0\end{smallmatrix}\right ]u=\left [\begin{smallmatrix} g \\ \gamma _0^{\prime} u_0\end{smallmatrix}\right ]$$
⁠. They can equivalently be interpreted as Galerkin discretizations of (2.8) (cf. Cohen et al., 2012; Broersen & Stevenson, 2014, Proposition 2.2). In view of this, Theorem 3.7 reproduces, though here with a clear-cut constant, the results from Andreev (2013, Theorems 3.1 & 4.1).

 

Remark 3.8
As was pointed out earlier in Andreev (2013), for practical computations it can be attractive to modify the Galerkin discretization of (2.8) by replacing |${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta }$| by some |$\tilde{A}_s^{\delta }={\tilde E_Y^{\delta }}{^{\prime }} \in{\mathcal{L}} \textrm{is} (Y^{\delta },{Y^{\delta }}^{\prime })$| whose inverse can be determined cheaply (a preconditioner),3 such that for some constants |$0<c_{\mathcal N} \leq C_{\mathcal N}<\infty $|⁠,
$$\begin{equation*} \frac{(\tilde{A}_s^{\delta} u)(u)}{(A_su)(u)} \in [c_{\mathcal N}^2,C_{\mathcal N}^2] \quad (\delta \in \varDelta,\,u \in Y^{\delta}). \end{equation*}$$
Indeed, in that case one can solve the then explicitly available Schur complement equation with preconditioned CG, instead of applying the preconditioned MINRES iteration. By redefining
$$Z^{\delta }:={{{\operatorname{clos}}_{Y^{\delta } \times H^{\delta }}}} {\operatorname{ran}}\left [\begin{smallmatrix} (\tilde{A}_s^{\delta })^{-1} {E_Y^{\delta }}^{\prime } B \\ \gamma _0 \end{smallmatrix}\right ]\Big |_{X^{\delta }}$$
in the proof of Theorem 3.5, and by taking |$W^{\delta }$| to be its orthogonal complement in |$Y^{\delta } \times H^{\delta }$| with |$Y^{\delta }$| now being equipped with inner product |$(\tilde{A}_s^{\delta } \cdot )(\cdot )$|⁠, instead of (3.11) we now estimate for any |$\bar{u}^{\delta } \in X^{\delta }$|⁠,
$$\begin{align*} \|u^{\delta}-\bar{u}^{\delta}\|^2_{X,Y^{\delta}} & = \sup_{0 \neq (v_1,v_2) \in Y^{\delta}} \frac{((B(u^{\delta}-\bar{u}^{\delta}))(v_1)+\langle u^{\delta}(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{\|v_1\|_{Y}^2+\|v_2\|^2}\\ &\leq{\frac{1}{\min(c_{\mathcal N}^2,1)}} \sup_{0 \neq (v_1,v_2) \in Y^{\delta}}\frac{((B(u^{\delta}-\bar{u}^{\delta}))(v_1)+\langle u^{\delta}(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{(\tilde{A}_s^{\delta} v_1)(v_1)^2+\|v_2\|^2}\\ &= {\frac{1}{\min(c_{\mathcal N}^2,1)}} \sup_{0 \neq (v_1,v_2) \in Z^{\delta}} \frac{((B(u^{\delta}-\bar{u}^{\delta}))(v_1)+\langle u^{\delta}(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{(\tilde{A}_s^{\delta} v_1)(v_1)^2+\|v_2\|^2}\\ &= {\frac{1}{\min(c_{\mathcal N}^2,1)}} \sup_{0 \neq (v_1,v_2) \in Z^{\delta}} \frac{((B(u-\bar{u}^{\delta}))(v_1)+\langle u(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{(\tilde{A}_s^{\delta} v_1)(v_1)^2+\|v_2\|^2}\\ &\leq{\frac{\max(C_{\mathcal N}^2,1)}{\min(c_{\mathcal N}^2,1)}} \sup_{0 \neq (v_1,v_2) \in Z^{\delta}} \frac{((B(u-\bar{u}^{\delta}))(v_1)+\langle u(0)-\bar{u}^{\delta}(0),v_2\rangle)^2}{\|v_1\|_{Y}^2+\|v_2\|^2}\\ &\leq{ \frac{\max(C_{\mathcal N}^2,1)}{\min(c_{\mathcal N}^2,1)}} \|u-\bar{u}^{\delta}\|^2_{X}. \end{align*}$$
Consequently, a generalization of the statement of Theorem 3.5 reads as
$$\begin{equation*} \|u-u^{\delta}\|_{X,Y^{\delta}} \leq \Big(1+\sqrt{{\textstyle \frac{\max(C_{\mathcal N}^2,1)}{\min(c_{\mathcal N}^2,1)}}}\,\Big) \inf_{\bar{u}^{\delta} \in X^{\delta}} \|u-\bar{u}^{\delta}\|_{X}, \end{equation*}$$
and that of Theorem 3.7 reads
$$\begin{equation*} \|u-u^{\delta}\|_{X} \leq \gamma_{\varDelta}^{-1} \sqrt{{\textstyle \frac{\max(C_{\mathcal N}^2,1)}{\min(c_{\mathcal N}^2,1)}}} \inf_{\bar{u}^{\delta} \in X^{\delta}} \|u-\bar{u}^{\delta}\|_{X}. \end{equation*}$$

 

Remark 3.9

As we saw in the previous section, under the condition that (3.5) is valid, Galerkin discretizations of (2.10) yield quasi-optimal approximations. Assuming |$A=A^{\prime }$|⁠, in the current section we have seen that the same holds true for Galerkin discretizations of (2.8) when in addition |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}}\ \gamma _0|_{X^{\delta }}\subseteq H^{\delta }$|⁠. For the latter discretization, however, still a suboptimal error bound is valid without assuming (3.5). This raises the question whether this is also true for Galerkin discretizations of (2.10).

As we saw earlier, the Galerkin operator resulting from of (2.10) is invertible whenever |$X^{\delta }\neq \{0\}$|⁠. Moreover, when equipping |$X^{\delta }$| with the ‘mesh-dependent’ norm |$\|\,\|_{X,Y^{\delta }}$|⁠, by adapting the proof of Theorem 3.3 one can show that the Galerkin operator is in |${\mathcal{L}}\textrm{is}(Y^{\delta } \times X^{\delta }, {Y^{\delta }}^{\prime } \times{X^{\delta }}^{\prime })$| with both the operator and its inverse having a uniformly bounded norm. Despite this result, we could not establish, however, a suboptimal error estimate similar to Theorem 3.5.

Finally in this section, we comment on the implementation of the Galerkin discretization of (2.8). This system reads
$$\begin{equation} \left[\begin{array}{@{}ccc@{}} {E_Y^{\delta}}^{\prime} A_s E_Y^{\delta}& 0 & {E_Y^{\delta}}^{\prime} B E_X^{\delta} \\ 0 & {E_H^{\delta}}^{\prime} E_H^{\delta}& {E_H^{\delta}}^{\prime}\gamma_0 E_X^{\delta}\\{E_X^{\delta}}^{\prime} B^{\prime} E_Y^{\delta}& {E_X^{\delta}}^{\prime} \gamma_0^{\prime} E_H^{\delta}& 0\end{array}\right] \left[\begin{array}{@{}c@{}} \mu^{\delta} \\ \sigma^{\delta} \\ u^{\delta} \end{array}\right]= \left[\begin{array}{@{}c@{}} {E_Y^{\delta}}^{\prime} g \\{E_H^{\delta}}^{\prime} u_0 \\ 0 \end{array}\right]. \end{equation}$$
(3.13)
By eliminating |$\sigma ^{\delta }$|⁠, it is equivalent to
$$\begin{equation} \left[\begin{array}{@{}cc@{}} {E_Y^{\delta}}^{\prime} A_s E_Y^{\delta} & {E_Y^{\delta}}^{\prime} B E_X^{\delta}\\{E_X^{\delta}}^{\prime} B^{\prime} E_Y^{\delta}& -{E_X^{\delta}}^{\prime} \gamma_0^{\prime} E_H^{\delta} \big({E_H^{\delta}}^{\prime} E_H^{\delta}\big)^{-1} {E_H^{\delta}}^{\prime}\gamma_0 E_X^{\delta}\end{array}\right] \left[\begin{array}{@{}c@{}} \mu^{\delta} \\ u^{\delta} \end{array}\right]= \left[\begin{array}{@{}c@{}} {E_Y^{\delta}}^{\prime} g \\ -{E_X^{\delta}}^{\prime} \gamma_0^{\prime} u_0 \end{array}\right]. \end{equation}$$
(3.14)
The operator |$E_H^{\delta } \big ({E_H^{\delta }}^{\prime } E_H^{\delta }\big )^{-1} {E_H^{\delta }}^{\prime }$| is the |$H$|-orthogonal projector onto |$H^{\delta }$|⁠. So under the assumption that
$$\begin{equation*} {\operatorname{ran}}\ \gamma_0|_{X^{\delta}}\subseteq H^{\delta}, \end{equation*}$$
which was made in Theorem 3.7, it can be omitted, or equivalently, it can be pretended that |$H^{\delta }=H$|⁠, without changing the solution |$(\mu ^{\delta },u^{\delta })$|⁠. The implementation of the resulting system
$$\begin{equation} \left[\begin{array}{@{}cc@{}} {E_Y^{\delta}}^{\prime} A_s E_Y^{\delta} & {E_Y^{\delta}}^{\prime} B E_X^{\delta}\\{E_X^{\delta}}^{\prime} B^{\prime} E_Y^{\delta}& -{E_X^{\delta}}^{\prime} \gamma_0^{\prime} \gamma_0 E_X^{\delta}\end{array}\right] \left[\begin{array}{@{}c@{}} \mu^{\delta} \\ u^{\delta} \end{array}\right]= \left[\begin{array}{@{}c@{}} {E_Y^{\delta}}^{\prime} g \\ -{E_X^{\delta}}^{\prime} \gamma_0^{\prime} u_0 \end{array}\right] \end{equation}$$
(3.15)
is easier, and it runs more efficiently than (3.13).

 

Remark 3.10
System (3.15) can be viewed as a Galerkin discretization of
$$\begin{equation} \left[\begin{array}{@{}cc@{}} A_s & B \\ B^{\prime} & - \gamma_0^{\prime} \gamma_0 \end{array}\right] \left[\begin{array}{@{}c@{}} \mu \\ u \end{array}\right]= \left[\begin{array}{@{}c@{}} g \\ - \gamma_0^{\prime} u_0 \end{array}\right], \end{equation}$$
(3.16)
but for the analysis of the discretization error in |$(\mu ^{\delta },u^{\delta })$| it is still useful to view (3.15) before elimination of |$\sigma ^{\delta }$|⁠, as a Galerkin discretization of (2.8) which yielded the sharp bound on this error presented in Theorem 3.7.

4. Realization of the uniform inf-sup stability (3.5)

In Theorem 3.3 it was shown that Galerkin discretizations of (2.10) are quasi-optimal when (3.5) is valid, and in Theorem 3.7 the same was shown for Galerkin discretizations of (2.8) when in addition |$X^{\delta } \subseteq Y^{\delta }$| and |${\operatorname{ran}} \gamma _0|_{X^{\delta }} \subseteq H^{\delta }$| (and |$A=A_s$|⁠) are valid.

In this section we realize condition (3.5) for finite element spaces w.r.t. partitions of the space-time domain into prismatic elements. In Section 4.1 generally nonuniform partitions are considered for which the partition in time is independent of the spatial location, and the spatial mesh in each time slab is such that the corresponding |$H$|-orthogonal projection is uniformly |$V$|-stable. In Section 4.2 we revisit the special case, already studied in Andreev (2013), of trial spaces that are tensor products of temporal and spatial trial spaces.

4.1 Nonuniform approximation in space local in time, nonuniform approximation in time global in space

 

Theorem 4.1
Let |${\mathcal O}$| be a collection of closed subspaces |$X_x$| of |$V$| such that the |$H$|-orthogonal projector |$Q_{X_x}$| onto |$X_x$| is in |$\mathcal L(V,V)$|⁠, with |$\mu _{\mathcal O}:= \inf _{X_x \in{\mathcal O}} \|Q_{X_x}\|_{\mathcal L(V,V)}^{-1}>0$|⁠. For any |$N \in \mathbb N$|⁠, |$0=t_0<t_1<\cdots <t_N=T$|⁠, |$q_0,\ldots ,q_{N-1} \in \mathbb N$|⁠, |$X_x^0,\ldots ,X_x^{N-1} \in{\mathcal O}$|⁠, let
$$\begin{align*} X^{\delta} &:=\{u \in C(\bar{I};V) \colon u|_{(t_i,t_{i+1})} \in P_{q_i} \otimes X_x^i\},\\ Y^{\delta} &:=\{v \in L_2(I;V) \colon v|_{(t_i,t_{i+1})} \in P_{q_i-1} \otimes X_x^i\}. \end{align*}$$
Then with |$\varDelta $| being the collection of all |$\delta =\delta (N,(t_i)_i, (q_i)_i, (X_x^i)_i)$|⁠, it holds that
$$\begin{equation} \inf_{\delta \in \varDelta}\inf_{\{u \in X^{\delta} \colon \partial_t u \neq 0\}} \sup_{0\neq v \in Y^{\delta}} \frac{(\partial_t u)(v)}{\|\partial_t u\|_{Y^{\prime}}\|v\|_Y}\geq \mu_{\mathcal O}, \end{equation}$$
(4.1)
i.e., (3.5) is valid.

 

Proof.

In Andreev (2013, Lemma 6.2) it was shown that |$\inf _{0 \neq u \in X_x} \sup _{0 \neq v \in X_x} \frac{\langle u,v\rangle }{\|u\|_{V^{\prime }}\|v\|_{V}}=\|Q_{X_x}\|_{\mathcal L(V,V)}^{-1}$|⁠.

With |$P_n$| denoting the Legendre polynomial of degree |$n$|⁠, extended with zero outside |$(-1,1)$|⁠, for any |$u \in X^{\delta } $|⁠, |$\partial _t u$| can be written as the |$L_2(I;H)$|-orthogonal expansion |$(t,x) \mapsto \sum _{i=0}^{N-1} \sum _{n=0}^{q_i-1} P_n\big (\frac{2t-(t_{i+1}+t_i)}{t_{i+1}-t_i}\big ) u_{i,n}(x)$| for some |$u_{i,n} \in X_x^i$|⁠. Fixing |$\varepsilon \in (0,\mu _{\mathcal O} )$|⁠, for each |$(i,n)$| there is a |$v_{i,n} \in X_x^i$| with |$\|v_{i,n}\|_{V}=\|u_{i,n}\|_{V^{\prime }}$| and |$\langle u_{i,n},v_{i,n}\rangle \geq (\mu _{\mathcal O} -\varepsilon ) \|u_{i,n}\|_{V^{\prime }} \|v_{i,n}\|_{V}$|⁠. Taking |$v:=(t,x) \mapsto \sum _{i=0}^{N-1} \sum _{n=0}^{q_i-1} P_n\big (\frac{2t-(t_{i+1}+t_i)}{t_{i+1}-t_i}\big ) v_{i,n}(x)$|⁠, we conclude that
$$\begin{equation*} (\partial_t u)(v) \geq (\mu_{\mathcal O} -\varepsilon) \sum_{i=0}^{N-1} \sum_{n=0}^{q_i-1} \big\|P_n\big({\textstyle \frac{2\cdot-(t_{i+1}+t_i)}{t_{i+1}-t_i}}\big)\big\|_{L_2(I)}^2 \|u_{i,n}\|_{V^{\prime}}^2=(\mu_{\mathcal O} -\varepsilon) \|u\|_{Y^{\prime}} \|v\|_Y, \end{equation*}$$
which implies the result.

 
Remark 4.2

In view of Theorem 3.7, note that both |$X^{\delta } \subset Y^{\delta }$| and (3.5) are valid by taking |$Y^{\delta }:=\{v \in L_2(I;V) \colon v|_{(t_i,t_{i+1})} \in P_{q_i} \otimes X_x^i\}$|⁠.

Considering the condition on the collection |${\mathcal O}$| of spatial trial spaces |$X_x$|⁠, let us consider the typical situation that |$H=L_2(\varOmega )$|⁠, |$V=H^1_{0,\gamma }(\varOmega )=\{u \in H^1(\varOmega )\colon u=0 \textrm{ on } \gamma \}$| where |$\varOmega \subset \mathbb R^d$| is a bounded polytopal domain and |$\gamma $| is a measurable, closed, possibly empty subset of |$\partial \varOmega $|⁠. We consider |$X_x \subset V$| to be finite element spaces of some degree w.r.t. a family of uniformly shape regular, and, say, conforming partitions |${\mathcal T}$| of |$\varOmega $| into, say, |$d$|-simplices, where |$\gamma $| is the union of some |$(d-1)$|-faces of |$S \in{\mathcal T}$|⁠. When the partitions in this family are quasi-uniform, then using, e.g., the Scott–Zhang quasi-interpolator (Scott & Zhang, 1990), it is easy to demonstrate the so-called (uniform) simultaneous approximation property
$$\begin{equation*} \sup_{X_x \in{\mathcal O}} \sup_{0 \neq u \in V} \frac{\inf_{v \in X_x}\{\|v\|_V+\big(\sup_{0\neq w \in X_x} \frac{\|w\|_V}{\|w\|_H}\big) \|u-v\|_H\}}{\|u\|_V}<\infty. \end{equation*}$$
Writing for |$u \in V$| and any |$v \in X_x$|⁠, |$Qu=v+Q(u-v)$|⁠, one easily infers that |$\sup _{X_x \in{\mathcal O}} \|Q_x\|_{\mathcal L(V,V)}<\infty $|⁠.

The uniform boundedness of |$\|Q_x\|_{\mathcal L(V,V)}$| is, however, by no means restricted to families of finite element spaces w.r.t. quasi-uniform partitions, and it has been demonstrated for families of locally refined partitions, for |$d=2$| including those that are generated by the newest vertex bisection algorithm. We refer to Carstensen (2002) and Gaspoz et al. (2016).

4.2 Nonuniform approximation in space global in time, nonuniform approximation in time global in space

If in Theorem 4.1, the spatial trial spaces |$X_x^i$| are independent of the temporal interval |$(t_i,t_{i+1})$|⁠, then |$X^{\delta }$| is a tensor product of trial spaces in space and time. In that case, one shows inf-sup stability for general temporal trial spaces, e.g., spline spaces with more global smoothness than continuity.

 

Theorem 4.3

Let |${\mathcal O}$| be as in Theorem 4.1. Given closed subspaces |$X_t \subset H^1(I)$|⁠, |$\frac{\mathrm{d}}{\mathrm{d}t} X_t \subseteq Y_t \subset L_2(I)$| and |$X_x \in{\mathcal O}$|⁠, let |$X^{\delta }:=X_t \otimes X_x$|⁠, |$Y^{\delta }:=Y_t \otimes X_x$|⁠. Then with |$\varDelta $| being the collection of all |$\delta =\delta (X_t,Y_t,X_x)$|⁠, (4.1) is valid.

The proof of this result follows from the fact that thanks to the Kronecker product structure of |$\partial _t \in \mathcal L(X,Y^{\prime })$|⁠, for such trial spaces we have
$$\begin{align} \nonumber &\inf_{\{u \in X^{\delta}\colon \partial_t u \neq 0\}} \sup_{0\neq v \in Y^{\delta}} \frac{(\partial_t u)(v)}{\|\partial_t u\|_{Y^{\prime}}\|v\|_Y} \\ &\quad=\inf_{\{u \in X_t\colon \frac{\mathrm{d} u}{\mathrm{d} t} \neq 0\}} \sup_{0\neq v \in Y_t} \frac{\int_I \frac{\textrm{d} u}{\textrm{d} t} v\,\textrm{d}t}{\|\frac{\mathrm{d} u}{\mathrm{d} t}\|_{L_2(I)}\|v\|_{L_2(I)}} \times \inf_{0\neq u \in X_x} \sup_{0\neq v \in X_x} \frac{\langle u,v\rangle}{\|u\|_{V^{\prime}}\|v\|_{V}}\\ \nonumber & \quad=\inf_{0\neq u \in X_x} \sup_{0\neq v \in X_x} \frac{\langle u,v\rangle}{\|u\|_{V^{\prime}}\|v\|_{V}}. \end{align}$$
(4.2)
(To see this, one may use that for Hilbert spaces |$U$| and |$V$|⁠, |$T \in \mathcal L(U,V^{\prime })$|⁠, and Riesz mappings |$R_U\colon U \rightarrow U^{\prime }$|⁠, |$R_V\colon V \rightarrow V^{\prime }$|⁠, it holds that |$\inf _{0 \neq u \in U}\sup _{0 \neq v \in V}\frac{(Tu)(v)}{\|u\|_U\|v\|_V}=\min \sigma (R_U^{-1} T^{\prime } R_V^{-1} T)$|⁠, with |$R_U^{-1} T^{\prime } R_V^{-1} T\in \mathcal L(U,U)$| being self-adjoint and non-negative. In the above setting, it is a Kronecker product of corresponding operators acting in the ‘time’ and ‘space’ directions, respectively.)

 

Remark 4.4
(Sparse tensor products).

Instead of considering the ‘full’ tensor product trial spaces from Theorem 4.3, more efficient approximations can be found by the application of ‘sparse’ tensor products. Let |$X_x^{(0)}\subset X_x^{(1)} \subset \cdots $| be a sequence of spaces from |${\mathcal O}$|⁠, |$X_t^{(0)}\subset X_t^{(1)}\subset \cdots \subset H^1(I)$|⁠, and |$Y_t^{(0)}\subset Y_t^{(1)}\subset \cdots \subset L_2(I)$| such that |$Y_t^{(k)} \supseteq \frac{\textrm{d}}{\textrm{d} t}X_t^{(k)}$|⁠. Then for |$X^{(\ell )}:=\sum _{k=0}^{\ell } X_t^{(k)} \otimes X_x^{(\ell -k)}$|⁠, |$Y^{(\ell )}:=\sum _{k=0}^{\ell } Y_t^{(k)} \otimes X_x^{(\ell -k)}$| inf-sup stability holds true uniformly in |$\ell $| with inf-sup constant |$\mu _{\mathcal O}$|⁠.

Although this result follows as a special case from the analysis given in Andreev (2013) for convenience we include the argument. Defining |$W_t^{(k)}:=Y_t^{(k)} \cap (Y_t^{(k-1)})^{\perp _{L_2(I)}}$| for |$k>0$|⁠, and |$W_t^{(0)}:=Y_t^{(0)}$|⁠, from the nestings of |$(Y_t^{(i)})_i$| and |$(X_x^{(i)})_i$| one infers that |$Y^{(\ell )}=\oplus _{k=0}^{\ell } W_t^{(k)} \otimes X_x^{(\ell -k)}$| is an |$(L_2(I)\otimes H)$|-orthogonal decomposition. Given |$y \in Y^{(\ell )}$|⁠, let |$y=\sum _{k=0}^{\ell } y_k$| be the corresponding expansion. Fixing |$\varepsilon \in (0,\mu _{\mathcal O} )$|⁠, there exist |$\tilde{y}_k \in W_t^{(k)} \otimes X_x^{(\ell -k)}$| with |$\langle y_k,\tilde{y}_k\rangle _{L_2(I)\otimes H} \geq (\mu _{\mathcal O}-\varepsilon )\|y_k\|_Y\|\tilde{y}_k\|_{Y^{\prime }}$| and |$\|\tilde{y}_k\|_{Y^{\prime }}=\|y_k\|_Y$|⁠, and so |$\langle \sum _{k=0}^{\ell } y_k,\sum _{k=0}^{\ell } \tilde{y}_k\rangle _{L_2(I)\otimes H} \geq (\mu _{\mathcal O}-\varepsilon ) \|\sum _{k=0}^{\ell } y_k\|_Y\|\sum _{k=0}^{\ell } \tilde{y}_k\|_{Y^{\prime }}$|⁠. Thanks to |$\partial _t X^{(\ell )} \subseteq Y^{(\ell )}$|⁠, the proof is completed.

 

Remark 4.5

In view of (4.2), it is obvious that Theorem 4.3 remains valid when the condition |$\frac{\textrm{d}}{\textrm{d}t} X_t \subseteq Y_t$| is relaxed to |$\inf _{\{u \in X_t\colon \frac{\textrm{d} u}{\textrm{d} t} \neq 0\}} \sup _{0\neq v \in Y_t} \frac{\int _I \frac{\textrm{d} u}{\textrm{d} t} v\,\textrm{d}t}{\|\frac{\textrm{d} u}{\textrm{d} t}\|_{L_2(I)}\|v\|_{L_2(I)}}>0$| uniformly in the pairs |$(X_t,Y_t)$| that are applied. As shown in Andreev (2013), the same holds true in the sparse tensor product case. For |$X_t$| the space of continuous piecewise linears w.r.t. some partition |${\mathcal T}$| of |$I$|⁠, and |$Y_t$| the space of continuous piecewise linears w.r.t. the once dyadically refined partition, an easy computation shows that the inf-sup constant is not less than |$\sqrt{3/4}$|⁠.

Since in our experiments with the method from Andreev (2013), with this alternative choice of |$Y_t$| the numerical results are slightly better than when taking |$Y_t$| to be the space of discontinuous piecewise linears w.r.t. |${\mathcal T}$|⁠, we will report on results obtained with this alternative choice for |$Y_t$|⁠.

5. Numerical experiments

For the simplest possible case of the heat equation in one space dimension discretized using as ‘primal’ trial space |$X^{\delta }$| the space of continuous piecewise bilinears w.r.t. a uniform partition into squares, we compare the accuracy of approximations provided by the newly proposed method (i.e., the Galerkin discretization of (2.10) with trial space here denoted by |$Y_{\textrm{new}}^{\delta } \times X^{\delta }$|⁠) with those obtained with the method from Andreev (2013) (i.e., the Galerkin discretization of (2.8)). We implement the latter method in the form (3.15), i.e., after eliminating |$\sigma ^{\delta }$|⁠. The remaining trial space is denoted here by |$Y_{\textrm{Andr}}^{\delta } \times X^{\delta }$|⁠. So we take |$T=1$|⁠, i.e., |$I=(0,1)$|⁠, and with |$\varOmega :=(0,1)$|⁠, |$H:=L_2(\varOmega )$|⁠, |$V:=H^1_0(\varOmega )$|⁠, |$a(t;\eta ,\zeta ):=\int _{\varOmega } \eta ^{\prime } \zeta ^{\prime }\,\textrm{d}x$|⁠. With |$\frac{1}{h_t}=\frac{1}{h_x} \in \mathbb N$|⁠, we set
$$\begin{alignat*}{3} X^{\delta}&:=&&\,\{v \in H^1(I)\colon v|_{(ih_t,(i+1)h_t)} \in P_1\} &&\otimes \{v \in H_0^1(\varOmega)\colon v|_{(ih_x,(i+1)h_x)} \in P_1\},\\ Y_{\textrm{new}}^{\delta}&:=&&\,\{v \in L_2(I)\colon v|_{(ih_t,(i+1)h_t)} \in P_0\} &&\otimes \{v \in H_0^1(\varOmega)\colon v|_{(ih_x,(i+1)h_x)} \in P_1\},\\ Y_{\textrm{Andr}}^{\delta}&:=&&\,\{v \in H^1(I)\colon v|_{(ih_t/2,(i+1)h_t/2)} \in P_1\} && \otimes\{v \in H_0^1(\varOmega)\colon v|_{(ih_x,(i+1)h_x)} \in P_1\}. \end{alignat*}$$
Note that |$\dim Y^{\delta }_{\textrm{new}} \approx \dim X^{\delta }$| and |$\dim Y^{\delta }_{\textrm{Andr}} \approx 2\dim X^{\delta }$|⁠. The total number of nonzeros in the whole system matrix of the new method is asymptotically a factor 2 smaller than this number for Andreev’s method.

Prescribing both a smooth exact solution |$u(t,x)=e^{-2t} \sin \pi x$| and a singular one |$u(t,x)= e^{-2t} |t-x| \sin \pi x$|⁠, Fig. 1 shows the errors |$e^{\delta }:= u - u^{\delta }$| in the |$X$|-norm as a function of |$\dim X^{\delta }$|⁠.

|$\|e^{\delta }\|_X$| vs. |$\dim X^{\delta }$| for both numerical methods. Left: |$u(t,x)=e^{-2t} \sin \pi x$|⁠. Right: |$u(t,x)=e^{-2t} |t-x| \sin \pi x$|⁠.

The norms of the errors in the Galerkin solutions found by the two methods are nearly indistinguishable from one another. Furthermore, the observed convergence rates |$1/2$| and |$1/4$|⁠, respectively, are the best possible ones, that in view of the polynomial degrees of |$X^{\delta }$| and |$Y^{\delta }$| (new method) or that of |$X^{\delta }$| (Andreev’s method) and the regularity of the solutions, can be expected with the application of uniform meshes. (For any |$\varepsilon>0$|⁠, |$e^{-2t} |t-x| \sin \pi x \in H^{\frac 32-\varepsilon }(I \times \varOmega ) \setminus H^{\frac 32}(I \times \varOmega ).$|⁠)

For both solutions and both numerical methods, the errors |$e^{\delta }(T,\cdot )$| measured in |$L_2(\varOmega )$| converge with the better rate |$1$|⁠, i.e., these errors are asymptotically proportional to |$h_x^2=h_t^2$|⁠; see the left-hand picture in Fig. 2. To illustrate that the two methods yield different Galerkin solutions, we show |$e^{\delta }(0, \cdot )$|⁠, measured in the |$L_2(\varOmega )$|-norm on the right-hand-side of Fig. 2.

Singular solution |$u(t,x)=e^{-2t} |t-x| \sin \pi x$|⁠. Left: |$\|e^{\delta }(T,\cdot )\|_{L_2(\varOmega )}$| vs. |$\dim X^{\delta }$|⁠. Right: |$\|e^{\delta }(0,\cdot )\|_{L_2(\varOmega )}$| vs. |$\dim X^{\delta }$|⁠.

The new method actually yields two approximations for |$u$|⁠, viz. |$u^{\delta }$| and |$\lambda ^{\delta }$|⁠. This secondary approximation is not in |$X$|⁠, but it is in |$Y=L_2(I;V)$|⁠. For both solutions, the errors in |$\lambda ^{\delta }$| measured in the |$Y$|-norm are slightly larger than in those in |$u^{\delta }$|⁠; see the left-hand picture in Fig. 3.

Singular solution |$u(t,x) = e^{-2t} |t-x| \sin \pi x$|⁠. Left: |$\|e^{\delta }\|_Y$| and |$\|u-\lambda ^{\delta }\|_Y$| vs. |$\dim X^{\delta }$| for the symmetric problem. Right: |$\|e^{\delta }\|_X$| vs. |$\dim X^{\delta }$| for the nonsymmetric problem.

Finally, we replaced the symmetric spatial diffusion operator by a nonsymmetric convection-diffusion operator |$a(t;\eta ,\zeta ):=\int _{\varOmega } \eta ^{\prime } \zeta ^{\prime }\,+ \beta \eta ^{\prime } \zeta \textrm{d}x$|⁠. Letting |$\beta := 100$| and again taking the singular solution |$u(t,x)=e^{-2t} |t-x| \sin \pi x$|⁠, the errors |$e^{\delta }$| in the |$X$|-norm of both Galerkin solutions vs. |$\dim X^{\delta }$| are given in Fig. 3. We once again see that the two methods show very comparable convergence behavior.

6. Conclusion

Three related (Petrov–) Galerkin discretizations of space–time variational formulations were analyzed. The Galerkin scheme introduced by Steinbach (2015) has the lowest computational cost and applies on general space–time meshes, but depending on the exact solution, the numerical solutions can be far from quasi-optimal in the natural mesh-independent norm. The minimal residual Petrov–Galerkin discretization introduced by Andreev (2013) yields for suitable trial and test pairs quasi-optimal approximations from the trial space. For suitable pairs of trial spaces, Galerkin discretizations of a newly introduced mixed space–time variational formulation also yield quasi-optimal approximations, but for the same accuracy at a lower computational cost than with the method from Andreev (2013).

Funding

NSF (grant DMS 172029 to R.S.); Netherlands Organization for Scientific Research (NWO) (under contract. no. 613.001.652 to J.W.).

Footnotes

1

Here and in the following, |$\inf _{\{u \in X^{\delta }\colon \partial _t u \neq 0\}} \sup _{0\neq v \in Y^{\delta }} \frac{(\partial _t u)(v)}{\|\partial _t u\|_{Y^{\prime }}\|v\|_Y}$| should be read as |$1$| in the case that |$\{u \in X^{\delta }\colon \partial _t u \neq 0\}=\emptyset $|⁠.

2

In the (discontinuous) Petrov–Galerkin community, |$Y^{\delta } \times H^{\delta }$| and |$Z^{\delta }$| are known under the names test search space (or search test space) and projected optimal test space (or approximate optimal test space), respectively.

3

For Galerkin discretizations of (2.10), such a replacement of |${E_Y^{\delta }}^{\prime } A_s E_Y^{\delta }$| by an equivalent operator will result in an inconsistent discretization.

References

Andreev
,
R.
(
2012
)
Stability of space–time Petrov–Galerkin discretizations for parabolic evolution equations
.
Ph.D. Thesis
.
ETH, Zürich
.

Andreev
,
R.
(
2013
)
Stability of sparse space–time finite element discretizations of linear parabolic evolution equations
.
IMA J. Numer. Anal.
,
33
,
242
260
.

Andreev
,
R.
(
2016
)
Wavelet-in-time multigrid-in-space preconditioning of parabolic evolution equations
.
SIAM J. Sci. Comput.
,
38
,
A216
A242
.

Babuška
,
I.
&
Janik
,
T.
(
1989
)
The h-p pversion of the finite element method for parabolic equations I. The p version in time
.
Numer. Methods Partial Differ. Equ.
,
5
,
363
399
.

Babuška
,
I.
&
Janik
,
T.
(
1990
)
The h-p version of the finite element method for parabolic equations II. The h-p version in time
.
Numer. Methods Partial Differ. Equ.
,
6
,
343
369
.

Brézis
,
H.
&
Ekeland
,
I.
(
1976
)
Un principe variationnel associé à certaines équations paraboliques. Le cas dépendant du temps
.
C. R. Acad. Sci. Paris Sér. A-B
,
282
,
Ai, A1197
A1198
.

Broersen
,
D.
&
Stevenson
,
R. P.
(
2014
)
A robust Petrov–Galerkin discretisation of convection-diffusion equations
.
Comput. Math. Appl.
,
68
,
1605
1618
.

Carstensen
,
C.
(
2002
)
Merging the Bramble–Pasciak–Steinbach and the Crouzeix–Thomée criterion for |${H}^1$|-stability of the |${L}^2$|-projection onto finite element spaces
.
Math. Comp.
,
71
,
157
163
.

Cohen
,
A.
,
Dahmen
,
W.
&
Welper
,
G.
(
2012
)
Adaptivity and variational stabilization for convection–diffusion equations
.
ESAIM Math. Model. Numer. Anal.
,
46
,
1247
1273
.

Dautray
,
R.
&
Lions
,
J.-L.
(
1992
)
Mathematical Analysis and Numerical Methods for Science and Technology
.
Evolution Problems I, vol. 5
.
Berlin
:
Springer
.

Devaud
,
D.
&
Schwab
,
C.
(
2018
)
Space–time |$hp$|-approximation of parabolic equations
.
Calcolo
,
55
,
Art. 35
,
23
.

Dupont
,
T.
(
1982
)
Mesh modification for evolution equations
.
Math. Comp.
,
39
,
85
107
.

Ern
,
A.
,
Smears
,
I.
&
Vohralík
,
M.
(
2017
)
Guaranteed, locally space-time efficient, and polynomial-degree robust a posteriori error estimates for high-order discretizations of parabolic problems
.
SIAM J. Numer. Anal.
,
55
,
2811
2834
.

Führer
,
T.
&
Karkulik
,
M.
(
2019
)
Space–time least-squares finite elements for parabolic equations
.
Technical Report
,
arXiv:1911.01942
.

Gander
,
M. J.
&
Neumüller
,
M.
(
2016
)
Analysis of a new space–time parallel multigrid algorithm for parabolic problems
.
SIAM J. Sci. Comput.
,
38
,
A2173
A2208
.

Gaspoz
,
F. D.
,
Heine
,
C.-J.
&
Siebert
,
K. G.
(
2016
)
Optimal grading of the newest vertex bisection and |${H}^1$|-stability of the |${L}_2$|-projection
.
IMA J. Numer. Anal.
,
36
,
1217
1241
.

Gunzburger
,
M. D.
&
Kunoth
,
A.
(
2011
)
Space–time adaptive wavelet methods for control problems constrained by parabolic evolution equations
.
SIAM J. Contr. Optim.
,
49
,
1150
1170
.

Kato
,
T.
(
1960
)
Estimation of iterated matrices, with application to the von Neumann condition
.
Numer. Math.
,
2
,
22
29
.

Langer
,
U.
,
Moore
,
S. E.
&
Neumüller
,
M.
(
2016
)
Space-time isogeometric analysis of parabolic evolution problems
.
Comput. Methods Appl. Mech. Eng.
,
306
,
342
363
.

Nayroles
,
B.
(
1976
)
Deux théorèmes de minimum pour certains systèmes dissipatifs
.
C. R. Acad. Sci. Paris Sér. A-B
,
282
,
Aiv, A1035
A1038
.

Neumüller
,
M.
&
Smears
,
I.
(
2019
)
Time-parallel iterative solvers for parabolic evolution equations
.
Adv. Comput. Math.,
,
45
,
1031
1066
.

Rekatsinas
,
N.
&
Stevenson
,
R.
(
2019
)
An optimal adaptive tensor product wavelet solver of a space-time FOSLS formulation of parabolic evolution problems
.
Adv. Comput. Math.
45
,
1031
1066
.

Schwab
,
C.
&
Stevenson
,
R. P.
(
2009
)
A space-time adaptive wavelet method for parabolic evolution problems
.
Math. Comp.
,
78
,
1293
1318
.

Schwab
,
C.
&
Stevenson
,
R. P.
(
2017
)
Fractional space–time variational formulations of (Navier)–Stokes equations
.
SIAM J. Math. Anal.
,
49
,
2442
2467
.

Steinbach
,
O.
&
Zank
,
M.
(
2018
)
Coercive Space–time Finite Element Methods for Initial Boundary Value Problems. Berichte aus dem Institut für Angewandte Mathematik, Bericht 2018/7
.
Technische Universität Graz
.

Steinbach
,
O.
(
2015
)
Space–time finite element methods for parabolic problems
.
Comput. Methods Appl. Math.
,
15
,
551
566
.

Scott
,
L. R.
&
Zhang
,
S.
(
1990
)
Finite element interpolation of nonsmooth functions satisfying boundary conditions
.
Math. Comp.
,
54
,
483
493
.

Tantardini
,
F.
&
Veeser
,
A.
(
2016
)
The |${L}^2$|-projection and quasi-optimality of Galerkin methods for parabolic equations
.
SIAM J. Numer. Anal.
,
54
,
317
340
.

Urban
,
K.
&
Patera
,
A. T.
(
2014
)
An improved error bound for reduced basis approximation of linear parabolic problems
.
Math. Comp.
,
83
,
1599
1615
.

Voulis
,
I.
&
Reusken
,
A.
(
2018
)
A time dependent Stokes interface problem: well-posedness and space-time finite element discretization
.
ESAIM Math. Model. Numer. Anal.
,
52
,
2187
2213
.

Wloka
,
J.
(
1982
)
Partielle Differentialgleichungen: Sobolevräume und Randwertaufgaben
.
Stuttgart
:
B. G. Teubner
.

Xu
,
J.
&
Zikatanov
,
L.
(
2003
)
Some observations on Babuška and Brezzi theories
.
Numer. Math.
,
94
,
195
202
.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.