Stabilization of High Order Cut Finite Element Methods on Surfaces

We develop and analyze a stabilization term for cut finite element approximations of an elliptic second order partial differential equation on a surface embedded in $\mathbb{R}^d$. The new stabilization term combines properly scaled normal derivatives at the surface together with control of the jump in the normal derivatives across faces and provides control of the variation of the finite element solution on the active three dimensional elements that intersect the surface. We show that the condition number of the stiffness matrix is $O(h^{-2})$, where $h$ is the mesh parameter. The stabilization term works for linear as well as for higher-order elements and the derivation of its stabilizing properties is quite straightforward, which we illustrate by discussing the extension of the analysis to general $n$-dimensional smooth manifolds embedded in $\mathbb{R}^d$, with codimension $d-n$. We also formulate properties of a general stabilization term that are sufficient to prove optimal scaling of the condition number and optimal error estimates in energy- and $L^2$-norm. We finally present numerical studies confirming our theoretical results.


Introduction
CutFEM for Surface Partial Differential Equations. CutFEM (Cut Finite Element Method) provides a new high order finite element method for solution of partial differential equations on surfaces. The main approach is to embed the surface into a three dimensional mesh and to use the restriction of the finite element functions to the surface as trial and test functions. More precisely, let Γ be a smooth closed surface embedded in R 3 with exterior unit normal n and consider the Laplace-Beltrami problem: find u ∈ H 1 (Γ) such that a(u, v) = l(v) ∀v ∈ H 1 (Γ). (1.1) The forms are defined by where ∇ Γ is the surface gradient and f ∈ L 2 (Γ) a given function with average zero. Let now Γ be embedded into a polygonal domain Ω equipped with a quasi-uniform mesh T h,0 and let Γ h be a family of discrete surfaces that converges to Γ in an appropriate manner. Let the active mesh T h consist of all elements that intersect Γ h . Let V h be the space of continuous piecewise polynomial functions on T h with average zero on Γ h . The basic CutFEM takes the form: find u h ∈ V h such that where (1.4) and f h is a suitable representation of f on Γ h . In the context of surface partial differential equations this approach, also called trace finite elements, was first introduced in [26] and has then been developed in different directions including high order approximations in [29], discontinuous Galerkin methods [6], transport problems [28], embedded membranes [12], coupled bulk-surface problems [8] and [19], minimal surface problems [11], and time dependent problems on evolving surfaces [23], [25], [27], and [30]. We also refer to the overview article [2] and the references therein.
Previous Work on Preconditioning and Stabilization. The basic CutFEM method (1.3) manufactures a potentially ill conditioned linear system of equations and stabilization or preconditioning is therefore needed. In [24] it was shown that diagonal preconditioning works for piecewise linear approximation and in [29] it was shown that also quadratic elements can be preconditioned if the full gradient is used instead of the tangential gradient in the bilinear form a h , see also [13] where this formulation was proposed.
In [3] a stabilized version of (1.3) was introduced. The method takes the form where s h is a stabilization form which is added in order to ensure stability of the method. More precisely we show that there is a constant such that for all v ∈ V h , v 2 (1.6) Using (1.6) and the properties of a h we can show that the condition number κ of the stiffness matrix satisfies the optimal bound κ h −2 . (1.7) In [3] only piecewise linear finite elements were considered and the stabilization term was defined by where [D 1 n F v] is the jump in the normal derivative of v at the face F , F h is the set of internal faces (i.e. faces with two neighbors) in the active mesh T h , and c F > 0 is a stabilization parameter. Optimal order bounds on the error and condition number were established. Furthermore, for linear elements the so called full gradient stabilization s h (v, w) = ch(∇v, ∇w) T h (1.9) was proposed and analyzed in [7] as a simpler alternative, and for higher order elements the normal gradient stabilization (1.10) was proposed and analysed independently in [5] and [18]. The restrictions on α essentially comes from a lower bound which implies that optimal order a priori error estimates hold and an upper bound which implies that the desired stability estimate (1.6) holds. Note that stronger control than (1.6) is sometimes demanded, for instance in cut discontinuous Galerkin methods [6], which may lead to stronger upper restrictions on α.
The last two stabilization terms, (1.9) and (1.10), can however not be used in the discretization of bulk problems without destroying the convergence order while the ghost penalty stabilization term [10], which as (1.8) acts on the element faces, can be used. Thus, in many applications where both bulk and surface problems occur face stabilization is still needed when discretizing the problem with CutFEM. It should be noted that the stabilization term (1.8) has been used also for other reasons than controlling the condition number. It is used for example in [9] to stabilize for convection and in [22] to improve on the accuracy of computing the mean curvature vector of a surface. We study the problem of computing the mean curvature vector of a surface based on the Laplace-Beltrami operator in Section 9.3.
New Contributions. In this work we consider stabilization for linear as well as higherorder elements using a combination of face stabilization and normal derivative stabilization at the actual surface. More precisely the stabilization term takes the form where γ ∈ [0, 1] is a parameter, [D j n F v] is the jump in the normal derivative of order j across the face F , and c F,j > 0, c Γ,j > 0, are stabilization constants. This stabilization term with γ = 1 has also been used in a high order space-time CutFEM on evolving surfaces [30]. We note that also for this stabilization term there is a certain range of scaling with the mesh parameter h, and in fact the following stronger version of (1.6) holds where we get stronger control of the gradient for smaller γ ∈ [0, 1], which corresponds to stronger stabilization.
Using the combination of the face and surface stabilization terms the proof of (1.6) consists of two steps: • Using the face terms we can estimate the L 2 norm of a finite element function at an element in terms of the L 2 norm at a neighboring element which share a face and the stabilization term on the shared face. Repeating this procedure we can pass from any element to an element which has a large intersection, • Using the stabilization of normal derivatives at the surface we can control a finite element function on elements which have large intersection with Γ h .
Using shape regularity we can show that one can pass from any element in T h to an element with a large intersection in a uniformly bounded number of steps, see [3] and [15]. The analysis is quite straight forward and may in fact also be extended to the case on an n-dimensional smooth manifold embedded in R d for general codimension cd = d − n. An important special case is a curve embedded in R 3 , which may for instance be an intersection of two surfaces, see also [4] for generalizations to so called mixed dimensional problems. For clarity, we consider a two dimensional surface embedded in R 3 throughout the paper and discuss the minor modifications necessary in a separate section. We also refer to [5] for general background on the analysis of problems with higher codimension embeddings.
Outline. In Section 2 we introduce the model problem, in Section 3 we formulate the finite element method, in Section 4 we collect some basic preliminary results, in Section 5 we formulate the properties of the stabilization term, in Section 6 we establish an optimal order condition number estimate, in Section 7 we discuss the extension to problems on ndimensional smooth manifolds embedded in R d , in Section 8 we prove optimal order a priori error estimates, and in Section 9 we present numerical results that support our theoretical findings. In the last section, Section 10, we conclude and discuss the advantages with the proposed stabilization.

The Laplace-Beltrami Problem on a Surface
The Surface. Let Γ be a smooth, closed, simply connected surface in R 3 with exterior unit normal n embedded in a polygonal domain Ω ⊂ R 3 . Let U δ (Γ) be the open tubular neighborhood of Γ with thickness δ > 0, where d(x) is the signed distance function of Γ with d < 0 in the interior of Γ and d > 0 in the exterior. Note that n = ∇d is the outward-pointing unit normal on Γ. Then there is δ 0 > 0 such that for each x ∈ U δ 0 (Γ) the closest point projection: x to a unique closest point on Γ. In particular, we require where for x ∈ Γ, κ i are the principal curvatures. We also define the extension of a function u on Γ to U δ 0 (Γ) by u e (x) = u(p(x)), x ∈ U δ 0 (Γ). (2.4) The Problem. We consider the problem: Here ∇ Γ is the tangential gradient on Γ defined by where We used the extension of u which is constant in the normal direction to Γ to formally define ∇ Γ u. However, the tangential derivative depends only on the values of u on Γ and does not depend on the particular choice of extension. Using Lax-Milgram's Lemma we conclude that there is a unique solution to (2.5), and we also have the elliptic regularity estimate The Method

The Mesh and Finite Element Space
• The background mesh and space: Let T h,0 be a quasi-uniform partition of Ω with mesh parameter h ∈ (0, h 0 ] into shape regular tetrahedra. On T h,0 we define V p h,0 to be the space of continuous piecewise polynomials of degree p ≥ 1.
• The discrete surfaces: Let Γ h be a sequence of surfaces that converges to Γ. We specify the precise assumptions on the convergence below.
• The active mesh and the induced surface mesh: We denote by T h the set consisting of elements in the background mesh T h,0 that are cut by the discrete geometry Γ h . These elements form the so called active mesh. See Fig. 1 for an illustration. The restriction of the active mesh to the discrete surface Γ h manufactures an induced cut surface mesh that we denote by • The finite element space: The restriction of the background space to the active mesh together with the condition Γ h v ds h = 0 defines our finite element space V p h , Forms.

The Finite Element Method
• The forms a h and l h are defined by where the tangential gradient ∇ Γ h is defined by • The stabilization form s h is defined by where Here (D j a v)| x is the j-th directional derivative of v along the line defined by the unit vector a(x), [w]| F denotes the jump of w over the face F , F h is the set of internal faces (i.e. faces with two neighbors) in the active mesh T h , c Γ,j > 0, c F,j > 0 are stabilization parameters, and γ ∈ [0, 1] is a parameter.
Remark 3.1 In the special case p = 1 we obtain

10)
where γ ∈ [0, 1] which can be compared with the pure face penalty term developed in [3]. Note that the scalings with the mesh parameter h are different in the two terms and that for γ > 0 the face penalty term is weaker in the stabilization term we introduce here.

Assumptions
A1. We assume that the closed, simply connected approximation Γ h and its unit normal n h are such that for all h ∈ (0, h 0 ], that n · n h > 0 on Γ h , and that we have a uniform bound on the curvature κ h . A2. We assume that the mesh is sufficiently small so that there is a constant c independent of the mesh size h such that T h ⊂ U δ (Γ) with δ = ch ≤ δ 0 and that the closest point mapping p(x) : Γ h → Γ is a bijection.
A3. We assume that the data approximation f h is such that Here ds = |B|ds h (see Section 4 for details).

Norms
Given a positive semidefinite bilinear form b we let v 2 b = b(v, v) denote the associated seminorm. In particular, we will use the following norms on We let H s (ω), with ω ⊂ R d , denote the standard Sobolev spaces of order s. For ω ⊂ Γ or ω ⊂ Γ h H s (ω) denotes the surface Sobolev space which is based on tangential derivatives.

Some Inequalities
• Under A1 and A2 there is a constant such that the following Poincaré inequality holds for see Lemma 4.1 in [3].
• For T ∈ T h we have the trace inequalities v 2 where the first inequality is a standard trace inequality, see e.g. [1], and the second inequality is proven in [20] and the constant is independent of how Γ h intersects T and of h ∈ (0, h 0 ].
• For T ∈ T h we have the inverse inequality

Interpolation
Let π p h : where N h (T ) ⊂ T h is the union of the elements in T h which share a node with T . In particular, we have the stability estimate

Some Results for Extended and Lifted Functions
Extension. Recall that we define the extension of a function u on Γ to U δ 0 (Γ) by u e (x) = u(p(x)). For u ∈ H s (Γ), s ≥ 0 the following holds where 0 < δ ≤ δ 0 and for h small enough we may take δ ∼ h. See Lemma 3.1 in [29].
Using the definition of the extension of a function u on Γ, the definition of the closest point projection (2.2), the chain rule, and the identity n = ∇d we obtain where H(x) = D 2 d(x). For x ∈ U δ 0 (Γ), the eigenvalues of H(x) are κ 1 (x), κ 2 (x), and 0, with corresponding orthonormal eigenvectors a 1 (x), a 2 (x), and n e (x). We have the identities and H(x) may be expressed in the form see [17,Lemma 14.7]. Using the fact that H is tangential, P Γ H = HP Γ = H, in the identity (4.9) we obtain ∇u e = (I − dH)∇ Γ u. Thus, Lifting. Define the lifting v l of a function v on Γ h to Γ as Using equation (4.13) it follows that and thus we have Here, see [14], and using the fact that n and n h are unit normals it is easy to show that Estimates Related to B. It can be shown, see [3] for instance, that the following estimates hold Surface Measures. For the surface measure on Γ h , ds h (x), and the surface measure on Γ, ds(p(x)), we have the following identity where see Proposition A.1 in [14]. Using the identity n(x) − n h (x) 2 R 3 = 2(1 − n(x) · n h (x)) and our assumptions we obtain the following estimates: Norm Equivalences. Using the identities (4.13) and (4.17) and the bounds in equations (4.20) and (4.23) we obtain the following equivalences and We will also use the following bound The constant in (4.26) depends on the higher-order derivatives of the geometry.

Properties of the Stabilization Term
Here we formulate properties of a general stabilization term that are sufficient to prove that the resulting linear system of equations have an optimal scaling of the condition number with the mesh parameter and that the convergence of the method is of optimal order.
Properties. Let s h be a semi positive definite bilinear form such that Remark 5.1 The first three properties are used to establish the a priori error estimates: P1 is needed to show optimal interpolation error estimates, P2 is used to estimate the consistency error, and P3 is used to show optimal L 2 -error estimates. In P3 the assumption that v ∈ H 2 (Γ) reflects the fact that the solution to the dual problem resides in H 2 (Γ). The two last properties are used to show the condition number estimate: P4 is the inverse inequality and P5 is the Poincaré inequality. For the extension of these properties to the more general case of an n-dimensional smooth manifold embedded in R d , with codimension cd = d − n see Section 7.
Verification of P1-P5. We now show that the stabilization term defined in (3.7) satisfies P1-P5.

P1. Using a standard trace inequality on the faces (equation (4.3)) and the trace inequality (4.4) for the contributions on Γ h we obtain
Next, setting w = v e −π p h v e , using the interpolation error estimate (4.6) and the stability , and thus P1 follows.
P2. Note first that for v ∈ H p+1 (Γ) we have v e s h,F = 0 and thus v e s h = v e s h,Γ . Next, subtracting D j n v e = 0, using Assumption A1, and inequality (4.26) we obtain where we used the estimate n h − n L ∞ (K h ) h p . We conclude that P2 holds for Verification of (5.17). We start from the definition and estimate the two terms on the right hand side as follows. First where for each j = 2, . . . , p and each face F ∈ F h with neighboring elements T 1 and T 2 we used the inverse estimate Second, using a similar approach where we used the inverse estimate for j = 2, . . . , p. Thus (5.17) holds.
where we used the stability of the interpolation operator (4.7) and the stability of the extension operator (4.8). Next we have the following bounds.
where we used the trace inequality (4.3), the interpolation estimate (4.6), and finally the stability (4.8) of the extension operator. Thus we have Term II. Using the fact that D 1 n e v e = 0 we obtain and we arrive at Collecting the bounds we obtain and thus P3 holds.
Simplified Proof of P3 in the Case γ = 1. Using estimate (5.9) with w ∈ V p h and inverse inequality (4.5) we obtain Setting w = π p h v e and using the stability of the interpolation operator (4.7) and the stability of the extension operator (4.8) we obtain P4. Starting from (5.9) with w ∈ V p h and using the inverse inequality we get Using that h ∈ (0, h 0 ] we have h 2γ−3 h −3 for γ ≥ 0 and thus Property P4 holds.
P5. See the proof of Lemma 6.3 below.

Remark 5.2
We note that the normal gradient stabilization (1.10) satisfies P 1 , P 2 , P 4 , and P 5 , see [18] and [5]. To verify P 3 we use that n e · ∇v e = 0, interpolation error estimates, the H 1 -stability of the interpolant, assumption A2 (the inclusion T h ⊂ U δ (Γ) with δ ∼ h), the stability of the extension operator (4.8), and that n h − n e 2 We also used that p ≥ 1. Thus we conclude that P 3 holds for α ≥ −1.

Condition Number Estimate
We shall show that the spectral condition number of the resulting stiffness matrix scales as h −2 , independent of the position of the geometry relative to the background mesh. In particular, we show that the stabilization term defined in (3.7) has Property P5 and controls the condition number for linear as well as for higher-order elements.
Recall that functions v in V p h satisfy the condition Γ h v ds h = 0 and therefore R N is and recall that the condition number is defined by which in terms of the eigenvalues of the symmetric positive definite matrix A h is equal to Theorem 6.1 If there are constants, independent of the mesh size h and of how the surface cuts the background mesh, such that the following hold: Then, the spectral condition number κ(A h ) satisfies Proof.
For v, w ∈ V p h it follows from the continuity of the form A h , the inverse inequality, and the equivalence between the norms, equation (6.3), that From coercivity, the Poincaré inequality, and the equivalence between the norms, equation Using the definition (6.6) of the spectral condition number we get In the following Lemmas we show that all the conditions in Theorem 6.1 are satisfied.
Lemma 6.1 (Continuity and Inf-Sup Condition) There is a constant independent of the mesh size h and of how the surface cuts the background mesh, such that A h is continuous: and A h satisfies the inf-sup condition Proof. The continuity follows directly from the Cauchy-Schwarz inequality. The bilinear A h by definition and the inf-sup condition (6.12) follows from coercivity. Lemma 6.2 (Inverse Inequality) There are constants, independent of the mesh size h and of how the surface cuts the background mesh, such that the following inverse inequality holds Proof. Using the element wise trace inequality (4.4), the inverse inequality (4.5), and Property P4 of the stabilization term we obtain v 2 (6.14) Using the above estimates and recalling the definition of v A h the result follows.
Proof. To prove the Poincaré inequality we will proceed in two main steps: 1. We use the face penalty to reach an element which has a large intersection with Γ h . Here we employ a covering of T h , where each covering set contains elements that have a so called large intersection property, see [3]. 2. For elements with a large intersection the normal derivative control on Γ h is used to control the L 2 norm on the element. To establish the Poincaré inequality (6.15) we shall prove that v 2 Here we note that (6.15) follows from (6.16) since the right hand side of (6.16) satisfies the estimate where we used the definition (3.7) of s h , the Poincaré inequality (4.2), and the fact that In each set T h,x there is one element T x ∈ T h,x which has a large intersection with Γ h , where d is the dimension. (4) The number of sets T h,x to which an element T belongs is uniformly bounded for all T ∈ T h . See [3] for the construction of the covering. We prove (6.16) by considering a set T h,x in the covering and the following steps: Step 1. We shall show that v 2 where F h,x is the set of interior faces in T h,x . To prove (6.19) we recall that for two elements T 1 and T 2 that share a face F it holds v 2 where F j h,x is the set of interior faces in T j h,x . Iterating (6.21) we obtain (6.19).
Step 2. We shall show that there is a constant such that for all T x ∈ T h which have the large intersection property (6.18), To verify (6.22) we define the cylinder and using Taylor's formula for a polynomial v ∈ P p (Cyl δ (K x )), we obtain the bound v 2 Using the following estimate, which we verify below, there is a constant such that for all the desired estimate (6.22) follows for δ ∼ h.
Verification of (6.25). To employ a scaling argument we will construct two regular cylinders with circular cross section and the same center line that may be mapped to a reference configuration, one containing T x and one contained in Cyl δ (K x ). See Figure 6.
Let F x be a plane with unit normal n x , which is tangent to K x at an interior point of K x . Then we have the bound and with K x the closest point projection of K x onto F x we conclude using shape regularity and a uniform bound on the curvature κ h (Assumption A1) that there is a ball B r 1 ,x ⊂ K x ⊂ F x with radius r 1 ∼ h and a δ 1 ∼ h such that Next using shape regularity there is a larger ball B r 2 ,x ⊂ F x with the same center as B r 1 ,x such that where T x is the closest point projection of T x onto F x , and a δ 2 ∼ h such that Clearly, Cyl δ 1 (B r 1 ,x ) ⊂ Cyl δ 2 (B r 2 ,x ) and using a mapping to a reference configuration we conclude that there is a constant such that for all polynomials v ∈ P p (Cyl which in view of the inclusions (6.27) and (6.29) concludes the proof of (6.25).
Step 3. Combining equation (6.19) and (6.22) and using that {T h,x : x ∈ X h } is a cover of T h completes the proof of (6.16) and the lemma.
where γ ∈ [0, 1] is the scaling parameter in s h , see (3.8) and (3.9). The modifications are as follows. In Step 1 we show that .32) Figure 3: Schematic figure illustrating a typical configuration of an element T , the curved intersection with Γ h , the plane F x which is tangent to K x , the cylinders Cyl δ (K x ), Cyl δ 1 (B r 1 ), and Cyl δ 2 (B r 2 ). Note that we have the inclusions B r 1 , which after multiplication by h 2γ corresponds to To show (6.32) we use the estimate ∇v 2 In Step 2 we show that which after multiplication by h 2γ corresponds to h 2γ ∇v 2 Combining (6.33) and (6.36), summing over the cover, and using Property (4) of the cover we obtain (6.31).
Remark 6.1 Note the following: • From (5.43) we see that the scaling of the stabilization term corresponds to the mass matrix. This scaling is the weakest which guarantees the Poincaré inequality (6.15) (Property P4 of the stabilization term).
• For the operator L = α∆ Γ + β (6.37) where α and β are constants, we would have the inverse inequality instead of (6.13), and the resulting condition number estimate is then we note that the desired inverse inequality (6.38) holds for

Extension to Problems on Manifolds with General Codimension Embeddings
The stabilization term (3.7) may be extended to the more general case of an n-dimensional smooth manifold embedded in R d , with codimension cd = d − n > 1, by scaling the face penalty term (3.8) in such a way that the stabilization terms associated with the faces and surface scale in the same way, This is the same scaling as is used for the face stabilization term in the case of piecewise linear elements in general codimension, see Table 1 in [5]. With this definition P1-P3 remain the same and may be verified using the results in [5]. P4-P5 take the more general form P4 is verified as in equation (5.43) and below we comment on the minor modifications in the proof of Lemma 6.3 necessary to verify P5. We next extend the proof of Lemma 6.3, to the case of a general embedding of an ndimensional surface in R d . To that end we first note that Step 1 can be directly carried out for d-dimensional simplices and we obtain v 2 Step 2, the only difference is that we will have cd = d − n orthonormal normal directions {n h,i } cd i=1 , which we may choose to vary smoothly over K x , see [5] for details, and therefore the definition of the cylinder over K x takes the form Using Taylor's formula in several dimensions and integrating over the cylinder (7.6), we obtain the following generalization of (6.24), v 2 where we used the fact that, for a monomial Π cd i=1 We finally note that (6.25) holds in any dimension d and the desired estimate in Step 2 follows for δ ∼ h, and Step 3 is just a combination of Step 1 and 2. Thus the proof of Lemma 6.3 easily extends to the case of general codimension embeddings and we conclude that (7.4) holds.

A Priori Error Estimates
In this section we prove optimal error estimates in the energy norm and in the L 2 norm.

Strang Lemma
Using that A h is continuous and satisfies the inf-sup condition we first show a Strang Lemma which connects the error in the energy norm to the interpolation error and the consistency error.
Lemma 8.1 Let u ∈ H 1 0 (Γ) be the unique solution of (2.5) and u h ∈ V p h the finite element approximation defined by (3.3). Then the following discretization error bound holds Proof. Adding and subtracting an interpolant and using the triangle inequality we get Using the inf-sup condition (6.12) for A h we have Adding and subtracting A h (u e , v) and using the weak formulation (3.3) yields Finally, using the continuity of A h we obtain Collecting the estimates we end up with the desired bound

The Interpolation Error
Next we prove optimal interpolation error estimates using Property P1 of the stabilization term.

The Consistency Error
The approximation of the geometry Γ by Γ h and the approximation of the data f by f h leads to a consistency error that we estimate in the next lemma.
be the solution of (2.5) then, the following bound holds Proof. We have the identity Term I. Adding −a(u, v l ) + l(v l ) = 0, and using the triangle inequality

(8.16)
Term I I . Adding and subtracting Γ h ∇ Γ h u e · ∇ Γ h v|B| ds h , using that B T B −T = P Γ h , changing the domain of integration from Γ to Γ h , using the triangle inequality, norm equivalences in (4.25), estimates (4.20), and (4.23) we obtain

17)
Term I II . Changing the domain of integration we get where we at last used Assumption A3 on the data approximation f h . Together, the bounds of I I and I II , and the Poincaré inequality (4.2), imply Finally, using that u is the solution of (2.5) we have the stability estimate ∇ Γ u L 2 (Γ) f L 2 (Γ) , and we obtain Term II. Using the Cauchy-Schwarz inequality and Property P2 of the stabilization term, see Section 5, we obtain Combining the estimates of I and II we obtain the desired estimate

Error Estimates
We first prove optimal error estimates in the energy norm and then apply a duality argument to obtain L 2 -error estimates Theorem 8.1 Let u ∈ H p+1 (Γ) ∩ H 1 0 (Γ) be the solution of (2.5) and u h ∈ V p h the finite element approximation defined by (3.3). If assumptions A1-A3 hold and the stabilization term has properties P1-P5 then, there is a constant independent of the mesh size h such that the following error bound holds Proof. Using the Strang Lemma followed by the bounds on the interpolation error (8.8) and the consistency error (8.14) we get the desired bound.
be the solution of (2.5) and u h ∈ V p h the finite element approximation defined by (3.3). If assumptions A1-A3 hold and the stabilization term has properties P1-P5 then, there is a constant independent of the mesh size h such that the following error bound holds Proof. The proof is similar to the proof of the L 2 -error estimate in [3] for p = 1. Let e h = u e − u h | Γ h and its lift on Γ be e l h = u − u l h . Recall that Γ u ds = 0 and Γ h u h ds h = 0. We now defineũ h ∈ V p h asũ so that we have Γũ l h = 0. By adding and subtractingũ l h and using the triangle inequality we obtain Term I. Consider the dual problem: It follows from the Lax-Milgram lemma that there exists a unique solution in H 1 (Γ) \ R and we also have the elliptic regularity estimate Setting v = u −ũ l h in (8.31), adding and subtracting an interpolant, and using the weak formulations in (2.5) and (3.3) we obtain

(8.36)
Term I I . Using the Cauchy-Schwarz inequality, the norm equivalences (see Section 4), and the energy norm estimate (Theorem 8.1) we obtain The interpolation error estimate (8.10) with s = 1 yields and finally using the elliptic regularity estimate (8.32) we get Adding and subtracting an interpolant, using the interpolation error estimate (8.40) followed by the elliptic regularity estimate (8.32) yield Thus, and using Property P3 and the elliptic regularity estimate (8.32) we obtain Final Estimate of Term I. Collecting the estimates of Terms I I − I IV and taking ψ = u −ũ l h / u −ũ l h L 2 (Γ) we obtain Term II. Using the definition (8.29) ofũ h , adding Γ h u h ds h = 0, changing the domain of integration and using related bounds, employing the Poincaré inequality (4.2), and the stability estimate (8.42), we obtain Conclusion of the Proof. Using the estimates of Terms I and II in equation (8.30) and the equivalence e l h L 2 (Γ) ∼ e h L 2 (Γ h ) conclude the proof.

Numerical Examples
We consider three different problems. The mesh on the domain where the interface is embedded is always generated independently of the position of the interface Γ and we use Lagrange basis functions.

The Laplace-Beltrami Problem
is the exact solution. We discretize (2.5) using the proposed CutFEM with Lagrange basis functions of degree p=1, 2, and 3.
In Fig. 4 we show the error and the spectral condition number of the linear systems as a function of the mesh size h. The error measured in the L 2 -norm behaves as O(h p+1 ) and in the H 1 -norm as O(h p ), as expected. The condition number behaves as O(h −2 ) independent of the polynomial degree and how the interface cuts the background mesh. The magnitude of the error and also the condition number depends on the stabilization constants in (3.8)- (3.9) and also on the choice of basis functions. For the results in Fig. 4 we have used γ = 1, c F,j = c Γ,j = 2.5 · 10 −j , j = 1, · · · , p. We have not optimized this choice of parameters and other choices may give better results for example smaller condition numbers with almost the same error. We show this for example in Fig. 6 and 8. Other stabilization constants or basis functions can also give better scaling of the condition number with respect to the polynomial degree.
In Fig. 5 and 6 we show the effect of the stabilization on the error and the condition number for linear (p=1) and cubic elements (p=3), respectively. We show the convergence of the error with respect to the L 2 -norm. The results are similar when the error is measured in the H 1 -norm. We compare the proposed stabilization with the stabilization terms (1.8) and (1.10) and also show the condition number after diagonal scaling.
We see in Fig. 5 that for linear elements the errors, measured in the L 2 -norm, using the different stabilization terms and also no stabilization are very similar but we clearly see that when no stabilization is used the condition number can be extremely large depending on the position of the interface relative the background mesh. We have used c F,1 = c Γ,1 = 10 −1 and γ = 1 in the proposed stabilization. Note that the stabilization term (1.8) developed in [3] can be written in the form of the proposed stabilization, see (3.7)-(3.9), but with c Γ,j = 0 for all j and γ = 0. For this stabilization we have chosen c F,1 = 10 −1 but since γ = 0 we get slightly larger errors than the other alternatives. Decreasing the stabilization constant c F,1 will decrease the error. In the stabilization term (1.10) we have chosen c T = 10 −1 and α = 1. We see in Fig. 5 that an alternative to adding stabilization terms is to do a simple diagonal scaling.   For higher order elements than linear the resulting linear systems from unstabilized Cut-FEM are severely ill-conditioned and neither a simple diagonal scaling alone or a pure face stabilization term such as (1.8) improves the condition number significantly, see Fig. 6. When the condition number becomes to large the convergence fails, the error is dominated by roundoff errors and does not decrease with mesh refinement. In general, when the stabilization is not enough the behavior of the error depends on how the geometry cuts the background mesh. Therefore, if the approximation of the interface is slightly changed the magnitude of the error and the convergence shown in Fig. 6, where cubic elements are used, may change a lot for the unstabilized method and the method with only face stabilization. With the proposed stabilization and also with the stabilization term (1.10) we obtain as expected optimal convergence orders and the condition number of the linear systems scales as O(h −2 ) independent of how the interface cuts the background mesh. In Fig. 6 we also show results with different constants in the stabilization terms. We see that the magnitude of the error decreases as the stabilization term gets weaker. However, when the stabilization is too weak the condition number is large and roundoff errors dominate.
In Fig. 6 we see that the proposed stabilization and the stabilization term (1.10) resulted in very similar errors however the condition number is smaller when the stabilization term (1.10) is added to the weak form. The difference in the condition number decreases after a diagonal scaling of both matrices. We emphasize that all these results depend on the choice of parameters in the stabilization terms and the basis functions used. We have not optimized the parameters in the stabilization terms. For example, using cubic elements and the proposed stabilization term there are six constants and the parameter γ that one can modify to optimize the magnitude of the error and the condition number.
Next we vary both γ and the constants c F,j and c Γ,j in the proposed stabilization term and study the error and the condition number in Fig. 7 and 8 for linear and cubic elements, respectively. By decreasing the constants c F,j and c Γ,j and/or increasing γ we weaken the stabilization. In general, the error decreases while the condition number increases as the stabilization is weakened. The condition number also increases when the stabilization becomes too strong. We see this for the linear elements as well as the cubic elements. A strengthening of the stabilization sometimes decreases the condition number significantly but only increases the error slightly. Compare for example the black circles connected with a dashed dotted line, γ = 0 in Fig. 8 with the black circles connected with a solid line γ = 1 in the same figure. By lowering γ the error increases with a factor of 2.5 in this case but the condition number decreases with a factor 850. Comparing the results in Fig. 7 and 8 we see that the choice of parameters in the stabilization term have larger impact on the cubic elements than the linear elements. We also see that with other choices of parameters than the ones used for the results in Fig. 4 the condition number can scale much better with respect to the polynomial degree.

The Mass Matrix
We now consider the problem of finding u h ∈ V p h such that where s h is the stabilization term defined in (3.7). The interface Γ is a circle of radius 1 centered at the origin and we generate a uniform triangular mesh with h = h x 1 = h x 2 on the   . We let f h be We study the condition number of the mass matrix using different degrees of polynomials, p, and different stabilization terms. Fig. 9 shows that the stabilization term we propose results in optimal convergence rates in the L 2 -norm and optimal scaling of the condition number (O(1)) of the mass matrix as it did for the Laplace-Beltrami operator. As before, the parameters used in (3.7)-(3.9) influence the magnitude of the error and the condition number. For the results shown in Fig. 9 we have used c F,j = c Γ,j = 0.03 · 20 −j , j = 1, · · · , p. Other choices could give smaller condition numbers without increasing the error significantly and also give better scaling with respect to the polynomial degree of the condition number.
In Fig. 10 and 11 we compare different stabilization terms for linear and cubic elements, respectively. The proposed stabilization with γ = 1 is compared with using no stabilization i.e., c F,j = c Γ,j = 0, a pure face stabilization i.e. c Γ,j = 0 with γ = 1, a pure interface stabilization i.e. c F,j = 0 with γ = 1, and the stabilization term (1.10) with α = 1. In Fig. 10 we also show results with two different constants in the stabilization terms. We see that only the proposed stabilization and the stabilization term (1.10) yield optimal scaling of the condition number (O(1)) and both the error and the condition number are very similar for these two stabilization terms.
For cubic elements we show results with three different constants in the stabilization terms, see Fig. 11. Decreasing the constants in the stabilization decreases the magnitude of the error but the condition number increases. When the stabilization is too weak the convergence rate is not optimal.
Our observations are similar to what we observed for the Laplace-Beltrami operator. For higher order elements the linear systems are severely ill-conditioned without stabilization and the behavior of the error is oscillatory and the convergence rate will depend on how the geometry cuts the background mesh. We also see that neither the pure face stabilization nor a pure interface stabilization give enough control of the condition number and that we need, as we propose, the combination of them.

The Mean Curvature Vector
We consider the interfaces given by the zero contour of the following level set functions Deco-cube (9.6) (9.7) We generated meshes independently of the position of the given interface. We define the mesh size by h = 1/N 1 d where N denotes the total number of nodes. We construct an approximate level set function φ h using the nodal interpolant π 1 h φ on the background mesh and let the    interface be the zero level set of φ h . We use linear Lagrange basis functions. Given the discrete coordinate map x Γ h : Γ h x → x ∈ R d we want to find the stabilized discrete mean where s h is a stabilization term. The torus and the deco-cube are examples from [22]. In [22] we proved that using the ghost penalty stabilization (1.8) the method (9.8) is a convergent first order method with respect to the L 2 -norm. Therefore, we also expect to obtain first order accurate approximations of the mean curvature vector using the proposed stabilization term. We compare the approximation from (9.8) with the vector H = − (∇ · n) n, where n = ∇φ |∇φ| . In Fig. 12 we show the error in the L 2 -norm using the proposed stabilization with γ = 0 and the stabilization term (1.10) developed in [5] and [18] with α = −1. We emphasize that the analysis in [22] requires control over the jumps in the normal derivatives across edges on Γ h , which can be controlled using (1.8), and without appropriate stabilization we don't expect any convergence in L 2 . Note also that we do not expect the element normal gradient stabilization to provide enough stabilization to ensure convergence of the mean curvature vector since some additional tangential control is required and thus this is a situation where face stabilization comes in naturally. In practice, we note that depending on the stabilization parameter we see that we are able to obtain first order convergence with both stabilization terms. However, the magnitude of the errors are always smaller using the proposed stabilization compared to using the stabilization term (1.10). The magnitude of the error using the pure face stabilization is similar to the results with the proposed stabilization and not shown here. Furthermore, the condition number is better using the proposed stabilization term. We emphasize that when higher order elements than linear are used it is of vital importance to use the proposed stabilization term to control the condition number since the pure face stabilization does not control the condition number and therefore the error may be dominated by roundoff errors. Full details of high order approximations of the mean curvature vector using this approach are presented in [16].
In this example the face stabilization is of importance and in such cases the proposed stabilization is preferable compared to the stabilization term (1.10).

Discussion
A strategy to control the condition number of the system matrix resulting from cut finite element discretizations, independently of how elements in the background mesh are cut by the geometry, is to add a stabilization term to the weak formulation. For higher order elements than linear the linear systems resulting from cut finite element discretizations without stabilization are extremely ill-conditioned and it is not obvious how to precondition such linear systems. Therefore, the strategy of adding a stabilization term to the weak form has become a popular alternative to cure this ill-conditioning. The stabilization term most frequently used together with CutFEM is the ghost penalty stabilization [10] which acts only on the element faces. We have seen in this work that when higher order elements than linear are used in the cut finite element discretizations of surface PDEs the condition number can not be controlled by only adding such a face stabilization term. We have proposed a remedy where we keep the face stabilization but also add a stabilization term acting on the interface/surface and prove that this new stabilization term controls the condition number for both linear as well as higher order elements.
As discussed in the introduction and in Section 9 a different stabilization term, the normal gradient stabilization (1.10) which is evaluated on each cut element, has recently been proposed in [5] and [18]. For surface PDEs this stabilization also controls the condition number for linear as well as higher order elements. This term results in less non-zero entries in the stiffness matrix than the stabilization term we propose since it is local while the face stabilization term involves neighboring elements. The implementation of (1.10) is also easier than the stabilization we propose. However, there are a couple of reasons why we choose to keep the ghost penalty term and propose a stabilization term based on this term.
• The element stabilization (1.10) cannot be used in the discretization of bulk problems since it would destroy the convergence order while the ghost penalty stabilization could be used without destroying the convergence order and with control of the condition number of the linear systems. Many applications include coupled bulk-surface problems and for those problems the ghost penalty stabilization is needed. In such applications, the effect of the proposed stabilization on the number of non-zero elements is very small and since the face stabilization is needed for the bulk problem the implementation of the proposed stabilization is also easy as only the stabilization term acting on the surface has to be added.
• The ghost penalty stabilization has shown to be of importance also for other reasons than controlling the condition number. In [9], for example, we show that by adding such a face stabilization we get a stable discretization for convection dominated problems on surfaces and no other stabilization term such as for example a SUPG term is needed. In [22] we show that with the same face stabilization a method for computing the mean curvature vector for piecewise linear surfaces based on the Laplace-Beltrami operator enjoys first order convergence in the L 2 -norm while, in general, without the stabilization convergence cannot be expected. In Section 9.3 we compute the mean curvature vector of three surfaces and show that we obtain better accuracy with the proposed stabilization than with the stabilization (1.10). Our method for computing the mean curvature vector can also be extended to give high order approximations [16]. For time dependent problems on surfaces and coupled-bulk surface problems adding the ghost penalty stabilization enables to directly approximate space-time integrals in space-time cut finite element formulations using quadrature rules for the integrals over time [23,30]. This makes the implementation of space-time CutFEM convenient when higher order elements are used, in particular for coupled bulk-surface problems. The proposed stabilization thus also has all these advantages and in addition gives control of the condition number for linear as well as higher order elements.
• The proof of the condition number estimate is simple when the proposed stabilization term is used. The main step in the proof is to show that one covers a non empty set inside an element by going in the normal direction of the surface from the part of the surface that intersects the element. For the proposed stabilization one only needs to show that such a non-empty set exist for elements that have a large intersection with the surface. However, to obtain the desired Poincaré inequality with the stabilization term (1.10) one needs to show that such a set exist in each element and in particular that this set is large enough [5,18]. We use the simplified structure of the proof to extend our results to the case of general codimension embeddings, see Section 7 where we present the minor modifications of the proof to Lemma 6.3 to verify the generalized version (7.4) of the Poincaré inequality P5.
• We also note that the stabilization term (1.10) requires the computation of the normal vector inside an element. When the interface is implicitly represented by a higher dimensional function, for example a distance function, this normal vector can be taken to be the normal vector of the level sets of the function defining the interface implicitly. However, when an explicit representation is used for the surface a normal vector inside an element is not immediately available but normal vectors on the surface, used in the proposed stabilization, are available.