On the rate of convergence of Yosida approximation for the nonlocal Cahn-Hilliard equation

It is well-known that one can construct solutions to the nonlocal Cahn-Hilliard equation with singular potentials via Yosida approximation with parameter $\lambda \to 0$. The usual method is based on compactness arguments and does not provide any rate of convergence. Here, we fill the gap and we obtain an explicit convergence rate $\sqrt{\lambda}$. The proof is based on the theory of maximal monotone operators and an observation that the nonlocal operator is of Hilbert-Schmidt type. Our estimate can provide convergence result for the Galerkin methods where the parameter $\lambda$ could be linked to the discretization parameters, yielding appropriate error estimates.


Introduction
The nonlocal version of the Cahn-Hilliard equation was introduced in the early 90's by G. Giacomin and J. Lebowitz [46].They considered the hydrodynamic limit of a microscopic model describing a d-dimensional lattice gas evolving via a Poisson nearest-neighbor process and derived a nonlocal energy functional of the form where Ω is a smooth bounded domain in R d , J is a nonnegative and symmetric convolution kernel, and F is a double-well potential.
The associated evolution problem is related to the gradient flow (in the H −1 -metric) and provides a nonlocal variant of the Cahn-Hilliard equation, given by the following system of equations: equipped with appropriate boundary conditions, where a(x) = (J * 1)(x) := ´Ω J(x − y)dy, (J * u)(x) := ´Ω J(x − y)u(y) dy, for x ∈ Ω.This equation describes the evolution of the concentration of two components in a binary fluid.The local concentration of one of the two components is represented by a real valued function u = u(x), the function µ is the chemical potential, F is a double-well potential, and m is the mobility.In our case, we deal with the so-called nondegenerate Cahn-Hilliard equation, that is the mobility m(u) is bounded away from 0, and without loss of generality, we assume m(u) = 1.Moreover, the potential F can be splitted into two parts where γ is a convex and may be singular while Π is a regular, small perturbation.
In [46] the authors derived a free energy functional in nonlocal form, and proposed the corresponding gradient flow to model the phase-change in binary alloys.The mathematical literature on the nonlocal Cahn-Hilliard equation is widely developed: among many others, we mention [1,7,41,43,53] and the references therein.Concerning the existence of solution for the nonlocal Cahn-Hilliard equation, the first proof for singular kernels not falling within the W 1,1 -existence theory and under possible degeneracy of the double-well potential was done in [27,29].These were also the first contributions dealing with the case of non-regular interaction kernels.In [28] the authors provided a full characterization of existence in the case of singular double-well potential, and interaction kernels satisfying W 1,1 integrability assumptions.In [27][28][29], the authors worked under the assumption of constant mobility and consider a Yosida-type regularization on the nonlinearity.
As already mentioned, in this paper, we focus on the nondegenerate Cahn-Hilliard equation.This assumption is a mathematical simplification, allowing to study the Cahn-Hilliard equation in greater detail.In particular, there are much more results available for the nondegenerate equation, including well-posedness results [60] or strict separation property [42,64].Nevertheless, it seems that it is the degenerate equation that is physically relevant as it results from several different limits, including hydrodynamic limit of Vlasov-type equation [34], interacting particle systems [46] or high-friction limit for the Euler-Korteweg equation [33,44,59].We also want to remark that there are few analytical results available for the degenerate equation [15,16,26,35,36,38,40,62] and that these studies were initiated by the paper of Elliott and Garcke [39].In this paper we focus on the case with constant mobility since the nondegeneracy assumption is necessary to obtain an L 2 estimate on the chemical potential µ which is an essential part of our argument.
The standard methods of construction of solutions to (1.2) with singular potentials are based on compactness of solutions to the approximate system where the potential is regularized via Yosida approximation (see Section 2.2 for the relevant definitions).Namely, the singular potential γ is regularized via Yosida approximation γ λ and then one sends λ → 0 based on a priori estimates.In this paper, we obtain an explicit estimate in L 2 ((0, T ) × Ω) which quantifies this convergence, see Theorem 2.1 for more details.To our knowledge, this is the first result of this type.Such estimate could provide convergence of Galerkin methods for (1.2) when one applies Yosida approximation to regularize the potential and the Galerkin discretization in space.
We recall that the Yosida approximations of a function F is defined as F λ : R → R, having Lipschitz constant of order 1/λ, and F λ (s) := ´s 0 F λ (r) dr for every s ∈ R. Our method consists in the following: we consider a sequence {u λ } satisfying (2.13) given by where B(u λ ) := a(x) u λ − J * u λ and with {γ λ } being a sequence of Yosida regularisations (see Section 2.2 for the exact definition and properties).To obtain an explicit convergence rate we make two observations.The first one is the estimate on the Yosida approximation coming from [19]: With (1.5), it is easy to conclude the convergence with the rate 2) is the gradient flow in H −1 (Ω) norm.To obtain the estimate in L 2 (Ω), we use the nonlocal kernel J.We assume that J ∈ L 2 (Ω) so that the operator u → J * u is a Hilbert-Schmidt operator.Such operators are compact so that, up to a small error, their image is a finitedimensional subspace of L 2 (Ω) on which the two norms, L 2 (Ω) and H −1 (Ω), are equivalent.This yields the information about the convergence in L 2 (Ω).All these comments are made rigorous in Section 4. In particular, we state our result for the case with Neumann boundary conditions, but the theorem is valid also in the (simpler) case with Dirichlet boundary conditions or periodic domains.
Several papers have addressed the discretization of (1.2) with the potential approximated via Yosida approximation [12,[54][55][56].While we do not consider any discretization in our paper, one can apply several available results in the literature for the discretization of (1.4) with the regularized potential F λ (see [21-23, 30, 69]) and then consider the limits: firstly ∆x, ∆t → 0 and then λ → 0. While the aforementioned results have been obtained for an explicitly given potential, we expect that they still hold true for an arbitrary regular potential.
One can even consider the joint limit ∆x, ∆t, λ → 0: if u ∆x,∆t λ is a solution to the discretization of the regularized problem, we expect where α, β depends on the particular method and C(λ) → ∞ when λ → 0.Then, applying Theorem 2.1 one gets so that if ∆t, ∆x are chosen to depend (appropriately) on λ, one finally obtains the convergence The details depend on the rates α, β as well as the constant C(λ) (i.e.how much the convergence result depends on the regularity of the potential).While the rates α, β are known for several methods, the constant C(λ) has to be obtained by a careful inspection of the convergence proofs and we leave it for future works.We also remark that other schemes have been proposed for the Cahn-Hilliard equation and some related problems in [3,5,6,8,20,37,[50][51][52].
The paper is structured as follows: in Section 2 we introduce the setting and recall the main tools used in the paper; in Section 3 we compute the uniform estimates and in Section 4 we provide the proof of the main theorem (Theorem 2.1).

Assumptions and the main result
In order to state our result we need several definitions and tools that we are going to recollect in this section.In particular, we introduce the notations and hypothesis used in this paper and recall the maximal monotone graphs theory as well as some basic facts on the Hilbert-Schmidt integral operator. 2.1.Hypotheses.
Assumption 1.Throughout the paper we assume the following: (H1) Ω is a smooth bounded domain in R d , and T > 0 is a fixed final time.
(H2) The kernel J : for almost every x.For any measurable v : Ω → R we use the notation and set a(x) = J * 1.We also assume that for some a − > 0 inf x∈Ω a(x) > a − > 0.
(2.1) (H3) We assume that F = γ+ Π, where γ is a convex proper function.We write γ : R → 2 R for a maximal monotone graph such that 0 ∈ γ(0) and γ is a subdifferential of γ in the convex analysis sense.Moreover, we let Π = Π ′ and we assume that Π is a Lipschitz function such that Without loss of generality we can suppose that F is nonnegative.(H4) The initial condition Moreover, we assume that the average of u 0 , where the domain dom(γ) is defined in Section 2.2.
Let us comment the assumptions.Concerning (H2), the kernel J is required to be defined on R d but in fact, it could be defined on the set Ω − Ω, i.e. the set containing all the points of the form x − y where x, y ∈ Ω.The L 2 regularity of the kernel is necessary to know that the nonlocal operator J * v is an Hilbert-Schmidt operator from L 2 (Ω) to L 2 (Ω).We remark that condition (H2) implies ∇a(x) ∈ L ∞ (Ω).
Moreover, the lower bound on a is a technical matter related to the geometry of Ω and symmetry of J: it is easy to see that it holds true if Ω is a ball and J is radially symmetric.Concerning (H3), the assumption on the potential is fairly standard: the main part γ is convex and possibly singular while the term Π should be considered as a perturbation which cannot be too large.The reason for considering γ as a maximal monotone graph is that the derivative of γ may not exist in the classical sense but it exists as a subdifferential of a convex function which is a maximal monotone graph, see Section 2.2 for the relevant definitions.Finally, the first part of (H4) guarantees that the energy related to (1.4) is finite at t = 0.The condition (2.3) says that the initial condition is not supported only at the points where subdifferential γ does not exist.
Hence, the most demanding condition is (H3) which still allows us to consider all reasonable choices for the potential.The standard choice for F is the fourth-order polynomial F pol (x) := 1 4 (x 2 − 1) 2 , x ∈ R, with minima, corresponding to the pure phases, in ±1.A more realistic description is given by the logarithmic double-well potential with 0 < θ < Θ and c > 0. With this choice then we are on the bounded domain (−1, 1) and the minima are within the open interval (−1, 1).A third example which is also included in our study is the so-called double-obstacle potential (see [9,61]) With this choice F ′ ob is not defined in the usual way, and has to be interpreted as the subdifferential ∂F ob in the sense of convex analysis.
The necessity of singular potentials as in (1.3) is motivated by one of the first derivation of the (nonlocal) Cahn-Hilliard equation due to Giacomin-Lebowitz [46,47] who considered a logarithmic potential (2.4).Furthermore, the double-obstacle potential (2.5) was proposed by Oono and Puri [61] to model phase separation (see also [9,10]).One of its interesting application is in the inpainting of damaged images [11,45] where the double-obstacle potential leads to better visual results comparing to the smooth potentials.Last but not least, the singular potentials appear in several applications, including tumor growth [2,25,67] and flows of the binary mixtures [18,49].Our work covers also several classes of singular kernels J, including Riesz, Newtonian, and Bessel, which are used to model nonlocal interactions in multiple settings, including tumor growth [32,63], (Patlak-)Keller-Segel and aggregation-diffusion equation [17,57,58] and related applications in the sampling problems [31].Nevertheless, we point out that the focus of the current paper is on the singular potential rather than on the singular kernel.
We also highlight that our result could be generalized to anisotropic potential possibly dependent on time, i.e. of type F = F (u, x, t) but additional hypothesis should be specified.

2.2.
Maximal monotone graphs theory.As specified in (H3) we assume that F can be written as where γ : R → R ∪ {∞} is a convex, lower semicontinuous and Π : R → R is a function such that Π := Π ′ is globally Lipschitz continuous.For such γ, we define its domain as The subdifferential γ(x) is defined as a set-valued map satisfying Here, we slightly abuse the notation and we highlight that we write γ(x) for an arbitrary element of the set γ(x).It may happen that γ(x) is empty for some x; hence, we define its domain which is the set of points where γ is differentiable dom(γ) = {x : γ(x) = ∅}.The subdifferential γ is a maximal monotone graph due to the celebrated result of Rockafellar [68,Corollary 31.5.2].This means that γ(x) is monotone and that there is no bigger (in the sense of inclusions of graphs) multi-valued map which is monotone.Below we briefly recall the most important facts while for the complete theory we refer to [4, Chapter 3] and [13].
Concerning the relation between D( γ) and dom(γ), it is well-known that if γ(x) < ∞ and γ is continuous at x, then γ(x) is nonempty cf.[66, Proposition 3.29 (ii)].The continuity assumption is not restrictive since any convex function is continuous on the interior of its effective domain, see [66,Proposition 2.20].On the other hand, if γ(x) is nonempty, it means that x ∈ D( γ) (otherwise, γ(x) has to be empty).Therefore, dom(γ) ⊂ D( γ) and this inclusion can be strict, see [68, p.218].We note the following: To introduce Yosida approximation, it is important to recall that the fact that γ is a maximal monotone graph is equivalent with saying that (I + λγ) −1 is a contraction for all λ > 0 welldefined on R cf. [14, Proposition 2.2].It means that for all x 1 , x 2 ∈ R.More precisely, monotonicity always implies that (I + λγ) −1 is a contraction but it is the maximal monotonicity which implies that the range of I + λγ is the whole space R so that (I + λγ) −1 is defined on R.
For the sake of analysis, we need to approximate multi-valued map γ with a single-valued map via the so-called Yosida approximation.To this end, we let where I is the identity map.Then, the Yosida approximation is defined with Formula (2.7) immediately implies that γ λ (x) is convex.
To understand better the relation between γ λ and γ λ , we note that the minimum has to be attained for y such that x−y λ ∈ ∂ γ(y) = γ(y) which means that y = J λ (x) and the formula simplifies to We also note that from this reasoning we have x−J λ (x) λ ∈ γ(J λ (x)) so that which will be useful in our reasoning.Finally, one can check that γ λ is classically differentiable, γ λ ′ = γ λ and γ λ ր γ as λ → 0 cf.[14,Proposition 2.11].

The nonlocal operator. Let us consider the operator
First, it is worth considering only a part of B, namely In particular, it is a compact operator (that is, the image of the unit ball is relatively compact) and for any orthonormal basis {e i } in L 2 (Ω), we have (see [24, Appendix C]): (2.11) Concerning the operator B, we have the following properties which are a simple consequence of the symmetry and the nonnegativity of J: Proof.We first note that since J is symmetric Hence, (iii) follows immediately.For (i), we observe that the expression above does not change when u and v are interchanged.Finally, (ii) follows from (i) because B(1) = 0.

2.4.
The main result.The nonlocal Cahn-Hilliard equation we are going to consider has then the following form: (2.12) Using Yosida approximation defined in Section 2.2, we consider solutions to where γ λ is defined in (2.6).Our main results reads: Theorem 2.1.Suppose that Assumption 1 holds true.Then, where the constant C depends on the model functions Π, γ and the norm of the initial condition in L 2 (Ω).
It will be clear from the proof that the constant depends mostly on the distance between a − and Π ′ ∞ as in (2.2).

Uniform estimates
We recall that existence of solution to the nonlocal Cahn-Hilliard equation with W 1,1 kernel has been proved in [28].In this section, we consider solutions to (2.13) and prove the following uniform estimates: Theorem 3.1.The following sequences are bounded: x .The proof of Theorem 3.1 is split into several parts.Proof.We multiply (2.13) with u λ to get Now, by convexity of the Yosida approximation, we have (in fact, it may happen that γ ′ λ does not exist, see Remark 3.1 below).Concerning the term with the bilinear form, we note that B(u λ ) = a(x) u λ − J * u λ so that The last two terms can be estimated via Cauchy-Schwarz and Young convolutional inequalities: where ε has to be chosen.Collecting (3.1)-(3.5)and recalling that a(x) > a − (see (2.1)) we conclude , this concludes the proofs of (A) and (B).Now, we multiply (2.13) with µ λ and integrate in space using Neumann's condition on µ λ to obtain where F λ is a primitive function of F λ .Recall that B(u λ ) = u λ −J * u λ so that the symmetry of the kernel J implies Therefore, estimate (D) follows from (3.6) after integrating in time and using assumption on the initial condition.
We multiply (2.13) with (−∆) −1 (u λ − u 0 ) where u 0 is the average of u 0 to get Thanks to (D) and equation (2.13), {∂ t u λ } is uniformly bounded in The last integral in (3.8) can be split into three terms: which can be studied separately.Clearly, For the term with Π(u λ ) we estimate so that ´Ω Π(u λ ) (u λ − u 0 ) is bounded in L ∞ (0, T ) due to (A).Finally, we have to estimate the term with γ λ (u λ ).For this, we prove that there exist constants M 1 , M 2 depending only on u 0 such that ) If (3.10) is proved, then the proof of Lemma 3 is completed because the rest of the terms are either nonnegative or bounded in L 2 (0, T ).
Case r ∈ (m − , m + ).Note that, by Remark 2.1, the interval where we used monotone convergence of Yosida approximations to the minimal value γ 0 , see (2.9).Of course, by monotonicity, the supremum above can be estimated only in terms of |γ 0 (m − )| and |γ 0 (m + )|, so the proof is concluded.
Finally, we prove (E).For this, using the formula for µ λ , we get: We want to prove that each term appearing on the right-hand side (RHS) of (3.11) is bounded in L 2 t L 2 x .For µ λ , this follows from (C) while for (J * 1)u λ , this follows from (A).For the term J * u λ we apply Young's convolutional inequality (note that J does not depend on time) to deduce x which is bounded by (A).Finally, for Π(u λ ) we apply (3.9) and (A) once again.
Remark 3.1.In the proof of Lemma 2, we used the existence of γ ′ λ which may not be the case (we only know that γ λ is continuous for fixed λ > 0).However, in fact, we only used the sign which can be deduced from convexity by a suitable approximation scheme.Namely, let γ ε λ be a usual mollification of γ λ and let u ε λ be solution to (3.12) Note that since γ λ is nondecreasing, the same holds for γ ε λ so that γ ε λ ′ ≥ 0. Arguing as in (3.1)-(3.2),we obtain the inequality As in Lemma 2, we get the following uniform bounds: x .It remains to pass to the limit ε → 0 which is simple because (2.13) admits a unique solution (it is nondegenerate Cahn-Hilliard with regular potential) so that we can identify the limit.

Proof of the main result Theorem 2.1
We consider u λ 1 , u λ 2 to be two solutions of (2.13) with potentials µ λ 1 , µ λ 2 respectively.We want to prove To this end, we write u = u λ 1 − u λ 2 , µ = µ λ 1 − µ λ 2 .We have Since the average of u is equal to 0, we can test (4.1) with (−∆) −1 u as in the proof of Lemma 3 to obtain 2) The last term can be written as as B is linear.We analyze the three terms appearing on the (RHS) separately.
Term ´t 0 ´Ω B(u) u.We note that B is of the form where I is an integral Hilbert-Schmidt operator and self-adjoint because we assume that the kernel J ∈ L 2 loc (R d ).Therefore, I is compact, and has representation If, e i e i where the orthonormal basis {e i } of L 2 (Ω) is chosen as eigenvalues of (−∆) operator as in (3.7) and {λ i } are the corresponding eigenvalues.Note that λ 0 = 0, e 0 = const and for all i ≥ 1 we have ´Ω e i dx = 0 and (−∆) −1 e i = 1 λ i e i .
As I is Hilbert-Schmidt, ∞ i=1 Ie i 2 L 2 (Ω) < ∞.Therefore, we conclude that there exists a sequence {I k } of finite dimensional and self-adjoint operators defined by the formula Moreover, I k − I → 0 in the operator norm which is a simple consequence of the summability ∞ i=1 Ie i 2 L 2 (Ω) < ∞.Now, we fix k ∈ N. We introduce the space and we use the orthogonal decomposition where A k = span(e 1 , ..., e k ) and B k = (e k+1 , e k+2 , ...).We remark that we skip the vector e 0 because we restrict ourselves to functions in L 2 0 (Ω).
We write The first term is estimated by triangle inequality For the second term, we use the decomposition u = u A + u B and observe that because I k ≤ I .Now, on the space A k there are two norms: the usual L 2 norm and the (this is a norm due to the additional constraint that the average is 0).As A k is of finite dimension, there exists a constant C(k) (which depends on k and blows up when k → ∞) such that where the last inequality follows from the fact that {e i } are the eigenvectors of −∆ operator and (−∆) −1/2 is self-adjoint so that The conclusion is that ˆt 0 ˆΩ I(u) u ≤ ≤ I − I k u 2 L 2 (0,t;L 2 (Ω)) + ε I 2 u 2 L 2 (0,t;L 2 (Ω)) + C(ε, k) (−∆) −1/2 u 2 L 2 (0,t;L 2 (Ω)) .
Taking into account the whole form of B as in (4.3) and that a(x) > a − as in (2.1), we conclude that where ε and k has to be chosen appropriately.