ADER discontinuous Galerkin schemes for general-relativistic ideal magnetohydrodynamics

We present a new class of high-order accurate numerical algorithms for solving the equations of general-relativistic ideal magnetohydrodynamics in curved spacetimes. In this paper we assume the background spacetime to be given and static, i.e., we make use of the Cowling approximation. The governing partial differential equations are solved via a new family of fully-discrete and arbitrary high-order accurate path-conservative discontinuous Galerkin (DG) finite-element methods combined with adaptive mesh refinement and time accurate local timestepping. In order to deal with shock waves and other discontinuities, the highorder DG schemes are supplemented with a novel a-posteriori subcell finite-volume limiter, which makes the new algorithms as robust as classical second-order total-variation diminishing finite-volume methods at shocks and discontinuities, but also as accurate as unlimited high-order DG schemes in smooth regions of the flow. We show the advantages of this new approach by means of various classical two- and three-dimensional benchmark problems on fixed spacetimes. Finally, we present a performance and accuracy comparisons between Runge-Kutta DG schemes and ADER high-order finite-volume schemes, showing the higher efficiency of DG schemes.


INTRODUCTION
Electromagnetism plays an important role in many astrophysical processes such as compact objects and binaries consisting of black holes and neutron stars. The general-relativistic theory of magnetohydrodynamics (GRMHD) is a successful theory to describe these systems, combining the fluid description of matter with a simplified theory for electromagnetic fields in the absence of free charge carriers. Similar to general-relativistic hydrodynamics (GRHD), first successful (lower-dimensional) simulations of the GRMHD system date back to the pioneering work of Wilson (1975) more than 40 years ago (See Font 2008;Martí & Müller 2015, for recent reviews on progress in GRMHD simulations). In the past years, several groups started to recast the system of GRMHD equations into a conservative form to make use of conservative Godunovtype finite-volume schemes based on approximate Riemann solvers and high-resolution shock-capturing schemes (HRSC). Many GRHD and GRMHD codes have been developed over the last decade (for instance Baiotti et al. 2005;Duez et al. 2005;Anninos et al. 2005;Antón et al. 2006;Giacomazzo & Rezzolla 2007;Anderson et al. 2008;Kiuchi et al. 2009 Dionysopoulou et al. 2013;Radice et al. 2013;White et al. 2016;Porth et al. 2017) and applied to various topics in astrophysics. Some codes also evolve the spacetime by feeding back the fluid and magnetic energy-momentum tensor in the Einstein field equations, which govern the time evolution of the metric tensor; some codes even incorporate radiation transfer like the one proposed by Takahashi & Umemura (2017), or include the full Maxwell theory in a resistive relativistic MHD formulation (see e.g., Palenzuela et al. 2009;Dumbser & Zanotti 2009;Dionysopoulou et al. 2013;Bucciantini & Zanna 2013;and. L. Del Zanna & Bucciantini 2014;Aloy & Cordero-Carrión 2016).
In this work we propose a new family of little dissipative and little dispersive shock capturing schemes for the solution of the general-relativistic magnetohydrodynamics equations (GRMHD), based on high-order accurate explicit discontinuous Galerkin (DG) finite-element schemes on spacetime adaptive meshes (AMR) with time-accurate local time stepping (LTS) and supplemented by a high-order accurate a posteriori subcell finite-volume limiter in order to cope with shocks and discontinuities in the solution. To the best of our knowledge, this family of algorithms has never been applied to the GRMHD equations before.
An important and novel aspect of our approach is the interpretation of the source terms in the GRMHD equations that ac-count for the gravitational field in curved spacetimes as nonconservative products, instead of the usually employed algebraic source terms, since the gravity terms in general relativity are indeed functions of the spatial derivatives of the lapse, the shift vector and the spatial metric tensor. In other words, given a vector of conserved variables Q and the tensor of nonlinear conservative fluxes F = (F 1 , F 2 , F 3 ), the set of GRMHD equations which are normally written as (Rezzolla & Zanotti 2013) ∂tQ + ∇ · F (Q) = S(Q) , (1.1) where S is a generic source vector can, in our framework, be rewritten as ∂tQ + ∇ · F (Q) + B(Q) · ∇Q = 0 , (1.2) or in quasi-linear form, ∂tQ + A(Q) · ∇Q = 0 , (1.3) with the system matrix A(Q) := ∂F /∂Q + B(Q). Above, and throughout the paper, the nabla operator without subscript is simply defined as ∇ = (∂x, ∂y, ∂z), and thus does not indicate a covariant derivative. Here, B = (B1, B2, B3) is the matrix of the nonconservative product B(Q) · ∇Q := B1∂xQ + B2∂yQ + B3∂zQ.
The system (1.3) is called hyperbolic if the matrix A · n is diagonalizable for all normal vectors n = 0 with only real eigenvalues and a complete set of bounded linearly independent eigenvectors. The hyperbolicity of the GRMHD system has been studied in many works [see, for instance, Anile (1990); Komissarov (1999)]. This paper deals with the general-relativistic extension of the special relativistic case presented in . As it has been mentioned above, the background spacetime is introduced as a nonconservative product in the principal part of the system on the left hand side and is not treated as an algebraic source term, as it has been conventionally treated all along in the literature so far. The inspiration to use so-called path-conservative schemes for nonconservative products has been taken from successful developments in the context of so-called well-balanced numerical methods Bermúdez & Vázquez (1994) for the solution of the shallowwater equations [see Parés (2006); Castro et al. (2006Castro et al. ( , 2010 for details on path-conservative schemes], where the bottom-slope term (which is the gradient of a known function and accounts for gravity forces in shallow-water models) is discretized as a nonconservative product in the principal part of the system rather than as a classical algebraic source term. In the shallow water context, the family of path-conservative schemes allows to preserve certain stationary equilibrium solutions exactly up to machine precision also on the discrete level, including nontrivial equilibria, see Gaburro et al. (2017) and Gaburro et al. (2018) for recent examples. At this stage, the development of exactly well-balanced numerical methods for the GRMHD equations is still out of scope, but further developments in this direction would definitely deserve attention. We also would like to stress that the use of nonconservative products is not related to the ADER-DG scheme itself. It would have been equally possible to compute the metric derivatives analytically and discretize the gravity terms as conventional algebraic source terms, as done in other codes for the GRMHD system.
Discontinuous Galerkin methods belong to the family of finite-element methods which consider the numerical approximation of a weak formulation of the governing system of partial differential equations over a set of non-overlapping elements. The discrete solution space is restricted to the space of piecewise polynomials of maximum degree N ≥ 0 and the degrees of freedom (i.e., the expansion coefficients) of the chosen polynomial basis are directly evolved in time. Finite-element methods are known also under the name of variational-difference or projection-difference, (see the original formulations by Ritz 1909;Galerkin 1915). In particular, in the DG formulation the numerical solution is allowed to be discontinuous at element interfaces [see Reed & Hill (1973) for the integration of the neutron transport equation]. It has taken nearly two decades for the DG methods to be extended to general nonlinear hyperbolic systems, thanks to the groundbreaking works of Cockburn et al. (1989Cockburn et al. ( , 1990; Cockburn & Shu (1998b); see also Cockburn et al. (2000); Cockburn & Shu (2001); Shu (2016) for a review.
In the last twenty years, DG methods became increasingly popular mainly because of four attractive properties: i) nonlinear L2 stability has been proven for general nonlinear scalar conservation laws by Jiang & Shu (1994); ii) arbitrary high order of accuracy can be easily achieved for smooth solutions by simply increasing the polynomial degree N of the chosen basis functions; iii) high parallel scalability makes DG methods better suited for large-scale simulations even on general unstructured meshes when compared with high-order finite-difference or finite-volume methods; iv) high-order DG methods are only little dissipative and little dispersive, even when compared with high-order finitevolume and finite-difference schemes and are thus essential for accurate long-term simulations. The main drawback that afflicts explicit DG schemes is the rather severe CFL stability condition that constrains the timestep of the simulations to scale with approximately h/(2N + 1) for hyperbolic partial differential equations, where N is the degree of the nodal polynomial basis used within the element and h is the characteristic size of one DG element (not the distance between the individual nodal degrees of freedom). A way to alleviate the severe CFL timestep restriction is the use of efficient semi-implicit DG schemes, as those proposed, for instance by Tavelli & Dumbser (2016); Fambri & Dumbser (2016).
DG methods have attracted the interest of the computationalastrophysics community only over the last few years. In particular, the first DG-based method for general-relativistic hydrodynamics has been developed by Radice & Rezzolla (2011), but it was limited to spherically symmetric spacetimes. The first three dimensional implementation of a DG method for relativistic flows on curved but fixed background spacetimes has been recently presented by Bugner et al. (2016), but without considering the magnetic field interaction. Very recently, Miller & Schnetter (2017) formulated an operator-based DG method for the solution of the Einstein field equations, while in Dumbser et al. (2018) a high-order DG scheme for the solution of a first-order reduction of the conformal and covariant formulation (CCZ4) Alic et al. (2012) of the Z4 system of the Einstein equations has been proposed. Also rather recently, Kidder et al. (2017) provided a DG implementation within a taskbased parallelism model for GRMHD, while Anninos et al. (2017) presented also a DG code with hp-refinement, and both of them complemented the high resolution of the purely-spatial polynomials basis with multi-step high-order time integrator, e.g., Adams-Bashforth (AB3) or Runge-Kutta schemes. Indeed, total-variation diminishing (TVD) Runge-Kutta methods are typically used in order to reach a stable high-order time discretization of DG schemes, i.e., applying the method of lines (MOL) technique which leads to the so-called family of RK-DG schemes.
On the other hand, the time discretization proposed in this paper is different and is named ADER technique. The particular feature of the ADER approach introduced by Toro and Titarev in the finite-volume context Titarev & Toro (2002, 2005; Toro & Titarev (2006) is that it leads to arbitrary high-order accurate fully-discrete one-step schemes in space and time. ADER schemes have already been applied to the equations of relativistic MHD, both in the ideal case (see Dumbser et al. 2008b, Zanotti et al. 2015b) and in the resistive case (see Dumbser & Zanotti 2009) and to other nonlinear systems of partial differential equations (see Fambri et al. 2017). Moreover, the ADER strategy adopted in this paper, which goes back to Dumbser et al. (2008a), applies to general systems of balance laws with conservative fluxes, nonconservative products and stiff or non-stiff algebraic source terms. In particular, it is based on a local spacetime discontinuous Galerkin (LSTDG) predictor step, which solves a local Cauchy problem in the small, based on a weak formulation of the partial differential equations in spacetime.
Although DG methods are proven to be nonlinearly L2 stable, whenever steep gradients or discontinuities appear in the solution, the use of an unlimited high-order DG scheme inevitably leads to spurious oscillations known as Gibbs phenomenon. In order to cope with this problem, several attempts have been made, e.g., artificial viscosity (Hartmann & Houston 2002;Persson & Peraire 2006;Cesenek et al. 2013), filtering (Radice & Rezzolla 2011), hybridisation with finite-volume/finite-difference schemes for the selected "troubled cells" adopting some sort of high-order slopelimiting procedures (Cockburn & Shu 1998a;, 2004Balsara et al. 2007;Zhu et al. 2008;J. Zhu & Qiu 2013;H.Luo et al. 2007;Krivodonova 2007). Here, we employ to the socalled a-posteriori finite volume subcell limiter (SCL) technique recently proposed by Dumbser et al. (2014b), which is based on the so-called multi-dimensional optimal order detection (MOOD) of Clain et al. (2011) and Diot et al. (2012). The main advantage of this approach is that the high-resolution properties of unlimited DG methods are preserved thanks to the introduction of a subgrid level, which is used a-posteriori for integrating the partial differential equations in troubled cells by means of a more robust highorder accurate finite-volume scheme [for completeness see also the work of (Casoni et al. 2013;Sonntag & Munz 2014Fechter & Munz 2015;Meister & Ortleb 2016) for alternative subcell DG limiters]. The presented SCL has been applied to several systems of nonlinear partial differential equations with promising results in the work of 2015b) and Fambri et al. (2017).
The paper is organised as follows: In Sec. 2 we describe the system of governing partial differential equations, in Sec. 3 we describe the ADER-DG scheme with the finite-volume subcell limiter and the adaptive mesh refinement (AMR) technique. Section 4 presents the testbeds both in special and general relativity that the scheme has passed. In Section 5 we present strong MPI scaling results up to 16,000 MPI ranks and performance and accuracy comparisons between Runge-Kutta DG schemes and ADER-WENO finitevolume schemes. Finally, Sec. 6 contains a summary of the results and an outlook to future work. Finally, Sec. 6 contains a summary of the results and an outlook to future work.
Hereafter, Latin indexes run from 1 to 3, while Greek indices run from 0 to 3. The zeroth components refer to the timelike coordinate of the corresponding tensor or vector and the signature of the metric tensor is assumed to be (−, +, +, +) through all the text. We use the Einstein summation convention over repeated indexes. Wherever not specified, the index correspondence z ) is adopted. Moreover, bold symbols are used to indicate three-vectors (or tensors). We use units with speed of light c = 1 and gravitational constant G = 1.

MATHEMATICAL FORMULATION AND PHYSICAL ASSUMPTIONS
The governing equations of an (ideal) fluid coupled to an electromagnetic field and described in a curved spacetime are given by the general-relativistic magneto-hydrodynamics equations (GRMHD). Following the derivation and formalism developed by Del Zanna et al. (2007), the covariant Euler-Maxwell system reads and contains the conservation of the energy momentum tensor T µν , as well as the homogeneous Faraday law, with ∇µ being the covariant derivative operator. Since in most astrophysical phenomena the electrical conductivity of the plasma is very high, the ideal-MHD approximation (where the electrical conductivity is actually assumed to be divergent) is a reasonable one. In this case, the electrical field is completely determined by the fluid velocity and the magnetic field, that is, the magnetic flux φB = B · S over any surface S is conserved and is advected with the fluid movement. The magnetic contribution to the hydrodynamics equations, i.e., the MHD equations, is then just a conservation equation for the magnetic field, which we will describe in the next Sections.

The 3+1 split of spacetime
The 3 + 1 decomposition of spacetime is the most widely used framework to prepare general-relativistic theories such as the GRMHD for numerical discretization. The four-dimensional spacetime manifold is decomposed into three-dimensional (3D) spacelike hypersurfaces which are parametrized by a time coordinate t and described by the 3D objects (Thorne & Macdonald 1982;Baumgarte & Shapiro 2003;Rezzolla & Zanotti 2013): the lapse function α, the spatial metric tensor γ, the shift vector β, and the extrinsic curvature tensor K. The (smooth) foliation or slicing Σt defines a timelike normal vector to the 3-hypersurface which is the future-oriented unit (nµn µ = −1) vector and can be regarded as the four-velocity of the Eulerian observer, i.e., at rest in the 3D hypersurface Σt.
Any four vector V µ (or tensor) can be split into its temporal and spatial components, respectively where the relation between the purely spatial (3D) metric tensor γij and the spatial-projection operator γ is given by γµν := gµν + nµnν , γ µ ν := g µ ν + n µ nν , (2.7) with the obvious property that γ ·n = 0. In this formalism, the spatial metric γij is used for lowering/raising indexes of purely spatial vectors (or tensors). Given a coordinate system (or x) are the spatial coordinates, the line element on the foliation Σt can then be expressed by the 3 + 1 form of the metric, i.e., ds 2 = −α 2 dt 2 + γij dx i + β i dt dx j + β j dt . (2.8)

The GRMHD system
In order to write the system of the GRMHD equations in the 3 + 1 decomposition of the spacetime, we define the vector 1 V of the 19 primitive variables as V := ρ, vj, p, B j , Φ, α, β j ,γm , j = 1, 2, 3; m = 1, . . . , 6 , where ρ is the rest-mass density in the frame comoving with the fluid, v the three-velocity vector, p the fluid pressure, B the magnetic field vector in the comoving frame, Φ an artificial scalar introduced to ensure the divergence-free constraint of the magnetic field at the discrete level via the hyperbolic divergence-cleaning approach (Dedner et al. 2002), α the lapse function, β, the shift vector, and γ a vector whose components represent the six independent components of the three (spatial) metric γ, i.e., γ = (γ11, γ12, γ13, γ22, γ23, γ33) (2.10) The corresponding state vector Q of conserved variables is defined as Note that while ρ can be seen as the rest-mass density of the fluid evaluated by the Lagrangian comoving observer with four-velocity u µ , D and v µ are the rest-mass density and the velocity as measured by the Eulerian observer. As such, v µ is a purely spatial vector (nµv µ = 0) and its norm is the one appearing in the definition of the Lorentz factor W . Finally, the symbol γ denotes instead the determinant of γ, i.e., γ = det(γij). Also associated to the Eulerian frame is the (Eulerian) threemomentum density vector Sj, which is related to the Lagrangian velocity u µ through the following identities The conserved variables Q(V ) can be easily expressed in terms of the primitive variables via D := ρW , (2.12) Here, U is the conserved energy density and τ the corresponding quantity without the rest-mass energy energy density, h = 1 + + p/ρ is the specific enthalpy and is the specific internal energy (Rezzolla & Zanotti 2013). The electric field in the Eulerian frame is indicated as E and in the ideal-MHD limit (i.e., for diverging electrical conductivities) it is determined by the simple Ohm law (2.2), i.e., Ei = −˜ ijk v j B k . The cross product is given by the spatial three-Levi-Civita tensor density˜ .
The (covariant) Pointing vector E × B in the momentum density (2.13) can be written as Given all these definitions, the system of partial differential equations for ideal GRMHD can be written in the very compact nonconservative homogeneous form (1.2), where the conservative fluxes F and the nonconservative product B(Q) · ∇Q are given by and where T ij denotes the spatial stress-energy tensor with the total pressure comprising both the fluid and the magnetic pressure, i.e., Since we are here interested in static spacetimes (Cowling approximation), the system of equations does not contain explicitly the extrinsic-curvature tensor K, which can be expressed simply in terms of metric functions (Misner et al. 1973;York 1979;Gourgoulhon 2012) (2.20) As mentioned above, the divergence-free constraint of the magnetic field is here taken into account at the discrete level through the so-called hyperbolic Generalized Lagrangian Multiplier approach (GLM, and also known as "divergence-cleaning") proposed by Dedner et al. (2002), i.e., by augmenting the GRMHD system with an additional auxiliary equation for an artificial scalar field Φ, in order to propagate away numerical errors in the divergence-free constraint of the magnetic field ∂i √ γB i = 0 . (2.21) In order to achieve a more efficient divergence cleaning, we also allow the characteristic velocity of the divergence cleaning c h to be larger than the speed of light, i.e., c h ≥ 1 in (2.16). Typical values for the cleaning speeds are chosen in the range c h ∈ [1, 2].

Equation of state, primitive recovery, characteristic speeds
For the closure of the GRMHD equations, an equation of state p = p(ρ, ) has to be chosen. With the aim of simplicity, we here consider the ideal-fluid (or "Gamma-law") equation of state (Sec. 4) where Γ is the polytropic index. In the same spirit, for the recovery of primitive variables V (Q), we employ a standard approach corresponding to the third option reported in Sect. 3.2 of Del Zanna et al. (2007); possible alternatives for performing the inversion of system (2.13) are discussed by Noble et al. (2006). For the characteristic wave speeds in GRMHD we usually employ the standard magnetosonic approximation for the wave speeds (as in Gammie et al. 2003), but accounting also for the possibility c h > 1. Note that this choice of eigenvalues corresponds to the standard choice when c h = 1 and is also valid in the general relativistic hydrodynamics limit.

ADER discontinuous Galerkin schemes
As mentioned in the Introduction, the numerical scheme that we adopt is the ADER discontinuous Galerkin (DG) scheme supplemented with an a-posteriori finite-volume subcell limiter approach with AMR, presented in the series of papers 2015b) and Fambri et al. (2017) in the context of the Euler equations of compressible gas dynamics, ideal MHD, special relativistic RMHD, but also compressible Navier-Stokes and viscous and resistive MHD equations. A brief overview of the numerics is given in the following.
After choosing a mesh partition Ω h = {Ωi}, which is Cartesian and spacetime adaptive through a cell-by-cell approach (see Khokhlov 1998), with the property with Ω being the computational domain, NE the total number of spatial elements and " • " denoting the interior operator. The weak formulation of the governing equations (1.2) is then written in the form where φ k ∈ U N h is a generic basis element for the vector space U N h of piecewise polynomials of maximum degree N ≥ 0 defined over Ω and which are allowed to be discontinuous across the element interfaces ∂Ωi. In this work, the set of basis and test functions {φ k } has been chosen as the set of Lagrange interpolation polynomials of maximum degree N over Ωi with the property Stroud (1971) for a detailed discussion of multidimensional quadrature]. For this reason, the chosen polynomial basis is said to be a nodal basis with respect to the Gauss-Legendre quadrature points.
Since the chosen AMR grid is locally Cartesian, the spatial integrals of Eq. (3.5) can be evaluated in a dimension-by-dimension fashion in x, y and z direction, and the corresponding nodal test and basis functions are defined after rescaling the domain of integration Ωi to the unit element [0, 1] d . Therefore, we only need the tensor product of the GL quadrature points in the unit interval [0, 1], denoted by {ξ m GP }m=1,...,N+1 in the following. Note that the total number of GL quadrature points {x m GP } in Ωi, as well as the total number of basis elements {φ k }, is (N + 1) d .
After integration by parts of the flux-divergence term, Eq. (3.2), can be rewritten as After restricting the space of the solutions to the set of piece- the following higher order accurate path-conservative ADER-DG scheme is obtained for the so-called degrees of freedom of u h , or expansion coefficients,û n k :  where an element-local spacetime predictor solution q h (x, t) has been introduced and the details related to its computation are given in the next section. Due to the discontinuous character of the solution q h at the element interfaces ∂Ωi, the surface integral of the fluxes is computed by means of an approximate Riemann solver G depending on the boundary extrapolated data q − h and q + h evaluated at the left and right of an element interfaces, respectively. In this paper we mainly use the simple Rusanov flux (Rusanov 1961) where smax denotes the maximum signal speed computed in q − h and q + h . Any other monotone numerical flux function could be used equally well, see Toro (2009) for an overview of different Riemann solvers. On the other hand, the jump term of the nonconservative product has been approximated with a so-called path-conservative scheme (Parés 2006;Castro et al. 2006) of the form that is based on the theory of Dal Maso et al. (1995) on hyperbolic partial differential equations with nonconservative products and which must obey the generalized Rankine-Hugoniot or consistency condition . We here use the simplest possible path, i.e., the straight-line segment path Notice that the combination of (3.6) with (3.7) represents the extension of the Rusanov (or local Lax-Friedrichs) flux to the nonconservative case. Indeed, other more sophisticated schemes may be used with the aim of reducing the numerical dissipation [see e.g., the HLLEM-type version of Dumbser & Balsara (2016), which is an extension of the HLLEM flux of Einfeldt et al. (1991), or the path-conservative Osher schemes forwarded in Dumbser & Toro (2011)].
Note also that the choice made here to interpret the gravity terms as a nonconservative product makes them appear not only in the volume integral in Eq. (3.5), but also also in the Riemann solver via Eq. (3.7) above. This contribution to the Riemann solver is not present in classical discretisations as purely algebraic source term. However, the main advantage of path-conservative schemes is that they allow at least in principle the construction of well-balanced numerical schemes that are able to preserve particular steady-state solutions of the governing partial differential equations exactly. Although the development of exactly well-balanced schemes for the GRMHD equations is beyond the scope of this work, it represents an interesting extension of the formalism presented here.
As a concluding remark in this Section we note that the ADER-DG scheme (3.5) is (N + 1)-th order accurate for smooth solutions. Since the final algorithm is a purely explicit DG scheme, a CFL-type stability condition on the time step holds in the form where hmin is the minimum characteristic mesh-size, d is the number of spatial dimensions, λmax is the maximum signal velocity of the system of partial differential equations, and CFL is a constant coefficient such that 0 < CFL < 1. If not stated otherwise, the standard value for the tests presented in this paper is CFL = 0.9.
For the results of a numerical von Neumann stability analysis of ADER-DG schemes, see e.g., Dumbser (2005)

Spacetime discontinuous Galerkin predictor
First introduced in Eq. (3.5), the spacetime predictor q h is an "interior" solution of the partial differential equations within each element, based on the following weak formulation of (3.4) in spacetime: where the spatial domain of integration has been reduced to only the interior of the space elements Ω • i , i.e., without integration by parts of the space-integrals. As a result, one obtains a system of NE independent (element-local) equation systems.
Note the introduction here of the new basis set {θ k } for the vector space Q N h of piecewise spacetime polynomials of maximum degree N , and the discrete solution q h (x, t) is represented in terms of the basis functions θ k as (3.12) Also in this case, a nodal basis is used, based on the Gauss-Legendre quadrature points referring to the spacetime element Ωi × [t n , t n+1 ]. After integration in time by parts of the first term in (3.11) and after invoking the causality principle (upwinding in time) then the following NE independent systems of (N + 1) (d+1) nonlinear equations in the spacetime degrees of freedomq k are obtained: (3.13) The system of equations (3.13) can be solved via a simple discrete Picard iteration for each element Ωi, without needing any communication with neighbour elements (Dumbser et al. 2008b). We should stress that the choice of an appropriate initial guess q 0 h (x, t) for q h (x, t) is crucial to obtain a computationally efficient scheme. One can either use an extrapolation of q h from the previous time interval [t n−1 , t n ], as suggested in , or a second-order accurate MUSCL-Hancock method, as suggested in Hidalgo & Dumbser (2011). For the initial guess, one can write a Taylor series expansion in time and then simply needs to compute approximations to the time derivatives of q h at time t n , where is used as an abbreviation in the following. A second-order accurate MUSCL-type initial guess for for q h (x, t) is given by while a third-order accurate initial guess for q h (x, t) reads For an even higher-order accurate initial guess, one can employ the continuous extension Runge-Kutta (CERK) schemes proposed in Owren & Zennaro (1992). For the use of CERK schemes as time integrators of explicit discontinuous Galerkin schemes, see Gassner et al. (2011). If an initial guess of the order N is chosen, it is sufficient to use one single Picard iteration in order to solve (3.13). It is also useful to remark once again that one-step ADER schemes are particularly well suited for AMR with time-accurate local-timestepping (LTS) and allow a consistent reduction of MPI communications compared to classical Runge-Kutta time stepping schemes, see Dumbser et al. (2013Dumbser et al. ( , 2014a, and , , Fambri et al. (2017) for details. Furthermore, in ADER schemes for nonlinear hyperbolic PDE, limiters need to be applied only once per time step, while in Runge-Kutta based method of lines schemes, the limiter needs to be applied in each Runge-Kutta stage again.
For a detailed comparison of Runge-Kutta and ADER finitevolume schemes, see Dumbser et al. (2006) and Balsara et al. (2013), while Runge-Kutta DG and Lax-Wendroff DG schemes (the latter are very similar to ADER-DG schemes) have been compared in , also concerning computational performance. We also show a detailed computational performance comparison between ADER-DG schemes and RKDG schemes for GRMHD at the end of this paper in Section 5.

A-posteriori subcell finite-volume limiter
The ADER-DG scheme (3.5) is formally of order N +1 for smooth solutions, hence the method must be oscillatory for N > 0 in the presence of discontinuities, since the scheme is linear in the sense of Godunov (1959), thus inevitably generating spurious oscillations (this is also known as the "Gibbs phenomenon" in the context of signal analysis). In order to cope with this problem, a special treatment is needed wherever and whenever the solution is discontinuous, or the gradients in the discrete solution are sufficiently steep. In our specific implementation, this problem is handled as follows: After evaluating the predictor solution q h (x, t) via Eq. (3.13), a so-called candidate solution u * h (x, t n+1 ) is computed through the unlimited one-step ADER-DG scheme (3.5). Next, the candidate solution u * h (x, t n+1 ) is checked against mathematical and physical admissibility criteria, which are collectively referred as the relaxed discrete maximum principle (DMP). These criteria are: the absence of floating point errors (NaNs), the positivity of pressure and density of the fluid, the velocity being lower than the light speed and a possible (successful) conversion from conservative to primitive variables V = V (Q) [see Loubère et al. (2014); ]. Typical scenarios that may potentially violate the cited admissibility criteria are: the vicinity of steep-gradients or discontinuities, under-resolved flow features, as well as very low pressure and density conditions, e.g., atmospheres around compact objects or vacuum regions.
For the subcell finite-volume limiter we introduce the notation ( 3.17) In practice, the relaxed DMP used in this paper reads: where Vi is the set containing the space-element Ωi and its Voronoi neighbours that share a common node with Ωi. Here, the parameter δ in (3.18) is chosen as with δ0 = 10 −8 and = 10 −7 , which is more restrictive than what used in previous work [(Dumbser et al. 2014b), Zanotti et al.].
If the candidate solution u * h violates any of the criteria of the relaxed DMP (3.18), then it is locally rejected and the cell Ωi is flagged as a troubled cell and a limiter status flagβ n+1 i is set tõ β n+1 i = 1; conversely, it is set toβ n+1 i = 0 if all admissibility criteria are satisfied in cell Ωi at time t n+1 . For a troubled cell Ωi, the numerical solution is then recomputed, starting again from the old time level t n , but using now a more robust numerical scheme than the high-order ADER-DG scheme.
We have here selected as numerical scheme on the subgrid level a second-order accurate MUSCL-Hancock TVD finitevolume scheme with MinMod slope limiter (Toro 2009), mostly because of its proven robustness in the presence of shock waves and low density atmospheres. For cells which were unlimited at the old time (i.e., β n i = 0), it is easy to compute the necessary subcell averages via the projection (3.17), while for limited cells at time t n , the subcell averages are already available from the previous time step. As an alternative, a higher accurate ADER-WENO finite-volume schemes can be used [see Dumbser et al. (2014b); Dumbser et al. (2013)], bearing in mind that the WENO approach does not clip local extrema, in contrast to the chosen second-order TVD method. However, for the GRMHD system considered here we have found the subcell TVD limiter to be much more robust than the WENO scheme.
Formally, we can write both the second-order MUSCL-Hancock scheme, as well as a high-order ADER-WENO scheme, which is very similar to the ADER-DG scheme (3.20). High order in space, together with non-oscillatory properties, are achieved in Eq. (3.20) via a nonlinear reconstruction of piecewise polynomials from the known cell averagesv n i,s using either a TVD or a WENO reconstruction. Denoting by w h (x, t n ) the result of this reconstruction, it can then be used to compute the predictor q h (x, t), either via Eq. (3.13), where u h (x, t n ) is simply replaced by w h (x, t n ) and the control volume of the spacetime integration is replaced by Ωi,s×[t n , t n+1 ], or via the simple MUSCL-Hancock evolution step to the half-time level [see Toro (2009) where R is the reconstruction operator associated with the projector P, so that R • P = I, with I the identity operator [see Dumbser et al. (2014b) for details]. For the subcell finite-volume scheme a different CFL stability condition applies and takes the form with hmin the minimum cell size referred to the DG control volumes Ωi. Choosing Ns ≥ N +1 is a natural requirement that allows to reconstruct the of degrees of freedom of u h from the piecewise constant solution v h via R. Following Dumbser et al. (2014b) we choose Ns = 2N + 1 so that ∆tFV = ∆tDG. This choice allows us to maximise the resolution properties of the chosen subcell finite-volume scheme and to run it at its maximum possible CFL number. For alternative higher order ADER-WENO finite-volume schemes for the relativistic MHD equations with reconstruction in primitive variables, the reader is referred to Balsara & Kim (2016) and .

Adaptive Mesh Refinement
The ADER-DG algorithms with subcell finite-volume limiter described above has been here implemented on spacetime adaptive Cartesian meshes. Detains on our AMR algorithm have been described in Dumbser et al. (2013) and in Zanotti et al. ( , as well as 2015b and Fambri et al. (2017), and we refer the interested reader to these works. The AMR strategy adopted here is named "cell-by-cell" refinement and consists in providing a space-tree data structure (see Khokhlov 1998;Bungartz et al. 2010;Weinzierl & Mehl 2011;Dumbser et al. 2013, for details), whose "leaves" correspond to the spatial elements Ωi used by the numerical scheme described before. The main alternative to a space-tree data structure is the use of so-called "patches", (see Berger & Oliger 1984;Berger & Jameson 1985;Berger & Colella 1989), where a set of independent overlaying Cartesian sub-grid domains, or "patches", is introduced and activated when necessary. In our AMR approach the numerical solution is checked independently along every single space-element for an eventual recursive refining or recoarsening process. In practice, starting from an initial Cartesian grid of refinement level = 0 = 0, which is the basic mesh without refinement, the tree-type infrastructure of finer refinement levels is made accessible. The refinement levels > 0 are built according to the so called refinement factor R which is the number of smaller spaceelements per space-direction in which a coarser element is broken in a refinement process, or which are merged in a recoarsening stage. Note that choosing a refinement factor R = 2 would generate the well known "quadtrees" in two-dimensional (2D) meshes and "octrees" in 3D meshes. For an arbitrary refinement factor R, general space-trees are obtained [see also Bungartz et al. (2010); Weinzierl & Mehl (2011)].
For practical purposes, a finite number of refinement levels is provided, i.e., from the coarser = 0 to a finest possible refinement level = max ∈ IN + 0 . The refinement/recoarsening process is driven by a prescribed refinement-estimator function χ = χ (u h (x, t n )), which is a function of discrete gradients and second derivatives of a scalar indicator function ϕ, and by two thresholds χ + and χ − . Elements are marked for refinement whenever χ > χ + and for recoarsening whenever χ < χ − [for details on the definition of χ see Löhner (1987); ; Fambri et al. (2017)]. In general, the indicator function ϕ can be chosen to be any mathematical quantity of physical interest that varies in the computational domain and time, e.g., the local rest-mass density, the pressure, or a function of the state-variables and of their gradients, e.g., the Lorentz factor or the vorticity, but also the limiting-status β n i . Hereafter and unless stated otherwise, we have simply used the rest-mass density as indicator function, i.e., ϕ = ρ. An alternative choice of χ that deserves investigations in the future would consist in the evaluation of the numerical production of entropy as both error and smoothness indicator [see Puppo & Semplice (2011); Semplice et al. (2016); Cravero & Semplice (2016) for details], but has not yet been used in the present paper.
To simplify the AMR algorithm, two neighbour elements are allowed to belong either to the same level or to an adjacent refinement level ± 1. To each element in the tree we assign a basic element status which is for the so-called parent cells 0 , for active elements +1 , for the so-called virtual children where Ntot is the total number of space-elements present in the tree. Note that Ntot should be distinguished from the total number of active elements NE, which are the leaves of the tree that define the Ωi used in the numerical scheme, and for which Ntot > NE holds in general. The so-called parent cells (σi = −1) are those tree elements which contain active elements on a higher level and finally a virtual child cell (σi = +1) is a tree element which is contained within an active cell that belongs to a lower and adjacent refinement level − 1.
Apart from the storage of flux contributions from neighbour cells within our high-order time-accurate local time stepping (LTS) algorithm Dumbser et al. (2013), virtual cells are also needed for high-order finite-volume schemes to provide the necessary data for polynomial reconstructions (TVD, WENO) on a given refinement level if two adjacent active cells belong to different refinement levels; this is illustrated schematically in Fig. 1. This strategy produces a locally uniform grid around each cell and greatly simplifies reconstruction. Our strategy of generating a locally uniform grid around each cell is very different from the approach based on genuinely multidimensional CWENO reconstructions proposed by Semplice et al. (2016).
The dynamics of the numerical solution on virtual elements is given by standard L2 projection (for virtual children) or averaging (for parent cells), as depicted in Fig. 2, where the mapping between the chosen solution spaces, piecewise polynomial (unlimited) or piecewise constant (limited), and between two adjacent refinement levels and + 1 is depicted.
Finally, due to the possibility of handling a large range of spatial scales within the same domain, corresponding to very different CFL time restrictions, a time-accurate and fully conservative local time stepping (LTS) has been implemented in order to use the smallest admitted timestep only where necessary, and a large timestep where it is allowed (see Dumbser et al. 2013). A flow diagram illustrating the main stages of the final algorithm presented in this Section can be found in Dumbser et al. (2014b).

NUMERICAL VALIDATION
In the following sections, we will present a series of numerical validations of the numerical algorithms introduced so far and characterised by a path-conservative ADER-DG scheme supplemented by an a-posteriori finite-volume limiter applied to AMR grids. The tests have been performed both in special and in general relativity, employing either two or three spatial dimensions. Furthermore, the validations will be distinguished in "smooth flows" (Sec. 4.1), for which we will be able to measure the actual convergence order of the scheme, and "non-smooth flows" (Sec. 4.2 and 4.3), for which we will illustrate the ability of our approach to handle accurately shocks and large gradients.
All of these tests share a number of common properties which we list below and have been employed unless stated otherwise: i) the adiabatic index has been chosen equal to Γ = 4/3; ii) the refinement factor has been chosen as R = 3; iii) a second-order MUSCL-Hancock TVD finite-volume method with reconstruction in primitive variables on the subgrid-level has been employed as subcell finite-volume limiter for the ADER-DG PN method; iv) the Rusanov (or local Lax-Friedrichs) approximate Riemann solver has been used; v) problems in curved spacetimes have been solved employing Kerr-Schild (KS) coordinates, either spherical or Cartesian.

Smooth general-relativistic flows
We first test the high order of convergence of our ADER-DG schemes against three different scenarios in curved spacetimes given respectively by: i) the Michel accretion of gas onto a black hole in KS spherical (KSS) coordinates and in 2D); ii) a stationary non-selfgravitating fluid torus in equilibrium around a black hole, again in 2D; iii) the Michel accretion with a radial magnetic field in KS Cartesian (KSC) coordinates and in 3D.
To ensure that the flow is actually smooth, in the following tests we will restrict our computational domain to regions that are fully filled with fluid. In this way, after successively refining the mesh, we evaluate the L2 and L∞ error norms at different DG polynomial degrees and mesh resolutions so as to to measure the convergence order of our numerical implementation and compare it with the expected mathematical one. Anticipating what will be shown in more detail in the following sections, the numerical results confirm the high order of accuracy of the presented numerical scheme. Indeed, using the results shown in Tables 1-3 we can conclude that the ADER-DG PN method reaches its design accuracy N + 1 in most cases.
For all these convergence tests, since the reference solutions are stationary in time, we used the initial condition as the external state vector in the chosen approximate Riemann solver whenever evaluating the fluxes only at the boundary interfaces x ∈ ∂Ω.

2D Michel accretion onto a Schwarzschild black hole
As a first test of a smooth flow with an analytical solution we consider the spherical transonic accretion of an isentropic fluid onto a nonrotating black hole is known as Michel solution (Michel 1972). For the sake of completeness we give the explicit expressions of the lapse, the shift and the spatial metric of a Kerr black hole with mass M and spin a in Cartesian Kerr-Schild coordinates (x, y, z) with lx := rx + ay r 2 + a 2 , ly := ry − ax r 2 + a 2 , lz := z r , and r = x 2 + y 2 + z 2 − a 2 2 + x 2 + y 2 + z 2 − a 2 2 2 + z 2 a 2 .
After taking the metric (4.2) with a = 0 and defining the values of the free parameters of the problem, i.e., the mass of the black hole M = 1, the critical radius rc = 8 M and the critical density ρcM 2 = 1/16, the Michel solution can be determined analytically [see, e.g., Rezzolla & Zanotti (2013)].
We have performed this test in spherical KS coordinates with a spatial domain (r, θ) ∈ Ω = [1.5, 100] × [0.15, 3.0], discretized with a uniform mesh of 200 × 32 elements and solved with our ADER-DG P5 scheme. A graphical representation of the numerical results and their comparison with the analytic solution is shown in Fig. 3, while the results of the convergence study are provided in Table 1. Clearly, we can note an excellent agreement between analytical and numerical solution and that the latter converges at the expected and high order.

2D torus interior around a Schwarzschild black hole
Next, we consider the numerical convergence study of a stationary solution of a thick disk (or axisymmetric test-fluid torus) orbiting around a Schwarzschild black hole (a = 0) of mass M = 1 in 2D spherical KS coordinates. The theory of the equilibrium of these non-selfgravitating fluids in GRHD has been first proposed by Abramowicz et al. (1978); Kozlowski et al. (1978) and has been the subject of a vast literature. For completeness, we give in appendix A a brief description of the setup of the primitive variables of this test problem, referring the interested reader to Font & Daigne (2002)  Nx DG-P 4.56E-10 2.51E-10 7.81 7.21 4 5.38E-11 3.62E-11 7.43 6.72 5 1.04E-11 8.11E-12 7.37 6.71 Table 1. L 2 and L∞ errors and convergence rates for the 2D Michel accretion in spherical Kerr-Schild coordinates for the ADER-DG-P N scheme. We report the convergence results for the rest-mass density ρ at t = 10 up to N = 6, and contrast the results with the expected rate. The domain has been chosen different (enlarged) for the cases N = 5 and N = 6 in order to keep away the numerical error from the machine limit. Similar results have also been obtained for all other flow variables.
specific angular momentum of 0 = 3.8, a potential gap ∆W = −10 −3 (inside and nearly filling its Roche lobe). The polytropic constant and exponent have been chosen equal to K = 0.0496 and Γ = 4/3, respectively. Also in this case, for a rigorous testing of the convergence or- DG-P  Table 2. L 2 and L∞ errors and convergence rates for the 2D torusinterior problem in spherical Kerr-Schild coordinates for the ADER-DG-P N scheme. We report the convergence results for the rest-mass density ρ at t = 10 up to N = 4, and contrast the results with the expected rate. Similar results have also been obtained for all other flow variables.
der we have simulated only an inner portion of the torus which is fully filled by fluid, namely, the one covered by the coordinate patch (r, θ) ∈ Ω = [7, 10.5] × [1.47, 1.67]. The corresponding measured convergence order after evolving the set of the GRHD equations in spherical KS coordinates are reported in Table 2, once again showing the expected high order of convergence of our ADER-DG scheme. We conclude this test by remarking that torus simulations where the torus is fully contained in the computational domain, which therefore includes also a region set to atmosphere, will be presented in Sec. 4.3.1.

3D Michel accretion with radial magnetic field
This is the 3D version of the similar test presented in Sec. 4.1.2, with the addition of one spatial dimension (corresponding to the azimuthal Killing vector) and of a radial magnetic field. Although such a magnetic field is unphysical, since it leads to a nonzero divergence and hence to the presence of a magnetic monopole, it is nevertheless widely used for testing GRMHD codes [see, e.g., Etienne et al. (2010)]. Here, we use it to test the convergence order of our high-order method by considering also the magnetic component of the set of partial differential equations. In addition, to stress-test our numerical infrastructure, we have employed for this test 3D Cartesian KS coordinates, so that the magnetic field lines are not aligned with any of the coordinate axis.
The chosen contravariant components of the radial magnetic field takes the form where the black-hole mass is again set to M = 1 and b µ is the magnetic field measured by the Lagrangian observer comoving with the fluid, i.e., The spatial domain is in this case given by (x, y, z) ∈ Ω = [−5, +5] 3 and is partitioned with a uniform mesh of 30 3 elements, where we have employed a very simple cubic excision to avoid the singularities at the coordinates' origin location of the black hole as shown in the left panel of Fig. 4. At the excision boundary, we impose the exact solution of the problem as boundary condition in all variables.
After adopting a ratio (b 2 /ρ)hor = 4 at the horizon, the results of the convergence study are presented in Table 3, while graphical representation of the numerical results is offered in the right panel of Fig. 4, which reports the numerical solution interpolated along 200 points at z = 0 and y = x for the rest-mass density and the x-component of the velocity and of the magnetic field vectors as plotted against to the analytical solutions. Clearly, also in this case the numerical solution is shown to converge at the expected order of accuracy, confirming the validity of our implementation in the presence of a magnetic field and of a nontrivial coordinate mapping.

Non-smooth special-relativistic flows
The tests considered in this section are considerably different from those discussed so far in that they do not involve smooth flows and allow therefore for the presence of nonlinear waves, either in the form of shocks or of steep gradients as those present at the fluid interface with an atmosphere.

Riemann problems
We start by considering two standard Riemann (or shock-tube) problems, here referred to respectively as RP1 and RP2, and originally proposed in the context of special relativistic MHD by Balsara (2001). Although these tests are solved on flat spatial hypersurfaces, i.e., γij = δij, where δij is the identity threematrix, they employ different setups for the gauge variables, the lapse function and the shift vector. In particular, Table 4 Nx 1.34E-08 3.65E-08 6.18 5.99 Table 3. L 2 and L∞ errors and convergence rates for the 3D Michel accretion with radial magnetic field in Cartesian Kerr-Schild coordinates for the ADER-DG-P N scheme. We report the convergence results for the magnetic field component B x at t = 10 up to N = 6, and contrast the results with the expected rate. Similar results have also been obtained for all other flow variables. The adiabatic index for RP1 and RP2 has been set to be Γ = 2 and Γ = 5/3, respectively. For these tests, the HLL approximate Riemann solver has been used. Figure 6 offers a 3D view of the rest-mass density variable for the proposed shock-tube problems and the corresponding AMR grid and limiting status, for the case α = 2, obtained with our ADER-DG-P3 scheme using a level-zero mesh of 40 × 5 spaceelements onto with max = 2 maximum refinement levels are added, and an ADER-DG-P5 scheme on a level-zero grid of 120 × 5 elements with one single refinement level max = 1. The corresponding one-dimensional (1D) cuts relative to the P5 solutions are presented instead in Fig. 7 relatively to the test configurations listed in Table 4; shown with solid lines are the corresponding solutions from the exact Riemann solver of Giacomazzo & Rezzolla (2006). In the presence of moving discontinuities, the expected order of convergence of any shock capturing method is at most one. In Fig.  5 we show the results of a numerical convergence study for RP2, indicating that the numerical method converges indeed with the expected order of one for flows with shocks and discontinuities.
Overall, the results of these tests confirm the high-resolution shock-capturing capability, but also the robustness, of the new class of ADER-DG PN schemes. In addition, they show that the aposteriori finite-volume sub-grid limiter is activated only in very small portions of the domain and, in the case of genuine shocks, it is very narrowly concentrated near the discontinuity.

Advection of a 2D magnetic field loop
In this special-relativistic 2D problem we advect a loop of magnetic field which is at a magnetic pressure much smaller than the corresponding fluid pressure. The computational domain in Cartesian coordinates is given by is (x, y) ∈ Ω = [−1, +1] × [−0.5, 0.5] with periodic boundary conditions everywhere. Using unitary (dimensionless) rest-mass density and gas pressure, i.e., ρ = p = 1, the velocity field is set to be constant with and initialised as (vx, vy) = (2, 1)V0, where V0 = 1/5. The magnetic-field vector is derived from the magnetic vector potential, which is specified as where r is the radial coordinate, R = 0.3 is the radius of the advected loop and the parameter A0 = 10 −3 modules the magnetic field. The discontinuity at the loop boundaries has been initially slightly smoothed, e.g., by means of a standard linear smoothing in the form otherwise . (4.6) otherwise . (4.7) where s(r) = 1 − (r − R)/(r − R1) is the adopted linear taperfunction, with R1 chosen to be close to R, e.g., R1 = 0.315. Given the initial conditions and the periodic boundary conditions, the magnetic loop is advected across the computational domain and we have performed simulations using the lapse function set either to α = 1 or to α = 2, so that the corresponding simulation times to recover the initial configuration are t = 5 and t = 2.5, respectively; conversely, the shift vector β i is set to zero. This test has been solved using a level-zero mesh of 20 2 space elements with tho maximum refinement levels max = 2 via an ADER-DG-P4 scheme, supplemented with the a posteriori TVD subcell limiter and by adopting an HLL Riemann solver. At this point we would like to emphasize that instead of HLL or Rusanovtype Riemann solvers any other stable and monotone numerical flux could have been used equally well. The Riemann solver has to be understood as a building block of the DG scheme, exactly in the same way as it is in the finite-volume context. Figure 8 reports the numerical results, which show a good agreement between the advected solution and the reference one given by the initial condition (left panels). Furthermore, the limiter is only rarely activated, as expected for this test case (right panels). The solutions for the divergence cleaning scalar ψ are plotted in Fig. 9.

2D blast wave
Another standard test of the RMHD equations is represented by the cylindrical blast wave problem. In this benchmark, the plasma is initially at rest and subject to a constant magnetic field along the x-direction; we have therefore considered two different configurations strengths of the magnetic field, i.e., Bx = 0.1 and Bx = 0.5, representing the case of a moderately and of a highly magnetized plasma, respectively.
The initial conditions for the rest-mass density and pressure are given respectively by (4.8) and together with the magnetic-field strength are sufficient to fully specify the initial setup. Also in this case, and following see Balsara & Spicer (1999), a linear smoothing is used in order to avoid sharp discontinuities in the initial conditions. The computations have been carried out in 2D with a Cartesian coordinate system over a computational domain given by Ω = [−6, 6] 2 , with 40 2 elements on the coarsest mesh level, and a maximum refinement level max = 2. We have used the Rusanov Riemann solver with our ADER-DG-P3 scheme. The computed results for different physical quantities, the AMR grid and the limiter status are shown in Fig. 10 for the moderately magnetized case, and in Fig. 11 for the highly magnetized case. Note in the bottomright panels of figures the map of the "troubled cells" and how these are limited in extent and nicely map the dynamics of the discontinuities in the magnetic field. Clearly, the fraction of troubled cells in the case of the low-magnetisation setup represent only a very small fraction of the evolved cells (see Fig. 10); this is to be contrasted with what happens in the case of the much more challenging case of high magnetisation, where however the troubled cells still represent less than 50% of the evolved cells (see Fig. 11).
Lacking an analytic solution to compare with, the assessment of the results in this case is harder, but it is reassuring that the results match well those presented in other tests in the literature, e.g., by Del Zanna et al. (2007); Dionysopoulou et al. (2013); .

Orszag-Tang vortex
Our final special-relativistic test of non-smooth flows is another classic benchmark represented by the relativistic version of the Orszag-Tang vortex system Orszag & Tang (1979). This is a useful application of our numerical infrastructure as it involves the development of a complex and non-smooth magnetic-field structure and hence it explores geometries without trivial symmetries.
The initial conditions in this case are given by the vector of conserved variables (ρ, u, v, w, p, Bx, By, Bz with Γ = 4/3. The computational domain is Ω = [0, 2π] 2 , with 30 2 elements on the level-zero grid, a maximum refinement level of max = 2, periodic boundary conditions and a Rusanov Riemann solver for the subcell finite-volume limiter. Figure 12 shows the numerical results for the AMR grid with limiter status, the rest-mass density and the divergence-cleaning scalar ψ at different times, together with the corresponding numerical solution obtained with the same scheme on a fine uniform 270 2 mesh, corresponding to the finest mesh resolution at = max and which serves here as a reference. The figure, in particular, refers to simulations in which the P5-version of our ADER-DG has been adopted. Also for this test, a rigorous accuracy analysis is not trivial . Convergence study against Riemann problem RP2 of table 4. L 1 errors are plotted against the discretization step ∆x = L/Nx, with L = 1 being the length of the one-dimensional domain, Nx the discretization number, i.e., the number of high-order space-elements in the x-direction. These tests have been performed with the fourth-order accurate ADER-DG-P3 scheme supplemented by our second-order subcell finite-volume limiter. Figure 6. 3D view of the rest-mass density, the corresponding AMR grid and, on the horizontal plane, the corresponding limiting status, obtained with our ADER-DG P N with finite-volume subcell limiting. From the top panel to the bottom, from left to right: i) RP1 at t final = 0.2 with α = 2, P 3 , with a coarsest grid of 40 × 5 elements, max = 2; ii) RP1 at t final = 0.2 with α = 2, P 5 , with a coarsest grid of 120 × 5 elements, max = 1; iii) RP2 at t final = 0.275 with α = 2, P 3 , with a coarsest grid of 40 × 5 elements, max = 2; iv) RP2 at t final = 0.275 with α = 2, P 5 , with a coarsest grid of 120 × 5 elements, max = 1. The limited cells, using the subcell ADER-TVD finite-volume scheme, are highlighted in red along the horizontal plane below the 3D plot of the rest-mass density ρ, while unlimited DG-P N cells are highlighted in blue. but we note the very good agreement between the AMR simulations and the fine uniform-grid reference solution, as well as with the corresponding solutions that have been published elsewhere [see, e.g., ; Porth et al. (2017)]. Note also how the AMR grid structure and the troubled-cells patterns closely follow the development of steeper gradients and discontinuities.

Non-smooth general-relativistic flows
In the following two sections we discuss the use of our ADER-DG method in non-smooth general-relativistic flows, either in 2D and spherical coordinates or in 3D and Cartesian coordinates. The tests involve the evolution of non-selfgravitating tori as those presented in Sec. 4.2 with the important difference that the computational domain here fully contains the torus, whose exterior is therefore filled with a uniform atmosphere at a rest-mass density of ρ0 = 10 −9 that is five orders of magnitude smaller than the one at the torus centre.

2D torus around a Schwarzschild black hole
First, we consider a thick torus in equilibrium orbiting around a black-hole with the parameters previously described in Sec. 4.1.2 and using horizon-penetrating spherical KS coordinates in 2D. The   computational domain (r, θ) ∈ Ω = [2, 18] × [0.5, 2.5] is discretized with a uniform mesh of 50 2 elements using an ADER-DG-P3 scheme with TVD subcell finite-volume limiter (as a comparison, the torus has an inner radius rin = 5.5 and an outer radius rout = 13.8, so that the entire torus is resolved with only 26 elements in radial direction and 14 elements in angular direction). On the outer edge we impose the initial data as boundary condition in all variables. A 1D cut of the rest-mass density in the radial direction is shown in the left panel of Fig. 13 and is plotted over the analytic solution at t = 100 M . Note the excellent agreement between the numerical results and the exact solution, with differences in the central rest-mass density that are less than 0.7%.
It is useful to remark that the low-density atmosphere has been successfully simulated and robustly evolved in time with a highorder ADER-DG scheme and that inside the computational domain the limiter is activated only on the border of the torus, where spurious oscillations may generate possibly negative-valued densities and pressures in the high-order DG polynomials. However, the aposteriori subcell finite-volume limiter appears to be robust enough to accurately treat the atmosphere of the torus. Furthermore, we note that the fluid in this low-density region is treated so as to be evolved as a standard fluid, i.e., the velocity is not set to zero in a computational cell that is marked to host the atmosphere. As a result, during the simulations, the atmosphere the fluid in the atmosphere starts accreting onto the black hole; in practice the amount of matter accreted in this manner is minute and does not influence with the dynamics of the much denser matter lost from the torus.

3D torus around a Schwarzschild black hole
The final test considered in this battery is represented by a fully 3D evolution of the torus considered in the previous section, therefore adding the azimuthal spatial dimension.
For this, we use a horizon-penetrating Cartesian KS coordinates which cover a computational domain chosen to be (x, y, z) ∈ Ω = [−18, +18] × [2, 18] × [−8, +8]. The portion of the domain around the origin is excised following the same logic discussed in sec. 4.1.3. The solution has been computed using an ADER-DG-P3 scheme on a uniform mesh composed of 40 × 20 × 20 elements.
The 1D cut of the rest-mass density profile on the equatorial plane θ = π/2 and along different angular directions φ = π/4, π/2 and 3π/4 at t ∼ 30 M . The various numerical solutions are overlayed with the corresponding analytic solutions in the right panel of Fig. 13. Once again, we can observe an excellent agreement between numerical and exact solution, with differences in the central rest-mass density that are less than 1.5%.  . Advected magnetic field loop problem (SRMHD) obtained with the ADER-DG-P 4 scheme supplemented with the a posteriori TVD subcell limiter: results for the divergence cleaning scalar ψ at time t = 5.0 with lapse function α = 1.0 (left) and at time t = 2.5 with lapse function α = 2 (right).

STRONG MPI SCALING AND PERFORMANCE COMPARISON WITH OTHER SCHEMES
In this section we provide a detailed and quantitative performance analysis of the new ADER-DG schemes for the GRMHD equations proposed in this paper. We compare CPU times and MPI scaling results for ADER-DG in comparison with classical Runge-Kutta DG (RKDG) schemes. We furthermore provide CPU time comparisons between ADER-DG and ADER-WENO finite-volume (FV) methods.
As first test we run the Michel accretion problem again on the domain Ω = [3, 5.5] × [1, π − 1] in two space dimensions using a sequence of successively refined meshes of Nx × Nx DG elements and Nx(N +1)×Nx(N +1) finite-volume zones until a final time of t = 10. We use a third-order ADER-DG scheme (N = 2) and compare with a third-order ADER-WENO finite-volume scheme, see Dumbser et al. (2013); Dumbser et al. (2008b). In order to make the comparison fair, the mesh of the FV scheme is N + 1 times finer than the one of the DG scheme, since the DG method has N + 1 degrees of freedom per cell and per space dimension. The total number of degrees of freedom is therefore the same for both methods. We present the L2 and L∞ errors for the density ρ obtained with both methods. We also report the wall clock time (WCT) measured in seconds and the time needed by the scheme to update one single degree of freedom on one single CPU core (DTU), measured also in seconds. The inverse of this number represents the number of degrees of freedom that the scheme is able to update in one second on one CPU core and can be compared with other finite-volume and finite-difference methods. As computer hardware for this test we use one single CPU core of a workstation with an Intel i7-4770 CPU with 3.4 GHz clock speed and 16 GB of RAM. The results are shown in Table 5, from which it becomes clear that the ADER-DG scheme is faster and more accurate than the ADER finite-volume scheme using the same number of degrees of freedom. Similar results have already been reported in Dumbser et al. (2008b) and Dumbser (2010) for the Euler equations of hydrodynamics, the MHD equations and the compressible Navier-Stokes equations, using the unified framework of PN PM schemes.
As second test case we take the large amplitude Alfvén wave problem in flat Minkowski spacetime described in Del Zanna et al. (2007) and also used later in Dumbser et al. (2008b) and . We use the 3D computational domain Figure 10. Solution of the SRMHD blast wave with Bx = 0.1 at time t = 4.0, obtained with the ADER-DG P 3 scheme supplemented with the a posteriori second-order TVD subcell limiter. Top panels: rest-mass density (left), thermal pressure (center) and Lorentz factor (right). Bottom panels: magnetic pressure (left) with magnetic field lines reported, AMR grid (center) and limiter map (right) with troubled cells marked in red and regular unlimited cells marked in green.  Table 5. Comparison of L 2 and L∞ errors for the Michel accretion problem in 2D. Wall clock times (WCT) and CPU time per degree of freedom update (TDU) in seconds for a third-order ADER-DG scheme (N = 2) compared with a third-order ADER-WENO finite-volume (FV) scheme. Ω = [0, 2π] 3 , which is discretized with ADER-DG schemes of increasing order of accuracy in space and time and using a sequence of successively refined meshes of size Nx × Nx × Nx.

Nx
To provide a direct a comparison, we solve the same test problem also with high order Runge-Kutta DG schemes (Cockburn & Shu 1998b, 2001. Since ADER-DG schemes are uniformly high order accurate in space and time, for the RKDG method we use appropriate Runge-Kutta schemes in time whose temporal order of accuracy exactly matches the spatial one. In particular, we use the classical third and fourth-order RK schemes of Kutta (1901), the fifth order Runge-Kutta scheme of Fehlberg (1969) and the first one of the sixth order Runge-Kutta schemes proposed in Butcher (1964). Note that due to the well-known Butcher barriers that apply to high order RK schemes for nonlinear ODE systems, the fifth order RK scheme has six stages, and the sixth order RK scheme has seven stages. We run the test problem with both schemes without any limiter up to a final time of t = 1 and report the errors of the variable By measured in L2 norm. The computational results for ADER-DG and Runge-Kutta DG schemes are reported in Table 6, together with the measured wall clock times (WCT) in seconds and the time needed by each scheme to update one single degree of freedom (TDU) in microseconds. Again, the inverse of TDU in seconds represents the number of degrees of freedom that the scheme is able to update in one second on one single CPU core and can be directly compared with existing finite-volume and finite-difference codes. We observe that the CPU times and error norms are comparable for both schemes. For all mesh sizes Nx and polynomial approximation degrees N we have used 512 CPU cores of the Phase I system of the SuperMUC of the LRZ in Garching, Germany. This means that for the coarsest mesh with Nx = 8, each MPI rank has only one single element to update. The results of Table 6 clearly show that for a small number of elements per MPI rank our communication avoiding ADER-DG schemes are computationally less expensive than RKDG schemes of the same order, since RKDG requires MPI communication in each Runge-Kutta stage. We finally run this test problem on a fixed grid of 64,000 elements (Nx = 40) using fourth-order ADER-DG and RKDG schemes on an increasing number of CPUs, from 64 to 16,000. The parallel implementation is based on pure MPI and thus each CPU core corresponds to one MPI rank. The speedup graph and the parallel efficiency as measured on the Phase I system of the SuperMUC supercomputer of the LRZ in Garching, Germany, are Figure 11. Solution of the SRMHD blast wave with Bx = 0.5 at time t = 4.0, obtained with the ADER-DG P 3 scheme supplemented with the a posteriori second-order TVD subcell limiter. Top panels: rest-mass density (left), thermal pressure (center) and Lorentz factor (right). Bottom panels: magnetic pressure (left) with magnetic field lines reported, AMR grid (center) and limiter map (right) with troubled cells marked in red and regular unlimited cells marked in green.
presented in Fig. 14. It shows the better MPI scaling of the communication avoiding ADER-DG schemes compared to conventional RKDG methods.

DISCUSSION AND CONCLUSIONS
We have proposed a new high-order DG scheme for the numerical solution of the system of the GRMHD equations in the ideal-MHD limit using multiple spatial dimensions and on spacetime adaptive meshes. An important and novel aspect of our discretization is that we have made use of nonconservative products in order to account for the metric terms directly inside the Riemann solver at the element interfaces instead of considering them as purely algebraic source terms. While there is no development yet of exactly well-balanced numerical schemes for GRMHD for some relevant stationary equilibrium solutions, our approach here is motivated by the encouraging results already obtained in this respect by Parés (2006); Castro et al. (2006); , who have employed the framework of well-balanced path-conservative finitevolume and DG schemes for the successful solution of the shallowwater equations. One of the main feature of our ADER-DG scheme is its ability to reach arbitrary high order of accuracy in space and time for smooth parts of the solution, while it falls back to a robust finite-volume scheme at discontinuities such as shocks and material interfaces, without loosing the subcell resolution capabilities of the high-order DG scheme. We have validated the numerical implementation of the novel ADER-DG scheme with an a-posteriori subcell finite-volume limiter by solving the system of GRHD and GRMHD equations in the ideal-MHD limit for a number of classical benchmark tests. These tests have been performed both in 2D and in 3D with either spherical or Cartesian coordinate mappings. Furthermore, they have involved either smooth relativistic flows, for which we have been able to compute the convergence order and compare it with the expected one, or non-smooth relativistic flows, for which we have been able to compare our results with exact solutions or other reference solutions available in the literature. Overall, the benchmarks have shown a very good performance of the new scheme, exhibiting an excellent agreement between analytical and numerical solutions and that the latter converge at the expected and high order for smooth flows.
The developments presented here on the solution of the GRMHD equations is part of a long-term plan to develop a numerical infrastructure for the study of problems in relativistic astrophysics in general and to simulate the merger of binary systems of neutron stars in particular [see, e.g., Baiotti & Rezzolla (2017) for a recent review]. Indeed, another important development in this respect has been the successful development and testing of a first-order hyperbolic formulation of the Einstein equations given by the first-order reduction of the CCZ4 system (Alic et al. 2012), which was recently presented by Dumbser et al. (2018) (FO-CCZ4). These two independent but related developments naturally lead to the construction of a computational framework where the GRMHD equations are evolved together with the Einstein field equations in a fully coupled manner. This is one of the goals of the Figure 12. SRMHD Orszag-Tang vortex problem at times t = 3, t = 6, t = 7, from left to right, obtained through the ADER-DG-P 5 scheme supplemented with the a posteriori TVD subcell limiter on a 30 2 elements on the coarsest grid ( = 0), two maximum refinement levels and a refinement factor R = 3. From the top to the bottom row: 1 st ) P 5 -solution obtained on the AMR grid; 2 nd ) P 5 -solution obtained on the corresponding finer uniform grid, i.e., 270 2 space elements of the maximum refinement level max = 2; 3 rd ) AMR-grid, troubled cells (red) and unlimited cells (blue); 4 th ) divergence cleaning scalar ψ.   Table 6. Accuracy and cost comparison between ADER-DG and RKDG schemes of different orders for the GRMHD equations in three space dimensions. The test problem is the large amplitude Alfvén wave solved in the domain Ω = [0, 2π] 3 up to t = 1 on a sequence of successively refined Cartesian meshes with N 3 x elements. The errors refer to the variable By. The table also contains total wall clock times (WCT) measured in seconds and the time needed by the scheme to update one single degree of freedom on one single CPU core (TDU) measured in microseconds. All simulations have been performed in parallel on 512 MPI ranks of the SuperMUC phase I system at the LRZ in Garching, Germany. Note that for the coarsest grid with Nx = 8, each MPI rank has only one single element to update. ExaHyPE framework (Charrier & Weinzierl 2017;Köppel 2017;Charrier et al. 2018) and is part of our present and future research.
The key idea here is to use our new ADER-DG schemes to solve the GRMHD equations and the FO-CCZ4 formulation of the Einstein equations together, i.e., in a monolithically coupled way, simulating with the same numerical scheme one single evolution system for both matter and spacetime.
We also plan to carry out an extension to full general relativity of the first-order symmetric hyperbolic model of continuum mechanics recently proposed by ;  and by Dumbser et al. (2017), and which is based on the pioneering work of Godunov & Romenski (1972) on nonlinear hyperelasticity in the Newtonian limit. This new unified formulation of continuum mechanics allows one to deal with vis-cous fluids and elastic solids within one single and unified system of symmetric-hyperbolic partial differential equations and has bounded signal speeds for all involved physical processes, including dissipative effects. In addition, this mathematical development will be accompanied by a numerical one, with the implementation of a novel indicator for AMR and subcell limiting based on the definition of the numerical entropy density and relative fluxes as done, e.g., by Puppo & Semplice (2011); Semplice et al. (2016) and Cravero & Semplice (2016).  Figure 14. Strong scaling test for the 3D GRMHD equations and performance comparison between fourth-order ADER-DG and RKDG schemes (N = 3). The test case is the large amplitude Alfvén wave problem solved in 3D up to t = 1 on a uniform Cartesian mesh composed of 40 × 40 × 40 elements. The results were obtained with a pure MPI implementation on the SuperMUC phase I system at the LRZ in Garching, Germany, using 64 to 16,000 CPU cores. On 16k cores, each MPI rank has only 4 elements to update.