First Gaia dynamical model of the Milky Way disc with six phase space coordinates: a test for galaxy dynamics

We construct the ﬁrst comprehensive dynamical model for the high-quality subset of stellar kinematics of the Milky Way disc, with full 6D phase-space coordinates, provided by the Gaia Data Release 2. We adopt an axisymmetric approximation and use an updated Jeans Anisotropic Modelling (JAM) method, which allows for a generic shape and radial orientation of the velocity ellipsoid, as indicated by the Gaia data, to ﬁt the mean velocities and all three components of the intrinsic velocity dispersion tensor. The Milky Way is the ﬁrst galaxy for which all intrinsic phase space coordinates are available, and the kinematics are superior to the best integral-ﬁeld kinematics of external galaxies. This situation removes the long-standing dynamical degeneracies and makes this the ﬁrst dynamical model highly overconstrained by the kinematics. For these reasons, our ability to ﬁt the data provides a fundamental test for both galaxy dynamics and the mass distribution in the Milky Way disc. We tightly constrain the volume average total density logarithmic slope, in the radial range 3.6–12 kpc, to be α tot = − 2.149 ± 0.055 and ﬁnd that the dark halo slope must be signiﬁcantly steeper than α DM = − 1 (NFW). The dark halo shape is close to spherical and its density is ± 0.0020 M (cid:2) pc − 3 (0.437 ± 0.076 GeV cm − 3 ), in agreement with previous estimates. The circular velocity at the solar position v circ ( R (cid:2) ) = 236.5 ± 3.1 km s − 1 (including systematics) and its gently declining radial trends are also consistent with recent determinations.


I N T RO D U C T I O N
For decades astrophysicist have constructed dynamical models of external galaxies from observations of their unresolved stellar kinematics to study their masses as well as their orbital and density distributions (see review by Courteau et al. 2014). The models were initially constrained by long-slit spectroscopy (e.g. van der Marel 1991) and nowadays by integral-field spectroscopy (see review by Cappellari 2016).
When using only projected line-of-sight kinematics and the first two velocity moments, there are well-known fundamental degeneracies between the mass density and the orbital distribution, or anisotropy, for spherical galaxies. This mass-anisotropy degeneracy (Binney & Mamon 1982;Gerhard 1993) led to the development of techniques to extract the elusive shape of the stellar line-of-sight velocity distribution from the galaxy spectra (e.g . Bender 1990; E-mail: nitschai@mpia.de van der Marel & Franx 1993) together with dynamical modelling approaches that could use that information to break the degeneracy (e.g. Rix et al. 1997;van der Marel et al. 1998;Gebhardt et al. 2000;Valluri, Merritt & Emsellem 2004;Cappellari et al. 2006).
The mass-anisotropy degeneracy is less severe in axial symmetry as one can observe different views of the velocity ellipsoid along different polar angles on the sky. However, degeneracies must still be expected because the data cube is a three-dimensional observable which cannot uniquely constrain the three-dimensional distribution function in addition to the galaxy density distribution (see discussion in section 3 of Valluri et al. 2004). Moreover, the surface-brightness deprojection is also strongly degenerate, unless the galaxy is edgeon (Rybicki 1987). Dynamical degeneracies are indeed observed even with state-of-the-art dynamical models and data (Krajnović et al. 2005;de Lorenzi et al. 2009).
For external galaxies, heroic attempts were made to break the dynamical degeneracies using proper motion measurements in addition to line-of-sight kinematics (e.g. van de Ven et al. 2006;Watkins et al. 2015), but these useful proof-of-concept studies had to rely on data of relatively limited quality. Moreover, these studies are only possible for very few cases, where the full velocity information is available.
For the Milky Way there have been in the past years many surveys gathering kinematic information for stars all over the sky. For example the Geneva-Copenhagen Survey (Nordström et al. 2004), RAVE (Steinmetz et al. 2006), APOGEE (Allende Prieto et al. 2008;Majewski et al. 2017), Gaia-ESO (Gilmore et al. 2012), and many others, but the Gaia mission (Gaia Collaboration et al. 2016) is by far the largest one, measuring billions of sources. Gaia provides us with the largest sample of three-dimensional kinematic information for stars that cover a large area on the sky.
With this information, all these long-standing dynamical degeneracy issues disappear for the kinematics of the Milky Way from the Gaia mission, which are based on direct proper motion and radial velocities determinations for millions of individual stars: (i) one can obtain all six phase-space coordinates (three spatial coordinates and three velocities), making the dimension of the observable larger than that of the distribution function, thus allowing for extra parameters like the density to be uniquely constrained; (ii) one measures the true velocity moments, by direct summation, over many stars rather than having to infer them from integrated galaxy spectra; (iii) the stellar density is uniquely obtained without the need for a deprojection.
In this situation, unlike the case of every galactic dynamical model for other galaxies that has been constructed in the past half a century, dimensional arguments alone already imply that there is no guarantee that even general models will be able to fit simultaneously all the components of the kinematics unless the model assumptions are sufficiently accurate. These data sets that provide full velocity information then allow us to make a fundamental physical test, namely to verify whether a model based on the Newtonian equations of motion, 1 which was developed based on the motions on the scale of the Solar system, is able to accurately predict the average motions of the stars at the scale of our Galaxy, 10 8 times larger.
In recent years there has been good progress in dynamical modelling of the Milky Way and a summary of different dynamical methods is given by .  used the dynamical modelling based on action integrals suggested by Binney (2010Binney ( , 2012 and Binney & McMillan (2011) where they use six-dimensional dynamical fitting with three-action-based distribution functions to fit abundance selected stellar populations from SEGUE. This machinery was improved and implemented in the ROADMAPPING code (Trick, Bovy & Rix 2016), to recover the gravitational potential by fitting an orbit distribution function to stellar populations within the disc and it was applied to mock data investigating its capabilities. The local dark matter density was determined to be 0.0126 q −0.89 M pc −3 by Piffl et al. (2014), were q is the halo's axial ratio, using also an action based distribution function. They investigated the vertical mass density using kinematics from giant stars in RAVE and variations in z from the number density determined by Jurić et al. (2008), fitting a distribution function to the kinematics and computing the vertical density profile until it fits the observed profile. Kinematic models based on Gaussian and Shu distribution functions were used by Sharma et al. (2014) to constrain kinematic parameters of the Milky Way using RAVE data and full phase space data from the Geneva-Copenhagen survey. Portail et al. (2017) used the made-to-measure method to construct dynamical models of the bar region only, using kinematics from BRAVA, ARGOS, and OGLE and recovering the 1 Relativistic corrections are unimportant here. bar pattern speed, and the stellar and dark matter mass distribution in this region. Another work by Robin et al. (2017), investigated a wide solar neighbourhood field using the Besançon population synthesis model applied to RAVE and Gaia DR1 data to reproduce the velocities, constrain the thin and thick disc dynamical evolution and determining the solar motion. In Hagen & Helmi (2018) they combine TGAS and RAVE data to study the kinematics of red clump stars and derive the dark matter density in the solar neighbourhood. They apply axisymmetric Jeans equations and get ρ DM (R , 0) = 0.018 ± 0.002 M pc −3 . However, they also mention that the systematic errors are much larger and it is important to get accurate constrains on the stellar disc parameters.
Here, we attempt to construct a first axisymmetric dynamical model of the Milky Way kinematics given by Gaia DR2. We use the new spherically aligned Jeans Anisotropic Method (JAM sph ; Cappellari 2020), which allows for general axial ratios for all three components of the velocity ellipsoid and a spherical orientation, as indicated by the Gaia data (Everall et al. 2019;Hagen et al. 2019). We want to test to what accuracy a relatively simple model can capture the richness of the Gaia kinematics. We intentionally keep the model as simple as possible not to risk overfitting kinematic features that may be due to non-equilibrium or nonaxisymmetry (e.g. bar, spiral arms, and warps) rather than tracers of the gravitational potential. We additionally provide a description of the mass density distribution (and circular velocity) of the Milky Way at radii r ≈ 3.6-12 kpc.
The outline of this paper is as follows: In Section 2, we briefly present the Gaia DR2 data-set and introduce the Jeans model, including its required components in Section 3. We present the resulting JAM model and the Milky Way circular velocity curve in Section 4, before concluding in Section 5.

GAIA S T E L L A R K I N E M AT I C DATA
In 2018 April Gaia had its second data release (Gaia DR2; Gaia Collaboration 2018a), which contains the data collected during the first 22 months of its nominal mission lifetime. This gives for the first time a high-precision parallax and proper motion catalogue for ∼10 9 sources, supplemented by precise and homogeneous multiband allsky photometry and a large (a few times 10 6 ) radial velocity survey at the bright (G < 13) end (Gaia Collaboration 2018b).
Some early dynamical models used subsets of the DR2 Gaia data to infer the shape and mass of the dark matter halo Watkins et al. 2019;Wegg, Gerhard & Bieth 2019), the Galaxy's velocity curve (Eilers et al. 2019), or its non-equilibrium features (Antoja et al. 2018). However, no study has yet attempted to construct a comprehensive model of the bulk of the Gaia DR2, namely the few times 10 6 of stars with high-quality full sixdimensional phase space coordinates. This is the goal of this paper.
We use kinematics derived with Bayesian distance estimates of Schönrich, McMillan & Eyer (2019) . We assume as distance to the Galactic Centre R = 8.2 kpc (Gravity Collaboration 2019), a vertical displacement of the Sun from the mid-plane of z = 0.02 kpc (Joshi 2007) and as solar velocities in cylindrical Galactic coordinates (U , V , W ) = (−11.1, 247.4, 7.2) km s −1 (from Schönrich, Binney & Dehnen 2010; Gravity Collaboration 2019; Reid et al. 2009, respectively). The giant stars are the main contribution in Gaia at distances larger than 1 kpc from the Sun and can be measured out to large distance due to their brightness. For this reason they give the most homogeneous sub-sample over the area we want to probe. They are selected based on their absolute magnitude M G < 3.9, intrinsic colour (G BP − G RP ) 0 > 0.95 and positive parallaxes with relative uncertainty / > 5, no other quality cuts were performed. This sample contains about 1.98 × 10 6 stars and covers a volume with extreme cylindrical coordinates 3.65 < R < 12.02 kpc, −2.52 < z < 2.50 kpc, and −15 • < φ < 15 • , which is divided into (R, z) cells of 200 × 200 pc. For each cell with at least 30 stars we calculated the median velocity and the velocity dispersion. The median uncertainties over our full sub-sample are ( vr , v φ , vz ) = (2.2, 1.3, 1.4) km s −1 , while the median velocity error in each bin is always below 3 km s −1 , making the uncertainties negligible for both velocity and velocity dispersion.

Model for the Milky Way stellar luminosity density
The first component one needs to construct for a stellar dynamical model is the distribution of the tracer population, from which the kinematics was derived. Ideally, this would be extracted from the same Gaia data set we use to derive the kinematics, but this kind of model is not (yet) available.
Therefore, we use the Milky Way stellar distribution from Jurić et al. (2008) derived from the number density of main-sequence stars using data from the Sloan Digital Sky Survey (SDSS). However, we are ignoring the stellar halo component since it is quite uncertain and not as important since our data are mainly stars in the Milky Way disc rather than in the halo. The Jurić et al. (2008) photometric model was derived with data that only cover the disc plane near the solar position and include only a smaller part of the Southern sky. Hence in our work it is extrapolated to describe the whole disc area and used for giant stars rather than main-sequence stars from which it was derived. Nevertheless, it is still the best stellar distribution we have at the moment for the disc region and probably is still a good approximation for the giant stars too. In retrospect, the fact that our dynamical model, which has very little freedom, is able to fit the data well, support the idea that the adopted stellar model may be a reasonably good representation of our tracer distribution. In fact, any change in the tracer distribution directly translates into variations of the model kinematics.
The disc is decomposed into a sum of two exponential components, the thin and the thick disc. This allows for different scale lengths (L thin , L thick ) and heights (H thin , H thick ) for each component where ρ(R , 0) is the number density of stars in the solar neighbourhood, and the parameter f in equation (2) is the thick disc normalization relative to the thin disc. Our adopted values are the 'bias corrected' parameters (Jurić et al. 2008): f = 0.12; (L thin , H thin ) = (2.6, 0.3) kpc and (L thick , H thick ) = (3.6, 0.9) kpc. The parameter ρ(R , 0) was originally derived as a number density from star counts by Jurić et al. (2008), but since we want to obtain meaningful stellar mass-to-light values (M * /L) we normalize the density (Jurić et al. 2008) in such a way that the stellar luminosity density at the solar radius corresponds to the local luminosity density in V band ρ L = 0.056 L V pc −3 , which was derived by Flynn et al. (2006) using the local luminosity function and the vertical structure of the disc. We have chosen the V band since the SDSS data, used to fit the disc model, and the Gaia data are including the V band.
We further included the bulge, even though we have tested the sensitivity of our models to it and saw that it has a minimal effect on the disc area we are probing with the Gaia data and therefore we will keep it fixed for our model. Even an extreme model without a bulge would not affect our conclusions. We use a bulge model of McMillan (2017) which is an axisymmetric version of the one obtained from COBE/DIRBE photometry (Bissantz & Gerhard 2002) We adopt the values of McMillan (2017) a = 1.8, r 0 = 0.075 kpc, r cut = 2.1 kpc, and an axial ratio of q = 0.5. We normalize the bulge to our disc by ensuring it has the same bulge to disc ratio as the original model (McMillan 2017), obtaining ρ 0, b = 80.81 L V pc −3 . Within 5 kpc from the Galactic Centre the bar dominates the kinematics (Wegg, Gerhard & Portail 2015;Bovy et al. 2019) but since our data are mostly not covering this area we are ignoring the bar in our stellar model. The small fraction of stars that might belong to the bar with radii smaller than 5 kpc should not have a significant effect on our model. Previous studies showed that one can measure reliable quantities outside the bar region of barred galaxies (Lablanche et al. 2012), by symmetrizing the bar density as done here.
For use with our model, we approximate the intrinsic density of the disc + bulge stellar model with a multi-Gaussian expansion (MGE) (Emsellem, Monnet & Bacon 1994;Cappellari 2002), by fitting a synthetic two-dimensional image using the MGE fitting method (Cappellari 2002) and MGEFIT software package. 2 MGE is a general and easy way to find the potential and solve the Jeans equations without having to modify the formalism for every different adopted parametrization. Most importantly, the method is sufficiently flexible to describe the photometry of multicomponent galaxies in great detail. Cappellari (2020) in fig. 5 presents an example of the negligible ( 1 per cent) errors one can expect in the final Jeans kinematic predictions when approximating an analytic distribution with the MGE, as we do in this paper. This indicates that numerical approximation errors due to the limited terms in the expansion should be negligible with respect to systematic uncertainties in the data and model assumptions.

Model for the dark matter and gas contributions
The second component of a dynamical model is the mass density of the Galaxy, which includes not only the stellar component but also the gas distribution and the dark matter halo.
We describe the dark matter as a generalized Navarro, Frenk, and White profile (gNFW; Wyithe, Turner & Spergel 2001), with variable inner slope α DM and axial ratio q with m from equation (4) and where the scale radius, r s , is also allowed to vary between 10 and 26 kpc, consistent with values found with the prediction of the halo mass-concentration relation If α DM = −1, equation (5) is the classical Navaro, Frenk, and White (NFW) dark matter profile (Navarro et al. 1996). This onedimensional profile is fitted with Gaussians using the MGE FIT 1D procedure in the MGEFIT package (see footnote 2), and appropriately made oblate/prolate, for use with the model.
In addition, we include the gas density, even though it is quite uncertain, again from the previous Milky Way model by McMillan (2017). We include the mass density of both the H I and H 2 gas in the disc of the Milky Way (equation 4 in McMillan 2017). For simplicity of producing the MGE fit, we initially ignored the hole in the centre of this density profile, which is outside the range of our kinematics, by removing the term in the equation that describes it, so that the gas model only declines exponentially towards larger r. The gas density was modelled by creating an image which we fitted in 2-dim with the MGE FIT SECTORS routine in the MGEFIT package (Cappellari 2002) as we did for the stellar model. This gas MGE will be added to the potential density of the Milky Way but will be kept fixed to the quoted mass by McMillan (2017) during the model fit, since we just want it as an estimate of the mass contribution from the gas in the disc region.
Formally, removing the hole changes the force/potential inside the volume we are probing and may affect our results, however, we subsequently verified that the effect is negligible by also running a model including the central hole in the gas distribution. For an accurate MGE model for the gas with the hole, we had to proceed differently. We fitted equation (4) of McMillan 2017 in one dimension (R) along the equatorial plane (z = 0) using the MGE FIT 1D routine in the MGEFIT package (Cappellari 2002), while allowing for negative Gaussians (negative=True). We then described the sech 2 (0.5 × z/z d ) trend in z with its bestfitting Gaussian exp(−0.195 × z 2 /z 2 d ), which provides an excellent approximation. The resulting 2-dim MGE model for the gas was then obtained by multiplying the 1-dim R-coordinate MGE with the z-Gaussian. Using this alternative 2-dim MGE gas model with a hole, we found that the total density slope only change well within the quoted errors.

Bayesian JAM modelling
For the dynamical modelling we use a new solution of the axisymmetric Jeans equations under the assumption of a spherically aligned velocity ellipsoid (Cappellari 2020), which was shown to describing very accurately the dynamics of the Gaia data both in the outer halo (Wegg et al. 2019) and the disc region (Everall et al. 2019;Hagen et al. 2019). This JAM sph method was developed specifically with the Gaia data in mind. As an extreme test of the sensitivity of the model assumption on the orientation of the velocity ellipsoid, we also use for comparison the cylindrically aligned JAM cyl solution and the JAMPY package 3 (Cappellari 2008).
The JAM cyl method was applied to model the integral-field stellar kinematics of large numbers of external galaxies (see review in Cappellari 2016) and has been extensively tested against N-body simulations (Lablanche et al. 2012;Li et al. 2016) and in real galaxies against CO circular velocities (Leung et al. 2018), including many barred and non-perfectly-axisymmetric galaxies like the Milky Way. In both cases, it was found to recover unbiased density profiles, even more accurately than the more general Schwarzschild (Schwarzschild 1979) approach (see e.g. fig. 8 of Leung et al. 2018).
A higher accuracy of JAM cyl with respect to Schwarzschild was confirmed (see fig. 6 of by Jin et al. 2019) using the currently stateof-the-art Illustris cosmological N-body simulation (Vogelsberger et al. 2014). Given its spherical alignment, the JAM sph method should be even more accurate for the Milky Way.
JAM sph is described in detail in Cappellari (2020) and here we only summarize the key model assumptions. The Jeans equations with the velocity ellipsoid assumed to be aligned with the spherical coordinate system and the anisotropy defined as β = 1 − σ 2 θ /σ 2 r , become (e.g. Bacon, Simien & Monnet 1983, equations 1 and 2) ( These can be combined to give a linear first-order partial differential equation, for which standard textbook-like methods of solutions exist A detailed solution was given by Bacon et al. (1983) and Bacon (1985). Cappellari (2020) specialized the Jeans solution to the case where both the density and the tracer distributions are described by the MGE parametrization and described an efficient and accurate numerical implementation, which we used in this work. 4 The use of MGE allows one to spatially vary the anisotropy by assigning different anisotropies to the various Gaussian components of the MGE. JAM sph gives the solution of the Jeans equations for all three mean velocity components (v r , v θ , v φ ) and the velocity dispersion in the three directions (equations 52-54 of Cappellari 2020). We project the spherical velocities into cylindrical coordinates (v R , v φ , v z ). However, given that the model assumes a steady state, v R and v z are identically zero and do not need to be explicitly fitted. This assumption is consistent with good accuracy with the Gaia maps by Gaia Collaboration (2018b) which also show these velocities to be small for the purpose of this study (−15 v R 15 km s −1 , −10 v z 10 km s −1 ).
The formal statistical errors in the Gaia data are quite small and certainly much smaller e.g. than the actual deviations of the Milky Way kinematics (or most external galaxies) from the axisymmetry and steady-state assumptions. Moreover, the Gaia data set contains a very large number of values. In these situations, the formal statistical uncertainties become meaningless as the uncertainties become entirely dominated by systematics. This is a common issue also for the dynamical modelling of high-S/N integral-field kinematics. To approximately account for systematic errors in the data and approximations in the model assumptions we follow the approach of Section 3.2 of a previous modelling paper by van den Bosch & van de Ven (2009), as modified for Bayesian analysis in section 6.1 of Mitzkus, Cappellari & Walcher (2017). We stress the fact that the method is not statistically rigourous, as one should expect from the fact that it tries to deal with systematic and not statistical uncertainties. However, it was found to work well in practice, in a number of cases. The key idea of this approximate approach to deal with systematic uncertainties in the data is to assume that they produce a comparable contribution to the final uncertainty on the model parameters as the statistical uncertainties in the data. This implies that, instead of adopting the standard statistical confidence levels based on χ 2 (e.g. χ 2 = 1 for the 1σ uncertainty for one DOF) one should allow variations in the χ 2 of the order of its standard deviation χ 2 = √ 2(N − M) ≈ √ 2N , with N = 817 the number of data points fitted and M the number of model parameters.
In practice, we proceed as follows: First, after an initial fit, we adopt for the kinematics a constant error of = 5.7 km s −1 , for all kinematic components, to give χ 2 /DOF = 1 for the best-fitting model. To include this uncertainty in the Markov chain Monte Carlo (MCMC) sampling, the error needs to be increased in such a way that a 'misfit' with a χ 2 = √ 2N using the original errors results in a fit with χ 2 = 1 with the updated, scaled errors. Hence, we multiply the original error = 5.7 km s −1 by (2N) 0.25 , which is equivalent to changing the 1σ confidence level from χ 2 = 1 to χ 2 = √ 2N . JAM sph or JAM cyl uses as fixed input the density of the tracer population (Jurić et al. 2008) and a model for the gas density (McMillan 2017). Our standard model has nine free parameters: (i) the velocity dispersion ratios or anisotropies (σ θ /σ r 5 and σ φ /σ r 6 ) for both the flattest (q MGE < 0.2) Gaussian components (subscript 1) of the MGE and the rest (subscript 2), to allow for some of the observed clear spatial variation of the anisotropy, while keeping the model as simple as possible. Our results are only weakly sensitive to different choices for separating the Gaussians with different anisotropies (see Section 4.2); (ii) the inner logarithmic slope of the gNFW profile (α DM ); (iii) the dark matter fraction f DM within a sphere of radius R ; (iv) the mass-to-light ratio M * /L V of the stellar component in the V band; (v) the axial ratio q; and (vi) the scale radius (r s ) of the dark matter profile.
Additionally, we also consider for comparison a model where we allow each of the 18 Gaussians in the MGE model for the stellar luminosity-density distribution to have a different anisotropy σ θ /σ r and σ φ /σ r . This model has 41 free parameters.
The Bayesian modelling was performed using the ADAMET 7 package of Cappellari et al. (2013a), which implements the Adaptive Metropolis algorithm by Haario, Saksman & Tamminen (2001). This is used to estimate the posterior distribution, as in standard MCMC methods (Gelman et al. 2013), to get the confidence levels of the best-fitting parameters and to show the relations between the different parameters. We adopted constant priors on all parameters, in such a way that the probability of the model, given the data, is just proportional to the likelihood P (data|model) ∝ exp (− χ 2 /2).

JAM fit to the Gaia data
Formally, our standard model has nine free parameters, however, most of them are either directly constrained by the data or irrelevant and marginalized over. We include these 9 parameters just not to risk underestimating the model uncertainties. The four ratios between the different components of the velocity dispersion can be directly measured from the maps, while the M * /L has an almost one-toone correspondence to the dark matter fraction f DM . The halo scale length r s is totally unconstrained by the Gaia data and is degenerate with α DM : their combination simply defines the density total slope. While the halo axial ratio q is consistent with a spherical shape. This effectively leaves to the model the freedom to vary only the two parameters f DM and α DM , which describe the dark matter halo, to fit a set of four two-dimensional kinematic maps! The best-fitting standard gNFW model for the velocity maps is shown in Fig. 1. We also show for comparison the model without dark matter and the best-fitting model with a 'standard' Navarro, Frenk, and White (NFW) α DM = −1 profile (Navarro et al. 1996). Moreover, we also show the model with free anisotropy for each of the 18 Gaussians of the MGE. The median parameters from the posterior distribution for all four models are listed in Table 1 together with the 1σ uncertainties, defined as half of the intervals enclosing 68 per cent of the posterior values, marginalized over the other parameters.
In addition, in Fig. 1 we show for reference a version of the Gaia kinematic data that were symmetrized with respect to the equatorial plane and LOESS smoothed following Cappellari et al. (2015). In practice, first we use SYMMETRIZE VELFIELD 8 on our Gaia data to generate a symmetric ('axisymmetric') version with respect to z = 0 and then we use the LOESS 2D 9 routine (with frac=0.05) of Cappellari et al. (2013b), which implements the two-dimensional LOESS algorithm of Cleveland & Devlin (1988). This gives a LOESS smoothed estimate of the kinematics at the sets of coordinates (R, z) for the symmetrized data.
We use the symmetrized and smoothed Gaia data as benchmark to quantify the quality of our JAM fits in Table 1. For this, we list the quantity χ 2 JAM /χ 2 LOESS , following Cappellari et al. (2015), for each of the best-fitting models. This quantity has the advantage over the usual χ 2 that it approximates the χ 2 /DOF but is insensitive to the normalization of the kinematic uncertainties.
To partially estimate the effect on the model parameters of the non equilibrium in the kinematics of the Milky Way we also fitted two models to two independent subsets of the Gaia data extracted from two azimuthal sectors (−15 • < φ < 0 • and 0 • < φ < 15 • ) of our data. The results for the two sectors (Table 1) are consistent with each other and with the main model. As an extreme test of the sensitivity of our model to the assumptions on the orientation of the velocity ellipsoid, we also have fit a model (JAM cyl ) that assumes a cylindrically aligned velocity ellipsoid (Cappellari 2008). The JAM sph model gives a slightly better fit to the data than JAM cyl , consistently with the finding that the Milky Way velocity ellipsoid is nearly spherical aligned (Everall et al. 2019;Hagen et al. 2019). However, the difference between the parameters inferred by two, JAM sph and JAM cyl , models is minimal (Table 1) as the two solutions are not very different in the disc plane.
Three results can be inferred from Fig. 1: (i) The most striking and important is how well this simple equilibrium model is able to capture the average Gaia kinematics. Here, the main model residuals appear due to the known 15 per cent non-equilibrium and nonaxisymmetry features (Gaia Collaboration 2018b), which cannot be described by an equilibrium model. In fact, most of the model Figure 1. Data versus models. From left to right: (i) Gaia data, (ii) the best-fitting JAM sph model without dark matter, (iii) the best model with a 'standard' Navarro, Frenk, and White (NFW) dark matter profile, (iv) the best-fitting model with gNFW halo profile, with free inner logarithmic slope, and two free anisotropies (see text), (v) the best-fitting model with a different anisotropy for each MGE Gaussian, and (vi) a symmetrized and LOESS-smoothed version of the Gaia data. The latter is shown for reference and, given that it makes no other assumption than symmetry and small-scale smoothness, it essentially represents the best fit that one can expect with any axisymmetric model. Below each model there are the corresponding residuals (data − model). From top to bottom the rows show the mean azimuthal velocity (v φ ), the velocity dispersion in radial (σ R ), azimuthal (σ φ ), and vertical (σ z ) direction in Galactic cylindrical coordinates. residuals seem associated with the spiral-wave pattern visible in the face-on view of the kinematics of the Milky Way disc in fig. 10 of Gaia Collaboration (2018b). The fact that most of the residuals are due to non-axisymmetry is also clearly visible by comparing the residuals of the best-fitting model to those of the symmetrized data ( Fig. 1): the largest JAM residuals are in the V φ components, but most of those also stand out with respect to the symmetrized Gaia data. The same is true for the major deviations in the JAM fits to the other components. This good agreement is further quantified by the χ 2 JAM /χ 2 LOESS ratio in Table 1, which shows values quite close to one, especially when allowing for 'free anisotropies'. (ii) It is clear that a model without dark matter completely fails to describe the observations, (iii) moreover, a standard NFW dark matter profile does not provide an equally good fit as the best-fitting one, with our median dark halo slope α DM = −1.53 ± 0.12. In particular, an NFW halo systematically overestimates σ z and the radial gradient in v φ . The halo slope lies in the range expected from simple predictions for halo contractions (Gnedin et al. 2004) for samples of real galaxies (e.g. fig. 2 in Cappellari et al. 2013a). This is consistent with previous work indicating a steeper slope than the standard NFW is needed in the disc region of the Milky Way around the solar region (Cole & Binney 2017;Portail et al. 2017).
(1979) method one could fit every detail of the data, down to the noise. However, a better fit will not necessarily constitute an improvement in the recovered density, because there is a risk of interpreting non-equilibrium or non-axisymmetry features as tracers of an axisymmetric equilibrium model. Tests on real galaxies (Leung et al. 2018) and simulations (Jin et al. 2019) indicate that Schwarzschild models provides a less accurate recovery of the true density than JAM in this situation. Thus, we intentionally kept the standard model simple, not to risk over-interpreting features in the data that are not real or not tracer of an equilibrium potential.
Our model has a total density at the solar position of ρ tot (R ) = 0.0640 ± 0.0043 M pc −3 and a dark matter density of ρ DM (R ) = 0.0115 ± 0.0020 M pc −3 , corresponding to a dark matter energy density of 0.437 ± 0.076 GeV cm −3 where the uncertainties include systematics and are not purely statistical. These values are broadly consistent with previous determinations (McKee, Parravano & Hollenbach 2015;McMillan 2017).
The posterior distribution of the parameters for the gNFW model are shown in Fig. 2. This figure also shows the expected covariance between the mass-to-light ratio, the dark matter fraction, and the inner slope of the halo: models with steeper (more negative) α DM have lower (M * /L) V ratio and higher dark matter fraction f DM . This distribution also shows that an NFW profile (red-dashed line in Fig. 2) is outside the 3σ confidence level. In addition, as comparison the blue-shaded region indicates the stellar mass-to-light ratio from stellar counts 0.75 M /L V (15 per cent uncertainty at 1σ level from Flynn et al. 2006). This shows that our model (M * /L) V is consistent with the stellar-counts determination within the errors. Our model strongly excludes flattened haloes and favours a nearly round one, consistent with recent Gaia results (Wegg et al. 2019).

Total density profile
Having found our best-fitting model we can find the total density profile, using the MGE RADIAL DENSITY procedure in the JAMPY package (see footnote 3).
The total-density distribution is the quantity that the dynamical models directly measure. This is very tightly constrained by the Gaia data and it is shown in Fig. 3. From the posterior of the models we measure a total density mean logarithmic slope of α tot ≡ log ρ tot / log r = −2.149 ± 0.055 in the radial range 3.6-12 kpc of the data. These radii correspond to about 0.9-2.9 half-light radii R max e , for our measured 10 R max e ≈ 4.122 kpc. The measured total-density slope is consistent with the 'universal' slope α tot ≈ −2.19 ± 0.03 inferred at comparable radii on early-type disc galaxies (Cappellari et al. 2015) with effective velocity dispersion not smaller than the Milky Way's value (Kormendy & Ho 2013) σ e = 105 ± 20 km s −1 .
We also investigate how much the uncertainties of the parametrization of the Milky Way stellar tracer distribution affect our JAM sph results. For this, we vary the disc parameters of Jurić et al. (2008) within the quoted errors and then redo the JAM sph fit using the CAPFIT least-squares program. 11 This is repeated for 100 different disc models, for random values of the scale heights, scale lengths, and the fraction of the thick disc normalization, within the given 20 per cent 1σ uncertainties (Jurić et al. 2008). The result for the total density can be seen in panel (b) of Fig. 3. Here, the blue lines indicate the systematic error due to the Milky Way stellar tracer model. This gives us an alternative estimate of the value and a systematic error α tot ≡ log ρ tot / log r = −2.194 ± 0.044 and the logarithmic slope of the gNFW profile is α DM = −1.455 ± 0.055, which is comparable to the previous estimate.
During this testing of these systematic errors, we saw that higher values for the scale height and length of our disc models seem to give JAM models that agree even better with our data. We leave further exploration of this aspect to a subsequent study. Figure 2. gNFW parameters corner plot. Each panel shows the posterior probability distribution marginalized over two dimensions (contours) and one dimension (histograms). The parameters are (i) the dark matter fraction f DM inside a sphere of radius R = 8.2 kpc; (ii) the velocity dispersion ratios [(σ θ /σ r ) 1 , (σ φ /σ r ) 1 ] of the Gaussians flatter than q MGE = b/a = 0.2, and the rest [(σ θ /σ r ) 2 , (σ φ /σ r ) 2 ]; (iii) the inner dark matter halo logarithmic slope α DM ; (iv) the stellar mass-to-light ratio (M * /L) V ; (v) the axial ratio q of the dark matter halo; and (vi) the scale radius r s for the dark matter halo. The thick contours represent the 1σ , 2σ and 3σ confidence levels for one degree of freedom. The red dashed line marks the 'standard' α DM = −1 slope of the NFW (Navarro et al. 1996) dark matter profile and the blue shaded band indicates a mass-to-light ratio estimate from stellar counts (Flynn et al. 2006). The numbers with errors on top of each plot are the median (black dashed lines) and 16th and 84th percentiles of the posterior for each parameter, marginalized over the other parameters.
Our model results depend on some arbitrary choices, like the number of Gaussians used in the MGE fit, and the axial ratio q MGE at which we separate Gaussians with different anisotropies. We tested the sensitivity of the most robust quantity from the model, the total density slope, to these choices. If we force the MGE fit to have 35 Gaussians, instead of the 18 of our standard model, and perform a least-squares fit, we obtain a slope for the total density of α tot = −2.203, which agrees within the quoted errors with the leastsquares fit for our standard model which has α tot = −2.185 (this is slightly different from the median value from the model posterior quoted previously in this Section). Additionally, if we allow four possible anisotropies, separated both in radius (smaller or larger than one effective radii) and flattening at q MGE = 0.3 (instead of our standard q MGE = 0.2) we get α tot = −2.075. So even with this different choice to separate the anisotropies, the result is still consistent at the level of the quoted errors.

Circular velocity curve
Having found a model that describes our Gaia data we can also derive the circular velocity, using the MGE VCIRC procedure in the JAMPY package (see footnote 3). Density profiles obtained by randomly varying the assumed parametrization for the stellar disc (Jurić et al. 2008) within the quoted errors and re-running the whole model fitting procedure.
The circular velocity of the Milky Way disc region is plotted as a red line for the gNFW and as a dashed yellow line for the normal NFW profile in panel (a) of Fig. 4. The black points are derived with a subset (|z| < 0.5 kpc or tan (z/R) < 6 • ) from our data when using the same non-parametric method of Eilers et al. (2019), also based on the axisymmetric Jeans equations, to calculate the circular velocity and the green squares are the measurements of Eilers et al. (2019). The small offset between our values and the previous work by Eilers et al. (2019) is coming from our slightly higher value of the solar velocity because of the newest Gravity Collaboration (2019) results for the distance to the Galactic Centre. The difference between our values using the method of Eilers et al. (2019) (black dots) and our result from the JAM sph model (red line) is expected, since the data used for Eilers et al. (2019) method are only covering the equatorial plane of the Galaxy (|z| < 0.5 kpc or tan (z/R) < 6 • ) while for our model all the data reaching up to 2.5 kpc in z-direction were used. Moreover, also the different parametrization of the tracer distribution in Eilers et al. (2019) method and our makes a difference. We have investigated this further and can reproduce almost exactly their circular velocity curve if we use the same kinematics and the same tracer distribution of Pouliasis, Di Matteo & Haywood (2017) with a spherical (q DM = 1) NFW dark matter halo (Nitschai et al., in preparation). Reassuringly, the two circular velocities are consistent within the estimated quite small systematic uncertainties. The model with NFW halo would produce an inconsistent circular velocity, supporting the finding that this is excluded by the Gaia data.
The blue transparent lines are 100 random realizations for the circular velocity from the posterior distribution that indicates the formal uncertainty for our mean circular velocity but include our approximate treatment for systematics by scaling the input , where they use a quite different stellar sample, stellar distance estimates, and modelling assumptions, and with another recent work by Mróz et al. (2019) (233.6 ± 2.8 km s −1 ) in which they construct the rotation curve of the Milky Way using the proper motion and radial velocities from Gaia for classical Cepheids.
The result for the effect of the uncertainties of the parametrization of the Milky Way stellar tracer distribution on the circular velocity can be seen in panel (b) of Fig. 4. Here, the blue lines indicate the systematic error due to the Milky Way stellar tracer model. This gives us an alternative estimate of the value and systematic error v circ (R ) = 236.3 ± 3.1 km s −1 , which is comparable to the previous estimate.

C O N C L U S I O N
The model presented is the first stringent test of the Newtonian equations of motion on galactic scales. It demonstrates that we already have a remarkably accurate knowledge of the mass distribution in our Milky Way and we can concisely describe the main average characteristics of the observed stellar kinematics with a minimal set of assumptions. The model also shows that the average kinematics of the Milky Way, outside of the bar region, can be well described by an axisymmetric and equilibrium model. The fact that we can well describe the Galactic kinematics, regardless of relatively minor deviations from equilibrium and axisymmetry (Widrow et al. 2012;Antoja et al. 2018;Gaia Collaboration 2018b), is consistent with observations of external galaxies, where such deviations are widespread, but models can still capture the average kinematic properties (Cappellari et al. 2013a) and recover circular velocity profiles to 10 per cent accuracy (Leung et al. 2018), even from data of much inferior quality.
The Gaia data are going to improve in accuracy with subsequent data releases. Moreover, when not relying entirely on parallactic distances, one can significantly increase the extent of the region where kinematics can be measured (Wegg et al. 2019). Dynamical models will soon be able to study our Galaxy's density distribution over larger distances. These models are starting to provide a description of the dynamics of the Milky Way at a level that is impossible to achieve in external galaxies. This is providing a key benchmark for our knowledge of galactic dynamics that will complement much less detailed studies of much larger samples of external galaxies.
The kinematics used in this paper and the MGE components can be found as Supplementary data in the online version.

AC K N OW L E D G E M E N T S
M.S. Nitschai warmly thanks the two scholarships, Deutscher Akademischer Austauschdienst (German Academic Exchange Service) -Programm zur Steigerung der Mobilität von Studierenden deutscher Hochschulen ('DAAD-PROMOS-Stipendium') and 'Baden-Württemberg-STIPENDIUM', for the financial support during the stay abroad working on this project and the Astrophysics sub-department of the University of Oxford for the hospitality during that time. NN gratefully acknowledges support by the Deutsche Forschungsgemeinschaft (DFG, German Research Foundation) -Project-ID 138713538 -SFB 881 ('The Milky Way System', subproject B8). This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www. cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gai a/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.