Supermassive black hole mass in the massive elliptical galaxy M87 from integral-field stellar dynamics using OASIS and MUSE with adaptive optics: assessing systematic uncertainties

The massive elliptical galaxy M87 has been the subject of several supermassive black hole mass measurements from stellar dynamics, gas dynamics, and recently the black hole shadow by the Event Horizon Telescope (EHT). This uniquely positions M87 as a benchmark for alternative black hole mass determination methods. Here we use stellar kinematics extracted from integral-field spectroscopy observations with Adaptive Optics (AO) using MUSE and OASIS. We exploit our high-resolution integral field spectroscopy to spectrally decompose the central AGN from the stars. We derive an accurate inner stellar-density profile and find it is flatter than previously assumed. We also use the spectrally-extracted AGN as a reference to accurately determine the observed MUSE and OASIS AO PSF. We then perform Jeans Anisotropic Modelling (JAM), with a new flexible spatially-variable anisotropy, and measure the anisotropy profile, stellar mass-to-light variations, inner dark matter fraction, and black hole mass. Our preferred black hole mass is $M_{\rm BH}=(8.7\pm1.2 [\text{random}] \pm1.3 [\text{systematic}]) \times 10^9 \ M_\odot $. However, using the inner stellar density from previous studies, we find a preferred black hole mass of $M_{\rm BH} = (5.5^{+0.5}_{-0.3}) \times 10^9 \ M_\odot $, consistent with previous work. We find that this is the primary cause of the difference between our results and previous work, in addition to smaller contributions due to kinematics and modelling method. We conduct numerous systematic tests of the kinematics and model assumptions and conclude that uncertainties in the black hole mass of M87 from previous determinations may have been underestimated and further analyses are needed.


INTRODUCTION
Supermassive black holes play an important role in galaxy evolution.This is shown through empirical relations between black hole mass and luminosity (Kormendy & Richstone 1995;Magorrian et al. 1998), as well as black hole mass and stellar velocity dispersion (Ferrarese & Merritt 2000;Gebhardt et al. 2000b).The reliability of these relationships depends on accurate black hole mass measurements.
M87 is one of the most massive galaxies of the Virgo cluster and sits at the centre of the main sub-cluster (e.g.Cappellari et al. 2011b, fig. 7).It is a prototypical massive slow rotator early-type galaxy, with a large Sersic (Sérsic 1963) index and a core in the nuclear surface brightness profile (Kormendy et al. 2009), and fits all characteristics for having assembled most of its mass by dry mergers (see review by Cappellari 2016).Like other galaxies of its type, M87 contains a supermassive black hole at its centre (Kormendy & Ho 2013) whose sphere of influence has the largest angular size of any known black hole outside of the Milky Way, making it a valuable target for black hole studies.The measurement of the black hole shadow by the Event ★ E-mail: david.simon@physics.ox.ac.ukHorizon Telescope (EHT) (Event Horizon Telescope Collaboration et al. 2019) makes M87 the first galaxy for which direct imaging of the supermassive black hole has taken place.This has the potential to serve as a powerful test of the general theory of relativity: but only if we can confirm through independent measurements that the black hole mass recovered assuming general relativity is correct.The measurement of the black hole shadow from the EHT (Event Horizon Telescope Collaboration et al. 2019) assuming general relativity determined the black hole mass to be (6.5±0.7)×10 9 M ⊙ .In the case of M87 there are two other such classes of measurement.Gas dynamical measurements by Harms et al. (1994), Macchetto et al. (1997), and Walsh et al. (2013) have measured the masses (2.7 ± 0.8) × 10 9 M ⊙ , (3.6 ± 1.0) × 10 9 M ⊙ , and (3.3 +0.8  −0.7 ) × 10 9 M ⊙ , respectively.Stellar dynamical measurements (Gebhardt & Thomas 2009;Gebhardt et al. 2011;Liepold et al. 2023) using the orbital superposition method of Schwarzschild (Schwarzschild 1979) have been made which found a black hole masses of (6.0 ± 0.5) × 10 9 M ⊙ , (6.2 ± 0.4) × 10 9 M ⊙ , and (5.37 +0.37  −0.25 ) ×10 9 M ⊙ respectively.There is thus a discrepancy in the recovered black hole masses by a factor of two depending on the method used.Jeter et al. (2019); Jeter & Broderick (2021) propose that this may be due to unrealistic assumptions made in the gas mod-elling.They find that more detailed accounting of the radial motion of the gas as well as allowing for a thick gas disk can alleviate the discrepancy (though it should be noted that their models are not fit to data).Recently, Osorno et al. (2023) has produced detailed ionized gas maps of M87 using the same MUSE data as in this paper, revealing a complex multi-component gas structure.They suggest that the cause of the discrepancy is likely due to incorrect assumptions about the ionized gas disk inclination, though they comment that it is challenging to make an independent black hole mass measurement with this data.The agreement between stellar dynamical measurements and the measurement of the black hole shadow is reassuring, but it is still important to continue testing and independently verifying stellar dynamical models in order to fully understand possible systematics and to improve the robustness of the measurement.In this paper we derive new independent measurements of the black hole mass using stellar dynamics with two different high-resolution integral-field datasets from two different telescopes and a different dynamical modelling approach than previously used.
This paper is laid out as follows: in section 2 we introduce the integral field data and photometric data used in this study.In section 3 we describe our spectral fitting and discuss a number of tests we performed and several methods of extracting the kinematics that we use.In section 4 we use a combination of IFU data with photometry to accurately measure the stellar density profile for M87 down to the region dominated by the AGN.In section 5 and section 6 we describe the details of our Jeans modelling and present our black hole mass constraints.We compare these with previous observations and discuss a number of systematic uncertainties.Lastly, in section 7 we summarize our results and comment on the future landscape for studies of M87.
We take the distance to M87 to be 16.8 Mpc (Event Horizon Telescope Collaboration et al. 2019).All black hole masses quoted are scaled to this distance.This corresponds to a spatial scale of 81.1 pc per 1 arcsecond.

Integral Field Spectroscopy
We use integral field observations from the Optically Adaptive System for Imaging Spectroscopy (OASIS) spectrograph made on the Canada-France-Hawaii Telescope (CFHT) (McDermid et al. 2006) and the Multi Unit Spectroscopic Explorer (MUSE) (Bacon et al. 2010) in narrow field mode (NFM) on the Very Large Telescope (VLT) with adaptive optics (AO).This gives us two independent views of the central kinematics of M87.To add kinematic information at larger distances, we also use observations from SAURON Bacon et al. (2001).
The OASIS integral field spectrograph (IFS) has a 10 ′′ ×8 ′′ field of view with a 0. ′′ 27×0.′′ 27 pixel scale.For this measurement the spectrograph was configured to cover the wavelength range of 4760-5558 Å with a resolution of 5.4 Å FWHM (corresponding to an instrumental dispersion of  inst ≈ 134 km s −1 ) sampled at 1.95 Å per pixel.Three observations were made for 2700 seconds each, which were then combined to form the final image (see McDermid et al. (2006) for a detailed description of the observations and data reduction).
The MUSE observations were made as part of program 0103.B-0581 (PI: N. Nagar) in NFM.The NFM covers a field of view of 7.5"x7.5"with a pixel scale of 0.025"/pix and is used together with the the adaptive optics facility GALACSI (Arsenault et al. 2008;Ströbele et al. 2012), providing laser tomographic AO corrections.The observations were carried out on 20 February 2021 and consist of nine dithered exposures of 700s, resulting in a total exposure time of 6300s.The data were reduced with the standard MUSE data reduction pipeline (Weilbacher et al. 2020) in the ESO reflex environment (Freudling et al. 2013) using the dedicated offset sky exposures for sky subtraction with the standard pipeline parameters.The parameters for source detection and image alignment were optimized to align the individual exposures based on a combination of point sources and the knots of the jet of M87.We also tested whether the remaining sky residuals could be removed using the Zurich Atmospheric Purge (ZAP, Soto et al. (2016)) on sky residual cubes produced by the pipeline, but the improvement was not significant so we proceeded with the original cube.MUSE covers a wavelength range from 4650-9300 Å with a gap between 5780-6050 Å due to a Na notch filter blocking the light from the four laser guide stars facility (4LGSF).The spectrum is sampled at 1.25 Å per pixel with a resolution of about 2.6 Å FWHM, corresponding to an instrumental dispersion  inst ≈ 63 km s −1 .For our analysis, we restrict the spectral range to cover only 4800-5700 Å.The range greater than ∼7000 Å has significant sky residuals so we choose to exclude this range in our analysis.The remaining range to the right of the notch filter (6050 Å -7000 Å) does not have deep stellar features that provide information for stellar kinematics.Furthermore, this region has several gas emission features with multiple kinematic components.One way of dealing with this would be to mask all of the gas.However, this would conceal some kinematic features to the left of the notch filter and would leave only a few disconnected regions of unmasked spectrum to the right of the notch filter.We thus choose to proceed by omiting the region to the right of the notch filter and fit all of the gas to the left of the notch filter.We observe some slight flux calibration issues at the ends of the spectrum which we clip for our analysis.
The SAURON observation of M87 was first described in Emsellem et al. (2004) and later reanalysed as a part of Cappellari et al. (2011a).The field of view of SAURON is 33 ′′ × 41 ′′ with a pixel size of 0. ′′ 94 × 0. ′′ 94.The wavelength range covered is 4800-5380 Å at 4.2 Å spectral resolution (corresponding to an instrumental dispersion of  inst = 108 km s −1 ) sampled at 1.1 Å per pixel.Four observations were made for 1800 seconds each.The data reduction was performed with xsauron (Bacon et al. 2001).Further details are available in Emsellem et al. (2004); Cappellari et al. (2011a).

Photometry
We use HST imaging from Côté et al. (2004) (HST proposal 9401) to construct our stellar surface brightness model while carefully removing the AGN.This is an F850LP ACS/WFC observation covering a field of view of approximately 211 ′′ ×212 ′′ with a pixel scale of 0. ′′ 05.The exposure was made for 90 seconds, guaranteeing that the nucleus does not become saturated.We also use an r-band SDSS mosaic generated with the software Montage 1 to constrain the stellar surface brightness at larger radii.The SDSS image covers a spatial scale of approximately 713 ′′ ×713 ′′ with a pixel scale of 0. ′′ 396.

Spectral Fitting
We bin the galaxy spectra spatially using the Voronoi tesselation algorithm and VorBin software package2 described in Cappellari & Copin (2003).This algorithm takes the x and y coordinates of a set of data along with the assigned signal and noise and bins neighboring points to a target signal to noise ratio.We define the signal to be the median spectral flux and the noise to be the median error (this is done before logarithmically rebinning the spectra).For the OASIS data, we bin to a target signal to noise ratio of 50 per 1.95 Å spectral pixel.This leaves all of the spaxels in the innermost arcsecond unbinned, allowing for maximum spatial resolution.MUSE has a much higher spatial resolution, requiring some spatial binning in the innermost parts of the galaxy to have sufficient signal for kinematic fitting.We start by masking the innermost 0. ′′ 1 as these spectra are entirely dominated by the AGN3 .We tested binning with a target signal to noise ratio of 50,40,30,20,15, and 10 per 1.25 Å spectral pixel.The recovered black hole mass is consistent for 50, 40, 30, and 20, but increases sharply for lower signal to noise.This is due to the fact that with this level of noise, a flat fit to the spectra is allowed in the innermost regions, resulting in many central spectra having anomalously large dispersion.For the rest of this work, we use the case with a signal to noise ratio of 50 per 1.25 Å spectral pixel.For both datasets, we remove all data points for which the fraction of the flux due to stars is less than 50%.This is determined after running the fits by comparing the average flux in the Legendre polynomials to the stars (more on this later).
We logarithmically resample the spectra to a velocity scale Δ = Δ ln  of 105 and 66 km s −1 for OASIS and MUSE respectively and fit the binned spectra using the penalized Pixel Fitting method pPXF software package 4 of Cappellari & Emsellem (2004) and Cappellari (2017Cappellari ( , 2022)).This method allows for the simultaneous fitting of template stellar spectra, template gas spectra, and continuum contributions/template mismatch with the addition of additive polynomials.We can write an observed galaxy spectrum as 5 Here   represents the stellar templates,   the gas templates, L the corresponding line of sight velocity distribution, and P additive polynomials (in this case taken to be Legendre polynomials).
The strongest absorption feature in the OASIS and MUSE spectra that contributes to the fits of the stellar kinematics is due to Mgb around 5200 Å.However, in the innermost arcsecond of the galaxy this feature is contaminated by gas emission from [NI]5197,5200 (see Fig. 1, Fig. 2, and Fig. 3).When it comes to fitting the absorption feature from Mgb then, there is a degeneracy between the stellar dispersion and the gas kinematics.This is compounded by the fact that it is precisely in the innermost regions that the relative flux in each much smaller flux to the total spectrum and can be fit by additive Legendre polynomials 4 Available from https://pypi.org/project/ppxf/ 5 This ignores sky, attenuation, and multiplicative polynomials.See the full expression in eq. 13 of Cappellari (2022) 480 490 500 510 520 530 540 550 560 570 Wavelength (nm) Plot of the MUSE spectra as a function of radius.The black curve is the data and the red curve is the best pPXF fit to the stars and the orange curve is the best fit to the gas.The  = 0 ′′ spectra is the combined innermost 0. ′′ 1 of MUSE.This is not fit as we do not include it in the final analysis.
The double peaked structure of the gas appears intermittently at a variety of radii.The bottom spectra shows the single stellar template that we use to fit the stellar kinematics for all of the MUSE spectra, as well as the sum of the gas free spaxels that we use to fit for the single stellar template.The mean luminosity weighted radius of the gas free spectra is 3. ′′ 3.
spectra due to the AGN increases, further increasing the uncertainty in the kinematic extraction.In order to test how the treatment of this affects the extracted kinematics, we consider two separate scenarios.
In the first scenario we simply fit the full spectrum.This has the advantage that outside of the innermost arcsecond the kinematic extraction is very reliable with the disadvantage of the innermost arcsecond being less reliable.In the second scenario we restrict the spectra to begin at 5250 Å (these spectra are referred to as spectra to the right of [NI] or RNI spectra) so that we exclude the contaminated region.This has the benefit of avoiding any uncertainty from the fit to Mgb at the cost of further restricting the spectral range.This is done for both OASIS and MUSE.The details of the kinematic extraction for each of these scenarios is significantly different.Spectral fitting over the full wavelength range is challenging to perform due to the presence of strong gas emission lines from H, [OIII]4959,5007, and [NI]5197,5200.Furthermore, we observed multiple gas kinematic components for H and [OIII] in the inner most part of the galaxy (see Fig. 3).Additionally, the presence of the central AGN dilutes the stellar features of the central spectra making a reliable extraction of the kinematics without an assumption of the spectra shape difficult.Furthermore, since we fit the gas, there is the data and the red curve is the best pPXF fit to the stars and the orange curve is the best fit to the gas.The  = 0 ′′ spectra is the central spaxel for OASIS.This is not fit as we do not include it in the final analysis.The double peaked structure of the gas appears intermittently at a variety of radii.The bottom spectra shows the single stellar template that we use to fit the stellar kinematics for all of the OASIS spectra, as well as the sum of the gas free spaxels that we use to fit for the single stellar template.The mean luminosity weighted radius of the gas free spectra is 3. ′′ 7.
is a degeneracy between the stellar flux and the flux from the gas.In order to increase the robustness of the extracted kinematics, we allow for only one single stellar component in our fits (i.e.we fit one linear combination of template spectra to the data, as opposed to many variably weighted template spectra).We determine this single stellar template by co-adding the spectra with no gas emission lines and fitting this with stars from the MILES stellar library 6 (Sánchez- Blázquez et al. 2006;Falcón-Barroso et al. 2011).Note that one single stellar template does not mean one constant stellar population across the field of view since the Legendre polynomials allow for significant variations in the line strength of the spectral features of the stellar population (note, we know from Sarzi et al. (2018) that there are no sharp changes to the stellar population in the innermost parts of M87).We use the MILES stellar library consisting of nearly 1000 individual stellar spectra.We also tried this using SSPs from the MILES stellar library (Vazdekis et al. 2010) including synthetic spectra with alpha enhancement (Vazdekis et al. 2015)

OASIS RNI Spectrum
Figure 3.The top panels show a fit of the MILES stellar spectra to the M87 spectra at large radii without gas emissions.This is done separately for MUSE and OASIS.This fitted spectra is then used to fit the stellar kinematic component for each spaxel.The middle panels show a spectrum from the centre of M87 that has been fitted with the above stellar template, as well as with multiple gas templates.This shows strong gas emissions with a double peaked profile for H and [OIII].These must be carefully accounted for in order to produce a reliable kinematic fit.The bottom panels show a fit of the stellar template to a spectrum in the central parts of M87 where the spectra has been restricted to start at the right of [NI] (denoted RNI spectra).Here there is no gas, simplifying the fitting, but the noise in the spectrum increases.maximum possible variation in the parameters, as well as the MUSE stellar library (Ivanov et al. 2019), consisting of 35 individual stellar spectra, but we ultimately find the best fit using the MILES stars.
The choice to fit only a single stellar template as opposed to the full spectral library is very important for galaxies with an AGN.Silge et al. (2005) measured the supermassive black hole mass of Centaurus A using Gemini NIFS data, allowing for a varying stellar template and found a steeply increasing dispersion profile in the center of the galaxy and measured a black hole mass between (1.5-2.4)×10 8  ⊙ .This was a significant outlier of the  −  relation and did not agree with later measurements of the black hole mass using gas (Neumayer et al. 2007).Later, Cappellari et al. (2009) performed the same measurement with SINFONI data using a fixed single stellar template and found a black hole mass equal to 5.5 × 10 7  ⊙ .This was no longer a significant outlier of the  −  relationship and is in excellent agreement with the determination of the black hole mass from gas (Neumayer et al. 2007).The case where there is no AGN was tested in Westfall et al. (2019).There, the authors show in figure 13 that the difference in the extracted kinematics for high mass red galaxies between those using a single or varying template is at most 3 per cent.This is in contrast to the case of blue galaxies for which there is a much stronger deviation, likely due to the fact that these galaxies are undergoing star formation and thus have stronger population gradients.We also note that, while we independently determine the single stellar template for both MUSE and OASIS, the two agree quite closely (less than one percent RMS deviation over the relevant wavelength range).Using the OASIS template to fit the MUSE data and vice versa returns effectively the same kinematics as just using the OASIS template for OASIS and the MUSE template for MUSE.
The OASIS spectral resolution is much larger than that of the stars in the MILES stellar library which have an instrumental resolution of 2.51 Å FWHM corresponding to  instr ≈ 61 km s −1 at 520nm.Note that in each case the dispersion profile is consistent up to a constant scaling outside of one arcsecond.Within one arcsecond, however, there is a large deviation in the extracted profiles.This is due to the uncertainty in the kinematic extraction caused by contamination from the AGN and the degeneracy in the fits between the fit to [NI] and the Mgb absorption.The bottom panel shows the dispersion profiles used in the final analysis scaled to match the SAURON profile.Data shown in the same color are combined in the final analysis.
We account for this by degrading the resolution of the template spectra (before logarithmically rebinning) with a gaussian to a constant resolution per angstrom so that the spectral resolution for the two are the same.The spectral resolution for MUSE over the wavelength range we use is nearly the same as that of the MILES stars, so we do not apply a correction.We determine the single stellar template separately for both OASIS and MUSE.We then allow gas templates in the pPXF fits for  , [OIII] and [NI], but allow H and [OIII] to have two distinct kinematic components each.Given the number of templates (1 stellar spectrum + 2 H + 2 [OIII] + 2 [NI] spectra) and the fact that the gas could be challenging to fit due to the possibility of there being multiple local best fits, we experimented with a number of constraints, such as treating the gas components of H and [OIII] as a part of the same kinematic component by fixing their velocity and dispersion to be equal.Ultimately, we found that the most reliable fit to the stellar spectra is comes from allowing maximum freedom in the gas fit.That is, treating each gas template as having its own velocity and dispersion.This is because even slight offsets in the gas fits for central spectra result in large residuals that end up driving the stellar fit.This means having a total of six kinematic components: one for the stars, one for each set of H and [OIII], and one for [NI].
We assume that the line of sight distribution can be treated as a Gaussian without the inclusion of higher order Gauss-Hermite moments.We made this assumption because (i) Cappellari et al. (2007, sec.2) found using synthetic galaxy models that the sigma obtained from a Gaussian fit (moments=2) with pPXF provides a better approximation to the second velocity moments than computing of the second moment by integrating the LOSVD from a fit which includes higher Gauss-Hermite moments (e.g.moments=4); (ii) making this assumption one is able to accurately predict with JAM the observed  rms of hundreds of real galaxies (e.g.fig. 1 of (e.g.Cappellari et al. 2015, fig. 1); and (iii) in the specific case of the stellar kinematics of M87, previous studies have found that, within the range of our data, the lowest order Gauss-Hermite moments are either consistent with zero or small (Liepold et al. 2023, fig. D2), implying that the LOSVD is essentially Gaussian.To test this, we run pPXF for our MUSE data where we include Gauss-Hermite moments up to ℎ 4 .We find evidence for some offset in ℎ 3 (median between 0.015 and 0.018 depending on the choice of Legendre Polynomial) which suggests that there is some small template mismatch.We find that ℎ 4 is consistent with zero (see Fig. 5).This is consistent with the result of Liepold et al. (2023).This holds across the choice of Legendre Polynomials.Thus we feel confident not including the Gauss-Hermite moments in our analysis.
One issue that we faced was some of the fits returning large stellar velocities in a couple of the central spaxels of the galaxy.M87 is well known to be a slow rotator (Emsellem et al. 2007(Emsellem et al. , 2014)), so any large deviations in the velocity suggest an error in the fitting.We handle this by fixing the velocity across the field for both datasets to equal the recession velocity7 .This serves as a realistic prior that helps decrease the noise in our kinematic extraction.In order to confirm that this does not impact the extraction of the dispersion, we compared the dispersion measured before and after fixing the velocity.We found that, for both MUSE and OASIS, this did not meaningfully impact the dispersion over the field of view (less than one percent RMS deviation between the two) and the difference for the central spaxels is less than ∼2 per cent.It is worth noting that the SAURON data we use does not fix the velocity at any point over the field of view.Given that we exclude SAURON data from the central regions and we know that fixing the velocity in the center does not significantly impact the dispersion at large radii, we believe that it is consistent to do this.
Previous work has not run into this issue due to differences in the spectral range considered.Gebhardt et al. (2011) uses data covering the CO band head, which features several deep absorption structures ideal for determining stellar kinematics.Emsellem et al. (2014) uses MUSE data in wide field mode without AO which does not feature the notch filter designed to block light from laser guide stars used in AO and thus can see the absorption due to the sodium doublet around 5900 Å.Additionally, Liepold et al. (2023) uses much bluer data with spectral features such as CA II H, CA II K in addition to Mgb that add valuable kinematic information.
We calculate errors in the dispersion using wild bootstrapping (Davidson & Flachaire 2008) of the spectra residuals and repeating the pPXF fits 100 times on the bootstrapped spectra.Lastly, we perform this analysis three separate times using Legendre polynomials of degrees 4, 5, and 6.We find that the extracted dispersion in the centre most region depends on the choice of Legendre polynomial (see Fig. 4).This is likely due to the degeneracy between [NI] and Mgb absorption.Different degrees of Legendre polynomial, especially those with very high degree given the total wavelength range, can go beyond accounting for template mismatch and can start reproducing parts of the stellar spectrum.
Kinematic extraction redwards of [NI] is done using the same single stellar template as in the case for the full spectra, no gas templates, and additive Legendre polynomials of degree 1, 2, and 3.As in the previous case, we also fix the velocity of each spectra to the recession velocity, which we determined as the median over the field of a free fit.A dispersion map of the extracted kinematics in the case where we fit the full spectrum with additive Legendre polynomials of degree 4 is shown in Fig. 6.The map appears symmetric, as we expect since the core of M87 is highly spherical.As a result, we plot the dispersion profile for the remaining scenarios as a function of radius in Fig. 4.
The supermassive black hole in M87 has the largest angular size from Earth of any known black hole outside of the Milky Way.We define the sphere of influence  BH as i.e. the radius such that the black hole mass equals the mass in stars.Assuming the range of black hole masses and / values determined in this work, we determine the  BH is between ∼5 and 6 ′′ and possibly even larger.Thus the black hole sphere of influence dominates much of the field of view for both OASIS and MUSE.This makes measuring parameters such as the stellar mass to light ratio challenging as, within this field of view, there is a large degeneracy between the black hole mass and the stellar mass.We can break this degeneracy by adding larger field kinematic data that is more sensitive to the mass of the stars.We do this by including SAURON data (Emsellem et al. 2004) for M87 as reanalysed for the ATLAS 3D project8 by Cappellari et al. (2011a).Here the spaxels were binned to a target signal to noise ratio of 40 (as opposed to 60 in the original SAURON reduction).The gas was masked during the fits and the degree of additive Legendre polynomials was set to 4. Similar to this work, the stellar kinematics were fit using the MILES stellar library with a single optimal linear combination of templates.
When considering large field data, it is important to be conscious of the fact that any subsequent parameter studies will be heavily influenced by information provided at larger radii because there is a greater volume of data points at large radius than at small radius.To account for this we only consider SAURON data out to a radius of 15 ′′ .To avoid accounting for uncertainties in the SAURON PSF, we do not include any SAURON data within the innermost 2 ′′ of the galaxy.We observe an offset between our extracted dispersions and the SAURON dispersion.This is expected due to systematic uncertainties in the kinematic extraction, especially due to template mismatch.To account for this, we apply a multiplicative scale to the MUSE/OASIS dispersions so that they match the SAURON data between 2 ′′ and 4 ′′ .Scaling the velocity axis by a factor Υ is equivalent to changing the overall mass normalization of the model by a factor √ Υ.For MUSE, the factor required to scale the dispersion to match the SAURON data varies between 0.95 and 1 (given the choice of Legendre polynomial).For OASIS, the range is between 0.87 and 0.9.If one wanted to scale each data set to either the OASIS or MUSE data, this would correspond to an increase in the black hole mass of between ∼ 5 − 6 per cent for OASIS and up to ∼ 2 − 3 per cent for MUSE.
Given the large number of dispersion profiles generated, we have to make a choice of which ones to study.We rule out using the degree 2 and 3 RNI spectra as we expect that using such a high degree of Legendre polynomial over such a small wavelength range will lead to an unreliable kinematic extraction (indeed, for both MUSE and OASIS, we see the dispersion profile either completely flattening out or decreasing in the centre).For the degree 4,5,6 polynomials note that the degree 5 polynomial for MUSE closely resembles the degree 6, and for OASIS the degree 5 dispersion profile closely resembles the degree 4 dispersion profile.As such, we choose to throw out the degree 5 profile from each data set and are left with 6 dispersion profiles as shown in the bottom panel of Fig. 4. In the final analysis, we combine different data sets in order to draw a reliable result.We choose the combinations: (i) SAURON + OASIS degree 6 + MUSE degree 4 (ii) SAURON + OASIS degree 4 + MUSE degree 6 (iii) SAURON + OASIS RNI degree 1 + MUSE RNI degree 1 In the latter case we refer to the kinematics as the RNI spectra.

PSF from AGN Spectrum
The PSF from IFS data has often been determined by comparing convolved HST photometry to the flux from the IFS cube (e.g.Mc-Dermid et al. 2006;Krajnović et al. 2018, fig.B1).This method is most reliable when the centre of the galaxy has a cusp since the observed profile will be more strongly affected by the PSF.Since the nuclear region of M87 has a core, we expect this method to produce a less reliable PSF.However, we can circumvent this by noting that M87 has a bright AGN that can be assumed to be unresolved and can be used as a point source to infer the PSF directly.The profile of the PSF can thus be measured if we can extract out the component of each spectra due to the AGN.This can be done using the additive polynomials in our spectral fitting.The AGN likely has a flat nonthermal continuum that can be well approximated by polynomials while the underlying galaxy does not.
We extract the shape of the PSF by performing a pPXF fit to the unbinned central spaxels in each dataset over the largest possible wavelength range (4800-5700Å for MUSE and for 4760-5558Å OASIS) using the fixed stellar template determined before from the gas-free and AGN-free spectra with multiple gas components and additive Legendre polynomials.We also tried restricting the wavelength range to that of the RNI data but found that we are unable to obtain reliable results due to the spectral range lacking distinct features in the center to anchor the total weight of the stars compared to the AGN.Additionally, we experimented with fixing the kinematics though we found that this did not impact the results.We allow the degree of the additive Legendre polynomials to vary from 1 to 6.The FWHM of the PSF varies between 0. ′′ 042 to 0. ′′ 061 for MUSE.For OASIS this range is 0. ′′ 527 to 0. ′′ 586.In this case we do not mask any of the central spaxels.We also test the effects of masking the jet but find only small diferences to the extracted FWHM.We adopt the PSF in the degree 4 case for both the MUSE and OASIS data.
It is worth noting that, in the case of MUSE, we do not expect the uncertainty in the PSF to impact the our final results since we mask an inner region that is about the same size as the PSF.In Fig. 7, we show the results for MUSE and OASIS using degree 4 Legendre polynomials, as well as a plot of the one dimensional profile as a function of radius to the left or right of the origin.We parametrize the PSF using a multi-gaussian expansion in the form of with  the radius,   the standard deviation of gaussian , and   the normalization of each gaussian satisfying  =1   = 1.We model the observed AGN profile by integrating the PSF over the lenslet size of OASIS/MUSE.We do this by first noting that a PSF convolved observable  obs (, ) can be written as (e.g.Qian et al. 1995, appendix. D) Note that  is the analytic expression of a gaussian integrated over a lenslet of size   by   centred on the point (, ).As our observable is unresolved, we can treat it as a delta function.Substituting that into Equation 4 gives that the model of the PSF is simply  (, ), with the lenslet size substituted for that of MUSE or OASIS.
We then fit the PSF parameters by matching this model to the observed spectrally determined AGN by employing the optimize.least_squaresfunction of SciPy (Virtanen et al. 2020).We assume the errors are constant except in the centre, where we set them to be small so as to force a good fit at both large and small radii.We use the PSF extracted with degree 4 Legendre polynomials for the rest of the analysis.The measured parameters for each PSF is given in Table 1.The FWHM for OASIS is 0. ′′ 561, which is consistent with the seeing on the night of the observation (McDermid et al. 2006).The FWHM for MUSE is 0. ′′ 049, which is close to diffraction limited.The top panel shows a plot of the spectroscopic data of the AGN with the best fit multi-gaussian expansion of the PSF after being integrated over the MUSE/OASIS lenslet.This is presented as a one dimensional function where the x-axis is the radius of each data point, with those lying to the left of the origin having  < 0 and those lying to the right having  > 0. The colormap shows the 2D image of the spectrally extracted AGN on a log scale.The bottom panel shows this for OASIS.These results show the reliability of using the AGN contribution to the spectra as measurement of the PSF.

Stellar Tracer Distribution
Modelling the stellar tracer distribution of M87 is challenging due to non-stellar contributions in the photometric data, namely the AGN and jet.It is sufficient to mask the jet, but masking the AGN would mean masking the location where the supermassive black hole is.This is the most important region to model for black hole studies, implying that we must take another approach.The way that we circumvent this is by using the information from our spectral fits.Spectral fitting determines the contribution of stars, gas, and the AGN (through additive polynomials) such that one can extract out the pure stellar contribution.
To do this, we measure the flux due to the stars in the pPXF fit to the MUSE data from section 3.2.We do this by subtracting from each spectra the best fit gas lines and additive Legendre polynomials (which approximate the AGN spectrum).We test this using Legendre polynomial degrees between 1 and 6.The main difficulty is in the central spaxels where some of the fits of the stellar spectrum become so diluted that they are completely degenerate with the Legendre polynomials.In order to test this, we ran one set of fits for each polynomial degree and with the kinematics fixed to the largest value in the bottom panel of Fig. 4, as well as the smallest.We show the results of this in Fig. 8.Here you can see that, while some of the choices of polynomial degree lead to decreasing stellar densities in the center of the galaxy, they approach an upper bound which we take to be the true stellar density.We ignore those profiles with decreasing inner stellar density as this behavior is not observed in other core galaxies either observationally or in simulations.We fit the profile over the region of the MUSE data using a double power law and match it to the radial surface brightness from Hubble outside of the innermost 0.5 ′′ .From this we create a modified Hubble image which has the innermost arcsecond replaced with our fitted AGN-free profile.
We parametrize the galaxy surface brightness using the Multi-Gaussian Expansion method (Emsellem et al. 1994;Cappellari 2002).In order to model the full extent of the galaxy, we match an SDSS r-band mosiac of M87 to the Hubble image and fit them simultaneously using the robust MGE fitting algorithm and the Mg-eFit 9 software package of Cappellari (2002).This algorithm fits the projected surface brightness using a multi-gaussian expansion of the form We mask the jet and gap between the detectors in the HST image, and we mask a prominent star in the SDSS image.We also provide MgeFit with the Hubble ACS/WFC PSF in order to obtain the PSF-deconvolved stellar distribution as opposed to the observed distribution.We generate this Hubble PSF using the tool TinyTim (Krist et al. 2010).We record our MGE expansion for this in Table 1.
The MGE/galaxy contours are shown in Fig. 9 and the parameters we fit are shown in Table 2.We express the MGE in the AB photometric system and the F850LP band of HST where we have used the absolute magnitude of the sun  ⊙,F850LP =4.50 mag from Willmer (2018) and the galactic extinction  = 0.029 mag from Schlafly & Finkbeiner (2011).The zero point at the time of the observation was   = 24.873.Our spectral-decomposition approach allows for a much more reliable measurement of the stellar surface brightness near the black hole than was possible before.The most recent stellar dynamical determinations of the black hole mass in M87 (Gebhardt & Thomas 2009;Gebhardt et al. 2011;Liepold et al. 2023) have used the surface density profile provided in Kormendy et al. (2009).This work determines the surface density profile of M87 by combining observations across a number of different distance scales and photometric bands.The data determining the central profile of M87 comes from Lauer et al. (1992), in which the authors fit a power law starlight model, central nonthermal point source, and optical counterparts for the jet knots in order to determine the shape of the stellar distribution.Kormendy et al. (2009) notes that there are some gradients and discontinuities in the ellipticity and principal axis inside the core of M87 which may be due to issues with the AGN treatment in Lauer et al. (1992).We find a much flatter core in the innermost region which, holding everything else constant, will result in a larger black hole mass.This is discussed in more detail later.

Mass Modelling with M/L Gradients
In order to perform dynamical modelling, one has to parametrize the gravitational potential.Sarzi et al. (2018) used stellar population models to measure gradients in the stellar  * / in M87 by allowing for both ages, metallicity and stellar initial mass function (IMF) variations.Although these measurements are assumption dependent and quite uncertain, they provide an estimate for a possible stellar  * / variation within the innermost 30-40 ′′ of M87, implying that it is not sufficient to assume that mass follows light without testing the alternative.This is further corroborated by Oldham & Auger (2018) which also finds evidence for a radially decreasing  * /.We allow for variations away from mass follows light in this region by allowing for a radially dependent  * / profile given by 9 Available from https://pypi.org/project/mgefit/This functional form is motivated by the fact that, within the error bars, figure 11 of Sarzi et al. (2018) is well fit by this function.The observed nearly linear variation in  * / with lg() cannot represent the true  * / variation since it is unbounded at 0 and infinity, so we set it to be constant outside of the regions constrained by the data.We implement this within the MGE formalism in the following way: we evaluate ( * /)() at each   (where   is the standard deviation of each gaussian in the MGE) and multiply the surface brightness  850 ,  by this value.In principle, ( * /) 1 and ( * /) 2 are free parameters, but we can determine possible upper and lower bounds by studying the effect of varying the IMF.Sarzi et al. (2018) finds that the largest possible M/L ratio assuming a Kroupa IMF is close to 5.0 in the r-band.From figure 2 of Cappellari et al. (2012), we see that, empirically in the galaxy population, the largest  * / increase one can expect due to the IMF normalization is a factor 2.6 heavier than the value corresponding to a Kroupa IMF.Taking the r-band  * / with Kroupa IMF as reference, the heaviest  * / one can realistically expect for M87, allowing for extreme IMF gradients, correspond to  * /=13 in the r-band.Lastly, we can convert this to the SDSS z-band (which we take to be approximately the same as the ACS/WFC F850LP band) using the conversion formula Using M87 colors  = 9.92 and  = 10.70 from SDSS DR7, and solar magnitudes  ⊙ = 4.50 and  ⊙ = 4.65 from Willmer (2018), we find that the upper bound is ( * /)  < ∼ 7.23.This becomes important later in the analysis where we introduce this cut off when our modelling would otherwise prefer an unphysically large  * / ratio.
This choice of parametrization restricts the steepest possible  * / variation.One could further increase the freedom of this parametrization by allowing the innermost radius at which the  * / ratio becomes constant to vary.We discuss the impact of this in subsection 6.2.

Dark Matter
Several studies of the dark matter halo of M87 have been previously made using globular clusters Murphy et al. (2011); Agnello et al. (2014); Oldham & Auger (2016); Li et al. (2020).Oldham & Auger (2016) finds a preference for a somewhat cored halo, whereas Murphy et al. (2011); Li et al. (2020) does not find a preference for either a cored profile or a cuspy halo (though Murphy et al. (2011) has a slight preference for a cored halo).Agnello et al. (2014) finds a preference for a very steep cusp, but this may be due to choices in their modelling (See Oldham & Auger (2016) for an extensive discussion).Our data only covers up to 15 ′′ , which is about a fifth of a half light radius, so we cannot make any reliable inference about the dark matter contribution.We include dark matter and vary / in our study to account for the possible range of shapes of the total density profiles, which leads to a more conservative estimate of the black hole mass.As such, we proceed with a NFW dark matter halo (Navarro et al. 1997) The range of our data is much less than   so we set it arbitrarily to 20kpc because its precise value is irrelevant for the modelling results.We parametrize the overall magnitude as the fraction of matter within a sphere of one half light radius consisting of dark matter,  dm .We calculate the enclosed masses from the MGE analytically  using the routine mge_radial_mass from Jampy 10 .We calculate the half light radius from the MGE of Table 2 using the routine mge_half_light_radius in Jampy and find a value of 70 ′′ .

Jeans Modelling
The Jeans Anisotropic Modelling (JAM) method (Cappellari 2008(Cappellari , 2020) ) has been used to model the stellar dynamics of galaxies and study their stellar mass-to-light ratios and dark matter content in large integral-field spectroscopic surveys such as ATLAS 3D (Cappellari et al. 2013), SAMI (Scott et al. 2015), MaNGA (Li et al. 2018a) 10 Available from https://pypi.org/project/jampy/We mask a prominent star at the top of the field.You can see that the MGE fit covers the full shape of the galaxy and provides an excellent fit in the innermost region where our data is the most sensitive.The galaxy is oriented so that north is at the top and east is to the left.
as well as surveys at high redshift e.g. for the LEGA-C survey (van Houdt et al. 2021).It was applied to the study of galaxies' total density profiles out to large radii (Cappellari et al. 2015) and in several studies of smaller galaxy samples.More recently, JAM was employed to accurately predict Gaia kinematics of the Milky Way using all sixdimensional components of the stellar phase space (Nitschai et al. 2020(Nitschai et al. , 2021)).
Tests of JAM using high-resolution N-body simulations (Lablanche et al. 2012) and lower-resolution but more extensive cosmological hydrodynamical simulations (Li et al. 2016) have shown that, with high-S/N data, JAM recovers accurate total density pro-files with negligible bias.More recently, JAM was compared in detail against the Schwarzschild (1979) method using samples of both observed galaxies, with circular velocities from interferometric observations of the CO gas, and numerical simulations respectively.Both studies consistently found that the JAM method produces even more accurate (smaller scatter vs the true values) density profiles (Leung et al. 2018, fig. 8) and enclosed masses (Jin et al. 2019, fig. 4) than the more general Schwarzschild models.More quantitatively, between 0.8-1.6 effective radii, where the gas is well-resolved and the  c is better determined, Leung et al. (2018) reports a mean 1  error 1.7× smaller for JAM over the equivalent Schwarzschild model.Similarly, when considering all 45 model fits to the N-body simulations by Jin et al. (2019), the 68th percentile deviation (1  error) is 1.6× smaller for JAM than the equivalent Schwarzschild model.The increased accuracy of JAM in extracting density distributions may be due to JAM assumptions acting as an empirically-motivated prior and reducing the degeneracies of the dynamical inversion.
For supermassive black hole studies, JAM was found to accurately recover the "known" mass of the two most accurate benchmark black holes in NGC4258 (Drehmer et al. 2015) and the Milky Way (Feldmeier-Krause et al. 2017, sec.4.1.2).Moreover, extensive tests of a few tens of galaxies have found that JAM and Schwarzschild methods recover black hole masses that are generally consistent with one another (Cappellari et al. 2010;Seth et al. 2014;Thater et al. 2017Thater et al. , 2022;;Krajnović et al. 2018).However, not all BH measurements from different methods agree within the uncertainties and further comparisons between different approaches are still needed.
There are two implementations of JAM: one where the velocity ellipsoid is assumed to be aligned with the cylindrical-polar coordinate system (Cappellari 2008), and one where the velocity ellipsoid is assumed to be aligned with the spherical-polar coordinate system (Cappellari 2020).The choice of which implementation to use depends on the galaxy's intrinsic shape.M87 is a slow rotator early type galaxy (Emsellem et al. 2011) and slow rotators as a class are weakly triaxial, or nearly spherical, inside the half-light radius, becoming more triaxial at larger radii (see review by Cappellari (2016)).M87 has a specific angular momentum  R e ≈ 0 within the uncertainties (Emsellem et al. 2011).It showed some barely detectable misaligned stellar rotation from SAURON data (Emsellem et al. 2004), which became more clearly visible from high-S/N MUSE data (Emsellem et al. 2014).The ellipticity of M87 increases at large radii (see Fig. 9), where misaligned stellar rotation and triaxiality become evident (Liepold et al. 2023).Within the slow-rotators class, M87 was classified as a non-rotator (Krajnović et al. 2011).As a class, these are massive early type galaxies generally found at the centre of clusters, which tend to be rounder than  < ∼ 0.15 in projection (Emsellem et al. 2011, fig.6), indicating that, although triaxial, they must be intrinsically close to spherical with a ratio between the minor to major axes of the ellipsoidal density / > ∼ 0.85.This excludes the possibility that M87, which is nearly round in projection, may appear as such due to a special viewing angle.Instead, M87 must be intrinsically close to spherical in the region where it shows circular isophotes.JAM models, unlike the Schwarzschild models, do not require large-radii kinematics to constrain the models because they are nearly insensitive to the mass distribution at radii outside the region where kinematics are available.For this reason, one can expect axisymmetric JAM models with the velocity ellipsoid aligned with sphericalpolar coordinates, to provide an accurate description of the inner dynamics of M87, even though the galaxy, like all slow rotators, becomes more strongly triaxial at much larger radii than those we model.

Priors on the Anisotropy
JAM takes as parameters the galaxy inclination, anisotropy profile, stellar tracer distribution, and mass distribution, including the black hole mass.The kinematic axis is poorly constrained due to M87 being dominated by unordered motion in the central region, (Emsellem et al. 2014;Sarzi et al. 2018) so we fix the kinematic axis to align with the galaxy photometric axis as given in Krajnović et al. (2011).The value reported there is 151.3 degrees east or north, which is consistent with our value of 156.7 degrees east of north.The stellar tracer and mass distributions are described in subsection 4.1.As the inner regions of M87 are nearly spherically symmetric with little ordered motion, we fix the inclination to 90 degrees.Changing this has a negligible effect on the measured black hole masses, because a nearly spherical model appears spherical from any inclination.We can exclude M87 being an intrinsically flat but nearly face-on disk, because of the general shape distribution of slow rotators (Cappellari 2016;Li et al. 2018b).
The last thing to specify is the anisotropy profile.It is well known that for a spherical system, there is a degeneracy between the anisotropy and the density profile.This so-called mass-anisotropy degeneracy implies that for a range of assumed density profiles, one can adopt a corresponding anisotropy profile in such a way that the model reproduces the same profile of second velocity moments (Binney & Mamon 1982;Gerhard 1993).The degeneracy, however, is not complete, and the range of allowed profiles depends on the specific situation because the anisotropy is limited by the two extreme cases where the orbital distribution is fully radial or fully tangential respectively.For this reason, without further assumptions, one would generally expect large uncertainties in BH masses from spherical models based on the Jeans equations.
The situation, however, has improved dramatically from the days when the mass-anisotropy degeneracy was first discovered.Since then, many studies have modelled the inner dynamics of galaxies using general models that allow one to account for the full shape of the line-of-sight velocity distribution, rather than the moments alone.We think we now even have a good understanding of the underlying physics of the orbital distributions we have measured.In particular, we have found that massive slow-rotator galaxies with a core in their surface brightness profile, like M87, are consistently characterized by a nearly isotropic, or just slightly radially anisotropic orbital distribution outside the break radius, while orbits start becoming tangentially biased inside that radius, reaching the peak tangential anisotropy well inside the BH sphere of influence (Gebhardt et al. 2003 fig.10;Cappellari et al. 2008 fig.2;Thomas et al. 2014).The observations are quantitatively well reproduced by models in which both the cores in the surface brightness and the tangentially biased orbits are due to gas-free mergers of galaxies with supermassive black holes in their centres.The black holes sink towards the centre of the gravitational potential via dynamical friction, while ejecting stars on radial orbits (e.g.Milosavljević & Merritt 2001;Milosavljević et al. 2002;Rantala et al. 2019;Frigo et al. 2021).
In the case of M87, due to its very flat inner core, one can place constraints on its orbital anisotropy even from theoretical arguments alone.The cusp-slope vs central anisotropy theorem by An & Evans (2006) states that for a spherical power-law tracer population  ∝  − in a Keplerian potential, there is a relation between the anisotropy  = 1 −  2  / 2  with   and   the tangential and radial dispersion, respectively, and the logarithmic slope  of the tracer, such that  <  − 1/2.The inner slope of M87 varies from nearly flat in the centre ( ≈ 0 see Fig. 8) to  ≈ 0.27 between 1-5 ′′ (Lauer et al. 2007).We can thus conservatively conclude that the inner anisotropy has an upper limit of   /  < ∼ 0.9 and possibly even   /  < ∼ 0.8 for  = 0.The assumptions of the theorem are satisfied well inside the sphere of influence of the BH of M87 and for this reason the theorem provides additional support for the expected significant tangential anisotropy near the BH of M87.
For all these theoretical and empirical reasons, nowadays, it does not make sense to assume complete freedom in the orbital anisotropy of Jeans models as done in the past.Instead, the knowledge we accumulated on the galaxies anisotropy can be used as a Bayesian prior, which is easy to enforce to our JAM models.The ability to place priors is an important feature of stellar dynamical codes.All Schwarzschild codes use regularization to enforce smoothness in the orbital distribution (e.g.Richstone & Tremaine (1988); van der Marel et al. (1998); Gebhardt et al. (2000a); Cappellari et al. (2002); Valluri et al. (2004); Thomas et al. (2005)).A recent study suggested that the fine tuning of regularization is essential for accurate results (Neureiter et al. 2023).This regularisation is mathematically equivalent to a prior.
In the next section we describe a new way of specifying the anisotropy variations in JAM models, which is ideally suited to enforce anisotropy priors.

Fitting a given anisotropy profile with JAM
Let's consider for simplicity a spherical non-rotating JAM model.The intrinsic stellar dispersion of the model is given by the luminosity-weighted sum of the dispersion of the individual Gaussians making up the MGE Where  is the deprojected MGE oblate axisymmetric luminous density (see equation 13 of Cappellari (2008)).Fig. 10 shows the contribution of the individual [ 2  ]  for different anisotropies, for a Hernquist (1990) model with mass  * = 1011 M ⊙ containing a typical nuclear supermassive black hole of mass 0.5% that of the stellar mass (Kormendy & Ho 2013, eq. 11).When the Gaussians in a JAM model are isotropic or have tangential anisotropy, each Gaussian essentially contributes to the total  2  only near a radius close to its dispersion  ≈   (Fig. 10a).In these cases, one can construct a desired anisotropy profile () by simply assigning the anisotropy   = (  ) to the Gaussians with dispersion   , as pointed out in Cappellari (2008, sec. 3.2.2).Fig. 10 shows that this approximation works quite well in general.However, when the Gaussians are significantly tangential anisotropic, or near the central supermassive black hole, the total  2  rises steeply at small radii and a single Gaussian does not contribute to the total JAM model only around  ≈   (Fig. 10c,d).In those situations, there is no simple precise relation between the anisotropy of a given Gaussian and the total anisotropy of the JAM model at  ≈   .This is generally not a problem, if one is not interested in the fitted anisotropy.However, if one wants to quantitatively reproduce a specific total anisotropy profile one has to numerically fit for the anisotropies   of the different Gaussians.
For  = 1 it specializes to the homographic anisotropy function used by Bacon (1985).
In the top panel of Fig. 11 we show how one can reproduce our logistic anisotropy profile with JAM.In the figure, we adopted a rather extreme anisotropy variation, with inner tangential anisotropy   /  = 1/2 ( 0 = −3), outer radial anisotropy   /  = 2 ( ∞ = 0.75), anisotropy radius   = /2, where  is the break radius of the Hernquist (1990) profile, and the sharpness of the transition is  = 1.5.One can see that by fine-tuning the anisotropy of the different Gaussians, one can reproduce quite general anisotropy variations.The problem with this approach is that the anisotropy   of the individual Gaussians has to be fitted non-linearly in a least-square sense, while repeatedly computing the intrinsic velocity moments of the model, to obtain the total anisotropy for a given choice of   parameters.
An alternative is to solve the original Jeans equations for an axisymmetric model with spherically-aligned velocity ellipsoid, in Cappellari (2020, eq. 8) by relaxing the assumption of a constant anisotropy per Gaussian.This is only possible analytically for special choices of the anisotropy function (, ).We found that, adopting the parametrization of Equation 11for the anisotropy, the solution of Cappellari (2020, eq. 10), using the same notations, generalizes to In the special case  = 1, this solution reduces the one using homographic functions in Bacon (1985, eq. 3) 11 .The application of a generic varying anisotropy function to the axisymmetric cylindrically-aligned Jeans solution of Cappellari (2008, eqs. 8, 9), is straightforward if one make the cylindrical anisotropy a function   (||) of the modulus of the cylindrical coordinate .One simply has to replace the constant axial anisotropy   , with the corresponding varying expressions   () parametrized by the function of Equation 11.In the case of the tangential anisotropy , the situation is identical regardless of the alignment of the velocity ellipsoid, and one can just replace the constant with an arbitrary function of the coordinates (, ), without having to change anything else.
When applying the axisymmetric solution to a model with both the tracer distribution and the total density described by an MGE (Emsellem et al. 1994;Cappellari 2002), following the steps outlined in Cappellari (2020, sec.5.1), the resulting solution requires only minimal changes and the numerical algorithm can be left unchanged.One only needs to replace the following two expressions with the corresponding one indicated by the arrows, in all the expressions of Cappellari (2020, eqs. 46-53) In the spherical limit, the expression for the intrinsic radial velocity dispersion in Cappellari (2020, eq. B2) generalizes to Unlike the constant-anisotropy case, when projecting this model along the line-of-sight one cannot remove one of the two resulting integrals, except for some special cases (Mamon & Łokas 2005), which are not very useful for practical applications.
We implemented these changes in v7.0 of the Python JamPy package 12 , which now allows one to compute axisymmetric or spherical models with the logistic radial anisotropy variation, for both the spherically-aligned and cylindrically-aligned solutions.An application in the spherical limit is shown in the middle panel of Fig. 11.As expected the JAM model with the same variable-anisotropy for all Gaussians produces the same dispersion profile as the model with constant anisotropy for each individual Gaussian, as they both follow by design the same given anisotropy profile.

MCMC Analysis
For each of the models and dispersion profiles we perform an MCMC analysis to carefully assess the influence of the modelling and systematic effects of the kinematic extraction on the recovered BH mass.We define the  2 to be where  ℓ  is the extracted dispersion in binned spaxel ℓ,  ℓ  is the model dispersion, and Δ ℓ is the bootstrapped uncertainty in the extracted dispersion.We thus express the  2 for the combined data sets as From here, we perform an MCMC analysis using the code emcee of Foreman-Mackey et al. (2013).For each of the three combinations of different kinematic extractions described in subsection 3.1, we test four models, for a total of 12 different combinations of data and models.In each model we adopt as free parameters the four anisotropy parameters ( 0 ,  ∞ ,   , ) described in subsection 5.2, the black hole mass  BH , and make the following assumptions: • Constant stellar / without NFW dark matter.This adds an extra parameter (/) tot for a total of 6 model parameters.
• Constant stellar / with NFW dark matter.This add two extra free parameters, the stellar  * / and the normalization of the halo, quantified by the dark matter fraction within one effective radius  dm (<  e ), for a total of 7 model parameters.
• Varying M/L with NFW dark matter.This combines both a NFW dark matter halo, parametrized by the dark matter fraction  dm , along with the parameters (/) 1 and (/) 2 which parametrize the mass-to-light variation, for a total of 8 parameters.
These varying assumptions allow us to make contact with previous work which have made various assumptions on the parametrization of the gravitational potential, and also allow us to test the impact of each model assumption on the final result.
In order to best sample the space of possible parameters, we re- The center panel shows the same thing using the analytic implementation described in equations 12-14.The bottom panel shows the intrinsic anisotropy ratio and confirms that the result using the two methods is the same, the key difference being that the top one requires fitting the anisotropy for each gaussian while the center one requires a small modification to the analytic solution.
express the anisotropy parameters in terms that result in a more efficient and uniform sampling of the model posterior.Namely, we sample the anisotropy parameters defined such that Given what we know about the anisotropy profile in M87, we restrict these parameters to the ranges 0.5 ≤ (  /  ) 0 ≤ 1 and 1 ≤ (  /  ) ∞ ≤ 1.3.The bound at 1 comes from the enforced condition that the anisotropy becomes tangential in the center and radial at large radii.The bounds at 0.5 and 1.3 represent the largest range of anisotropy values reliably observed in early type galaxies Gebhardt et al. (2003); Cappellari et al. (2008Cappellari et al. ( , 2009)) (2021).This range also encompasses the range of anisotropy profiles previously determined for M87 (Cappellari & McDermid 2005;Gebhardt et al. 2011).In Table 3, we list all of the parameters along with the corresponding bounds on their values.We also restrict the parameter lg  to be between -0.3 and 0.6.This is due to the fact that small values of  correspond to no anisotropy transition, which we want to exclude, and large values of  give rise to an infinitely large parameter space where the spatial transition of the anisotropy takes place nearly instantaneously.In practice, the preferred range of parameter space almost always lies between -0.3 and 0.6 so this does not impact our final results.
Running JAM for the required number of steps necessary to generate reliable posteriors for all of these combinations of models and data is very computationally expensive.The final contours exhibit strong covariances that require a long burn in time for the walkers to sample and populate the posterior.The likelihood also has multiple local minima which further increases the run time if the chain is started further away from the global minimum.In order to speed up this process, we start by running emcee for 300000 steps with 100 walkers for the JAM model of a spherical galaxy using the routine jam_sph_proj in Jampy. 13This is a good approximation to the true Jeans solution as M87 is highly spherical, especially in the range of our data.Once this step is complete, we have a good approximation of the posterior.From there, we run emcee for 50000 steps with 100 walkers for the JAM model assuming axisymmetry and a spherically aligned velocity ellipsoid using the routine jam_axi_proj.These final 50000 steps are what is used for the final analysis.
We present the posteriors for this in Fig. 17.We also show the posteriors assuming the most general model for the individual MUSE RNI, OASIS RNI, and SAURON data sets in Fig. 12.Given our choice of physically motivated priors, some of the posteriors are not symmetric and even run into the boundary.As such, we report the posterior median and left and right 1 confidence interval for all of the combinations of models and combined data in Table 4.As we expect, the results for M87 change very little between using spherical JAM and axisymmetric JAM.We also show the model fits to the observed dispersion in Fig. 14.   4.

Black Hole Results
In Fig. 17 we show the full posteriors using the RNI spectra for each of the four models.In Fig. 13 we show 1 and 3 contours of the black hole mass and total / within one half light radius for the four different models using the RNI spectra.We choose to focus on the RNI spectra as this is the case where the OASIS and MUSE data have the closest agreement, suggesting that this choice of kinematics best reflects the true kinematics.The impact of the different choices of kinematics on the following results is some scatter, as opposed to single systemic shifts in the measured values.From Fig. 12 we can deduce the individual contributions of each data set to the final black hole masses.There we see that the OASIS and MUSE data are remarkably similar, with the SAURON data providing a unique contribution.
We find a large range of permissible black hole mass values, from nearly 6 × 10 9  ⊙ to greater than 11 × 10 9  ⊙ at one sigma uncertainty across different choices of kinematics and models.The values obtained using the most general model (the median of the posterior) varies between 8.3 − 8.9 × 10 9  ⊙ .The errors are asymmetric and range in magnitude between 1.2 to 1.6.Using the kinematics derived from RNI spectra gives  BH = (8.7 ± 1.2 ± 1.3) × 10 9  ⊙ where we calculate the second error to be half the difference between the largest and smallest black hole mass across all kinematics and models (i.e. the largest and smallest values in Table 4).Note that restricting this just to the most general model across each choice of kinematics decreases the systematic error to 0.3.The final allowed range of black hole masses depend strongly on the model assumptions.The most simple model with only constant  * / has the smallest range of permitted black hole masses.Including dark matter slightly shifts this range to the right.This is because the SAURON data strongly constrains the kinematics at large radii, so there is a tight correlation between  dm and  * / which allows one to interchange stars and dark matter.Decreasing stars at large radii in favor of dark matter, however, must be compensated at small radii with an increased black hole mass.Throughout this the anisotropy parameters remain fairly constant.This changes significantly with the introduction of varying  * /.In the model with a  * / variation and no dark matter, this results in a much larger range of viable black hole masses at the lower mass end.This is due to a correlation between (/) 1 and the black hole mass which effectively results in the mass of the black hole being exchanged for mass in stars.The results for the black hole mass are similar for the most general model featuring both varying / and DM.
The most interesting difference compared with previous studies is that the black hole mass we measure is more massive than that found in previous stellar dynamical studies.One key difference between our analysis and previous work is that we directly measure the stellar distribution within the influence of the AGN and find the stellar profile to be flatter than that used in all previous black hole studies of M87 using stellar dynamics.In order to determine if this could explain the increase in the black hole mass, we performed an MCMC run using the stellar profile of Kormendy et al. (2009) as this is used for the tracer distribution in previous studies (Gebhardt & Thomas 2009;Gebhardt et al. 2011;Liepold et al. 2023).We do our modelling with the RNI spectra and in the model with DM and constant M/L, as this most closely approximates the models used in Gebhardt & Thomas (2009) and Gebhardt et al. (2011).The 1 and 3 posteriors for this are shown in Fig. 13.We find a preferred black hole mass of  BH = (5.5 +0.5  −0.3 ) ×10 9  ⊙ , which agrees much more closely with previous measurements which range between 5.4 × 10 9  ⊙ (Liepold et al. 2023) to 6.2 × 10 9  ⊙ (Gebhardt et al. 2011).The equivalent model using our stellar distribution gives a black hole mass of  BH = (10.2+0.8  −1.0 ) × 10 9  ⊙ .Naively one might think that the difference in the black hole mass is purely due to the difference in the gravitational potential between the two models.On its own, our stellar profile should increase the measured black hole mass over previous determinations as it is exchanging the mass of the stars in the central regions of the galaxy with the mass of the black hole.However, one can calculate the decrease in stellar mass in the centre of the galaxy between our model and the model of Kormendy et al. (2009) and we find that, assuming a constant stellar / ratio of 3.4 without DM, that within 5 ′′ (approximately the sphere of influence) the decrease in stellar mass is 5.6 × 10 7  ⊙ , or around 1 per cent of the total stellar mass within 5 ′′ .This implies that the modification to the gravitational potential due to our model cannot be the sole cause of the difference between the two results.
One possible alternative is that this is a result of the very flat core.As pointed out by Kormendy & Ho (2013, sec. 3.1), measuring BHs with stellar dynamics in galaxies with flat cores is intrinsically less accurate than in galaxies with steep inner profiles, because of the increase importance of orbital anisotropy.Additionally, in centrally cuspy galaxies, the line of sight velocity distribution along the photometric centre of the galaxy receives its largest contribution from the three dimensional origin of the galaxy.In the case of a very flat core, the line of sight velocity distribution along the photometric centre receives an even contribution from a larger range of radii.This further increases the uncertainty in the kinematic modelling (note the much larger uncertainties using our profile over the Kormendy et al. (2009) profile) and has the potential to impact the extracted kinematics.Furthermore, a steeper profile implies more stars near the black hole, where the intrinsic sigma is higher.This leads to a higher observed sigma after projecting along the line-of-sight.In order to better demonstrate this effect, we show a plot of the model dispersion assuming our best fit parameters but using the stellar distribution from Kormendy et al. (2009) as opposed to our stellar distribution in Fig. 14.We find that, given the same set of parameters, using the old stellar distribution leads to a sharp rise in the dispersion within 1 ′′ .This suggests that in order to fit the data, previous studies required a smaller black hole mass with a more radially biased anisotropy closer to the black hole.This is exactly what we see in Fig. 15 where we plot our anisotropy and that of Gebhardt et al. (2011).Further study of this will be required in future work.

Mass to Light Ratio Constraints
One important feature in this work is the inclusion of stellar / variations.In Fig. 16 we show 1000 / profiles randomly sampled from the posterior of the most general model using the RNI data.We find a strong preference for an increasing / ratio in the central regions of the galaxy.Our best fit / variation agrees with the bottom end of figure 11 of Sarzi et al. (2018) assuming a Kroupa IMF.Previous studies have either failed to take / variations into account (Gebhardt & Thomas 2009;Gebhardt et al. 2011), or directly used the profile measured in Sarzi et al. (2018) (Liepold et al. 2023).Our result suggests that it is important to treat the profile as a free parameter.
One important limitation of these results is the fact that the / variation parametrization we use does not permit particularly steep variations due to it fixing the / ratio at 1 ′′ and 30 ′′ .To test the effect of this, we ran a model where the inner radius at which the / ratio becomes fixed is free to vary.We find a preference for a range of values between 1 ′′ and 4 ′′ but do not find significant changes in the distribution of / profiles in the range of the Sarzi et al. (2018) data.Our work thus establishes a preference for a / gradient at the lower end of what was reported in Sarzi et al. (2018).However, we caution that the recovered  * / profile is degenerate with the slope of the dark matter profile and the one we derive relies on a fixed NFW halo.Obviously no dynamical model can distinguish between a variation in the total density due to the stellar  * / or the dark matter without assumptions.
The total / ratio (defined as / within one  e ) that we measure exhibits some model dependence.In Fig. 13, we see that the total / ranges between 3.2 up to 3.7 depending on the model assumptions.This range of values is primarily due to the uncertainty in the anisotropy profile at large radius.The model with the largest total / is the model with varying / and DM.We see in Fig. 17 that the preferred value of (  /  ) ∞ is smaller than in the other models.This decrease in anisotropy at large radius must be made up by an increase in the mass.This further highlights the importance of the mass-anisotropy degeneracy when studying M87.
The total / ratio of M87 in the I-band has been previously measured in Cappellari et al. (2006) and in the r-band from Cappellari et al. ( 2013) (using mass follows light models in both cases).Converting these to the SDSS-z band (as a proxy for the ACS/WFC F850LP band) gives / ratios of 4.2/4.8 14, and 4.0.This is slightly larger than our range of /.In both previous determinations, the models were fit to the data out to R≈35 ′′ , while we only fit the data out to R=15 ′′ .The slightly larger total / of previous determinations may be explained as due to both their smaller adopted BH and an increase in the dark matter fraction between these radii.Gebhardt & Thomas (2009) and Murphy et al. (2011) also present measurements of the V-band stellar / ratio of 6.3 and 8.2.Converting these to SDSS-z band gives 2.8 and 3.7, respectively.These cannot be directly compared to the results in Fig. 13 as we present the total / rather than just the stellar /.However, we can still conclude that a stellar / of 3.7, after including dark matter, will lead to a total / that is slightly above the range of what we have determined.Likewise, a stellar / of 2.8, combined with a dark matter fraction of ∼ 15 will result in a total / of ∼ 3.2, which is at the lower end of what we measure.
In this work we measure the stellar distribution within the influence of the AGN and find the stellar profile to be flatter than in previous work.The difference to the enclosed mass within 5 ′′ is close to 1%, implying that this does not significantly modify the gravitational potential.However, M87 is a large galaxy where the AGN covers only a small fraction of the stellar distribution relative to the size of the black hole.For other galaxies, such as NGC 4151, one of the key uncertainties in the black hole mass determination is uncertainty on the cuspiness of the inner stellar distribution (Roberts et al. 2021).Applying this technique to that case or similar cases could significantly reduce the uncertainties in the final black hole mass measurement.2011) is shown in black.We find strong evidence for a radially increasing anisotropy ratio while varying strongly due to the mass anisotropy degeneracy.

Dark Matter
In this work we assume a NFW dark matter halo with break radius equal to 20kpc and find, in our most general models, a preference for a dark matter fraction within one effective radius of around  dm (<  e ) ≈ 0.2.This closely agrees with a result from Murphy et al. (2011) which determined the dark matter fraction within one effective radius to be approximately 17%.Our results, however, are strongly model and data dependent.In the less general models we consistently find a preference for no or very little dark matter (see Fig. 17).Additionally, on their own, the MUSE, OASIS, and SAURON data do not have a preference for a dark matter halo (Fig. 12).This data only goes out to 1.2 kpc, so we do not expect to very strongly constrain the dark halo.This highlights the importance of including large scale kinematic data for constraining information on the dark matter halo.

Anisotropy Profile Constraints
In Fig. 15 we show 1000 anisotropy profiles randomly chosen from the posterior of the NFW DM + Varying / model using the RNI spectra.We find a remarkable agreement with previous work.The profiles all display the radially increasing behavior expected of slow rotators.We also visually see that the profiles tend to transition from constant on the right hand side to lower values near 10 ′′ .This is close to the size of the core of M87 (5.′′ 66 according to Lauer et al. (2007)), and agrees with the results of previous studies demonstrating that the size of the core in slow rotators is close to the radius at which the velocity anisotropy ratio becomes tangential (Thomas et al. 2014).Another observation we should make is the strong correlation between the black hole mass and the anisotropy profile in Fig. 15.This clearly demonstrates the strong role that the mass-anisotropy degeneracy plays in this analysis.
One important comment is that these results depend on our use of physically motivated priors.We see in Fig. 17 that in many cases, the posteriors for (  /  ) 0 and (  /  ) ∞ run into the imposed boundaries and thus are unable to explore the full range of parameter space capable of reproducing the data.

CONCLUSION AND FUTURE PROSPECTS
We have studied the galaxy M87 using stellar kinematics from SAURON, OASIS, and MUSE using the code JamPy and our primary conclusions are as follows: • The stellar distribution of M87 can be measured directly within the influence of the AGN.This is done by directly measuring the fraction of the spectral flux due to stars during the kinematic extraction.The shape of the stellar distribution profile used in previous studies (Kormendy et al. 2009) overestimates the the stellar density in the central region of M87 by a factor of 2 (see Fig. 8).
• For galaxies with an AGN, the PSF can be accurately measured in integral field spectroscopy by measuring the continuum flux during the kinematic extraction (see Fig. 7 and Table 1).This is due to the fact that the AGN spectral contribution is thought to be smooth, and hence is well approximated by additive polynomials.We measure the FWHM of the OASIS PSF to be 0. ′′ 561 and the FWHM of the MUSE PSF to be 0. ′′ 049.These are consistent with what we expect from the seeing for OASIS and the AO capabilities of MUSE.
• We use JAM dynamical models of the kinematics in a Bayesian fashion.We find a preferred black hole mass of  BH = (8.7 ± 1.2 ± 1.3) × 10 9  ⊙ with the second error showing the modelling and kinematic systematic uncertainty.This range is consistent with the EHT measurement and previous stellar dynamical models of M87, though with a distinct preference for a larger black hole mass.Our analysis also highlights the fact that, even with excellent data, the derived black hole mass is sensitive to a variety of assumptions on both the kinematic extraction and the / variation.The resulting uncertainties, when accounting for these systematics, are significantly larger than usually adopted.
• We find a strong preference for a radially decreasing  * / ratio at the lower end of what is found in Sarzi et al. (2018).This has the effect of expanding the range of allowed black hole masses to lower values.
• We measure the anisotropy profile of M87 assuming a new flexible analytic parametrization for the anisotropy which is a logistic function of logarithmic radius and find a strong preference for a radial increase.This also clearly shows the mass-anisotropy degeneracy which strongly contributes to the uncertainty in the black hole mass.We conclude that, contrary to what is sometimes assumed, one can obtain stringent constraints on both black hole masses and on the anisotropy profile, using the Jeans equations, by combining priors on the anisotropy, which is now well-understood in galaxy centres, with realistic parametrizations for the total density.There remain many important questions about M87 that are well suited to be studied in the near future.Recent work has suggested a number of different ways that improved modelling of the gas disk is able to bring the supermassive black hole measurements from gas dynamics into agreement with those from stellar kinematics and the EHT (Jeter et al. 2019;Jeter & Broderick 2021;Osorno et al. 2023).This, however, relies on resolving details of the gas kinematics within the innermost arcsecond of the galaxy.New and improved datasets could be used to differentiate between these scenarios.
Furthermore, as this work shows, future studies of the supermassive black hole mass of M87 using stellar kinematics will also require detailed studying of systematic effects in order to produce reliable black hole mass results.High quality data for these tasks may not be far off.There is a cycle 1 JWST proposal (2228, PI: Jonelle Walsh) to measure the central supermassive black hole of M87 using NIRSpec.This will cover a wavelength range including the CO bandhead that is similar to the wavelength range covered in Gebhardt et al. (2011) though it will be unaffected by skylines and will have much higher spatial resolution and signal to noise.This will provide the most detailed view of M87's inner kinematics to date.

Figure 2 .
Figure2.Plots of the OASIS spectra as a function of radius.The black curve is the data and the red curve is the best pPXF fit to the stars and the orange curve is the best fit to the gas.The  = 0 ′′ spectra is the central spaxel for OASIS.This is not fit as we do not include it in the final analysis.The double peaked structure of the gas appears intermittently at a variety of radii.The bottom spectra shows the single stellar template that we use to fit the stellar kinematics for all of the OASIS spectra, as well as the sum of the gas free spaxels that we use to fit for the single stellar template.The mean luminosity weighted radius of the gas free spectra is 3. ′′ 7.

Figure 4 .
Figure 4.The top and centre panels show the extracted dispersion for M87 under a number of different assumptions for MUSE and OASIS, respectively.Note that in each case the dispersion profile is consistent up to a constant scaling outside of one arcsecond.Within one arcsecond, however, there is a large deviation in the extracted profiles.This is due to the uncertainty in the kinematic extraction caused by contamination from the AGN and the degeneracy in the fits between the fit to [NI] and the Mgb absorption.The bottom panel shows the dispersion profiles used in the final analysis scaled to match the SAURON profile.Data shown in the same color are combined in the final analysis.

Figure 5 .
Figure 5. Map of ℎ 4 measured by running pPXF on the Full MUSE spectra with Legendre polynomial degree of 4. The values are centered on zero with some scatter.The results are unchanged under different Legendre Polynomial assumptions.

Figure 6 .
Figure 6.Top panel: MUSE kinematics extracted over the full spectrum with additive Legendre polynomial degree of 4. The centremost 0.1 ′′ is masked (this is the circle in white at the origin).Bottom panel: OASIS kinematics extracted over the full spectrum with additive Legendre polynomial degree of 4. Both are oriented so that north is facing up and east is facing to the left.
Figure7.The top panel shows a plot of the spectroscopic data of the AGN with the best fit multi-gaussian expansion of the PSF after being integrated over the MUSE/OASIS lenslet.This is presented as a one dimensional function where the x-axis is the radius of each data point, with those lying to the left of the origin having  < 0 and those lying to the right having  > 0. The colormap shows the 2D image of the spectrally extracted AGN on a log scale.The bottom panel shows this for OASIS.These results show the reliability of using the AGN contribution to the spectra as measurement of the PSF.

Figure 8 .
Figure 8.Comparison of a scaled HST radial profile with a profile where the centre region is set by the stellar profile measured spectroscopically from MUSE.The purple and orange curves show the spectroscopically extracted stellar profile from the MUSE data under different assumptions.The orange curve fixes the stellar kinematics to the lowest values seen in the bottom panel of Fig. 4, and similarly for the purple curve with the largest values.The different lines of the same color show different choices of Legendre polynomial degree.The black dashed curve shows the profile we adopt.We also compare with the profile used in the previous stellar dynamical determinations of Gebhardt & Thomas (2009); Gebhardt et al. (2011); Liepold et al. (2023).Their profile is a factor of 2 larger than ours in the centre.

Figure 9 .
Figure 9. Top panel: best fit MGE contours in the innermost 100 arcseconds of the HST image.The regions in yellow are the masked jet and gap between the detectors.Bottom panel: best fit MGE contours over the full SDSS field.We mask a prominent star at the top of the field.You can see that the MGE fit covers the full shape of the galaxy and provides an excellent fit in the innermost region where our data is the most sensitive.The galaxy is oriented so that north is at the top and east is to the left.

Figure 10 .
Figure10.Top row: the contributions to the total luminosity weighted second moment from different MGE components.Middle row: intrinsic anisotropy ratio compared with the anisotropy assigned to each gaussian.Bottom row: projected velocity dispersion.The columns correspond to different choices of the total anisotropy, ranging from constant to radially varying.

Figure 11 .
Figure11.The top panel shows the contributions to the radial velocity dispersion where the desired anisotropy profile is reached by fitting the anisotropy for each gaussian in the MGE.The center panel shows the same thing using the analytic implementation described in equations 12-14.The bottom panel shows the intrinsic anisotropy ratio and confirms that the result using the two methods is the same, the key difference being that the top one requires fitting the anisotropy for each gaussian while the center one requires a small modification to the analytic solution.

Figure 12 .
Figure 12.Corner plots for the individual MUSE, OASIS, and SAURON data sets, respectively, using the RNI spectra.The MUSE and OASIS corner plots feature a secondary minimum corresponding to a near isotropic anisotropy profile in the centre of the galaxy with a lower black hole mass.Bestfit parameters and errors for MUSE and OASIS are shown in the last two rows of Table4.

Figure 13 .
Figure 13.Plot of the 1 and 3 confidence intervals for black hole mass and mass to light ratio within a sphere of one half light radius for each of the scenarios shown in Fig. 17.
Figure 13.Plot of the 1 and 3 confidence intervals for black hole mass and mass to light ratio within a sphere of one half light radius for each of the scenarios shown in Fig. 17.

Figure 14 .
Figure14.Plot of 1000 dispersion profiles randomly sampled from the MCMC chain with MUSE RNI + OASIS RNI + SAURON and colored according to their supermassive black hole mass.This is compared with a set of binned data points combining the MUSE RNI, OASIS RNI, and SAURON data.We also show a model with no black hole mass.It is technically possible to fit the data with no black hole mass if there is a highly radial anisotropy in the centre.To ensure this does not happen, we fix   = 1 and find the best fit model enforcing  BH = 0.The dotted and dashed purple line shows the dispersion after exchanging our stellar distribution with the one from Kormendy et al. (2009) but still using the parameters from the best fit model assuming our stellar distribution.You can see that the profile becomes much steeper in the center.The way to remedy this is by decreasing the black hole mass to the value determined in previous studies and adjusting the anisotropy correspondingly.

Figure 15 .
Figure 15.Plot of 1000 anisotropy profiles randomly sampled from the MCMC chain with MUSE RNI + OASIS RNI + SAURON and colored according to their supermassive black hole mass.The best fit anisotropy profile from Gebhardt et al. (2011) is shown in black.We find strong evidence for a radially increasing anisotropy ratio while varying strongly due to the mass anisotropy degeneracy.

Figure 16 .
Figure 16.Plot of 1000 M/L profiles randomly sampled from the MCMC chain with MUSE RNI + OASIS RNI + SAURON and colored according to their supermassive black hole mass.This is compared with range of M/L variations assuming Kroupa IMF from Sarzi et al. (2018) which is shown in gray and outlined with black lines.Our data strongly prefers an increasing M/L variation towards the centre of the galaxy consistent with that inSarzi et al. (2018).

Table 1 .
Table of fitted MGE parameters for OASIS, MUSE, and the HST F850LP TinyTim PSF, respectively.The FWHM for OASIS is 0. ′′ 561, for MUSE is 0. ′′ 049, and for HST is 0. ′′ 063.The HST PSF is larger than the MUSE PSF because the wavelength observed in the HST observation is longer than that in the MUSE observation.

Table 2 .
MGE parameters for the deconvolved ACS/WFC F850LP surface brightness.This corresponds to the orange profile in Fig.8and the red contours in Fig.9.The principal photometric axis measured is 156.7 degrees east of north.
Sarzi et al. (2018)eters and their permitted upper and lower bounds.The limits on (   /  ) 0 and (   /  ) ∞ come from the large literature of observations and simulations of core galaxies.The limits on / come from considering the heaviest possible IMF combined with the results fromSarzi et al. (2018) ; McConnell et al. (2012); Thomas et al. (2014); Krajnović et al. (2018), as well as in simulations of slow rotators Rantala et al. (2019); Frigo et al.

Table 4 .
Table of best fit parameters for each combination of instrument, spectral range, and legendre polynomial degree.The values presented are the median of the posteriors, with the upper and lower bounds corresponding to the 1 interval.The black hole masses are clustered between 8 and 10 billion solar masses.In the data column M stands for MUSE, O for OASIS, and S for SAURON.