MRK 1216&NGC 1277 - An orbit-based dynamical analysis of compact, high velocity dispersion galaxies

We present a dynamical analysis to infer the structural parameters and properties of the two nearby, compact, high velocity dispersion galaxies MRK1216&NGC1277. Combining deep HST imaging, wide-field IFU stellar kinematics, and complementary long-slit spectroscopic data out to 3 R_e, we construct orbit-based models to constrain their black hole masses, dark matter content and stellar mass-to-light ratios. We obtain a black hole mass of log(Mbh/Msun) = 10.1(+0.1/-0.2) for NGC1277 and an upper limit of log(Mbh/Msun) = 10.0 for MRK1216, within 99.7 per cent confidence. The stellar mass-to-light ratios span a range of Upsilon_V = 6.5(+1.5/-1.5) in NGC1277 and Upsilon_H = 1.8(+0.5/-0.8) in MRK1216 and are in good agreement with SSP models of a single power-law Salpeter IMF. Even though our models do not place strong constraints on the dark halo parameters, they suggest that dark matter is a necessary ingredient in MRK1216, with a dark matter contribution of 22(+30/-20) per cent to the total mass budget within 1 R_e. NGC1277, on the other hand, can be reproduced without the need for a dark halo, and a maximal dark matter fraction of 13 per cent within the same radial extent. In addition, we investigate the orbital structures of both galaxies, which are rotationally supported and consistent with photometric multi-S\'ersic decompositions, indicating that these compact objects do not host classical, non-rotating bulges formed during recent (z<= 2) dissipative events or through violent relaxation. Finally, both MRK 1216 and NGC 1277 are anisotropic, with a global anisotropy parameter delta of 0.33 and 0.58, respectively. While MRK 1216 follows the trend of fast-rotating, oblate galaxies with a flattened velocity dispersion tensor in the meridional plane of the order of beta_z = delta, NGC 1277 is highly tangentially anisotropic and seems to belong kinematically to a distinct class of objects.

galaxies with very high velocity dispersions. In van den Bosch et al. (2012, hereafter vdB12), six such objects were highlighted with sizes smaller than Re 3 kpc and central stellar velocity dispersions higher than σc 300 km s −1 .
These features indicate extremely high dynamical mass densities for which there are two possible explanations, assuming reasonable stellar densities: over-massive black holes that weigh a significant fraction of the total baryonic galaxy mass, or high stellar mass-to-light ratios which would increase the stellar dynamical mass considerably but imply a stellar initial mass function (IMF) much more bottom-heavy than a Salpeter IMF.
According to the orbit-based dynamical models of vdB12, NGC 1277 hosts an over-massive SMBH and possesses a stellar IMF that is consistent with a Chabrier IMF, ruling out a Salpeter IMF at 3σ. Interestingly, Emsellem (2013, hereafter E13) showed a hand-picked alternative model with a smaller black hole and a Salpeterlike IMF and no dark matter, which produces a reasonable fit. Furthermore, spatially resolved spectroscopic data along NGC 1277's major axis have been obtained and investigated in Trujillo et al. (2014, hereafter T14) and Martín-Navarro et al. (2015b), which indicate a uniformly old stellar population, high constant α-abundance and bottom heavy IMF. Their reconstructed stellar mass-to-light ratio of Υ 7 is consistent with the values reported in vdB12, but much lower than the Υ = 10 adopted by E13. Clearly, all these differences call for a re-examination of NGC 1277's stellar and central dark component.
Dark matter is not expected to be an important contributor at kpc to sub-kpc scales, but is nonetheless a key ingredient in many early-type galaxies (Rix et al. 1997;Cappellari et al. 2006; Thomas et al. 2007;Cappellari et al. 2013) that needs to be taken into account in any dynamical analysis due to its degeneracy with the stellar mass-to-light ratio and hence with the black hole mass (M•) (Gebhardt & Thomas 2009). Most studies that aimed to constrain the halo contribution to the overall mass profile either used long-slit spectroscopic observations or spatially limited integral field unit (IFU) data that rarely go beyond ∼ 1-2 effective radii (Re). The effective radius is only a relative scale that neither guarantees nor excludes the coverage of a substantial amount of dark matter. However, the SAURON and ATLAS 3D survey found a mean dark matter contribution of about 30% inside 1 Re, which corresponds to a mean absolute scale of only 5 kpc. The aforementioned HETMGS's compact galaxy sub-sample, though, should provide more interesting constraints in this regard. Their apparent sizes are relatively small and thus allow us to obtain detailed two dimensional stellar kinematics out to several effective radii (which at the same time also corresponds to a larger absolute coverage of up to 10 kpc, given their mean distance) to study their mass profiles and hence to probe the existence of dark matter, which is assumed to dominate the mass budget in these remote regions.
The aim of this paper is to set the stage for an investigation of compact, high-dispersion galaxies from the HET-MGS by combining long-slit spectroscopy with the HET, high-spatial resolution imaging with the HST and large-field, medium-and low-resolution spectroscopic observations with the PPAK IFU at Calar Alto. In doing so, we want to tackle several issues.
(i) dynamically infer the black hole mass, mass-to-light ratio and dark matter content of each galaxy, (ii) identify the dynamically hot and cold components to see whether violent relaxation or dissipative events played an important role in the recent evolution of these objects, (iii) analyse the stellar populations to obtain and further constrain reliable stellar mass-to-light ratios and IMF slopes as well as to gain insight into their formation histories, (iv) compare our results with the current picture of how galaxies and their constituents scale and evolve.
In this paper, we focus on orbit-based dynamical models of only two objects, namely MRK 1216 and NGC 1277, with effective radii smaller than 2.5 kpc and exceptional central dispersion peaks of σc 300 km s −1 ( Table 1). The PPAK observations of both were taken with the V1200 medium-resolution grating, covering a wavelength range of 3400-4840 Å. This restricted range makes a stellar population analysis not very suitable for answering IMF related questions. In addition, the kinematics for these two galaxies are currently available to a radius of ∼ 15 only, due to a much shorter exposure strategy compared to the rest of the sample. Nevertheless, the wide-field IFU data presented here still cover these objects out to ∼ 3 Re (i.e. 5 kpc) which should be sufficient for a dynamical examination.
The paper is organised as follows: In Section 2 we present the photometry. Section 3 covers the stellar kinematics. In Section 4 we carry out a dynamical analysis to constrain the black hole mass, stellar mass-to-light ratio and dark matter content of both galaxies. Section 5 rounds up and discusses the results. Section 6 highlights uncertainties and potential error sources, followed by a brief summary in Section 7.
Throughout this paper, we adopt 5th year results of the Wilkinson Microwave Anisotropy Probe (WMAP ) (Hinshaw et al. 2009), with a Hubble constant of H0 = 70.5 km s −1 Mpc −1 , a matter density of ΩM = 0.27 and a cosmological constant of ΩΛ = 0.73.

PHOTOMETRY
In this section, we present the photometric data, consisting of high-spatial resolution imaging with the HST. The first part of this paragraph covers the reduction and combination of dithered HST exposures to a final, super-sampled image. The second part then describes the photometric analysis of MRK 1216 and NGC 1277.

HST Imaging
We obtained single orbit imaging of MRK 1216 with the HST WFC3 in I -(F814W) and H -band (F160W), as part of program GO: 13050 (PI: van den Bosch). The data set comprises three dithered images in I -band with a total integration time of 500 seconds and seven images in H -band Table 1. Photometric properties of MRK 1216 and NGC 1277. (1) Morphological classification according to the NED, (2) Hubble flow distance, (3) scale at this distance, (4) effective radius in arcsec or (5) kpc, measured by a circular aperture that contains half of the light , (6) extinction corrected total luminosity of the HST F160W (MRK 1216) and F814W (NGC 1277) exposures, (7) peak and effective velocity dispersion in the PPAK data, and (8)  with a total integration time of 1354 seconds. The H -band images consist of three dithered full-and four sub-array exposures. The 16 ×16 sub-array images are short 1.7 second exposures, to mitigate possible saturation in the 450 seconds full-frames of the high surface brightness nucleus.
For the photometric as well as the dynamical analysis, we give preference to the deep HST H -band exposures, mainly due to less dust susceptibility in the near-infrared (NIR) and the fact that the inferred stellar mass-to-light ratios become a weaker function of the underlying stellar populations (Bell & de Jong 2001;Cole et al. 2001).
The reduction and combination of the individual F160W exposures is performed via Astrodrizzle (Gonzaga et al. 2012) in two major steps. First, a bad pixel mask for each flat-field calibrated image is generated, which then again is used during combination of dithered exposures, while correcting for geometric and photometric distortions. Both the deep full-and sub-array exposures in F160W are dominated by galaxy light of the huge stellar halo and hence the standard sky subtraction routine in Astrodrizzle (consisting of iterative sigma-clipping of uniformly distributed pixels) overestimates the background flux. We therefore measure the background flux separately in all the frames, manually, before combining the images. For the fullframes, the background level is measured in the less contaminated corners of each image, while the background flux in the sub-frames is estimated by measuring the flux difference between the (already) sky subtracted full-frames and the different sub-frames. We don't reach the sky/noise dominated regions in the full-frames, but the surface brightness (SB) in the corners of each image (where the sky estimates have been performed) is more than 10 magnitudes below the central SB, and will thus have little impact on the accurate recovery of the central stellar light/mass. Fig. 1 (top panel) shows the match of the surface brightness profiles of MRK 1216 after subtraction of sky background in the full-and sub-array exposures. Once the sky values have been determined, we combine the frames via Astrodrizzle and obtain a super-sampled image with a resolution of 0.06 /pixel and a FOV of 1.5 arcmin 2 (Fig. 2). The sky of the full-frames was calculated by iterative sigmaclipping of non-contaminated regions. The sky of the sub-frames was inferred by matching their non-sky subtracted SB profile with the sky subtracted SB profile of the full-exposures. At radii beyond 5 , the SB of the sub-exposures are noise dominated. Bottom: Fits to the final, HST H -band image, showing the match with a single Sérsic, multi-Sérsic and multi-Gaussian expansion. The single Sérsic overestimates the SB measurement at both the inner-and outermost radii. The MGE accurately reproduces the SB in the inner parts but is more extended, whereas the multi-Sérsic fit represents a fair match at all radii.
For the photometric analysis, we adopt a CANDELS point-spread function (PSF) (van der Wel et al. 2012). In brief, the PSF has a size of 0.17 FWHM and has been generated with TinyTim (Krist 1995) for the F160W filter. The PSF is created in the centre of the WFC3 detector, to minimise distortion, and is 10 × sub-sampled. Resampling it back to the original HST IR scale of 0.13 /pixel, and applying a kernel to replicate the effects of inter-pixel capacitance, creates a synthetic star in the centre of each frame. The final PSF is then obtained by "drizzling" the images and thus the PSFs. In this way, we produce a point-spread function at the same scale as our final science image.
High-resolution imaging of NGC 1277 is available in the Hubble Legacy Archive. Observations of this galaxy have been carried out in program GO: 10546 (PI: Fabian), resulting in three dithered exposures in R-(F625W) and V -band (F550M) with a total integration time of 1654s and 2439s respectively. In contrast to the I -and H -band images of MRK 1216, the redder R-band does not have a significant advantage over the V -band. The leverage between the two filters is small and consequently both are equally subject to the effects of extinction (see Section 2.3). Here, we employ the V -band photometry because of its longer exposure time and for the sake of consistency with the modelling results of vdB12.
The F550M flat-field calibrated images have been sky subtracted, cosmic ray rejected, corrected for photometric and geometric distortions via Astrodrizzle before being combined into a final image with a resolution of 0.05 /pixel. The PSF of these observations was recovered with TinyTim, created in each of the three individual, dithered exposures and drizzled to match the resolution of the corresponding science frame.

MRK 1216
MRK 1216 is a sparsely investigated early-type galaxy (ETG) (R.A. : 08 h 28 m 4 • , Decl. : −06 • 56 22 ) with strong excess of UV radiation in its centre (Markarian 1963). A few redshift measurements have been carried out for this object (Petrosian et al. 2007;Jones et al. 2009), which translate to a Hubble flow distance of 94±2 Mpc. Given its distance, 1 is equivalent to 450 ± 10 pc/arcsec. The final, combined HST image thus covers a field of view (FOV) of 65 kpc 2 .
To examine its photometric properties, structure and morphology we decompose the galaxy into multiple Sérsic components using Galfit (Peng et al. 2002). The analysis is done for 3 different scenarios: First, a single Sérsic fit is carried out to obtain the single Sérsic index and thus the overall steepness of the light profile. Second, we perform a bulge-disk decomposition, if possible. Although such a decomposition is a matter of debate, we do this for comparison with literature studies, where similar procedures have been carried out to relate central black hole masses to bulge luminosities. Third, we execute a fit with multiple Sérsic components that best matches the light profile 1 .
A single Sérsic fit to MRK 1216's H -band image has an apparent magnitude of mH,V ega = 10.47, an effective radius of Re = 6.34 , a projected axis ratio (b/a) = 0.58 and a single Sérsic index of n = 4.93. Residuals of this fit are strong. In comparison to the SB measurement, the single Sérsic fit shows an excess of light in the very centre and also tends to overpredict the light of the large outer halo (Fig. 1, bottom panel).
We further investigate the stellar structure by gradually increasing the number of Sérsic components. A twocomponent model yields a very centrally concentrated (Re = 3.42 ) "bulge" with a Sérsic index of n = 3.61 -although remarkably flat (q = 0.56) -which is embedded in a (close to) exponential, round and very extended (Re= 17.22 ) stellar "disk"/envelope with a Sérsic index of n = 0.96. Pronounced residuals remain, hinting at a more complex stellar composition. Even so, the "bulge" of the two-component fit will serve as an upper limit to the bulge luminosity. According Table 2. Sérsic decomposition of MRK 1216's HST (F160W) Hband image. The columns represent the number of Sérsics for a given fitting scenario (1), their apparent total magnitude (extinction corrected) (2), their effective semi major axis radius (3), the corresponding Sérsic index (4) and their apparent flattening (5) to this fitting scenario, we obtain a bulge-to-total luminosity ratio of B/T = 0.69. A decent fit is obtained with at least four Sérsic components, resulting in notably lower and less prominent residuals. In this case, the outer stellar "disk"/envelope persists, whereas indications of a "bulge"-like component totally disappear (Table 2). All components show rather low Sérsic indices, which complicates any attempt of a morphological interpretation. We therefore do not present a unique classification but rather stick to the conclusion that MRK 1216 is indeed a compact ETG, harbouring a complex, flat substructure, that is embedded in a round, extended stellar halo. As the innermost component is too small to be considered a bulge, we adopt the luminosity of the second innermost Sérsic as a conservative lower limit of a bulge luminosity, which accounts for 12 per cent of the total light and extents to 1.34 (or roughly 0.3 Re).
Our orbit-based dynamical models need a stellar mass model, from which we can infer the stellar gravitational potential. This is accomplished by deprojecting the surface brightness distribution of a galaxy which however is a non-unique task, as has been convincingly illustrated by Rybicki (1987). Even the surface brightness distribution of an axisymmetric stellar system only provides information about its density outside a so-called "cone of ignorance". This means that in principle, and unless the galaxy is observed edge-on, there could be a family of "konus densities" (Gerhard & Binney 1996) which alter the intrinsic mass distribution but are invisible to the observer as they project to zero surface brightness. Making use of physically and observationally motivated criteria for the luminosity profile of axisymmetric galaxies, van den Bosch (1997) found that the addition of mass due to konus densities cannot be arbitrary and is most likely confined to be less than 10 per cent for (cusped) ellipticals, implying a marginal role in the dynamics of early-type galaxies. We employ a similar, empirically motivated approach by parameterising the surface brightness distribution of galaxies with a set of multiple, two-dimensional Gaussian functions (MGE: Monnet et al. 1992;Emsellem et al. 1994). Although a set of Gaussians does not form a complete set, the MGE method has been very successful in the recovery of the surface brightness pro- files and features of realistic multi-component galaxies (Cappellari 2002). We obtain the intrinsic luminosity density by deprojecting the parameterised surface brightness distribution for a given/assumed set of viewing angles, adopting an absolute magnitude of 3.32 for the Sun in H -band (Binney & Merrifield 1998). In the case of an MGE, the deprojection can be performed analytically while the gravitational potential is then obtained by means of a simple, one-dimensional integral.
Our final MGE contains 10 components with a fixed position angle (PA) of 70.2 • (measured counter-clockwise from the y-axis to the galaxy major axis, with the image aligned N.-E., i.e. north is up and east is left) and a common centre, as listed in Tab. 3. The flattest Gaussian has an axis ratio of q = 0.52, which forces the lower boundary of possible inclinations to be greater than 59 • (with 90 • being edge-on), assuming oblate axial symmetry (see Section 6). A dust disc would be helpful in further constraining the inclination of the galaxy, although it would also pose a major concern for the modelling of the stellar mass, but is not evident in either of the H -and I -band images of MRK 1216. Fig. 2 shows the combined, final H -band image of MRK 1216 (left) and its contour map (right). Over-plotted are contours of the MGE (black) as well as an excerpt of the central 15 arcsec 2 (∼ 3 Re). The MGE reproduces the SB profile within the central 40 arcsec 2 , but tends to overpredict it at the largest radii. Note that the SB profile lacks any PA twists. The PA is almost constant within 30 from the centre (∆PA 2 • ) and changes only at larger radii where it is virtually unconstrained as the round outer halo has close to zero ellipticity.

NGC 1277
Given several redshift measurements, NGC 1277 is located at a Hubble flow distance of 71±1 Mpc and deeply embedded in the Perseus cluster. It is classified as a lenticular S0 galaxy (Marcum et al. 2001) without any noticeable substructures or prominent features besides the clearly visible central dust disk with a semi-major axis radius of 0.13 kpc and a flattening of q = 0.3. The presence of dust complicates the recovery of the central stellar mass and hence the black hole mass in NGC 1277 (see Section 6), but assuming that this nearly edge-on disk traces the PA of its host, we can pin down the inclination of the galaxy to 75 • . As in the case of MRK 1216, photometry shows that a superposition of galaxies can be ruled out as an explanation for the observed high velocity dispersions.
The SB profile of NGC 1277 shows a flattened, regular structure with no significant changes in the PA with increasing distance from the centre (∆PA 2 • ). A single Sérsic fit to its V -band image reveals a moderate Sérsic index of n = 2.24, a small effective radius of Re = 3.9 and a projected axis ratio (b/a) of 0.53. The total V -band magnitude is mV,V ega = 13.39. This rather simple fitting scenario is, of course, an under-representation of NGC 1277's stellar complexity, leading to strong residuals in the centre -where the luminosity profile shows an excess of light when compared to the Sérsic -and at larger radii.
A further decomposition with two components improves the fit significantly. Here, a flat (b/a = 0.52), inner (Re = 2.85 ) bulge-like component (n = 2.25) is embedded in a rather flat (b/a = 0.50), outer (Re = 10.35 ) disk like component (n = 0.37). The fit has a bulge-to-total ratio of B/T = 0.75.
An acceptable fit can be obtained with (at least) 4 Sérsics, as listed in vdB12. An interpretation of the various components, however, is difficult, except for the outermost component which resembles a round stellar halo. None of the components has a high Sérsic index, making it difficult to find any photometric evidence for the presence of a "bulge". Devoid of a distinct spheroid component in this multi-Sérsic fit of NGC 1277, we again adopt the "bulge" of the two component decomposition as an upper limit to the bulge luminosity (B/T = 0.75) whereas the luminosity of the second innermost Sérsic in the four component decomposition will serve as a lower limit (B/T = 0.24).
For the stellar mass model we make use of the Vband Multi-Gaussian-Expansion of vdB12 with a fixed PA of 92.7 • . E13 provided an alternative MGE, based on the R-band image of NGC 1277. The difference between the two parameterisations, though, is of little account. Both MGEs reproduce the 2D surface brightness profile equally well, and we refer the reader to E13 for an illustration of the isophotes. There is basically not enough leverage between the wide Rband and medium V -band filter to obtain any colour information that would also minimise the effect of obscuration by the central dust disc. The MGE of E13 in R-band yields a lower total luminosity while increasing the central surface brightness only slightly. These values however do not change the inferred stellar dynamical masses substantially (Sec. 4.2.2).

STELLAR KINEMATICS
This section covers the HET long-slit and PPAK IFU observations. After sketching the reduction of the individual data sets, we extract and present the kinematics which in turn are used as input for our orbit-based dynamical models.

PPAK
Large-field, medium-resolution (V1200) observations of both galaxies have been carried out at the 3.5 m telescope at Calar Alto, with the Potsdam Multi Aperture Spectrograph (PMAS ) (Roth et al. 2005) in the PPAK mode (Verheijen et al. 2004;Kelz et al. 2006). The observing details of this run are outlined in Sec. 3.3. The PPAK IFU consists of 382 fibers, which are bundled to a hexagonal shape. Each fiber has a diameter of 2.7 , resulting in a FOV of roughly 1.3 arcmin 2 . Using a 3 dither-pointing strategy, the 331 science fibers have a 100% covering factor across the entire FOV. An additional 36 fibers are used to sample the sky, while the remaining 15 fibers are used for calibration purposes. The V1200 grating has a resolving power of R = 1650, at 4000 Å. The spectral resolution across the nominal 3400 − 4840 Å spectral range and FoV is homogenised to 2.3 Å FWHM based on measured line widths in the arc lamp exposure. This spectral resolution corresponds to an instrumental velocity dispersion of σ = 85 km s −1 . The low sensitivity at the blue end and vignetting at the red end reduce the useful spectral range to 3650 − 4620 Å.
The reduction of the PPAK data follows the reduction procedure of the Calar Alto Legacy Integral Field Spectroscopy Area (CALIFA) survey. An extensive overview of the reduction pipeline is given in Sánchez et al. (2012) and Husemann et al. (2013). The data reduction steps by the pipeline include bias subtraction, straylight subtraction, cosmic ray rejection with PyCosmic (Husemann et al. 2012), optimal fiber extraction, fiber flat-fielding, flexure correction, wavelength calibration and flux calibration. The sky subtraction is done by averaging the spectra of 36 dedicated sky fibers which are located 72 away from the PPAK FoV centre. Given our compact objects' sizes, the sky fibers should be free from any contamination of the galaxy itself. To exclude any potential contamination by field stars or low-surface brightness objects, the sky spectrum is constructed by taking the mean of only the 30 faintest sky fibers. The resulting sky spectrum is then subtracted from its associated science exposure. Finally, the 3 dither-pointings are resampled to the final data cube with a 1 sampling using a distance-weighted interpolation algorithm as described in Sánchez et al. (2012).
To measure reliable stellar kinematics, we first spatially bin the data with the adaptive Voronoi tessellation technique, as implemented by Cappellari & Copin (2003). At the cost of spatial resolution, we co-add spectra into (Voronoi-) zones to reach a minimum S/N of 25 in each bin, after applying a minimum S/N cut of 4 for each spaxel. The binning process, however, is not straightforward. Spectra of grouped pixels are assumed to be uncorrelated, which is not true for our data. Due to the three-point dither-pattern of the PPAK observations, correlated errors during binning appear. In the most general case, each spaxel contains information from a number of different fibers (as each fiber contributes to more than just one pixel). Hence, the noise in adjacent spaxels is correlated and spatial covariances have to be taken into account during the S/N estimates of co-added spectra. A correction for the correlation of the S/N in the data has been applied by quantifying the ratio of the real error -directly estimated from residuals of full spectrum continuum fitting -to the analytically propagated error of binned spectra. The ratio can be characterised by a logarithmic function (Husemann et al. 2013): with α = 1.38 and n as the number of spaxels in each bin.
The ratio increases rapidly for small bins, indicating a high correlation between adjacent spaxels, and flattens out for spatially bigger bins, where the correlation between spaxels becomes less. Once the effect of noise correlation has been taken into account, we adopt the Indo-US stellar library with 328 spectral templates (Valdes et al. 2004). A non-negative linear combination of these templates is then convolved with a Gaussian line-of-sight velocity distribution (LOSVD) and fitted to each spectrum in the range of 3750 − 4550 Å (covering prominent stellar absorption line features such as the Balmer and Calcium H-and K lines), while using additive Legendre polynomials of 15th order. In this way, we derive the mean line-of-sight velocity v, velocity dispersion σ and higher order Gauss-Hermite velocity moments h3 and h4 (which represent asymmetric and symmetric departures from a Gaussian LOSVD) per bin on the plane of the sky. Sky lines are masked beforehand and corresponding uncertainties of the kinematic moments are determined by means of 100 Monte Carlo simulations (see also Falcón-Barroso et al., in prep. for a full description of the CALIFA stellar kinematics pipeline).

HET
In addition to the medium-resolution wide-field PPAK data, HET long-slit kinematics along the major axis are in hand. Observations were carried out, using the Marcario Low Resolution Spectrograph (LRS ) (Hill et al. 1998). The LRS is a classical long-slit spectrograph with a slit length of 4 . We made use of the G2 grating and a slit width of 1", covering a wavelength range of 4200-7400 Å. This configuration has a resolution of R = 1300 which corresponds to a spectral resolution of 4.8 Å FWHM (i.e. an instrumental velocity dispersion of σ = 108 km s −1 ), at a pixel scale of 0.475 . Single exposures of 900 s have been taken for each galaxy, in good weather and seeing conditions of 1 arcsec, resulting in a total of 3 individual (apparent) major axis profiles each for MRK 1216 and NGC 1277.
The reduction of the HET data is accomplished by a dedicated and fully automated pipeline (van den Bosch et al. 2015), following a standard reduction practice of bad pixel and cosmic ray masking, overscan and bias subtraction, flat fielding correction and wavelength correction. From the reduced data, we then extract kinematic information by applying an updated version of the pPXF code (Cappellari & Emsellem 2004) with a set of 120 spectral templates from the MILES stellar library (Sánchez-Blázquez et al. 2006;Falcón-Barroso et al. 2011).
The PSF and positioning of our observations are crucial for an accurate determination of the modelling parameters, such as black hole mass and stellar mass-to-light ratio. We recover a reliable PSF in each data set of both galaxies by iteratively fitting a PSF convolved, reconstructed (slit) image to the MGE of the high-resolution HST data. The PSF in turn is expanded by multiple, round Gaussians and the (slit) images are well reproduced in all cases by a PSF with one or two components (Tab. 4)

MRK 1216
On December 5, 2011, we obtained PPAK data using the medium-resolution V1200 grating. The seeing was ∼ 1 arcsec. Two science exposures, 900 seconds each, were taken per pointing, resulting in a total exposure time of 1.5 hours on-source. Fig. 3 displays the two-dimensional line-of-sight kinematics. Reliable data are available out to a major-axis radius of 15 . The kinematic maps show fast and regular rotation with a maximum velocity of 182 km s −1 . The velocity dispersion has a very pronounced peak of about 338 km s −1 in the centre, indicating a very high mass concentration and hinting at the presence of a SMBH. Superimposed are contours of constant surface brightness from the same data cube.
One out of our three HET long-slit kinematics along the apparent major axis will be illustrated in Fig. 6. The velocity and velocity dispersion profiles are in agreement with the PPAK data, revealing a rotation with a maximum velocity of 219 km s −1 and a peak in velocity dispersion of 345 km s −1 . Furthermore, we measure strong h3 moments that appear to be anti-correlated with v, commonly observed in axisymmetric galaxies.

NGC 1277
PPAK data of NGC 1277 have been obtained in the same run as data of MRK 1216. The observing strategy, setup as well as data processing and reduction are also identical, resulting in the kinematic maps in Fig. 4. Similar to MRK 1216, the kinematic data of NGC 1277 are limited to a radius of ∼ 15 . The maps reveal very fast rotation around the short axis, peaking at 276 km s −1 , and an extraordinarily flat rotation curve out to several effective radii. The peak in velocity dispersion is about 355 km s −1 and hence considerably lower than the dispersion in the (three major axis) HET slits (415 km s −1 ) (Fig. 7). Moreover, the central h4 measurements are also lower in the PPAK data cube. The difference, though, is largely attributable to the difference in spatial resolution between both data sets (see Section 6 for the reliability of the individual measurements and the recovery of the black hole mass).  We observe the same anti-correlation between h3 and v, which is expected in the case of axial symmetry and reasonable given NGC 1277's apparent flatness and strong rotation around its short axis. We superimpose its contours of constant surface brightness from the same data cube, with slight irregularities due to extensive masking of nearby objects and the presence of faint fore-and background stars.

DYNAMICAL ANALYSIS
We introduce our orbit-based dynamical modelling technique which fits the observed line-of-sight kinematics and the photometry, simultaneously. We hereby constrain the intrinsic contributions of black hole, stars and dark matter to the overall mass budget and infer the orbital structure of both galaxies.

Schwarzschild's Method
Schwarzschild's orbit superposition method (Schwarzschild 1979) has proven to be a reliable technique to recover in great detail the dynamical and structural properties of galaxies. The basic idea behind this modelling approach is as simple as it is striking: The motion of astronomical objects, e.g. stars, is governed by the underlying gravitational potential, which in turn can be a sum of not only visible matter but also any non-visible components. This means that once a gravitational potential is assumed, a representative library of orbits can be calculated in that potential that thoroughly samples all integrals of motion (4 in the case of spherical systems and 3 in axisymmetric or triaxial configurations). By assigning weights to the orbits we can then compute their combined properties and compare them to present-day observables, which represent a snapshot of a certain gravitational and dynamical configuration. The implementation of Schwarzschild's method then probes a set of gravitational potentials and tests whether there is a steadystate superposition of orbits in that potential that matches the full LOSVD and the (intrinsic and projected) light/mass distribution.
A wealth of Schwarzschild codes exist. Starting with the modelling of spherical galaxies (e.g. Romanowsky et al. 2003), to axisymmetric galaxies (e.g. Cretton et al. 1999;Gebhardt et al. 2003;Valluri et al. 2004;Thomas et al. 2004;Chanamé et al. 2008) right up to the modelling of triaxial systems (van den Bosch et al. 2008, vdB08 hereafter). In what follows, we make use of the triaxial implementation of Schwarzschild's method. This code represents a very flexible way to reproduce not only all available data but at the same time to recover the internal dynamical structure of galaxies , to constrain their intrinsic shapes (van den Bosch & van de Ven 2009), their SMBH masses (van den Bosch & de Zeeuw 2010), their (constant) mass-tolight ratios (Läsker et al. 2013), as well as their dark matter fractions and profiles (Weijmans et al. 2009). For a detailed overview of the working principles, we refer the reader to vdB08. Here, we confine ourselves to a brief description of the main steps: (i) The implementation begins with a surface brightness distribution that has been parameterised with a set of Gaussians (see Section 2.2 and 2.3). Once a set of viewing angles are chosen, a de-projection of the surface brightness, corresponding to the surface mass density, can be carried out which yields the intrinsic stellar mass distribution and hence the stellar gravitational potential of the galaxy. For a triaxial deprojection three viewing angles are needed to pin down the shape and orientation of the triaxial ellipsoid. On the other hand, in the axisymmetric case, the short-to long-axis ratio -i.e. the flattening (q), which is then directly related to the inclination (i) -remains the only free parameter.
(ii) Within this potential, a representative library of orbits is calculated. In this work, the library consists of more than 7500 orbits (dithering excluded), given by the 9 starting points (in each of the radial and angular directions) at each of the 31 logarithmically sampled equipotential shells between 0.003 and 150 .
(iii) During orbit integration, intrinsic and projected quantities are stored and then PSF convolved for comparison with the data.
(iv) For a given potential, χ 2 statistics is used to find a non-negative, linear superposition of orbits that matches the set of kinematic and photometric observables. We recover the spatially binned 2D LOSVD in a least-squares sense by finding the optimal set of orbital contributions to the Gauss-Hermite moments (Gerhard 1993;van der Marel & Franx 1993;Rix et al. 1997). The orbital contributions, in turn, represent the mass in each orbit. To ensure self-consistency, they must be able to simultaneously reproduce the intrinsic and projected stellar masses, which are stored on a grid and given by the integration of the MGE model contribution over the respective area. The masses are constrained to an accuracy of 2 per cent, which reflects the usual uncertainties in the surface brightness parameterisation by the MGE, and are allowed to vary within the boundaries while finding the best-fitting kinematics.
(v) A reiteration of the steps (i -iv) is carried out for differing gravitational potentials, including the presence of a SMBH and dark matter.
For multiple reasons we do not employ regularisation during the construction of the Schwarzschild models. First, we hereby make sure that our models are unbiased with respect to regularisation. Second and more importantly, it is not possible to accurately determine a proper level of regularisation that is needed a priori. Third, it has been shown that regularisation changes neither the values of the best-fitting parameters nor the orbital weights significantly, as long as reasonable values are chosen and an oversmoothing of the distribution function (DF) is prevented (Verolme et al. 2002;van den Bosch et al. 2008;van den Bosch & de Zeeuw 2010). And finally, while regularisation can be helpful in individual cases to find the set of orbital weights that best fits the velocity moments and to prevent the weights from varying rapidly, it decreases the degrees of freedom at the same time and leads to an artificial narrowing of the χ 2 contours and thus to smaller confidence intervals for the recovered parameters (but see Valluri et al. 2004, Thomas et al. 2005and Morganti et al. 2013 for a more detailed discussion of the effects of regularisation in their individual models).
In the case of MRK 1216 and NGC 1277, the photometry and kinematics are consistent with oblate axial symmetry (see Section 6). In constructing dynamical models we will therefore restrict ourselves to an axisymmetric stellar system 2 .

MRK 1216
We consider three gravitational sources; the central black hole mass M•, the stellar mass M (which is the deprojected, intrinsic luminosity density times the constant stellar mass-to-light ratio Υ ), and a spherically symmetric dark matter component with an NFW profile (Navarro et al. 1996) with concentration cDM and total virial mass MDM = M200. In the case of MRK 1216, the final models will thus probe a four parameter space in log(M•/M ) ∈ [7, 11], ΥH ∈ [0.5, 3], cDM ∈ [5, 15] and log(fDM ) = log(MDM /M ) ∈ [-9, 5]. The search in parameter space is mainly motivated by observational and theoretical constraints on: the stellar mass-to-light ratio for SSP models with a Kroupa and Salpeter IMF (Vazdekis et al. 1996); the black hole mass from predictions of the black hole scaling relations (Gültekin et al. 2009) and the dark halo parameters from investigations of Bullock et al. (2001), Moster et al. (2010) and Macciò et al. (2008) (see also Section 5.3).
By design, we do not explore the inclination space. As has been shown in Krajnović et al. (2005) and van den Bosch & van de Ven (2009), it is not possible to infer the inclination angle by means of two-dimensional line-of-sight stellar kinematics, alone, unless kinematic features exist (e.g. kinematically decoupled cores) that put additional constraints on the intrinsic shape of galaxies. Even in the case of three integral axisymmetric orbit-based models, different inclinations above the lower limit that is given by the photometry are able to fit the LOSVD almost equally well. However, we can further constrain the inclination by simple observational arguments. Although the flattest Gaussian in our MGE limits the minimum possible inclination for the projection in an oblate axisymmetric case (Section 2.2), the deprojection of Gaussians close to the lower inclination limit of 59 • generates intrinsically flat galaxies with unphysical axis ratios of q = b/a 0.2. On the other hand, an edge-on configuration (even though possible) appears to be unlikely, too. MRK 1216 is much rounder than expected for a flat, oblate system that is observed at 90 • , with an axis ratio that quickly converges to unity beyond 1 Re. We therefore choose an inclination of 70 • that is in between these two extreme cases. To assess the reliability and robustness of our results with respect to changes in the inclination, we also explore models with a close to edge-on configuration of 85 • and find that our parameter constraints are affected by less than 5 per cent. The upper limit of the stellar mass-to-light ratio increases by ∼ 0.1 when increasing the inclination, which in turn leads to insignificant changes in the derived values for the black hole and dark halo values. In general, the results are very robust with respect to variations in the inclination. Changes in the parameter estimation, and in particular in the stellar mass-to-light ratio, are only significant if a large range of inclinations is probed or the minimal observed axis ratio q is larger than 0.7, which translates to lower inclination limits of i 45 • (see also Cappellari et al. 2006).
We bi-symmetrise the observed kinematics beforehand. Although being fairly symmetric, the symmetrisation reduces noise and systematic effects which helps for the recovery of the higher order Gauss-Hermite moments in the models. The PA for the bi-symmetrisation is obtained by using the weighted first and second moments of the intensity distribution in the PPAK data (PA kin =70.7 • ), and turns out to be in excellent agreement with the PA that is inferred from the high-resolution imaging (PA phot =70.2 • ).
We construct ∼ 200 000 models to constrain the bestfitting parameters as well as all parameters within a relative likelihood of three standard deviations. Figure 5 (top left) shows the enclosed mass distribution of MRK 1216, derived from our entire set of models. The solid lines represent the stellar mass (red), black hole mass (blue), dark matter content (green) and total mass (black) for the best fit.
The dashed lines indicate 3σ confidence intervals for one degree of freedom. Based on these models, we obtain a total stellar mass of log(M /M ) = 11.3 +0.1 −0.2 , a black hole mass of log(M•/M ) = 9.4 +0.6 −9.4. and a dark halo mass of log(MDM /M ) = 14.2 +1.1 −2.2 . Neither the black hole nor the DM halo parameters are very well constrained. The best fitting dark halo dominates at radii larger than 15 (i.e. 7 kpc), which is at the edge of the extent of our kinematic data. Interestingly, models without any dark matter are not able to recover the observations and can be ruled out. Figure 5 (middle left) is a slice through the M• -ΥH plane, i.e. we plot every combination of black hole mass and stellar mass-to-light ratio, marginalised over the dark halo parameters cDM and fDM . As is already visible in the enclosed mass profile plot, the best-fitting black hole mass is log(M•/M ) = 9.4. While we obtain an upper limit of log(M•/M ) = 10.0, the black hole is unconstrained at the lower boundary of the grid, at log(M•/M ) = 7. We therefore carry out additional tests at the lower end of the parameter space (M•/M = 0) which show that the presence of a black hole is not required, as models with no black hole are able to match the data as well. The stellar mass-to-light ratio in H -band spans a range of 1.0 -2.3. The best-fitting model favours ΥH = 1.8. For comparison, stellar population synthesis (SPS) models with a single power-law Salpeter stellar initial mass function (IMF) (Vazdekis et al. 1996) predict a stellar mass-to-light ratio of 1.7 (assuming solar metallicity and an age of ∼ 13 Gyr).
The bottom left panel in Figure 5 is analogous to the middle panel and displays the goodness-of-fit contours for the dark matter fraction fDM and H -band stellar massto-light ratio ΥH , marginalised over all remaining parameters. We observe closed contours that clearly call for a non- For NGC 1277, we also depict an earlier estimate of the black hole mass from vdB12. Middle: Confidence contours of black hole mass vs. stellar mass-to-light ratio. Bottom: Confidence contours of dark matter fraction vs. stellar mass-to-light ratio. The lines denote the 68.3 (white), 95 (grey) and 99.7 (black) per cent quantiles of a χ 2 distribution with two degrees of freedom. As a reference, we overplot stellar mass-to-light ratio predictions of SSP models with a (single) power-law Salpeter IMF (Vazdekis et al. 1996 in the respective bands. negligible amount of dark matter. The halo concentration in the models is unconstrained and can adopt any value within the range that is probed. The best fitting dark halo has a concentration of cDM = 10, a mass of log(MDM /M ) = 14.2 and a scale radius of rs = 110 kpc.
The corresponding best-fitting Schwarzschild model kinematics of MRK 1216 are shown in Figure 6. The plots display fits to the first four kinematic moments of the PPAK data, and one of the three simultaneously fitted individual HET long-slits. Our models can accurately recover the kinematics of the PPAK and HET data, in particular the peak in the velocity dispersion profile and the flat and rapidly rotating velocity curve beyond 5 . For illustration purposes we add a "bad model" to the plots. The bad model was chosen by following the ridge of minimum χ 2 beyond the 3σ confidence level and displays the predicted kinematics for the best-fitting model without a dark halo. The relative likelihoods of the two models are separated by ∆χ 2 = χ 2 b − χ 2 b w/o dm = 15 and the differences in the figures are barely distinguishable. Despite these similarities, we will show that dark matter is a necessary ingredient to successfully recover the observational constraints and is well in line with our current understanding of the stellar build-up and properties of elliptical galaxies (see Section 5.3).

NGC 1277
For the dynamical analysis of NGC 1277, our models explore the parameter ranges in log(M•/M ) ∈ [7,11], ΥV ∈ [2,10], cDM ∈ [5, 15], log(fDM ) = log(MDM /M ) ∈ [-9, 5] and i ∈ [75]. In contrast to vdB12, the freedom of the models is further restrained by fitting the PPAK and HET data at the same time. As in the case of MRK 1216, we present the mass distribution, the black hole mass vs. stellar massto-light ratio and dark matter fraction vs. stellar mass-tolight ratio plots, which outline the limits for the individual parameters. Figure 5 (top right) shows the enclosed mass profile with a total stellar mass of log(M /M ) = 11.1 +0.1 −0.1 , a black hole mass of log(M•/M ) = 10.1 +0.1 −0.2 and a dark halo mass of (MDM /M ) = 12.6 +1.9 −12.6 , at a significance of 3σ. The kinematic data of NGC 1277 show the same problems as the data of MRK 1216, leading to poor constraints on the dark halo parameters. The vertical dotted line displays the extent of our kinematic information and illustrates the inability to constrain the dark matter halo, which becomes dominant only at larger radius for the best-fitting model. Here, the presence of dark matter is not necessary to fit the observed velocity moments.
We present contours of χ 2 as a function of M• and ΥV in Figure 5 (middle right). Despite the low resolution data, the black hole mass is well constrained (see also Sec. 5.2.1 and 6). We obtain an upper limit of log(M•/M ) = 10.2 and a lower limit of log(M•/M ) = 9.9. The best-fitting V -band stellar mass-to-light ratio is ΥV = 6.5 +1.5 −1.5 and consistent with the mass-to-light ratio that is predicted from spectral synthesis fits (Vazdekis et al. 2010;Ricciardelli et al. 2012;Vazdekis et al. 2012) of NGC 1277's deep, optical long-slit observations (T14), assuming a single power-law Salpeter IMF (see Section 6). Models with a black hole mass of log(M•/M ) ∼ 9 -as suggested by the scaling relation between black hole The confidence intervals for the dark halo are shown in the bottom right panel of Figure 5. In contrast to MRK 1216, the 99.7 per cent contours cannot rule out models without dark matter. The best-fitting NFW profile has a concentration of cDM = 10, a dark halo mass of log(MDM /M ) = 12.6 and a scale radius of rs = 33 kpc.
In Fig. 7, we show the bi-symmetrised PPAK kinematics, one of the three individual but simultaneously fitted HET long-slit kinematics, the best-fitting model and a "bad model" of NGC 1277. Note that one of the three long-slit kinematics is identical with the data presented in vdB12. In this case, the PPAK velocities and dispersions are fitted exceptionally well. Problems arise in fitting the HET velocity dispersion as the peak in the HET data differs by ∼ 60 km s −1 from the peak in the PPAK cube (Section 3.4). The best fit predicts a slightly lower central dispersion in NGC 1277, which matches the PPAK data but is in contrast to the HET observations (see Section 6). We also emphasise that even though the h4 values of the "good model" are slightly off along the minor axis, they are still within the measurement errors. For the illustration of a bad model we explicitly chose a model with a higher mass-to-light ratio and a black hole mass that is about a factor of 2 smaller than the lower limit. The ∆χ 2 of this fit is ∼ 50 and well beyond the 3σ boundary. The difference between observed and modelled kinematics are most pronounced in the second and fourth column where the bad model clearly fails to fit σ and h4 in the centre. The deviation in h4 along the minor axis is also much stronger. In contrast to the good model, the bad model fails to reproduce the kinematics within the measurement errors, by underestimating the number of stars with line-of-sight velocities close to the average velocity.
Overall, our models are in good agreement with the results of vdB12, with tighter constraints especially on the lower end of stellar mass-to-light ratios and hence a slightly decreased upper limit of the black hole mass. This effect is mainly driven by lower estimates of the dark halo content that is constrained by the wide-field IFU data and higher stellar mass-to-light ratios which then propagate towards the centre.

DISCUSSION
We summarise the findings and take a closer look into the results of our orbit-based dynamical models, their orbital structures and how they compare to the photometric analysis. In addition, we place the black hole masses back into the scaling relations, discuss the significance of our dark halo detections and finish with a concluding remark concerning the origin and evolutionary history of both galaxies

Orbital Structure
Apart from inferring the mass distribution, orbit-based dynamical models also allow a detailed probe of the orbital structure of galaxies. We can not only inspect the amount of mass that is assigned to each particular orbit, or orbit family in general, but also quantify the system's degree of anisotropy, which holds important clues about the processes that shaped its evolution (Bender et al. 1992). The anisotropy profiles of early-type galaxies have been investigated extensively. While data and techniques differ, ranging from long-slit observations and spherical models (Kronawitter et al. 2000;Gerhard et al. 2001) to more general axisymmetric models (Gebhardt et al. 2003) that make use of the full 2D spectral information ), there is a common agreement, namely that luminous, round and slowly rotating early-type galaxies are almost isotropic whereas oblate, fast-rotating galaxies span a large range of anisotropy profiles. The orbital structures in the dynamical models, though, have not been linked to the many and varied components that are observed via photometric decompositions of high resolution imaging, which also provide an independent record of a galaxy's evolutionary history. In the first part of this subsection we aim to provide this link by mapping the components in phase space to the multi-Sérsic components in Section 2.2 and 2.3. In the second part, we then present a direct comparison between the orbital distribution of the two compact objects in this work and a more general and representative sample of early-type galaxies.
In Figure 8 and 9, we show the orbital mass weights as a function of average radius (r) and spin (λz =Jz/(r ×σ) -whereJz is the average specific angular momentum of the orbits along the short z-axis andσ their average dispersion -and further examine the orbital structures by inspecting the ratio of radial to tangential velocity dispersion (σr/σt = 2σ 2 r /(σ 2 φ + σ 2 θ )) and the occupation fractions of the individual orbit families (de Zeeuw 1985;Statler 1987;de Zeeuw & Franx 1991). The averages are time averages per single orbit, which the Schwarzschild models keep track of.
Deciphering the mass distribution among the orbits as a function of angular momentum will provide the necessary link to the photometric components. Hitherto, only two other galaxies have been investigated in a similar manner. Walsh et al. (2012) presented the S0 galaxy NGC 3998 which showed a very clear non-rotating bulge and a non-maximal rotating disk. Lyubenova et al. (2013) presented the E5 galaxy FCC 277 with a nuclear star cluster, which showed both a pro-and retrograde disk and a non-rotating component. A direct comparison to the photometric structures, though, was not within the scope of those investigations and has therefore been omitted so far.
In what follows, we present the orbital configuration only for the best-fitting models, but the general trend is preserved for most models that are within the statistical 3σ uncertainties.

MRK 1216
MRK 1216 rotates rapidly (∼ 220 km s −1 ) around the short axis. It is thus a fast-rotating, oblate early-type galaxy. The two-component photometric decomposition contains a small, flattened and massive bulge and an outer exponential envelope, with a bulge-to-total ratio of B/T = 0.69. However, the dynamical decomposition is not as straightforward as matching directly to these two Sérsic components; there is a large, extended, rotating structure (λz = 0.0 − 0.5) beyond 10 ; two, more centrally located, moderately rotating structures (λz = 0.1 − 0.5) between 1 and 5 ; an inner (8 ) rapidly (λz ∼ 0.7) and outer ( 20 ) maximally rotating component (λz ∼ 1), as well as some mass in two counterrotating and one outer ( 10 ) non-rotating structure (Fig.  8, left panel). The orbital structure is devoid of a massive, non-rotating component which harbours a major fraction of the stellar mass and hence we conclude that MRK 1216 does not contain a classical, non-rotating bulge. If anything, we identify the moderately rotating component at ∼ 1 as the bulge, with a dynamical B/T of 13 per cent, which however is in sharp contrast to the photometric B/T of the two-component decomposition.
Remarkably, the multiple component photometric fit of MRK 1216 with four Sérsics (Section 2.2) is in much better agreement with the dynamical decomposition. The outer three photometric components can be mapped onto the 3 mildly rotating distinct components in the orbital configuration at ∼ 1 , 4 and 15 . All these photometric components have low (n ∼ 1) Sérsic indices that are normally associated with an exponential disk, except for the outermost component that resembles an envelope with close to zero ellipticity. Adopting Sérsic component number two as the bulge gives a photometric B/T of 12 per cent (Sec. 2.2), which is in very good agreement with the dynamical B/T of 13 per cent. Moreover, the component around 15 carries a mass of 30 per cent and is almost as massive as the large, outer component in the photometry with a contribution of 35 per cent to the total mass. The most notable difference is between the third and most massive Sérsic in the photometry and the dynamical substructure at 4 , which ought to contribute 45 per cent of the total stellar mass but only constitutes 20 per cent in the orbital configuration. We also note that the innermost photometric component is too small (0.4 ) to be resolved by the dynamics, while the rapidly rotating structure (λz ∼ 0.7) at 8 has no photometric counterpart at all. Finally, the maximally rotating structure (λz ∼ 1) at 20 is beyond the reach of our kinematic data and merely a result of an extrapolation, but is expected to correspond to a disky component whereas the photometry at those radii are dominated by the round outer halo. Mass distribution along all orbits as a function of angular momentum along the short z-axis and radius. Mass located above (below) the red (blue) line is rotating strongly prograde (retrograde). Bottom left: Local mass fraction as a function of average radius, divided into a non-rotating bulge-like (|λz| < 0.1), prograde rotating disk-like (λz > 0.1) and retrograde rotating (λz < -0.1) component. This classification is based on the net angular momentum of the orbits along the z-axis. The long tick marks denote the effective radius of the photometric components in the multi-Sérsic fit. Top right: Profile of radial vs. tangential velocity dispersion. Bottom right: Mass fraction per orbit type as a function of radius. The vertical solid line denotes the predicted black hole sphere of influence. The orbital distribution below 0.4 is not resolved by the data and merely extrapolated by the models. MRK 1216's radial anisotropy profile is simple and almost isotropic with oscillations of only 30 per cent around σr/σt = 1 (Fig. 8, right panel). The largest deviation from isotropy appears within the central 1 , which is attributable to the strong gravitational perturbation of the axisymmetric potential due to the presence of a black hole and hence the requirement of a non-negligible amount of stars in box orbits. Most of the mass resides in short-axis tube orbits whereas long-axis tube orbits, which are vital orbit types for triaxial and prolate systems (

NGC 1277
NGC 1277 is also a flat, fast-rotating ( ∼ 300 km s −1 ) earlytype galaxy and, as expected, our best-fitting dynamical model reveals a simple structure where most of its stars ( 80 per cent) reside in strongly rotating orbits between 1 and 20 ( Fig. 9, left panel). We distinguish at least three individual components in phase space; a highly rotating one (λz = 0.7 − 1) between 5 and 10 , containing 65 per cent of all stars; one moderately rotating and very extended component (λz = 0.1 − 0.5) between 0.5 and 3 , which contains 23 per cent of the stellar mass, and a centrally located non-rotating one at 2 , that harbours 3 per cent of all stars. The remaining 9 per cent are distributed among the tiny substructures at various positions. The lack of a massive, non-rotating (λz = 0) component in the dynamical decomposition suggests that this galaxy does not contain a pressure supported bulge. Moreover, due to the absence of a distinct, central, non-rotating component, this result also does not match with what we see and anticipate based on our results in the photometric two-component decomposition (Section 2.3), which has a bulge and an exponential disk with a B/T = 0.75. The massive bulge in the two-component fit is located around ∼ 3 , where most of the mass in the orbital decomposition resides in rotating structures. While the massive bulge in the photometry could indeed be mildly rotating, similar to our identification of the dynamical bulge in the orbital structure of MRK 1216, the mass fractions at these radii are simply at odds and disfavour the simple twocomponent decomposition.
The orbital structure is also not consistent with the 1D photometric analysis of NGC 1277 from Kormendy & Ho (2013), as they connected their inner flattened bulge with the outer round halo (> 20 ) as a single component, which is more luminous than the disk in their analysis (B/T = 0.55). In the dynamical decomposition, however, these two components (i.e. bulge and outer halo) do not appear to be connected and most of the mass resides in the extended flat and rapidly spinning component (λz 0.7).
In contrast, the overall dynamical structure has an intriguing resemblance to the multi-component Sérsic fit, which hints at an inner, exponential disk (n ∼ 1) with a small contribution to the overall stellar mass (24 per cent); and most of the mass (53 per cent) settled in a flat, outer disk-like component that resembles the extended highly rotating structure in our orbital decomposition. Even though the low Sérsic index of the second innermost component in the photometric fit is usually associated with a disky component, we cannot rule out the existence of a mildly rotating, flattened spheroidal component. Given the match in the orbital and photometric decomposition, we therefore adopt this component as a lower limit to the bulge, with a B/T = 0.24 (see Sec. 2.3 and 5.2.2). The innermost photometric component ( 0.5 ) is not resolved by our observations and, unfortunately, our stellar kinematics do not reach out to large enough radii to determine the dynamical structure of the outer halo that is expected at 15 and beyond. However, these substructures contain only a minor fraction (23 per cent) of the total stellar mass of this object.
Considering that the dynamical models of NGC 1277 are also axisymmetric, with deviations from axial symmetry close to the black hole, the resemblance between both galaxies in the mass weights of the different orbit types is no surprise. Unlike MRK 1216, however, NGC 1277 is mildly radially anisotropic in the immediate vicinity of the black hole and becomes strongly tangentially anisotropic beyond (Fig. 9, right panel).
Dynamical orbit based decompositions are a good tool for unravelling components in phase space. In this work, we could trace back the spinning orbital components of MRK 1216 and NGC 1277 to the flattened, low-Sérsic components in the photometry, which are commonly associated with rotating structures. Moreover, our models show that the two compact galaxies are rotationally supported while both the orbital and photometric structures indicate the lack of a central pressure supported, massive spheroidal component. A small, mildly rotating bulge could be present, given the match in mass and location of the second component in the photometric multi-Sérsic fits and the dynamical substructures. However, taking into account their flattening, low Sérsic index, small mass fraction and rotational support, these structures more likely correspond to rotating discs (maybe even thick disks), and can only be considered as a modest lower limit of a putative bulge (see e.g. Sec.

5.2.2).
It is also worth noting that although the models rely on the more general MGE, which is completely independent from the Sérsic fits and devoid of any physical interpretation, the orbital substructures bear no resemblance to it. The location and mass weights of the orbital structures do not match the individual Gaussians, which is most pronounced in Fig. 9 where the overall structure is clearly comprised of less than ten components.
Nevertheless, more decompositions are needed to get a better understanding. For instance, drawing boundaries between the substructures in phase space is not trivial, while the photometric models are often plagued by strong degeneracies between the individual components in the fit. In our case, the issue of connecting photometric and dynamical structures is most prominent in MRK 1216, where the mismatch in mass between the most massive photometric component in the multi-Sérsic fit and the mildly rotating orbital component at 4 is worrisome, while some minor components in the dynamical structure do not seem to have a photometric counterpart at all. Given that our Schwarzschild method is capable of recovering the distribution function and hence the rich, internal dynamical structure of ETGs , this might hint at difficulties in a) recovering the stellar build-up by multi-component Sérsic fits to the photometry and/or in b) simply linking flat, (high-) low-Sérsic components with (non-)rotating dynamical structures. Currently, there is no a priori definition of what range of λz values can be associated with these structures. The connection between the individual dynamical and photometric components here is therefore driven by the agreement between both, and our intuitive understanding of associating low |λz| components with (non-)rotating bulges and high |λz| components with highly rotating disks.
Surely, modelling limitations, such as the assumption of axisymmetry (but see also Sec. 6) will also have a nonnegligible effect on the recovery of the internal dynamics and their subsequent interpretation. More tests are therefore necessary, optimally of mock galaxy kinematics of purely rotational or pressure supported dynamical systems, to assess the robustness of our approach. This, in turn, will yield valuable information regarding the reliability and physical interpretation of photometric decompositions.

Classification And Comparison
In Emsellem et al. (2007Emsellem et al. ( , 2011, early-type galaxies were separated into two classes of systems based on their specific stellar angular momentum. Fast rotators reveal a high specific angular momentum, comprise the majority of earlytype galaxies, are close to axisymmetric in most cases and span a large range of anisotropy profiles , in contrast to slow rotators which appear to be nearly isotropic. MRK 1216 and NGC 1277 are both fast rotators, as is expected by their rapid rotation around the apparent short axis, with a specific angular momentum λR of 0.34 (0.41) and 0.25 (0.53) within one (three) effective radii.
To facilitate a comparison between the dynamical structure of the two compact galaxies in this work and a more general and representative sample of galaxies, such as presented within the SAURON framework, we follow the procedure and notation in Cappellari et al. (2007) and show the relation between the global anisotropy parameter δ = 1 − Πzz/Πxx (Binney & Tremaine 1987) and the anisotropy parameter βz = 1 − Πzz/ΠRR, which describes the shape of the velocity dispersion tensor in the meridional plane. The values in this work have been measured within 3 Re, i.e. ∼ 6 kpc and 3.5 kpc for MRK 1216 and NGC 1277 respectively. The measurements are based on a larger relative scale, in contrast to the SAURON sample which usually covers the kinematics only out to ∼ 1 Re, but corresponds much better to the SAURON measurements of typically larger ETGs in an absolute sense.
We confirm the picture of diverse anisotropy profiles of fast-rotating systems in Fig. 10. Here, MRK 1216 is located in a region that is populated by the bulk of fast-rotating galaxies in the SAURON sample. It's only slightly tangential anisotropic in the φ − r plane, which leads to the conclusion that most of its anisotropy can be traced back to a flattening of the velocity dispersion tensor in the meridional plane. While MRK 1216 follows the trend presented in Cappellari et al. (2007), that fast-rotating early-type galaxies are mainly flattened oblate systems, NGC 1277 is an outlier in every aspect and appears to belong (kinematically) to a totally different class of objects. It is flattened in z-direction, Figure 10. Anisotropy in the meridional plane (βz) vs. global anisotropy (δ) of MRK 1216 (blue) and NGC 1277 (red), measured by our orbit-based dynamical models of the wide-field PPAK IFU and long-slit HET data within 3 Re. MRK 1216 follows the bulk of axisymmetric, fast-rotating ETGs in the SAURON sample (green), with a flattened velocity dispersion tensor in z-direction. NGC 1277 exhibits a distinct kinematic structure. Besides a flattening in the meridional plane, NGC 1277 is highly tangentially anisotropic in the plane orthogonal to the symmetry axis.
but also shows a substantial amount of tangential anisotropy in the plane orthogonal to the symmetry axis, which is necessary to account for the high and extended amplitude in rotational velocity. In the SAURON sample only one galaxy, NGC 4550 (βz = 0.43 and δ = 0.60), is highly dominated by tangential dispersion. In contrast to NGC 1277, though, NGC 4550 consists of two massive counter-rotating disks.
The difference between the two compact galaxies in our sample is not only a difference of orbital structure but also of sheer size (see Table 1), with MRK 1216 being almost twice as large as NGC 1277. Taking into account the similarity between MRK 1216 and the SAURON galaxies, this may indicate that MRK 1216 has already entered a path of becoming a "regular", fast-rotating elliptical whereas NGC 1277 is still in its infancy.

Masses
A credible determination of M• requires a very thorough analysis. In the most optimal case, this is done by dynamical modelling of high-spatial resolution data that can resolve the black hole sphere of influence (RSOI = GM•/σ 2 ), i.e. the region where the gravitational pull of the black hole dominates. However, even state-of-the-art adaptive optics can resolve RSOI only for a limited number of galaxies, unless the black hole is either very nearby or very massive. The sphere of influence of NGC 1277 is about 1.6 -as measured from the best-fitting black hole mass of log(M•/M ) 10.1 and the effective velocity dispersion in the PPAK data (Table 1) -and hence at the edge of being resolved by the HET kinematics. Yet, this measurement of the sphere of influence is based on the assumption that the stellar density is well approximated by an isothermal sphere, and changes drastically if we adopt a more conservative estimate based on the region where the enclosed stellar mass equals the black hole mass, which yields RSOI = 0.9 (Fig. 5). Moreover, even if the sphere of influence is resolved, our measurements still rely on the seeing limited dispersion peak and h4 values within the central 1 and caution should be exercised regarding the black hole mass reliability in NGC 1277 (but see also Section 6 for a more in-depth discussion of the black hole mass).
Nevertheless, the gravitational, and consequently the dynamical, influence of the black hole is clearly imprinted in the observed velocity moments. The rapid rise and distinct peak in the velocity dispersion profile as well as the positive values in h4 indicate a strong mass excess within the central arcseconds. The high h4 values imply an LOSVD with heavy tails and a considerable amount of rapidly rotating stars in the very centre, and these features can -as far as the models are concerned -be solved best with a central black hole mass of log(M•/M ) 10.1. Models with an "ordinary" SMBH of log(M•/M ) ∼ 9, as suggested by M• − L Bulge , are not able to recover the photometric and kinematic properties, as they fail to either fit the dispersion profile and/or the fourth Gauss-Hermite moment. In particular, the robustness of the h4 measurement eliminates the possibility of a more moderate black hole measurement in favour of a higher massto-light ratio, as illustrated in Fig. 7.
The same, however, cannot be said for MRK 1216. Although the best-fitting model favours an over-massive SMBH, the total absence of a black hole cannot be ruled out. Models with and without a black hole provide an almost equally good fit to the kinematics and thus are not able to discern between the various black hole mass scenarios, which is why (for the time being) our measurements can only be regarded as an upper limit. Upcoming high spatial resolution spectroscopic observations with NIFS (PI: Walsh) that resolve the sphere of influence will be able to tell the difference and show whether or not MRK 1216 follows the trend of NGC 1277.

Scaling Relations
We place both black hole masses back onto the M• − L Bulge relation (Fig. 11). To this end, we utilise the compilation of Sani et al. (2011) with bulge-disk decompositions of 57 galaxies, based on Spitzer /IRAC 3.6 µm mid-infrared imaging. The use of mid-infrared data has not only the advantage of less dust extinction susceptibility but is also a better tracer of the underlying stellar mass. To this sample, we add 10 black hole masses with 2MASS K -band luminosities and bulge-to-total ratios if available; five from disk galaxies, as presented in Kuo et al. (2011) and Greene et al. (2010); two in brightest cluster galaxies (BCG), published in Mc-Connell et al. (2011); one from a low-luminosity elliptical (Kormendy et al. 1997); one from a high velocity dispersion Figure 11. M• − L 3.6,Bulge plot from Sani et al. (2011), with Spitzer /IRAC 3.6 µm bulge-disk decompositions and dynamical black hole mass measurements (including 1σ errors) for 57 galaxies. The red and blue error bars represent 3σ statistical uncertainties of NGC 1277's and MRK 1216's black hole mass. The lower limit for the black hole mass in MRK 1216 is consistent with no black hole. For the K -band bulge luminosities we adopt their total 2MASS K -band luminosities and the bulge-to-total ratios from our photometric multi-component decompositions of the HST Vand H -band images. lenticular galaxy (Rusli et al. 2011) and one from a recent merger galaxy (Kormendy et al. 2009), investigated in Gültekin et al. (2011. The solid line in Figure 11 represents the black hole mass-bulge luminosity relation (M• − L Bulge ) derived in Sani et al. (2011), based on their bulge-disk decompositions of literature black hole host galaxies and a linear regression fit to the data. The blue and red error bars mark our findings for MRK 1216's and NGC 1277's black hole mass with a statistical uncertainty of 3σ. Their Kband bulge luminosities are based on the photometric decompositions in Section 2.2 and 2.3 of the HST H -and V -band images (using the second innermost component of the multi-Sérsic fit as a lower limit to the bulge luminosity while the bulge in the two-component Sérsic fit serves as an upper limit) and their total 2MASS K -band luminosities. The figure illustrates the exceptional position of NGC 1277. The best-fitting black hole mass remains a significant outlier from this relation and the 3σ lower bound, log(M•/M ) = 9.9, still overshoots the upper 99.7 per cent confidence envelope of the relation by at least one order of magnitude. Similarly, and given the difficulties in identifying a bulge in both the dynamical and photometric decompositions (Sec. 2.3 and 5.1.2), NGC 1277 strongly deviates from the black hole mass-total luminosity relation in K -band (Läsker et al. 2014), where the lower 3σ limit of the black hole mass is marginally consistent with upper 3σ bound of this relation.
Interestingly, the black hole measurement in NGC 1277 is consistent with estimates of the scatter in the M• −L Bulge and M• − σ relation in the optical (Gültekin et al. 2009;Kormendy & Ho 2013). The best-fitting black hole mass is an outlier by a factor of ∼ 8 (4) with respect to the mean predicted black hole mass in the M• − L Bulge (M• − σ) relation, but still within a 3σ (2σ) confidence if the intrinsic/cosmic scatter of 0.44 (0.38) dex is taken into account.
The consistency between the black hole mass of NGC 1277 and the M• − L Bulge relation in the optical is a result of a larger intrinsic scatter when compared to the relation in the mid-infrared, and expected if the black hole mass-bulge luminosity relation is just a tracer of a more fundamental link between black hole mass and bulge mass (Marconi & Hunt 2003;Häring & Rix 2004). As NGC 1277 becomes an outlier in the tighter relation in the mid-infrared this could be interpreted as a hint for a different formation channel that lacks the physical interplay and causal link between the black hole and the spheroidal component of its host (Silk & Rees 1998). An alternative but speculative example for such a channel is presented by Shields & Bonning (2013). Based on gravitational radiation recoil during the final stages of two massive black hole mergers and the accompanied velocity kicks , they argue that a massive black hole in a nearby galaxy could have been ejected and recaptured by NGC 1277. The probability of mergers which could produce kicks that exceed the escape velocity of the host is non-negligible (Lousto et al. 2010), but the ejected black hole would be accompanied by a hypercompact stellar system (HCSS) with a stellar mass of MHCCS 10 −2 × M• (Merritt et al. 2009). Consequently, we ought to observe a considerable number of these free floating, compact stellar systems already in the Virgo cluster. The lack of any such observational evidence questions the likeliness of this scenario.
A different idea has been put forward by E13, to reconcile the black hole in NGC 1277 with predictions of the scaling relations. Here, individual dynamical models from Nbody realisations have been chosen to match the HET and HST data in the very centre and over a wide radial range. A hand-picked model without dark matter and a black hole mass of log(M•/M ) = 9.7 shows an acceptable fit to the kinematics. However, no parameter search was done to find a best-fit model and confidence intervals. In particular, the models fail to fit all kinematic moments simultaneously and especially the fourth Gauss-Hermite moment, which seems to be the key discriminator between the various black hole mass scenarios.
The presence of a bar was also discussed briefly as an alternative explanation of the very distinctive kinematic moments. For instance, a model with no black hole but an edge-on bar was able to overcome the problem of fitting h4 while a model with an end-on bar was a good fit to the remaining data. The truth could lie somewhere in between these two opposing bar configurations, with a more moderate black hole mass in addition. However, it is worth noting that we were not able to find any evidence for the presence of a bar in any of the data sets. Although limited by the spatial resolution of our kinematic observations, we see a clear trend for an anti-correlation between h3 and v. The presence of a bar should break this trend over its projected length, as has been shown by N-body simulations of bar-unstable disks by Bureau & Athanassoula (2005) and observations of edge-on spiral galaxies (Chung & Bureau 2004). We also thoroughly inspected the high-resolution HST data and performed photometric decompositions with Galfit that included a bar. The decompositions, however, resulted in visually and statistically worse fits. Even though we do not rule out the possibility of a small (i.e. 1 ) end-on-bar, which would not be resolved by the currently available data, we raise concerns that a) this would be a special and unlikely case and b) that the same argument could be easily applied to a number of other dynamical black hole measurements. Finally, high-resolution spectroscopic observations with NIFS (PI: Richstone) have already been carried out for NGC 1277, which will shed light on this argument.

Dark Halo Detection
Based on our orbit-based models of the wide-field IFU data and the HET long-slit kinematics, we have no clear evidence for the presence of a dark halo in NGC 1277. In MRK 1216, on the other hand, the data can only be recovered with the addition of dark matter. We note, though, that the detection is only of a weak statistical nature. The best-fitting model without a dark halo deviates by ∆χ 2 = 15 from the overall best-fitting model, which is slightly beyond the 3σ confidence limit, as has been shown in Fig. 5. Given the four kinematic moments of the PPAK data and the three HET slits that are fitted simultaneously, the mean deviation per kinematic moment and bin is ∼ 0.04 between both models and the predicted velocity moments are barely distinguishable in the IFU maps as well as in the major axis profiles of the long slits (Fig. 6). The difference in the relative likelihoods of both models is mostly attributable to the IFU kinematics, which account for 2/3 of the ∆χ 2 . This is in contrast to e.g. the statistically stronger black hole detection in NGC 1277, where the difference between our best-fitting model and a model with a black hole mass of log(M•/M ) = 9.7 is driven by a few central bins with a ∆χ 2 of 25 and is visible in the mismatch of the central velocity moments (Fig. 7).
In the case of MRK 1216, one would expect the outer bins to be the driver of the χ 2 difference, where the lack of the dark halo should lead to the most prominent deviation between a model with and without a dark halo. We show that this is not the case. In Fig. 12 we present the ∆χ 2 of the PPAK data between the best-fitting model without a halo and the overall best-fitting model as a function of radius. The plot reveals the central region ( 5 ) as the cause of the ∆χ 2 difference. In addition to the statistical claim, this is a clear indication for additional dark mass that can be explained as follows: The absence of a dark halo naturally leads to an increase in the stellar mass-to-light ratio which mitigates the effects of missing mass in the outer parts. This is illustrated in Fig. 12, where the simple mass-follows-light model presents an equally good fit to the outer kinematics as the overall best-fitting model with a halo. The rise in constant mass-to-light ratio however leads to a mismatch between data and model (or best-fitting model and bestfitting model w/o a halo) in the central regions.
A natural way to resolve this issue and to make the mass-follows-light models fit the outer and inner data points would be a radially increasing mass-to-light ratio that adopts the best-fitting value for the central regions and steadily increases towards the outer parts to account for the outer bins. Indications for such a trend should be imprinted in the colour profiles of galaxies (Bell & de Jong 2001;Bruzual & Charlot 2003), attributable to variations in age and/or metallicity of the galaxy's stellar population as is expected if galaxies grow inside-out (Pérez et al. 2013;Patel et al. 2013). We have therefore inspected the colour profiles of NGC 1277 and MRK 1216 based on SDSS g − i and HST F814W-F160W imaging, but the lack of a significant trend with increasing distance from the centre in both does not promote the use of a stellar mass-to-light-ratio gradient in our models. This is also in accordance with the spectroscopic results of T14 for NGC 1277, which suggest a uniformly old stellar population with almost constant metallicity and α/Fe values. Even though spatial gradients in the colours (e.g. Franx et al. 1989;Tamura & Ohta 2003) and stellar population properties (e.g. Greene et al. 2013) of individual ETGs have been observed, which would justify the assumption of a radially varying Υ , the analysis of a large sample of lateand early-type galaxies suggests that gradients for Υ are in general negative (Tortora et al. 2011), which in turn would further increase the dark mass and hence the discrepancy between our models with and without a dark halo (see e.g. . In principle, variations in the IMF could conceal a colour gradient in both compact objects while effectively increasing the stellar mass-to-light ratio. A recent study of radial trends in the IMF of individual, massive, high-dispersion galaxies however argues the converse and indicates that the observed trend of a bottom-heavy IMF is only a local property -confined to the central region of a galaxy -followed by a decrement of the IMF slope with increasing distance from the centre, and hence a radially decreasing stellar mass-tolight ratio (see Martín-Navarro et al. 2015a,b) Even if MRK 1216 and NGC 1277 did not assemble in the same way as the most massive ellipticals did and just evolved passively (see Section 5.4), there is currently no comprehensive theory of star formation that could explain the tendency of a more bottom heavy IMF in the less dense outskirts of galaxies.

Dark Halos In Elliptical Galaxies
The results for the dark matter halos in our analyses are puzzling, in particular in the light of other orbit-based dynamical models with a similar extent in the stellar kinematic information of the full LOSVD (e.g. Rix et al. 1997;Thomas et al. 2007). While those investigations provided unambiguous evidence for the presence of dark matter in elliptical galaxies, we can only partially confirm this trend. For instance, Weijmans et al. (2009) examined the two early-type galaxies NGC 3379 and NGC 821. Based on SAURON data out to four effective radii, they obtained a dark matter contribution of at least 8 and 18 per cent to the total mass Figure 12. Total χ 2 difference of MRK 1216's PPAK data, between the best-fitting model without a dark halo and the overall best-fitting model, as a function of distance from the centre.
budget within one Re. They also predicted a dark matter fraction of 30-50 per cent within four Re and concluded that dark matter is necessary to explain the observed kinematics.
In particular NGC 3379, with a small effective radius of ∼ 2 kpc, is easily comparable to our compact objects, where we provide a similar relative and absolute coverage of the LOSVD. However, we can detect a dark halo only in MRK 1216. For NGC 1277, the reverse is true as the models are able to recover the observations without the need of any dark matter and predict a maximal dark matter fraction of only 13 per cent within one effective radius. Interestingly, the analysis of a larger sample of these compact, high central velocity dispersion galaxies (Yıldırım et al., in prep.) indicates a dominance of the stellar mass distribution within one effective radius, owed to the decrease of the effective radius which encompasses less of the dark volume. While the contribution of dark mass to the total mass budget within one effective radius is around 10 per cent for these objects, with stellar masses above log(M /M ) = 11.1, and thus systematically lower than inferred for the population of massive (log(M /M ) 11.1), local ETGs in e.g. the SAURON and ATLAS 3D sample (Cappellari et al. 2006(Cappellari et al. , 2013, it appears to be consistent with the dark matter content of compact, high central velocity dispersion galaxies at redshift z = 2 (van de Sande et al. 2013) (see also Sec. 5.4).

NFW Profiles
The aforementioned numbers are based on the assumption that the dark halo profile in both galaxies is well described by a spherically symmetric NFW profile. As a further check of this hypothesis, we compare our results with a semianalytic approach of Moster et al. (2010) that links the stellar mass of a galaxy to the mass of its dark matter halo.
By comparing the galaxy mass function with the halo mass function they obtained a well-defined stellar-to-halo mass (SHM) relation, which enables the determination of a halo mass for a given stellar mass and vice versa.
In Figure 13 we overplot all results of our orbit-based dynamical models that are enclosed by the 99.7 per cent confidence limit and thus immediately test the consistency of our models with the standard cold dark matter paradigm (ΛCDM), which is the underlying cosmological model that defines the halo mass function. There is a small range of overlap between the predictions of our models and the SHM relation that would imply consistency with ΛCDM, but we also see a wide coverage of allowed halo masses due to the inability of our models to constrain the parameter space in cDM and fDM effectively. A different quantification of the SHM relation in terms of late-and early-type galaxies (Dutton et al. 2010) does not change anything in this respect, as the halo masses of both MRK 1216 and NGC 1277 still overshoot the upper and lower bound of these relations by about one order of magnitude.
The difficulty in detecting a dark halo in both galaxies, in particular in NGC 1277, and in constraining the dark halo parameters cannot simply be attributed to the use of larger 3σ confidence intervals in our study. More probably, the obstacle can be traced back to their compactness and high stellar masses within the small spatial extent that is probed by the available kinematic data. Given our best-fitting results, stellar masses are of the order of log(M /M ) = 11.1 within 7 and 5 kpc for MRK 1216 and NGC 1277 respectively. The contribution of a NFW halo to the total mass profile within the same range can be estimated to be of the order of log(M /M ) = 10.5 and 10.2 for MRK 1216 and NGC 1277, assuming that the mass-concentration relation (e.g. Bullock et al. 2001;Duffy et al. 2008;Macciò et al. 2008) and stellar-to-halo mass relation (e.g. Moster et al. 2010;Guo et al. 2010;Behroozi et al. 2010) hold. Accordingly, the dark halo would constitute ∼ 25 per cent of the total mass budget in MRK 1216 and only ∼ 13 per cent in NGC 1277, which would explain the statistically weak detection in the former and our struggle to verify the presence of dark matter in the latter, as the additional mass is easily compensated by a marginal increase in the stellar M/L.

The Origin Of Compact, High Velocity Dispersion Galaxies
MRK 1216 and NGC 1277 are unusual and rare galaxies in the nearby universe. Their detection was a result of the selection criteria of the HETMGS (van den Bosch et al. 2015) which, based on the sphere of influence argument, naturally looked for dense, high-dispersion objects that could possibly host very massive SMBHs. Still, the number of objects that are similar to both, even in the HETMGS, are limited and questions regarding their origin and evolution arise. Typically, the stellar populations are the first resort for exploring the (stellar) evolutionary history of a galaxy, but this would only be feasible for the long-slit spectroscopic data due to the short wavelength coverage of our IFU observations. We therefore focus on the already available data and find hints for a rather unremarkable and quiet past in their photometric and structural properties.
MRK 1216 is an isolated galaxy in the field, which has only two other galaxies within a search radius of 1 Mpc at its distance. It has no tidal signatures or asymmetries and any recent galaxy-galaxy interaction can therefore be ruled out. Given its compact shape and rotationally supported dynamical structure, an active merging history seems to be unlikely, too. Violent relaxation due to collisionless 1:1 or 1:2 mergers, for instance, commonly yields boxy, slow rotating ellipticals (e.g Naab et al. 2006), which is at odds with the rapid rotation and dynamical characteristics of MRK 1216 and NGC 1277. Note, though, that this formation scenario also fails in reproducing the detailed dynamical properties of massive ETGs in general (Burkert et al. 2008;Naab et al. 2014). Likewise, violent relaxation due to "dry" unequal mass mergers have been shown to be able to recover the photometric and kinematic properties of disky, fast-rotating ellipticals Naab et al. 2014). However, unequal mass mergers tend to increase the galaxy size drastically (Naab et al. 2009;Oser et al. 2010;Hilz et al. 2013), and are thus hard to reconcile with the sizes observed in both compact objects. On the other hand, dissipative equal mass mergers can reproduce fast-rotating ETGs, while also recovering the tilt in the FP (e.g. Robertson et al. 2006;Cox et al. 2006). However, both MRK 1216 and NGC 1277 are still outliers in the FP (Yıldırım et al., in prep.) and the non-negligible gas fractions involved in the merging process are expected to boost the star formation activity, which is in contrast to the uniformly old age and star formation history of NGC 1277 (T14 and Martín- Navarro et al. 2015b), unless the merging event has taken place more than 10 Gyr ago. As a result, the aforementioned simulated merger scenarios -which actually have been tailored to test and recover the formation and evolution mechanisms of today's population of ETGs -fail to fully explain the two compact galaxies in this work.
This brings up the idea whether these two objects are representatives of a galaxy population that has (at some point) taken a significantly different path than the present-day massive galaxy population, which has grown in mass and size since z = 2 (van Dokkum et al. 2010) presumably through successive (minor and major) merging events. In fact, stellar age estimates of the present-day massive galaxy population (McDermid et al. 2015) are consistent with the inferred stellar ages of NGC 1277, and the range of allowed stellar mass-to-light ratios in our dynamical models cannot rule out the trend of a more bottom-heavy IMF with increasing stellar velocity dispersion, which is also commonly observed for the most massive ellipticals. Accordingly, both galaxies would present unaltered and passively evolved analogues of the massive, quiescent galaxy population at much earlier times, which are thought to constitute the cores of today's massive ellipticals.
Indeed, the two galaxies are quantitatively similar to the quiescent galaxies at z = 2 . Those are also found to be small (Daddi et al. 2005;Trujillo et al. 2006;Zirm et al. 2007;van Dokkum et al. 2008;van der Wel et al. 2008van der Wel et al. , 2014, possess extremely high dispersions ) and generally have a disk-like structure (van der Wel et al. 2011). T14 were able to go beyond a simple structural, photometric and kinematic comparison by carrying out a stellar population analysis of NGC 1277. Based on long-slit spectra out to ∼ 3 Re they found that NGC 1277 consists of a uniformly old stellar population ( 12 Gyr), formed during a very short-lived era at z 3 with an intense star formation rate. This again is in good agreement with spectroscopic investigations of Kriek et al. (2006Kriek et al. ( , 2009 and Toft et al. (2012) for individual quiescent galaxies at z ∼ 2. Those have also very old stellar populations, with the bulk of their mass already assembled at z 3, and are absent of any significant star formation. Recently, evidence has even mounted for a further evolutionary link with the sub-millimeter galaxies (SMGs) at z 3 (Toft et al. 2014). The SMGs not only provide the necessary ages and compact sizes, but also the intense star formation rates -which could have been triggered by gas-rich (major) mergers at high redshifts (Naab et al. 2007;Wuyts et al. 2010) -to explain the old, compact stellar populations of the quiescent galaxies at z = 2 (but see also Williams et al. 2014;Dekel &Burkert 2014 andBarro et al. (2014) for an alternative formation channel).
The resemblance between NGC 1277 and MRK 1216 and the quiescent galaxies at higher redshifts is remarkable. Nevertheless, we need to go beyond single anecdotal examples if we want to underpin the claim that the compact galaxies, found in the HETMGS, are passively evolved descendants of the quiescent population at z 2. It is encouraging though that we have found 18 compact, highdispersion, early-type galaxies in total, which will enable us to investigate in detail their photometric, structural, kinematic and stellar evolutionary properties.

UNCERTAINTIES
Our orbit-based dynamical analysis and its implications are afflicted by a number of moderate concerns, which we would like to highlight here.
• During the construction of our dynamical models we have assumed axisymmetric stellar systems. The models are robust with respect to changes in the inclination (Section 4.2.1) but the orbital structures can change rapidly when the assumption of axial symmetry is relaxed. Even mild triaxiality would alter the observed phase space structures in Fig. 8 and 9 noticeably, leading also to variations in the derived values of e.g. the black hole mass (van den Bosch & de Zeeuw 2010). In this respect, even the slightest twist in the PA can be interpreted as a deviation from axisymmetry. An MGE with a fixed PA for all Gaussians (Sec. 2.2) is a necessary but insufficient condition for the assumption of axial symmetry, as triaxial deprojections cannot be ruled out. However, the body of evidence that has been presented throughout this paper, namely the fast and regular rotation around the short axis, the anti-correlation between v and h3, the negligible mis-alignment between the kinematic and photometric PA and results from shape inversions of a large sample of fast-rotating early-type galaxies (Weijmans et al. 2014), show that axial symmetry is a justified assumption of the intrinsic shape of both compact objects.
• Tightly linked to the black hole mass is the stellar mass-to-light ratio which in turn is degenerate with the dark matter halo (Gebhardt & Thomas 2009). The determination of Υ is therefore crucial in constraining the black hole, if the black hole sphere of influence is not resolved (Rusli et al. 2013). In our set of dynamical models, the stellar mass-to-light ratio is assumed to be constant throughout the observed range of kinematics. This is also supported by the lack of colour gradients in both galaxies and an only mild change in the stellar population properties of NGC 1277 within a radial extent of ∼ 3 effective radii (T14). We emphasise, though, that a change in the stellar M/L at smaller radii ( 1 ) might be present, which would neither be resolved by the SDSS photometry nor by our long-slit and wide-field IFU kinematics of NGC 1277. In fact, Martín-Navarro et al. (2015b) found a slight increase in the stellar M/L in NGC 1277, by tracing gravity sensitive features in their NIR, long-slit spectroscopic data. Limited by the spatial resolution of their data set, however, the stellar M/L increases only marginally from 7.0 to 7.5 between their outermost (∼ 6 ) and innermost (∼ 1 ) data points, which is still consistent with the stellar M/L inferred in our dynamical models.
Despite emerging evidence for strong systematic variations in the IMF of early-type galaxies (e.g Auger et al. 2010;Dutton et al. 2011;Cappellari et al. 2012;Spiniello et al. 2012), predictions of SSP models with a single power-law Salpeter IMF (Vazdekis et al. 1996 are consistent with our orbit-based dynamical models of both NGC 1277 and MRK 1216. This has formerly been excluded at the 3σ level for NGC 1277 in vdB12. Whereas those conclusions based on spectral synthesis fits of NGC 1277's single SDSS aperture (Cid Fernandes et al. 2005) with Bruzual&Charlot models (Bruzual & Charlot 2003), our values for the stellar mass-to-light ratios are derived from fits to the spatially resolved long-slit spectra (T14) based on MIUSCAT Ricciardelli et al. 2012) SPS models. Both approaches show that NGC 1277 is comprised of an uniformly old stellar population. Since more standard variations of the inferred stellar population parameters (i.e. metallicity and α-abundance) do not seem to be able to explain the difference between the Υ values quoted in vdB12 and the values in this work, we assume that the difference might be attributable to the choice of a non-standard IMF in the former. We consider the values reported here as a conservative estimate. Hence the strong tendency towards a more bottom-heavy IMF in high dispersion galaxies Ferreras et al. 2013;La Barbera et al. 2013;Spiniello et al. 2014) cannot be ruled out by our orbit-based dynamical analysis of MRK 1216 and NGC 1277, and would indeed favour the presence of a more moderate black hole mass. However, according to our Schwarzschild models, the upper range of possible mass-to-light ratios implies that the IMF in these two objects can only be more massive by ∼ 15 and 25 per cent at most in NGC 1277 and MRK 1216, respectively, with respect to a Salpeter IMF. While this is largely consistent with the observed scatter in the relation between IMF slope and velocity dispersion in the aforementioned studies, exotic variations of the IMF -as predicted e.g. by the best-fitting relation in Treu et al. (2010) and Spiniello et al. (2014), which implies a shift in Υ by ∼ 40 per cent for a stellar velocity dispersion of ∼ 300 km s −1 -can be excluded.
• A major concern in the modelling of NGC 1277 remains the nuclear dust ring. Although contaminated regions have been generously masked while constructing the luminous mass model, this is by no means an appropriate physical account of dust extinction. Since our orbit-based dynamical models measure the enclosed mass within a given radius, an underprediction of the stellar mass in the nucleus will obviously bias the measurement towards higher black hole masses, although it would take a considerable amount of mass to be screened by dust ( 50% of the stellar mass within 1 ) to bring the black hole in line with the scaling relations.
• The biggest concern in the recovery of the individual mass contributions and in particular for the black hole mass in NGC 1277, though, remains the accuracy of the kinematic measurements. E13 has shown that the detection of an overmassive SMBH entirely hinges on the dispersion peak and the positive h4 values in the centre. This is easily verified by our models, where most of the χ 2 difference between the best-fitting model and a model with a more moderate black hole mass of log(M•/M ) = 9.5 (Fig. 7) is attributable to the fits to σ and h4 that contribute 3/4 of the ∆χ 2 . In particular, the seeing limited measurements within 1 in the Figure 14. Comparison between one of the three HET longslit (black) and PPAK IFU measurements (red) of the fourth Gauss-Hermite moment in NGC 1277. The HET measurements have been obtained along the apparent major axis. The PPAK measurements correspond to the values of the Voronoi-binned data, for which the bin centroids are located within a 1 wide strip along the major axis.
HET long-slits are the driver of this difference and question the reliability of the black hole mass estimate. Moreover, dust obscuration could also affect the measurement of the LOSVD, even though modest dust mass assumptions show that this is only significant for the large scale kinematics (Baes & Dejonghe 2001).
The PPAK observations, although limited by their spatial resolution, provide an independent way to assess the accuracy of the HET measurements and the models in vdB12. As has been shown in Section 3.4, the central dispersion in the PPAK cube is considerably lower than the peak observed in the HET data. As a result, models that fit the combined data set predict a velocity dispersion that matches the PPAK data but slightly fails to do so for the peak in the HET data (Section 4.2.2). In addition, the central h4 moments in the PPAK data, while still positive, are slightly below the HET measurements (Fig. 14), which is of concern considering that those values have been key in discriminating between the various black hole mass scenarios (Sec. 5.2.1). In principle, we could try to reconcile both data sets and let them "meet in the middle". However, quantifying the offset between the long-slit and IFU kinematics is a non-trivial task, which is why we follow a different route and check the inter-consistency between both by fitting the PPAK data individually.
We display the results of these test models in Figure 15, and show ∆χ 2 = χ 2 − χ 2 min as a function of black hole mass and dark halo mass (marginalising over all remaining parameters). The model predictions for NGC 1277 based on the PPAK data are shown on top with the predictions based Figure 15. Comparison of the inferred values for black hole mass and dark matter halo based on NGC 1277's orbit-based models of the PPAK data only (top) and the combined PPAK +HET data set (bottom). The red dot marks the best-fitting value. The horizontal line denotes a ∆χ 2 difference of 9, which corresponds to statistical 3σ uncertainties for one degree of freedom.
on both data sets -already illustrated in Fig. 5 -plotted below. While the inferred values for the dark halo are identical, the PPAK only models of NGC 1277 yield a black hole mass of log(M•/M ) = 10.0 +0.2 −0.4 . This is largely in agreement with the values derived in Section 4.2.2 and vdB12, although with a decreased lower limit for the black hole mass by a factor of two, and now consistent with the black mass in E13, which was previously ruled out. Here again, the main contribution to the ∆χ 2 between our best-fitting model and models that are ruled out by the statistical 3σ uncertainties comes from the velocity dispersion and the fourth Gauss-Hermite moment. In contrast to our fiducial models, however, which fitted both data sets simultaneously, the main driver in the fits to σ and h4 cannot be traced back to the seeing limited innermost data points but is more uniformly distributed, as highlighted in the second and fourth row of Fig. 7.
We thus ascribe the decreased lower limit to the larger spatial resolution of the PPAK kinematics which is not able to resolve the sphere of influence of the massive black hole. For our conclusions we choose to give no preference to either one of the measurements and stick to the fiducial models in Section 4.2.2. Due to the lingering issues between both data sets, though, the very careful reader can adopt a lower limit of log(M•/M ) = 9.6 (but see also the next two points).
In the case of MRK 1216, the difference between the PPAK and HET kinematics is marginal (Sec. 3.3). Although the best-fitting model in Fig. 6 seems to be slightly off the measured HET dispersion, it is still well within the measurement errors. Hence, fits to the PPAK data alone do not show any difference in the derived values for black hole mass, stellar mass-to-light ratio and dark halo mass. We therefore do not present a comparison similar to Fig.  15 but rather refer the reader to the detailed analysis and modelling results in Section 4.2.1 and 5.3.2.
• In Morganti et al. (2013), Monte Carlo simulations of mock galaxy kinematics are utilised to estimate appropriate confidence intervals. Based on their made-to-measure particle code nmagic, they advocate the use of larger ∆χ 2 values to be able to recover their model galaxy parameters. While these findings are certainly interesting, there are significant differences in the modelling approach as well as in the kinematic data sets. The investigations of Morganti et al. (2013) are based on a single case study and more extensive tests are necessary to verify the reliability of their adopted confidence levels. The uncertainties in our parameter estimation are based on the commonly used ∆χ 2 values. After marginalising over the orbital weights as well as over e.g. Υ , c and f , we obtain formal 3σ errors of the black hole mass with a ∆χ 2 of 9. Alternatively, we can make use of the expected standard deviation of χ 2 for our parameter estimation, as promoted by e.g. van den Bosch & van de Ven (2009) in the case of IFU kinematics. The standard deviation in χ 2 is (2 × (N − M )), where N is the number of kinematic constraints (i.e. the four kinematic moments v, σ, h3 and h4 in each bin) and M the number of free parameters in our models, namely the dark halo parameters c and f as well as the black hole mass M• and stellar mass-to-light ratio Υ . As a result, we obtain a lower limit for the black hole mass in NGC 1277 of log(M•/M ) = 9.6, which again is well in line with the estimate of the black hole mass based on models of the PPAK only data and the black hole mass that was put forward by E13. Applying the same argument to MRK 1216, however, would imply consistency with models which do not contain a dark halo.
• Finally, we point out that all uncertainties presented here are solely statistical errors. A large uncertainty factor in any measurement of the black hole mass remains the estimate of systematic errors. These are hard to quantify and can arise not only through the use of modelling assumptions such as a constant mass-to-light ratio, axial symmetry and the adoption of a spherical NFW halo, but also through technical limitations as for instance a stellar template mismatch and the influence of the wavelength range that is used to infer the LOSVD. It is beyond the scope of this paper to derive an assessment of each of these factors but we acknowledge that their total contribution most likely overshoots our statistical errors. Still, to provide a conservative estimate of the black hole mass -by taking into account the effects of systematic uncertainties -we simply follow the practice of Kormendy & Ho (2013)

SUMMARY
We have performed a detailed analysis of a suite of kinematic and photometric information of the two compact, nearby, high velocity dispersion galaxies MRK 1216 and NGC 1277. Our analysis combined three different but complementary data sets; high spatial resolution imaging with the HST, low-resolution, long-slit spectroscopic observations with the HET and medium-resolution spectroscopic observations with the PPAK IFU.
We first analysed the reduced and combined HST images with multiple Sérsic components to infer the structure and morphology of each galaxy. Both galaxies show a very compact, early-type structure without any noticeable substructures. By means of a multi-component decomposition, we obtained estimates for a bulge luminosity. A decent fit was obtained with at least four components in both cases. We further parameterised the observed light distribution with a set of multiple Gaussians, which in turn was used to build axisymmetric dynamical models.
Kinematic information was extracted by fitting the binned spectra with a set of stellar libraries. The observations revealed a distinct central peak in the velocity dispersion -hinting at a very high mass concentration in the nucleus of both galaxies -and fast and regular rotation around the short axis that is consistent with axial symmetry.
Our dynamical models rely on a triaxial implementation of Schwarzschild's orbit superposition method. Probing a wide range of parameters, we infer upper and lower limits for the individual gravitational contributions of black hole mass, stellar mass and dark matter halo. For NGC 1277 we obtained good constraints on the black hole mass of log(M•/M ) = 10.1 +0.10 −0.2 for the best fitting model, consistent with former measurements of vdB12. Even for high stellar mass-to-light ratios, the lower limit on the black hole is considerably higher than predictions of the M• − L Bulge relation in the mid-infrared. In the case of MRK 1216, we only obtain an upper limit of log(M•/M ) = 10.0. Highresolution spectroscopic observations are thus needed to resolve the sphere of influence and to place firm constraints on its black hole mass.
Despite kinematic information out to 5 kpc, we were not able to constrain the dark halo parameters significantly. The models predict a dark matter contribution of up to 52 per cent in MRK 1216 and 13 per cent in NGC 1277 within one effective radius. Models without a dark halo are formally excluded at the 3σ level in MRK 1216, only. The difference between the best-fitting model without a halo and the overall best-fitting model is mainly driven by data points within 5 . We show that this difference is due to an increase in the constant mass-to-light ratio in the dark-halo-free models to account for the outer kinematics, which then propagates towards the centre and leads to the observed mismatch. A radially increasing mass-to-light ratio could indeed recover the data without the need of a dark halo. But, if anything, recent investigations of massive early-type galaxies suggest a radially decreasing stellar mass-to-light ratio.
The stellar mass-to-light ratios span a range of 5.0 -8.0 in V -band in NGC 1277 and 1.0 -2.3 in H -band in MRK 1216. The best-fitting models are in good agreement with predictions of SSP models with a single power-law Salpeter IMF. Higher mass-to-light ratios -as have been observed in high dispersion galaxies -cannot be excluded. Nevertheless we place upper limits on possible deviations from the derived values.
The orbital structure is rotationally supported in both galaxies, which is consistent with the multi-component Sérsic decompositions of the deep HST images. This is highly indicative that MRK 1216 and NGC 1277 do not possess any pressure supported classical bulges that have formed through violent relaxation in late (i.e. z 2), equal mass mergers. Recent, successive minor and dissipative major merging events are unlikely too, as these tend to increase the galaxy size drastically and should yield more recent star formation activities, which are in contrast to their uniformly old stellar age estimates. Taking into account their compact, featureless and regular structures as well as their high dispersions and rapid rotation, these compact objects might well be unaltered descendants of the quiescent galaxy population at z = 2, which in turn are thought to be remnants of highly dissipative submillimeter mergers at even higher redshifts.