Modeling the orbital histories of satellites of Milky Way-mass galaxies: testing static host potentials against cosmological simulations

Understanding the evolution of satellite galaxies of the Milky Way (MW) and M31 requires modeling their orbital histories across cosmic time. Many works that model satellite orbits incorrectly assume or approximate that the host halo gravitational potential is fixed in time and is spherically symmetric or axisymmetric. We rigorously benchmark the accuracy of such models against the FIRE-2 cosmological baryonic simulations of MW/M31-mass halos. When a typical surviving satellite fell in ($3.4-9.7$ Gyr ago), the host halo mass and radius were typically $26-86$ per cent of their values today, respectively. Most of this mass growth of the host occurred at small distances, $r\lesssim50$ kpc, opposite to dark-matter-only simulations, which experience almost no growth at small radii. We fit a near-exact axisymmetric gravitational potential to each host at $z=0$ and backward integrate the orbits of satellites in this static potential, comparing against the true orbit histories in the simulations. Orbital energy and angular momentum are not well conserved throughout an orbital history, varying by 25 per cent from their current values already $1.6-4.7$ Gyr ago. Most orbital properties are minimally biased, $\lesssim10$ per cent, when averaged across the satellite population as a whole. However, for a single satellite, the uncertainties are large: recent orbital properties, like the most recent pericentre distance, typically are $\approx20$ per cent uncertain, while earlier events, like the minimum pericentre or the infall time, are $\approx40-80$ per cent uncertain. Furthermore, these biases and uncertainties are lower limits, given that we use near-exact host mass profiles at $z=0$.


INTRODUCTION
The satellite galaxies of the Milky Way (MW) and M31 are the most rigorously studied low-mass galaxies, given their proximity to us.The dynamics and evolution of these low-mass galaxies encode rich information about their past and the host halo environment in which they orbit.These low-mass galaxies also differ from the non-satellite galaxies within the Local Group (LG) given that their host galaxies, either the MW or M31, regulated their star formation, and they probe deep within their host potentials.Important questions about their orbital histories include: When did each satellite fall into the MW/M31 halo?How close have they orbited, and when did they experience pericentric passages?How has the mass of the MW/M31 changed over time, and how has it impacted satellite orbits?How well conserved are orbital properties such as energy or angular momentum?Given a near-perfect representation of the host potential, to what motions, for over 1 billion sources, including globular clusters and the satellite galaxies of the MW (Gaia Collaboration et al. 2018a).Numerous studies use Gaia's kinematic data to study group infall of satellites, including satellites of the LMC (for example Kallivayalil et al. 2018;Fritz et al. 2018a;Patel et al. 2020).Because proper motion measurements improve with multiple observations, many studies now use HST and Gaia data in conjunction to reduce uncertainties in satellite galaxy proper motions (such as del Pino et al. 2022;Bennet et al. 2022;Warfield et al. 2023).Current observational programs such as the Satellites Around Galactic Analogs (SAGA) survey (Geha et al. 2017;Mao et al. 2021) are observing satellite galaxies around other MW-mass galaxies.The James Webb Space Telescope (JWST) soon will be able to obtain proper motions of even more distant low-mass galaxies, beyond the LG (Weisz et al. 2023), and the Vera Rubin Telescope will catalog more than 10 billion stars within the low-mass galaxies around the MW.
Using the phase-space information of satellite galaxies, in tandem with a model of the Galactic potential, many studies investigate satellite infall and orbital histories.A galaxy becomes a satellite galaxy when it first crosses the virial radius of a more massive halo, which can quench the lower-mass galaxy's star formation (for example Gunn & Gott 1972;van den Bosch et al. 2008;Rodriguez Wimberly et al. 2019;Samuel et al. 2022b,a).As satellites reach their closest approach to the host galaxy at pericentre, they orbit in the denser host CGM and feel strong tidal forces ram pressure (for example McCarthy et al. 2008;Simons et al. 2020;Martín-Navarro et al. 2021;Samuel et al. 2022a).Many studies use different models for the MW/M31 potential and numerically integrate the orbits for their satellite galaxies to derive orbit properties.However, the results depend strongly on modeling the total mass profile of the MW and M31 (for example Gaia Collaboration et al. 2018b;Fritz et al. 2018a,b).Another study by Fillingham et al. (2019) jointly used Gaia data with the Phat ELVIS (pELVIS; Kelley et al. 2019) dark matter only (DMO) simulations, which include the gravitational effects of a central galaxy, to match simulated satellites to observed satellites in 6D phase space.They then used the distribution of infall times of the matched simulated satellites to infer the infall times for 37 satellites of the MW, which ranged from ≈ 1 − 11 Gyr ago, similar to other simulation-focused studies (for example Wetzel et al. 2015;Bakels et al. 2021;Santistevan et al. 2023).Deriving these infall and orbit history properties is generally difficult, given that we do not know precisely how the mass distribution of the MW or M31 has changed over time.
Stellar streams arise from the disruption of satellite galaxies or globular clusters.Therefore, studying the orbits of streams or globular clusters gives insight into the possible future orbits of satellites that will eventually merge into their host galaxy (for example Ibata et al. 1994;Majewski et al. 1996;Bullock & Johnston 2005;Price-Whelan et al. 2016, 2019;Panithanpaisal et al. 2021;Bonaca et al. 2021;Ishchenko et al. 2023).Several studies used both the Dark Energy Survey (DES; Dark Energy Survey Collaboration et al. 2016) and the Southern Stellar Stream Spectroscopic Survey ( 5 ; Li et al. 2019) to discover these small systems and measure their kinematics (for example Shipp et al. 2018Shipp et al. , 2019;;Li et al. 2021Li et al. , 2022)).Comparisons with cosmological simulations in Shipp et al. (2023) suggest that undetected stellar streams may exist around the MW, which the upcoming Vera Rubin Observatory could potentially discover.
Cosmological simulations of MW-mass galaxies allow us to study theoretically the orbital evolution of satellites.Many studies used DMO simulations to understand how subhalos respond to pericentric events (Robles & Bullock 2021), how subhalo orbits respond to various MW environments (Peñarrubia et al. 2002;Peñarrubia & Benson 2005;Ogiya et al. 2021) and pre-processing and group accretion (Rocha et al. 2012;Wetzel et al. 2015;Li et al. 2020;Bakels et al. 2021).However, many such studies did not account for the important effects of baryons (for example Brooks & Zolotov 2014;Bullock & Boylan-Kolchin 2017;Sales et al. 2022).
Of utmost importance in deriving the satellite orbit histories is understanding the mass distribution of the MW and M31.Studies such as Bovy & Rix (2013) and Bovy et al. (2016) focused on fitting and deriving parameters for the disc, such as the scale height and length, while other studies estimated the total mass of the MW or M31 (for example Eadie et al. 2017;Patel et al. 2018;Eadie & Jurić 2019;Patel & Mandel 2023).One method of defining the total virial mass of a galaxy is by summing the mass within a given radius, such as  200m , the radius that encompasses 200× the matter density of the Universe (Bryan & Norman 1998).Using constraints from globular cluster kinematics, Vasiliev (2019a) found that the virial mass of the MW is  200m = 1.2 × 10 12 M ⊙ , which is in line with the studies mentioned above and with the currently accepted virial mass in the literature of  200m = 1 − 2 × 10 12 M ⊙ (for example Bland-Hawthorn & Gerhard 2016).Many studies use the orbits of satellite galaxies to constrain the mass of the MW or M31 (for example Evans & Wilkinson 2000;van der Marel & Guhathakurta 2008;Watkins et al. 2010;Irrgang et al. 2013).Patel & Mandel (2023) suggested the mass of M31 to be more massive,  200m = 2.85 − 3.02 × 10 12 M ⊙ , from proper motions from HST and Gaia of satellite galaxies.
Many studies used numerical tools to backward integrate the orbits of satellites, stellar streams, or globular clusters, such as Galpy (Bovy 2015), AGAMA (Vasiliev 2019b), and Gala (Price-Whelan 2017).However, such orbit modeling often makes approximations by keeping the host mass profile fixed over time (for example Patel et al. 2017;Fritz et al. 2018a;Fillingham et al. 2019;Pace et al. 2022), sometimes varying the MW center of mass, including an LMC-mass satellite, or including dynamical friction (Weinberg 1986;Lux et al. 2010;Gómez et al. 2015;Patel et al. 2020;Garavito-Camargo et al. 2019, 2021;Correa Magnus & Vasiliev 2022;Lilleengen et al. 2023).Lux et al. (2010) compared various orbit history properties of subhalos in the dark matter-only Via Lactea I simulation (Diemand et al. 2007) to the orbits of MW satellites by using proper motion measurements from the literature and integrating their orbits in fixed potentials.In another study, Arora et al. (2022) compared 4 models with and without time dependence to investigate the effects of different mass models on stellar streams dynamics in simulations, and found that although most models conserve stream orbit stability, the only model that conserves stability over long periods of time is the time-evolving model.D'Souza & Bell (2022) used 2 MW-mass host halos from the ELVIS suite of dark-matter-only (DMO) simulations to test how well orbit modeling recovers the cosmological orbits of subhalos.Although the majority of dynamical models applied to the MW and M31 assume static host potentials, the fiducial model in D 'Souza & Bell (2022) accounted for the true mass growth of each MW-mass host.They compared results from host halos with and without LMC-mass satellites and showed that orbit modeling better recovers the more recent pericentres and apocentres when compared to the second or third-most recent.They also tested models in which they did not account for any mass growth of the MW-mass host, or the presence of an LMC-like satellite, and found varying degrees of uncertainty associated with each simple model.However, these simulations lacked baryonic physics, including the gravitational effects of a central galaxy, and various works noted the importance of modeling baryonic physics on these scales (for example Brooks & Zolotov 2014;El-Badry et al. 2016; Bullock & Boylan-Kolchin 2017; Sales et al. 2022).
Table 1.Properties at  = 0 of the 13 MW/M31-mass galaxies/halos in the FIRE-2 simulations that we analyse, ordered by decreasing stellar mass.Simulations with 'm12' names are isolated galaxies from the Latte suite, while the others are from the 'ELVIS on FIRE' suite of Local Grouplike pairs.Columns: host name;  star,90 is the host's stellar mass within  star,90 , the disc radius enclosing 90 per cent of the stellar mass within 20 kpc;  200m is the halo total mass;  200m is the halo radius; and  satellite is the number of satellite galaxies at  = 0 with  star > 3 × 10 4 M ⊙ that ever orbited within  200m , totalling 493 across the suite.In Santistevan et al. (2023), we studied the orbital dynamics and histories of satellite galaxies in the FIRE-2 cosmological zoom-in simulations of MW-mass galaxies.We investigated trends between the present-day dynamical properties, such as velocity, total energy, and specific angular momentum, with the satellite infall times, present-day distance from the MW-mass host, and satellite stellar mass.We also similarly checked for trends with properties at pericentre and found that the most recent pericentre was not the smallest, contrary to the expectation that satellite orbits only shrink over time.

Name
In this paper, we further study the infall and orbital histories of the same satellites.We model an axisymmetric mass profile for each simulated MW-mass host to within a few per cent at  = 0, and we backward integrate the orbits of satellites within each one.We then compare these results against the 'true' orbital histories of satellite galaxies in the simulations.Our goal is to quantify rigorously the strengths and limitations of modeling satellite orbits in a static host halo potential, a commonly used technique.Although we focus on satellite galaxies, our results are relevant for orbit models of stellar streams and globular clusters.
Key questions that we address are: (1) How much has the mass profile of a MW-mass host evolved over the orbital histories of typical satellites?(2) How well does orbit modeling in a static axisymmetric host potential recover key orbital properties in the history of a typical satellite?(3) How far back in time can one reliably model the orbital history of satellites in a static axisymmetric host potential?

FIRE-2 simulations
We use the cosmological zoom-in baryonic simulations of MWmass galaxies in both isolated and LG-like environments from the Feedback In Realistic Environments (FIRE) project1 (Hopkins et al. 2018).We ran these simulations using the hydrodynamic plus body code Gizmo (Hopkins 2015), with the mesh-free finite-mass (MFM) hydrodynamics method (Hopkins 2015), and the FIRE-2 physics model that includes several radiative heating and cooling processes such as Compton scattering, Bremsstrahlung emission, photoionization and recombination, photoelectric, metal-line, molecular, fine-structure, dust-collisional, and cosmic-ray heating across temperatures 10 − 10 10 K (Hopkins et al. 2018).The FIRE-2 physics model also includes the spatially uniform and redshift-dependent cosmic ultraviolet (UV) background from Faucher-Giguère et al. (2009), for which HI reionization occurs at  reion ≈ 10.Stars form in gas that is self-gravitating, Jeans unstable, molecular (following Krumholz & Gnedin 2011), and dense (  > 1000 cm −3 ), and represent single stellar populations, assuming a Kroupa (2001) initial mass function.Stars then evolve along stellar population models from STARBURST99 v7.0 (Leitherer et al. 1999), inheriting masses and elemental abundances from their progenitor gas cells.Other stellar feedback processes we implement in the FIRE-2 simulations include core-collapse and white-dwarf (Type Ia) supernovae, stellar winds, and radiation pressure.
We analyse a similar set of galaxies as Santistevan et al. (2023), only we omit 'm12r', because of its low stellar mass compared to the MW and because we are not able to fit its mass profile to sufficiently high precision (see Appendix A).We also include 'm12z', first introduced in Garrison-Kimmel et al. (2019a).Our sample is from both the Latte suite of isolated MW/M31-mass galaxies, introduced in Wetzel et al. (2016), and the 'ELVIS on FIRE' suite of LG-like MW+M31 pairs, introduced in Garrison- Kimmel et al. (2019a).Table 1 lists several of their properties at  = 0, such as stellar mass,  star,90 , halo mass,  200m , and radius,  200m , and the number of satellite galaxies at  = 0 with  star > 3 × 10 4 M ⊙ ,  satellite .
The Latte suite of isolated MW/M31-mass galaxies includes halos at  = 0 with  200m = 1 − 2 × 10 12 M ⊙ with no other halos of similar mass within 5 200m .We also chose m12w to have LMCmass satellite analogs near  ≈ 0, and m12z to have a smaller DM halo mass at  = 0 (Samuel et al. 2020).Star particles and gas cells are initialised with masses of 7100 M ⊙ , however, because of stellar mass loss, the typical is ≈ 5000 M ⊙ .The mass of dark-matter (DM) particles is 3.5×10 4 M ⊙ within the zoom-in region.The gravitational softening lengths for star and DM particles are fixed at 4 and 40 pc (Plummer equivalent), respectively, comoving at  > 9 and physical thereafter.The gas cells use adaptive force softening, consistent with their hydrodynamic kernel smoothing, down to 1 pc.
The selection criteria for each pair of halos in the 'ELVIS on FIRE' suite of LG-like galaxies is based on their individual masses ( 200m = 1 − 3 × 10 12 M ⊙ ), combined masses ( tot = 2 − 5 × 10 12 M ⊙ ), their relative separation (600 − 1000 kpc) and radial velocities ( rad < 0 km s −1 ) at  = 0.The mass resolution in the 'ELVIS on FIRE' suite is ≈ 2× better than the Latte suite, with initial masses of star particles and gas cells of ≈ 3500 − 4000 M ⊙ .
Our 13 simulated galaxies display broadly consistent properties as similar MW/M31-mass galaxies and exhibit comparable observational properties to the MW or M31, such as: MW/M31-like morphologies (Ma et al. 2017;Garrison-Kimmel et al. 2018;El-Badry et al. 2018;Sanderson et al. 2020) that follow stellar-to-halo mass relations (Hopkins et al. 2018), realistic stellar halos (Bonaca et al. 2017;Sanderson et al. 2018), and dynamics of metal-poor stars from early galaxy mergers (Santistevan et al. 2021).Each galaxy also hosts a satellite galaxy population with properties comparable to the satellites within the local Universe, such as: stellar masses and internal velocity dispersions (Wetzel et al. 2016;Garrison-Kimmel et al. 2019b), radial and 3-D spatial distributions (Samuel et al. 2020(Samuel et al. , 2021)), star-formation histories and quiescent fractions (Garrison-Kimmel et al. 2019b;Samuel et al. 2022b).

Halo/galaxy catalogs and merger trees
To generate the (sub)halo catalogs at each of the 600 snapshots, we use the ROCKSTAR 6D halo finder (Behroozi et al. 2013a) using DM particles only, and we use CONSISTENT-TREES (Behroozi et al. 2013b) to generate merger trees.None of the (sub)halos that we analyse have any low-resolution DM particle contamination, given the sufficiently large zoom-in volumes.
We briefly review how we implement star particle assignment in post-processing here, but we refer the reader to Samuel et al. (2020) for details.First, we select star particles within  < 0.8  halo , out to a maximum distance of 30 kpc, with velocities within  < 2  circ,max of the (sub)halo's center-of-mass (COM) velocity.We then keep the star particles within  < 1.5  star,90 of the (then) current member stellar population's COM and (sub)halo center position, where  star,90 is the radius that encloses 90 per cent of the stellar mass.Then, we select the star particles with velocities within  < 2  vel,star of the COM velocity of the member star particles, where  vel,star is the velocity dispersion of the current member star particles.Finally, we iterate on both the spatial and kinematic criteria until the (sub)halo's stellar mass converges to within 1 per cent.This also guarantees that the COM of the galaxy and its (sub)halo are consistent with one another.
We use two publicly available analysis packages: HaloAnalysis2 (Wetzel & Garrison-Kimmel 2020a) for assigning star particles to halos and for reading and analyzing halo catalogs/trees, and Giz-moAnalysis3 (Wetzel & Garrison-Kimmel 2020b) for reading and analyzing particles from Gizmo snapshots.

Selection of satellites
We select satellites in the same manner as Santistevan et al. (2023).To summarise, we include all satellites at  = 0 with  star > 3×104 M ⊙ that ever orbited within their MW-mass halo's virial radius,  200m .This stellar mass limit corresponds to roughly ≈ 6 star particles, which reasonably resolves the total stellar mass (Hopkins et al. 2018).
At our selection threshold of  star > 3 × 10 4 M ⊙ , the median peak halo mass is  halo,peak ≈ 9×10 8 M ⊙ which corresponds to ≳ 2×10 4 dark-matter particles.Thus, we resolve satellite subhalos well, to prevent significant numerical disruption according to the criteria in van den Bosch & Ogiya (2018); see Samuel et al. (2020) for more discussion on our satellite resolution convergence.
We include 'splashback' satellite galaxies that currently orbit outside of the MW-mass halo's  200m but are gravitationally bound to it (for example Wetzel et al. 2014).As Table 1 shows, the number of surviving satellites at  = 0 per host, including the splashback population, is 26 − 57, and our sample totals 493 satellites.
To avoid biasing our results to the hosts with more satellites, we oversample the satellites, so that each host contributes a near equal fraction of satellites to the total, to within 5 per cent; see Santistevan et al. (2023) for details.

Calculating orbit properties
Many dynamical modeling studies implement a static gravitational potential for the host, consisting of a sum of potentials for each component of the galaxy, such as the stellar/gaseous disc, the bulge, the stellar halo, and the dark-matter halo (see for example Kallivayalil et al. 2013;Gómez et al. 2015;Patel et al. 2017).To numerically integrate orbits through time, these studies then often use common numerical tools, such as Galpy, to solve the equations of motion at each timestep.
In our analysis, we backward integrate the orbits of satellite galaxies in mass profiles that we fit to each MW-mass host in the FIRE-2 simulations.In short, we model the mass profiles of the hosts at  = 0 with a generalized form of the spherical Navarro-Frenk-White (NFW, Navarro et al. 1996) density profile using dark matter and hot gas ( > 10 5 K) particles within  < 10 kpc, and all particles at  = 100 − 500 kpc.We model the disc with two double-exponential disc profiles, one for the inner disc (bulge) and one for the outer disc, using star particles and cold gas ( < 10 5 K).The median fit across all 13 MW-mass hosts is within ≈ 5 per cent of the enclosed mass profiles in the simulations at  > 10 kpc out to the virial radius.Thus, we test orbit modeling under a best-case scenario, at least for a static axisymmetric potential, with near perfect knowledge of mass profile at present day.We note that we do not model the additional gravitational potential for any given satellite galaxy.See Appendix A for more details on our fits to each MW-mass host.
Next, we select the cylindrical positions (, , ) and velocities ( R ,   ,  Z ) of the satellites at the  = 0 snapshot and use this 6D phase-space information to initialise their orbits.We then use the galactic dynamics python package Galpy 4 (Bovy 2015) to backward integrate the satellite orbits for 13.8 Gyr within each host's static axisymmetric potential.Because the MW-mass host is the only gravitational potential we account for, we do not include any movement of the MW-mass host throughout the satellite orbit integration.This paper focuses on understanding the base uncertainties in static potential orbit modeling, thus, we do not account for dynamical friction in our model.Including dynamical friction may improve the model orbits, however this is outside of the scope of this paper.
We explore numerous properties of satellite orbits, each of which provides insight into their orbit histories.For example, pericentres occur when the satellite is at its closest approach to the MW-mass host, when the satellite feels the strongest tidal forces and is deepest in the host CGM.Some studies use the first post-infall apocentre distance, also called the turn-around radius, as an alternate definition for the radius of the host galaxy (for example More et al. 2015;Diemer 2017).The orbital eccentricity of satellites describes the orbit shape, which can change over time in the simulations, but is fixed in the fixed potential models.The orbital energy is invariant in a time-independent potential, and in a spherically symmetric potential, total angular momentum is invariant.Thus, comparing the evolution of these properties with the simulations informs us to what extent this holds.We calculate the following properties for the satellites in our sample.
Pericentre distance, time, velocity, and number: We define the virial properties at  200m , the radius that encloses 200× the mean matter density of the Universe.We calculate pericentres in the same manner as Santistevan et al. (2023).First, we track the main progenitor of the satellites back in time through all 600 snapshots using the merger trees, and first confirm that the satellite is within the virial radius,  200m , of the MW-mass host halo at a given snapshot.Then, we find local minima in its galactocentric distance within a ±20 snapshot window, which corresponds to ≈ 1 Gyr in time.Given the ≈ 25 Myr time spacing between snapshots, we then fit a cubic spline to the distance, time, and velocity arrays in this snapshot window, and save the spline interpolated minimum distances, and the corresponding times and velocities at these pericentres.Finally, we assign the total number of pericentres to a satellite based on the number of times the above criteria are met.As mentioned in Santistevan et al. (2023), we checked our pericentre calculations using a snapshot window of ±4, 8, 10 snapshots and saw that our fiducial window of ±20 snapshots best eliminates 'false' pericentres, that is, cases where the pipeline finds a pericentre in either numerical noise or short-lived perturbations in the orbits.Additionally, whether we center the distances on the satellite or MW-mass host galaxies or DM (sub)halos does not affect our results.
Apocentre distance: We calculate apocentres similar to the way that we calculate pericentres.We first confirm that the satellite has orbited within  200m of the MW-mass host halo before a given snapshot, however, we do not require it to be within  200m at the snapshot of interest so that we may catch apocentres in the 'splashback' phase.We then find local maxima in the satellite's galactocentric distance within a similar window of ±20 snapshots.Finally, we similarly fit a cubic spline to the distance and save these values.
Infall time: To calculate infall time, we simply ensure that the satellite is within  200m at a given snapshot, and save the corresponding time that this first happens.In orbit modeling, we calculate infall time in two different ways depending on how we treat  200m .The first method involves using the evolving  200m from the simulations, and finding when the model orbit first crosses this distance, similar to how we calculate infall time for the satellites in the simulations.However, in our second method, we keep the present-day  200m as a fixed quantity over time, that is,  200m ( lb = 0), where  lb is lookback time.We find the instances in which the model orbit first crosses within this fixed distance.In the case that the model orbit was always within  200m ( lb = 0), we set the infall time equal to the beginning of the simulations, 13.8 Gyr ago.
Eccentricity: We approximate orbital eccentricity as: where  apo and  peri are the apocentre and pericentre distances, respectively.Defined this way, the eccentricity is a constant for a Keplerian orbit.However, it will vary within our model here, thus we choose adjacent pairs of apocentres and pericentres in the actual integrated orbit from the model to calculate it.We make no distinction in the ordering of pericentre/apocentre combinations, i.e. whether not to choose an apocentre only after pericentre, or pericentre only after an apocentre.
Orbital period: We approximate the orbital period by simply calculating the time between adjacent pericentres.We checked how this compares to the timing between adjacent apocentres and found consistent results.
Specific orbital energy: We take the specific orbital energy of a satellite to be the sum of the kinetic and potential energy per mass at each snapshot.The simulation snapshots store the gravitational potential at the location of each particle, which we use to compute at the location of a satellite.Following Santistevan et al. (2023), we select all star, gas, and dark matter particles within ±5 kpc of the satellite's virial radius, and use the median potential of these particles, to minimize the satellite's self-potential.
Because we track satellites across time, we must properly normalise the potentials at each snapshot.Our sample includes 3 LG-like pairs of MW/M31-mass hosts, thus, we cannot normalise the potentials at arbitrarily large distances.Therefore, we choose to normalise potentials with: where  sat,snap (,  lb ) is potential of a satellite at a given snapshot,  host,snap ( = 500 kpc,  lb ) is the potential for particles within a spherical shell at  = 500 ± 5 kpc around the MW-mass host, and the last two terms are the analytic gravitational potentials,  ×  (< )/, for the mass enclosed within 500 kpc at present-day and any given lookback time, respectively.We choose 500 kpc for several reasons: (i) the bound mass for a given MW-mass host does not change by more than a few per cent beyond this, (ii) satellites typically orbit as far as the 'splashback' radius, which we approximate as ≈ 1.5  200m ≈ 500 kpc from a spherical collapse model (for example Fillmore & Goldreich 1984;Bertschinger 1985), and (iii) we must choose distances smaller than the separation between the two MWmass hosts (≳ 840 kpc).
For the analytic potentials, we get and because the enclosed mass does not change significantly at 500 kpc, this integral results in Φ ≈ − ×  (< 500 kpc)/500 kpc.
The last three terms in Equation 2 ensure that the potential is properly normalised across different snapshots.
When we examine differences in total energy, we divide by the virial potential of the host halo,  200m,0 ≡  ×  ( <  200m ,  lb = 0)/ 200m , because each host has different  200m and  200m , so this ensures that we compare satellite evolution in a similar manner.However, the snapshots for 'm12z', 'Romulus', and 'Remus' do not have stored potential values, thus, we exclude them when we compare orbital energies.
Specific angular momentum: We calculate a satellite's specific angular momentum at each snapshot with ℓ =  × , where  is the total distance between the satellite and the center of the MW-mass host, and  is the total velocity of the satellite with respect to the center of the MW-mass host.
Tidal acceleration: Finally, we calculate the tidal acceleration a satellite feels by taking the derivative of  =  ×  (< )/ 2 with respect to , where  (< ) is the total enclosed mass of the host within a distance .We calculate this at every snapshot and save only the maximum |/ | that a satellite experienced after first infall, although this almost always coincides with when the satellite is at its closest approach to the MW-mass host.

Disc orientation
We use the 6D phase-space coordinates of satellites at  = 0 relative to the MW-mass disc, but the disc precesses over time.In Santistevan et al. (2021), we showed that after the disc stabilizes ≈ 5 − 11.5 Gyr ago (when the angular momentum vector of the disc stopped rapidly fluctuating in its orientation), the disc continued to precess between 5−130 • until  = 0. Satellites reach their closest approach to the MWmass disc when they are at pericentre, so pericentre properties are likely most sensitive to the host disc configuration.Thus, we explore different disc orientations and models to investigate how they affect the resultant satellite orbits in Appendix B. To summarise, in one model we rotate the disc by 90 • while keeping the same coordinates for satellites at  = 0.In another model, we use a point mass model for the disc.In both models, the median differences between all orbit properties we explore between our fiducial model and these different configurations is small, less than 1 per cent.Therefore the details of the geometric configuration of the galactic potential do not matter much to satellite galaxy orbits.For more discussion about this see Appendix B and Table B1.

RESULTS
Many studies integrate satellite orbits using a model of a static axisymmetric MW/M31 potential at  = 0, which is unphysical given that the MW evolves over time.Therefore, to provide context for our results on satellite orbits, we first quantify the mass evolution of MWmass hosts in our simulations, over both long and short timescales, including how this depends on distance.We then explore the extent to which satellites conserve energy and angular momentum.Finally, we explicitly compare results between the simulations and the idealized axisymmetric model.Because many of these distributions are non-Gaussian, throughout we present the median trends across the sample of host galaxies or satellites, as well as the half-width of the 68th or 95th percentile range, which for brevity we refer to as the 1 and 2 scatter, respectively.

Halo virial properties
We define virial properties at  200m , the radius that encloses 200× the mean matter density of the universe.Figure 1 shows the evolution of both the total (baryonic plus dark matter) virial mass of the hosts,  200m , and their virial radii,  200m .We show the median trend along with the 68th and 95th percentiles.The left-hand axes show the virial properties normalised to the present day ( lb = 0), while the righthand axes show these in physical units, scaling to the median.The grey line and shaded region also show the median and 68th percentile of the infall times,  lb infall,MW , of the satellites in our sample.The median MW-mass host  200m grew more quickly at earlier times, slowing around 4 Gyr ago.As Santistevan et al. (2020) showed, the fractional stellar mass growth of these MW-mass hosts is broadly consistent with studies based on abundance matching and dark-matter-only simulations (for example Behroozi et al. 2013b;Hill et al. 2017).Most satellites fell in 3.4 − 9.7 Gyr ago, when the median MW-mass host had ≈ 33 − 86 per cent of its mass at  = 0.
Figure 1 (right) shows the growth of  200m is nearly linear in time, with relatively small fractional scatter.When the typical surviving satellites fell in, the median MW-mass host had 26 − 73 per cent of its  200m ( = 0).Thus, the MW-mass hosts grew considerably in mass and radius, which affects the orbits of satellites.

Mass within fixed physical radii
Figure 1 showed the enclosed mass within  200m ().However, for satellites that already fell into the MW-mass halo, the additional growth ≳  200m , may not matter much, given that the orbits depend primarily on just the mass within an orbit.Therefore, Figure 2 shows the ratio of the enclosed mass within a given fixed physical radius, , at a given lookback time,  lb , relative to today,  (< ,  lb )/ (< ,  lb = 0).We show the median ratios of enclosed mass within  < 50, 100, and 150 kpc, along with the 68th and 95th percentiles for  < 50 kpc.Because the satellite orbits are sensitive to the enclosed mass, we choose to measure the enclosed mass within these distances because typical pericentre distances for the satellites in our sample are ∼ 50 kpc and typical apocentres are ∼ 200 kpc.
Because we show enclosed mass within fixed physical radii, the increase with lookback time ≳ 12 Gyr ago represents the Hubble expansion, prior to the collapse within that radius.The enclosed mass was ≈ 62 − 91 per cent of its present value during the typical infall times of surviving satellites; larger than for  200m in Figure 1 during the same time range.The spikes in the 95th percentile range likely arise from massive satellites.The enclosed mass fractions within  < 100 kpc and  < 150 kpc were larger at nearly all lookback times, meaning that mass has grown fractionally less at larger radii.In other words, the significant growth of the central galaxy, largely via gas cooling/accretion, leads to more significant mass grows at smaller radii.
Figure 3 (top) shows the evolution of the enclosed mass profile across time from  = 5 -500 kpc, where  is the satellite distance from the MW-mass host.We first calculate the median profile over all 13 hosts at each distance and snapshot with ≈ 1 Gyr time spacing; see the colorbar for the lookback time of a given mass ratio.Then we normalise the curves for each snapshot to the median profile of the hosts at present day.
In general, at each  the enclosed mass ratio increased over time, and likewise, at each time, the enclosed mass ratio increases with distance.Similar to Figure 2, the mass growth over time at large  was not as significant compared to small .For example, typical recent pericentre distances for satellites in the simulations are ≈ 50−60 kpc, and compared to present day, the enclosed mass was only 74 per cent during typical satellite infall times.The median present-day distance of the satellite galaxies in the simulations is around 175 kpc, and at  lb = 7.4 Gyr ago, the enclosed mass was ≈ 83 per cent of its mass at  = 0.However, near the virial radius and beyond, the enclosed mass was already 97 per cent during typical satellite infall times.

Short-term evolution of host mass
The enclosed mass at any given time is subject to the perturbations of satellite galaxies that are orbiting around and merging within the main host.To examine the variability in the enclosed masses, we calculate a time-averaged enclosed mass profile over the last 2 Gyr, and we normalise the enclosed mass at every snapshot between  lb = 0 − 2 Gyr to this time-averaged one.Figure 3 (bottom) shows the 1 and 2 scatter of these ratios at each distance in the solid and dashed blue lines, respectively, in the bottom panel of Figure 3.
Similar to the top panel, the 1 scatter shows larger variability at small , suggesting that the fractional growth in the inner regions of the MW-mass host is larger than at large .However, the 2 scatter does monotonically decrease with distance, but it is constant between  = 10 − 100 kpc.Halos and their galaxies grow hierarchically over time, and each figure in this section explicitly quantifies this idea in the evolving virial region (Figure 1), within fixed distance apertures (Figure 2), and at fixed time (Figure 3).Modeling the enclosed mass/potential of a host at  = 0 and holding that potential fixed across many Gyr does not account for the real, substantial growth of the host halo environment in which satellites orbit.

Orbital energy and angular momentum
In a time-independent potential, the specific energy of a satellite's orbit is conserved.Likewise, in a spherically symmetric potential, the specific angular momentum, ℓ =  × , is conserved, while the component of ℓ along the minor axis is conserved in an axisymmetric potential.In Santistevan et al. (2023), we showed that satellites that fell in ≲ 10 Gyr ago conserve their median ℓ across the population, but importantly with large scatter of ≈ 40 per cent.However, we did not examine trends as a function of lookback time across an orbit.We next examine how well orbital energy and angular momentum of a satellite's orbit are conserved.We stress that we show trends across the full population of satellites, including the full range of values of satellites with a particular infall time, distance, and  star , which gives a sense of conservation on a satellite-by-satellite basis.
Figure 4 (top row) shows the difference in the total orbital energy between present-day and infall into the MW-mass halo.To scale to the characteristic energy of a given halo, we divide these differences by  200m,0 =   200m / 200m , the virial gravitational potential of the host halo today.We show these fractional energy differences as a function of the lookback time of satellite infall into MW-mass host,  lb infall,MW , distance from the MW-mass host, , and satellite stellar mass,  star .
Generally, the median fractional difference in specific energy decreases with increasing  lb infall,MW from 0 to −2.5 (top left).Thus, satellites that fell into their MW-mass halo earlier lost more energy, because the MW-mass host grew in mass by ≳ 20 per cent (Figure 2).Next, the median fractional difference in total energy increases with  (top middle), from −2 to roughly −0.2 for satellites near  200m .Satellites that currently orbit at smaller distances have lost more energy since infall, compared to satellites that orbit at larger dis- 2).Similar to Figure 2, the total mass within a given physical distance, , relative to the value today, but now as a function of .We show the median across our 13 MW/M31-mass hosts, at various lookback times,  lb , back to 11 Gyr, which encompasses the infall times of > 95 per cent of surviving satellites.The enclosed mass increases over time at essentially all , with the inner halo near the galaxy experiencing the most fractional mass growth.Thus, the approximation of a static halo mass/potential is least accurate for satellites with the smallest pericentres.For context, the median recent pericentre of surviving satellites is ≈ 50 kpc.Bottom: Typical short-term fluctuations in the host halo mass profile (Section 3.1.3).The 1 and 2 standard deviation of the fractional change/fluctuation of the total mass of the host over the last 2 Gyr, which is the typical of the orbital timescale of satellites at  ≈ 30 kpc.These fluctuations in enclosed mass are weaker at larger distances.On average, the host mass growth/fluctuations over the last 2 Gyr are ≲ 5 per cent, though such fluctuations would be even higher for halos with a massive (LMC-or M33-mass) satellite.
tances, which only recently fell in.The fractional difference in total energy only weakly depends on  star (top right).Satellites below  star ≲ 10 7.5 M ⊙ have a median fractional energy difference of roughly −0.5, and the fractional energy difference for more massive satellites decreases to as low as −1.6, but we caution that there is only one satellite with  star > 10 9 M ⊙ .Next, we investigate the specific angular momentum of satellite orbits, ℓ.My studies implement a spherically symmetric component to the potential, such as an NFW profile for the DM halo (for example Kallivayalil et al. 2013;Patel et al. 2017;Besla et al. 2019), in which the total angular momentum is constant.We show in Appendix B that satellite orbits are insensitive to the direction/details of the axisymmetric component of the potential, the disc, so we show results in ℓ and not the component of ℓ along the minor axis of the potential.
In Santistevan et al. (2023), we discussed trends in the angular momentum difference today versus infall, normalised by the angular momentum at satellite infall; (ℓ 0 − ℓ infall )/ℓ infall .Figure 4 (bottom row) shows the same difference, only now normalised by the an-gular momentum at present-day, that is, (ℓ 0 − ℓ infall )/ℓ 0 .Similar to Santistevan et al. (2023), we find weak dependence in the median fractional change in ℓ across the population since infall.The median is as large as ≈ 45 per cent compared to the values at infall.These satellites make up roughly 20 per cent of the total sample.The fractional difference in ℓ shows virtually no dependence with  or  star .The average 1 scatter is ≈ 40 and 50 per cent versus  and  star .
Our results suggest that satellite populations do not show overall conservation of energy or angular momentum.If we focus on the 68th and 95th percentile ranges in each panel, we see cases in which the energies or specific angular momenta for some satellites at presentday are similar to their values at infall, for a large range of infall time, , and  star .However, this is not true for all satellites, and the uncertainties in  and ℓ suggest that one cannot determine the orbital energy or angular momentum of a given satellite at infall to within a factor of ≳ 2 from present-day measurements.Rather, satellites commonly lose some orbital energy since infall, likely because of the growth of the MW-mass host potential over time, the effects of dynamical friction on high-mass satellites, and also satellite-satellite interactions that may torque the satellite orbits.
We also investigate the extent to which the kinetic and potential energy components are conserved with respect to infall time, , and  star .Versus infall time, the fractional change in the kinetic energy of satellites that fell in recently is positive and the fractional change in the potential energy was negative, suggesting these satellites are likely on their first infall and nearing their first pericentre.Satellites with larger infall times often had negative fractional changes in kinetic and potential energies, because of both dynamical friction slowing the satellites and the growing host potential over time.Within  < 100 kpc, the circular velocity profile rises significantly, and satellites that orbit today have much larger kinetic energies compared to infall.We note no strong trends with  star .
Having quantified the change in orbital energy and angular momentum from first infall to today, Figure 5 quantifies the extent to which a satellite conserves  and ℓ as a function of lookback time.We show the median trend across the population of satellites in the dashed lines.For simplicity, we present the 1 scatter across the entire population, at a given snapshot in the solid lines.We include only galaxies that were still satellites at a given lookback time, including splashback satellites, so the size of the sample monotonically decreases with lookback time.
Figure 5 (top) shows the difference in the satellite orbital energy between a given lookback time and today,  () −  0 , normalised by the MW-mass halo potential today,  200m,0 .Over the last ≈ 3.5 Gyr, the median total energy is relatively unchanged, but at earlier times, this fractional difference increases with increasing lookback time from 0 to as large as ≈ 2 at 11.75 Gyr ago.The fractional change in  reaches 25 per cent at ≈ 4.8 Gyr ago, 50 per cent at 7.9 Gyr ago, and 100 per cent at 9.2 Gyr ago.
Although the median fractional energy change over the last 3.5 Gyr is small, this is only a statement about the population, and it does not imply energy conservation for a typical satellite, given the large scatter.The 1 scatter reaches 25 per cent already at ≈ 1.6 Gyr ago, 50 per cent at 6.1 Gyr ago, and 100 per cent 9.1 Gyr ago.
Figure 5 (bottom) shows the fractional change in the specific angular momentum of satellites, ℓ.Similar to energy, we compute the difference of ℓ at each lookback time with the value today, but now we normalise this difference by ℓ today.The median ℓ is constant for longer, over the last ≈ 6 Gyr, before which it decreases.This implies that early-infalling satellites gained angular momentum on average, as we showed in Santistevan et al. (2023).The median fractional difference reaches 25 per cent at a much later time of ≈ 11.9 Gyr The median decreases from 0 to −2.5 with increasing infall lookback time, meaning that satellites have lost energy and become more bound since infall, because the host has grown (Figures 1-3).Because  inversely correlates with infall time, the median change in energy increases from −2 to −0.2 with  (middle) and is constant with satellite mass at  star ≲ 10 7.5 M ⊙ , but it decreases at higher mass because of dynamical friction.Although  correlates with these properties, the widths of the 68th percentiles span ≈ 1.2 − 1.5, highlighting the significant variation (and thus uncertainty) for any given satellite.Bottom row: Change in specific orbtal angular momentum, ℓ, relative to the value today.The median fractional difference is generally zero for satellites that fell in ≲ 9.5 Gyr ago, but earlier infalling satellites have increased by up to ∼ 45 per cent.We find little to no change in the median ℓ with  or  star , except for satellites with  star ≳ 10 7 M ⊙ , which experienced stronger dynamical friction.The mean widths of the 68th percentiles in the fractional change in ℓ span 70 per cent versus infall time and , and 100 per cent versus  star .These uncertainties in both  and ℓ imply that we cannot infer the initial energy or angular momentum of a given satellite's orbit at its time of infall to better than ≳ 2. ago, and 50 per cent at 11.4 Gyr ago.Compared to the fractional change in , the 1 scatter reaches a given fraction later, 25 per cent at ≈ 4.2 Gyr ago, 50 per cent at 9.3 Gyr ago.We stress again that, although the population median is conserved longer, this does not imply that a given satellite's ℓ is conserved for this long.Rather, the 1 scatter represents the typical uncertainty for a given satellite.
Figures 4 and 5 show that neither  nor ℓ are conserved across time, which agrees with Figures 1-3, and results from the growth and general time dependence of the host halo potential.The 1 scatter represents the typical uncertainty for a given satellite, which is as large as ≈ 50 per cent for ℓ around 9.3 Gyr ago, and as large as a factor of 2 for  at 9.1 Gyr ago.

Orbit modeling in static axisymmetric host potential
We now compare orbit properties of satellites from our cosmological simulations to properties derived in an idealized, static, axisymmetric model.As we describe in detail in Appendix A, we fit the presentday host potential, keep it fixed over time, and initialise the satellite orbits at  = 0 using the same 6D phase-space coordinates as in the cosmological simulations.Thus, the orbital energy and angular momentum of satellites remain constant, and the satellites orbit periodically across 13.8 Gyr.
Figure 6 shows four representative satellites: each column shows varying degrees of how well orbit modeling reproduces the most recent pericentre distance.To quantify how well orbit modeling does in reproducing the recent pericentre distance, we measure Δ peri / peri = ( peri,model −  peri,sim )/ peri,sim .From top to bottom, we compare the host-centric distance, the total velocity, specific angular momentum, and specific energy of the orbit.
Orbit modeling agrees well with the simulations during the satellites' recent histories.In the left two columns, orbit modeling recovers the orbits well for one half to two orbits, the third column shows agreement with the timing of the orbit for two and a half orbits but less agreement for the distance and velocity, and the right column show agreement for less than half an orbit.
Even in the left two cases, in which the model does well at reproducing the most recent pericentre distance, orbit modeling does not accurately recover previous pericentres, especially the timing of the pericentres, which continues to become more out of phase with time, likely due to the lack of dynamical friction.The right column shows cases in which the timing of the most recent pericentre is within ≈ 0.5 Gyr but the pericentre distance is off by nearly a factor of 2.
Finally, the third and bottom rows of Figure 6 show the lack of conservation in specific angular momentum and specific energy for the satellites in the simulations.For each satellite, we show the fractional change in ℓ compared to present-day, that is, (ℓ() − ℓ 0 )/ℓ 0 .Even after a satellite falls into its MW-mass halo, its angular momentum can increase or decrease ≳ 50 per cent over time.The lack of conservation in ℓ is likely a combination of complex processes, including the growth of the MW-mass host, satellite-satellite interactions, mergers, and the non-symmetric potential.In the bottom row, we calculate the fractional change in energy compared to present-day, normalised by the host virial potential energy today, that is, ( () −  0 )/ R200m,0 .Similar to the results in Figures 4-5, the specific energy of a satellite orbit decreases over time, primarily because of the growth of the MW-mass host.
In the following subsections, we quantify differences in orbit properties across the entire satellite population.It is worth noting that dynamical friction acts more efficiently at higher masses to rob the  3).While the median, which represents the systematic bias across the population, reaches 25 per cent at 4.8 Gyr ago, the 1 scatter, which represents the uncertainty for a given satellite, reaches 25 per cent already at 1.6 Gyr ago.Bottom row: Specific angular momentum.The median is about zero over the last ≈ 6 Gyr, but at earlier times the angular momentum decreases, down to −60 per cent, meaning that the specific angular momentum systematically has increased since infall.Again, the 1 scatter reaches 25 per cent already 4.7 Gyr ago.Thus, while inferences of orbital energy and angular momentum, based on their conservation, are relatively unbiased (for the overall population) over the last ≈ 5 − 8 Gyr, their uncertainties for a given satellite are large already ≈ 1 Gyr ago.
satellites of their orbital energy and cause them to merge away (for example Boylan-Kolchin et al. 2008).We remind the reader that when interpreting the plots, we do not include a model for dynamical friction.

Orbital distance
First, Figure 7 shows the absolute fractional difference in host-centric distance from the simulations versus from orbit modeling, versus lookback time.We show the absolute value of the median to keep all values positive and visual clarity, however, the median is negative for lookback times of  lb ≲ 7.5 Gyr, and positive for  lb ≳ 7.5 Gyr.
Over the last 8.5 Gyr, the median fractional difference is relatively constant at ≲ 10 per cent.Before this, the median then increases to 25 and 50 per cent, around 9.7 and 10 Gyr ago.Prior to ≈ 11.7 Gyr ago, less than 1 per cent of these satellites today were still satellites.The 1 scatter reaches a 25, 50, and 100 per cent fractional difference at 4, 6.3, and 8.6 Gyr ago.

Virial infall time
Many studies focus on when low-mass galaxies first become satellites, and their properties, such as mass, during infall (for instance Boylan-Kolchin et al. 2011;Wetzel et al. 2015;Patel et al. 2017).
We investigate two ways of calculating the time of first infall into the host halo within orbit modeling,  lb,infall,model .First, when a satellite first crossed within the MW-mass halo, accounting for the growth of its  200m over time.Second, we record when a satellite's orbit first crossed within  200m at  = 0. Nearly 60 per cent of all satellites in orbit modeling always have orbited within  200m ( lb = 0), so we simply define these infall times to be 13.8 Gyr ago.We take the difference of each  lb,infall,model to the infall time in the simulations,  lb,infall,sim (calculated with an evolving  200m ). Figure 8 shows these differences, versus the infall times in the simulations (left) and versus satellite  star (right).
In the left panel, using an evolving  200m ( lb ), the median difference between orbit modeling and the simulations is generally within ≈ 2 Gyr.The peak in the curve at 1.5 Gyr is driven by 11 of the 20 satellites in this particular bin in infall time, where orbit modeling predicts that these galaxies fell in ≳ 5 Gyr earlier than in the simulations.Similarly, a slightly smaller peak at 7.5 Gyr is caused by the model over-predicting  lb,infall,model for nearly half of the satellites by ≳ 2 Gyr.
The 68th percentile range is largest for the most recently infalling satellites and decreases with increasing lookback time.Orbit modeling generally overestimates the infall time compared to the simulations, because the model orbits are periodic and more likely to cross the virial radius at earlier times.The model overpredicts the infall time for roughly 65 per cent of all satellites.Even when accounting for the evolving  200m , the 1 scatter spans ≳ 1 Gyr, which highlights the large uncertainty in orbit modeling.
When using a fixed  200m ( lb = 0), the median shows relatively good agreement for satellites that fell in within the last ≈ 4 Gyr.Beyond 4 Gyr ago, the median difference in infall time increases until ≈ 7 Gyr ago, where it decreases again.The orbits of these satellites were generally within  200m ( lb = 0) at all times, so the difference between orbit modeling and simulations follows the relation 13.8 Gyr −  lb,infall,sim .Of the subset of satellites that fell in between 1 − 2 Gyr ago, only 2 of the 17 satellites were always within  200m , which is why the 68th percentile dips down close to 0. However, the associated uncertainties in this method of calculating infall time are much worse, and the 1 scatter reaches as large as ≈ 5.7 Gyr.
Versus  star , both infall metrics follow the same general trends: better agreement for satellites with  star < 10 7 M ⊙ and larger offsets for higher-mass satellites.The offset between the medians, and 68th percentiles, is roughly 3 − 4 Gyr between the two infall metrics.The values associated with the fixed  200m ( lb = 0) method skew to larger values, because many of the model orbits always orbited within this distance.The 1 scatters each span roughly 2.5 − 3 Gyr, so one cannot accurately determine a satellite galaxy's infall time at any given mass to within ≈ 2.5 Gyr.

Pericentre properties
We next investigate various properties associated with pericentric passages relative to the host galaxy, when the tidal acceleration and ram pressure from the host CGM tend to be strongest.Lookback time [Gyr] 0.0 0.5 1.0 0 0.1 0.3 0.6 1 2 3 6 redshift 0 0.1 0.3 0.6 1 2 3 6 redshift 0 0.1 0.3 0.6 1 2 3 6 redshift 0 0.1 0.3 0.6 1 2 3 6 redshift Figure 6.Four case studies of satellite orbital histories.Orbital distance from the host galaxy,  (top row), total velocity (second row), specific angular momentum (third row), and specific total energy (bottom row).We show 4 satellites based on how well the most recent pericentre agrees between the simulations and orbit modeling, ( peri,model −  peri,sim )/ peri,sim , with increasing error from left to right (see legend).Black lines show the simulations, while blue lines show model-based orbits in a static axisymmetric host potential.Black vertical dashed lines show the first infall into the MW-mass halo in the simulation, and in the top row the grey line shows  200m ( ) of the host halo.Green vertical dotted lines show pericentres in the simulation.In the left case study, the orbit model and simulation agree for nearly two full orbits, while the right case study shows agreement for less than half an orbit.Orbit modeling tends to recover the timing of the most recent pericentres better than their distances.As the bottom rows show, specific angular momentum and total energy of the orbit are not generally conserved; see also Figures 4 and 5.
Most works assume that satellite orbits only shrink over time, because of dynamical friction and the time-dependent host potential (for example Weinberg 1986;Taylor & Babul 2001;Amorisco 2017), which implies that the most recent pericentre should be the smallest experienced.However, as we showed in Santistevan et al. (2023), for 67 per cent of our satellites with  peri ≥ 2 the most recent pericentre is not the smallest, because many satellite orbits have grown in pericentre distances over time.Patel et al. (2020) also saw cases in which the most recent pericentre was not the smallest, and suggest that the presence of a massive satellite alone can cause this effect.Therefore, we present trends for the most recent and the minimum pericentres.
Figure 9 compares pericentre distances,  peri , the velocity at pericentre, , the number of pericentric passages,  peri , and the timing of pericentres,  peri , versus the lookback time of infall into the MW-mass halo, present-day distance, , and satellite  star .For each pericentre property, we show the difference between orbit models and simulations, except for pericentre distance, for which we compare the fractional difference.
Figure 9 (top row) shows the trends for pericentre distance, for both the most recent,  peri,rec , and the minimum pericentre,  peri,min .For a given satellite, all of its pericentre distances are the same in our (static) orbit models.With respect to satellite infall time (left panel), the model recovers the median  peri,rec well, but this is across the entire population of satellites, not for a given satellite.The median fractional difference is within ≈ 20 per cent, and the average 1 scatter is roughly 19 per cent, smaller than many other properties we present here.Thus, although the median pericentre distance across the population looks reasonable, the prediction for any particular satellite is uncertain by ≈ 20 per cent.Orbit modeling recovers the median  peri,min well for satellites that fell in ≲ 5 Gyr ago, because all but one of these satellites experienced only one pericentre, so the minimum is the most recent.For satellites that fell in ≳ 5 Gyr ago, the median fractional offset in  peri,min diverges from that of  peri,rec .Roughly 60 per cent of satellites that fell in ≳ 5 Gyr ago experienced multiple pericentres, and because the orbit models only predict a single  peri for a given satellite, these positive values suggest that the satellites in the simulations orbit at closer distances than in our static model.The 1 scatter for  peri,min reaches 100 per cent around 9.5 Gyr ago, thus one cannot accurately predict the minimum pericentre to within ≳ 2 for satellites that fell in earlier than this.
In the middle panel, median  peri,min and  peri,rec between the orbit models and simulations show general consistency across all distances The fractional difference in  peri,rec is ≲ 5 per cent, and ≲ 25 per cent for  peri,min .As we will discuss below, ≈ 95 per cent of satellites that currently orbit beyond 300 kpc completed only one pericentre, so the median and 68th percentiles are the same between both  peri,min and  peri,rec .Conversely, 2/3 of satellites currently shows the absolute value of the median across all satellites and the solid line shows the 1 scatter of the sample.We choose to show the absolute value of the fractional difference to keep all values positive for visual clarity, however, the median is negative for lookback times ≲ 7.5 Gyr, and positive for earlier times.Similar to Figure 5, we only include the instantaneous population of satellites for a given lookback time.The median reaches 25 per cent at 9.7 Gyr ago, but the 1 scatter reaches 25 per cent already at 4 Gyr ago.
within 300 kpc have  peri ≥ 2 and, similar to the left panel, the median and scatter for  peri,min increases to positive values, indicating that orbit modeling overpredicts  peri,min for these satellites.For the satellites within 300 kpc, nearly 85 per cent fell into their MW-mass hosts over 5 Gyr ago.As with the left panel, the range in the 1 scatter is larger for  peri,min than in  peri,rec , with average values of 55 per cent and 24 per cent, respectively.
Finally, the median fractional difference in both pericentre metrics shows no dependence on stellar mass for  star < 10 8.75 M ⊙ (right panel).Lower-mass satellites typically fell in earlier, orbit at closer distances, and completed more pericentres, so their minimum and most recent pericentres are more likely to diverge.Only one satellite in our sample has  star > 10 9 M ⊙ .
Figure 9 (second row) shows trends in the total velocity of pericentre,  peri , and similar to  peri , we show trends in both the minimum,  peri,min , and most recent,  peri,rec .Again, satellites at small infall time, large , and high  star , only experienced one pericentre, so the median trends in both  peri,min and  peri,rec are the same.The model recovers both the median  peri,min and  peri,rec to within ≈ 30 km s −1 across all infall times and , and nearly all  star .Although the model recovers the median  peri,rec well, it over-estimates  peri,rec in all panels presumably because of the lack of dynamical friction and gravitational perturbations from other satellites.
Figure 9 (third row) compares the number of pericentric passages a satellite experienced,  peri .For the model orbits, we only count the number of pericentric passages a satellite experienced since first infall.We count  peri from the two infall metrics in Section 3.3.2:since infall into the MW-mass halo accounting for an evolving  200m ( lb ), and since infall while keeping a fixed  200m  200m ( lb = 0).We show the mean and standard deviation for  peri , because it is an integer for a given satellite.
These results for  peri are not particularly sensitive to the two ways in which we calculate first infall.In the left panel, the mean difference in  peri slightly increases with infall time, because orbits in the models are periodic, so longer integration times lead to larger  peri .The middle panel shows trends versus host distance, : for satellites at ≲ 250 kpc, orbit modeling overpredicts  peri , because these satellites typically fell into their MW-mass hosts earlier than satellites at larger .Finally, the mean difference in  peri is generally flat at  star < 10 7.5 M ⊙ , but the difference increases for more massive satellites, given the lack of dynamical friction in the orbit models.
Finally, Figure 9 (bottom row) compares the timing of just the most recent pericentre,  lb peri .The median difference is ≲ 0.2 Gyr across all three panels.The median is also consistently negative, indicating that orbit modeling predicts more recent pericentres, likely because in the orbit models the MW-mass host does not reduce in mass going back in time.
Typically maximized near a pericentric passage, satellites feel a tidal acceleration from the host, which strips their mass.We calculate the acceleration to be  =   (< )/ 2 , where  (< ) is the total enclosed mass of the host within a distance .We then compute the derivative with respect to  and save the maximum |/ | that a satellite experienced after first infall.
Figure 10 compares the maximum |/ | experienced between the simulation and model.Satellites that fell in  lb infall,MW = 2.5 − 7 Gyr ago typically have larger minimum pericentres in the simulations than in orbit modeling, so the model overpredicts |/ | for these satellites by up to 45 per cent.Conversely, satellites that fell in  lb infall,MW ≳ 7 Gyr ago show larger minimum pericentres in the orbit models, so underpredicts |/ | for the earliest satellites by up to 55 per cent.Because the simulations and orbit models agree in  peri,min for satellites that fell in  lb infall,MW < 2.5 Gyr ago, the median fractional difference is near zero.
Figure 10 (middle and right) shows that |/ | has little to no dependence on present-day satellite distance or  star .Although the median fractional difference is close to 0 in both panels, the 1 scatter increases from 0.39 − 2 with , and the mean scatter versus  star is 73 per cent.In all three panels, the 2 scatter spans 100 per cent or more, so while the median |/ | across the population from orbit modeling is relatively accurate, for any given satellite, orbit modeling over-or underpredicts |/ | typically by a factor of ≈ 2.
Finally, Appendix C compares both the timing and distance of pericentres between orbit modeling and the simulations at each previous lookback pericentric event.The bias in both the distance and timing of pericentres increases with increasing lookback pericentre events to roughly 20 per cent in distance, and ≈ 1.5 Gyr in time.The uncertainty in these measurements increases up to the 4th most recent pericentre to ≈ 50 per cent in distance and ≈ 1 Gyr in time.Beyond this, ≲ 8 per cent of satellites experienced 5 pericentres or more.
In summary, we compared various pericentre properties for both the minimum and most recent pericentre events.Across the full sample, the median fractional difference, or bias across the population, for the minimum and most recent pericentre distances are 2.5 − 6.6 per cent.The bias in the pericentre velocity is within ≈ 20 km s −1 , within 0.2 Gyr for the timing of the most recent pericentre, and < 2 for the number of pericentric events.Finally, the bias in the maximum tidal acceleration is typically 10's of per cent across infall time, , and  star .Just as importantly, the typical 1 scatter, which represents the uncertainty for a given satellite, is significant at ≈ 10 − 70 per cent.For orbit modeling, we measure infall time two ways: using a non-evolving host halo radius,  200m ( lb = 0) (green) and using  200m ( lb ) from the simulations (purple).Solid lines show the median and shaded regions show the 68th percentile ranges across our satellites.Left: The median infall time is generally accurate for orbit modeling if using an accurate  200m ( lb ), with a median offset of ≲ 2 Gyr and a typical scatter of 2.4 Gyr.The spike at 1.5 Gyr comes from orbit modeling overpredicting the infall times for most satellites by ≳ 5 Gyr, given errors in modeling recent apocentre distances.By contrast, orbit modeling using fixed  200m ( lb = 0) works well for recently infalling satellites but vastly overestimates the lookback time to infall for earlier-infalling satellites, with an average 1 scatter of 2.8 Gyr.Right: The median and 68th percentile for both infall time metrics show similar trends with satellite mass, offset by ≈ 3 Gyr.The typical 1 scatters range from roughly 3 Gyr and 2.5 Gyr when using either the fixed  200m ( lb = 0) or accurate  200m ( lb ), respectively.Orbit modeling fails most significantly for satellites with  star ≳ 10 7 M ⊙ , because dynamical friction has shrunk their orbits.

Apocentre, orbital period, and eccentricity
The apocentre measure how far a satellite orbits from its host, and an orbit spends most of its time near apocentre.Figure 11 (top) compares trends in the most recent apocentre distance,  apo,rec .We only measure an apocentre that occurs after infall into the MW-mass halo.About 68 per cent of our satellites experienced an apocentre; the rest are on first infall.
Versus infall time (top left), 8 satellites fell into the host between  lb infall,MW = 2 − 3 Gyr ago, and orbit modeling generally overpredicts  apo,rec by ≳ 75 per cent for half of them.The fractional difference in apocentre distance is smaller for earlier-infalling satellites, and the median is ≈ 0.015 with a mean 1 scatter of 0.06. Figure 11 (top middle) shows little dependence with .Satellites that currently orbit at smaller distances generally fell into the MW-mass host earlier, so orbit modeling somewhat underpredicts  apo,rec for satellites within ≲ 250 kpc, similar to how it underpredicted  apo,rec at large  lb infall,MW (top left).Overall, the mean 1 scatter is ≈ 0.08.Finally, the median fractional difference in apocentre distance decreases weakly with  star (top right).Lower-mass satellites typically fell into their MW-mass halo earlier, and they have smaller fractional differences.
Figure 11 (middle row) shows trends in the most recent orbital period,  orbit .We define the orbit period as the time difference between the two most recent pericentric passages, and we find nearly identical results using the times between apocentres.47 per cent of the satellites in our sample experienced 2 or more pericentres in both the simulations and orbit modeling.
In the left panel, satellites that fell in  lb infall,MW < 4.5 Gyr ago did not have enough time to undergo 2 pericentres.For earlier infalling satellites, the median difference in  orbit varies by as much as −0.4 Gyr, but the mean across all infall times is −0.15.The difference in  orbit is negligible versus .Orbit modeling does not account for dynamical friction, therefore, for satellites with  peri ≥ 2, if orbit modeling underpredicts  peri,rec compared to the simulations, it suggests more bound orbits and smaller  orbit values, which is what we see for these satellites with  star ≳ 10 6.5 M ⊙ .
Finally, Figure 11 (bottom row) compares the orbital eccentricities, . Figure 11 (bottom left) shows that, for satellites that fell in  lb infall,MW ≲ 4 Gyr ago, orbit modeling recovers  peri,rec well (Figure 9, top left) and overpredicts  apo,rec .Similarly, although orbit modeling recovers  apo,rec well for satellites up to  lb infall,MW = 7.5 Gyr ago, it also underpredicts  peri,rec , which drives  to be higher in orbit modeling.Orbit models and simulations show similar results for earlier-infalling satellites.The median difference in  is flat with  and  star < 10 8.25 M ⊙ .
Again, we compute all results in this subsection based on the most recent pericentre and apocentre, but as Figure 9 showed, orbit modeling performs worse for earlier properties of an orbit, so comparison of orbital period and eccentricity at earlier stages of these orbits would show even larger disagreements.

Recoverability of orbit properties
We compared 15 properties of satellite orbits in our cosmological simulations against orbit models using static axisymmetric potentials that we fit near-exactly to our hosts at  = 0. Table 2 lists the properties that we tested, as well as the median offsets and 1 and 2 scatters across our sample of satellites.We compare both the raw difference of a given orbit property, , defined as  model −  sim , as well as the fractional difference, ( model −  sim )/ sim .Additionally, we show the fractional change in orbital specific energy since infall relative to the MW-mass potential, ( 0 −  inf )/ 200m,0 and the fractional change in orbital specific angular momentum relative to today, (ℓ 0 − ℓ inf )/ℓ 0 .The orbit models conserve these quantities by definition.).This difference slightly increases from ≈ 0 to 1 with  lb (left) and decreases from 2 to 0 with  (middle), because satellites at small distance typically fell in earlier, which means that they orbited longer in the model.Fourth row: Difference between the lookback time of the most recent pericentre,  peri,model −  peri,sim , which shows weak trends with any properties.Although the median trends across the satellite population agree well in most cases, the substantial scatter in all panels implies significant uncertainty for a given satellite's orbit history.
We quantify the goodness of the orbit models in terms of their 'bias' (accuracy) and 'uncertainty' (precision) in the right-most columns of Table 2.The 'bias' describes how well orbit modeling accurately recovers the median orbital property across the satellite population: we define a property to be minimally, moderately, or highly biased if the median fractional offset of the satellite popula-tion between orbit modeling and the simulations is < 10 per cent, 10 − 25 per cent, or > 25 per cent, respectively.However, even in cases where this bias is small (accuracy is high), orbit modeling can have severe limitations if it cannot model the history of a given satellite to good precision.Thus, we also quantify the 'uncertainty' via how large the scatter in this difference between orbit models and  (right).Solid lines show the median and the dark and light shaded regions show the 68th and 95th percentiles across all satellites.These trends mirror those for the minimum pericentre distance,  peri,min , in Figure 9 (top).Because this orbit modeling does not account for the growth of the MW-mass host and the orbits are periodic, it increasingly over-predicts the pericentre distance, and underpredicts the tidal acceleration, with increasing  lb infall,MW ≳ 5 Gyr.The dependence on  and  star are weak.However, the 1 scatters span more than 50 per cent in each of the panels here, and up to a factor of 2, highlighting the large uncertainties.Comparing the most recent apocentre distance,  apo , orbital period,  orbit , and orbital eccentricity, , from orbit modeling versus the simulations, as a function of lookback time of infall in the MW-mass halo (left), current distance from MW-mass host,  (middle), and satellite  star (right).Solid lines show the median and the dark and light shaded regions show the 68th and 95th percentiles across all satellites.Top row: The most recent apocentre distance.Versus infall time, the median is roughly constant at −2 per cent for satellites that fell in ≳ 3.5 Gyr ago.The model recovers the median apocentre distance to within ±5 per cent versus , but the median decreases slightly with  star from ≈ 0 to 8 per cent at  star = 10 8.25 M ⊙ .Although the medians in each panel may only be within a few per cent, the 1 scatters span ≈ 5 − 10 per cent or more, highlighting the uncertainty for a given satellite.Middle row: Difference between the most recent orbital time,  orbit , defined as the difference in time between the two most recent pericentres.Because orbit modeling generally under-predicts the recent pericentre lookback times, and because the orbits are periodic, the median difference in  orbit is slightly negative across all panels, and even as low as 0.5 − 1.5 Gyr for satellites with  star ≳ 10 7.25 M ⊙ .Bottom row: Difference between most recent orbital eccentricity,  = ( apo −  peri )/( apo +  peri ).The difference in  varies by at most 0.06 versus  lb infall,MW and , but satellites with  star > 10 8.5 M ⊙ have differences > 0.15.In general, orbit modeling recovers the median properties here within ≈ 7 per cent (see Table 2), though with significant scatter.The 1 scatters span 8 − 15 per cent, likely because these properties all depend on pericentre and apocentre events that occur in the recent past.
simulations is across the satellite population.We define a property to be minimally, moderately, or highly uncertain if the 1 scatter is < 25 per cent, 25 − 50 per cent, or > 50 per cent.Because bias is more problematic (systematic) than uncertainty, we impose stricter criteria for it.
Figure 12 visually represents these summary results, via the median offsets, and 1 and 2 scatters, for the fractional differences between orbit modeling and simulations.We rank order each property independently in each panel.
The median fractional offset (representing the 'bias') of all properties ranges from −0.54 for the fractional change in specific energy to 0.44 for the lookback time of satellite infall using a fixed  200m ( lb = 0).The properties whose median agrees to within ±5 per cent across the population include: the most recent pericentre distance,  peri,rec , the lookback time of the most recent pericentre,  lb peri,rec , the maximum value of the derivative of the tidal acceleration, |/ |, the fractional change in the angular momentum relative to today, (ℓ 0 − ℓ infall )/ℓ 0 , the most recent apocentre distance,  apo,rec , the eccentricity of the most recent orbit,  rec , and the total satellite velocities at the minimum and most recent pericentres,  peri,min and  peri,rec , respectively.Not surprisingly, orbit modeling tends to model/recover recent properties of an orbit with the least bias across a population.
Properties that agree moderately, to within 5 − 10 per cent, include the distance of the minimum pericentre,  peri,min , the lookback time of infall into the MW-mass host,  lb infall,MW , and the most recent satellite orbit period,  orbit,rec .Both  lb infall,MW and  peri,min occurred further back in time than the properties that show the least bias.
Finally, the properties that are most systematically biased in orbit modeling are: the number of pericentric passages both with an evolving and fixed  200m ,  peri and  peri,fixed , the lookback time of infall when keeping a fixed  200m =  200m ( lb = 0), and the change in orbital energy since infall,  inf .
Just as important as examining the bias (offset in the median across the population) is the uncertainty for a given satellite, via the scatter across the population.This ranges across ≈ 0.06 − 0.82 at 1 and ≈ 0.38 − 4.2 at 2.Again, properties that occurred more recently generally have smaller 1 scatter (aside from  peri,min ).
At best, the uncertainty for a given satellite is 6 per cent in the apocentre distance, and ≳ 10 per cent for all other properties.Additionally, these uncertainties reach nearly a factor of ≈ 2 in energy, and the 2 scatters are ≳ 40 per cent.
Because we model the host potential to within a few per cent at  = 0, the uncertainties in Table 2 represent lower limits to the bias/uncertainty in orbit modeling in practice.

Summary of results
We compared 15 orbit properties for 493 satellite galaxies around 13 MW-mass hosts in the FIRE-2 suite of cosmological baryonic simulations against orbit histories derived from orbit modeling in a static, axisymmetric potential for the same hosts, to quantify rigorously the accuracy and precision of this orbit modeling technique.Specifically, we fit axisymmetric potentials to each MW-mass hosts at  = 0 to within a few percent, which also means that the uncertainties that we present are lower limits to a more realistic scenario applied to the MW/M31 with uncertainty in the underlying host potential.We now discuss the key questions we raised in the Introduction and our corresponding results.

How much has the mass profile of a MW-mass host evolved over the orbital histories of typical satellites?
• Most surviving satellites first fell into their MW-mass halo 3.4− 9.7 Gyr ago.During that time,  200m and  200m of the host were 33 − 86 per cent and 26 − 73 per cent of their values today (Figure 1), so they roughly doubled since then.
• Perhaps more relevant for satellite orbits, the total enclosed mass within a fixed physical distance increased meaningfully (Figures 2-3).Within 50 kpc, the typical recent pericentre distance of our satellites, the enclosed mass was only ≈ 74 per cent of its present-day value at typical satellite infall times (≈ 7.4 Gyr ago).
• The fractional increase in the enclosed mass of the host is larger at smaller distances (Figures 2-3).This is contrary to the expectations of 'inside-out' growth of a dark-matter halo from DMO simulations, where most halo growth occurs at larger radii (for example Diemand et al. 2007;Wetzel & Nagai 2015).With the inclusion of baryonic physics (most importantly gas cooling), more physical growth occurs at smaller distances, most relevant for satellites with smaller pericentres.
How well does orbit modeling in a static axisymmetric host potential recover key orbital properties in the history of a typical satellite?
• Calculating the infall time of a satellite in the model with a growing  200m () yields more consistent results with the simulations, ≲ 2 Gyr offset, compared to using the fixed  200m ( = 0) at present-day, but the 1 uncertainties in both metrics can be as high as ≈ 5 − 6 Gyr (Figure 8).
• Orbit history properties that occurred more recently have smaller fractional offsets and uncertainties than properties that occurred in the past.For instance, the timing and distance of the most recent pericentre have median fractional offsets (uncertainties) ≲ 3 (9 − 21) per cent, compared to the minimum pericentre distance, which occurred further back in time, which has a fractional offset (uncertainty) of 7 (53) per cent (Figures 8 and 9 and Table 2).
• The orbit properties that orbit modeling recovers best (with the smallest bias and uncertainty) include the distance, timing, and velocity of the recent pericentre, the velocity at the minimum pericentre, the most recent apocentre distance, and the orbit eccentricity and period.The properties that are not recovered well (largest bias and/or uncertainty) include the minimum pericentre distance, number of pericentric passages, the lookback time of infall into MW-mass host halo, maximum strength of the tidal field, and the change in total orbital energy (Figure 12 and Table 2).
• Even with near-perfect knowledge of the mass distribution/potential at  = 0 in the host galaxies, the typical uncertainties in these orbit properties range from 6 − 82 per cent.Furthermore, the satellite-to-satellite variations in each, that is, the 2 scatters, are ≳ 40 per cent.Thus, one cannot recover these orbit properties to within a factor of ≈ 2 or so, which cautions against overgeneralizing the results for a single satellite from the median trends (Figure 12 and Table 2).
• At fixed mass, the spatial extent or orientation of the host galaxy disc does not significantly affect the orbital properties of the satellites.Compared to a disc that is rotated by 90 degrees, or a point mass disc model, the median offsets between the fiducial disc model are within ≈ 10 −3 per cent, and the widths of the 68th percentiles are less than 6 × 10 −2 per cent (Table B1).

How far back in time can one reliably model the orbital history of satellites in a static axisymmetric host potential?
Table 2. Comparing the results of orbit modeling in a static axisymmetric host potential against cosmological baryonic simulations.Column list: property name; variable; median offset, 1 scatter, 2 scatter.The left columns compare the raw difference,  model −  sim , while the middle columns compare the fractional difference, ( model −  sim ) /  sim .Additionally, we describe the strength of the bias (median fractional offset) and uncertainty (1 scatter of the fractional offset) associated with each property.In the bottom two rows, we show the difference in the orbital energy between present day and the infall time, normalised by the host halo potential energy at present-day, ( 0 −  inf )/ 200m,0 , and the fractional change in the orbital specific angular momentum relative to present-day, (ℓ 0 − ℓ inf )/ℓ 0 .Given that energy and angular momentum are always conserved in the model, we place them below a horizontal line to distinguish them.

Raw Difference
Fractional • The specific energy and specific angular momentum of an orbit are generally not conserved: uncertainties are less than 25 per cent only back to ≈ 3.1 Gyr.Backward integrating orbits more than ≈ 9 Gyr results in energy uncertainties up to a factor of ≈ 2 or more (Figures 4,5 and 12,and Table 2).
• At the most recent orbit, the uncertainty in pericentre distance is already ∼ 20 per cent, and ∼ 200 Myr in pericentre timing (Figure C1).Subsequent orbits result in larger uncertainties.

Comparison to D'Souza & Bell
Our analysis is closest to that of D 'Souza & Bell (2022), who similarly fit symmetric models to MW-mass halos from the ELVIS suite of DMO simulations (Garrison-Kimmel et al. 2014) to study the uncertainties associated with orbit modeling.First, we note key differences in methods.D 'Souza & Bell (2022) used DMO simulations, which neglects the (tidal) acceleration from the central galaxy that modify these orbits and could strip/disrupt satellites that orbit nearby.The internal stellar feedback in a satellite also can reduce the inner density of dark matter and make the satellites more vulnerable to tidal disruption (Bullock & Boylan-Kolchin 2017), although this is a second-order effect (Garrison-Kimmel et al. 2017).Without these baryonic effects and processes, the surviving satellites in DMO simulations typically fell into their MW-mass halo earlier and were able to complete more pericentres, while also orbiting closer to the center of the host with smaller pericentric passages than we showed in Santistevan et al. (2023).However, D'Souza & Bell (2022) did account for the effects of dynamical friction in their model, similar to Patel et al. (2020), which acts to slow satellites down and ultimately merge within the host, which we do not.D'Souza & Bell (2022) also examined the gravitational effects from LMC-like analogs in MW-mass hosts.Because the LMC is so massive, it hosts its own satellite population, and studies suggest that it is on first infall into the MW and near its first pericentre (for example Kallivayalil et al. 2013;Deason et al. 2015;Kallivayalil et al. 2018;Patel et al. 2020), so accounting for its gravitational influence on the surrounding satellites is of great interest.Although some hosts in our simulations have LMC-like analogues at previous snapshots (see Samuel et al. 2021;Barry et al. 2023), we do not analyze them specifically.
Another significant difference between D'Souza & Bell (2022) and our analysis is that they account for the true mass growth of the  2 lists, based on the fractional level of agreement between orbit modeling in a static axisymmetric host potential and the cosmological baryonic simulations.For clarity, we shorten ( 0 −  inf )/ 200m,0 to  inf , and (ℓ 0 − ℓ inf )/ℓ 0 to ℓ inf , and we show the absolute change of  inf in the left panel to more easily compare with the other properties.The left panel shows the median offset, in order from negative to positive values, and the middle and right panels show the 1− and 2 scatters in order of agreement.Satellite orbit properties that occurred recently, such as the recent pericentre distances/times/velocities, show smaller median offsets, ≲ 3 per cent, compared to properties that occurred further in the past, such as the minimum pericentre distances or infall times, ≳ 5 per cent.The same is generally true for the 1 scatter, but not always for the 2 scatter, despite the strong correlation between the two.Thus, deriving orbit parameters in the recent past yields median results largely consistent with cosmological simulations, but almost always with significant scatter (uncertainty) always ≳ 10 per cent, and orbit parameters that occurs further in the past generally suffer from non-trivial bias and significant uncertainty.Furthermore, these results are best-case scenarios for static axisymmetric host potentials, because we fit them near-exact to the simulations at  = 0.
MW-mass host at every snapshot by updating their potentials while keeping the potential fixed between snapshots.We do not account for the mass growth of the MW-mass host, because the majority of orbit modeling studies in the literature implement a fixed mass/potential (for example Patel et al. 2017;Fritz et al. 2018a;Fillingham et al. 2019;Pace et al. 2022), because we do not know the full mass histories of the MW or M31, and also because the mass assembly history for each MW-mass host in our simulations is unique.
D'Souza & Bell (2022) define a recovered property as being when the absolute value of the fractional difference is less than 30 per cent, that is, | true −  model |/ true < 0.3, and report the fraction of satellites that do not meet this criterion as being 'outliers'.They center their results on two hosts, 'iDouglas' which is an isolated MW-mass galaxy and 'iOates' which has an LMC analog, but they test orbits in iOates with and without the gravitational contribution from this massive companion.Similar to our analysis, they focus on the timing and distance of various pericentre events, the apocentre distances, and the infall times of satellites, and find that more recent pericentres and apocentres have smaller outlier fractions than pericentres or apocentres that happened at earlier times.In particular, the outlier fractions for the most recent, and second-most recent pericentre distances in the hosts without the additional massive satellite are 31.2− 47 and 43.8 − 69.9 per cent, respectively.Although we do not look specifically at the second-most recent pericentre, we do find that the median fractional offsets and 1 uncertainties for the most recent pericentre distance, −2.5 and 21 per cent, are smaller compared to the same for the minimum pericentre distance which often occurred ≈ 6 Gyr earlier, 0.066 and 53 per cent.The authors also show that the timing of the most recent pericentre is often better recovered than the distance, with an outlier fraction of 13.8 − 23.2 per cent.Our results show that the median fractional offset is −2.8, which is comparable to the offset in recent pericentre distance, but with a smaller 1 uncertainty of 9 per cent.Thus, it is often easier to recover the timing of a pericentre than its distance.D'Souza & Bell (2022) similarly showed that the distance of the most recent apocentre has a smaller outlier fraction than the most recent pericentre distance, with a value of only 6.2 − 34.9 per cent.Our work also suggests that the apocentres are easier to recover, with a median fractional offset and 1 uncertainty of −1.3 and 6 per cent, respectively.They conclude that apocentres are easier to model because they only depend on the binding energy of a satellite galaxy, while the pericentres depend on the angular momentum of a satellite as well as its binding energy.Properties at apocentre also do not intricately depend on the details of the gravitational potential at small distances like pericentres do, and rather, what is more important is modeling the total enclosed mass precisely, as both studies have done.Finally, the authors also calculate the infall times of satellites and find good agreement in their simulations and model, with an outlier fraction of 11.2 − 28.9 per cent.Although we generally see small median offsets when calculating infall time with an evolving  200m (), ≈ 6.7 per cent, the associated uncertainty is high, ≈ 41 per cent, and using a fixed  200m ( = 0) is worse.D'Souza & Bell (2022) explored other models for the growth of the MW-mass host, and most comparable to our work is their results in which they keep the mass fixed over time.They report that, depending on the property, using the static model in some cases produces the best results as compared with the simulations.However, as one integrates longer in time, the static model becomes less representative of the MW-mass environment, and thus satellite orbit properties that occurred earlier are not modeled as well, which is what we similarly see in the minimum pericentre properties and infall times.The authors explored a model that accounts for the median mass growth of the 48 MW-mass halos in the ELVIS DMO simulation suite and found that both the static model and median mass growth model reproduce similar results at recent lookback times as well.Finally, they implemented a static model with 40 per cent more mass, and a static model with different halo concentrations, which both returned larger biases, scatters, and outlier fractions.Thus, modeling the mass and shape of the halos at present-day is important.
Finally, D'Souza & Bell (2022) compared uncertainties in the virial mass of the MW-mass host, and thus uncertainties in the potential, the uncertainty in the 6D phase-space coordinates of the satellites at  = 0, the uncertainty in modeling the recent LMC-like accretion, and the uncertainty in modeling the motion of the MWmass system as it moves throughout the Universe.They concluded that the uncertainties in the recovered orbit history properties when using simple parametric forms of the potential or ignoring the LMClike contribution to the potential are comparable to the uncertainty in results caused by a ≈ 30 per cent uncertainty in the virial mass of the MW-mass host.

Comparison to other studies
The results in Figures 2 and 3 (top) differ strikingly from those from DMO simulations, where the enclosed mass at smaller radii is set earlier than at larger radii.For example, using the Via Lactea dark matter-only simulation, Diemand et al. (2007) showed that the enclosed dark matter mass within 100 kpc assembled prior to  ≈ 1.7 ( lb ≈ 9.9 Gyr ago), and grew only by about 10 per cent since then.The enclosed mass at even smaller distances formed earlier.In a related analysis, Wetzel & Nagai (2015), showed that the enclosed dark-matter mass within a fixed physical  ≈ 50 − 100 kpc at  = 1 was already ≳ 85 − 95 per cent of its value at  = 0, compared to the mass in stars and gas, which were only ∼ 55 − 70 per cent of the mass at  = 0.Because dark matter is collisionless and dissipationless, and because the accretion radius grows over time, dark-matter growth occurs largely 'inside-out'.However, gas can cool over time, which drives the formation of the central galaxy, and leads to more physical mass growth at smaller radii compared with the dark matter.
To derive the orbit histories of satellites of the MW and M31, many studies commonly apply orbit modeling in a static host potential or use simple approximations of the growth of the host.For instance, Kallivayalil et al. (2006) used a fixed MW-mass potential, as well as an LMC-like potential, to determine that the SMC was gravitationally bound to the LMC.Kallivayalil et al. (2013) used 3 epochs of HST measurements to constrain the LMC's proper motions and suggest that the LMC is likely on its first infall, as previous studies have suggested (for example Besla et al. 2007).The authors compared different static models of the MW and models that account for the growth of the MW-mass halo over the last 10 Gyr, but in the evolving models, the authors did not account for the central galaxy.Patel et al. (2017) similarly sought to understand the orbit of the LMC around the MW, as well as the orbit of M33 around M31, and concluded that like the LMC, M33 is likely fell into the M31 halo less than ≲ 2 Gyr ago or so.Although these authors modeled the many components of the main galaxy, they did not account for any time dependence of the host potential.Patel et al. (2017) also compared the orbits of massive satellite analogues in the dark matter-only Illustris-1-Dark simulation to an NFW model of the MW-mass host halo with a dynamical friction model included.The authors used the  = 0 6D phase-space coordinates for the satellites and integrated their orbits, similar to our pipeline, but for 6 Gyr, and suggest that this type of orbit modeling technique shows good agreement between the two orbits for satellites on first infall and for satellites that recently completed their first pericentre.The orbits that we present in Figure 6 show good agreement between the simulations and model for recent lookback time as well, however, only the top-left panel shows good agreement for up to 6 Gyr.There are other satellites in our sample that show agreement for this time range, however, we choose to intentionally show test cases in which the model does not do well in this figure as well.
Recent work not only suggests that the LMC has only recently fallen into the MW's halo and is currently near its first pericentre (for example Kallivayalil et al. 2009Kallivayalil et al. , 2013)), but that it has a satellite population of its own (for example Deason et al. 2015;Kallivayalil et al. 2018;Patel et al. 2020).Other studies account for the gravitational contribution of the LMC on the other satellites of the MW.Kallivayalil et al. (2018) used Gaia data, in conjunction with the DMO Aquarius simulation, to determine which satellites might be gravitationally bound to the LMC.The simulation does have an LMC analogue, with a similar position and velocity at  = 0, but does not account for the gravitational effect of the central galaxy.Patel et al. (2020) used newer Gaia observations and numerically integrated the orbits of satellites in a model of the potential with a MW, LMC, and SMC component to determine the orbital histories of the LMC and its satellites.Although they keep the potential fixed over time, they only backward integrated the orbits of satellites for ≈ 6 Gyr, given that beyond this time, MW-mass halos typically have ≲ 80 per cent of their mass at  = 0.The authors concluded that the derived orbits for satellites of the LMC strongly depend on whether or not you include the LMC in the global potential, and in some cases, the contribution from the SMC is important as well.Finally, other studies aim to accurately model the LMC potential with basis function expansions fit to simulation data to understand how it affects the MW, and its satellites, as in Garavito-Camargo et al. (2019, 2021); Correa Magnus & Vasiliev (2022), but doing so while accounting for the growth of the MW remains challenging.
Studies such as Fritz et al. (2018a) and Fillingham et al. (2019) provide inferences of the infall times, pericentre and apocentre distances, and orbit eccentricities for all satellites in the MW.Fritz et al. (2018a) used data from Gaia DR2, and numerically integrated the orbits of all satellites in Galpy in two different models for the MW potential, which they kept fixed over time.Depending on the mass of the MW, the authors showed that some of the apocentres for the satellites can lie either inside or outside of the virial radius, which has implications for instance in studying how a satellite interacts with the hot gas in the MW halo.Furthermore, the authors suggested that some satellites have pericentres as close as ≈ 20 kpc, where the strong tidal acceleration from the central galaxy is important.As we showed in Figures 1-3, depending on when these pericentres take place, the mass of the host was only a fraction of its mass today.Fillingham et al. (2019) used the Phat ELVIS DMO simulations, which include an analytic disc potential, to statistically sample satellites with similar present-day positions and velocities to make cosmologically informed predictions for what the infall times of the MW's satellites were.The Phat ELVIS simulations account for the growth of the disc potential through abundance matching scaling relations, where each disc has a unique growth rate, and thus, tidally disrupt subhalos that orbit close to the disc (Kelley et al. 2019).
Most of the MW-mass simulations in the FIRE-2 suite do not have an LMC-like analogue at  = 0, but there are at least four analogues at earlier times, within the last 6 − 7 Gyr ago, and previous works have used these analogues to study both planar configurations of satellites and the effect of LMC-mass satellites on subhalo populations (Samuel et al. 2021;Barry et al. 2023).Given that the LMC recently fell into the MW's dark matter halo, ≲ 2 Gyr ago (for example Besla et al. 2007;Kallivayalil et al. 2013;Patel et al. 2017), previous orbit modeling studies that include the effects of the LMC to derive orbits of the MW's satellites work well in this recent regime.On the other hand, because we do not include the effects of an LMC in the host potential, our work informs how well static potential orbit modeling works in the regime before the LMC was a satellite, that is, ≳ 2 Gyr ago.However, the results suggesting that the LMC is on first infall and just passed its first pericentre are trustworthy, given that the mass of the MW and its dark matter halo have not significantly changed over the last ≲ 2 Gyr.As such, both kinds of studies, both with and without an LMC analogue, complement each other in understanding the full, complex MW formation history.

APPENDIX A: MODELING HOST MASS PROFILES
We integrate the orbits of satellite galaxies using the galactic dynamics Python package Galpy (Bovy 2015), which allows users to define custom host potentials.We thus fit the enclosed mass profiles of the MW-mass hosts in our simulations as a sum of disc and halo components, to supply to Galpy.

A1 Modeling the disc
We model the MW-mass disc as the sum of two double exponential discs, one for the inner disc/bulge, and one for the outer disc.The density profile of the disc is: (A1) where  inner disc and  outer disc are amplitudes of mass density ( ⊙ / kpc 3 ),  inner disc and  outer disc are disc scale radii ( kpc), and ℎ  is the disc scale height ( kpc).Given that satellite orbits are more sensitive to the enclosed mass, rather than the local density, we first integrate out the vertical component of Equation A1, then we integrate over cylindrical  to obtain the disc enclosed mass given by  (< ) = 4 ℎ   inner  inner −  −/ inner ( inner + ) , (A2) and a similar term for the outer disc component.We thus ensure that the total enclosed mass agrees to within a few percent at all relevant radii when integrated over all space.We fit and model the disc with star particles and cold gas ( < 10 5 K) within  = 0 − 20 kpc and | | < 3 kpc.Including mass within | | < 4, 5 did not significantly change the derived parameters.
The orbits of typical satellites are not sensitive to the details of the size, geometry, or orientation of the disc, see Appendix B.

A2 Modeling the halo
To fit the halo component, we use a generalized form of the spherical Navarro-Frenk-White (NFW, Navarro et al. 1996)  where  halo is the amplitude ( M ⊙ ),  halo is the scale radius ( kpc), and  and  are the slopes of the inner and outer density profile.We then integrate the analytic form of () to convert this to an enclosed mass profile: where 2  1 is the Gauss hypergeometric function.We model and fit this profile with dark-matter and hot gas ( > 10 5 K) within  < 10 kpc, and all particles (dark matter, all gas, and stars) at  = 10 − 500 kpc.We obtained similar parameters selecting particles within  < 300, 350, 400 kpc, but we use the fits out to 500 kpc, because many satellites orbit out to there (see Santistevan et al. 2023).We tried fitting the halos to a regular NFW profile, where  = 1 and  = 3, but we obtained notably better agreement with the generalized form above.

A3 Total mass profile
We use the parameters we derived from fitting the above analytic density/mass profiles in their respective potentials in Galpy.In Galpy, we specifically use the spherically symmetric 'TwoPower-SphericalPotential' for the DM halo, and axisymmetric 'DoubleEx-ponentialdiscPotential' for the inner and outer discs, and we input all three to define the total potential of a given host, to use to integrate the orbits of its satellites.Table A1 lists the fit parameters.
Figure A1 shows the ratio of the halo+disc analytic model using the parameters in Table A1 to the enclosed masses in the simulations; a 1:1 ratio is presented as the dotted horizontal line.At all distances we show, we model the median enclosed mass to within 3-4 per cent, and the 68th percentile range is within 5 per cent beyond 10 kpc.Within 10 kpc, the 68th and 95th percentiles of the enclosed mass ratios span both higher and lower than what we present in the y-axis, but only ≈ 5 per cent of surviving satellites orbited at such small distances.
Individually, we model the two-component disc to within ≈ 4 per cent at  = 2 − 20 kpc and the halo to within ≈ 10 − 20 per cent.Although we double counts the star particles and cold gas between 10 − 20 kpc, the median mass ratio in Figure A1 is ≲ 2 per cent  A1 lists all fit parameters.Black line shows the median and the shaded regions show the 68th and 95th percentile scatter across our 13 hosts.
The median agrees to within < 4 per cent at all radii and ≲ 1 per cent for the mass within  200m (344 − 472 kpc).The ratio shows larger host-to-host scatter at ≲ 10 kpc, but only ≈ 5 per cent of satellites in our sample orbit this close.This level of agreement ensures that modeling the total (axisymmetric) mass profile of each host at  = 0 is not a significant source of error for our orbit modeling.
within this distance and produces a good fit to the data.One galaxy, m12c, experienced a significant merger (≈ 3 : 1 mass ratio) ≈ 9 Gyr ago, thus modeling this host with symmetric analytic models does not capture the complexity in its mass distribution.Thus, the total enclosed mass in m12c is fit to within ≈ 15 per cent.Comparing the timing and distance of each pericentric passage between orbit modeling and simulations, which we measure at each 'pericentre event', where 1 is the most recent pericentre, 2 is the second most recent, and so forth.We select all satellites that experienced (at least) a given number of pericentres, and we show the median and 1 scatter across all satellites.Top: The median fractional difference in pericentre distance decreases from −2.5 to −18 per cent going back 6 pericentres while the 1 scatter increases from 18 to 48 per cent.Less than 3 per cent of satellites experienced more than 7 pericentres, in both the simulations and orbit models.Bottom: The median difference in the timing of the pericentric events decreases with lookback pericentre events from −30 Myr for the most recent pericentre to −1.2 Gyr.The 1 scatter ranges from ≈ 0.2 − 0.85 Gyr, so not only does orbit modeling increasingly systematically underpredict the timing of a pericentre, but also, orbit modeling becomes more uncertain, over longer lookback times.
cent experienced three or more.The most pericentres experienced is 10, but we cut the figure at 6 pericentres given that only 3.4 per cent experienced more than this.Focusing on trends with distance, the median fractional difference in the most recent pericentre is only −2.5 per cent, similar to Figure 9 (top row).As we compare pericentres further back in time, the median decreases to −18 per cent for satellites at their 6th-most recent pericentre.However, this is not to say that the model never overpredicts the pericentre distances.For the most recent pericentric passage, the model under-predicts the distance in ≈ 60 per cent of the sample, and over-predicts the distance in the other 40 per cent.
Likewise, the 1 scatter generally increases with increasing lookback pericentre events, where it reaches a max value of ≈ 48 per cent at the 5th-most recent pericentre.Thus, the median difference between the model and simulation increases with each orbit as we look back in time, but also the uncertainties associated with these pericentre distances increases.
Similar to trends in distance, Figure C1 (bottom) shows that the median difference in the timing of each pericentre decreases from −30 Myr at the most recent pericentre to −1.2 Gyr for the 6th most recent.Orbit modeling continuously under-predicts pericentre events, presumably because the host mass is unchanging as the satellites orbit.For a given satellite, the more massive host with a deeper gravitational potential will result in a more bound orbit compared to the same satellite in the same MW-mass host with less mass at earlier times.More bound satellites thus orbit deeper in the potential with shorter orbit timescales.Regarding the most recent pericentric passage, the model under-predicts the timing in ≈ 66 per cent of satellites and over-predicts the timing in 30 per cent; the remaining 4 per cent of satellites have nearly identical values.The 1 scatter spans ≳ 0.5 Gyr for the second-most recent pericentre and beyond.
These results highlight another aspect of how far back in time orbit modeling reliably works.If one is only interested in deriving the most recent pericentre distance or time, the model recovers the median trends of satellites to within a few per cent, but the uncertainties are non-negligible.For the satellites that orbited more than once, the most recent pericentre is not always the smallest (see Santistevan et al. 2023), therefore if one is interested in how close satellites orbited to their host, then the uncertainty in deriving these distances will be ≳ 35 per cent, and ≳ 0.5 Gyr in the timing.
This paper has been typeset from a T E X/L A T E X file prepared by the author.

Figure 1 .Figure 2 .
Figure 1.Left: Total (baryonic plus dark matter) mass of a MW/M31-mass halo,  200m , enclosed within  200m as a function of lookback time,  lb .The left axis shows  200m at  lb relative to  200m today, while the right axis converts this to the median  200m across our sample.The black line shows the median, and the dark and light shaded regions show the 68th and 95th percentiles across our 13 hosts.The vertical grey line and shaded region show the median and 68th percentile range of the lookback times of infall for surviving satellites.The typical host halo had 54 per cent of its final  200m when a typical satellite fell in ≈ 7.4 Gyr ago, so  200m nearly has doubled over that time.Right: Same, for the growth of the MW/M31-mass halo radius,  200m . 200m shows nearly linear growth with time over the last ≈ 12 Gyr.The typical host halo had 43 per cent of its final  200m when a typical satellite fell in ≈ 7.4 Gyr ago, again highlighting significant change over the orbital history of a typical satellite.

Figure 3 .
Figure 3. Top: Average long-term evolution of the mass profile (Section 3.1.2).Similar to Figure2, the total mass within a given physical distance, , relative to the value today, but now as a function of .We show the median across our 13 MW/M31-mass hosts, at various lookback times,  lb , back to 11 Gyr, which encompasses the infall times of > 95 per cent of surviving satellites.The enclosed mass increases over time at essentially all , with the inner halo near the galaxy experiencing the most fractional mass growth.Thus, the approximation of a static halo mass/potential is least accurate for satellites with the smallest pericentres.For context, the median recent pericentre of surviving satellites is ≈ 50 kpc.Bottom: Typical short-term fluctuations in the host halo mass profile (Section 3.1.3).The 1 and 2 standard deviation of the fractional change/fluctuation of the total mass of the host over the last 2 Gyr, which is the typical of the orbital timescale of satellites at  ≈ 30 kpc.These fluctuations in enclosed mass are weaker at larger distances.On average, the host mass growth/fluctuations over the last 2 Gyr are ≲ 5 per cent, though such fluctuations would be even higher for halos with a massive (LMC-or M33-mass) satellite.

Figure 4 .
Figure4.The difference in a satellite's specific orbital energy and specific orbital angular momentum between today and at first infall into the MW-mass halo, versus lookback time of infall (left), current distance from host,  (middle), and satellite  star (right).Orbit modeling nearly always assumes that these quantities are conserved.Solid lines show the median for all satellites within the 13 MW-mass hosts, and the dark and light shaded regions show the 68th and 95th percentiles.Top row: Change in specific orbital total energy, , relative to the virial energy of the MW-mass host halo today,  200m,z=0 =   200m / 200m .The median decreases from 0 to −2.5 with increasing infall lookback time, meaning that satellites have lost energy and become more bound since infall, because the host has grown (Figures1-3).Because  inversely correlates with infall time, the median change in energy increases from −2 to −0.2 with  (middle) and is constant with satellite mass at  star ≲ 10 7.5 M ⊙ , but it decreases at higher mass because of dynamical friction.Although  correlates with these properties, the widths of the 68th percentiles span ≈ 1.2 − 1.5, highlighting the significant variation (and thus uncertainty) for any given satellite.Bottom row: Change in specific orbtal angular momentum, ℓ, relative to the value today.The median fractional difference is generally zero for satellites that fell in ≲ 9.5 Gyr ago, but earlier infalling satellites have increased by up to ∼ 45 per cent.We find little to no change in the median ℓ with  or  star , except for satellites with  star ≳ 10 7 M ⊙ , which experienced stronger dynamical friction.The mean widths of the 68th percentiles in the fractional change in ℓ span 70 per cent versus infall time and , and 100 per cent versus  star .These uncertainties in both  and ℓ imply that we cannot infer the initial energy or angular momentum of a given satellite's orbit at its time of infall to better than ≳ 2.

Figure 5 .
Figure5.Fractional change in specific energy, , and specific angular momentum, ℓ, of satellite orbits versus lookback time.Solid line shows the median across all satellites and dashed line shows the 1 scatter.We only include galaxies that are satellites at a given lookback time, including splashback satellites.The vertical line and shaded region represent the median satellite infall time and 68th percentile range of values, respectively.Top: Specific energy.The median fractional change increases with lookback time, that is, they were less bound at infall than they are today, because the host has grown (Figures1-3).While the median, which represents the systematic bias across the population, reaches 25 per cent at 4.8 Gyr ago, the 1 scatter, which represents the uncertainty for a given satellite, reaches 25 per cent already at 1.6 Gyr ago.Bottom row: Specific angular momentum.The median is about zero over the last ≈ 6 Gyr, but at earlier times the angular momentum decreases, down to −60 per cent, meaning that the specific angular momentum systematically has increased since infall.Again, the 1 scatter reaches 25 per cent already 4.7 Gyr ago.Thus, while inferences of orbital energy and angular momentum, based on their conservation, are relatively unbiased (for the overall population) over the last ≈ 5 − 8 Gyr, their uncertainties for a given satellite are large already ≈ 1 Gyr ago.

Figure 7 .
Figure7.The fractional difference between orbit modeling and the simulations for the host-centric distance versus lookback time.The dashed line shows the absolute value of the median across all satellites and the solid line shows the 1 scatter of the sample.We choose to show the absolute value of the fractional difference to keep all values positive for visual clarity, however, the median is negative for lookback times ≲ 7.5 Gyr, and positive for earlier times.Similar to Figure5, we only include the instantaneous population of satellites for a given lookback time.The median reaches 25 per cent at 9.7 Gyr ago, but the 1 scatter reaches 25 per cent already at 4 Gyr ago.

Figure 8 .
Figure 8.The difference in infall time, from orbit modeling versus the simulations, as a function of infall time in the simulations (left) and satellite  star (right).For orbit modeling, we measure infall time two ways: using a non-evolving host halo radius,  200m ( lb = 0) (green) and using  200m ( lb ) from the simulations (purple).Solid lines show the median and shaded regions show the 68th percentile ranges across our satellites.Left: The median infall time is generally accurate for orbit modeling if using an accurate  200m ( lb ), with a median offset of ≲ 2 Gyr and a typical scatter of 2.4 Gyr.The spike at 1.5 Gyr comes from orbit modeling overpredicting the infall times for most satellites by ≳ 5 Gyr, given errors in modeling recent apocentre distances.By contrast, orbit modeling using fixed  200m ( lb = 0) works well for recently infalling satellites but vastly overestimates the lookback time to infall for earlier-infalling satellites, with an average 1 scatter of 2.8 Gyr.Right: The median and 68th percentile for both infall time metrics show similar trends with satellite mass, offset by ≈ 3 Gyr.The typical 1 scatters range from roughly 3 Gyr and 2.5 Gyr when using either the fixed  200m ( lb = 0) or accurate  200m ( lb ), respectively.Orbit modeling fails most significantly for satellites with  star ≳ 10 7 M ⊙ , because dynamical friction has shrunk their orbits.

Figure 9 .
Figure9.Comparing various properties of orbital pericentres, between the simulations and orbit modeling, for surviving satellites versus their lookback time of infall into the MW-mass halo (left), present-day distance from the MW-mass host,  (middle), and satellite  star (right).Solid lines show the median, and the shaded regions show the 68th and 95th percentiles, across all satellites.Top row: Fractional difference between pericentre distances, ( peri,model −  peri,sim )/ peri,sim , for both the most recent and the minimum pericentre.Orbit modeling predicts larger minimum pericentres, as high as 100 per cent in the median, for satellites that fell in ≳ 12 Gyr ago (left), and within 25 per cent for satellites at small  (middle), and for satellites ≲ 10 7 M ⊙ (right).Second row: Difference between the total velocity at pericentre,  peri,model −  model,sim , for both the most recent and the minimum pericentre.Orbit modeling generally overpredicts both pericentre velocities by ≈ 10 − 20 km s −1 .Third row: Difference between the mean number of pericentric passages about the MW-mass host, since first crossing the growing host  200m ( lb ) and since first crossing  200m ( lb = 0).This difference slightly increases from ≈ 0 to 1 with  lb (left) and decreases from 2 to 0 with  (middle), because satellites at small distance typically fell in earlier, which means that they orbited longer in the model.Fourth row: Difference between the lookback time of the most recent pericentre,  peri,model −  peri,sim , which shows weak trends with any properties.Although the median trends across the satellite population agree well in most cases, the substantial scatter in all panels implies significant uncertainty for a given satellite's orbit history.

Figure 10 .
Figure10.Comparing the maximum tidal acceleration, |/ |, from the MW-mass host that satellites experienced via orbit modeling versus in the simulations, as a function of lookback time of infall into the MW-mass halo (left), distance from the host,  (middle), and satellite  star (right).Solid lines show the median and the dark and light shaded regions show the 68th and 95th percentiles across all satellites.These trends mirror those for the minimum pericentre distance,  peri,min , in Figure9(top).Because this orbit modeling does not account for the growth of the MW-mass host and the orbits are periodic, it increasingly over-predicts the pericentre distance, and underpredicts the tidal acceleration, with increasing  lb infall,MW ≳ 5 Gyr.The dependence on  and  star are weak.However, the 1 scatters span more than 50 per cent in each of the panels here, and up to a factor of 2, highlighting the large uncertainties.
Figure 11.Comparing the most recent apocentre distance,  apo , orbital period,  orbit , and orbital eccentricity, , from orbit modeling versus the simulations, as a function of lookback time of infall in the MW-mass halo (left), current distance from MW-mass host,  (middle), and satellite  star (right).Solid lines show the median and the dark and light shaded regions show the 68th and 95th percentiles across all satellites.Top row: The most recent apocentre distance.Versus infall time, the median is roughly constant at −2 per cent for satellites that fell in ≳ 3.5 Gyr ago.The model recovers the median apocentre distance to within ±5 per cent versus , but the median decreases slightly with  star from ≈ 0 to 8 per cent at  star = 10 8.25 M ⊙ .Although the medians in each panel may only

Figure 12 .
Figure12.Rank ordering the 15 properties of the orbit histories of satellite galaxies, as Table2lists, based on the fractional level of agreement between orbit modeling in a static axisymmetric host potential and the cosmological baryonic simulations.For clarity, we shorten ( 0 −  inf )/ 200m,0 to  inf , and (ℓ 0 − ℓ inf )/ℓ 0 to ℓ inf , and we show the absolute change of  inf in the left panel to more easily compare with the other properties.The left panel shows the (, )=  inner disc  −/ inner disc − |  |/ℎ  +  outer disc  −/ outer disc − |  |/ℎ halo )  (1 + / halo ) −  (A3)

Figure A1 .
Figure A1.Ratio of the enclosed total mass in each best-fit model to that in each simulation, as a function of distance from the center of the MW-mass host, , at  = 0. We fit a double-exponential profile for the inner and outer disc (Equation A1-A2) and a generalized NFW profile for the halo (Equations A3-A4).Table A1 lists all fit parameters.Black line shows the median and the shaded regions show the 68th and 95th percentile scatter across our 13 hosts.The median agrees to within < 4 per cent at all radii and ≲ 1 per cent for the mass within  200m (344 − 472 kpc).The ratio shows larger host-to-host scatter at ≲ 10 kpc, but only ≈ 5 per cent of satellites in our sample orbit this close.This level of agreement ensures that modeling the total (axisymmetric) mass profile of each host at  = 0 is not a significant source of error for our orbit modeling.
Figure C1.Comparing the timing and distance of each pericentric passage between orbit modeling and simulations, which we measure at each 'pericentre event', where 1 is the most recent pericentre, 2 is the second most recent, and so forth.We select all satellites that experienced (at least) a given number of pericentres, and we show the median and 1 scatter across all satellites.Top: The median fractional difference in pericentre distance decreases from −2.5 to −18 per cent going back 6 pericentres while the 1 scatter increases