ABSTRACT

Machine learning can play a powerful role in inferring missing line-of-sight velocities from astrometry in surveys such as Gaia. In this paper, we apply a neural network to Gaia Early Data Release 3 (EDR3) and obtain line-of-sight velocities and associated uncertainties for ∼92 million stars. The network, which takes as input a star’s parallax, angular coordinates, and proper motions, is trained and validated on ∼6.4 million stars in Gaia with complete phase-space information. The network’s uncertainty on its velocity prediction is a key aspect of its design; by properly convolving these uncertainties with the inferred velocities, we obtain accurate stellar kinematic distributions. As a first science application, we use the new network-completed catalogue to identify candidate stars that belong to the Milky Way’s most recent major merger, Gaia-Sausage-Enceladus (GSE). We present the kinematic, energy, angular momentum, and spatial distributions of the ∼450 000 GSE candidates in this sample, and also study the chemical abundances of those with cross matches to GALAH and APOGEE. The network’s predictive power will only continue to improve with future Gaia data releases as the training set of stars with complete phase-space information grows. This work provides a first demonstration of how to use machine learning to exploit high-dimensional correlations on data to infer line-of-sight velocities, and offers a template for how to train, validate, and apply such a neural network when complete observational data is not available.

1 INTRODUCTION

The Milky Way galaxy formed through a process of hierarchical structure formation, capturing and dismantling smaller galaxies with its strong gravitational pull (White & Frenk 1991). Stars accreted from a disrupted galaxy may retain characteristic chemical and phase-space features, which can be harnessed to infer properties about the parent galaxy (Helmi, White & Springel 2003; Bullock & Johnston 2005; Robertson et al. 2005; Font et al. 2006; De Lucia & Helmi 2008). This effort in galactic archaeology is greatly aided by the availability of complete velocity information for the Milky Way’s stars. In this work, we use a neural network to infer missing line-of-sight velocities from available 5D astrometry in Gaia Early Data Release 3 (EDR3) (Gaia Collaboration 2021), resulting in ∼ 92 million more stars with estimated 6D phase-space coordinates. As a first science application, we use the catalogue of machine-learned line-of-sight velocities to study the properties of one of the most significant mergers in the Milky Way’s history, Gaia-Sausage-Enceladus (GSE) (Belokurov et al. 2018; Helmi et al. 2018).

The Gaia satellite has ushered in a new age in astrometry, with the goal of providing precise positions and velocities for an unprecedented number of stars in the Milky Way (Gaia Collaboration 2016, 2018). However, most stars in Gaia EDR3 have only 5D astrometry available (two angular coordinates, two proper motions, and parallax). Currently, less than 1 per cent of the stars in the Gaia catalogue have a measured line-of-sight velocity, but this will continue improving with the addition of ∼30 million more line-of-sight velocities in the next data release (Gaia Collaboration 2021). Spectroscopic surveys play an important role in complementing the Gaia data with high-precision line-of-sight velocities and chemical abundances. However, these surveys typically have limited sky coverage, and only a small fraction of stars are cross-matched with Gaia. As we will show, using a neural network to fill in missing phase-space information in astrometric data is a powerful alternative when complete observational data is not available.

Dropulic et al. (2021) (hereafter referred to as D21) developed a deep neural network design that accepted 5D astrometric coordinates (angular coordinates, proper motions, and parallax) and output the missing line-of-sight velocity, vlos, along with an uncertainty on the predicted velocity, σlos. D21 trained, tested, and validated the neural network on mock Milky Way stellar data. The use of simulations allowed for clear metrics of success to guide the network development because truth information was available. To assess the network’s predictive power, D21 compared the true line-of-sight velocities to the predicted ones. They found that 5D astrometry alone is insufficient to provide a reliable prediction of vlos for every individual star. However, incorporating the learned uncertainty, σlos, allowed them to obtain accurate distributions for the full population of stars, as well as the correlations between different velocity components in the Galactocentric frame. Thus, the uncertainty output of the network was a crucial feature of its design. Moreover, D21 showed that the neural network was successful in accurately reconstructing spatially diffuse kinematic substructure, like the GSE, that comprises a small fraction of the overall data. This general idea of inferring missing line-of-sight velocities based on information acquired from a smaller subset of stars with full 6D information has recently been applied to stars in the Kepler field (Angus et al. 2022), albeit using a Bayesian inference approach.

Motivated by the results of D21, we train a neural network on ∼6.4 million Gaia EDR3 stars with line-of-sight velocities and then use it to predict the line-of-sight velocities and associated uncertainties of an additional ∼92 million stars. The network is validated by comparing its predictions to the measured values for the subset of data with full phase-space information.1 This work provides a concrete demonstration of how machine learning can be used to fill in stellar phase-space in future Gaia data releases. Indeed, as the subset of measured Gaia line-of-sight velocities continues to grow in the years ahead, so too will the training set, and the prediction accuracy of the network should only continue to improve.

As the first application, we use the catalogue of stars with machine-learned line-of-sight velocities to identify candidates of the GSE merger. To date, most studies of GSE have been restricted to a small fraction of stars from Gaia with 6D phase-space or that have been cross-matched with spectroscopic surveys. Still, much has been learned about the GSE merger (Helmi 2020). GSE tidal debris is thought to comprise a significant part of the Galactic stellar halo. It can potentially explain the observed break in the halo density and anisotropy profiles at Galactocentric radii of ∼20–30 kpc; the anisotropic component is thought to be the tidal debris of GSE, which is mixed together with an isotropic, metal-poor component of the Galactic halo (Deason et al. 2018; Haywood et al. 2018; Bird et al. 2019; Lancaster et al. 2019; Iorio & Belokurov 2021). Close to the Solar position, estimates suggest that the GSE debris can comprise more than 50 per cent, and possibly up to 80 per cent, of the stellar halo (Lancaster et al. 2019; Necib et al. 2019b; Naidu et al. 2020; Iorio & Belokurov 2021). GSE stars are on highly eccentric, radial orbits, and the original satellite likely had a total stellar mass of M* ∼ 108−9 M at infall (Koppelman, Helmi & Veljanoski 2018; Mackereth et al. 2019; Myeong et al. 2019; Vincenzo et al. 2019; Koppelman, Bos & Helmi 2020; Kruijssen et al. 2020; Mackereth & Bovy 2020; Yuan et al. 2020; Carollo & Chiba 2021; Iorio & Belokurov 2021; Limberg et al. 2021). Moreover, the metallicity distribution of GSE stars is estimated to peak near [Fe/H] ∼ −1.4 to −1.2, consistent with the picture that the collision between GSE and the Milky Way occurred ∼8 to 10 Gyr ago (Myeong et al. 2018; Di Matteo et al. 2019; Gallart et al. 2019; Mackereth et al. 2019; Massari, Koppelman & Helmi 2019; Necib et al. 2019b; Vincenzo et al. 2019; Bonaca et al. 2020; Das, Hawkins & Jofré 2020; Feuillet et al. 2020, 2021; Forbes 2020; Mackereth & Bovy 2020; Naidu et al. 2020; Donlon et al. 2021; Gudin et al. 2021; Hasselquist et al. 2021; Montalbán et al. 2021; Buder et al. 2022; Horta et al. 2023).

Our network-completed catalogue allows us to identify nearly twenty-fold more GSE candidate stars than possible from the 6D Gaia data set alone – nearly 450 000 stars in total. We then use these stars to infer the metallicity distribution function (MDF) for stars that are cross-matched with spectroscopic surveys and the spatial distribution of GSE for all stars. In both cases, we find excellent agreement between the true and predicted distributions for stars with full 6D information. For stars with only 5D information, the predicted metallicity distribution as a function of [Fe/H] and [Mg/Fe] for candidate GSE stars are consistent with both the 6D results and existing studies in the literature. The spatial distribution of GSE candidate stars with only 5D information exhibits hints of anisotropy in Galactocentric ygalzgal coordinates, as compared to the mostly isotropic distribution recovered in the 6D data set.

This paper is organized as follows. In Section 2, we review the neural network developed in D21, overview the necessary changes for its application to Gaia EDR3 data, and discuss validation of the network output. In Section 3, we introduce the network-completed Gaia EDR3 catalogue. In Section 4, we present our GSE selection technique and discuss the dynamics, metallicity, and spatial distribution of the candidate stars. We conclude in Section 5.

The catalogue of machine-learned line-of-sight velocities and uncertainties for Gaia EDR3 together with their source IDs has been made publicly available https://zenodo.org/record/6558083.

2 NETWORK TESTING AND VALIDATION

The Gaia catalogue contains full phase-space information for several million EDR3 stars (Katz et al. 2019). These stars serve as a critical tool for training, validating, and testing the neural network architecture from D21 in a purely data-driven fashion. Moving forward, we will work with this subset of Gaia stars, further selecting those with parallax ϖ > 0 and fractional error δϖ/ϖ < 0.2, renormalized unit weight error ruwe < 2, as well as Gaia-measured line-of-sight velocity vlos ∈ [ −550, 550] km s−1. Additionally, we restrict to stars with G-band magnitude G ∈ [6, 17] for training.2 We also correct for the zero-point parallax offset in the Gaia data, which results at least in part from a degeneracy between the parallax and basic-angle variation on the Gaia satellite. Specifically, we use the parallax zeropoint for EDR3 published by the Gaia Collaboration (Lindegren et al. 2021), which is a function of G, the ecliptic latitude, and the effective wavenumber used in the astrometric solution; the zeropoint ranges from −0.08 to 0.01. We refer to this subset of Gaia, which comprises more than 6.4 million stars, as the Radial Velocity (RV) Catalogue. A summary of the number of stars in the RV Catalogue as a function of various selection cuts is provided in Table 1.

Table 1.

Number of stars in various subsets of Gaia EDR3 used in this work. The subsets include the RV Catalogue, the Test-Set Catalogue, and the Machine Learned-Radial Velocity (ML-RV) Catalogue, with the total number of stars in each catalogue indicated in bold. The first is simply the subset of Gaia EDR3 data with measured line-of-sight velocities (Katz et al. 2019), which also satisfy the base selection requirements in Row [0] of the table in addition to having vlos ∈ [ −550, 550] km s−1. The Test-Set Catalogue is similar to the RV Catalogue, except with line-of-sight velocities and associated uncertainties inferred by the neural network. The ML-RV Catalogue is the subset of Gaia EDR3 that also satisfies the selection requirements in Row [0] in addition to G ∈ [12, 17] (Row [1]), but with no measured line-of-sight velocities. Each star in the ML-RV Catalogue has a line-of-sight velocity and an associated uncertainty predicted by the neural network. Additional selection cuts include [2] cutting on orbital eccentricity, e, [3] cross-matching with GALAH + DR3 (Buder et al. 2021), and [4] cross-matching with APOGEE DR17 (Majewski et al. 2017). Note that e corresponds to the true eccentricity for the RV Catalogue and the predicted value for the Test-Set and ML-RV Catalogues. The number of stars with e > 0.8 for the Test-Set and ML-RV Catalogues are approximate, since this quantity changes between Monte Carlo draws during error sampling.

Selection cutsEDR3 RV catalogueTest-set catalogueML-RV catalogue
0ϖ > 0, δϖ/ϖ < 0.2, G ∈ [6, 17], ruwe < 26444 8716444 87192 371 404
1[0] + G ∈ [12, 17]4332 6574332 65791 840 346
2[1] + (e > 0.8)24 219∼18 000∼450 000
3[2] + GALAH244∼200∼400
4[2] + APOGEE624∼500∼800
Selection cutsEDR3 RV catalogueTest-set catalogueML-RV catalogue
0ϖ > 0, δϖ/ϖ < 0.2, G ∈ [6, 17], ruwe < 26444 8716444 87192 371 404
1[0] + G ∈ [12, 17]4332 6574332 65791 840 346
2[1] + (e > 0.8)24 219∼18 000∼450 000
3[2] + GALAH244∼200∼400
4[2] + APOGEE624∼500∼800
Table 1.

Number of stars in various subsets of Gaia EDR3 used in this work. The subsets include the RV Catalogue, the Test-Set Catalogue, and the Machine Learned-Radial Velocity (ML-RV) Catalogue, with the total number of stars in each catalogue indicated in bold. The first is simply the subset of Gaia EDR3 data with measured line-of-sight velocities (Katz et al. 2019), which also satisfy the base selection requirements in Row [0] of the table in addition to having vlos ∈ [ −550, 550] km s−1. The Test-Set Catalogue is similar to the RV Catalogue, except with line-of-sight velocities and associated uncertainties inferred by the neural network. The ML-RV Catalogue is the subset of Gaia EDR3 that also satisfies the selection requirements in Row [0] in addition to G ∈ [12, 17] (Row [1]), but with no measured line-of-sight velocities. Each star in the ML-RV Catalogue has a line-of-sight velocity and an associated uncertainty predicted by the neural network. Additional selection cuts include [2] cutting on orbital eccentricity, e, [3] cross-matching with GALAH + DR3 (Buder et al. 2021), and [4] cross-matching with APOGEE DR17 (Majewski et al. 2017). Note that e corresponds to the true eccentricity for the RV Catalogue and the predicted value for the Test-Set and ML-RV Catalogues. The number of stars with e > 0.8 for the Test-Set and ML-RV Catalogues are approximate, since this quantity changes between Monte Carlo draws during error sampling.

Selection cutsEDR3 RV catalogueTest-set catalogueML-RV catalogue
0ϖ > 0, δϖ/ϖ < 0.2, G ∈ [6, 17], ruwe < 26444 8716444 87192 371 404
1[0] + G ∈ [12, 17]4332 6574332 65791 840 346
2[1] + (e > 0.8)24 219∼18 000∼450 000
3[2] + GALAH244∼200∼400
4[2] + APOGEE624∼500∼800
Selection cutsEDR3 RV catalogueTest-set catalogueML-RV catalogue
0ϖ > 0, δϖ/ϖ < 0.2, G ∈ [6, 17], ruwe < 26444 8716444 87192 371 404
1[0] + G ∈ [12, 17]4332 6574332 65791 840 346
2[1] + (e > 0.8)24 219∼18 000∼450 000
3[2] + GALAH244∼200∼400
4[2] + APOGEE624∼500∼800

We train, validate, and test the network on the RV Catalogue and present the results in this section. Section 2.1 introduces the network architecture used in this study in detail and discusses the error-sampling procedure used in presenting results throughout the paper. Section 2.2 describes how well the network does in predicting the kinematic and spatial distributions of high-eccentricity stars in the RV Catalogue, which are potential GSE candidates. These results are critical for building confidence in the neural network behaviour on data, so that we can move forward and study its results on the Gaia data with only 5D astrometry available.

2.1 Network architecture and error sampling

The neural network developed in D21 consists of two halves that are structured identically except for the last layer. Each half of the network consists of six layers: the input, four hidden layers, and the output. The input consists of five quantities per star: Galactic longitude (ℓ), Galactic latitude (b), proper motion in right ascension (μα), proper motion in declination (μδ), and parallax (ϖ). The hidden layers comprise 30 nodes each and use a hyperbolic tangent activation function. The output layer from one half of the network, which we call the ‘velocity predictor,’ consists of a single node with linear activation to obtain a continuous value, the line-of-sight velocity. The output layer from the other half of the network, which we call the ‘uncertainty predictor,’ uses a ReLU activation function to constrain the uncertainty on the predicted line-of-sight velocity to positive values.

Comprehensive comparisons of different network architectures were performed in D21 to determine this optimal setup (detailed in Appendix A of D21). For example, we trained the network on only Galactocentric positions xyz and found suboptimal results. Also in D21, we compared the network’s performance to that of an alternative technique for reconstructing 6D phase-space information of stars from 5D information, which proposed setting vlos = 0 km s−1 for stars in a restricted region of the sky (Koppelman et al. 2019a). We found that the network was well-calibrated and performed as well as setting vlos = 0 in the region, where setting vlos = 0 was designed to excel. However, we found the network to be additionally well-calibrated beyond this narrow spatial region and had the benefit of predicted uncertainties.

To improve the predictive power of the network on Gaia EDR3, we make several modifications to the foundational neural network of D21:

  • We provide the network with a continuous representation of Galactic longitude, ℓ, as opposed to the discontinuous representation ℓ ∈ [0°, 360°] used in D21. Specifically, we use ℓ → [cos (ℓ), sin (ℓ)]T. This change in representation ameliorates a discontinuity in the neural network’s outputs at ℓ = 0°, which was not present in the outputs when the network was trained on the mock catalogue. How the topology of a data set affects the behaviour of the neural network is a novel field of research – see for example, Batson et al. (2021).

  • We account for measurement errors on the line-of-sight velocity in the Gaia EDR3 RV Catalogue by Monte Carlo (MC) sampling each star over this uncertainty during training. This ensures that the network’s predicted uncertainty, σlos, encompasses the Gaia measurement error. As will be demonstrated at the end of this Section, the network’s uncertainties are essentially Gaussian with the chosen technique. One could try to propagate uncertainties through the network explicitly – including the Gaia measurement error – by using more sophisticated loss functions and network architectures, for example Naik & Widmark (2022), but since our network gives errors that can naturally be interpreted as Gaussian, we find this unnecessary for this study.

In all other aspects, the neural network used in this work is the same as that developed in D21.

The RV Catalogue is divided into five random subsets. For each round of training, we choose three different subsets to be the training, validation, and test sets, respectively.3 For each round of training, we predict line-of-sight velocity values and uncertainties for all stars in one of the subsets (where truth-level vlos is withheld). After all rounds of training are complete, every star in the RV Catalogue has a network-predicted vlos and σlos. We refer to this as the Test-Set Catalogue, as it will allow us to compare network predictions for the data to their true values. By construction, it has the same number of stars as the RV Catalogue. Properties of the Test-Set Catalogue are also summarized in Table 1.

To fully exploit the network prediction of both vlos and σlos, we perform a procedure we term error-sampling to present results throughout this paper. For each star, we take the predicted vlos and σlos as the mean and standard deviation of a Gaussian distribution for the line-of-sight velocity and draw five MC trials from this distribution. Any kinematics cut that we impose is performed on each MC sample separately. The final predicted distribution is then taken to be the mean of all the MC trials and corresponds to what is presented in all of the plots to follow. Figure colourbars that are labelled as ‘Test-Set star count’ or ‘ML-RV star count’ represent error-sampled values. Colourbars labelled as ‘RV star count’ represent truth-level counts.

Fig. 1 demonstrates how well the network does in filling in missing phase-space information by comparing predicted velocities in the Test-Set Catalogue to their corresponding truth values in the RV Catalogue. While the network predicts vlos, we have transformed the results into Galactocentric spherical velocities (vr, vθ, vϕ), following the same procedure described in Appendix B of D21. The top row of Fig. 1 compares the predicted velocities to their truth values in the RV Catalogue. Perfect agreement between truth and predicted velocities occurs along the diagonal, with the prediction being an underestimate (overestimate) toward the left (right) of the plot. The bottom row shows how the uncertainties depend on the predicted velocities. The predicted uncertainty on the Galactocentric velocities is shown as a function of the predicted Galactocentric velocity for all stars in the Test-Set Catalogue. For each velocity bin, the box denotes the 50 per cent containment about the median uncertainty, while the whiskers denote the 5 per cent and 95 per cent containment. The network uncertainties are lowest for values of (vr, vθ, vϕ) corresponding to disc stars, since they dominate the catalogue in sheer number. Because non-disc stars form a comparatively smaller fraction of the data, the network uncertainties are typically larger. However, these uncertainties in the network output are taken into account through the error-sampling procedure.

(Top) Comparison of the inferred velocities, vpred, in the Test-Set Catalogue with their corresponding truth values, vtruth, from the RV Catalogue. While the network infers the line-of-sight velocities, we present the results here in Galactocentric spherical coordinates. The dotted-white diagonal line indicates exact agreement between the truth and inferred values. (Bottom) The network-predicted uncertainty as a function of velocity in each coordinate direction. In each bin, the box denotes 50 per cent containment about the median uncertainty, while the whiskers indicate 5–95 per cent containment. In this and all following figures, the Test-Set star counts for the background heatmaps are properly error-sampled, as described in the text.
Figure 1.

(Top) Comparison of the inferred velocities, vpred, in the Test-Set Catalogue with their corresponding truth values, vtruth, from the RV Catalogue. While the network infers the line-of-sight velocities, we present the results here in Galactocentric spherical coordinates. The dotted-white diagonal line indicates exact agreement between the truth and inferred values. (Bottom) The network-predicted uncertainty as a function of velocity in each coordinate direction. In each bin, the box denotes 50 per cent containment about the median uncertainty, while the whiskers indicate 5–95 per cent containment. In this and all following figures, the Test-Set star counts for the background heatmaps are properly error-sampled, as described in the text.

Fig. 2 explicitly demonstrates that the network’s predicted uncertainties, σlos, can be interpreted as Gaussian. Here, we histogram the residuals of the predicted line-of-sight velocities relative to their corresponding truth values, divided by the predicted uncertainty. The distribution is well-fit by a Gaussian with mean μ = 0.01 and standard deviation σ = 0.9. We conclude that approximately 68 per cent (95 per cent) of stars have true line-of-sight velocity |$v_\mathrm{los}^\mathrm{truth}$| within 1σlos (2σlos) of the network’s predicted central value |$v_{\rm los}^{\rm pred}$|⁠, and that σlos can be safely interpreted as a Gaussian error bar.

The distribution of the residual between the network-predicted line-of-sight velocity $v_{\rm los}^{\rm pred}$ and the true value $v_{\rm los}^{\rm truth}$, divided by the network-predicted uncertainty σlos. The solid-green line shows a best-fit normal distribution with mean μ = 0.01 and standard deviation σ = 0.9. This justifies interpreting σlos as a Gaussian error bar.
Figure 2.

The distribution of the residual between the network-predicted line-of-sight velocity |$v_{\rm los}^{\rm pred}$| and the true value |$v_{\rm los}^{\rm truth}$|⁠, divided by the network-predicted uncertainty σlos. The solid-green line shows a best-fit normal distribution with mean μ = 0.01 and standard deviation σ = 0.9. This justifies interpreting σlos as a Gaussian error bar.

2.2 High-eccentricity stars in the test-set catalogue

GSE stars are typically identified by some combination of the following properties: high velocity, retrograde motion, low metallicity, and highly eccentric orbits. Our goal is to identify candidate GSE stars from the subset of Gaia data with only 5D astrometry available. To do so, we select stars on high-eccentricity orbits, as determined using the network-inferred line-of-sight velocities. To this end, the network must be able to identify the low- and high-eccentricity stellar population accurately. In this section, we validate the performance of the network by comparing the low- and high-eccentricity populations in the RV Catalogue and the Test-Set Catalogue. We emphasize that the network does not require any knowledge of the Galactic potential to make the predictions that we detail below.

Moving forward, we will use a high-eccentricity cut, e > 0.8, to identify GSE candidates. This selection criterion is similar to that used by Naidu et al. (2020), which required that e > 0.7 and also explicitly excluded other pre-defined substructures. We place a tighter restriction on eccentricity to decrease contamination from stars with low eccentricities. We find that only |$0.15{{\ \rm per\ cent}}$| (⁠|$0.25{{\ \rm per\ cent}}$|⁠) of the Test-Set Catalogue stars have epred > 0.8 (0.7) when their corresponding truth values are less than 0.8 (0.7). It is important to note that, without chemical abundances, it is difficult for us to remove possible contamination from other substructures, as done by Naidu et al. (2020).

To determine stellar eccentricities, we use the gala package (M. Price-Whelan 2017), assuming the standard Milky Way potential given in Bovy (2015).4 The notation epred refers to eccentricities calculated from the Test-Set Catalogue and etruth to eccentricities calculated directly from truth velocities in the RV Catalogue. The correlation coefficient between epred using |$v_\mathrm{los}^\mathrm{pred}$| and etruth for all stars is 0.82, indicating excellent agreement. Fig. 3 shows the correlation between epred and etruth for high-eccentricity stars, including our error-sampling procedure, in the relevant parameter space for the GSE selection. Points along the diagonal indicate a completely accurate predicted eccentricity, while points above (below) the diagonal indicate an underestimated (overestimated) eccentricity. There are notably fewer stars in the bottom right-hand quadrant of the plot; this is important because we will need to cut on epred to obtain GSE candidates in the full Gaia EDR3 catalogue, and this tells us that stars which are assigned higher values of epred generally have as high or higher etruth values. Of the ∼60 000 stars shown in Fig. 3, |$\sim 28{{\ \rm per\ cent}}$| are in the upper-right quadrant, |$\sim 17{{\ \rm per\ cent}}$| are in the upper-left quadrant, and only |$\sim 12{{\ \rm per\ cent}}$| are in the bottom-right quadrant. Of the |$\sim 12{{\ \rm per\ cent}}$| of stars in the bottom-right quadrant of Fig. 3 (representing the contamination from low-eccentricity stars in our sample), |$\sim 65{{\ \rm per\ cent}}$| have etruth > 0.7, which would still qualify as high-eccentricity per the cut made by Naidu et al. (2020).

The network-predicted eccentricities for stars in the error-sampled Test-Set Catalogue compared to their true values from the RV Catalogue. Overall, the correspondence between truth and predicted values is excellent, especially at the highest eccentricities. Stars that fall in the bottom right (top left) quadrant would be mistakenly labelled (not labelled) as GSE candidates in the Test-Set Catalogue, assuming a selection cut of e > 0.8.
Figure 3.

The network-predicted eccentricities for stars in the error-sampled Test-Set Catalogue compared to their true values from the RV Catalogue. Overall, the correspondence between truth and predicted values is excellent, especially at the highest eccentricities. Stars that fall in the bottom right (top left) quadrant would be mistakenly labelled (not labelled) as GSE candidates in the Test-Set Catalogue, assuming a selection cut of e > 0.8.

Fig. 4 explores the extent to which the network captures correlations among the velocity components for the low and high-eccentricity samples. The background heatmaps correspond to the distribution of stars in the RV Catalogue. Here, we begin to use our spatially complete Catalogue; there are 4308 438 stars with etruth < 0.8 and 24 219 stars with etruth > 0.8 (Row [2] of Table 1). As expected, the low-eccentricity sample of stars is disc-like with azimuthal velocities strongly peaked at vϕ ∼ −200 km s−1. The high-eccentricity sample has the characteristic ‘sausage’-like shape in the RV distribution, which is expected for GSE stars (Belokurov et al. 2018). The vϕ distribution for the low-eccentricity stars is not smooth across vϕ ∼ 0 km s−1 because many of these stars are on highly radial orbits (i.e., have large vr) and thus appear in the top panel instead of the bottom panel of the figure.

Distributions of Galactocentric velocities for the high eccentricity sample (top) and all other stars (bottom). The background heatmaps correspond to truth-level distributions from the RV Catalogue. The corresponding 30 per cent, 60 per cent, and 90 per cent containment contours are shown by the solid-violet lines. For comparison, we also show the error-sampled containment regions for stars in the Test-Set Catalogue in dashed white. The correspondence between the true and machine-learned distributions is excellent. Note that e refers to the true (predicted) eccentricities for stars in the RV (Test-Set) Catalogue.
Figure 4.

Distributions of Galactocentric velocities for the high eccentricity sample (top) and all other stars (bottom). The background heatmaps correspond to truth-level distributions from the RV Catalogue. The corresponding 30 per cent, 60 per cent, and 90 per cent containment contours are shown by the solid-violet lines. For comparison, we also show the error-sampled containment regions for stars in the Test-Set Catalogue in dashed white. The correspondence between the true and machine-learned distributions is excellent. Note that e refers to the true (predicted) eccentricities for stars in the RV (Test-Set) Catalogue.

The violet contours in Fig. 4 correspond to 30 per cent, 60 per cent, and 90 per cent containment intervals for stars in the RV Catalogue. The contours for the corresponding error-sampled distributions of the Test-Set Catalogue are shown in dashed white. Because of our conservative error-sampling procedure, ∼3000 stars are excluded out of the typical sample size of 18000 stars in each of the MC’d sets with epred > 0.8; these stars have eccentricities that fall below the cut after the MC sampling. The error-sampled Test-Set Catalogue is an excellent approximation of the RV Catalogue for both the high-eccentricity and low-eccentricity samples at the 30 per cent and 60 per cent containment intervals. The network is marginally underconfident at the 90 per cent level, likely because stars on the distribution tails typically have higher network uncertainties. We stress that the correspondence between the predicted and true velocity distributions is highly non-trivial. The network has no direct access to information about the Galactic potential, which is essential in predicting its eccentricity; nevertheless, the network’s ability to predict vlos and σlos is sufficiently good to accurately predict the distributions for these categories.

Lastly, we explore the spatial distributions of the GSE stars in the Test-Set compared to the RV Catalogue. The heatmaps in the top row of Fig. 5 correspond to the Galactocentric spatial distribution of RV Catalogue stars with etruth > 0.8. The bottom row shows the corresponding heatmaps for the Test-Set Catalogue. To quantitatively compare the distribution of GSE candidates, we compute the 3D covariance ellipse of the Galactocentric positions of high-eccentricity stars in each catalogue, which has principal axes given by its eigenvectors. In addition, we project the GSE candidates into 2D Galactocentric planes and obtain the 2D covariance ellipse in each plane; these ellipses are shown for the RV and Test-Set Catalogue in Fig. 5, for stars within 5 kpc of the Sun.

Heatmaps corresponding to the Galactocentric spatial distributions for the high-eccentricity stars in the RV Catalogue (top) and Test-Set Catalogue (bottom). In this work, we use a right-handed coordinate system where the Sun is located at negative xgal. The 2D covariance ellipses for all stars within 5 kpc of the Sun are indicated by the violet (white) lines for the RV (Test-Set) Catalogue. There are five essentially indistinguishable lines shown for the fits to the Test-Set Catalogue; each corresponds to a different sample over the network uncertainty. A summary of the tilt angles and axis ratios of the covariance ellipses are provided in Table 2.
Figure 5.

Heatmaps corresponding to the Galactocentric spatial distributions for the high-eccentricity stars in the RV Catalogue (top) and Test-Set Catalogue (bottom). In this work, we use a right-handed coordinate system where the Sun is located at negative xgal. The 2D covariance ellipses for all stars within 5 kpc of the Sun are indicated by the violet (white) lines for the RV (Test-Set) Catalogue. There are five essentially indistinguishable lines shown for the fits to the Test-Set Catalogue; each corresponds to a different sample over the network uncertainty. A summary of the tilt angles and axis ratios of the covariance ellipses are provided in Table 2.

Table 2 summarizes two statistics that characterize the spatial properties in each of the catalogues. The axis ratio is the ratio of the lengths of the principal axes of the 3D covariance ellipse, in decreasing order of length, while the tilt angle is the angle between the major axis of the 2D covariance ellipse in the ygalzgal plane, with respect to the ygal-axis. For the Test-Set Catalogue, the distribution is almost isotropic in the ygalzgal plane across the five MC samples. The spatial distribution of Test-Set GSE candidates is similar to the true distribution, with both distributions being relatively isotropic about the Sun. We emphasize that the covariance ellipses shown here are used to compare the spatial distribution of GSE candidates between catalogues. They only apply to local stars within 5 kpc of the Sun and thus do not serve as a characterization of the global stellar halo.

Table 2.

Tilt angles and axis ratios for the covariance ellipses of the spatial distributions of all stars within 5 kpc of the Sun in the RV, Test-Set, and ML-RV Catalogue. The covariance ellipses are intended to quantitatively compare the spatial distribution of GSE candidates between the different catalogues; they apply to the local, not global, halo. However, the tilt in the ygalzgal plane may be an unbiased indicator of the global tilt of the stellar halo, as noted by Han et al. (2022).

RV catalogueTest-Set catalogueML-RV catalogue
Tilt angle∼ isotropic∼ isotropic59°
Axis ratio10.2 : 10.2 : 9.09.6 : 9.6 : 9.09.0 : 7.8 : 7.2
RV catalogueTest-Set catalogueML-RV catalogue
Tilt angle∼ isotropic∼ isotropic59°
Axis ratio10.2 : 10.2 : 9.09.6 : 9.6 : 9.09.0 : 7.8 : 7.2
Table 2.

Tilt angles and axis ratios for the covariance ellipses of the spatial distributions of all stars within 5 kpc of the Sun in the RV, Test-Set, and ML-RV Catalogue. The covariance ellipses are intended to quantitatively compare the spatial distribution of GSE candidates between the different catalogues; they apply to the local, not global, halo. However, the tilt in the ygalzgal plane may be an unbiased indicator of the global tilt of the stellar halo, as noted by Han et al. (2022).

RV catalogueTest-Set catalogueML-RV catalogue
Tilt angle∼ isotropic∼ isotropic59°
Axis ratio10.2 : 10.2 : 9.09.6 : 9.6 : 9.09.0 : 7.8 : 7.2
RV catalogueTest-Set catalogueML-RV catalogue
Tilt angle∼ isotropic∼ isotropic59°
Axis ratio10.2 : 10.2 : 9.09.6 : 9.6 : 9.09.0 : 7.8 : 7.2

3 INTRODUCING THE ML-RV CATALOGUE

Having validated the neural network’s performance in predicting the velocity and spatial distribution of stars with both low- and high-eccentricity orbits, we now apply it to Gaia EDR3 stars with only 5D information. We restrict ourselves to G ∈ [12, 17] stars to obtain a spatially complete data set. The predicted line-of-sight velocity and associated uncertainty form what we call the ML-RV Catalogue. This catalogue is spatially complete, encompassing a total of 91 840 346 stars, more than 20 times larger than the number of equivalent stars in the RV Catalogue. Table 1 shows how the number of stars in the ML-RV Catalogue varies as a function of selection cut. The majority of stars in the ML-RV Catalogue lie within 10 kpc of the Galactic Centre, but some are slightly further out, within 20 kpc.

Both the spatially complete Test-Set Catalogue and the ML-RV Catalogue are publicly available for download. For each catalogue, we provide the Gaia EDR3 source ID, predicted line-of-sight velocity vlos in km s−1, as well as the predicted network uncertainty σlos in km s−1https://zenodo.org/record/6558083.

4 FEATURES OF GSE IN THE ML-RV CATALOGUE

We next identify GSE candidates in the ML-RV Catalogue by looking for stars on high-eccentricity orbits, epred > 0.8. This is the first such study of GSE stars in the vast subset of Gaia data with no observed line-of-sight velocities. Our analysis yields a nearly twenty-fold increase in the number of GSE candidate stars as compared to the RV Catalogue. This section explores the dynamics, abundances, and spatial distribution of these newly identified candidates.

4.1 Dynamics

The Galactocentric spherical velocities of the ML-RV GSE stars are shown in Fig. 6. The background heatmap corresponds to the error-sampled high-eccentricity (epred > 0.8) subset of the ML-RV Catalogue. The dashed-yellow lines are the error-sampled 30 per cent, 60 per cent, and 90 per cent containment intervals for the ML-RV Catalogue; the solid-blue lines are the corresponding truth-level contours for the RV Catalogue. The inferred velocity distributions of the significantly more numerous ML-RV GSE candidates closely agree with those of the RV Catalogue’s GSE stars.

2D distributions of the Galactocentric velocity components of high-eccentricity stars. The background heatmap shows the network-predicted kinematic distributions of GSE candidate stars (epred > 0.8) in the ML-RV Catalogue. In this and all following figures, the ML-RV star counts are properly error-sampled, as described in the text. The yellow-dashed contours indicate the location of the 30 per cent, 60 per cent, and 90 per cent containment intervals. For comparison, the solid-blue contours indicate the same containment intervals for the GSE candidates (etrue > 0.8) in the RV Catalogue. The GSE’s classic ‘sausage’-like structure is apparent in the vr − vϕ plane for both subsets.
Figure 6.

2D distributions of the Galactocentric velocity components of high-eccentricity stars. The background heatmap shows the network-predicted kinematic distributions of GSE candidate stars (epred > 0.8) in the ML-RV Catalogue. In this and all following figures, the ML-RV star counts are properly error-sampled, as described in the text. The yellow-dashed contours indicate the location of the 30 per cent, 60 per cent, and 90 per cent containment intervals. For comparison, the solid-blue contours indicate the same containment intervals for the GSE candidates (etrue > 0.8) in the RV Catalogue. The GSE’s classic ‘sausage’-like structure is apparent in the vrvϕ plane for both subsets.

Next, we compare the integrals of motion of the stellar orbits in each catalogue, focusing on total orbital energy (E), angular momentum in the z-direction (Lz), and radial action (⁠|$\sqrt{J_r}$|⁠) for an axisymmetric potential. These quantities are evaluated by integrating orbits in the Milky Way potential given in Bovy (2015) using the gala package. Stars from a recent Milky Way merger such as the GSE are expected to show correlation in the integrals of motion space (see Helmi 2020 for example).

Fig. 7 shows the resulting distributions in the integrals of motion for the GSE stars in the error-sampled ML-RV Catalogue, shown as the pink-hued heatmap. The grey-scale background heatmap shows the truth values of the RV Catalogue, which is dominated by the disc, for comparison. The dashed-yellow lines indicate the error-sampled contours for the 30 per cent, 60 per cent, and 90 per cent containment intervals of the ML-RV Catalogue stars with epred > 0.8; the solid blue lines are the corresponding truth-level contours for the RV Catalogue stars with etruth > 0.8.

Distribution of stars in E, Lz, and $\sqrt{J_r}$. The pink-hued heatmaps correspond to the integrals of motion for the GSE candidates in the ML-RV Catalogue. The yellow-dashed contours indicate the location of the 30 per cent, 60 per cent, and 90 per cent containment intervals of these distributions. For comparison, the solid-blue contours show the same containment intervals for GSE candidates in the RV Catalogue. The grey-hued (log) density background heatmap shows the distribution for all stars in the RV Catalogue.
Figure 7.

Distribution of stars in ELz, and |$\sqrt{J_r}$|⁠. The pink-hued heatmaps correspond to the integrals of motion for the GSE candidates in the ML-RV Catalogue. The yellow-dashed contours indicate the location of the 30 per cent, 60 per cent, and 90 per cent containment intervals of these distributions. For comparison, the solid-blue contours show the same containment intervals for GSE candidates in the RV Catalogue. The grey-hued (log) density background heatmap shows the distribution for all stars in the RV Catalogue.

We can compare these distributions with those from the existing literature. Feuillet et al. (2021) found that stars with |$30 \le \sqrt{J_r}/[(\text{kpc km s}^{-1})^{1/2}] \le 50$| and −0.5 ≤ Lz/[103 kpc kms−1] ≤ 0.5 presented the least-contaminated sample of kinematically selected GSE stars based on the metallicity distribution. Koppelman et al. (2019b) clustered stars in ELz, e, and [Fe/H] to identify GSE stellar candidates. They identified the region occupied predominantly by GSE as −1.5 < E/[105 km2 s−2] < −1.1. Most of the GSE candidates in the ML-RV Catalogue fall reasonably within these energy and angular momentum regions, as seen in Fig. 7. At the 90 per cent containment interval, we find that the ML-RV GSE sample sits within the ranges: |$20 \le \sqrt{J_r}/[(\text{kpc km s}^{-1})^{1/2}] \le 44$|⁠, −0.5 ≤ Lz/[103 kpc kms−1] ≤ 0.5, and −1.67 < E/[105 km2 s−2] < −1.0. It is important to note that the consistency between our results and those of Koppelman et al. (2019b) and Feuillet et al. (2021) was not guaranteed a priori, given that we do not use metallicity information in selecting GSE candidates as they do.

4.2 Metallicity distribution

Using machine-learned line-of-sight velocities to identify GSE candidates in the subset of Gaia data with only 5D astrometry enables us to work with a much larger sample than otherwise possible by solely relying on cross-matches to spectroscopic surveys. This benefit does come at the cost of a potentially large network uncertainty on the inferred velocity, which must be properly accounted for when studying the resulting stellar distributions by error sampling. While the machine learning approach is no match for the gold standard of spectroscopic observations, it can serve as a complementary tool when complete data are not available. In this section, we focus on the GSE candidates in the ML-RV Catalogue that have cross-matches in GALAH + DR3 (Buder et al. 2021) and APOGEE DR17 (Majewski et al. 2017; Queiroz et al. 2020) to assess the robustness of the network-predicted line-of-sight velocities and to study their chemical abundance distributions.

There are a total of 689 431 (551 559) stars in the cross-match of GALAH + (APOGEE) and Gaia EDR3. The fractions that pass the high-eccentricity (e > 0.8) selection cut for GSE stars in the RV, Test-Set, and ML-RV Catalogues are summarized in Table 1. Fig. 8 compares the inferred velocities for the ML-RV high-eccentricity subset with the truth values obtained by combining Gaia with the line-of-sight velocities from GALAH + (top) and APOGEE (bottom). This serves as an additional cross-check of our machine-learned line-of-sight velocity, using actual measurements that are completely independent of Gaia. There is excellent agreement between the network-inferred and truth values, providing a strong test of the accuracy of the ML-RV velocities. We have verified that the correspondence is equally good for the low-eccentricity subset as well.

Comparison of the inferred Galactocentric spherical velocities, vpred, in the ML-RV Catalogue with their respective values obtained using line-of-sight velocities from a cross-match to GALAH + DR3 (top) and APOGEE DR17 (bottom). The velocities of the low eccentricity stars in the cross-matched ML-RV Catalogue (not shown here) demonstrate an equally good correspondence with the truth values.
Figure 8.

Comparison of the inferred Galactocentric spherical velocities, vpred, in the ML-RV Catalogue with their respective values obtained using line-of-sight velocities from a cross-match to GALAH + DR3 (top) and APOGEE DR17 (bottom). The velocities of the low eccentricity stars in the cross-matched ML-RV Catalogue (not shown here) demonstrate an equally good correspondence with the truth values.

The left-hand panel of Fig. 9 presents the metallicity distribution functions (MDFs) for the cross-match to GALAH+. The distributions for high-eccentricity (e > 0.8) stars in the RV, Test-Set, and ML-RV Catalogues are indicated by the cyan, yellow, and magenta histograms, respectively. The spread in the Test-Set and ML-RV Catalogues is due to error sampling. For comparison, the shaded grey histogram shows the distribution for all stars in the cross-matched sample. To quantitatively compare the results, we fit the MDFs to a Gaussian distribution. The best-fit mean and standard deviations for the RV and Test-Set distributions are highly consistent with each other with (μ, σ) = (−0.94, 0.50) and (1.03, 0.55), respectively.5 The corresponding MDF for the ML-RV GSE candidates is similar: (μ, σ) = (−1.00, 0.56). The right-hand panel of Fig. 9 shows the equivalent results for the APOGEE cross-match of high-eccentricity stars. In this case, the best-fit mean and standard deviations for the RV and Test-Set distributions are (μ, σ) = (−1.14, 0.45) and (−1.1, 0.43), respectively. The MDF for the ML-RV GSE candidates has (μ, σ) = (−1.06, 0.47). These results are consistent with each other, as well as with the GALAH + values presented above.

The metallicity distributions for GSE candidates in cross-matches of Gaia EDR3 with GALAH + DR3 (left) and APOGEE DR17 (right). The distributions for the RV, Test-Set, and ML-RV Catalogues are shown by the cyan, yellow, and magenta lines, respectively. The spread in the Test-Set and ML-RV histograms is due to error sampling. The shaded grey histogram corresponds to the distribution of all stars in the cross-matched sample.
Figure 9.

The metallicity distributions for GSE candidates in cross-matches of Gaia EDR3 with GALAH + DR3 (left) and APOGEE DR17 (right). The distributions for the RV, Test-Set, and ML-RV Catalogues are shown by the cyan, yellow, and magenta lines, respectively. The spread in the Test-Set and ML-RV histograms is due to error sampling. The shaded grey histogram corresponds to the distribution of all stars in the cross-matched sample.

In Fig. 10, we study the α-element abundance [Mg/Fe] of GSE candidate stars from the RV and Test-Set Catalogues (left-hand panel) and the ML-RV Catalogue (right-hand panel) cross-matched with APOGEE. The location of a star on the [Mg/Fe] − [Fe/H] plane is indicative of its origin in the stellar halo or thin/thick disc. For example, the grey-scale background heatmap in Fig. 10 shows all of the stars in the Gaia EDR3-APOGEE DR17 cross-match (Queiroz et al. 2020). Disc stars correspond to the concentration of stars at metallicities [Fe/H] ≳ −1.0, which separate into a high-α and low-α component. Older, accreted stars from mergers typically populate the plane at lower metallicities and extend to higher α-abundances – see e.g., Wheeler, Sneden & Truran (1989), McWilliam (1997), Tolstoy, Hill & Tosi (2009), Rix & Bovy (2013), Bland–Hawthorn & Gerhard (2016) for relevant reviews.

Scatter plot of stars in the Gaia EDR3-APOGEE cross-match as a function of α-element abundance [Mg/Fe] and metallicity [Fe/H]. The results are shown for the GSE candidates in the RV Catalogue (cyan), as well as a single MC iteration of the Test-Set (yellow) and ML-RV (magenta) Catalogues. The background grey-scale heatmap corresponds to all of the stars in the Gaia EDR3-APOGEE DR17 cross-match (Queiroz et al. 2020). The crosses in the bottom left-hand corner of each panel indicate the mean uncertainty in [Mg/Fe] and [Fe/H] for each respective sample.
Figure 10.

Scatter plot of stars in the Gaia EDR3-APOGEE cross-match as a function of α-element abundance [Mg/Fe] and metallicity [Fe/H]. The results are shown for the GSE candidates in the RV Catalogue (cyan), as well as a single MC iteration of the Test-Set (yellow) and ML-RV (magenta) Catalogues. The background grey-scale heatmap corresponds to all of the stars in the Gaia EDR3-APOGEE DR17 cross-match (Queiroz et al. 2020). The crosses in the bottom left-hand corner of each panel indicate the mean uncertainty in [Mg/Fe] and [Fe/H] for each respective sample.

The [Mg/Fe] distributions of the RV, Test-Set, and ML-RV GSE candidates are consistent with each other. In all three cases, a significant fraction of the stars are concentrated in a regime that extends from [Fe/H] ∈ [ −2.5, −0.75] and [Mg/Fe] ∈ [0.1, 0.4], corresponding to an accreted population. There is also a significant concentration of stars centred at [Fe/H] ∼ −0.5 and [Mg/Fe] ∼0.3, which would correspond to the high-α disc. These stars, which have vϕ ∼ 0, are consistent with the ‘Splash’ population of in situ stars that may have been perturbed during the GSE merger (Bonaca et al. 2017; Belokurov et al. 2020). Approximately 20 per cent (22 per cent) (24 per cent) of the GSE candidates in the RV (Test-Set) (ML-RV) × APOGEE catalogue are in this high-α disc component.

The [Fe/H] and [Mg/Fe] distributions that we recover for the GSE candidates in the ML-RV Catalogue are consistent with those previously published (Di Matteo et al. 2019; Mackereth et al. 2019; Necib et al. 2019b; Bonaca et al. 2020; Das et al. 2020; Feuillet et al. 2020; Mackereth & Bovy 2020; Carollo & Chiba 2021; Feuillet et al. 2021; Hasselquist et al. 2021; Buder et al. 2022; Horta et al. 2023). This clearly demonstrates that the use of network-inferred velocities – which feeds into the determination of eccentricities – does not bias the resulting abundance distributions. We note that there is also a larger fraction of ML-RV GSE candidates that populate the low-α disc regime, which may suggest some disc contamination in the selection.

4.3 Spatial distribution

The spatial distribution of GSE debris in the Milky Way is essential for inferring the initial conditions of the original merger. In literature, the spatial distribution is still a matter of some debate. By exploiting the excellent sky coverage of RR Lyrae stars, Iorio et al. (2017), Iorio & Belokurov (2018) found evidence for a tilted, triaxial stellar halo – likely dominated by GSE debris – within ∼30 kpc of the Galactic Centre. Because the halo is roughly aligned with the Hercules–Aquila Cloud (Belokurov et al. 2007; Simion et al. 2014) and the Virgo Overdensity (Vivas et al. 2001; Newberg et al. 2002; Duffau et al. 2006; Jurić et al. 2008; Bonaca et al. 2012), these authors argue that the overdensities may be associated with the GSE merger – see also Simion, Belokurov & Koposov (2019), Donlon et al. (2019). Naidu et al. (2021) was able to reproduce these observational results with an isolated N-body simulation of a major merger that occurred at infall redshift of z ∼ 2. Other studies, however, argue that the tilt in the global halo may be a result of selection biases in RR Lyrae stars (Helmi 2020). As a counterpoint, Balbinot & Helmi (2021) integrate orbits of GSE candidate stars within 2.5 kpc of the Sun and find that the resulting Galactocentric distribution is isotropic. Donlon et al. (2020) show that the Virgo Overdensity and Hercules–Aquila Cloud would only have survived until today if their progenitor fell in ∼2.7 Gyr ago, much more recently than the expectation for the GSE. However, the orbital integration in both these studies was performed in a spherically symmetric dark matter halo, which has been shown to artificially isotropize stellar distributions (Han et al. 2022).

The GSE candidate stars studied in this work are concentrated near the Solar position, and their spatial distribution is thus representative of the local, not global, stellar halo. Han et al. (2022) emphasized that the local halo is not a representative sample of the global halo, given that its centre is shifted relative to the Galactic Centre. However, as these authors noted, the distribution of the local halo in the ygalzgal plane may preserve features of the global halo as it is not off-centre in this projection. If so, then the tilt angle of our GSE candidates can potentially be used as an indicator of the global tilt of the stellar halo.

To this end, we now explore the spatial distribution of the GSE candidates in the ML-RV Catalogue, with relevant plots provided in Fig. 11. The background heatmap shows the error-sampled distribution for all stars in the sample. The yellow lines are the 2D covariance ellipses of the spatial coordinates of the candidates within 5 kpc of the Sun. For comparison, the blue lines show the corresponding 2D covariance ellipses for fits the RV Catalogue (same as in Fig. 5).

Heatmaps corresponding to the Galactocentric spatial distributions for the high-eccentricity stars in the ML-RV Catalogue. The 2D covariance ellipses for all stars within 5 kpc of the Sun are indicated by the blue (yellow) lines for the RV (ML-RV) Catalogue. There are five essentially indistinguishable lines shown for the fits to the ML-RV Catalogue; each corresponds to a different sample over the network uncertainty. See Table 2 for the best-fit axis ratios and tilt angles.
Figure 11.

Heatmaps corresponding to the Galactocentric spatial distributions for the high-eccentricity stars in the ML-RV Catalogue. The 2D covariance ellipses for all stars within 5 kpc of the Sun are indicated by the blue (yellow) lines for the RV (ML-RV) Catalogue. There are five essentially indistinguishable lines shown for the fits to the ML-RV Catalogue; each corresponds to a different sample over the network uncertainty. See Table 2 for the best-fit axis ratios and tilt angles.

As before, we summarize the spatial distribution using the principal axes ratio of the 3D covariance ellipse, as well as the tilt angle of the 2D covariance ellipse in the ygalzgal plane; these numbers are provided in Table 2. In our validation of the Test-Set Catalogue, we found excellent agreement between truth and predicted elliptical fits. In both of these cases, the best-fit distribution was essentially isotropic in the heliocentric frame, with no significant tilt in the ygalzgal plane. If the local tilt is indicative of the global value, then this result would be consistent with that of Balbinot & Helmi (2021). However, the spatial distribution of the ML-RV GSE candidates is very different from that of the RV candidates: we find a significant ∼59° tilt of these stars above the Galactic plane and away from the Sun.

We have confirmed that the tilt angles for the RV and Test-Set Catalogues (see Table 2) remain essentially unchanged when restricted to stars that lie within a heliocentric distance of 2.5 kpc rather than 5 kpc. In the ML-RV Catalogue, we have verified that the tilt is still present, at 69°, when considering stars within 3.5 kpc; however, it increases to 85° (i.e., the ellipse’s major axis is almost aligned with the y-axis) for stars within 2.5 kpc. The relative axis ratios remain the same when cutting on the distance to the Sun. Moreover, we have verified that the cut on the parallax error does not significantly bias the spatial distribution of stars where these ellipsoidal fits are performed. Further study will be needed to understand the origin of the observed tilt in the ML-RV GSE sample and whether it is representative of the global halo distribution in the ygalzgal plane. At the very least, simulations such as that of Naidu et al. (2021) must also be able to reproduce the local spatial properties of the GSE stars as elucidated by the ML-RV Catalogue.

5 CONCLUSION

The success of the machine learning approach presented in D21 motivated the application of the method to Gaia data, with the first target goal being the study of the GSE, the most recent major merger in the Milky Way’s history. The neural network from D21 takes as input the 5D astrometry of a star and outputs a predicted line-of-sight velocity and associated uncertainty. In this work, we trained the network on ∼6.4 million stars from Gaia EDR3 with complete phase-space information (our ‘RV Catalogue’) and ascertained excellent agreement between the true kinematic distributions of these stars and their inferred distributions (our ‘Test-Set Catalogue’). We then applied the trained network to ∼92 million stars in Gaia EDR3 without full phase-space information (our ‘ML-RV Catalogue’).

A critical feature of the neural network design is its ability to provide uncertainty, σlos, which can be thought of as the network’s overall confidence in its velocity prediction. We explicitly demonstrated that the network uncertainties are nearly Gaussian, so there is a clear understanding of how the predicted line-of-sight velocities deviate from their true values. Importantly, this Gaussian behaviour allows one to create error-sampled distributions on observables of interest, which properly include the network’s uncertainty.

The network’s predictive success is quite remarkable given that it is provided with minimal inputs – parallax, two proper motions, and two spatial coordinates. One can potentially generalize the network to take e.g., action-angle variables, which may enable the exploration of stars out to greater distances in the halo. However, this would come at the expense of assumptions regarding the Galactic potential that could potentially bias the network output. One could also explore the possibility of training the network on some combination of Gaia data as well as other spectroscopic surveys, although this would necessitate a careful treatment of potentially different selection functions between the surveys.

As a first science application of the ML-RV Catalogue, we focused on the identification and characterization of tidal debris of the GSE merger. GSE stars are typically fast moving, retrograde, metal-poor stars on highly eccentric orbits. In this work, we identified GSE candidates using an eccentricity cut off: e > 0.8. We confirmed a strong correlation between network-predicted and truth eccentricities, thus justifying the use of an eccentricity selection in the ML-RV Catalogue. We found ∼450 000 GSE candidates, about twenty times more than the number of stars found using the identical selection criteria in the RV Catalogue, where stars have full 6D information.

The network-predicted distributions of GSE candidates in the 5D Gaia catalogue have Galactocentric velocities, total orbital energy, radial action, z-direction angular momentum, and metallicity-distribution functions that are consistent with the current literature. In a cross-matched subset of the ML-RV Catalogue, we find that a fraction of the GSE candidates are consistent with the high-α disc and are likely part of the ‘Splash,’ or in situ halo, component.

Finally, we examined the spatial distribution of stars identified as GSE. The spatial distribution of GSE candidates within 5 kpc of the Sun is essentially isotropic in both the RV and Test-Set Catalogue. The good correspondence between these two catalogues demonstrates that the network does not bias the resulting spatial distributions through its effect on the predicted eccentricity of the stars. For ML-RV GSE candidates within a similar distance range, the distribution is anisotropic and exhibits a tilt of 59° out of the Galactic plane. Further study is needed to better understand the implications of these findings for the global stellar halo.

The results of this work provide the most extensive mapping of GSE candidates near the Solar position to date, which was made possible by filling in the phase-space of the 5D Gaia EDR3 data. Any numerical simulation of a GSE-like merger, similar to what was performed by Naidu et al. (2020), must be able to reproduce these results for the local halo. Through careful comparison of the simulation to the spatial and chemodynamic distributions of the GSE candidates in Gaia, one can use the simulation to further refine assumptions regarding the initial conditions of the GSE merger, such as its mass, orbital inclination, and velocity. Such simulations can be used to investigate, for example, the impact of GSE on the stellar disc, how GSE debris impacts the shape of the dark matter halo and its implication for the rotation of the disc (whether or not it is aligned with the disc). This last point is an especially exciting application as it is directly relevant to mapping the local dark matter velocity distribution in the Milky Way, which has important applications for uncovering the fundamental nature of the dark matter in direct dark matter detection experiments (Necib, Lisanti & Belokurov 2019a).

The ML-RV Catalogue developed in this work can be used for many other applications beyond the study of the GSE. To this end, we have made the catalogue publicly available https://zenodo.org/record/6558083. One potentially interesting application that has yet to be carefully studied is the use of network-inferred line-of-sight velocities to recover stellar streams. Identifying cold substructures with inferred kinematics may be a challenge, as the predicted uncertainty may be significantly larger than the dispersion, but the possibility of greatly increasing the statistics available for any analysis of these substructures makes such an approach highly attractive.

This paper establishes that machine learning can be used to successfully infer 6D phase-space distributions from 5D astrometry alone. We outline a systematic approach to train and validate the network in a purely data-driven fashion. Moreover, through careful treatment of network uncertainties, we explicitly show that network-predicted distributions are robust and interpretable. When Gaia DR3 is released, it will include RV measurements for ∼30 million more sources. This wealth of new information will provide an ever-larger training set to use towards improving the predictive power of the network. This work demonstrates that moving forward, machine learning can serve a critical role in supplementing astrometric and spectroscopic surveys in cases where the available kinematic data is not complete.

Acknowledgement

The authors would like to thank V. Belokurov, T. Cohen, J. Han, Y. Kahn, L. Necib, D. Roberts, and S. Yaida for fruitful conversations. ML gratefully acknowledges financial support from the Schmidt DataX Fund at Princeton University made possible through a major gift from the Schmidt Futures Foundation. BO was supported in part by the U.S. Department of Energy under contract DE-SC0013607 and DE-SC0020223. HL and ML are supported by the U.S. Department of Energy under Award Number DE-SC0007968. Additionally, HL is supported by National Science Foundation grant PHY-1915409 and the Simons Foundation. AD is supported by National Science FoundationGraduate Research Fellowship Program under grant number DGE-2039656. This work was supported by the National Science Foundation under cooperative agreement PHY-2019786 (The NSF AI Institute for Artificial Intelligence and Fundamental Interactions, http://iaifi.org/) and was performed in part at the Aspen Center for Physics, which is supported by National Science Foundation grant PHY-1607611. The work presented in this paper was performed on computational resources managed and supported by Princeton Research Computing. It made use of the astropy (Robitaille et al. 2013), gala (M. Price-Whelan 2017), corner (Foreman-Mackey 2016), h5py (Collette 2013), IPython (Perez & Granger 2007), Jupyter (Kluyver et al. 2016), matplotlib (Hunter 2007), NumPy (van der Walt, Colbert & Varoquaux 2011), pandas (Wes McKinney 2010), and SciPy (Jones et al. 2001) software packages.

This work has made use of data from the European Space Agency (ESA) mission Gaia (https://www.cosmos.esa.int/gaia), processed by the Gaia Data Processing and Analysis Consortium (DPAC, https://www.cosmos.esa.int/web/gaia/dpac/consortium). Funding for the DPAC has been provided by national institutions, in particular the institutions participating in the Gaia Multilateral Agreement.

DATA AVAILABILITY

The Gaia data are publicly available in the Gaia archive. The APOGEE data are publicly available on the Science Archive Server. The GALAH data are publicly available on the program’s website. The catalogue of machine-learned line-of-sight velocities and uncertainties for Gaia EDR3 together with their source IDs has been made publicly available on Zenodo at https://zenodo.org/record/6558083.

Footnotes

1

Note that the training, testing, and validation of the network are all performed on Gaia EDR3 data in this work. While mock data catalogues were used by D21 to develop the network architecture, they are not used at any stage of the analysis here.

2

Note that when studying GSE candidates later on, we will further restrict to stars with G ∈ [12, 17], where the EDR3 catalogue is essentially spatially complete (Fabricius et al. 2021).

3

The training, validating, and testing of the network occur in the same manner as described in D21.

4

We note that the derived eccentricities may differ if a triaxial halo is used for the orbit integration, as discussed in Han et al. (2022).

5

The quoted mean and standard deviation for the Test-Set and ML-RV Catalogues are averaged across the fits for the five MCs.

References

Angus
R.
,
Price-Whelan
A. M.
,
Zinn
J. C.
,
Bedell
M.
,
Yuxi
,
Lu Foreman-Mackey
D.
,
2022
,
AJ
,
164
,
11

Balbinot
E.
,
Helmi
A.
,
2021
,
A&A
,
654
,
A15

Batson
J.
,
Haaf
C. G.
,
Kahn
Y.
,
Roberts
D. A.
,
2021
,
J. High Energy Phys.
,
2021
,
280

Belokurov
V.
et al. ,
2007
,
ApJ
,
658
,
337

Belokurov
V.
,
Erkal
D.
,
Evans
N. W.
,
Koposov
S. E.
,
Deason
A. J.
,
2018
,
MNRAS
,
478
,
611

Belokurov
V.
,
Sanders
J. L.
,
Fattahi
A.
,
Smith
M. C.
,
Deason
A. J.
,
Evans
N. W.
,
Grand
R. J. J.
,
2020
,
MNRAS
,
494
,
3880

Bird
S. A.
,
Xue
X.-X.
,
Liu
C.
,
Shen
J.
,
Flynn
C.
,
Yang
C.
,
2019
,
AJ
,
157
,
104

Bland-Hawthorn
J.
,
Gerhard
O.
,
2016
,
ARA&A
,
54
,
529

Bonaca
A.
et al. ,
2012
,
AJ
,
143
,
105

Bonaca
A.
,
Conroy
C.
,
Wetzel
A.
,
Hopkins
P. F.
,
Kereš
D.
,
2017
,
ApJ
,
845
,
101

Bonaca
A.
et al. ,
2020
,
ApJ
,
897
,
L18

Bovy
J.
,
2015
,
A&AS
,
216
,
29

Buder
S.
et al. ,
2021
,
MNRAS
,
506
,
150

Buder
S.
et al. ,
2022
,
MNRAS
,
510
,
2407

Bullock
J. S.
,
Johnston
K. V.
,
2005
,
ApJ
,
635
,
931

Carollo
D.
,
Chiba
M.
,
2021
,
ApJ
,
908
,
191

Collette
A.
,
2013
,
Python and HDF5
.
O’Reilly Media, Inc

Das
P.
,
Hawkins
K.
,
Jofré
P.
,
2020
,
MNRAS
,
493
,
5195

Deason
A. J.
,
Belokurov
V.
,
Koposov
S. E.
,
Lancaster
L.
,
2018
,
ApJ
,
862
,
L1

De Lucia
G.
,
Helmi
A.
,
2008
,
MNRAS
,
391
,
14

Di Matteo
P.
,
Haywood
M.
,
Lehnert
M. D.
,
Katz
D.
,
Khoperskov
S.
,
Snaith
O. N.
,
Gómez
A.
,
Robichon
N.
,
2019
,
A&A
,
632
,
A4

Donlon Thomas
I.
,
Newberg
H. J.
,
Weiss
J.
,
Amy
P.
,
Thompson
J.
,
2019
,
ApJ
,
886
,
76

Donlon Thomas
I.
,
Newberg
H. J.
,
Sanderson
R.
,
Widrow
L. M.
,
2020
,
ApJ
,
902
,
119

Donlon Thomas
I.
,
Newberg
H. J.
,
Kim
B.
,
Lepine
S.
,
2021
,
ApJ
,
932
,
L16

Dropulic
A.
,
Ostdiek
B.
,
Chang
L. J.
,
Liu
H.
,
Cohen
T.
,
Lisanti
M.
,
2021
,
ApJ
,
915
,
L14

Dropulic
A.
,
Liu
H.
,
Ostdiek
B.
,
Lisanti
M.
,
2022
,
Gaia EDR3 Catalogs of Machine-Learned Radial Velocities
,
Zenodo
,

Duffau
S.
,
Zinn
R.
,
Vivas
A. K.
,
Carraro
G.
,
Méndez
R. A.
,
Winnick
R.
,
Gallart
C.
,
2006
,
ApJ
,
636
,
L97

Fabricius
C.
et al. ,
2021
,
A&A
,
649
,
A5

Feuillet
D. K.
,
Feltzing
S.
,
Sahlholdt
C. L.
,
Casagrande
L.
,
2020
,
MNRAS
,
497
,
109

Feuillet
D. K.
,
Sahlholdt
C. L.
,
Feltzing
S.
,
Casagrande
L.
,
2021
,
MNRAS
,
508
,
1489

Font
A. S.
,
Johnston
K. V.
,
Bullock
J. S.
,
Robertson
B. E.
,
2006
,
ApJ
,
638
,
585

Forbes
D. A.
,
2020
,
MNRAS
,
493
,
847

Foreman-Mackey
D.
,
2016
,
J. Open Source Softw.
,
1
,
24

Gaia Collaboration
,
2016
,
A&A
,
595
,
A1

Gaia Collaboration
,
2018
,
A&A
,
616
,
A11

Gaia Collaboration
,
2021
,
A&A
,
649
,
A1

Gallart
C.
,
Bernard
E. J.
,
Brook
C. B.
,
Ruiz-Lara
T.
,
Cassisi
S.
,
Hill
V.
,
Monelli
M.
,
2019
,
Nature Astron.
,
3
,
932

Gudin
D.
et al. ,
2021
,
ApJ
,
908
,
79

Han
J. J.
et al. ,
2022
,
AJ
,
164
,
249

Hasselquist
S.
et al. ,
2021
,
ApJ
,
923
,
172

Haywood
M.
,
Matteo
P. D.
,
Lehnert
M. D.
,
Snaith
O.
,
Khoperskov
S.
,
Gómez
A.
,
2018
,
ApJ
,
863
,
113

Helmi
A.
,
White
S. D. M.
,
Springel
V.
,
2003
,
MNRAS
,
339
,
834

Helmi
A.
,
Babusiaux
C.
,
Koppelman
H. H.
,
Massari
D.
,
Veljanoski
J.
,
Brown
A. G. A.
,
2018
,
Nature
,
563
,
85

Horta
D.
et al. ,
2023
,
MNRAS
,
in press

Hunter
J. D.
,
2007
,
Comput. Sci. Eng.
,
9
,
90

Iorio
G.
,
Belokurov
V.
,
2018
,
MNRAS
,
482
,
3868

Iorio
G.
,
Belokurov
V.
,
2021
,
MNRAS
,
502
,
5686

Iorio
G.
,
Belokurov
V.
,
Erkal
D.
,
Koposov
S. E.
,
Nipoti
C.
,
Fraternali
F.
,
2017
,
MNRAS
,
474
,
2142

Virtanen
P.
et al. ,
2020
,
Nature Methods
,
17
,
261

Jurić
M.
et al. ,
2008
,
ApJ
,
673
,
864

Katz
D.
et al. ,
2019
,
A&A
,
622
,
A205

Kluyver
T.
et al. ,
2016
, in
Fernando
L.
,
Birgit
S.
, eds,
Positioning and Power in Academic Publishing: Players, Agents and Agendas
.
IOS Press
, p.
87

Koppelman
H. H.
,
Helmi
A.
,
Veljanoski
J.
,
2018
,
ApJ
,
860
,
L11

Koppelman
H. H.
,
Helmi
A.
,
Massari
D.
,
Roelenga
S.
,
Bastian
U.
,
2019a
,
A&A
,
625
,
A5

Koppelman
H. H.
,
Helmi
A.
,
Massari
D.
,
Price-Whelan
A. M.
,
Starkenburg
T. K.
,
2019b
,
A&A
,
631
,
L9

Koppelman
H. H.
,
Bos
R. O. Y.
,
Helmi
A.
,
2020
,
A&A
,
642
,
L18

Kruijssen
J. M. D.
et al. ,
2020
,
MNRAS
,
498
,
2472

Lancaster
L.
,
Koposov
S. E.
,
Belokurov
V.
,
Evans
N. W.
,
Deason
A. J.
,
2019
,
MNRAS
,
486
,
378

Limberg
G.
et al. ,
2021
,
ApJ
,
913
,
11

Lindegren
L.
et al. ,
2021
,
A&A
,
649
,
A4

Price-Whelan
A. M.
,
2017
,
J. Open Source Softw.
,
2
,
388

Mackereth
J. T.
,
Bovy
J.
,
2020
,
MNRAS
,
492
,
3631

Mackereth
J. T.
et al. ,
2019
,
MNRAS
,
482
,
3426

McWilliam
A.
,
1997
,
ARA&A
,
35
,
503

Majewski
S. R.
et al. ,
2017
,
AJ
,
154
,
94

Massari
D.
,
Koppelman
H. H.
,
Helmi
A.
,
2019
,
A&A
,
630
,
L4

Montalbán
J.
et al. ,
2021
,
Nature Astron.
,
5
,
640

Myeong
G. C.
,
Evans
N. W.
,
Belokurov
V.
,
Sanders
J. L.
,
Koposov
S. E.
,
2018
,
ApJ
,
863
,
L28

Myeong
G. C.
,
Vasiliev
E.
,
Iorio
G.
,
Evans
N. W.
,
Belokurov
V.
,
2019
,
MNRAS
,
488
,
1235

Naidu
R. P.
,
Conroy
C.
,
Bonaca
A.
,
Johnson
B. D.
,
Ting
Y.-S.
,
Caldwell
N.
,
Zaritsky
D.
,
Cargile
P. A.
,
2020
,
ApJ
,
901
,
48

Naidu
R. P.
et al. ,
2021
,
ApJ
,
923
,
92

Naik
A.
,
Widmark
A.
,
2022
,
MNRAS
,
516
,
3398

Necib
L.
,
Lisanti
M.
,
Belokurov
V.
,
2019a
,
ApJ
,
874
,
3

Necib
L.
,
Lisanti
M.
,
Garrison-Kimmel
S.
,
Wetzel
A.
,
Sanderson
R.
,
Hopkins
P. F.
,
Faucher-Giguère
C.-A.
,
Kereš
D.
,
2019b
,
ApJ
,
883
,
27

Newberg
H. J.
et al. ,
2002
,
ApJ
,
569
,
245

Perez
F.
,
Granger
B. E.
,
2007
,
Comput. Sci. Eng.
,
9
,
21

Queiroz
A. B. A.
et al. ,
2020
,
A&A
,
638
,
A76

Rix
H.-W.
,
Bovy
J.
,
2013
,
A&A Rev.
,
21
,
61

Robertson
B.
,
Bullock
J. S.
,
Font
A. S.
,
Johnston
K. V.
,
Hernquist
L.
,
2005
,
ApJ
,
632
,
872

Robitaille
T. P.
et al. ,
2013
,
A&A
,
558
,
A33

Simion
I. T.
,
Belokurov
V.
,
Irwin
M.
,
Koposov
S. E.
,
2014
,
MNRAS
,
440
,
161

Simion
I. T.
,
Belokurov
V.
,
Koposov
S. E.
,
2019
,
MNRAS
,
482
,
921

Tolstoy
E.
,
Hill
V.
,
Tosi
M.
,
2009
,
ARA&A
,
47
,
371

van der Walt
S.
,
Colbert
S. C.
,
Varoquaux
G.
,
2011
,
Comput. Sci. Eng.
,
13
,
22

Vincenzo
F.
,
Spitoni
E.
,
Calura
F.
,
Matteucci
F.
,
Silva Aguirre
V.
,
Miglio
A.
,
Cescutti
G.
,
2019
,
MNRASL
,
487
,
L47

Vivas
A. K.
et al. ,
2001
,
ApJ
,
554
,
L33

Wes
McKinney
,
2010
, in
van der Walt
S.
,
Millman
J.
, eds,
Proc. 9th Python in Science Conference
. p.
56

Wheeler
J. C.
,
Sneden
C.
,
Truran James
W. J.
,
1989
,
ARA&A
,
27
,
279

White
S. D. M.
,
Frenk
C. S.
,
1991
,
ApJ
,
379
,
52

Yuan
Z.
,
Chang
J.
,
Beers
T. C.
,
Huang
Y.
,
2020
,
ApJ
,
898
,
L37

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)