Hidden deep in the halo: Selection of a reduced proper motion halo catalogue and mining retrograde streams in the velocity space

The Milky Way halo is one of the few galactic haloes that provides a unique insight into galaxy formation by resolved stellar populations. Here, we present a catalogue of $\sim$47 million halo stars selected independent of parallax and line-of-sight velocities, using a combination of Gaia DR3 proper motion and photometry by means of their reduced proper motion. We select high tangential velocity (halo) main sequence stars and fit distances to them using their simple colour-absolute-magnitude relation. This sample reaches out to $\sim$21 kpc with a median distance of $6.6$ kpc thereby probing much further out than would be possible using reliable Gaia parallaxes. The typical uncertainty in their distances is $0.57_{-0.26}^{+0.56}$ kpc. Using the colour range $0.45<(G_0-G_\mathrm{RP,0})<0.715$ where the main sequence is narrower, gives an even better accuracy down to $0.39_{-0.12}^{+0.18}$ kpc in distance. The median velocity uncertainty for stars within this colour range is 15.5 km/s. The distribution of these sources in the sky, together with their tangential component velocities, are very well-suited to study retrograde substructures. We explore the selection of two complex retrograde streams: GD-1 and Jhelum. For these streams, we resolve the gaps, wiggles and density breaks reported in the literature more clearly. We also illustrate the effect of the kinematic selection bias towards high proper motion stars and incompleteness at larger distances due to Gaia's scanning law. These examples showcase how the full RPM catalogue made available here can help us paint a more detailed picture of the build-up of the Milky Way halo.


INTRODUCTION
A large body of evidence shows that the assembly of the Milky Way is irrefutably hierarchical.The Galactic halo in particular has a non-linear structure with a vast number of chemical and dynamical stellar streams that allow us to study the formation history of our galaxy.It hosts a range of different substructures -from cold streams (e.g., Shipp et al. 2018;Malhan et al. 2018;Ibata et al. 2021;Martin et al. 2022;Li et al. 2022) to more diffuse merger events such as the recently discovered major merger Gaia-Enceladus (Helmi et al. 2018;Belokurov et al. 2018;Koppelman et al. 2018), the currently disrupting Sagittarius (Ibata et al. 1995), the Helmi streams (Helmi et al. 1999), Thamnos (Koppelman et al. 2019), Sequoia/I'itoi/Arjuna (Myeong et al. 2019;Naidu et al. 2020), LMS-1/Wukong (Yuan et al. 2020;Naidu et al. 2020), Cetus (Newberg et al. 2009;Yuan et al. 2022) and overdensities such as the Hercules-Aquila cloud, the Virgo Overdensity, and Eridanus-Phoenix (Belokurov et al. 2006;Balbinot et al. 2021).The cold stellar streams (see also Mateu 2022, for a recent compilation of streams discovered to date) are thought to be disrupting or disrupted dwarf galaxies or globular clusters that remain as coherent elongated structures for a long time after the * E-mail: astroakshara97@gmail.com (AV) progenitor is fully dissolved before getting phase-mixed (see e.g., Helmi 2008;Helmi 2020, and references therein).Stellar streams as well as more phase-mixed material that can be discovered due to their coherence in energy and angular momentum (e.g., Ruiz-Lara et al. 2022) are direct evidence that the Milky Way halo is assembled through several merger events in the past.These accreted streams allow us to understand the history, formation and evolution of the Milky Way and help probe the Galactic acceleration field and dark matter distribution (Koposov et al. 2010;Ibata et al. 2021).Therefore, the Galactic halo is an interesting playing field in nearfield cosmology with a vast number of chemical and dynamical stellar streams that allow us to study the formation history of our Galaxy.
With the release of Gaia early Data Release 3 (EDR3) and Data Release 3 (DR3) (Gaia Collaboration et al. 2021Collaboration et al. , 2022)), we have full astrometric solutions for 1.468 billion sources out of the total 1.811 billion sources that were mapped over a period of 34 months of data collection.Gaia DR3 also provides radial velocities for more than 33 million sources and BP/RP spectra for more than 220 million sources (Katz et al. 2022;Angeli et al. 2022;Andrae et al. 2022).This, in combination with large ground-based spectroscopic surveys such as APOGEE (Majewski et al. 2017;Ahumada et al. 2020), GALAH (Buder et al. 2021), SDSS SEGUE (Yanny et al. 2009), and LAMOST (Cui et al. 2012), has provided the community of Galactic archaeology with exciting new prospects for halo cartography and the study of substructures.Because spectroscopic studies are often limited in their apparent magnitude ranges, kinematic and dynamical studies of the more distant Galactic halo and its substructures often make use of tracers that are intrinsically bright, like red giants (e.g., Naidu et al. 2020;Chandra et al. 2022).Blue Horizontal Branch stars and other bright standard candles such as RR Lyrae are also often employed because they give additional distance information and thereby paint a 3D picture (e.g., Deason et al. 2018;Starkenburg et al. 2019;Wang et al. 2022).
Many of the research results produced in studying the halo focus on the Gaia data with three dimensional position and velocity spacecalled the Gaia 6D sample (see a recent application of Gaia 6D sample in Recio-Blanco et al. ( 2022)) where we can probe the Integrals of Motion space to study the assembly of the halo (Helmi & White 1999;Helmi et al. 2000).However, this 6D sample makes up only a small part of the entire Gaia sample without line-of-sight velocities -called the Gaia 5D sample.Only a sub-sample of 2.4% of the sources out of the 78% of the sources with astrometric solutions have Gaia lineof-sight velocities.Additionally, if one is seeking to add distance information, one is hampered by the fact that most of the stars in Gaia DR3 suffer from poor parallaxes.Only about 9.5% of the stars in Gaia DR3 have parallax_over_error > 5 (allowing 20% error in distances) i.e., only one-tenth of the humongous Gaia DR3 can be used to study sources with precise distance measurements derived from Gaia parallaxes.
Much of the wealth of Gaia DR3's 1.8 billion stars thus lies in the 5D sample with accurate photometry and astrometry, most of which is comprised of low luminosity low mass main sequence stars.Although it comes with clear limitations -such as the lack of lineof-sight properties, poor parallax as well as unconstrained parallax offset for distances beyond 2 kpc, and the fact that the volume (in terms of distances) of stars that can be probed is smaller than that of giants -there is much benefit in mining these stars effectively for Galactic Archaeology purposes.The STREAMFINDER algorithm is a clear example of successful mining of (part of) this large dataset, as it has been finding and characterising many coherent stellar streams in the halo (Ibata et al. 2021).
Main sequence stars are moreover very useful tracers to use in addition to brighter stars as they make up the majority of the stellar halo.The low-mass end of the main sequence is very long-lived and these stars preserve the imprint of their birth materials in the atmosphere better than more evolved stars.Most importantly, they vastly outnumber the brighter stars, making them useful to probe very small substructures, or faint surface brightness features.
In this work, we aim to select halo main sequences stars using the Gaia 5D sample.We explore the selection of Galactic halo main sequence stars out to ∼ 21 kpc using Gaia DR3 proper motions and photometry, which derives the reduced proper motion (see Jones 1972;Gould 2007;Smith et al. 2009 for more information about the reduced proper motion and its usage in various contexts).A similar selection method was recently used by Kim & Lépine (2021) to select half a million local halo main sequence stars, but they restricted their sample to stars out to 2 kpc using the parallax information from Gaia.Here, we instead follow the same method as explored on Gaia DR2 data by Koppelman & Helmi (2021a) and we include stars without good parallax information.As with the release of Gaia DR3, the limiting Gaia G-magnitude increased to ∼22 with the survey being complete up to G<19 (Fabricius et al. 2021a), we see that our selection method increases the catalogue size five times compared to Koppelman & Helmi (2021a).The simple colour-magnitude relation for the main sequence stars works in our favour to fit very .The high density region consists of mostly disc stars (of ∼99% of the stars).The area inside the polygon enclosed by grey dashed lines represents the tentatively selected halo sources at these high tangential velocities vertically bounded by the main sequence colour range.A -1.6 [M/H], 12 Gyr age isochrone (where absolute magnitude is converted to reduced proper motion parameter) at these higher tangential velocities is over-plotted in red and blue to show that the selections we make correctly picks up halo main sequence stars reliable photometric distances to these stars, allowing many avenues for exploration.
In this paper, we explore the study of fainter counterparts of retrograde stellar streams by using the binned velocity space of the catalogue.We furthermore improve distance information to candidate stellar stream stars by folding in available metallicity information.
This paper is laid out as follows: In Section 2, we introduce the reduced proper motion, explain the selection method, photometric distance fitting, catalogue description and distance validation.Binned velocity space is introduced in Section 3. Selection, validation and the improvement of photometric distances using metallicity information of retrograde stellar streams GD-1 and Jhelum are covered in subsections 3.1 and 3.2 respectively.In subsection 3.3, we show how a combination of Gaia's scanning law and our high proper motion selection provides us with a complementary view of the Sagittarius stream.Section 4 presents the summary of our results, future works, synergies with different datasets and potential science cases for the catalogue.

REDUCED PROPER MOTION SELECTED HALO SAMPLE
In this paper, we use the Gaia data release 3 (Gaia Collaboration et al. 2022) astrometry and photometry to select halo main sequence stars.

High tangential velocity stars in halo orbits
The process of selecting main sequence stars on halo-like orbits in Gaia DR2 (Gaia  Top panels: RPM selected halo using Gaia DR2; All sources (left), sources with heliocentric distances greater than 5 kpc (middle), sources with heliocentric distances greater than 10 kpc (right).Bottom panels: RPM selected halo using Gaia DR3; All sources (left), sources with heliocentric distances greater than 5 kpc (middle), sources with heliocentric distances greater than 10 kpc (right) K21) using just the Gaia proper motion (astrometry) and photometry information, the combination of which renders reduced proper motion.
The (Gaia G-band) reduced proper motion is given by the following equation(s): which is equivalent to: where  is the total reflex corrected proper motion in mas/yr defined as √︃ ( * RA cos(Dec)) 2 +  * Dec 2 provided by Gaia,  G,0 is the apparent Gaia G-magnitude after applying extinction correction,  G,0 is the absolute G-magnitude and  tan is the tangential velocity in km/s of the source that is proportional to the product of distance in kpc and proper motion in mas/yr by a factor of 4.74057 which is the result of unit conversions to obtain the velocity in terms of km/s.
Using the relation between equation ( 1) and ( 2), it can be seen that we are able to select high tangential velocity stars in the reduced proper motion versus colour diagram (hereafter, the RPM diagram).This is because, main sequence stars follow a simple colour-absolute magnitude relation and therefore, at a fixed  tan , a simple RPM colour relation.
Looking at the relation in equation ( 2), we can see that the RPM diagram simply mimics the colour-absolute magnitude diagram if we select stars with similar tangential velocities.Groups of stars that belong to the same substructure with small dispersion in tangential velocity will look similar in the RPM and HR diagram.The  tan for disc stars are small, which means that clean samples of halo stars can be easily selected by selecting high  tan stars.It should be noted that such kinematically selected halo samples disfavour sources with low tangential velocities (in turn, small proper motions) and high lineof-sight velocities.While such a selection of halo stars thus might be very clean, it is not complete, and it has a dynamical selection bias.
The dependence of   on both absolute magnitude and tangential velocity brings another useful property: we can safely assume that the stars living in regions of the RPM diagram where main sequence stars with high tangential velocities are selected are indeed main sequence stars.If they were on the (much more luminous) giant branch instead, their tangential velocities derived from these proper motions, but at a further distance, would have to be so high that they exceed the finite escape velocity of our Galaxy (Koppelman & Helmi 2021b) and to have many of these stars in the sample would be highly unlikely.The election of high tangential velocity (halo) main sequence stars in this RPM diagram allows us to create a sample with well-understood distances since main sequence stars have an approximately linear relationship between absolute magnitude and colour.
In summary, the RPM diagram can be constructed using a combination of Gaia observables that are reliably available for a very large sample of its stars (apparent magnitude and proper motions), as shown in Eqn.(1).It then returns valuable information on the absolute magnitude -and thereby distance and tangential velocityof the stars in selected areas of the RPM space where main sequence halo stars live.

Photometric Distance Calibration
Most of the established methods to select halo stars involve the use of distance and spectroscopy information.By selecting the halo stars using the reduced proper motion method, we end up having a halo catalogue that is independent of using distances as an input.In turn, we aim to calculate the distances to these stars by using the apparent Gaia magnitude in the G-band.This colour absolute magnitude relation is simply referred to as photometric parallax in literature (Jurić et al. 2008).Photometric distance in kpc as a function of Gaia photometry and colour is given by the following equation: The errors in absolute magnitude are computed as the width of the absolute magnitude range for all the stars with high tangential velocities and good parallax in the RPM diagram in each of the colour bins.The percentage error in  after propagating the errors in absolute magnitude (with the assumption that the errors in apparent magnitude are negligible) is given by where we assume an approximately linear relation between colour and absolute magnitude, which is a good approximation for the main sequence.In this work, we choose to use  0 −  RP,0 colour, as  BP,0 flux tends to be biased and overestimated for fainter sources.Using it would therefore be less ideal as it needs to be filtered creating selection effects (Riello et al. 2021a).

Extinction correction and quality cuts
For the reduced proper motion halo selection using the recent Gaia DR3 catalogue we perform several photometric as well as astrometric quality cuts, and we remove sources with high extinction.The photometry used for analyses in this paper is corrected for interstellar extinction using Schlegel et al. (2009) 2D dust maps and following the procedure of K21 with small modifications.We will briefly outline the steps below.The 2D dust maps account for the entire ISM dust along the line-ofsight.Following Binney et al. (2014) who use the relation by Sharma et al. (2011) for the dust density model, we are able to calculate a parameter called the "extinction fraction" that accounts for only the amount of foreground dust for each source based on its location in the sky.This fraction particularly makes a difference in extinction correction for sources that are within the solar neighbourhood.A simple analytical form of how we calculate the extinction fraction called the  ext (short for normalised extinction) in V-band is, where the denominator is the extinction value given directly from the 2D dust maps and the numerator is the amount of extinction for a star that is at a distance  in kpc from the sun with the direction vector ì  along the latitude and longitude (ℓ, ) in radians and the function represented by ( ì ( )) inside the integral is the dust density which is adopted from Eqn. (16) in Sharma et al. (2011).For sources with parallax < 0.1 mas (i.e., distant stars), we fix the  ext to be 1.0 to make the process computationally feasible and faster.
We also scale the 2D dust maps in regions where  ( −) > 0.15 because they are overestimated (Arce & Goodman 1999).For this, we use the equation from Binney et al. (2014).The scaling factor is estimated as follows: It is noteworthy that these corrections don't have a large impact on the selection of halo stars for the purposes of this study.However, the extinction correction process that we follow here can have an effect on sources in the solar neighborhood as well as more highly reddened regions.
We use an extinction curve with   = 3.1 that is computed using Padova model 1 which is originally based on O'Donnell (1994) and 1 http://stev.oapd.inaf.it/cgi-bin/cmd_3.7 Cardelli et al. (1989).Using this tool for Gaia DR3 photometric system, we apply     = 0.83627,     = 1.08337 and     = 0.63439 to obtain the extinction correction for each Gaia passband.Based on Section 8.3 from Riello et al. (2021b), we solve for a correction to the internally calibrated mean source G-band photometry to account for the systematic effect due to the use of default colour in the Image Parameter Detection (IPD).The correction is represented by a simple cubic polynomial as a function of BP-RP colour for different ranges of G-band, the coefficients of which are taken from Table 5 in Riello et al. (2021b).The corrected G-flux is a product of the given G-flux and this correction factor while the corrected apparent G-magnitude is  G − 2.5 × log(correction-factor).We calculate a correction for phot_bp_rp_excess_factor using values based on Table 2 in Riello et al. (2021b) which gives us the value of what we call excess_flux.We also calculate one sigma deviation for the excess_flux for a sample of sources with good quality Gaia photometry using equation ( 18) from Riello et al. (2021b).Using these two parameters, we place a photometric quality cut on the full Gaia DR3 sample that is defined as (excess_flux) < 5 × sigma_excess_flux.Additionally, we filter the sources with bad astrometric solutions by removing all the sources with ruwe > 1.4 (See more information on this in Lindegren et al. 2021 andFabricius et al. 2021b).We remove high extinction sources using the quality cut   < 2.0, which also acts as a way to remove disc contamination as most of the high extinction stars come from lower latitudes close to the disc.The mean values of  ( − ) and the extinction normalisation factor in the V-band  ext in the final catalogue are 0.19 and 0.92 respectively.

Fitting the main sequence sources
Producing a reliable fit for the absolute magnitude of main sequence stars for different values of Gaia colour is one of the important steps in generating photometric distances to the sources in this halo catalogue.Instead of relying on isochrones, which are known to not always describe the data well in this colour-ranges (Di Matteo et al. 2019), we use an empirical relation based on real data.We select stars with high tangential velocities ( tan > 200 km/s) that also have good quality parallaxes (parallax_over_error> 5) in the Gaia colour range 0.35 <  0 −  RP,0 < 1.1 and impose a three component linear fit in absolute G-magnitude ranges between 4 and 6, 5 and 8, and above 8.After applying a zero point offset of 17 as to the Gaia DR3 parallaxes (Lindegren et al. 2021), we invert them to use an estimate of distances to get the absolute magnitudes for this fit.We also perform a more precise running mean and standard deviation fit for the absolute magnitude by binning the colours into 128 components.We use the [M/H] = −0.5, 11 Gyr isochrone (obtained from Marigo et al. 2017) shifted by 0.01 mag in  0 −  RP,0 to describe the valley between blue and red sequence.The blue sequence is inherently described as the halo stars with almost no rotation while the red sequence is heavily populated by heated-up thick disc stars with slow rotation (Koppelman et al. 2018;Di Matteo et al. 2019).All the stars to the right of this shifted isochrone are removed as red sequence contamination.This contamination is further reduced by using a stricter  tan > 300 km/s cut for the fitting purpose.We also remove residual stars below the main sequence using the following empirical cuts:  0 −  RP,0 < 0.65 and  G,0 > 8 or  0 −  RP,0 < 0.8 and  G,0 > 10.

Selecting the final catalogue
We go back to the RPM diagram for the entire Gaia DR3 sample, now fully cleaned, and place a tangential velocity cut between 200 and 800 km/s in the main-sequence colour range.We then assign photometric distances to each of these stars based on the 3-component linear and running mean main sequence fit.The corresponding extinctioncorrected RPM diagram for the entire Gaia DR3 data that passes the photometry, astrometry and extinction cuts as described in subsubsection 2.3.1 is shown in Fig. 1.The grey polygon is drawn based on the 3-component fit to the main sequence at 200 and 800 km/s tangential velocities.The red and blue isochrones (converted to RPM sequence using equation ( 2)) are taken from Marigo et al. (2017) with [M/H] = −1.6 and 12 Gyr age corresponding to the average metallicity and age of the local halo (e.g., Youakim et al. 2020) to illustrate that the high  tan cut we impose on the main sequence picks up halo stars indeed.
White dwarfs are removed by excluding all the sources below the 3-component fit on the main sequence offset by 2 mag in  G,0 .Because the uncertainties in proper motions and photometry are not large enough for the intrinsically brightest giants on the higher Red Giant Branch (hereafter RGB) to be picked up in our sample, we claim that the giant contamination is negligible.As a final quality cut, we place an empirical cut on reduced proper motion uncertainty in order to remove contamination from the lower RGB stars bleeding into the main sequence selection we use.The quality cut we impose on Gaia DR3 reduced proper motion parameter over reduced proper motion uncertainty (computed by propagating the errors in apparent G-magnitude and proper motions) is to keep all the sources that satisfy the following condition: log     > 1.75.

Final catalogue
The final catalogue comprises 47,650,376 provisional halo main sequence sources from Gaia DR3 selected using the reduced proper motion property.This is approximately five times larger than the number of sources presented in K21 using Gaia DR2.A subsample of this final catalogue with reliable photometric distances that excludes turnoff and redder stars by considering only stars with 0.45 < ( 0 − RP,0 ) < 0.715 has a total of 24,647,379 stars, which is 3.5 times bigger than the one produced with Gaia DR2.Fig. 2 shows the on-sky density distribution of tentative halo main sequence stars from Gaia DR2 (K21) and Gaia DR3 (this work) for different distance bins (All,  > 5 kpc and  > 10 kpc) in Galactic coordinates colour-coded by the logarithm of the number of stars present in each pixel produced using a k=12 HEALPix pixel level.The bottom middle panel shows the distribution of sources that have a heliocentric distance of more than 5 kpc which equals to ∼ 75% of the total sample while the bottom right panel shows the distribution of sources that have a heliocentric distance of more than 10 kpc that makes up 15% of the total sample.The mean distance of the Gaia DR3 RPM selected halo sample is ∼6.6 kpc and goes out to 21 kpc which is much farther out than was possible using the catalogue derived for Gaia DR2 whose mean distance was 4.3 kpc.This is further illustrated in Fig. 2. The on-sky distribution of the new catalogue looks more complete at distances greater than 5 kpc.Pixels with low/no stars in all three panels (especially at higher distances) correspond to high-extinction regions according to the Gaia dust maps.The Gaia scanning pattern created as Gaia has been scanning some regions more frequently is prominently visible in all the panels, because we probe the fainter stars pushing the limits up to Gaia's limiting ap- parent magnitude of 22 (see Riello et al. 2021b for a clear view of Gaia's scanning pattern at fainter magnitudes).The resulting catalogue is available with this work.The first nine rows of the catalogue and their available parameters are shown in Table 1.

Photometric distance validation
One of the major advantages of using this sample to study the halo is that we can derive reliable photometric distances to the main sequence halo stars.Therefore, it is important that we analyse the quality of the photometric distances we calculate.The textbook sam-ple to compare with are the inverted parallaxes from Gaia for sources with good quality parallaxes, parallax_over_error > 20 within the RPM selected halo sample.We end up with ∼80,000 sources with good parallaxes which is less than 1% of the entire sample size as Gaia parallaxes are unreliable for distant and faint sources.We use these 80,000 sources as a representative sub-sample to validate the quality of the photometric distances computed using the main sequence fit.The product of the photometric distance and parallax (which should ideally be 1.0) and the percentage error in the photometric distances that are calculated by propagating the uncertainties in the absolute magnitude calculated based on the running mean fit on each of the 128 Gaia colour bins (see equation ( 4)) versus Gaia colour are shown in the top and middle panel of Fig. 3 respectively.It is clearly evident from these figures that the photometric distances are more reliable, relatively speaking, inside the colour range: 0.45 < ( 0 −  RP,0 ) < 0.715.The typical uncertainty within this colour range is down to 7.4% The sample within this colour range has a median distance error of 0.39 +0.18  −0.12 kpc and a median velocity error taking into account the error in the distance as well as that in the proper motions of 15.5 km/s.The whole sample has a median distance error of 0.57 +0.56  −0.26 kpc.Stars bluer and redder than this have typically larger relative uncertainties.On the one hand, the typical uncertainty in the bluer main sequence turnoff part with ( 0 −  RP,0 ) < 0.45 is 12%.This larger parallax-distance deviation near the main-sequence turnoff is caused by the almost vertical nature of this part of the HR diagram; this makes absolute G-magnitude as a function of colour more uncertain.Typically, our method tends to underestimate the distances for these stars which explains the overpopulation of stars at the MSTO colour range below 1.0 in the y-axis in the top panel of Fig. 3.One illustration of the possible consequences of this underestimation is shown in Section 3.3.On the other hand, for redder sources than  0 −  RP,0 > 0.715 the typical distance uncertainty inflates to 16% and we see from the parallax-distance comparison that the distances are typically overestimated (see the stars at the redder range above 1.0 in y-axis in the top panel of Fig. 3).The cause of this is a broadening of the main sequence at the faint end due to a lack of faint stars (Gaia's incompleteness from G > 20) increasing the uncertainty in the calibration of photometric distances in this range.
The bottom panel of Fig. 3 shows the distribution of heliocentric and galactocentric distances.The two local peaks in the galactocentric distance distribution are attributed to the overpopulation of stars around the solar neighbourhood as expected.Thus, the distances agree well with the parallaxes in the reliable colour range and the distribution agrees with the hypothesis that the Milky Way stellar halo has a steep density profile as presented in the literature so far (Deason et al. 2011).

SUBSTRUCTURE(S) IN THE BINNED VELOCITY SPACE
Hunting for spatially coherent halo substructures in the sky is possible using proper motion information while the structures are hidden in a maze of smooth background halo distribution.An even more reliable approach is to combine proper motion with the distance information and plot the mean tangential velocity components of each pixel in the sky to look for cold streams and/or over-densities.This method will work only if the velocity vectors in the sky are sufficiently different from the background velocity distribution and it is also sensitive to the velocity dispersion of the structure itself and uncertainties on the velocities.In this section we will use this approach on the RPM sample.It is important to remember here though that: (i) the RPM selected halo sample suffers from a kinematic selection bias and (ii) the distances are not very accurate close to the turnoff and for fainter stars with redder colours.
To find the on-sky velocity components, we use the following equation: where i can be the Galactic longitude or latitude in radians to derive their respective velocity components in km/s.It is important that these space velocities are corrected for solar reflex motion using the distances that we calculate.For this purpose, we calculate the solar velocities at each (l,b) using the following equation: where we use the solar motion constants ( ,  ,  ) = (11.1,12.24, 7.25) km/s and the local standard of rest motion  LSR = 232.8km/s from Schönrich et al. (2010) and McMillan (2016) respectively.Now we add the solar correction to the velocities we calculated in equation ( 7) to get the solar motion corrected Galactic space velocities in km/s.
where  can be (, ) in radians.In this halo main sequence catalogue, the mean uncertainty in velocities for the entire sample within the  0 − RP,0 colour range where the distances are more reliable are  *  ∼ 23.6 km/s and  *  ∼ 17.9 km/s.The mean errors for turnoff and fainter stars are comparatively larger:  *  ∼ 37.2 km/s and  *  ∼ 24.8 km/s.
Only 12,862 stars in our catalogue have radial velocity measurements from Gaia because of the intrinsically faint tracers of the sample.Cross-matches with spectroscopic surveys such as LAMOST, SDSS SEGUE, APOGEE and GALAH give us less than 100,000 stars in common altogether.As this would severely restrict our search, we compute pseudo-3D-velocities instead by assuming the radial velocity component to be zero i.e.,  * los = 0. Taking into account the local standard of rest, the motion of stars around the Galactic Centre follows a sin  cos  pattern which can be put to use without lineof-sight velocities in certain regions in the sky, such as near both Galactic poles, near Galactic centre and anti-centre (Kim & Lépine 2021).However, here we are trying to look for streams and substructures on the entire sky.Considering that, we choose to compute pseudo-velocities for the entire sample in (, , ) and (, , ) coordinates.The pseudo velocities in km/s in 3D cartesian coordinates are given by, The tilde symbol above the velocity representation is to indicate that these are not true 3D velocities.We can perform coordinate transformation on these values to get pseudo-3D-cylindrical-velocities namely ( ṽ , ṽ  , ṽ ).All these pseudo velocities are in galactocentric coordinates by placing the sun at ( ,  ,  ) = (−8.2,0.0, +0.014) kpc (GRAVITY Collaboration et al. 2018).
The sky distribution in Galactic coordinates (, ) showing the mean velocities of stars in pixels of 360/300 × 180/100 is shown in Fig. 4. We see several members of streams that move in the same direction with a velocity that is sufficiently different from the nearby background.The top panel is binned for velocity distribution in the longitude direction, the middle panel is binned for velocity distribution in the latitude direction and the bottom panel is binned for velocity distribution in the pseudo azimuth direction.All these subfigures have candidate stars selected with a photometric distance greater than 7 kpc (distant halo main sequence stars).We are mainly concerned about the velocity distribution at (comparatively) higher distances because streams, substructures and over-densities that are in the solar neighbourhood will not typically appear as cold and coherent structures in the sky.We choose a distance greater than 7 kpc because it is the mean distance of the sample.It is also important to note that we plot all the distant halo main sequence stars in our sample including the turnoff stars for which the uncertainties in the photometric distances can be relatively high.
In the following subsections, we study the two most obviously visible structures in the sky to the faintest Gaia G magnitudes and characterise them independent of any prior information about these structures from the literature.These selection methods, in principle, can be used for any streams picked up using the RPM sample.

GD-1 stellar stream
One of the most easily visible structures in the northern hemisphere in Fig. 4 is the very retrograde GD-1 stream discovered using Sloan Digital Sky Survey by Grillmair & Dionatos (2006).In the past 1.5 decades, this stream has been studied in unprecedented detail (for some most recent studies, see Ibata et al. 2020;Balbinot et al. 2021;Banik et al. 2021;Dillamore et al. 2022;Doke & Hattori 2022;Shih et al. 2022).
To pick up candidate stars that could potentially belong to this stream, we draw an empirical polygon around the pixels that distinctively vary in mean velocity along the longitude direction from the nearby background halo in the sky.We randomly select the same number of candidate stars in all directions around the stream structure in the nearby halo as the control sample.The sky space and the rough polygon selection in and around the stream are presented in the top panel of Fig. 5, where the black polygon on the top left shows the GD-1 candidate members selected and the black polygon (after removing the stars in the grey polygon belonging to GD-1) on the top right shows the control sample of candidate members around the stream.When we plot the non-solar motion corrected Gaia proper motions in latitude and longitude direction of these on-stream and off-stream track members selected from the top panel, we clearly see an arc of proper motion peaks belonging to the GD-1 main sequence candidates to the right of the background halo proper motion peaks.The 2D proper motion plot and the selection of proper motion peaks belonging to GD-1 candidate stars are shown in the bottom panel of Fig. 5.We can clearly see the kinematic selection effect (high tangential velocity selection) of the RPM sample as the void of halo stars with small proper motions (upper right part) in this subplot which is also the reason why we refrain from using complex statistical selection methods like Gaussian mixture models for picking up the proper motion peaks of the stream here and throughout the rest of this paper.
In order to validate the candidate members selected using proper motion peaks, we fit a third degree polynomial to the proper motion in right ascension and declination with respect to the stream coordinate  1 using the stream coordinate system and conversion defined by Koposov et al. (2010).Throughout the rest of this work, we only fit polynomial functions for the removal of contaminants and validation to account for the lack (or minimal availability) of line-of-sight information in these faintest magnitudes and to avoid biases and errors that may arise from using different Galactic potential models.In all the fitting polynomials,  1 and  2 are in units of radians.Because the stream selection we perform does not pick up members from the lower end of  1 , we place a cut on the lower limit of  1 and remove obvious contaminants at the edge of the grid before we fit an empirical polynomial to these candidates.These polynomial functions match very well (with slight variations that we believe to be due to the incompleteness in lower  1 ranges) with the ones defined by Ibata et al. (2020).We select the candidates that lie within 2 standard deviation from the mean error () in proper motion.This is expressed by the following equation(s): where  RA and  Dec are in mas/yr.The selection polynomial and the method are illustrated in the top two panels of Fig. 6.All the stream star candidates that satisfy the conditions described by the above equations are chosen to be the confident members and plotted on the stream sky coordinates ( 1 ,  2 ).A third degree polynomial fit to the end stream track is described by the following equation: This equation also matches well with the polynomial stream track derived in Ibata et al. (2020).These confident members are shown in the middle panel of Fig. 6.To evaluate the performance of this independent stream candidate selection using the velocity space of the reduced proper motion selected halo sample, we cross-match the final members with publicly available existing spectroscopy surveys and find 64 stars with line-of-sight velocities and metallicities from SEGUE (Sloan Extension for Galactic Understanding and Exploration).The line-of-sight velocities for these stars from SEGUE versus stream coordinate  1 are shown in panel 4 of Fig. 6.The empirical stream track is fitted using the polynomial described by equation (1) from Ibata et al. (2020).We select the confident radial velocity members using the following condition: where  los is in km/s.Out of the 64 cross-matches, 58 (91%) of them check to be confident members under the condition described by the above equation.The ones that do not satisfy this condition still look coherent in the position-velocity space.Confident members and radial velocity members (plotted as velocity quivers) are shown as smoothed density plots in  1 versus  2 =  2 −  ( 1 ) in the bottom panel of Fig. 6.We are able to see the spur and diffuse blob feature at ( 1 ,  2 ) ∼ (−30, 1) and ( 1 ,  2 ) ∼ (−20, −0.5) that was discovered by Price-Whelan & Bonaca (2018) which were proposed be caused by dark matter substructures in the Milky Way (Bonaca et al. 2020).The spur and blob are underdense by ∼3 significance compared to the highest density stream track component.We also see three gaps and/or density variations across the stream at  1 ∼ −38, −20, −3 with ∼1, 2, and 3 significance.These gaps were also confirmed by de Boer et al. (2018) andde Boer et al. (2020).The member candidates away from the stream track are contaminated due to the edge selections in proper motion that can also be seen as crowding in top panels of Fig. 6 at higher values of  1 .More information in 6D space and chemistry is needed to confirm their membership, or refute it.
To further improve the analysis of the fainter counterparts of stream star candidates, we make the photometric distances more accurate by folding in the (known) metallicity distribution of the stream into distance calibration.Again, we refrain from using existing publicly available isochrones such as MIST (Choi et al. 2016) and PARSEC (Marigo et al. 2017) for this purpose, as our analysis shows the isochrones mismatch with each other in Gaia DR3 colours at these faint magnitudes on the main sequence and for 4 <  G,O < 8 and moreover deviate from local halo data with good parallax information, the latter of which is also suggested in Kim & Lépine (2021) (hereafter KL21)2 .This can be explained as the current stellar evolution tracks do not reproduce the cold low-mass stars very well.Instead, we choose to use the empirical photometric metallicity grids from Table 3 in KL21 as mock isochrones to improve our photometric distances for the stream star candidates.KL21 creates this photometric metalicity grid by selecting high tangential velocity local halo population out to ∼2 kpc using the RPM diagram, cross-matching them to existing spectroscopic surveys such as SDSS SEGUE I/II (Yanny et al. 2009), SDSS APOGEE DR16 (Ahumada et al. 2020), LAMOST DR6 (Cui et al. 2012), GALAH DR3 (Buder et al. 2021) and nearby main sequence from Hejazi et al. (2020) to get ∼20000 stars with metallicity information and building a photometric metallicity grid based on the distribution of the stars in the dataset.As this is a very similar selection to our sample, we deem these grids very justified for this work.One major difference between our sample and KL21 sample is that it is not cleaned for thick disc contaminants, but as these will have metallicities significantly higher than the mean metallicity of GD-1 stream ([Fe/H] ∼ −2.29 based on SEGUE crossmatch of candidates confirmed in this paper) this will not create any issues.
Using these grids, we create a two dimensional interpolation grid based on metallicity and Gaia colour to give the absolute magnitude in G-band and thus photometric distances corrected for the metallicity of the stream.We find that the distance uncertainties are not affected in their magnitude by this change.The upper and lower sigma confidence intervals given in Table 1 is therefore applicable to both sets of distances, provided that the shift between the general distance and metallicity-dependent distance measurements is taken into account.The final distribution of the distance is shown in the top right panel of Fig. 7.The distance re-calibrated with metallicity information and distance calculated for the entire RPM sample (without any prior information on metallicity) are shown as continuous and dashed line distribution respectively on the histogram.It is important to note that the distance histogram cannot be used at face value because of the incompleteness of the main sequence sample due to Gaia's incompleteness for sources with G < 19.We can clearly see from the CMD in the top middle panel of Fig. 7 that we do not probe the entire magnitude range in every colour bin (as illustrated by the CMD, where we see that the range of magnitudes probed in the faint end of the Gaia colours -around ( 0 −  RP,0 ) = 0.6is much smaller than at ( 0 −  RP,0 ) = 0.4) and therefore, we are clearly missing a lot of distant main sequence in the sample due to Gaia's limiting magnitude.However, even an incomplete catalogue of faint main sequence stars belonging to the GD-1 stream is still an interesting probe into the low surface brightness components of this stream.We also show the sub-giant or turnoff stars, for which we have less accurate distances, in grey in the top middle panel of Fig. 7.These are still likely members of GD-1 but unlikely to be true main sequence stars (especially as the turnoff for such metal-poor systems is rather blue).
The final catalogue of main sequence stars associated to GD-1 is plotted as smoothed density distribution in the bottom panel of Fig. 7.We are able to see a break in track (kink) at  1 ∼ −40 as proposed by de Boer et al. (2018) but the shape of this kink is opposite to the shape seen in their data while it matches the shape seen by the simulated stream from the same work.The two tracks are -0.12• and -0.06 • away from the main stream track.However, their 16th and 84th quantiles overlap significantly, indicating that more spectroscopically confirmed members would be needed to make any statistically conclusive statements on these.Webb & Bovy (2019) modelled the location of GD-1's progenitor using N-body simulations and data from Gaia DR2 and concluded that the stream's progenitor could be located between −30 <  1 < −45 and this could be responsible for the observed gap at  1 ∼ −40.Following this argument,  1 ∼ −40 could be a probable place for the GD-1 progenitor which we can see as a kink in our data.However, also here, more spectroscopic follow-up will be necessary to draw any firm conclusions.Some very distant (| 2 | > 1) radial velocity members and small density peaks in our sample (see bottom panel of Fig. 6 and 7) further illustrate the high complexity of this stream.

Jhelum stellar stream
Another stream that we see from the second panel of Fig. 4 is the Jhelum stellar stream.The Jhelum stream was discovered by Shipp et al. (2018) using the first three years of multi-band optical imaging data from the Dark Energy Survey (DES).Since then, the stream has been extensively studied mostly due to its complex morphology and several sub-components (for most recent studies see, Woudenberg & Koop et al., 2022, Li et al. 2022).
To pick up candidates belonging to Jhelum stellar stream, we follow the same procedure as before and draw a rough polygon around the velocity peaks in the Gaussian smoothed binned velocity mo-  ments in the sky for on-stream candidates as shown in the top panels of Fig. 8.We randomly pick up the same amount of stars from all directions outside the stream polygon and dub them off-stream.Their respective proper motion density in (l,b) coordinates is shown in the bottom panels of Fig. 8.We can clearly see an overdensity of proper motion around the four-sided grey polygon drawn in Fig. 8.We select stars from this proper motion peak and use them as Jhelum stellar stream faint candidates for the rest of this subsection.
To validate our selection, we fit an empirical polynomial, in a similar fashion with GD-1, on the proper motions in (RA, Dec) with respect to the stream coordinate  1 defined by Bonaca et al. (2019).We select all the candidates that satisfy the following two conditions based on the second degree polynomial fit on proper motion in (RA, Dec) with respect to  1 : where  1 is in radians and  RA and  Dec are in mas/yr.We allow here a 3 deviation instead of 2 owing to the complex multiple components in the stream that are debated to have similar (Bonaca et al. 2019) or slightly different (Shipp et al. 2019) proper motion distribution.In the final distribution of stream star candidates, we find three components of Jhelum -the narrow and broad components almost parallel to each other reported by Bonaca et al. (2019) and the tertiary spur component reported after the advent of Gaia EDR3 recently in Woudenberg & Koop et al., 2022.We fit the following three second degree polynomials to these sub-components by dividing the stream sky space into three parts based on their overdensities: with respect to the stream coordinate  1 , confident members on stream sky coordinates as KDE smoothed distribution (middle bottom) and the stream coordinates density distribution in the sky with the confident members of the three sub-components of Jhelum (bottom).The thin dashed lines in the plots are second degree polynomial fits on the corresponding parameters.The thick dashed line in the bottom panel refers to the equivalent fit by Bonaca et al. (2019) for the broad component (described as 0.9 • below the narrow component).The dashed lines in plots one and two are second degree polynomial fits on the corresponding parameters.The counts(*) for the Gaussian smoothed density plot is simply the number of stars per 0.4×0.1 deg 2 .
where ( 1 ,  2 ) are in radians.Equation ( 20) fitted for the narrow component of Jhelum is similar to what is fitted by Woudenberg & Koop et al., 2022.The polynomial fitting and confident member selection are illustrated in Fig. 9.In the last panel of this figure, we also illustrate the fact that the Jhelum broad component is not just ∼ 0.9 • shifted from the narrow component (thick dashed line proposed by Bonaca et al. 2019) but is better described by equation ( 21).These Jhelum stream confident members have no cross-matches with existing publicly available spectroscopic surveys mostly because all our candidates are extremely faint (G > 18.5) and in the southern sky.Therefore, to improve the distances sensitive to the stream's metallicity, we adopt the distribution based on the mean and standard deviation proposed by Li et al. (2022).We fit a mock isochrone using KL21 photometric metallicity grid at [Fe/H]=-1.83and find the mean distance to be 10.44 kpc which is a bit closer than what is observed in the literature.This can be attributed to the fact that we only probe the nearby main sequence of the Jhelum stream due to Gaia's limiting magnitude.This distance fitting is illustrated in Fig. 10.In the bottom panel of this figure, we plot the stream density of the main sequence stars.Radial velocity members from Ji et al. (2020) are overplotted and we see that these members are consistent with our choice of stream track.We also plot the metallicities derived for these candidates in the top panel as a histogram, but we can see that all of these stars are more metal poor than the observed mean metallicity.This is because the stars observed with high-resolution spectroscopy in Ji et al. (2020) were preferentially selected to be metal-poor compared to all possible Jhelum members.It is important to note that we see a density break in the middle of the narrow component around  1 ∼ 15 with one side of the stream going downwards and the other side upwards for almost 2 • .These two tracks causing a density break are 0.15 • and -0.38 • away from the narrow stream track and the break in density is ∼1 below the density of the narrow stream.This was seen and simulated as the kink feature in Woudenberg & Koop et al., 2022 caused due to interactions with Sagittarius but not resolved as clearly as we see here.As such, this showcases an advantage of probing fainter counterparts of such complex stellar streams.

Lower proper motion tail of Sagittarius
We pick up another clearly visible structure in Fig. 4, top panel, around  ∼ 150 − 200 • and  < 0 • .Interestingly, we do not recognise this candidate stellar stream as a stream known in the literature.It overlaps with the Sagittarius broad stellar stream Additionally, it seems to be close to a great circle in the sky with the stellar streams Palca (Shipp et al. 2018;Li et al. 2022) and Cetus (Chang et al. 2020;Yuan et al. 2022) and overlaps significantly in the sky with the second structure almost parallel to the Cetus stream discovered by Thomas & Battaglia (2022), but it does not match these known streams and overdensities in proper motion space.To investigate this further, we perform a similar selection method drawing a rough polygon around the velocity peaks that are sufficiently different from the background halo and a control sample around the polygon on-stream selection.The density of proper motion in (l,b) shows a clear extension on the on-stream track at ( ℓ ,   ) = (0, −5).This is illustrated in Fig. 11.It bears resemblance to the Sagittarius stream in certain observables, but extends to lower proper motions as well as lower distances (at ∼10 kpc) compared to Sagittarius samples defined in the literature so far.In the bottom panels of this figure, we overplot Sagitarrius RGB stream star candidates selected by Vasiliev et al. (2021) within the on-stream and off-stream polygon.Sagittarius overlaps with the structure we see in this region and the proper motions, though do not entirely match.Rather, it looks like a lower end extension in the proper motion space of Sagittarius that is being picked up only at this region of the sky by the RPM sample.We notice that we lose the actual peak of Sagittarius proper motions because of the RPM sample's inherent selection bias against halo stars with small proper motions.
Similar to our procedure for GD-1 and Jhelum, we fit a second degree polynomial to the proper motion in (l,b) directions with respect to the longitude.The stream stars show a clear trend with increasing longitude.We also fit a second degree polynomial to the (, ) sky space.The fitted polynomial functions are as follows: =  2 (ℓ) = 1.39ℓ 2 − 6.17ℓ + 2.29 ( 24)  where ℓ and  are in radians and  ℓ and   are in mas/yr.The polynomial fit on proper motion and on sky density distribution along with Gaia's scanning pattern around it is shown in Fig. 12.
We see two sub-components in this structure selection that are of higher density than the rest.These two high density regions clearly overlap with two arcs in the background caused by Gaia's scanning pattern.Gaia scans this region more frequently than the surrounding regions which means that it probes deeper in these regions.This might explain why we seem to have picked up the lower proper motion part of Sagittarius only in this apparently stream-like feature.We do not see the same signal for the rest of the Sagittarius stream, because they are too far and deep for the sample to reach while Gaia's scanning pattern makes this part of Sagittarius pop out.Ten stars have metallicities and line-of-sight velocities after cross match with SDSS/SEGUE.Although these stars have poor signal-to-noise, it is good enough to validate that the metallicities fall into the range of metallicities of Sagittarius selected from Vasiliev et al. (2021) in this region as it can be seen in the bottom panels of Fig. 14.These stars are plotted in the sky as purple stars with velocity quivers in the third panel of Fig. 12.
Finally, we attempt to improve the distances to the stars using the metallicity information from SEGUE cross-matches.We realise that a large majority of the stars in the feature we pick up are actually bluer than our preferred colour selection with ( 0 −  RP,0 ) < 0.45, which means that the distances are less reliable.It is possible for the turnoff stars higher on the turnoff (from the sub-giant side of the CMD) to bleed into our sample.We create a three dimensional interpolation for MIST isochrones (as they do work well in the sub-giant branch) using age, metallicity and Gaia colour as inputs.By feeding the mean metallicity of -1.25 (based on the 10 SEGUE cross-matches) and an age of 12.5 Gyr, we find that the two distance solutions possible either give a distance of ∼10 kpc, or ∼20 kpc, to the feature.The latter distance would correspond to the predicted Sagittarius stream stars in that region of the sky Vasiliev et al. (2021) and would explain why the feature does not show any redder stars.The distance and metallicity distribution and CMD for these stars are shown in the top panels of Fig. 13.In the bottom panel, we show the latitude versus line-of-sight velocities for the radial velocity members and fit a second degree polynomial ignoring the star at ∼200 km/s as contamination.This polynomial fit is described by the following equation: where  is in radians and  los is in km/s.We compare the observable properties of these member stars with Sagittarius stream candidates selected by Vasiliev et al. (2021).These comparison figures are shown in Fig. 14.The structure we see agrees with Sagittarius stream candidates in that region in almost all the observables, and extends it in proper motions.We conclude that this feature is indeed a part of Sagittarius and that it looks like a thin stream crossing Sagittarius due to a combination of Gaia's scanning law -providing more fainter The full stream star candidate list will be made available online stars in this feature -and the RPM sample selection effect on small proper motion.

Summary of stream properties
The distance properties and the number of candidate members selected are summarised for all three discussed candidate streams in Table 2.The columns that will be provided as a part of the stream member candidate catalogue are shown as a part of Table 1.

CONCLUSIONS AND OUTLOOK
With the advent of the European Space Agency's Gaia mission and its recent data releases EDR3 and DR3, the astronomy community has obtained the largest-ever cartograph of our Milky Way galaxy with unprecedented astrometric parameters.In this paper, we have presented a catalogue of ∼47 million halo stars on the main sequence with high tangential velocity selected only using Gaia DR3 proper motions and photometry.This is made possible using the reduced proper motion that when plotted versus Gaia colours mimics the colour-magnitude diagram for populations with different tangential velocities.Most of the literature methods to select halo stars make use of distance and/or spectroscopy information One much used example of this, is to create a 'Toomre' diagram of velocities, where a cut can be made to separate stars that move fast with respect to the solar motion (e.g., | −  LSR | > 210 km/s) to isolate halo stars (see for implementations of such a cut for instance Bonaca et al. 2017;Koppelman et al. 2018).The disadvantage of such a method is however that it is limited to stars with good parallaxes and line-ofsight velocities.Instead, we build in this work a catalogue of inner halo stars out to ∼21 kpc.This sample is five times bigger with less systematics and much improved completeness beyond 5 kpc (thanks to Gaia DR3) than the original K21 RPM sample produced using Gaia DR2 and showcases the science potential of such a sample with the upcoming data releases of Gaia (for eg., Gaia DR4).The distance makes up an important aspect of the 6D information often used in the study of the dynamical evolution of the Milky Way halo.Here, we calculate photometric distances to these stars with simple linear colour-magnitude relation for these stars on the main sequence.The typical uncertainty on these derived photometric distances is ∼7% which is more reliable and probes farther away than would be possible using Gaia parallaxes.
We explore the possibility of using the binned velocity space of this dataset at relatively higher distances to independently pick up candidate members and main sequence stars belonging to three example streams.The use of main-sequence stars, rather than brighter giants, allows us to trace low surface brightness counterparts.Two cold streams with complex morphology: GD-1 and Jhelum are picked up by the sample and explored in detail.
We fit polynomial functions for the observable properties of the stream members with respect to the main stream coordinate  1 .We see gaps and overdensities similar to what is observed in the literature and often enhanced to a greater contrast.For GD-1, we see a density gap at  1 ∼ 40 • which could be a probable position for the now fully disrupted GD-1's progenitor (Webb & Bovy 2019).However, this is not conclusive, because the distribution of the tracks causing the density breaks overlap quite a lot.More member stars and radial velocities for these members should be helpful for further analysis.We also see a similar density gap in Jhelum at  1 ∼ 15 • resolved very well due to improved Gaia astrometry and photometry.The several components of the stream was recently explored using N-body simulations as due to a tentative close interaction with Sagittarius (Woudenberg & Koop et al., 2022) while some of these density breaks and variations still remain to be explained.We improve the photometric distances calculated by the entire RPM sample by folding in the metallicity distribution information for these stream members using photometric metallicity grids computed by Kim & Lépine (2021).Our selection method is validated using line-of-sight and metallicity information from cross-matches with spectroscopic surveys (for GD-1) or candidate members from the literature (for Jhelum) and returns very high success rates as ∼ 90% of the overlapping stars turn out to be true members.
We also see part of Sagittarius in the binned velocity space in a negative latitude direction with similar properties, but slightly more negative proper motions compared to literature samples for the stream.This structure is characterized better by fitting distances on the sub-giant branch using MIST isochrones and metallicity information from SEGUE cross-matches.Based on the available information, the structure is likely to belong to the Sagittarius stream which is visible in this catalogue only in that part of the sky due to a combination of the kinematic selection bias of this sample, its incompleteness at higher distances due to the intrinsic faintness of main sequence stars, and Gaia's scanning pattern.Future Gaia data releases may be less affected by Gaia's scanning pattern and more reliable to hunt for coherent streams on such binned velocity space.However, for this catalogue, this detection illustrates that care needs to be taken in the interpretation of substructures.
Aiding future follow-up studies, we provide candidate member catalogue with our photometric distances, further refined by using the metallicity distribution of these streams as input.A more systematic exploration of nearby stellar streams by picking up proper motion peaks would further enhance the potential of this catalogue.As such, we can push the substructure searches to Gaia's magnitude limits.Using this sample in pseudo 6D space by setting the line-of-sight velocity to zero in regions of the sky where the pseudo space mimics the real Galactic velocities (ie., near Galactic centre, anticentre and poles) combined with added metallicity information can be used in the detailed chemodynamical analysis of the inner stellar halo.

Synergies with other datasets
Such a main sequence stars sample with the largest halo cartograph in the era of Gaia can be used to provide spectroscopic targets to the next big spectroscopic surveys such as WEAVE (Dalton et al. 2012), 4MOST (de Jong et al. 2019), and SDSS-V (Kollmeier et al. 2019) because this catalogue complements the Gaia DR3 source spectra at the fainter end.Even low-resolution spectroscopic follow-up that can provide us with the missing line-of-sight velocities and/or metallicities can be extremely useful to disentangle the merger history of the inner stellar halo.In a subsequent paper, we intend to explore the cross-match of this sample with photometric metallicity information from the highly metallicity Pristine narrow-band survey (Starkenburg et al. 2017) in the Northern Hemisphere (∼1 million sources in overlap).While still lacking line-of-sight information, such a cross-match does provide us with valuable information about the metallicity structure of the halo at different distances and in various directions (see e.g., Youakim et al. 2020).Additionally, the metallicity information from Pristine allows us to improve on the derived distances, as we have showcased in this work by folding in spectroscopically known metallicities for the streams, thereby avoidingdistance biases.This effort will be made possible with this sub-sample.The sample presented in this work already allows a deeper (and fainter) dive into studying the structure of the Milky way inner halo, while holding even greater promise for unravelling the complex formation history of our Galaxy with future Gaia data releases.It is a golden age to do Galactic Archaeology.

DATA AVAILABILITY
All data used in this study are publicly available.The PYTHON codes used to create this catalogue is available here: https: //github.com/astroakshara/RPM-Catalogue-Gaia-DR3and the full RPM halo catalogue along with the stream member candidates are available in this article and in its online supplementary material.

Figure 1 .
Figure1.The RPM Diagram for all the sources in Gaia DR3 5D sample that pass astrometry and photometry quality cuts (see text).The high density region consists of mostly disc stars (of ∼99% of the stars).The area inside the polygon enclosed by grey dashed lines represents the tentatively selected halo sources at these high tangential velocities vertically bounded by the main sequence colour range.A -1.6 [M/H], 12 Gyr age isochrone (where absolute magnitude is converted to reduced proper motion parameter) at these higher tangential velocities is over-plotted in red and blue to show that the selections we make correctly picks up halo main sequence stars

Figure 2 .
Figure2.Mollweide map of RPM selected halo stars in Galactic coordinates.Top panels: RPM selected halo using Gaia DR2; All sources (left), sources with heliocentric distances greater than 5 kpc (middle), sources with heliocentric distances greater than 10 kpc (right).Bottom panels: RPM selected halo using Gaia DR3; All sources (left), sources with heliocentric distances greater than 5 kpc (middle), sources with heliocentric distances greater than 10 kpc (right)

Figure 3 .
Figure 3. Quality and distribution of the photometric distances.Top panel: Comparison of photometric distance derived in this work to the Gaia good quality parallaxes as a function of Gaia colour.Purple thick dashed lines indicate ±10% difference between photometric distances in this work and Gaia parallaxes.The colour range between which the sample produces reliable distances is marked with the label 'Reliable' and the subsample outside this range is marked as 'Turnoff' and 'Faint-end' respectively.Middle panel: Percentage error in photometric distances as a function of Gaia colour.Purple thin dashed line indicate the typical photometric distance uncertainty (∼ 7%) within the reliable Gaia colour range.Bottom panel: Distribution of heliocentric photometric distances (left) and galactocentric photometric distances (right) for the entire sample as derived in this work.Note that the y-axis is the logarithmic density in each bin.

Figure 4 .
Figure 4. Galactic star map binned with the mean value of solar motion corrected longitudinal (top), latitudinal (middle), and pseudo azimuth velocities (bottom) in each pixel for the heliocentric distances above 7 kpc.Several coherent retrograde streams and substructures pop up in each of the three panels.

Figure 5 .
Figure 5. Binned pseudo azimuth velocity moments at d>6 kpc in the sky with the (rough) polygon selection of the GD-1 stream (top) and Gaia proper motions in (l,b) (bottom) within the polygon selection of the stream -on stream track (left) and outside the polygon selection of the stream -off stream track (right).

Figure 6 .
Figure 6.Proper motion in RA direction (top), Dec direction (middle top) with respect to the stream coordinate  1 , proper motion fitted confident members in GD-1 stream coordinates (middle), confident radial velocity members from SEGUE (middle bottom) and the stream track variations with respect to the coordinate  1 with radial velocity members and confident candidates plotted as longitudinal velocity vector quivers (bottom).The dashed lines in plots one through four are third degree polynomial fits on the corresponding parameters.The counts(*) for the Gaussian smoothed density plot denote the number of stars per 1.2 × 0.2 deg 2 and 1.2 × 0.08 deg 2 in panel 3 and 5 respectively.

Figure 7 .
Figure 7. Metallicity distribution of radial velocity members (top left), colour-magnitude Diagram density distribution of confident main sequence with a -2.29 [Fe/H], 8.06 kpc (value taken from Malhan et al. 2022) mock isochrone in purple and non-main sequence in grey (top middle), metallicity sensitive photometric distance distribution as a continuous line and photometric distance distribution as a dashed line (top right) and the stream track variations with respect to the coordinate  1 with radial velocity main sequence members (bottom).Note that this is a subset of the stars used for the bottom panel of Fig. 6 where also turnoff stars were included.The counts(*) for the Gaussian smoothed density plot is simply the number of stars per 0.8 × 0.1 deg 2 .Note a discontinuity/kink clearly visible at  1 ∼ −40 • .

Figure 8 .
Figure 8. Binned longitudinal velocity moments in the sky at d>8 kpc with the (rough) polygon selection of the Jhelum stream (top) and Gaia proper motions in (l,b) (bottom) within the polygon selection of the stream -on stream track (left) and outside the polygon selection of the stream -off stream track (right).

Figure 9 .
Figure 9. Proper motion in RA direction (top), Dec direction (middle top) with respect to the stream coordinate  1 , confident members on stream sky coordinates as KDE smoothed distribution (middle bottom) and the stream coordinates density distribution in the sky with the confident members of the three sub-components of Jhelum (bottom).The thin dashed lines in the plots are second degree polynomial fits on the corresponding parameters.The thick dashed line in the bottom panel refers to the equivalent fit by Bonaca et al. (2019) for the broad component (described as 0.9 • below the narrow component).The dashed lines in plots one and two are second degree polynomial fits on the corresponding parameters.The counts(*) for the Gaussian smoothed density plot is simply the number of stars per 0.4×0.1 deg 2 .

Figure 10 .
Figure 10.Assumed Gaussian distribution of metallicity based on Li et al. (2022) (top left), Density plot of the colour-magnitude Diagram with confident main sequence stars overplotted with a -1.83 in [Fe/H], 11.35 kpc in distance (value taken from Malhan et al. 2022) mock isochrone in purple and non-main sequence stars in grey (top middle), metallicity sensitive photometric distance distribution as a continuous line and photometric distance distribution as a dashed line (top right) and density distribution of main sequence stars in stream coordinates with radial velocity members from Ji et al. (2020) and narrow component stream track (bottom).The counts(*) for the Gaussian smoothed density plot is simply the number of stars per 0.4 × 0.1 deg 2 .Note a discontinuity clearly visible at  1 ∼ 15 •

Figure 11 .
Figure 11.Binned longitudinal velocity moments in the sky at d>8 kpc with the (rough) polygon selection of a part of Sagittarius picked up by the RPM sample (top) and Gaia proper motions in (l,b) (bottom) within the polygon selection of the stream -on stream track (left) and outside the polygon selection of the stream -off stream track (right).This is overplotted with stars in the onstream and off-stream polygon of the Sagittarius RGB sample from Vasiliev et al. (2021).

Figure 12 .
Figure 12.Proper motion in longitude direction (top), latitude direction (middle top) with respect to the Galactic longitude ℓ and density of candidate members on Galactic coordinates with radial velocity members as purple stars with longitudinal velocity quivers (middle bottom).Candidate members overplotted on the Gaia's scanning pattern in this region created using astrometric_n_obs_al parameter (bottom).Dashed lines in plots one to three are second degree polynomial fits on the corresponding parameters.The counts(*) for the Gaussian smoothed density plot is simply the number of stars per 0.9 × 0.64 deg 2 .

Figure 13 .
Figure 13.Metallicity distribution of radial velocity members (top left), colour-magnitude Diagram density distribution of confident main sequence with a -1.25 [Fe/H], 19.5 kpc (mean distance on the subgiant branch fit) MIST isochrone in purple and turnoff stars in grey (top middle), metallicity sensitive sub-giant photometric distance distribution as a continuous line, photometric distance distribution as a dashed line, and metallicity sensitive main-sequence turnoff photometric distance distribution as a dotted line (top right) and latitude versus line-of-sight velocity with a second degree polynomial fit on the track ignoring the one stream star contamination at ∼200 km/s (bottom).

Figure 14 .
Figure 14.Observable properties of part of Sagittarius picked up by the RPM halo sample compared to Vasiliev et al. (2021) RGB candidates belonging to the Sagittarius stream.

Table 1 .
The first nine candidates and two stream member candidates in the reduced proper motion selected Galactic halo catalogue using Gaia DR3 A full electronic version of the catalogue will be made available online as one full catalogue and each stream member as a separate catalogue.

Table 2 .
The properties of stream members selected using the binned velocity space from the reduced proper motion catalogue