Characterising the Gaia Radial Velocity sample selection function in its native photometry

The Gaia DR2 radial velocity sample (GDR2_RVS) of 7.2 million stars, the only Gaia DR2 sub-sample to provide six-dimensional phase-space information, has been of paramount importance to infer properties of the Milky Way. Yet no quantitative and accurate modelling of this GDR2_RVS sample is possible without knowledge and inclusion of a well-characterized selection function. Here we show how the GDR2_RVS selection function or 'completeness' depends on basic observables, foremost on the apparent magnitude G_RVS and color G-G_RP, but also on the surrounding source density and on sky position, where the completeness exhibits distinct small-scale structure. We derive the selection function through estimates of the internal completeness, i.e. the ratio of GDR2_RVS sources compared to all Gaia DR2 sources (GDR2_all). A simple but approximate way to cast the selection function is through the recommendation of high-completeness magnitude and colour ranges: 2.95<G_RVS<12.05 and 0.35<G-G_RP<1.25. For a more rigorous and detailed description, we provide a python function to query our selection function, as well as tools and ADQL queries that produce custom selection functions with additional quality cuts.


INTRODUCTION
Gaia DR2 contains median radial velocities and their uncertainties for 7 224 631 stars (Katz et al. 2019). To fully exploit this data set, the astronomical community requires the GDR2 RVS selection function. A selection function quantifies the probability of an object entering a sample (here, the RVS sample) as a function of its observables, such as magnitudes, colours and position on sky. This is required for essentially all ensemble modelling of such data, as in such cases any model that predicts observables must first be multiplied with the selection function before a meaningful comparison to data is possible. The characterisation of the GDR2 RVS selection function has been hampered as no G RVS photometry, covering the far-red optical region of the RVS spectra, has been published. This is because of the Gaia spectroscopic pipeline not yet being fully calibrated (Sartoretti et al. 2018). Approximating G RVS from G and G RP magnitudes allows us to build the selection function: S (G RVS , G − G RP , (α, δ)). G RVS provides the brightness range over which the RVS instrument (Cropper et al. 2018) could collect enough signal over the Gaia DR2 E-mail: rybizki@mpia.de time span and where the detector would not saturate. The G − G RP colour is a proxy for the effective temperature of the star, for which reliable radial velocity determinations can be obtained from the measured spectral window. The sky position can enter the selection function via the source density, which varies dramatically across the sky, through the Gaia scanning law (Boubert et al. 2020;Boubert & Everall 2020). Sky position also enters the selection function through a mix of both the RVS spectral window assignment on neighbouring RVS sources and (in the case of Gaia DR2) through a pre-selection of RVS sources based on other input catalogues Smart & Nicastro (2014).
Our paper is structured as follows: in Section 2 we show how G RVS can be derived. Since our empirical approach derives the internal completeness of the GDR2 RVS sample with respect to the GDR2 all sample, we show how this can be generalised to the external completeness, i.e. selection function in Section 3. Section 4 looks at the completeness with colour and magnitude. In Section 5 we examine the completeness over the sky. Section 6 highlights correlations of the GDR2 RVS selection function with other parameters. We then explain the generation and usage of our GDR2 RVS selection function in Section 7 and conclude with a summary in Section 8. Magnitude distribution of GDR2 all (blue) and GDR2 RVS in different photometric bands (G in green and G RVS in red). The mode of the GDR2 RVS distribution is indicated as a function of G and G RVS magnitude at 12.7 and 11.9, respectively.

USING THE NATIVE PHOTOMETRY OF THE RVS INSTRUMENT
The Radial Velocity Spectrometer (RVS) is an integral field spectrograph (Cropper et al. 2018) that observes in the near-infrared at λ = [845, 872] nm, which is redder than the mean wavelength of the G band (λ = [330, 1050] nm) and also slightly redder than the mean wavelength of the G RP band (λ = [630, 1050] nm) (Evans et al. 2018). G RVS can be approximated from G and G RP using Equations 2 & 3 from Gaia Collaboration et al. (2018), which gives G RVS -G RP as a fourth order polynomial in G-G RP . These equations 1 are valid within 0.1 < G − G RP < 1.7 and approximate the true G RVS to ≈ 0.1 mag precision 2 . The advantage of switching to this approximated G RVS value, which is closer to where RVS observes, is that it reduces the impact of colour variations on the selection function. This results in a sharper cut-off in the magnitude distribution which can be inspected in Figure 1. For G RVS , 5.3M sources are brighter than the mode whereas for G this only holds for 4.5M sources. For Gaia DR2, the processing of radial velocities was limited to sources with G RVS < 12 mag, although there are many sources with fainter G RVS magnitudes. This is because there are three different ways to determine G RVS (Sartoretti et al. 2018;Katz et al. 2019), which were used in different parts of the processing and affect the completeness of the GDR2 RVS sample: • The "on-board G RVS " is derived on-board the satellite, is used for various on-board automated decisions, and is also transmitted to the ground 3 .
• The "external G RVS " was calculated by Smart & Nicastro (2014) from a collection of ground-based photometric catalogues. These G RVS are part of the so-called Initial Gaia Source List (IGSL). 1 can be inspected in the queries of Appendix A 2 We add 0.05 mag at each limit to the colour range, i.e. 0.05 < G − G RP < 1.75 in order for all bins to have the same width, i.e. 0.1 mag. 3 For sources with on-board G RVS > 16.2 mag RVS windows are assigned, sources with on-board < 7 get 2D windows assigned (Sartoretti et al. 2018). • The "internal G RVS " is the magnitude derived by the Gaia spectroscopic pipeline using the flux recorded in the RVS spectra 4 .
Ideally we would like to use the "internal G RVS " in this work but since it was not published we are instead forced to use the approximated G RVS which was fit to the "internal G RVS " after processing was finished (Bailer-Jones et al. 2018).
To select the stars to be processed for Gaia DR2, the consortium used the external G RVS , when available. For the 8% of sources where it was not, the on-board G RVS was used instead (Sartoretti et al. 2018).
The external G RVS has been derived from multiple catalogues, with varying zero-points, and a multitude of photometric bands, resulting in multiple transformation formulas. The sharp selection in external G RVS translates into a more shallow one in the approximated G RVS . Throughout the paper we refer to the approximated G RVS as G RVS unless we add the prefixes from the 3 bullet points above 5 . The tail of the G RVS magnitude distribution comprises mostly stars with an external G RVS brighter than 12 th mag, but which are fainter in the approximated G RVS .
For the G RVS distribution in Figure 1 we note that for the magnitude range 2.95 < G RVS < 12.05 (1) the GDR2 RVS sample follows the GDR2 all sample distribution quite well. Stars < 2.95 G RVS mag are usually not entering the GDR2 RVS sample owing to saturation of the core of RVS spectra at approximately G ∼ 4 mag (Katz et al. 2019). We limit our investigation to the colour range for which G RVS can be approximated. From Table 1 we deduce that we are only losing 0.2 % of GDR2 RVS sources because of the colour limit and most of them actually owing to a missing G RP measurement.
In Figure 2 we compare the GDR2 all sources with GDR2 RVS sources where G < 12 mag in order to assess the GDR2 RVS sample completeness with colour. As we can see, only a minute fraction of GDR2 RVS sources reside outside the G RVS approximation range (depicted with grey dashed lines). For a central colour range with 0.35 < G − G RP < 1.25.
(2) (depicted with red dashed lines) we see that the GDR2 RVS sample is almost complete with respect to the GDR2 all sample. The reason for the decrease in completeness for very blue and red sources is that hot stars have strong Paschen lines affecting the determination of radial velocities from the Calcium triplet. Similarly, spectra of cool stars are dominated by TiO molecular bands which inhibit the determination of the pseudo-continuum at the shortest wavelengths (Katz et al. 2019). Therefore sources that needed RV templates 6 with an effective temperature outside the range 3550 < T eff [K] < 6900 were excluded in Gaia DR2. This translates into a relatively sharp colour dependence, which is blurred particularly at the red end owing to dust reddening as this extends the radial velocity determination to redder sources. This can be seen in Figure 3 where the G−G RP versus effective temperatures are shown for the GDR2 RVS sample, colour-coded by the extinction estimate from GDR2 . Interestingly, the extinction estimates also increase for sources bluer than 0.35 mag and therefore hotter than 6900 K. Figure 3 also illustrates that our defined high-completeness colour range from Equation 2 comes from the effective temperature range for radial velocity detection.
only 28 templates (Sartoretti et al. 2018;Katz et al. 2019). We did not find strong signatures in the selection function resulting from these two different ways of determining the template spectra.

FROM INTERNAL COMPLETENESS TO SELECTION FUNCTION
While in this paper we mainly investigate the internal completeness of GDR2 RVS sample with respect to GDR2 all sample and specifically the GDR2 all sample with G RP measurement (because we require G − G RP colour), evidence shows that we can use this internal completeness and approximate the external completeness (i.e. selection function) from it. First we need to look at the internal completeness of GDR2 all sources that have a G RP measurement with respect to GDR2 all sources that at least have a G measurement for G < 15 mag. As we see in Figure 4, the completeness over the sky is virtually complete except for small patches with lower completeness; these are mainly towards the Galactic north pole plus a tiny patch with missing colour information.
Further on we need to verify that GDR2 all sample is externally complete in the magnitude range brighter than 15 th G magnitude. Rybizki & Drimmel (2018) did this via a crossmatch to 2MASS (Skrutskie et al. 2006) sources. Figures for this can be inspected in tutorial [2] 7 . While they find external completeness of GDR2 all to be virtually complete between 8 < G < 15 mag, Gaia seems to be losing sources in the Galactic disc and bulge for sources with G > 15 mag. Their external completeness assessment relies on the assumption that Gaia and 2MASS do independent measurements of the true sources within their respective G magnitude bins per HEALpix (they marginalise over colour). Despite their crude assumptions and large magnitude bins their findings are indicative of the GDR2 all sample being close to the true completeness for sources with G < 15 mag. Similarly we expect no bias from spurious sources in GDR2 all because for G < 15 mag these have negligible contribution, cf. Appendix C of Lindegren et al. (2018).
For the bright end, (Boubert & Everall 2020) have shown that for sources with G > 3 mag the detection probability (and therefore completeness) is high.
We provide the GDR2 all sources with G RP measurement internal completeness as a function of G and HEALpix of level 6 and integrate it into our RVS internal completeness function so that it should reflect the overall selection function.
In the following, we will inspect in more detail the colourmagnitude dependence of the internal completeness of the GDR2 RVS sample. Ranges indicated by Equations 1 and 2 will guide us where to expect nearly full completeness.

DEPENDENCE ON COLOUR AND MAGNITUDE
In order to get a better differentiated view of the GDR2 RVS sample completeness, we show the internal completeness, i.e. GDR2 RVS /GDR2 all source count, for colour-magnitude bins in the left panel of Figure 5. The red dashed lines show the colour magnitude ranges of high-completeness as specified in Equations 1 & 2. For the GDR2 RVS sample we have 7.21M sources displayed (all sources within the G RVS -approximation), of which 75 % are within the red dashed lines. For the GDR2 all sample, 6.57M sources are within the red dashed lines meaning that 83 % of the sources inside the red dashed lines have an RVS measurement. The colour bins that are adjacent but outside of the red dashed lines still have some higher fractions of completeness, but beyond these the completeness decreases fast. The only notable exception is the faint red corner at G RVS ∼ 11 mag and G − G RP ∼ 1.4 mag, which we attribute to dust-reddened sources with a well-defined Calcium triplet despite their red colour. The drop off for objects with G RVS > 12 mag is quite sharp for all colour bins. There seems to be a decrease in completeness at G RVS ∼ 7 mag. This was the on-board G RVS magnitude limit, where the Gaia pipeline switched from 2D to 1D window assignment (Sartoretti et al. 2018). Those sources with G RVS > 7 are still relatively bright and produce spurious sources which also get 1D windows allocated by the sky-mapper. This then potentially produces window conflicts whilst because of the non-processing of truncated windows, leads to a lowered completeness in the range 7-9 mag G RVS , cf. Sec.3.2 Sartoretti et al. (2018), contrary to 2D windows which are processed even when blended. Curiously, the decrease in completeness at around 7 th G RVS happens at a somewhat brighter G RVS in the red than in the blue. This strongly indicates a colour-dependence of the on-board G RVS estimate. The lowered completeness vanishes for fainter sources as these produce less spurious sources. In crowded regions of the sky (especially where the number of visits is still low, i.e. towards the Galactic center and the anti-center 8 ) the effect of lowered completeness at the beginning of the 1D window assignment is worst, cf. left panel of Figure 7.
In the right panel of Figure 5 the source density of the GDR2 RVS sample per colour-magnitude bin is shown. Here we see that in absolute numbers there are still a significant number of GDR2 RVS sources that have G RVS > 12 mag or G−G RP > 1.25 mag.

DEPENDENCE ON SOURCE DENSITY AND SKY POSITION
Here we investigate the dependence of the selection function with respect to the position on the sky S ((α, δ)). As we will see in Section 5.1, this has imprints of the global source density, as well as the close RVS pairs which compete in the spectral window allocation. Furthermore the dependence on input catalogues is still visible in Gaia DR2 as we will see in Section 5.2. In the generation of our selection function, the sky position enters through HEALpix bins while the source density is only accounted for indirectly by lowering the HEALpix level until enough sources are in a respective selection function bin. See Section 7 for details.

Global
To the first order, the completeness is driven by projected stellar density (we neglect the sources from the second telescope in our work, but discuss its contribution in Appendix B) which results in truncated RVS windows that have not been processed in Gaia DR2 (Katz et al. 2019). Also the processing limit of the RVS instrument is around 36k sources per degree 2 (Cropper et al. 2018) (though in the GDR2 RVS sample the highest density at HEALpix level 6 is 2681 sources/degree 2 ). We therefore split our sample into high-(> 100k sources / degree 2 ), intermediate-and low-(< 10k sources / degree 2 ) density regions on the sky (HEALpix level 5). See Figure 6 for their distribution. In Figure 7 the resulting CMDs for the high-, intermediateand low-density sample are shown from left to right. As expected the overall completeness increases with decreasing density from 73 % over 87 % to 91 %, respectively. For the high-density sample (left panel) the drop off in completeness for sources with G RVS < 7 mag is more pronounced and the faint blue corner has particularly low completeness, owing to the correlation between high-density and dust-reddened regions. For the intermediate density sample (middle panel) only the faint blue corner has a somewhat lowered completeness. For the low-density sample (right panel) the faint red end has lowered completeness, which is owing to the correlation between low-density and unreddened sky areas.

Close pairs
There are fundamental limitations of the Gaia satellite for the detection of nearby sources (sky separation), the so called 'contrast sensitivity', which is a function of projected distance and magnitude difference (de Bruijne et al. 2015). For GDR2 all sources with a G magnitude and a sky position measurement this has been characterised in Brandeker & Cataldi (2019), which is at approximately 0.4 for equal brightness sources. When requiring a colour measurement this distance increases to approximately 2.0 ( Fig.9 Arenou et al. 2018) 9 , because of truncated windows, which have not been included in Gaia DR2 photometry (Riello et al. 2018).
For the GDR2 RVS sample the situation is somewhat different owing to its extremely elongated windows 10 and relatively small source densities. The latter will change with future data releases, when going from a GRVS limit of 12 mag in GDR2, to 14 mag in 9 except for closer equal brightness binaries which still entered the catalogue 10 RVS spectral windows are 10 across-scan pixels times ∼1.3k along-scan pixels wide, which corresponds to 1.77 times ∼75 (Prusti 2012). This also means that a single RVS spectrum effectively takes up 135.4 arcseconds squared on the sky. If tightly packed this would allow for approximately 100k sources per degree 2 which is a factor of 3 higher than the instrument limit. In GDR2 RVS sample the highest density per degree 2 is 2681. If isotropically distributed the mean distance of neighbouring sources in the highest density HEALpix would be approximately 55 .  GDR3, and perhaps as faint as 16 mag in GDR4. The maximum source densities per degree 2 will increase from 2.7k in GDR2 to (a theoretical maximum of) 50k in GDR3 or 300k in GDR4 (though the instrument limit is reached at 36k). Sources in denser areas are lost because deblending was not activated yet and truncated windows which were not rectangular were not processed (Sartoretti et al. 2018). This does not apply to 2D windows, which were assigned to sources with G RVS < 7 mag.
In Figure 8 11 we show log density plots of the sky separation versus G RVS difference for GDR2 RVS sources relative to GDR2 RVS sources for high density (left panel) and low density sky areas as defined by Figure 6. Owing to the low-number statistics and relatively low G RVS range of the sample the contrast sensitivity is not well sampled. In high density areas chance alignments, which increase with distance squared, dominate the close pairs. For low density areas these play less of a role and true binaries in very close 11 The query for the close pair data can be inspected in Appendix A where we also provide a link to the data which includes all pairs between GDR2 RVS sources and GDR2 all sources. vicinity (over-density up to 5 ) are more abundant. The minimum separation of two GDR2 RVS sources 12 is 5 across-scan pixel (Sartoretti et al. 2018) corresponding to 0.85 (Prusti 2012)). The very close true binaries (1-2 ) are less well sampled in the high density regions indicative of source loss because of window truncation.
We do not attempt to include the effect of close pairs into our selection function, still from Figure 8, one can approximate one if needed for binary treatment.

Sky position
It is important to recall for the following subsections that the external G RVS , on which the magnitude limit of 12 for most of the sources was applied comes from the IGSL3 (Smart & Nicastro 2014). This used several catalogues and transformation formula in generating an G RVS estimate, which we defined as external G RVS in this paper. IGSL3 has 15M sources that have G RVS < 12 mag. The G RVS determination was prioritised in order of the following input catalogue list (percentage of IGSL sources with external G RVS < 12 mag in brackets): SDSS (Strauss et al. 2002) (4), TYCHO2 (Høg et al. 2000) (16), GSC23 (Lasker et al. 2008) (80) and negligible fractions from other input catalogues.
In Figure 9 we can see that the priorities in choosing from different catalogues results in structure that we will recognise in the GDR2 RVS completeness function. The TYCHO set on the left has the highest priority and appears well-behaved. The SDSS footprint in the middle panel is not covering the whole sky. Areas and magnitude ranges that are left out are filled with the GSC23 data from the right hand panel.
This results in the overall IGSL footprint seen in the left panel of Figure 10. For comparison we show the GDR2 RVS sample density on the right. Differences are mainly a result of overlapping windows in dense areas (and therefore lost sources in the GDR2 RVS sample on the right) and for 8 % of GDR2 RVS sources also the on-board G RVS has been used instead of the external G RVS . But still patterns from the GSC23 and SDSS footprint can be recognised in the GDR2 RVS sample.  . Density plots (colour scale is in log and not the same for both panels) of the distance versus G RVS difference for GDR2 RVS sources relative to other GDR2 RVS source. The left panel shows high density areas with more than 100k sources per degree 2 . For the right panel these close pairs are shown for low density areas on the sky (less than 10k sources per degree 2 ). The number of pairs for each of those subsets (not all are necessarily depicted) are 86k and 10k. Figure 9. Source densities from different catalogues (Aitoff projection as in Figure 6 at HEALpix level 6). From left to right TYCHO, SDSS and GSC23 (Guide Star Catalog, see text) footprint in the IGSL (Gaia Input Catalog, see text) for sources with G RVS < 12 mag. Colour-scale is fixed to cover the range of all three panels.

Magnitude dependence of spatial completeness
In Figure 11 we show the completeness over sky with G RVS magnitude when only using sources within the high-completeness colour ranges of Equation 2. A video version can be downloaded from here 13 . 13 https://keeper.mpdl.mpg.de/f/db4ca9ee9bc34513bed1/ Figure 10. Source densities for IGSL3 with G RVS < 12 mag on the left and the GDR2 RVS sample on the right (Aitoff projection same as in Fig 6 at HEALpix level 6). The colour-bar range is fixed to the GDR2 RVS sample, which has limits 17 and 2681 sources/degree 2 ; the colour bar saturates for the IGSL3 catalogue.
At G RVS = 11 mag (left panel) the completeness shows little spatial structure, with only slightly lower completeness towards the Galactic plane. What can be already recognized is the SDSS footprint (lower completeness stripes perpendicular to the Galactic plane) that can also be seen in the left hand panel of Figure 10, which shows the source density of the IGSL catalogue with external G RVS < 12 mag. Since the SDSS transformation was preferentially used when calculating the sources in these stripes (cf. Figure 9 middle panel), it likely means that the SDSS transformation results in a fainter external G RVS magnitude compared to the GSC23 transformation. This compares with Figure 16 of Evans et al. (2018) where the G magnitude estimate from SDSS data varies over the sky and seems to be especially different in parts of the stripes.
At 12 th G RVS mag (middle panel of Figure 11), small patches of lower completeness start to emerge. These areas are point-like and distributed over the entire sky. It could be that these spots are correlated with brighter zero points in the respective GSC23 plates.
At G RVS = 12.5 mag there are still a few areas of high completeness visible. The circular glass-canopy pattern that can be seen is a result of the derivation of most of the external G RVS magnitudes from the GSC23 Bj and Rf magnitudes (Lasker et al. 2008). The magnitude zero-points of the plates seem to be a little fainter at the edges (possibly where they overlap each other, private communication David Katz). The canopy pattern delineates the borders of the plates. It is also clearly visible in both panels of Figure 10. There also seem to be plates for which the total area has a fainter zero-point.

Colour dependence of spatial completeness
In Figure 12 we show the completeness over sky with G−G RP when only using sources within the high-completeness magnitude range of Equation 1. A video version can be downloaded from here 14 .
In the left hand panel of Figure 12 where the completeness for the colour bin at G − G RP = 0.4 mag is shown, we can see notably lower completeness wherever dust reddening is in place. This is owing to hot stars which have no treatment from the RVS pipeline getting reddened to colours that would usually (in the absence of dust-reddening) result in a radial velocity measurement; this can 14 https://keeper.mpdl.mpg.de/f/edbe4cc51d544b738ab5/ be seen by the good completeness out of plane. A second order effect can be be witnessed in lower completeness areas that can also be seen but are even less pronounced in the left hand panel of Figure 11. For instance, the stripe from the Galactic south pole left up to approximately l = 90 and b = 0. This feature and similarly lower completeness areas in the top right, Galactic anti-center, etc. are because of the scanning law and the necessity of at least 2 transits for a radial velocity measurement 15 , cf. The middle panel of Figure 12 is similar to left panel of Figure 11 with the SDSS stripes of lower completeness and a lower completeness in the Bulge because of blends; interestingly enough however, the Galactic plane overall suffers less from incompleteness. When going redder in the right panel the completeness towards the plane improves the most. The reason for this counterintuitive effect is because sources in the good temperature regime (3550 < T teff [K] < 6900) for Ca triplet radial velocity determination are only falling into this colour bin (i.e. G − G RP = 1.3 mag) when dust-reddened and therefore in the Galactic plane.

CORRELATION WITH OTHER QUANTITIES
We want to assess how well the GDR2 RVS sample represents the underlying GDR2 all stellar properties in the CMD. If stellar properties were similar, one could apply simple completeness correction by up-sampling the GDR2 RVS sources to the numbers of GDR2 all sources in their respective CMD bin.
In Figure 13 we colour-code the fractional difference of the mean parallax, ruwe 16 and phot_bp_rp_excess_factor 17 15 The GDR2 RVS sample has a hard lower limit of 2 transits, a median of 7 and a maximum of 201. 16 The Renormalised Unit Weight Error (RUWE) is expected to be around 1.0 for sources where the single-star model provides a good fit to the astrometric observations. A value significantly greater than 1.0 (e.g. > 1.4) could indicate that the source is non-single or otherwise problematic for the astrometric solution. 17 The BP/RP excess factor is the sum of the integrated BP and RP fluxes divided by the flux in the G band (BP and RP are dispersed and therefore more prone to light from nearby sources than G). This excess is believed to be caused by background and contamination issues affecting the BP and Figure 11. Internal completeness of the GDR2 RVS sample over the entire sky (as a Mollweide projection as in Fig. 6 at HEALpix level 5) at different G RVS magnitudes, with yellow being 100 % complete and dark blue being 0 % complete. (from left to right) between the GDR2 RVS sample and the GDR2 all sample on the CMD. Comparing to the left hand panel of Figure 5 we see that lower values of internal completeness generally coincide with differences of the GDR2 RVS sample to the GDR2 all sample for the shown parameters, while the patterns generally differ between the parameters. For the parallaxes it seems that sources with G − G RP < 0.35 that are in the GDR2 RVS sample have generally higher parallaxes compared to the GDR2 all sample whereas for sources with G − G RP > 1.05 the opposite applies. The reason for this is that blue sources which are dust-reddened are usually further away whilst simultaneously being too hot for a radial velocity determination (cf. the left hand panel of Figure 12 where for blue sources a low completeness in the Galactic plane areas can be seen). Therefore, blue sources in GDR2 RVS are generally closer to us than the blue sources in GDR2 all . Vice versa, the red sources (1.05 < G − G RP < 1.35) in the GDR2 RVS sample are dust-reddened bluer stars for which a RVS measurement is possible (cf. right panel of Figure 12). Those are usually further away than unreddened sources of the same colour for which molecular lines in the spectrum prohibit the RVS measurement. This of course also means that the proper motions are different between those two samples, potentially biasing kinematic selections.
For the ruwe and the phot_bp_rp_excess_factor it seems that generally the GDR2 RVS sample has better quality sources than the overall GDR2 all sample. There is one exception to this rule at the red end for G RVS ≈ 11 mag where quality indicators are better than for the GDR2 all sample. However, this only applies to a tiny fraction of the GDR2 RVS sources. RP data. Therefore a large value of this factor for a given source indicates systematic errors in the BP and RP photometry.
Therefore we recommend to not use bins of the completeness function with completeness below some threshold, e.g. 90 %. If a larger sample is needed because of a weak signal we caution that the GDR2 RVS sample properties might not be representative of the GDR2 all sample.

RVS selection in GeDR3mock with parallax bias
In order to assess the impact of the parallax bias on the spatial distribution of the GDR2 RVS sample we applied the completeness function [HEALpix level 5, 0.1 G RVS bins, 0.1 G − G RP bins] to the GeDR3mock catalogue (Rybizki et al. 2020). We randomly chose 1000 subsets in each bin and took the subset that had the mean parallax closest to the mean of the respective: And also (c) a truly random subset, which we do not consider here, but publish as well.
In Figure 14 we display the spatial distribution of sample (a) black solid lines versus sample (b) in blue dashed lines in log density contours. The left panel shows the Galactic XZ projection for the blue sources (0.15 mag < G−G RP < 0.35 mag) where we see that in the Galactic plane the GDR2 RVS sample (a) does not probe as deep as the GDR2 all sample (b) with almost 0.5 kpc difference at the outer density contour. Similarly, the right panel shows the Galactic XY projection for the red sources (1.05 mag < G − G RP < 1.35 mag) where we see that the GDR2 RVS sample probes approximately 1.5 kpc further than the GDR2 all sample on the far side of the Galaxy at the outer density contour.
The difference is less pronounced than it probably is with the real Gaia data because we only chose from random subsets of  sources from GeDR3mock; this choice does not perfectly represent the GDR2 all parallax distribution. Nevertheless, it should be beneficial to investigate those subsets and see how other selection effects can bias a sample, e.g. cuts on fractional parallax uncertainty.

ACCESSING GDR2 RVS COMPLETENESS FUNCTION
An example query illustrating how to download the data necessary to produce the GDR2 RVS selection function is given in Appendix A.
User-specific quality cuts should be included in the query.
The internal completeness function created for easy usage as a python function has been generated in the following way: • fixed bin size of 0.2 mag in G RVS covering a range from 2.9 until 14.1 mag.
• fixed bin size of 0.1 in G − G RP , covering 0.05 to 1.75 mag.
• HEALpix level 6 baseline but with degradation down to level 0 until at least 5 GDR2 all sources are available in that bin. If level 0 HEALpix has less than 5 sources then we use the whole sky also including bins with less than 5 sources.
• In each level 6 HEALpix the number of GDR2 RVS -and GDR2 all -sources are saved according to the above scheme as well as the respective HEALpix level from which the numbers were taken.
The completeness function as well as the upper and lower 1sigma percentile were generated according to the following scheme: • Whenever a bin had no GDR2 all or no GDR2 RVS entries the completeness function was set to zero as well as the upper and lower percentile.
• in the rest of the cases the number of sources in the GDR2 RVS bin were divided by the number of sources of the GDR2 all bin which yields the completeness function.
• for the upper and lower percentile we assumed a Poisson distribution with expected value of the number of sources in the GDR2 RVS bin. We took the value at the 16th and 84th percentile of this distribution and divided by the number of sources in the GDR2 all bin. We forced the value for upper to be less or equal the number of GDR2 all .
As a result, the completeness function is smoothed on the sky wherever source densities were too low.An illustration of this can be seen in the left panel of Figure 15, where the whole sky selection function is depicted for G RVS = 10.84 mag and G − G RP =1.2 mag. In the middle panel we show the corresponding fractional uncertainty calculated as ((upper-lower)/2)/completeness. The right panel shows the respective HEALpix level, over which the star counts in the CMD bin have been averaged. The GDR2 RVS completeness function can be queried (l, b, mag, col) and returns the above quantities and also the number of GDR2 all and GDR2 RVS sources in that respective bin (adding all of those yields total star counts), but also the sky area smoothed value (which is actually used to calculate the selection function). This is returned to a precision of 49152 HEALpixes, 56 magnitude bins and 17 colour bins and can be accessed via the gdr2_completeness package (Rybizki & Drimmel 2018). Tutorial 5 illustrates its usage and shows some visualisations.
We also provide the colour transformations (G, G RP ) → G RVS as well as the internal completeness of the GDR2 all sample with G RP measurement vs the GDR2 all sample. Interactive web visualisations of the completeness maps over the CMD and the sky are available here 18

SUMMARY
The GDR2 RVS sample is an important subset of the Gaia DR2 catalogue. We have characterized the internal completeness of the GDR2 RVS sample with respect to the GDR2 all sample in the three dimensions sky position, G RVS magnitude and G − G RP colour. In the magnitude range 3 < G RVS [mag] < 14 evidence shows (Rybizki & Drimmel 2018) that this should be close to the external completeness (i.e. selection function). We show that the internal completeness is well characterised in the native G RVS band which can be approximated from G RP and G − G RP . Imprints in the GDR2 RVS sample internal completeness come from the IGSL3 input catalogues (mainly SDSS and GSC23) and its estimated external G RVS on which the magnitude limit of 12 th magnitude has been applied. Completeness is lowered in high density areas mainly owing to blended RVS spectral windows resulting in non-rectangular windows which have not been processed in GDR2. Another important effect is the spectral template range that could be used to determine radial velocity measurements. This was for effective temperatures between 3550 and 6900 K. Together with dust reddening in the Galactic plane this leads to the counter-intuitive effect that for red sources the completeness is best in the Galactic plane. A secondorder effect coming from the scanning law and the requirement of at least 2 transits with radial velocity measurement is also visible.
We show that the GDR2 RVS sample can have partially different properties over the CMD than the corresponding GDR2 all sample, e.g. in parallax but also in proper motion, which might bias kinematic selections. We similarly show that GDR2 RVS sources usually have better quality indicators such as ruwe and phot_bp_rp_excess_noise compared to the GDR2 all sample which prohibits easy completeness corrections. We apply the completeness function to a mock stellar catalogue, GeDR3mock, and explore the impact of the parallax difference on the spatial extent of colour subsets of the GDR2 RVS sample.
We provide a completeness function in python that delivers the internal completeness as a function of HEALpix, magnitude and colour together with quality and uncertainty indicators, together with the functionality to generate the external completeness, i.e. the selection function, by taking into account gaps in the Gaia DR2 G RP photometry. The necessary data is included but can also be generated for individual use-cases by adapting our example ADQL query from Appendix A.
In Gaia DR3 (which is anticipated to be published end of 2021 at time of writing), a much improved GDR2 RVS sample will be provided. It will be much deeper with a magnitude limit of internal G RVS = 14 mag, removing the dependency on external catalogues. Also the treatment of blended spectra will be included and the effective temperature range for which radial velocities will be determined might increase. This will greatly simplify future selection function determinations and should allow for more sophisticated modelling, e.g. including the nearby source contamination (as projected onto the sky) and taking into account the different scanning angles. Figure 15. Example selection function (all-sky output without rp to g completeness correction) for G RVS = 10.84 mag and G − G RP = 1.2 mag. From left to right, the completeness, fractional uncertainty ((upper-lower)/(2*completeness)) and the HEALpix level over which the star counts were averaged to calculate the completeness. The grey areas in the middle panel come from a zero division where the completeness function is zero. For some values of G RVS and G − G RP with very few sources averaging will go over the whole sky, corresponding to a value of -1 (not the case here). Table Expressions, which will be part of the upcoming ADQL 2.1 standard, which at the time of writing among the Gaia-carrying VO data centers are only available on GAVO's TAP service 19 .

This query uses Common
WITH with_rvs AS ( --This is a subquery on which we will perform further queries below SELECT radial_velocity, source_id/140737488355328 AS hpx, phot_g_mean_mag-phot_rp_mean_mag AS grp, --Creates HEALpix of level 6 and abbreviates the G-GRP colour phot_rp_mean_mag+0.042319-0.65124*(phot_g_mean_mag -phot_rp_mean_mag)+ 1.0215 * POWER(phot_g_mean_mag -phot_rp_mean_mag,2) -1.3947 * POWER(phot_g_mean_mag -phot_rp_mean_mag,3) + 0.53768 * POWER(phot_g_mean_mag -phot_rp_mean_mag,4) AS phot_rvs --GRVS approximation FROM gaia.dr2light --This is the GAVO Gaia DR2 The result of this query (which was subdivided in different HEALpix in order to be able to retrieve all sources) can be downloaded from here 20 in a 5, 10 and 20 version. It has also been cleaned from double entries and an G RVS magnitude has been calculated for the second source.

APPENDIX B: INCREASED DENSITIES FROM SECOND FIELD OF VIEW
Since the two telescopes share the focal plane, the sources competing for CCD window allocation are more than just the sources at a single position in the sky. We neglect this effect in our work, because the viewing angle of 106.5 degrees is sufficiently large and the position of the second telescope always changes for a specific position of the first telescope because of the scanning law. Therefore, plus the fact that most fractions of the sky contain low density areas, we assume that only a small fraction of transits will have the situation where both fields are in crowded regions.
To make an approximate but quantitative assessment of the increase in source density per transit coming from the second telescope, we use the public Gaia scanning law 21 (Gaia Collaboration et al. 2016), without accounting for the breaks (e.g. due to lost telemetry or the telescope going into safe mode) and try to answer the question: for each field of view for telescope 1 (FoV1) on the sky, how many more sources there are coming from FoV2 (on average). We take every 5th data point in the scanning law file (which has a time resolution of 10 seconds), such that the individual pointings are 50 apart (50 seconds), which corresponds approximately to the distance of two neighbouring HEALpix at level 6. Level 6 HEALpix also have roughly the size of the field of view (FoV) of one telescope. Instead of an exact solution, each pointing is moved to the nearest HEALpix of level 6. We assume that per transit in each FoV Gaia observes the same amount of sources, which is simply taken from the HEALpix level 6 sourcecount of GDR2 RVS sample (as displayed in Figure 10). 20 https://keeper.mpdl.mpg.de/d/f2b841c75d7a42f6aad9/ 21 https://www.cosmos.esa.int/web/gaia/ scanning-law-pointings Figure B1. Mollweide projection of the mean per transit fractional increase in source density for GDR2 RVS sample owing to the second telescope in Galactic coordinates. The Galactic Center is in the middle, with longitude increasing to the left at HEALpix level 6.
As can be seen in Figure B1 the resulting mean per transit fractional increase for the GDR2 RVS sample (average counts in FoV2 divided by counts in FoV1) is very low in high density regions, cf. right panel Figure 10. Therefore neglecting the sources from the second telescope in high density regions seems to be a valid assumption. For the GDR2 all sample the situation is similar and will even improve in future data releases with longer observational baselines. The shown plot and others can be recreated using tutorial 6 in Rybizki & Drimmel (2018). This paper has been typeset from a T E X/L A T E X file prepared by the author.