DESI Complete Calibration of the Color-Redshift Relation (DC3R2): Results from early DESI data

We present initial results from the Dark Energy Spectroscopic Instrument (DESI) Complete Calibration of the Color-Redshift Relation (DC3R2) secondary target survey. Our analysis uses 230k galaxies that overlap with KiDS-VIKING $ugriZYJHK_s$ photometry to calibrate the color-redshift relation and to inform photometric redshift (photo-z) inference methods of future weak lensing surveys. Together with Emission Line Galaxies (ELGs), Luminous Red Galaxies (LRGs), and the Bright Galaxy Survey (BGS) that provide samples of complementary color, the DC3R2 targets help DESI to span 56% of the color space visible to Euclid and LSST with high confidence spectroscopic redshifts. The effects of spectroscopic completeness and quality are explored, as well as systematic uncertainties introduced with the use of common Self Organizing Maps trained on different photometry than the analysis sample. We further examine the dependence of redshift on magnitude at fixed color, important for the use of bright galaxy spectra to calibrate redshifts in a fainter photometric galaxy sample. We find that noise in the KiDS-VIKING photometry introduces a dominant, apparent magnitude dependence of redshift at fixed color, which indicates a need for carefully chosen deep drilling fields, and survey simulation to model this effect for future weak lensing surveys.


INTRODUCTION
Modern cosmology relies on our ability to observe galaxies as tracers of the structure formation and expansion of the Universe.To do this we must map their on-sky positions and, crucially, their positions in all three dimensions.Measuring the redshift, z, for galaxies outside of our local group is a good proxy for this third dimension, because the expansion of the universe reddens the light from a galaxy in a way that is monotonic with distance.The most accurate way to measure a galaxy's redshift is via the detection of prominent emission or absorption features with sufficient signal-to-noise ratio in the spectral energy distribution (SED).With this, the observed wavelengths of these features are compared to their known rest frame wavelengths to provide a redshift measurement.Spectroscopic surveys obtain these many-wavelength observations very successfully via slit masks on the telescope (e.g. on Keck, Oke et al. 1995), and more recently with integral field units (IFUs) (e.g. the Hobby Eberly Telescope Dark Energy Experiment, Gebhardt et al. 2021), and massively multiplexed instruments with independent optical fibers capable of taking many spectra at once (e.g.DESI, Flaugher & Bebek 2014; DESI Collab-★ E-mail: jmccull@stanford.eduoration et al. 2022).The majority of galaxies in our Universe are relatively faint and distant, and low-throughput galaxies are more feasibly observed photometrically -in optical, near-infrared, and infrared filters -rather than spectroscopically, due to limits in exposure time.Imaging surveys estimate the distances to galaxies from their colors, measured as the ratio of photometric fluxes in different bandpass filters.Although photometric redshifts are more readily attainable, these estimates are less accurate than spectroscopic redshifts.
Cosmological measurements from wide imaging surveys, like weak gravitational lensing or galaxy clustering, rely on geometric information.The most recent imaging surveys, such as the Dark Energy Survey (DES, Sevilla-Noarbe et al. 2021), Subaru Hyper Suprime-Cam (HSC, Aihara et al. 2022) and the Kilo-Degree Survey (KiDS, Kuĳken, K. et al. 2019), span a significant fraction of the sky.Accurate estimates of the redshifts of these galaxy samples based on limited information (e.g.photometry rather than spectroscopy) are required to obtain unbiased cosmological constraints.Indeed, one of the foremost difficulties facing imaging surveys for cosmology lies in calibrating the redshift probability distribution (Myles et al. 2021).Typically, redshift distributions are estimated and calibrated for an ensemble of galaxies, (), and the uncertainty is modelled as an error on the mean redshift of the distribution in the cosmological analysis-as cosmological parameters are most sensitive to shifts in the mean-z (see e.g.Amon et al. 2022;Li et al. 2023a;Dalal et al. 2023;van den Busch et al. 2022).Calibrating the redshift distribution for an entire ensemble has unique difficulties compared to doing the same for individual galaxies, though individual redshifts have a multitude of other science applications.As ensemble redshift distributions are of the foremost interest to weak lensing cosmology, ensemble calibration is the focus of this paper.
With observations of fluxes in only a few broad bands, the underlying challenge for determining an accurate redshift is a degeneracy between a galaxy's spectral phenotype and redshift (see Newman & Gruen 2022 for a review).A variety of approaches have historically been used to determine galaxy redshifts.Quiescent elliptical galaxies have a consistent drop in light emission at 4000Å, which enables accurate photo-zs with this so-called red sequence of galaxies, which allows the bounding of the redshift between different band passes and is particularly useful for identifying cluster members and lensing galaxy samples (e.g. the RedMaPPer algorithm in Rykoff et al. 2014Rykoff et al. , 2016)).While early-type, passive galaxies benefit from having very similar SEDs, this is not necessarily true for many other galaxy types.Template fitting methods can rely on either spectroscopically informed templates or semi-analytic models that are typically constructed with stellar populations (e.g.Brammer et al. 2008).An empirical variation of this is a Principal Component Analysis (PCA), but in essence both methods fit the data to a linear combination of templates (see Salvato et al. 2019 for a review).Regardless of the method for redshift estimation, redshift-type degeneracy is an irreducible problem when working with photometry (see discussion of methods that break age-mass-redshift degeneracy in e.g.Wang et al. 2023).
A correct model for the galaxy population (i.e. the mix of templates and their luminosity functions at each redshift, or a large, fully representative reference sample of galaxies with known spectroscopic redshift) would be required to determine correct () for photometric samples despite redshift-type degeneracies.Both of these solutions at present appear unfeasible, though gains in forward modeling the distribution have been made (e.g.Alsing et al. 2023).The issue can be greatly reduced by observing in additional bands that break degeneracies, which motivates a filter set that goes beyond the standard optical broad bands (Buchs et al. 2019;Wright et al. 2019).With this increased wavelength coverage, however, the color space that the photometric observations occupy becomes high-dimensional.The challenge is to associate like-spectroscopic galaxies with photometric galaxies efficiently across that high-dimensional space.
Self Organizing Maps (SOMs, Kohonen 2004) can serve as a useful tool to subdivide the high-dimensional colour space into a set of SOM cells efficiently, tracing density and coherent galaxy types, as demonstrated in Masters et al. (2015) and utilized in Hildebrandt et al. (2020); Myles et al. (2021) among others.The use of SOMs for redshift calibration demand spectroscopic galaxy samples to completely populate this space, thereby providing accurate galaxy redshifts for any combination of colors.Of critical note is that spectroscopic redshifts are typically obtained only for a specific and limited selection of galaxies and must be weighted to become representative of the photometric sample (Hartley et al. 2020).The resulting calibration problem is that without a complete spectroscopic sample that fully populates the photometric color-space derived redshift distributions are subject to bias and uncertainty given incomplete or under-sampled spectroscopic observations.Some 5,000 spectroscopic redshifts have recently been determined by the Complete Calibration of the Colour-Redshift Relation (C3R2) project in order to populate this color space and thereby calibrate the color-redshift relation (Masters et al. 2017(Masters et al. , 2019;;Stanford et al. 2021).Beyond fully populating each SOM cell, a larger multiplicity of spectroscopic galaxies per SOM cell will be needed to meet requirements for future deep imaging surveys like Euclid (Amendola et al. 2013) and Rubin Observatory (Ivezić et al. 2019;Collaboration et al. 2009).This can be achieved via statistical characterization of broad or bimodal redshift distributions in SOM cells, though a well constructed SOM minimizes these features where possible.Additionally, as upcoming imaging surveys are deeper than most spectroscopic samples, it is essential that the magnitude dependence of the redshift at a fixed color, /, is understood.Previous examinations of this measurement have shown a small / trend at fixed color (Masters et al. 2019).
In this paper, we present the DESI Complete Calibration of the Color-Redshift Relation (DC3R2) secondary target survey.This survey supplements the existing spectroscopic samples used for photometric redshift calibration by both populating SOM cells that were previously unfilled, and by increasing the multiplicity per cell.We use DC3R2 to revisit the magnitude dependence of the redshift at a fixed color with improved statistics, including apparently brighter galaxies, and study trends in / as a function of color.Sec. 2 introduces the survey data used.In Sec. 3 we describe the construction of the DC3R2 sample.We explore how it calibrates the color-redshift relationship alongside DESI main survey targets in Sec. 4. Finally, using this new resource, we examine the magnitude dependence of redshift at fixed color in the presence of observational effects like photometric scatter (Sec.5.2).

DATA
DC3R2 is a secondary target program on the Dark Energy Spectroscopic Instrument (DESI), Sec.2.1, that obtained spectroscopic redshifts for galaxies that were targeted with KiDS-VIKING (KV) photometry, Sec.2.2.We also analyze DESI main survey targets that fall within the KiDS-VIKING footprint, but were targeted on photometry from the DESI Legacy Imaging Surveys, Sec.2.3, including the Dark Energy Camera Legacy Survey (DECaLS, Dey et al. 2019).Additional spectroscopic and imaging surveys were used to validate this analysis, as listed in Sec.2.4.For reference, all magnitudes in this paper are absolutely calibrated in the AB system (see Oke & Gunn 1983).

DESI
The Dark Energy Spectroscopic Instrument (DESI) is a ground-based spectroscopic experiment installed at the 4m Mayall telescope (Collaboration et al. 2016a;DESI Collaboration et al. 2022).The DESI instrument is sensitive from 360-980 nm, with 5,000 robotically actuated fibers that are capable of taking spectra simultaneously.Over five years, it aims to measure spectra of 40 million galaxies and quasars that will aid in examination of baryon acoustic oscillations (BAO), the growth of structure through redshift-space distortions and dark energy (Collaboration et al. 2016b).While these are DESI's primary goals, the survey is uniquely capable of providing a multitude of spectroscopic redshifts that have far-reaching uses.Here we exploit its ability to be used to calibrate photometric redshifts, in line with the need of weak gravitational lensing experiments.
The DESI main survey targets relevant to this paper are divided into three galaxy types: Luminous Red Galaxies (LRGs; Zhou et al. 2023), Emission Line Galaxies (ELGs; Raichoor et al. 2023), and the Bright Galaxy Survey (BGS; Hahn et al. 2023).Notably, LRGs are magnitude selected on a bright z-band fiber-magnitude alongside  and NIR photometry from WISE W1 to target red galaxies from 0.4 <  < 1.0 with high signal-to-noise Zhou et al. (2023).ELGs, on the other hand, are selected on a -band magnitude alongside  color cuts to target star forming galaxies from 0.6 <  < 1. 6 Raichoor et al. (2023).Lastly, BGS targets an -band magnitude limited sample (split into a bright and faint subset) that is significantly brighter than the other main surveys,  ≲ 0.6, capable of observation during suboptimal conditions Hahn et al. (2023).Each of these main survey samples targets in a complementary part of the color-redshift space, though their union does not span the full census of galaxy populations DESI is capable of measuring.For precise photometric cuts for each sample, we direct the reader to the respective target selection papers, which use the photometry of the DESI Legacy Imaging Surveys (Dey et al. 2019).For a brief overview of these selections, see Table 1.
The methodology for fitting models to the obtained spectra are explored in Bailey et al. (2023), and the validation of these techniques is performed in Lan et al. (2023).The observations that produced data for this analysis come from December 14, 2020 through July 9, 2021, which span a combination of the 'One-Percent Survey' (OPS), Survey Validation (SV), and the beginning of the main survey operations (Y1) -the internalFuji and Guadalupe data releases, respectively (DESI Collaboration 2023).SV data has already been made available in the DESI Early Data Release (EDR) (DESI Collaboration et al. 2023).Selection changes for main targets were subject to minute changes between SV1 and the OPS as well as Y1 operations, detailed in the respective paper for each sample.The selection footprint for DC3R2 was modified after the OPS and before Y1 to make use of newly released photometry.Additionally, on May 12th and 13th, 2021 a small dedicated tile program was run for DC3R2 targets with high priority.The selection for DC3R2 targets is outlined in greater detail in Section 3.1, and the optimization for dedicated tile fibers is discussed in Appendix C.

KiDS-VIKING
The Kilo Degree Survey (KiDS) is a large scale optical ugri imaging survey with OmegaCAM on the VLT Survey Telescope (VST) at the ESO Paranal Observatory (Arnaboldi et al. 2000;Kuĳken et al. 2015).Its footprint is overlapped by a near-infrared ZYJHK  VIR-CAM photometric survey with the 4m Visible and Infrared Survey Telescope for Astronomy (VISTA), the VISTA Kilo-degree Infrared Galaxy Public Survey (VIKING).The two surveys both span more than a thousand square degrees in the ugriZYJHK  bands, to a depth of  ≤ 25.Their complementary wavelength coverage has been processed jointly to create the 9-band KiDS-VIKING (KV) survey (Wright et al. 2019).This data set provides dereddened, multi-band color information for an overlapping patch of the DESI SV and Y1 footprint, allowing us to crucially associate DESI spectroscopic redshifts with a high-dimensional color space.
In this analysis, specifically, we make use of the KiDS-450 data release, with observations spanning the Galaxy And Mass Assembly (GAMA) fields (Driver et al. 2011), depicted as G09, G12, and G15 and the shaded green regions in Fig. 1 (de Jong et al. 2017).Through the beginning of main survey operations, we also used the KiDS-1000 release, shown in the blue footprint of Fig. 1, which jointly with KiDS-450 expands the overall footprint and provides a super-set of optical and near-infrared photometry (Wright et al. 2019;Kannawadi et al. 2019;Hildebrandt et al. 2020).The only selections we apply to this calibrated photometry is the provided photometric quality flags (e.g.FLAG_GAAP_u) which conservatively mask out (for all nine bands) pixels affected by diffraction spikes, over-saturation, neighboring bright stars, or other effects in the final coadded images.We require all bands to perform our analysis, so any galaxy with flagged poor photometry in a single band is eliminated from our targeting.

DESI Legacy Imaging Surveys
In order to target the main survey samples for DESI,  optical photometry was collected on across three instruments to produce the DESI Legacy Imaging Surveys (Dey et al. 2019): the Mayall band Legacy Survey (MzLS) and the Beĳing-ARizona Sky Survey (BASS, Zou et al. 2017) Collaboration et al. 2016 for an overview), while MzLS and BASS supplied the same filters at complementary high declination ( > +34).The photometric catalog spans approximately 14,000 deg 2 in the northern hemisphere, relevant for DESI and this analysis.Additionally, the optical reach of these surveys was augmented by the Wide-field Infrared Survey Explorer (WISE) bandpasses (3.4 -22 m, W1, W2, W3, W4) (Wright et al. 2010;Cutri et al. 2013).More extensive information on the depths and calibration of the Legacy Surveys can be found on the data release page1 .

Other Spectroscopic Surveys
A multitude of spectroscopic surveys have been undertaken in the COSMOS field.Several of these were utilized in this analysis for validation.Among these surveys are the original C3R2 effort (Masters et al. 2017(Masters et al. , 2019;;Stanford et al. 2021) and the master spectroscopic catalog from the COSMOS collaboration (M.Salvato, in prep).The latter includes observations from a variety of wavelength regimes and spectral resolutions across many instruments (VLT VIMOS, VUDS, Keck MOSFIRE, DEIMOS, Magellan IMACS, Subaru FMOS, and many others, (Lilly et al. 2007;Le Fèvre et al. 2015;Casey et al. 2017;Hasinger et al. 2018;Kriek et al. 2015;Kartaltepe et al. 2010;Silverman et al. 2015;Trump et al. 2007;Balogh et al. 2014).For the use of this project, these samples were limited to only confident redshifts.Furthermore, they were matched to the Masters et al. (2017) photometry and assigned to the original C3R2 SOM in order to validate our color-redshift relation.

METHODS AND OBSERVATIONS
From December 2020 through July 2021, DESI observed 328k main survey targets (ELGs, LRGs, BGS) in the KV footprint, 51,177 of which were also selected as DC3R2 and 1216 that were exclusively DC3R2 targets.The following sections will further break down these numbers into the DC3R2 secondary targets during SV ( 3.1), Y1 (3.1.2),and our dedicated tiles (3.1.1),and selected main survey targets that overlap.After completeness cuts we find that we have a sample of 230.7k galaxies that occupy the color space from 0.0 <  < 1.55.The aim of this program is to fill as much of the high-dimensional color-magnitude-space for future weak lensing surveys as possible, with high multiplicity and as much depth as is feasible when limited to secondary targets.The sample reported here spans the breadth of 56% of the population anticipated for Rubin and Euclid analyses, effectively most of the galaxies anticipated at  < 1.55.Future work can expand to higher redshift populations on NIR and IR instruments.Crucially DESI is well suited to extending this analysis and examining deeper samples in this lower redshift regime with dedicated programs.
In the following sections, we describe our procedure for target selection (3.1) through the three major phases of the DC3R2 program: the dedicated tiles (3.1.1),and SV, Y1 spare fibers (3.1.2).Additionally we detail the observations (3.2), redshift completeness (3.2.2), and the weighting schema to provide a representative sample of redshifts for calibration purposes (3.2.3).

Target Selection
The primary DC3R2 targets span across the GAMA-9h, 12h, and 15h equatorial fields (Driver et al. 2011) for a total of 300 sq.degrees, where sufficiently deep    color information was available.We have matched the GAMA fields as reported by KiDS-VIKING (KV) DR3 (Wright et al. 2019) to the Dark Energy Camera Legacy Survey (DECaLS; Dey et al. 2019) Data Release 9 photometry in order to constrain color alongside DESI z-band fiber fluxes, using the closest match within 1".The fiber fluxes from DECaLS were used to ensure our program could be completed at low exposure cost.In order to calibrate the complete color redshift relation, DC3R2 aimed to occupy with multiplicity as much as reasonably attainable to DESI of the color-space described by the Masters et al. 2017 Self Organizing Map (SOM), which is a useful map trained on narrow band COSMOS photometry that spans the approximate depth and breadth of color for future weak lensing surveys like Euclid and LSST.Appropriately it is the subject of several redshift search programs that also aim to span this color-space with spectroscopy (Saglia et al. 2022;Masters et al. 2015), which further encourages our use of it.This SOM is then corrected to KiDS-VIKING colors according the procedure described in Section 4.1, to allow for assignment of our alternate photometric bands.The abundances of galaxies observed across the color-space for these KiDS-450 fields (G09, G12, G15) can be found in Fig. 2a.In the neighboring panel, Fig. 2b depicts the distribution of galaxies observed from this catalog with successfully measured redshifts.
Galaxies are selected by color, determined by their assignment to cells in this map, as well as by fiber magnitude.We select these colors on mean cell magnitude, to take into account visibility of the targets to the fiber, and on a high probability of redshift < 1.6 (i.e. the [OII] feature lies within the DESI wavelength window, discussed further in the following paragraphs).This allows us to target 3,692 cells from the C3R2 SOM (≈33%) with a minimum of 3 galaxies per cell, spread in magnitude.These targets were chosen to enable a first quantification and rejection of faulty photometry and redshift and examine the trend of redshift with magnitude at fixed color, sorely required for accurate calibration efforts in the future (Masters et al. 2019).From 1800 targets per sq.deg.DC3R2 required only a small random subset (36 per sq.deg., or 2% for our initial request of 3 galaxies per cell, making it an ideal candidate for a secondary target programs with very flexible fiber assignment across a high density of targets.While ideally, DC3R2 would sample the full color space available to DESI, i.e. no selection on  < 1.6, to fully probe the edge of the redshift range without a photometric redshift prior, these galaxies are realistically too faint on average for a spare fiber program to address in a few 20 minute exposures.We benefit from main survey target classes (ELGs especially) probing this redshift regime with higher fidelity.We can summarize the diverse range of target selections in this paper in Table 1, where we find DC3R2 is the broadest in terms of redshift, but augmented by deeper main survey samples.
Explicit choices for the DC3R2 target selection procedure follows from the joint KiDS-VIKING-DECaLS matched catalog, • To boost redshift completeness of our targets, individual galaxies that would conservatively be expected to take four visits or fewer were selected from the catalog as MAG_FIBER_Z < 22.10 -too acheive a balmer break SNR of approximately one in the optical, where passive galaxies are the conservative upper bound, see the discussion in Zhou et al. 2023 (their Sec. 4.4) for more information..  visits).From this we prioritized the brightest in each cell (as a lever arm for the measurement in Sec.5.2) and multiplicity as described for each component of our survey in the respective section below.
The initial spare fiber catalog for the SV stage had approximately 970k targets available that met the above criteria.This target list was modified for Y1 to include the larger KiDS-1000 footprint (Kuĳken, K. et al. 2019), see 3.1.2for more information.Our ability to prioritize targets is severely limited by the secondary nature of our program, and where we do have ability to prioritize is discussed in more depth in App.C for the dedicated tiles.We are agnostic to pre-existing spectroscopic coverage in the field and simply cover any available target by chance target-fiber vacancies.

Dedicated Tiles
DC3R2 has observed two dedicated tiles, i.e. instrument pointings, at the end of the One Percent Survey within the survey validation period.We chose the two pointings to be centered at 217.5 degrees and 221.0 degrees RA on the celestial equator, within the GAMA 15h field (Driver et al. 2011) and also within the Hyper Suprime-Cam Subaru Strategic Program (HSC) survey area (Aihara et al. 2018).Each tile was observed with two different fiber configurations, one for a single 30 minute exposure aiming at targets down to  fiber < 21.5, and one for a total of 90 minutes of exposure time with targets down to  fiber < 22.10.A description of how targets were chosen for each of these pointings can be found in Appendix C, but in summary we optimize priority for bright-faint pairs (a large span of MAG_GAAP_Z) in a given cell to maximize our lever arm on measuring /.The dedicated tiles are identified in Fig. 1 as black circles.

Spare Fibers
The DC3R2 targets observed outside of the dedicated tiles occurred in two phases within the SV and Y1 periods.1: Generalized target selections for each survey contributing to this analysis and their complementary redshift sensitivities and the resulting observed spectra counts before quality cuts are applied.All magnitude cuts are reported in the optical Legacy Survey bands, , made use of for targeting.Note that the BGS fiber magnitude cuts are color dependent and simplified here, and that objects may be shared between different target classes.See Fig. A2 for a breakdown of the interplay of each sample in color-magnitude space.
The initial spare fiber, secondary target, observations are described by the selections made in 3.1 on the KV DR3, KiDS-450, photometry released at the time.
The spare fiber target selection and prioritization for the Y1 observing period was modified from the initial SV phase to draw targets from the then publicly available larger area KiDS-1000 (DR4) photometric catalog (Kuĳken, K. et al. 2019), as seen in the shaded blue regions of Fig. 1.The strategy was altered to prioritize the brightest galaxy in each cell alongside a randomly drawn target within a Zmagnitude of 21.88.This lower magnitude cut ensures a redshift is obtainable from two visits, which differs from the maximum of four visits in SV, seldom available for spare targets.The random draw traces the magnitude distribution of the cells overall.This pair-wise selection per cell has the benefit of extending the magnitude leverage on /, as more bright galaxies are available in the larger footprint, and achieving high redshift success in survey spare fiber mode.The same criterion as in Sec.3.1 allowed us to select from 5074 cells in the C3R2 SOM for the Y1 fiber proposal.The blue objects in Fig. 1 are successful matches back to this wider catalog during Y1, sometimes retroactively from main survey targets in the SV phase.The DC3R2 spare fiber program extended into Y1, and while some of that data is analyzed here, we expect an additional 315k targets within our footprint to become available with the Y1 data release, with around 28k of those objects being DC3R2 exclusive targets and the remainder coming from overlap with DESI main survey target classes.

Observations
The positions of all observed DC3R2 targets on the sky, falling on the equatorial where 9-band photometry is available, are depicted in Fig. 1, including the dedicated tiles in black.An individual DESI pointing can capture up to 5k targets across 8 deg 2 , which is visible as the overlapping circular rosettes in this figure.

Redshift determination
Following pixel level calibration and extraction of a one-dimensional spectrum, the DESI redshifting pipeline redrock forward models the observed data from a basis consisting of a set of template SEDs for each target class (Guy et al. 2023;Bailey et al. 2023).This decomposition is done at each point in a fine grid of redshift values  and the  2 of the difference of observed data and the best linear combination of templates is determined.A minimum in the  2 () indicates a potentially optimal redshift-template solution.The  that globally minimizes  2 is taken as the redshift of the observed object.

Completeness
The primary metric of redshift confidence in the DESI survey is Δ 2 , the difference between the  2 of the best fit redshift and template combination and the second lowest local minimum of  2 as a function of redshift.The larger the Δ 2 , the more confident one can be that the first redshift is correct (Bailey et al. 2023).As each main survey sample selection (Hahn et al. 2023, Zhou et al. 2023, and Raichoor et al. 2023) has different characteristic SED features, the ability of redrock to fit a given spectrum depends on the type of galaxy.The DESI visual inspection efforts have determined minimum Δ 2 selections and additional quality cuts that maximize purity and completeness for each main survey sample (Lan et al. 2023;Guy et al. 2023).
For this analysis, DC3R2 galaxies that we consider to have high confidence redshifts must pass the same completeness criteria as the BGS sample, i.e.Δ 2 > 40.This is the strictest among the Δ 2 cuts for DESI main survey samples and was chosen to account for the broad variation of SED types in our target selection, informed by the visual inspection done for the BGS sample in Lan et al. ( 2023) that optimizes good redshift purity and inclusion for our comparably bright targets.ELGs and LRGs have significantly lower Δ 2 requirements in the DESI main survey, which combined with further quality cuts give high fractions of confident redshifts, as redrock and its template sets have been optimized to obtain good fits to these SEDs.Galaxies with different SEDs may need more conservative metrics to minimize outliers, hence our adoption of the Δ 2 > 40 cut.Our analysis indicates this is more than sufficient in most regions of color-space, but it ought to be kept in mind that Δ 2 is only one avenue of selecting robust redshifts and the threshold required depends strongly on the SED type of the galaxy.Other features, e.g. the signal-to-noise in strong features like [OII], may also be strong predictors of successful redshift measurements in redrock.
We report the distribution of successful redshifts in the SOM from all target classes based upon their individual success criteria in Fig. 2b.While a cell may have successful redshift measurements, those measurements may be biased in certain regions of color-space where obtaining good redshifts is more difficult.How we account for this bias is described in the following section.The redshifts measured from these successful spectra are visible in Fig. 3a after accounting for these incomplete regions.We can see in Fig. 3b the general agreement with previous spectroscopy in the COSMOS field.The color-redshift calibration is further discussed in Sec.4.2.
For the sake of future survey efforts to calibrate the color-redshift relation, in Fig. 4 we report the exposure times required by DESI for targets of eight-band colors in the SOM to achieve Δ 2 > 40 for objects scaled to a magnitude of MAG_GAAP_Z = 21.0, as following from the discussion in Appendix D.
For the fiber configurations with 90-minute exposure time over   the DC3R2 dedicated tiles, 98.1% of targets achieved the DC3R2 criterion for success with no flagged warnings (ZWARN == 0).For the galaxies in the two single 30-minute exposures, 93.4% meet this redshift success criterion.The overall redshift success rate for the dedicated tile targets is 95.8% .For the entire DC3R2 ancillary program that also made use of spare fibers (footprint depicted in Fig. 1), the success rate is 93.4% for unique DC3R2 targets, and 94.7% for shared targets, with a total of 13,270 targets observed during the dedicated program.While many of these overlap with main survey targets, the higher DC3R2 priority in fiber selection ensures that the selection of these objects are less biased towards the main survey selections in color than similar overlaps in the spare fiber fields.

Sample Weighting Scheme
One of the primary methods of photometric redshift calibration in weak lensing is to appropriately reweight a sample of known red-shifts to accurately approximate the redshift distribution of a much larger source galaxy sample.This requires a full understanding of the selection acting on both samples.For the true redshift sample this selection can be intentional in the definition of a spectroscopic target catalog or unintentional due to incomplete redshift recovery among the targeted galaxies.Only with the selection accounted for an in addition with the weak lensing source galaxy sample selection applied can the reweighted spectral sample be representative and the estimated redshift distribution therefore be unbiased.
For this survey, we must take into account the overlap of DC3R2 targets with DESI main survey targets.The selection of the latter is not based on KiDS-VIKING colors but on DESI Legacy Survey , , , 1, 2 photometry (Myers et al. 2023).The observation prioritization of the DESI main survey targets may affect relative abundances of certain redshifts within a given SOM cell.This is accounted for by reweighting according to different subselections of our targets, namely:    (1) Dedicated tiles: Our dedicated tile sampling method over a small area will be unaffected by DESI main survey oversampling.It is our fiducial sample for this reason.(2) DC3R2 exclusive targets: These are targets observed during regular DESI operations that are only observed as targets of our dedicated program, i.e. that do not meet DESI main survey target selection criteria.This sample will be potentially biased away from the types and redshifts of galaxies observed by DESI's main surveys in SOM cells that contain a mix of both.(3) DESI main survey targets: DESI main survey targets that overlap with our selection will be more likely to be observed due to their high priority in comparison with DC3R2 exclusive targets in SOM cells that contain a mix of both.Each class of main survey targets has different magnitude and color cuts than the DC3R2 targets.SV and Y1 main survey targets have different color selections that have evolved over the course of the survey and are weighted accordingly.
We can re-weight these subselections to leverage the full DESI survey beyond our fiducial sample, (1).The basic ruleset for usage and weighting is described here, with the aim to be providing a reliable and unbiased sample of spectroscopic galaxies over as many cells as possible with a large magnitude span within available cells.We want each cell to have a collection of representative galaxies that are corrected for overabundances caused by preferential observations of galaxies of a particular SED type.For all DESI-observed galaxies that have KiDS-VIKING photometry in a given cell: • We split the sample in the cell into categories of ( 2) and ( 3) by observation flags DESI_TARGET and SCND_TARGET, available bitmasks in the DESI catalog that indicate main survey and secondary programs respectively.
• For those in (3), we check how complete the targeting is for main survey targets in color space by comparing the full main survey target catalog occupation for the cell (with only color cuts applied) to the full KiDS-VIKING occupation of that cell.If the targeting completeness (fraction of photometric targets that are also main survey targets) is below 90 per-cent for a given color, we exclude the main targets from this cell and skip the following steps.This effectively ensures that the color cut on the main survey sample does not significantly bisect a SOM cell.As the period covered by the observations spanned several iterations of LRG/ELG cuts (see Myers et al. 2023), only cells where both the stricter Y1 cut and the looser SV cuts did not bisect the cell were retained.
• For those in (1), ( 2) and ( 3), we check the redshift completeness of the sample type in the given cell.This is done on the subset of galaxies in a cell that pass the specific survey target selection (e.g. the DECaLS based magnitude and color cuts, or DC3R2 magnitude selection).If among this subset within a cell the redshift completeness is less than 90%, we exclude all of these for both DC3R2 and main survey galaxies from the weighting scheme due to the risk of a significant redshift dependent selection bias, and do not include low confidence targets in the next step.See a simple visual description of this in Fig. 2c.The large number of incomplete spectra seen in the center top left of this figure do not lack sufficient exposure time, but spectral features within the DESI sensitivity range.The galaxies likely to be  > 1.6 predominantly lack spectral features in the optical.• We weight the samples of ( 2) and ( 3) high confidence redshifts so that both samples contribute proportionally to their abundances in the targeting catalog, within a given cell and bin of  fiber .If only one sample type exists in a given magnitude/cell bin, no such re-weighting is done.
We see a linear relationship be written explicitly as follows.For a given magnitude/color bin  within the , ,  magnitude cube of a given SOM cell  let   denote the number of actual (or observed) and   the number of potential targets.Further, denote the unique sample (ELG, LRG, BGS (B), BGS (F), DC3R2 spare fibers, DC3R2 dedicated tiles) of a target by the index .Where each class of galaxy lives in the SOM is delineated by color in Fig. 5, where we can see each of the samples are highly complimentary.The weight of the redshift of an object towards the estimated redshift distribution of a cell  depends on  and  and can be factorized as (, , ) =   (, ) ×   (, , ) (1) where   (, ) is the weight of the magnitude bin towards the full sample in the cell and   (, , ) is the weight of the sample that the galaxy belongs to for the given cell and magnitude bin.Individual samples in magnitude-color bins require these weights to properly reproduce relevant target abundances in our final reported redshift distributions.An illustration of how samples populate a choice of magnitude bins is visible for the SV3 selections in Appendix A, Fig. A2 where a more thorough description of magnitude bins can be found.For objects that are in more than one sample (typically only true for some DC3R2 targets with main survey samples), the largest of its sample weights is used. Here

𝑎, 𝑝
refer to the number of actual or potential targets in cell  or in magnitude bin  of that cell, respectively.Similarly, the reweighting of samples among that bin and cell is given by No reweighting is done for objects in (1) aside from the redshift completeness check, as these targets are not expected to be biased within a cell.The distribution of weights normalized for each SOM cell, Σ ∈ Σ ∈   (, , ) =   (), are depicted in Fig. 6 for the main survey samples.The purpose of these weights, to construct less biased (|) distributions, can be demonstrated in Sec.5.1, where they produce noticeable shifts in the inferred redshift distributions.

THE COLOR-REDSHIFT RELATION
In this section we describe the color-space used in this analysis (Sec.4.1) and discuss the characterization of redshift distributions in that space given DESI observations (Sec.4.2).

Description of Color-Space
This analysis makes use of the SOM first developed in Masters et al. (2015), which includes galaxies to Rubin and Euclid-like depth,  ≲ 24.5 or approximately 98% complete at  = 25.3 from Laigle et al. 2016, in ugrizYJH, with the modifications and added   -band from Masters et al. (2017).Our map exists in analogous 9-band KiDS-VIKING photometric colors to the original training photometry.As at the time of this paper KiDS-VIKING-like photometry is not available across the same fields, our transfer function to shift the map is evaluated by measuring flux in the KiDS-VIKING bandpasses for the best model template fit to the narrow band photometry in the in the original SOM cell.The nature of the template fitting and the templates themselves to produce these SEDs are elaborated upon in Laigle et al. (2016).While KV is not the photometry that Masters et al. 2015 was trained on, though making use of a transformed version of this SOM has the advantage that individual cells will be populated by galaxy populations of approximately the same true colors.Thus the deep samples collected by C3R2 and the DC3R2 survey will have similar redshifts for similar cells and can help inform future redshift searches.We can conclude that our map, though photometrically different, will span a similar color space to Masters et al. 2017 and therefore Euclid and LSST by construction.For more elaboration on the differences between the SOM in this analysis and Masters et al. 2017, see Appendix B, and for the exact colors of our map see Appendix A.

Characterization of Color-Space Using DESI
With DESI targets mapped to    color-space, we have a powerful statistical sample that constrains the high-dimensional color-redshift relation for future cosmological analyses.With more than 230k galaxies covering 56% of the map, we present redshifts that can span 0 < z < 1.6 with high multiplicity.DESI provides both secure spectroscopic redshifts alongside redrock SED models fitted to each galaxy, which allows us to explore the evolution of galaxy-type across the map, as seen in Fig. 5.The colors of the SOM, broadly have smooth transitions, as do the redshifts as seen in Fig. 3a which depicts the median redshift across this map.Galaxies with alike colors tend to have alike redshifts, except for where sharp delineating features express an innate degeneracy -where small shifts in color can have large consequences for redshift inference.
The gains that have been made in constraining the p(z|c) can be seen in the other panels of Fig. 3, where we examine the additional color coverage of bright, low-redshift cells that were under-observed in the COSMOS field in the past.The red and magenta regions in Fig. 3c depict regions where DESI dominates the redshift information in the map by spectroscopic count.Cosmic variance and simple undersampling can distort the broader color-redshift relation we measure by biasing populations of galaxies within a given SOM cell (see an extended discussion in App.B).DC3R2 and DESI together provide .Depiction of the SOM with four complementary samples, post completeness cuts: Emission Line Galaxies (green), Luminous Red Galaxies (red), the Bright Galaxy Survey (blue) and DC3R2 selected targets (orange), where the strength of each color channel is directly proportional to the fraction of galaxies in that cell that come from the sample noted.The distributions of normalized redrock SED fits are shown for three choices of SOM cell that span star forming, star burst, and passive galaxy types, with a colored envelope for the 68% quantile region about the median of all templates shown in black.The shaded regions denote rest-frame wavelengths not observed by DESI according to the best fit redshift and individual galaxy spectrum fits are drawn in gray.Normalization is done at a point of featureless continuum, 3000 Angstroms for star forming galaxies and at 8000 Angstroms for mostly quiescent galaxies.much higher confidence in the distributions of redshifts at these magnitudes and colors.This is exemplified in Fig. 7, which depicts the number of cells in the map at each redshift with spectroscopic coverage.We see a dominant DESI + KV contribution in the BGS and ELG redshift regimes (the two peaks), which sharply drops off at  = 1.55 (becoming subdominant to COSMOS at  > 1.375), where the [OII] line passes out of the optical range.DESI jointly with KiDS-VIKING calibrates 87% of the LSST / Euclid color-space (as a fraction of cells) at  < 0.35, 84% from 0.8 <  < 1.2, and 77% overall for  < 1.65.While this is the overarching calibration extent of early DESI data in the KV fields, we can see that the higher coverage where DESI main survey targets supply galaxy spectra reveals that future efforts to span deeper magnitudes, and DC3R2like efforts to bridge the spaces between main target categories could make real gains to fill more of the space approximately between 0.4 <  < 0.8.For the full color space independent of redshift, the DESI-KV sample, inclusive of DC3R2, calibrates 56% of the space.
Similarly, DESI will produce better constraints on the (|) in the future, jointly with photometry of improved depth and with the full extent of the five year survey.With currently available KiDS-VIKING, we can see in Fig. 8 that redshifts for a given color are more broad than in COSMOS-by a factor of 1.7 -likely due to photometric scatter affecting cell assignment.The 68%-region for the span of redshifts in a cell in aggregate, (  − z )/(1 + z ), with the DESI-KV sample is [-0.0392,0.0361], yielding an approximate   = 0.0376.In the bottom panel we can see how the distributions vary in individual cells, with the median of cells falling at approximately this aggregate (marked by the dashed lines).DESI provides a powerful data set, and if similar photometry to the COSMOS field were supplied for our spectra, we might anticipate our estimate of the variance in our overall redshift distributions statistically decreasing by a factor of 0.317 over our current reported, unweighted uncertainty (i.e. 2 , COSMOS / 2 , DESI−KV ).With an already substantial increase in number of spectra, deeper photometry would provide strong advantages.While the depth of the sample can be increased, the bounds on redshift range are fixed on the current instrument.

CHARACTERIZATION OF SPECTROSCOPIC SELECTION BIASES ON REDSHIFT CALIBRATION
Ideal photometric redshift calibration for weak lensing surveys has to account not just for the dependence of redshift on observed color.Even at fixed color, magnitude and explicit and implicit selections of the spectroscopic or the weak lensing source galaxy sample can have an impact on the resulting redshift distribution (e.g.accounted for in redshift calibration).For future surveys, spanning the depth will no longer be possible across all colors on account of the required spectroscopic exposure times.A sub-cell calibration that accounts for magnitude variation may become necessary at this stage, and the accuracy of this calibration will heavily depend on the depth and multiplicity of the available spectroscopic redshifts.
Here we investigate the impact of these systematics on redshift calibration with the DC3R2 spectroscopic sample.Section 5.1 tests the effects of magnitude cuts and spectroscopic selections on the redshift distributions of realistic Stage-III lensing survey redshift bins.Sections 5.2 and 5.3 explicitly check for trends in redshift as a function of magnitude at fixed color, and the impact of noise in observed colors on those.Finally, Sections 5.4 and 5.5 compare the systematic effects found to the requirements on future surveys.

Impact on Calibration of Redshift Bins
Selection effects in spectroscopic samples will distort crucial estimates for weak lensing surveys, like that of the mean redshift per tomographic bin.To explore these selection effects we use the KiDS-450 galaxy abundances in our SOM cells to infer redshift distributions for five KiDS-like tomographic bins.This procedure follows where, for a given selection, sel, and bins, , comprised of cells, .The first term amounts to the distribution of spectroscopic redshifts in a given cell and the second amounts to a weighting factor that relies on the abundances of galaxies in the calibrated sample (here, KiDS-450).
The resulting ()s are depicted in Fig. 9a.We create these bins by sorting cells covered by DESI spectroscopy by their median redshift.
The lowest redshift cells are assigned to the first tomographic bin, until the number of KV galaxies with shapes occupying those cells reaches one fifth of the overall sample.We continue in this fashion, associating roughly equal numbers of calibrated galaxies with each bin, .Note, that we can see in Fig. 9a that this binning does not place equal numbers of calibrating galaxies, i.e. spectroscopic redshifts, in each bin.We then perform a set of tests on the impact of selection biases by applying further selections to the spectroscopic sample and determining how that changes the estimated ()s.Table 2 summarizes the resulting change in mean redshift for the tomographic bins.
Every comparison ensures that the color selection for each bin does not change as the spectroscopic selection effect is applied, i.e. is made for bins defined by the same cell envelope both before and after the additional spectroscopic selection.This means that if the selection eliminates certain SOM cells from spectroscopic coverage, we use the estimated mean redshift of the same reduced set of cells even for the fiducial sample in order to isolate the effect of selection bias at fixed color.
The spectroscopic selection effects we examine are: • Magnitude -(MAG_GAAP_Z < 21.0)For / = 0 and a well populated color space, a magnitude cut would introduce no change in mean redshift.In the presence of significant / > 0, however, a magnitude cut would bias the inferred redshift towards a lower mean value.This operation and the result is demonstrated in Fig. 9b.We indeed find a small (|Δ| < 0.01) impact of the magnitude cut on the mean redshift inferred for the two lowest redshift bins, but an impact that exceeds current calibration requirements on the mean redshift  The bias in mean redshift for each bin is noted in the legend, showing the substantial effect on the inferred redshifts of a photometric sample if the spectroscopic calibration sample is biased to brighter galaxies.
inferred for the higher redshift bins, up to almost Δ z = 0.1.This cut changes the mean magnitude in each bin, and the spectroscopic redshift counts according to Table 3.As this cut was chosen to be moderately bright for demonstration, we also explore how less severe selections affect this metric in Fig. 10 as half magnitude steps down from the KV limiting magnitude in the i-band.This demonstrates the importance of representative spectroscopic redshifts in our toy model, which is constructed in a way that fundamentally differs from DES, KiDS, and HSC as our SOM is not trained on a magnitude in addition to its colors.
• Quality Flags -(Δ 2 > 40) As a consequence of enforcing a more severe confidence in the fitted model, this will remove spectra in an SED-dependent way from the ELG and LRG samples where, by default, some objects with smaller Δ 2 are considered confident redshifts.When implemented, less then 2% of spectroscopic galaxies are removed.The impact of this stricter selection is small (|Δ| < 0.005) but coherent across all redshift bins, i.e. biases all redshift distributions towards a lower mean by preferentially removing higher redshift galaxies from the sample within a given cell.In the highest redshift bin, where the fraction of ELG targets contributing to the calibration is large, the effect exceeds the redshift calibration requirements of future weak lensing experiments (cf.also Hartley et al. 2020 for the impact of selection based on conventional redshift confidence flags).
• Removal of Cell Outliers -Galaxies with large deviation from the  3: Changes in properties of a given tomographic bin when their calibrating spectroscopic sample is subject to MAG_GAAP_Z < 21, describing the decrease in mean magnitude of the calibrating galaxies, and the fraction of spectroscopic calibrating galaxies cut from each bin.Higher redshift bins are dominated by fainter spectra and suffer the largest population shifts.
median redshift of an ensemble of similar color are more likely to be outliers of various types, e.g.due to blending, AGN light, or redshift determination errors.We define a sample of outliers based on the criterion where  , is the aggregate standard deviation in cells found in Sec.4.2, from which 3 , = 0.113.The selection corresponds to the central 99.7% of redshifts if these were to follow Gaussian distributions of color-independent width at any fixed color and thus would reject only the most egregious outliers.In practice, these distributions are not Guassian and this selection cuts 5.1% of objects.
The selection is relative to the median rather than the mean to robustly deal with cells that are undersampled or heavily affected in their mean redshift by outliers.The impact of removing the outlier population defined this way is maximally of order Δ ≈ 0.02, and has reduced impact in the highest redshift bins.While a number of these outliers can be attributed to photometric scatter across degenerate regions in the SOM (neighboring cells with large separation in redshift), others may be real examples of broad or bimodal cell distributions and ought to be examined with visual inspection.
• Applying Weights,  -Spectroscopic redshifts are weighted according to the scheme described in Section 3.2.3, to account for prioritization of spectroscopy for targets of certain morphologies, colors, and SED-types that are not necessarily representative.As the weights can be somewhat noisy in under-sampled regions of color space (in targeting), additionally this comparison is restricted to cells where the resulting shift in z from applying the weights is small.While this will result in an underestimation of the true weighting effect for all galaxies, these are the regions where weights can be applied confidently due to the spectroscopic and targeting counts.We apply a cell envelope selection where | z − z, | < 0.08, which retains > 90% of cells and eliminates the sparsely occupied, and noisiest cells.Note that the effect of this scheme in Table 2 is always to lower the mean redshift in a given bin.The impact is maximally Δ ≈ 0.01 in the most sparsely populated spectroscopic bins.Additionally, if these weights are applied in the same way to the magnitude selection ( < 21), we see in Table 2 that they mitigate the bright spectroscopic bias in high bins, but do not eliminate it.

Magnitude Dependence of Redshift at Fixed Color
Past analyses on photometry and spectroscopy in the COSMOS field have shown the magnitude dependence of redshift at fixed color cell to be small and well described by a linear behavior with / ≈ 3 × 10 −3 (Masters et al. 2017).The equivalent measurement made in the KV data with DESI redshifts is depicted in Fig. 11b.Here we examine the relation of differences in VIKING Z-band magnitude and redshift between pairs of galaxies that occupy the same colorcell in the SOM.A cell with  galaxies contributes ( − 1)/2 data points to Fig. 11b.We see a linear relationship, with a large amount of scatter.The raw slope measured, / = 0.0250 ± 0.0009 , is an order of magnitude larger than that reported in previous studies (see App. E for methodology).This ought not to be taken at face value, and Section 5.3 (Fig. 11a) explores photometric scatter as the largest source of bias in this measurement, as well as the chief difference between the data set used in this study and that of previous ones.Furthermore, comparisons of / will be affected to lesser degrees by the other second order effects discussed in Section 4.1.

Systematic Uncertainty due to Photometric Scatter
With the depth of our SOM resembling that of future deep surveys, it could be problematic that we use a relatively shallow photometry in KiDS-VIKING to assign our spectroscopic galaxies.Photometric scatter enters our calculation through perturbations in the cell assignment, which comprises our definition of fixed color.Galaxies with larger flux uncertainties tend to be fainter, which is also correlated with higher redshifts.While the direction of scatter will be impacted by the color topography of our individual SOM, we can broadly expect faint, high redshift galaxies to scatter more often into brighter, lower redshift cells and induce a strongly positive /.We can imagine photometric scatter as an asymmetric smearing of the redshift distributions in a given cell.
A potential systematic effect of this photometric scatter on the measured slope can be explored with a test described here.We can perform a direct measurement on the slope introduced by photometric noise by (1) removing any existing / from our data by randomly shuffling all spectroscopic redshifts in a given SOM cell, thus nulling / while preserving the (|), (2) applying a random Gaussian draw of the width of the flux error reported in the catalog for each band, thus perturbing measured colors.(3) Reassigning galaxies to the SOM based on these perturbed fluxes, we repeat our measurement of /.Additionally, we can (4) iteratively reshuffle and perturb many times to reduce statistical uncertainty in the measurement from the limited sample size.The only contributor to this final, measured / will be photometric scatter, making it a measurement on this systematic that can be compared across photometric surveys of different depth.Using (4) to quintuple the number of points available to us for this measurement, we observe in Fig. 11a, that (/)  = 0.0132 ± 0.0007 for the reported KV errors associated with our DESI targets in this SOM.The slope induced this way can be measured across color space, i.e. across the SOM.This color-dependent effect is subtracted off in the second panel of Fig. 12, leaving an estimate of an intrinsic / that is corrected at first order, at least under the assumption that the reported flux measurement errors are accurate.While individual cells carry significant large intrinsic (/)  (order of ≈ 0.01), the average across all cells is much lower at (/) ,.= −7 × 10 −4 .

Impact of photometric noise levels
We can see in Fig. 11a that the systematic measured for KiDS-VIKING-like error in a SOM of Euclid-like resolution is significant, (/) sc ≈ 0.013 , and the likely dominant contributor to our overall measured slope.With this effect in mind, we revisit the original measurement performed in Masters et al. ( 2019) using the same field and similar photometric filters.Repeating the test of Section 5.3 used where each data point is the difference between a unique pair of galaxies occupying the same SOM color-cell in magnitude and redshift.The best-fit slope (dashed red) is fit from a collection of medians and  (pink) in magnitude slices.If reported photometric errors are accurate, this systematic dominates our measurement.
(a) Systematic induced in / by KiDS-VIKING photometric errors in conjunction with our given SOM resolution for DC3R2 spectroscopic redshifts at fixed color, with change in magnitude in KiDS-VIKING MAG_GAAP_Z.Sec.5.3 describes the procedure of scattering galaxies by their reported errors to perform this measurement.
(b) Total measured change in redshift with magnitude (Z-band) at fixed color as observed in the shallow KiDS-VIKING photometry is dramatically affected by photometric scatter in contrast to previous studies on deeper COSMOS field photometry.to measure the impact of photometric scatter, but with the lower noise levels of the COSMOS photometry, we find that the original measurement had a contributing noise bias of (/) sc ≈ 0.003 ± 0.0004.This is similar to the measured slope in Masters et al. (2019), and means that the intrinsic / in that sample is roughly consistent with zero.With the limitations of the coarse photometric scatter test we have applied we do not claim to have a measurement of the intrinsic / in COSMOS to better than 0.003 as a result.
Also worthy of note is that doubling the photometric error for KV associated DESI redshifts produces a (/) sc, 2 = 0.0256 ± 0.0010 , which is larger than the observed raw slope in Sec.5.3.Thus, in case the photometric error in the KV catalogs should be underestimated at levels that have been reported in other studies doing source injection into survey images (Everett et al. 2022, e.g.), it could be that the average intrinsic slope of the DC3R2 sample is indeed consistent (across the full sample, not simply cell-by-cell) with zero as well.
As noise in the KiDS-VIKING photometry introduces a large systematic effect on our desired measurement, it is worth exploring what photometry is needed to drive down the error and improve constraints on /.Given the uncertainty on our estimation of the systematic effect of photometric noise, the hypothesis that / ≈ 0 seems to be consistent with current data.Yet improved photometry such as that from HSC (Aihara et al. 2018) could allow to use DESI spectra to better effect.With a Gaussian-draw, background limited error model based upon the effective exposure times and limiting magnitudes of Aihara et al. 2018 (see their Table 8) we generate HSC-like flux errors for grizy to augment KiDS-VIKING uJHKs.We find that (/) sc ≈ 0.008, which halves the effect, though does not eliminate it.In the limit of noiseless optical (ugriz), we still find (/) sc ≈ 0.004.We can imagine the LSST scatter with these objects would be comparable to this value, without improved NIR or IR follow up.Note that this is still larger than the slope found with the COSMOS photometry that includes the very deep UltraVISTA NIR data, demonstrating both that such small systematic errors are achievable in principle, and that they rely on deep NIR that is difficult to achieve with currently operating instruments.

Future survey requirements
The tests presented above have shown limitations to how well we can constrain the trend of mean redshift with magnitude at fixed color in the presence of noisy photometry.Here, we connect this to requirements for how well do we need to know / for future, stage IV weak lensing surveys.We can approximate the redshift calibration error due to imperfectly known magnitude dependence as where Δ(/) is our error in the magnitude dependence of redshift at fixed color, z is the mean redshift of the sample being calibrated, and ( mwide − mspec ) is the offset between the mean magnitudes of wide field observed galaxies and the spectroscopic sample used to calibrate them.The requirement for final Rubin analyses is expected to be |Δ|/(1 + z) ≈ 0.001 (LSST Dark Energy Science Collaboration et al. 2018).With the limiting  band magnitude of final Rubin data and the mean magnitude of the current DESI sample for mwide = 26.3 and mspec = 21.1, respectively, we find that we need to determine |Δ(/)| ≤ 0.0002.If we assume the Δ(/) ≈ 0.004 that was determined in Section 5.3.1 to be the limit obtainable with cur-rent NIR photometry and deep Rubin observations, our needs and capabilities are at odds.
If instead of considering the faintest galaxies used by Rubin we estimate the sample's mean magnitude, using the slope of the luminosity function from Gruen & Brimioulle (2017), observed to the limiting i-band magnitude for LSST, we obtain mwide = īRubin ≈ 24.90.This provides a more realistic, less conservative requirement for Δ(/) ≤ 0.0006, which is still stricter than all our estimates for the effect of photometric scatter on this slope by almost an order of magnitude.

Interpretation
The previous sections studied how selections on the spectroscopic calibration sample provided here may impact the redshift distributions estimated using that sample.We performed these tests using a simple redshift calibration scheme that assumes that at a fixed observed color, the distribution of true redshifts of a photometric and a spectroscopically selected galaxy sample are identical.We note that none of the recent, Stage-III, analyses have relied on a redshift calibration scheme that was quite this simple, and hence the biases we identify are not expected to be present at the level found here in recent analyses.For example, Fig. 10 depicts a case where the calibrating spectra are up to two magnitudes brighter than the limiting magnitude of the sample, whereas the formal KiDS analysis made use of a variety of spectroscopic sources that spanned the depth of their photometric sample (van den Busch et al. 2022).Similarly DES did not strictly apply a spectroscopic selection and made use of narrowband photometric redshifts where there were no representative spectroscopic redshifts (Myles et al. 2021).Despite this, extrapolation of redshift calibration samples to photometric objects of similar color but fainter magnitude is likely required for future, deeper photometric surveys, and hence it is useful to study the potential pathways for bias in such an application of our sample.Both the DES and KiDS SOMs (Myles et al. 2021;Wright et al. 2019) are trained on colors and a magnitude (or luptitude) -and DESC will be unable to do this without throwing away a substantial part of their sample.Hence the SOM for this analysis is matching spectroscopic zs to galaxies based on their colors alone.
A necessary condition for a non-zero bias in such a redshift calibration scheme is that the expectation value of redshift depends on properties relevant for the spectroscopic target selection and measurement success other than just a galaxy's observed color.The most salient such effect we identify is via a trend of mean redshift with observed magnitude at given observed color, /.We find this slope to be much steeper than reported in previous studies.The cause for this steepness is an effect of the larger photometric noise in the colors measured for our target sample, rather than a large dependence of redshift on magnitude at fixed true color.Most of the imaginable selection effects are related to a galaxy's magnitude, and hence the non-zero / propagates into several of the bias tests we perform and merits further interpretation.
The bias imposed by photometric scatter on / varies with color, as seen in Fig. 12, and potentially also with other selection choices.The scatter itself is asymmetric as it does not shift objects into neighboring cells isotropically in the map, and will induce a larger / in areas with large color-redshift degeneracies (i.e. a small error in color could lead to a large offset in the mean local redshift).Our ability to understand the intrinsic / present in the data depends heavily on the quality of our photometry and/or our ability to correctly model photometric measurement errors.As seen in the right panel of the same figure, the intrinsic / is similarly color dependent, but just as likely to be negative as positive (indicating that it is likely a noisy measurement).
Our method to constrain this bias relies on the reported photometric errors.However, if these errors were to be misestimated by order unity, potentially in a color dependent way, the entire measured slope could be the result of this systematic, as discussed in Sec.5.3.1.Literature analyses have demonstrated that the deviation between observed photometry and the true fluxes of a galaxy requires realistic image simulations processed by photometric pipelines to estimate, and reported errors are frequently an underestimate at levels that indeed reach factors of two (e.g.Huang et al. 2017;Everett et al. 2022).No literature study exists on the accuracy of reported KV GAAP flux errors specifically, but the relative linearity of the algorithm and preliminary studies on image simulations (Li et al. 2023b) imply that the misestimation of the error in GAAP is non-zero but not a factor of two.Consequently, Sec.5.1 demonstrates that selection effects in magnitude suffer from / dependence significantly in higher redshift bins, where objects tend to be fainter, and this emphasizes the role that deep photometry, in contrast to merely multi-band coverage, plays when making maximal use of spectroscopic redshifts.We can see with the   cut, on the narrowness of the redshift distribution for a given cell, that potential outliers in a given cell that arise either from photometric scatter or a misattributed redshift can have dramatic effects on the redshift distribution.Since this effect is largest in the lowest redshift bin-where objects are brightest-we might suspect that these outliers are dominated by cell-to-cell photometric scatter.Cutting these cells does further limit the color space of a given weak lensing analysis, and ought to be avoided where possible.For this reason, visual inspection works like Lan et al. ( 2023) and careful selection of the redshift sample are crucial to future analyses.
It has been established (a) that LSST will require a very accurate measurement of / if calibrated by DESI alone (Sec.5.4), and (b) that even with perfect optical photometry the photometric scatter contribution to / with existing DESI spectra is larger than the acceptable error by almost an order of magnitude (Sec.5.3.1).The strategies to account for this in future weak lensing redshift calibration efforts can be threefold: • Deeper Photometry : Despite (b), specifically improving NIR/IR measurements will decrease the systematic contributor to / and potentially allow for a measurement that meets photo-z requirements.
• Deeper Spectroscopy : Deeper spectra will reduce a given survey calibration's dependence on /, mitigating how well known this slope has to be in (a).After DESI's current survey, it will be uniquely situated to push for deeper spectra.
• Modeling : Perhaps the most viable approach, one can take into account the effect of photometric scatter with appropriate modeling of the data.Future analyses could constrain the bias via forward modeling from a color space of true photometry.In this approach, the intrinsic and observed / are folded into the inference alongside the systematic in a methodologically appropriate way, and the size of the systematic effect is no longer a limiting factor.

CONCLUSIONS
Photometric redshift calibration -estimating the redshifts for galaxies where we only observe them through a collection of filtersrequires a thorough understanding of the color-redshift relation, which is very non-linear across the full galaxy population.This paper presents a catalog of spectra, the Dark Energy Spectroscopic Instrument (DESI) Complete Calibration of the Color-Redshift Relation (DC3R2) secondary target survey, designed to aid in the redshift calibration for a large fraction of the weak lensing source galaxy samples of future photometric surveys.The data includes associated weights that turn the survey samples (ELGs, LRGs, BGS) into one that is representative at a given    and limiting magnitude of 23.68 (5, -band), using KiDS-VIKING as test-bed.We chose to select targets on this 9-band photometry in order to break redshift-type degeneracies, and KiDS-VIKING provided the most constraining data over this area.With this unprecedented quantity of DESI spectroscopic redshifts, we examine how it benefits future surveys and allows us to examine spectroscopic selection effects.
• The DESI sample reported here calibrates the redshift distribution of roughly 56% of the galaxies in COSMOS via 230k spectroscopic redshifts.For the photometric colour space that will be visible to DESC and Euclid (approx.98% complete at  = 25.3), this sample corresponds to coverage of 6248 cells out of 11250.Approximately 41% of the full COSMOS color-space and galaxy population is calibrated from spectroscopic galaxies that are classed as DC3R2 targets (inclusive of overlap with main survey targets), and 4% from uniquely DC3R2 objects.
• However, even though this sample provides an incredible quantity of high-quality spectra, we find that the combination of uncertain photometry and a variety of spectroscopic selection effects can produce substantial biases on the redshift inference for a given lensing redshift bin.We demonstrate that introducing a preference for brightermagnitude calibration spec-zs in the presence of photometric scatter effects or intrinsically high / biases the mean redshift on the bin of order Δ ≈ 0.01, especially for higher redshift bins (see Fig. 10 for a breakdown).This effect is present even for the shallower testbed data used here and will be exacerbated for fainter samples, as it enters the inference as an induced magnitude dependence of redshift at fixed color.Fewer photometric bands further worsens the effect, as breaking degeneracies becomes more difficult (e.g. the uncertainty in mean redshift for the year 3 HSC, Rau et al. 2023, grizy analysis was larger than that of KiDS-450, Wright et al. 2020    by a factor of 1.25, even with KiDS having an additional bin).
• Results for this work are expressed in a color-space that is similar to past work on the Masters et al. ( 2017) SOM, as our map is a transformation of this space into KiDS-VIKING colors.As demonstrated in Fig. B1, we recover very similar galaxy SEDs and redshifts per cell, but comparisons like that of Fig. 3c ought to be taken with the knowledge that the colors and photometric noise levels between different surveys are not identical.
• Our analysis reveals a general agreement in the color-redshift relation with previous spectroscopic surveys that have explored this color-space.When accounting for effects induced by photometric noise, we also find agreement in the magnitude dependence of redshift at fixed color with the result of Masters et al. (2017).
• Photometry quality has an important role in redshift calibration.This study has found that photometric errors need to be well understood for modeling the color-redshift relation and especially magnitude dependent effects (/).In order to constrain this slope for future surveys we require either better photometry than expected, deeper spectroscopy, or improved analysis methodology (discussed in Sec.5.5).We find that the use of KV photometry in this map does not supply as strong a constraint as past data sets do on the slope /.However, close examination of the systematic / induced by photometric scatter in a high resolution SOM has strengthened the case for the null hypothesis in this work and in past survey data sets.While / ≈ 0 is potentially consistent with our data, deeper photometry or survey simulation will be needed to constrain the slope sufficiently for future weak lensing efforts.
The spectroscopic redshifts measured by the complete 5-year DESI survey will provide unparalleled support for future redshift calibration in weak lensing surveys.To fully leverage the powerful quantity of data discussed in this paper, more accurate photometry and deeper spectroscopic redshifts will be necessary to constrain the magnitude dependence of redshift at fixed color.Exploration into the capability of massively multiplexed spectroscopic instruments, like DESI, to attain redshifts of fainter sources than currently targeted is important.
Looking ahead, future surveys will provide additional wavelength and sky area coverage that can improve photometric redshift calibration.Among these is a follow-on program to DC3R2 using the 4-metre Multi-Object Spectroscopic Telescope (4MOST; de Jong et al. 2019) that will observe targets across the same SOM used in this work, though more uniformly to redshift ∼1.55 (Gruen & Mc-Cullough 2023).Together with deeper spectroscopic campaigns (e.g.DESI-II, Schlegel et al. 2022) and campaigns including high-quality infrared spectroscopy, these data will form the basis for constraining a model of the galaxy population seen by deep photometric surveys, including its redshift distribution.
• Cosmic variance -The original C3R2 survey observes a different limited volumes of the Universe compared to the the overlap region of DESI and KiDS-VIKING used for DC3R2, with galaxy samples of limited number.This difference in cosmic variance can influence shifts in the redshift distributions seen in each cell.If each cell is well described as (), this impact of altered abundances is minimal on the calibrated color-redshift relation.This paper has found that for the photometry used, the redshift distributions are comparably broader, in which case altered abundances may overrepresent certain contributors to the overall cell redshift.A small sample is also liable to miss cell contributors that are rare and may not accurately measure the tails of these SOM cell distributions.However, the DESI/KV overlapping footprint is sufficiently large that cosmic variance will be a minor concern.
• Photometric noise -KV photometry has a higher noise level than the COSMOS catalog that used to train the original the map.The effect of this a scattering of galaxies between neighboring cells, effectively lowering the resolution of the map.This effect would influence measurable quantities like /, which is discuss further in Section 5.3, as this noise is asymmetric between cells.
• The SOM color transformation -As the filters covered by KiDS-VIKING are different from those used in Masters et al. (2017), a best fit SED for each cell was defined based on template fits to COSMOS narrow-band galaxies in Laigle et al. (2016) and evaluated in the relevant KiDS-VIKING bands to transform the color of that cell.
The median of all COSMOS galaxies in the original SOM produced the best fit SEDs for this evaluation.For a sample with overlap in COSMOS/KiDS, we ought to show that the same galaxies reliably get assigned to the same cell given sufficiently low photometric uncertainty.Before a joint analysis of the samples assigned to cells based on their original COSMOS and based on KV photometry could be made, which we do not attempt here, it would first have to be shown that the color transformation does not introduce e.g.spurious magnitude dependence of redshift within color cells.We can see a broad recovery of COSMOS cell photo-z against the DESI spectroscopic median redshift per cell in Fig. B1, which indicates that this transformation produces a color map with very similar galaxies on a cell-by-cell basis.However, cell boundaries in color space could have minor distortions.Outliers at high  are likely to result from spectroscopic incompleteness in  ≥ 1.6.Photometric scatter is likely the largest contributor to disagreement at lower redshifts, which we elaborate on the consequences of more fully in Sec.5.2 and care to characterize for future surveys.
• Differing redshift fitting algorithms in spectral modeling -Robustness tests on DESI's spectral modeling pipeline, redrock (Bailey et al. 2023), are expanded upon in Guy et al. (2023);Lan et al. (2023).Additionally, for a small overlap of objects in common fields, we can directly compare the redrock modelled redshift to those found by other surveys to test their reliability.
• Spectroscopic selection effects -Quality flags chief among them, we can cross check on other spectral surveys to inform our redshift quality cuts that limit outliers.Perhaps even more importantly we can look at our relevant findings as a function of redrock's Δ 2 to see if this selection affects quantities like dz/dm or shifts our understanding of the (|).Certain galaxy types are more conducive to confident redshifts.Past studies have found that if color cuts are implemented on spectroscopic galaxies that are not available to the weak lensing surveys, redshift calibration errors can be as large as Δ ≈ 0.04 Hartley et al. (2020).This analysis mitigates this effect by removal of detections that do not have sufficiently high completeness in a given cell which accounts for these sample specific color cuts (see Section 3.2.3).
• Cell assignment algorithm The metric that is minimized for each galaxy to be assigned a cell can dramatically impact the distribution across the map.We attempt to imitate the cell assignment procedure from Masters et al. (2015), which minimizes contributions from flux measurements with larger error and from drop-outs, and when running the assignment on the original COSMOS catalog recover the identical assignment for more than 99% match.
1 2 .0 < r < 1 9 .6 1 9 .6 < r < 2 0 . 2 2 0 . 2 < r < 2 0 .The slopes measured in Sec.5.2 for / were done in such a way to be comparable to the procedure for the original measurement in Masters et al. 2017.The data points in Fig. 11 are drawn from unique pairs in a given SOM cell, producing  2 points for a cell with  spectroscopic galaxies.To fit the linear model, we take slices in ΔMAG_GAAP_Z and fit each histogram of points to a normal distribution with a median and width, .Tails in the distribution cause deviations from the fitted normal that become larger with increasing |Δ|, which may lead to slight underestimation of the slope uncertainty.The fitting process is demonstrated visually in Fig. E1.The medians at the bin midpoints are then fit with a linear regression that produces our estimate of / for a given sample.
For Fig. 12, where there are fewer data points for each cell than for the entire sample, only two slices in magnitude were chosen, where they were chosen to jointly span [min(ΔM cell ), max(ΔM cell )].

Figure 1 .
Figure1.Footprint of spectroscopic redshifts used in this analysis, including main survey targets, depicting objects observed in SV (green points) and through the first 56 days of Y1 operations (blue points).The combined bright and faint footprint for the DC3R2 dedicated tiles are outlined in black.The broad KiDS-VIKING-N equatorial field provides the     photometry to match to DESI data, which is inclusive of KV DR4 (shaded blue) and DR3 (shaded green) that spans the overlap of KiDS-1000 with the DESI footprint(Kuĳken, K. et al. 2019).Particularly relevant for SV, prior to the KiDS-1000 release, the majority of our targets lie in the GAMA fields (red).
Target abundance by     color self-organizing map cell in the relevant KiDS-VIKING 450 fields (G09, G12, G15), from which spectroscopic targets for DC3R2 were chosen for SV and the dedicated tiles.Objects included in this count meet either DC3R2 selections or any main survey (ELG, LRG, BGS) magnitude cuts in grz.Number of successful spectra taken per cell, across DC3R2 as well as all DESI main surveys (ELG/LRG/BGS) that pass redshift quality cuts.Fraction of spectra that meet our redshift quality selections at a given color, across all surveys, out of all objects observed.If the fraction of spectra in the cell with certain redshifts amounts to less than 90% we consider the spectra that are measured incomplete and potentially biased and exclude the cell from the fiducial analysis.

Figure 2 .
Figure 2. The distributions of targeting, spectroscopy, and redshift completeness across the color-space.
Median spectroscopic redshifts for a given cell for all DC3R2 and main survey targets that meet our quality cuts.Median spectroscopic redshifts for a given cell for the COSMOS field (as discussed in 2.4), from prior, non-DESI spectroscopy.The ratio of spectroscopic redshift counts between initial DESI efforts and prior COSMOS data for the same map, where solid red depicts color space exclusively added by DESI and dark blue does the same for COSMOS.The transition occurs where DESI doubled existing spec-z counts.

Figure 3 .
Figure 3. Median DESI redshifts compared to previously measured and curated spectroscopy in the COSMOS field reveals substantial gains in the low redshift regions and overall multiplicity.Note that these color-spaces are not identical and this comparison is a generalized one, as per the discussions in Sec.4.1, Appendix B.

Figure 4 .
Figure 4. Median DESI exposure times (in minutes) necessary for each SOM cell, for targets with a DECam fiber magnitude of  fiber = 21, to reach a high redshift confidence Δ 2 = 40, as per Appendix D. Full exposure time quantiles and median colors for each cell are reported in the attached data products.BGS targets are excluded from this plot, due to their typically much larger sky brightness.The color bar is chosen to approximately separate quiescent, passive galaxies (orange) from emission line galaxies (blue).

Figure 5
Figure5.Depiction of the SOM with four complementary samples, post completeness cuts: Emission Line Galaxies (green), Luminous Red Galaxies (red), the Bright Galaxy Survey (blue) and DC3R2 selected targets (orange), where the strength of each color channel is directly proportional to the fraction of galaxies in that cell that come from the sample noted.The distributions of normalized redrock SED fits are shown for three choices of SOM cell that span star forming, star burst, and passive galaxy types, with a colored envelope for the 68% quantile region about the median of all templates shown in black.The shaded regions denote rest-frame wavelengths not observed by DESI according to the best fit redshift and individual galaxy spectrum fits are drawn in gray.Normalization is done at a point of featureless continuum, 3000 Angstroms for star forming galaxies and at 8000 Angstroms for mostly quiescent galaxies.

Figure 6 .
Figure 6.Distribution of the weights for main target classes.Over-represented spectroscopic targets will have lower weight by construction, in order to generate a more representative  ().

Figure 7 .Figure 8 .
Figure 7. Existing spectroscopic coverage of the Rubin/Euclid color space, shown as a histogram of SOM cell counts against their median redshifts.Contributions from this work (DESI + KV photometry, blue), are combined with data taken in the COSMOS field for reference (black).The contribution of DC3R2 populated cells, weighted by the count of DESI spectra per combined spectroscopic count (red), demonstrate that jointly DC3R2 alongside DESI now provides a majority of spectroscopic calibration at redshifts below  ≈ 1.2.The dashed line at  = 1.7 provides a coarse upper bound for the detection limit of the [OII] feature on silicon based detectors, and thus the upper redshift limit anticipated by these samples.

Figure 9 .
Figure9.A simple redshift calibration scheme for the G09, G12, G15 fields of KiDS-450 using DESI + DC3R2 spectroscopy reveals that applying selection effects on the spectroscopic catalog can significantly underestimate the mean redshift of a photometric sample of the same observed color.

Figure 11 .
Figure11.Plots depicting the systematic from photometric scatter induced in / (left) and the raw measurement from the joint DC3R2-KV sample (right), where each data point is the difference between a unique pair of galaxies occupying the same SOM color-cell in magnitude and redshift.The best-fit slope (dashed red) is fit from a collection of medians and  (pink) in magnitude slices.If reported photometric errors are accurate, this systematic dominates our measurement.

Figure 12 .
Figure 12.Color dependence of / in the raw data (left) and corrected for the contribution of photometric scatter to the overall slope (right).
dz / dm (a) Raw measurement of change in redshift with magnitude at fixed color, / across the SOM, deresolved for statistical power into 5 x 5 super cells.
dz / dm -dz / dm sc (b) Intrinsic change in redshift with magnitude at fixed color, / as observed across the SOM, corrected to first order by subtracting off the contribution due to photometric scatter (i.e./ raw − / sc ).

Figure A2 .
Figure A2.Depiction of the occupancy of the entire DESI-KV sample in the grz magnitude bins (3 x 6 x 4) used to produce the weight scheme in Sec.3.2.3 that renders cell-level redshift distributions more representative.

Figure B1 .
Figure B1.Median DC3R2 redshifts of a SOM cell in the transformed KV mapping compared to the COSMOS photometric redshift of the same SOM cell prior to the transformation.This reveals the effect of both the SOM color transformation as well as the differences in sample depth.Broadly speaking, there is good recovery between redshifts for objects of the same colors.
at Kitt Peak National Observatory, and the Dark Energy Camera Legacy Survey (DECaLS) at the Cerro Tololo International Observatory.DECaLS extended the photometric catalog of the Dark Energy Survey (DES, e.g.see Sevilla-Noarbe et al. 2021 for photometry or Dark Energy Survey <  < 20.175,  fiber ≲ 22.9 , + 1 During SV, DC3R2 took observations for 44,272 targets.In contrast, during the onset of Y1 6,905 DC3R2 objects were observed.

Table 2 :
Consolidation of the shifts in mean redshift inferred for a tomographic bin with KiDS-VIKING-like cell abundances if different selections are made on the spectroscopic sample.Δ z is always the mean redshift of the new selection subtracted from the fiducial mean.The final row is a repeat of the first magnitude cut with the weights applied both before and after the selection.Error bars are produced by bootstrapping cells with replacement in the selection.