Beyond the 3rd moment: A practical study of using lensing convergence CDFs for cosmology with DES Y3

Widefield surveys of the sky probe many clustered scalar fields -- such as galaxy counts, lensing potential, gas pressure, etc. -- that are sensitive to different cosmological and astrophysical processes. Our ability to constrain such processes from these fields depends crucially on the statistics chosen to summarize the field. In this work, we explore the cumulative distribution function (CDF) at multiple scales as a summary of the galaxy lensing convergence field. Using a suite of N-body lightcone simulations, we show the CDFs' constraining power is modestly better than that of the 2nd and 3rd moments of the field, as they approximately capture the information from all moments of the field in a concise data vector. We then study the practical aspects of applying the CDFs to observational data, using the first three years of the Dark Energy Survey (DES Y3) data as an example, and compute the impact of different systematics on the CDFs. The contributions from the point spread function are 2-3 orders of magnitude below the cosmological signal, while those from reduced shear approximation contribute $\lesssim 1\%$ to the signal. Source clustering effects and baryon imprints contribute $1-10\%$. Enforcing scale cuts to limit systematics-driven biases in parameter constraints degrades these constraints a noticeable amount, and this degradation is similar for the CDFs and the moments. We also detect correlations between the observed convergence field and the shape noise field at $13\sigma$. We find that the non-Gaussian correlations in the noise field must be modeled accurately to use the CDFs, or other statistics sensitive to all moments, as a rigorous cosmology tool.


INTRODUCTION
The structure in the Universe -namely the distribution of matter -contains significant information on all kinds of physical processes; from the largest cosmological scales which probe the initial conditions of the Universe, to the galaxy and halo scales which probe both non-linear, gravitational evolution as well as baryonic imprints due to astrophysical processes, to the intra-galaxy scales where the gas and stellar phase space exhibit distinct structures from the rich physics of magneto-hydrodynamics.It is clear that the observed fields are abundant with information on both cosmology and astrophysics.It is then pertinent to question how best to extract the information from these fields, i.e. how best to maximize the constraints we can place on physical phenomena through measurements of these fields.
In the scenario where the field is a mean-zero Gaussian random field that is isotropic and homogeneous, the only degree of freedom for the field is the covariance between the pixels/voxels in real-space (or alternatively, the power spectra in Fourier-space).In such a scenario, it is clear that the maximal constraining power is obtained by measuring the power spectra, i.e. the only degree of freedom.For cosmological fields, the initial conditions seeding structure formation are Gaussian to a very good approximation, as has been verified by the cosmic microwave background (CMB) observations (Planck Collaboration 2016b, 2020), and a large part of the cosmological information in the resulting late time density field is still Gaussian, i.e. encoded in the variance of the field.Thus, the power spectra are a good way to extract information from the late-time fields as well.
However, there still remains significant, additional information beyond the power spectra.Even in the fiducial ΛCDM case -where ΛCDM is the cosmological model with cold dark matter (CDM) and the cosmological constant Λ -and the initial conditions contain no primordial non-Gaussianities, the presence of nonlinear, gravitational evolution generates signatures beyond the power spectra.This is commonly called "higher-order information" 1 and represents information in the field that is not captured by the power spectra.Such information still encodes signatures from cosmological and astrophysical processes, and is often highly complementary to the 2-point constraints; as a result, the combination of power spectra with higher-order information leads to constraints that are better than the trivial sum of the individual parts (e.g., Fluri et al. 2018Fluri et al. , 2019;;Gatti et al. 2020;Zürcher et al. 2021;Gatti et al. 2022;Fluri et al. 2022;Lanzieri et al. 2023) There exists a rich body of literature on different, complementary ways to extract this non-Gaussian information from continuous scalar fields like the density field or the weak lensing convergence field.The N-point correlation functions (or their Fourier equivalents, the poly-spectra) are the most well-known and widely used statistic, and measure the correlation of  points in space, where the points are separated by some distances.For  = 3, these statistics are computationally expensive to compute, and for  = 4 they are mostly prohibitive unless measured in specific limiting cases.Given this, many alternative methods have been explored to capture some/all of this information in a computationally inexpensive way.Some of the most commonly known/used methods include moments (Petri et al. 2015; Peel et al. 2018; Chang et al. 2018;   1 Power spectra are referred to as "2-point statistics" and they capture up to second-order information as they are fundamentally a variance measure and contain two orders of the field."Higher-order" here refers to higher than second-order information, which needs to be captured by beyond 2-point statistics, or sometimes referred to as "higher-order statistics".Gatti et al. 2020Gatti et al. , 2022)), Minkowski Functionals (Mecke et al. 1994;Blake et al. 2014;Petri et al. 2015;Parroni et al. 2020), densitysplit statistics (Gruen et al. 2018;Friedrich et al. 2018) and more.Similar statistics exist for the discrete fields, such as counts-incells (Baugh et al. 1995;Adelberger et al. 1998) and the k-Nearest Neighbor (kNN) distributions (Banerjee & Abel 2021a,b).For the weak lensing field, the 3-point information has been pursued either through the direct measurement or approximate summaries like the density-split statistics (Gruen et al. 2018;Friedrich et al. 2018), mass aperture moments (Secco et al. 2022b), field moments (Petri et al. 2015;Gatti et al. 2020Gatti et al. , 2022)), and integrated shear functions (Halder et al. 2021).Weak lensing peaks (Kratochvil et al. 2010;Shan et al. 2018;Martinet et al. 2018;Zürcher et al. 2022) probe a specific, fixed combination of N-point functions, as is the case with other statistics like cosmic void distribution functions (Davies et al. 2021) and persistent homology (Heydenreich et al. 2021(Heydenreich et al. , 2022)).Field-level inference tools are also employed (Fluri et al. 2018;Jeffrey et al. 2020;Fluri et al. 2019Fluri et al. , 2022)), while others explore machine learning-informed, but still interpretable, statistics such as scattering transforms (Cheng & Ménard 2021) and wavelet phase harmonics (Allys et al. 2020).
An outstanding question is identifying the "maximally" informative statistic for summarizing, and extracting constraints from, the fully nonlinear late-time density/convergence field.This is an unsolved problem given we do not a priori know the exact cosmological information contained in the different non-Gaussian signatures (including those beyond the 3-point function) across both linear and non-linear scales.Thus, to ensure we use all the available cosmological information in the field, it is desirable to consider statistics that capture all orders of statistical information (rather than just one order, or a specific combination of orders).The kNN distributions have been formally shown to be such a statistic for discrete tracers (Banerjee & Abel 2021a) as they capture volume integrals of all Npoint auto/cross-correlation functions of the field.While these kNN distributions are constructed for discrete tracer fields, Banerjee & Abel (2023) demonstrated that the analogous statistic for continuous fields are the CDFs of the field smoothed on different length scales.
The CDFs -or the probability distribution functions (PDFs), which are interchangeable ideas given they are connected by a linear integral transform -are the main statistic of focus in this work and have been theoretically known as a good non-Gaussian statistic for lensing fields since more than two decades ago (Jain et al. 1998;Kruse & Schneider 2000).The CDF is also an intuitive, visually informative statistic for non-Gaussian features and is often used to check and validate reconstructed lensing fields (White & Hu 2000;Chang et al. 2018;Jeffrey et al. 2021).Previous works have also shown that the lensing PDF significantly improves constraints in CDM compared to the standard 2-point functions (Giblin et al. 2023), while more works have shown the utility of the 3D matter density PDF in probing both CDM and other extended cosmologies (Uhlemann et al. 2020;Friedrich et al. 2020;Boyle et al. 2021;Gough & Uhlemann 2022;Cataneo et al. 2022).
While the benefits of using the CDF -namely the level of cosmological non-Gaussianity it can capture -have been explored in the past, this has mostly been in the more idealistic regime where some key observational factors were not included in the analysis.Thus, while we have had a prior understanding of the benefits of using PDFs/CDFs of the lensing field, we currently have an incomplete picture of the practical challenges in using this statistic to infer cosmological constraints.
In this work, we measure the CDFs of the lensing field from the first three years (Y3) of the Dark Energy Survey (DES) data and validate that the common lensing systematics -such as point spread function (PSF) contributions, reduced shear approximation, source clustering, and baryon imprints -have an impact on this statistic that is either negligible or can be adequately mitigated.Many of these tests have been extensively performed for 2-point statistics (Gatti & Sheldon et al., 2021) and have also been done for some 3-point statistics (Secco et al. 2022b;Gatti et al. 2022).The CDFs are sensitive to information at all orders, and validating the impact of these observational/modelling systematics on the CDFs also provides validation for higher-order information beyond the 3-point.This work is organized as follows: first, we introduce the formalism for the CDFs in §2.In §3 we describe the datasets and simulations used in this work, as well as the procedures used to forward-model the simulations to match the DES Y3 data.In §4 we define the data vector used for the rest of this work, and also demonstrate the Fisher constraining power of the CDFs for DES Y3-like data.In §5 we measure the CDFs on the DES Y3 weak lensing maps, and quantify the signal-to-noise of the measurements.We then validate the impact of different effects -PSF contributions, source clustering, reduced shear approximation, and baryonic imprintson this statistic and discuss any scale cuts required to mitigate these effects.Finally, we conclude in §6.

CDF FORMALISM
We begin in §2.1 by describing the formalism of the CDF statistics used in this work, including the exact measurement procedure.In §2.2 we briefly review the k-Nearest Neighbor distributions, which are a recently introduced statistic for discrete tracers that summarize all higher-order information, and we discuss how the analogous, continuous-field statistic is the CDF.Finally, in §2.3 we validate the CDFs using Gaussian fields.Note that the CDFs are closely related to other statistics in the literature and we will describe these later on in §6.

Cumulative Distribution Functions
The CDFs2 used in this work are defined as follows.Given a set of uniform/random points in a field, with spheres of radius  around each point, the CDFs summarize the fraction of spheres that have an enclosed density -i.e. the mean density within radius  -that exceeds a chosen threshold.In 2D, the density becomes a surface density, Σ, and the radius is a projected aperture, .The calculation of the fraction of points whose enclosed surface densities exceed a threshold can be formally written down using the following expression, where   ≡ (< ) is the average surface overdensity within an aperture .This measurement can also be trivially modified to use the surface density, rather than overdensity, just switching  → Σ(1 + ), where Σ is the mean surface density field.It can also be done with the surface mass, by simply multiplying the surface density with the aperture area associated with scale .
For a given map, the CDF measurement is performed as follows: First, we fill the map with a grid of points.Without loss of generality, we take these points to be located at the center of the HEALPix pixels (with NSIDE = 1024), as this greatly simplifies the calculations.Increasing the number of points in the grid (i.e. the number of pixels) will improve the precision of the measurement, as is the case with the traditional 2-point correlations.
Second, we pick a certain aperture scale, , and for each point we compute   , the convergence smoothed on scale .The smoothing is done in harmonic space using a harmonic tophat filter where  1 () is the Bessel function of the first order.The choice of tophat over a Gaussian filter is because the former allows for an easy interpretation of an enclosed quantity within a given physical scale.
Our computing procedure is the same for any other choice of filter as well.
Third, we measure what fraction of the grid points satisfy the inequality in equation 1, which is the probability, (  > ).The choice of thresholds is a degree of freedom in the measurement, and we describe our choices in Section 4.1.
Fourth and finally, steps 2 and 3 are repeated for a range of scales and thresholds to extract the distribution, (  > ), for different choices of .The exact choice of scales and thresholds used in this work is described in Section 4.1.
Figure 1 illustrates how the CDFs are constructed in a given field, and highlights some generic features of the CDFs.In the limit where the variance  2 → ∞, we expect (  > ) → 0.5, and where  2 → 0, then we expect (  > ) → 0 if  > 0, and (  > ) → 1 if  < 0. In Figure 1 we see that all curves are closer to  = 0.5 on small scales where the field's variance is high compared to the threshold values, and move towards  = 0 or  = 1 on large scales where the large smoothing scale suppresses the field's variance to values lower than the thresholds.Additionally, we see (  > 0) ≈ 0.4 at small-scales, where the distribution is log-normal (see top panels of Figure 1) and so the median of the distribution is not the same as the mean, ⟨⟩ = 0.At large-scales, we find (  > 0) ≈ 0.5 as the distribution becomes more Gaussian.
Thinking in 3D space, the CDFs extract (>  | ), the conditional distribution of the enclosed mean density given radius, as well as ( | > ), the conditional distribution of radii or volumes given a density threshold.These two distributions can be related using Bayes' theorem, . (3) Note that given the enclosed density  and spherical radius , we can easily obtain a mass  ≡ 4 3  3 .So the above can be rewritten as .
(4) Equation ( 3) better elucidates the connection between the CDFs and the ideas from halo collapse.The quantity (> 200  | ) is simply the fraction of volumes that contain a halo, where the halos are identified/defined as overdensities of at least  > 200  , with   being the critical density of the Universe.We can also generalize the CDF formalism to multi-field probes by computing the joint CDFs of multiple fields; this is simply, where   ,1 and   ,2 are two different fields (eg.different tomographic bins of a single type of field, or different types of fields).While we are allowed to choose different values for the thresholds  1 and  2 , we will enforce  =  1 =  2 henceforth for simplicity in the data vector.In this work, we will consider the cross-correlation between tomographic bins as part of our measurement.Note that the 2-field version of the CDFs formally contains all the 1-field information as well.This connection is identical to how 2D PDFs contain the marginal 1D distributions within them. 3We will use both 1-field and 2-field CDFs as part of our main data vector.The 3-field and 4-field CDFs will formally have additional information beyond the 1-field and 2-field CDFs, though our tests have shown there is only marginal improvement in cosmological constraints for the analysis choices described here (e.g., tomographic bin, angular scales, and thresholds).
For some tests, we will also postprocess the 2-field CDFs to isolate just the cross-covariance/correlation.This is done by performing the redefinitions described in Banerjee & Abel (2021b), which takes the joint probability to exceed in two different fields 3 A simple example is the 2D CDF,  (  ,1 >  1 ,   ,2 >  2 |  ) taken in the limit  2 = −∞.In this case,   ,2 is always above the threshold  2 and so the 2D CDF reduces to a 1D CDF, and removes the product of the individual probability to exceed for each field.The quantity  1,2 () is 0 if the fields are completely uncorrelated, and non-zero otherwise.The sign of  1,2 (), for any threshold , indicates the sign of the correlation between the two fields at that threshold.We can also extend this formalism to more than 2 fields (eg.an ABC correlation of three fields).While we do not consider such measurements in our analysis here, we note their potential utility both for cosmological information, but also as further compressions of the data vector.Note that there is no benefit to repeating a field twice (eg.AAB) if we also fix the threshold  for all the fields.The joint probability ( 1 > ,  1 > ,  2 > ) is exactly similar to ( 1 > ,  2 > ).
While we have discussed the CDFs in terms of lensing convergence, it is not necessary to be limited to this quantity.For example, one could consider the kinetic or thermal Sunyaev-Zeldovich fields (Sunyaev & Zeldovich 1972;Carlstrom et al. 2002), which are generated by baryons in halos and thus inherit the non-Gaussian features of the structure traced by these halos.

Connection to kNN distributions for discrete fields
The kNN distributions (Banerjee & Abel 2021a,b) are a novel way to summarize the clustering in a field of discrete tracers, such as galaxies or halos.They have been formally shown to capture volume integrals of all N-point functions of the tracer field, but can be computed in O ( log ) time, where  is the number of tracers.Thus, they have the same computational efficiency as a 2-point correlation function, but capture integrals of all the information held in the N-point functions (2-point, 3-point, 4-point, etc.).This statistic has already been measured in observational data, particularly to quantify the signal-to-noise of all correlations (both Gaussian and non-Gaussian) in a clustered field (Wang et al. 2022).
The kNNs are computed by taking a field of tracers with a known number density  tr , and then generating a large set of random points in this field as one would for computing an N-point clustering function (although a set of uniform points would be a sufficient choice as well).For each point, one computes the distance to the nearest tracer neighbor.The distribution of distances to the  th nearest neighbor forms a kNN distribution.This statistic is probing the distribution ( | >  tr ), i.e. the distribution of volumes that contain at least  tr tracers, where  tr takes integer values.Assuming spherical volumes, this can be reformulated as the distribution ( | >  tr ).Given kNNs depend on the counts of tracers enclosed within a volume, it is sensitive to volume integrals of all the correlation functions.However, the fact that the sensitivity is to a volume integral of the functions means signals from specific configurations of the N-point functions will be mixed together. 4n the limit of  tr → ∞, the number counts threshold >  tr becomes a density threshold > , and the conditional distribution becomes ( | > ) which can be related, using Bayes' theorem, to the distribution probed by the CDFs, (>  |).A detailed discussion on this connection between kNNs and CDFs be found in Banerjee & Abel (2023, see their Section 2.1).The analytic connection between the two statistics directly confirms that the CDFs can be formally shown to contain all volume integrals of higherorder functions, and this makes them better suited for summarizing a field, where we do not apriori know the exact cosmological information contained in all the non-Gaussian signatures of the field.In addition, this connection means the CDFs are the natural statistic to cross-correlate discrete and continuous fields while using the kNN formalism for the former (Banerjee & Abel 2023).

Consistency relations for Gaussian fields
In the Gaussian limit of (  ) = N (  ; , ) -where N is a normal distribution with mean  and variance  2 -there are three degrees of freedom for the CDF: the mean and variance of the map at each aperture scale, and the threshold .The threshold is an input parameter, and the mean of the map is taken to be  = 0 given  is derived from the overdensity field and so is defined as a perturbation field with the mean background subtracted.Thus, the variance is the only unconstrained parameter, and this variance can also be measured directly on the map.Formally, a Gaussian CDF is parameterized as, We can thus use the variance measured from the map smoothed on a given scale, , to predict the CDFs at that scale.For a purely Gaussian field, the measurements and predictions must agree.The correlations depend on just distance, .For N-point correlations of N > 3, the geometry connecting the N points will contain additional information, though the exact information contained in this geometry remains an open question.The 2-field CDFs averaged over an ensemble of 1000 noiseless, full-sky, Gaussian convergence maps.The () for the two fields corresponds to the third and fourth tomographic bins of DES Y3.The solid lines are converted to dotted styles when the CDF is outside the range [0.003, 0.997] (approximately corresponding to the 3 bounds).The black dashed lines show the predictions for the CDFs given the covariance of the two fields at a given smoothing scale, ; under the assumption the fields are Gaussian, the predictions must match the measurement.Bottom: The difference between the CDF measurement and Gaussian-field predictions, Δ =  meas −  theory normalized by the uncertainty in the CDFs -where the uncertainty is cosmic variance and is the observational limit for measurement uncertainty -estimated from the 1000 realizations.The gray band shows Δ/ < 0.1.In all cases the difference, Δ, is within this region and is completely negligible.
same exercise is trivially extended for the 2-field CDFs.In the Gaussian limit, the joint PDF of any set of fields is given by a multivariate normal distribution, where the column vector ì  = { 1 ,  2 , . . .  } are the kappa value in each field, and denote the point in multi-dimensional space where we evaluate the probability.The PDF in Equation ( 8) can be integrated, assuming some set of thresholds for each field, to obtain the CDF.Recall that in this work we set all thresholds to the same value .We also use  = 0.The unknown degrees of freedom for the distribution are then entirely in the covariance matrix.Thus if we know this covariance matrix, we can always predict the CDFs exactly.
We verify this in Figure 2 for our analysis setup.The top panel shows the 2-field CDF measured on noiseless, simulated maps whose signal mimics the DES Y3 data used in this work (see Section 3.2 for more details).In particular, the convergence map has the same redshift distribution as the 3rd and 4th tomographic bins.These are all Gaussian maps made by postprocessing N-body products, as detailed below in Section 3.1.4.The dashed lines (prediction) are consistent with the solid ones (measurement).The bottom panel shows the Gaussian model predictions are within 0.05 of the measurements, where the  of the data vector is just cosmic variance and thus represents the observational limit in precision.

DATA
We first describe in Section 3.1 the different simulations used in our analysis.We then detail the DES Y3 data in Section 3.2 and in Section 3.3 we describe how the simulated maps are forward modelled to imitate the DES Y3 data.
All maps used in this work are made with the Healpix convention of NSIDE = 1024.This corresponds to a pixel scale of 3.2 arcmin.The one exception are the products used from the Cosmogrid suite, described in Section 3.1.3,which are NSIDE = 512.

Simulations
While the CDF is a statistic that can be used to summarize any scalar field, in this work we are specifically interested in the lensing convergence, , which is a line-of-sight integral of the density field, where   is the redshift of the "source" plane/galaxies being lensed, n is the pointing direction on the sky,  is the overdensity field,  is the comoving distance from an observer to a given redshift,  is the scale factor,  0 is the Hubble constant, Ω m is the matter energy density fraction at  = 0, and  is the speed of light.We use the shorthand (  ) ≡   and (  ) ≡   .We model this convergence using full-sky density maps from different N-body simulations, with each simulation serving a different purpose in this work.We detail these different simulations below.Such simulations are uniquely suited for modeling these fields in the non-linear regime.For quasi-linear and linear regimes, analytic models can also be utilized (e.g., Barthelemy et al. 2023)

Anbajagane23 simulations (A23)
In this work, we use a suite of N-body simulations run with the Pkdgrav3 solver (Potter et al. 2017), where the suite has been specialized for performing Fisher forecasts for widefield surveys.This simulation suite will be formally introduced as part of a future work (Anbajagane et al., in prep.), and we describe here just the essential features of the runs and the relevant data products used in this work.The A23 simulations are run in 1 ℎ −1 Gpc boxes, starting at  = 127, with  = 512 3 dark matter particles.The initial conditions for all the A23 simulations are obtained from the Quijote suite (Villaescusa-Navarro et al. 2020), and so the simulations are essentially lightcone runs of the Quijote simulations specialized for widefield survey analyses.The original Quijote suite was designed for studying the Fisher information of the nonlinear structure, as well as building emulators sampling different cosmological parameters, but the data products are inadequate for producing mock lightcones of the lensing/density field.These products include snapshots and halo catalogs at only five redshifts, which is too coarse a redshift resolution for building lightcones.Hence we have rerun a subset of these simulations to create accurate full-sky lensing and density maps.
The A23 suite contains 100 simulations for computing the derivatives of the lensing/density field with respect to multiple cosmological parameters, of which three are of interest to us -Ω  ,  8 , and .The suite runs 100 simulations with parameter values slightly higher than the fiducial, and 100 simulations with values slight lower than the fiducial, and these two sets are used compute these derivatives.The fiducial cosmology is from Planck Collaboration (2016a), and the derivatives are computed over differences of ΔΩ  = 0.02, Δ 8 = 0.03 and Δ = 0.05, which are all the same settings as the Quijote suite.The A23 suite also has 2000 simulations at the fiducial cosmology which are used to compute the covariance matrix for our data vector.Since each all-sky map can have 4 completely independent DES footprints, we have a total of 8000 realizations to use for the covariance, and 400 realizations to compute the derivatives for the Fisher forecast.
While the original Quijote suite was run using Gadget3 (last described in Springel 2005), we use Pkdgrav3 which has already been employed extensively to perform both theoretical studies of the lensing field as well as simulation-based analyses of data from different weak lensing surveys (Fluri et al. 2019;Gatti et al. 2022;Zürcher et al. 2022).The Pkdgrav3 solver automatically builds lightcones as it solves the gravitational dynamics of the system forward in time, and so our final outputs are the lightcone shellsi.e.Healpix maps -of the density field at different redshifts.The simulation box is tiled/repeated as needed to construct large enough volumes to then build full-sky lightcones to a given redshift.This repetition will bias any large-scale correlations in the lightcone, but in this work we only consider scales much smaller than the box size.
The simulations have a total of 100 timesteps/shells, with 95 shells between 0 <  < 10.This gives us a high redshift resolution of between Δ ≈ 0.01 − 0.05 in that redshift range, with the exact value depending on the shell.The timesteps in this redshift range are spaced uniformly in proper time, , and this corresponds to different  and comoving distances depending on the cosmology.These density shells are then post-processed via Equation ( 9), with the integral over   replaced by a simple discrete sum, to create a lensing convergence field at each source plane redshift,   .This technique uses the Born approximation, which computes the total effective deflection due to lensing but along an undeflected ray path.A more precise calculation uses full rayracing, which calculates these deflections while constantly deflecting/updating the ray path.Petri et al. (2017) found the Born approximation leads to differences of ≲ 5% for the third moments statistic we will use in Section 4.2, but this is subdominant to the current uncertainties of ≈ 15%.
Note that we have not performed any resolution-convergence tests.The numerical requirements for this work are less stringent as we do not use the simulations for cosmological inference, but rather for (i) performing a Fisher analysis (Section 4.2), where the relevant quantities are relative and not absolute differences in the simulations as we vary cosmological parameters, and for (ii) computing covariance matrices for our systematic checks (Section 5).

Takahashi17 simulations (T17)
The Takahashi17 simulations (Takahashi et al. 2017) are a suite of N-body simulations run at a WMAP9 cosmology (Hinshaw et al. 2013), and have a higher particle resolution than the A23 suite, with 2048 3 particles.They, however, have lower redshift resolution than the A23 suite with 38 shells between 0 <  < 5.The shells are spaced equally in comoving distance, with widths of 150 Mpc ℎ −1 , and this leads to redshift spacing of roughly  ∼ 0.05 − 0.2.The T17 simulations have been used to model/test higher-order statistics in many works (Gatti et al. 2020;Secco et al. 2022b;Munshi et al. 2023;Gong et al. 2023;Heydenreich et al. 2023) for modelling, validation etc. and so we measure our statistics on these simulations for completeness.There are 108 independent full-sky maps, and that gives us a total of 432 DES Y3 cutouts.

Cosmogrid
Cosmogrid is a large suite of simulations that span the CDM parameter space, including the sum of the neutrino masses, and are designed for simulation-based modelling of widefield survey data (Kacprzak et al. 2023).They were run using Pkdgrav3, similar to the A23 simulations, and have a 900 Mpc/ℎ box size with 832 3 particles.The simulations are run at 2500 points spanning the parameter space, with 7 realizations at each point.They have 140 timesteps, with 70 equally spaced steps in proper time between 4 <  < 99, and another 70 equally spaced steps in proper time between 0 <  < 4. The spacing is different in each of the two regimes.
In this work, we use Cosmogrid to test the impact of baryons on the lensing CDF statistic.For this purpose, we use the fiducial runs which are 200 simulations run at fixed cosmology (Kacprzak et al. 2023, see their Table 2).We use both the default N-body run as well as the run postprocessed using the method of Schneider et al. ( 2019) so the density field looks like that of a hydrodynamic simulation with baryons.We discuss this more in Section 5.6.While the raw maps are available at NSIDE = 2048, the maps we used to study baryons have NSIDE = 512 -which is lower than the fiducial resolution of NSIDE = 1024 used in thus work -and we discuss the impact of this in Section 5.6 as well.

Gaussian maps
For the purpose of validating non-Gaussian statistics, it is useful to have maps that are purely Gaussian -i.e. are represented entirely by a power spectrum -rather than ones that contain a realistic level of non-linearity/non-Gaussianity.We use the power spectrum measured on the N-body maps, which contain the relevant nonlinearities, to then create consistent Gaussian maps.These maps will by construction have the same non-linear power spectra as the original maps.The method employed for doing this is the same as Giannantonio et al. (2008, see their Appendix A).It involves computing all auto-and cross-spectra between the relevant fields on the simulated maps, and then using these spectra with random phases to generate spherical harmonic modes  ℓ that are appropriately correlated to reproduce the input auto-and cross-power spectra.The  ℓ can then be transformed to obtain a real-space map.By definition, such maps will have no higher-order information and be described entirely by their power spectra.
If we have two maps  and  , and want to generate Gaussian maps that have the same auto and cross-power spectrum as  and  , we obtain the  ℓ via where  ℓ is a complex random normal variable with zero mean and unit variance, and    are coefficients derived from the power spectra, with a general form given by, and Equations ( 10) and ( 11) above have been reproduced from Omori (2022, see Appendix C).
For producing real maps, the  = 0 coefficients must be handled separately as they should have no imaginary component (see Appendix B in Sellentin et al. 2023, for an example).Thus, we explicitly remove their imaginary component, by setting Im( ℓ=0 ) = 0, and then rescale the coefficients as  ℓ=0 → √ 2 ℓ=0 .5 From these final  ℓ values we generate the Gaussian maps using the healpy routine, alm2map.
Note that when we post-process the Gaussian maps to mimic the DES year 3 observations (see Section 3.3), only the true convergence field is Gaussian.The procedures applied to the field to postprocess it -such as non-Gaussian noise, and survey masks of complicated geometries -will still induce a non-zero non-Gaussianity in the final simulated convergence field, but these non-Gaussianities will not be cosmological in origin.

Dark Energy Survey Year 3 (DES Y3)
The Dark Energy Survey (The Dark Energy Survey Collaboration 2005) is an optical imaging survey of 5000 deg 2 of the southern sky, and is currently the largest precision photometric dataset for cosmology.We use the data from the Year 3 data release (Sevilla-Noarbe et al. 2021), and in particular the galaxy shape catalogs.This is the same dataset used for the fiducial 2-point correlation function shear results (Secco & Samuroff et al., 2022a;Amon & Gruen et al., 2022) and harmonic power spectrum results (Doux et al. 2022), as well as the higher-order statistics such as the moments (Gatti et al. 2022), mass aperture (Secco et al. 2022b), and peaks (Zürcher et al. 2022).In this work, the Y3 Metacalibration galaxy shape catalog (Gatti & Sheldon et al., 2021) is used to make a map of the ellipticities, which is then converted into a convergence map via the Kaiser Squires method (Kaiser & Squires 1993).This is the same technique used in previous works on the mass map (Chang & Pujol et al., 2018;Jeffrey & Gatti et al., 2021).We perform all our measurements and tests on these maps.
We also use the DES Y3 PSF and reserved star shape catalogs from Jarvis et al. (2021) to estimate the impact of PSF contributions to the signal observed by our statistic.The shape catalogs are used to make a PSF "mass map" the same way the galaxy ellipticities are used, and this mass map is used to test the PSF contributions (see Section 5.3 for more detail).The same star shape catalog was used to test PSF contributions for both the shear 2-point function (Gatti & Sheldon et al., 2021) and the 3-point function (Secco et al. 2022b;Gatti et al. 2022).

Making simulated DES Y3-like Mass maps
We modify the simulated convergence/mass maps described in the above sections to include all the relevant observational effects of the DES Y3 data.Note that the main purpose of the maps is both to perform realistic forecasts of the cosmological constraints (Section 4), and to validate the contribution of different systematics to the CDFs data vector (Section 5).In this work, we do not use these simulations to get cosmology constraints from the DES Y3 data vector.
To make the mock maps, we start from the true convergence field, , and use an inverse Kaiser-Squires (KS) transform (Kaiser & Squires 1993) to obtain the two shear components,  1,2 .The shear is the true observable of a weak lensing survey given we measure galaxy shapes.The KS transform can be quickly performed in harmonic space as where the subscripts denote the E-mode and B-mode shear/convergence maps respectively.In the full-sky limit, where we have no survey masks, this is an exact expression.The technique has been validated for realistic data and found to have adequate accuracy (Chang & Pujol et al., 2018;Jeffrey & Gatti et al., 2021).
Redshift distribution/bins: We use four tomographic redshift bins with source galaxy () distributions matching DES Y3 (Myles & Alarcon et al., 2021); the mean redshifts of these bins are  mean ∈ {0.336, 0.521, 0.741, 0.935}.The true shear maps corresponding to each bin are obtained via a weighted sum of the shear maps in each redshift, where the weights are the () distributions.
Noise realization: The noise is obtained using the DES Y3 Metacalibration shape catalog from Gatti & Sheldon et al., (2021), using the same technique as Gatti et al. (2022).The galaxy shapes are randomly rotated to remove all spatial correlations of the galaxy ellipticities, thus removing any cosmological signal.We then place galaxies in pixels of a NSIDE = 1024 map, and compute the weighted average of the shear components in each pixel of the map,  noise 1,2 ( n), using the weights provided in the catalog.We add this noise to the true shear maps,  1,2 , separately for each tomographic redshift bin.This ensures the Y3 data and the simulated noise maps have the exact same variations in source/survey depth, and as we will show later, these variations create a strong non-Gaussian feature in the map (Section 5.2).
Multiplicative bias: The measured galaxy shapes have a bias of the order 1% that has been calibrated using large suites of image simulations of the DES Y3 survey (MacCrann et al. 2022).We include these bias terms, , in the maps by simply multiplying the shears as  1,2 → (1 + ) 1,2 .
Mask: We only use map pixels that have at least one DES Y3 galaxy in each of the four redshift bins.All pixels that do not fall into this category are discarded, and this defines the survey mask which is used in all further analyses, both for the simulations and for the DES Y3 data.
Kaiser-Squires reconstruction: Following the steps above, we obtain a spin-2 shear field,  1,2 , per DES Y3 tomographic redshift bin, that has noise, multiplicative bias, and a mask applied to it.We then convert this back to a convergence field using equation ( 12) to obtain a noisy convergence map for each redshift bin.We only use the E-mode convergence map in our analysis.This map is then used as our final DES Y3-like map.Other, more sophisticated map-making techniques have been explored in the Y3 data as a replacement to KS reconstruction.A detailed description can be found in Jeffrey & Gatti et al., (2021).The KS method remains the simplest method that is also quick and accurate.The simplicity in compute time is a particularly attractive feature here as we make O (10 4 ) mock DES Y3 maps in this work.Note that the mass maps we generate from DES Y3 data in Section 3.2 are also created by making the shear maps  1,2 and using the KS transformation to obtain the convergence field.
In Section 5, we will add other effects to the mock maps -such as PSFs, higher-order shear effects, and so on -to test their impact on the measured signal and quantify which effects can be safely ignored and which effects may require scale cuts on the data vector.We do not address the impact of intrinsic alignments in this work, as we treat it as a systematic that can be properly modelled, and thus marginalized over, in a full cosmological analysis as opposed to an effect that contaminates the data vector and requires scale cuts.

CDF ANALYSIS SETUP AND FISHER CONSTRAINTS
We define the CDF data vector for DES Y3 in §4.1 and show the Fisher information in this CDFs data vector, as well as data vectors of other closely-related statistics, in §4.2.

Defining CDFs data vector
In this work, we measure all possible 1-field and 2-field CDFs for the four tomographic bins of DES Y3.This results in four 1field "auto" CDFs, and six 2-field "cross" CDFs.We measure the CDFs across 10 smoothing scales, spaced logarithmically between 3.2 ′ and 200 ′ .The choice of scales matches the moments-based DES Y3 analysis of Gatti et al. (2022).For each scale, we use 7 thresholds  ∈ {−20, −6, −2, 0, 20, 6, 20} × 10 −3 .These were chosen by looking at the variance of the field at the smallest and largest smoothing scale, and ensuring at least two thresholds did not asymptote to 0 or 1 at each scale.Using the Fisher forecast below we have checked that these thresholds probe most of the relevant information while being practical to implement, and we do not perform a more methodic study of the optimal threshold choices.We have, however, verified that removing any one of the seven thresholds leads to a fractional change in the constraints of 5% to 10%.We did not test adding more thresholds as the longer data vector leads to poorer numerical convergence, which then makes it difficult to robustly identify the increase in constraining power provided by the additional thresholds.
For all CDF measurements, we only focus on the range of scales where 0.05 < CDF(, ) < 0.95, which excludes the ∼ 2 region of the distribution for each threshold  and smoothing scale .This removes measurements of the tails of the distribution where noise can cause spurious signals, and it also helps remove regions where the CDF has asymptoted to constant values of 0 or 1.We have confirmed that using different choices, such as 3 or 4 cuts, leads to a fractional difference of < 5% in the Fisher constraints.While the tails of the distribution are a sensitive probe of the non-Gaussian information, they are also much noisier and so the actual constraining power from this region of the distribution is not significant.The "bulk" of the distribution -for example, the 1 to 2 region -is still quite sensitive to non-Gaussian features while being less susceptible to noise (e.g., Friedrich et al. 2020;Uhlemann et al. 2020).
Our initial data vector has size  = 10 z-bins × 10 scales × 7 thresholds = 700 data points.The procedure above of focusing only on 0.05 < CDF < 0.95 removes more datapoints as multiple thresholds reach asymptotic behavior of CDF = 0 and CDF = 1 at large smoothing scales, especially for the lower redshift bins where the variance of the convergence field is lower. 6In practice, the data vector for DES Y3-like maps has  = 460 points.Note that different thresholds reach these asymptotic values at different scales.Figure 1 illustrates this behavior.Figure 3 presents the data vector measured on the DES Y3 data as well as different simulations described in Section 3.1.The 1-field (2-field) CDFs are shown in the diagonal (off-diagonal) panels.The colored lines show (  > ), the fraction of the map that exceeds a given threshold at a given smoothing scale, where each color is a different threshold.At a fixed threshold, the probability is driven to 0 or 1 with larger , and this behavior is discussed in Section 2.1.
The threshold  = 0 is special as it is the mean of the 1D marginal distributions, and so its probability for the 1-field CDFs is  ≈ 0.5 across all scales. 7In the 2-field case the probability for  = 0 is (  ,1 > 0,   ,2 > 0) ≈ 0.25 but has scale-dependent kernel has a lower amplitude for low-redshift sources and so the variance of the convergence field increases with redshift. 7Figure 1 shows the true convergence field is log-normal on small-scale, and thus has  (  > 0) ≠ 0.5.However, for noisy convergence fields, the noise dominates the cosmological signal on small scales and this noise is a symmetric distribution (the odd moments are zero, as discussed in Section 5.2).This restores the measurements to  (  > 0) ≈ 0.5 as mentioned.
deviations.This is because the correlation between the two fields alters this probability, and this correlation has a scale dependence, meaning the deviations from  ≈ 0.25 will also be scale-dependent as expected.
We can also see a clear visual difference between the CDFs of the shape noise field (dashed-dotted) and those of the observed convergence field.In particular, the 1-field CDFs of the (3, 3) bin show the clearest difference at larger scales.The shape noise field has a notably smaller variance than the observed convergence field, and this causes the CDFs to asymptote to 0 or 1 more quickly compared to the CDFs of the data.We also find that the T17 predictions are quite similar to those of A23, and that the simulations are generally a decent match to the data.

Fisher Information
We use the data vectors and covariance matrices constructed from the A23 simulations to perform a Fisher forecast for three CDM parameters that are the target of current and future lensing surveys -Ω  ,  8 , and  0 .We measure three broad types of summary statistics for this forecast:  , Ω m and  0 when using DES Y3-like data.The power spectra and 2nd moment probe only the Gaussian information and their contours overlap completely (the peach contour is hidden underneath the purple).Adding the 3rd moment significantly improves the constraints, and the CDF, which approximately contains all moments, improves upon that a non-negligible but diminishing amount.The degeneracy direction of 2+3 moments and the CDFs is also visibly different, and combining them leads to a further 20-30% improvement in constraints.The black dashed lines in the diagonal panels show the 1D constraints from CDFs measured on a purely Gaussian field, and these are consistent with those from the other Gaussian statistics.The constraints are tabulated in Table 1.
Gaussian Statistics, such as angular power spectra and the 2nd moments of the field are well-known for being sensitive to only the variance of the field, and the variance is often denoted the Gaussian part of the distribution.These statistics provide a good baseline for cosmological constraints obtained from current fiducial analyses, which primarily use such Gaussian statistics.The angular power spectra are measured in 20 bins in the range 10 < ℓ < 2048.The 2nd moments are measured on the maps smoothed with a tophat across 10 scales that are logarithmically spaced in the range 3.2 ′ <  < 200 ′ .
Higher-order moments are a natural extension to the 2nd moments where one averages higher powers of the fields, ⟨  ⟩.The most common one is the 3rd moment (or skewness), though the 4th moment (or kurtosis) has also been measured in lensing data before across a smaller range of angular scales 2 ′ <  < 8 ′ (Van Waerbeke et al. 2013).In this work we measure the 2nd and 3rd moments in the range 3.2 ′ <  < 200 ′ .
Finally, the CDF is the non-Gaussian statistic that is the focus of this work.The data vector definition is described in Section 4.1, and the measurement on DES Y3 data and some simulated mock maps is shown in Figure 3.
Note that the datavectors of these higher-order statistics tend to be long, and this is particularly an issue when computing the covariance numerically, as the number of realizations needed for the covariance increases with the data vector size.However, the A23 simulation suite contains 8000 DES Y3-like maps, and this number is far larger than the length of any data vector computed in this work.
We can now estimate the Fisher information with the standard approach, where       is the mean derivative of point  in data vector  with respect to parameter   , where the mean is computed using 400 DES Y3 realizations.C −1 is the inverse of the numerically estimated covariance matrix and this is computed while accounting for the Hartlap factor (Hartlap et al. 2007), The Hartlap factor for all data vectors in this work is ≳ 0.9.We have verified that the Fisher information -for all the statistics we present -changes by < 1% even if we halve the number of realizations used to compute the covariance matrix, from  = 8000 → 4000.
Similarly, halving the number of realizations used in computing the derivatives,  = 400 → 200, changes the Fisher information by < 1% for most statistics; the one exception is the CDFs, where the change in Fisher information is still at the 5-10% level.However, a numerical uncertainty of this level does not change our qualitative interpretations below.
Figure 4 shows the Fisher information of each statistic.The parameter constraints are obtained by inverting the Fisher matrix of Equation ( 13).First, we see that the angular power spectra and the 2nd moments have indistinguishable constraints, and this is the expected behavior as one is simply a transformation of the other; given the C ℓ , one can predict the 2nd moments exactly via an integral, and vice versa. 8We also see that the CDFs measured on a Gaussian version of the simulated Y3-like fields, shown by the gray dotted line in the diagonal panels, have constraints very consistent with those of the power spectrum and 2nd-moment.We show in Appendix B that the statistics used in this figure all follow a Gaussian likelihood even when measured on fully non-linear, non-Gaussian fields -which is not always the case for higher-order statistics as has been found in previous works (Park et al. 2022;Euclid Collaboration 2023).
Including the 3rd moment alongside the 2nd moment improves the constraints significantly for all parameters.This is primarily because of the different degeneracy directions for the different moments (Gatti et al. 2020(Gatti et al. , 2022)).
The CDFs improve the constraints compared to the combination of 2nd and 3rd moments.This confirms that there is still usable information beyond the 3rd moment in the convergence field, particularly in constraining Ω  .However, the modest improvement in going from the 2nd + 3rd moments to the CDFs (when compared to the increase from 2nd moments to 2nd + 3rd moments) shows that there is less information from the 4th moment and beyond.We explicitly check the information content of the 4th and 5th moments later in Figure 6.We have separately verified that the constraining power of the moments approach agrees better with that of the CDFs if we include the 4th and 5th moments in the former.(Zürcher et al. 2022).For KiDS 1000, we show results from cosmic shear (Asgari et al. 2021b) and a field-level analysis (Fluri et al. 2022).The DES analysis of the 2nd + 3rd Moments uses different, more conservative analysis choices (scale cuts, nuisance parameters etc.) compared to the Fisher forecast done here with the 2nd + 3rd Moments, resulting in its relatively looser constraints.
In general, we find that the CDFs do better than the combination of the 2nd and 3rd moments by around ≈ 20% in the three parameters we focus on (Table 1).They are also more compact, meaning the data vector for the CDFs (N = 460) is notably smaller than the data vectors for the higher-order moments -from progressively including the 4th Moment (N = 650) or 5th Moment (N = 1210) -while still providing constraints that are better than using up to the 5th moment.Combining the CDFs with the 2nd and 3rd moments leads to constraints that are 20-30% better than using just the 2nd and 3rd Moments.We have verified in Appendix B that the combined data vector also follows a Gaussian likelihood.
We also use the Figure of Merit (FoM), which is defined as the inverse of the area/volume of the ellipsoid formed by the parameter constraints, where  is the subset of parameters used to define the FoM and in our case is  ∈ {Ω m ,  8 ,  0 }.The FoM metric provides a concise way to summarize the constraining power in a multidimensional parameter space.We list the FoM values of our data vectors in Table 1.Including the 3rd moments improves the FoM, relative to the 2nd moments, by a factor of 3. Including the CDFs improves it by 15%, relative to the FoM of the combination of the 2nd and 3rd moments.Combining the CDFs with the 2nd and 3rd moments improves the latter's FoM by 65% and the former's FoM by 40%.

LENSING CDFS IN DES Y3 DATA
We now discuss measurements of the CDF on the DES Y3 data in §5.1, including the non-Gaussian aspect of the noise field in §5.2, and then detail the contributions from different effects that can impact the inference process: PSFs in §5.3, source clustering in §5.4,higher-order shear effects in §5.5 and baryonic effects in §5.6.Finally, we discuss scale cuts in §5.7.

CDF measurement & signal-to-noise
In Figure 3, we have already shown the DES Y3 measurements in solid lines, with the noise-only data vector in dotted gray lines and the A23 version of DES Y3-like map in the gray dashed lines.There is a clear cosmological signal as evidenced by the difference between the noise-only and DES Y3 measurements.Figure 5 now shows the signal-to-noise of the cosmological component for each datapoint in the data vector.This is computed as the residuals normalized by the uncertainty, S/N = |CDF Y3 − CDF N |/(CDF A23 ).We then also combine the statistical significance of the individual points, accounting for the covariance between them, and find a total signalto-noise of S/N = 45.3.If the difference between the signal+noise and noise-only fields is a difference in only their even moments (eg.variance and kurtosis) then for the 1-field CDFs (the "auto-correlation" part) in Figure 5, the S/N of a positive threshold should be similar to that of a negative threshold of the same amplitude.We see some indication of this via visual inspection of the 1-field CDF of the 3rd and 4th tomographic bin.We also see an asymmetry in the S/N, and this is a sign of an additional skewness caused by the signal field -for example, in the (0, 0) bin the amplitude of the yellow line ( = 0.006) is higher than the light blue one ( = −0.006).Thus, we can also visually see indications that this statistic captures non-Gaussian signatures.
Note that while we quote a signal-to-noise for the full set of residuals, we do not use it as a robust estimate of the amount of information.This is because the CDFs respond to noise and signal nonlinearly,9 so a  2 statistic is not the ideal way to quantify deviations if the deviations are large, which is the case between measurements of the noise-only maps and the noisy convergence maps.The interpretation of a  2 in the large-deviation regime is unclear.Note that this is not a problem for our Fisher forecast as the residuals are small given the shifts in the cosmology parameters, as needed for the derivatives, are also small.
Given the results of Figure 4, where we find the CDFs are a useful and complementary statistic for constraining cosmology, and Figure 5, where we find the CDFs in DES Y3 have a clear cosmological signal with signs of both the Gaussian and non-Gaussian part, we would like to now test the robustness of this statistic to the relevant observational effects in the Y3 weak lensing data.We will explore exactly this in the following subsections: • Naturally we would want to know how much of the cosmological information seen in Figure 5 is non-Gaussian -this requires a more precise understanding of the non-Gaussianity in the noise field ( §5.2).
• The measured shape of galaxies will have some contributions from the PSF, which can then lead to non-cosmological spatial correlations of the galaxy ellipticities -we find this is negligible ( §5.3).• Source galaxies, which trace the density field, will be clustered and this can impact the observed convergence field -this has a noticeable impact ( §5.4).
• The source clustering also leads to correlations between the shape noise field and the convergence field, as seen in the CDFswe can model this correlation effectively ( §5.4).
• The impact of ignoring higher-order shear effects when modelling the data vector -this is also negligible ( §5.5).
• The effect of baryonic physics on our statistics -as expected from previous works, this is important ( §5.6).
• Given the tests above, we detail the analysis choices one would need to make -under our current modelling ability -to robustly infer cosmology using the CDFs ( §5.7).
The impact from other common systematic factors, such as () uncertainties, multiplicative bias uncertainties, and intrinsic alignments, is not considered here.These effects can all be marginalized in the inference and modeling process when obtaining cosmological constraints via the CDFs data vector.Such marginalization has already been performed for multiple different analyses of higherorder statistics (e.g., Gatti et al. 2022;Zürcher et al. 2022).

Non-Gaussianity of shape noise fields
To quantify the level of cosmological non-Gaussianity observed by the CDFs, one first needs to understand the non-Gaussianity in the noise field.This is particularly relevant for us as the CDFs are sensitive to all moments of the field, meaning all moments of the cosmological signal but also all moments of the noise field.For this particular investigation, we will switch to using the fields' moments to summarize the noise field and cosmological field at different orders.We do this as the moments can easily isolate the signal from different orders, which helps disentangle the information contained in the CDFs.
Figure 6 shows the 2nd to 5th moments of DES Y3 mass map, as well as the shape noise map, for the 4th tomographic bin.We find that there is a significant non-Gaussianity in the noise, particularly in the 4th moment and on small scales.Such a feature is naturally expected if the field of source galaxy number counts is not uniform.In the limit that the galaxy counts are uniform across The 4th and 5th moments (bottom panels) have their disconnected components subtracted out.The bands show 1 uncertainties for the Noise-only and A23 maps from the O (10 4 ) realizations used in this work.The moments are re-scaled by  n as a visualization choice, where  n = /2 and  ∈ {2, 3, 4, 5} is the moment order.The 2nd and 3rd moments have significant information beyond the noise.The 4th moment is significant on the smallest scales, but this contribution is entirely from the noise field since the blue/orange and green lines are almost perfectly overlaid.On larger scales, there is a weak, cosmological signal.The 5th moment is fully consistent with no signal across all scales.
the whole DES Y3 footprint, then every pixel in the map has the same number of galaxies, and thus would have the same shape noise per pixel.In reality, the number of source galaxies per pixel varies across the footprint, either from survey observing conditions or from the intrinsic clustering of sources due to structure formation (see Section 5.4 or Figure 9).In this case, the noise variance per pixel varies across the footprint, and summing the individual Gaussian noise distributions within the pixels results in a Gaussian mixture model that is symmetric about the  = 0 mean, but can have a significant non-Gaussianity in its even cumulants/moments starting from the kurtosis/4th moment.This is also consistent with the fact that we detect no odd moments in the noise field.We also see in Figure 6 that for DES Y3-like data the cosmological signal exists only in the 3rd and 4th moments.At the 5th moment, the measurement is already consistent with no signal.The noise field has a 3rd moment that is consistent with 0 across the full range of scales.For the fourth moment, however, the noise has a larger fourth moment than the cosmological signal.We can infer this by seeing that the fourth moment of the observed field is very similar to that of the noise-only field.
The significance of the 4th moment in Figure 6 highlights the need to accurately model the noise field, since almost all the non-Gaussianity on small scales is coming from the shape noise field rather than the convergence field.Note that some previous works have also shown a strong detection of the 5th moment in the convergence field from data (Van Waerbeke et al. 2013), but they analyze the total fifth moment ⟨ 5 ⟩, whereas here we only consider the connected component, which is obtained as ⟨ 5 ⟩ − 10⟨ 2 ⟩⟨ 3 ⟩, where ⟨ 2 ⟩⟨ 3 ⟩ is the disconnected component.10Accounting for this disconnected component is important when isolating the signal in the higher orders.For example, Gaussian distributions have a nonzero fourth moment that must be accounted for -by subtracting out this "disconnected" piece -when measuring non-Gaussian features via the 4th moment.A similar scenario occurs for the fifth moment, where we subtract contributions from lower orders, namely the product of the 2nd and 3rd moments.

PSF contributions
So far we have assumed that spatial correlations between the measured galaxy shapes are a purely cosmological signal.However, this is not guaranteed to be the case as the ellipticities from the PSF can have spatial correlations as well.These correlations have been studied extensively for the 2-point functions (Jarvis et al. 2021), and the work from Gatti & Sheldon et al., (2021); Amon & Gruen et al., (2022) have explicitly shown their contributions to the cosmological signal/constraints from 2-point functions are negligible.This test has also been done at the 3-point function level (Secco et al. 2022b;Gatti et al. 2022) and found the contributions continue to be negligible.We now replicate this test at the CDF level, which will test the contribution of the PSFs to all higher-order moments.
First, we detail the different PSF contributors to the galaxy shapes.The lensing convergence is obtained from the lensing shear maps, which in turn are obtained from individual galaxy ellipticities.The measured ellipticity of a single galaxy can be separated into distinct components, e obs = e gal + e shear + e psf,true + Δe psf,err + Δe psf,true , ( 16) where e gal is the intrinsic ellipticity of a given galaxy, e shear is the ellipticity modification due to weak lensing from foreground structure, e psf,true is the PSF ellipticity, Δe psf,err is the PSF ellipticity error11 , and Δe psf,true is the PSF size error12 .The first quantity of Equation ( 16) is assumed to average to zero, ⟨e gal ⟩ = 0, while the PSF components can still make a non-zero mean contribution.The coefficients, , ,  connect the PSF components to their effective contributions on the measured shear.The values of these coefficients can be measured directly from the data, and we use the values reported in Gatti & Sheldon et al., (2021, see their Table 2) of  = 0.001,  = 1.09 and  = −0.5.These PSF-based ellipticities can then be used to make a "PSF mass map" in the same way galaxy ellipticities are used to make the DES Y3 mass map.In practice, we make three PSF maps for each of the three PSF components in equation ( 16) and sum them together in the end.
We test the impact of PSFs on the CDFs by comparing measurements between two types of maps.The first type of map is the sum of the cosmological signal from the A23 simulations, the Y3-like shape noise field, and a PSF mass map for each of the three individual PSF terms of equation ( 16).The second type of map contains the same signal and noise fields as the first, but the PSF mass map is now created after rotating all the PSF-related ellipticities in random directions.Thus the first map preserves any PSF-based spatial correlation signals, whereas the second map removes such correlations.Therefore, the residuals between the CDF measurements on these two maps quantify the significance of the PSF ellipticities being spatially correlated, which in turn quantifies how much this non-cosmological spatial correlation will contaminate our signal. 13Note that we add the same PSF mass map to all tomographic bins.
We show in Figure 7 the significance of the residuals between these two maps as measured by the CDFs, averaged over 8000 realizations.The results show that the significance of the PSF contribution is below 0.1 for all bins, scales, and thresholds.More importantly, we also show the cosmology signal seen by the CDFs -the same results from Figure 5 -and find the PSF contribution is multiple orders of magnitude below the cosmological signal, which has a significance of 3 − 10.This also confirms that the PSF contributions at the DES Y3 data quality are negligible even beyond the 3-point information.
Note that there are some dipping/valley features in both the dashed and solid lines, which are locations where the residuals switched between positive and negative values. 14This crossing implies there are scales where the residuals from the cosmological signal, at a given convergence threshold, are zero.This does not coincide with the scales where the same zero crossing occurs for the PSFs.So in principle, for a given threshold, there can be certain scales where the PSFs contribute more than the cosmological signal.However, this contribution would still be between 1 − 10% of the (1, 2) (1, 3) (2, 3) The difference in CDFs measured on two DES Y3-like simulated maps.One contains the Y3 PSF mass map, and the other contains a PSF mass map obtained after rotating all the PSF-based ellipticities.The contribution of any correlations from the PSF (solid lines) is below < 0.1 and is statistically negligible for all thresholds (different colors).It is also 2-3 orders of magnitude below the cosmological signal in DES Y3 (dotted lines).The total signal-to-noise of PSF-induced correlations is 0.3.
measurement uncertainty and thus is not a concern for cosmological constraints.

Source Clustering
Surveys observe the lensing field sampled at the location of source galaxies, and the ellipticities of these source galaxies are then used to infer the original lensing and convergence fields.The standard prediction for the convergence correlations has an additional correction because the source galaxies do not uniformly sample the lensing field and are themselves clustered given they trace the underlying, clustered density field.This clustering of source galaxies impacts the observed convergence as follows: the () of a survey details the weighting of the convergence field at different redshifts, and is computed across the whole survey footprint.However, the precise () varies across the sky.For example, at redshift  in direction â, we can have a significant overdensity of structure.This means the () in the â direction has more galaxies at redshift , and the () must be reweighted accordingly.We will refer to this effect henceforth as source clustering (SC), as was first denoted in Bernardeau (1998), though this effect has also been called source-lens clustering (Hamana et al. 2002).The effect of source clustering is not present in the fiducial postprocessing technique described in Section 3.3.However, it can be included through the prescription detailed in Gatti et al. (2023, see their Equation 5) and previously used in Gatti et al. (2020), where () is the source redshift distribution of the tomographic bin, averaged across the survey footprint, ( n, ) and ( n, ) are the density and true shear maps at a given direction/pixel and redshift, and   is the source galaxy bias.In simple terms, equation ( 17) modulates the () across the survey footprint by reweighting it in a direction-dependent way using the density fields.Note that Gatti et al. (2023) take   = 1, which we follow in this work as well, and this is a fair approximation for source galaxies which tend to be mostly blue galaxies.
In Figure 8, we show the difference in the CDF datavector measured on a convergence field with/without source clustering.Both sets of simulations have the same noise field, which is described in Section 3.3.Thus, Figure 8 presents the impact of source clustering on the cosmological signal.We find here that the impact on the CDFs is at most 0.1 − 0.5, and it is generally 1-10% that of the cosmological signal.Gatti et al. (2020Gatti et al. ( , 2023) ) show the impact of source clustering on the 2nd and 3rd moments is at the 1−10% level as well.Krause et al. (2021) show that the source clustering effect on cosmic shear 2-point functions leads to negligible bias (< 0.15) in cosmological parameter constraints, but this result is obtained after performing fiducial scale cuts which remove scales where the impact of source clustering is most prominent.Thus these findings are still consistent with our statement above that source clustering is a 0.5 effect on small scales.
We have thus far checked the impact of source clustering on the convergence field.However, source clustering will also induce a correlation between the true convergence field and the shape noise field.Both the convergence field and the source galaxy number density field depend on the density field, and are thus correlated with one another.Given the noise depends inversely on the source galaxy number density as   ∝ 1/ √  gal , the convergence field is correlated with the noise field.For example, consider two redshift bins A and B, with   >   .If there is an overdensity in bin B, it would simultaneously induce a large convergence in bin A and a (1, 2) (1, 3) (2, 3) suppressed noise in bin B, causing an anti-correlation between the convergence field of bin A and the noise field of bin B. Gatti et al. (2023) describe a simple modification of the noise field that models this correlation, where the definitions are the same as equation ( 17), with  noise ( n) as the shape noise field, which is obtained as described in Section 3.3; by using the DES Y3 galaxy shape catalog, and randomly rotating the galaxy orientations.The density factor in Equation ( 18) varies the number counts of source galaxies across the sky according to the underlying density field.This is the same source clustering effect discussed above but we now consider its effect on the shear noise field,  noise , rather than the true shear field, .As a consequence of the density-based reweighting, the even moments (variance, kurtosis etc.) of the modified noise field,  SC, noise ( n), are slightly inconsistent with those of the original noise field  noise ( n).The factor  ( n) is implemented as a correction for this inconsistency (see Section 3 of Gatti et al. (2023) for a more detailed discussion), and is modelled as where n) is the shear variance, summed over both components, in a given direction/pixel and for a given noise realization.The coefficients  and  are calibrated in Gatti et al. (2023) for the four DES Y3 bins using the Cosmogrid simulations, with values  ∈ {0.97, 0.985, 0.990, 0.995} and  ∈ {0.1, 0.05, 0.035, 0.035}.We have verified that the results of Figure 9 below are insensitive to the inclusion/exclusion of  ( n) in Equation ( 18), which is expected as they focus on the correlations between fields, rather than the covariance between them.
The correction to the noise field in Equation ( 18) is known to improve the modelling of the 3rd moments, which are sensitive to such convergence-shape noise correlations (Gatti et al. 2023).We post-process our simulations using Equations ( 17) and ( 18) to obtain convergence maps with such correlations.We then quantify the statistical significance of these correlations, as determined by the CDFs measured on these maps.The CDFs are a useful tool here as they inherit the properties of the kNN distributions, which are the discrete-field version of the CDFs and are a higher signal-to-noise estimator than the 2-point function for determining whether two fields are correlated (Banerjee & Abel 2021b, see their Figure 5).
Figure 9 shows the convergence-shape noise correlation as seen in the CDFs.Instead of the 2-field CDFs, we show the crosscomponent defined in Equation ( 6) and normalize it by the uncertainty in these correlations, estimated across 1000 DES Y3 realizations.Thus, the presented quantity can be interpreted as a significance of correlation.In the left panels are the results from DES Y3 and from the A23 simulations with source clustering.The DES Y3 result is the mean data vector from correlating the same DES Y3 mass map with 1000 different noise maps.The right panels show A23 simulations without source clustering, and finally the A23 simulations with purely Gaussian noise and no survey mask.
The exclusion of source clustering leads to a simulated model that is clearly different from what is observed in the data, and including source clustering brings the model and data into good agreement.The right panels of Figure 9 show that even if we do not include source clustering, there are correlations between the simulated mock maps.Such correlations are expected due to the survey observing properties.The first such cause is survey depth variations, which modulate the source galaxy number density across Figure 9.The correlation between two fields, which are the observed convergence field -either from DES Y3 data or forward modelled from simulationsand the simulated Y3-like shape noise fields.We find a significant detection of correlation.The panels show the index of the tomographic bin for the observed field (S) and the shape noise field (N).The left panels show the DES Y3 data and the A23 simulations with source clustering.The right panels show a subset of correlations for two other types of simulations -one with no source clustering, and one with Gaussian noise and no survey mask.The simulations with no source clustering show a clear difference from those with it included.However, even without source clustering, the observed field is correlated with the noise field, and this is due to performing KS reconstruction with a survey mask.We also measure the CDFs on full sky maps that use Gaussian noise and no survey mask.In this regime, the signal and noise fields are completely uncorrelated as expected.The total signal-to-noise of the convergence-shape noise correlation, computed as the difference between the "With SC" and "No SC" models, is 13.The "With SC" model is within 3.5 of the Y3 measurements.
the sky in the same way for all noise realizations and tomographic bins.The second is the presence of a common survey mask when we perform the KS reconstruction, which induces features in the map that are correlated across independent noise realizations given they all share the same mask.The black dashed lines in the right panels of Figure 9 confirm that a full-sky analysis with Gaussian shape noise and no survey mask -which by construction has removed the survey property-based effects discussed above -has no convergence-shape noise correlations.
Figure 9 also shows that convergence-shape noise correlations are statistically significant in the data vector and so are a necessary component in forward-modelling the CDFs.This is also true of other higher-order statistics.The analysis of Gatti et al. (2022) found correlations between the signal and noise field but was able to denoise the measurements to remove this effect.This was possible as they used the third moments of the field as their statistic, ⟨ 3 obs ⟩ = ⟨( signal +  noise ) 3 ⟩, and so the noise-dependent terms -such as ⟨ signal  2 noise ⟩ -that contributed to the measured moments, ⟨ 3 obs ⟩, could be subtracted exactly.This can be done for moments of any order.For statistics like the CDFs, however, the data vectors depend on the noise in a nonlinear way, and a simple subtraction will not remove all convergence-shape noise correlations.In this case, we are reliant on an accurate forward model of the shape noise field.15

Higher-order shear effects
In equation ( 16), the contribution to the measured ellipticity from the cosmological component is written as e shear .This is then connected to the shear field, , as e shear = /(1 − ).In the limit of  ≪ 1, this is approximated to leading order as /(1 − ) ≈ .Thus, the measured ellipticities are assumed to directly trace the shear , and we ignore higher-order terms, the first of which is . 16The effect of this approximation is generally known to be subdominant to the cosmological signal (Krause & Hirata 2010).The specific impact on the 2nd and 3rd moments measured in DES Y3 is also known to be negligible, especially when compared to the uncertainties in the Y3 measurements and to other effects such as baryon imprints (Gatti et al. 2020, see their Figure 4).
In Figure 10 we show the residuals between CDF measurements made on a mass map where the input true shear field is just  and a map where the input field is actually /(1 − ).Note that by using /(1 − ) rather than the approximation (1 +  + . ..) we test the impact of ignoring all higher-order terms in the reduced shear approximation, rather than just the leading order correction, .We then perform the full postprocessing pipeline with both map versions.The differences at the data vector level are within < 0.1 and are subdominant to the signal by multiple orders of magnitude.The impact of this approximation increases with redshift, which is expected as the variance of the  field increases for source galaxies This result also provides a validation for magnification effects, which at leading order in  modify the shear as  → (1 + ), where  is some O (1) constant.As was the case with the PSF contributions, these effects have been quantified up to the 3-point function for DES Y3 (Gatti et al. 2020), and we have now implicitly extended it to include higher-order moments through our focus on the CDFs.

Baryon imprints
Finally, we check the impact of baryon modeling on this statistic.Over the past decades, it has been well-established that galaxy formation processes like gas cooling and AGN (Active Galactic Nuclei) feedback can alter the distribution of total matter within and around halos (Blumenthal et al. 1986;Gnedin et al. 2004;Duffy et al. 2010), which consequently will impact the weak lensing signal (Chisari et al. 2018).These baryonic imprints have a strong mass/redshiftdependence (Lovell et al. 2018;Beltz-Mohrmann & Berlind 2021;Anbajagane et al. 2022a) and this mass/redshift-dependent impact on the halo potential can vary across simulation prescriptions (e.g., Anbajagane et al. 2022b;Shao et al. 2022).
Recently, Schneider et al. (2019) implemented a halo-based model that can alter N-body simulations -which are cheaper to run than full hydrodynamic simulations with galaxy formationto then model the baryon imprints on the density/convergence field.This technique provides a higher-level, approximate galaxy formation model that depends only on "macro" properties like the halo baryon fraction, the baryon density profiles, dark matter density profile etc. and the flexibility manifesting from the method's ap-proximate nature is particularly useful for matching the range of halo property scaling relations found in the latest hydro simulations (e.g., Anbajagane et al. 2020;Lim et al. 2021;Lee et al. 2022;Cui et al. 2022;Stiskalek et al. 2022;Anbajagane et al. 2022a).
In this section, we once again compute residuals between CDFs measured on maps from N-body simulations and maps that have been "baryonified".Both sets of maps used in this section come from the Cosmogrid suite, and the baryonification was performed with the same model as Schneider et al. (2019).The parameters of the baryonification model were all given their default values, except for some of the gas model parameters which we given values of   = 13.82 and  = 0.These parameters are part of a reparameterization done in Fluri et al. (2022) and control the gas density profiles' slopes.We take the true convergence fields from Cosmogrid and postprocess them using the same pipeline described in Section 3.3.
Figure 11 shows the residuals due to baryonic imprints on DES Y3-like mock maps.In all cases, the baryon impacts are below 1.However, note that the maps from Cosmogrid have a resolution of NSIDE = 512, and thus the pixel resolution is 6.4 ′ arcmin, instead of the 3.2 ′ arcmin minimum scale used in this work.Since the baryons' dominant contribution is on smaller scales, it is likely that the true residuals at 3 ′ <  < 6.4 ′ are actually larger than what is presented in Figure 11 but are currently suppressed due to the pixel resolution of the Cosmogrid maps.Nevertheless, we can state that the baryon imprints for  > 10 ′ have a significance that is approximately 1-2 orders of magnitude below the cosmological signal.
The impact is also highest for the extreme thresholds in the CDF -the  = −0.006and  = −0.020thresholds -and this has been seen in previous, theoretical works.Osato et al. (2021) compared hydrodynamic simulations with a dark matter-only counterpart and showed the lensing PDF can be impacted by more than 10% at the tails of the distribution (see their Figure 5).Sunseri et al.As expected, baryon imprints are a significant effect on the data vector.The gray band shows the scales below  < 6.4 ′ , which is the pixel resolution of the Cosmogrid DES Y3 maps, and is a factor of 2 larger than the other maps we consider in this work.Thus, the baryon effects we estimate below that scale are an underestimate of the true effect given the pixel resolution will suppress these effects.The total signal-to-noise of baryon imprints is 3.5, though this is a lower bound given the suppression due to map resolution.
(2023) used the same set of simulations to show that the impact of baryons on halos, filaments, and voids affects different parts of the matter PDF.

Scale cuts
In the above sections, we have determined the impact of different systematics and modelling approximations on the CDF data vector.Some systematics are negligible for the whole data vector, such as the PSFs (Section 5.3) and the reduced shear approximation (Section 5.5), while others are prominent at a subset of scales, such as baryon imprints (Section 5.6).Thus, using the CDFs to robustly infer cosmological constraints will require us to discard some parts of the fiducial datavector -namely the parts where the amplitude of the systematics is high -and obtain constraints using the remaining fraction of the data vector.Amongst all the systematic effects considered in this work, the most significant are the baryon imprints (Figure 11) and the source clustering effect (Figure 8).These will determine how the data vector is truncated.Our scale cuts are determined by requiring that the parameter bias due to unmodeled systematic effects is below a certain threshold.We compute this bias using the extended Fisher formalism of Amara & Réfrégier (2008); Asgari et al. (2021a), where both  and  are indexes over the cosmological parameters of interest.The average bias in the datavector,  biased −  fid , is a quantity we have already computed and presented in the above subsections.We then summarize this bias-per-parameter, Δ bias  , into a bias for the full N-D posterior as where  is the covariance of the parameters, and so  −1 is just the Fisher matrix, .Our procedure for scale cuts is simply removing datapoints until  < , where  is some chosen threshold.We will use  ∈ {0.3, 0.2, 0.1}.The choice  = 0.3 matches the tests done in the main methodology pipeline for DES Y3 (e.g., Krause & Fang et al., 2021;Secco & Samuroff et al., 2022a;Amon & Gruen et al., 2022) while the other values are chosen to explore more stringent cuts that could be reflective of Stage IV surveys.Note that this threshold, , is somewhat arbitrary, but that is not a concern as our goal is to see how the scale-cuts for the CDFs compare to those for the moments; as long as the same choices are applied across both statistics, the arbitrariness of the choices is not relevant.
The other component we must decide is how to determine and discard data points to achieve the condition  < , as there is significant freedom in doing so.We could throw away all datapoints for every bin/threshold corresponding to aperture scales below a certain chosen value.However, the choice of a fixed scale threshold is suboptimal as the impact of systematics at a chosen scale varies across bins and thresholds (as seen in any of the Figures above).Thus, our choice here is a scale cut done bin-by-bin (and thresholdby-threshold, in the case of CDFs) and follows the approach of Secco & Samuroff et al., (2022a); Amon & Gruen et al., (2022).We compute the chi-squared of a given effect in a specific tomographic bin combination (and also specific threshold, in the case of CDFs), and remove the datapoints corresponding to the smallest scales until The Fisher constraints from CDFs (left) and 2nd + 3rd Moments (right) measured on simulations.We present four cases, where we either have no scale cuts, or cut the data vector so the parameter bias in the Ω m −  8 −  0 contour is below a certain value; see Equation ( 21).The CDFs and the moments have comparable constraints, which are denoted in Table 3.
we satisfy the relation, where  sub,biased and  sub,fid are subsets of the datavectors used in Equation ( 20), where the subsets correspond to specific tomographic bin combination (and threshold, when using CDFs), C sub is the covariance matrix of the subset, and Δ 2 thresh is the maximum change in  2 we allow for the full datavector.In practice, we vary Δ 2 thresh until the parameter bias goes below our required threshold.The datapoints that have been removed to achieve this condition define the scale cuts.
Once the scale-cuts have been defined, we recompute the Fisher constraints using the truncated data vector; the results for the CDFs are tabulated in Table 2.The table also shows constraints from generic scale cuts, where we set a fixed minimum angular scale for all tomographic bins and all thresholds.Cutting all scales below 20 ′ causes a fractional change of ≈ 30% in the constraints.At these scale cuts, the CDFs are comparable to combining 2nd and 3rd Moments, and we have verified that combining the CDFs with the moments still leads to a 30% improvement in the constraints.The PSFs and reduced shear effect have no impact on scale cuts so we do not show them here.Note that, as was discussed in Section 5.6, the impact of baryonic effects is an underestimate given the baryonified Cosmogrid maps used to estimate the effect have a 6.4 ′ minimum resolution scale.Baryon effects are more impactful at smaller scales and will be more than 10% of the signal if the resolution limit is corrected.However, for our goal of consistently comparing the impacts on CDFs and Moments, this suppression is not a limiting factor.
Table 2 shows that baryon imprints and source clustering both cause notable differences in the parameter constraints, especially in  8 and  0 .The FoM in the 3D parameter space drops by a factor of nearly 10 after implementing these scale cuts, which highlights the growing need to improve modelling of these effects instead of robustly trimming the data vector to be insensitive to the effects.Note that while the impact of source clustering on determining the scale cuts is larger than that of the baryonic imprints -which is counter to the standard expectation -this is once again because of the suppression of baryon effects on the small scales due to the resolution scale of the Cosmogrid data products.
Figure 12 and Table 3 also show the results from defining scale cuts using both baryon imprints and source clustering, and doing so for CDFs and for the 2nd and 3rd moments.This provides a self-consistent reference to compare the two data vectors.The combination of scale cuts is done by looking at both baryonic effects and source clustering, and at each datapoint we pick the amplitude of the effect that is highest, i.e.  = max |Baryons, SC| for each datapoint.We find that the moments' constraints are comparable to the CDFs' after these scale cuts.Once we remove  0 from the analysis the scale cuts cause only a factor of 3 degradation of the FoM as opposed to the factor of 10 if we include  0 .
Generally, one may expect the CDFs to be less sensitive to these effects than the moments; reduced shear, source clustering, and baryon imprints are all effects that grow with the amplitude of the density field and/or the convergence field.This means they impact the tails of the density/lensing distribution the most and leave the "bulk" of the PDF -roughly the 68% or the 95% region centered around the median -relatively unaffected.The moments are defined as an integral over the whole distribution and so cannot isolate just parts of it.The CDFs on the other hand can perform such an isolation.They fundamentally only probe whether or not a pixel's convergence is above a given threshold; thus, if the convergence is well above/below the threshold, the measurement of the CDFs is unaffected by that pixel value shifting around due to various effects.For example, the negative thresholds  < 0 will be unaffected by the baryon imprints in massive halos, as massive halos exist in  > 0 regions and baryon imprints reduce the  value but always keep it positive, and so the convergence around halos will always be above the  < 0. Of course, if the  values of interest are near more non-Gaussian.It may also become relevant in constrainingand/or marginalizing over -the impact of baryons on the density field; these effects happen predominantly within halos, and so are localized around the most non-linear regions of the density field and thus will have non-Gaussian signatures.The CDFs might also be one of the few ways to probe the highest orders of information in the field.They are more robust given they can isolate specific parts of the distribution, and this is in contrast to the higher order moments which will be increasingly sensitive to noise/outliers in the tails of the distribution.Thus, if there is significant, usable higher-order information in the cosmological field (for example, in future surveys with different noise levels and sensitivities), the CDF may be one of the only ways to robustly access it.
While efforts have already been made to obtain cosmology from up to the 3rd moment, we show there remains some information beyond the 3rd moment that can likely be accessed in a robust manner, i.e. without worrying about systematics.Effects like reduced shear, source clustering, and baryons have some impact that is at the 0.1% − 10% level depending on the effect and the angular scale.After enacting scale cuts to reduce the bias on cosmological constraints to be within 0.3, the CDF data vector still provides constraints better than those of the 2nd and 3rd moment data vector.We have identified that accurate modelling of the noise field at higher orders is the current limiting factor in robustly inferring cosmology from statistics like the CDFs.Alternatively, an accurate way of denoising the CDFs -which effectively bypasses requirements in modeling the noise field by removing its contribution from the data vector -would enable robust cosmology constraints with the CDFs.
Finally, we note that even though this work has specifically focused on validating the CDF as a summary statistic, the validation results have significant implications for the broader range of lensing convergence statistics discussed in the literature.The key underlying information is the distribution of convergence as a function of scale, (  ), and the CDFs are a convenient and compact way of summarizing this distribution/information.Other statistics summarize this distribution in different ways, such as lensing-in-cells17 and Minkowski Functionals. 18As has been discussed above, another closely connected statistic is the moments of the field, ⟨   ⟩, which are a further summary of the distribution, (  | ), and computing moments to an arbitrarily high order is equivalent to computing the CDF to arbitrarily many thresholds.
As we move towards Stage IV surveys with wider survey areas and deeper observations -both leading to higher precision measurements -other systematics could become relevant.As a rough example, the LSST Year 10 dataset will have ∼ 3 times the survey area as DES Y3 and ∼ 5 times the source galaxy number density as DES Y3 (The LSST Dark Energy Science Collaboration 2018), which leads to a factor of 4 increase in precision of the data vector and in the significance of any systematic we discuss in this work.The reduced shear effect (Figure 10) -which can be safely ignored in Stage III surveys -will likely need to be included in the model for Stage IV, especially for LSST's highest redshift bins as the am-plitude of the effect grows with redshift.However, this component can be trivially included via simulation-based modeling using the same approach we used to include its effects in our simulations (Section 5.5).Source clustering will also be a necessary modelling ingredient for Stage IV surveys as its signal-to-noise will exceed 1 for LSST.While this modeling can also be done through simulationbased modeling, it requires some galaxy bias prescription (Equation 17) which would introduce a modeling uncertainty that has yet to be quantified.Additionally, we discussed that the Born approximation is adequate for modelling the weak lensing field under DES-like uncertainties.However, previous works have shown that for Stage IV data quality we will require ray tracing when using higher-order statistics (Petri et al. 2017).
These effects above -reduced shear, source clustering, and Born approximation -impact all statistical summaries of the lensing field, including the standard 2pt and 3pt functions.Systematics that will uniquely impact the CDFs are then effects that generate a fourth moment and beyond.We have already found in this work that the fourth moment of the noise field is a highly relevant modeling component for the CDFs.In DES Y3, this was primarily sourced by the survey depth fluctuations as well as the intrinsic, cosmological clustering of source galaxies.In general, however, any process that spatially modifies the shape noise per galaxy or the number of galaxies per pixel will generate the fourth moment.For Stage IV surveys, the precision will be high enough that effects such as spatially varying multiplicative bias -which impacts the measured variance of the shape distribution -could also be a required modeling component, but we must first quantify how much this bias will actually vary across the sky.
The validation steps performed in this work have implications for the statistics mentioned above -lensing-in-cells, Minkowski Functionals, field moments etc.For example, it is likely that PSF ellipticity correlations will be a few orders of magnitude below the cosmological signal for all of these statistics.A similar case can be made for the impact of source clustering and the reduced shear approximation.Of course, it is still ideal to perform a separate validation for those statistics to explicitly verify their robustness to these effects, but the results of this work indicate -given the statistics all summarize the same underlying distribution, (  | ) -that it is likely these other statistics will also be robust to these.By using the CDFs, which are approximately summarizing all higher order moments, we have tested these systematics at the map level and beyond the 3rd moment.We hope the methodologies for maplevel tests that we employed and/or introduced in this work enable more checks of the large library of higher-order statistics that are being developed for the convergence field, and thus enhance the trustworthiness of these newer statistics.
an accurate way to estimate potential constraints from the statistics considered in this work.Some additional techniques can also be used to quantify this Gaussianity of the likelihood (Park et al. 2022;Euclid Collaboration 2023), and they are roughly similar to the approach we have taken here.

Figure 1 .
Figure1.Bottom: The probability that   , the average convergence within circles of apertures , exceeds a chosen threshold .We use seven thresholds and show measurements for a noiseless convergence field corresponding to the fourth tomographic redshift bin in DES Y3.The solid lines are converted to dotted ones when the CDFs fall into the 99.7% (3) tail.The gray-blue line is always in the tail for this particular measurement.Top: The PDFs of   for different choices of aperture, .The three aperture scales that we show PDFs for are indicated by the vertical gray lines in the bottom panel.The PDFs are estimated from noiseless convergence fields and are smoothed with a Gaussian for visualization purposes.The vertical lines in these top three panels are the thresholds we use.The probability to exceed is the integral from each threshold up to  ( = ∞).For high thresholds, we have a lower probability to exceed and vice-versa for low thresholds.
Figure 2. Top: The 2-field CDFs averaged over an ensemble of 1000 noiseless, full-sky, Gaussian convergence maps.The () for the two fields corresponds to the third and fourth tomographic bins of DES Y3.The solid lines are converted to dotted styles when the CDF is outside the range [0.003, 0.997] (approximately corresponding to the 3 bounds).The black

Figure 3 .
Figure3.The fiducial datavector used in this work.Colored solid lines are measurements of the CDFs on DES Y3 mass maps, dark dashed lines are from the A23 suite, dotted lines are from T17, and the dashed-dotted are from just shape noise maps with no cosmological signal.All simulated maps have the same DES Y3 shape noise field, survey mask, n(z) distribution, and are put through the same convergence reconstruction method.The panels show 1-field or 2-field CDFs for different bin combinations, with the specific combination denoted in the corner of each panel.There are clear differences between the noise-only CDFs and the DES Y3 data CDFs, particularly on larger scales and in higher redshift bins, which are the expected imprints for a cosmological signal in the lensing convergence maps.The A23 and T17 simulation predictions are a decent match to the Y3 data.

Figure 4 .
Figure4.The Fisher information of different statistics for  8 , Ω m and  0 when using DES Y3-like data.The power spectra and 2nd moment probe only the Gaussian information and their contours overlap completely (the peach contour is hidden underneath the purple).Adding the 3rd moment significantly improves the constraints, and the CDF, which approximately contains all moments, improves upon that a non-negligible but diminishing amount.The degeneracy direction of 2+3 moments and the CDFs is also visibly different, and combining them leads to a further 20-30% improvement in constraints.The black dashed lines in the diagonal panels show the 1D constraints from CDFs measured on a purely Gaussian field, and these are consistent with those from the other Gaussian statistics.The constraints are tabulated in Table1.

Figure 5 .
Figure5.The S/N of the DES Y3 data vector.There is a clear signal observed in the CDFs with S/N = 45.3 which is slightly higher than, but generally consistent with, the S/N of the 2-point analyses in DES (S/N = 40.2,see Section IV ofSecco et al. 2022b).We show the S/N from individual bin combinations as text in the upper left panels.The upper right text in a panel denotes the bin combinations used in a certain CDF measurement.Note that the measurements are significantly correlated so one cannot trivially add the S/N of different bins together.

Figure 6 .
Figure 6.The moments of the 4th tomographic bin, as a function of smoothing scales, for the DES Y3 map, the noise-only maps, and the A23 maps.The 4th and 5th moments (bottom panels) have their disconnected components subtracted out.The bands show 1 uncertainties for the Noise-only and A23 maps from the O (10 4 ) realizations used in this work.The moments are re-scaled by  n as a visualization choice, where  n = /2 and  ∈ {2, 3, 4, 5} is the moment order.The 2nd and 3rd moments have significant information beyond the noise.The 4th moment is significant on the smallest scales, but this contribution is entirely from the noise field since the blue/orange and green lines are almost perfectly overlaid.On larger scales, there is a weak, cosmological signal.The 5th moment is fully consistent with no signal across all scales.

Figure 8 .
Figure 8.The difference in CDFs measured on two DES Y3-like simulated maps, where one map contains source clustering and the other does not.The signal from source clustering (solid lines) is at 0.1 − 0.5 and generally contributes ≈ 5 − 10% to the total signal.The total signal-to-noise of source clustering-induced residuals is 1.3.

Figure 10 .
Figure10.The difference in CDFs depending on whether or not we account for reduced shear effects, Δ =  RS −  fid .The high redshift bins, especially when looking at the 2-field CDFs, see the largest impact given source planes at high redshift have larger values of  and thus the 1/(1 −  ) term for the reduced shear is larger.The deviations are still within ≲ 0.1 in all cases and are 2-3 orders of magnitude below the cosmological signal.The total signal-to-noise of reduced shear-induced residuals is 0.3.

Figure 11 .
Figure11.The difference in CDFs measured on dark matter-only (DMO) simulations and "baryonified" DMO simulations.As expected, baryon imprints are a significant effect on the data vector.The gray band shows the scales below  < 6.4 ′ , which is the pixel resolution of the Cosmogrid DES Y3 maps, and is a factor of 2 larger than the other maps we consider in this work.Thus, the baryon effects we estimate below that scale are an underestimate of the true effect given the pixel resolution will suppress these effects.The total signal-to-noise of baryon imprints is 3.5, though this is a lower bound given the suppression due to map resolution.
Figure12.The Fisher constraints from CDFs (left) and 2nd + 3rd Moments (right) measured on simulations.We present four cases, where we either have no scale cuts, or cut the data vector so the parameter bias in the Ω m −  8 −  0 contour is below a certain value; see Equation (21).The CDFs and the moments have comparable constraints, which are denoted in Table3.

Table 1 .
The Fisher information constraints for a joint analysis of Ω m ,  8 , and  0 , the Figure of Merit (FoM), and the size of the data vectors.The FoM, defined in equation (15), is a metric for the constraining power in multidimensional parameter space.All presented FoM values are normalized by the FoM of the Power Spectra.We compare with the shear results from (Gatti et al. 2022)and non-Gaussian statistics used on the DES Y3 or KiDS 1000 data.For DES Y3 we show constraints from Cosmic Shear(Amon &  Gruen et al., 2022; Secco & Samuroff et al., 2022a), 2nd and 3rd Moments(Gatti et al. 2022), and Peaks