Joint inference of multiplicative and additive systematics in galaxy density fluctuations and clustering measurements

Galaxy clustering measurements are a key probe of the matter density field in the Universe. With the era of precision cosmology upon us, surveys rely on precise measurements of the clustering signal for meaningful cosmological analysis. However, the presence of systematic contaminants can bias the observed galaxy number density, and thereby bias the galaxy two-point statistics. As the statistical uncertainties get smaller, correcting for these systematic contaminants becomes increasingly important for unbiased cosmological analysis. We present and validate a new method for understanding and mitigating both additive and multiplicative systematics in galaxy clustering measurements (two-point function) by joint inference of contaminants in the galaxy overdensity field (one-point function) using a maximum-likelihood estimator (MLE). We test this methodology with KiDS-like mock galaxy catalogs and synthetic systematic template maps. We estimate the cosmological impact of such mitigation by quantifying uncertainties and possible biases in the inferred relationship between the observed and the true galaxy clustering signal. Our method robustly corrects the clustering signal to the sub-percent level and reduces numerous additive and multiplicative systematics from $1.5 \sigma$ to less than $0.1\sigma$ for the scenarios we tested. In addition, we provide an empirical approach to identifying the functional form (additive, multiplicative, or other) by which specific systematics contaminate the galaxy number density. Even though this approach is tested and geared towards systematics contaminating the galaxy number density, the methods can be extended to systematics mitigation for other two-point correlation measurements.


INTRODUCTION
Over the past few decades, groundbreaking observations and theoretical advances have allowed us to construct a remarkably detailed picture of the properties and evolution of the Universe (e.g., Weinberg et al. 2013).Early-Universe observations from the cosmic microwave background (CMB; e.g., Planck Collaboration et al. 2020) and distance measurements from Type Ia supernova (Riess et al. 1998;Perlmutter et al. 1999;Filippenko 2005) already provide strong constraints on the currently accepted cosmological model, ΛCDM.To further test this model and its predictions across the full span of cosmic time and with a range of observables, photometric galaxy surveys have emerged as a cornerstone of modern cosmological analyses.Recent large-scale structure (LSS) surveys such as the Dark Energy Survey (DES; The Dark Energy Survey Collaboration 2005), Hyper Suprime-Camera Survey (HSC; Aihara et al. 2018a), and Kilo-Degree Survey (KiDS; de Jong et al. 2013) have shown strong cosmological constraining power.Upcoming and newly-begun surveys such as the Vera C. Rubin Observatory Legacy Survey of Space and Time (LSST; Ivezić et al. 2019), Dark Energy ★ fberlfei@andrew.cmu.eduSpectroscopic Instrument (DESI; DESI Collaboration et al. 2022), Euclid (Euclid Collaboration et al. 2022), and the Nancy Grace Roman Space Telescope High Latitude Survey (Spergel et al. 2015) will enable even more precise tests of the ΛCDM model in the coming decade.However, with the increasing survey size enabling greater statistical precision, the challenge of identifying and mitigating systematic contamination to provide robust and accurate constraints will be more important than ever.
Galaxy clustering measurements play an important role in cosmological analyses, as they provide information about the large-scale distribution of matter of the Universe, which carries information about the cosmic expansion history.The relation between the galaxy and matter density field, commonly known as galaxy bias, is a key determinant in clustering (Desjacques et al. 2018) that connects the observed galaxy clustering to the underlying matter field at all physical scales.Alongside weak gravitational lensing, galaxy clustering represents one of the three observables used in the current stateof-the-art analysis method for cosmological inference from imaging surveys (3×2pt analysis; e.g., Krause et al. 2021;Abbott et al. 2022).Clustering measurements depend on the galaxy (over-)density field and have a very high signal-to-noise ratio in large-area surveys, so potential systematic effects can disproportionately affect the observed galaxy overdensity field and have an outsized impact on cosmological constraints (Elvin-Poole et al. 2018).Consequently, generating tools for understanding and mitigating the impact of systematics on the galaxy overdensity field is paramount in the effort to produce robust cosmological constraints from large-scale structure measurements.Systematic effects, in the context of imaging surveys, encompass a wide array of observational and instrumental effects (Sevilla-Noarbe et al. 2021) that can systematically perturb the observed galaxy overdensity field, and thereby bias the measured galaxy two-point statistics.These effects can arise from survey-dependent factors such as atmospheric and observing conditions (e.g., seeing, airmass) and detector effects, or survey-independent (astrophysical) factors such as stellar density or Galactic extinction.In addition, systematicsinduced modulation of galaxy redshifts across the sky can also bias our measurements of angular clustering (Baleato Lizancos & White 2023).Accounting for all contaminating systematics can often be difficult, resulting in many possible systematic contaminants to consider (e.g., Rodríguez-Monroy et al. 2022).Moreover, for surveyindependent systematics, cross-correlation of different surveys will not help detect and remove their impact, motivating a comprehensive and robust approach to their mitigation.
Multiple methods have been developed for mitigating galaxy overdensity systematics.We discuss these methods in more detail in Sec.2.3.However, despite significant progress in mitigating galaxy overdensity systematics, challenges remain.The DES Y3 clustering results (Rodríguez-Monroy et al. 2022) are a notable illustrative example.Comparison between results from the color-selected red-MaGiC sample (Rozo et al. 2016) and the flux-limited MAGLIM (Porredon et al. 2021) galaxy samples, designed for 3×2-pt analysis, revealed unanticipated inconsistencies.This sparked great interest in the area of systematics treatment even though extensive validation determined the cause of the inconsistencies was most likely systematics-dependant sample selection.One pressing difficulty lies in the empirical determination of the form of the systematics contamination to the overdensity field, particularly when multiple systematics effects combine in complex ways.While simulations offer valuable insights, developing empirical methods for modeling and learning about the functional form of systematics contamination is an important path forward for the field.
In this study, we use a new and more generalized modeling approach for systematics contamination of the galaxy overdensity field.Our approach is empirical: we provide a generalized functional form for the contamination, and a method for constraining its parameters using the real survey data.We apply this method to jointly infer and mitigate contaminants in the clustering signal and validate its performance using KiDS-like mock galaxy catalogs (Harnois-Deraps et al. 2018) with synthetic systematics incorporated into them.Our goal is to provide a method for empirically characterizing and mitigating more complex systematic contamination.
The structure of the paper is as follows.In Section 2 we provide background on methods for mitigating galaxy overdensity systematics and the connection to cosmology.In Section 3 we outline our new systematics characterization and mitigation method.Section 4 details the mock extragalactic galaxy catolog used for testing our method, along with approaches used in its analysis.In Section 5 we present the results of applying our method under a variety of conditions, building up to a multi-systematic case that emulates the real complexity of survey data.Section 6 discusses the implications of our results and possible applications to real data.Finally Section 7 summarizes our findings and presents the challenges and future work.

BACKGROUND
Here we outline the background for this work, starting with the importance of clustering in cosmology and how systematics can bias cosmological analysis.We then introduce the galaxy overdensity formalism and contamination model in Subsection 2.2 and discuss various existing methods to deal with systematic contamination in Subsections 2.3 and 2.4.

Connection to Cosmology
The matter distribution of the Universe is of fundamental interest in cosmology as it carries important information about the evolution and structure of the Universe over time.It is therefore used for testing theoretical models for cosmic acceleration and understanding the ΛCDM cosmological model.Current cosmological analyses, such as the 3×2pt analysis (Krause et al. 2021;Abbott et al. 2022), use the combined information from clustering, shear, and galaxy-galaxy lensing to extract cosmological information (e.g., matter energy density, amplitude of matter fluctuations) in a Bayesian inference framework.Combining these probes is advantageous because the probes are complementary and combining them enables us to marginalize over observational and theoretical uncertainties associated with single probes.We use galaxies to probe the underlying matter density field, but marginalizing over the galaxymatter relationship requires modeling both linear and non-linear behavior, with the linear relationship that holds on large scales described as  g =   →  gg =  2   , where  g/ represents the galaxy/matter overdensity field,  the correlation function and  the linear galaxy bias.
The galaxy bias (for a review, see Desjacques et al. 2018) represents the relation between the galaxy and matter distribution and depends on the mass of the galaxy's host dark matter halo, along with other properties such as luminosity and redshift.Galaxy bias parameters are fitted during the inference process and are crucial in connecting the galaxy clustering signal to the matter clustering.If a simple linear bias model is used, clustering is then sensitive to  2 while galaxy-galaxy lensing is sensitive to  ( g =   , where  represents the lensing convergence), so we can eliminate  by using both probes.Therefore, biased measurements of the galaxy angular correlation function will bias the cosmological inference analysis.It is explicitly shown in Elvin-Poole et al. (2018) that ignoring systematic contamination to the galaxy clustering signal can cause overestimation of the galaxy bias parameters and underestimation of Ω  , for example.For this reason, mitigation of systematic contamination in galaxy clustering plays a crucial role in producing an unbiased cosmological analysis.

Galaxy Overdensity and Contamination Formalism
In order to create a reliable contamination model for galaxy overdensity systematics, it is useful to understand what kind of contaminants can be present and how they can affect the galaxy number density.For example, spurious galaxy detections due to stars mistakenly identified as galaxies can have a purely additive effect on the observed galaxy number density: to lowest order, this effect should depend on the stellar density but not on the galaxy density.Spatial variations in the PSF size can modulate the selection function (detection probability) of galaxies, for example due to modifications in the measured signalto-noise ratio or modulation in the degree of unrecognized blending between galaxies.This effect results in a multiplicative effect on the galaxy number density, as its magnitude depends on the local density The regions where the measured and true galaxy distributions differ are circled on both panels, with the color of the circle indicating the type of systematic that is generated.An example of a multiplicative systematic is the unrecognized blending of two or more distinct galaxies in the observed distribution, resulting in a multiplicative effect to the observed number density.An example of an additive systematic is whenstars aremisidentified as galaxies in the measured distribution.This illustration shows how multiplicative systematics depend on the galaxy density field while additive ones are independent of the galaxy distribution. of galaxies.An example illustration of additive and multiplicative effects on observed images is depicted in Fig. 1.We emphasize that we have only mentioned examples here, and that subtleties in the image processing can give rise to numerous other additive or multiplicative systematics in the galaxy overdensity field.
With these examples in mind, we start by defining notation that describes how contaminants modulate the observed galaxy overdensity field.We model the observed galaxy number density, n (  ), in a given direction    on the sky, as the true galaxy number density   (  ) modulated by some systematic contamination function  syst (  ) and an additive term  add syst (  ): Since we are interested in the clustering signal quantified as a twopoint correlation function of the galaxy overdensity field, we must write equation (1) in terms of the galaxy overdensity field.We write the true number density in the usual way where   represents the overdensity of some galaxy density field and bars over quantities indicate ensemble averages.
If we assume1 n = n , equation (1) becomes: after dividing by n and subtracting 1.Here, too, quantities without vs. with hats represent the true vs. observed field.We notice two additive terms and one multiplicative bias term.The additive terms  syst and  add syst / n can be combined into a single term that describes the additive systematics contamination to the overdensity field.Note that the combined additive terms may differ from the multiplicative term -they are not, by definition, identical.
In practice, we can obtain direct estimates of the overdensities and of observable quantities that can produce systematics by pixelizing the observed survey footprint.That is, all observed galaxies can be binned into equal area pixels from which these quantities can be calculated.

Methods for Systematics Correction
Many methods have been developed to address the challenging problem of correcting for systematic contamination in clustering, in response to the need for robust tools to deal with contamination of potentially unknown form.For reference, Weaverdyck & Huterer (2021) provide an extensive analysis and a review of multiple decontamination methods, along with their own method using ElasticNet.For consistency we stick closely to the formulations and notation used in Weaverdyck & Huterer (2021).Here we provide a summary of some commonly used methods, with a goal of motivating our own.

Mode Projection and Template Subtraction
Mode projection was introduced by Rybicki & Press (1992) and is applied to directly correct the galaxy overdensity power spectra ( ℓ ) for the presence of systematic contaminants by assigning infinite variance to the systematic templates, hence modifying the signal covariance matrix.However, this method suffers from the difficulty of needing to invert the modified covariance matrix.Elsner et al. (2016) proposed an adaptation of this method that avoided this challenge.For a single contaminant template  and some contamination parameter , the contamination model assumes that the additive terms in equation ( 2) can be written as  syst (  ) +  add syst (  )/ n (  ) =  (  ).The mode projection formalism neglects the multiplicative term in equation (2) under the assumption that  syst   ≪ 1, so the overdensity equation becomes δ (  ) =   (  ) +  (  ). (3) The contamination parameter is estimated using the covariance of the template map and the galaxy field and the variance of the template map itself, where  2 tg here represents the covariance of the  and δ maps over the whole footprint.In the full-sky case the covariance is given by We can write the same expression for  2 tt by replacing the cross power spectrum with the auto power spectrum of the template maps.Note that in practice the covariances in equation ( 4) can be calculated at the pixel-level directly rather than by measuring the power spectra.The method can be straightforwardly extended to the case of multiple systematics, as well as multiple galaxy maps in the case of a tomographic analysis.Using this method, a cleaned version of the overdensity field can be obtained from equation (3), and the contamination-corrected galaxy power spectrum can be estimated from the resulting cleaned map.Note that calculating the power spectrum of the cleaned map will result in a biased estimator, and Elsner et al. (2016) details how to properly debias the estimated power spectrum of the systematics-corrected map.This method is currently implemented in the open-source galaxy clustering code NaMaster (Alonso et al. 2019).It is also worth noting that pseudo-Cl mode projection and ordinary least squares regression at the pixel level have been shown to be equivalent methods (Weaverdyck & Huterer 2021).
Template subtraction (Ross et al. 2011;Ho et al. 2012) takes a similar approach to mode projection, keeping the same model contamination of equation (3).However, the contamination parameters are calculated for each mode as and then applied directly to the observed power spectra to obtain a biased estimate of the power spectra: Elsner et al. (2015) details how to properly debias the estimated power spectra in both the full-sky and partial sky limit.

Iterative Regression and ElasticNet
A more recent method developed by Elvin-Poole et al. (2018) uses a simple linear regression approach to iteratively clean the overdensity maps.For a systematic labeled , with template   , pixels are placed in evenly-spaced bins based on the value of the template within that pixel.Using all of the evenly-spaced bins (indexed ), regression parameters for each systematic    ,   are fit from the equation: Mock catalogs are used to estimate the covariance matrix of the binned quantities needed for fitting.This method is applied iteratively by identifying the systematics that more highly correlate with the density field and cleaning the map of the more dominant systematics first.It does this by assigning weights to all galaxies based on their pixel template value:  = 1/(   +   ).It then iteratively proceeds to the next most dominant contaminant performing the same process, which in practice is equivalent to applying a product of weights to the density field.The process stops once a desired significance level (for the defined null tests) is achieved.A final set of weights (the product of each iteration) is then assigned to each galaxy when calculating the corrected estimate of the two-point correlation function.
Finally, Weaverdyck & Huterer (2021) introduced the use of ElasticNet, a linear regression model with L1 and L2 regularization, as a mitigation method.The method uses an additive contamination model to estimate the contamination parameters for each template initially.Then, an additional map-level correction for the templates assumed to be multiplicative is done to account for multiplicative systematics.They also derive the impact of multiplicative systematics on the overdensity covariance and incorporate it into the likelihood.The likelihood for fitting the templates uses a a contamination model from equation (2) with  add syst = 0 and  syst (  ) =   T(  ), where    consists of a vector of contamination parameters and T is a matrix containing the systematics templates for every pixel.The contamination parameters are fit for by minimizing the loss function where δ represents the observed galaxy overdensity, and ||•|| 1 and ||• || 2 are the L1-norm and L2-norm, respectively.These two norms are meant to incentivize sparsity in the estimated parameters (L1-norm) and address possible correlations (L2-norm) between template maps.The correction to the galaxy power spectra then follows a similar procedure to that of mode projection described earlier.However, as previously mentioned, the cleaned maps are estimated by taking into account the multiplicative bias for the templates assumed to have this effect, which was neglected in previous methods.The result is a cleaned version of the overdensity map, which we call here δclean When correcting for the impact of systematics on the angular correlation function, a set of weights for each pixel can be produced in a similar manner to iterative regression, where the weight map is given by     = (1 +   T) −1 , and   ,  is the weight of pixel .It is important to note that this method does not need mock catalogs, and finds the regularization parameters through a form of cross-validation and partition of the maps.While ElasticNet represents an advance in correcting for both additive and multiplicative systematics in the galaxy overdensity, it relies on a priori information regarding which templates are multiplicative without a means for empirically choosing between the two.This becomes a special case of equation ( 2), which shows that if there are additive and multiplicative systematics in the number density, the result is different additive and multiplicative systematics in the overdensity.Therefore, incorrect a priori choices of which templates are multiplicative can lead to biases in the corrected overdensity map.

Non-linear Methods
There are a variety of non-linear machine learning methods (Rezaie et al. 2020;Johnston et al. 2021, e.g.,) that attempt to correct for systematic contamination in galaxy clustering.For example, Rezaie et al. (2020) uses a similar contamination model as equation ( 2), but instead of assuming a specific linear form for the contamination, use a fully connected forward neural network to solve the regression problem.The learned contamination function can then be applied to weight each pixel and correct the power spectra for systematics contamination.Johnston et al. (2021) uses self-organizing maps (Kohonen 1990) to mimic the systematic-contaminated maps and subtract the spurious modes from the clustering signal.

Summary of Methods
The correction techniques discussed in the previous subsection have achieved notable success in reducing the impact of systematic contamination on clustering measurements and have been applied to a wide range of survey data and simulations.Most of these methods are built on several key assumptions, such as the ability to treat systematics contamination as fully additive, or fully multiplicative, rather than a combination of both.While methods with these assumptions have often been sufficient for existing datasets, it is by no means clear that they will be sufficient for datasets with smaller statistical uncertainties.For example, neglecting multiplicative terms (that is,   terms in equation ( 2)) in the correction, or assuming they are the same as the additive term, could result in uncorrected biases (Shafer & Huterer 2015;Weaverdyck & Huterer 2021) in cases where those terms exist and differ from the additive systematic term.
In addition, understanding the existence and level of both the additive and multiplicative systematics terms in the contamination model can potentially help us understand empirically how systematics contaminate the observed galaxy number density.This knowledge could be used to identify areas for improvements in image processing algorithms.
These challenges highlight the need for further advances in correction techniques and modeling that can address the more general model for systematics in galaxy overdensities shown in equation ( 2).This motivates us to develop a generalized and flexible approach that can accurately model and remove systematic contamination in clustering while providing some understanding of the functional dependence on   .This provides a level of interpretability to the results that non-linear ML methods, such as neural networks, often lack.

METHODS
This section describes the method introduced in this work.First, we introduce the specific contamination model used to quantify and correct for systematic contamination in the one and two-point functions.The likelihood function chosen to characterize the galaxy overdensity field is defined with the inclusion of possible contamination, along with statistical methods to compare different modeling choices.Next we highlight the impact of noise and how to remove biases it introduces in the correction of two-point correlation functions.Finally, we describe possible selection effects when dealing with real data and mocks.

Systematics Contamination Model
We will start from equation (2) and initially consider the case of one systematic, described by some template .In our formulation of the systematics model we choose to define the systematic functions in terms of template overdensities   (  ).Similar to other works, we choose  syst (  ) =   (  ), but now consider  add syst (  ) =   (  ).So equation (2) becomes as follows: where  and  are the contamination parameters, absorbing the previous constants:  =  + / n and  = .This allows us to construct a nested model, for which the cases of purely additive ( = 0) and purely multiplicative ( = ) number density contamination are special cases.Since a given systematic template might have both additive and multiplicative effects on the number density, generalization beyond these simple models is physically motivated and allows us to test our assumptions about what contamination models we use.Also note that we have chosen to consider only terms linear in   since we can "feature engineer" through nonlinear transforms of one or more base templates such that the dependence is linear (Elvin-Poole et al. 2018;Weaverdyck & Huterer 2021).This formalism can be generalized for the case of  sys systematics maps, for which we would have: From now on we will refer to the case where {, } are free as the "Combined" model, the form when  =  as the "Multiplicative" model, and the form when  = 0 as the "Additive" model, i.e., δ As explained in Sec.2.3, most current methods assume either additive or multiplicative contamination models in the galaxy number density.
If the template maps are uncorrelated with each other and with the underlying true galaxy overdensity distribution, the observed galaxy two-point function would then be: and our estimate for the corrected two-point correlation function would be So in order to correct the clustering signal for systematics, we must estimate the contamination parameters {  ,   } and measure the spatial correlation functions of the zero-mean template overdensities.Note that model misspecification can lead to biased corrections of the correlation function, as we will see in Sec. 5. Some of our main goals are to introduce a formalism with the flexibility to handle different types of systematics, to robustly correct the two-point function despite the increase in the number of parameters, and to gain an understanding of the functional contamination model empirically from data.

Correlated Templates
The above formalism assumes systematics templates are uncorrelated, but generalizes easily to correlated systematics.To handle the case of correlated systematics templates, we use the eigenvectors of the template maps' pixel covariance matrix to create an orthogonal set of template maps,  ′  .Equations ( 14) and ( 15) can then be rewritten in the rotated frame to mitigate the clustering signal: where { ′  ,  ′  } are the contamination parameters in this new space.The contamination parameters in the original template space can be retrieved, if needed, by applying the inverse rotation matrix on these new parameters.We want the original parameters back as a way of evaluating parameter estimation and possibly understanding the functional form of contamination from systematics.Details of the derivation of the correlated case are in Appendix A.

Likelihood Function
To estimate the contamination parameters, we work directly with galaxy and systematic overdensity maps.We use healpy (Zonca et al. 2019) and HEALPix2 (Górski et al. 2005) to pixelize the data and create such maps from galaxy catalogs.Hence, the notation   ,  refers to the average value of   from all galaxies in pixel .The size of the pixels depends on the sample: it should be chosen such that pixels have a non-negligible number of galaxies, but are small enough to retain information about clustering on the smallest scales of interest.In this work we used pixels of NSIDE = 512, which have an area of ∼ 47 arcmin 2 .We use a maximum likelihood estimation approach to estimate the parameters of interest.
The simplest approach is to model the probability density as a Gaussian distribution, (  ) ∼ N (0,   ).The conditional probability distribution ( δ | t ,   ,   ) for uncorrelated pixels is then: is the th column of the matrix  t , and   is the galaxy overdensity pixel-map variance.The summation goes over all  pixels in the map.The rows of the matrix  t represent each systematic, while the columns represent the systematic values at a particular pixel.The denominator term is a normalizing factor needed to account for the multiplicative form of rearranging equation (12).In order to parametrize the   distribution as a likelihood we need to understand what we expect its general form to be.By definition,   ≥ −1 and ⟨  ⟩ = 0, so the distribution is bounded from below but is allowed to take arbitrarily large positive values.This means our distribution will naturally be skewed.For this reason, we also explore modeling the conditional probability of each pixel as a skewed Gaussian distribution.The resulting log-likelihood is: , where   contains the in terms { δ ,  t ,   ,   } by rearranging equation ( 12),  is the skewness parameter, and is the center of the distribution, in this case defined by assuming that the mean value of δ is zero.The second term in the likelihood serves as a normalization factor.Note that this formula assumes pixels in our galaxy overdensity map are uncorrelated, even though we know this is not true.However, we show later in Section 5 that this approximation still enables us to accurately mitigate the systematics in the two-point correlation functions.For the case of the combined systematic model with  sys templates, we fit 2( sys + 1) parameters via the maximum likelihood.Both    and    consist of  sys parameters used in mitigation, while  and  are nuisance parameters describing the overdensity distribution.

Distinguishing between likelihood models
To summarize, we have created a nested likelihood model described by equation ( 18) that can reduce to a simple Gaussian in the case of  = 0, and that reduces to additive or multiplicative systematic terms for particular relationships between   and   for systematic .This formulation of our model allows us to easily compare models in terms of how well they describe the data using likelihood ratio tests.For shorthand we represent all model parameters by Θ = {  ,   , , }.We define this ratio as where L is the likelihood defined in equation ( 18), Θ and Θ 0 are the unrestricted and restricted parameter spaces, and the hatted variables represent their best fit parameter estimates.So for example, when we want to test whether the additive systematic model is sufficient to describe systematics due to template   , we can define Θ 0 = {Θ :   = 0}.Similarly when we want to test the sufficiency of a Gaussian likelihood to describe the data, we can define Θ 0 = {Θ :  = 0}.In a likelihood ratio test formulated this way, the null hypothesis is that the data can be adequately described by a simpler model, which usually translates to a model with fewer parameters or restrictions.
If the null hypothesis is true, the quantity  LR defined earlier is asymptotically  2 distributed.In order to reject the null hypothesis (thereby preferring a more complex model to describe the data), we need to define some critical value  such that we can reject the null hypothesis if  LR > .This critical value is often taken to be  ≈  −1 (1 − , ), where  is the cumulative distribution function (CDF) of a  2 distribution,  is the probability of incorrectly rejecting the null hypothesis, and  the number of degrees of freedom.The value of  is selected based on the desired risk of incorrectly rejecting a simpler model for a more complicated one.This test provides an approach to model selection, and hence a general understanding of the validity of the assumptions underlying our models, that takes as default the simpler (null) model, and hence requires strong evidence to reject that model in favor of the more complicated alternative.This reflects our general preference for avoiding overparameterized models.These tests are enabled by the nested structure of the models under consideration.However, ultimately we must draw conclusions about the success of our method based on how well they enable us to correct the two-point correlation functions.

Noise Debiasing
The correction for biases in the clustering signal in equation ( 16) uses the squares of the estimated contamination parameters ( 2 ,  2 ).Under the assumption that our systematics model correctly describes the systematics contamination, the estimated parameter â can be modeled as the true parameter   plus zero-mean noise: â =   +   .In this case, â is an unbiased estimate for   , but its square â2  is a biased estimator for  2  : The unbiased squared terms that should be used in the estimated correction in equation ( 16) are therefore where Var[ â ] here represents the noise variance of the parameter   .The same considerations and debiasing procedure applies to the  parameters.In the limit of purely additive systematics, the noise debiasing presented here is in essence equivalent to the debiasing methods developed in mode projection and template subtraction.Note that failure to debias the squared parameters can lead to significant overcorrection of the two-point function.This is shown quantitatively in Fig. B1 of Appendix B. This illustrates the necessity of removing the noise bias in the estimate of the squared parameters in order to obtain an unbiased estimate of the clustering signal.This particular debiasing method differs from that of methods like ElasticNet, which deal with the bias-variance tradeoff by using data-calibrated L1 and L2 regularization, as mentioned in Sec.2.3.2.Meanwhile, this approach balances this tradeoff using a direct debiasing technique similar to that of mode projection and template subtraction.
This approach is straightforward to implement if mock catalogs are available to estimate the variance in the systematics parameters { â , b }.However, in the absence of mock catalogs, we will discuss in Section 6 a bootstrapping approach to estimating the noise without the need for mock catalogs.It is important to note that the parameter noise is influenced by multiple factors: 1) the specific form of the likelihood for and power spectra of   and   , 2) the survey area coverage and pixelization, and 3) the number of parameters being jointly estimated.Different contaminants will in general have different noise levels, and being able to empirically estimate the noise in the systematics parameters is crucial.

Selection Effects
In both mock catalogs and real data, we need to account for selection effects that systematically modulate the observed number density.For example, these effects can be caused by observing conditions and survey design.Different completeness levels across the survey area can bias the selection of observed galaxies.We account for these effects using random catalogs to normalize the observed galaxy number density in every pixel.
where  tot  is the total number of galaxies in our sample and  tot ,rand is the total number of random galaxies.Their ratio serves as a normalizing factor given that our random catalogs have higher number density than our survey catalogs.In addition, we remove pixels in which our random catalog has very few galaxies to avoid extreme outliers and division by zero.Specifically, we calculate the maximum number of galaxies in a pixel for our random catalog, and discard all pixels that contain fewer than 10% of this quantity.In practice this resulted in removing less than 1% of the total area of the mocks used, which are described in Sec.4.1.Note that since we carry out this process using the random catalogs, the results simply remove any survey regions that fall into pixels with very little survey coverage.

DATA, SOFTWARE, AND ANALYSIS
This section describes the data and software choices used to test the methodology described in the previous section.We start by describing the mock extragalactic galaxy catalog used and the process of generating mock systematic template maps.Following this we explain the software used to fit for the systematics model using the one-point function and estimate the two-point function.Finally we describe the process of quantifying cosmological impact using the data and tools in this section.

Mock Extragalactic Galaxy Catalog
To The -body simulation produces a full light cone reconstruction, which requires simulating the non-linear evolution of the particles from a desired redshift, as well as generating halo catalogues.The generated halo catalogs are then used, along with a given HOD model, to assign a certain galaxy population to every host halo.The HOD model used to produce these mocks is described in Smith et al. (2017) and it is referred to as the GAMA HOD model in Harnois-Deraps et al. (2018).For the KiDS-HOD galaxy mocks, the HOD model is calibrated to reproduce key properties of the KiDS Survey (Hildebrandt et al. 2016).The galaxies are assigned up to redshift  spec = 2.0 and contain photometric redshifts and lensing information as well.
Note that we chose to implement our method directly on the twopoint function (in configuration space), which is why we chose to work directly with galaxy positions, rather than looking at contamination in the power spectra.We used a set of 119 publicly-available KiDS-HOD mock catalogs 3 with total sky coverage of 100 deg 2 , along with an associated random catalog.The random catalog has a density approximately 10 times higher than that of the mocks.We work with only one low redshift tomographic bin (0.2 <  spec < 0.4) as a proof of concept for the methodology presented in this work.Table 1 summarizes key numbers for the mocks used for this work.The catalogs are pixelized with NSIDE = 512 (pixel area ∼ 49 arcmin 2 ), meaning that a typical pixel has ∼ 40 galaxies.

Mock Systematic Maps
In order to develop and test our method in a realistic scenario, it is important to generate systematics template maps that emulate the structure and distribution of systematics encountered in real data that will bias the observed number density.To do so, we use the Point Spread Function (PSF) area size (a proxy for seeing) as our test case, given that this is a well known observational parameter that can bias the observed galaxy number density depending on how the detection process is carried out, especially for ground-based surveys where regions with worse seeing can exhibit more severe challenges in deblending overlapping galaxy light profiles.We start by measuring the PSF area size autocorrelation function from HSC Public Data Release 1 (Aihara et al. 2018a,b).From a pixelized map of seeing size, with NSIDE = 512, we can construct a map of the seeing size overdensity, which is what we use to calculate the autocorrelation function using the estimator described in Sec.4.3.We then fit the correlation function to a declining exponential, as it fits the data quite well in the angular scales between 1 and 500 arcmin, which fall within the scales of interest for measuring clustering.With this fit and the use of CAMB (Lewis et al. 2000), we go from the correlation function to the power spectrum (  →   ℓ ), where  here refers to the seeing size overdensity template we have constructed from the data.We then fit the power spectrum to an exponential function, as it once again provides a good visual fit to the data in the scales we are interested for clustering (40 < ℓ < 1500), and find the best fit to be   ℓ ∝ exp[−ℓ/6].However, since the scales probed by HSC are much larger than those we measure in the KiDS-HOD mocks, we adjust the scale of the exponential decay to 1/500, to emulate the overall behaviour of the correlation function in the range of scales for this analysis.Note that for this validation test, we are not looking to closely fit for the seeing power spectrum, but rather to have an approximate idea of its spatial correlations as a function of scale.We can then use HEALPY (Zonca et al. 2019) to generate full-sky template overdensity maps with NSIDE = 512, and only use the pixels in the same area as our KiDS catalogs.
The above description was for the case of a single systematic.However, we will be interested in generating multiple systematics from different power spectra to introduce a realistic level of variety in the spatial correlations of the systematics.We use 5 different power spectra to represent 5 "families" of systematics, where some of the forms are similar to those used in Weaverdyck & Huterer (2021): 3 See available SLICS products at: http://cuillin.roe.ac.uk/ ~jharno/SLICS/ (ii)   ℓ ∝  − (ℓ/250) 2 (iii)   ℓ ∝ (ℓ + 1) −2 (iv)   ℓ ∝ (ℓ + 1) −1 (v)   ℓ ∝ (ℓ + 1) 0 Fig. 2 shows the correlation function for synthetic systematic maps generated using the above power spectra, as well as the HSC PSF area size correlation function, calculated from HSC PDR1, for comparison.Note that the full-sky maps generated using these power spectra are cut to match the area of the KiDS-HOD mocks, resulting in smaller available scales than those measured for HSC.We see that the overall shape of the correlation function differs for each systematic family, while still being consistent with the expected behavior we see from real data.For example, the flattening of the correlation function at smaller scales and exponential decay at larger scales is seen both the real data and in the synthetic maps.The actual scale on which the correlation function decline differs in the HSC data and the mocks by design, as described above.
This figure shows that despite the limited area coverage of the mock catalogs that drove us to use smaller scales than we would in a realistic analysis, the shape of systematic correlations across the scales used is non-trivial and depicts the complexity we can expect for real systematics.Maps can then be generated using a choice from these families, and in practice all maps are normalized to have equal variance.We can see that for a fixed variance, there will be different patterns in the spatial correlations for these families of power spectra.For example, family (iii) has less power for higher ℓ modes or smaller spatial scales than most of the other families, while family (v) can be identified as white noise.

Measuring Galaxy Clustering and Contamination Parameters
We estimate the galaxy two-point correlation function using the Landy & Szalay estimator (Landy & Szalay 1993) implementation in TreeCorr4 (Jarvis et al. 2004): where D and R represent the galaxy and random catalogs respectively, and DD, RR and DR are the number of galaxy pairs.In our specific implementation, the two-point function is calculated in 15 logarithmically separated bins in angle  between 0.06 arcmin and 30 arcmin, a choice made given the small area of our mocks.The pixel size (∼ 7 arcmin) used to estimate the contamination parameters is much larger than the smallest scales used to measure clustering here.The pixel size was chosen in the context of the scales used for current survey science using galaxy clustering (Rodríguez-Monroy et al. 2022).The reason we include scales much smaller than the pixel size in our analysis is due to the limited size of the mock catalogs (100 deg 2 ).Capping our analysis to scales commonly used for measuring clustering in surveys (∼ 5 − 200 arcmin) would have strongly limited the range of scales we can use for this analysis, given that larger scales are greatly affected by low-statistics noise.In addition, the large pixel scale does not cause an issue in our tests since this pixel size was self-consistently used both to construct the template maps and to correct for their contamination.For this reason we include smaller scales than the pixel scale in our analysis, and comment on the fidelity of the corrections in Sec.5.2.
Covariance matrices for the estimated two point functions are calculated using all 119 mocks.That is, the covariance of two angular separation bins is given by where  realizations = 119 in our case, the subscripts ,  refers to the angular separation bins   and   ,   is the two-point function of the th mock catalog, and w(  ) is the mean correlation function of bin  over all mock catalog realizations.
We estimate the contamination parameters described in equation (12) at the level of the one-point function (i.e., the PDF of   values) by sampling the parameter space using a Monte Carlo Markov Chain (MCMC) algorithm with emcee (Foreman-Mackey et al. 2013) and choosing the parameters corresponding to the maximum likelihood point.Even though parameters can be more efficiently obtained by using an optimizer (such as gradient descent), we use MCMC to obtain estimates of the parameter noise from single mock realizations (without the need to assume a Gaussian likelihood) and compare them to those from multiple mock realizations.The algorithm needs a user-specified number of walkers to explore the parameter space for a certain number of iterations.The number of walkers should generally be at least twice the number of parameters, while the number of iterations needed to converge might depend on the number of pixels used.For this work, we used 250 walkers and 1500 iterations.We evaluate convergence heuristically by running the MCMC for 5000 iterations, on a single mock realization, using the maximum number of parameters considered in this work.We confirm that the mean and variance of each parameter, calculated across all chains, stabilizes to the 5% level on average after 1500 iterations.We also confirmed that increasing the number of walkers or iterations did not considerably decrease the integrated autocorrelation time of the MCMC chain, which was approximately 90 for a 52 parameter inference.

Quantifying Cosmological Impact
We quantify the possible cosmological impact of residual contamination in the clustering signal by modeling the corrected estimate of the clustering signal as a function of the true signal modulated by an amplitude : where Δ =  − 1 quantifies the bias in the corrected clustering signal.This can be understood in the context of 3×2pt analysis, where clustering and galaxy-galaxy lensing are used to constrain the galaxy bias, as explained in subsection 2.1.The most relevant dependence in this context is that of  8 , a measure of clustering strength, which in the linear regime is related to the galaxy bias, , and galaxy clustering,   , by the relationship:   ∝  2  2 8 .Measuring any biases on the amplitude of   will determine how accurately the  8 combination can be measured.In our validation case of 119 mocks, the value of A can be obtained for every individual mock catalog using the corrected and true estimates of the two-point function by minimizing the  2 : We use the full range of scales from the measured clustering (0.06 -30 arcmin) in this fitting process.Doing this fit for every mock catalog, we can compute the average over all realizations, Ā, to obtain a measure of the bias.The covariance matrix C corr here is taken from all realizations of the corrected clustering signal and is calculated using equation ( 24).We are interested in the estimated bias on the corrected two-point function, Δ Ā = Ā − 1, as the success metric to quantify the significance of residual bias in the corrected two-point function.Note that using the realization-specific  true and  corr to estimate  for every mock catalog largely removes the impact of cosmic variance on , reducing the uncertainties in our analysis.

Summary: Step-by-Step Procedure
We summarize the previous subsections with a step-by-step procedure for generating and analyzing the data of a single mock catalog.
(i) Measure the true correlation function of the extragalactic galaxy catalog described in Sec.4.1 and generate pixelized maps with NSIDE = 512.
(ii) Generate  sys full-sky template maps using HEALPY from the desired selection of the 5 systematics families described in Sec.4.2.Mask the full-sky map to only cover the same area as the galaxy mock catalog.Compute the correlation function for all systematics by assigning each galaxy the corresponding template values for the pixel in which the galaxy is located.
(iii) From the pixelized versions of the galaxy field (  ) and systematics maps (  ), we generate the contaminated galaxy field ( δ ) using equation ( 12) from a chosen set of contamination parameters.We then calculate the contaminated clustering signal using equation ( 14).The solid lines show the results of maximum likelihood estimation fits for the distribution with a skewed Gaussian (green) or Gaussian (purple) likelihood.All contamination parameters are fixed to zero while fitting, while  is free to vary in both likelihood forms.The parameter  = 0 for the Gaussian likelihood but is free to vary for the skewed Gaussian.The dotted lines show the same fit after removing pixels with   ≥ 1.6 as potential outliers.
(iv) With the observed galaxy overdensity field and systematics maps as inputs, the contamination parameters are jointly estimated as prescribed in Sec.4.3 by using a MCMC algorithm with the chosen likelihood model (Gaussian or skewed Gaussian) and contamination model (additive, multiplicative, or combined).
(v) The contaminated clustering signal is corrected using equation (15).This procedure is followed for all 119 mock realizations, yielding an ensemble of results used to quantify decontamination quality, parameter estimation and noise, and cosmological impact.

RESULTS
In this section, we show the results of applying our method to mock galaxy catalogs.We will first apply our method to mock catalogs without any systematics, to explore differences between our use of a Gaussian or skewed Gaussian distribution for   (Sec.5.1).We will then test our systematics mitigation method in different scenarios varying in complexity to evaluate its performance.The outcome of that exercise, which will be detailed in subsections below, may be summarized briefly as follows: • Sec.5.2 shows that systematics mitigation performance is especially sensitive to choice of systematic model.
• The use of a nested contamination model (here defined as "Combined") allows flexibility in defining a model that can describe the multi-systematic case in Sec.5.3.
• Sec.5.4 shows that our method can robustly mitigate systematic contamination in the clustering signal to the sub-percent level, reducing the bias on the two-point function to below 0.1.

No Systematics
Before investigating systematics mitigation, we first explore the underlying distribution we are trying to model.The likelihood function in equation ( 18) is based on certain assumptions about the true   distribution.For the case where we cannot assume |  | ≪ 1, we know that the underlying distribution cannot be Gaussian, as   is mathematically bounded to be greater than −1 but can take any positive value.As seen in Fig. 3, we use the true   field from our mock catalogs (filled histogram) and fit its distribution using both a Gaussian and a skewed Gaussian likelihood function.For both likelihoods we fix any contamination parameters {, } to zero and allow  to vary, while we fix  = 0 for the Gaussian likelihood and let it vary for the skewed Gaussian.The solid lines in the left-hand panel show that the skewed Gaussian likelihood provides a better fit to the data than the Gaussian likelihood, given the skewed distribution of   values.However, we found that this difference was mostly due to potential outliers, which once removed (shown as the dashed lines) provided better fits to the distribution and agreement in the estimated variance.Physically the outliers represent regions of very high galaxy density.The cut   < 1.6 was made to test this.Note that in a real cosmologi- shows the uncorrected two-point function, with contamination of at most 2 per cent.Solid lines show the results after correction using a Gaussian likelihood for the MLE, while dashed lines show the correction using a skewed Gaussian likelihood.The different colors show different contamination models used: additive (grey, left panels), multiplicative (orange, left panels), and combined (blue, right panels).The shaded contours show the ±1 regions for their respective curves based on multiple realizations of mock catalogs.In both rows, we see that using the incorrect contamination model leads to residual biases after the systematics correction, while correction with the combined model results in equivalent performance to using the true contamination model.For both the true and combined model, the residual systematics are below the 0.1 per cent level.There is no significant difference in performance between the skewed and Gaussian likelihoods when using the combined systematics model.cal analysis, larger scales will be used and the variance will therefore be smaller.
It is important to highlight that this cut was made in the true   distribution, which we do not have in real data.Therefore, this cut is purely for illustration purposes and is not applied at any point in the analyses later in this section.However, it is important to point out that the fits with the Gaussian distribution were especially sensitive to these large   values.As we can see from the right panel of Fig. 3, the outliers contributed a considerable portion of the overall estimated variance despite being a small fraction (∼ 1%) of the pixels.In contrast, the skewed Gaussian likelihood seems to be less sensitive to outliers.This consideration is relevant because real data will have outliers for a variety of reasons (e.g., survey window complexity not represented in these mock catalogs).Our results indicate that the skewed Gaussian likelihood may be able to more accurately estimate the true variance of the   distribution and possibly improve estimates of the systematic contamination parameters modulating the variance, more specifically the  parameter in equation ( 11).

One Systematic
We next test the method for a scenario with a single source of systematic contamination.We consider separately the cases of a single additive and a single multiplicative systematic.The true overdensity map is contaminated using a single systematic template map generated from the power spectrum (i) described in Subsection 4.2 to produce the observed galaxy overdensity map.The observed twopoint function is calculated using equation ( 23).In each realization, we estimate the contamination using the parameters from the maximum likelihood estimation approach described in Subsection 3.3 by fitting for a single systematic map, then correct the two-point function using equation ( 15).We use different versions of the likelihood model in equation ( 18) to test different assumptions about the distribution of   and its contamination.In Fig. 4, we show the fractional difference between the uncorrected, corrected and true clustering signals to evaluate performance.For context, the systematic corre-lation function is approximately 2 orders of magnitude smaller than the signal.
As expected, assuming the wrong model for the systematic contamination can result in large residual biases even after applying our systematics mitigation scheme, while choosing the correct model results in removal of the systematic to below the 0.1 percent level.The combined model performs almost identically to choosing the correct model, without substantially increased uncertainties despite the small increase in model complexity (one additional parameter).The similarities in performance between the true and combined model suggest that the combined model is a viable alternative for use in real data.The choice of the likelihood form (Gaussian vs. skewed Gaussian) seems to have no significant impact on performance, so choosing the simpler Gaussian model is preferred in this case.However, given the generalized contamination model chosen, we felt it was worth revisiting the question of likelihood choice and test its impact directly.The marginal differences between likelihoods found here reinforces previous works' conclusions regarding the Gaussian approximation for systematics mitigation in galaxy clustering (Weaverdyck & Huterer 2021;Rezaie et al. 2021).
As described in Subsection 3.5, we emphasize the importance of debiasing the estimated model parameters prior to estimating the corrected correlation function.We have confirmed that skipping this step results in substantial overcorrection, increasing the size of the residual biases by an order of magnitude in this case.The severity of the degradation will depend on the size of the parameter estimation noise.We also note that the estimated correction is not degraded below the pixel size used to estimate the contamination parameters, confirming our choice to use scales below the pixel size for this validation.
We used likelihood ratio tests to see whether we can identify any model preference from the data, focusing on the choice of combined versus the simpler additive and multiplicative models.We compare the use of both distributions in Table 2 for each systematic model.The columns show in what fraction of all mock catalog realizations the likelihood ratio test could not reject the correct (and simpler) model.In practice, the inability to reject a simpler model can be used as a form of model selection in favor of that model.In general this test correctly identifies the contamination model in the single systematic case; when using a skewed Gaussian likelihood, it correctly identified the additive or multiplicative model 100% and 99% of the time, respectively.The Gaussian likelihood model resulted in 97% and 93% accuracy.Here, too, there are small differences between using either likelihood model, but both produce strong results overall.This is a positive sign that likelihood ratio tests work well for simple cases such as the single systematic contamination and can potentially be used for model selection.In practice we do not know what the true model is, so this tool can be used to compare systematic models.For a single systematic, the fitting can be done using all three contamination models used here (additive, multiplicative, and combined).Since the combined model serves as a nested model, the likelihood ratio for the additive and multiplicative models can be calculated to evaluate if either form is preferred over the other.
Finally, we note that our mock catalogs include a realistic cosmological clustering signal, meaning the values of galaxy overdensity in each pixel are correlated.Our results suggest that even though our likelihood model for   omits these correlations, this model misspecification does not affect the fidelity of the systematic correction.The favorable results for our method with multiple systematics, as shown in subsequent subsections, further support this conclusion.

Multiple Systematics
We now move to the treatment of multiple systematics.When dealing with real survey data (e.g., DES), a multitude of systematic templates (Rodríguez-Monroy et al. 2022) are considered during the mitigation process, motivating our extention to the case of multiple systematics.We use a non-trivial number of uncorrelated contaminants, 25, to test our method.We generate 5 independent templates from each of the 5 systematic families described in Sec.4.2, resulting in 25 independent templates.The contamination model used is chosen with a random number generator such that each systematic has equal chance of being additive or multiplicative, resulting in 10 additive and 15 multiplicative systematics for this analysis.The nonzero contamination parameters are chosen from a Gaussian centered at zero and standard deviation of 0.15.This particular value of the standard deviation is motivated by real observations of seeing in HSC and chosen to result in an approximate 10% contamination level at larger scales when using the systematic families described in Subsection 4.2.This level of contamination is reasonable considering the results of Rodríguez-Monroy et al. (2022).Using these choices we produce the observed galaxy overdensity field for each realization of the mocks using equation (12).We then fit for the contamination parameters jointly for all systematics using the combined, additive, and multiplicative model, as well as the true contamination model for comparison.To clarify, in the additive and multiplicative case all systematics are assumed to follow that specific model, while in the true contamination model, each systematic follows their true assigned model when fitting.
We compare performance of the different contamination models in Fig. 5.We show only the results using the Gaussian likelihood for the fits, as we found negligible difference when using the skewed Gaussian form.Both panels show that the combined model produces sub-percent mitigation of the contaminated two-point function (to a very small fraction of the statistical uncertainty) and equivalent performance to using the true contamination model within the statistical uncertainties.On the other hand, using the additive and multiplicative models consistently across all systematics results in biased corrections of the clustering signal (model mis-specification bias).We conclude, therefore, that assuming a single contamination model in the presence of systematics with distinct types of contamination can lead to biased estimates of the correlation function.In this case, the safest choice when the form of systematics is not known is to use the combined model for all of them.We note that the increased degradation at larger scales is due to the small area of the KiDS mocks.This was confirmed using larger simulated mocks described Sec.6.2, where this degradation is not present at these scales.We also verified that the combined model can still recover the truth and outperforms the simpler models in the case of no contamination ( =  = 0) for a random selection of templates.
Finally, we consider whether we can learn anything about the form of our contaminants (e.g., are some purely additive or multiplicative?) using our methodology.We avoid using likelihood ratio tests for this, as with multiple contaminants, comparing forms for each one is not as trivial as in the single systematic case.To see if we can learn something about the underlying contamination model we use the estimated parameters directly, i.e., the best-fitting values for {  ,   } from using the combined model.For example, if we estimate the contamination parameters and uncertainties for systematic   to be: { â = 0.5 ± 0.03, b = 0.01 ± 0.05}, we may infer that the contamination from this particular template is additive, as  and  differ significantly from each other, and  is consistent with 0 within the uncertainties.Fig. 6 shows a corner plot of a representative sample of estimated parameters constructed from the fitting in all 119 mocks for 6 of the 25 total systematics.The blue (additive systematic) and red (multiplicative systematic) vertical and horizontal lines show the true parameter value.We see that on average, the combined model can recover the true parameters within the 1 contours.Therefore, we can use the parameter estimates directly to learn about the forms of different systematics for which we have templates.Since this is done over multiple realizations, we use the parameter estimates for each mock to do this.However, in real data, we could use a bootstrap approach to generate similar cornerplots and study the contamina- for different assumed contamination models used when correcting the signal with 25 different systematics, including a mix of additive and multiplicative systematics.The first and second columns show the mean and error on the mean of  over all mocks.The uncorrected measurement is shown as a baseline for what our method needs to accomplish given the extremely statistically significant bias.Using the true contamination model (the correct choices of which systematics are additive or multiplicative) or the combined model both reduce the bias considerably (to below 0.1), while incorrectly applying the additive and multiplicative models to all templates results in a biased correction.The bias-variance trade-off can be observed by comparing both columns.The increased complexity of the combined model results in the highest variance, but in a considerable decrease on the bias when comparing with the additive and multiplicative models.
The fitting results shown here use a Gaussian likelihood, but we confirm the differences with the skewed Gaussian form are small and within the errors from each other.
tion model from the data empirically.Using corner plots based on the MCMC parameters of a single realization is not equivalent to those produced from multiple realizations, as the full parameter variability is not captured by the MCMC error estimates.We also note the tighter constraints on the  parameters compared to the  parameters.This discrepancy is due to the difference in constraining the shifts in the mean of the overdensity distribution (determined by the  parameters) versus shifts in the variance (determined by the  parameters).This can be understood statistically, as estimating the variance of the distribution intrinsically depends on estimating the mean.Any noise in estimates of the mean will propagate to estimates of the variance.
As shown here, the introduction of the combined model allows for joint modeling and correction of a set of clustering systematics that have different functional forms.Similarly, by construction it will allow for modeling of systematics that are neither purely additive nor purely multiplicative in overdensity.

Cosmological Impact
To quantify the potential cosmological impact of how our method performs with multiple systematics, we fit the uncorrected and corrected clustering signals (for different modeling choices) to a constant times the true signal, as expressed in equation ( 25).Following the prescription detailed in Sec.4.4, we calculate the estimated bias, Δ Ā, for different systematics contamination models in the multiple systematics case described above and report the results in Table 3.Although not shown, the differences between the estimated biases from using a Gaussian or skewed Gaussian likelihood are negligible, leading to a preference for the simpler Gaussian model.The biasvariance trade-off can be evaluated by comparing both columns in this table.As shown, the contaminated signal exhibits a few-percent bias that is highly statistically significant.Applying a correction using the true and the combined contamination models both reduce the bias to well below 0.1, with the caveat that the error on the mean from the combined model is approximately 1.4 times larger than that of the true model (increased variance).As expected, using the true contamination model is the optimal approach when it comes to reducing the bias without much increase in the variance.However, the combined model still dramatically outperforms the other contamination models and produces a bias consistent with zero, indicating that it is a sufficient correction for a mixture of different types of systematic contaminants for cosmological analysis.
On the other hand, assuming the additive and multiplicative models in the presence of a mixture of different types of systematics produces a significantly biased correction.The correction using the additive model performs substantially worse than with the multiplicative model by a factor of ∼ 4. We can see why from Fig. 5, where the multiplicative model does better than the additive at small scales and much worse at large scales.However, due to the smaller errorbars at small scales, the fitting of  using equation (26) favors the scales at which the multiplicative model performs better, which explains the lower values of Δ Ā for that model.The increased complexity of the combined model results in the highest variance, but in a considerable decrease on the bias when compared with the additive and multiplicative models.Thus, the bias-variance tradeoff in this case favors the combined model.

Effect of Correlated Systematics
So far we have only showed the performance of this method for uncorrelated systematics.However, we know that in reality template maps are often correlated with each other.To test the performance of this method for the case of correlated templates, we introduce different levels of in-class correlation between systematics templates5 .That is, in the process of map production using HEALPY, we provide off-diagonal elements that describe the correlation of all maps with each other.We only add correlation between maps that are produced from the same power spectrum (what we mean by in-class correlation), where the off-diagonal elements are  corr      ℓ if the maps   and   belong to the same class, and 0 otherwise.We consider this scenario for a range of values of  corr .
In our specific scenario, we are dealing with 5 systematic classes with 5 maps each.For our test, every map belonging to the same class will have the same level of correlation,  corr , with each other and 0 with all other maps.We then follow the method described in Sec.3.1 to deal with such cases.Fig. 7 shows the average bias in the corrected correlation function amplitude as a function of the in-class correlation.We see that in the presence of correlated template maps, there is no loss in performance.All points are consistent within 2 with an unbiased correction of the two-point function.In addition, the results with the Gaussian and skewed Gaussian likelihoods are once again not statistically significant.

Comparison with other methods
Our methodology has some features in common with previous work described in Sec.2.3, but also some distinct differences.Unlike the methods of Mode Projection (Rybicki & Press 1992) and Template Subtraction (Ross et al. 2011), we explicitly account for multiplicative terms in the one-point function describing the observed galaxy overdensity in equation (11).It is common practice to neglect the third term in this equation and consider only the first order term.However, we show in this work how neglecting the presence of multiplicative bias can lead to non-negligible biases in the corrected correlation functions if they are present.
The method presented in this paper also considers distinct parameters contributing to the additive and multiplicative contamination to the galaxy overdensity, allowing for empirical determination of the systematics model, rather than fitting a specific model (e.g., additive or multiplicative).This not only allows for the possibility of multiple sources of contamination for a given template, but also allows us to shows the variance for the  parameters (which modify its variance).As we can see, there is little difference between the estimated noise using either likelihood on the  parameters.However, the Skewed Gaussian has lower parameter noise for the  parameters.We attribute this to the Skewed Gaussian's better recovery of the distribution's variance when we do not remove outliers.In addition, we see that the variance recovered from the MCMC posteriors vastly underestimates the parameter noise.Bottom: The effect of miscalculating the parameter noise on the two-point function correction.Possible parameters are drawn from a Gaussian centered at the true parameter value and with variance represented by the different colored lines.The correction to the observed correlation function is done using the estimated noise.The x-axis shows the ratio of the estimated noise to the true noise for both the  and  parameters, while the y-axis shows the bias in the corrected two-point function.We see that under-(or over-) estimating the parameter noise causes the two point function to be over-(or under-) corrected.However, the effect of the bias depends on the size of the noise itself.Having high parameter noise means the analysis is more sensitive to accurately estimating the noise, while the opposite is true at low parameter noise.
directly determine the functional form of the systematic contamination.The contamination parameters are also fitted for jointly, rather than independently for every systematic.This differs from iterative methods, which clean contaminants one by one using a form of linear regression.These methods rely on null hypothesis testing during the iterative process to avoid biased mitigation of contaminants.Other methods like ElasticNet incorporate the impact of multiplicative effects by using a two-step process to update the covariance for the assumed multiplicative parameters.In contrast, our method deals with this problem by taking into account all systematics simultaneously in the fitting process and self-consistently incorporating the multiplicative effects into the likelihood.This joint estimation enables the discrimination of additive and multiplicative templates directly.
In our current implementation of the method, we chose to mitigate the systematics directly at the two-point level, which differs from the iterative regression (Elvin-Poole et al. 2018) and ElasticNet methods (Weaverdyck & Huterer 2021) that mitigate at the map level either through map cleaning or pixel weights.In principle, our method allows for mitigation at the map level through cleaning of the observed overdensity map using equation ( 12).One common feature The different line colors represent different systematic classes.The triangle points show the variance from using 100 realizations of mock catalogs, while the circular points show the variance estimated using the block bootstrap on a single realization.The errorbars show the error on the mean, and the lines (varying styles) track the results from mocks.We see that the bootstrap approach can effectively estimate the parameter noise for higher sky area coverage and low noise parameters, while it fails for small surveys with high parameter noise.among corrections at these different levels is that all require a noise debiasing process for the reasons explained in Sec.3.5.Regarding the use of pixel weights, the last term in equation (1) when used with combined model prevents us from being able to cleanly express the contamination as a weighted correction for each galaxy.We acknowledge that mitigating at the two-point level requires calculation of the auto-correlation function for all systematics of interest, which is not as simple as using pixel weights.However, it is still possible to perform a pixel-weighted correction, similar to that of iterative methods or ElasticNet, if the parameter fits agree with an additive or multiplicative model.In that case, the weights can be calculated for all templates   with contamination parameter    for any given pixel  as: Finally, our method allows for fitting the   distribution using a Skewed Gaussian likelihood, which more closely resembles the real distribution.Even though we do not see any significant difference in systematics mitigation between methods after debiasing the parameters, we still provide an alternative fit to a distribution commonly assumed to be Gaussian for systematics treatment(see examples of non-Gaussinity in Rezaie et al. 2021Rezaie et al. , 2023)).We emphasize the non-Gaussianity of the   distribution in the fits for the true distribution in Sec.5.1.There, the parameter that describes skewness was estimated to be  ≈ 5.1 ± 0.2, which represents a highly non-Gaussian distribution.We note that the specific level of non-Gaussianity will heavily depend on the pixelazion scheme and galaxy density.Despite the high statistical significance with which the non-Gaussianity of the   distribution is detected, the effectiveness of systematics mitigation for the Gaussian likelihood indicates that for our scenario, model mis-specification of a non-Gaussian distribution as a Gaussian is not a problem when parameter noise is known or can be estimated accurately.

Practical applications on real data
Although our demonstration of the method used simulated data, the method is easily transferable to real LSS data.One open question is the choice of the pixel scale in relation to the characteristic scale of the systematics.More specifically, what pixelization is sufficient to properly characterize the systematic maps and mitigate contaminants?This work has produced and measured synthetic contamination at the same pixel scale; however, systematic template maps may be produced at finer scales in real data.The impact of pixel scale choices on the corrected clustering signal for systematics maps with varying characteristic scales was not explored here, but should be considered when working with real data.
We have also emphasized that it is important to quantify the noise in the estimated systematics parameters in order to accurately mitigate the contamination.This work used multiple realizations of mock catalogs for this purpose.For surveys that have mock catalogs with realistic clustering, such an approach would transfer over directly.However, not all surveys will have suitable mock catalogs.To apply the method without many realizations of mock catalogs, we propose the use of data-driven methods to estimate the noise in the systematics parameters.
Before applying data-driven methods to estimate the noise in the estimated systematics parameters, it is important to understand exactly what the noise depends on and how to evaluate its impact on systematics mitigation.The noise can depend on many factors, including the sky area coverage, the   distributions and power spectrum, and the form of the likelihood used to analyze the δ distribution (including the number of parameters being fitted).For example, the top panel of Fig. 8 illustrates how the noise level in the inferred systematics parameters can vary based on the form of the likelihood (blue and orange points represent results with a Gaussian and skewed Gaussian likelihood, respectively) and on the spatial power spectra used to generate the systematic maps, normalized to a fixed variance in   .As we can see, the noise can differ greatly depending on the power spectrum used, with some forms increasing the estimated noise variance by almost an order of magnitude.The choice of a Gaussian or skewed Gaussian likelihood does not significantly affect the noise in the  parameters, which control shifts in the mean of the distribution.However, the choice has a substantial impact on the noise in the estimated  parameters, which modulate its variance.Therefore, even though the choice of the likelihood (Gaussian or skewed Gaussian) does not significantly affect the systematics mitigation outcome, it is important to note that the noise estimates differ.This could be relevant in the scenario where the parameter variance cannot be accurately measured and corrected for, in which case it is helpful to adopt a model with smaller variance since the bias correction would be less sensitive to it.However, we found no correlation between the noise variance in the estimated systematics parameter and the true value of the parameter.
In the absence of mock catalogs, we propose the use of the block bootstrap method to estimate the parameter noise.Block bootstrap means that the pixelated maps are divided into approximately equal area patches, where each bootstrap sample represents a version of the maps made from a random selection (with replacement) of the patches.This allows us to create multiple realizations from a single map and estimate the parameter noise one would get from having multiple independent mocks.We wish to test this data-driven approach on higher and more realistic area coverage than our current KiDS mocks.To do so we use the same mechanism to generate systematic template maps from the set of 5 systematic classes described in the multiple systematic case in Sec.5.3.In order to obtain different area coverage   maps, we used version 2.3.0 of CCL6 (Chisari et al. 2019) to generate galaxy-galaxy power spectra using a vanilla ΛCDM cosmological model (Ω  = 0.3, Ω  = 0.05, Ω Λ = 0.7,  8 = 0.8,   = 0.96) with galaxy bias equal to 1 (  =   ) to produce Gaussian galaxy overdensity maps.This cosmological model is similar (while not identical) to that used in KiDS mocks, with the exception that our galaxy bias is fixed to 1.We use a Gaussian () with mean redshift   = 0.3 and   = 0.05 to resemble the redshift range used for the previous validation with the KiDS mocks.We choose 4 size configurations: 100 deg 2 , 400 deg 2 , 1000 deg 2 , and 2000 deg 2 , to show how this approach works at various sky area coverages.Although not identical, the produced mocks provide a realistic test of the block bootstrap method for estimating parameter noise.We wish to compare the parameter noise estimation using this bootstrap approach with estimates using multiple realizations.We transformed the Gaussian maps into log-normal maps to more accurately portray a realistic underlying distribution.This is done by taking some Gaussian map , shifting it using its variance:  =  − 0.5 * Var[], and transforming the shifted map to a log-normally distributed one:  = exp() − 1.For the tests described here the shift parameter is 0.5 * Var[] = 0.025.
In this specific example we use 100 bootstrap-resampled datasets (meaning 100 distinct versions of a map from one realization), with 10 block bootstrap patches.We confirm that this number of realizations leads to 5%-level convergence of the parameter noise.The number of resampled datasets and patches should be chosen based on the number of parameters estimated and scales at which corre-lations between pixels is small.In this particular test we estimate 51 parameters and correlations become negligible at ∼ 1.5 degrees, which motivates our chosen configuration.
In Fig. 9 we compare the accuracy of parameter noise estimation by comparing a bootstrap approach on a single mock catalog realization against our previous approach using multiple realizations of mock catalogs as a function of sky area coverage.As we can see from the figure, the noise is approximately proportional to the inverse of the survey area.The figure also shows that using the block bootstrap on a single dataset can be a good alternative to using multiple realizations of mock catalogs.The bootstrap estimates are quite good for higher area mocks, and for small area mocks with lower levels of parameter noise.However, the approach fails for small area surveys (∼100 deg 2 ) with high parameter noise associated with systematics maps that have a lot of small-scale power.
In addition, the bootstrap estimates can depend on of the number of patches chosen.Creating patches that are too small might fail to capture large spatial correlations between template maps.Conversely, choosing very few large patches can result in challenges in inferring a significant number of systematics parameters and their variances.Our results show, however, that it is possible to achieve a balance that reliably estimates the systematics parameter noise for many realistic survey scenarios.
Note that we cannot simply take the parameter noise variance directly from a single realization by looking at the posterior generated from the MCMC samples.Because of our simple adopted likelihood model, which neglects correlations in overdensity between pixels, the variance of the posterior coming from the MCMC samples does not reflect sources of uncertainty such as cosmic variance.Using the MCMC sample posterior noise will produce underestimates of the parameter noise, as seen in Fig. 8, and therefore lead to overcorrection of the two-point function.

CONCLUSIONS
In this paper, we presented a new method for characterizing and mitigating systematic contaminants in in galaxy overdensity fields.We used a generalized and flexible contamination model that allows for multiple systematics to contaminate the overdensity field in additive, multiplicative, or 'combined' forms.This modelling choice allows us to quantify contamination without assuming that all systematics follow one particular form, and in addition helps us empirically determine the functional dependence of systematics on the galaxy overdensity.Using the generalized model can reduce model biases, producing unbiased estimates of the two-point correlation function in the cases where different systematics take different forms captured by our model.The ability to determine the functional form of the contamination may provide insight into their origin, so they can be reduced or entirely avoided in future.
As part of our approach, we analyze the one-point function of galaxy overdensities within pixels to determine the systematics contamination model, then use that to correct the two-point functions.In addition, we introduced a skewed Gaussian likelihood function to represent the galaxy overdensity distribution in an attempt to more closely model the true underlying conditional probability function.However, there were no significant differences in the quality of systematics mitigation between the skewed Gaussian and Gaussian likelihood.Therefore, we recommend using the simpler Gaussian likelihood.
The method was tested and validated using KiDS-like mock galaxy catalogs with realistic clustering, and mock systematic templates maps.Template maps with different systematic 'families' were produced to mimic the level of complexity we can expect to find in real data.Key outcomes of our tests were as follows: (i) Contaminating the true   with one template map in two different ways (multiplicatively and additively), we found that applying the wrong model when mitigating systematics can result in significant residual biases after correcting the two-point function, while our proposed, more general model produced unbiased corrections and has comparable performance to the case where the true contamination model is known.We also showed that simple likelihood ratio tests can successfully identify the correct contamination model.Both the Gaussian and skewed Gaussian likelihood model for the overdensity field result in robust systematics mitigation, leading us to prefer the simpler Gaussian model for correction purposes even though it does not describe the overdensity field particularly well.
(ii) Multiple systematics: We contaminated the overdensity field with a non-trivial number of contaminants (25) with different power spectra, of which some were additive and some multiplicative systematics.We showed that the combined model once again produced unbiased estimates of the clustering signal, while assuming a single contamination model for all systematics produced biased corrections in this mixed case.In addition, we found that we can accurately determine the functional form of the systematic contamination for the individual systematics.
Finally, we considered the practical applications of the methods presented here in real data.The main challenge is to accurately estimate the systematics model parameter noise needed to obtain unbiased corrected clustering signals.Since this work used mock catalogs, parameter noise could be estimated well from multiple mock catalog realizations.However, this will pose a challenge when the method is applied to real data, as mock catalogs are not always available and are computationally expensive.To overcome this challenge, we proposed a block bootstrap approach for estimating the parameter noise from a single realization of both the galaxy density field and systematic maps.We showed that the estimated parameter noise using the block bootstrap is comparable to the estimates using multiple mock catalog realizations in all cases except for small area (∼100 deg 2 ) with significant small-scale power in the systematics template maps.This is therefore a promising approach in applying this method to real data in future work.
Applications of this method on current imaging surveys would allow us to directly compare the results with commonly used mitigation techniques on real data.In addition, empirical determination of the functional form of real systematics could help us learn how systematics bias the observed galaxy overdensity field.Recent analyses like the DES Y3 clustering analysis (Rodríguez-Monroy et al. 2022), where concerns emerged regarding large-scale structure systematics, are an interesting and highly relevant context for direct application of the method developed in this work, as a stepping stone towards future surveys.

Figure 1 .
Figure 1.Schematic illustrating examples of additive and multiplicative systematics on the galaxy density field.Yellow stars and blue ellipses represent stars and galaxies, respectively.The left panel shows the true galaxy and star locations, while the right panel represents the measured locations.The regions where the measured and true galaxy distributions differ are circled on both panels, with the color of the circle indicating the type of systematic that is generated.An example of a multiplicative systematic is the unrecognized blending of two or more distinct galaxies in the observed distribution, resulting in a multiplicative effect to the observed number density.An example of an additive systematic is whenstars aremisidentified as galaxies in the measured distribution.This illustration shows how multiplicative systematics depend on the galaxy density field while additive ones are independent of the galaxy distribution.
validate our methodology we use the KiDS-like Halo Occupation Distribution (HOD) -body simulations (KiDS-HOD hereafter) produced by Harnois-Deraps et al. (2018).This set of galaxy mock catalogs is produced from the wider set of SLICS (Scinet LIght Cone Simulations) described in Harnois-Déraps & van Waerbeke (2015).The 1025 -body simulations used in SLICS are produced by the gravity solver CUBEP 3 M (Harnois-Déraps et al. 2013), where each run follows 1536 3 particles inside a grid cube of comoving side length  box = 505 ℎ −1 Mpc.

Figure 2 .
Figure2.Example of the correlation function for systematics generated through synthetic map-making (solid lines) using one of the 5 systematics families (different colors) described in Subsection 4.2 and for the PSF area size residuals in HSC (black points).For visualization purposes, the correlation functions are normalized by their value at  = 1 arcmin.Note that the solid lines are measured on scales appropriate for the analysis done using KiDS-like mocks, which has considerably less area than HSC, resulting in the elimination of larger scales.As we can see, the overall form of the twopoint function from our synthetic maps varies amongst different systematic families, albeit with some similarities between certain families.Comparing the solid lines with the real data, we see that the overall exponential behavior of the correlation function and flattening at small scales is similar amongst the systematic families and real data, with the obvious difference that the exponential decay occurs at different scales by design.

Figure 3 .
Figure 3. Filled histogram shows the distribution of true galaxy overdensity   for one systematics-free mock catalog from the KiDS-HOD galaxy simulation.The solid lines show the results of maximum likelihood estimation fits for the distribution with a skewed Gaussian (green) or Gaussian (purple) likelihood.All contamination parameters are fixed to zero while fitting, while  is free to vary in both likelihood forms.The parameter  = 0 for the Gaussian likelihood but is free to vary for the skewed Gaussian.The dotted lines show the same fit after removing pixels with   ≥ 1.6 as potential outliers.

Figure 4 .
Figure 4. Fractional difference between the corrected and true two-point correlation functions in the presence of a single additive (top row) or multiplicative (bottom row) systematic contaminant.Right and left panels show the same success metric but are separated for visualization clarity.The black dash-dotted line shows the uncorrected two-point function, with contamination of at most 2 per cent.Solid lines show the results after correction using a Gaussian likelihood for the MLE, while dashed lines show the correction using a skewed Gaussian likelihood.The different colors show different contamination models used: additive (grey, left panels), multiplicative (orange, left panels), and combined (blue, right panels).The shaded contours show the ±1 regions for their respective curves based on multiple realizations of mock catalogs.In both rows, we see that using the incorrect contamination model leads to residual biases after the systematics correction, while correction with the combined model results in equivalent performance to using the true contamination model.For both the true and combined model, the residual systematics are below the 0.1 per cent level.There is no significant difference in performance between the skewed and Gaussian likelihoods when using the combined systematics model.

Figure 5 .
Figure 5. Top panel: Fractional difference between the average corrected and true two-point function in the presence of 25 systematic contaminants, 10 of which are additive and 15 multiplicative.The vertical symlog scale shows cases of possible over/under correction.The black dash-dotted line shows the uncorrected two-point function, with contamination reaching a maximum of ∼10 per cent.Solid lines show the estimated correction using a Gaussian likelihood for the MLE and the combined (blue), additive (grey), multiplicative (orange) or true (pink) contamination model.The shaded contours show the 1 regions of their respective lines.Bottom panel: The difference between the corrected and true two-point function divided by the standard deviation of the true clustering across all realizations for each separation bin.Both panels show that the combined model robustly mitigates contamination to the sub-percent level with respect to the clustering signal and well within 1 on all scales, and on average is equivalent to using the true contamination model.The performance of the true and combined models is consistent within the statistical uncertainties.

Figure 6 .
Figure 6.Each panel shows a density plot of the estimated contamination parameters, {  ,   }, using a Gaussian likelihood, from all 119 mock catalog realizations.In other words, every point on a particular panel represents the maximum likelihood estimate from all MCMC samples of a single mock catalog.The contour regions show the 1 (68%) and 2 (95%) regions.The 6 systematics shown (out of a total of 25 contaminants) form a representative sample of the different systematic families (shown by the scaling of their power spectra,   ℓ ) used to generate the systematic maps.The vertical and horizontal solid lines show the true parameters used for contamination, with blue (red) representing additive (multiplicative) systematics.The dashed blue line shows the zero point of the  parameter, where an additive systematic should live in parameter space.We see that across all panels the true parameters fall within 1 regions of the estimated parameters and we can use the inferred parameters to identify additive and multiplicative systematics.

Figure 7 .
Figure7.The average bias on the corrected clustering signal over all 119 mock catalog realizations as a function of the correlation coefficient between systematic templates within a given power spectrum class.The solid points show this metric for the corrected two-point function using the combined contamination model and either a Gaussian (black) or skewed Gaussian (green) likelihood in the MLE.The errorbars show the error on the mean.This figure shows that the excellent performance of the method remains unaltered even by relatively high levels of correlation between systematic template maps.

Figure 8 .
Figure8.Top: Average parameter estimation variance for each systematics class used to generate contaminating templates.The variance for each estimated parameter is calculated over all 119 realizations of the KiDS-HOD mocks for each class of systematic.Since we use 5 systematics per class, the points show the average noise from the 5 parameters corresponding to each systematic contaminant from the mocks.The crosses show the parameter variance taken from the MCMC posteriors for comparison.The different colors show the likelihood form used, either a Gaussian (blue) or a Skewed Gaussian (orange).The errorbars are the errors on the mean.The left panel shows the variance on the  parameters (which modify the mean of the observed   distribution), while the right panel shows the variance for the  parameters (which modify its variance).As we can see, there is little difference between the estimated noise using either likelihood on the  parameters.However, the Skewed Gaussian has lower parameter noise for the  parameters.We attribute this to the Skewed Gaussian's better recovery of the distribution's variance when we do not remove outliers.In addition, we see that the variance recovered from the MCMC posteriors vastly underestimates the parameter noise.Bottom: The effect of miscalculating the parameter noise on the two-point function correction.Possible parameters are drawn from a Gaussian centered at the true parameter value and with variance represented by the different colored lines.The correction to the observed correlation function is done using the estimated noise.The x-axis shows the ratio of the estimated noise to the true noise for both the  and  parameters, while the y-axis shows the bias in the corrected two-point function.We see that under-(or over-) estimating the parameter noise causes the two point function to be over-(or under-) corrected.However, the effect of the bias depends on the size of the noise itself.Having high parameter noise means the analysis is more sensitive to accurately estimating the noise, while the opposite is true at low parameter noise.

Figure 9 .
Figure 9.Comparison of the results of estimating systematics parameter noise using multiple realizations of mock catalogs versus a block bootstrap approach on a single mock catalog realization.The figure shows the variance in the estimated parameters (scaled by the area) as a function of the sky area coverage.The different line colors represent different systematic classes.The triangle points show the variance from using 100 realizations of mock catalogs, while the circular points show the variance estimated using the block bootstrap on a single realization.The errorbars show the error on the mean, and the lines (varying styles) track the results from mocks.We see that the bootstrap approach can effectively estimate the parameter noise for higher sky area coverage and low noise parameters, while it fails for small surveys with high parameter noise.

Table 1 .
KiDS-HOD mocks redshift binning used and galaxy number density information.Since we have 119 realizations and one random catalog, number and density information on the mock catalogs is given as an average over all realizations.

Table 2 .
Model comparison using likelihood ratio tests for single systematic cases.The columns show in what fraction of all mock catalog realizations the likelihood ratio test preferred the correct (and simpler) model over the combined model.

Table 3 .
Comparison of the estimated bias on the two-point function( corr