Magnification bias in galaxy surveys with complex sample selection functions

Gravitational lensing magnification modifies the observed spatial distribution of galaxies and can severely bias cosmological probes of large-scale structure if not accurately modelled. Standard approaches to modelling this magnification bias may not be applicable in practice as many galaxy samples have complex, often implicit, selection functions. We propose and test a procedure to quantify the magnification bias induced in clustering and galaxy-galaxy lensing (GGL) signals in galaxy samples subject to a selection function beyond a simple flux limit. The method employs realistic mock data to calibrate an effective luminosity function slope, $\alpha_{\rm{obs}}$, from observed galaxy counts, which can then be used with the standard formalism. We demonstrate this method for two galaxy samples derived from the Baryon Oscillation Spectroscopic Survey (BOSS) in the redshift ranges $0.2<z \leq 0.5$ and $0.5<z \leq 0.75$, complemented by mock data built from the MICE2 simulation. We obtain $\alpha_{\rm{obs}} = 1.93 \pm 0.05$ and $\alpha_{\rm{obs}} = 2.62 \pm 0.28$ for the two BOSS samples. For BOSS-like lenses, we forecast a contribution of the magnification bias to the GGL signal between the angular scales of $100$ and $4600$ with a cumulative signal-to-noise ratio between $0.1$ and $1.1$ for sources from the Kilo-Degree Survey (KiDS), between $0.4$ and $2.0$ for sources from the Hyper Suprime-Cam survey (HSC), and between $0.3$ and $2.8$ for ESA Euclid-like source samples. These contributions are significant enough to require explicit modelling in future analyses of these and similar surveys.


INTRODUCTION
Over the last few decades, weak gravitational lensing has become a powerful tool to directly measure the matter distribution of the late Universe, while allowing for the inference of the cosmological parameters which govern it. Surveys, such as the currently ongoing Kilo Degree Survey 1 (KiDS, Kuĳken et al. 2015), the Dark Energy Survey 2 (DES, Flaugher et al. 2015), the Hyper Suprime-Cam Subaru Strategic Program 3 (HSC SSP, Aihara et al. 2018), have become increasingly limited by systematics rather than statistics as ever-growing sample sizes reduce uncertainties. The impact of the systematics will become even more exaggerated for the next generation of surveys, e.g. Euclid 4 (Laureĳs et al. 2011), the Vera C. Rubin Observatory Legacy Survey of Space and Time 5 (LSST, Abell et al. 2009), and the Nancy Grace Roman Space Telescope 6 (also known as WFIRST, Spergel et al. 2015). For this reason, recent efforts have focused on improving our physical understanding of often neglected phenomena which can influence cosmological parameter inference based on shear and clustering measurements. These effects include intrinsic galaxy alignments (Kiessling et al. 2015;Kirk et al. 2015;Troxel & Ishak 2015) and magnification (Hildebrandt et al. 2009;Duncan et al. 2014;Hildebrandt 2016;Unruh et al. 2019;Thiele et al. 2020). In this paper, we will focus on the magnification effects.
While the magnification due to gravitational lensing partially manifests itself as a change in the angular diameter of an object, it also changes the observed solid angle of a field with respect to the intrinsic solid angle. This can affect the observed galaxy counts and their fluxes, leading to what is commonly known as a magnification bias, which has been detected in the past by Chiu et al. (2016) and Garcia-Fernandez et al. (2018). It is important to note that this affects the counts of source galaxies and lens galaxies, such that the magnification due to large-scale structure can also change the shear-clustering cross-correlations (galaxy-galaxy lensing, GGL) and the clustering measurements (Hui et al. 2007 We break down the magnification effect into two separate phenomena: flux magnification and lensing dilution. The first is caused by an increase/decrease in the flux observed from a source due to gravitational lensing which can push otherwise unobserved galaxies over the flux limit or push galaxies with magnitudes below the flux limit out of the observational window. At the same time, lensing dilution increases/decreases the number of observed sources within a certain area of the sky by (de-)magnifying the solid angle behind the gravitational lens. The magnification effect can be measured directly from changes in the apparent size and magnitude of lensed galaxies (Schmidt et al. 2011) or by comparing the observed galaxy effective radii to the intrinsic radii derived from their surface brightness and stellar velocity dispersion (Huff & Graves 2013). Nonetheless, it is most commonly measured through the bias in the observed number density of sources (Scranton et al. 2005). Since this bias directly contributes to the clustering and GGL signal, we will rely on this approach in our analysis.
The constraining power of weak lensing samples is constantly growing (Troxel & Ishak 2015;Hikage et al. 2019;Asgari et al. 2020) by including additional measurements (Abbott et al. 2018(Abbott et al. , 2019b and through joint analyses between different surveys like, for example, in the recent joint analysis of KiDS-1000 with BOSS (methodology described in Joachimi et al. 2020 and the results are shown in Heymans et al. 2020 andin Tröster et al. 2020). In all these analyses, the understanding of the systematics is becoming a priority. One potential systematic could appear from unaccounted magnification biases in the clustering signal of a non-flux-limited spectroscopic surveys such as BOSS (Dawson et al. 2012) or DESI 7 (Aghamousa et al. 2016) or color-selected photometric samples such as DES M G C (Rozo et al. 2016) or luminous red galaxy (LRG) samples (Vakili et al. 2020). Thus also biasing the GGL correlations with shear signal from weak lensing surveys. This paper aims to provide a method for estimation of the magnification bias for surveys which have complex sample selection functions which are not purely flux/magnitude-limited. We use the standard framework for estimating the magnification bias from observables in flux-limited surveys as a basis for the parametrisation of a semi-empirical model for non-flux-limited surveys. This model is then tested by comparing the estimates for the magnification bias in BOSS observations (Dawson et al. 2012) to the estimates from MICE2 cosmological simulations. We then use our results to forecast some of the potential biases which could be induced in a joint analysis of KiDS-1000 or HSC Wide with BOSS and a Euclid-like survey with a DESI-like survey.
This article is structured in the following manner. In section 2, the theoretical background is described. In section 3, we provide an outline and presentation of our methods and simulations. The magnification bias estimates from a BOSS-like galaxy population are presented in section 4. The forecasts for current and future joint analyses are found in section 5. Lastly, we conclude the paper and provide an outlook in section 6. appendix A repeats the analysis shown in section 4 for a magnitude limited galaxy sample. 7 https://www.desi.lbl.gov 2 THEORETICAL BACKGROUND

Magnification bias for flux-limited surveys
As described in the review by Bartelmann & Schneider (2001), a lensed population of galaxies with a cumulative galaxy count at redshift , given a flux limit of , can be described in terms of the unlensed population, 0 , as where ( ) is the magnification for a redshift . Here, the 1/ ( ) factor accounts for the dilution of galaxies due to magnification. The unlensed population has been observationally shown to be similar to a power law in flux (in particular, for faint galaxies) given by where and parametrise the power law and 0 ( ; ) is the redshift probability distribution of the galaxies. Taking the ratio of these two populations, assuming that we can approximate the ( ) with the magnification of a fiducial source at infinity (which should hold mainly at low redshifts, Bartelmann & Schneider, 2001) and integrating over redshift, we get the following expression: If ≈ 1, we can see from equation (3) that the magnification bias would vanish (with slight deviations from this depending on the redshift range). The magnification can be related directly to the local surface density in the weak lensing limit (| | 1, | | 1) with ≈ 1 + 2 (Broadhurst & Lehár 1995). Therefore, one can relate to the relative difference between the magnified and the unmagnified galaxy populations and the exponent of the flux power spectrum with where is the same as the in equation (3) in the weak lensing limit. When analysing samples with a complex selection function, equation (4) does not necessarily apply anymore. Nonetheless, we use the parameter as an analogue to estimate the magnitude of the magnification bias in a given galaxy sample.

Estimating the magnification bias in flux-limited surveys
By considering equations (1), (2), and the definition of magnitude as a function of flux, one can derive that obs can be determined from the differential galaxy count ( ) over a given band magnitude range from to + as follows (Binggeli et al. 1988;Bartelmann & Schneider 2001;Hildebrandt et al. 2009), This obs near the faint end of the galaxy population is considered as an effective luminosity function slope if it is consistent with the value given by equation (4). Therefore, by estimating the luminosity function slope, , through the observed , one can estimate the systematic effects that may be introduced to galaxy number counts through the magnification bias, and therefore the systematics affecting the clustering and GGL signals derived from this observable.

Signal modelling
In accordance with the framework outlined in Section 2 of Joachimi et al. (2020) as the methodology for the inference of cosmological parameters from KiDS-1000, we opt to quantify the influence of the magnification bias on cosmology through its contribution to the GGL angular power spectra. These angular power spectra are lineof-sight projections of the three-dimensional matter power spectrum. We express the observable GGL angular power spectrum correlating galaxy positions, n, and galaxy shapes, , as a linear functional of derived statistics as ( ) where is the index for lens galaxy redshift bins, is the index of the source galaxy samples, gG stands for the cross-correlation between the lens galaxy distribution and the source gravitational shear, gI stands for the intrisinc alignment of source galaxies physically close to foreground lenses and mG stands for the correlation between gravitational shear and the lensing-induced magnification bias in the lens sample.
( ) ga (ℓ) for ∈ { , } are defined as Limber-approximated line-of-sight projections of the three-dimensional cross-power spectrum between the galaxy and matter distribution, gm , given by (Kaiser 1992 where is the comoving distance, hor is the comoving distance to the horizon, 1 = 5 × 10 −14 (ℎ 2 / −3 ) −2 , in accordance with the value from Hirata &Seljak 2004 andKing 2007 which is set using the galaxy ellipticity measurements from SuperCOSMOS (Brown et al. 2002;Hambly et al. 2001).
The magnification term in equation (6) is modelled as where again indexes lens galaxy samples, indexes source samples, mG stands for the lensing-induced magnification bias in the lens sample and GG stands for shear-shear correlation signal.
( ) GG (ℓ) is defined as the cosmic shear angular power spectrum purely from gravitational lensing effects, i.e. without any intrinsic alignment signals, and is given by where m,nl is the non-linear matter power spectrum. This power spectrum is computed with a non-perturbative model using HM-C (Mead et al. 2015(Mead et al. , 2016 integrated within CAMB 8 (Lewis et al. 2000;Lewis & Bridle 2002;Howlett et al. 2012). HMC incorporates baryonic feedback in its halo modelling approach. We solely parametrise the baryonic feedback model using one free parameter, bary , in line with Hildebrandt et al. (2017). The non-linear matter power spectrum m,nl is also used to compute the cross-power spectrum between the galaxy and matter distribution gm used in equation (7) as in the analysis shown in Joachimi et al. (2020).

METHODOLOGY
The method outlined in this paper aims to provide an accurate estimate of the effective luminosity function slope, , of a galaxy sample with a complex sample selection. This estimate can be used to quantify the magnification bias in clustering and GGL lensing analyses. To achieve this, we rely on realistic weak lensing simulations to calibrate the obs estimate from observables, based on equation (5), such that it agrees with the value of derived from unobservable quantities using equation (4). The procedure gives a magnitude range that yields the most optimal obs value. This value is used to estimate obs from observations. If the simulations are accurate, obs should agree with the underlying even though it cannot be directly measured.

BOSS DR12 data
We develop our method using lens samples derived from the Sloan Digital Sky Survey (SDSS)-III BOSS (Eisenstein et al. 2011a;Dawson et al. 2012). BOSS is a spectroscopic survey with a complex sample selection function which is commonly used in cosmological analyses of galaxy clustering and GGL (Alam et al. 2017;Sánchez et al. 2017;Beutler et al. 2017;Tröster et al. 2019;Speagle et al. 2019;Heymans et al. 2020). For more details about the nature of the galaxy selection process, see Alam et al. (2015). A lens galaxy sample selected in such a way could be introducing a substantial magnification bias in any analysis, while its complexity does not allow to measure it with current means. For the BOSS sample, the bias becomes even more important to model, because it is commonly used in GGL analysis with the source galaxy samples of weak lensing surveys whose footprint significantly overlaps with the BOSS footprint.
For this work, we use the photometric data from the final data release of BOSS, DR12 (Alam et al. 2015) with the same target selection as in Sánchez et al. (2017). This sample combines the BOSS LOWZ and CMASS galaxy samples to produce a catalogue which covers approximately 9300 deg 2 (Reid et al. 2016). Its normalised redshift distribution can be seen in figure 1. The sample is then split into two redshift ranges: "zlow" (0.2 < ≤ 0.5) and "zhigh" (0.5 < ≤ 0.75). From this photometric data, we use SDSS composite model (cmodel) band magnitudes which are defined in Stoughton et al. (2002).

MICE2 simulations
For the analysis discussed in section 4 and in appendix A, we rely on datasets of simulated galaxies, selected from the MICE2 galaxy mock catalogue (Fosalba et al. 2015c,a;Carretero et al. 2015;Crocce et al. 2015;Hoffmann et al. 2015). This catalogue is based on the MICE dark matter-only simulation, generated from 7 × 10 10 particles in a box with a side length of 3 Gpc and assuming a ΛCDM cosmological model with Ω m = 0.25, Ω Λ = 0.75, Ω b = 0.044 and ℎ = 0.7. A light cone, spanning ∼5000 deg 2 , is constructed from this simulation box and populated with galaxies up to a redshift of = 1.4 using a hybrid Halo Occupation Distribution (HOD) and Halo Abundance Matching (HAM) technique. Additionally, MICE2 embeds gravitational lensing by providing estimates of the shear components, convergence as well as true and lensed position for each galaxy.
We start from this MICE2 input catalogue and apply an evolutionary correction to the provided SDSS -band magnitudes and calculate an additional set of magnitudes mag = evo − 2.5 log 10 (1 + 2 ) , Normalised differential galaxy count distribution, ( ), with respect to the -band magnitude. The BOSS sample is shown in black, the MICE2 sample in red, while the flux-limited MICE2 sample is shown in yellow. In the top plot, we see the population of galaxies with 0.2 < ≤ 0.5 and at the bottom, the population of galaxies with 0.5 < ≤ 0.75. The black cross indicates the turn-off magnitude determined for the BOSS sample. The red triangle indicates the same for the MICE2 mock sample. used to derive equation (4) and equation (12) might lead to biases when simulating magnified galaxy samples. Since 99.9% of the galaxies in the MICE2 simulations have | | < 0.09, the assumption should still hold. However, it should be investigated in the future whether this is really the case.
Finally, we select two samples from this base catalogue, one with an arbitrary magnitude limit in the SDSS -band at mag ≤ 20.2 (applied in appendix A) and one that resembles the SDSS BOSS survey, using a target selection similar to Eisenstein et al. The -band number counts of these two samples and the original BOSS data is shown in figure 2. In figure 2, it becomes apparent how the BOSS selection function differs from a flux-limited sample. The cut-off of the galaxy population at the magnitude limit is not as pronounced, while the ( ) no longer increases monotonically, especially at low redshifts. The galaxy counts per unit area as a function of redshift of the three samples is shown in figure 1. Here we see how with the BOSS selection function applied, the redshift distribution is altered in a highly non-linear manner causing it to be multi-peaked with a main peak at ∼ 0.5. The magnitude-limited sample, on the other hand, follows a roughly single-peaked distribu-tion dominated by low redshift galaxies ( ∼ 0.3).
Having knowledge of the underlying matter distribution allows us to compare estimates of the scale of flux magnification through obs ( ) from observables as given by equation (5) with the estimate as given by equation (4). When analysing the MICE2 mock observations, we only consider the SDSS model -band magnitude, due to a lack of available SDSS cmodel magnitudes from the simulations.
As a sanity check of our methods outlined in section 3.3, we conduct an estimate of the magnification bias induced by a flux/magnitude-limited sample selection function on a galaxy survey over an eighth of the sky in appendix A. For this, we use the MICE2 simulations to obtain the positions and magnitudes of galaxies before and after magnification, while knowing the true underlying matter density. We set the magnitude limit in the -band to a magnitude of 20.2 (similar to the magnitude limit in the -band of the BOSS survey).
To conduct the analysis for the case where the target selection function is not flux or magnitude limited, we select a ∼5000 deg 2 area from the MICE2 simulations and apply the aforementioned sample selection function to it. The -band magnitude distribution of the BOSS and MICE2 galaxies within each of the redshift bins is shown in figure 2. Here, we see that, although the overall shape of the population is similar between the BOSS and the MICE2 galaxies, the MICE2 objects are consistently shifted towards the fainter end of the distribution. This is at least partially caused by the fact that the BOSS magnitudes are -band cmodel magnitudes and the MICE2 magnitudes are SDSS model -band magnitudes. In addition, the MICE2 simulations with a BOSS-like selection function do not seem to capture the population of galaxies at the extremes of the magnitude distribution. Both of these biases might also be due to some assumptions in the galaxy formation and evolution models used in the MICE2 simulations. In addition, the fiducial cosmology assumed for the simulations might not agree with the cosmological parameters preferred by the BOSS data. However, the method of calibrating the obs estimates from the observations with the simulations is not sensitive to a constant shift in the distribution nor is it sensitive to the extremes of the magnitude distribution by construction.

Calibration procedure on simulations
To calibrate the obs obtained from observations, we first have to determine an accurate estimate of the underlying luminosity function slope, , in the MICE2 simulations as given by equation (4). As outlined in figure 3, we first spatially bin the lensed and unlensed galaxy positions using H P at a resolution of nside = 64 (Gorski et al. 2005). Within each bin/pixel, we evaluate lensed and unlensed cumulative galaxy counts, and 0 respectively, as well as the average convergence, . We then perform a least squares linear fit of the relative difference between lensed and unlensed galaxy counts over the convergence, , to estimate (as shown in figure 4). This is a consequence of the linearity between these two quantities which emerges in the weak lensing limit as given by equation (4).
In order to obtain better estimates of the uncertainties of , the H P pixels are grouped into tiles (H P pixels with a resolution of nside = 4) for which we repeat the analysis independently each time. The weighted mean of these values obtained from each tile gives the final estimate for , , while the standard deviation between these values is used to estimate the uncertainty as given by where i are the estimates from each tile or bin, i is their associated uncertainty, is the weighted mean of the estimates and is the number of tiles over which the analysis is repeated. When i = , equation (13) reduces to the equation for the error of the mean, i.e.
As an alternative, one might think that it would be enough to assume that the uncertainty on the galaxy counts is given by a noise, which considers the correlation between the lensed and unlensed galaxy counts (which is shown in the errorbars of the data points in figure 4). We find, however, that this approach leads to underestimates of the uncertainties. Sampling over many different areas in the sky gives a more conservative estimate of the uncertainty, while also accounting for the local fluctuations in the BOSS sample.
A possible cause for concern when comparing the magnified and unmagnified galaxy populations can be the edge cases where, for a given bin or pixel, the unmagnified galaxy number count 0 = 0, while the magnified number counts = 1 or vice versa. These cases cause divergences in the relative difference and unrealistic uncertainties, since they introduce null denominators. For this reason, they are excluded in the analysis. In any case, the frequency of these occurrences is usually found to be negligible for the H P resolutions and redshift bins used in this work. Dividing the 5000 deg 2 MICE2 simulations into two redshift bins at a H P nside = 64, there are none of these cases. While considering 19 redshift bins at the same H P resolution, only ∼0.7% of the pixels have to be discarded.

Determining magnification bias from observations
After having determined the luminosity function slope, , from the simulations as described in section 3.3, we estimate the optimal magnitude range, Δ , to calibrate the estimate of obs from mock observations using .
To do this, we first choose a magnitude band, , that has been used to select (at least, partially) the galaxy sample of interest. Another magnitude band will carry less information about flux magnification. Then, we determine the discrete differential galaxy count distribution, ( ), over the chosen magnitude, , for a given redshift range. Subsequently, we find the magnitude at which the faintest most dominant peak in ( ) occurs. This value is considered to be the effective magnitude limit of the galaxy sample. From ( ), we compute obs ( ) using equation (5). Thereafter, we calculate the weighted mean of obs ( ), obs , over all possible magnitude ranges, Δ , below the effective magnitude limit determined before.
In order to find the optimal Δ which will be used for the calibration of obs from the actual observations, we find the value of obs (Δ ) which is in best statistical agreement with the value of determined previously for the same galaxy sample and redshift range. Therefore, the optimal obs (Δ ) value is the one which stands for the count of lensed galaxies, 0 refers to the counts of unlensed galaxies, to the convergence, to the luminosity function slope determined from the known , ( ) is the differential galaxy count distribution over magnitude, , obs is the luminosity function slope as determined from ( ).  The reason behind choosing a magnitude range, Δ , relative to the effective magnitude limit of the differential galaxy count distribution, ( ), for calibration is to account for one of the simplest forms of disagreement between the observed ( ) and the ( ) from mock observations. This disagreement being a constant shift in the domain of ( ). For instance, such a shift exists between the ( ) from the BOSS and MICE2 samples which has been discussed in section 3.2 and shown in figure 2 already. If we were to evaluate MICE2 obs and BOSS obs over the same magnitude range, while disregarding the difference between their ( ) distributions, the obs estimates will be biased. This happens because we would be probing regimes of ( ) from the observed galaxy sample beyond or far below its magnitude limit when calculating obs . Other higher-order biases in the ( ) from mock observations may exist which would require more complex parametrisations of the calibration procedure. Nevertheless, in such cases, it might be more efficient and physically motivated to adjust the models used to produce the mock galaxy samples such that the agreement in ( ) improves up to a point where it can be mostly parametrised by a constant shift in the magnitude.
In any case, once the optimal Δ to reconcile obs and from the mocks has been determined, it may be used to calibrate obs from the observations. As summarised in the lower third of figure 3, we first compute ( ) for the given redshift range. We again find the faintest most dominant peak in ( ) and set it as the effective magnitude limit and evaluate obs ( ) from ( ). Lastly, we calculate the weighted mean of obs ( ) over the optimal magnitude range below the effective magnitude limit, Δ , determined before from the simulations over the same redshift range. Thus, we produce the final obs estimate for that sample.
The method described here has been integrated into M BE 9 . The module also includes the analysis conducted in section 4.

APPLICATIONS TO BOSS LENSES
We proceed to apply the method described in sections 3.3 and 3.4 to the BOSS lens galaxy sample introduced in section 3.1. The magnitude bands selected for this are cmodel magnitudes, since they are better indicators of the overall flux emitted by a galaxy. The specific magnitude band chosen is based on which band was used to select the dominant population within a sample. In other words, when working with LOWZ-dominated galaxy samples ( < 0.36), we use the -band and when working with CMASS-dominated samples ( > 0.36), we use the -band (Eisenstein et al. 2011a). To allow for accurate forecasting of the KiDS-1000+BOSS analysis , we choose the same convention for the redshift bins: 0.2 < ≤ 0.5 and 0.5 < ≤ 0.75. Consequently, both bins are dominated by CMASS galaxies, so we opt to use -band magnitudes for the analysis of both samples.
As demonstrated in appendix A, for the flux-limited case, we can accurately and robustly estimate the magnitude of the magnification bias by determining the effective luminosity function slope through the weighted mean of obs near the magnitude limit. In this section, we discuss whether the same can be said when applying a complex sample selection function which does not have a clear flux/magnitude limit such as in the case of the BOSS survey.
Firstly, we directly estimate from the MICE2 simulations following the approach outlined in section 3.3. An example of this is shown in figure 4, where we see the estimate within a single ∼200 deg 2 tile containing 256 pixels within the zhigh bin. This procedure is repeated for each tile and redshift bin. Then, we find the weighted mean between the from each tile to determine the for each redshift bin and its uncertainty given by equation (13). This gives zlow = 2.43 ± 0.09 and zhigh = 3.26 ± 0.07.
Next, applying the procedure discussed in section 3.4 and using the differential galaxy count distributions for each redshift bin shown in figure 2, we can estimate obs ; once for the simulated BOSS-MICE2 observations, and once for the actual BOSS observations. In figure 6, for zlow, we find that the estimate is optimal near the faint end of the count distribution, which is expected, since the assumed flux power law should be most accurate in the faint limit. However, this does not appear to be the case for the high redshift sample, zhigh. For this range, the estimate is optimal when considering the whole magnitude range up to the turn-off magnitude. This might be due to incompleteness in the sample and/or the complex selection, which flattens the observed number counts (Hildebrandt 2016).
Taking the magnitude range from the optimal MICE2 obs estimate to calibrate BOSS obs gives the estimates shown in figure 5. For the MICE2 mocks, we find that zlow = 2.43 ± 0.09, while zlow obs = 2.442 ± 0.002. In addition, zhigh = 3.26 ± 0.07, while zhigh obs = 3.08 ± 0.32 which indicates that the estimates obtained from observations using equation (5) are a good indicator of the scale of the magnification bias even when there is a complex sample selection function when they are properly calibrated. For this reason, we may consider the obs estimates given in table 1 from the actual BOSS observations as unbiased indicators of the scale of the magnification bias. Note that the value for zlow, slightly deviates from the value of BOSS obs = 1.80 ± 0.15 quoted in Joachimi et al. (2020), since there have been minor adjustments in the way peaks in ( ) are detected. This leads to a 16% change in the amplitude of the mag. bias contribution, which has no effect on the KiDS-1000 analysis as the GGL contributions are marginal.
When comparing the obs ( ) curves for each bin in figure 5, one might notice that the turn-off near the effective magnitude limit is not as steep for zhigh as for zlow. This is due to the complex BOSS selection function which deviates particularly strongly from a simple flux limit at high redshifts. Here is where the semi-empirical calibration of the magnitude range considered in order to determine the effective luminosity function slope obs is especially relevant. As shown in figure 6, we find that for zhigh we get a more accurate estimate when considering the entire magnitude range Δ available below the effective magnitude limit which is in stark contrast with the results found for a flux-limited sample (see figure A2). The opposite is the case for zlow. As shown in figure 5, the double peak in the zlow bin combined with a clearer 'flux limit' near the peak magnitude means that the power law model for the luminosity function holds best within a small magnitude range near the peak.
We note that in figure 2 the simulated and the observed differential count distributions do not quite match. The ( ) from MICE2 mock observations is shifted by a Δ ≈ 0.2 to the faint end with respect to the BOSS ( ). This might be due to some limitations in the galaxy model of the MICE2 simulations. The fact that the ( ) from the mocks and observations do not match perfectly seems to be driving the discrepancy between MICE2 obs and BOSS obs shown in figure 5. However, since our calibration is based on a magnitude range of a fixed width relative to the effective magnitude limit for each sample, the estimates are not sensitive to this apparent shift in the domain of ( ). The only thing which can bias our estimates are any disagreements in higher-order derivatives of ( ) near the effective magnitude limit between observations and simulations. However, the uncertainties of obs from equation (13)   is in a ∼2 tension with despite being calibrated to optimally overlap. Taking as the underlying truth, we consider MICE2 obs and BOSS obs for 0.2 < ≤ 0.25 to be biased. This remarks the limitations of this method. A small change in the sample size can lead to radical changes in the gradient of the magnitude distribution ( ) of these galaxies, causing substantial biases in the obs estimates as discussed in Hildebrandt (2016).  The arrows indicate the constant magnitude shift applied to reconcile the differential galaxy count distribution, ( ), from observations with the ( ) from mocks. The dotted black horizontal line marks the obs estimate from BOSS galaxies, the dashed red horizontal line marks the obs estimate from MICE2 mock galaxies and the blue dot-dashed horizontal line marks the effective MICE2 determined from the weak lensing convergence with equation (4) and used to calibrate MICE2 obs .

MAGNIFICATION BIAS IN WEAK LENSING MEASUREMENTS
Having produced estimates for the effective luminosity function slope ( obs ) for the BOSS DR12 galaxy sample, we now proceed to make forecasts of the importance of magnification bias in the GGL signals. The forecasts are produced from cross correlating source galaxies from weak lensing surveys with the BOSS lens samples considered in section 4. First, we produce forecasts for the GGL signals for a KiDS-1000+BOSS DR12 analysis as described in Joachimi et al. (2020) (Tröster et al. 2019). We use the halo and intrinsic alignment models described in section 2 and set bary = 3.13 (upper limit of the KiDS-1000 prior) and IA = 0.8 (best estimate from Tröster et al. 2019).

KiDS-1000 + BOSS forecasts
Following the approach outlined in section 2.3, we propagate the obs measurements for zlow and zhigh shown in table 1 into angular power spectrum prediction for the galaxy-galaxy lensing signal. We then determine the ratio between the angular power  In order to put these contributions into perspective, we also estimate the statistical uncertainty in the GGL signal assuming shot and shape noise only (see for example Joachimi & Bridle 2010). We calculate this for 6 logarithmically spaced ℓ bins per dex, while assuming the footprint area of the full KiDS survey, = 1350 deg 2 . In figure 9, we then compare the relative magnification-shear signal to the relative GGL uncertainty for each ℓ bin. The magnification-shear correlation found between these bins constitutes a few-per cent contribution to the galaxy-galaxy lensing signal correlated with the zlow bin. To compare that to the shape and shot noise, gG , we define the cumulative signal-to-noise ratio, SNR, within a range of angular scale, ℓ min < ℓ < ℓ max , as follows where is the number of ℓ bins, labels each ℓ bin, and ℓ min,i and ℓ max,i mark the lower and upper limits of each bin, respectively. For the correlations with the zlow bin, this implies a cumulative signal-tonoise ratio for 100 < ℓ < 4600 between 0.1 and 0.3. This contribution becomes larger for the high-redshift source bin (zhigh), from ∼5% to ∼20% of the GGL signal, while the shot and shape noise is of a similar scale. Hence, the cumulative SNR(100 < ℓ < 4600) = 0.3 for the correlation between the zhigh and the first KiDS redshift bin, while the cumulative SNR(100 < ℓ < 4600) = 1.1 between the zhigh and the fifth KiDS bin. At the same time, these values lead to a maximal contribution of the magnification bias to the clustering signal of ∼0.6% (Joachimi et al. 2020). Even though we are assuming the area of the full 1350 deg 2 KiDS footprint, these contributions to the GGL signal by magnification are large enough to prompt the consideration through modelling in the analysis of this systematic in the KiDS-1000+BOSS analysis outlined in Joachimi et al. (2020). Nonetheless, since the analysis shown here already provides an accurate estimate for the magnitude of the magnification bias, the contribution to the GGL signal in each bin can simply be fixed and added to the overall GGL angular power spectrum without the need to add any more free parameters in the astrophysical models.

HSC Wide + BOSS forecasts
We repeat the analysis for section 5.1, considering the HSC Wide source bins. figure 10 shows the ratio between (i,j) mG (ℓ) and (i,j) gG (ℓ) together with the relative uncertainty in the GGL signal for each ℓ bin assuming a full footprint area of 1400 deg 2 (Aihara et al. 2018) as well as the galaxy sample properties shown in Table 2. Similar to KiDS, we find that the magnification-shear signal only contributes about ∼2% to the GGL signal correlated with the zlow lens bin (giving a cumulative SNR within 100 < ℓ < 4600 between 0.4 and 0.5). In correlations with the zhigh lens bin, the contribution of the magnification-shear signal is larger and between ∼5% and ∼20% which is considerable above the shape and shot noise (with 1.576 < phot ≤ 2.500 1.85 3.0 0.21 Notes. stands for the mean redshift in each tomographic bin, med for the median redshift, gal for the galaxy number density in arcmin −2 following the definition from Heymans et al. (2012) and , for the dispersion per ellipticity component. zlow and zhigh are the lens bins based on the BOSS DR12 galaxy clustering data. The KiDS source bins have been defined in accordance with the methodology for the KiDS-1000 GGL analysis as given in Joachimi et al. (2020) and Heymans et al. (2020) based on the redshift calibration described in Hildebrandt et al. (2020) and Wright et al. (2020). The properties of the HSC source bins are based on the information provided in table 1 of the HSC Y1 cosmic shear analysis (Hikage et al. 2019)  a cumulative SNR within 100 < ℓ < 4600 between 1.3 and 2.0). It is significant enough to give grounds for the consideration of this systematic during future GGL analyses which cross correlate the HSC Wide sample with the BOSS DR12 or a similarly selected lens sample.

Euclid-like survey + DESI-like survey forecasts
We produce forecasts for a GGL analysis with Stage-IV (Albrecht et al. 2006), assuming lens and source samples akin to DESI (Aghamousa et al. 2016) and Euclid (Laureĳs et al. 2011), respectively. We repeat the analysis shown in section 5.1 and 5.2 for the Euclid-like source bins described in table 2 and in figure 8. We consider a footprint overlap between our source and lens sample of 6000 deg 2 , which is roughly the expected overlap between Euclid and DESI (Levi et al. 2013;Aghamousa et al. 2016). Therefore, the fictitious BOSS/DESI-like galaxy sample we are considering here has all the properties of the BOSS lens sample, but has the planned DESI footprint. Although DESI will probe higher redshifts and fainter galaxies than BOSS, it will be similar to BOSS in that it will not be a purely flux-limited survey. Targets in DESI are selected using a combination of different band magnitudes depending on the galaxy type and redshift range which is being observed (for more details see Aghamousa et al. 2016). For this reason, the magnification bias in the DESI sample cannot be modelled analytically either, warranting an analysis similar to the one discussed here. The Euclid-like source sample used in this work is designed to be split into the same redshift bins as suggested by Euclid collaboration forecast choices (Blanchard et al. 2019). In addition, within each bin, the median redshift is chosen to be in agreement with the one expected for the Euclid sources.
Considering 6 logarithmically spaced ℓ bins per dex (as in the previous sections), we obtain the magnification-shear signal forecasts shown in figure 11. We see that the magnification-shear signal constitutes a considerable systematic when correlating with the zlow bin, since the observed cumulative SNR on scales within 100 < ℓ < 4600 is between 0.3 and 0.7. The magnification bias signal becomes strong enough for correlations with zhigh, it would be a detectable signal  (with the cumulative SNR within 100 < ℓ < 4600 ranging from 1.5 when correlating zhigh and Euclid1 to 2.8 when correlating zhigh and Euclid10). This might require any future GGL analysis of Eu-clid+BOSS or Euclid+DESI data to allow for the parameters to freely vary as a nuisance parameter in order to properly account for this systematic. The method outlined in this paper could be used to set informative priors on the values within each lens bin.

CONCLUSIONS
In this paper, we have introduced a novel method to estimate the effective luminosity function slope, , of galaxy samples which have been defined with a complex selection function that is not simply flux/magnitude-limited. The method calibrates the estimates from observables with accurate cosmological simulations with the same sample selection. This expands upon previous work where the flux magnification was only measured for flux-limited cases or found to be inaccurate in non-flux-limited cases (Hildebrandt 2016).
The new method determines the underlying slope of the luminosity function of the simulated galaxy sample ( ) from unobservable properties such as the convergence, , and the unlensed galaxy position. It then finds the magnitude range relative to the magnitude limit over which the resulting obs as calculated from the observable differential galaxy count distribution, ( ), best agrees with . Finally, the same relative magnitude range is used to determine obs from the observed galaxy sample.
A few things should be considered when employing this method. We find that the magnitude ranges up to the effective magnitude limit that are determined to be optimal from the simulations in order to calibrate obs are only valid for a given redshift range, a given sample selection function and a given galaxy sample for which weak lensing simulations are available. Thus, it is important to note that this method cannot be generalised trivially, as it requires the availability of accurate cosmological simulations to assure consistency between the two independent estimates, obs and . Nonetheless, when simulations are available, it provides a robust estimate of the scale of the magnification bias for non-flux-limited surveys such as BOSS.
Applying our calibration method to the BOSS DR12 sample split into two redshift bins, we find that obs = 1.93 ± 0.05 for 0.2 < < 0.5 and obs = 2.62 ± 0.28 for 0.5 < ≤ 0.75 leading to a contribution to the galaxy-galaxy lensing signal of up to ∼2% for KiDS-1000 and HSC Wide sources correlated with the 0.2 < < 0.5 lens bin. Although the contribution can go up to ∼20% when correlating KiDS-1000 and HSC Wide sources with the 0.5 < ≤ 0.75 BOSS lens bin, the magnification-shear signal can go above the noise with a cumulative SNR going up to 1.1 and 2.0 for KiDS-1000 and HSC Wide, respectively. Hence, both for KiDS-1000 and HSC Wide, the magnification-shear signal appears to be dominant enough to warrant the modelling of this systematic in future GGL analyses involving BOSS lenses, as was already done in the recent KiDS-1000 analysis (Joachimi et al. 2020;Heymans et al. 2020). This necessity becomes even more evident in the forecasts for a GGL analysis of Euclid-like sources with DESI-like lenses. In this case, the magnification-shear signal is either a considerable systematic when correlating with the zlow bin (with a cumulative SNR of around 0.5), or it even becomes a detectable signal when correlated the source bins with zhigh giving cumulative SNRs around 2 which can go up to 2.8. This might require any future GGL analysis incorporating Euclid and any highly selected lens sample (e.g. DESI or BOSS) to allow for the effective luminosity function slope ( ) of each lens sample to vary freely within the model using informative priors based on an analysis similar to the one conducted in this paper. These results are in line with Duncan et al. (2014) as well as the recent findings from Mahony et al. in prep. where it was determined that the inclusion of the magnification bias in the modelling for surveys such as the next generation of surveys is necessary to accurately infer cosmological parameters.
We expect similar conclusions for other surveys. It might be desirable to estimate the magnification bias using the methodology outlined in this paper in clustering and GGL analyses based on DES M G C lens galaxies such as the ones described in Clampitt et al. (2017), Elvin-Poole et al. (2018) and Prat et al. (2018), since it also follows a complex selection function (Rozo et al. 2016). The SNR should be comparable to HSC and KiDS, so the magnification bias will not have to be included as a free parameter. On the other hand, for surveys such as LSST (Ivezić et al. 2008;Abell et al. 2009) and the Nancy Grace Roman Space Telescope (formerly known as WFIRST, Spergel et al. 2015), it may become necessary to make the of the lens galaxy samples a nuisance parameter in any clustering or GGL analysis, as we suggest for a Euclid+DESI-like analysis. between 0 and ∼2). This confirms that a power law is a good approximation for the luminosity function over a large magnitude range near the faint end of the distribution which implies that, in a magnitude-limited survey, obs estimates are robust and accurate even after substantial changes in the magnitude range considered. We find similarly good agreement between the and obs in the zhigh bin where zhigh obs = 3.12 ± 0.2, while figure A2 shows that this estimate is robust at high redshifts.
Despite the consistency between and obs and the robustness of the estimate to small changes in the calibration magnitude range Δ , it is surprising to see such a drastic increase in obs between zlow and zhigh. This seems to be a consequence of the magnitude limit at = 20.2 being low enough to exclude a substantial fraction of faint galaxies at high redshifts, such that the power law in flux assumed in equation (2) no longer applies. If we consider the luminosity function of the galaxies as a Schechter function (Schechter 1976), such a selection of bright galaxies would lead to a dominant exponential term in the Schechter function which leads to overestimates of . In general, this is not of much concern, since most magnitude limited surveys operate within a regime where the power law approximation holds. This paper has been typeset from a T E X/L A T E X file prepared by the author. with max. overlap with MICE2 from convergence ( MICE2 ) from magnitude distribution ( MICE2 obs ) Figure A2. obs estimates from the MICE2 simulations for the magnitudelimited case ( < 20.2) over -band magnitude ranges below the turn-off magnitude (Δ ) considered to calculate the weighted average. Two redshift bins are considered: 0.2 < ≤ 0.5 (top) and 0.5 < ≤ 0.75 (bottom).
The red cross marks the obs estimate which overlaps the most with the estimate from the weak lensing convergence (black line).