Intrinsic alignment from multiple shear estimates: A first application to data and forecasts for Stage IV

Without mitigation, the intrinsic alignment (IA) of galaxies poses a significant threat to achieving unbiased cosmological parameter constraints from precision weak lensing surveys. Here, we apply for the first time to data a method to extract the scale dependence of the IA contribution to galaxy-galaxy lensing, which takes advantage of the difference in alignment signal as measured by shear estimators with different sensitivities to galactic radii. Using data from Year 1 of the Dark Energy Survey, with shear estimators METACALIBRATION and IM3SHAPE, we investigate and address method systematics including non-trivial selection functions, differences in weighting between estimators, and multiplicative bias. We obtain a null detection of IA, which appears qualitatively consistent with existing work. We then forecast the application of this method to Rubin Observatory Legacy Survey of Space and Time (LSST) data and place requirements on a pair of shear estimators for detecting IA and constraining its 1-halo scale dependence. We find that for LSST Year 1, shear estimators should have at least a $40\%$ difference in IA amplitude, and the Pearson correlation coefficient of their shape noise should be at least $\rho=0.50$, to ensure a $1\sigma$ detection of IA and a constraint on its 1-halo scale dependence with a signal-to-noise ratio greater than $1$. For Year 10, a $1\sigma$ detection and constraint become possible for $20\%$ differences in alignment amplitude and $\rho=0.50$.


INTRODUCTION
Einstein's theory of General Relativity predicts that massive objects alter the geometry of surrounding space-time and consequently affect the path of light-rays passing close to them, in an effect known as gravitational lensing.On large scales, the subtle lensing of lightrays from background source galaxies by foreground lens masses is only detectable by correlating the shapes of many galaxies, as their intrinsic ellipticity is much larger than any observed change (shear) induced by large-scale lensing.This effect, referred to as weak gravitational lensing, has proven to be an important scientific tool for probing the matter distribution of the Universe, which in turn has allowed us to develop our understanding of dark matter and dark energy (e.g.Hu 2002;Baldauf et al. 2010;Weinberg et al. 2013;Abbott et al. 2018).
Weak lensing measurements typically consider auto-correlations between the shapes of source galaxies (cosmic shear; CS), or crosscorrelate the shapes of source galaxies with the positions of foreground lens galaxies (galaxy-galaxy lensing; GGL) (for a review on the theory of weak lensing, see Bartelmann & Schneider (2001)).However, both cosmic shear and galaxy-galaxy lensing are susceptible to contamination by correlations known as intrinsic alignments (IA), resulting from tidal physics as well as galaxy formation and evolution effects.The origin of IA at different scales and for different galaxy types is an active area of research, but is generally understood ★ E-mail: c.macmahon@ncl.ac.uk to include tidal effects, as well as possible contributions from galaxy evolution history and environment (Croft & Metzler 2000;Heavens et al. 2000;Troxel & Ishak 2015).In GGL, which will be the focus of this work, correlations due to IA exist between lens and source galaxies before the weak lensing effect imprints on our observations.Consequently, if IA is not properly accounted for, it can result in biased lensing estimates and thus biased constraints on cosmological models.
Current Stage III surveys -such as the Dark Energy Survey (DES; Abbott et al. 2022), Kilo-Degree Survey (KiDS; Heymans et al. 2021), and Hyper Suprime-Cam survey (HSC; Hikage et al. 2019) -and upcoming Stage IV surveys -such as the Rubin Observatory's Legacy Survey of Space and Time (LSST; Ivezić et al. 2019), and Euclid (Scaramella et al. 2022) -are vastly decreasing the statistical uncertainty on weak lensing measurements.With such a wealth of modern surveys providing greater statistical power, IA is becoming a significant source of uncertainty (e.g.Samuroff et al. 2019;Secco et al. 2022) and is forecast to become even more so in the near future (Krause et al. 2016).
Direct detection of intrinsic alignment, using smaller samples of more luminous 'source' galaxies for which spectroscopic redshifts are available, has provided key insight into the physics of intrinsic alignment, by directly selecting only those 'source' and 'lens' galaxies which are truly physically associated (Mandelbaum et al. 2006;Hirata et al. 2007;Okumura et al. 2009;Singh et al. 2015).However, in order to achieve the statistical power needed to make an accurate weak lensing measurement, millions of galaxy images are required.Thus, many surveys measure galaxy redshifts using photometry with much broader spectral bands than spectroscopy, leading to larger associated uncertainties.Such surveys also do not always have a representative spectroscopic sub-sample available.
Other methods for measuring or mitigating the IA contamination to GGL have exploited the redshift dependence of the effect using methods such as binning sources in photometric redshift (photo-z), to separate those which are closer or further in redshift from the lenses (Heymans et al. 2004;Hirata et al. 2004;Joachimi et al. 2011;Blazek et al. 2012).However, such methods could be impacted by potentially large photo-z uncertainties, which in the worst cases may even be incorrectly estimated (e.g.Bernstein & Huterer (2010)).
Advances are being made in the measurement of photo-z (e.g.Bilicki et al. 2018), and it may soon be possible to obtain a large enough spectroscopic sample to carry out high-precision weak lensing studies (e.g.DESI Collaboration et al. 2016).Nonetheless, characterising the associated uncertainties remains an important consideration for upcoming photometric lensing surveys, such as LSST and Euclid.Novel methods to measure the IA contamination are also being proposed to address the issues associated with photo-z, such as the selfcalibration methods (targeting cosmic-shear; Zhang 2010;Troxel & Ishak 2012;Yao et al. 2017Yao et al. , 2019)).However, these methods are still somewhat dependent on the photo-z uncertainty, although do include parameters to account for this.
In Leonard & Mandelbaum (2018) (hereafter: L2018), a novel method for measuring and / or constraining the scale dependence of intrinsic alignment was proposed.This method attempts to use the dependence of the IA signal's amplitude on the radial scales within a galaxy, rather than its dependence on redshift.We expect the outer radial regions of galaxies to be more aligned with local structure than the inner radial regions, which results in twisting of the isophotes in a galaxy's light profile.Using observational data, Singh & Mandelbaum (2016) showed that shear estimators with sensitivity to different radial scales within galaxies, had different levels of IA contamination, due to isophotal twisting theorised to result from IA.Further observational evidence was shown in Georgiou et al. (2019), where altering the radial weighting of a shape estimator resulted in different measured alignment amplitudes.Tenneti et al. (2014) also showed this effect in simulated data.
The method of L2018 (henceforth referred to as the multi-estimator method; MEM) therefore looks to compare weak lensing measurements from two different shear estimators, with sensitivity to different radial regions of galaxies.If the lensing contribution to shear can be shown to be the same in both estimators (as should ideally be the case), taking the difference of the two estimates would 'cancel out' the lensing signal (since the lensing effect does not depend on the radial region of the galaxy from which the light originated).This leaves a portion of the IA signal, determined by the difference in IA amplitude between the radial regions of the galaxy which each estimator probes.The cancellation of the lensing signal requires some assumptions, which will be discussed in greater detail later as a key subject of this work.
The advantages of such a method are twofold.Firstly, since the lensing signal is cancelled, it does not need to be measured and removed.This may make such a method particularly robust in the case of catastrophic photo-z error estimations, which could become more likely as upcoming surveys image further and fainter sources than ever before.Secondly, correlations in shape noise and cosmic variance between the two estimators could reduce the uncertainty in the measured IA signal.This would allows us to test IA at small scales within the 1-halo regime, which current models struggle to describe due to non-linear effects.
In this work, we focus on GGL and do not consider the MEM in the context of direct application to cosmic shear, as non-local gravitational-intrinsic correlation would significantly complicate the formalism of the estimator.Therefore, we consider the simplest scenario for the estimator (GGL) while the method is still in development, with the view of potentially extending to direct cosmic shear applications in future work.
This paper is structured as follows: in Section 2, we review the formalism of the MEM from L2018 and introduce a series of assumptions made in its most basic construction.We then go on to derive a new fundamental expression in the absence of one of these assumptions and discuss other potential complications.In Section 3, we present the first observational measurement with MEM, using the Dark Energy Survey Year 1 galaxy shape catalogues, and propose methods to address the complications to the basic formalism outlined in Section 2. In Section 4, we consider the method in the context of upcoming Stage IV lensing surveys and carry out forecasts to place requirements on shear estimators for use with the MEM.Finally, in Section 5, we conclude by placing these results in the context of classes of shear estimators planned for deployment in Stage IV surveys.

Basic method formalism
In this section, we briefly review the mathematical formalism of the MEM, as introduced in L2018.
The key GGL observable considered here is tangential shear, which measures the level of alignment and ellipticity distortion tangential to a lens.We consider two different estimators for the tangential shear, as given by two different shear estimation methods,  t and  ′ t , Here, the label 'lens' indicates a sum over lens-source pairs. denotes shear, subscripts L and IA denote lensing and IA contributions respectively, and a tilde denotes an observed quantity. is the onsky lens-source angular separation, w  are weights given to each lens-source pair, and  is a factor which scales the IA amplitude of one estimator relative to the other. and  ′ represent sample-level multiplicative bias, residual in the estimators post-calibration (note these are not the full multiplicative bias values, only a portion of the bias that remains due to uncertainty on the calibration values).The boost factor, (), accounts for 'excess' galaxies, which are physically associated with the lens due to clustering, and as such not expected to be physically associated in a random sample.It is given by: where 'rand' indicates a sum over random-source pairs, i.e. sources paired with galaxies from a catalogue of random points drawn from the lens redshift distribution. rand is the number of randoms and  lens is the number of lenses.
We now apply the assumptions that the weights for the two methods are identical, and the multiplicative biases residual after calibration,  and  ′ , are demonstrably subdominant.We will later revisit these assumptions in detail.Taking the difference of our estimators, by subtracting equation 2 from equation 1, now gives: Now consider a sample of lens-source pairs in which the lens and source have small enough line-of-sight separation that we would expect them to be intrinsically aligned.The scale in which it is conventionally assumed IA could be present is 100 Mpc/h (e.g.L2018) and we adopt this assumption here.The quantity of interest to us is the tangential shear due to IA per contributing lens-source pair.To account only for contributing pairs, we divide equation 4 by the sum of the weights of contributing pairs.To express this, we make the following definition: with F defined by: where 'rand, close' denotes a sum over sources within Π = 100 Mpc/h line-of-sight separation of random points (L2018).The choice of 100Mpc/h does warrant further investigation, which we touch on again in Section 4 below.Finally, multiplying equation 4 by equation 5 gives us an expression to extract a portion of the IA signal, per intrinsically aligned lenssource pair, (1 − ) γIA : Equation 7 is the fundamental equation of this method.From it we can measure a portion of the IA contribution up to an amplitude determined by .For example, a value of  = 0.8 indicates a 20% difference the IA contamination of our two estimators.Even in the case where we only recover a small fraction of the IA contamination, provided the signal is above zero, this gives us the ability to extract information about the scale dependence of intrinsic alignment, potentially inside the non-linear 1-halo regime.If an estimate of  could be obtained as the radial sensitivities of the estimators were known, then it would be possible to fully model the IA contribution to tangential shear for either of these estimators.However, even if  could not be estimated, information on the scale dependence would allow for the amplitude of an IA model to be calibrated within cosmological parameter estimation pipelines.A more detailed discussion on the significance of  is given in Section 4.3.1.While equation 7 represents the ideal case of this method, as mentioned above, we have made several strong assumptions about the relative characteristics of the shear estimation methods to obtain it.We now go beyond the method as introduced in L2018, to explore the consequences of relaxing these assumptions.

The effect of residual multiplicative bias
The work of L2018 assumed residual multiplicative bias due to uncertainty in the multiplicative bias calibration to be subdominant and thus ignored.This is because L2018 was considering future shear estimation methods with demonstrably subdominant calibration uncertainty, such as the Bayesian Fourier Domain method proposed by Bernstein & Armstrong (2014).In the case where the uncertainty on multiplicative bias calibration cannot be shown to be subdominant, residual multiplicative bias can remain in the estimators, and equation 7 must be re-formulated to include terms accounting for the residual bias in each estimator.Subtracting equation 2 from equation 1 in this instance yields: Normalising by the weighted number of physically associated pairs gives, where γL,PA and γIA respectively represent the average lensing and IA contributions to tangential shear, per lens-source pair.Here, the subscript PA denotes the residual is normalised by the weighted number of physically associated pairs, not that only those pairs have contributed to this lensing signal.Essentially, this implies the lensing contribution to shear does not fully cancel, in the case where residual multiplicative bias in the estimators is not subdominant, leaving a lensing residual, (− ′ ) γL,PA .Note that the term 'lensing residual' refers generally to any part of the lensing signal that was not cancelled by taking the difference of the two tangential shears, whereas residual multiplicative bias specifically refers to a bias remaining in the tangential shear estimates due to the uncertainty on the multiplicative bias calibration.
Due to the percent level contribution of the IA signal to the full tangential shear, even percent level uncertainty on bias calibration has the potential to leave a lensing residual which dominates the IA signal in the MEM.Accounting for multiplicative bias uncertainty in the case where it is not subdominant is therefore imperative to the success of this method.

Weighted source redshift distributions
Another potential source of a lensing residual arises when our two estimators have different weighting schemes.It is clear from equations 1 and 2 that different weights would result in different values of tangential shear, even in the absence of IA and multiplicative bias uncertainty, as well as different boost (equation 3) and  (equation 6) values.
To see how the different weighting schemes can manifest as different tangential shears, it is simplest to express γt as a Fourier space integral over the matter power spectrum.Following, for example, Prat et al. (2018), the tangential shear is given by, Here, we have assumed lens galaxies trace the underlying matter with a linear bias, such that  g =   m . 2 is the second order Bessel function, ℓ is the angular wavenumber,  is the 3D wavenumber,  is the scale factor (not the IA offset parameter defined previously),  is comoving distance, and  L () is the lens redshift distribution.The quantity of interest that changes depending on the weighted redshift distribution of the sources is (), the lensing efficiency, given by: where we have given the weights as a function of redshift as, typically, fainter and noisier galaxies are more likely to be observed at higher redshifts and have lower associated weights.From equation 11 it is apparent that a redshift distribution weighted by two different schemes for each estimator (i.e.w() s ()), will result in different lensing efficiencies and thus different tangential shears.This is therefore another potential source of a lensing residual, which could contaminate attempts to measure the IA amplitude offset between the two estimators, if not adequately corrected for.

Galaxy size and limiting resolution
In order for the different radial weightings of the two estimators to have a meaningful physical interpretation, therefore capturing a difference in the IA amplitude, it is required that the survey in question is able to resolve the physical scales of both radial weightings.An illustration of this issue is shown in Figure 1; the galaxy has to be larger than the point spread function (PSF) by a greater degree than would be traditionally required with a single estimator, so that the radial sensitivity of a second estimator can peak at a smaller radius while still probing scales larger than the PSF.This therefore implies the galaxy sample used with the MEM has stricter requirements on the effective size of a galaxy than a typical lensing sample.Following the formalism of Chang et al. (2013), the effective size is given by, where  gal is an estimate of the galaxy radius and  PSF is the radius of the point spread function for the galaxy in question.To determine if a galaxy is suitable for shear estimation, the measurement noise,   , is calculated from, where  is the signal-to-noise ratio of the galaxy image and (, , ) are parameters specific to the shear estimator in consideration.
Clearly, a galaxy with a small effective size could still be selected for shear estimation based on high signal-to-noise.Therefore, a cut on the effective size is necessary before determining the measurement noise on the remaining galaxies.This requirement on the effective size is also dependent on the exact radial weighting schemes used.We will address this issue again in section 4.1, in the context of LSST.

OBSERVATIONAL CASE STUDY: DES Y1
We now carry out the first application of the MEM to observational data, to probe the effectiveness of the method in the context of a recent weak lensing shape sample.Our primary objectives here are to provide a baseline procedure for applying the method to observational data, and highlight complications that may arise when applying the MEM to real data.

Data and shear estimators
We choose to use the DES Y1 lens and source catalogues, as there are two different shear estimators applied to the source sample, alongside a multitude of published work that has used these catalogues.This avoids the need to match galaxies between different surveys which have used different estimators, and provides resources for validation and comparison of intermediate measurements (Prat et al. 2018).
IA from multiple shear estimates 5

Shape catalogues
We make use of the two public DES Y1 galaxy shape catalogues (Zuntz et al. 2018), which assign shears using the METACALIBRATION estimator (Sheldon & Huff 2017, 'MCAL' hereafter) and the IM3SHAPE estimator ('IM3' hereafter).The former contains 34.8 × 10 6 galaxies, and the latter 21.9 × 10 6 galaxies.As previously mentioned, the MEM requires the lensing signal in both estimators to be the same for the lensing shear to cancel.This can naively (neglecting for the moment the complications discussed in Sections 2.2 and 2.3) be achieved by simply selecting only those galaxies for which shear estimates exist from both MCAL and IM3, to create a matched catalogue of galaxies.This matched catalogue contains 17.8 × 10 6 source galaxies, with an average effective number density of  s = 3.26 arcmin −2 .Without significant additional analysis, the relative radial sensitivities for these two estimators, and therefore the difference in their IA contamination, is unknown.We therefore proceed with the objective of understanding the intricacies of the MEM in a observational context, and do not necessarily expect a detection of IA.

Lens catalogue
The DES Y1 lens catalogue (Elvin-Poole et al. 2018) contains 660, 000 luminous red galaxy lenses, with redshifts determined via photometry.The redMaGiC (Rozo et al. 2016) algorithm has been applied to these lenses to select them so as to reduce photo-z error to   (1+) < 0.02.For comparative purposes, the photo-z error in the source sample used by Blazek et al. (2012) was

Application methodology
Application of the MEM begins with the calculation of the correlation functions required to estimate tangential shear and the boost factor.We make use of the well established TreeCorr1 (Jarvis 2015) package to estimate these quantities which, for DES Y1 shear estimators, are given by, γim3 () = () where wL,  denotes the lens galaxy weights, ws,  denotes the source galaxy weights, and   t the galaxy ellipticity estimates for each estimator.⟨  ⟩ is the average of the galaxies' responses to artificial shear, and ⟨ s ⟩ is an additional response that accounts for selection bias when cuts are made to the catalogue.Notice here that MCAL tangential shear does not incorporate individual galaxy weights, as weighting is implicit in a galaxy's response to artificial shear, hence why the sum is normalised by the average response.We use 10 angular separation bins log-spaced between 2.5' and 250', matching the angular separation range used in Prat et al. (2018), but reducing the number of bins from 20 to 10 for greater statistical power in each bin, due to the lower effective source density of the matched catalogue and small magnitude of the IA signal.
To obtain , we do not consider the separations of all randomsource pairs individually (as this would be computationally prohibitive).Instead, we split sources and randoms into narrow, weighted redshift bins and count the number of random-source pairs in only those bin combinations within 100 Mpc/h line-of-sight separation, as well as the total number of random-source pairs in all bins (see equation 6).
Finally, in order to obtain the covariance on the measurement, we use a jackknife method with 20 patches defined by a k-means algorithm (see Jarvis 2015 for more detail).The jackknife method can be mathematically expressed via, where the subscript JK denotes that this is a jackknife estimate,   represents an estimate obtained with patch  excluded, and x represents the average of the   values.In this case,  = (1 − ) γIA .
We use the entire lens sample to maximise statistical power and ensure a large overlap between the lens and source sample, therefore including as many intrinsically aligned pairs as possible.For future analyses, a narrow lens bin could be preferable to localise the measurement in redshift space for easier comparison to other measurements.However, for the purpose of this case study as a first application of the MEM, maximising the signal-to-noise ratio by including as many galaxy pairs as possible was deemed the appropriate choice.

Selection response in the MCAL catalogue
As discussed previously, if any selection is made to the full MCAL catalogue, a selection response must be calculated to re-calibrate the sample weighting in light of this selection.In the case where no change has been made to the source ellipticity distribution through this selection, the selection response is zero.In the application to DES Y1, the matched catalogue can be thought of as a selection in the MCAL catalogue, based on selection criteria from the IM3 catalogue and vice versa.
Such a selection is highly non-trivial to determine a posteriori, as it depends on relations between various parameters which contribute to deciding if a certain shear measurement method can be employed for a given galaxy.Furthermore, shear estimation may have failed on certain galaxies for no obvious reason.
In Sheldon & Huff (2017), the selection response is given by where ⟨  ⟩ + and ⟨  ⟩ − represent the mean th component ellipticities measured from images without artificial shear, but with selections made on parameters measured from images with th component shear Δ applied positively and negatively.
Machine learning methods have the potential to address the challenges of post-hoc selection response estimation in this case, due to their ability to classify large data-sets based on complex relations between 'features' of those data-sets.We can, in theory, train a classifier to act as a proxy for the true selection function, by predicting  2018), off diagonal terms are negligible.Selection response values have been determined by using the machine learning classifier described in Section 3.2.1 and thus their magnitude should be considered a lower limit on the size of the true selection response.
whether or not a galaxy is present in the matched catalogue based on 'features' of that galaxy.To do this, we first assign all galaxies in the full MCAL catalogue a new flag, indicating their presence or absence in the matched catalogue.Using the scikit-learn (Pedregosa et al. 2011) and keras (Chollet et al. 2015) packages, we train a multilayer perceptron neural network (see, e.g.Popescu et al. 2009) to predict the probability of a galaxy being selected for the matched catalogue based on its size, i, r and z band flux, signal-to-noise, and  1 and  2 ellipticity components.
In the case where we take the (overly-simplistic) stance that a galaxy with a selection probability greater than 0.5 is classed as present in the matched catalogue, the trained classifier is able to predict matched galaxies in unseen test data with an accuracy of 72%.However, in reality, the inclusion of a galaxy in the matched catalogue is not a deterministic processes.Therefore, instead, to find ⟨  ⟩ + and ⟨  ⟩ − we calculate the weighted mean of   over all galaxies in the full MCAL catalogue, with weights given by the classifier's predicted probability of selection for the matched catalogue based on features measured from artificially sheared images.We are then able to estimate the selection response.
We are unable to place uncertainties on the estimated selection response using this method, as the maximal accuracy of 72% (in the hard-cutoff case) indicates that the features available to us in the catalogue do not fully capture the selection and therefore the estimated value is biased from the truth in some unknown way.However, we can consider this value as a lower bound on the magnitude of the selection response and we expect the sign to be correct.A more detailed discussion on the classifier and the interpretation of this value as a lower bound is given in Appendix A.
The resulting selection response values and uncertainties are shown in Table 1, alongside the shear response values for the matched catalogue.Figure 2 shows our measurement made using the MEM, both including and excluding the estimated selection response.Failing to include the selection response in this context severely biases the measurement to the point where we see a potential false detection of IA.Importantly, since the value of selection response we have obtained is lower bound, the error bars on the corrected signal should be considered underestimated.
We thus present here the first application of the MEM to observational data.Initially, it appears to indicate a null detection of IA when some level of selection response is accounted for.However, as discussed in Section 2, there remain potential complications which must be considered before a final measurement is presented.In the figures of the following subsections (Figures 3 and 4), as each correction to the signal is applied, the data points and error bars from the previous figure will be represented by navy squares and the corrected signal as yellow circles.
Measurement corrected for selection response: We compare the signal in the cases where selection response is included (yellow circles) and excluded (blue squares).Inclusion of the selection response lower bound gives a measurement which is consistent with zero across all angular separation bins.However, failing to properly account for the selection response leads to a potential false detection across a majority of bins, as shown by the blue data points.Here, and in all following plots, error bars represent 1 (68%) confidence intervals, but are likely underestimated due to the expectation that the true selection response is larger than the value used here.A small offset has been applied on the horizontal axis to improve visual clarity.

Galaxy weights and differences in effective source redshift distribution
In addition to the specific case of the MCAL selection response, more general selection and weighting differences between the MCAL and IM3 estimators must be considered.Although using the matched catalogue in the case of IM3 and MCAL tangential shear measurements guarantees we use the same literal galaxies for both, the effective contribution to shear from each of these galaxies is different for each estimator.As detailed in Hoyle et al. (2018), for the DES Y1 case, the effective redshift distribution (which governs the true expected tangential shear) from IM3 must account for per-galaxy explicit weights as well as effective weighting by (1 +   ).For MCAL, an effective weighting with respect to the per-galaxy response value is required.Note that the selection response does not factor into this effective weighting, as it is a catalogue level value rather than per-galaxy.
One might naively imagine re-weighting the version of the matched catalogue for one estimator by the per-galaxy effective weights of the other estimator, to achieve a unified weighting scheme.However, this is generally ill-advised due to correlations between per-galaxy weights and ellipticities even across different estimators, which could induce a severe bias to our measurement.Using the two effective weighted redshift distributions, we compute theoretical tangential shears, as seen in equation 10, (making use of the Core Cosmology Library, henceforth referred to as CCL; Chisari et al. 2019) and take their difference to estimate the potential lensing residual, which we will call γ L,PA .Provided we trust our theoretical prediction is sufficiently accurate, we can then subtract this from our measured signal to correct for the difference in weighting schemes between the estimators.
In order to ensure we trust our theoretical lensing residual, we consider two primary effects which may affect the accuracy of our predicted γ L,PA .Firstly, the fact that the true cosmology is unknown (which also impacts the estimate of galaxy bias used in our modelling) and secondly, un-modelled photometric redshift error.The measured signal minus the weight-induced residual is shown by the yellow circles.Green triangles show the theoretical prediction for this residual, while the blue squares show the signal corrected for the selection response (yellow circles in previous figure), but not for the difference in weights.We see that while the weight induced residual is significant enough to noticeably shift the data points, we still see a null detection in most bins.
A horizontal offset is applied to the data points for visual clarity.
To account for uncertainty in the true cosmology, we compute γ L,PA for six ΛCDM cosmologies (parameter sets 3-8 of Table II of Abbott et al. (2018)) and the 1 upper and lower limits on galaxy bias for our sample (estimated from the measurements in Elvin-Poole et al. ( 2018) to be   = 1.68 ± 0.13).The difference between the smallest and largest lensing residuals for different cosmologies, also incorporating the 1 limits on galaxy bias, then gives us an estimate of the uncertainty arising from the true cosmology not being precisely known.To simulate un-modelled photometric error, we convolve the weighted source redshift distributions with a Gaussian photometric error model defined by   = 0.1(1 + ) (which we select to be representative of a worst-case un-modelled uncertainty in a Stage III dataset).Taking the DES 3x2pt best-fit parameters to be our fiducial cosmology, the difference between the γ L,PA residuals including and excluding additional photometric error gives an estimate of the uncertainty due to a potential un-modelled photometric error in the data.
We then add these two uncertainties in quadrature to estimate the overall uncertainty on the residual.Figure 3 shows the resulting weight-induced lensing residual, γ L,PA , alongside the measured signal with and without γ L,PA subtracted.Error bars on the corrected measured signal have been adjusted to also include uncertainty on γ L,PA .Given the lensing signal is well understood at these scales, it is possible to model and correct for this residual as we have done here.However, in an ideal scenario, a matched weighting scheme would be constructed for both estimators from the outset to avoid this issue entirely.

Multiplicative bias residuals
The final potential contaminant we must address is the multiplicative bias induced lensing residual (detailed in Section 2.2).For DES Y1, IM3 has a multiplicative bias uncertainty of  im3 = 0.025 while MCAL has  mcal = 0.013.Since we cannot determine the level of The pink shaded region represents the contribution to 1 uncertainty from multiplicative bias uncertainty, while the blue represents the statistical uncertainty obtained from the jackknife method and the uncertainty on the weight induced residual correction.Finally, the orange shows the total 1 uncertainty from the combination of both the pink and blue regions.We see that the majority of the known uncertainty in this application of the MEM is statistical.
multiplicative bias residual in the estimators to greater precision than these associated uncertainties, we choose to treat it as a systematic uncertainty in the MEM and combine it with the estimated covariance matrix.We investigate the significance of the ( im3 −  mcal ) γIA term in equation 9 and find it to be subdominant compared to the lensing residual term.Since the other complications discussed in this section have been addressed, the measured signal can therefore be expressed as, where Δ γt represents the difference in tangential shear between the two shear estimators, which is our measured signal, and  = ( im3 −  mcal ).If we consider the estimated covariance computed from the jackknife method to be the covariance on Δ γt , then the covariance for the quantity of interest, (1 − ) γIA = Δ γt −  γL,PA , is given by: Where  and  denote individual  bins.To find the lensing residual covariance, we construct a Gaussian distribution for  with a mean  = 0 and standard deviation  = √︃  2 im3 +  2 mcal .Using  = 1000 random draws from this distribution and multiplying them by our forecast lensing signal normalised over the boost and F, we calculate the lensing residual covariance using the re-sampling formula, where   represent individual samples, x is the mean value of all samples,  and  again represent  bins, and  denotes the transpose.We estimate the cross-covariance between the lensing residual and the measured signal in a similar fashion, by drawing 1000 samples from a multivariate Gaussian defined by the measured signal and its jackknife covariance, then using the equation 20 to determine the two cross-covariance terms.
At this point we have now, as far as feasibly possible within the scope of this work, corrected or accounted for all the aforementioned complications to the signal.We thus relabel our measurement as the IA component of the MEM, (1 − ) γIA .Figure 4 shows our MEM IA measurement data points along with coloured regions representing the various contributions to uncertainty.We see that while residual multiplicative bias does have some noticeable effect on our overall uncertainty, the majority still arises from statistical uncertainty.

Summary of findings from DES Y1 case study
To better contextualise our findings, we compute a theoretical IA signal using CCL and the DES Y1 GGL and clustering best fit nonlinear alignment (NLA) model amplitude,  IA = 0.38, found in Table 5 of Samuroff et al. (2019).Figure 5 shows our final MEM measurement alongside the predicted IA signal for four different hypothetical  values.Given the order of the estimators in our measurement ( γIM3 t − γMCAL t ), a positive (1 − ) value implies IM3 is more sensitive to the outer radial regions of galaxies (because the IA signal is negative in terms of tangential shear), while a negative value, −(1 − ), implies the opposite.A  2 test shows the strongest agreement with the −(1 − 0.6) γIA model.We do not consider  values below 0.6, as this is at the limits of what was seen in Singh & Mandelbaum (2016), although from Figure 5 we can expect lower  values in the positive model would give better  2 agreement.However, it is imperative we emphasise again that the error bars are likely underestimated, and furthermore, the true selection response is expected to shift the data points closer to zero.We therefore refrain from making any definitive comments on the value of  and the relative radial weightings of the two estimators, and carry out these comparisons simply as an exploratory exercise.
While here we have not been able to detect IA using the MEM, we have made important progress by demonstrating its feasibility for use with observational data.We do not attempt to place any constraints on IA models with this measurement, due to the large uncertainties on our signal, compounded by the expectation that these are underestimated.
To summarise, we will briefly outline the key lessons from this first application of the MEM to observational data, to take forward when considering application of the MEM to Stage IV surveys (the focus of Section 4 and the remainder of this paper).
• The matched catalogue must be constructed in a way that ensures selection bias can be minimised and / or well-characterised: Using similar shear estimators could be useful; for example, a modified MCAL estimator with a different radial weighting, used in conjunction with the standard MCAL estimator.
• Establishing a shared weighting scheme is highly beneficial to ensure differences in weighting (explicit or effective) do not manifest as differences in the lensing shear: While differences in weighting can be treated provided we trust our modelling of the lensing signal, future applications should make it a priority to construct a matched weighting scheme for both estimators a priori to avoid this process entirely.Alternatively, another approach could be to forgo weighting galaxies altogether if signal-to-noise allows (see e.g.Zhang et al. 2023).
• Multiplicative bias uncertainty must be demonstrably subdominant or accounted for within the overall measurement uncertainty: Future applications should select estimators with the lowest levels of calibration uncertainty possible.The uncertainty should also be accounted for in conjunction with the measurement covariance using the formalism introduced in Section 3.2.3.

FORECASTING FOR STAGE IV SURVEYS
Given our findings from the DES Y1 case study, we now present a forecast for the performance of the MEM with synthetic data sets representative of a Stage IV lensing survey.Specifically, we consider LSST Y1 and Y10.For comparison, the relative survey specifications for DES and LSST are given in Table 2. Using this forecast, we will seek primarily to address the issue of residual multiplicative bias as done in Section 3.2.3, by removing the assumption of sub-dominance and instead accounting for it within the measurement uncertainty.This will then allow us to place requirements on shear estimators in two cases: detection of IA with the MEM, and constraint of the IA scale dependence with the MEM.We choose to focus here on multiplicative bias because, as established in Section 3, issues with selection bias and differences of effective weighting schemes are more readily overcome, while multiplicative bias uncertainty must be included as a systematic of the MEM.
For a pair of hypothetical shear estimators, we will forecast MEM performance with respect to their amplitude offset parameter, , and the Pearson correlation coefficient (Benesty et al. 2009) of their shape-noise, .It is important to note here that we do not seek to place strict limits or requirements on observational choices or shape estimators.Such limits would be specific to the analysis choices made here and the survey in question.Instead, we aim to provide more general guidelines and targets for the development of bespoke shape estimators to be used with the MEM.2).

Redshift distributions
In order to carry out the forecasting, we assume the prescriptions for lens and source galaxy samples given in the LSST DESC Science Requirements Document v1 (LSST DESC 2018; The LSST Dark Energy Science Collaboration et al. 2018).Where sample-dependent parameters are referenced, unless otherwise stated, it can be assumed the associated values are taken from LSST DESC 2018.
To address the issue of galaxy size discussed in Section 2.4, we impose a strict effective size size cut of  ≥ 3. Relating this to specific radial weightings and values of (1 − ) would require analysis of galaxy images, which is beyond the scope of this work and we defer to a future analysis.However, we anticipate that such a cut would likely be sufficient and potentially even excessive.Determining exact values for a given pair of estimators will be the subject of future analysis.
Given this cut, we re-compute the LSST DESC 2018 redshift distributions 2 using WeakLensingDeblending 3 (Sanchez et al. 2021) simulated galaxy catalogs for Y1 and Y10, where Y1 is defined as being 10% of the total 10-year exposure time.Figure 6 shows the raw distributions and best fit to the LSST DESC 2018 parametric distribution given by, with constants  0 and  defined in LSST DESC 2018 for the Y1 and Y10 lens galaxy samples, and in Figure 6 for the Y1 and Y10 source galaxy samples used here.Compared to the values in Table 2, the larger drop in  eff for Y10 is due to the larger fraction of galaxies removed by the effective size cut in Y10, since Y1 will likely image the biggest and brightest galaxies first, with smaller, fainter ones resolved over the next 9 years. sky is the fraction of the sky that was or will be covered by the survey.We can see LSST promises a significant increase in statistical power over DES Y1.To ensure this greater sample size is utilised effectively, shear estimators will need to demonstrate significantly lower 1 limits on multiplicative bias uncertainty, defined as  max . The MEM will also benefit from these decreased levels of multiplicative bias uncertainty.
We use only one source and one lens tomographic bin, as was done in the DES Y1 application.For lenses, we use a narrower bin compared to the DES Y1 case study, in the range 1.0 ≤  l ≤ 1.2, which represents the highest redshift lens bin defined in LSST DESC 2018 for Y1.We choose to use this bin so as to consider a hypothetical scenario where we are attempting to probe IA in a narrow slice of redshift space.Choosing the highest redshift bin also results in a lower number of sources far behind the lenses, which will contribute to lensing residuals but not the IA signal.For the source bin, we use the full redshift range between 0.05 ≤  s ≤ 3.5.Placing some additional limit on the maximum source redshift would also prove beneficial to reducing lensing residuals, however, it would be detrimental to the statistical uncertainty on our   measurements, which we expect to be a crucial limiting factor in our error budget, especially given the findings of Section 3. We also account for photo-z uncertainty where appropriate, by convolving equation 21 with a Gaussian uncertainty model from LSST DESC 2018, where  s and  ph represent spectroscopic and photometric redshift respectively and   is defined as 0.05(1+  s )for sources and 0.03(1+  s ) for lenses.
It is important to mention that including the full source sample behind the lenses has the potential to increase the magnitude of the lensing signal and thus any lensing residuals.In this work, we will only consider the full sample to try and obtain limits on the acceptable values of  and  in the instance where statistical uncertainty requires we use the full sample.However, in a real analysis, some benefit could be gained from placing a lower upper limit on the source redshift bin, to restrict the number of sources behind the lenses which are contributing to the lensing shear, but not the IA shear.The exact limit will depend on the survey in question, as a balance would need to be struck between retaining an acceptable signal-to-noise ratio for the overall shear, whilst minimising the lensing residual.

Halo Occupation Distributions
To calculate theoretical tangential shears for lensing and IA, as well as the boost, we first require power spectra for the quantities of interest.To obtain predictions within the 1-halo regime, where we are most interested in using the MEM, we use the halo model formalism (Seljak 2000;Peacock & Smith 2000;Cooray & Sheth 2002), Here,  is the wavenumber,  is halo mass,  lin () is the linear power spectrum, () is the number of halos of mass , () is the halo bias as a function of halo mass (Tinker et al. 2010), and ( |) and ( |) are the Fourier space halo profiles of the tracers being correlated.The first and second terms of equation 23 represent the 1-halo and 2-halo contributions respectively.
In the context of this work, the profiles needed are; lens galaxy density, given by the halo occupation distribution (HOD) of Nicola et al. (2020); source galaxy density, given by the HOD of Zu & Mandelbaum (2015); matter density, given by the Navarro-Frenk-White profile (Navarro et al. 1996a;Navarro et al. 1997); and a satellite shear HOD for intrinsic alignment in the 1-halo regime (Schneider & Bridle 2010;Fortuna et al. 2021).The lens HOD was chosen as it is based upon Hyper Suprime Cam data and therefore able to model a sample with a number density somewhat representative of deep, LSST observations.Similarly, the source HOD can be modified using the source number density for our LSST-like sample.The satellite shear HOD is capable of modelling 1-halo IA effects, making it an ideal candidate for constructing a theoretical IA signal in this context.In Appendix B, we verify our implementations of the 2-point cumulants in equation 23 are correct.
From the halo model and the above HODs, we obtain the 1-halo and 2-halo terms for lensing shear and the boost.For IA, we only obtain the 1-halo term from the halo model, the 2-halo IA term instead comes from an NLA model (Bridle & King 2007).We note that this does mean there is some discrepancy between the modelling of our lensing and IA signals.In an ideal scenario, the full IA signal would have been modelled using the halo model.However, as discussed in Appendix B of Fortuna et al. (2021), the IA halo model is currently unable to accurately describe the transition scales, particularly in GGL, and determining the correct description is a significant study itself which is beyond the scope of this work.We therefore follow their approach by using the NLA model to compensate for the lack of power that results from this incomplete description.We do not attempt a similar treatment for the lensing signal (which may also lack power in these transition scales as evidenced by Mead & Verde 2021; Mahony et al. 2022), as we do not expect it to meaningfully impact our forecasting results.Having introduced our fundamental modelling choices, we will now go on to describe the modelling procedure in greater detail.

Lensing and IA shears
For on-sky separation binning, we consider seven log-spaced bins in the projected separation range 0.1 ≤  p ≤ 10.Unlike in DES Y1, here, we choose to consider  p instead of angular separation , for easier comparison of this work to L2018 and Blazek et al. (2012).This choice means we are accessing the 1-halo regime, where we are most interested in using the MEM to study IA scale dependence.
As in Section 3 above, the theoretical lensing shear is obtained using CCL (Chisari et al. 2019), but this time using the halo model to compute both the 1-halo and 2-halo contributions to the galaxymatter power spectrum,  gM ().The power spectrum can then be used to obtain the angular power spectrum using the Limber approximation, where ℓ is the angular multipole at which the spectrum is defined,  is comoving line-of-sight distance (which is a function of redshift), and g and M refer to galaxies and matter respectively.  ( ) is the galaxy density function given by, For the purpose of this theoretical modelling, we will redefine the lensing efficiency function (previously given by equation 11) as   ( ), Here, Ω  is the matter fraction of the Universe, and  0 is the Hubble constant, ( ) is the scale factor and  is the speed of light.Note that we take the fiducial cosmology to be the same as LSST DESC 2018 and assume a flat Universe (Ω  = 0).Assuming B-modes are zero, we can then find the projected correlation function in real space (which is analogous to the tangential shear in this context) using, where  is the angular separation of the tracers in question and  ℓ 0,2 are the Wigner-d matrices for tracers with spins 0 (galaxies) and 2 (shear).We solve this equation via a brute force sum to avoid instabilities that arise when considering the high ℓ values required to probe the 1-halo regime.
To model the theoretical IA signal, we draw on several areas of the literature.We compute the lens position and 1-halo satellite alignment angular cross spectrum,  1h gI (ℓ) from the prescription of Fortuna et al. (2021) (hereafter: F2021).To account for the different red and blue galaxy alignment amplitudes, we take a weighted average of the 1-halo amplitudes for red and blue galaxies given in F2021.
For LSST Y1, we use an approximate red fraction of  red = 0.10, which decreases to  red = 0.05 for Y10.We note that these values are not rigorously determined, but rather qualitative estimates.We vary the value of  red within a reasonable range of 0.00 ≤  red ≤ 0.30 and find a negligible effect on the 1-halo signal, due to the small difference between the red and blue galaxy 1-halo amplitudes found Table 3. Key parameters and their values used in our IA modelling.For the 1-halo term,  red values of 0.10 and 0.05 were used for LSST Y1 and Y10 respectively.1-halo values are taken from F2021, while the NLA 2-halo values are from Secco et al. (2022).
in F2021.We also take the projected separation scale dependence parameter,  = −2, from F2021.
A more significant effect could arise from differences in the luminosity functions of F2021 and LSST.Here, a detailed analysis of these luminosity functions is not appropriate, so we instead determine lower limits on the average alignment amplitude, below which the maximum lensing residual dominates the 1-halo IA signal.These are  1h > 1.00 × 10 −4 and  1h > 5.00 × 10 −5 for Y1 and Y10 respectively.
We also compute a second angular cross spectrum for the 2-halo regime,  NLA gI (ℓ), using a redshift dependent NLA model (Hirata et al. 2007;Bridle & King 2007) following equation 24 of Secco et al. (2022), with best fit parameters taken from row three of Table III in (Secco et al. 2022) (lens bias is also accounted for using the prescription given in LSST DESC 2018).
It is important to note that our choice of fiducial IA model may not represent what is eventually seen in LSST data, due to the increased depth of LSST compared to the samples used in F2021 and DES Y3.However, it is nonetheless necessary for us to choose some fiducial IA model in the absence of any measurements truly representative of LSST.We refer the reader to Krause et al. (2016) for discussion on the complexities of forecasting the IA contamination to stage IV surveys.We caution that the findings of this study may not apply if the true signal is found to be vastly different to the models we have used here, but ultimately it is necessary for us to make some modelling choices.We defer a detailed comparison of different IA models to future work.
Having obtained the 1-halo and 2-halo spectra, we then combine them, truncating each via a window function to avoid double counting in the 1-halo to 2-halo transition, with ℓ 1h = 1.4 × 10 4 and ℓ 2h = 3 × 10 4 chosen to give as smooth a transition as possible, and subscript I denoting an intrinsic shape tracer.Table 4.1.3contains a summary of the key IA model parameters used in this work.We then use this angular power spectrum to compute the IA tangential shear for our choice of projected and tomographic bins.Figure 7 shows the various tangential shears discussed here.All quantities shown have been normalised by the estimated number of physically associated pairs, which is why we see a flattening of the lensing signal in the scales where the boost factor strongly dominates and contribution from  is negligible, as the boost factor has a similar scale dependence to the lensing signal.

Galaxy weights and a new definition for F
In keeping with LSST DESC 2018, we adopt a simple weighting scheme for all source galaxies given by, with intrinsic shape noise,   = 0.12 and per component error on the measured ellipticity of a given galaxy ,    = 0.26, such that all galaxies are weighted equally.We also adopt a new definition for  (Blazek et al. 2015;Safari et al. 2023), by redefining the maximum line-of-sight separation at which we expect galaxies to be physically associated and therefore contributing to the average IA signal in a particular projected separation bin, This definition of  extends on the previous one to remove the contribution of pairs that are close in projected separation, but far in line-of-sight separation.The lower limit of 2 Mpc/h ensures that all galaxies within the same halo are included, regardless of how small their projected separation may be.We note that if this definition is used with large projected separations, some upper limit on the lineof-sight separation may be required to avoid artificially diluting the signal.However, this will not pose a problem in our analysis, as we do not consider  p beyond 10 Mpc/h.More detail on the calculation of  and the boost factors is given in Appendix C.

Residual multiplicative bias in LSST-like estimators
Because we are not looking to cross correlate different source tomographic bins, we assume a constant multiplicative bias across all redshifts.In LSST DESC 2018, the allowed levels of multiplicative bias uncertainty given are ±0.013 and ±0.003 for Y1 and Y10 respectively.

Forecasting procedure
We use the TJPCov4 package to estimate the statistical covariance on our forecasts, and validate our approach against the LSST DESC 2018 forecast covariance.It is important to note that inclusion of a residual multiplicative bias alters the covariance expression given in L2018; we address this in Appendix D, but find it amounts to only percent level corrections to the original expression, so therefore compute the statistical covariance in the same way as L2018.
The maximum allowed multiplicative bias calibration uncertainties from LSST DESC 2018 are  = 0.013 and  = 0.003 for Y1 and Y10 respectively.Using the same process detailed in Section 3.2.3,we estimate the IA only covariance by re-sampling from a Gaussian distribution of multiplicative bias uncertainty values.In this case, to estimate Δ γ for the cross covariance terms, we simply add our fiducial IA signal and the mean lensing residual and use the TJPCov statistical covariance to define the multivariate Gaussian.We again estimate the IA component of the residual multiplicative bias, ( −  ′ ) γIA , but find it is only significant when  is very large.In this case, the lensing residual would dominate the IA signal, and any future applications (by nature of the method) should seek to minimise .We therefore assume this term to be negligible.
In this forecasting scenario, we are capable of isolating the IA component and measuring the signal-to-noise, and so for the following results we will focus on this to better understand the region in the - parameter space we should ideally target.While this is not the case in an observational scenario, because the mean of the residual multiplicative bias distribution should be zero, in most cases, the sample mean of the lensing residuals is much smaller than the IA signal, and equation 18 is dominated by the IA component.Therefore, as shown in Section 3.2.3, in an observational context following this procedure is still appropriate, as it widens the uncertainty on the signal to account for residual bias larger than the expected value.Even if the measured signal were to include a significant contribution from a lensing residual, this would be captured by large cross-covariance between the residual and the measured signal.

Forecasting results
The key variables of the method which can potentially be tuned and controlled are the amplitude offset parameter, , and the shape noise correlation between the estimators, .We therefore choose to forecast the signal-to-noise ratio (SNR) for different values of  and , to place requirements on these parameters for an IA signal to be detected in LSST data, whilst accounting for realistic levels of residual multiplicative bias.
We will consider three different definitions of the SNR to better contextualise our results: the SNR in a single  p bin, the combined SNR for all  p bins including the full covariance matrix, and the SNR for a 1-halo scale dependence parameter fit using a Markov-Chain Monte-Carlo (MCMC) method.

The amplitude offset parameter, 𝑎
In order to better contextualise our results, here we will briefly discuss , its range of expected values from observational results, and potential complications that may arise when attempting to determine its value.
In an observational context, without prior knowledge on the radial sensitivity of the two estimators, it is difficult to determine precisely the value of .It could be possible to gain further information on  through the inclusion of a third shear estimator, as this would allow for the ratio of different combinations of shear estimators to be compared.However, it is likely the additional computational cost and necessary preparation for including a third shear estimator would be prohibitive and therefore we do not carry out any investigation into this.Regardless, even if  cannot not be estimated, a constraint on the scale dependence alone would allow for the amplitude of an IA model to be left free and calibrated within cosmological parameter estimation pipelines.Therefore, it is not essential to the success of the MEM that  be accurately estimated.Despite this, in the context of designing optimised MEM estimators, it is important to have a rough target value for , to ensure constraining the scale dependence is possible.Radial sensitivity calibration for the estimators could be carried out on simulated images, where the dependence of the IA signal on galaxy scale can be precisely controlled.There is limited observational data available to model this dependence, however Georgiou et al. (2019) found a roughly linear relationship between the IA amplitude and the radial sensitivity of their shear estimator.The difference in alignment amplitude between their smallest and largest radial weightings would correspond to  ≈ 0.3.Singh & Mandelbaum (2016), when comparing Re-Gaussianisation, Isophotal, and de Vaucouleurs shapes, found 20% to 40% differences in alignment amplitude between various combinations of estimators, which corresponds to values of  from 0.6-0.8,though these estimators were not optimised to result in a maximal difference in IA contamination.
It is therefore reasonable to expect an optimised estimator could achieve somewhere in the range 0.3 ≤  ≤ 0.6, though this is complicated by factors such as the variation in the distance between isophotes for different galaxy types and sizes.Such variation could mean values of  differ for each galaxy within a sample and across different redshift bins (since the former properties are known to be correlated with redshift).Further investigation into this is therefore necessary in the design of optimised estimators, to ensure any variation is not large enough to seriously dilute a MEM signal.
At present, we suggest studies looking to optimise shear estimators for the MEM aim for the lowest value of  possible in the design stage, and when applying these estimators to observational data, do not rely upon the assumption that the values of  seen in simulations will carry through to the observational sample.A sensible first step for a successful measurement with the MEM is to seek only to constrain the scale dependence of the IA signal, which alone could provide valuable insight into the IA of galaxies.

Requirements for the detection of IA
The simplest question of whether or not we expect a detection of IA can be answered by looking at the SNR in a single  p bin.To claim a detection, we require (assuming all other systematics are controlled) that the SNR be greater than or equal to 1 for a given  p bin, else the 1 error bars would be consistent with zero. Figure 8 shows the per-bin SNR as a function of  p for Y1 and Y10, visualised for a selection of  and  pairings which are 'borderline' in terms of detection.
From Figure 8 we can infer that, to obtain a detection of IA in all or most  p bins, a value of  ≤ 0.6 is required.Higher shape noise correlation values would allow for a detection with slightly higher values up to  ≤ 0.7, but  itself appears to have a greater impact on the per-bin SNR in all cases, as shown by the wider spread of points in the bottom two panels.We therefore expect from this that, Values greater than 1 represent a detection of IA above statistical noise and uncertainty due to multiplicative bias, for the given   bin in isolation.LSST Y1 is shown on the left and Y10 on the right.The top panels show different cases where  is varied but  kept fixed, while the bottom panels show the opposite.A much higher SNR is seen in the lowest projected separation bin where the 1-halo term becomes highly dominant.Varying  has a more significant effect on the SNR than varying .However, in the lowest signal to noise bins  can be the difference between a detection and a signal consistent with zero.In Y10 compared to Y1 we see a roughly factor of 2 increase in per-bin signal to noise.
given the estimators used meet this requirement, LSST Y1 levels of multiplicative bias should still allow for a 1 detection of IA with the MEM, with Y10 allowing for potentially a 2 or higher detection for the same estimators.It is important to note the  values considered here may be conservative.We have chosen these values to showcase clearly the boundary at which the MEM becomes unable to detect the IA signal; of course, pushing  further below 0.6 would allow for an even stronger detection.For now, we expect as long as the estimators have  ≤ 0.6, the MEM has potential to detect IA in LSST Y1.Such a value should be achievable given the differences in IA amplitude found in Singh & Mandelbaum (2016).
In all cases, significantly higher signal to noise is seen in the lowest projected separation bin, as a result of the different scale dependencies of the IA and lensing signals (which propagates forward into the lensing residual and thus our covariance) inside the 1-halo regime.For example, Georgiou et al. (2019) find the 1-halo IA signal to be represented by a power-law in   with a scale dependence index of  = −2 in Galaxy and Mass Assembly survey (GAMA) and KiDS data.This value is also used by F2021 in the construction of their IA halo model, and as such we have chosen to use the same value in our modelling here.On the other hand, the lensing signal scale dependence seen in similar samples used by Viola et al. (2015) and Dvornik et al. (2018) appears to follow roughly  = −1.This implies it is not unsurprising that at projected separations of  p ≈ 0.1Mpc/h we begin to see a much stronger alignment signal, resulting in a very high SNR.

Full covariance signal-to-noise
The question of whether we expect a detection of IA is not the only one, however, particularly as the MEM is, by construction, unable to independently measure the amplitude of an IA signal (in isolation from precise external information on the value of ).We are thus motivated to consider the constraining power of the signal more broadly, with a key objective of the MEM being to place model independent constraints on the IA scale dependence.To move towards understanding this, we first consider the overall SNR, which can be obtained from the full covariance matrix in all  p bins with the 13.9 14.6 15.6 16.7 18.2 20.2 23.2 28.2 39.6 11.9 12.5 13.3 14.3 15.6 17.3 19.9 24.2 34.0 9.9 10.5 11.1 11.9 13.0 14.4 16.6 20.2 28.3 7.9 8.4 8.9 9.5 10.4 11.6 13.3 16.1 22.6 5.9 6.3 6.7 7.2 7.8 8.7 9.9 12.1 17.0 4.0 4.2 4.4 4.8 5.2 5.8 6.6 8.1 11.3 2.0 2.1 2.2 2.4 2.6 2.9 3. Combined SNR in all  p bins across the - parameter space for LSST Y1 (left) and LSST Y10 (right).We see high signal to noise in the entire explored region, even in areas where we expect the majority of the signal to have 1 uncertainty consistent with zero.This is a result of the high SNR in the lowest separation bin and highly correlated off diagonal elements of the covariance matrix.In the right panel we see very high correlation across the entire matrix, including in the larger  p bins.This is not the case for the lower  value on the left, however, in the smaller  p bins we still see significant correlation.
following equation, From this definition, we can see how the covariance between  p bins affects the total SNR across all bins.As  impacts the covariance matrix, computational limitations mean we cannot calculate a large quantity of covariance matrices to probe  values.Instead, we calculated the covariance matrices for nine  and  values between 0.1 and 0.9, resulting in 81 combinations.We then interpolate to obtain a smooth picture of the SNR across the - parameter space.
Figure 9 shows the full covariance SNR for Y1 and Y10.Across, the entire parameter space, we see very high signal to noise, even in places where we would not expect to obtain a detection of IA.There are two reasons for this.First, as discussed in Section 4.3.2,even for poor values of  and , there is still a strong signal in the lowest separation bin.Second, the covariance matrix has highly correlated off diagonal terms, particularly in lower  p bins.This is shown in Figure 10.This high correlation implies that, while  has a less significant effect on whether we expect a detection of IA in a given   bin or not, its effect on the constraining power of the overall measurement with respect to parameters of interest may be significant.We will now go on to explore if this is indeed the case.

1-halo scale dependence constraints
Evidently, the overall SNR does not, by itself, provide a full picture of the forecast constraining power of the MEM, with respect to model parameters of interest, namely, scale-dependence.To explore this in    greater detail, we carry out MCMC fits using the emcee5 (Foreman-Mackey et al. 2013) package.We fit the synthetic measurement to a 4-parameter truncated power-law model, designed to qualitatively approximate our fiducial IA model, while maintaining a realistic level of model agnosticism: Note that in this case, the amplitudes which are fit ( 1h and  2h ) will be a fraction of the true amplitude, dictated by the value of .Compared to equation 28, we have swapped the order of the truncation terms as ℓ and  p are inversely proportional.The truncation scales of 0.3 Mpc/h and 0.75 Mpc/h are chosen to give the best fit to the fiducial signal from maximum likelihood estimation.
To constrain the model parameter space, we adopt a set of uniform priors,  1h ∈ [0, 10],  1h ∈ [−10, 0],  2h ∈ [−10, 0], and  2h ∈ [0, 10].We run chains for each of the 81 combinations of  and , initialising 32 walkers in a small spread around the maximum likelihood estimates, and allowing the chains to run until all have achieved convergence.which we define as,  > 50, where  is the total number of iterations and  is in the integrated auto-correlation time.We note that a stricter test of convergence should ideally be used when placing model constraints with real data, but for the purposes of probing the acceptable values of  and , this criteria is sufficient.
Marginalising over the other 3 parameters in the model, we estimate the forecast SNR for the 1-halo scale dependence,  1h , by taking the 50th percentile (median) value as the best fit and the distance between the 16th and 84th percentiles as the 1 (68%) confidence region.We choose to take the median rather than the maximum likelihood estimate, as we found it to be more robust to variations in walker initialisation and allowed more freedom of the model within the signal uncertainties.The resulting SNR from these fits is shown in Figure 11.Encouragingly, we see that even for certain - combinations where we do not expect detection, a 1 or greater constraint on the scale dependence is still possible with high enough values of .The importance of high  values is further emphasised here for ensuring the tightest possible constraints, with  = 0.90 resulting in twice the SNR compared to  = 0.10 for the same  value.Similar to what was seen in the per-bin diagonal SNR, going from Y1 to Y10, we again see an approximately factor of 2 increase in the constraint SNR.
An interesting question also arises when we consider the relation between  and  themselves.If we could expect that  were to increase as  decreased, it would be greatly beneficial to the development of bespoke estimators.However, the inverse could make designing an estimator suitable for Y1 in particular challenging.Answering this question would require the analysis of specific shape estimators, which is beyond the scope of this work, but we highlight this as a key consideration for future research.
As a heuristic guide in interpreting the results shown in Figure 11, we consider the analysis of Secco et al. (2022), which sought to select an appropriate model of IA for the DES Year 3 cosmic shear analysis.That work found that, in the scenario where the true IA was described by the Tidal Alignment -Tidal Torquing model (TATT, Blazek et al. 2019) with tidal torquing amplitude  2 = −1.36,redshift index of this term  2 = −2.5 and source density bias parameter  TA = 1.0, but the analysis incorrectly assumed the simpler NLA- model with  2 =  2 =  TA = 0, the result was a bias in the  8 − Ω M plane of considerably more than 2.We can see that for each of these assumed truth values, an external measurement with SNR of 1 would have excluded the incorrectly assumed value at the 1 level, while a prior measurement with SNR of 2 would have decidedly ruled out the possibility of using the simpler presumed model.Thus, while a higher SNR is of course helpful for further pinning down model properties, a measurement of an IA model parameter of SNR even modestly >1 can be extremely powerful in ensuring robust cosmological constraints from a cosmic shear analysis, further emphasising the importance of improving our ability to measure and constrain IA in an observational context.

DISCUSSION AND CONCLUSIONS
In this work, we have carried out the first application of the Multi-Estimator Method (MEM) for measuring and / or constraining intrinsic alignment (IA) developed in Leonard & Mandelbaum (2018) (L2018).Using Dark Energy Survey Year 1 (DES Y1) shear estimators, we showed how the MEM could be applied, and investigated and corrected for systematic errors that may pose problems to attempts to use this method.We identified three key systematics that future applications must treat or account for in order for the technique to succeed: • Selection biases induced when making catalogue cuts to match the lensing contribution to shear between shape samples.
• Differences in effective weighting schemes between the two samples, altering the effective redshift distribution of the samples, and thus the measured lensing signal.
• Residual multiplicative biases in the lensing signal, due to calibration uncertainty, resulting in a lensing residual when cancellation is performed.
Our investigation into these systematics highlighted the potential biases that can be introduced when they are not accounted for and we therefore attempted to develop techniques to treat them.Given we expected our uncertainties to be underestimated, the final result was a null detection of IA in DES Y1, which was qualitatively consistent with the findings of Samuroff et al. (2019).We found the dominant source of uncertainty in this context was statistical.Having developed the tools and knowledge necessary to apply the MEM in an observational context, we went on to forecast the significance of the multiplicative bias induced lensing residual in Stage IV data.
We additionally considered the requirement on galaxy size, such that a galaxy should be well enough resolved that differences in radial weighting between two shape estimators are physically meaningful.Using a strict cut on galaxy effective size of  ≥ 3, we constructed new samples from the original forecasts for the Legacy Survey of Space and Time (LSST) Y1 and Y10 source samples.We used halo occupation distribution models to theoretically determine the observed quantities necessary for the MEM, and thus forecast IA and lensing residual signals.
For a fiducial IA signal in our forecasts, we used a combination of the IA halo model (Schneider & Bridle 2010; F2021) and a redshift dependent non-linear alignment (NLA-) model with parameters from the DES Y3 best fits (Secco et al. 2022).We stress again, this choice, while well motivated by observations and literature, may not necessarily be representative of the contamination in LSST, and therefore our forecasting results are guidelines for future applications of the MEM, rather than strict requirements.We plan in future work to determine how varying IA models could impact the signal obtained from the MEM via analysis of simulated galaxy images.
With a set of fiducial signals, we developed a scheme whereby uncertainty in the multiplicative bias can be accounted for as a systematic error and estimated its contribution to the covariance.We found even in the presence of multiplicative bias uncertainty  = ±0.013,there was possibility of a 1 or higher detection in LSST Y1, given the offset in alignment amplitude between the two estimators is greater than 40% (represented by the MEM parameter  ≤ 0.6).In this case, the impact of shape noise correlation between estimators (captured by the Pearson correlation coefficient, ) was limited.However, higher  values became more important when the signal-to-noise on the signal was close to 1, although  in all cases still remained the primary factor in determining if detection was possible.For LSST Y10, the drop in multiplicative bias uncertainty to  = ±0.003resulted in a roughly factor of 2 increase in the SNR, and thus enabled detection at higher  values.A general limit for Y10 could be taken as  ≤ 0.80.
When addressing the ability of the MEM to constrain the 1-halo scale dependence of the IA signal, we found  became more significant, due to high covariance in the forecasts.With sufficiently high , constraints on the scale dependence could have a SNR greater than 1 even if the data itself was consistent with zero.We also found for low SNR values of  and  that the maximum likelihood estimate from the chains often over-fit the data points themselves, despite the presence of large uncertainties, whereas the median values of the posterior distribution allowed for more freedom of the model within the error-bars.For this reason, we chose to use the median as our best fit value when calculating the 1-halo scale dependence constraint SNR.
As a general guideline, for high values of , achieving a value of  =  should be sufficient to obtain a reasonable constraint on the 1-halo scale dependence.This requirement relaxes for lower  values, allowing for lower values of .Given the results here, we recommend a realistic and achievable target for LSST Y1 is  = 0.60 and  = 0.50.Given the model used here, this would allow for both a detection and greater than 1 constraint on the scale dependence.Exceeding this baseline would of course result in even more favourable performance of the MEM.
In future, we will look to identify specific shear estimators and optimise them for use with the MEM by introducing custom radial weighting.As mentioned previously, it would be interesting to investigate how values of  and  are related for different pairs of estimators, to determine the feasibility of achieving both low  and high .This is a study that is most easily carried out in tangent to shear estimator optimisation.Furthermore, we will seek to minimise multiplicative bias uncertainty in the estimators as much as possible, in the hopes of lowering it well beyond the LSST Y1 requirements.
Promising shear estimators for optimisation include METADE-TECT (Sheldon et al. 2023), which builds upon the framework of METACALIBRATION to also perform galaxy detection, and Fourier Power Shapelets (Li et al. 2018(Li et al. , 2020(Li et al. , 2022)), which has analytical correction for measurement bias making it computationally efficient in the context of the MEM, where a second shear estimation pipeline needs to be run on images.Finally, Forklens (Zhang et al. 2023) is potentially also promising, due to its ability to measure shear from extremely noisy images.This may remove the need for galaxy weights entirely, therefore circumventing the requirement for a matched weighting scheme.
In conclusion, here we have built upon the work of L2018 to develop a greater understanding of the systematics present in the MEM, and how they can potentially be treated or accounted for.We have carried out the first application of this method to observational data using DES Y1 and measured an IA signal consistent with zero, that includes the DES Y1 best fit NLA model within its uncertainties.In the context of LSST Y1 and Y10, we have placed general requirements on the key parameters relating the shear estimators used, in order ensure future attempts at measurement are robust to contamination by residual multiplicative bias.Our work here has shown that, while challenges lie ahead, the measurement of IA in LSST Y1 is possible and strong constraints in Y10 are highly likely, especially if development of the MEM continues for LSST Y1 and beyond.
To further develop the guidelines given here, work identifying and optimising shear estimators for the MEM is required and planned for future analysis.Tests on simulated galaxy images with these tailored estimators can be used to determine how the fiducial IA model affects the method, and allow specific requirements on galaxy size in the context of specific values of  to be placed.With the work carried out here and proposed for the near future, the MEM has the potential to become another important tool in developing our understanding of the intrinsic alignment of galaxies.

APPENDIX B: 2-POINT CUMULANTS FOR HALO MODEL POWER SPECTRA
To calculate the boost factors, we require the galaxy-galaxy power spectrum  gg (ℓ).Because we have different HODs for our source and lens samples, computing their cross-spectrum is non-trivial.Therefore, we opt to individually calculate  ll (lens-lens clustering) and  ss (source-source clustering), then approximate the lens-source clustering power spectrum as the geometric mean of the auto-spectra, To verify this is a suitable approximation, we plot each of the three spectra mentioned above, as well as an additional estimate,  2pt ls , which is obtained by assuming the 2-point cumulant is simply the product of the means of each Fourier space profile.This plot is shown in Figure B1.We find the geometric mean closely follows this 2-point term up to high , at which point the spectrum becomes dominated by a constant, non-physical central-central 1-halo term, resulting from the inappropriate 2-point cumulant.Subtracting the value where this constant is most dominant (taken to be the value of  2pt ls at max ) from the spectrum recovers the behaviour of the geometric mean, confirming this is indeed a suitable approximation.For the 1-halo galaxy-intrinsic correlation, which gives the tangential shear due to IA, we derive the 1-halo power spectrum to determine if using the product of the profile means is sufficient, or if a more complex cumulant is required.The HOD for lens galaxies is given by, where  is the mass of the halo, nL is a normalisation over the total number of lens galaxies, and  C and  S are respectively the number of central and satellite galaxies for a halo of mass . S (, ) is a truncated Navarro-Frenk-White (NFW) profile (Navarro et al. 1996b(Navarro et al. , 1997)), which we assume the satellite galaxies approximately trace.From F2021, the satellite intrinsic shear profile is given by, From this we can infer that the product of the profile means is sufficient for the 2-point cumulant, as all terms in equation B5 are physically meaningful.The first term in the square brackets represents the correlation of central position with satellite shear, and the second term represents the satellite position -satellite shear correlation.We do not have a profile currently for the central shear, however, within the 1-halo regime (where this power spectrum is used in our IA model), we expect this term to be subdominant.

APPENDIX C: BOOST FACTOR AND F CALCULATIONS
In order to calculate the boost factor and  for the forecasting in Section 4, we use the theoretical expressions given in L2018.The boost factor is given by, . (C2) In this case,  + ( l ,  p ) represents the maximum redshift at which we expect a source to be aligned with a lens at  l , given its 2D separation from that lens is  p (see Equation 30). Figure C1 shows the boost and F values calculated using these expressions with the LSST-like data employed in section 4.

APPENDIX D: DERIVATION OF COVARIANCE MATRIX FOR NON-NEGLIGIBLE MULTIPLICATIVE BIAS
Following the derivation in Appendix B of (Jeong et al. 2009) and the equations given in Appendix B of L2018, we re-derive the expression  (Sheldon et al. 2004).We see that the inclusion of F becomes important for  p > 2Mpc/h.
for the statistical covariance on our measurement, in the case of nonnegligible multiplicative bias.
For two different tangential shear estimates,   and  ′  , with demonstrably subdominant post-calibration residual multiplicative bias, the statistical covariance on our measurement can be expressed as: Where  sky is the fraction of the sky covered by the survey,  gg (ℓ),  gM (ℓ),  MM (ℓ) are the galaxy-galaxy, galaxy-matter, and matter-matter angular power spectra respectively. L and  s are lens and source galaxy surface densities respectively,  and  ′ are the shape noise for our two estimators, and  is the correlation in shape noise between these estimators.As such, when the estimators are the same,  = 1 and  =  ′ .Because, the multiplicative bias residual is assumed to be small for any contemporary shear estimators, we will neglect terms of second order in  and  ′ .Substituting D3 into D2 and multiplying through the brackets gives: We can see that the inclusion of multiplicative bias residuals amounts to percent level corrections in the covariance terms.Since we include the residual multiplicative bias as a systematic uncertainty in our measurement, we therefore use the original equation of L2018 in our forecasts.
The errors shown on the forecast data points are obtained from a covariance matrix produced using TJPCov.However, because TJP-Cov does not yet have the functionality to calculate the covariance for the specific quantity we are interested in here, we instead employed equation D1.
Four standard  t covariance matrices were obtained from TJPCov, with the cross-estimator covariance terms calculated using √  1  2 as the shape noise, varying .This approach does not capture any potential differences in  s for the two estimators, however, we find it a close enough approximation for the purposes of our forecasting.
This paper has been typeset from a T E X/L A T E X file prepared by the author.

Figure 3 .
Figure 3. Measurement corrected for selection response and weighting differences:The measured signal minus the weight-induced residual is shown by the yellow circles.Green triangles show the theoretical prediction for this residual, while the blue squares show the signal corrected for the selection response (yellow circles in previous figure), but not for the difference in weights.We see that while the weight induced residual is significant enough to noticeably shift the data points, we still see a null detection in most bins.A horizontal offset is applied to the data points for visual clarity.

Figure 4 .
Figure 4. Final (1 − ) γIA measurement corrected for selection response, weighting differences, and multiplicative bias uncertainty: Data points are represented by the yellow circles and are the same as in the previous figure.The pink shaded region represents the contribution to 1 uncertainty from multiplicative bias uncertainty, while the blue represents the statistical uncertainty obtained from the jackknife method and the uncertainty on the weight induced residual correction.Finally, the orange shows the total 1 uncertainty from the combination of both the pink and blue regions.We see that the majority of the known uncertainty in this application of the MEM is statistical.

Figure 5 .
Figure5.Final MEM measurement of (1 − ) γIA compared to the signal expected based on DES Y1 NLA model best fits of  IA , for a few example values of .We see our signal, is most consistent with the − (1 − 0.6) γIA model.We expect the error bars to be underestimated due to the selection response correction representing a lower bound, and that the true selection response would shift the data points closer to zero.

Figure 6 .
Figure 6.LSST Y1 (left) and Y10 (right) source galaxy redshift distributions obtained after the effective size cut.The parametric model from equation 21 has been re-fit to obtain a new model.The  eff values correspond to a loss of roughly 2 and 11 sources per square arcminute for the Y1 and Y10 cases respectively, when compared to the original values (see Table2).

Figure 7 .
Figure 7.Comparison of tangential shears for the different IA models discussed.The windowed tangential shear (yellow squares) is the model used in our forecasting.It is obtained via a truncated combination of the 1-halo (black crosses) and NLA (red crosses) models.The maximum lensing residual (purple triangles) is shown for comparison.We show the absolute value of the IA signal for easier visual comparison to the lensing residual.

Figure 8 .
Figure8.SNR in each  p bin taken as the forceast IA signal divided by the 1 uncertainty.Values greater than 1 represent a detection of IA above statistical noise and uncertainty due to multiplicative bias, for the given   bin in isolation.LSST Y1 is shown on the left and Y10 on the right.The top panels show different cases where  is varied but  kept fixed, while the bottom panels show the opposite.A much higher SNR is seen in the lowest projected separation bin where the 1-halo term becomes highly dominant.Varying  has a more significant effect on the SNR than varying .However, in the lowest signal to noise bins  can be the difference between a detection and a signal consistent with zero.In Y10 compared to Y1 we see a roughly factor of 2 increase in per-bin signal to noise.
Figure 9.Combined SNR in all  p bins across the - parameter space for LSST Y1 (left) and LSST Y10 (right).We see high signal to noise in the entire explored region, even in areas where we expect the majority of the signal to have 1 uncertainty consistent with zero.This is a result of the high SNR in the lowest separation bin and highly correlated off diagonal elements of the covariance matrix.

Figure 10 .
Figure 10.Correlation matrices for LSST Y10 at  = 0.10 (left) and  = 0.90 (right).In the right panel we see very high correlation across the entire matrix, including in the larger  p bins.This is not the case for the lower  value on the left, however, in the smaller  p bins we still see significant correlation.

Figure 11 .
Figure 11.Signal-to-noise ratio for the constraint on 1-halo scale dependence with LSST Y1 (left) and Y10 (right) like data.Different combinations of  and  are shown.We define the signal as the median of the posterior distribution of  1h values and noise as the region containing 68% of the posterior probability (1 uncertainty).

Figure B1 .
Figure B1.Comparison of galaxy-galaxy power spectra at  = 0, calculated using the halo model with different lens and source HODs.We see the simple 2-point cumulant follows the geometric mean at low k, but begins to diverge above  = 10.Subtracting the value at  max from the rest of the spectrum, it once again follows the geometric mean.

Figure C1 .
Figure C1.Boost and F values calculated using the Y1 forecasting data set detailed in 4.1, for lenses in the range 1.0 ≤  l ≤ 1.2 and sources in the range 0.05 ≤  s ≤ 2.4.The power law for the boost follows the expected trend given in(Sheldon et al. 2004).We see that the inclusion of F becomes important for  p > 2Mpc/h.

Table 1 .
Shear response and selection response for the MCAL matched catalogue.As inPrat et al. (

Table 2 .
Relevant survey parameters for DES Y1, and LSST Y1 and Y10.