Self-calibration of weak lensing systematic effects using combined two- and three-point statistics

We investigate the prospects for using the weak lensing bispectrum alongside the power spectrum to control systematic uncertainties in a Euclid-like survey. Three systematic effects are considered: the intrinsic alignment of galaxies, uncertainties in the means of tomographic redshift distributions, and multiplicative bias in the measurement of the shear signal. We find that the bispectrum is very effective in mitigating these systematic errors. Varying all three systematics simultaneously, a joint power spectrum and bispectrum analysis reduces the area of credible regions for the cosmological parameters $\Omega_\mathrm{m}$ and $\sigma_8$ by a factor of 90 and for the two parameters of a time-varying dark energy equation of state by a factor of almost 20, compared with the baseline approach of using the power spectrum alone and of imposing priors consistent with the accuracy requirements specified for Euclid. We also demonstrate that including the bispectrum self-calibrates all three systematic effects to the stringent levels required by the forthcoming generation of weak lensing surveys, thereby reducing the need for external calibration data.


INTRODUCTION
One of the primary aims of modern cosmology is to constrain cosmological parameters within the concordance cosmological model. An increasingly reliable tool for this purpose is weak gravitational lensing. Recent galaxy surveys including the Kilo-Degree Survey 1 (KiDS), the Dark Energy Survey 2 (DES) and the Hyper Suprime-Cam Subaru Strategic Survey 3 (HSC) have already produced strong constraints on parameters of structure growth (Troxel et al. 2018;Hikage et al. 2019;Asgari et al. 2020a). The next generation of surveys such as Euclid 4 (Laureijs et al. 2011) and the Vera C. Rubin Observatory Legacy Survey of Space and Time 5 (LSST) will represent a step change in the quantity and precision of weak lensing data and deliver even tighter parameter constraints. Moreover, the increased volume and accuracy of the data will make it possible to use methods and statistics which are not feasible with current surveys.
One possibility is to make more use of three-point weak lensing statistics. These are inherently more difficult to measure and analyse than two-point statistics but nevertheless a three-point weak lensing signal was first detected as early as 2003 (Bernardeau et al. 2003;Pen et al. 2003). Subsequently Semboloni et al. (2010) successfully used three-point aperture mass statistics from the Cosmic Evolution Survey (Scoville et al. 2007) to estimate cosmological parameters. E-mail: ucapsep@ucl.ac.uk This work was an important proof of concept. Although the survey was small, with an area of only 1.64 deg 2 , combining two-point and three-point statistics produced a modest improvement in parameter constraints. More recently the feasibility and usefulness of threepoint measures were confirmed by Fu et al. (2014) using the larger Canada France Hawaii Telescope Lensing Survey (CFHTLenS; Heymans et al. 2012).
Several theoretical studies have investigated the weak lensing bispectrum from the point of view of reducing statistical uncertainties (Takada & Jain 2004;Kayo et al. 2012;Kayo & Takada 2013;Rizzato et al. 2019). All these authors concluded that in principle the bispectrum can provide worthwhile additional information and thus improve cosmological parameter constraints. However these investigations did not take account of systematic uncertainties and their conclusions must be considered optimistic. If anything, the results reinforce the need to control systematic uncertainties. Currently systematic and statistical errors are of similar size but future surveys will drastically reduce statistical uncertainties, making control of systematic effects a priority. Accounting for systematic effects can also shed light on tension between recent results from weak lensing (Troxel et al. 2018;Abbott et al. 2019;Hikage et al. 2019;Asgari et al. 2020b;Heymans et al. 2020;Joudaki et al. 2020) and the latest Planck analyses of the cosmic microwave background (Aghanim et al. 2018). In particular there are discrepant results for the value of the structure growth parameter S 8 = σ 8 Ω m /0.3, derived from the matter fluctuation amplitude parameter σ 8 and the matter density parameter Ω m . The possibility that this apparent tension between results from different probes stems from uncontrolled systematic effects has not been ruled out.
In the light of this we investigate the feasibility of using threepoint statistics to control some of the major systematic uncertainties which beset weak lensing. Our work is partly motivated by existing evidence that some systematics affect two-point and three-point statistics in different ways, for example Semboloni et al. (2008) for weak lensing, Foreman et al. (2020) for the matter bispectrum. For tomographic weak lensing we might expect these differences to be substantial because the weak lensing power spectrum and bispectrum are differently-weighted projections of their matter counterparts. We also explore the potential for using the combined bispectrum and power spectrum to enable self-calibration -mitigating systematic effects using only data from the survey itself.
We focus on three major sources of systematic error: intrinsic alignments of galaxies, residual uncertainty in the shape of tomographic redshift distributions expressed through potential shifts in their means, and multiplicative bias in shear estimation. The effects of these (and other) systematic errors on two-point weak lensing statistics have been studied extensively and there is a significant literature discussing specific types of uncertainty and presenting general approaches to estimating and controlling systematics (Huterer & Takada 2005;Huterer et al. 2006;Ma et al. 2006;Bridle & King 2007;Kitching et al. 2008a;Bernstein 2009;Hearin et al. 2012;Kirk et al. 2012;Massey et al. 2012;Cropper et al. 2013;Joachimi et al. 2015;Kirk et al. 2015;Troxel & Ishak 2015;Mandelbaum 2018). The resulting methods have been implemented in the analysis of two-point statistics from recent weak lensing surveys (Hoyle et al. 2018;Zuntz et al. 2018;Hikage et al. 2019;Samuroff et al. 2019;Giblin et al. 2020;Joachimi et al. 2020).
In contrast, relatively little attention has been paid to the effect of systematics on three-point weak lensing statistics such as the bispectrum even though many of the concepts developed for the power spectrum can readily be adapted. Of the few studies which did consider systematics in three-point statistics, Huterer et al. (2006) investigated generic multiplicative and additive biases in future surveys such as LSST and found that using the bispectrum as well as the power spectrum could increase the scope for self-calibration without undue degradation of parameter constraints. For intrinsic alignments, Shi et al. (2010) extended a nulling method from twopoint to three-point statistics, which mitigated the effects of intrinsic alignments but at the expense of loss of constraining power, and Troxel & Ishak (2011 used the redshift dependency within single redshift bins to inform a self-calibration method. Theoretical explorations of three-point intrinsic alignment statistics have been presented by Semboloni et al. (2008) based on simulations and Merkel & Schäfer (2014) using a tidal alignment model.
In this work we provide a more complete assessment of the value of the bispectrum to mitigate weak lensing systematics through selfcalibration in a tomographic survey. We model the effect of each systematic on the weak lensing power spectrum and bispectrum and use Fisher matrix analysis to forecast the potential for self-calibration in a Euclid-like survey.
In Sect. 2 we summarise the tomographic weak lensing power spectrum and bispectrum and the structure of their covariance matrices. Section 3 records our survey and modelling assumptions. In Sect. 4 we describe our parameterization of the three systematic effects, for both the power spectrum and bispectrum, and in Sect. 5 we describe our inference methodology. In Sect. 6 we present our results. Our conclusions are in Sect. 7. Appendices A to C give details of our power spectrum and bispectrum covariance methodology. Appendix D contains supplementary plots demonstrating self-calibration.

Tomographic weak lensing power spectrum and bispectrum
Throughout this work we assume a flat universe. With this assumption the convergence field κ (i) for the i th tomographic bin at angular position θ is where χ lim is the maximum comoving distance of the survey, δ is the matter density contrast, and the weight q (i) ( χ) is defined as Here a( χ) is the scale factor, p (i) ( χ) is the line-of-sight distribution of galaxies in the i th tomographic bin, H 0 is the Hubble constant and Ω m is the matter density parameter. Assuming that the Limber and flat-sky approximations are valid, the tomographic weak lensing power spectrum at angular multipole between redshift bins i and j is where P δ is the matter power spectrum, k χ(z) = + 1/2, and we use the more accurate extended Limber approximation (LoVerde & Afshordi 2008) which includes higher-order terms from a series expansion of ( + 1/2) −1 .
where B δ is the matter bispectrum and we again use the extended Limber approximation (Munshi et al. 2011). The vectors k i form a triangle so that k 1 + k 2 + k 3 = 0. Thus the bispectrum has only three degrees of freedom, two from the triangle condition and one from the orientation of the triangle in space.

Summary statistics
In this work we treat the weak lensing power spectrum and bispectrum as observables, even though they are not directly measurable in practice because of complications such as incomplete sky coverage of surveys. Nevertheless we expect our results to be valid for alternative more practical Fourier-space summary statistics. In the case of twopoint analyses such alternatives include band powers (Van Uitert et al. 2018;Joachimi et al. 2020) and pseudo-C estimators (Hikage et al. 2011;Asgari et al. 2018;Alonso et al. 2019), both of which contain essentially the same information as the power spectrum. Three-point summary statistics are less well-developed and it is less clear how they relate to the underlying bispectrum. The most recent three-point analyses of survey data (Semboloni et al. 2010;Fu et al. 2014) have used aperture mass statistics, which can be estimated from correlation functions or modelled from the bispectrum (Schneider et al. 1998). One advantage is that third-order aperture mass statistics separate E-and B-modes of the shear signal well (Shi et al. 2014). This is desirable since the detection of B-modes can indicate the presence of systematics. Other three-point Fourierspace estimators have also been suggested, for example the integrated bispectrum which is sensitive mainly to squeezed triangles (Munshi et al. 2020b). This has recently been generalised to other bispectrum configurations through a 'pseudo' estimator (Munshi et al. 2020a). So far these new statistics have not been used to analyse survey data and their practicality and realism are unknown.

Weak lensing covariance
Both the matter and weak lensing covariance matrices have the general form where the subscripts denote 'Gaussian', 'in-survey non-Gaussian' and 'supersample covariance' (Takada & Jain 2009;Kayo et al. 2012;Takada & Hu 2013). Appendix A summarises the origin of these terms in the matter power spectrum and bispectrum covariance.
To calculate the matter covariance we use the halo model formalism, following Takada & Hu (2013) for the power spectrum covariance and Chan & Blot (2017) and Chan et al. (2018) for the bispectrum covariance. Appendix B gives further details of the bispectrum supersample covariance, since the full expression has not been widely used, and also discusses conflicting results in the literature, justifying our choice of model.
For ease of computation we consider only equilateral triangles when calculating the bispectrum and its covariance, but recognize that in doing this we are discarding potentially valuable information from other triangle shapes. For example Barreira (2019) found that squeezed triangles with two large and one small side provided useful information. This reduction in information means that our conclusions about the efficiency of self-calibration are likely to be conservative.
To estimate the weak lensing power spectrum and bispectrum covariances and their cross-covariance, we follow the methods in Takada & Jain (2004) and Kayo et al. (2012). In App. C we give expressions for all the components of the weak lensing power spectrum and bispectrum covariances for a single tomographic bin. Similar results, including for the power spectrum-bispectrum crosscovariance, can be found in Kayo et al. (2012) and Rizzato et al. (2019).
Appendix C also illustrates the relative sizes of terms in the weak lensing covariance matrices. The in-survey non-Gaussian terms of both the power spectrum and bispectrum covariance are subdominant, in agreement with results in Barreira (2019) and Rizzato et al. (2019). Consequently, to simplify calculation, in our analysis we include only the Gaussian and supersample terms. Over most of the relevant angular scales the power spectrum and bispectrum supersample covariance are both dominated by the one-halo terms, but we nevertheless retain all terms apart from small dilation terms.

COSMOLOGICAL PARAMETERS, SURVEY CHARACTERISTICS AND MODELLING ASSUMPTIONS
We assume a spatially flat wCDM model and consider six cosmological parameters with fiducial values as shown in Table 1. We model the evolving dark energy equation of state parameter by w(a) = w 0 + (1 − a) w a where a is the cosmological scale factor, which introduces two further parameters: w 0 , the value of w at the present day, with fiducial value -1, and w a which has fiducial value 0. We assume a Euclid-like survey with area 15 000 deg 2 , total galaxy density 30 arcmin −2 and redshift range 0.0 ≤ z ≤ 2.5. The assumed overall redshift probability distribution of source galaxies is with α = 2.0, β = 1.5, z 0 = z med / 2, z med = 0.8. We model statistical uncertainty in photometric redshift values by assuming that the redshift distribution within each tomographic bin is Gaussian with a dispersion σ ph . Thus the conditional probability of obtaining a photometric redshift z ph given the true redshift z has the form We take σ ph to be 0.05. With these assumptions, we divide the redshift distribution into five bins, each containing the same number of galaxies. . Future surveys such as Euclid will allow much finer bin division than this. However we find that, in the absence of systematic uncertainties, increasing the number of bins beyond five for either the power spectrum or the bispectrum provides little extra information. This is consistent with results in Ma et al. (2006), Joachimi & Bridle (2010) and Rizzato et al. (2019). In fact we find that, considering statistical uncertainties only, if five or more bins are used for the power spectrum, there is little to be gained from using more than two bins for the bispectrum. This will not necessarily be true if systematic uncertainties are considered. For example, using only the power spectrum, if intrinsic alignments are present the information content does not level off until up to 20 bins are used (Bridle & King 2007;Joachimi & Bridle 2010). Nevertheless we restrict the self-calibration analysis to five bins to reduce the complexity of the bispectrum and its covariance. It is reasonable to expect that the self-calibration power would increase if more than five bins were used.
We use 20 angular bins equally logarithmically spaced from min = 30 to max = 3000. This range avoids large scales where the Limber approximation breaks down and in any case little information is available, and also small scales where the modelling of non-linear effects on the matter distribution becomes very uncertain.
We employ the transfer function from Eisenstein & Hu (1998). We model the non-linear matter power spectrum with the fitting formula from Takahashi et al. (2012). For the three-dimensional matter bispectrum we use the well-established formula from Gil-Marín et al. (2012). Recently, Takahashi et al. (2020) derived a more accurate prescription for the matter bispectrum, especially at highly non-linear scales, which is also the first such formula to include the impact of baryon feedback. This opens up the possibility of additional self-calibration. This will be the subject of further work, including an assessment of the consistency of the new formula with established feedback approaches for the power spectrum (Mead et al. 2020).

MODELLING OF SYSTEMATICS
In this section we discuss our parameterization of the systematic effects, in each case starting from methods which have been shown to work well for the power spectrum and extending them to the bispectrum.

Intrinsic alignment of galaxies
The observed (lensed) ellipticity of a galaxy, obs , is related to its intrinsic ellipticity, I , by (Seitz & Schneider 1997) where g = γ/(1−κ) is the reduced shear. In this equation all variables are complex numbers and g * is the complex conjugate of g. In the weak lensing regime κ 1, γ 1, so g ≈ γ and obs ≈ γ G + I .
Assuming that intrinsic ellipticities are random so that I = 0, the average ellipiticity over a large number of galaxies provides an estimate of the true gravitational shear γ G . From Eq. (9) we can construct a correlator of the ellipticities of two galaxy samples, labelled i and j, as Note that the correlators on the right-hand side of Eq. (10) are illustrative and not true correlation functions because they do not explicitly take account of the fact that shear is a spin-2 quantity. The first term of the right-hand side of Eq. (10) is the lensing signal, GG, and the fourth represents intrinsic alignment auto-correlation, II. There are two GI terms representing cross-correlations between shear and intrinsic alignment. Although we model both of these, the first will be small if z i < z j unless the two redshift distributions overlap substantially, because intrinsically-aligned galaxies at higher redshift cannot affect the lensing of galaxies at lower redshift. In a tomographic analysis we associate the labels i and j with different redshift bins.
Analogues of Eq. (3) can be used to calculate the two intrinsic alignment power spectra in the extended Limber approximation: where, as in Sect. 2.1, q (i) ( χ) is defined by Eq.
This formalism can be extended to three-point statistics. In analogy to Eq. (10) we construct a three-point correlator where again i, j and k denote galaxy samples. The four terms on the right-hand side of this equation are given by As before these are simplified illustrative correlators.
In a similar way, we can split the observed bispectrum B obs into four terms The term B

(i jk)
GGG is the lensing bispectrum defined by whereκ G is the Fourier transform of the convergence and only unique combinations of i, j and k are included. The bispectrum B (i jk) The other three terms on the right-hand side of Eq. (21) can be defined similarly, replacingκ G byκ I as appropriate. For example for GGI κ (i) and, in analogy to Eq. ( 4), where again the constituents of the equation are defined in Sect. 2.1, with B δδδ I defined by With these ingredients we can evaluate the full observed lensing bispectrum B obs in terms of B δδδ , B δδδ I , B δδ I δ I and B δ I δ I δ I . Our method is similar to that in Troxel & Ishak (2012) and Merkel & Schäfer (2014). The matter bispectrum B δδδ is determined straightforwardly from the fitting function in Gil-Marín et al. (2012), which has the form where F eff 2 are modifications of the normal perturbation theory kernels F 2 (Bernardeau et al. 2002) and P NL (k) is the non-linear matter power spectrum.
To obtain expressions for B δδδ I , B δδ I δ I and B δ I δ I δ I we adapt the linear alignment model developed by Hirata & Seljak (2004) for intrinsic alignment power spectra. This model assumes that the ellipticity of a galaxy is linearly related to the local quadrupole of the gravitational potential at the time the galaxy formed. The model is well-established for two-point statistics (Bridle & King 2007;Kirk et al. 2012) and has been found to be a good fit to direct measurements of intrinsic alignments (Singh et al. 2015;Singh & Mandelbaum 2016;Johnston et al. 2019). We adopt the so-called non-linear alignment model introduced by Bridle & King (2007) which replaces the linear power spectrum used in Hirata & Seljak (2004) with the non-linear matter power spectrum P NL (k).
Based on this approach, we relateδ I to the Fourier transform of the matter density contrast,δ G , byδ I = f IAδG , where the factor f IA has the form Here ρ cr is the critical density and D(z) is the growth factor normalised to unity at the present day. The parameter C 1 is a normalisation factor which in principle can be determined from observations or simulations. We use the value derived by Bridle & King (2007) which is 5 × 10 −14 h −4 M −2 Mpc 6 , leading to C 1 ρ cr = 0.0134 (Joachimi et al. 2011).
Our parameterization of f IA allows for uncertainty in the intrinsic alignment amplitude and possible redshift dependence through the free parameters A IA and η IA respectively. We do not model luminosity dependence but η IA acts as a proxy for any indirect redshift dependence through luminosity (Troxel et al. 2018). We set the fiducial value of η IA to be zero and take the fiducial value of A IA to be 1, consistent with recent survey results. The quantity z 0 is an arbitrary pivot value which we set to 0.3 in line with previous work (Joachimi et al. 2011;Joudaki et al. 2016).
In the two-point case the two intrinsic alignment power spectra are given by We extend this to the three-point case using tree-level perturbation theory and the fitting function from Eq. (26) to get Then integrating as in Eq. (24) gives expressions for the weak lensing intrinsic alignment bispectra. Fig. 1 shows examples of resulting bispectra for some illustrative tomographic bin combinations. This figure shows equilateral triangle bispectra obtained with five redshift bins, assuming fiducial values of the intrinsic alignment parameters A IA and η IA . The GGI bispectrum is negative and its magnitude can be almost as large as the GGG signal. The other bispectra are positive. The GII bispectrum is generally several orders of magnitude less than the GGI bispectrum, but in some bin combinations it is as much as 20% of the GGI bispectrum. The III bispectrum is always sub-dominant, which is consistent with the findings in Semboloni et al. (2008) from simulations of a survey similar to CFHTLenS. In Fig. 2 we show the relative importance of the intrinsic alignment terms compared with the pure lensing signal, plotted at two representative angular scales, = 100 and = 1000, for both the power spectrum and bispectrum. For the power spectrum all redshift bin combinations are plotted whereas for the bispectrum we show the same subset as in Fig. 1. Noting the different vertical scales in these two figures, we find that intrinsic alignment affects the power spectrum more than the bispectrum. This contrasts with the findings from simulations in Semboloni et al. (2008). These authors measured three-point aperture mass statistics and concluded that the III/GGG ratio was generally higher than the II/GG ratio. They also found that the III signal is negative whereas we find it is positive. Despite these disagreements we see no obvious reason to discard the linear alignment model which is well-established and robust for two-point statistics. The key point for our analysis is that intrinsic alignments affect the power spectrum and bispectrum differently. Even if our model is not entirely accurate in detail, conclusions based on it will still hold. We plan to revisit and validate the modelling assumptions in future work.

Redshift uncertainties
Another source of systematic uncertainty is the calibration of tomographic redshift distributions. Here we consider a single source of uncertainty due to the use of photometric redshift measurements: bias in the mean redshift of each tomographic bin. Thus we consider the effect of shifting the whole distribution of galaxies in a bin to a higher or lower redshift, without changing the shape of the distribution. This has been found to be a good proxy for the uncertainty in the distribution within a bin (Hikage et al. 2019;Hildebrandt et al. 2020a). We allow for different uncertainty and hence different shifts in each bin so that the redshift distribution, p (i) , in bin i is modelled as where p (i) obs (z) is the observed redshift distribution. The shifts in the mean, ∆z i , are treated as free parameters. A similar method for forecasting redshift uncertainties was used by Huterer et al. (2006). It is also the standard approach used for current surveys (Joudaki et al. 2016;Abbott et al. 2018;Hoyle et al. 2018;Hikage et al. 2019;Hildebrandt et al. 2020b).

Multiplicative shear bias
The final type of systematic error which we consider is multiplicative shear bias which alters the amplitude of the weak lensing signal. We ignore additive bias which can be calibrated directly on the data.
Multiplicative biases can have several quite distinct origins (Massey et al. 2012;Cropper et al. 2013;Kitching et al. 2019). For example they can arise from incorrect modelling of the point spread function, especially of its size (Cropper et al. 2013;Mandelbaum 2018;Giblin et al. 2020), or from an inappropriate galaxy surface brightness model (Miller et al. 2013). A more pervasive source of multiplicative bias is noise bias. This is an unavoidable consequence of the non-linear transformation from image pixels to ellipticity measurements and would be present even if the galaxy profile was known perfectly (Melchior & Viola 2012;Viola et al. 2014).
From Eq. (34) we can construct the two-point correlator where θ is the angle on the sky between a pair of galaxies, and in the final line we have dropped terms of order m 2 i Massey et al. 2012). An analogous expression can also be defined for ξ (i j) − (θ). Note that these correlators are again simplifications which ignore the spin-2 nature of the shear. Taking the Fourier transform leads to a similar expression for the power spectrum Similarly for three-point statistics we can write a generic correlator aŝ where we use the facts that the multiplicative factors are real and the same for both shear components. Once again this expression is a simplification which ignores the fact that shear is a spin-2 quantity.
This method of calibrating multiplicative shear bias by treating the multiplicative factors as nuisance parameters was used in two-point analyses of data from CFHTLenS (Kilbinger et al. 2013;Miller et al. 2013) and DES (Abbott et al. 2018). Fu et al. (2014) extended the method in Kilbinger et al. (2013) to analysis of three-point aperture mass statistics.

Fisher matrices and figures of merit
To investigate the impact of systematics we use the Fisher matrix (Tegmark et al. 1997). In simplified notation the elements of the Fisher matrix are defined by where D is the data vector, Cov D is the corresponding covariance matrix, and p α and p β are parameters which may be the cosmological parameters which we want to estimate or nuisance parameters associated with systematic uncertainties. In detail the matrix multiplication in Eq. (40) is a sum over all combinations of angular frequencies and tomographic bins. Eq. (40) assumes a Gaussian likelihood and that the covariance is independent of the cosmological parameters. As discussed by Carron (2013), using a parameter-dependent covariance matrix with a Gaussian likelihood would introduce a spurious term into the Fisher matrix.
We consider two different data vectors, firstly the power spectrum and then the power spectrum and bispectrum combined. In the second case the covariance matrices, including their cross-covariance, are also combined (Kayo et al. 2012). We do not consider the bispectrum alone since if the bispectrum is available then we can assume that a two-point statistic has already been measured.
The diagonal element (F −1 ) αα of the inverse Fisher matrix provides a lower bound for the variance of parameter p α after marginalising over all other parameters. Thus higher values in the Fisher matrix, or equivalently lower values in its inverse, correspond to lower uncertainty. In this work we are interested in understanding how well we must constrain nuisance parameters in order to improve estimates of the cosmological parameters. To do this we consider the effect of imposing priors on the nuisance parameters. To add a Gaussian prior with width ∆p α to parameter p α we add 1/∆p 2 α to F αα . We then use the inverse of the updated Fisher matrix to determine revised constraints on the other parameters (Tegmark et al. 1997). We use the inverse of the area of the Fisher ellipse as a figure of merit (FoM), as defined by the Dark Energy Task Force (Albrecht et al. 2006). This provides a single figure which quantifies how tightly the parameters are constrained. In the plane of the parameters p α and p β the FoM is defined as We focus on FoMs in the Ω m -σ 8 and w 0 -w a planes which are most relevant for weak lensing. The Fisher matrices and FoMs take account of the cosmological parameters defined in Sect. 3 together with the nuisance parameters defined in Sect. 4: the parameters A IA and η IA from Eq. (27), five nuisance parameters ∆z i denoting the shift in the mean value of the redshift bin centered on z i , and five parameters m i representing the multiplicative bias in each tomographic bin.
To calculate the Fisher matrices we need the derivatives of the power spectrum and bispectrum with respect to the parameters. The derivatives with respect to the intrinsic alignment and multiplicative bias parameters can be evaluated analytically but the cosmological parameters and redshift shifts require numerical derivatives for which we use a standard five-point stencil. We confirmed the accuracy of the derivative calculations by verifying that, for each parameter p α , a Gaussian distribution centred on the fiducial parameter value with variance (F αα ) −1 matches the one-dimensional posterior for p α .

Interpretation of figures of merit
We use figures of merit in several ways. Firstly, FoMs in the presence of systematics can be compared to their values when there are no systematics. This quantifies the loss of information due to the systematic uncertainties. This is particularly useful for comparing the relative importance of two different systematic effects. Secondly, we can quantify the extra information provided by the bispectrum (with or without systematics) by comparing the FoMs obtained with the power spectrum only with those obtained with the combined power spectrum and bispectrum. Finally, we can consider how the FoMs change when we alter the priors on nuisance parameters. This gives insight into the self-calibration regime where a nuisance parameter can be constrained purely from information in the survey without the need for external information to set priors, although at the expense of some loss of overall constraining power. Figure 3 shows schematically how a figure of merit changes as the prior on a parameter is changed. The values in this figure are purely for illustration. The prior values considered are typical of those in our later analysis but the vertical axis values are simply illustrative of possible improvements in the FoM compared to the FoM with a wide prior of ten. If the prior value is small (blue region in Fig. 3), the parameter is tightly constrained by the prior and further tightening of the prior does not affect it. Conversely, in the orange region the parameter is independent of the prior; this is the selfcalibration regime where the FoM can be determined purely by data from the survey. Between these two regimes, in the white area, the FoM rises rapidly as the prior is tightened. We choose to define the  self-calibration regime as the region where the FoM is improved by less than five percent of the value it has with a wide prior, although this definition is somewhat arbitrary. This level is indicated by the horizontal grey line in Fig. 3. The size of the step between the orange and blue regions indicates how strongly the FoM relies on priors outside the self-calibration region. A small step is desirable.
In Sect. 6 we consistently present results in the format of Fig.  3. Throughout we show the percentage improvement in the FoMs compared with 'base' values obtained with wide priors. In each case the self-calibration regime, as we define it, can be read off as the region where the FoM is improved by less than five percent. The boundary of this region varies from case to case.

Default priors
For our analysis we define a set of default priors to represent the baseline accuracy possible with Euclid. For redshift bin means and multiplicative bias we take the default priors to be the accuracy requirements specified in the Euclid Definition Study Report (Laureijs et al. 2011). This sets a requirement that the mean redshift should be known to at least an accuracy of 0.002 (1 + z) for each redshift bin. The corresponding accuracy requirement for multiplicative bias is also 0.002, based on shear simulations in Kitching et al. (2008b). There is no specified Euclid accuracy requirement for intrinsic alignment parameters so we take as our default a conservative value of 0.1 for both parameters.

Overview
Our main results are summarised in Fig. 4. This shows the FoMs obtained in three situations: when all systematics are present but wide priors of ten are imposed on all nuisance parameters; when default priors are imposed on each type of nuisance parameter in turn, but wide priors are imposed on the remaining parameters; and when no systematics are present -this can be considered as a baseline which exemplifies the maximum attainable information content. Table 3 provides the numerical results behind Fig. 4. Using the bispectrum can be much more beneficial than the alternative of using the power spectrum alone and imposing tight priors on the nuisance parameters. When all systematics are taken together, combining the power spectrum and bispectrum produces a 90-fold increase in the Ω m -σ 8 FoM and a nearly 20-fold increase in the Ω m -σ 8 FoM, compared with using the power spectrum alone, even when priors on all nuisance parameters are wide. This improvement can be compared with the factor of 1.6 gain obtained from the bispectrum when only statistical uncertainties are considered.
The default prior values used in Fig. 4 are mainly in the selfcalibration regions where the FoMs are insensitive to the prior. This explains why the FoMs are similar regardless of which systematic we consider here. This is especially true for the combined power spectrum and bispectrum, and less so for the power spectrum alone. We discuss this further in Sect. 6.4.
One important caveat in interpreting our results is that we have undoubtedly underestimated the constraining and self-calibration power of the power spectrum because we use only five tomographic bins throughout, as discussed in Sect. 3. Thus our assessment of the extra constraining power of the bispectrum is likely to be overoptimistic.

Effect of the bispectrum -statistical errors
The first line of each panel in Table 2 shows FoMs (or ratios of FoMs) obtained when systematic uncertainties are ignored, so only statistical errors are present. This situation has been investigated by several other authors (Kayo et al. 2012;Kayo & Takada 2013;Rizzato et al. 2019). All found that the bispectrum could improve cosmological parameter constraints: Kayo et al. (2012) estimated a 20-40% improvement in the signal to noise ratio from using the bispectrum, Kayo & Takada (2013) forecast a 60% improvement in the dark energy figure of merit and Rizzato et al. (2019) forecast an improvement in the signal to noise ratio of around 10%. In comparison we find that including the bispectrum as well as the power spectrum increases the Ω m -σ 8 FoM by around 60% and the w 0 -w a FoM by around 40%. The differences in the results can be attributed at least partly to different survey specifications and tomographic set-ups.

Effect of the bispectrum -systematic errors
The remainder of Table 2 shows the impact of systematic uncertainties on the two FoMs, assuming wide priors on all the nuisance parameters. Intrinsic alignments have the most deleterious effect. With the power spectrum only, the presence of intrinsic alignment nuisance parameters reduces the Ω m -σ 8 FoM by a factor of more than 300, and the w 0 -w a FoM by a factor of 400. Multiplicative bias is relatively harmless, although certainly not negligible. Again considering the power spectrum only, multiplicative bias causes both FoMs to fall by around 20 − 25%. The effect of redshift uncertainty is intermediate, reducing the Ω m -σ 8 FoM by a factor of around ten and the w 0 -w a FoM by a factor of 35. Even with wide priors on all the nuisance parameters the bispectrum is hugely helpful in counteracting the effect systematic uncertainties. This is largely because the bispectrum reduces the impact of intrinsic alignments, which affect the power spectrum and bispectrum very differently as seen in Sect. 4.1. However the bispectrum also considerably offsets the effect of uncertainty in redshift bin means.

Self-calibration
We next investigate the effect on the FoMs of tightening or relaxing the priors on the nuisance parameters and hence the potential for self-calibration. Figure 5 shows the effect of varying priors on the intrinsic alignment parameters, with fixed priors equal to the Euclid requirements imposed on all other parameters. The horizontal grey lines in each panel indicate our definition of self-calibration discussed in Sect. 5.2. The self-calibration regime is the region to the right of the point where these lines cross the orange or blue lines.
When only the power spectrum is considered the self-calibration regime starts when the priors on A IA and η IA are about 0.5. Using the bispectrum as well, the self-calibration regime extends to a prior value of about 0.1 for A IA , but does not change for η IA . The bottom panel of Fig. 5 shows that if priors on both intrinsic alignment parameters are tightened simultaneously, the selfcalibration requirements are around 0.5 if only the power spectrum is considered. In contrast, when the power spectrum and bispectrum are combined, the self-calibration point is around 0.05, within our default prior of 0.1.
In all three panels of Fig. 5 the size of the step between the selfcalibration regime and the regime where the FoM is controlled by the priors is much smaller when the bispectrum and power spectrum are combined. This means that even outside the self-calibration regime the bispectrum massively reduces the requirement for tight external priors and the degradation of the FoM within the self-calibration is much less than for the power spectrum only.
For the power spectrum, the Ω m -σ 8 figure of merit is most sensitive to the amplitude parameter A IA and the w 0 -w a figure of merit is most sensitive to the the redshift exponent η IA . This is as expected: Ω m , σ 8 and A IA have confounding effects on the amplitude of the weak lensing signal, whereas the dark energy parameters are highly sensitive to redshift uncertainty.
We next consider redshift uncertainty and multiplicative bias, setting fixed priors of 0.1 on the two intrinsic alignment parameters. The prospects for self-calibration are shown in Fig.6. Once again we indicate our criterion for self-calibration by horizontal grey lines. Vertical dashed lines indicate the Euclid accuracy requirements. For redshift bin means, if the power spectrum only is used the selfcalibration regime starts at a prior of around 0.1 for both FoMs. This means that the only way to improve cosmological parameter constraints is through narrow external priors. In contrast, if the bispectrum is also used then the boundary of the self-calibration regime changes to about 0.001, within the Euclid requirement. A similar pattern is seen for the multiplicative bias parameters. When only the power spectrum is used the boundary of the self-calibration regime is at a value of about 0.3, again implying that tight external priors are needed to improve parameter constraints. If the bispectrum is used as well self-calibration starts at around 0.005, just outside the Euclid requirement, except in the case of the Ω m -σ 8 FoM where the self-calibration boundary is almost exactly at the Euclid requirement.
In Fig. 7 we explore the joint effect of priors on redshift and multiplicative bias parameters, together with the effect of using the bispectrum as well as the power spectrum. This figure shows the ratio between the FoM obtained with the combined power spectrum and bispectrum with the default prior of 0.1 imposed on the intrinsic alignment parameters but varying priors on the other parameters, and the FoM obtained with the power spectrum only and priors of 0.1 for the intrinsic alignment parameters and 0.002 for the redshift and multiplicative bias parameters. Thus the panels show, for each FoM, the improvement from using the bispectrum compared with the baseline Euclid scenario with the power spectrum only and default priors. The grey stars indicate the default values of the redshift and multiplicative bias priors. At these points the Ω m -σ 8 FoM is around 65 times greater than the baseline Euclid value and the w 0 -w a FoM is around 13 times greater. Thus including the bispectrum as well as the power spectrum produces a large gain compared with any   (Laureijs et al. 2011). The horizontal grey lines indicate a 5% improvement in the FoM. An improvement less than this is our criterion for self-calibration. Note different vertical scales in each panel. further tightening of the priors with the power spectrum only. This is true even when the redshift and multiplicative bias priors are greatly relaxed. Figure 7 also shows that there is little interaction between the redshift parameters on the one hand and the multiplicative bias parameters on the other. Thus there is only limited opportunity for trade-offs between the accuracy of the two sets of parameters. Appendix D contains further detail about the effect of varying priors on the redshift and multiplicative bias parameters.

CONCLUSIONS
In the context of a Euclid-like tomographic weak lensing survey we have considered three major sources of systematic uncertainty: contamination by intrinsic alignments which adds additional terms to the cosmic shear power spectrum and bispectrum; uncertainty in the mean redshifts of the tomographic bins due to the use of photometric redshift measurements; and multiplicative bias which affects the amplitude of the shear signal. We modelled the effects of these systematics on the weak lensing bispectrum by extending existing methods which are well-tested for the power spectrum and which have been used to analyse data from current weak lensing surveys.
We used figures of merit based on Fisher matrices to forecast the effect of these systematics on parameter constraints, focusing in particular on the large-scale structure parameters Ω m and σ 8 and the dark energy equation of state parameters w 0 and w a . Whether we consider the power spectrum only or the combined power spectrum and bispectrum, the presence of systematic uncertainties causes an We compared two strategies for combatting this loss of information. The first approach rests on using the power spectrum only and imposing tight priors on the systematic nuisance parameters, informed by external calibration data or simulations. This is what is normally done in weak lensing analysis. In our analysis we assume that the external calibration meets requirements in the Euclid Definition Study Report (Laureijs et al. 2011), where these exist. The second strategy involves analysing the bispectrum alongside the power spectrum. We find that this greatly reduces the impact of systematic uncertainties, especially intrinsic alignments which, with our modelling assumptions, contribute at different levels to the power spectrum and bispectrum.
Thus much more can be gained by using the bispectrum than by setting tight priors but using only the power spectrum. This is true even though our analysis is based on a 'cut down' bispectrum which depends only on equilateral triangles. Using more triangle configurations could be expected to produce even greater gains. Our results are also conservative because we used only a limited number of tomographic bins. Increasing the number of bins would increase the constraining power from both the power spectrum and the combined bispectrum and power spectrum. The relative gain from the bispectrum might be then smaller but we would still expect a substantial improvement.
In all the cases which we have considered, combining the bispectrum with the power spectrum improves self-calibration power: the self-calibration regime starts at a smaller prior value than with the power spectrum alone and there is less degradation in the figures of merit in the self-calibration regime. For redshift and multiplicative bias uncertainties, the self-calibration regime for a combined power spectrum-bispectrum analysis starts near or within Euclid accuracy requirements. For intrinsic alignment parameters, where there are no specified Euclid requirements, self-calibration starts close to our conservatively chosen default prior values. It is important to recognize, however, that the added constraining power due to the bispectrum will lead to tighter accuracy requirements on all nuisance parameters if systematic errors are to be kept well below statistical errors.
These results are encouraging and we plan to explore several aspects further. Firstly, we intend to validate our intrinsic alignment modelling by revisiting the analysis in Semboloni et al. (2008) using state-of-the-art simulations such as the Euclid Flagship Mock Galaxy Catalogue 6 , or survey data such as the Dark Energy Survey Instrument Bright Galaxy Survey 7 (Levi et al. 2019). Secondly, as discussed in Sect. 3, we will review the new formula for the matter bispectrum derived by Takahashi et al. (2020) to improve our bispectrum modelling and investigate the potential for further selfcalibration. Finally, we intend to explore the performance of weak lensing three-point statistics which are readily derived from real data, such as aperture mass statistics (Schneider et al. 1998). Extending our work in these ways will help to confirm the practical value of using three-point statistics to control systematics in Euclid and other next-generation weak lensing surveys. their respective bispectrum calculations, and Stephen Stopyra for advice on using M .

DATA AVAILABILITY
The data underlying this article will be shared on reasonable request to the corresponding author.

APPENDIX A: ORIGIN OF TERMS IN THE MATTER POWER SPECTRUM AND BISPECTRUM COVARIANCE
Ifδ is the Fourier transform of the underlying density contrast, then an estimator for the power spectrum will involve the product of two Fourier modesδδ. Similarly a bispectrum estimator involves δδδ. From this we can use Wick's theorem to understand the corresponding covariances. The power spectrum covariance has two terms. Schematically, one term involves δδ δδ , the product of two power spectra, and the other involves the connected four-point function or trispectrum, δδδδ c . The bispectrum covariance has terms involving δδ δδ δδ , δδδ c δδδ c , δδδδ c δδ and δδδδδδ c , and the power spectrum-bispectrum cross-covariance involves δδ δδδ c and δδδδδ c .
Terms which depend only on the power spectrum are referred to as Gaussian. If the underlying field was Gaussian then these would be the only non-zero parts of the covariance. The cross-covariance has no Gaussian terms. The remaining, non-Gaussian, terms generate non-zero offdiagonal elements. They arise from mode coupling either between small-scale modes within the survey window (in-survey covariance) or between in-survey modes and long-wavelength modes longer than the survey window dimension (supersample covariance). Supersample covariance is generated by the four-point correlator in the power spectrum covariance and the six-point correlator in the bispectrum covariance.

APPENDIX B: MATTER POWER SPECTRUM AND BISPECTRUM SUPERSAMPLE COVARIANCE
Background modes which cause supersample covariance are essentially constant across the survey footprint, so their effect can be equated to a change in the mean density within the survey region. Supersample covariance can thus be thought of as the response of the power spectrum and bispectrum to a long-wavelength mode δ b Takada & Hu 2013;Li et al. 2014a,b;Barreira et al. 2018;Chan et al. 2018;Lacasa 2018).
In this view the power spectrum supersample covariance is (Takada & Hu 2013) where σ 2 W is the variance of the long-wavelength background mode within the survey window, given by Here V W is the volume defined by the survey window,W is the Fourier transform of the survey window function and P L is specifically the linear power spectrum because the long-wavelength mode is in the linear regime. The power spectrum P may be in the linear or nonlinear regime. Similarly the bispectrum supersample covariance is (Chan et al. 2018) and the power spectrum-bispectrum supersample cross-covariance is Again, the bispectrum B can be in the linear or non-linear regime. The response functions in Eqs. (B1), (B3) and (B4) can be derived using the halo model (Cooray & Hu 2001;Cooray & Sheth 2002) together with perturbation theory (Bernardeau et al. 2002).
The halo model is based round the integrals Here M(z) is the halo mass, n(M, z) is the number density of halos, f (M, z) ≡ dn/dM is the halo mass function,ũ M (k) is the Fourier transform of the halo density profile, µ is the number of points being correlated, and b β (M) is the halo bias. We assume a Navarro-Frenk-White halo matter density profile (Navarro et al. 1997) and the mass-concentration relation given in Duffy et al. (2008). We use results from Tinker et al. (2008) to model the halo mass function and halo bias. The bias quantifies the β-th order response of the halo mass function to the long-wavelength mode δ b (Mo & White 1996;Schmidt et al. 2013 We assume linear bias so that b 0 = 1, b 1 = b(M) and b β = 0 for β > 1.
The halo model expression for the power spectrum is giving where we assume that the one-halo term I 1 1 (k) is not affected by the background mode δ b (Chiang et al. 2014). We further assume that in the presence of δ b the halo mass function changes from f (M, z) to (1 + δ b ) f (M, z) but the halo profile does not change (Schaan et al. 2014), so that Substituting into Eq. (B9) gives and Eq. (B8) becomes Thus we need the response of the modulated power spectrum P(k |δ b ) to the background fluctuation δ b . This has been derived in several ways, including from perturbation theory and consistency relation arguments (Takada & Hu 2013); a separate universe approach (Li et al. 2014a); and from the position-dependent power spectrum and integrated bispectrum (Chiang et al. 2014). The resulting expression is Substituting into Eq. (B13) gives the halo model power spectrum The bispectrum response can be derived in a similar way by expressing the bispectrum as the sum of 1-halo, 2-halo and 3-halo terms.
Here B PT is the tree-level matter bispectrum given by B PT (k 1 , k 2 , k 3 ) = 2 F 2 (k 1 , k 2 )P(k 1 )P(k 2 ) (B20) + F 2 (k 2 , k 3 )P(k 2 )P(k 3 ) where the symmetrised mode-coupling kernel F 2 is (Bernardeau et al. 2002) F 2 (k 1 , k 2 ) = 5 7 + 1 2 Thus we need the response of the tree-level bispectrum to the long mode. Chan et al. (2018) used perturbation theory to obtain where B G is identical to B PT but with the density coupling function F 2 replaced by its velocity counterpart G 2 : For equilateral triangles we use the computer algebra package M 8 and Eq. (B22) to obtain An alternative way to obtain the bispectrum response is to extend the concept of the position-dependent power spectrum developed by Chiang et al. (2014) to three-point statistics (Adhikari et al. 2016;Munshi & Coles 2017). We use this method, for equilateral triangles only, as a check on the validity of Eq. (B22).
We define the position dependent bispectrum as B(k 1 , k 2 , k 3 , r)δ(r) , the correlation between the bispectrum measured in a sub-volume of the survey V L , with length scale L and centred at position r, and the mean density contrast at r. It is equivalent to an integrated trispectrum and can be expressed as whereW L (q) is the Fourier transform of the sub-volume window function which we take to be 1 within the sub-volume and 0 otherwise. Following Adhikari et al. (2016) we make the assumption that the trispectrum is dominated by the squeezed limit in which one wavevector is much smaller than the other three. Eq. (B26) can then be simplified through the bispectrum triangle condition k 1 + k 2 + k 3 = 0 and a change of variables to get Averaging over the solid angles Ω 12 and Ω 13 between two pairs of wavevectors and fixing the direction of one k vector results in This derivation depends critically on the assumption that the squeezed-limit configuration dominates the trispectrum. Strictly the result is only valid for trispectra which depend only on four items, say four sides of a quadrilateral or three sides and a diagonal (see the discussion in Adhikari et al. 2016). Nevertheless we take the squeezed-limit assumption it to be a reasonable approximation. We proceed to show that in the equilateral case Eq. (B29) has the form for some function f (k), where σ 2 L is the variance of the density field on the sub-volume scale, given by It then follows that iT(k, k) measures how the equilateral bispectrum responds to a large-scale density fluctuation with variance σ 2 L . To evaluate the trispectrum on the right-hand side of Eq. (B29) we use perturbation theory. The general trispectrum T PT (k 1 , k 2 , k 3 , k 4 ) can be expressed as (Bernardeau et al. 2002;Pielorz et al. 2010) where T a is the sum of 12 terms like and T b is the sum of four terms of the form F 3 (k 1 , k 2 , k 3 )P(k 1 )P(k 2 )P(k 3 ). Here F 3 is the symmetrised third-order coupling kernel: with k i j = |k i + k j |. We use this formulation and the computer algebra package M to derive an approximation for the squeezed trispectrum with three equal short modes and one long mode q. In this configuration three terms of T a are zero in the limit q → 0 because F 2 (k, −k) = 0 and one term of We set |k i | = k for i = 1, 2, 3 and k i · k j = k 2 /2 for i j, write P (k) = ∂P(k)/∂k, and Taylor-expand all terms in T a and T b to first order in q/k. This leads to T equi a = 9P(k) 2 P(q) 98k 2 q 2 17(k · q) 2 + 42k 2 k · q + 32k 2 q 2 (B35) − 9P(k)P (k)P(q) 98k 3 q 2 (k · q) 3 + 14k 2 (k · q) 2 + 6k 2 q 2 , 84k 2 q 2 106(k · q) 2 − 216k 2 k · q + 53k 2 q 2 (B36) 126k 3 q 2 70(k · q) 3 − 216k 2 (k · q) 2 + 93k 2 q 2 , Although T equi a and T equi b include terms in k · q/q 2 which are divergent as q → 0, these cancel out in the final expression for T equi PT . Finally, ignoring terms which vanish as q → 0, we get T equi PT = P(q)P(k) ∂k .
We substitute this into Eq. (B29) and take the average over the solid angle Ω between k and q. The angular average of (k · q) 2 /k 2 q 2 is 1/3 and so we get Power spectrum response This is close to but not identical with the result for equilateral triangles obtained by Chan et al. (2018), Eq. (B24). We would not expect exact agreement given the approximations we have made and the fact that our results are only correct to first order. Nevertheless our final expression broadly confirms Eq. (B22) for equilateral bispectra. We therefore use Eq. (B22) in our work since it is more complete and general, substituting into the halo model expression Eq. (B19).
We note however that both results are quite different from the expression obtained by Adhikari et al. (2016) It is difficult to determine the source of the discrepancies since all three derivations rely on M (private communications) and intermediate steps are not transparent. Figure B1 illustrates our calculated matter response functions evaluated using modelling assumptions from Sect. 3. This figure shows the individual halo model terms and the total response but excludes the dilation terms -the −1/3 terms in Eqs. (B14) and (B22) -which are consistently small as also found by Li et al. (2014a) and Chan et al. (2018).

APPENDIX C: WEAK LENSING COVARIANCE
In this appendix we give expressions for the components of the convergence power spectrum and bispectrum covariance for a single redshift bin. The power spectrum-bispectrum cross-covariance can easily be derived in a similar way (Kayo & Takada 2013;Rizzato et al. 2019). Further details and derivations can be found in Takada & Jain (2009), Kayo & Takada (2013) and Sato & Nishimichi (2013). Rizzato et al. (2019) give all permutations of terms in these covariances for a tomographic survey.
We assume a survey with area Ω s in steradians and consider angular bins of width ∆ i centred on the values i . Thus l i − ∆ /2 ≤ | i | ≤ l i + ∆ /2. For simplicity we omit noise terms in this appendix but in a real survey the intrinsic ellipticity of galaxies will always induce shape noise. In the main part of this paper we always implicitly include shape noise in the Gaussian terms of the covariances (both the power spectrum and the bispectrum). We assume this noise is Gaussian, so the observed power spectrum between redshift bins i and j has the form where δ K i j is the Kronecker delta, σ 2 is the total intrinsic ellipticity dispersion, which we take to be 0.35, andn i is the galaxy number density in redshift bin i.

Gaussian covariance
The Gaussian part of the convergence power spectrum covariance is where δ K 1 2 is the Kronecker delta which is 1 if 1 = 2 and 1 is within the bin width ∆ 1 , and zero otherwise. N pairs ( 1 ) is the number of independent pairs of modes within the bin width.
The Gaussian part of the convergence bispectrum covariance is is the number of triplets of modes which form triangles of side lengths 1 , 2 and 3 within the specified bin widths ∆ i .

In-survey non-Gaussian covariance
The in-survey non-Gaussian part of the convergence power spectrum covariance is where T is the convergence trispectrum. The integrals are over all wavevectors which are within the bin width ∆ around or . The in-survey non-Gaussian part of the convergence bispectrum covariance involves similar integrals over the angular bins. However it can be considerably simplified by making use of triangle conditions and other reasonable assumptions such as that the trispectrum does not vary much within the bin width. Full details can be found in Appendix A of Kayo & Takada (2013). The resulting simplified expression is where P 6 is the six-point function or pentaspectrum. The triangle conditions 1 + 2 + 3 = 0 and 4 + 5 + 6 = 0 mean that the P 6 term depends automatically on two triangles. The only remaining freedom is the angle ψ between any two wavevectors, one in each of these triangles, so we can replace the integrals over i with integrals over ψ (Kayo & Takada 2013).

Supersample covariance
Using the standard Limber approximation and assuming a spatially flat universe, the weak lensing power spectrum supersample covariance for a single redshift bin is (Takada & Hu 2013) where χ lim is the maximum comoving distance of the survey, q( χ) is the lensing weight function given by Eq.
Similarly the weak lensing bispectrum supersample covariance is (Kayo & Takada 2013) Cov[B( 1 , 2 , 3 ), B( 4 , 5 , 6 )] SSC = 1 Ω s ∫ χ lim 0 d χ q 6 ( χ) χ −10 σ 2 W ( χ) ∂B δ ( 1 / χ, 2 / χ, 3 / χ; χ) ∂δ b × ∂B δ ( 4 / χ, 5 / χ, 6 / χ; χ) ∂δ b . Figure C1 shows the diagonal Gaussian, in-survey non-Gaussian and supersample terms of the weak lensing power spectrum and bispectrum covariance and their cross-covariance across a range of angular scales, calculated for a single redshift bin for a 15 000 deg 2 survey. For the bispectrum and cross-covariance we show results for equilateral configurations only. For the power spectrum and bispectrum Gaussian terms dominate except at small scales where the supersample terms are large. The supersample term dominates the cross-covariance, which has no Gaussian term. The results are consistent with other similar numerical calculations for Euclid-like surveys (Barreira et al. 2018;Rizzato et al. 2019). Figure C2 splits the supersample terms of the equilateral-triangle bispectrum covariance and the power spectrum-bispectrum crosscovariance into their one-halo, two-halo and three-halo components. The one-halo term is dominant for > 50 in the bispectrum covariance and at all scales in the cross-covariance. At larger scales the three-halo term dominates the bispectrum covariance. However this is inconsequential since the signal-to-noise ratio in this regime is very small. Figure D1 is a different presentation of the information in Fig. 6. It compares the FoMs obtained with the combined bispectrum and power spectrum with those obtained with the power spectrum only, in each case imposing default priors of 0.1 on the intrinsic alignment parameters and 0.002 on all other nuisance parameters. Figure D2 the effect of varying priors on redshift bin means individually, for the the power spectrum only and the combined power spectrum and bispectrum. The vertical dashed lines indicate the redshift accuracy requirement from the Euclid Definition Study Report (Laureijs et al. 2011). The horizontal grey lines indicate a 5% improvement in the FoM; an improvement less than this is our criterion for self-calibration. With only the power spectrum, the selfcalibration regime starts at a prior value of around 0.1 for every bin. The bispectrum is more sensitive to redshift and when it is also used the self-calibration point increases with redshift, from roughly 0.001 for the nearest bin to 0.1 for the furthest. Nevertheless self-calibration mainly starts at smaller values than for the power spectrum only. In all cases the height of the step between the self-calibration regime and the region where the FoM is fixed by the prior is less when the bispectrum is combined with the power spectrum, reducing reliance on external priors. This paper has been typeset from a T E X/L A T E X file prepared by the author.   The horizontal grey lines indicate a 5% improvement in the FoM. An improvement less than this is our criterion for self-calibration. Note different vertical scales in each panel.