-
PDF
- Split View
-
Views
-
Cite
Cite
Masato Shirasaki, Kana Moriwaki, Taira Oogi, Naoki Yoshida, Shiro Ikeda, Takahiro Nishimichi, Noise reduction for weak lensing mass mapping: an application of generative adversarial networks to Subaru Hyper Suprime-Cam first-year data, Monthly Notices of the Royal Astronomical Society, Volume 504, Issue 2, June 2021, Pages 1825–1839, https://doi.org/10.1093/mnras/stab982
- Share Icon Share
ABSTRACT
We propose a deep-learning approach based on generative adversarial networks (GANs) to reduce noise in weak lensing mass maps under realistic conditions. We apply image-to-image translation using conditional GANs to the mass map obtained from the first-year data of Subaru Hyper Suprime-Cam (HSC) Survey. We train the conditional GANs by using 25 000 mock HSC catalogues that directly incorporate a variety of observational effects. We study the non-Gaussian information in denoised maps using one-point probability distribution functions (PDFs) and also perform matching analysis for positive peaks and massive clusters. An ensemble learning technique with our GANs is successfully applied to reproduce the PDFs of the lensing convergence. About |$60{{\ \rm per\ cent}}$| of the peaks in the denoised maps with height greater than 5σ have counterparts of massive clusters within a separation of 6 arcmin. We show that PDFs in the denoised maps are not compromised by details of multiplicative biases and photometric redshift distributions, nor by shape measurement errors, and that the PDFs show stronger cosmological dependence compared to the noisy counterpart. We apply our denoising method to a part of the first-year HSC data to show that the observed mass distribution is statistically consistent with the prediction from the standard ΛCDM model.
1 INTRODUCTION
Impressive progress has been seen in observational cosmology in the past decades. An array of multi-wavelength astronomical data have established the standard model of our universe, referred to as ΛCDM model, with precise determination of major cosmological parameters. The nature of the main energy contents in our universe remains unknown, however. Invisible mass component called dark matter is needed to explain the formation of large-scale structures in the universe (Clowe, Gonzalez & Markevitch 2004; Ade et al. 2016), and an exotic form of energy appears to be responsible for the accelerating expansion of the present-day universe (Huterer & Shafer 2018).
With an important aim at revealing the nature of dark matter and the late-time cosmic acceleration, a number of astronomical surveys are ongoing and planned. Accurate measurement of cosmic lensing shear signals is one of the primary goals of such galaxy surveys including the Kilo-Degree Survey (KiDS1), the Dark Energy Survey (DES2), and the Subaru Hyper Suprime-Cam Survey (HSC3), even for upcoming projects including the Nancy Grace Roman Space Telescope (Roman4), the Legacy Survey of Space and Time on Vera C. Rubin Observatory (LSST5), and Euclid.6
The large-scale matter distribution in the universe can be reconstructed through measurements of lensing shear signals by collecting and analysing a large set of galaxy images (Tyson, Wenk & Valdes 1990; Kaiser & Squires 1993; Schneider 1996). Although the image distortion of individual galaxies is typically very small, it is possible to infer the distribution of underlying matter density in an unbiased way by averaging over many galaxies. However, there are well-known challenges in practice when extracting rich cosmological information from the reconstructed matter distribution. Non-linear gravitational growth of the large-scale structure renders the statistical properties of the weak lensing signal complicated. Numerical simulations have shown that popular and powerful statistics for random Gaussian fields such as power spectrum are not able to fully describe the cosmological information imprinted in weak lensing maps (Jain, Seljak & White 2000; Hamana & Mellier 2001; Sato et al. 2009). To extract and utilize the so-called non-Gaussian information, various approaches have been proposed (e.g. Matsubara & Jain 2001; Sato et al. 2001; Zaldarriaga & Scoccimarro 2003; Takada & Jain 2003; Pen et al. 2003; Jarvis, Bernstein & Jain 2004; Wang, Haiman & May 2009; Dietrich & Hartlap 2010; Kratochvil, Haiman & May 2010; Fan, Shan & Liu 2010; Shirasaki & Yoshida 2014; Lin & Kilbinger 2015; Petri et al. 2015; Coulton et al. 2019; Schmelzle et al. 2017; Gupta et al. 2018; Ribli, Pataki & Csabai 2019), but no single statistic can capture the full information, unfortunately.
In practice, an observed weak lensing map is contaminated with noise arising from intrinsic galaxy properties and observational conditions. The former is commonly called as shape noise, which compromises the original physical effect. It is known that the noise effect can be robustly estimated and can also be mitigated for a Gaussian field (Hu & White 2001; Schneider et al. 2002), but little is known about the overall impact of the shape noise on a non-Gaussian field. Non-Gaussian information can potentially be a powerful probe to test the ΛCDM model and variant cosmological models even in the presence of shape noise (Shirasaki et al. 2017a; Liu & Madhavacheril 2019; Marques et al. 2019). For instance, Zorrilla Matilla et al. (2020) show that cosmological inference based on convolutional neural networks relies on the information carried by high-density regions where the noise is less important. Clearly, it is important to devise a noise reduction method in order to maximize the science return from ongoing and future wide-field lensing surveys.
A straightforward way of mitigating the shape noise is to smooth a weak lensing map over a large angular scale [e.g. |$\sim \! 20{\!-\!}30 \, {\rm arcmins}$| in Vikram et al. (2015), Chang et al. (2018)], but the smoothing itself also erases the non-Gaussian information in the map (Jain et al. 2000; Taruya et al. 2002). A novel approach has been proposed to keep a high angular resolution of ∼1 arcmin while preserving non-Gaussian information (Shirasaki, Yoshida & Ikeda 2019b). The method is based on a deep-learning framework called conditional generative adversarial networks (GANs; Isola et al. 2016). Thanks to the expressive power of deep neural networks, conditional GANs can denoise a weak lensing map on a pixel-by-pixel basis (see also Jeffrey et al. (2020), Remy et al. (2020) for similar study with a deep learning method). In exchange for its versatility, deep learning methods need to be validated thoroughly before applied to real observational data. However, validation of deep learning for lensing analyses has not been fully explored so far. To examine and improve the capability of denoising with deep learning, we need to study the statistical properties of denoised weak lensing maps using conditional GANs.
In this paper, we construct and test conditional GANs and apply to the real galaxy imaging data from Subaru HSC survey (Aihara et al. 2018). We use a large set of realistic mock HSC catalogues (Shirasaki et al. 2019a) to train the GANs. We test the denoising method using 1000 test data and assess possible systematic errors in the denoising process. We investigate non-Gaussian information in the denoised maps by using realistic simulations of gravitational lensing. For the first time, we evaluate generalization errors in the denoising process by varying several characteristics in the mock HSC catalogues. After stress testing, we apply our GANs to the real HSC data and study the cosmological implication of the reconstructed large-scale matter distribution.
The rest of the present paper is organized as follows. In Section 2, we summarize the basics of gravitational lensing. Section 3 describes the HSC data as well as our numerical simulations used for training and testing GANs. In Section 4, we explain the details of our training strategy of GANs. Before applying the GANs to the real HSC data, we perform thorough tests. We present the results in Section 5. In Section 6, we show the denoised map for the HSC data. Concluding remarks and discussions are given in Section 7.
2 WEAK GRAVITATIONAL LENSING
2.1 Basics
2.2 Smoothed lensing convergence map
3 DATA
3.1 Subaru HSC Survey
HSC is a wide-field imaging camera installed at the prime focus of the 8.2-m Subaru telescope (Miyazaki et al. 2015; Aihara et al. 2018; Furusawa et al. 2018; Komiyama et al. 2018; Miyazaki et al. 2018). The Wide Layer in the HSC survey will cover 1400 deg2 in five broad photometric bands (grizy) in its 5-yr operation, with superb image quality of sub-arcsec seeing. In this paper, we use a galaxy shape catalogue that has been produced for cosmological weak lensing analysis in the first year data release (HSC S16A hereafter). Details of the galaxy shape measurements and catalogue information are found in Mandelbaum et al. (2018a).
In brief, the HSC S16A galaxy shape catalogue is made from the HSC Wide-Layer data taken from 2014 March to 2016 April over 90 nights. We use the same set of galaxies as in Mandelbaum et al. (2018a) to construct a ‘secure’ shape catalogue for weak lensing analysis. The sky areas around bright stars are masked (Coupon et al. 2018). The HSC S16A weak lensing shear catalogue covers 136.9 deg2 that consists of the following six disjoint patches: XMM, GAMA09H, GAMA15H, HECTOMAP, VVDS, and WIDE12H. Among the six patches, we choose the XMM field as a main sample in this paper because there exist publicly available catalogues of galaxy clusters in optical (Oguri et al. 2018) and X-ray bands (Adami et al. 2018). We can use the cluster catalogues to examine the reliability of our denoising process by performing object-by-object matching.
In the HSC S16A shape catalogue, the galaxy shapes are estimated using the re-Gaussianization PSF correction method applied to the i-band coadded images (Hirata & Seljak 2003).
In the XMM region, the survey window is defined such that (1) the number of visits within HEALPix pixels with NSIDE=1024 to be (g, r, i, z, y) ≥ (4, 4, 4, 6, 6) and the i-band limiting magnitude to be greater than 25.6, (2) the PSF modelling is sufficiently good to meet our requirements on PSF model size residuals and residual shear correlation functions, (3) there are no disconnected HEALPix pixels after the cut (1) and 2), and (4) the galaxies do not lie within the bright object masks. For details of defining these masks, see Mandelbaum et al. (2018a).
The redshift distribution of the source galaxies is estimated from the HSC five broadband photometry. Tanaka et al. (2018) measure photometric redshifts (photo-z’s) of the galaxies in the HSC survey by using several different methods. Among them, we choose the photo-z with a machine-learning code based on self-organizing map (mlz) as a baseline. To study the impact of photo-z estimation with different methods, we consider two additional photo-z’s estimated from a classical template-fitting code (mizuki) and a hybrid code combining machine learning with template fitting (frankenz). For our analysis, we select the source galaxies by their best estimates (see Tanaka et al. (2018)) of the photo-z’s (zbest) in the redshift range from 0.3 to 1.5 as done in the main cosmological analyses for the HSC S16A data (Hikage et al. 2019). For a given method of the photo-z estimation, individual HSC galaxies are assigned a posterior probability distribution function (PDF) of redshift. Fig. 1 shows the stacked PDFs for the source galaxies in the XMM. The mean source redshifts are found to be 0.96, 1.01, and 1.01 for the estimates by mlz, mizuki, and frankenz, respectively.

The stacked photometric redshift distribution for the galaxies in the XMM field. The line with points shows the estimate by our baseline method, while the yellow solid and black dashed lines stand for the results by frankenz and mizuki, respectively.
We then reconstruct the smoothed convergence field from the HSC S16A data as described in Section 2.2. Adopting a flat-sky approximation, we first create a pixelized shear map for the XMM on regular grids with a grid size of 1.5 arcmin. We then apply FFT and perform convolution in Fourier space to obtain the smoothed convergence field. Note that we limit the maximum number of grids on a side to be 256 in our analysis. Currently, it is still computationally expensive to train GANs with large-size images with decent computer resources [see Brock, Donahue & Simonyan (2018) for a recent attempt]. Since our aim here is to analyse lensing convergence maps with an arcmin resolution, the pixel size is set to be |$\sim \! 1\, {\rm arcmin}$|. We will analyse observational data for a larger region in our future work. Our survey window covers the range of |$[30.9, 37.3]\, {\rm deg}$| and |$[-7.29, -0.89]\, {\rm deg}$| in right ascension (RA) and declination (Dec.), respectively. There are 1345 810 source galaxies available with the photo-z estimate by mlz. On the other hand, we found 13 45 541 and 13 42 017 objects for the selection based on mizuki and frankenz, respectively. Note that the selection of source galaxies depends on how to estimate the photo-z.
In actual observations, there are missing galaxy shear data due to bright star masks. The observed regions have also complex geometry. Applying our method directly to such regions likely generates additional noises (Shirasaki, Yoshida & Hamana 2013). We determine the mask regions for each convergence map by using the smoothed number density map of the input galaxies with the same smoothing kernel as in equation (10). Then we mask all the pixels with the smoothed galaxy number density less than 0.5 times the mean number density. After masking, the data region is found to cover 21.4 deg2.
3.2 Mock HSC observations
We use a large set of simulation data for training our conditional GANs. Table 1 summarizes our mock simulations.
Summary of our mock catalogues for Subaru Hyper-Suprime Cam Survey first-year data. For each of 100 cosmological models (parameter sets), we have 50 realizations of mock catalogues.
Name . | Number of . | Cosmology . | Note . | Reference . |
---|---|---|---|---|
. | realizations . | . | . | . |
Fiducial | 2268 | WMAP9 cosmology (Hinshaw et al. 2013) | Photo-z info by mlz | Section 3.2.1 |
Photo-z run 1 | 100 | – | Photo-z info by mizuki | Section 3.2.2 |
Photo-z run 2 | 100 | – | Photo-z info by frankenz | – |
Multiplicative-bias run 1 | 100 | – | Change 〈mb〉 by +0.01 | Section 3.2.3 |
Multiplicative-bias run 2 | 100 | – | Change 〈mb〉 by −0.01 | – |
Noise-varied run 1 | 100 | – | Change σmea by |$+10\,\rm per\, cent$| | Section 3.2.4 |
Noise-varied run 2 | 100 | – | Change σmea by |$-10\,\rm per \,cent$| | – |
Cosmology-varied run | 50 × 100 | 100 different models (Fig. 2) | Photo-z info by mlz | Section 3.2.5 |
Name . | Number of . | Cosmology . | Note . | Reference . |
---|---|---|---|---|
. | realizations . | . | . | . |
Fiducial | 2268 | WMAP9 cosmology (Hinshaw et al. 2013) | Photo-z info by mlz | Section 3.2.1 |
Photo-z run 1 | 100 | – | Photo-z info by mizuki | Section 3.2.2 |
Photo-z run 2 | 100 | – | Photo-z info by frankenz | – |
Multiplicative-bias run 1 | 100 | – | Change 〈mb〉 by +0.01 | Section 3.2.3 |
Multiplicative-bias run 2 | 100 | – | Change 〈mb〉 by −0.01 | – |
Noise-varied run 1 | 100 | – | Change σmea by |$+10\,\rm per\, cent$| | Section 3.2.4 |
Noise-varied run 2 | 100 | – | Change σmea by |$-10\,\rm per \,cent$| | – |
Cosmology-varied run | 50 × 100 | 100 different models (Fig. 2) | Photo-z info by mlz | Section 3.2.5 |
Summary of our mock catalogues for Subaru Hyper-Suprime Cam Survey first-year data. For each of 100 cosmological models (parameter sets), we have 50 realizations of mock catalogues.
Name . | Number of . | Cosmology . | Note . | Reference . |
---|---|---|---|---|
. | realizations . | . | . | . |
Fiducial | 2268 | WMAP9 cosmology (Hinshaw et al. 2013) | Photo-z info by mlz | Section 3.2.1 |
Photo-z run 1 | 100 | – | Photo-z info by mizuki | Section 3.2.2 |
Photo-z run 2 | 100 | – | Photo-z info by frankenz | – |
Multiplicative-bias run 1 | 100 | – | Change 〈mb〉 by +0.01 | Section 3.2.3 |
Multiplicative-bias run 2 | 100 | – | Change 〈mb〉 by −0.01 | – |
Noise-varied run 1 | 100 | – | Change σmea by |$+10\,\rm per\, cent$| | Section 3.2.4 |
Noise-varied run 2 | 100 | – | Change σmea by |$-10\,\rm per \,cent$| | – |
Cosmology-varied run | 50 × 100 | 100 different models (Fig. 2) | Photo-z info by mlz | Section 3.2.5 |
Name . | Number of . | Cosmology . | Note . | Reference . |
---|---|---|---|---|
. | realizations . | . | . | . |
Fiducial | 2268 | WMAP9 cosmology (Hinshaw et al. 2013) | Photo-z info by mlz | Section 3.2.1 |
Photo-z run 1 | 100 | – | Photo-z info by mizuki | Section 3.2.2 |
Photo-z run 2 | 100 | – | Photo-z info by frankenz | – |
Multiplicative-bias run 1 | 100 | – | Change 〈mb〉 by +0.01 | Section 3.2.3 |
Multiplicative-bias run 2 | 100 | – | Change 〈mb〉 by −0.01 | – |
Noise-varied run 1 | 100 | – | Change σmea by |$+10\,\rm per\, cent$| | Section 3.2.4 |
Noise-varied run 2 | 100 | – | Change σmea by |$-10\,\rm per \,cent$| | – |
Cosmology-varied run | 50 × 100 | 100 different models (Fig. 2) | Photo-z info by mlz | Section 3.2.5 |
3.2.1 Fiducial simulations
We first describe the mock shape catalogues for HSC S16A. The mock catalogues are generated from 108 full-sky lensing simulations presented in Takahashi et al. (2017).8 In Takahashi et al. (2017), the authors perform a suite of cosmological N-body simulations with 20483 particles and generate lensing convergence maps and halo catalogues. The N-body simulations assume the standard ΛCDM cosmology consistent with the 9-yr WMAP cosmology (WMAP9) (Hinshaw et al. 2013) with the CDM density parameter Ωcdm = 0.233, the baryon density Ωb0 = 0.046, the matter density Ωm0 = Ωcdm + Ωb0 = 0.279, the cosmological constant ΩΛ = 0.721, the Hubble parameter h = 0.7, the amplitude of density fluctuations σ8 = 0.82, and the spectral index ns = 0.97. The gravitational lensing effect is simulated with the multiple lens-plane algorithm on a curved sky (Becker 2013; Shirasaki et al. 2015). Light-ray deflection is directly followed by using the projected matter density field produced by the outputs from the N-body simulations. Each lensing simulation data consists of 38 different source planes at redshift less than 5.3. Realistic source redshift distributions are implemented following the curves in Fig. 1.
To generate mock shape catalogues, we employ essentially the same method as developed in Shirasaki & Yoshida (2014) and Shirasaki et al. (2017b). We use the full-sky simulations combined with the observed photometric redshifts and angular positions of real galaxies. Provided the real catalogue of source galaxies, where each galaxy contains information on the position (RA and Dec.), shape, redshift, and the lensing weight, we perform the following four steps:
Set the RA and Dec. of the survey window in the full-sky realization.
Populate source galaxies on the light cone using original angular positions and redshifts of the observed galaxies.
Rotate the shape of each source galaxy at random to erase the real lensing signal.
Add the lensing shear on each source galaxy using the lensing simulations
In the step (ii), we draw the source redshift at random by following the posterior distribution of photo-z estimates on an object-by-object basis. Hence, our mock catalogue contains galaxies at z < 0.3 or z > 1.5. Note that our method maintains the observed properties of the source galaxies on the sky. We increase the number of realizations of the mock catalogues by extracting multiple separate regions from a single full-sky simulation. Finally we obtain 2268 mock catalogues in total.9
3.2.2 Photometric redshift uncertainties
In the fiducial mock catalogues, we utilize the photo-z information estimated by mlz. To examine possible systematic effects owing to photo-z uncertainties, we generate additional mock realizations adopting the two other redshift estimates by mizuki or frankenz. We produce 100 mock realizations of the HSC S16A catalogues for each model, and use them to evaluate the impact of photo-z uncertainty in our denoising process.
3.2.3 Image calibration uncertainties
We use a single value of multiplicative bias 〈mb〉 (defined in equation 8) when generating our fiducial mock catalogues. Estimating 〈mb〉 is based on image simulations, and thus there remains a |$1{{\ \rm per\ cent}}$|-level uncertainty (Mandelbaum et al. 2018b). To account for possible systematic effects by the mis-estimation of the multiplicative bias, we make additional mock realizations by changing 〈mb〉 → 〈mb〉 + Δmb in the production process. We assume two values of Δmb = ±0.01. For each value of Δmb, we produce 100 mock realizations of the HSC S16A.
3.2.4 Noise model uncertainties
Imperfect knowledge of the noise distribution in the data can be another source of systematic uncertainties in denoising process. To assess possible model bias in the noise, we generate two different mock catalogues by varying the amplitude of the standard deviation (error) of the shape measurement, σmea. In the HSC S16A data, the value of σmea has been calibrated with a set of image simulations, and the estimate for individual objects may be subject to a |$10{{\ \rm per\ cent}}$|-level uncertainty (Mandelbaum et al. 2018b). To test the impact of this uncertainty, we vary the amplitude of σmea by a factor of (1 + Δσmea) on an object-by-object basis when generating mock catalogues. We keep the lensing weight fixed even when varying σmea in the lensing analysis, because we suppose that we are unaware of the mis-estimate of σmea. We assume two values, Δσmea = 0.1 and Δσmea = −0.1. For each value of Δσmea, we produce 100 mock realizations of the HSC S16A.
3.2.5 Varying cosmological models
To study the cosmological dependence on weak lensing maps, we also generate mock catalogues of the HSC S16A data by varying cosmological models. We design the cosmological models for simulations so as to cover a much wider area in the two-parameter space (Ωm0, σ8) than the constraints by the current galaxy imaging surveys (Hildebrandt et al. 2017; Troxel et al. 2018b, a; Hikage et al. 2019). We choose a sample of cosmological models in the Ωm0−σ8 plane by using a public R package to generate the maximum-distance sliced Latin Hypercube Designs (LHDs; Ba, Myers & Brenneman 2015). We first generate 120 designs in a two dimensional rectangle specified by 0.1 ≤ Ωm0 ≤ 0.7 and 0.4 ≤ σ8(Ωm0/0.3)0.6 ≤ 1.1 using the codes. We then restrict the designs to those with 0.4 ≤ σ8 ≤ 1.4. This leaves 100 designs. Fig. 2 shows the resultant 100 cosmological models adopted in our simulations. Note that we set ΩΛ = 1 − Ωm0 assuming a spatially flat universe. For other parameters, we adopt Ωb0h2 = 0.02225, h = 0.6727 and ns = 0.9645. These parameters are consistent with the results from Planck 2015 (Ade et al. 2016).

The 100 different cosmological models to study the cosmological dependence on weak lensing maps. At each point, we generate 50 mock realizations of the HSC S16A data.

The configuration of N-body boxes in our ray-tracing simulation for the cosmology with Ωm0 = 0.3.
For a given single N-body simulation volume, we produce two sets of the projected density fields with a projection depth of Lbox/2 on 96002 grids by using the triangular-shaped cloud assignment scheme (Press et al. 1992). By solving the discretized lens equation numerically, we obtain the lensing convergence κ and shear γ on 40962 grids with a grid size of 0.15 arcmin. A single realization of our ray-tracing data consists of 22 source planes in the range of |$z\lower.5ex\hbox{$\,\, \buildrel\lt \over \sim \,\,$}3$|. We perform 50 ray-tracing realizations of the underlying density field by randomly shifting the simulation boxes assuming periodic boundary conditions. We finally produce the mock catalogue of the HSC S16A as described in Section 3.2.1.
When running cosmological N-body simulations, we use the parallel Tree-Particle Mesh code GADGET2 (Springel 2005). We generate the initial conditions using a parallel code developed by Nishimichi et al. (2009) and Valageas & Nishimichi (2011), which employ the second-order Lagrangian perturbation theory (Crocce, Pueblas & Scoccimarro 2006). The number of N-body particles is set to 5123. We set the initial redshift by |$1+z_{\rm init} = 36 \, (512/L_{\rm box})$|, where we compute the linear matter transfer function using CAMB (Lewis, Challinor & Lasenby 2000). Note that our choice of the initial redshift is motivated by the detailed study of Nishimichi et al. (2019).
4 DENOISING BY DEEP-LEARNING NETWORKS
4.1 Conditional GANs
To perform mapping from a noisy lensing field κobs to a noiseless counterpart κWL, we use a model of conditional GANs developed in Isola et al. (2016). The networks have two main components, a generator and a discriminator. We train the networks so that the generator applies some transformation to the input noisy field κobs to output a noise field κN.10 The discriminator compares the input image to an unknown image (either a target image from the data set or an output image from the generator) and tries to judge if it is produced by the generator. To be specific, the input image for the discriminator is set to the noisy field κobs, while the target image is either of the noise counterpart of κobs or an output from the generator.
The structure of the generator and the discriminator in our networks is essentially the same as in Shirasaki et al. (2019b), except for minor parameter tuning. The generator uses a U-Net structure (Ronneberger, Fischer & Brox 2015) with an eight set of convolution and deconvolution layers. Each convolution layer consists of convolution with a kernel size of 5 × 5, the batch normalization, and the application of the activation function of leaky ReLU with a leak slope of 0.2. The deconvolution layer does the inverse operation of the convolution layer. The generator also has additional skip connections between mirror layers to propagate the small-scale information that would be lost as the size of the images decreases through the convolution process. The discriminator produces a single value from a given input image for the decision whether the input is real or a fake. The final output of the discriminator is made after the image reduction through 4 convolution layers and after averaging all the responses from the convolution layers. In the convolution layers in the discriminator, we remove the batch normalization to balance the losses of the generator and the discriminator in a stable way. The resulting number of parameters in our networks is close to 400 000.
4.2 Training the networks
When training the networks, we use the minibatch Stochastic Gradient Descent (SGD) method and apply the Adam solver (Kingma & Ba 2014), with learning rate 0.0002, momentum parameters β1 = 0.5 and β2 = 0.9999. We also set λ = 75 in equation (17). The parameter λ controls the strength of the regularization given by the L1 norm. All the networks in this paper are trained with a batch size of 1. We initialize the model parameters in the networks from a Gaussian distribution with a mean 0 and a standard deviation of 0.02. We train our networks using the TensorFlow implementation11 on a single NVIDIA Quadro P5000 GPU. While processing, we randomly select training and validation data from the input data sets. Each network is validated every time it learns 100 image pairs.
To prepare the training data set, we use 400 realizations of our mock HSC S16A catalogues (Section 3.2.1). Using the information of noiseless lensing maps κ and γ in our survey window, we generate 60 000 noisy maps by injecting independent noise realizations at random. From the 60 000 image pairs of the noisy field κobs and the underlying noise κN, we select 25000 image pairs by bootstrap sampling so that each bootstrap realization can contain 167 realizations of noiseless lensing fields. In our previous study, we find that it is near-optimal to use ∼200 realizations of noiseless lensing fields and set the number of training sets to ∼30 000 for our networks (Shirasaki et al. 2019b).
To set the hyperparameter in our network, we examined the training with λ = 25, 50, 75, 100, and 150. For a given λ, we varied the number of image pairs in the training process from 20 000 to 40 000 at intervals of 5000. We then apply the network trained with different hyperparameters to the test sets. After some trials, we find that the training with 25000 image pairs and λ = 75 can provide the best performance on noise reduction in the test sets.
4.3 Production of the final denoised image
As reported in Shirasaki et al. (2019b), a single set of our networks trained by 25 000 image pairs has a large scatter in the image-to-image translation. To reduce this dependence on training data sets, we generate 10 bootstrap sampling of 25 000 training data and obtain a total 10 networks for denoising. Namely, we obtain 10 candidates of the underlying noise field κN for a given noisy field κobs. To evaluate the best estimate of κN, we take the median over the 10 candidates on a pixel-by-pixel basis. Once the averaged estimate of κN is determined in this manner, we evaluate the underlying noiseless field κWL by subtracting the best noise model from the observed one κobs.
The denoising process by our networks is tested by 1000 noisy data from the fiducial mock catalogues. These test data are not used in the training process.
5 PROPERTIES OF DENOISED MAPS
In this section, we study statistical properties of weak lensing maps denoised by our conditional GANs. We pay special attention to non-Gaussian information in the maps. In this paper, we consider one-point distribution function (PDF) to extract non-Gaussian information. Furthermore, we employ matching analyses of peaks in the maps and massive clusters in the N-body simulations, demonstrating that our GANs do not erase rich cosmological information from high-density regions. In the Appendix, we summarize additional tests for our GANs. The tests include a conventional two-point correlation analysis, a reliability check of our GANs’ predictions, and dependence of our results on hyperparameter in our GANs.
Our training strategy for conditional GANs is provided in Section 4. Here, we show the validation results of the outputs from our networks by using 1000 test data sets. These test sets are based on the fiducial mock catalogues as in Section 3.2.1, while we do not use them in the training process. In the following, the lensing map is normalized so as to have zero mean and unit variance.
5.1 Visual comparison
A quick visual comparison would allow us to highlight how our GAN-based denoising works for noisy input images. Fig. 4 compares three maps for one of our test data. In the figure, the left and right panels show an input noisy and the true noiseless counterpart, respectively. The middle panel shows the denoised map by our conditional GANs. In each panel, red spots indicate high density regions, while bluer ones have lower densities. The denoised image retains similar patterns in density contrast over a few degrees compared to the ground truth. Note that Fig. 4 concentrates on the pixel values in the range of −2.5σ to +2.5σ, i.e. largely noise dominated. Although not perfect, our GANs recover small-scale information (e.g. positive peaks) closely to the ground truth.

An example of image-to-image translation by our networks. The left-hand panel shows an input noisy lensing map, while the right stands for the true (noiseless) counterpart. The medium represents the reconstructed map by our conditional GANs. For the reconstructed map, we first obtain the underlying noise field from 10 bootstrap realizations of the generators in our GANs and then derive the convergence map by the residual between the input noisy map and the predicted noise. In this figure, the hatched region shows the masked area due to the presence of bright stars and inhomogeneous angular distributions of galaxies in our survey window. In the legend, μ and σ denote the spatial average and the root-mean-square of lensing fields, respectively.
5.2 One-point probability distribution
One-point PDF is a simple summary statistic of a weak lensing map. Our previous study shows that the denoised image yields a similar PDF to the noiseless true counterpart if the lensing field is properly normalized (Shirasaki et al. 2019b). Here, we repeat the previous analysis but with including various observational effects such as complex survey geometry, inhomogeneous galaxy distribution on a sky, wide redshift distribution of source galaxies, and variation of the weights in the analysis. For a given lensing map, we measure the one-point PDF as a function of (κ − μ)/σ where μ and σ are the spatial average and the root-mean-square. We perform linearly spaced binning in the range of −15 < (κ − μ)/σ < 15 with width of 0.3. Fig. 5 compares the PDFs averaged over 1000 realisations of lensing fields. The noiseless PDF is significantly skewed compared to the observed noisy counterparts. Our method reproduces the large skewness in the noiseless PDFs from the noisy input images. The typical bias in the reconstruction ranges from a 0.5–1σ level over a wide range of pixel values as shown in the bottom panel.

We compare the lensing PDFs for noisy, noiseless, and denoised maps. The solid line in the top panel shows the averaged PDF over the 1000 noiseless lensing maps, while the dashed line is for the noisy (observed) one. The red points in the top panel show the averaged PDF for the denoised maps by our GANs. In the bottom, we show the difference between the noiseless and denoised PDFs normalized by the sample variance of the noiseless PDFs. For reference, we highlight ±0.5σ-level differences by the magenta lines at the bottom.
5.3 Peak-halo matching
To study the small-scale structure on a denoised lensing field, we examine the correspondence between dark matter haloes and the local maxima in the lensing maps. Since our mock HSC catalogues are originally based on cosmological N-body simulations, we can generate light-cone halo catalogues with the same sky coverage as the lensing maps. The light-cone catalogues are produced from the inherent full-sky halo catalogues of Takahashi et al. (2017). The dark matter haloes in the full-sky catalogues are identified by a phase-space temporal halo finder Rockstar (Behroozi, Wechsler & Wu 2013).
In the following, we consider two mock cluster catalogues; one is a simple mass-limited sample, and the other takes into account a realistic mass selection effect in optically selected galaxy clusters. For the mass-limited sample, we use dark matter haloes with mass12 greater than |$10^{14}\, h^{-1}M_{\odot }$| at redshift less than 1. The mass and redshift selection roughly corresponds to the real galaxy cluster catalogue based on the photometric data in HSC S16A (Oguri et al. 2018). In order to use more realistic samples, we generate mock galaxy clusters based on the multiband identification of red sequence galaxies (the cluster finding algorithm is referred to as CAMIRA; Oguri 2014). We adopt the mass-to-richness relations of the CAMIRA clusters identified in HSC S16A (Murata et al. 2019). Murata et al. (2019) assume a lognormal distribution of the cluster richness for given cluster masses and redshifts and constrain the mean and scatter relation between the cluster mass and richness. Using their lognormal model, we assign the cluster richness to dark matter haloes in their redshift range of 0.1–1.0.
In a given lensing map, we first identify local maxima with their peak heights greater than 5σ. We then search for clusters around the peaks with a search radius of 6 arcmin. When we find several haloes in the search radius, we regard the closest cluster from the position of the peak as the best match. Over 1000 realizations in our survey window, we find 27 683 peaks. We identify |$89.5{{\ \rm per\ cent}}$| of the peaks have a matched mass-limited cluster in the noiseless field. After denoising, the number of peaks is found to be 23 248 and the matching rate is |$64.6{{\ \rm per\ cent}}$|. For more realistic CAMIRA-like clusters, the matching rate is found to be |$85.1{{\ \rm per\ cent}}$| and |$58.9{{\ \rm per\ cent}}$| for the noiseless and denoised fields, respectively. Without denoising, the number of peaks reduces to 1669 over 1000 realizations. However, the matching rate is found to be |$80.2{{\ \rm per\ cent}}$| and |$75.2{{\ \rm per\ cent}}$| for the mass-limited and the CAMIRA-like clusters, respectively. Hence, the analysis with denoising is highly complementary to the noisy counterpart.
Furthermore, to validate the halo-peak matching, we study the number density of the matched dark matter haloes as a function of halo masses and redshifts. Figs 6 and 7 show the number density of the matched clusters to the noiseless, the denoised, and the noisy peaks. As shown in the figures, the shape of the number density looks similar between the noiseless and the denoised peaks. This indicates that the peak-cluster matching for the denoised fields is not a coincidence. Compared to the noisy peaks, the denoised peaks play an important role to search for less massive clusters.

The number density of the matched dark matter haloes to the peaks on the lensing peaks. From the left to the right, we show the number density of the dark matter haloes as a function of halo masses M200b at three different redshift ranges, 0.2 ≤ z < 0.4, 0.4 ≤ z < 0.6, and 0.6 ≤ z < 0.8. The grey histogram shows the results for the true (noiseless) lensing fields, while the red points and green squares stand for the denoised and noisy fields, respectively. The red points broadly follow the grey histogram except for the difference in amplitudes. For a reference, the dashed line in each panel represents the noiseless results with a multiplicative factor of 0.7.

Similar to Fig. 6, but we show the peak-cluster matching results for the CAMIRA-like mock catalogues.
5.4 Cosmological dependence on lensing PDFs
We next validate if our conditional GANs can reduce noises when the true cosmologocal model is different from that assumed in the training process. When our GANs would surely learn noise properties alone, the networks should be able to denoise regardless of underlying cosmological models. To study the cosmological dependence on the denoised lensing map, we use the mock catalogues as in Section 3.2.5. We have 50 mock realizations of noisy lensing maps for each of 100 different cosmological models. For a given cosmology, we input a noisy map to our GANs, obtain the denoised map, and then compute the one-point PDF from the denoised map. We repeat this process for 50 realizations per each cosmological model and estimate the average PDF.
Fig. 8 summarizes the cosmological dependence on the denoised PDF for the HSC S16A. We find a clear dependence of the cosmological model, highlighting that our GANs do not overfit to the assumed cosmology in the training. The results in Fig. 8 can be compared with the PDFs for noisy input maps. The cosmological dependence of the lensing PDFs without our denoising is shown in Fig. 9. This illustrates that the cosmological dependence is weak compared to the statistical error of the PDF when one works on the original noisy lensing map. Our results indicate that the denoised PDF would potentially constrain the cosmological parameters tighter than the noisy counterpart does. However, the denoised PDFs are less sensitive to the cosmological parameters than the noiseless counterparts (see Fig. 10). In addition, the noiseless PDF is found to be largely sensitive to the parameter of σ8(Ωm0/0.3)0.2−0.3, while the denoised and noisy PDFs are mainly determined by σ8(Ωm0/0.3)0.5. According to these differences, the cosmological information in denoised PDFs is not identical to that in true PDFs. We need additional analyses to study the information content in the denoised PDFs and its relation with other statistics (e.g. Shirasaki 2017). We leave those for our future study.

The cosmological dependence of the one-point PDF of the denoised maps. The top panel shows the PDF as a function of the pixel value (κ − μ)/σ, where μ and σ is the spatial average and root-mean-square for a lensing field κ, respectively. In the top, the inset figure represents the 100 cosmological models considered in the present study. The star symbol in the inset figure shows our fiducial model (the WMAP9 cosmology; Hinshaw et al. 2013). The dependence of cosmological models is highlighted by the colour difference. In the bottom, we show the difference of the PDF from our fiducial cosmological model normalised by the statistical uncertainty.

Similar to Fig. 8, but for the lensing PDF without our denoising process.

Similar to Fig. 8, but for the lensing PDF in the absence of shape noises.
5.5 Accounting for systematic uncertainties
Before showing results, we caution the limitation of our approach. All the analyses in this paper assume the baryonic effects on the cosmic mass density can be negligible. Osato, Shirasaki & Yoshida (2015) and Castro et al. (2018) examined the baryonic effects on the lensing PDF with hydrodynamical simulations and found the most prominent effect would appear in high-σ tails in the PDF. This is because the baryonic effects such as cooling, star formation, and feedback from active galactic nuclei commonly play a critical role in high-mass-density environments in the universe. Besides, we ignore possible correlations between the lensing shear and the shape noises. An example causing such correlations is the intrinsic alignment (IA; Troxel & Ishak 2014). Although this IA effect can potentially cause the biased parameter estimation in future surveys (Krause, Eifler & Blazek 2016), we expect it would be less important for our analysis because we do not employ clustering analyses of galaxy shapes. According to the observational facts, the IA effect is expected to be more prominent for redder galaxies (e.g. see Hirata et al. 2007). Since redder galaxies preferentially reside in denser environments such as galaxy clusters, we would mitigate the impact of the IA effect on our analysis when removing the high-σ information. To take into account these effects, we decide to remove such high-density regions by setting the range of pixel values to be |${\cal P}\ge 0.01$| in equation (20). This leaves the lensing PDF at −2.1 < (κ − μ)/σ < 3.3 with the number of bins being 18. In this set-up, we would argue that there exist significant generalization errors in our denoising when finding |$\Delta \chi ^2 \ge \sqrt{2\times 18}=6$|.
5.5.1 Photometric redshifts
With our GANs, we assume the source redshift estimation by the specific mlz method, but other methods predict the different redshift distributions accordingly (see Fig. 1). To assess the systematic uncertainty due to imperfect photo-z estimates, we compute equation (20) when setting the term of |${\cal P}(\mathrm{test})$| to be the averaged PDF over 100 realizations of the mock catalogues with different photo-z information (Section 3.2.2). We find that the photo-z estimate by different methods can induce the bias in the lensing PDFs with a |$\lower.5ex\hbox{$\,\, \buildrel\lt \over \sim \,\,$}0.3\sigma$| level over a wide range of pixel values. These differences introduce Δχ2 = 0.290 and 0.134 for the Photo-z run 1 and 2, respectively.
5.5.2 Multiplicative bias
Besides, we assume the multiplicative bias defined by equation (8) is perfectly calibrated, but it can be mis-estimated with a level of 0.01. To test this systematic effect, we input the average PDF obtained from the mock catalogues as described in Section 3.2.3 when computing equation (20). We find that a 1 per cent-level error in the multiplicative bias can induce the errors with a level of Δχ2 = 0.238 and 0.294 for Δmb = 0.01 and −0.01, respectively.
5.5.3 Imperfect knowledge of noise
We assume that the noise distribution in our mock catalogues is the same as in the real data. However, the actual measurement error of galaxy shapes is subject to a 10 per cent-level uncertainty. To test the potential effect of this error, we input the average PDF obtained from the mock catalogues as described in Section 3.2.4 in equation (20). We find that a 10 per cent-level mis-estimation in the shape measurement error can the errors with a level of Δχ2 = 0.213 and 0.272 for Δσmea = 0.10 and −0.10, respectively.
5.5.4 Total systematic uncertainties
Putting all together, we confirm |$\Delta \chi ^2 \lower.5ex\hbox{$\,\, \buildrel\lt \over \sim \,\,$}1$| for the denoised PDFs. Hence, we conclude that possible systematic uncertainties in the measurement of galaxy shapes and redshifts are unimportant for the denoised PDF in our HSC data sets. Now we are ready to apply our deep-learning denoising method to real HSC data.
6 APPLICATION TO REAL DATA
6.1 Visual investigation and cluster matching
We apply our GAN-based denoising to the real weak lensing map obtained from the HSC S16A data. Fig. 11 shows a comparison of the denoised images between mock and real data set. The top left panel shows a noisy lensing field in a mock observation taken from 1000 realizations of the fiducial catalogues (Section 3.2.1). The top right panel represents the denoised weak lensing fields for the mock data. In the bottom, the left-hand and right-hand panels are similar to the top, but for the real HSC S16A data. On the denoised field, we mark the position of the matched galaxy cluster to the local maximum with its peak height greater than 5σ. For the mock data, we define the galaxy clusters by the dark matter haloes with their masses greater than |$10^{14}\, h^{-1}M_{\odot }$| and their redshifts z < 1 in N-body simulations. On the other hand, for the real data, we select the optically selected CAMIRA clusters (Oguri et al. 2018) in the HSC S16A with their richness of >15 and the X-ray selected clusters (Adami et al. 2018) in our survey window by their X-ray temperature being |$\gt 2.14\, {\rm keV}$|. Oguri et al. (2018) has shown that our selection of the optical richness and X-ray temperature roughly corresponds to the selection of the cluster mass by |$\gt 10^{14}\, h^{-1}{\rm M}_{\odot }$|. According to the results in Section 5.3, we expect |$\sim \! 64.6{{\ \rm per\ cent}}$| of the peaks in a denoised field with their peak heights >5σ will find their counterparts of galaxy clusters. In our denoised map for the real HSC S16A, we find 23 peaks and 13 peaks have the counterparts. This matching rate is in good agreement with the expectation from our experiments with 1000 mock observations. When limiting the CAMIRA clusters alone, we find 10 matched clusters in the denoised field that is still consistent with our expectation within a 1σ Poisson error. Note that we have three matched clusters selected in both of optical and X-ray bands.

Performance of the denoising of the observed weak lensing map in the Subaru HSC first-year data. In upper panels, the left-hand and right-hand panels show a noisy input map and the denoised counterpart for a mock observation among 1000 realizations, respectively. The bottom panels show the results similar to the upper ones, but for the real observational data. In the top right-hand panel, the grey points show the matched dark matter haloes to the local maxima on the denoised map. In the bottom left-hand panel, the star and square symbols show the matched galaxy clusters selected in optical and X-ray bands, respectively. Note that the hatched region represents the masked area because of missing the data.
6.2 Statistics-level comparisons

The comparison of lensing PDFs between the HSC real data and WMAP9-cosmology prediction. In the upper panel, the grey square and red circles show the observed PDF for noisy and denoised fields, respectively. We also present the corresponding predictions based on our mock observations under the WMAP9 cosmology with the lines. In the bottom, we show the difference between the observed PDF and the prediction in units of the sample variance. For a reference, the dashed lines show ±1σ levels.
7 CONCLUSION AND DISCUSSION
We have devised a novel technique for noise reduction of cosmic mass density maps obtained from weak lensing surveys. We have improved over our previous analysis (Shirasaki et al. 2019b) by incorporating realistic properties in the training data set and by performing an ensemble learning with conditional GANs. Our denoising method with GANs produces 10 estimates of the underlying noise field for a given input (observation). The multiple outputs allow us to reduce generalization errors in the denoising process by taking the median value over 10 predictions by our networks. For the first time, we have performed a stress-test on the denoising with deep-learning by using non-Gaussian statistics and by varying relevant parameters in mock lensing measurements.
Our findings through model validation are summarized as follows:
Our GANs can reproduce the one-point probability distributions (PDFs) of noiseless fields with a level of 0.5–1σ level. This argument holds even when we vary the multiplicative bias with a level of 1 per cent, the photo-z distribution of source galaxies, and the error in galaxy shape measurements with a level of 10 per cent.
After denoising, positive peaks with their pixel value greater than 5σ in lensing mass maps have counterparts of massive galaxies within a separation of 6 arcmin. The matching rate between peaks and clusters is found to be |$\sim \! 60{{\ \rm per\ cent}}$| for the denoised field, while it is comparable to the true matching rate of |$\sim \! 90{{\ \rm per\ cent}}$|. We also confirmed that the matching in the denoised field is not a coincidence by studying the mass function of matched clusters.
Even though we assumed a specific cosmological model in the training of our GANs, the denoised field can show a cosmological dependence. The cosmological dependence on the denoised lensing PDF is different from the noiseless counterpart, whereas the sensitivity of the parameters (Ωm0, σ8) in the denoised PDF is found to be greater than the noisy counterpart. This indicates that our GANs extract some cosmological information hidden by observational noises.
We have applied our method to the real observational data by using a part of the Subaru Hyper-Suprime Cam (HSC) first-year shape catalogue (Mandelbaum et al. 2018a). By comparing the denoised field for real HSC data with the prediction based on our mock observations, we concluded that our denoising provides a consistent result within the standard ΛCDM cosmological model.
The method developed in this paper can be easily generalized to other large-scale cosmological data sets (e.g. Moriwaki, Shirasaki & Yoshida 2020 for intensity mapping surveys). Since the observable information is limited by the cosmic variance, future cosmology studies would need to extract information hidden behind observational noises within a limited data size. Sophisticated modelling of cosmic large-scale structures can open a new window to produce mock observable ‘universes’ as many as possible, which then allow us to redesign cosmological analyses beyond conventional methods. Conditional GANs can provide an innovative approach for next-generation cosmological analyses. To gain full benefits from machine learning techniques in the future, we will need to solve technical problems of large-scale computing for deep learning and of fast and accurate modelling of the cosmic structure in multidimensional parameter space. It would be necessary to develop methodologies to improve better understanding of neural networks in a physically intuitive way. A fruitful combination of astrophysics with machine learning is required to confront these challenges. Our results presented in this paper provide a prototype model for deep-learning-assisted cosmology, for further enhancement of the science returns in future astronomical surveys.
ACKNOWLEDGEMENTS
This work was in part supported by Grant-in-Aid for Scientific Research on Innovative Areas from the MEXT KAKENHI Grant Number (18H04358, 19K14767), and by Japan Science and Technology Agency CREST Grant Number JPMJCR1414 and AIP Acceleration Research Grant Number JP20317829. This work was also supported by JSPS KAKENHI Grant Numbers JP17K14273 and JP19H00677. Numerical computations presented in this paper were in part carried out on the general-purpose PC farm at Center for Computational Astrophysics, CfCA, of National Astronomical Observatory of Japan.
The HSC collaboration includes the astronomical communities of Japan and Taiwan and Princeton University. The HSC instrumentation and software were developed by the National Astronomical Observatory of Japan (NAOJ), the Kavli Institute for the Physics and Mathematics of the Universe (Kavli IPMU), the University of Tokyo, the High Energy Accelerator Research Organization (KEK), the Academia Sinica Institute for Astronomy and Astrophysics in Taiwan (ASIAA), and Princeton University. Funding was contributed by the FIRST programme from Japanese Cabinet Office, the Ministry of Education, Culture, Sports, Science and Technology (MEXT), the Japan Society for the Promotion of Science (JSPS), Japan Science and Technology Agency (JST), the Malasiya Toray Science Foundation, NAOJ, Kavli IPMU, KEK, ASIAA, and Princeton University.
This paper makes use of software developed for the Vera C. Rubin Observatory. We thank the LSST Project for making their code available as free software at http://dm.lsst.org.
The Pan-STARRS1 Surveys (PS1) have been made possible through contributions of the Institute for Astronomy, the University of Hawaii, the Pan-STARRS Project Office, the Max-Planck Society and its participating institutes, the Max Planck Institute for Astronomy, Heidelberg and the Max Planck Institute for Extraterrestrial Physics, Garching, The Johns Hopkins University, Durham University, the University of Edinburgh, Queen’s University Belfast, the Harvard-Smithsonian Center for Astrophysics, the Las Cumbres Observatory Global Telescope Network Incorporated, the National Central University of Taiwan, the Space Telescope Science Institute, the National Aeronautics and Space Administration under Grant Number NNX08AR22G issued through the Planetary Science Division of the NASA Science Mission Directorate, the National Science Foundation under Grant Number AST-1238877, the University of Maryland, and Eotvos Lorand University (ELTE) and the Los Alamos National Laboratory.
Based [in part] on data collected at the Subaru Telescope and retrieved from the HSC data archive system, which is operated by Subaru Telescope and Astronomy Data Center at National Astronomical Observatory of Japan.
DATA AVAILABILITY
The data underlying this article will be shared on reasonable request to the corresponding author.
Footnotes
The smoothing scale θs is commonly adopted to search for massive galaxy clusters in a smoothed lensing map (Hamana, Takada & Yoshida 2004). Using numerical simulations, Shirasaki, Hamana & Yoshida (2015) has found a one-to-one correspondence between the peaks on a smoothed map by the filter in equation (10) and massive galaxy clusters at z = 0.1−0.3 when imposing the peak height to be larger than ∼5σ.
The full-sky light-cone simulation data are freely available for download at http://cosmo.phys.hirosaki-u.ac.jp/takahasi/allsky_raytracing/.
The mock shape catalogues are publicly available at http://gfarm.ipmu.jp/~surhud/.
One may think that it would be more appropriate to directly generate a noiseless lensing field in the network. This possibility has been examined in our previous work and it does not work in an HSC-like imaging survey (Shirasaki et al. 2019b). This is mainly because the signal-to-noise ratio on a pixel-by-pixel basis is small in general.
We use the modified version of https://github.com/yenchenlin/pix2pix-tensorflow
We define the halo mass as the spherical overdensity mass with respect to 200 times mean overdensity.
REFERENCES
APPENDIX: ADDITIONAL VALIDATION TESTS OF OUR GANS
In this appendix, we show additional validation tests of our GAN performance. The tests include two-point correlation analyses, a sanity check of overfitting, and hyperparameter dependence on our results. Note that we normalize a lensing map so that it has zero mean and unit variance in the tests.
A1 Clustering amplitudes

The two-point correlation analysis of noiseless and denoised fields. In the top panel, the points show the cross-correlation between the noiseless and denoised fields, while the solid and dashed lines are for the auto-correlation of the noiseless and denoised fields. The bottom panel represents the cross-correlation coefficient in the two-point clustering.
A2 Statistical uncertainties in lensing PDFs
It is important to make sure that our conditional GANs are not subject to overfitting to our simulations. In general, one can find overfitting when the loss for test data sets becomes much smaller than the loss for training sets. Because we are interested in statistical properties of the lensing map, the comparison of losses is not always needed. Instead, we compare the statistical uncertainty of the denoised PDF with that of the true counterpart. We caution that our loss function of GANs is not designed to reconstruct noiseless lensing PDFs over realisations. Hence, the lensing PDF after denoising should be regarded as the prediction by our GANs. If the variance in the denoised PDFs becomes typically smaller than that of the true counterpart, it makes the prediction by our GANs highly unreliable. Fig. A2 shows the standard deviation of the lensing PDFs over the 1000 test data sets. The black line in the upper panel shows the true underlying scatter, while the red points are the denoised counterparts. We find that the statistical uncertainty in the denoised PDFs is larger than the intrinsic value in the wide range of lensing fields. This simple statistic indicates that the lensing PDFs by our GANs remain reliable.

The comparison of the variance in the one-point lensing PDF. In the upper panel, the solid line shows the variance for the true noiseless PDFs, while the points is the counterpart for the denoised PDFs. The ratio between two is also shown in the lower panel.
A3 Varying a hyperparameter in the conditional GANs
We here summarize the effect of a hyperparameter λ in our deep-learning networks on the denoising performance. To study the effect of λ, we consider additional two models with λ = 50 and 100. We follow the same training strategy as in Section 4 when varying λ. We built 10 GANs for a given λ and then estimated the ‘best’ denoised field by using the median over 10 outputs by our GANs.
Fig. A3 shows the comparison of the denoised map by GANs with different values of λ. The figure highlights that the large-scale clustering pattern in the map looks less affected by the choice of λ, while the difference at over- and under-dense regions is prominent. We also summarize the statistic-level comparisons in Figs A4 and A5, supporting the visual implication.

The effect of a hyperparameter λ in our networks on image-to-image translation. In this figure, we work on the same realization of lensing data in Fig. 4. From the left to right, we show the denoised lensing maps as varying the value of λ = 50, 75, and 100.

The effect of a hyperparameter λ in our networks on denoised lensing PDFs. In the upper panel, the solid line shows the true noiseless lensing PDF, while the red circles represent the denoised counterpart with the default set up of λ = 75. The green dashed and orange dotted lines stand for the denoised PDFs with λ = 50 and 100, respectively. In the bottom, we show the difference between the denoised and noiseless PDFs in units of the sample variance in noiseless fields. For a reference, the magenta lines in the bottom show ±0.5σ levels.

The effect of a hyperparameter λ in our networks on the cross-correlation coefficient among the denoised, noisy, and noiseless fields. We define the coefficient in equation (A2). The red circles with error bars show our fiducial results with λ = 75, while the green dashed and orange dotted lines represent the cases with λ = 50 and 100, respectively.