LinKS: Discovering galaxy-scale strong lenses in the Kilo-Degree Survey using Convolutional Neural Networks

We present a new sample of galaxy-scale strong gravitational-lens candidates, selected from 904 square degrees of Data Release 4 of the Kilo-Degree Survey (KiDS), i.e., the"Lenses in the Kilo-Degree Survey"(LinKS) sample. We apply two Convolutional Neural Networks (ConvNets) to $\sim88\,000$ colour-magnitude selected luminous red galaxies yielding a list of 3500 strong-lens candidates. This list is further down-selected via human inspection. The resulting LinKS sample is composed of 1983 rank-ordered targets classified as"potential lens candidates"by at least one inspector. Of these, a high-grade subsample of 89 targets is identified with potential strong lenses by all inspectors. Additionally, we present a collection of another 200 strong lens candidates discovered serendipitously from various previous ConvNet runs. A straightforward application of our procedure to future Euclid or LSST data can select a sample of $\sim3000$ lens candidates with less than 10 per cent expected false positives and requiring minimal human intervention.


INTRODUCTION
Strong gravitational lenses 1 are composite systems where a massive foreground object (e.g., a galaxy or a cluster) creates multiple images of one or more higher-redshift sources (e.g., galaxies or quasars). Strong lenses are useful for a wide range of cosmological and astrophysical studies (Schneider et al. 1992;Schneider 2006;Treu 2010). For example, they can provide cosmological constraints on the dark energy equation of state (Collett & Auger 2014;Cao et al. 2015) and precision measurements of the Hubble constant (Schechter et al. 1997; combined with dynamical and stellar population synthesis analyses Ferreras et al. 2010;Spiniello et al. 2011;Brewer et al. 2012; Barnabè et al. 2013;Sonnenfeld et al. 2015;Posacki et al. 2015;Spiniello et al. 2015;Leier et al. 2016;Sonnenfeld et al. 2018b;Vernardos 2018). Finally, strong lenses can act as a "Cosmic Telescope", providing a magnified view of otherwise unresolved background sources (e.g., Impellizzeri et al. 2008;Swinbank et al. 2009;Richard et al. 2011;Deane et al. 2013;Treu et al. 2015;Mason et al. 2017;Salmon et al. 2017;Kelly et al. 2018).
The above-listed studies have typically been carried out using samples of tens to maximally about a hundred massive lens galaxies (M 10 11 M ), and are often limited to redshifts z 0.5 and/or are inhomogeneously selected. Current results are therefore often limited by sample size or cosmic variance. Creating more substantial, homogeneously selected samples of gravitational lenses, which extend to lowermass galaxies and higher redshifts, will reduce the effects of "small-number statistics" and allow an improved study of lens galaxies as a function of galaxy properties and evolutionary state. In particular, Vegetti & Koopmans (2009) estimate that it is possible to compute sub-halo mass fractions of lens galaxies to a level of 0.1 per cent with only ∼ 50 lens systems. With the same number of lenses, it is possible to reach a per cent level precision in estimating their mass density slopes (Barnabè et al. 2011). Therefore a much larger number of galaxy-scale lenses can improve the outcome from these analyses and enable one to conduct a proper statistical comparison with the results obtained from lens simulations (e.g., Xu et al. 2016;Li et al. 2016;Mukherjee et al. 2018). Moreover, the precision of the value of H 0 can be improved to the level of a few per cent when studying a sample of about 40 strong lenses with measured time delays (Jee et al. 2016;Shajib et al. 2018). Collecting large samples of strong lenses, furthermore, giving us better access to the high-redshift universe and increases the probability of discovering double Einstein-ring  and other "exotic" lenses (e.g., Tu et al. 2009;Cooray et al. 2011;Brammer et al. 2012;Tanaka et al. 2016). Moreover, samples of homogeneously selected strong lenses are needed to characterize the selection function of a strong lens survey, allowing to map measurements carried out on strong lenses back to the general population of galaxies. We refer the reader to the LSST Science Book (LSST Science Collaboration et al. 2009) and the Euclid Strong Lensing white paper (Euclid Strong Lensing team, 2018, in prep) for a more detailed discussion of future scientific applications of strong gravitational lenses.
The largest homogeneously-selected sample of confirmed strong lenses is the Sloan Lens ACS Survey (SLACS; Bolton et al. 2006Bolton et al. , 2008, which yielded more than a hundred spectroscopically confirmed strong lenses with complete redshift information and high-resolution imaging follow-up (with e.g., the Hubble Space Telescope and Keck Observatory Adaptive Optics). In total, all lens surveys combined have produced up to a thousand highly-likely 2 gravitational lens candidates (e.g., Browne et al. 2003;Faure et al. 2008;Treu et al. 2011;Inada et al. 2012;Brownstein et al. 2012;More et al. 2012;Stark et al. 2013;Sonnenfeld et al. 2013a;Gavazzi et al. 2014;More et al. 2016;Shu et al. 2016Shu et al. , 2017. Ongoing wide-field optical-IR surveys are expected to make the next giant step forward by yielding thousands of new lenses (Collett 2015;Petrillo et al. 2017). The first new lens candidates have already been discovered (Petrillo et al. 2017;Hartley et al. 2017;Diehl et al. 2017;Sonnenfeld et al. 2018a;Spiniello et al. 2018;Jacobs et al. 2018;Wong et al. 2018) in the Kilo-Degree Survey (KiDS; de Jong et al. 2013), in the Hyper Suprime-Cam Subaru Strategic Program (HSC; Miyazaki et al. 2012), and in the Dark Energy Survey (DES; The Dark Energy Survey Collaboration 2005). Similarly large samples are expected from deep sub-mm observations by e.g., the Herschel telescope (Negrello et al. 2010), the South Pole Telescope (SPT; Carlstrom et al. 2011), and the Atacama Large Millimeter/sub-millimeter Array (ALMA) 3 . These telescopes have already uncovered hundreds of new lens candidates (Vieira et al. 2013;Negrello et al. 2017). Within the next decade, ∼ 10 5 strong lenses are expected to be found in future surveys (Oguri & Marshall 2010;Pawase et al. 2014;Collett 2015;McKean et al. 2015) utilising, e.g., ESA's Euclid mission (Laureijs et al. 2011), the Large Synoptic Survey Telescope (LSST Science Collaboration et al. 2009) and the Square Kilometer Array 4 . In particular, these surveys will allow lower-mass and higher-redshift lenses to be found, thanks to their deeper and higher angular resolution observations. Moreover, it will become possible to follow up promising targets at an even higher angular resolution with ALMA and the European Extremely Large Telescope (E-ELT). A future SKA-VLBI facility could, in addition, investigate milli-arcsecond angular scales of the lensed images for the effects of dark-matter line-of-sight and sub-halos (Spingola et al. 2018), enabling one to study small deviations from the smooth mass model of the lens.
Strong gravitational lenses are scarce objects within the total population of galaxies. In current surveys, of the order of one strong lens exists per few hundred to a thousand galaxies. This number strongly depends on galaxy mass and selection criteria, with the number of lenses peaking around M * -galaxies for source-selected samples and at larger masses when lenses are selected as luminous red galaxies (LRGs). Their rarity makes it essential to develop robust lens-finder algorithms and deploy them in streamlined data-processing pipelines. This end-to-end automation will drastically reduce, and possibly prevent entirely, the need for future visual inspection of millions of potential lens candidates (e.g., Lenzen et al. 2004;Horesh et al. 2005;Alard 2006;Estrada et al. 2007;Seidel & Bartelmann 2007;Kubo & Dell'Antonio 2008;More et al. 2012;Maturi et al. 2014;Joseph et al. 2014;Gavazzi et al. 2014;Agnello et al. 2015;Brault & Gavazzi 2015;Chan et al. 2015;Stapelberg et al. 2019;Hartley et al. 2017;Petrillo et al. 2017Petrillo et al. , 2019Jacobs et al. 2017;Sonnenfeld et al. 2018a;Spiniello et al. 2018).
In light of such an automation strategy, we recently developed (Petrillo et al. 2017), and more recently improved upon (Petrillo et al. 2019), a new convolutional neural network (ConvNet) lens-finder algorithm. The objective in this LinKS 3 paper is to report on how we use ConvNets in an automated lens-search pipeline, and report on the results of applying these networks to galaxies selected from ∼ 900 square degrees of KiDS Data Release 4. The core result that we present is an automatically selected sample of 3500 rank-ordered strong-lens candidates. From this ConvNet pre-selected sample, several subsamples of higher confidence candidates are distilled through human visual inspection.
In Section 2, we provide a brief introduction to KiDS, the imaging and catalogue data that are used in this paper. In Section 3, we explain how we select a subsample of intrinsically luminous (red) galaxies from the colour-magnitude diagram of KiDS galaxies, as well as the methodology used to identify gravitational lens candidates within that colourmagnitude selected subsample. In Section 4, we present the gravitational lens candidates found from the most conservative sample selection. In Section 5, we apply the networks to a wider selection of galaxies -inherently limited only in their apparent brightness -to examine the efficiency of the algorithm in extremely data-heavy regimes such as those expected from future astronomical surveys, such as with Euclid and LSST, which may also have restricted colour information. In the same section we also present a "bonus sample" of inhomogeneously selected lens candidates that were identified serendipitously during various past experiments in the development of the final ConvNets. Lastly, in Section 6, we summarise our main conclusions.

DATA FROM THE KILO-DEGREE SURVEY
The Kilo-Degree Survey (KiDS; de Jong et al. 2013) is an ESO public survey carried out with the OmegaCAM widefield imager (Kuijken 2011) mounted on the VLT Survey Telescope (VST; Capaccioli & Schipani 2011) at the Paranal Observatory in Chile. The telescope, camera, and survey have been designed to obtain images with sub-arcsecond seeing and homogeneous image quality both across the full field of view and throughout the survey execution. In this way the survey yields a large and homogeneous galaxy sample. The size and homogeneity of this sample is required for the surveys primary science drivers, which include placing strong constraints on both the distribution of matter across cosmic time and the cosmological parameters of the universe through weak-lensing measurements; the subtle distortions introduced in galaxy shapes by cosmic shear (e.g., Hildebrandt et al. 2017). At the same time, the combined power of the survey's superb image quality and wide area makes KiDS optimal for strong-lensing studies (Napolitano et al. 2016;Petrillo et al. 2017;Spiniello et al. 2018). OmegaCAM has a one square degree field of view, with pixels that have an angular scale of 0.21 arcseconds, and KiDS will survey a total of ∼1350 square degrees in four optical bands (u, g, In this paper, we make use of 904 tiles 5 that form a subset of the KiDS Data Release 4 (KiDS ESO-DR4, Kuijken et al. 2018, in prep.). The analysis performed uses imaging data, and derived products, produced within the Astro-WISE information system (Valentijn et al. 2007;McFarland et al. 2013). We make use of the single-band and multi-band catalogues of the KiDS-DR4.

The "full sample"
The target extraction and their associated photometry have been obtained using S-Extractor (Bertin & Arnouts 1996). To optimise the initial lens searches, we pre-select a sample of luminous galaxies with reliable photometric data. We proceed in the following way: (a) We select sources with a S-Extractor r-band FLAGS value < 4, thereby including only deblended sources and removing from the catalogue objects with incomplete or corrupted photometry, saturated pixels, or any other blending or extraction related problem. (b) We further reject galaxies in areas compromised by, e.g, stellar diffraction spikes and reflection halos, by selecting sources with the flag IMA_FLAGS set to zero for all the four KiDS bands. (c) We select sources with a Kron-like magnitude MAG_AUTO in the r-band below 20th magnitude, in order to maximize the lensing cross-section (Schneider et al. 1992). (d) Finally, we select sources with flag 2DPHOT equal to 1 (as derived by the star-galaxy separator software 2DPHOT (La Barbera et al. 2008) in order to select secure galaxies.
To reduce the contamination by stars further, we select only objects with a FWHM in r-band greater than the 90 percentile range of the distribution of star-like objects within the same tile (those with 2DPHOT equal to zero). We adopt this strategy to reach a suitable compromise between filtering out stars and not excising too many galaxies from the sample. This selection procedure results in a sample of nearly one million (specifically 930 651) targets which we will refer to as the "full sample" in the remainder of the paper.

The luminous red galaxy sample
Luminous Red Galaxies (LRGs; Eisenstein et al. 2001) are massive galaxies which, as a result, are more likely to exhibit strong lensing features than other classes of galaxies (see Turner et al. 1984;Fukugita et al. 1992;Kochanek 1996;Chae 2003;Oguri 2006;Möller et al. 2007). We select LRGs from the full sample, defined earlier, using the low-redshift (z < 0.4) LRG colour-magnitude selection of Eisenstein et al. (2001). We slightly adapt this selection to include fainter and bluer sources: The magnitudes are S-Extractor MAG_AUTO. In this section we chose to limit our analysis to the Astro-WISE single-band object detection catalogues. We determine the u,g,r,i photometry for each object using the individual S-Extractor MAG_AUTO measurements. As these measurements are made using slightly different centroids and the PSF varies significantly between bands, we do not expect this "first-look" LRG selection methodology to be uniform. As our aim is not to compile a complete sample of LRGs, however, we do not expect this decision to impact our conclusions. We note that after the analysis for this project began, Vakili et al. (2018) presented a sophisticated methodology to select LRG galaxies for clustering studies in KiDS-DR3. Future LinKS analyses will investigate adopting this LRG sample. Our selection results on a sample of 88 327 sources, which we refer as the "LRG sample" throughout the remainder of this paper. Note that our goal here is to select a reasonable number of massive (LRG) galaxies, without significant contamination by spiral galaxies, but that this sample need not strictly be purely LRGs. We find an average of 98 sources selected per tile with a standard deviation of ∼ 43. This standard deviation is high, but expected given the "first-look" methodology that we have adopted to compile this sample, in addition to the high levels of cosmic variance expected for this highly biased galaxy sample.

SEARCHING FOR LENSES
To find gravitational-lens candidates in KiDS imaging data, we use the ConvNets previously introduced by Petrillo et al. (2019). These networks are significantly improved variants of the original ConvNet presented by Petrillo et al. (2017). ConvNets (Fukushima 1980;LeCun et al. 1998) represent a state-of-the-art method of pattern recognition (Russakovsky et al. 2015). The networks learn how to classify a diverse set of images during the so-called training phase, whereby labelled images are provided to the ConvNet. Its weight parameters are changed to minimise a pre-defined loss function, which expresses the difference between the labels of the images and the output values p (one for each image) of the ConvNet. For a more detailed introduction to ConvNets for finding lenses we refer the interested reader to Petrillo et al. (2017), and to more general reviews by Schmidhuber (2015), LeCun et al. (2015) and Guo et al. (2016).
To evaluate methods for identifying images of simulated gravitational lenses -in preparation for the Euclid mission (Metcalf et al. 2018) -recently an international challenge was organised. The results of this challenge demonstrated that ConvNets, collectively with Support Vector Machines (SVMs), are among the most promising methods for finding lens candidates currently available. As a proof of concept, ConvNets have been used to find new gravitational lens candidates by Petrillo et al. (2017) in the KiDS DR3 and by Jacobs et al. (2017) in the Canada-France-Hawaii Telescope Legacy Survey (CFHTLS) and in DES (Jacobs et al. 2018).
In terms of methodology and target selection our analysis differs from the work done by Spiniello et al. (2018), who have focused their search exclusively on lensed quasar candidates in KiDS, by visual inspecting targets preselected using optical/infrared colours. Lens candidates have also been found in the KiDS DR3 data by Hartley et al. (2017), who trained a Gabor-SVM finder.

Training the Convolutional Neural Networks
We start by giving a brief synopsis of our ConvNets and the training procedure, as reported by Petrillo et al. (2019). Building on our experience, we choose to deploy two different ConvNets. One focusses on utilising the best morphological information by taking the best-seeing, i.e., r -band, images as input. The other ConvNet exploits colour information in addition to morphological information by taking 3-band RGB images as input. The RGB images are created with HumVI 6 (Marshall et al. 2016) using the g, r and i bands. In both cases, the KiDS images have a size of 101 × 101 pixels (i.e. 20 × 20 arcseconds) with the central pixel corresponding to the centre of the galaxy of interest. The ConvNets take these images and transform them into a single value, p, which can vary between 0 and 1. This value represents, to some degree (see e.g., Saerens et al. 2002), the probability that the input image is a lens (see also Section 3.2). The input size of 20 × 20 arcseconds is chosen to be sufficiently large as to enclose most galaxy-scale lens systems, and sufficiently small as to both avoid contamination by unrelated field objects and allow for a ConvNet with a practical memory requirement 7 .
We use two classes of objects to train the ConvNets: (1) the lenses labelled with a 1.0, and (2) the non-lenses, labelled with a 0.0.
(1) For the lenses, we use a set of ∼ 6000 KiDS LRGs on which we superimpose simulated lensed images. The simulated lensed images (∼ 10 6 in number) are composed mostly of high-magnification rings, arcs and quads. The gravitational-lens mass distribution adopted in our simulations is assumed to be that of a Singular Isothermal Ellipsoid (SIE, Kormann et al. 1994) perturbed by additional Gaussian Random Field (GRF) fluctuations and an external shear. An elliptical Sérsic (1968) brightness profile is used to represent the lensed sources, and to which we add several small internal stellar structures (e.g., star-formation regions, satellite galaxies), described by circular Sérsic profiles. For each background source, we extract magnitudes from the "COSMOS" models provided by the code Le Phare (Arnouts et al. 1999;Ilbert et al. 2006) in order to simulate realistic gri-composite images. The lens and source parameters vary accordingly to the values in Table 1 of Petrillo et al. (2019).
(2) The non-lenses are a collection of ∼ 12 000 galaxies from KiDS. This sample is comprised of a supersample of: (a) the same LRGs used for the lenses; (b) randomly selected galaxies from the survey with a r -band magnitude brighter than 21; (c) 'false positives' (e.g., mergers, ring galaxies, etc.) from earlier ConvNets; and (d) a sample of galaxies that were visually classified as spirals from an on-going Galaxy-Zoo project (Willett et al. 2013, Kelvin et al., in prep.).
A more detailed description of the training sample preparation, the results of the training phase, and a detailed discussion of the performance of the ConvNets are presented in Petrillo et al. (2019).

Application to the LRG sample
The ConvNets described in the previous subsection are both applied to the LRG sample, and only targets with p > 0.8 (returned from either of the ConvNets) are selected. This threshold is chosen to obtain a reasonable number of 'true positives' and, at the same time, not contaminate the sample with a large number of 'false positives'. Petrillo et al. (2019) present an extensive analysis of the performance of these ConvNets by choosing different p-value thresholds. With this threshold, the 3-band ConvNet picks 1689 candidates, while the one-band ConvNet picks 2510 candidates. These numbers correspond to fractions of ∼ 1.9 and ∼ 2.8 per cent of the LRG sample, respectively. We find a total of (exactly) 3500 unique candidates with p > 0.8 since 699 galaxies are common between both ConvNets. We refer to this sample of 3500 unique targets as the ConvNet sample.
By setting the threshold value p to 0.8, however, we still expect the presence of many false positives in the ConvNet sample (∼ 90 per cent; Petrillo et al. 2019). To validate the candidates, selected by the ConvNets, we conduct a visual inspection: seven of the authors of this paper -referred to as "classifiers" -examine the 101 × 101 pixels RGB composite image, created with STIFF 8 (Bertin 8 http://www.astromatic.net/software/stiff 2012). The classifiers have only three possible choices for each source being a lens: Sure, Maybe, and No lens. We translate each of these categories into a numerical value in the same way as was done by Petrillo et al. (2017): A: Sure lens 10 points. B: Maybe lens 4 points. C: No lens 0 points.
As a result, the maximum score that any one galaxy candidate can obtain is 70, i.e. when all human classifiers think it is surely a lens. A histogram with the numerical results of the visual inspection is shown in Figure 1. About ∼ 57 per cent of the initial 3500 candidates selected by the ConvNets (i.e., 1983 candidates) have at least one classifier selecting it as a Sure lens or Maybe lens. Only four candidates achieve the maximum score. Figure 2 Figure 3). Naturally this means that none of these confirmed lenses were flagged as Sure lens by all classifiers. However, these sources are often confirmed as lensed through high angular resolution HST (Hubble Space Telescope) follow-up, which makes it unsurprising that they are not classified as secure lenses in ground-based KiDS data. In the LRG sample there are other six known gravitational lenses which have not been identified by our ConvNets. However, the KiDS images of these objects do not exhibit striking lensing features and, thus, they are hardly recognizable as strong lenses.
The visual classification appears to depend on the signal to noise ratio. For example, the candidate SCJ083726+015639, found in HSC data by Sonnenfeld et al. (2018a), is present in two adjacent KiDS tiles, and the Con-vNets retrieve it from both tiles (the ConvNets select three more HSC candidates). Nevertheless, the human classifiers, in general, give very different scores to the same candidate depending on the quality of the images (Figure 4). Thus, it is fair to assume that many 'good' candidates are lost from our sample if we preferentially select only those candidates with high visual-inspection score. On the other hand, there are also clearly cases where the ConvNets select candidates without any human-identifiable lensing feature being present.
To examine the other extreme of the classification, Figures 5 and 6 present the candidates that the ConvNets classify with values of p > 0.999, along with the scores from our visual inspection. For the 3-band ConvNet, some of these extremely high-confidence ConvNet candidates received low visual classification scores; there is even a case with visualinspection score of zero. It is clear that there remains significant disagreements between human and ConvNet classifications, and that both classification methods are prone to some level of bias and error. Nonetheless, Figure 7  inspection score. Hence, even if the classification schemes from humans and ConvNets differ, both tend to agree to a certain extent on what constitutes a 'good' lens candidate.
Even if there is no obvious inspection-score below which the candidates are no longer reliable, we nonetheless observe an increase in the fraction of good candidates with increas-ing score. Therefore by defining some fiducial threshold for the visual inspection score, above which one considers the targets as reliable candidates, we can investigate how the number of retrieved candidates (and the degree of contamination) vary as a function of the threshold set on the value of p. Figure 8 presents these correlations for all the ConvNet candidates and for a "bona fide" sub-sample composed of targets with a visual inspection score 28. This is a fiducial value of the score which corresponds to a) maybe lens given by all the classifiers or to b) sure lens given by two classifiers and maybe lens from other two classifiers. In particular, in the left panel of Figure 8 we see how the number of retrieved candidates changes as a function of the value of p, greatly decreasing when p is approaching to 1. This change is more gentle in the case of the "bona fide" sample. The right panel shows that the fraction of "bona fide" systems is increasing with p, reaching the lowest contamination degree when p is close to 1. This latter result confirms the correlation among the visual inspection score and p, previously shown in Figure 7.

THE LINKS SAMPLE CANDIDATES
We define the "LinKS (Lenses in the Kilo-Degree Survey) sample" as the full sample of 1983 gravitational lens candidates retrieved with p > 0.8 and a score from the visual inspection greater than zero. The sample contains five previously confirmed strong lenses (see Figure 3; Cabanac et al. 2007;Bolton et al. 2008;Christensen et al. 2010;More et al. 2017) and 12 lens candidates discovered in the HSC data (Sonnenfeld et al. 2018a;Wong et al. 2018). This sample also contains the "bona fide" subsample, composed of the 89 candidates which have a visual inspection score 28, which we defined in Sect.3.2. We note that by relaxing this inspection score requirement further, for example to 16 (i.e. the score corresponding to four maybe a lens), we are able to produce a subsample of 308 candidates. Nonetheless, we opt to define our "bona fide" sample using the more stringent 28 requirement. Information about the data products provided for the LinKS sample, along with images for each of the 89 "bone fide" candidates, is provided in Appendix A. Additional information is also provided at the the LinKS webpage 9 . 9 http://www.astro.rug.nl/lensesinkids

Candidate properties
In this section we summarise the main characteristics of the LinKS sample. To enable this analysis, we rely on candidates with known spectroscopic redshift publicly available from SDSS DR14, GAMA DR3 and 2dFLenS (Abolfathi et al. 2018;Baldry et al. 2018;Blake et al. 2016). We also incorporate accurate multi-band colours as measured by the Gaussian Aperture and PSF (GAaP) code. Briefly, GAaP produces fluxes measured in Gaussian-weighted apertures, which are modified per-source and per-image, so as to produce seeing-independent estimates flux estimates across different observations/bands. The aperture modification calculation requires that the PSF of the image be both homogeneous and Gaussian, and so prior to running GAaP each survey tile has its PSF Gaussianised over the full field of view. Importantly, GAaP magnitudes are not total, and preferentially weight the central, redder parts of our lens galaxies. This acts to reduce the contamination of the outer (blue) features of the lens candidates (i.e. the lensed arcs), and improve the fidelity of lens-candidate SED models. In this section we have chosen to limit our analysis to the LinKS sample in the KiDS-North patch 10 . This selection reduces our LinKS sample to 659 candidates, of which 41 (out of 89) are in the "bona fide" subsample. We show the observerframe g − r colour in terms of redshift of these candidates in the left panel of Figure 9. Due to our initial selection criteria (see Section 2.2) all of our candidates exhibit red colours, with g−r ∼ 0.8 at z ∼ 0 and g−r ∼ 1.7 at the highest redshifts z ∼ 0.5. Visually the "bone fide" candidates seem to sample the colour distribution of the entire sample without bias; they are otherwise unexemplary. To further characterize the sample of candidates, and allow for a comparison with the literature, we then estimate stellar masses for the subsample of our sources with spectroscopic redshifts 11 . Following Petrillo et al. (2017), we estimate stellar masses using the software Le Phare (Arnouts et al. 1999;Ilbert et al. 2006), which does a χ 2 fitting between colours from stellar population synthesis (SPS) models and the observed colours. We employ single burst SPS models from Bruzual & Charlot (2003, BC03) and a Chabrier (2001) IMF, allowing the stellar population age to vary and assuming metallicities in the range (0.005-2.5 Z ). The maximum age is set by the age of the Universe at the redshift of the galaxy, with a maximum value at z = 0 of 13 Gyr. We do not consider internal extinction, and our models assume zero redshift uncertainty. We adopt the GAaP ugri magnitudes MAG_GAaP and related 1 σ uncertainties (Kuijken et al. 2018, in prep.), corrected for Galactic extinction using the map by Schlafly & Finkbeiner 10 The fourth KiDS data release consists of multi-band GAaP catalogues for both the Northern and Southern patches, but we chose to limit our analysis to a preliminary set of 497 tiles that were processed by Astro-WISE at the start of this analysis. We note that some improvements have been made to the GAaP catalogues during the course of this work, in particular the calibration of the u-band zero-points has been refined. We do not expect these updates to significantly impact our conclusions. 11 Robust stellar masses are available from the literature for those KiDS galaxies that are also contained in SDSS and GAMA. However, in order to have homogeneous results for all the candidates, we determine the masses for the whole sample using KiDS 4-band photometry. dex. Stellar masses are shown as a function of redshift in Figure 9, and compared with SLACS (Auger et al. 2009) and SL2S (Sonnenfeld et al. 2013b) data. Consistently with Petrillo et al. (2017), the selected candidates have redshifts in the window 0.1 ∼ < z ∼ < 0.5, with a median value of 0.33, while the stellar masses are typically larger than 10 11 M , with an average value of ∼ 2 × 10 11 M . We note, of course, that the choice of IMF significantly influences the final mass estimates; using a Salpeter (1955) IMF instead of a Chabrier IMF causes inferred stellar masses to increase by a factor of ∼ 2 with no change to observed colours (Tortora et al. 2009).  The "bone fide" candidates are shown in green in both panels. They span a similar range of redshifts and masses as the whole sample, with a marginal indication that they may preferentially sample higher stellar masses.

Predictions and Prospects: Euclid and LSST
Using the LensPop code presented in Collett (2015), Petrillo et al. (2017) estimated that the maximally retrievable number of strong lens candidates in a fully complete KiDS survey would be ∼ 2400. For a ∼ 900 square degrees area such as that considered in this paper, ignoring the masked area of the survey, we would there expect to find ∼ 1700 possible strong lenses. If we further consider only those lenses that satisfy our LRG colour-magnitude cuts (Section 2.2), and which have an Einstein radius larger than one arcsecond (i.e. the range on which the ConvNets have been trained; see Table 1 in Petrillo et al. 2019), this number reduces further to about ∼ 450 retrievable strong lenses. Their average distribution in redshift is consistent with the actual distribution of our retrieved candidates of the previous subsection, peaking at a value of z ∼ 0.3. Our samples here therefore fully encompass the predicted ∼ 450 retrievable strong lenses from LensPop: the full sample of LinKS candidates containing ∼ 4× the number of predicted sources, and the bone fide sample containing ∼ 5× too few. We note again, though, that by relaxing the visual inspection score requirement to, e.g., 16 (the score corresponding to four maybe a lens) one can create a wider "bone fide" sample containing 308 candidates; ∼ 68 per cent of the retrievable lenses predicted by LensPop. Nonetheless, we continue to conservatively consider only the 89 sources in our "bona fide" subsample to be genuine lenses, and conclude that this sample is complete at the level of ∼ 20 per cent.
In the following, we predict the number of lenses expected in future surveys utilising the depth and breadth of the future Euclid and LSST surveys, and the performance of our ConvNets in retrieving strong lenses within these future datasets.
Euclid. Collett (2015) predicts that there will be ∼ 170 000 potential lenses in Euclid. Petrillo et al. (2019) extended this analysis by estimating the number of lenses with an Einstein radius larger than 1 arcsecond and with a redshift z < 0.5, which roughly corresponds to our LRG colour cut selection. This reduces the number of potential strong lenses to ∼ 20 000 in the 15 000 square degrees of the completed survey. With the same strategy used in this paper, we conservatively estimate that between ∼ 5000 and ∼ 15 000 lenses will be retrievable with ConvNets from the completed Euclid survey. These numbers assume that the 1-band ConvNet performs at least as well on Euclid data as it does on KiDS data, in the same parameter-domain, and that it is possible to pre-select LRGs with the aid of ground-based multiband observations and the IR-bands from Euclid. We note, though, that Euclid data will have better image quality than KiDS, which will allow the training of more effective algorithms over a wider parameter space. Furthermore, it will allow improved recognition and rejection of false positives via visual inspection. These considerations all lead to our assessment that our estimate of the number of retrievable strong lenses is conservative.
LSST. The above forecast can also be performed for LSST, and moreover with greater accuracy, as LSST will observe in the same g, r, and i filters as does KiDS. We find that the number of potentially discoverable lenses in LSST, with an Einstein radius larger than one arcsecond and with our invoked LRG colour selection, is ∼ 20 000 over the 20 000 square degrees of the completed survey. Therefore, as in Euclid, we estimate that between ∼ 5000 and ∼ 15 000 lenses may be retrievable from the completed LSST survey data with our ConvNets.

THE FULL SAMPLE CANDIDATES
Visual inspection of strong lens candidates selected by the ConvNets is a time-consuming task. However investing such time to achieve increased purity and completeness of the recovered candidate sample is worth the effort. But lowering the p-value threshold above which lens candidates are defined, or significantly increasing the survey area (and thus significantly increasing the absolute number of p 0.8 candidates) naturally only exacerbates this task. As such, performing the visual inspections completed here for much larger target samples, such as those expected from Euclid and LSST, will likely be prohibitive. In these cases, one may want to reduce the number of candidates to visually inspect by increasing the p-threshold required for candidacy definition. However it is unclear how such an increase may influence the number of lens-candidates returned. Furthermore, if the scientific aim is to establish a complete strong-lens sample that is unbiased in its lens properties, then such a high threshold may be counter-productive. The LRG sample used in this paper is a distinct sub-sample of massive Early Type Galaxies (ETGs) which lack (active) star formation and therefore have profiles which allow easier separation of foreground lenses from lensed images, which are often blue star-forming galaxies, as demonstrated in SLACS. In this work we use the LRG sample because we expect most of the lenses to be massive ETGs. However, selecting such a sample of galaxies is not always straightforward and can lead to the loss of potential lenses; LRGs do not represent the entire population of galaxies and hence the entire strong-lensing cross-section.
For this reason, it is interesting to see how the Con-vNets perform on a less restricted and much larger sample of galaxies. We explore these issues in Section 5.1 by applying the ConvNets to the full sample, but with a higher threshold in p, in order to reduce the visual inspection effort. In Section 5.2, we then translate the outcome to the planned Euclid and LSST surveys and analyse the advantages and applicability of such a strategy. Finally, in Section 5.3 we present a composite sample of lens candidates collected from various ConvNets, applied to the full sample, that were run during the ConvNet optimisation process. Each of these runs was less efficient than the final ConvNets employed in the main body of this work, but sometimes yielded distinct lenses which we have subsequently collated.

A high-purity sample
We run the two ConvNets on the full sample (930 651 galaxies) rather than on the smaller but purer LRG sample (88 327 galaxies). To obtain a sample of lens candidates that is both pure and limited in size, and in order to reduce the visual inspection load, we average the predictions from both ConvNets into a single predictive parameter p. We select candidates with an average value of p larger than 0.999. With this selection we obtain just 30 strong lens candidates ( Figure 10); 0.003 per cent of the full sample. When visually inspected, we find that this sample is extremely pure and, more in particular, it is composed of 13 : • 2 confirmed lenses (Cabanac et al. 2007;Bolton et al. 2008); • 10 potential lenses; • 2 possible contaminants.
This result attests to the capability of the ConvNets to find lens candidates in a sample slightly different from what it was trained on. We note that 18 of the 30 candidates retrieved in this manner are not part of the LinKS sample because they did not satisfy the LRG cut in Section 2.2 (see Section 5.3 for more information on these candidates). We note further that it is entirely possible that some of these candidates fail our LRG colour-magnitude selection explicitly because of contamination by the bright blue lensing features that we are attempting to locate; a clear drawback of such an LRG selection with imperfect photometry.

Small high-purity Euclid & LSST samples
Considering that, theoretically, the number of recoverable lenses in ∼ 900 square degrees of KiDS is at most ∼ 1700 (see Section 4.2), our recovery of only 30 candidates in Sect. 5.1 implies that a p > 0.999 setup will only recover ∼ 2 per cent of possibly retrievable lenses. If we turn this efficiency into a forecast for the 170 000 total retrievable lenses in the full Euclid survey as predicted by Collett (2015), we expect to find ∼ 3000 candidates with a > 90 per cent purity which are retrievable with minimal human intervention. Such a sample represents the often called "low-hanging fruit" of strong lenses within Euclid, as these sources are expected to occupy a limited but easily accessible part of parameter space (i.e., large Einstein radii and low redshifts). Note again that we expect this number to be conservative, as with our other forecasts presented in Section 4.2, as Euclid lenses will be observed with a much higher angular resolution than KiDS lenses, and will be detected with ConvNets trained on higher fidelity data. Near-infrared colours will also help to downselect lens candidates since, being less sensitive to the dust and mapping a wider wavelength baseline, they will provide a more efficient way to separate LRGs from star forming galaxies. Nonetheless, even in this conservative case, the number of lenses forecast here would be one to two orders of magnitude larger than the number detected in any previous or ongoing strong lens survey. Finally, as in Section 4.2, a similar number of easy candidates may be expected from LSST surveys.

The "bonus sample"
The sample presented in this section includes 200 strong lens candidates discovered serendipitously during previous ConvNet runs that are not part of the LinKS sample. The candidates in this Bonus sample have not gone through the same rigorous visual inspection as those in the LinKS sample, and subsequently cannot be considered to be as statistically well defined. However if we apply the ConvNets to these candidates with a threshold p > 0.8, 160 candidates pass this threshold in at least one of the two ConvNets, i.e., 80 per cent of the sample. Detailed data related to this sample can be found online 14 (see the Appendix). This sample contains eight HSC survey lens candidates (Sonnenfeld et al. 2018a) and four confirmed lenses J1452-0058 ), J142449-005322 (Tanaka et al. 2016), J010127-334319 (Bettinelli et al. 2016) and KiDS0239-3211 (Sergeyev et al. 2018).

DISCUSSION AND CONCLUSIONS
In this paper, we present several samples of lens candidates from the Kilo-Degree Survey (KiDS) which likely contain several hundred strong gravitational lenses. To gener- Figure 10. Images of the sample of candidates retrieved running the two ConvNets on the full sample, averaging the predictions and selecting those with p > 0.999. This sample is > 90 per cent pure and requires very little human intervention. The upper block of 12 galaxies is part of the LRG sample while the bottom 18 galaxies are exclusively part of the full sample. More information on the latter candidates can be found in the Appendix. Each image has dimensions 20 × 20 arcseconds.
ate these samples, we apply two new lens-finder algorithms -based on Convolutional Neural Networks (ConvNets) -to a sample of 88, 327 LRGs, selected via a colour-magnitude cut, from 904 one-square-degree tiles of KiDS data. We visually inspect the candidates selected by these ConvNets and conservatively select 1983 rank-ordered candidates, which we designate the LinKS sample (see Section 4). We further subset the data into subsamples of 219 more plausible candidates, and 89 highly likely candidates.
We did not attempt to achieve a high level of statistical completeness in the samples of LRGs, nor in the samples of resulting lens candidates. We aimed instead to both maximise the number of lens candidates while minimising the fraction of false positives. Our colour-magnitude selection (Section 2.2) aimed at choosing a large sample of massive (early-type) galaxies while specifically avoiding star-forming (e.g. spiral) galaxies and other contaminants. We note that Vakili et al. (2018) recently selected LRGs from KiDS data using the LRG colour-magnitude relation, and also computed their photometric redshifts; we anticipate that this sample could be utilised to compile a more statistically complete sample of KiDS LRG lens candidates in the future. In addition to the LinKS sample, in Section 5.3 we presented a Bonus Sample that consists of two-hundred lens candidates. These lenses were serendipitously discovered in KiDS data, e.g., during previous experiments with various ConvNets. While these sources have not been rigorously scrutinised in the same manner as our main LinKS sample, and we therefore do not consider it to be as statistically well-defined as our main sample, it nonetheless contains a number of interesting strong lens candidates for future follow-up.
From our KiDS strong lens candidates (together with those found by Hartley et al. 2017 andSpiniello et al. 2018), the ∼ 600 galaxy-scale lens candidates found in DES (Diehl et al. 2017;Jacobs et al. 2018;Spiniello et al. 2019) and HSC data (Sonnenfeld et al. 2018a;Wong et al. 2018), it will soon be possible to select a sample of confirmed lenses similar in size to the total number of gravitational lenses known today. For example the Masterlens database 15 , which assembles information on all known gravitational lenses, contains a total of ∼ 600 gravitational lenses discovered up to 2016. It is possible that the total number could be, by now, up to ∼ 1000 confirmed lenses and lens candidates. We believe it likely that strong lens searches within the KiDS, DES, and HSC surveys could easily double this number -accumulated over many decades -within the next few years.
Despite the already considerable numbers of new lens candidates from KiDS, there are still many lens candidates to be discovered, especially in that part of parameter space that we have not, or rather not thoroughly, explored. In addition, the completed KiDS survey will cover an area of 1350 square degrees. We plan to apply our method to these completed KiDS data, together with that of Spiniello et al. (2018), to find lensed quasars. Applying other complementary methods as Hartley et al. (2017) SVM will aid in maximizing the exploration of the parameter space.
Besides the LRG-selected sample, we have shown that is possible to tune the ConvNets to yield a sample of lens candidates with considerable purity by using many more targets (i.e. about ten times more). In particular, we ran the lensfinders on a sample composed of 930 651 galaxies (not just LRGs) and retrieved a sample of 30 strong lens candidates with an expected purity of > 90 per cent. By selecting lens candidates in this way, we are able to considerably diminish the visual inspection load, although at the price of losing many genuine lenses. With a similar setup, though, it would be feasible to retrieve ∼ 3000 lens candidates from the future Euclid data set with minimal human intervention. A similar number would be found by LSST.
All these results can be enhanced further, especially by training the ConvNets with more complete training sets (Petrillo et al. 2019). In addition, a collection of genuine lens candidates, even in modest numbers, should allow one to fine-tune ConvNet lens-finders further to improve their classification capacity (Tuccillo et al. 2018;Domínguez Sánchez et al. 2018). New gravitational lenses can also be used as training sets for future crowdsourced searches (Marshall et al. 2016). Finally, the candidates identified in this paper could be used to build a benchmark against which different lens-finders can be tested and compared, similar to analyses done with simulated data (e.g., Metcalf et al. 2018) Our results are very encouraging in light of future strong-lens surveys (e.g. those utilising Euclid and LSST) for which a naive strategy of visually inspecting galaxies to select lens candidates is entirely infeasible, given the enormous number of galaxies these new instruments will uncover. One can expect to compile samples of strong lenses from Euclid and LSST that are between one to two orders of magnitude  (Taylor 2005) and STILTS (Taylor 2006) software have been extensively used in this project. Author Contributions: All authors contributed to the development and writing of this paper. The authorship list is given in three groups: the lead authors (CEP,CT,GV,LVEK,GVK), followed by two alphabetical groups. The first alphabetical group includes those who are key contributors to both the scientific analysis and the data products. The second group covers those who have either made a significant contribution to the data products, or to the scientific analysis.  Figure A1 -continued