Is every strong lens model unhappy in its own way? Uniform modelling of a sample of 13 quadruply+ imaged quasars

Strong-gravitational lens systems with quadruply-imaged quasars (quads) are unique probes to address several fundamental problems in cosmology and astrophysics. Although they are intrinsically very rare, ongoing and planned wide-field deep-sky surveys are set to discover thousands of such systems in the next decade. It is thus paramount to devise a general framework to model strong-lens systems to cope with this large influx without being limited by expert investigator time. We propose such a general modelling framework (implemented with the publicly available software Lenstronomy) and apply it to uniformly model three-band Hubble Space Telescope Wide Field Camera 3 images of 13 quads. This is the largest uniformly modelled sample of quads to date and paves the way for a variety of studies. To illustrate the scientific content of the sample, we investigate the alignment between the mass and light distribution in the deflectors. The position angles of these distributions are well-aligned, except when there is strong external shear. However, we find no correlation between the ellipticity of the light and mass distributions. We also show that the observed flux-ratios between the images depart significantly from the predictions of simple smooth models. The departures are strongest in the bluest band, consistent with microlensing being the dominant cause in addition to millilensing. Future papers will exploit this rich dataset in combination with ground based spectroscopy and time delays to determine quantities such as the Hubble constant, the free streaming length of dark matter, and the normalization of the initial stellar mass function.


INTRODUCTION
Strong gravitational lensing is the effect where light from a background object is deflected by a foreground mass distribution (e.g. galaxy or galaxy cluster) and multiple images of the background object form. Strong gravitational lenses are powerful probes to answer a variety of astrophysical and cosmological questions (see, e.g., Treu 2010), as we discuss briefly below.
According to the concordance model in cosmology, our Universe consists of 5 per cent baryonic matter, 26 per cent dark matter, and 69 per cent dark energy accounting for a cosmological constant Λ (Planck Collaboration et al. 2018). This model is known as the Λ cold dark matter (ΛCDM) model. The predictions of the ΛCDM model have been extensively tested with good agreement to observations spanning from the largest scale up to the horizon down to ∼1 Mpc (e.g. Dawson et al. 2013;Shajib & Wright 2016;Planck Collaboration et al. 2018). However, there also have been observations that are in tension with the flat ΛCDM paradigm. For instance, there is a tension at the 3σ level between the local measurement of H 0 from Type Ia supernovae (Riess et al. , 2018aBernal et al. 2016) and that extrapolated from the Planck cosmic microwave background measurement for a flat ΛCDM cosmology. This tension may arise from unknown systematic uncertainties in one or both of the measurements, or might point to new physics, e.g. additional species of relativistic particles, a non-flat cosmology, or dynamic dark energy. Therefore, it is crucial to have precise and independent measurements of H 0 to settle this discrepancy.
In a gravitational lens, if the background source is timevariable (typically a quasar, but also a supernova as originally proposed), the delay between the arrival time of photons for the different images can be used to measure the socalled 'time-delay distance' (Refsdal 1964;Suyu et al. 2010). This distance is inversely proportional to H 0 , thus it can be used to constrain H 0 and other cosmological parameters (for a detailed review, see Treu & Marshall 2016). H 0 has been determined to 3.8 per cent precision using three lens systems in the flat ΛCDM cosmology (Suyu et al. 2010(Suyu et al. , 2013(Suyu et al. , 2017Sluse et al. 2017;Rusu et al. 2017;Wong et al. 2017;Bonvin et al. 2017;Tihhonova et al. 2018). With a large sample size of about 40 lenses, it is possible to measure H 0 with the per-cent precision (Jee et al. 2016;Shajib et al. 2018) necessary to resolve the H 0 tension and make the most of other dark energy probes (Linder 2011;Suyu et al. 2012;Weinberg et al. 2013).
One of the baryonic components in dark matter is lowmass star. Surprisingly, recent studies have shown that the low-mass star contribution in massive elliptical galaxies is significantly underestimated if the stellar initial mass function (IMF) of the Milky Way is assumed van Dokkum & Conroy 2010;Auger et al. 2010a;Cappellari et al. 2012;Schechter et al. 2014). Precise knowledge about the IMF is key in measuring almost any extragalactic quantity involving star and metal formation. Measuring the stellar mass-to-light ratio in the deflectors of quadruply imaged lensed quasars (henceforth quads) from microlensing statistics provides one of the most robust methods to constrain the IMF (e.g. Oguri et al. 2014;Schechter et al. 2014).
Until recently, all of these methods could only be applied to a small sample of known quads. However, such systems are currently being discovered at a rapidly increasing rate due to multiple strong-lens search efforts involving various large-area sky surveys (e.g. Williams et al. 2017Williams et al. , 2018Schechter et al. 2017;Agnello et al. 2017aAgnello et al. , 2018bSonnenfeld et al. 2018;Lemon et al. 2018;Anguita et al. 2018, Treu et al. 2018. With more deep wide-field surveys, e.g. Wide-Field Infrared Survey Telescope, Large Synoptic Survey Telescope, Euclid, etc., coming online within the next decade, the sample size of quads is expected to increase by two orders of magnitude or more (Oguri & Marshall 2010;Collett 2015).
Modelling such lens systems has so far been carried out for individual systems while fine-tuning the modelling approach on a case-by-case basis. However, with the rapidly increasing rate of discovery, it is essential to develop a modelling technique that is applicable to a wide variety of quads to efficiently reduce the time and human labour necessary in this endeavour. Given the large diversity in the morphology and complexity of quads, this is an interesting problem to pose: is every quad different or 'unhappy in its own way' that requires careful decision-making by a human in the modelling procedure, or are the quads similar or 'happy' to some extent so that a uniform modelling technique can be applied to generate acceptable models without much human intervention?
Recently, some initial strides have been undertaken along the lines of solving this problem. Nightingale et al. (2017) devised an automated lens modelling procedure using Bayesian model comparison. Hezaveh et al. (2017) and Perreault  applied machine learning techniques to automatically model strong gravitational lenses and constrain the model parameters. In this paper, we devise a general framework or decision-tree that can be applied to model-fitting of quads both in a single band and simultaneously in multiple bands. We implement this uniform modelling approach using the publicly available lens-modelling software Lenstronomy (Birrer & Amara 2018, based on Birrer et al. 2015) to a sample of 12 quads from the Hubble Space Telescope (HST ) data in three bands. Lenstronomy comes with the sufficient modelling tools and the architecture allows a build-up in complexity as presented in this work. We report the model parameters and other derived quantities for these lens systems.
To demonstrate the scientific capabilities of such a sam-ple of strong-lens systems, we study the properties of the deflector galaxy mass distribution, specifically the alignment of the mass and light distributions in them. The distribution of dark matter and baryons in galaxies can test predictions of ΛCDM and galaxy formation theories (e.g. Dubinski 1994;Ibata et al. 2001;Kazantzidis et al. 2004;Macciò et al. 2007;Debattista et al. 2008;Lux et al. 2012;Read 2014). N-body simulations with only dark matter particles predict nearly triaxial, prolate haloes (Dubinski & Carlberg 1991;Warren et al. 1992;Navarro et al. 1996;Jing & Suto 2002;Macciò et al. 2007). In the presence of baryons, the halos become rounder (Dubinski & Carlberg 1991;Dubinski 1994;Warren et al. 1992). For disk galaxies, the dark matter distribution is shown to be well-aligned with the light distribution (Katz & Gunn 1991;Dubinski & Carlberg 1991;Debattista et al. 2008).
As the lensing effect is generated by mass, strong gravitational lenses give independent estimates of the mass distribution that can be compared with the observed light distribution. A tight alignment within ±10 • between the major axes of the mass and light distribution has been observed for deflector galaxies with weak external shear, whereas galaxies with strong external shear can be highly misaligned (Keeton et al. 1998;Kochanek 2002;Treu et al. 2009;Sluse et al. 2012;Gavazzi et al. 2012;Bruderer et al. 2016). However, there has been some conflict on the correlation between the ellipticity of the mass and light distributions with reports of both strong correlation (Sluse et al. 2012;Gavazzi et al. 2012) and no correlation (Keeton et al. 1998;Ferreras et al. 2008;Rusu et al. 2016).
This paper is organized as follows. In Section 2, we describe the data used in this study. We describe our methodology in Section 3 and the results in Section 4. Finally, we summarize the paper followed by a discussion in Section 5. When necessary, we adopt a fiducial cosmology with H 0 = 70 km s −1 Mpc −1 , Ω m = 0.3, Ω Λ = 0.7, and Ω r = 0. All magnitudes are given in the AB system.

HST SAMPLE
Our sample consists of eleven quads and one five-image system. Some of these systems were discovered by the STRonglensing Insights into the Dark Energy Survey (STRIDES) 1 collaboration (Treu et al. 2018, submitted, Anguita et al. 2018, Ostrovski et al. 2018b, some are recent discoveries by independent searches outside of the Dark Energy Survey (DES), and some are selected from the literature. In this section, we first describe the high-resolution imaging data obtained through HST. We then briefly describe the lens systems in the sample.

Data
Images of the lenses were obtained using the HST Wide Field Camera 3 (WFC3) in three filters: F160W in the infrared (IR) channel, and F814W and F475X in the ultraviolet-visual (UVIS) channel (ID 15320, PI Treu). In the IR channel filter, we used a 4-point dither pattern and STEP100 readout sequence for the MULTIACCUM mode. This approach guarantees a sufficient dynamic range to expose both the bright lensed quasar images and the extended host galaxy. For the UVIS channel filters, we used a 2-point dither pattern. Two exposures at each position, one short and one long, were taken. Total exposure times for all the quads and the corresponding dates of observation are tabulated in Table 1.
The data were reduced using AstroDrizzle. The pixel size after drizzling is 0. 08 in the F160W band, and 0. 04 in the F814W and F475X bands.

Quads in the sample
In this subsection, we give a brief description of each quad in our sample (Fig. 1).

PS J0147+4630
This quad was serendipitously discovered from the Panoramic Survey Telescope and Rapid Response System (Pan-STARRS) survey (Berghea et al. 2017). The source redshift is z s = 2.341 ± 0.001 (Lee 2017) and the deflector redshift is z d = 0.5716 ± 0.0004 (Lee 2018). Initial models from the Pan-STARRS data suggests a relatively large external shear γ ext ∼ 0.13.

SDSS J0248+1913
This lens system was discovered in Sloan Digital Sky Survey (SDSS) imaging data using the morphology-independent Gaussian-mixture-model supervised-machine-learning technique described in Ostrovski et al. (2017) applied to SDSS u, g and i, and Wide-field Infrared Survey Explorer (WISE ) W1 and W2 catalogue level photometry (Ostrovski et al. 2018b, in preparation). The lensing nature was confirmed via optical spectroscopy with the Echellette Spectrograph and Imager (ESI) on the Keck telescope in 2016 December prior to the HST observations presented here and will be described in Ostrovski et al. (2018b, in preparation). Delchambre et al. (2018) report the independent discovery of this spectroscopically confirmed lensed system as a lensed quasar candidate using Gaia observations. The lens system resides in a dense environment with several other galaxies within close proximity. Part of the lensed arc from the extended source is noticeable in the F160W band in IR.

ATLAS J0259-1635
This lens system was discovered in VLT Survey Telescope (VST)-ATLAS survey from candidates selected with quasarlike WISE colours (Schechter et al. 2018). The source for this system is at redshift z s = 2.16 (Schechter et al. 2018).

DES J0405-3308
The discovery of this system is reported by Anguita et al. (2018). A complete or partial Einstein ring is noticeable in all the HST bands. The source redshift is z s = 1.713 ± 0.001 .   Figure 1. Comparison between the observed (first, third and fifth columns) and reconstructed (second, fourth and sixth columns) strong-lens systems. The three HST bands: F160W, F814W, and F475X are used in the red, green, and blue channels, respectively, to create the red-green-blue (RGB) images. Horizontal white lines for each system are rulers showing 1 arcsec. The relative intensities of the bands have been adjusted for each lens system for clear visualisation of the features in the system.

DES J0408-5354
This system was discovered from the DES Year 1 data (Lin et al. 2017;Diehl et al. 2017;Agnello et al. 2017b). The deflector redshift is z d = 0.597 and the quasar redshift is z s = 2.375 (Lin et al. 2017). This is a very complex lens system with multiple lensed arcs noticeable in addition to the quasar images. The sources of the lensed arcs can be components in the same source plane as the lensed quasar or they can be at different redshifts. This system has measured time-delays between the quasar images: ∆t AB = −112 ± 2.1 days, ∆t AD = −155.5 ± 12.8 days, and ∆t BD = −42.4 ± 17.6 days (Courbin et al. 2018).

DES J0420-4037
This lens system was discovered in DES imaging data using the morphology-independent Gaussian-mixture-model supervised-machine-learning technique described in Ostrovski et al. (2017) applied to DES g, r and i, Visible and Infrared Survey Telescope for Astronomy (VISTA) J and K, and WISE W1 and W2 catalogue level photometry (Ostrovski et al 2018b, in preparation). Several small knots are noticeable near the quasar images that are possibly multiple images of extra components in the source plane.

PS J0630-1201
This system is a five-image lensed quasar system . The discovery was the result of a lens search from Gaia data from a selection of lens candidates from Pan-STARRS and WISE. The source redshift is z s = 3.34 ).

SDSS J1251+2935
This quad was discovered from the SDSS Quasar Lens Search (SQLS; Oguri et al. 2006;Inada et al. 2012) (Kayo et al. 2007). The source redshift is z s = 0.802 and the deflector redshift is z d = 0.410 measured from the SDSS spectra (Kayo et al. 2007).

SDSS J1433+6007
This lens system was discovered from the SDSS data release 12 photometric catalogue (Agnello et al. 2018a). The redshifts of the source and deflector are z s = 2.737 ± 0.003 and z d = 0.407 ± 0.002, respectively (Agnello et al. 2018a).

PS J1606-2333
This quad was discovered from Gaia observations through a candidate search with quasar-like WISE colour over the Pan-STARRS footprint (Lemon et al. 2018). The main deflector has a noticeable companion near the South-most image.

DES J2038-4008
This lens system was discovered from a combined search in WISE and Gaia over the DES footprint (Agnello et al. 2017a). The deflector and the source redshifts are z d = 0.230 ± 0.002 and z s = 0.777 ± 0.001, respectively (Agnello et al. 2017a). This system has an intricate Einstein ring with complex features from the extended quasar host galaxy.

WISE J2344-3056
This lens system was discovered in the VST-ATLAS survey (Schechter et al. 2017). This is a very small-size quad with reported maximum image separation ∼ 1. 1. Several small and faint blobs are in close proximity, two of which are particularly noticeable near the North and East images.

LENS MODELLING
To devise a uniform approach that will suit a wide range of quads that vary in size, configuration, light profiles, etc., we need to choose from the most general models for the lens mass profile and the light distributions. It is often required to fine-tune the choice of models by adding complexities to the lens model in a case-by-case basis to suit the purpose of the specific science driver of an investigator. However, such detailed lens-modelling is outside of the scope of this paper. We only require our models to satisfactorily (χ 2 red ∼ 1) fit the data while being general enough to be applicable to a wide variety of lens systems.
We use the publicly available software package Lenstronomy 2 (Birrer & Amara 2018, based on Birrer et al. 2015) to model the quads in our sample. Prior to this work, Lenstronomy was used to measure the Hubble constant (Birrer et al. 2016) and to quantify lensing substructure (Birrer et al. 2017). We first adopt the simplest yet general set of profiles to model the deflector mass and light, and the source-light distributions (e.g. Section 3.1, 3.2). Then, we run a particle swarm optimization (PSO) routine through Lenstronomy to find the maximum of the likelihood function. After the PSO routine, we check for the goodness-of-fit of the best fit model. If the adopted profiles can not produce an acceptable fit to the data, we gradually add more mass or light profiles to account for extra complexities in the lens system, e.g, presence of satellites, complex structure near the Einstein ring, or extra lensed source components. We run the PSO routine after each addition of complexity until a set of adopted mass and light profiles can produce an acceptable model. Next, we obtain the posterior probability distribution functions (PDFs) of the model parameters using a Markov chain Monte Carlo (MCMC) routine. The PSO and MCMC routines in Lenstronomy utilize the cos-moHammer package (Akeret et al. 2013). cosmoHammer itself embeds emcee (Foreman-Mackey et al. 2013), which is an affine-invariant ensemble sampler for MCMC (Goodman & Weare 2010) written in Python.
In this section, we first describe the profiles used to parameterize the mass and light distributions. Then, we explain the decision-tree of the modelling procedure.

Mass profile parameterization
We adopt a power-law elliptical mass distribution (PEMD) for the lens mass profile. This profile is parameterized as where γ is the power-law slope, θ E is the Einstein radius, q is the axis ratio. The coordinates (θ 1 , θ 2 ) depend on position angle φ through a rotational transformation of the on-sky coordinates that aligns the coordinate axes along the major and minor axes. We also add an external shear profile parameterized by two parameters, γ 1 and γ 2 . The external-shear magnitude a. Initial Figure 2. Flowchart showing the decision-tree for uniform modelling of quads to simultaneously fit multi-band data. After the initial setup (node a), the fitting is first done only with one band (node b) to iteratively choose the necessary level of complexity in the mass and light profiles (nodes d, e, f, g, h, k, l). A proposed model is accepted, if the power-law slope γ does not diverge to a bound of the allowed range (nodes j) and the p-value 10 −8 for the fit (nodes c, j). After deciding upon a set of profiles to simultaneously model the multi-band data (node i), the uncertainties on the model parameters are obtained by running a MCMC routine (node m).
γ ext and angle φ ext are related to these parameters by If there is a secondary deflector or a satellite of the main deflector, we choose an isothermal elliptical mass distribution (IEMD), which is a PEMD with the power-law slope γ fixed to 2.

Light profile parameterization
We choose the elliptical Sérsic function (Sersic 1968) to model the deflector light profile. The Sérsic function is parameterized as Here I e is the amplitude, k is a constant that normalizes θ eff so that it is the half-light radius, q L is the axis ratio, and n Sersic is the Sérsic index. The coordinates (θ 1 , θ 2 ) also depend on the position angle φ L that rotationally transforms the on-sky coordinates to align the coordinate axes with the major and minor axes. We add a 'uniform' light profile parameterized by only one parameter, the amplitude, that can capture unaccounted flux from the lens by a single Sérsic profile.
The circular Sérsic function (with q L = 1, φ L = 0) is adopted to model the host-galaxy-light distribution. We limit θ eff > 0. 04 (which is the pixel size in the UVIS bands) on the source plane to prevent the Sérsic profile to be too pointy effectively mimicking a point source. For a typical source redshift z s = 2, 0. 04 corresponds to ∼ 0.33 kpc. This is a reasonable lower limit for the size of a lensed source hosting a supermassive black hole. If there are complex structures in the lensed arcs that can not be fully captured by a simple Sérsic profile, we add a basis set of shapelets (Refregier 2003;Birrer et al. 2015) on top of the Sérsic profile to reconstruct the source-light distribution. The basis set is parameterized by maximum order n max , and a characteristic scale β. The number of shapelets is given by (n max + 1)(n max + 2)/2.
The quasar images are modelled with point sources with a point spread function (PSF) on the image plane.

Modelling procedure
We model the quads in a general framework to simultaneously fit the data from all three HST bands. Fig. 2 illustrates the flow of the modelling procedure. We describe the nodes of this flow-chart below. Each node is marked with a lowercase letter. Some of the decision nodes in Fig.  2 are self-explanatory and need no further elaboration.
a. Initial setup: We first pre-process the data in each band. A cutout covering the lens and nearby environment from the whole image is chosen. The background flux estimated by SExtractor (Bertin & Arnouts 1996) from the whole image is subtracted from the cutout. We also select four or more stars from the HST images to estimate the initial PSF in each band. A circular mask with a suitable radius is chosen to only include the deflector-light distribution, and the lensed quasar-images and arcs. If there is a nearby galaxy or a star, we mask it out unless we specifically choose to model the light profile of a satellite or companion galaxy, e.g., for DES J0408-5354, PS J0630-1201, SDSS J1433+6007 and PS J1606-2333. As PS J0630-1201 is a five-image lens, we allow the model the flexibility to produce more than four images.
b. Fit the 'most informative' band: It is important to judiciously initiate any optimization routine, such as the PSO, to efficiently find the global extremum. Finding the global maximum of the joint likelihood from all the bands together from a random initial point is often very expensive in terms of time and computational resource. Therefore, we first only fit the 'most informative' band to iteratively select the light and mass profiles necessary to account for the lens complexity. In this study, we choose F814W as the 'most informative' band. It is easier to decompose the deflector and the source-light distributions in the F814W band than in the F160W band as the deflector does not have a large flux near or beyond the Einstein ring. The resolution in the F814W band is also twice as high as in the F160W band. Furthermore, the deflector flux in the F475X band is often too small to reliably model the deflector-light distribution without a good prior. At first, we fix the power-law slope for the lens mass model at γ = 2 (i.e. the isothermal case). With each consecutive PSO routine, we narrow down the search region in the parameter space around the maximum of the likelihood. After each PSO routine, we iteratively reconstruct the PSF with the modelled-extended-light subtracted quasar images themselves. This is performed iteratively such that the extended light model updates its model with the new PSF to avoid biases and over-constraints on the PSF model. Similar procedures have been used in Chen et al. (2016); Birrer et al. (2017); Wong et al. (2017). The details will be described in Birrer et al. (2018, in preparation) and the reconstruction routines are part of Lenstronomy.
c. Good fit? We check for the goodness of fit by calculating the p-value for the total χ 2 and degrees of freedom. We set p-value 10 −8 as a criterion to accept a model. This low p-value is enough to point out substantial inadequacies in the model while applicable to the wide variety of the lens systems in our sample. Implementing a higher p-value would require noise-level modelling which is hard to achieve in a uniform framework. The total χ 2 in this node is computed from the normalized residuals only in the F814W band.
e. Add satellite mass profile: We add an IEMD for the satellite or companion mass profile. The light distribution of the satellite is modelled with an elliptical Sérsic profile. The initial centroid of the satellite is chosen approximately at the center of the brightest pixel in the satellite.
g. Add extra source component: If there are extra lensed source components, e.g., blobs or arcs, that are not part of the primary source structure near the Einstein ring, we add extra light profiles in the same source plane of the lensed quasar. We only add one light profile for each set of conjugate components. It is easier to identify and constrain the positions of additional source components on the image plane. Among the identifiable conjugate components from visual inspection, if one component is a smaller blob, and the others form arcs, we choose the blob's position in the image plane as the initial guess. First, we only add one circular Sérsic profile for each additional source component. For the second visit to this node, i.e. there is unaccounted structure or extra light near the additional lensed source components, we add shapelets with n max = 3 on top of the Sérsic profile. For each subsequent visit, we increase n max by 2.
h. Add shapelets to source-light profile: If there are structures near the Einstein ring, we add a basis set of shapelets on top of the Sérsic function to the primary source-light profile. We first add shapelets with n max = 10 and increase n max by 5 for each future visit to this node. The characteristic scale β of the shapelets is initiated with the best fit θ eff of the Sérsic profile for the source.
i. Fit all bands simultaneously: Before fitting all the bands simultaneously it is important to check astrometric alignment between the bands and correct accordingly if there is a misalignment. We align the data from the IR channel (F160W) with those from the UVIS band (F814W and F475X) by matching the positions of the four lensed quasar images. After that, we run PSO routines to fit all the bands simultaneously. Each PSO routine is followed by one iterative PSF reconstruction routine.
j. Good fit? We check for the goodness of fit with the same criteria described in node c. In this node, the total χ 2 is summed from the normalized residuals in all the three bands. Moreover, we check that the power-law slope γ has not diverged to the bound of the allowed values when γ is relaxed in node i. This might happen if there is not enough complexity in the adopted model to reconstruct the observed fluxes. We also check if there is lens flux unaccounted by the single Sérsic profile. If the total flux in the 'uniform' light profile within the effective radius is more than one per cent of that for the elliptical Sérsic profile, we decide that there is unaccounted lens flux. This can particularly happen in the F160W band as the lens light is more extended in the IR than in the UVIS channels and two concentric Sérsic functions provide a better fit to the lens light (Claeskens et al. 2006;Suyu et al. 2013). If there is no unaccounted lens light, we discard the 'uniform' profile from the set of lens-light profiles before moving to node m.
l. Add second Sérsic function to lens-light profile: If there is unaccounted lens flux, we discard the 'uniform' light profile and add a second Sérsic function on top of the first one with the same centroid. We fix the Sérsic indices for the two Sérsic profiles to n Sersic = 4 (de Vaucouleurs profile) and n Sersic = 1 (exponential).
m. Run MCMC: If the PSO fitting sequence finds an acceptable model for the quad, we run a MCMC routine. The initial positions of the walkers are centered around the best fit found by the PSO fitting sequence.
n. Finish: After the MCMC routine, we check for the convergence of the chain. We accept the chain as converged, if the total number of steps is ∼ 10 times the autocorrelation length, and the median and variance of the walker positions at each step are stable for 1 autocorrelation length at the end of the chain. We then calculate the best-fit value for each model parameter from the median of the walker positions at the last step. Similarly, 1σ confidence levels are computed from the 16-and 84-th percentiles in the last step.

Systematics
We estimate the systematic uncertainties of the lens model parameters by marginalizing over several numerical settings. We performed the modelling technique described in Section 3.3 with eleven different numerical settings: varying the lens-mask size, varying the mask size for extra quasarimages for PSF reconstruction, varying the sampling resolution of the reconstructed HST image, without PSF reconstruction, and with different realisations of the reconstructed PSF. We checked for systematics for the lens system SDSS J0248+1913. This system was chosen for two reasons: (i) this system has relatively fainter arc compared to the point source and deflector brightness, thus providing a conservative estimate of the systematics, and (ii) the modelling procedure is one of the simplest ones that enables running the modelling procedure numerous times with different settings in relatively less time. We assume the systematics are the same order of magnitude for the other lens systems in the sample.

RESULTS
In this section, we first describe the lens models and report the model parameters along with some derived parameters for all the quads. Then, we investigate the alignment between the mass and light profiles and report our findings.
In Appendix A, B and C we report additional inferred lens model parameters that are not directly relevant for the scientific investigation carried out here but may be of interest to some readers, especially in planning future follow-up and observations.

Efficiency of the uniform framework
All the 12 quads are reliably (p-value ∼ 1, Table 2) modelled following the uniform approach described in Section 3. The framework was designed and tuned from the experience gained from uniformly modelling the first ten observed quads in the sample. The two quads, SDSS J1251+2935, SDSS J1433+6007, were observed after the design phase. We effectively modelled these two lenses implementing the general framework, which validates its effectiveness. The total investigator time spent for these two lenses is ∼ 3 hours per lens including data reduction, initial setup and quality control of the model outputs. The number of CPU hours (on state-of-the-art machines 3 ) per system ranges between 50 to 500 depending on the complexity of the model.

Lens models
The set of profiles chosen through the decision-tree for modelling the quads along with the corresponding p-values are listed in Table 2. We show a breakdown of the best-fit models in each band for the quads, SDSS J0248+1913, DES J0408-5354, SDSS J1251+2935, SDSS J1433+6007, as examples, in Fig. 3. Model breakdowns for the rest of lenses are provided in Appendix D. We show the red-green-blue (RGB) images produced from the HST data alongside the reconstructed RGB images for all the quads in Fig. 1.
We checked the robustness of the estimated lens model parameters with and without PSF reconstructions. We find the Einstein radius θ E , axis ratio q, mass position angle φ, external shear γ ext and shear angle φ ext to be robustly (within 1σ systematic+statistical uncertainty) estimated. However, the power-law slope γ is affected by 1σ system-atic+statistical uncertainty due to deviations of the reconstructed PSF. This is expected as γ depends on the thickness of the Einstein ring and this thickness in the reconstructed model in turn depends on the PSF.
We investigated if the lens model parameters are stable with increasing complexity in the model (Fig. 4). The stability of the Einstein radius θ E and the external shear γ ext improves if the mass profile of a satellite is explicitly modelled. For increasing complexity in modelling the source-light distribution, the power-law slope γ, the Einstein radius θ E and the external convergence γ ext are stable.
We report the lens model parameters: Einstein radius θ E , power-law slope γ, axis ratio q, position angle φ, external shear γ ext , and shear angle φ ext and deflector light parameters: effective radius θ eff , axis ratio q L , and position angle φ L in Table 3. For the deflectors fitted with double Sérsic profiles, the ellipticity and position angles are computed by fitting isophotes to the double Sérsic light distribution. We use the Photutils 4 package in Python for measuring the isophotes which implements an iterative method described by Jedrzejewski (1987). We tabulate the astrometric positions of the deflector galaxy and the quasar images in Table  4. The apparent magnitudes of the deflector galaxy and the quasar images in each of the three HST bands are given in Table 5.

Alignment between mass and light distributions
In this subsection, we report our results on the alignment between the mass and light distributions in our sample of quads (Fig. 5).

Centroid
The centers of the mass and light distributions match very well for most of the quads with a root-mean-square (RMS) of 0. 04 (Fig. 5a). Three clear outliers are PS J0147+4630, DES J0408-5354 and PS J0630-1201. In PS J0630-1201, there are two deflectors with comparable mass creating a total of five images. If the two deflectors are embedded in the same dark matter halo, the center of the luminous part of the deflector can have an offset from the center of the halo mass. The other two outliers also have nearby companions possibly biasing the centroid estimation.

Ellipticity
We find a weak correlation between the ellipticity parameters of the mass and light distribution for the whole sample (Fig. 5b). We calculate the Pearson correlation coefficient between the axis ratios q and q L of the mass and light distributions, respectively, in the following way. We sample 1000 points from a two-dimensional Gaussian distribution that is centered on the axis ratio pair (q, q L ) for each quad. We take the standard deviation for this Gaussian distribution along each axis equal to the 1σ systematic+statistical uncertainty. We take the covariance between the sampled points for each lens as zero as we observe no degeneracy in the posterior PDF of the axis ratios for individual lenses. The Pearson correlation coefficient for the distribution of the sampled points from all the quads is r = 0.09 (very weak correlation).

Position angle
The position angles of the elliptical mass and light distributions are well aligned for eight out of 12 quads. The standard deviation of the misalignment in position angles for these eight lenses is 12 • . (Fig. 5c). The systems with large misalignment also have large external shear. We find a strong correlation between the misalignment angle and the external shear magnitude (r = 0.75, Fig. 5d). We find very weak correlation between the misalignment angle and the mass axis ratio q (r = 0.15, Fig. 5e).

Deviation of flux ratios from macro-model
Stars or dark subhalos in the deflector can produce additional magnification or de-magnification of the quasar images through microlensing and millilensing, respectively (for detailed description, see Schneider et al. 2006). In that case, the flux ratios of the quasar images will be different than those predicted by the smooth macro-model. Deviation of the flux ratios can also be produced by baryonic structures (Gilman et al. 2017b) or disks (Hsueh et al. 2016(Hsueh et al. , 2017, quasar variability with a time delay, and dust extinction Anguita et al. 2008). We quantify this deviation of the flux ratios in the quasar images as a χ 2 -value by where f IJ = F I /F J is the flux ratio between the images I and J. We assume 20 per cent flux error giving σ f IJ = 0.28 f IJ . We set this error level considering the typical order of magnitude for intrinsic variability of quasars (e.g. Bonvin et al. 2017;Courbin et al. 2018). Although, many of the quads in our sample have short predicted time-delays (Table C1), where intrinsic variability is not a major source of deviation in fluxratios, we take 20 per cent as a conservative error estimate for these lenses.
If the flux ratios are consistent with the macro-model, χ 2 f is expected to follow the χ 2 (3) distribution, i.e. χ 2 f ∼ χ 2 (3), as only three out of the six flux ratios are independent producing three degrees of freedom. However, the χ 2 f -distribution is shifted toward a higher value than χ 2 (3) (Fig. 6). The mean of the combined distribution of log 10 χ 2 f from all the three HST bands is 2.10. A Kolmogorov-Smirnov test of whether the observed χ 2 f -distribution matches with the χ 2 (3)distribution yields a p-value of ∼ 0. The shift is higher in shorter wavelengths. The mean of the log 10 χ 2 f 's in the F160W, F814W, and F475X bands are 1.97, 2.09, and 2.24, respectively. This is expected, as the quasar size is smaller in shorter wavelengths making it more affected by microlensing, and as shorter wavelengths are also more affected by dust extinction.

SUMMARY AND DISCUSSION
We presented a general framework to uniformly model large samples of quads while attempting to minimize investigator time. We apply this framework to model a sample of 12 quads and simultaneously fit imaging data from three HST WFC3 bands. All the quads are satisfactorily (p-value 10 −8 ) modelled in our uniform framework. We choose the p-value threshold to be suitably low to be applicable to our quad sample with large morphological variation while being able to point out deficiencies in the modelling choice of profiles along the decision tree. In the end, most of the lens systems in our sample are modelled with p-value ∼ 1 (Table  2). Thus, we showed that a large variety of quads can be modelled with a basic set of mass and light profiles under our framework, i.e. all the quads in our sample are 'happy' (or, at least 'content').
Only one of the quads in our sample, DES J0408-5354, has measured time delays: ∆t observed AB = −112 ± 2.1 days, PEMD Double elliptical Sérsic Sérsic 1.0 abcijklbcijmn Point source (image plane) * The p-value is for the combined χ 2 from all three bands. * * Labels of nodes visited during the modelling procedure in the flow-chart shown in Fig. 2. † Satellite or extra source component separate from the central source. Table 3. Lens model parameters. The reported uncertainties are systematic and statistical uncertainties added in quadrature.  In order to make the problem computationally tractable for much larger samples we made some simplifying assumptions. Thus, whereas some of the lensing quantities, such as Einstein radius, deflector center of mass, position angle and ellipticity, and image flux ratios, are robustly determined, our models are not appropriate for all applications. In particular, science cases requiring high precision might require more sophisticated modelling for each individual lens system.

System name
The main simplifying assumptions in our work are: 1) we restricted our models to simple yet general profiles to  describe the mass and light distributions. 2) we assume no colour gradient in the deflector and source fluxes. Thus, we use the same scale-sizes and ellipticity in the deflector-and source-light profiles in different bands while fitting simultaneously. Some straightforward ways to further improve the lens modelling are to allow for colour dependency of the light distribution of the source and deflector, explicitly including mass distribution of more nearby companions or satellites, increasing the number of shapelets (n max ), and consider composite mass models consisting of both stellar and dark matter components.
We illustrate the information content of this large sample of quads by investigating the alignment between the light and mass distributions in the deflector galaxies, and the dis- tribution of so-called flux ratio anomalies. Our key results are as follows: (i) The centers of the mass and light distributions match very well (the RMS of the offsets is 0. 04).
(ii) We find the correlation between the ellipticity of the mass and light distributions to be very weak (Pearson correlation coefficient, r = 0.09).
(iii) The position angles of the major axes of the mass and light distributions are well-aligned within ±12 • for two-third of the lenses.
(iv) Systems with high (> 30 • ) misalignment angle between the light and mass also have large external shear (γ ext 0.1). The Pearson correlation coefficient between the misalignment angle and the external shear is r = 0.75.
(v) The measured flux ratios between the images depart significantly from those predicted by our simple mass models. These flux ratio anomalies are strongest in the bluest band, consistent with microlensing being the main physical driver, in addition to millilensing associated with unseen satellites.
Our finding of weak correlation between the light and mass ellipticity slightly agrees with Keeton (2001), Ferreras et al. (2008 and Rusu et al. (2016) who find no correlation. However, we do not find a strong correlation as Sluse et al. (2012) and Gavazzi et al. (2012) report. The weak correlation between the mass and light ellipticity in our study is consistent with the hierarchical formation scenario of elliptical galaxies where the remnants in the simulation of multiple mergers are shown to have no correlation between the halo and light ellipticity (Weil & Hernquist 1996). Moreover, some of the deflectors in our sample are disky galaxies. The projected ellipticity of disky galaxies will not be correlated with the halo ellipticity if viewed from arbitrary orientations.
Moreover, dark matter halos are expected to be rounder than the stellar distribution from simulation (Dubinski & Carlberg 1991;Warren et al. 1992;Dubinski 1994) with reported agreements to observations (Bruderer et al. 2016;Rusu et al. 2016). In our sample, the majority of the sys-tems follow this prediction. Only three systems have significantly flatter mass distribution than the light distribution (DES J0408-5354, PS J0630-1201 and WISE J2344-3056). All these systems have satellites or comparable-mass companions and thus are not the typically relaxed systems where we expect this to hold. In contrast, four systems in our sample are significantly rounder in mass than in light: AT-LAS J0259-1635, DES J0405-3308, DES J0420-4037 and PS J1606-2333. These are likely to be disky galaxies from visual inspection of their shapes. This explains the large difference in ellipticity between the mass and light.
To reliably compare the ellipticity of the light and mass distribution, the ellipticity needs to be estimated within the same aperture, or within an aperture large enough beyond which the ellipticity does not significantly evolve. From a strong-lens system, only the total (projected) mass within the Einstein radius can be estimated. If the Einstein radius is much smaller than the effective radius of the deflector galaxy, the comparison of ellipticity between light and mass may not be representative of the entire galaxy.
We find a strong alignment between the mass and light position angles, which agree very well with previous reports (Kochanek 2002;Ferreras et al. 2008;Treu et al. 2009;Gavazzi et al. 2012;Sluse et al. 2012;Bruderer et al. 2016). Our result is also in agreement with Bruderer et al. (2016) that the systems with high misalignment (> 30 • ) also have strong external shear (γ ext 0.1). Absence of systems with high misalignment but low external shear is in agreement with the prediction of galaxy formation models. Orbits that are highly misaligned in isolated galaxies (thus with low external shear) are shown to be rare and unstable (Heiligman & Schwarzschild 1979;Martinet & de Zeeuw 1988;Adams et al. 2007;Debattista et al. 2015). The misalignment in isolated galaxies can only be sustained by a constant gas-inflow in blue starburst galaxies (Debattista et al. 2015).
Furthermore, for systems with θ E /θ eff < 1, the lensingmass is likely to be dominated by the stellar mass. In that case, relatively stronger correlation between the mass and the light distributions is naturally expected. Comparison between the dark matter and luminous matter distribution would be more interesting in regard to directly testing ΛCDM and galaxy formation theories. Although, broadly speaking, large deviations in ellipticity and alignment in our sample have to be explained by the presence of dark matter. However, direct comparison between the dark and luminous mass distributions requires composite mass models with dark and luminous components as adopted by Bruderer et al. (2016). These kind of mass-models are beyond the scope of this paper and left for future studies.
The flux-ratio anomalies in the disky galaxies in our sample are not at the extreme of the χ 2 f -distribution. This further supports microlensing by foreground stars being the dominant source of the flux-ratio anomaly.
Detailed follow-up of this sample is under way, to measure redshift and velocity dispersion of the deflectors as well as the time delays between the quasars and the properties of the environment. Once follow-up is completed, we will use this sample to address fundamental questions such as the determination of the Hubble Constant (e.g. Bonvin et al. 2017), the nature of dark matter (e.g. Gilman et al. 2017a), and the normalization of the stellar initial mass function in massive galaxies (e.g. Schechter et al. 2014       This paper has been typeset from a T E X/L A T E X file prepared by the author.