Extrapolating the projected potential of gravitational lens models: property-preserving degeneracies

While gravitational lens inversion holds great promise to reveal the structure of the light-deflecting mass distribution, both light and dark, the existence of various kinds of degeneracies implies that care must be taken when interpreting the resulting lens models. This article illustrates how thinking in terms of the projected potential helps to gain insight into these matters. Additionally it is shown explicitly how, when starting from a discretised version of the projected potential of one particular lens model, the technique of quadratic programming can be used to create a multitude of equivalent lens models that preserve all or a subset of lens properties. This method is applied to a number of scenarios, showing the lack of grasp on the mass outside the strong lensing region, revisiting mass redistribution in between images and applying this to a recent model of the SDSS J1004+4112 cluster, as well as illustrating the generalised mass sheet degeneracy and source-position transformation. In the case of J1004 we show that this mass redistribution did not succeed at completely eliminating a dark mass clump recovered by GRALE near one of the quasar images.


INTRODUCTION
Apart from causing beautiful observations, the light deflection caused by the gravitational lens effect holds the promise of providing insight into the distribution of the matter responsible for this deflection, as well as for probing parameters of the cosmological model.To make this possible, one typically needs to try to invert the lens effect, e.g.try to reconstruct a model for the gravitational lens that is compatible with the observations.Over the years, several techniques for doing so have been developed, differing in the kinds of observations they use as input as well as in how the matter distribution it tries to reconstruct, is modeled.This ranges from statistical analyses of small deformations of background galaxies, i.e. weak lensing data, to the use of multiple, possibly highly deformed images, also referred to as a strong lensing scenario.The cause of the deflection, the matter distribution of the lens itself, can be modelled by a relavitely small number of density profiles, typically aligned with the visible matter (e.g.LensTool, Jullo et al. (2007)), by a large set of basis functions, intended to be capable of modelling a wide variety of distributions (e.g.PixeLens, Saha & Williams (2004), Coles (2008)), or even by both of these options, in a more hybrid approach (e.g.wslap+, Sendra et al. (2014)).Others still do not model the mass distribution directly, but instead model the lens' ⋆ Corresponding author: jori.liesenborgs@uhasselt.begravitational potential (e.g.relensing, Torres-Ballesteros & Castañeda (2023)).
Irrespective of the procedure and input data that are used, it is important to realize that a solution to the inversion problem is not uniquely determined, that there exist various kinds of degenerate solutions that are able to explain the observations equally well.Some of these are exact by nature, others differ in principle but only cause changes that still lie within the known observational uncertainties.Depending on how the mass distribution is modelled, these degeneracies can manifest themselves in different ways.It may even seem that there are no such degeneracies present, if the inversion technique used does not provide the freedom needed to describe equivalent solutions.The existence of multiple, equally compatible solutions is fundamental however, so care must be taken when interpreting inversion results.
In this article we illustrate how thinking about lens inversion not on the level of the mass distribution itself, but the potential causing it, can help gain additional insight into which properties can actually be constrained well.Assuming that one solution to the lens inversion problem is known, a tool is introduced that uses quadratic programming to search for lens models that are equally compatible with the observed data.
After briefly reiterating the gravitational lensing formalism in section 2, a toy model will be introduced in section 3 to study an effect that is often encountered when performing lens inversions with our own method grale, namely mass density peaks outside of the region covered by the multiple image systems.The idea behind the method and its practical implementation using quadratic programming is explained in sections 4 and 5. Application to the toy model for the outer density peaks, as well as revisiting known degeneracies using this method will be done in section 6, ending the article with a final discussion in section 7.

GRAVITATIONAL LENSING FORMALISM
Below, the formalism to describe gravitational lensing is briefly reviewed -for a complete account the interested reader is referred to Schneider et al. (1992).In the usual approximation, the mass density of the gravitational lens itself is modelled as being two-dimensional, lying in the so-called lens plane.This mass density Σ(θ), where θ describes the viewing direction, causes light rays from source to observer to become deflected.The lens equation, describes this mapping: when looking in direction θ, one receives the light that one would receive from direction β if the light deflection could be disabled somehow.The deflection angle α(θ) describes the way the light ray changes direction due to the lens effect, and is determined by the projected mass density Σ(θ) in its entirety.This deflection angle is rescaled by D ds and Ds, the angular diameter distances from gravitational lens to source and from observer to source respectively.Similarly, the angular diameter distance from observer to lens will be denoted by D d .To ease notation, one often uses the rescaled deflection angle α = D ds /Ds α.The equation can be interpreted as describing how a two-dimensional source shape, lying in the so-called source plane and described by the β-space, is transformed into possibly multiple images lying in the image plane, described by the θ-vectors.
It can be shown that in this thin lens approximation, the deflection angle arises from a two-dimensional version of the gravitational potential, usually referred to as the lens potential or projected potential ψ(θ): (2) This lens potential is also related to a scaled version κ(θ) of the mass distribution where κ(θ) = Σ(θ)/Σcrit is also called the convergence, and Σcrit = c 2 Ds/4πGD d D ds is known as the critical density.
If two image positions θi and θj correspond to the same source position β, there will be a time delay ∆tij = t(θi, β)− t(θj, β) between these images, which may be measurable for a source with intrinsic variability.Here, in which z d represents the redshift of the lens plane.
The case where multiple images arise from a same source is called the strong lensing regime, but even when there is only a single image this can still be a somewhat deformed one.Further away from the bulk of the lensing mass, one  2021), illustrating the effect that is often encountered when using the freeform grale inversion method, when the optimization procedure places mass near the boundaries where the mass density is not well enough constrained by the more centrally located images.
then encounters the weak lensing regime.The deformations are described by the shear components Statistical analyses of deformed background galaxies cannot reveal these values directly unfortunately, only a combination of them with the convergence can be estimated at a point.This is then called the reduced shear gi = γi/(1 − κ).
For the remainder of the article, the focus will be on the strong lensing regime; the same ideas and procedures apply to the weak lensing regime as well.

OUTER DENSITY PEAKS
In our free-form strong lens inversions using the grale software (Liesenborgs et al. 2006(Liesenborgs et al. , 2020)), one has to specify the region where mass is to be recovered.Typically, this region should not exceed the boundaries set by the multiple image systems by too much, but the amount by which this is done in practice can vary somewhat: depending on the complexity of the lensing scenario one may need to make this region somewhat larger than a first estimate to be able to obtain a reconstruction that can explain the observed images adequately.
In such cases the underlying optimization technique can place extra mass near the borders of the inversion region, where no multiple image systems enclose these structures.For a final solution several tens of optimization results are averaged, which does tend to smoothen, but not remove these structures.Fig. 1 shows an example of this effect.It is commonly understood that as such mass density features are not enclosed by strongly lensed images, their precise location and shape should not carry too much weight.Instead, they can be interpreted as properties of the mass distribution that the inversion algorithm introduces to mimic external shear, but the precise origin of this shear cannot be constrained.
To illustrate the ill-constrained nature of the outer regions of a strong lensing scenario, the toy model from Fig. 2 will be used.The shape of the mass distribution is inspired by the Ares simulated cluster (Meneghetti et al. 2017), but has an extra mass peak in the top-right corner.The lens itself is located at a redshift of z d = 0.5 in a flat ΛCDM cosmological model with H0 = 70 km s −1 Mpc −1 and Ωm = 0.3, and causes the four circular sources shown in the right panel of the figure to be transformed into the images that can be seen in the center panel.

LENS POTENTIAL EXTRAPOLATION
From the overview of the lensing formalism in section 2, one can see that all properties can be derived from the lens potential ψ(θ).More precisely, if two models have the same values of the lensing potential in the regions that contain all the images, from equation (2) they will have the same deflection angles at those locations, and through the lens equation (1) will map to the same positions in the source plane.With image locations, source location and lens potential values unchanged, equation (4) implies that the time delays between images will be unchanged as well.Being very local properties derived from the lens potential, γi values at the image locations will be the same (equation ( 5)), as well as the gi = γi/(1 − κ) values, since the convergence κ stays unchanged by equation (3).
The central idea to resolve the presence of the mass peak in the toy model, is therefore to create a new model that has the same values of the lens potential within the circular region indicated in the figure.As both models have the same 1 potential values inside the circular region, all lensing properties will be conserved there.Outside the circle, the new model will have different values however: we shall start from the lens potential values inside the circle, and extrapolate these outward.The goal is to do this in such a way that said mass peak is less prominent or even erased.
To be able to perform such calculations based on ψ(θ) numerically, it will be necessary to approximate this continuous scalar field by a discrete grid of values ψij.Within the modelling part of the grale software, it is possible to specify a lens model based on such a grid.To calculate values of the first and second order derivatives, not only at the grid points themselves, but in between these points as well, the bicubic interpolation routines from the GNU Scientific Library2 (GSL) are used.
In a first step, an approximation of the toy model is needed based on a grid of ψij values.To be able to calculate derivatives near the boundaries shown in the figure, a slightly larger region is used for this grid, in this case a 240 × 240 arcsec 2 one.For a 32 × 32 grid covering this region, Fig. 3 shows results for this approximate lens model.In this case, differences are still noticeable, but as can also be seen in In what follows, the model based on 128 × 128 values of ψij will be used as the starting point.This represents the true toy model with considerable accuracy, while yielding a manageable number of variables that will need to be handled in the extrapolation method described below.

QUADRATIC PROGRAMMING SOLUTION
For the simulated lensing scenario under consideration, the approach will be to keep the ψij values inside the circular region fixed, and to optimize for the other lens potential values.It shall turn out that this optimization can be formulated as a quadratic programming (QP) problem, for which several software packages exist to calculate the solution very efficiently.
In this type of problem one looks for a vector x of unknown values that mimimize the expression where P and q are a known matrix and vector respectively.
One is allowed to formulate linear constraints for these unknowns in x, Here, G and h are again a known matrix and vector, and the inequality sign is to be interpreted as a component-wise inequality.Similarly, one is allowed to impose a linear equality constraint as well as hard lower and upper bounds for the values in x.
For simplicity, let us first consider a one dimensional example.Suppose the lensing potential is defined on a grid of N points, ψi, i = 1, . . .N .Some of these values will be held fixed throughout the procedure, this will be denoted as ψi = ψi, while others are the values to be retrieved, described by ψi = xj.There will be M such unknown values xj, j = 1, . . .M .
Leading to a measure of the mass density through equation (3), the Laplacian will be of particular interest.For a discretized version of the lens potential values, this can be approximated by an R-point kernel L k , k = 1, . . .R, which through a convolution then yields an estimate of the local density.As an example, using e.g. the difference of differences × 512 values that are plotted, so also in between the ψ ij grid points), as well as RMS difference in the image plane are also mentioned (the circles show the image positions according to this approximation, the crosses are those for the toy model lens itself).As for higher grid resolutions the visual differences from Fig. 2 become quickly unnoticeable, Table 1 shows the maximum differences and RMS for these approximations.
to approximate the Laplacian in a one-dimensional scenario, the three-point kernel L = [1, −2, 1] could be used.
Seeking smooth density distributions, it is actually the gradient of the density that is part of the minimization process.To optimize for this gradient in a discrete setting, one would compare such convolutions around neighbouring points, leading to the minimization of the following cost function: where for simplicity the bounds of the summation over i are assumed to be such that the indices stay valid.One can easily regroup terms to yield for a now slightly larger kernel K.While this derivation was inspired by the gradient of the Laplacian, this general formula can be useful for many situations.For example, one could also use the Laplacian kernel directly to minimize for the density, a straightforward kernel [−1, 1] to minimize the gradient of the lens potential, or even use multiple kernels with different weights.The latter would merely add more terms to the outer shows the mass densities, the bottom row the re-calculated images using the adjusted model as well as the critical lines.For reference the original critical lines can be seen as thick dashed lines.In the left hand column, the lens potential values inside the circular region were kept fixed, and allowed to change freely outward.In the center column, the same settings were used, only adding the additional constraint that the lens potential values at the border of the ψ ij grid should be preserved as well.In the right hand column, the border constraint was kept, but the one from the circular region was replaced by the regions of the images themselves.
sum, optionally weighting them by a different factor, but the problem remains the same otherwise.This optimization problem is of the quadratic programming kind that was described above.To see this, let us focus on one particular term of the outer sum, say i = 7.Assuming some of the ψ values are fixed and some are to be optimized for, this term could look like the following: Working out the square and leaving out the constant value that does not play a role during the optimization process, this can be written as For the full number of M unknowns xj, one can easily imagine the matrices containing the K k factors to be padded with zeros.To account for the outer summation over i, the matrices for each of the terms simply need to be added together.One then ends up with the optimization problem as formulated in (6).Note that the resulting matrix P will be a sparse matrix, in this one-dimensional example a so-called band matrix.
There is also a constraint that needs to be taken into account: the mass density that is determined by the reconstructed potential, needs to be positive.This means that the convolution with the discrete Laplacian kernel L k should be positive for every grid point: where i is any value that leads to an actual constraint on some xj values (i.e.not all ψi values are fixed).The set of resulting constraints can easily be organized into the form of (7).Formulating additional constraints this way for the mass density can of course be done as well: perhaps one would like the mass in certain regions to lie within certain bounds, or even be equal to specific values.Similarly, one could add constraints for the gradient of ψi, i.e. the deflection angle, using the [−1, 1] kernel.These desired properties would merely add rows to equations ( 7) or (8).
For the two-dimensional case, of course a two-dimensional convolution will be needed, and the outer sum will need to be replaced by two sums, one for every grid dimension.This means that equation ( 11) will be modified into the following: When writing out just a single term that is squared, it will have a comparable form as equation ( 12), implying that a very similar organization into a quadratic programming problem can take place.The structure of the P matrix will no longer be that of a band matrix, but since the kernel is typically very small compared to the full grid size of the lens potential values, it will still be a sparse matrix.When using a kernel with e.g.values such as [−1, 1], the result is only an estimate of the gradient up to some scale factor, depending on the grid resolution.Appendix A describes a practical way in which this scale factor is determined in the code.

APPLICATIONS
To numerically solve the quadratic programming problem, we made use of the Python module qpsolvers (Caron et al. 2023).This does not provide an implementation for the QP optimization by itself, but instead standardizes the formulation of the problem and still allows one to select one of several supported solver implementations, both open source and commercial.The actual solvers that were used in the examples below, are the mosek (MOSEK ApS 2023) software and the Splitting Conic Solver (SCS) (O'Donoghue et al. 2022(O'Donoghue et al. , 2016;;O'Donoghue 2021).
For the convolution kernel that is used as the discretized Laplace operator, various kernels with different extents can be used.In practice, the commonly used did not work as well as the following 5 × 5 kernel from Burger & Burge (2009):

Outside density peak in toy model
The left and center columns of Fig. 4 show the results of the QP optimization when keeping the projected potenial values inside the circle fixed to those of the 128 × 128 approximation of the toy model lens.For both situations the weights for the different kernel contributions were the same, but each has a different boundary constraint for the ψij grid.In the left situation, there was the requirement that the density that is calculated from the potential should equal the original one at the border.In the center situation the potential values themselves at the border were fixed in addition to those in the circular region.The additional freedom for the situation from the left hand panel allowed the solver (mosek was used here) to erase the upper-right mass peak nearly completely.As the bottom part of the figure shows, the resulting critical line structure has become somewhat peculiar, clearly differing from the original one.This is much less the case for the center panel situation.The extra constraint on the border values of the lens potential did prevent the optimization to complete eliminate the topright mass peak, although it is clearly reduced significantly.
Note that because the lens potential values are unchanged within the circular region that contains the images, the image plane RMS is the same as in Table 1 for the 128 × 128 grid, on which these optimizations were based.

Monopole degeneracy revisited
While keeping lens potential values inside the circular region fixed will certainly preserve all properties in this strong lensing region, one could also keep only the potential values at the image locations fixed.That too will preserve all properties at those locations, but allows for changes in between the images as well.In this sense, it is quite similar to the monopole degeneracy from Liesenborgs et al. (2008b), where specific basis functions were used to manipulate the mass density in between the images but without affecting the properties at the image locations themselves.The right hand column of Fig. 4 shows the results for the toy model, where the settings were the same as in the center panel but the constraint of the circular region was replaced by constraints at the image locations.As appendix B shows, for CL0024+1654 the result is very similar to that of Liesenborgs et al. (2008b).
It is then interesting to apply this procedure to a mass model for SDSS J1004+4112 that was published recently (Perera et al. 2024), showing a peculiar mass concentration.For reference, the model is reproduced in the top panel of Fig. 5, where the mass density feature can be seen around (6, −1) arcsec.The extrapolation procedure described above, can then be used to redistribute mass in between the image locations.Because the projected potential values in a small area around each image need to be conserved (to conserve all relevant derivatives), the resolution of the potential grid ψij affects how much freedom remains to redistribute the mass clump in question.Here, a very fine grid of 1536 × 1536 points was used, leading to the equivalent lens model shown in the bottom part of the figure.The procedure clearly was not as successful in erasing the mass clump as in the case of CL0024+1654, although the feature has become less prominent.That it is more difficult to remove it altogether could be expected due to the proximity of the images: since the projected potential is preserved in a small region around each image, so is the density and even its gradient.Note that in trying to erase certain features, the application of the procedure has also introduced some less desirable ones: a pinching effect can be seen e.g.around images (8,10) and (7,3).

Mass sheet degeneracy and source-position transformation
A relatively straightforward degeneracy is typically referred to as the mass-sheet degeneracy (MSD) (Falco et al. 1985) or steepness degeneracy (Saha & Williams 2006).Replacing the density Σ(θ) of a lens model by yields a new model for which a source plane that is scaled in each dimension by the factor λ, corresponds to the same im-  2024).The feature around (6, −1) arcsec, also called the south-east mass clump in the article, was an interesting result that was reproduced quite consistently in the inversions.The bottom panel shows how the extrapolation method described here, can cause mass to be redistributed thereby making this clump considerably less prominent.The optimization was performed with the SCS solver.
age plane.Note that this particular construction can be done for only a single source redshift, as the sheet of mass Σcr depends on the angular diameter distances to this source.This degeneracy does not preserve all properties though: since it scales the source plane, the magnifications of the images are changed by a factor of λ 2 (note that the relative magnifications are not affected).It can also be shown that the time delays involved are scaled by this factor λ.
One generalization of this degeneracy, using the extrapolation procedure from before, would be to find an alternative to the sheet itself: in the regions of the images, one could determine the projected potential values for the mass sheet Σcr.Keeping these values fixed, the lens potential in other regions could be extrapolated, yielding a model Σcr,eq(θ) that has an equivalent effect as the mass sheet in this particular case.Using this in a similar way as in equation ( 18) again produces a new Σ ′ that has the same effect as the original mass sheet degeneracy.Since the extrapolation can be done in many ways, even for a single λ value this leads to a multitude of equivalent mass models.
It has been shown before that this degeneracy can be extended to multiple source redshifts, but with the same scale factor λ (Liesenborgs et al. 2008a), and even that different scale factors can be used (Liesenborgs & De Rijcke 2012).This more general scenario can also be inspected using the extrapolation procedure, although in this case the focus will be more on the constraints than on the potential values themselves.To illustrate this, consider the situation in the left hand column of Fig. 6, which uses the same settings as in the example from Liesenborgs & De Rijcke (2012): a non-singular isothermal ellipse (NSIE) mass distribution at a redshift of 0.5, transforms the two elliptical sources at redshifts of 1.2 and 1.8 from the bottom panel into the images of the center panel.
Focusing temporarily on the first source only, a new model can be created using equation ( 18) and factor λ1 that causes a scaled version of the source to correspond to the same images.Let us identify the regions of the images as {θreg,1}, and save the deflections angles for this new model in these regions, α′ 1 ({θreg,1}), for later use.Of course, for the second source the same procedure can be performed, with a different λ2, again producing deflection angles α′ 2 ({θreg,2}), now in the regions of the other images.
The goal is now to create a single lens model that has both α′ 1 ({θreg,1}) and α′ 2 ({θreg,2}) as deflection angles, and it is this which the extrapolation method can produce.There will be no3 values of the lens potential that are fixed, all will be calculated.Apart from the constraint that keeps the mass density positive, there are now also constraints for the gradients of the projected potential, as these correspond to the deflection angles α′ of the new model.For such gradients the kernel [−1, 1] can be used in x-and y-directions, yielding estimates of α′ .In principle, equation (8) could enforce these values in the regions of the images to be respectively α′ 1 ({θreg,1}) and α′ 2 ({θreg,2}), but we found that this exact constraint does not work well in practice.Instead, some small deviations are allowed, which can be formulated as inequality constraints using equation ( 7).The center column of Fig. 6 shows a model that was obtained using this procedure, for λ1 = 0.9 and λ2 = 0.8.The sources that were scaled by these factors are shown in the bottom panel, and cause the same images as before to be generated, as can be seen in the center panel.
The mass-sheet degeneracy is a special case of the sourceposition transformation (SPT) described in Schneider & Sluse (2014).In this more general version, the source plane does not undergo a mere rescaling, but can be transformed in a more general way.In case one would scale only the ydimension of the source plane by a factor λ, one can write where Based on the original deflection angles in the image regions, this way one can calculate the constraints for α′ 1 ({θreg,1}) and α′ 2 ({θreg,2}), which can subsequently be solved in the same way as before.The right hand column of Fig. 6 shows how the two source shapes that were rescaled in the ydirection by a factor λ = 0.85 produce the same images again.In practice, obtaining results that correspond to this type of SPT turned out to be quite difficult.In particular, when fur- ther decreasing λ one quickly needed to allow more and more deviations in the targeted α′ values for the QP procedure to still find a solution.Similar difficulties were also noted in Schneider & Sluse (2014), as more arbitrary changes of the source plane are no longer guaranteed to be compatible with deflection angles being the gradient of the lensing potential.
As with the MSD itself, while these variations still generate (nearly) the same images, other properties are no longer preserved.The time delays will change in a less predictable way, and magnifications themselves are changed as well.

DISCUSSION AND CONCLUSION
In this article we have looked at the problem of lensing degeneracies from the perspective of the projected potential.Starting from a working lens model and keeping the lens potential values fixed in the relevant regions, i.e. at least the regions of the images in the system, one can obtain a multitude of equivalent lens models that preserve all lensing properties.If one concentrates on only retaining the deflection angles, models compatible with the same observed images can still be obtained, however these will no longer necessarily preserve other properties.This also illustrates the different constraining power of different types of observations.Time delay measurements help probe the projected potential directly and are therefore of particular importance.Observed images provide information about the gradient of the potential, which illustrates why, without fixing the underlying shape of the lens model, many multiple images systems are required to build the lens potential from its gradients.Weak lensing measurement probe the curvature of the potential, providing the least detailed information.
A method using quadratic programming was described with which such degeneracies can be explored in practice.Different constraints or weights of the convolution kernels yield different results, and even different solvers with otherwise the same settings typically do no converge to the exact same solution.All solutions do preserve the desired lensing properties, further indicating that multiple solutions can explain the same observations.At least using this particular method, it is is not always easy to eliminate the light-free mass features that grale reconstructs.They can be partially smoothed by redistributing mass over the lens plane, but they do not necessarily go away completely.The work presented here shows that because of lensing degeneracies, one should not take the specific shapes of such mass features too seriously, but on the other hand one should not merely dismiss them as artifacts of a freeform reconstruction.This also lends support for the reality of the dark matter clumps in SDSS J1004+4112 as well as in Abell 1689 (Ghosh et al. 2023), which could be similar, but somewhat less massive than the ones found in the Coma cluster using weak lensing (Okabe et al. 2010).
The left hand panel of Fig. 4 showed how certain settings could remove the mass peak from the toy model completely.The price was the rather odd critical line structure, for which a more wide field view can be seen in the left panel of Fig. 7.The area shown borders the region on which the lens potential is defined, explaining the behaviour of the lines near the boundaries.The right hand panel of the same figure shows the corresponding caustic structure.While this particular lens model still generates the same images as the true lens, several extra, smaller images are predicted as well.Unfortunately this is not something that can be prevented directly using the quadratic programming method.These strange critical line and caustic structures identify a possible disadvantage of thinking in terms of the lens potential.While this does allow for a large amount of flexibility, it does not provide an overview of the mass distribution in its entirety.Similar to the notion of the external shear, it can easily encode lensing effects without having to identify any part of the mass den- sity as the origin of these effects.In this respect, explicitly modelling the mass density itself has the advantage, as there simply is no other mass needed that is not part of the model.Not always visible in the figures as they are shown here, are some rather unphysical fluctuations in the density near the images where the potential values are retained.Fig. 8 illustrates this effect: for the image in the SDSS J1004+4112 system that is indicated in the left panel, the center panel shows the density according to the model that is based on the extrapolated lens potential values.The intersections of the horizontal and vertical lines identify the grid where these potential values are defined.To conserve the lens properties near the images, some of these values were kept fixed; these result in the density region that resembles a rotated rectangle.Adjacent to this region, where the extrapolation starts, is where some fluctuations in the density can be seen.This seems to be explainable, at least in part, to a slight mismatch between the final model using bicubic interpolation of the grid values and the optimization procedure that estimates the density using a convolution kernel.The first one is used to have a lens model for which all properties can be calculated at any point, even between the grid points.The second is needed to be able to formulate the optimization as a QP problem.The right hand panel shows the estimated density from the convolution, implying that this is the situation that is considered by the QP procedure.While the densities are no longer defined everywhere, but only on the grid points, the result does seem to suffer much less from the fluctucations from the center panel.Even so, the area where the potential values are preserved can still be identified clearly, meaning that the extrapolation result is not as good as one would hope.
It is not yet clear how this effect should best be addressed.Perhaps a convolution kernel that is better suited than the one from equation ( 17) can be found, although initial attempts have not been successful.Another approach could be to no longer use the QP formulation of the problem, but another optimization strategy that does not require the approximation with the convolution kernel.Possibly such an alternative optimization procudure could prevent the prediction of unobserved images as well.

Figure 1 .
Figure 1.Reproduction of fig. 1 from Ghosh et al. (2021), illustrating the effect that is often encountered when using the freeform grale inversion method, when the optimization procedure places mass near the boundaries where the mass density is not well enough constrained by the more centrally located images.

Figure 2 .Figure 3 .
Figure2.This is the toy model mentioned in section 3. The left panel shows a mass distribution consisting of two centrally located non-singular isothermal ellipse (NSIE) profiles, where a mass peak has been added in the top-right corner of the region.The mass density is shown in units of κ = Σ/Σcr for a source at redshift z = 2.0, the contours show levels of κ with intervals of 0.2 where the κ = 1 level is indicated by a thicker line.The dashed circle marks a rough estimate of the strong lensing region, to be used in the procedure that shall attempt to erase the external features, i.e. the peak in the top-right corner.This mass distribution causes the the sources at redshifts of 1.0 (diamond symbol), 1.5 (circle), 2.0 (square) and 2.5 (pentagon) from the right panel, to be lensed into the images that are shown in the center panel of this figure (the left panel marks these positions with crosses).Also shown in center and right panels are respectively the critical lines and caustics for a source at redshift z = 2.0.

Figure 4 .
Figure4.Results obtained when trying to remove the top-right density peak of the toy model, all using the mosek solver.The top row shows the mass densities, the bottom row the re-calculated images using the adjusted model as well as the critical lines.For reference the original critical lines can be seen as thick dashed lines.In the left hand column, the lens potential values inside the circular region were kept fixed, and allowed to change freely outward.In the center column, the same settings were used, only adding the additional constraint that the lens potential values at the border of the ψ ij grid should be preserved as well.In the right hand column, the border constraint was kept, but the one from the circular region was replaced by the regions of the images themselves.

Figure 5 .
Figure 5.The upper panel shows the same model as the one in the top-left part of fig. 2 ofPerera et al. (2024).The feature around (6, −1) arcsec, also called the south-east mass clump in the article, was an interesting result that was reproduced quite consistently in the inversions.The bottom panel shows how the extrapolation method described here, can cause mass to be redistributed thereby making this clump considerably less prominent.The optimization was performed with the SCS solver.

Figure 6 .
Figure6.The left column shows the simulated lensing situation that is used to illustrate how the extrapolation procedure can be used to obtain generalized MSD and SPT variations of this lens.The top row shows the mass densities, where contours are again in units of κ (for a source at z = 1.2), spaced by 0.2, and the thick like corresponds to κ = 1.The next row shows the images and critical lines that correspond to the sources and caustics in the bottom row.The source and images enclosed by square correspond to a redshift of 1.2, the other source is at a redshift of 1.8.The center column shows the model, image plane and source plane when a generalized MSD was constructed (see text).The orignal source positions are shown as thin lines.Similarly, the right column shows these properties for lens that differs by the SPT.

Figure 7 .
Figure 7.The left hand panel is a wider view of the peculiar critical lines that were shown in the left of Fig. 4, the right hand panel shows the corresponding caustics.Apart from the original multiple image systems, several other, small images are now predicted near the border as well.

Figure 8 .
Figure 8.For the SDSS J1004+4112 images that are indicated by the arrow in the left panel, the center panel shows the density in their neighbourhood.The locations of the potential grid values are the intersections of horizontal and vertical lines.The right hand panel shows what the QP procedure calculates for the density, based on the convolution of these same potential values with the Laplacian kernel from equation (17).Comparing the two shows that the small scale fluctuations in the center panel can at least partly be explained by a mismatch between the different ways the density is calculated.

Figure B1 .
Figure B1.Applying the method from this article to the model from Liesenborgs et al. (2008b) (left panel), yields the result shown in the right panel.For comparison, the center panel shows the result that was obtained in the aforementioned article using the monopole degeneracy.

Table 1 .
Table 1 the approximation improves quite rapidly for higher N ψ × N ψ grid resolution.This table illustrates the properties mentioned in the caption of Fig. 3 for increasing N ψ × N ψ resolutions of the ψ ij grid.The ∆κ and ∆ αi numbers represent the maximum values of the differences.