The Performance of Photometric Reverberation Mapping at High Redshift and the Reliability of Damped Random Walk Models

Accurate methods for reverberation mapping using photometry are highly sought after since they are inherently less resource intensive than spectroscopic techniques. However, the effectiveness of photometric reverberation mapping for estimating black hole masses is sparsely investigated at redshifts higher than $z\approx0.04$. Furthermore, photometric methods frequently assume a Damped Random Walk (DRW) model, which may not be universally applicable. We perform photometric reverberation mapping using the Javelin photometric DRW model for the QSO SDSSJ144645.44+625304.0 at z=0.351 and estimate the H$\beta$ lag of $65^{+6}_{-1}$ days and black-hole mass of $10^{8.22^{+0.13}_{-0.15}}M_{\odot}$. An analysis of the reliability of photometric reverberation mapping, conducted using many thousands of simulated CARMA process light-curves, shows that we can recover the input lag to within 6 per cent on average given our target's observed signal-to-noise of>20 and an average cadence of 14 days (even when DRW is not applicable). Furthermore, we use our suite of simulated light curves to deconvolve aliases and artefacts from our QSO's posterior probability distribution, increasing the signal-to-noise on the lag by a factor of $\sim2.2$. We exceed the signal-to-noise of the Sloan Digital Sky Survey Reverberation Mapping Project (SDSS-RM) campaign with a quarter of the observing time per object, resulting in a $\sim200$ per cent increase in SNR efficiency over SDSS-RM.


INTRODUCTION
All active galactic nuclei (AGN) are believed to be powered by an accretion disk around a central super-massive black-hole (SMBH) which is itself surrounded by a broad-line region (BLR ;Antonucci 1993;Urry & Padovani 1995;Ho 2008;Heckman & Best 2014). The mass of the SMBH has been observed to scale with the properties of its host galaxy (e.g. Magorrian et al. 1998;Silk & Rees 1998;Benson et al. 2003;Haering & Rix 2004;Croton et al. 2006;Guo et al. 2011;and Kormendy & Ho 2013 for a full review) and so it is essential that accurate masses for the SMBH can be derived in order to investigate the effect AGN feedback has on their host galaxies.
In the absence of a direct black-hole mass measurement, there exist scaling relations based on emission line widths (e.g. Hβ : Wandel et al. 1999 andMg II : McLure &Jarvis 2002) and luminosity at 5100Å (e.g. Bentz et al. 2013). These relations are typically calibrated at low redshift and have not been extended to high redshift (Hiner et al. 2015; Barii et al. 2017) Vestergaard & Peterson 2006;Netzer et al. 2007;Runnoe et al. 2013;Feng et al. 2014;Meja-Restrepo et al. 2016). Therefore, it is also for the purposes of validating these scaling relations that more black-hole mass measurements at higher redshifts are needed.
Reverberation mapping (Blandford & McKee 1982;Gaskell & Sparke 1986;Gebhardt et al. 2000;Ferrarese & Merritt 2000;Peterson 2004) is a powerful technique for estimating black-hole masses. Assuming that the broad-line region is gravitationally dominated by the SMBH, it is possible to estimate the black-hole mass from the time delay between continuum emission from the accretion disk and the reprocessed emission from the broad-line region, also known as the "lag", from the Keplerian motion equation: where the virial parameter f describes the structure and orientation of a broad-line region with radius R BLR = ct lag and velocity dispersion, σ disp , of the broad-line region. Assuming that the virial factor, f , is fully generated by the inclination, θ , of the disc, f = 1/4 sin 2 θ and so at θ = 30 • , f = 1 (McLure & Dunlop 2001;Liu et al. 2017). The f can be determined on a case-by-case basis by modelling the BLR using spectroscopic measurements (Pancoast et al. 2011(Pancoast et al. , 2014Williams et al. 2018) or purely photometric means Pozo Nuez et al. (2014), through gravitational redshift measurements (Liu et al. 2017), or through combinations of independent black-hole mass estimators. However, it is common to use an aggregated average for use in large data sets. Grier et al. (2013a), Onken et al. (2004), Park et al. (2012), and Graham et al. (2011) have measured values of f = 4.3 ± 1.1, 5.5 ± 1.8, 5.1 ± 1.3, and 2.8 ± 0.6 respectively from the independently measured stellar velocity dispersions. So far, about 100 black-hole masses have been measured using spectroscopic reverberation mapping techniques (Kaspi et al. 2000;Bentz et al. 2009a,b;Denney et al. 2010;Bentz et al. 2013;Barth et al. 2015;Grier et al. 2012;Shen et al. 2015b;Du et al. 2015Du et al. , 2016aGrier et al. 2017), which require long-term spectroscopic observations to recover their lags. Since BLR radii can span up to several hundred light days (Peterson 2004;Bentz et al. 2014;Fausnaugh et al. 2017;Williams et al. 2018) light curve observations need to take place over several months or years to match features in the continuum to the echoes from the BLR, with 3 times the observed-frame lag being the recommended baseline (Shen et al. 2015a). Cosmological time dilation increases the time-scale of observed variability and so high-redshift QSOs require much longer observational campaigns than low-redshift QSOs. To compound this effect, higher-redshift QSOs are intrinsically more luminous than lower-redshift QSOs, which implies that they have longer lag timescales than lower-redshift QSOs (given the lag-luminosity relation). Fine et al. (2013) and then Brewer & Elliott (2014) have developed methods to recover lags from the stacked cross-correlations of photometric and spectroscopic observations to be used when individual lags are poorly constrained but there is a large sample of AGN. This method allows for the detection of emission-line lags for a population of AGN at very high redshift (Fine et al. 2013 use a sample of AGN with redshifts z 4.5) and provides convincing evidence for the decreasing BLR radius for emission-lines with higher excitation energies. However, stacked reverberation mapping is a statistical technique and cannot provide more signal-to-noise for individual objects.
An extra source of inefficiency for spectroscopic campaigns is the need to disperse the light and subsequent decreased signal-tonoise especially at high redshift. Therefore, observing emission lines spectroscopically for reverberation mapping is expensive due to the required overhead, and restricted to bright or low redshift sources and so accurate photometric methods for reverberation mapping are highly sought after.
The variability of the BLR emission line can be captured within a redshifted narrow-band (or broad-band) photometric filter through the careful separation of the underlying, driving continuum (Haas et al. 2011;Chelouche & Daniel 2012;Pozo Nuez et al. 2012;Zu et al. 2016). This can be done either by modelling the variability using a stochastic time-series model such as the Damped Random Walk (Zu et al. 2011(Zu et al. , 2013(Zu et al. , 2016 or by more empirical measures such as cross-correlation analysis, which are model-independent (White & Peterson 1994;Rybicki & Kleyna 1994;Peterson 2004;Chelouche & Daniel 2012;Shen et al. 2015a;Fausnaugh et al. 2017).
Javelin (Zu et al. 2013(Zu et al. , 2011(Zu et al. , 2016) is a parametric Bayesian tool which models the variability of the QSO itself rather than extracting peaks from empirical cross-correlation functions. Modelling the continuum emission as a DRW has some advantages over cross-correlation in that it allows for natural inclusion of Bayesian inference techniques for noisy data from which parameter values and uncertainties can be estimated (Zu et al. 2011(Zu et al. , 2013. Stochastic DRW models of the accretion disk continuum emission are based on physical assumptions that can be tested by observations. The physical mechanism supporting the use of DRW models is the stochastic heating of the accretion disk by the central source and its subsequent variability due to thermal fluctuations (Kelly et al. 2009). However, there is growing evidence that DRW is not universally applicable and that more complex time-series models are necessary to explain the correlations at high frequency (e.g. Kelly et al. 2014;Kasliwal et al. 2015bKasliwal et al. , 2017Guo et al. 2017;Smith et al. 2018). If this is the case, then assuming a DRW when interpolating light-curves (in order to estimate the lag) may introduce artificial peaks in the posterior distribution. Therefore, it may be beneficial to estimate the lag without interpolation, as with a Von Neumann estimator (Chelouche et al. 2017) or ZCDF method (Alexander 2013). However, these methods have their own problems, when binning with few data points, and biases due to the combined continuum and line light-curve in the narrow-band photometric filter. Although the sample of reverberation mapped QSOs is becoming more representative (in terms of luminosity and redshift) with time, the current sample is biased to low redshift QSOs and a narrow range of emission line properties (Shen et al. 2015a;Grier et al. 2017). If photometric reverberation mapping can recover precise lag estimates for SMBHs, then fewer resources would have to be spent on spectroscopic campaigns in order to fill in the parameter space of black-hole mass, luminosity and redshift.
Photometric reverberation mapping has been performed on both individual targets below z = 0.04 (Haas et al. 2011;Edri et al. 2012;Pozo Nuez et al. 2012;Ramolla et al. 2014;Pozo Nuez et al. 2014;Carroll & Joner 2015;Hood et al. 2015;Pozo Nuez et al. 2015) and for a sub-sample of the SDSS-RM (Shen et al. 2015a) catalogue (Hernitschek et al. 2015;Zhang et al. 2018). However, the estimated uncertainties for these SDSS-RM sub-samples are typically larger than 100 per cent. Photometric reverberation mapping has also been applied to the continuum to measure the properties of the accretion disk (Mudd et al. 2018;Cackett et al. 2018), though not to estimate black-hole masses until recently (Pozo Nuñez et al. 2019).
This work sets out to demonstrate the efficacy and reliability of photometric reverberation mapping even for higher redshift targets and to test its accuracy when the DRW assumption is not applicable. We aim to produce the first robust photometric reverberation mapped black-hole mass with a redshift above z = 0.04.
In Section 2, we carefully pre-select targets to give us the best possible chance of recovering precise lags. We specify that candidates must have redshifts that allow the use of a redshifted Hα photometric filter and have expected observed lags (from the lagluminosity relation Bentz et al. 2013) such that they can be observed for 3t lag days over multiple semesters. We then detail our observations and the methods used to produce photometric light-curves for use with Javelin. Before fitting QSO variability models to our observations, we produce a suite of simulated light-curves in order to test how well Javelin can recover known lags for QSOs with the same cadence and signal-to-noise as our target observations. In Section 3 we present the fitted BLR lag and black-hole mass distributions for our observations. In order to test whether the slope is significantly affected by non-Gaussian errors, we also apply rigorous statistical analysis to the fitting of the Hβ lag-luminosity relation by not assuming Gaussian uncertainties for either our targets or for the Grier et al. (2017) catalogue. In Section 4 we compare the efficiencies of the SDSS-RM campaign (Shen et al. 2015a;Grier et al. 2017) and our own, in terms of signal-to-noise of the fitted lag. We also discuss future potential applications of photometric reverberation mapping in upcoming surveys where such techniques can easily be applied. Finally, we summarise our conclusions and outlook in Section 5.

METHODS
Our intermittent requirements make RM observations of small samples of high redshift targets unsuited to continuous observing campaigns. We observed our QSOs robotically with the Liverpool Telescope (Steele et al. 2004) since it can accommodate our discontinuous observation campaign. We make use of the optical components of the infrared-optical (IO:O) suite of instruments available on the Liverpool Telescope since a range of Hα filters are available in addition to the SDSS ugriz filters. This allows us to observe the Hβ emission lines of a wide range of high redshift QSOs, since their observed emission line will fall within the bandpass of one of the available Hα filters.

Target selection
We select our targets to have i AB < 18, spectroscopically-confirmed in the SDSS DR12 (York 2000;Eisenstein et al. 2011) or BOSS (Dawson et al. 2013), and have broad Hβ emission lines with equivalent widths > 50Å. We only select those QSOs whose redshifted Hβ line will fall into one of the IO:O Hα photometric filters. Additionally, using the 5100Å luminosities from Shen et al. (2011) and the R − L 5100 relation from Bentz et al. (2013), we pre-select targets that are likely to have observed lags t lag (1 + z) < 95 days. Shen et al. (2015a) construct 2 metrics in order to determine which combination of properties of their simulated light-curves yield the most accurate lag detections. They find that the ratio of the number of data points contributing to the calculation of the crosscorrelation function to the number of data points that contribute to resolving the true lag is typically ≈ 2 for detected lags. In the limit of N epoch 1, this is equivalent to a requirement on the total observing run duration of 3 times the true observed lag, t span /t lag = 3. We therefore imposed an additional criterion that the QSOs be observable for at least 3 times the length of their expected lag between the 14 months of the Liverpool Telescope extended 2015B and 2016A semesters. Applying these constraints yields 10 targets which we submitted for observation.
Our targets, shown in Fig 1 as green points, are positioned between the redshift-luminosity locations of the high-redshift spectroscopic sample from Grier et al. (2017) and the low-redshift sample from Bentz et al. (2013).

Observations
Since the expected variability of QSOs is of order 10-70 per cent (Kaspi et al. 2007), we conservatively derive i-band exposure times, assuming an SNR > 20 (e.g. Bentz et al. 2013;Shen et al. 2015a) and seeing < 2 arcseconds, of 88s. This exposure time was calculated for our faintest target and so the SNR for the rest of our targets will be larger. Using the SDSS BOSS observations of our targets (shown in Fig 2) we detect no bright spectral features that would interfere with our ability to measure the continuum accurately. Accounting for the large equivalent widths of the Hβ lines, we use a 600s integration time for broad-line (i.e. narrow-band) observations.
Our targets span a range of redshifts between 0.350 and 0.398. Therefore, for each source, we use the appropriate Hα photometric filter for which the redshifted Hβ line dominates. For Target-10, we use the Hα-6566Å narrow-band filter.   Bentz et al. (2013) t rest − L 5100 relation, where L 5100 is estimated from the SDSS spectrum. We quote the length of time each target is visible and the baseline of the observations in units of days and expected observed-frame lag.
As seen in Table 1, we obtain the largest number of acceptable exposures with SDSSJ144645.44+625304.0 (referred to as Target-10 hereafter). Indeed, Target-10 is the only QSO for which we have obtained a baseline of observations longer than the recommended 3t Hβ (1 + z) needed to recover a lag. Thus, in what follows, we only discuss the analysis of Target-10 and defer the rest to a future work.

Ensemble Photometry and Flux Calibration
In order to estimate lags between the broad-line region and the continuum-emitting region of the QSO, we must first calibrate the i-band and Hα photometric magnitudes to a common magnitude system. We are then required to calibrate our i-band photometry using the known SDSS DR12 AB magnitudes of sources in the observed field. We calibrate Hα photometry by propagating available SDSS spectra through the transmission curve for the same narrowband Hα filter (6566Å) used to observe the Hβ line in Target-10, accounting for the fibre aperture.
We perform aperture photometry using Source Extractor (Bertin & Arnouts 1996) to estimate Petrosian magnitudes (Petrosian 1976;Graham et al. 2005) for each detected source in the field for both i-band and Hα exposures. We use Petrosian magnitudes in order to calibrate each exposure to the SDSS catalogue and to easily avoid the effects of differing seeing between our observations without modelling the PSF. We consider only those sources which have SDSS CLEAN=TRUE and Source Extractor FLAGS= 0 for use as reference sources. We can then apply a similar ensemble photometry method to that detailed by Honeycutt (1992), on the i-band exposures and calibrate those instrumental magnitudes to the SDSS absolute AB magnitude system. The details of our ensemble photometry method are described in Appendix A.

Light-curve Calibration
Fig 3 shows the calibrated light curve for Target-10 in the i-band along with the deviation from the mean magnitude for its reference sources. The average uncertainty for the AB magnitudes for Target-10 is about 0.015 mag with the largest being 0.040 mag. The iband magnitudes for Target-10 therefore have signal-to-noise ratio of between 25 and 120, exceeding than the necessary SNR> 20 recommended by Bentz et al. (2013) and Shen et al. (2015a) to achieve reliable lags.
The SDSS DR12 catalogue lacks Hα photometry and our observed fields contain few sources for which SDSS has spectra (only The light curve for Target-10 is shown in red with its calibrated i-band AB magnitudes labelled on the right axis. The deviation from the mean magnitude for each of the reference sources for Target-10 i-band are also shown on the left axis. Bottom: The i-band AB zeropoint for each exposure calibrated to SDSS magnitudes using the Petrosian aperture. one of which is not a QSO). Therefore, it is necessary to calibrate our Hα exposures to the magnitudes obtained from propagating SDSS spectra through IO:O Hα photometric filters. We derive zeropoints, relative to the "best" exposure (i.e. the exposure highest mean SNR for spectroscopic reference sources), for each of the Hα exposures by using the same ensemble photometry method detailed above. We make use of the SDSS spectroscopic catalogue to identify potential reference sources but find only one such source (α J2000 = 14 h 46 m 37 s , δ J2000 = +62 • 57 36 ) observed by BOSS. Our calibration depends upon the accurate measurement of the reference's flux within the Hα filter. Given that we find that the source is resolved into two components as shown in Fig 4, the effect of seeing and aperture corrections cannot be neglected. We first fit a model consisting of two Gaussians to our best Hα exposure, then transform the model to the same seeing as the BOSS The model convolved to the SDSS seeing for the spectrum observation using a difference-of-two-Gaussians kernel. Overplotted in red crosshairs is the location of the centre of the 2 arcsecond BOSS aperture and the aperture is shown in the bottom right panel.
observation, and finally extract the flux contained within the BOSS 2 arcsecond aperture. The difference between the ensemble calibrated instrumental magnitude we obtain for our best exposure and the propagated BOSS spectrum is taken as our zeropoint, accounting for uncertainties in both magnitudes. Fig 5 shows the the resultant light curve for Target-10 in the Hα waveband along with the deviation from the mean magnitude for its references sources. Due to the necessary intermediate step of calibrating differential magnitudes to the AB magnitude system via the spectral reference source, the signal-to-noise ratio of the Hα magnitudes is smaller than those in the i-band. We measure signal-to-noise ratios for the Hα fluxes of Target-10 range between 19.5 and 80.0.
The zeropoint for both i-band and Hα exposures can change by about 0.4 mag and the exposures where this occurs are the ones the highest uncertainty for the QSO magnitude. Upon inspection, it is clear that these exposures have increased cloud cover or worse-thannormal seeing. Our ensemble calibration method above takes into account the instantaneous deviation of reference sources from their inferred mean magnitudes and updates their weightings accordingly (see Appendix A). We therefore do not exclude these exposures from further analysis.

Reliability simulations
Javelin (Zu et al. 2013) can be used to model quasar variability with either spectroscopic (Zu et al. 2011)  The light curve for Target-10 is shown in red with its calibrated Hα AB magnitudes labelled on the right axis. The deviation from the mean magnitude for each of the reference sources for Target-10 Hα are also shown on the left axis. Bottom: The Hα AB zeropoint for each exposure calibrated to SDSS magnitudes using the Petrosian aperture.
walk covariance kernels which control the strength of the correlation between any two flux observations given the time between them. Zu et al. (2013) finds that the exponential covariance kernel is appropriate on time-scales, τ, between months and years, and we therefore adopt their recommendation for fitting with Javelin. Below a timescale of a few months, the correlation becomes stronger than can be accounted for by the exponential covariance kernel (Mushotzky et al. 2011;Zu et al. 2013) and the characteristics of stochastic behaviour at time-scales longer than a few years are not well known due to lack of data. There is further evidence that the DRW is not sufficient to explain high frequency light-curve variance as seen by Kepler and SDSS (Kasliwal et al. 2015a;Guo et al. 2017;Smith et al. 2018). The impact of fitting non-DRW light curves assuming the DRW model is not well understood. Furthermore, current variability-modelling techniques are not physically motivated and attempt to interpolate gaps in the light curve by assuming some correlated time-series model. If this model is too inaccurate and the gaps in the light curve too long, we risk producing artificial peaks in the lag posterior distribution, which can be indistinguishable from peaks describing the physical lag.
To test whether reverberation mapping with interpolated models can be trusted in the presence of such model-dependent problems we employ 3 techniques: (i) Use a non-parametric Von Neumann estimator of narrow band + continuum time-series as demonstrated with spectroscopic measurements in Chelouche et al. (2017). This allows modelindependent verification.
(ii) Generate a suite of simulated light-curves each with different generative parameters to test whether a given method can reliably retrieve a known lag under different models.
(iii) Use the newly reprocessed Kepler light-curves (Smith et al. 2018) as the basis for realising the simulated light-curves by fitting the Kepler data with the Continuous Auto-Regressive Moving Average process (see below) with order p = 2, q = 1, now indicated as CARMA(2,1), using KALI (Kasliwal et al. 2015a(Kasliwal et al. , 2017. This allows a test of the performance of DRW fitting procedures with non-DRW light-curves. Ideally, we would generate these light-curves from a physicallymotivated hydrodynamic self-consistent model of the BLR. However, this is beyond the scope of this work and so we settle on a suite of light-curves informed only by the reprocessed Kepler database (Smith et al. 2018); a prior distribution of BLR window parameters; and the observed SNR, cadence, and spectrum of our target QSO. In this BLR model (which is the same model that Javelin uses), the continuum light-curve is first smoothed by a top hat window of width w, then scaled by line-scale s. To generate the emission seen through the Hα photometric filter, the contribution of the continuum over the Hα filter is added to the simulated emission line flux. The Hα photometric light-curve, n(t) is therefore described by where α is the ratio of the continuum measured in the i-band, c(t), relative to that in the Hα filter. We fit the CARMA(2,1) process to the (Smith et al. 2018) light-curves using KALI (Kasliwal et al. 2017). A CARMA process is a stationary time-series model consisting of auto-regressive components and moving-average components (Kelly et al. 2014). Following Kelly et al. (2014), a CARMA(p,q) process, y(t), is defined as solution to the stochastic differential equation where p is the total number of auto-regressive time-scales, q is the number of moving-average time-scales, y(t) is a small flux perturbation from the mean at time t, ε(t) is a white noise process drawn as ∼ N (µ = 0, σ 2 ) and α & β are constants. We define α p = β 0 = 1 and the CARMA process is only stationary around a mean if p > q. A DRW, or CARMA(1,0) process is therefore defined as a solution to where τ is the time-scale of the variations, bringing the total number of parameters to 2 (τ and σ ). Similarly, a CARMA(2,1) process is defined as a solution to which is equivalent to a damped harmonic oscillator where ζ is the forcing ratio, ω is the angular frequency of oscillation and β 1 controls the frequency dependence, "colour", of the noise (i.e. if β 1 = 0, the noise power spectral density, PSD, is not flat).
The CARMA(2,1) process therefore has 4 parameters inclusive of the amplitude of the variations, σ . The differences between different CARMA processes are shown in Fig 6, which also highlights that the DRW is a CARMA(1,0) process, i.e. it is lacking a moving average component. The second order differential equation underlying the CARMA(2,1) process is familiar to many branches of physics and is therefore more easily interpretable than higher order processes. Indeed, the thermal motion with a fluid produces sound waves described by a PSD ∼ ν 2 (Mellen 1952), which suggests that CARMA(2,1) can be physically motivated by such distortions in the accretion disk. For more information concerning the statistics and physical applicability of CARMA processes to astronomical light-curves see Kelly et al. (2014) and Kasliwal et al. (2017).
The CARMA process has been shown to more accurately match PSDs of AGN which experience deviations from the DRW model (Kelly et al. 2014;Kasliwal et al. 2017) since it has more degrees of freedom and is therefore more flexible than its lower order counterpart (DRW). Here we fit CARMA(2,1) to all 20 of the Smith et al. (2018) reprocessed Kepler light-curves that have spectroscopic redshifts with KALI. Sampling from the time-scale probability distributions of each of the fits, we can produce light-curves whose structure functions and power-spectra resemble that of Kepler lightcurves. We also perform this analysis for simulated light-curves generated by a DRW process, as a comparison, still using the same template Kepler light-curves.
These simulations allow us to estimate the degree to which we can trust lag parameter estimations for a given QSO target and fitting method. They will reveal the nature of any artefacts which can occur due to the cadence, generative model, or interpolation of the input light curve. Furthermore, it allows us to test whether the DRW model predicts lags that too optimistic and therefore estimate a more robust uncertainty for the lag. We perform such analysis with 50 000 1 simulated light curves constructed by sampling from the time-scale distributions fit to the Smith et al. (2018)  The simulated observations are then taken at the same cadence as that of Target-10 and assuming the same signal-to-noise (shown as dots in Fig 7). In order to test how dependent the lag estimate is upon the zeropoint obtained from the spectral reference calibration source, we also scale the resultant continuum+line light curve by a zeropoint offset bringing the total number of explicit parameters to 6 (where the CARMA/DRW time-scales are implicitly drawn from fits to the Kepler light-curves). For Target-10, we use the distribution of these parameters fit to the Kepler sample or by inspecting the spectrum of Target-10, as appropriate. These parameter distributions are detailed in Table 2.

Fitting methods
For each of these light curves, we run the following analysis to derive the best estimate for the lag. For Javelin, we infer the DRW parameters (amplitude, σ , and time-scale τ) of the i-band continuum with 200 walkers, whether generated by DRW or not. We use the output probability distributions as a prior for the lag estimation using . The difference between different CARMA process orders (whose matched parameters are described in the legend). Top left: The power spectra of the CARMA processes. Top right: The structure function of the CARMA processes. Lower: One realisation for each CARMA process generated from the same random seed.

Parameter
Distribution Source Kepler Light curve Choice[n=20] Smith et al. (2018) Table 2. 50 000 draws were taken from these parameter distributions to create the simulated light curves for Target-10. Each draw created a different continuum lightcurve from the posterior distribution of CARMA(2, 1) fit to a randomly chosen Kepler light curve. The result was then propagated through a lagged smoothing window of width w days, scaled by line scale s, and added onto the continuum at the position of the Hα photometric filter = α * c(t) to create the narrow band light light curve.
both i-band and Hβ light curves. We run Javelin with the default settings of a logarithmic prior which begins to penalise lag values larger than a third of the observational baseline (the time between the first observation and the last), and a hard limit on lags longer than the baseline itself. MCMC chains must have converged before any reliable parameter estimation can be performed. The model is run until convergence is achieved, whereby MCMC is halted when the autocorrelation time for all parameters changes less than 1 per cent and the number of iterations is larger than 50 times the largest autocorrelation time estimate, as recommended by Foreman-Mackey et al. (2013) 2 . We find that simply using the i-band and Hα time-series directly with the Von Neumann estimator produces biased results. Indeed, for light-curves with α > 0, the Von Neumann estimator starts to underestimate the lag. Therefore, when estimating lags with the Von Neumann estimator, we subtract the i-band continuum photometry from the Hα narrow band photometry within the estimator. We apply the Von Neumann algorithm detailed by Chelouche et al. (2017) for 5000 samples, where each iteration samples a different realisation of the Target-10 light-curve from its flux uncertainties (the FR/RSS scheme defined by Peterson 2004) and subtracts the continuum realisation from the narrow-band realisation.
This results in a large hyper-volume of probability distributions which we can marginalise over to give us the accuracy of lag estimates as a function of known input lags, for each fitting method.
Due to the presence of more than one strong peak in the lag probability distributions, taking the median of an MCMC chain array may result in the parameter estimate being located in an area of low probability, between peaks, and not near a region of high probability. Therefore, any quoted estimate and its uncertainty could be misleading. We choose not to identify the primary peak by eye, but use a mode-finding method to identify the most probable solution  Figure 7. One of the 50 000 light curves generated from a grid of parameters based on the Kepler light-curves. The continuum, pure line, and line with continuum light curves are shown in blue, red, and green respectively. The lines depict the intrinsic light curve generated by the simulated QSO using the damped random walk covariance kernel. The noisy observations, with the same signal-to-noise ratio as the calibrated Target-10 light curves are shown as points. The mean flux of each of the light curves is shown as a horizontal line.
within the highest-posterior-density (HPD) credible interval. The HPD interval is the narrowest interval that is guaranteed to contain the mode of the distribution. We fit a kernel-density-estimate (KDE) using the FastKDE (OBrien et al. 2014, 2016 algorithm which calculates the kernel's parameters objectively (i.e. the hyper-parameters are informed entirely by the data and therefore it does not require user specification of bin width or kernel bandwidth), and choose the maximum value of that resultant KDE to be our best estimate for the Javelin parameters.

Reliability Simulations
Fig 8 shows the distributions of the KDE best estimate of the Hβ lag based on the output lag probability distributions from Javelin (top left shows DRW as input, top right shows CARMA(2,1) as input) and the Von Neumann estimator (bottom left shows CARMA(2,1) as input). The first observation we can make is that Javelin does indeed perform worse when the input light-curve is not a DRW process, as Javelin assumes (an average of 5 per cent error versus 1 per cent between 10 and 250 days). We also see that the modelindependent Von Neumann estimator recovers lags with an accuracy very similar to that of Javelin (4 per cent), when not assuming DRW. In addition, all methods start to fail with lag recovery errors greater than 50 per cent above 170 days Given that Javelin starts to penalise lag values larger than a third of the observation baseline it is perhaps not surprising that lags starting to approach the total length of the baseline itself are not as reliably recovered as those below a third of that length. The Von Neumann estimator does not apply such a prior and still experiences a drastic loss in accuracy beyond 170 days, suggesting that this loss is likely due to the finite baseline of the light-curve.
We also observe that there are a number of hyperparameter combinations whose recovered lags are incorrect by > 100 days. This occurs for combinations at all input lags and fitting methods and so we should not be surprised by spurious peaks in the probability distribution for Target-10 at higher lags. At all input lags and methods, we find artificial (i.e. incorrect) peaks at negative lags and so we can be justified in disregarding the peaks below -100 days. In particular, the Von Neumann estimator routinely places a large probability mass into a peak at -200 days. We find that there is always a large peak for all fitting methods at around 0-14 days, which coincides with the average cadence of observations (14 days).
The KDE method allows us to assess the most likely peak without referring to the unstable maximum likelihood point, but it also implies a large uncertainty on the lag given that there are other regions of high probability which cannot be ruled out a priori. We can address the issue in four ways: (i) Use the output lag distribution for our reliability simulations to mitigate the effect of non-linear artefacts that arise from the fitting process.
(ii) Apply a prior to the lag distribution based on previous lag and luminosity measurements, and established relations i.e. (Bentz et al. 2013).
(iii) Limit analysis to the range of lags bounded by the minima surrounding the tallest peak.
(iv) Combine estimations from each fitting method, thereby mitigating the biases which are not shared by both methods.
We perform the only the first, third and the last steps detailed above since we want our lag measurement to inform the t rest − L 5100 relation, which cannot be done independently if our measurement is a result of an application of a prior based on the same relation.

Lag estimation for Target-10
We perform the same fitting procedure for Target-10 as we did for our simulated light curves, using Javelin and the Von Neumann estimator. Fig 9 shows the Javelin posterior predictive distribution for the observed light curves of Target-10 based on the burnt-in chain (i.e. with the first 1000 steps for the MCMC chain removed). The Hα predictive posterior light curve is the linear combination of continuum and emission line light curves where the emission line flux is only a fraction of the continuum. Manually identifying the time delay between them will be difficult. Fig 10 shows that the most likely positive peak from Javelin coincides with the a peak from Von Neumann estimator. Corroboration from a model-independent method increases the likelihood of our detection being real.
However, the distribution of Hβ lags contains more than one convincing (SNR > 3) peak in both methods. Fortunately, since we have constructed a large suite of simulated light curves over a large range of DRW parameters, we can estimate the distribution of lag artefacts that results only from the fitting process and the properties of our data. We can then use the distribution to inform us as to which peak is the "real" one. For both Javelin and Von Neumann, we take the median PDF over all simulated light-curves. This creates a distribution of lags without a peak corresponding to the true input lag, since the median at any point will suppress such a peak. We scale the artefact distribution, an approximation of 1 − P(t Hβ ), so  that its median probability matches the median probability of the distribution of Target-10, P(t Hβ | D). Then we divide the Target-10 lag distribution by this artefact distribution, which has the effect of suppressing spurious peaks. The results for Javelin and the Von Neumann estimator are shown in Fig 11. We can see in Fig 11(a) that when the light-curves are DRW-generated, as Javelin assumes, the artefact distribution contains many peaks. The highest peak in the lag PDF for Target-10 at ∼ −100 days is completely accounted for by DRW+Javelin effects. However, the much smoother distribution shown in Fig 11(b) from using CARMA(2,1) light-curves, perhaps resulting from the greater inaccuracy in lag estimation, does not account for this peak. The artefact PDF of the Von Neumannn estimator, shown in Fig 11(c), contains many peaks, but the largest is centred around 0 days and does not account for the large probability mass found at -200 days.
The accuracy of the Javelin estimations on CARMA(2,1) over input lag and input variability amplitude is shown in Fig 12. There is a clear region where Javelin appears to be able to recover lags: the lag must be smaller than 170 days to have the best chance  of recovery and the continuum amplitude variability limit coincides with the mean fractional noise in the continuum light-curve (0.01). Partially following the method of Grier et al. (2017), we select the region bounded by the minima of the tallest peak (dashed lines in Fig 11 and 13) in the distribution that still contains artefacts. We then estimate the region of 68 per cent probability in the cases of artefact inclusion and deconvolution as shown in Fig 11. In order to show that any detected lag robust to choice of model and to make use of all available data, we combine the PDFs of the deconvolved Von Neumann and Javelin lag estimations by multiplying them (shown in Fig 13). We do not include the PDF estimated from Javelin with the DRW-generated artefact distribution, since we have shown that this is too optimistic. Deconvolution and combination do not entirely remove all ambiguity in the lag PDF, but it does push much of the probability mass into 3 distinct peaks at -105, -20, and +63 days. The lack of noise and distinct peak heights makes reporting the +63 day lag more trustworthy and robust to the assumed generative time-series model (Von-Neumann doesn't assume any model and the CARMA(2,1) tests Javelin's resilience to mismatch).
We recover an Hβ lag for Target-10 of 73 +4 −13 days without attempting to remove the influence of artefacts or combining tech-
niques and then an Hβ lag of 65 +6 −1 days when we apply artefact deconvolution and method combination. The lags estimated before and after deconvolution for each method are shown in Table 3 The best KDE estimate of the lag of Target-10 is consistent between both distributions but the uncertainty shrinks by 2.5 when we use the artefact deconvolution method to simplify the posterior and combine estimates from different techniques.

Fits to the t Hβ − L 5100 Relation
Using our derived time lag, we fit a power-law, with scatter, to the lag versus luminosity in linear space: where t rest is the lag that would be observed without the effects of intrinsic scatter in the relation and t rest is the observed lag including that intrinsic scatter. The normal distribution is indicated as N . Our fitting priors for the slopeα, interceptK, and scatter scaleε are: We correct the luminosity of our target for a host contribution of 24 per cent, as in Bentz et al. (2013). The details of the correction can be found in Appendix B. We do not fit a straight line in log space since the uncertainties in lag and luminosity along with the scatter are not strictly Gaussian in linear space and definitely not in log space. This subtlety may have a significant impact on the slope of the fit relation and therefore on its interpretation. We use this opportunity to test whether the correct treatment of non-Gaussian uncertainties makes a difference to resultant fit. We resample the uncertainty distributions of the lag estimations 1000 times per data point in order to fit the power law. In this way, we incorporate the probability distribution from Javelin naturally whilst also treating values from the literature correctly. We do not fit the power-law to the Grier et al. (2017) dataset since they reason that large selection effects due to limited monitoring cadence and duration may bias their lag measurements to lower values more so than the Bentz et al. (2013) sample. Instead, we use the Clean2+ExtCorr dataset from Bentz et al. (2013), which excludes two AGN due to potentially biased time lags and corrects the influence of internal extinction of one other. We recover the parameters listed in Table 4. Fig 14 shows the fit lag-luminosity relation to the Bentz et al. (2013) Clean2+ExtCorr sample. There is no significant difference between the fits with and without Target-10 included. However, fitting in linear space produces a shallower relation (by ∼ 0.013) than that of Bentz et al. (2013) and so, at extremes of luminosities, we find that our fit is significantly (3σ at 41 dex) different to the loglog straight line. Additionally, the uncertainty in our fit parameters is 68% HPD region w/o artefacts w/o artefacts (c) Von Neumann estimated PSF with CARMA(2,1) artefact PDF from CARMA(2,1)-generated light-curves Figure 11. The probability distributions for rest-frame lag of Target-10 before and after artefact deconvolution for Javelin and the Von Neumann estimator performed on CARMA(2,1) and DRW light-curves. Top panels: The full probability distribution for rest-frame lag as the blue histogram along with the artefact distribution in black derived from simulated light-curves. Bottom panels: The cleaned distribution of rest-frame lags for Target-10, where the artefact distribution is deconvolved from the output rest-frame lag distribution. The region marked by dashed lines indicates the region where we estimate the 68 per cent HPD interval (shaded red area), along with the mode (red line), which is determined by the position of the minima around the highest peak in the top panel (following the method performed by Grier et al. 2017). much reduced when compared to Bentz et al. (2013) and the scatter is larger (by about 0.5 dex). We also note that the impact of selection effects upon this and any fit of a t-L relation will be dependent on the cadence and duration of observations. This may go some way to explaining the seemingly excessive number of QSOs populating the space below the Bentz et al. (2013) data points. Furthermore, there may be an accretion rate dependency whereby the more fundamental relation is the plane of rest-frame lag, luminosity and accretion rate, as outlined by Du et al. (2016a). However, the explanatory power of this model is small for sources with the low accretion rates seen in the Grier et al. (2017) sample.
Propagating the posterior lag distribution for Target-10 Table 5. Mass fit parameters for datasets with Target-10, using the same parametrisation for a power-law as in Table 4. at the distribution for black-hole mass shown in Fig 15. The best estimates, with and without deconvolution of artefacts, for blackhole mass are only separated by 0.01 dex. Fig 16 shows the black-hole mass-luminosity relation for the Bentz et al. (2013) Hβ lags with line widths from the AGN Mass Catalogue (Bentz & Katz 2015). The parameter fits for the massluminosity relation are detailed in Table 5. We find that Target-10 is in good agreement with the Bentz et al. (2013) Clean2+ExtCorr dataset.
We find that the scatter of the mass-luminosity relation (0.5 dex) is much larger than that of the lag-luminosity relation in log space. This is unsurprising since the former combines uncertainty  Table 6. The efficiencies, calculated with different selection criteria, for SDSS-RM (Shen et al. 2015b;Grier et al. 2017) and this work. The efficiencies are calculated using Equation 12. We compare the efficiencies on a per object basis as well as over the whole campaign. We compare our Target-10 to the most similar QSO in the Grier et al. (2017) catalogue (based on f 5100 ) and to their most precise lag estimation (in terms of SNR lag ). In all cases, photometric reverberation mapping is more efficient than spectroscopic reverberation mapping.
. from the virial factor f as well as the scatter in line widths shown in Fig 17, which shows the black-hole mass against broad line velocity dispersion. However, it is still useful to note that a black-hole mass predicted from the t − L 5100 relation can be wrong by more than 0.3 dex 50 per cent of the time 3 .

Efficiency
This observing campaign totalled 17.4 hours (15.2 for Hα and 2.2 for i-band) in total, with 5.9 hours dedicated to Target-10. This is far shorter than the large majority of spectroscopic observing campaigns such as Shen et al. (2015a) where the typical epoch consists of at least eight 15 minute sub-exposures rather than our one 10 minute exposure with the Liverpool Telescope per epoch. Grier et al. (2017) achieved an average uncertainty of 3 ± 2 days and a maximum SNR of 23.1 whereas Target-10 has an uncertainty of +6/ − 1 days (SNR= 18.6), with much of the uncertainty attributed to artificial peaks having been mitigated using our simulations (see Section 3.2).
We define efficiency as the mean SNR lag achieved for a given observing campaign divided by the total time required.
where n is the number of observed targets (detection or not), t total is the total observing campaign observing time, and D is the primary mirror diameter. The mirror diameters are 2.5 m for SDSS-RM and 2 m for this work, which uses the Liverpool Telescope. This gives us the expected signal-to-noise for a given QSO per hour of observation per collecting area. In order to make a fair comparison, we include the SDSS spectrum integration time required to estimate velocity dispersions for each of our targets in the total time required to observe our targets as well.
We have achieved an efficiency of ε = 14.0 × 10 −3 hr −1 m −2 , whereas with spectroscopic reverberation mapping, SDSS-RM achieved ε = 4.4 × 10 −3 hr −1 m −2 , where our fraction of sources with detected lags (0.2) is the same as that of Grier et al. (2017). This is a 218 per cent increase in efficiency over the multiplexed 68% HPD region Von Neumann + Javelin Figure 13. The probability distribution for rest-frame lag of Target-10 after combining the artefact-deconvolved distributions of the Von Neumann estimator and Javelin.
SDSS-RM campaign. If we instead calculate the signal-to-noise per hour per square metre per object, SNR/t ob j , we find that on average we achieve 12 times more signal-to-noise per hour than Grier et al. (2017). Since the SNRs of the Grier et al. (2017) lags do not depend strongly on redshift, observed flux or luminosity, this is a fair comparison. The efficiencies described above include targets that we have observed but not analysed and consider the whole observing campaign at once. If we only consider Target-10 compared to the most precise lag measured by Grier et al. (2017), for SDSS J142103.53+515819.5, our efficiency rises to 18 times more signal-to-noise per hour per square metre than Grier et al. (2017) Furthermore, if we consider the most similar target to our Target-10 in terms of observed flux (SDSS J140759.07+534759.8), their efficiency drops to ε = 1.3 × 10 −3 hr −1 m −2 .

Future Applications
Having shown that reverberation mapping using photometric methods with minimal spectroscopy can be an effective means with which to measure black-hole masses, we can foresee a number of exciting applications for long term studies, which would require little extra effort to instigate.
The Liverpool Telescope (Steele et al. 2004) will soon be superseded by a new robotic successor, the Liverpool Telescope 2 (Copperwheat et al. 2014), with first light after 2020. The Liverpool Telescope 2 will benefit from a 4 metre diameter as opposed to the current Liverpool Telescope's 2 metres. Given the efficiency of photometric reverberation mapping with the current Liverpool Telescope, the application of these methods to its successor would be an effective use of time when applied robotically and make higher redshift measurements possible.
Photometric reverberation mapping lends itself well to large surveys, which often require that the instrument make repeated visits to the same field for calibration to standard stars. Selecting calibration fields to contain known QSOs would generate light curves with baselines as long as the survey's duration with a regular highfrequency cadence for little extra effort. The upcoming photometric surveys of the Javalambre Physics of the Accelerating Universe Astrophysical Survey (J-PAS, Benitez et al. 2014) and its companion calibration survey Javalambre-Photometric Local Universe Survey (J-PLUS) promise an opportunity for sustained long-term photometric reverberation mapping campaigns. Designed to accurately measure photometric redshifts for galaxies up to z = 1, with its unprecedented 56 narrow band filters, J-PLUS could easily observe the continuum and a wide range of emission lines for a sample of QSOs observed during calibration exposures. In addition, instruments such as the PAUCam (Castander et al. 2012;Padilla et al. 2016), providing 40 narrow-band filters in addition to the u,g,r,i,z, and y photometric filters, could also detect lags with higher SNR and a larger range of redshifts than IO:O. These observations could provide a far more detailed map of the broad-line region as inferred by Williams et al. (2018), and also provide a large enough dataset to perform continuum reverberation mapping (Mudd et al. 2018 Grier et al. (2017) (triangles), and Target-10. All points are coloured by redshift. The best estimate for the lag of Target-10 is shown as a bold green circle with and without the artefact deconvolution. The best fit line in log space to the Clean2+ExtCorr dataset by (Bentz et al. 2013) is shown in grey, the best fit line in linear space to the same data is shown in red. The best fit in linear space to the Clean2+ExtCorr dataset as well as Target-10 is shown in blue. The scatter estimated by MCMC in all best fit lines is indicated by dashed lines.
regions of sky with a high frequency and 3 day cadence, making pure photometric reverberation (Zu et al. 2016) with large numbers of QSOs a realistic possibility (Chelouche et al. 2014). A QSO light curve dataset from LSST would probe the extremes of timescales where the damped random walk model for QSO variability is thought to break down (Zu et al. 2013) whilst also providing opportunities for continuum mapping (Mudd et al. 2018). However, it is currently not clear whether LSST will be able to estimate accurate lags since Chelouche et al. (2014) do not account for photomet-ric measurement errors, dilution of light curve variations by host galaxy contribution, seeing effects which affect the host/nucleus separation and luminosity determination. Indeed, the selection is restricted to objects with strong emission lines, which is not the case for narrow-band photometric reverberation mapping.
Given that we can measure lags with 7 days uncertainty with current instrumentation, for baselines longer than 3t rest (1 + z), these survey's long campaigns and high cadences, along with high precision photometry, will likely provide more than enough signal-to- noise for lag estimation for hundreds of QSOs/AGN covering a large range of lags and luminosities. Indeed, strategic application of photometric continuum mapping and multiple narrow band filters probing multiple broad-line region radii will yield much information regarding the geometry and mass of SMBHs.

CONCLUSIONS
We demonstrate an efficient method for purely photometric QSO reverberation mapping at high redshift (z = 0.351) using Javelin (Zu et al. 2016(Zu et al. , 2013. (i) We observe 10 targets selected for their estimated signalto-noise, observable time, and inferred Hβ emission line lag (according to the t lag − L 5100 relation fit in Bentz et al. 2013).
(ii) Observing conditions ruled out the observation of 5 of our selected targets and 4 observed targets did not have the required baseline, recommended by Shen et al. (2015a), to observe their expected lag given their luminosity. We therefore proceed to discuss only SDSSJ144645.44+625304.0 (referred to as Target-10).
(iii) We calibrate the Hα and i-band light curves, using an ensemble photometry method, to SDSS AB magnitudes. In order to achieve as accurate an Hα relative calibration zeropoint as possible, we use the only available SDSS-BOSS spectrum. This spectrum is observed to be resolved into two components in both our i-band and Hα exposures, and the SDSS i-band exposures. Therefore, we fit a two-component Gaussian model to the source in order to transform to the same seeing as the BOSS observation before fitting a zeropoint.
(iv) Javelin and other tools assume the frequently-used DRW model and the effect of this assumption on the accuracy of lag estimation when the light-curves are not DRW-generated is not known. To make our lag robust to the choice of model and interpolation of the model, we generate 50 000 simulated CARMA(2,1) & DRWgenerated light-curves based on CARMA(2,1) fits to the reprocessed Kepler light-curves (Smith et al. 2018) using the same cadence and signal-to-noise measured in our calibrated light curves for Target-10. We find that although the accuracy of Javelin decreases when its base assumption is violated, it can still recover the correct input lag. Indeed, a model-independent Von Neumann estimator corroborates the 63 day peak in the Javelin lag PDF.
(v) We find that the output lag probability distribution from photometric RM is frequently affected by multiple peaks, some at negative lag values. We find that median estimate of the lag from the posterior probability distributions often reports inaccurate values and large uncertainties for lags. We therefore use an HPD kernel method (Section 3.2) to automatically identify the most probable peak objectively. Using the HPD kernel method, we report the reliability of Javelin and Von Neumann estimated lag over 0 to 316 days. We are able to reliably recover the original input lag over all other nuisance parameter ranges for the simulated light curves with an average of 6 per cent deviation when the input lag is less than 170 days. When simulating light curves based on the signal-to-noise and cadence of Target-10, we find that an error of no more than 0.4 mag in Hα narrow-band zeropoint calibration is still able to recover the given input lag to within an average of 6 per cent.
(vi) Using the simulated light curves generated from reprocessed Kepler light-curves (Smith et al. 2018), we compile a distribution of artefacts in the lag distribution produced by the Javelin and Von Neumann fitting procedure. We deconvolve the artefact distribution from the lag distributions of Target-10 and combine the estimations from both Javelin and the Von Neumann estimator, measuring Hβ lags and black-hole masses with smaller uncertainties than without artefact deconvolution. We find that the best estimate of the Hβ lag and black-hole mass do not change beyond the 68 per cent HPD credible interval when the artefact deconvolution is applied but their uncertainties shrink. We recover an Hβ lag for Target-10 of 73 +4 −13 days with Javelin and an Hβ lag of 65 +6 −1 days when we apply artefact deconvolution to both Javelin and the Von Neumann estimator and combine their results. Assuming an f = 4.3 ± 1.1, we measure a black-hole mass for Target-10 of 10 8.27 +0.13 −0.15 M with Javelin and a black-hole mass of 10 8.22 +0.13 −0.15 M when we apply artefact deconvolution and combination.
In conclusion, we find that if a Damped Random Walk (DRW) model is assumed by the fitting procedure when the light-curves are generated by a different Continuous Auto-Regressive Moving Average (CARMA) process, we can still recover accurate lags (despite a small loss in reliability). We find that by analysing the resulting probability distribution with more in-depth techniques, we can approach the precision demonstrated by spectroscopic reverberation mapping using photometric techniques. Furthermore, we can achieve this precision with a quarter of the total exposure time that the SDSS-RM programme required to achieve a higher average SNR with a smaller telescope. This results in a 218 per cent increase in efficiency over SDSS-RM. These simple yet powerful photometric methods can be  Bentz et al. (2013) sample are drawn from the AGN Mass Catalogue where possible and calculated using f = 4.3 ± 1.1 (Grier et al. 2013b). The Grier et al. (2017) masses are scaled from f = 4.47 to f = 4.3. All points are coloured by redshift. The best estimate for the mass of Target-10 is shown in green with and without the artefact deconvolution. The best fit in linear space to the Clean2+ExtCorr dataset as well as Target-10 is shown in red. The scatter estimated by MCMC in all best fit lines is indicated by dashed lines. readily applied to large surveys which require regular calibration in order to build a large baseline of known QSO observations.

ACKNOWLEDGEMENTS
SCR thanks Garreth Martin and Martin Hardcastle for fruitful discussions. SCR thanks Vishal Kasliwal for informative descriptions of how Kali works and fruitful discussions about the application of the CARMA(2,1) process to the Kepler light curves. SCR thanks Krista Lynne Smith for providing the reprocessed Kepler light curves. This research made use of Astropy, a communitydeveloped core Python package for Astronomy (Collaboration et al. 2013

APPENDIX A: PHOTOMETRIC CALIBRATION
To further improve our set of reference sources, we perform a number of checks. First, we perform the same aperture photometry ex-  traction using Source Extractor that we used on our own i-band exposures on the SDSS i-band exposures that contain the candidate reference sources. If the Petrosian magnitude extracted from SDSS exposures by Source Extractor does not agree with the Petrosian magnitude quoted in the SDSS DR12 catalogue to within 0.05 mag, then we discard the source. This leaves the sources depicted in green in Fig A1. Ideally, we would fit a single value of m AB s − m s across all instrumental magnitudes m s to measure the i-band zeropoint. However, as shown for the three example exposures in Fig A1, the IO:O CCD can become saturated for many bright sources and faint sources are noisy. This results in non-linearity at both high and low magnitudes. We therefore employ a spline-based technique to select a contiguous range of Source Extractor magnitudes containing "well-behaved" sources, where we can fit a single flat i-band zeropoint. We fit a spline to m AB s − m s against m s and find the range in which the gradient of the spline is 0±0.05 mag. This range corresponds to the region where aperture photometry is the least affected by saturation and noise, and is shown in the first quadrant of Fig A1. We then select those candidate reference sources which have instrumental magnitudes within that range. These sources, along with Target-10, are highlighted in Fig A2 and Fig A3. In order to estimate the exposure zeropoints and their uncertainties to the greatest accuracy, we employ an ensemble photometry technique similar to Honeycutt (1992). We start out by fitting the instrumental magnitudes to SDSS AB magnitudes whilst also fitting where m er is the instrumental magnitude for reference source r in exposure e with weighting w er ,m r is the magnitude of reference source r assuming that it does not vary over the course of observations, z e is the zeropoint for exposure e, and m AB r is the AB magnitude of reference source r as measured by SDSS with its associated uncertainty σ AB r . We begin the fitting procedure by setting the weight w er for each reference source at each exposure to the instrumental magnitude uncertainty given by Source Extractor, 1/σ 2 er . We then fit the quantitiesm r andẑ e using EMCEE (Foreman-Mackey et al. 2013) with 20 walkers until chain convergence is observed.
Some reference sources may indeed vary over the course of our observations. In addition, the instrumental uncertainty from Source Extractor may be underestimated by some factor. In order to reduce the offset to the zeropoint caused by the inclusion of varying sources, we scale the initial weighting by its probability in a fit Student-T distribution: w er → p er σ 2 er , p er = T (m er −m r |μ = 0,λ ,ν) where the inverse scale parameter,λ , and number of degrees of freedom,ν, are both fit to the distribution of m er −m r assuming a mean ofμ = 0. The Student-T distribution fit to the distribution of deviations of the instrumental magnitudes from their estimated mean (i.e. the distribution of the values of the black points in Fig 3), will update the weighting of each magnitude in each exposure and therefore assign very low weighting to sources which have larger variability over the course of our observations than others. We iteratively run this re-weighting procedure until each flux measurement in the light curve of the target QSO no longer changes within a tolerance of 0.001 mag. This typically takes 3-5 runs of MCMC inference, updating the weighting each time. The resulting light-curves are shown in Table A1.

APPENDIX B: QSO-HOST DECOMPOSITION
We correct for the contribution of the host galaxy by fitting a host disc and QSO point source, both convolved with the SDSS i-band PSF obtained from the relevant pSField file eigenimages, to the SDSS i-band photometry.  Figure B1. The posterior distribution of the QSO+Host fit to SDSS i-band data. The bounds of the cornerplot axes indicate the bounds of the uniform prior used in the nested sampling, except x0 and y0 for which the prior is normal with a width of 2 pixels. The parameters left to right are QSO amplitude, host amplitude, effective radius of the disc, the centre point, ellipticity, orientation angle, and background. The inset histogram is the derived posterior distribution of the ratio of QSO luminosity to host luminosity. The maximum posterior image of the QSO+host model is shown in the 5 top left axes. The top three images show the total model and its residuals from the data. The bottom two show the QSO and disc components convolved with the PSF separately.
We use the nested sampler Dynesty (Speagle 2019) and allow all parameters to vary including the background, orientation, ellipticity, and centre point. We use uniform priors on each parameter as shown in Fig B1 except the centre point x0, y0, for which we impose a normal prior distribution at the measured RA and Dec of the target with a width of 2 pixels. As shown in Fig B1, we find strong constraints of the contribution of the host (24 per cent) and the maximum posterior model image residual shows that we have successfully modelled Target-10.
We also fit a QSO+disc+bulge with a Sersic index of 4, but the data does not support the additional complexity of another component, with Bayes factor of log[P(data|disc)/P(data|disc+bulge)] = log[B d,d+b ] = 2.3 in favour of the simpler model.