The Cosmic Dipole in the Quaia Sample of Quasars: A Bayesian Analysis

We present a Bayesian analysis of the Quaia sample of 1.3 million quasars as a test of the cosmological principle. This principle postulates that the universe is homogeneous and isotropic on sufficiently large scales, forming the basis of prevailing cosmological models. However, recent analyses of quasar samples have found a matter dipole inconsistent with the inferred kinematic dipole of the Cosmic Microwave Background (CMB), representing a tension with the expectations of the cosmological principle. Here, we explore various hypotheses for the distribution of quasars in Quaia, finding that the sample is influenced by selection effects with significant contamination near the galactic plane. After excising these regions, we find significant evidence that the Quaia quasar dipole is consistent with the CMB dipole, both in terms of the expected amplitude and direction. This result is in conflict with recent analyses, lending support to the cosmological principle and the interpretation that the observed dipole is due to our local departure from the Hubble flow.


INTRODUCTION
A critical assumption in the contemporary cosmological framework is that the universe is homogeneous and isotropic at the largest scales (Harrison 2000).This is the cosmological principle, and it is for example taken as a starting point by the Friedmann-Lemaître-Robertson-Walker (FLRW) metric of spacetime and the Friedmann equations describing cosmic evolution.Homogeneity and isotropy were initially raised to the level of an a priori principle by Milne (1935) -but the question as to whether there is an a posteriori basis remains.If such a basis cannot be found, then we must critically re-examine the support for prevailing cosmological models.
The cosmological principle tacitly assumes the existence of a set of fundamental observers which reside in the 'cosmic rest frame' where the universe is maximally isotropic.This is supported by the fact that the 'cosmic microwave background' (CMB) is remarkably smooth with temperature anisotropies of order Δ/ ≈ 10 −5 .However, imprinted on these underlying small-scale fluctuations is a dipole anisotropy of order Δ/ ≈ 10 −3 .This is conventionally explained by the Earth's peculiar motion through the universe with a speed of 369.82 ± 0.11 km s −1 towards (, ) = (264.• 021, 48.• 253) in galactic coordinates (Planck Collaboration et al. 2020), which we denote as v CMB for future reference.If this explanation (the kinematic interpretation of the CMB) is correct, then other cosmological probes using all-sky surveys should show a similar anisotropy.Critically, distributions of matter at sufficiently large distances -namely where local clustering effects are negligible -should exhibit a dipole anisotropy, which we call the 'cosmic dipole' or the 'matter dipole'.If ★ E-mail: vasudeviiser@gmail.com† E-mail: ooay3125@uni.sydney.edu.au the cosmological principle is an accurate description of the universe, then the peculiar velocity inferred from this dipole should correspond with v CMB .
This matter anisotropy is observed, but there is no clear consensus on whether it is consistent with the cosmological principle or not.However, the general trend is that the matter dipole studies -specifically with radio galaxies and quasars -find a dipole that aligns with the CMB dipole in direction, but is larger in magnitude (Peebles 2022;Kumar Aluri et al. 2023).This 'dipole anisotropy problem' thus represents an outstanding problem amongst cosmological probes.Insofar that a consensus on this issue has not been reached, independent studies of matter dipoles with new catalogues of sources are key in further understanding the nature of this anomaly; for example, does it represents a shortcoming of our scientific understanding or an as of yet unresolved systematic issue?With this in mind, in this work we present an analysis of the recently-released Quaia quasar catalogue (Storey-Fisher et al. 2023).At the highest magnitude limit, this catalogue contains 1 295 502 sources.We examine the anisotropy in angular distribution of these quasars over the sky, applying a Bayesian framework to compare the inferred dipole to that of the CMB.The structure of this paper is as follows.In Section 2, we present the background theory and an overview of the instant state of the literature, including current observations of the cosmic dipole and an assessment of their consistency with the cosmological principle.The data under consideration in this study -the Quaia quasar catalogue -is presented in Section 3, and our approach to analysing the sample is examined in Section 4. The results are presented in Section 5. We discuss our results and present our conclusions in Section 6.

BACKGROUND: NUMBER COUNT DIPOLE
The cosmological principle's key assumption of homogeneity and isotropy can, and has been, tested.One critical family of tests involves probing the distribution of matter in the Earth's frame of reference; these are the 'matter dipole' studies.If we assume the principle to be an accurate description of the universe, then the CMB's temperature dipole is interpreted to arise solely from the Earth's peculiar motion.Moreover, the dipole-removed frame is the frame of 'cosmic rest' where the universe is perceived as maximally isotropic and homogeneous.Insofar that the Earth's peculiar velocity imprints a Doppler shift on observed sources like radio galaxies, we should be able to recover the magnitude and direction of this motion from the dipole in matter distributions over the sky.Framed in this way, measuring the consistency between the CMB-inferred and matter-inferred velocities is the linchpin of the matter dipole studies.
To see this, consider an observer moving with velocity  ≪  with respect to distant sources which are isotropic and homogeneous in their own rest frame.As suggested in Ellis & Baldwin (1984), if within the observer's passband the sources have a spectral energy distribution with a power law dependence on frequency described by  ∝  −  , and the apparent flux density has a cumulative power law distribution  (> ) ∝  −  , then Doppler boosting and relativistic aberration will induce a dipole anisotropy in the distribution of sources in the observer's frame.The isotropic frame of reference will be boosted by an amplitude This is the famous 'kinematic dipole', and Ellis & Baldwin (1984) made the rough estimate that a minimum of  (10 5 ) sources would be needed to discern this dipole.The implicit assumption here is that the observer should survey the sky until a flux density above which there is no directional bias in the completeness of the survey.Additionally,  and  are assumed to not be redshift-dependent, although there has been some suggestion that this simplification should be revisited (see e.g.Dalang & Bonvin 2022).Further, local inhomogeneities introduce a clustering dipole, so for a genuine measurement of the cosmic dipole a significant fraction of the sources need to be at high redshifts ( ≈ 1; Tiwari & Nusser 2016).From equation (1), the net dipole anisotropy for a patch of sky in the direction n will be Various all-sky surveys of radio sources have been used to trace out this dipole over the sky, and thus probe the cosmological principle.We note that Kumar Aluri et al. (2023) deals with the genealogy of these tests in greater detail, but none the less we recount some of the salient results here.Blake & Wall (2002) initially found support for a kinematic dipole aligned with the CMB and possessing the expected amplitude.However, the immediate state of the literature is equivocal as to whether or not the matter dipole is consistent with the CMB dipole.Many studies (see e.g.Singal 2011;Colin et al. 2017;Bengaly et al. 2018;Singal 2019;Siewert et al. 2021;Singal 2023;Wagenveld et al. 2023) have reported dipole amplitudes that are in excess of the CMB expectation, while the inferred dipole directions generally align with the CMB dipole (although notably Darling (2022) and Cheng et al. (2023) find consistency with the CMB dipole for their chosen radio catalogues).We point out that in the foregoing works and amongst others, authors discussed the appropriate choice of dipole estimator at length, including whether or not certain estimators incur a bias that must be accounted for.To our knowledge, tests instead formulated in the language of Bayesian statistics have been used less extensively, which we discuss below.
Turning away from the radio galaxy studies, Secrest et al. (2021) showcased that the Ellis & Baldwin (1984) method can be used to study the matter dipole in quasar samples.This study, taken together with the joint radio galaxy and quasar analysis in Secrest et al. (2022), is perhaps one of the more significant challenges to the cosmological principle.Therein, the authors studied the dipole in the distribution of quasars from CatWISE2020 (Marocco et al. 2021) using a least squares estimator, finding that the amplitude was at least twice as large as expected (at a 4.9 level of statistical significance).A similar conclusion with the same sample was reached in Kothari et al. (2022).Separately, Singal (2021) used a sample of 0.28 million quasars and also found a dipole magnitude in excess of the CMB expectation, although the sample size there was about 5 times smaller than that of Secrest et al. (2021).
As we touched on earlier, these analyses used frequentist statistics, and the results are sensitive to the estimator chosen.However, a Bayesian analysis of CatWISE2020 was performed by Dam et al. (2023), in which Secrest et al. (2021)'s result of an anomalously large dipole was confirmed at a statistical significance of 5.7.Taken together, these results lend evidence to the proposition that the quasar dipole is in tension with the kinematic dipole inferred from the CMB.
On the basis of the foregoing, the literature interrogating the matter dipole is by no means unanimous.That being said, these works do not represent an exhaustive survey of what is possible; a suite of other probes have been formulated, many of which are accounted for in Kumar Aluri et al. (2023).Some of these include tests with Type Ia SNe (see e.g.Horstmann et al. 2022;Singal 2022;Sorrenti et al. 2022), analyses of bulk flows (see e.g.Watkins et al. 2023) and direct probes of the FLRW metric with tests of spatial curvature (see e.g.Zhou & Li 2020).Recently, Oayda & Lewis (2023) proposed a novel test involving a dipole in time dilation, as sources with intrinsic time-scales are time dilated along the direction of the Earth's motion.
Returning to quasars, if there is an outstanding tension between the dipole inferred from quasars and that expected from the CMB, then closer scrutiny is warranted.Since the cosmological principle is a foundational assumption in the prevailing cosmological paradigm (Harrison 2000), a challenge to it cannot be easily overlooked.In this work, we present another analysis of the dipole in quasar distributions.We tested the recently-released Quaia catalogue (Storey-Fisher et al. 2023), employing Bayesian inference to understand which model is best supported by the sample and whether the inferred dipole is consistent with that of the CMB.

QUAIA CATALOGUE
The Quaia catalogue (Storey-Fisher et al. 2023) is principally taken from quasars observed by the Gaia satellite (Gaia Collaboration et al. 2016), which were released in Gaia DR3 (Gaia Collaboration et al. 2023a,b).The full sample of DR3 quasar candidates totals to 6 649 162 sources, which was the starting point for the authors.
In constructing their catalogue, they first imposed that all Gaia quasars have a measurement of photometric magnitude in the ,  and  bands.Additionally, the authors cross-matched each of the quasar candidates with those from the Wide-field Infrared Survey Explorer (WISE; Wright et al. 2010), using the unWISE reprocessing to also provide photometric information in the 1 and 2 infrared bands.To decontaminate their sample, the authors imposed proper motion cuts, since quasars are anticipated to be sources well within the background, and a number of colour magnitude cuts.They finally applied a  < 20.5 magnitude cut, the result of which constitutes their primary catalogue: the 'Quaia high' catalogue.Another cut of  < 20.0 created the 'Quaia low' catalogue, since the authors noted that deeper magnitudes sacrificed purity and measurement precision.
One other issue is outstanding: selection effects.To mitigate these, the authors created a selection function to account for how some sources are preferentially observed at different locations on the sky due to dust extinction, stellar density and the peculiarities of Gaia's scanning pattern.This information is encoded in four maps: a dust extinction map; a stellar distribution map; a separate Large Magellanic Cloud and Small Magellanic Cloud stellar map; and, a map encoding Gaia's scanning law and source crowding.This data is passed to a Gaussian process, producing a probability map: the selection function.The selection function describes how likely it is for sources to be included in the final catalogue depending on where they are on the sky.In other words, regions which are less dense on the basis of systematics like dust extinction will be associated with a lower probability, and regions which do not suffer from these effects have a probability closer to 1.
A visualisation of the raw Quaia low and Quaia high catalogues with number count densities can be seen in the top row of Fig. 1.These maps, as well as subsequent ones, are displayed in galactic coordinates.We show the selection function provided by the Quaia authors for both catalogues in the middle row of Fig. 1.By visual inspection, dust extinction appears to dominate the map, which explains the dearth of sources near the galactic plane in the raw catalogue.Finally, we show smoothed maps in the bottom row of Fig. 1 for Quaia low and Quaia high.To generate the smoothed map, we first scaled the catalogue according to the selection function such that the -th pixel with number of sources   (see Section 4.1.1 for information on how sources are binned) is scaled by 1/  , where   is the value of the selection function at that pixel.We then implemented a sliding average; for each pixel, we selected pixels within 1 steradian and computed the mean density.These maps give a visual cue of a source over-density near the galactic centre, as well as under-densities near mid galactic longitudes along the galactic plane.Superimposed on the maps are two masks we chose to use, which are explained in more detail in Section 4.1.2.

Binning
In order to prepare the catalogue for analysis, the sky was divided into equal-area pixels using the the pixelisation regime of HEALPix1 (Górski et al. 2005;Zonca et al. 2019) as incorporated in the Python package healpy. side = 64 -generating a total of 49 152 pixelswas chosen, since the selection maps created by the Quaia authors are given at this resolution.The choice of  side depends upon the fact that for number count analysis, the uncertainty in number counts for each pixel due to shot noise should not be greater than the mean number count for the catalogue.We then summed the number of sources within each pixel using their recorded positions in right ascension and declination.This gives a means by which changes in the source density can be discerned as a function of sky position.-Fisher et al. (2023) noted that the selection function is potentially poorly-modelled in the vicinity of the galactic plane.In making this judgment, they computed the fractional residuals between a synthetic catalogue generated by randomly sampling over a sphere according to the selection function and the actual Quaia catalogue.Around the edge of the plane, the random synthetic catalogue over-predicts the data; additionally, near the galactic centre, the random catalogue seems to under-predict the data.We note that in the bottom row of Fig. 1, which shows our smoothed map of the Quaia low and Quaia high samples, there indeed appears to be an over-density near (, ) ≈ (0 • , 30 • ) as well as under-densities along the galactic plane from about  = 120 • to  = 240 • .This is in line with the proposition made by the Quaia authors.For example, if the galactic centre is under-predicted by the selection function, then   < s where s is some true value of the selection function.Thus, in our smoothed maps which originate from scaled number counts, the -th pixel has a number count   /  >   / s , manifesting as an over-density.

Storey
In order to address this issue, we chose to mask the galactic plane with a series of increasingly conservative masks, as the Quaia authors suspected may be necessary at Section 4.5 in Storey-Fisher et al. (2023).To be explicit, we examined the effect of || < 10 • , 20 • , 30 • and 40 • galactic plane masks on the recovered signal in conjunction with an unmasked catalogue.The 30 • mask curtains much of the problematic regions, but it is still possible that at the edge of the mask the issues at the galactic plane seep into the masked sample.Accordingly, in addition to testing with a 40 • mask, we implemented a circular mask centred on ( • ,  • ) = (0, 0) and subtending a solid angle of 4 sr in concert with the 30 • galactic plane mask.We denote this as a 30 * mask for future reference.The 40 • mask is represented by the solid black line overlaid on the bottom row of Fig. 1, and the 30 * mask is represented by the dashed black line.

Dipole amplitude expectation
Since we are ultimately testing the kinematic interpretation of the CMB, we will need to compare the expected dipole amplitude given CMB-inferred motion and the actual recovered dipole from the Quaia sample.Conventionally, this amounts to using equation (1) with  =  CMB ≈ 369 km s −1 .This also means that  and  must be ascertained from the sample of galactic sources.Here, we instead use the actual source counts themselves -rather than their proxy  -and take the distribution of  to find a distribution of dipole amplitudes D given .This approach is detailed below.

Spectral index
As mentioned earlier, we assume that the -th Quaia source follows a flux power law such that   ∝  −   .To find the spectral index   , we compute the colour magnitude  − .Since Gaia magnitudes are measured in the Vega system, we use the zero points (ZP) and mean wavelengths of the  and  bands, as provided in (Riello et al. 2021)  where  is ZP  − ZP  and in the last line we used the assumption that   ∝  −   .Equation ( 5) yields a distribution of spectral indices for Quaia low and Quaia high, which we show in Fig. 2. The mean value of  is labelled there only for illustrative purposes; what is important, in our analysis, is the distribution itself.

Source number counts
To find the distribution of fluxes in the Quaia sample, we first convert the  magnitude into a Gaia flux using the zero points mentioned above.This yields Gaia fluxes in units of photoelectrons s −1 , though we note that these fluxes can also be found by matching each Quaia source with its entry in DR3 by using each entry's Gaia DR3 source identifier.To convert from these units into Jy, we apply the relevant conversion factor   found in the Gaia documentation (European Space Agency & Gaia Data Processing and Analysis Consortium 2021).A histogram showing the resultant flux distribution is presented in Fig. 3. Overlaid there in red is the integrated distribution, i.e. the number of sources above some limiting flux density  (> ).
In the context of quasar studies, the approach used in describing the source count distribution has been similar to that of radio galaxy studies.Specifically, in determining the dipole amplitude, power law fits of the form  − to the integrated source counts have been used (see e.g.Secrest et al. 2021), as well as piece-wise straight line fits, onto which a flux cut is imposed such that the data is constrained to the regime of one of those power laws (see e.g.Singal 2023).This traces back to the original conceptual framing of Ellis & Baldwin (1984), which supposed that a radio galaxy population can be described by a power law.Yet, by inspection, a straight line fit to the integrated counts in Fig. 3 is not the best reflection of the actual data.In order to get a better hold of the non-linear nature (in logarithmic space) This follows from the analysis in Section 4.2.2 and was performed with both the Quaia low and Quaia high samples.
of the data, we instead work directly with the observed fluxes in the  band.When we observe a population of sources and their associated fluxes, we expect a number count enhancement in the forward hemisphere and a diminution in the backwards hemisphere; sources become brighter and fainter respectively, and they congregate along the line of motion.The conceptual underpinning of Ellis & Baldwin (1984) is that the observed source count power law  −  is the resultant of these two effects.Thus, in order to find the expected dipole amplitude, the following method was used.
(i) We computed the number of sources greater than some limiting flux density  0 , denoted as   .
(ii) For the -th source with measured flux   , a Doppler shift was applied such that   →    1+ where  = (1 +  cos ) and  is the Lorentz factor.This is the relationship between the observed and rest frame flux densities as described in Ellis & Baldwin (1984).
(iii) We then computed the number of boosted fluxes greater than some limiting flux density and multiplied this sum by  2 , which anticipates relativistic aberration.We denote this final value as   .
(iv) Combining the above, we then calculated the expected dipole amplitude as Note that we have selected units where  = 1, and if we take a measurement along the line of motion -the direction of maximal density enhancement -then  = 0 so  = (1 + ).We also fixed  0 to be near the flux limit of the catalogue.For Quaia low,  = 20.0 corresponds to a flux of ≈ 3.27 × 10 −5 Jy in the  band, so we took  0 = 3.3 × 10 −5 Jy.For Quaia high,  = 20.5 corresponds to ≈ 2.06×10 −5 Jy, and so we used  0 = 2.1×10 −5 Jy.Then, substituting  =  CMB into the above analysis and randomly sampling  from Fig. 2 50 000 times, we find a dipole described by the distributions in Fig. 4 with mean amplitude D ≈ 0.0080 for Quaia low and D ≈ 0.0068 for Quaia high.

Bayes's theorem
With the Quaia sample prepared for analysis, we now explore how the framework of Bayesian statistics provides a natural language to test competing hypotheses and understand the Quaia data.The specific models we consider are explained in Section 4.5, but for now, a Bayesian approach to model comparison can be broken down into two key steps or levels (Mackay 2003).At the first level of inference, a model's parameters are optimised and the posterior distributions for those parameters are recovered.This amounts to solving Bayes's theorem, where We have recast the notation of Bayes's theorem in line with Speagle (2020) to better indicate what each term means in the context of model inference; namely, D refers to the data and  refers to the set of parameters pertaining to model .L,  and Z refer to the likelihood, prior and evidence or marginal likelihood functions respectively, and  denotes the resulting posterior probability distribution.

Evidence and the Bayes factor
At the first level, the marginal likelihood only represents a normalisation term.However, at the second level of inference, competing hypotheses or models are ranked using the Bayes factor, which is the ratio of the marginal likelihoods for each model.Specifically, if we wish to determine the relative levels of support for models  1 and  2 , we may compute where  denotes the Bayes factor.It is advantageous to work with natural logarithms since the actual value of the marginal likelihood is generally very small.In this work, any quoted marginal likelihood or Bayes factor is a natural logarithm.The interpretation of the level of support depends on the value of the Bayes factor.In this example, ln  12 > 0 means that  1 is preferred over  2 , with larger values indicating more support for  1 .For reference, Kass & Raftery (1995) provide a qualitative description depending on the value of the Bayes factor, although we stress that this is not meant to be definitive.We also note that the table in their work has units of 2 ln .We present our values as ln , so the reader will need to keep the factor of two in mind when comparing our values to the ranges appearing in Kass & Raftery (1995).
Since the marginal likelihood is an integral over all parameter space Z = ∫ Ω  L () × () , models with excessive parameters which waste parameter space are intrinsically disfavoured.To a rough approximation, the marginal likelihood can be written as following the argument in Mackay (2003).L ( MP ) is the value of the likelihood function at the most optimal set of parameters, where Δ/Δ 0  can be analogised as the ratio of the peak in the likelihood function to the width of the prior distribution.This term is called the Occam factor, and it generally penalises models which squander parameter space and only have explanatory power (i.e. a high likelihood) for a comparatively small region of parameter space.
For each of our hypotheses or models, we compute the marginal likelihoods and Bayes factors, using these metrics to rank them.The result of this process is a quantitative evaluation of which hypothesis has the strongest level of support, and as such what kind of model best accounts for the Quaia sample.

Nested sampling
A key difficulty with Bayesian methods is that the marginal likelihood cannot usually be computed analytically, and is somewhat expensive to determine numerically.However, modern computational methods are well-adapted to this challenge, and provide efficient algorithms which are easy to implement.In this work, we take advantage of the Nested Sampling (NS) algorithm (Skilling 2004(Skilling , 2006)).The focus of NS is on first determining the marginal likelihood and then evaluating the posterior distributions for a model's parameters as a 'subsidiary element' (Skilling 2004).In trying to evaluate the marginal likelihood, which is an integral over all parameter space as mentioned earlier, NS recasts the integral to be over the prior instead and generates iso-likelihood contours or shells of increasing likelihood (Speagle 2020).This gives an effective means of evaluating Z and an associated uncertainty, with the posterior distribution as a byproduct.Here, we have used Dynesty2 (Koposov et al. 2023), a Python package which implements the NS algorithm.

Likelihood functions
Armed with the methods of Bayesian analysis, the next step in our approach is to construct likelihood functions L to be placed in Bayes's theorem at equation ( 7).There are essentially two approaches that we can apply; these each test the same underlying assumption, but are framed in slightly different ways.We describe each of these in a model-invariant manner, eluding to how they would be altered to fit a particular hypothesis along the way.

Poissonian statistics
In the first approach, the binning process referenced in Section 4.1.1 is analogous to a Poisson point process, and variations in number density (occupancy) across each pixel can ideally be explained by two factors: shot noise and an intrinsic signal.Shot noise necessitates that we associate each pixel with a Poisson distribution, i.e. the occupancy for a given pixel is a random variate drawn from a Poisson distribution.The intrinsic signal could be a dipole, where sources in the forward hemisphere are associated with higher number densities than sources in the backward hemisphere.This modifies the rate parameter of the Poisson distribution describing a pixel's number density.
In light of this, the probability  of observing   sources in pixel  can be written as where pi is a unit vector pointing towards the -th pixel and   is the rate parameter for the -th pixel.The expected number of occupants in a pixel is just the rate parameter: In practice, as explained in Section 3, Quaia is packaged with a selection function that ascribes each pixel with a probability.On top of statistical fluctuations and variations due to the underlying signal, pixels might have source under-densities because of factors like extinction from dust in the galactic plane.This needs to be accounted for insofar that we are examining the assumption of homogeneity and isotropy.We can do this through attenuating the rate parameter by the value of the selection function   (a probability between 0 and 1) at the -th pixel.Explicitly, this means that By associating each pixel on the sky with a probability determined from Poisson distributions, the likelihood function can be written as the product of all probabilities.As a logarithm, this becomes for total number of pixels  pix. .

Point-by-point analysis
Turning to the second approach, we are not restricted to our choice of Poissonian statistics.We can see this by examining each source individually such that the likelihood function is now the product of all points, not the product of all pixels.Thus, we term this the 'pointby-point' approach.However, the sky is still discretised to simplify the calculation; points within a certain pixel are assumed to have the same probability.This approach was adopted by Conn et al. (2011Conn et al. ( , 2012) ) in their distance determinations using the tip of the red giant branch for sparsely populated systems.
Let us introduce the function   , which describes the anticipated signal at pixel .This is a model-dependent term, the functional form of which we leave for Section 4.5.If we examine each source individually and not as a member of a pixel, then the distribution associating points with a probability depending on their position on the sky takes the form   , at least up to a normalising constant.Thus, the contribution to the likelihood function at the -th pixel is ∝ (  ×   )   , since we are taking the product over all points   in the pixel.
To normalise this distribution for the -th pixel (denoted as f below), which is critical after application of a mask and selection map, we sum over all unmasked pixels such that Thus, the natural logarithm of the likelihood function can be written as In principle, both the Poissonian and point-by-point models should give consistent results, since both represent slightly different approaches to describe the same underlying effect.We confirm this in our results at Section 5.

Hypotheses under consideration
Suppose that the distribution of sources is in fact homogeneous and isotropic in the observer's frame.Then a monopole signal is anticipated, where pixels are expected to have some mean number density N irrespective of the location of the pixel on the sky.Expressed differently, the expected number density is This acts as the null hypothesis for our study.In order to compute the likelihood function in the Poissonian case, the rate parameter   of equation ( 11) is replaced with N, whereas in the point-by-point case   is set to 1.These are then substituted into equations ( 12) and ( 14) respectively.Thus,  Pois.= { N } in the Poissonian case and  P×P = ∅ in the point-by-point case.

𝑀 1 : Dipole
As an alternative to the null hypothesis, we introduce the vector D, which points in the direction of the dipole signal and has amplitude equal to the magnitude of the dipole (see e.g.equation ( 1)).The anticipated number count is then described by the sum of the monopole and dipole signals (see e.g.Dam et al. 2023) such that where   is the angle between the dipole direction and the -th pixel vector, and D is the magnitude of the dipole.
Here, Equation ( 16) is the rate parameter   that is inserted into equation ( 11) where Poissonian statistics is used, and   = 1 + D cos   in the point-by-point analysis for the purposes of equation (13).Evidently, the parameter spaces are given by  Pois.= { N, D, , } and  P×P = {D, , }, where  and  characterise the direction of the dipole in galactic coordinates.

𝑀 2 : Double dipole
The presence of an over-density region just above the galactic center and other under-densities along the galactic plane in both Quaia low and Quaia high, which we described in Section 4.1.2,hints towards the fact that the net dipole in Quaia might be a combination of two dipoles.The net dipole modulation is then the multiplication of two individual dipoles, and the expected number density for the -th pixel is given by This is used as a rate parameter for equation ( 11) where Poissonian statistics is used, and in the point-bypoint case for equation ( 13).Thus, our parameter space is The assumption here is that the two dipoles were generated at different times and due to different factors.So, the observer's frame was already anisotropic due to one dipole at the genesis of the second dipole, and it makes sense to apply an extra modulation on top of an already modulated sky.If this was not the case and the genesis time were the same for the two, then the net motion should be in a direction that is in between the two dipoles and hence only a single dipole would be observed.Even if such a scenario seems unlikely, it is worth examining as the marginal likelihood will balance the explanatory power of this model and its complexity, potentially revealing deeper insight into the nature of the Quaia sample.

𝑀 3 : Quadrupole
For the sake of completeness, we also test for an underlying quadrupole signal.We postulate that this signal is ∝ cos 2 , such that the expected number density for the -th pixel is Our parameter spaces are  Pois.= { N, D, , }, where the tilde on the signal magnitude suggests the fact that the quadrupole amplitude is not necessarily the same as the foregoing dipole models.For the point-by-point case,  P×P = { D, , }.The expected form of this signal is inserted into equations ( 11) and (13), as already described above.It is worth noting that the quadrupole model is in fact a special case for the double dipole model with D 1 • pi = −D 2 • pi in equation ( 17).

𝑀 4 : Dipole pointing towards the kinematic dipole
In order to verify the kinematic interpretation of the cosmic dipole, it is worth fixing certain parameters to their value as determined from the CMB dipole while varying the others.Doing so is advantageous, as the marginal likelihood for a kinematic hypothesis can be compared to models where such an interpretation is not assumed.
Here, we fix the model's dipole to the direction of the CMB dipole, namely (, ) = (264.• 0.21, 48.• 253), and leave the amplitude as a free parameter.The expectation of the number density is identical to equation ( 16) except with  fixed to  CMB , and so the parameters pertaining to each model are  Pois.= { N, D} and  P×P = {D}.

𝑀 5 : Dipole fixed by the kinematic dipole velocity
Another test involves fixing the magnitude of the dipole to the CMB value while allowing its direction to vary.The dipole magnitude is fixed to the mean values of D calculated using the method described in 4.2.2.More explicitly, D = 0.0080 for Quaia low and D = 0.0068 for Quaia high.Here, equation ( 16) applies with D fixed, and so the parameters for each model are  Pois.= { N, , } and  P×P = {, }.

𝑀 6 : CMB motion
Finally, we may totally align this model's dipole in both direction and magnitude with the CMB kinematic dipole and compute the marginal likelihood.This gives a metric by which the veracity of CMB-aligned motion in the Quaia sample compares to an inferred direction after parameter optimisation.The parameters in this case are  Pois.= { N } and  P×P = ∅.

Choice of priors
As a final step, we determine the prior functions (|) for equation (7), which is needed for each model's parameters.The choice of prior represents our belief about what values the parameters are likely to take before knowledge of the data.
• We adopted a broad prior for the dipole amplitude , as well as the double dipole amplitudes  1 ,  2 , choosing them from a uniform distribution ,  1 ,  2 ∼ U [0, 1].This choice was motivated by the significant uncertainty in the magnitude of the dipole across a diverse spectrum of independent tests (see e.g.Abdalla et al. 2022;Kumar Aluri et al. 2023).In contrast, we sampled the quadrupole amplitude according to D ∼ U [−1, 0], since in testing we found that the positive amplitude solution restricted the solution to a poorer fit at the north galactic pole.
• For the direction parameters, in internal calculations we work in equatorial coordinates but convert the posterior afterwards to galactic coordinates for presentation.We denote these equatorial coordinates (, ) for right ascension  in radians and co-declination  in radians.For the dipole direction, we uniformly sampled over the surface of a unit sphere, and so  ∼ U [0, 2] and  ∼ cos −1 (1 − 2) where  ∼ U [0, 1].For the double dipole directions, we took cross-talk between the two signals in the posterior distribution at opposite hemispheres.In a similar sense, for the quadrupole model we took  ∼ [/2, 3/2] and  ∼ cos −1 (1 − 2), since there is a degenerate solution in the other hemisphere.
• For the mean number density or monopole signal N, we took N ∼ U [0, 30] for Quaia low and N ∼ U [0, 50] for Quaia high.

RESULTS
With our models outlined, we now turn to the recovered parameters and marginal likelihoods of each model.Since we are chiefly interested in this comparative assessment, and not the actual value of the evidence itself, an easy way to represent the data is by computing the Bayes factor for model   with respect to the null hypothesis  0 : this is ln  0 = ln Z  − ln Z 0 .All models are being assessed with a common benchmark -the null hypothesis -which allows a quick identification of the strongest hypothesis.Thus, the relative level of support for one hypothesis (  ) over another hypothesis (  ) is simply computed by One point to keep in mind, however, is that since our Poissonian and point-by-point approaches of Section 4.4 use different likelihood functions, the actual value of Z for a given model is very different across the two approaches.What should not change appreciably between them, as we show below, is the Bayes factor.
Since our experiment has many variations -namely different models, masks, approaches and catalogues -we generated a considerable number of marginal likelihoods and hence Bayes factors, with the latter being tabulated in Appendix A. There, the highlighted cells draw attention to the model with the highest Bayes factor for a given galactic mask.These are too voluminous to give in their entirety here.Certain salient results, however, are referenced periodically in the following text to substantiate our findings.

Low galactic masks: |𝑏|
In this masking regime, the prevailing model is the double dipole ( 2 ).This is exemplified by the Bayes factors in in Table A1 and Table A2 for the point-by-point and Poissonian approaches respectively.Note here that the conclusions -that is, the level of support for each model -are the same across both approaches.
As an example, we placed the Bayes factors for a 30  A1 and A2, followed by the kinematic velocity model ( 5 ).Critically, this sheds light on the fact that a transition is occurring from the || < 30 • mask to the || < 40 • mask; the support for the fitted hypotheses  1 - 3 dwindles substantially, while the kinematic hypotheses  4 - 6 , which generally have less parameters, gain comparatively higher Bayes factors.This is also evinced by the 30 * mask, which incorporates both the || < 30 • and a 4 sr circular mask centred on (, ) = (0 • , 0 • ).The double dipole  2 and kinematic dipole  6 have comparable Bayes factors in this regime, and so the 30 * mask must represent an intermediate stage of this transition.

Low galactic masks: |𝑏|
Similar to Quaia low, the prevailing model is the double dipole ( 2 ).This is again exemplified by the Bayes factors in Tables A3 and A4 for the point-by-point and Poissonian approaches respectively.The level of support is the same across these approaches.
The Bayes factors for the 30 • mask with the point-by-point method are reproduced in Table 2 for reference.Curiously, there is equal support for both the double dipole ( 2 ) and the dipole ( 1 ) with ln  21 = 0, although for the Poissonian approach the Bayes factor for  2 is slightly higher than  1 (see Tables A3 and A4).This difference, however, does not change the interpretation significantly since it at best suggests a marginal level of support for  2 over  1 .Further, the model with the third-highest Bayes factor in Table 2 the kinematic velocity model  5 -yields a relative Bayes factor of ln  25 = 14.3.This suggests overwhelming support for both the double dipole and dipole models at || < 30 • mask over the other competing explanations.
For lower galactic mask angles, the double dipole model is consistently favoured over the dipole model, with the difference in Bayes factors increasing as the galactic mask angle  decreases.

High galactic mask: |𝑏| < 40 •
Unlike the Quaia low sample,  4 (kinematic direction) is the prevailing explanation with a || < 40 • galactic plane mask on Quaia high.This is consistent across both the point-by-point and Poissonian approaches, as expected.This being said, the Bayes factors (see Tables A3 and A4) for all models except  3 and  0 are comparable, with the difference between the lowest and highest Bayes factor being ≈ 1 log unit.This suggests that each model is on a similar footing in terms of explanatory power; one model does not totally dominate over the others.At best, the largest Bayes factor ln  41 ≈ 1 suggests some positive support for  4 over the double dipole ( 1 ).
Interestingly, with the 30 * mask, the dipole ( 1 ) is the favoured hypothesis, and the kinematic dipole ( 6 ) offers the next-best explanation of the data.The difference between the Bayes factor for  1 and the other models is in general higher than the differences between  4 and the other hypotheses with the 40 • mask, suggesting the dipole  1 is more dominant in this regime.

Prevailing double dipole as the effect of over-densities
We turn first to explaining the dominance of the double dipole at low galactic masks as opposed to the other hypotheses, which is observed in both Quaia low and Quaia high.This was outlined in Sections 5.1.1 and 5.2.1.In essence, what must be explained is why the double dipole -despite having a more significant penalty from the Occam factor for its complexity -can provide a comparatively better explanation of the data than the other models.
To better see how the model is fitting the data, we extracted the single best fit values (highest marginal probability) for the double dipole with a 30 • mask and computed the signal term We did this for all pixels over the sky and show the resultant map in Fig. 5. Compare this map with the smoothed versions shows in the bottom row of Fig. 1, noting the dashed lines which indicate the 30 * mask we applied.With this in mind, the high value of the Bayes factors likely arise because of the over-density at (, ) ≈ (0 • , 30 • ) and the under-densities along the galactic plane from about  = 120 • to  = 240 • ; this cannot be adequately captured by for instance the dipole  1 , but is better captured by the double dipole  2 .This applies both for Quaia low and Quaia high.Considering the coverage of the || < 30 • mask, it is probable that this angle and those below it are not able to sufficiently remove the region of over-density.
We therefore are of the view that systematic effects arising from the Quaia selection function impact our analysis.These systematic effects -whether arising from inadequate consideration of dust extinction effects, stellar contaminants, etc. -likely contribute to overdense regions near the galactic centre, which give spurious fits as far as probing the distribution of distant quasars is concerned.We can better understand the effect of this dubious over-density by examining how the direction and amplitude of the recovered dipole of model  1 changes with galactic mask angle.This is illustrated in Fig. 6.Looking at the top pane (Quaia low), as the region of over-density near the galactic centre is progressively masked out with increasing galactic latitude , the recovered dipole direction shifts towards the Signal .971 1.02 Signal CMB dipole.For a 40 • galactic plane mask (red contours), the recovered direction is consistent with the CMB direction within ≈ 0.5, although we note that at this stage the uncertainty in the fitted parameters has drastically increased because more than half of the sky has been masked.With respect to the dipole amplitudes, these are shown for each mask in the single row below the Quaia low sky projection of Fig. 6.As the over-dense region is masked out, the recovered dipole amplitude approaches the CMB dipole amplitude.Interestingly, for a 40 • galactic plane mask, the recovered dipole amplitude  × 10 3 ≈ 11 +6 −5 is consistent with the CMB dipole amplitude  ×10 3 ≈ 8 given the 2 uncertainties, which sheds light on why  6 is the preferred model in this regime.We interpret this as signifying that the spurious over-densities are filtered out with the 40 • mask, and so the prevailing model is simply a dipole consistent with the CMB dipole.
We also attempted to remove this over-dense region -while maximising the final number of sources analysed -by imposing a 4 sr circular mask centered at (, ) = (0 • , 0 • ) on top of the 30 • mask.This mask has been labelled as 30 * in Tables A1-A4.Note that this mask covers regions beyond the || < 40 • limit and hence in principle is better in removing the over-densities concentrated in the northern hemisphere along the line of  ≈ 0 • .Still, for Quaia low, the Bayes factor for the kinematic dipole ( 6 ) is about equal to that of the double dipole ( 2 ) where the 30 * mask is employed, suggesting The graticules are in galactic coordinates, and the contours give intervals of 11.8%, 39.4% and 67.5% of posterior mass, equivalent to 0.5, 1 and 1.5 for a 2D Gaussian.The different colours correspond to different galactic plane cuts applied, as indicated by the legend.Amplitudes of the recovered dipole (×10 3 ) are also tabulated beneath the plots with the same colour code as the sky projections.The listed uncertainties give an interval containing 95% of the posterior mass, equivalent to 2 for a 1D Gaussian.The key point is that as increasingly conservative masks are used, the recovered and CMB dipoles more closely align for Quaia low.Top: Quaia low.Bottom: Quaia high.they have equivalent explanatory power.This is likely because the 30 * mask is being influenced by other over-densities and under-densities, which are better filtered out by the 40 • mask.We therefore find that the 40 • mask is most apt for genuinely interpreting the cosmic dipole in Quaia low.
Quaia high cannot be viewed with the same interpretation.Looking at the bottom pane of Fig. 6, the recovered dipole for Quaia high drifts away from the CMB dipole and towards (, ) ≈ (330 • , 60 • ) with high galactic masks, which it reaches by || < 40 • .Moreover, the CMB and inferred dipole amplitudes do not agree with each other within 2, even with the 40 • mask.Looking at the Bayes factors, the favoured hypothesis with a 40 • mask is the kinematic direction ( 4 ), but only marginally; the next-best model,  1 , is only 0.2 to 0.3 log units behind.This is only a superficial, 'bare mention' of support for  4 over  1 (Kass & Raftery 1995).We therefore form the view that, while a dipole is being inferred in the sample, its parameters cannot be constrained.That is, the data is insufficiently clear on whether this dipole aligns with the CMB dipole, or has a preferred direction somewhere away from the CMB dipole.This is likely happening because, while  4 has one less parameter than  1 and hence a more favourable Occam factor, it also has less explanatory power.On net, the models balance out in terms of support.21)-( 23), using a 40 • mask.The contours give intervals of 11.8%, 39.4% and 67.5% of posterior mass, equivalent to 0.5, 1 and 1.5 for a 2D Gaussian.
We are hesitant to draw genuine conclusions from Quaia high because of the building evidence that the sample suffers more severely from contamination than Quaia low.Inspection of Fig. 6 lends support to this, but as an additional check, we tested one final model on both samples.Suppose that Quaia high is in fact contaminated, such that the sample consists of some genuine dipole component aligned with that of Quaia low (L), as well as a contaminated component.If so, then using our Poissonian methodology, the rate parameter describing Quaia high (H) for the -th pixel would be where H − L refers to the 'high minus low' sample and C refers to the contaminated dipole component.  .Evidently, the Quaia low dipole is recovered well (cf.Fig. 6), and the bulk of the posterior mass coincides with the CMB dipole.In contrast, the contamination dipole, which to reiterate has a rate parameter described by NH−L (1 + D C • pi ), coincides with the region of over-density near the galactic centre.This is instructive insofar that it lends further evidence to their being sources near the galactic centre which significantly contaminate Quaia high.
To summarise all the above, we form the view that while the overdensities are removed with the || < 40 • mask for Quaia low, they are present beyond this limit for Quaia high.Thus, we use the 40 • mask to infer the cosmic dipole in Quaia low, and more broadly the Quaia sample.This is predicated on its ability to remove most of the clustering issues arising from the selection function.For Quaia high, we can at best infer that there is a dipole, but not its parameters.There is strong reason to believe that this is because the sample is contaminated by sources not part of the quasar background.

Dipole estimation from the results
Having interpreted our results, we present our final determination of the cosmic dipole in Quaia.To reiterate the foregoing, we take the result from the 40 • mask from Quaia low and withhold any conclusions regarding Quaia high.
For Quaia low, a dipole aligned both in direction and magnitude with the CMB dipole ( 6 ) best accounts for the results.This model has positive support over the next-favoured model  5 (ln  65 = 2.6), which only fixes the dipole magnitude to that expected from the CMB.Accordingly, the inferred distribution of quasars in Quaia low is consistent with the kinematic interpretation of the CMB, and thus is consistent with the cosmological principle.

Impact of priors on results
It is worth noting that because the marginal likelihood in Bayes's theorem is an integral over all parameter space, it is generally sensitive to the choice of prior.This could have an effect on the inferences made when evaluating competing hypotheses.Thus, we also investigated whether or not our conclusions still hold with a narrower prior on the dipole amplitude.Specifically, we sampled the dipole amplitude according to  ∼ [0, 0.1] and the quadrupole amplitude according to D ∼ [−0.1, 0].These ranges of values are one order of magnitude smaller than our previous choices.
With more restrictive priors, we find that our conclusions do not change; that is, our findings are not strongly sensitive to the choice of prior.In general, the marginal likelihood for the dipole hypothesis ( 1 ), quadrupole hypothesis ( 3 ) and kinematic direction hypothesis ( 4 ) increased with respect to the other hypotheses.With a || < 40 • galactic plane mask on Quaia low, the Bayes factor for  1 moved from 2.8 → 5.2, 0.5 → 2.9 for  3 and 5.8 → 8.0 for  4 .This is with respect to the point-by-point approach, but the trend of increasing Bayes factors was similar for the Poissonian approach.While  4 is slightly more favoured than it was hitherto,  6 is still the dominant hypothesis (recall it has a Bayes factor of 10, as in Table A1).Similar changes occur for Quaia high. 4 remains the favoured hypothesis, but with only marginally more support than  1 .However, since the marginal likelihood for  4 increases while  5 and  6 are fixed between the two prior functions,  4 prevails more significantly over the other kinematic hypotheses.None the less, we reserve drawing further conclusions from Quaia high for the reasons mentioned in Section 6.1.Thus, even with more restrictive priors, the best explanation of the data is a dipole consistent with the CMB dipole.

Final thoughts and future outlook
In this work, we presented a Bayesian analysis of the distribution of quasars in the Quaia catalogue of Storey-Fisher et al. (2023).Analysing both the Quaia low and Quaia high samples, there is substantial evidence that systematic effects arising from the construction of the selection function introduce spurious over-densities and under-densities into the catalogue, contaminating it.This affects the inference of the cosmic dipole for both samples.After masking the contaminated regions, the inferred dipole is consistent with the CMB dipole in both magnitude and direction for Quaia low, but the data affords insufficient clarity to define the parameters of the dipole for Quaia high.There, the dipole drifts along  = 0 • towards high values of  as more conservative galactic plane masks are applied, which we have identified as an effect of a large over-density in the northern hemisphere.However, taking the results together, the inferred dipole in the distribution of quasars is in agreement with the CMB dipole, and hence the cosmological principle.
There are numerous avenues that can be pursued in the future to build on this result.These include: (i) Studying the redshift-based evolution of the dipole.In this work, we examined the cosmic dipole of both Quaia samples without reference to the quasar redshift distribution.A redshift-binned selection function map for Quaia high has been released, in which a cut at  = 1.47 was imposed to divide sources into those at a redshift lower and greater than this threshold (Alonso et al. 2023).Future work might generate a number of source redshift bins, regenerating the selection map for each bin using the code released with Storey-Fisher et al. (2023), and then study the evolution of the cosmic dipole as a function of redshift.This approach has been utilised by Horstmann et al. (2022) in their study of the dipole evolution of Type Ia SNe.
(ii) A joint analysis with other data-sets.A joint analysis of Quaia with other all-sky catalogues such as radio galaxies, SNe and other quasar samples can give interesting insights into the overall matter dipole of the universe.It might also shed light on how systematic effects are influencing the recovered dipole for different catalogues.Such cross-sample studies have been performed by many different research groups (see e.g.Secrest et al. 2022;Darling 2022;Wagenveld et al. 2023).
(iii) A revisit of the Quaia selection function.In this work, substantial masking of the galactic plane was employed to screen out over-densities and under-densities, which are suspected to arise not because of the Earth's peculiar motion, but because of systematic errors introduced by the selection function.Our primary concern is how the selection function appears to over-estimate source density near the galactic centre.These density fluctuations have limited our ability to determine the cosmic dipole of Quaia, chiefly by reducing the final number of sources which are analysed after masking.Hence a full appraisal of the selection function will be essential in determining the robustness of the cosmic dipole within the Quaia sample and its impact on the cosmological principle.This paper has been typeset from a T E X/L A T E X file prepared by the author.
In the original paper (Mittal et al. 2024), an error was made in the calculation of the distribution of spectral indices , as appearing in fig. 2 of that paper.The error arises from equation ( 3), which in the original paper reads While this equation is mathematically correct, the issue relates to the correct interpretation of the units of the passband flux density   when using Gaia magnitudes.These magnitudes, like the G band magnitude as appearing in the phot_g_mean_mag column of the Gaia DR3 release table, are defined such that   in Equation ( 1) is in units of photoelectrons per second ( −  −1 ; Hambly et al. 2022).
We noted this in our original paper at the beginning of Section 4.2.2, at which stage we mentioned that to convert from  −  −1 into Jy, one must first multiply the passband flux density by a correction factor   to give units of W m −2 Hz −1 .However, since this factor was not applied in equation ( 4) in the original work, equation ( 5) is incorrect as one cannot use the relation   ∝  −  if   is in units of  −  −1 .
(2) Thus, equations ( 4) and ( 5 As the Ellis & Baldwin (1984) relation is dependent on , the correction to the spectral index changes the expected dipole amplitude D. We repeated the methodology in Section 4.2.2 with the amended values of , finding a distribution of dipole amplitudes as shown in Fig. 2.This is to replace fig. 4 of the original work.We thus take the mean amplitude of D = 0.0048 for Quaia low and D = 0.0043 for Quaia high, using these as the dipole expectations.
Correcting for this error in the way described above impacts the relative Bayesian evidences of our different hypotheses.This is because models which incorporate the expected dipole amplitude will necessarily arrive at different marginal likelihoods.These are models  5 (kinematic velocity) and  6 (kinematic dipole) in our original paper.Accordingly, the last two rows of tables A1-A4 in the original paper are to be replaced with those given here in Tables 1-4.Based on these tables, our conclusions are altered slightly.The relative ordering of each model in terms of evidential power is not changed.Namely, for the Quaia low sample with a 40 • galactic plane mask, the kinematic dipole (model  6 ) is still the prevailing model.However, the difference in support between  6 and the next-favoured model,  4 (the marginal likelihood of which is given in the original work), becomes ln  64 = 1.5.This means that the kinematic dipole is somewhat less preeminent than as in the original work (ln  65 = 2.6).
Additionally, the conclusion in section 6.3 in the original work needs to be replaced.We now find that our conclusions are, to a certain extent, sensitive to the choice of prior.As mentioned there, after adjusting the prior function for D from U [0, 1.0] to U [0, 0.1], the marginal likelihood of model  4 increases to 8.0 (Quaia low;  • < |40| masked).Based on the corrected Bayes factors in Table 1, this means that model  4 (kinematic direction) is the prevailing model with ln  46 = 0.7, in which case we find D ≈ 11 +5 −5 × 10 −3 .Although, this Bayes factor is indicative of only a slight preference for  4 over  6 .Meanwhile, the conclusions for Quaia high using the adjusted prior on D are unchanged.
Though these aspects have been altered, it still remains that the Quaia sample strongly favours a dipole aligning with the direction of the CMB dipole.The question of its amplitude is not as decisive; while we cannot rule out the possibility of a value larger than expected from the CMB dipole, we have also pointed towards support for it being equal to the CMB value.This leaves significant scope for future inquiry to attempt to resolve this matter.

Figure 1 .
Figure 1.Visualisation of the salient features of the two Quaia catalogues in galactic coordinates, with Quaia low in the left column and Quaia high in the right column.Top row: Raw catalogue prior to any additional masking or processing.Note that the catalogue already has an absence of sources near the galactic plane, shown in grey, primarily due to dust absorption.Middle row: The selection function provided for both the Quaia catalogues, with the colour scale indicating the probability of source detection associated with each pixel due to factors like dust extinction.Bottom row: Both catalogues have been smoothed via a sliding average over a 1 steradian scale after scaling according to the selection function.Deviations in source density along the galactic plane can be seen, with an over-density at the galactic centre and an under-density at mid galactic longitudes.The solid and dashed lines indicate the 40 • and 30 * galactic plane masks respectively.

Figure 2 .Figure 3 .
Figure 2. Distribution of spectral indices  in the Quaia low and Quaia high samples computed from  − .Blue and orange are used to denote Quaia low and Quaia high respectively.The mean spectral indices ᾱ for each catalogue are indicated by the vertical lines.

Figure 4 .
Figure 4. Probability distribution for the dipole amplitude assuming  CMB .This follows from the analysis in Section 4.2.2 and was performed with both the Quaia low and Quaia high samples.

Figure 5 .
Figure 5. Reconstructed signal using the best fit parameters for model  2 (double dipole) tested on a 30 • mask.The functional form of this signal is as explained in Section 4.5.3.By visual inspection, the nature of this signal is similar to the pattern of over-densities and under-densities shown in the bottom row of Fig. 1, since it is this pattern that the model is trying to fit.Top: Fit with Quaia low ( < 20.0).Bottom: Fit with Quaia high ( < 20.5).

Figure 6 .
Figure6.Projection of the posterior distribution for recovered dipole direction (using the Poissonian likelihood) onto a Mollweide projection.The graticules are in galactic coordinates, and the contours give intervals of 11.8%, 39.4% and 67.5% of posterior mass, equivalent to 0.5, 1 and 1.5 for a 2D Gaussian.The different colours correspond to different galactic plane cuts applied, as indicated by the legend.Amplitudes of the recovered dipole (×10 3 ) are also tabulated beneath the plots with the same colour code as the sky projections.The listed uncertainties give an interval containing 95% of the posterior mass, equivalent to 2 for a 1D Gaussian.The key point is that as increasingly conservative masks are used, the recovered and CMB dipoles more closely align for Quaia low.Top: Quaia low.Bottom: Quaia high.

Figure 7 .
Figure 7. Projection of the 2D posterior distribution for the Quaia low dipole and the contaminated dipole, as defined in equations (21)-(23), using a 40 • mask.The contours give intervals of 11.8%, 39.4% and 67.5% of posterior mass, equivalent to 0.5, 1 and 1.5 for a 2D Gaussian.

Figure 1 .
Figure 1.Distribution of spectral indices  in the Quaia low and Quaia high samples computed from  − .Blue and orange are used to denote Quaia low and Quaia high respectively.The mean spectral indices ᾱ for each catalogue are indicated by the vertical lines.

Figure 2 .
Figure 2. Probability distribution for the dipole amplitude assuming  CMB .This follows from the analysis in Section 4.2.2 and was performed with both the Quaia low and Quaia high samples.

Table 1 .
Table of Bayes factors by model for a 30 • mask with the Quaia low sample and using the point-by-point approach.The highlighted cell represents the model with the highest Bayes factor, indicating it has the strongest level of support.

Table 2 .
• mask with Quaia low and the point-by-point method in Table1.The level of support of the double dipole ( 2 ) over the dipole ( 1 ) is ln  21 = Table of Bayes factors by model for a 30 • mask with the Quaia high sample and using the point-by-point approach.The highlighted cell represents the model with the highest Bayes factor, indicating it has the strongest level of support.4.5, which represents strong support.The model with the secondhighest Bayes factor in Table 1 -the kinematic dipole  6 -yields a relative Bayes factor of ln  26 = 3.4.For the || < 30 • mask, this is the closest a competing model comes to out-competing the double dipole.As we move towards less conservative galactic plane masks, this difference in general increases.For || < 10 • , 20 • , the dipole model ( 1 ) has the second-highest Bayes factor as seen in Tables The 'high minus low' sample is simply a catalogue we constructed containing the sources in Quaia high that are not in Quaia low.In this model, we fit both catalogues simultaneously: we use the rate parameter (  ) L to fit a dipole to Quaia low, which is our genuine component, and use the rate parameter (  ) H to fit the Quaia high sample.Accordingly, the total likelihood is the sum of the two likelihoods for both catalogues, i.e. ln L tot. = In this case, the set of parameters  L = { NL ,   ,   ,   } and  H = { L , NH−L ,  C ,  C ,  C }, since the rate parameter for Quaia high now depends on the parameters of the dipole for Quaia low and the parameters for the contaminated 'high minus low' component.Testing this on a 40 • mask yielded the posterior distributions seen in Fig 7

Table A1 .
TableofBayes Factors for different hypotheses and galactic masks using the Quaia low catalogue with the point-by-point analysis.Here, 30 * represents the combination of a 30 • mask and a 4 sr circular mask centered at the ( • ,  • ) = (0, 0).The highlighted cell represents the model with the highest Bayes factor, indicating it has the strongest level of support.

Table A2 .
As for Table A1 but with the Poisson statistics.

Table A3 .
As for Table A1 but with Quaia high.

Table A4 .
As for Table A1 but with Quaia high and the Poisson statistics.

Table 1 .
Correction to: The cosmic dipole in Quaia 3 Table of Bayes Factors for different hypotheses and galactic masks using the Quaia low catalogue with the point-by-point analysis.Here, 30 * represents the combination of a 30 • mask and a 4 sr circular mask centered at the ( • ,  • ) = (0, 0).The highlighted cell represents the model with the highest Bayes factor, indicating it has the strongest level of support.

Table 2 .
As for Table 1 but with the Poisson statistics.

Table 3 .
As for Table 1 but with Quaia high.

Table 4 .
As for Table 1 but with Quaia high and the Poisson statistics.