Analysis of Unified Galaxy Power Spectrum Multipole Measurements

We present a series of full-shape analyses of galaxy power spectrum multipole measurements from the 6dFGS, BOSS, and eBOSS galaxy surveys. We use an emulated effective field theory of large-scale structure (EFTofLSS) model to conduct these analyses. We exploit the accelerated prediction speed of the neural-network-based emulator to explore various analysis setups for our cosmological inference pipeline. Via a set of mock full-shape analyses of synthetic power spectrum multipoles, designed to approximate measurements from the surveys above, we demonstrate that the use of alternative priors on nuisance parameters and restricted model complexity reduces many of the biases previously observed in marginalised cosmological constraints coming from EFTofLSS analyses. The alternative priors take the form of a Jeffreys prior; a non-informative prior that can mitigate against biases induced by marginalising over poorly constrained nuisance parameters. When performing a joint analysis of all synthetic multipoles, we see an improvement in the level of agreement between the marginalised $\ln{\left(10^{10}A_s\right)}$ constraints and the truth; from $\sim2.0\sigma$ to $\sim0.42\sigma$. Using our pipeline to analyse the measured multipoles, we find an improvement in the level of agreement with cosmic microwave background (CMB) results; from $\sim2.4\sigma$ to $\sim0.5\sigma$. Therefore, we conclude that the spectroscopic galaxy survey datasets listed above are consistent with constraints obtained from the CMB.


INTRODUCTION
Conducting full-shape analyses of galaxy clustering statistics (Satpathy et al. 2017;Kobayashi et al. 2021;Chen et al. 2022;Lange et al. 2023), such as the power spectrum, is becoming a standard approach to complement analyses that focus of specific features like the baryon acoustic oscillations (BAO).To run one of these full-shape analyses, we require a theoretical model that allows us to make a prediction for the clustering statistic of interest for a given set of cosmological parameters .There are two possible routes here: 1.) use a simulation-based model, 2.) use an analytical model.A simulation-based model will likely be more accurate on small, nonlinear, scales.Comparisons of dark matter only N-body simulation codes have shown agreement in predictions of the dark matter power spectrum for scales  ≲ 1 ℎ Mpc −1 (Schneider et al. 2016;Grove et al. 2022).However, developing a simulation-based model requires many simulations with different sets of cosmological parameters sampling from the parameter space of interest.These suites of simulations (e.g.Heitmann et al. 2010;Maksimova et al. 2021) require huge computational cost to produce, and this cost can prohibit the use of such models.An analytic model may be less accurate on nonlinear scales (Foreman et al. 2016;Alkhanishvili et al. 2022), but using such a model will incur a significantly lower computational cost.
★ E-mail: jamie.donald-mccann@port.ac.ukOne such analytical model that is gaining in popularity when conducting full-shape analyses is the effective field theory of large-scale structure (EFTofLSS; Baumann et al. 2012;Carrasco et al. 2012;Senatore 2015;de la Bella et al. 2017;Philcox et al. 2020;Ivanov 2022;Mergulhão et al. 2023;Moretti et al. 2023).This perturbation-theory based model maps predictions for the dark matter clustering to that of galaxies via a series of nuisance parameters , that are marginalised over when putting constraints on the cosmological parameters .Two popular examples of EFTofLSS code implementations are PyBird (D'Amico et al. 2021) and CLASS-PT (Chudaykin et al. 2020).Predictions for the galaxy power spectrum multipoles can be made with PyBird in O (1 s) 1 .This is significantly faster than a numerical simulation, but running an MCMC with PyBird still requires a non-negligible amount of computational resources.This cost can limit the exploration of the analysis setup when using this model to carry out parameter inference.
The idea of emulation to reduce computational cost is being used more and more frequently for cosmological inference problems the prediction speed was reported as 1.01 s ± 13.1 ms.Based on 100 predictions made on a laptop with an Intel i5 2.50 GHz dual-core processor with four threads and 8 GB of RAM.Table 1 of (Chudaykin et al. 2020) reports prediction speeds from CLASS-PT.In default mode, the performance appears similar to PyBird.
and is now used to accelerate inference pipelines that are based on analytic theory models (Albers et al. 2019;Aricò et al. 2022;DeRose et al. 2022;Mancini et al. 2022;Günther et al. 2022;Eggemeier et al. 2022;Günther 2023;Nygaard et al. 2023) as well as those with simulation-based models (Heitmann et al. 2006;Agarwal et al. 2014;Nishimichi et al. 2019;Euclid Collaboration et al. 2021;Storey-Fisher et al. 2022).These emulators consist of nonlinear interpolators that are fitted to (or trained with) a set of input and output pairs {,  ()}, with  () being the function of interest.The nonlinear interpolation scheme generally takes the form of a machine learning algorithm like a Gaussian process or neural network (NN).In Donald-McCann et al. (2022b), the NN-based EFTEMU was added to the matryoshka suite of emulators (Donald-McCann et al. 2022a).The EFTEMU was developed to reduce the cost of EFTofLSS model evaluations and increased the prediction speed of the galaxy power spectrum multipoles by over three orders of magnitude.This increase in prediction speed opens up the opportunity to test more analysis setup choices when using the EFTofLSS model.
In this paper, we exploit the increased prediction speed from the EFTEMU to perform full-shape analyses of galaxy power spectrum multipole measurements from several completed galaxy surveys.We also examine how the analysis setup impacts the inferred cosmology.Through a series of mock full-shape analyses, we validate our cosmological inference pipeline.We then demonstrate that using alternative priors and more restrictive sets of nuisance parameters can alleviate some of the biases in the inferred cosmological parameters that can be seen when conducting full-shape analyses with the EFTofLSS.We find that using these alternative priors can alleviate some of the slight tensions in the marginalised cosmological parameter constraints when comparing with results from cosmic microwave background (CMB) analyses.The paper is organised as follows.In Section 2, we introduce the galaxy surveys considered for this work, along with the multipole measurements used.In Section 3, we further introduce the EFTofLSS and discuss any changes made to the EFTEMU for this work.In Section 4, we present a series of mock analyses designed to test our inference pipeline.In Section 5, we present results from the analysis of the multipole measurements introduced in Section 2. We conclude in Section 6.

DATA
There have now been several large-scale spectroscopic redshift surveys that have run to completion; combining to provide detailed maps of the universe covering a wide redshift range.For this work, we focus on three surveys that cover distinct redshift ranges: the 6dF galaxy survey (6dFGS, Jones et al. 2004Jones et al. , 2009)), the baryon oscillation spectroscopic survey (BOSS, Dawson et al. 2013;Alam et al. 2017), and the extended baryon oscillation spectroscopic survey (eBOSS, Dawson et al. 2016;eBOSS Collaboration et al. 2021).The redshift catalogues from each of these surveys are now publicly available such that galaxy clustering measurements can be made for each of them.Beutler & McDonald (2021) presents measurements of the power spectrum multipoles from each of these surveys, along with wide-angle and window function matrices.These matrices allow wide-angle effects and the survey window function to be included in theory predictions of the galaxy power spectrum multipoles via two simple matrix multiplications.All measurements have 40 -bins over the range 0 <  < 0.4 ℎ Mpc −1 .The BOSS and eBOSS samples are split into subsamples for the northern and southern galactic cap (NGC and SGC) and, in the case of BOSS, two redshift bins (BOSSz1 and BOSSz3).This results in seven sets of multipoles with four effective redshifts  eff = [0.096, 0.38, 0.61, 1.52].We refer the reader to Table 1 in Beutler & McDonald (2021) for more details about each sample.

Mocks
When exploring analysis setups, we need to examine if a particular setup leads to more or less bias in the inferred cosmological parameters than another.Mock multipoles were published alongside the measurements in Beutler & McDonald (2021).These mocks are those used to calculate covariance matrices and contain survey geometry and systematics to match their associated measurements.Each of the galaxy surveys considered for this work has its own set of mocks.The number of mock realisations and specifics of simulations used to produce them are covered in Section 5 of Beutler & McDonald (2021), or for the 6dFGS mocks see Koda et al. (2016); Carter et al. (2018), for BOSS see Klypin et al. (2016); Kitaura et al. (2016), and for eBOSS see Chuang et al. (2015); Zhao et al. (2021).It is helpful to have sets of mock multipoles for which we know the true cosmology as well as the "true" values for the nuisance parameters of the EFTofLSS model (bias parameters and counterterms, see Section 3).To that end, we produce a set of mock multipoles using PyBird with the cosmology set to the TT,TE,EE+lowE+lensing+BAO ΛCDM best-fit values from Table 2 in Planck Collaboration et al. (2020, henceforth Planck 2018).The nuisance parameters are fit to the mean of the mock multipole measurements published in Beutler & McDonald (2021) for each sample.We refer to the resulting multipoles as the "PyBird mocks".
The nuisance parameters for the PyBird mocks are determined by finding the maximum a posteriori (MAP) estimate for four bias parameters and six counterterms.This is done by finding the minimum of the negative log-likelihood (see Section 4.1 for likelihood definition) with a wide uniform prior on all bias parameters and counterterms.Except for the linear bias, this prior ranges from −50 <   < 50.The linear bias prior is truncated at zero to allow for positive values only.The nuisance parameters are fit to the mean of the mock multipoles on scales 0 <  < 0.2 ℎ Mpc −1 , and the covariance is rescaled by a factor of 10.2 Figure 1 shows the PyBird mock multipoles alongside the multipole measurements and mocks from (Beutler & McDonald 2021) for the  = 0.61 NGC sample.The bottom panel shows the residuals normalised by the rescaled covariance Δ( ) (  ( )/10) .We can see that the agreement of the PyBird mock multipoles and the mocks of (Beutler & McDonald 2021) is within 1.It should be noted that the agreement is better still when considering the unscaled covariance.Plots showing the PyBird mocks for the other samples all exhibit similar results.

MODEL
As alluded to in Section 1, there are two general routes to modelling the galaxy power spectrum.The first is to use numerical simulations; providing accurate small-scale predictions but coming at a high computational cost.The second is to develop an analytic model; producing computationally efficient predictions (in comparison to numerical simulations) but being less accurate on small scales.
Probing the small, nonlinear, scales of the galaxy power spectrum can improve the constraints on the cosmological parameters.For a given survey, we will have a larger number of galaxy-galaxy pairs with small separations than large separations; thus, the statistical error on small scales will be lower than on large scales.The EFTofLSS was developed to extend the scales of validity of analytic predictions, allowing us to probe smaller scales and exploit the reduced statistical error.

EFTofLSS
Standard perturbation theory (SPT) models the dark matter overdensity field as a perfect fluid.Although successful on large scales, where the density perturbations are small, its description starts to break down when entering nonlinear scales (1-loop SPT breaks down at  ∼ 0.1 ℎ Mpc −1 for redshift  = 0, Carlson et al. 2009).In recent years considerable effort has been put into an effective description which extends the range of SPT into a mildly nonlinear regime.
EFTofLSS introduces a cut-off scale which acts as an effective lowpass filter, leading to the fluid equations now being solved in terms of long-wavelength overdensity and velocity fields.Furthermore, an effective stress-energy tensor is introduced, which captures the effects of the small scales physics on the larger scales.At a given order , the effect of these small scales and their backreaction onto the long wavelength field can be captured by a finite number of so-called "counterterms"   .These counterterms are free parameters that must be fitted to data or calibrated with simulations.Including a nonlinear bias scheme, mapping the underlying dark matter field as described above to the observed galaxy densities, the 2D redshift-space galaxy power spectrum in terms of scale  and cosine of angle to the lineof-sight , can be written as In the above   are the redshift-space galaxy density kernels (for their exact form, see D'Amico et al. 2020), n is the mean galaxy density 3 , and  −1  is a normalisation scale 4 .Overall the 1-loop EFTofLSS introduces ten nuisance parameters.Four parameters ( 1−4 ) are introduced in the expansion of the galaxy density and velocity field in terms of the underlying dark matter field.These parameters are found in the galaxy kernels   .It has been noted that  2 and  4 are highly degenerate (D'Amico et al. 2020).It is common to reparameterise such that (2) There are three stochastic parameters (  ,1 ,  mono.,  quad. ) that are introduced to capture the difference between the actual observed galaxy field and its expected value.Finally, three counterterms that encapsulate the impact of UV physics: the effective sound speed of the dark matter field   , and   ,1 and   ,2 which control the impact of small scales on redshift space distortion.

Alcock-Paczyński effect
A reference cosmology is required to measure the galaxy power spectrum from redshift catalogues provided by surveys like those introduced in Section 2. Any differences between the true underlying cosmology and the reference cosmology lead to distortions of distances parallel and perpendicular to the line of sight.This is the so-called Alcock-Paczyński (AP) effect (Alcock & Paczyński 1979).
The distortion parallel and perpendicular to the line of sight is given by the distortion parameters  ∥ and  ⊥ , respectively.These parameters are defined as with  () and   () being the Hubble parameter and angulardiameter distance as a function of redshift, respectively.The superscript ref. in the above equations indicates quantities calculated at the reference cosmology.The AP distortion is applied to the scales and angles as . With  =  ∥ /  ⊥ , and  given by The 2D power spectrum can then be decomposed into multipoles via with L  being the -th order Legendre polynomial.
The EFTEMU (and PyBird) make predictions for the power spectrum multipoles rather than the 2D power spectrum.To include the AP effect, via Equation 5, we need to reconstruct the 2D power spectrum from the multipoles.We do this via The EFTEMU (as trained for this work) makes predictions for the first two even multipoles.Reconstructing the 2D power spectrum from only the first two even multipoles will result in systematic errors when including the AP effect via Equation 5.These errors are expected to be small compared to the error associated to the multipole measurements discussed in Section 2. It should be noted that the PyBird mocks introduced in Section 2.1 were constructed including the hexadecapole  4 ().As such, the mock analyses of Section 4 will test if these systematic errors from the 2D power spectrum reconstruction impact the inferred cosmology.

Emulator
The  (2022b).The increased width of the cosmological prior, particularly for ln (10 10   ), increases the dynamic range of the kernels  , .
The original preprocessing procedure involved rescaling all  , such that at every -value their magnitude was in the range [0, 1].
We modify this procedure by first taking the log of the  , before rescaling into the range [0, 1]. Figure 2 shows the kernels for the PyBird mocks at  = 0.61 for the first three even multipoles on scales 0.001 ⩽  ⩽ 0.3 ℎ Mpc −1 .There are 21 kernels for each multipole, and these 21 kernels can be split into three groups.The first group ( 11 , ) contains the linear terms, the second group ( loop , ) contains the loop terms, and the third group ( ct. , ) contains the counterterms.These three groups also represent the grouping used for the EFTEMU; each component of the EFTEMU emulates a different group (see Section 3 of Donald-McCann et al. 2022b).It can be seen from Figure 2 that some of the  loop  and  ct.
kernels are exclusively negative or have a zero crossing.To allow us to take the log of these kernels, we include either a simple sign change or the addition of a constant to the kernel preprocessing.Taking the log results in a reduced dynamic range in the training data and leads to higher prediction accuracy.We also significantly increase the number of samples generated for training and testing from 10,000 to 50,000.Only 40,000 are used for training; the remaining 10,000 are used for testing.
Figure 3 shows the prediction error on the monopole of the power spectrum when producing predictions with the re-trained EFTEMU.Each row shows the prediction error at a different redshift, and each column shows the prediction error computed with different sets of nuisance parameters.The orange shaded regions show the 68% and 95% credible intervals (CIs) of the prediction error as a function of .The solid coloured lines show the inverse signal-to-noise ratio (SNR) for the monopole measurements considered for this work at their respective redshifts.The shaded regions have been calculated from predictions for 10,000 unseen cosmologies.For the left column, the 10,000 cosmologies have been combined with sets of nuisance parameters that produce "reasonable" predictions for the monopole.We take random draws from a very wide uniform prior 5 on the nuisance parameters and calculate the multipoles for each set of cosmological and nuisance parameters.We define "reasonable" predictions as those which the monopole is strictly positive and those which can be said to remain perturbative 6 .Any sets of parameters that do not meet these criteria are rejected, and the nuisance parameters resampled from the prior.This is repeated until we have nuisance parameters for all 10,000 cosmologies.For the right column, samples from the posterior resulting from full-shape analysis of the 6dFGS-like PyBird mock (see Section 4) are used to inform the nuisance parameters for the unseen cosmologies.For each unseen test cosmology, the posterior sample with the closest cosmology 7 is selected, and its nuisance parameters are associated to that test cosmology.The two columns of Figure 3 show two different aspects of the prediction accuracy: the left column represents the prediction accuracy across the entire theoretically viable parameter space, the right column represents the prediction accuracy for power spectra that look more similar to something that has been previously observed.We can see from the right column that for all redshifts considered and for all  < 0.25 ℎ Mpc −1 , the prediction error from the emulator is less than the error on the data at the 68% level at each respective redshift.However, from the left column, we can see that for  = 0.38, 0.61 when considering the entire theoretically viable prior space, the prediction error can be greater than the error on the data on small scales ( ≳ 0.17 ℎ Mpc −1 ).In practice, we find that the level of prediction accuracy from the re-trained EFTEMU does 5 0 <  1 < 10, −10 < { 2 ,  4 } < 10, −500 <  3 ,   ,   ,1 ,   ,2 < 500. 6See Appendix A for our perturbative condition. 7The nearest neighbour in the 4D cosmological parameter space.With the Euclidean distance as the distance metric.
not induce any significant bias to the cosmological parameters when performing inference, as shown in Section 4.

MOCK ANALYSES
In this section, we present the results from a series of analyses of the PyBird mocks (described in Section 2.1).These mock analyses aim to verify that our cosmological inference pipeline does not induce biases in the cosmological parameter constraints.In addition, we explore how various analysis setups impact the results.In all cases, to put constraints on cosmological parameters, we sample from the posterior distribution via Preconditioned Monte Carlo (Karamanis et al. 2022b); as implemented in pocoMC8 (Karamanis et al. 2022a).Precondition Monte Carlo utilises Normalising Flows (Papamakarios et al. 2021) and Sequential Monte Carlo (Del Moral et al. 2006) to efficiently sample from posterior distributions even when they have a very complex shape.We use a Gaussian likelihood of the form with  being a concatenation of the multipole measurements considered  = [ 0 ,  2 ], P being the multipole predictions from the model P = [ P0 , P2 ] for a given set of cosmological parameters  and nuisance parameters , and C being the covariance matrix.
Many of the nuisance parameters of the EFTofLSS model appear linearly as multiplicative factors for the kernels.This allows us to marginalise over these parameters analytically rather than sampling The prediction error is defined as the ratio of the EFTEMU prediction to the PyBird prediction for the same set of cosmological and nuisance parameters.The ratio is then normalised such that it is equal to zero for a perfect prediction.Each row represents a different redshift 0.096, 0.38, 0.61, and 1.52 from top to bottom.For the left column, the cosmological parameters are combined with random draws of nuisance parameters from the theoretically viable prior space.For the right column, each test cosmology is combined with a set of nuisance parameters that result in 6dFGS-like predictions.The coloured solid lines show the inverse signal-to-noise ratio on the monopole for the datasets considered for this work.Panels with both blue and green lines represent the NGC and SGC, respectively.
from them.This is standard practice when conducting parameter inference with the EFTofLSS (D'Amico et al. 2020(D'Amico et al. , 2021;;Glanville et al. 2022).Carrying out the analytic marginalisation reduces dimensionality and thus leads to a more efficient inference of the cosmological parameters.Although it is more efficient to analytically marginalise the linearly appearing parameters, the prediction speed of the EFTEMU means that fully sampling the parameter space is tractable.We refer to the likelihood with no analytic marginalisation as the "full" likelihood, and we explore the use of both the marginalised and full likelihood in the results below.

Fiducial Results
We start by presenting results from an analysis with a fiducial setup.For this fiducial setup, we analyse the power spectrum monopole and quadrupole on scales 0.01 <  < 0.15 ℎ Mpc −1 .Figure 3 shows that the nearest neighbour prediction error on these scales is considerably lower than the error associated to the mocks at all redshifts for which the EFTEMU is trained.We fix three out of the ten nuisance parameters to zero, those parameters being  4 ,   ,2 ,  mono. .These parameters are commonly set to zero in analyses of the monopole and quadrupole with PyBird (D'Amico et al. 2020;Simon et al. 2022a).The priors on   , ℎ, and ln 1010   are those that define the emulator training space (given in Table 1).For   , we use a truncated normal distribution as the prior, with a mean of 0.02235 and a standard deviation of 0.000499 .The hard bounds of this prior are given by the emulator training space as with the other cosmological parameters.The priors on the nuisance parameters are given in Table 2.We refer to the prior of Table 2 as the "classic" prior.A majority of the EFTofLSS works cited in this paper use a prior of a similar form.Note that the prior on   ,1 is defined independent of n .For n = 4 × 10 −4 ℎ 3 Mpc −3 the prior width is 400, which is in line with other works that use the PyBird EFTofLSS model.
Figure 4 shows the resulting marginalised 1D and 2D posteriors from the analysis of the PyBird mocks with the fiducial setup and using the full likelihood 10 .The two contour levels in the off-diagonal panels are 1 and 2, and the grey dashed lines indicate the location of the true values used to generate the mocks.Along with the sampled parameters   , ℎ, and ln 10 10   we also plot the marginalised posterior distributions on two derived parameters: Ω  = (  +   )ℎ −2 , and Ã =  2 1   10 8 .For the purposes of this plot, the derived Ã posterior samples have had the truth subtracted, such that the 1D marginalised posterior should peak exactly at zero if unbiased.This normalisation of Ã allows us to compare the distributions calculated for each sample as they all have different  1 values.Looking at Figure 4, it is clear that for PyBird mocks with a higher SNR (BOSSz1 and BOSSz3 NGC), the agreement with the truth is very good for all parameters.For PyBird mocks with a lower SNR (6dFGS and eBOSS QSO SGC), we observe some significant shifts from the truth in many of the 1D and 2D projections.A likely cause for these shifts is the volume effect (Carrilho et al. 2022;Simon et al. 2022a;Hadzhiyska et al. 2023); these shifts are (at least partially) a result of marginalisation.

Parameter
Prior N (0, 2) In previous works, it has been shown that ln 10 10   is particularly susceptible to volume effects (Carrilho et al. 2022;Simon et al. 2022a), and indeed it is the parameter in Figure 4 that shows the most significant observed shift.See Appendix B for more discussion on the volume effect with a toy example.
The shifts induced in marginalised posteriors are reduced when the constraining power from the data is higher.Figure 5 shows, with dashed coloured lines, the 2 region of the 2D marginalised posterior distributions on  1 and ln 10 10   resulting from analysis of the PyBird mocks for various samples with the fiducial setup described above.Also plotted in Figure 5, with coloured shaded regions, is the 2 region of the 2D marginalised posteriors obtained from analysis of the PyBird mocks with covariance matrices rescaled by a factor of 1 / 50.It can be seen that although there is agreement with the truth (represented with dotted grey lines) at the 2 level in both cases for all the data samples plotted, the agreement is significantly better when the covariance has been rescaled.The posteriors have shrunk and remained consistent with the truth.If it were the case that the biases observed in Figure 4 were resulting from anything other than marginalisation, we would not see this behaviour.We also note from Figure 5 that the shift in posteriors and median values (shown with coloured squares and points) resulting from rescaling the covariance is along a line of constant Ã (shown with grey solid lines).Giving a compelling argument for using Ã as a diagnostic quantity when understanding if observed biases in ln 10 10   are a result of a true systemic bias from the analysis pipeline or a result of volume effects.Finally, we note that rescaling the covariance in this way does not only resolve the observed bias in ln 10 10   , but in all parameters shown in Figure 4.

Exploration of Analysis Setups
The results from the previous section have shown that the analysis pipeline developed for this work can return unbiased constraints on cosmological parameters of interest for a typical EFTofLSS analysis setup.We can exploit the increased prediction speed of the EFTEMU to explore various analysis setups and observe their impact on the constrained cosmology.

Scale Cuts
We start by exploring different scale cuts.It can be seen from the solid coloured lines in Figure 3 that there is clear scale dependence in the inverse SNR for all the data samples considered for this work.There is also a clear scale dependence in the emulator prediction error.As mentioned in Section 3, when analysing LSS data, there is a general expectation that the SNR increases when pushing to smaller scales.However, this is only true if the scales are not dominated by shot noise.If we combine this with a higher modelling error on smaller scales, although the expectation might be that including smaller scales will improve the constraints, this might not be the case.
Figure 6 shows the peak posterior values and 68% CIs of 1D marginalised posteriors (with coloured squares and lines, respectively) on the cosmological parameters Ω  , ℎ, and ln 10 10   resulting from analysis of the PyBird mocks with  max.= 0.150, 0.175, 0.200 ℎ Mpc −1 and the full likelihood.The results from the analysis of the BOSS-like mocks all show the same general trend; including smaller scales shrinks the 68% CI, reduces the observed bias in the peak posterior value, or both.The results for 6dFGS show a slightly tighter constraint on Ω  and ℎ when including smaller scales but the constraint on ln 10 10   remains almost constant.This is likely because the constraint on ln 10 10   from 6dFGS is completely dominated by volume effects.We can also see that including smaller scales worsens the agreement with the truth for the eBOSS-like mocks; the 68% CI shrinks, the peak posterior shifts away from the truth, or both.As can be seen from Figure 3, the emulator error is always significantly lower than the error associated with the eBOSS-like mocks; thus, the cause for the behaviour of the eBOSS-like results is more likely to be a result of the worsening SNR rather than emulator error.It can also be seen from Figure 3 that the smaller-scale modes have larger errors, thus including them worsens the volume effect.
Table 3 quantifies the level of agreement between the true cosmological parameters of the PyBird mocks and the 1D marginalised posteriors resulting from analysis of these mocks with  max.= 0.15 ℎ Mpc −1 and  max.= 0.2 ℎ Mpc −1 .For the purposes of this paper, we quantify the agreement as the number of  separating the peak posterior values of two given marginalised distributions.We define the agreement   as with   and   being the mean and 1 error calculated from the 1D marginalised posterior, and  0 and  0 being the mean and 1 error of the reference (when calculating   for the PyBird mocks  0 = 0.).In the case of asymmetric distributions, if the residual  0 −   is positive, we use the 1 error to the right of the peak posterior.If the residual is negative, we use the 1 error to the left of the peak posterior.We note that for all apart from the eBOSSlike mocks, the level of agreement does not significantly change and is at the ≲ 0.5 level for Ω  and ℎ when comparing the results obtained with the two  max.values.For the BOSS-like mocks, the level of agreement improves to < 1 for ln 10 10   when including smaller scales.It is also worth noting that although the analyses with  max.= 0.2 ℎ Mpc −1 include scales at which the observed emulator error from Figure 3 is at a similar level to the data error, we find no 0.10 0.12 0.14 significant bias in the constrained cosmology for those samples least susceptible to volume effects (BOSSz1 NGC, BOSSz3 NGC).

Bayesian Model Comparison
pocoMC allows us to easily calculate the Bayesian evidence for each posterior distribution.We use these evidence calculations to compare EFTofLSS sub-models.We define the full model as the PyBird EFTofLSS model with all nuisance parameters free, and a sub-model as any model that results from fixing any single nuisance parameter or combination of parameters to zero.
The first sub-model we consider (M 1 ) is that of the fiducial setup; with  4 ,   ,2 , and  mono.all set to zero. Figure 7 shows the natural log of the Bayes factor ln (  ) resulting from analysis of the PyBird mocks with  max.= 0.2 ℎ Mpc −1 and the full likelihood.With  ln (  ) given by ln In the above equation, Z(M 0 ) is the evidence calculated for the full model, and Z(M  ) is the evidence calculated for the sub-model being tested.We can see that although ln (  ) is positive for all data samples, indicating that the sub-model is preferred, the preference is weak for all samples apart from the two eBOSS-like samples.
The next sub-model we consider (M 3 ) is chosen by observing the level constraint beyond the prior for each of the bias parameters and counterterms when analysing the PyBird mocks with M 0 and  max.= 0.2 ℎ Mpc −1 and the full likelihood.Figure 8 shows the ratio of the prior standard deviation to the 1D marginalised posterior standard deviation for each bias parameter and counterterm.We can see that the only parameters to have a significant constraint beyond the prior (ratio > 1) are  1 ,  2 , and   ,1 .As such, we define sub-model M 3 to be that with  1 ,  2 , and   ,1 as the only free nuisance parameters, and all others fixed to zero.The results of Figure 8 are clearly prior dependent; a reduction in the prior width for   ,1 will result in the ratio in Figure 8 being lower for this parameter.These results represent the case in which we are limited to the classic prior defined in Table 2.We calculate the Bayes factor for each sample in the same way as for sub-model M 1 .These Bayes factors are also plotted in Figure 7.We can see that sub-model M 3 is preferred over the full model M 0 at a similar level to M 1 for all the BOSS-like samples and the 6dFGS-like sample.However, the preference for sub-model M 3 over the full model for the eBOSS-like samples is much stronger than sub-model M 1 .This stronger preference for the more restrictive sub-model M 3 is likely because of the SNR of the eBOSS-like samples, as discussed in previous sections (shot noise leads to a worse SNR on small scales compared to other samples).As the parameters set to zero primarily impact small scales, and the small scales of the eBOSS-like samples are much noisier than the other samples, the data provides very little evidence for these parameters.
Table 4 shows the same as Table 3 for analyses of the PyBird mocks with sub-model M 3 .If we compare the results from the two tables, we can see that generally, the agreement is of a similar level or better than that from the results obtained with sub-model M 1 .For the eBOSS-like mocks, the level of agreement is significantly better, and the evolution with  max. is now similar to that of the results from the BOSS-like mocks when considering ln 10 10   .These results show that we can reduce the parameter space significantly without biasing the constrained cosmology and, in some cases, can alleviate biases likely caused by volume effects.

Priors on Nuisance Parameters
The choice of prior for the nuisance parameters can have a significant impact on the constraint on the cosmological parameters (Carrilho et al. 2022;Simon et al. 2022a), however physically motivating priors on these parameters is challenging.The EFTofLSS is a perturbative model, and as such, if the contribution to the model from the loop corrections becomes too large, the model breaks down; this has led to priors on the nuisance parameters restricting values to be O (1).In this section, we explore using a Jeffreys prior (Jeffreys 1998) as an alternative to the zero-centred Gaussian priors commonly used in the literature.We explore the use of a Jeffreys prior because it is non-informative.This is a desirable property as it means we are not favouring any particular region of the parameter space a priori.Hadzhiyska et al. (2023) shows that the use of the Jeffreys prior on nuisance parameters can resolve volume effects like those observed in the results presented in previous sections.The Jeffreys prior is   (Jeffreys 1998); any models with a Bayes factor greater than ∼ 2.5 have definite evidence that the sub-model is preferred, and any models with a Bayes factor greater than ∼ 5 have very strong evidence that the sub-model is preferred.defined as with  () being the Fisher information matrix, which for a Gaussian likelihood with covariance independent of model parameters  can be written as From the equations above, we can see that partial derivatives of the model with respect to the model parameters are needed to evaluate the Jeffreys prior.These partial derivatives are trivial for the nuisance parameters that appear linearly in the model.They are simple sums of relevant kernels that are predicted by the EFTEMU (or PyBird) for a given set of cosmological parameters.For this work, we only impose the Jeffreys prior on these linearly appearing nuisance parameters.This means that volume effects related to these parameters should be mitigated.However, any volume effects related to marginalisation over the remaining nuisance parameters ( 1 ,  2 , and  4 ) and the cosmological parameters will still remain.
In practice, we impose hard bounds at -100 and 100 on the linear nuisance parameters in addition to the Jeffreys prior when using the Jeffreys prior with the full likelihood, and we impose additional Gaussian priors with  = 200 when using the Jeffreys prior with the marginalised likelihood.These additional priors are chosen relatively arbitrarily and are motivated by the practicalities of our inference pipeline 11 .For the mock analyses presented below, the linearly appearing parameters are constrained well within the additional uniform prior when using the full likelihood.We also test setting  = 1000 when using the Jeffreys prior and see no significant difference when comparing to posteriors calculated with  = 200.
11 pocoMC requires prior samples as starting positions for particles.This means we must define a prior that we can sample from when using the full likelihood, hence the imposition of the hard bounds at -100 and 100.
Figure 9 shows 1D marginalised posteriors for the cosmological parameters obtained from analysis of the PyBird mocks with sub-models M 1 and M 3 (defined in Section 4.2.2), the Jeffreys prior, and the full likelihood (these setups will henceforth be referred to as JP1 and JP3, respectively).Also plotted are the results obtained with M 1 and M 3 , the classic prior, and the full likelihood (henceforth be referred to as CP1 and CP3, respectively).We start by considering the results obtained with CP1 and JP1.We can see that for all samples, the agreement with the truth is better when using the Jeffreys prior; this is particularly noticeable for ln 10 10   .When using the classic prior, the ln 10 10   peak posterior values shift significantly depending on the SNR of the sample.When using the Jeffreys prior, these peak posterior values are more consistently located around the true value.We expect consistency when examining the results obtained from analysis of the PyBird mocks as they are sample variance free.We can visualise the consistency of the results by calculating the agreement between the results obtained from each sample with Equation 8 and plotting this as a matrix in Figure 10.We can see that for the results obtained with the Jeffreys prior, with the exception of 6dFGS, there is good agreement between the results from each other sample; however, the results obtained with the classic prior show some inconsistency.We can quantify the level of consistency by averaging the lower triangle of the matrices in Figure 10.This results in 0.30 and 0.94 for the Jeffreys prior and classic prior, respectively.
From Figure 9, we can also see that using the Jeffreys prior results in an increase in the width of the 68% CIs of the marginalised 1D posteriors.This should be expected, as many of the nuisance parameters converge to the prior when using the classic prior.These parameters have some degeneracy with the cosmological parameters; expanding the space that these parameters can explore inevitably leads to some degradation of the constraints on the cosmological parameters.If we examine the results obtained with JP3, we see that for the low SNR BOSS-like mocks (BOSSz1 and BOSSz3 SGC), we still have a reduction in bias in the ln 10 10   constraint whilst at the same time maintaining a CI that is competitive with the classic prior.We note that for the eBOSS-like mocks, although the ln 10 10   bias is reduced, it is not reduced to the same degree as with JP1.We also note that a greater bias observed in the Ω  constraints when using the JP3 compared to the CP3.

Joint Analyses
So far, we have considered each sample individually, which can give interesting insights into how the specifics of each sample (such as redshift and sample selection) impact the results.However, we would ultimately like to analyse multiple samples simultaneously to improve constraining power on the cosmological parameters.To do this we treat each sample as being independent, and as such define the joint likelihood as ln L joint (| joint ) =  ln [L (|  )], with  being the shared cosmological parameters,  joint being the complete set of nuisance parameters  joint = [ 1 ,  2 , . . .,   ], and ln [L (|  )] being defined in Equation 7. Unless explicitly stated, the joint analyses of mocks and data measurements in this work are done with the marginalised likelihood.We exclusively use the marginalised likelihood for these kinds of analyses as the joint parameter space can become very large when considering multiple samples.The analytic marginalisation keeps the dimensionality low, thus keeping the joint analyses tractable12 .
Figure 11 shows the posterior distributions resulting from analysis of all the BOSS-like mocks (BOSSz1 NGC, BOSSz1 SGC, BOSSz3 NGC, BOSSz3 SGC) with sub-model M 1 , the classic prior,  max.= 0.2 ℎ Mpc −1 , and the marginalised likelihood.We note that biases can be observed in the marginalised posteriors.To verify that our joint inference pipeline does not cause these biases, we also analyse the BOSS-like mocks with the covariance rescaled by a factor of 50.These results are also plotted in Figure 11.We can see that the ∼ 1 shift from the truth when considering ln 10 10   has been completely resolved.It can be seen that there is still a slight shift when considering   and ℎ.These biases are now more likely a result of the analysis setup, emulator error, or both rather than volume effects.We do not explore this further, as in all projections of the posterior resulting from analysis with the rescaled covariance, the truth is contained within 1.Appendix C compares results obtained with the inference pipeline of this work with those obtained with the pipeline of Zhao et al. (2023).
Figure 12 summarises the marginalised 1D posteriors for the cosmological parameters of interest resulting from analyses of various combinations of the PyBird mocks, with various analysis setups.All analyses were conducted with  max.= 0.2 ℎ Mpc −1 and the marginalised likelihood.Results obtained with sub-models M 1 and M 3 and the classic prior (as before referred to as CP1 and CP3) are represented with blue and orange points and lines, respectively.Results obtained with sub-models M 1 and M 3 and the Jeffreys prior (JP1 and JP3) are represented with green and red points and lines, respectively.Much of what can be seen from Figure 12 is in line with that from Figure 9.That being; when limited to the classic prior, the results obtained using CP3 are less biased than those obtained with CP1, and when considering alternative priors, the results obtained with JP1 are less biased compared to those from CP1 and CP3 at the cost of wider error bars, and although JP3 reduces the bias in the ln 10 10   constraints compared to CP1 these results are more biased than those from CP3 when considering Ω  .
As mentioned above, the 68%CIs are considerably wider when using JP1.This raises the question, is it even worth combing high SNR data with low SNR data if the Jeffreys prior is needed to mitigate against bias?To answer this, we look at the ratio of the 68% CIs resulting from the joint analysis of the BOSSz1 NGC and BOSSz3 NGC PyBird mocks with CP1 to the 68% CIs resulting from joint analysis of all the PyBird mocks with JP1.For Ω  , ℎ, and ln 10 10   , this ratio is 0.81, 0.92, and 0.99, respectively.We can see that the use of the Jeffreys prior in JP1 has degraded the constraint in such a way that it is better to simply combine the two samples that have negligible volume effects rather than combine all samples.If we instead look at the ratio of the 68% CIs obtained from analysis of BOSSz1 NGC and BOSSz3 NGC with CP1 to those obtained from the analysis of all the mocks with JP3, it is 1.3, 1.3, and 1.7 for Ω  , ℎ, and ln 10 10   , respectively.In this case, there is a significant benefit from doing the However, we find that the number of particles for the sampler needs to be increased as suggested in the pocoMC documentation; https://pocomc.readthedocs.io/en/latest/.These extra particles mean extra likelihood evaluations are required for each iteration.This adds to the computational cost for each analysis that is already increased by expanding the dimensionality.2) and sub-models M 1 and M 3 (defined in Section 4.2.2) respectively.The green and red lines and squares show the results obtained using the Jeffreys prior (defined in Section 4.2.3) with sub-models M 1 and M 3 respectively.All analyses conducted with  max.= 0.2 ℎ Mpc −1 .
Figure 10.Matrices visualising agreement between constraints on ln 10 10   resulting from analysis of different datasets with the same setup.For both panels the data was analysed with sub-model M 1 (as defined in Section 4.2.2) with  max.= 0.15 ℎ Mpc, and the colour indicates the magnitude of    (Equation 8).Left: results from analysis using the Jeffreys prior defined in Section 4.2.3.Right: results from analysis with the fiducial prior.
joint analysis of all the samples even if the Jeffreys prior is required.It is important to note that when using JP3, we see a ∼ 1 shift from the truth when considering Ω  .This is no worse than the bias in Ω  seen in the results of the joint analysis of all the PyBird mocks with CP1 but is worse than that from the joint analysis with JP1.

MAIN RESULTS
In this section, we present the main results of this work; constraints on cosmological parameters from analysis of the unified power spectrum multipole measurements discussed in Section 2. We repeat many of the analyses discussed in Section 4, replacing the mock multipoles with those measured from the 6dFGS, BOSS, and eBOSS redshift surveys.

Individual Constraints
We start by presenting the cosmological parameter constraints obtained via analysis of each sample individually.Figure 13 shows the peak posterior values and 68% CIs for the cosmological parameters Ω  , ℎ, and ln 10 10   resulting from analysis of the galaxy power spectrum multipole measurements with four different setups.The first (shown with blue points and lines) being sub-model M 1 ( 4 ,   ,2 , and  mono.set to zero; see Section 4.2.2) with the classic prior (see Table 2 and Section 4.2.3), the next (shown with green) being sub-model M 1 with the Jeffreys prior described in Section 4.2.3, the third (shown with orange) being sub-model M 3 (all nuisance parameters set to zero except  1 ,  2 , and   ,1 ) with the classic prior, and the last being sub-model M 3 with the Jeffreys prior.We refer to these four setups as CP1, JP1, CP3, and JP3, respectively.The black points and lines, and grey shaded regions, show the 99% CI of the Planck 2018 ΛCDM TT, TE, EE+low ℓ+lowE+lensing+BAO results13 .The results shown in Figure 13 are also summarised in Table D1.
The first thing to note is the strange appearance of the CIs resulting from the analyses of BOSSz1 and BOSSz3 SGC with JP1.The marginalised 1D posteriors on Ω  and ℎ are multimodal in these cases.The second modes of these distributions correspond to chain samples with extreme nuisance parameters.This could indicate a breakdown of the model.Further discussion on these results can be found in Appendix A. With the exception of these results, we see 3).Blue lines represent results from analysis of all BOSS-like PyBird mocks, orange represents the results from analysis of the BOSS-like mocks rescaled by a factor of 50.Both analyses done with sub-model M 1 (see Section 4.2.2), the classic style prior (see Table 2), and the marginalised likelihood.
good agreement between the results obtained with both sub-models and prior choices.Each given sample and parameter has agreement within 1 for all analysis setups.However, we do note that although < 1, there are more differences between the analysis setups when considering ln 10 10   .
Table 5 quantifies the average level of agreement14 between the results presented in Figure 13 and the Planck 2018 results.When we compare the level of agreement between the results obtained with CP1 and CP3 and the Planck 2018 results, we find that they are similar for both setups.For Ω  and ℎ there is very little difference between the results obtained with the two setups for a majority of the   9 for posteriors resulting from analysis of the multipole measurements described in Section 2. Grey shaded regions and black points and lines show the peak posterior and 99% CI from the Planck 2018 ΛCDM results.The 99% CI has been plotted rather than 68% (as for the other results) to aid comparison, as the Planck 68% is much smaller than all others.
samples.When considering the results from the eBOSS samples, we see more differences in the Ω  and ℎ constraints when comparing the two setups.However, as there is a shift from an Ω  that is lower than that from Planck 2018 to one that is higher, the average level of agreement does not change significantly.As mentioned above, the differences between the results obtained with CP1 and CP3 are clearer when considering ln 10 10   .For a majority of the samples, there is a shift in the peak posterior ln 10 10   value towards the Planck result.This is combined with an ∼ 10% reduction in the width of the 68% CIs.However, Table 5 shows a similar level of agreement when using CP1 and CP3.This is because of the results from the eBOSS QSO NGC analysis.We can see from Figure 13 that the ln 10 10   posterior when using CP1 is higher than that from Planck 2018.When using CP3, this shifts to even higher values.If we exclude this result, the average level of agreement in the ln 10 10   constraints between the results obtained with CP1 and CP3 and the Planck results is now 1.06 and 0.860, respectively.We now consider the average level of agreement between the results obtained with JP1 and JP3 and the Planck 2018 results.We see that there is better agreement with the Planck 2018 ln 10 10   constraint when compared to results obtained with CP1 for a majority of samples.As discussed in previous sections, using J1 widens the 68% CIs.Some of the improvement in agreement with the Planck results will be because of this.However, the shifts in the peak posterior values towards the Planck 2018 results that can be seen in Figure 13 will also result in better agreement.
These results show that reducing model complexity going from CP1 to CP3 does not induce any statistically significant bias when considering analyses of the same sample with the two submodels.They also show that using the reduced sub-model results in ln 10 10   peak posterior values that are closer to that from Planck 2018 for the majority of data samples.We can also see that using the Jeffreys prior allows us to obtain results consistent with those obtained with the classic prior whilst being more agnostic to the form of the nuisance parameter prior.The Jeffreys prior can also increase the level of agreement with CMB results for ln 10 10   .However, this can come with the possible probing of unphysical regions of the parameter space.

Joint Constraints
Figure 14 is the same as Figure 13 but summarises the 1D marginalised posterior distributions on the cosmological parameters of interest resulting from joint analyses of the unified multipole measurements.These results are also summarised in D2.Also plotted are the Planck 2018 results and relevant results from Carrilho et al. (2022); Simon et al. (2022b); Glanville et al. (2022).These works use the EFTofLSS to constrain ΛCDM parameters from analysis of galaxy power spectrum multipoles measured from different datasets.Simon et al. (2022b) uses PyBird to analyse the same eBOSS QSO multipoles used for this work.As with sub-model M 1 of this work  4 ,   ,2 , and  mono.are fixed to zero.Glanville et al. (2022) uses PyBird to perform joint analysis of 6dFGS, BOSS, eBOSS QSO multipole measurements.The BOSS samples used in Glanville et al. (2022) are slightly different from those used in this work; we refer the reader to Table 1 in Glanville et al. (2022) for details.The analysis of Glanville et al. (2022) also differs in that the hexadecapole  4 () is included in the data vector in addition to  0 () and  2 ().Additionally, fewer nuisance parameters are fixed to zero than in either of the sub-models of this work.Glanville et al. (2022) only fixes  4 to zero.Carrilho et al. (2022) uses an independent modelling pipeline for the EFTofLSS to analyse BOSS multipole measurements.Again the BOSS measurements used in Carrilho et al. (2022) are slightly different from those used in this work; we refer the reader to Section 2.1 of Carrilho et al. (2022) for details.The Carrilho et al. (2022) analysis also differs in that   is free, and the data vector includes  4 ().The form of the nuisance parameters in the Carrilho et al. (2022) pipeline differs from that of this work (for details, see Section 2.2.3 of Carrilho et al. 2022).None of these nuisance parameters are fixed in the Carrilho et al. (2022) analysis.Each of the three joint analysis results presented in Figure 14 approximates one of the EFTofLSS works above in the sense that the same kind of data is used.The eBOSS analysis is comparable to Simon et al. (2022b), the BOSS analysis is comparable to Carrilho et al. (2022), and the ALL analysis is comparable to Glanville et al. (2022).
Table 6 quantifies the level of agreement between the results of the joint analyses presented in Figure 14 and the EFTofLSS literature results and the Planck 2018 ΛCDM results.We first note the good agreement between the results of each joint analysis with their respective EFTofLSS literature results.With the exception of the constraint on ℎ from the ALL analysis, we see the results of this work agree with the literature results within ≲ 1.The results of the joint eBOSS analysis show a more significant dependence on the analysis setup for all parameters compared to the BOSS and ALL analyses.Unlike with the analyses of the PyBird mocks, it is more difficult to determine if these shifts in the results are because of volume effects (resulting from a given analysis setup), sample variance, or errors in the modelling.From the mock analysis results presented in Figure 12, we see a slight shift towards the truth when using CP3.From Figure 14, we see that using CP3 shifts the results towards those of Simon et al. (2022b).However, this shift is away from the other EFTofLSS literature results and the Planck 2018 results.If we look again at Figure 12, we see that using JP1 shifts the ln 10 10   results even closer to the truth.Comparing to the equivalent result in Figure 14, we see that using JP1 shifts the ln 10 10   back toward the results obtained with CP1.We note that the Ã posteriors obtained with both sub-models agree very well with each other and with those from Simon et al. (2022b).The linear bias values obtained with CP1 are 2.4 ± 0.3 and 2.3 ± 0.3 for the NGC and SGC, respectively.This aligns with the linear bias obtained via analysis of the eBOSS QSO samples with non-EFTofLSS models (Hou et al. 2020).These linear bias values are significantly lower when using sub-model CP3 at 2.1 ± 0.2 for both the NGC and SGC.
The results of the BOSS and ALL analyses show less dramatic shifts in the parameters compared to those from the eBOSS analysis and behave more like the results of the mock analyses.We see that for Ω  and ℎ, there is very little difference between the analysis setups.There is slightly better agreement with the EFTofLSS literature results and Planck 2018 for these parameters when using JP1 for both the BOSS and the ALL analysis.This results from  the increased width of the 68% CI in addition to a slight shift in the peak posterior values.The width of the 68% and 95% CIs appear wider from the Carrilho et al. (2022) results than those from this work.This is most likely a result of the differences in the analysis setup mentioned above; for example, allowing   to vary.Glanville et al. (2022) shows an increase in the CIs of all relevant cosmological parameters when including   as a free parameter.
When we examine ln 10 10   , we observe that the BOSS and ALL joint analyses with CP1 display a level of agreement with the Planck 2018 results that is at the ∼ 2.5 level.For the results from CP3, this is at the ∼ 2.4 and ∼ 1.9 levels for the BOSS and ALL analyses, respectively.The results from JP1 improve the level of agreement with the Planck 2018 results for both the BOSS and ALL joint analyses to < 1.The results from JP3 also show improved agreement with the Planck 2018 results.However, this is still > 1 for the results from the BOSS analysis.Although the peak posterior agrees with that of the JP1 analysis, the 68% CI is tighter and results in a > 1 difference.

CONCLUSIONS
We have presented results from multiple cosmological inference analyses of mock galaxy power spectrum multipoles designed to determine how choices about the analysis setup impact the inferred cosmological parameters.To minimise the computational cost of these mock analyses, we use the neural-network-based EFTEMU to predict the power spectrum multiples.The classic prior is broadly motivated by the idea of keeping the nuisance parameters small in order for the EFTofLSS model to remain perturbative.However, many of the parameters are very weakly constrained and introduce volume effects that bias the cosmological parameters upon marginalisation.We explore the use of a Jeffreys prior, a non-informative prior that can mitigate against these volume effects.Results from mock analyses with the Jeffreys prior show a significant reduction in bias in the ln 10 10   constraint.Considering the joint analysis of all the mocks, we find the shift from the true ln 10 10   is reduced from 2.0 to 0.42 comparing results obtained with sub-model M 1 and the classic prior and Jeffreys prior respectively.The use of the Jeffreys prior comes at the cost of widening the marginalised posteriors on the cosmological parameters.This comes as a consequence of allowing the nuisance parameters to explore a much larger prior volume.For example, we find that the ratio of the width of the 68% CI to the peak posterior value for the Ω  marginalised posterior is ∼ 7.9% when analysing only the mocks with negligible volume effects, with the classic prior.If we instead analyse all of the mocks with the Jeffreys prior, this ratio is ∼ 9.5%.This represents a ∼ 20% weakening of the constraint on Ω  even though much more data has been used.We can reduce this degradation of the constraint by using the Jeffreys prior with sub-model M 3 .The more restrictive parameter space leads to a less significant widening of the 1D marginalised posteriors for the cosmological parameters when combing M 1 with the Jeffreys prior.If we compute the ratio again, it is now ∼ 6.4%, representing a ∼ 20% tightening of the constraint.Now we again see a benefit from analysing all the data, including those samples that are susceptible to volume effects.
From the results of the mock analyses, we expect that when the analysis setup uses sub-model M 3 with the classic prior, we see better agreement with the truth compared to sub-model M 1 for the eBOSS samples, and similar levels of agreement for all other samples.We also expect that when using the Jeffreys prior with both sub-models, we will see better agreement with the truth compared to analysis with sub-model M 1 and the classic prior.Upon joint analysis of the unified multipole measurements provided in Beutler & McDonald (2021), we find that analysis with sub-model M 3 and the classic prior leads to better agreement with Planck 2018 LCDM results compared to results from the same analysis with M 1 .The level of agreement is improved from 2.4 to 1.9 for ln 10 10   .Analysing all of the multipoles with the Jeffreys prior and both sub-models leads to better levels of agreement again.This is now 0.54 and 0.70 for M 1 and M 3 respectively.These results indicate that some of the slight tensions between results obtained via analysis with the EFTofLSS and those obtained via analysis of the CMB are a result of analysis setup.
When using the Jeffreys prior, the nuisance parameters can take extreme values.From analysis of the individual data samples, some cases indicate that using the Jeffreys prior allows for probing regions of the parameters space in which the EFTofLSS model is no longer valid.This presents as multimodal distributions in the cosmological parameters.As mentioned above, we also see a degradation of the cosmological parameter constraint when using the Jeffreys prior.Both of these issues can be addressed by the inclusion of the classic prior within the Jeffreys prior.This limits how extreme the nuisance parameters can be, removing potentially unphysical regions of the parameter space and preventing degradation of the cosmological parameter constraints.This is explored in Zhao et al. (2023).It is shown that the resolution of volume effects can be achieved without severe degradation of the cosmological parameter constraints.We do not explore this in this work as the values of the prior widths for the nuisance parameters represent extra hyperparameters of the analysis that need to be motivated.The use of the Jeffreys prior without this additional tight prior represents an agnostic approach to the nuisance parameters, whether all regions of the parameter space are physically valid or not.condition.
Figure A1 shows the marginalised 1D posteriors for the cosmological parameters of interest obtained with different analysis setups when analysing the BOSSz1 SGC multipole measurements.In black are the results obtained when using the Jeffreys prior.We note the unusual shape of the marginalised posteriors for Ω  and ℎ; there are second modes of the distributions very far from the Planck 2018 results (plotted in Figure A1 in purple) and at the extremes of the prior-space.The nuisance parameters associated with the cosmological parameters of these second modes tend to have more extreme values than those in the central mode.This potentially indicates a breakdown in the EFTofLSS model in these regions.We can test this by imposing the perturbative condition defined by the equations above when sampling 15 .The red lines in Figure A1 show the results obtained when doing this.We can see that the extreme second modes in the Ω  and ℎ posteriors have been removed.However, this also results in a shift in ln 10 10   to lower values.These results appear to show that imposing the perturbative condition when sampling negates the effects of the Jeffreys prior when it comes to resolving the volume effects.A possible cause for this is the way that the perturbative condition has been included.Imposing the condition reduces the prior volume.However, no change has been made to the Jeffreys prior.So the prior volume corrected for with the inclusion of the Jeffreys prior is not the prior volume explored.It should be noted that these results are obtained with a prior that is still more agnostic than the classic style prior.We have made no choice on how large the nuisance parameters can be.We also note that the condition defined by the equations above is by no means exact.then we obtain the results shown with blue lines in Figure A1.In this case, the ln 10 10   constraint agrees with that obtained with the classic style prior (shown with green), but again is a more agnostic prior than the classic style prior.

APPENDIX B: TOY MODEL
The purpose of this appendix is to give some intuition for the volume effect discussed frequently in the main text (B2) The two terms in Equation B1 are referred to as the profile and Laplace terms, respectively.The Laplace term is responsible for the volume effect that induces biases in the marginalised posterior on the parameters of interest Ω when marginalising over the nuisance parameters .
To aid in understanding, we present results from analysis with a toy model.This toy model is the same as that discussed in Section 2.3 of Hadzhiyska et al. (2023); it is a simple power law of the form  Ω .We start by defining true values for  and Ω.For the example here, we use 50 and 1.5, respectively.We then generate 100 -values as random draws from a uniform distribution U (1, 5).We compute the model prediction for these 100 -values and true parameters and associate the same uncertainty  to each of them (this corresponds to a covariance matrix with a constant diagonal and zero off-diagonals).This synthetic data is then analysed with the same kind of inference pipeline used for the work presented in the main text.When  = 25 the inferred Ω after marginalisation over  is 1.500 +0.031 −0.033 ; very good agreement with the true Ω.However, when  = 800 the inferred Ω is 0.73 +0.93  −0.91 ; we now have a shift in the peak posterior.This shift in the peak posterior has come from the Laplace term.The grey solid line shows the location of the true Ω.We can see that when  = 25, both minima are in agreement with the truth; the Laplace term has a negligible impact when the constraining power is high.When  = 800, we see that the minimum of the profile term is still located at the truth.However, the sum of the Laplace and profile terms is shifted toward lower values of Ω.
As discussed in Section 4.2.3,Hadzhiyska et al. (2023) show that a Jeffreys prior can be used to mitigate against the volume effect.Figure B1 also shows the peak posterior for Ω when carrying out inference with a Jeffreys prior with a red dashed line.For this toy example, we only have one nuisance parameter .As such, the Fisher matrix needed to evaluate the Jeffreys prior is a single value.Given by We can see from Figure B1 that the marginalised peak posterior is now in good agreement with the truth and minimum of the profile term.
We can relate this toy example to the work of the main text by considering the form of the toy model.We can think of  as being one of the linearly appearing nuisance parameters of the EFTofLSS model and  Ω as the kernel or combination of kernels relevant for that nuisance parameter.

APPENDIX D: RESULTS TABLES
Table D1 summarises the cosmological constraints resulting from the unified multipole measurements discussed in Section 5.1.Table D2 summarises the cosmological constraints resulting from the joint analyses discussed in Section 5.2.  4 showing the results of joint analyses of the BOSS-like PyBird mocks.Blue lines show results obtained with the EFTEMU and the cosmological inference pipeline developed for this work.Orange results were obtained using PyBird and the inference pipeline of Zhao et al. (2023).Both analyses were done with the classic prior, the marginalised likelihood, and  max.= 0.2 ℎ Mpc −1 .

Figure 1 .
Figure 1.Top: With points and error bars, the mean of 1049 multipoles measured from the MD-Patchy mocks (Kitaura et al. 2016) for the NGC at  = 0.61.The error bars show the 1 error calculated from the 1049 measurements.The solid lines show the PyBird prediction for the Planck 2018 TT,TE,EE+lowE+lensing+BAO ΛCDM best-fit cosmology and the MAP estimate resulting from fitting bias parameters and counterterms to the mean multipoles from the MD-Patchy mocks.The crosses show the multipoles measured from BOSS NGC data, again with  = 0.61.Bottom: The residual of the mean multipole measurements and the PyBird prediction normalised by the 1 errors reduced by a factor of 10.The colours blue, orange, and green in both panels represent the monopole, quadrupole, and hexadecapole multipole moments, respectively.

3
For the analyses of this work we use values of 4 × 10 −4 ℎ 3 Mpc −3 for the 6dFGS and BOSS samples.For the eBOSS QSO samples we use 1.5 × 10 −5 ℎ 3 Mpc −3 . 4More recent papers that use the PyBird EFTofLSS model have an additional normalisation scale   .For this work, we neglect   , as such   =   .Throughout we set   = 0.7 Mpc −1 .

Figure 3 .
Figure 3. Prediction error of the re-trained EFTEMU used in this work.The orange shaded regions in each panel show the 68% and 95% credible intervals of the prediction error, respectively.The credible intervals are calculated by examining the prediction error on 10,000 test cosmologies not used for training.The prediction error is defined as the ratio of the EFTEMU prediction to the PyBird prediction for the same set of cosmological and nuisance parameters.The ratio is then normalised such that it is equal to zero for a perfect prediction.Each row represents a different redshift 0.096, 0.38, 0.61, and 1.52 from top to bottom.For the left column, the cosmological parameters are combined with random draws of nuisance parameters from the theoretically viable prior space.For the right column, each test cosmology is combined with a set of nuisance parameters that result in 6dFGS-like predictions.The coloured solid lines show the inverse signal-to-noise ratio on the monopole for the datasets considered for this work.Panels with both blue and green lines represent the NGC and SGC, respectively.

AFigure 4 .
Figure 4. 1D and 2D marginalised posterior distributions on the cosmological parameters of interest resulting from analysis of the PyBird mocks with the fiducial analysis setup (described in Section 4.1).The two contour levels in the off-diagonal panels represent the 1 and 2 regions, and the grey dashed lines in all panels show the true values of the PyBird mocks.The parameters Ω  and Ã have been derived, whilst the other parameters were sampled (see Section 4.1 for details).

Figure 5 .
Figure 5. 2D marginalised posterior for  1 and ln 10 10   resulting from analysis of mocks representing various samples of interest for this work with the fiducial setup described in Section 4.1.The dashed coloured contours represent the 2 region calculated when analysing mocks with covariance representative of their respective datasets.The filled contours represent the 2 region calculated with the covariance rescaled by a factor of 50.The coloured squares show the median values of the posterior obtained from analysis with the standard covariance and the circles from analysis with the rescaled covariance.The vertical dotted line shows the true ln 10 10   value of the mock, and the horizontal dashed lines show the true  1 values for each mock.The grey solid lines show lines of constant Ã with  1 values equal to the truth from the mocks.

Figure 6 .
Figure 6.Summary of the 1D marginalised posteriors on cosmological parameters of interest resulting from the analyses described in Section 4.2.1.Coloured squares show peak posterior values, dark horizontal coloured lines show the width of the 68% CI, light coloured lines with caps show the 95% CI, and vertical dashed lines show the true values of the mocks.

Figure 7 .
Figure 7. Natural log of the Bayes factor comparing two EFTofLSS submodels, M 1 and M 3 , for each of the datasets considered for this work with a  max.= 0.2 ℎ Mpc −1 .The grey dashed lines indicate two limits of the Jeffreys scale(Jeffreys 1998); any models with a Bayes factor greater than ∼ 2.5 have definite evidence that the sub-model is preferred, and any models with a Bayes factor greater than ∼ 5 have very strong evidence that the sub-model is preferred.

Figure 8 .
Figure 8. Ratio of the prior standard deviation to the posterior standard deviation for the marginalised 1D posteriors resulting from analysis of the BOSSz3 NGC PyBird mock with  max.= 0.2 ℎ Mpc −1 and the full likelihood.The black solid line indicates unity.

Figure 9 .
Figure9.Same as Figure6but comparing the impact of prior choice rather than varying  max. .The blue and orange lines and squares show results using the classic EFTofLSS prior (defined in Table2) and sub-models M 1 and M 3 (defined in Section 4.2.2) respectively.The green and red lines and squares show the results obtained using the Jeffreys prior (defined in Section 4.2.3) with sub-models M 1 and M 3 respectively.All analyses conducted with  max.= 0.2 ℎ Mpc −1 .

Figure 11 .
Figure 11.Same as Figure4for posteriors resulting from the joint analyses of the PyBird mocks (discussed in Section 4.3).Blue lines represent results from analysis of all BOSS-like PyBird mocks, orange represents the results from analysis of the BOSS-like mocks rescaled by a factor of 50.Both analyses done with sub-model M 1 (see Section 4.2.2), the classic style prior (see Table2), and the marginalised likelihood.

Figure 12 .
Figure 12.Same as Figure 9 for posteriors resulting from the joint analyses of the PyBird mocks (discussed in Section 4.3) with various analysis setups.Blue and orange represent results from analyses with sub-models M 1 and M 3 , and the classic style prior, respectively.Green and red show results with the Jeffreys prior defined in Section 4.2.3. max.= 0.2 ℎ Mpc −1 and the marginalised likelihood.

Figure 13 .
Figure 13.Same as Figure9for posteriors resulting from analysis of the multipole measurements described in Section 2. Grey shaded regions and black points and lines show the peak posterior and 99% CI from the Planck 2018 ΛCDM results.The 99% CI has been plotted rather than 68% (as for the other results) to aid comparison, as the Planck 68% is much smaller than all others.

Figure 14 .
Figure 14.Same as Figure 12 for the analyses of the mock multipole measurements discussed in Section 2. In addition to the results from this work, the Planck 2018 results and results from Carrilho et al. (2022); Simon et al. (2022b); Glanville et al. (2022) are plotted for comparison.As with Figure 13 the 99% CI of the Planck results have been plotted to aid comparison.All results from this work were obtained with  max.= 0.2 ℎ Mpc −1 and the marginalised likelihood.

Figure
Figure B1 compares the profile and Laplace terms for the two values of .The top panel shows the results with  = 25, and the bottom panel shows the results with  = 800.The blue lines show the profile term  2 * (Ω), and the dashed orange lines show the sum of the profile term and the Laplace term log {det [F * (Ω)]}.The results have been normalised such that the minimum has a value of zero.The grey solid line shows the location of the true Ω.We can see that when  = 25, both minima are in agreement with the truth; the Laplace term has a negligible impact when the constraining power is high.When  = 800, we see that the minimum of the profile term is still located at the truth.However, the sum of the Laplace and profile terms is shifted toward lower values of Ω.As discussed in Section 4.2.3,Hadzhiyska et al. (2023) show that a Jeffreys prior can be used to mitigate against the volume effect.FigureB1also shows the peak posterior for Ω when carrying out inference with a Jeffreys prior with a red dashed line.For this toy example, we only have one nuisance parameter .As such, the Fisher matrix needed to evaluate the Jeffreys prior is a single value.Given by

Figure A1 .
Figure A1.Marginalised 1D posteriors for the cosmological parameters of interest obtained with different analysis setups when analysing the BOSSz1 SGC multipole measurements.All analyses were conducted with  max.= 0.2 ℎ Mpc −1 and the full likelihood.The black lines represent the results obtained with the Jeffreys prior (defined in Section 4.2.3), the green lines represent the results with the classic prior (defined in Table2), and the red and blues represent the results obtained with the Jeffreys prior and the perturbative conditions discussed in Appendix A. Also plotted (in purple) are the marginalised 1D posteriors for the Planck 2018 results.

Figure B1 .
Figure B1.Comparison of the profile and Laplace terms discussed in Appendix B. The top panel shows the terms calculated for each Ω with  = 25.The bottom panel shows the same with  = 800.The grey solid lines indicate the location of the true Ω.The green dashed lines indicate the peak of the marginalised posterior obtained from carrying out inference with the given value of .The red dashed line in the bottom panel shows the peak posterior when conducting inference with a Jeffreys prior.
The width of the prior on   , ℎ, and ln (10 10   ) was increased significantly, and the spectral index   was fixed as we do not expect to get any meaningful constraint on   from our analyses.Table1 comparesthe prior for the original EFTEMU to that used in this work.The larger training space required a change in the training procedure compared to that in Donald-McCann et al.
Priors on the bias parameters and counterterms of the PyBird EFTofLSS model.U (, ) denotes a uniform distribution with boundaries  and , and N ( , ) denotes a normal distribution with mean  and standard deviation .The last two columns indicate which parameters are included in the two sub-models M 1 and M 3 (defined in Section 4.2.2).

Table 3 .
Number of sigma between the true cosmology of the PyBird mocks and the 1D marginalised posteriors, resulting from analysis with  max.= 0.15 ℎ Mpc −1 and  max.= 0.2 ℎ Mpc −1 .Left and right columns for each cosmological parameter correspond to  max.= 0.15 ℎ Mpc −1 and  max.= 0.2 ℎ Mpc −1 , respectively.Lower values are indicated with bold font.

Table 4 .
Same as Table3for analyses with sub-model M 3 defined in Section 4.2.2.

Table 5 .
The average level of agreement between the 1D marginalised posteriors resulting from the analyses (described in Section 5.1) of the unified multipole measurements and the Planck 2018 results.Each row corresponds to results obtained with different analysis setups.From top to bottom, those are: sub-model M 1 with the classic prior, sub-model M 3 with the classic prior, sub-model M 1 with the Jeffreys prior, and model M 3 with the Jeffreys prior.

Table 6 .
Simon et al. (2022b)2))between the marginalised 1D posteriors on the cosmological parameters of interest resulting from the analyses described in Section 5.2, and the Planck 2018 results and appropriate EFTofLSS literature results.For each sample, there are four rows; these correspond to results with different analysis setups.From top to bottom, they are: sub-model M 1 with the classic prior, sub-model M 3 with the classic prior, sub-model M 1 with the Jeffreys prior, and sub-model M 3 with the Jeffreys prior.Each cosmological parameter has two columns.The left column of each corresponds to the comparison with the appropriate EFTofLSS literature results: for ALL this isGlanville et al. (2022),Carrilho et al. (2022)for BOSS,Simon et al. (2022b)for eBOSS.The right column shows the comparison with Planck 2018.
Donald-McCann et al. (2022b)e EFTEMU has been improved beyond that inDonald-McCann et al. (2022b)to allow for accurate predictions to be made on a much larger cosmological prior space.The main analysis setup choices we explore are the choice of prior on the nuisance parameters of the EFTofLSS model and which parameters to include in our analyses.The classic EFTofLSS prior takes the form of zero-centred Gaussian distributions with narrow widths on the majority of the nuisance parameters.We compare the Bayesian evidence calculated from analyses of the mock multipoles with different sets of nuisance parameters fixed at zero and the classic prior.Fixing different sets of nuisance parameters to zero results in different EFTofLSS sub-models.The first sub-model we consider (M 1 ) is constructed by fixing the parameters  4 ,   ,2 , and  mono.to zero.This is a typical choice in the EFTofLSS literature.The next sub-model we consider (M 3 ) is constructed by fixing all nuisance parameters but  1 ,  2 , and   ,1 to zero.There is a significant preference for sub-model M 3 over M 1 for the eBOSS-like mocks constructed for this work.The results of the mock analyses show less bias in the inferred cosmology when using sub-model M 3 instead of M 1 to analyse the eBOSS-like mocks with the classic prior.
If we use a slightly more relaxed version of the condition, . All of what follows is based on Hadzhiyska et al. (2023, see Section 2 for a detailed discussion of the concepts discussed in this appendix).are the model parameters of interest and  * being the best-fit nuisance parameters, respectively, 15 In practice, this involves heavily penalising any sample that breaks this condition, rather than imposing a hard bound and F * (Ω) is given by F * ,  (Ω) = 1 2  2  2 * (Ω) + log {det [F * (Ω)]} + const., (B1) with  2 * (Ω) =  2 (Ω,  * ), where Ω and  * * .