The Parameter-Level Performance of Covariance Matrix Conditioning in Cosmic Microwave Background Data Analyses

Empirical estimates of the band power covariance matrix are commonly used in cosmic microwave background (CMB) power spectrum analyses. While this approach easily captures correlations in the data, noise in the resulting covariance estimate can systematically bias the parameter fitting. Conditioning the estimated covariance matrix, by applying prior information on the shape of the eigenvectors, can reduce these biases and ensure the recovery of robust parameter constraints. In this work, we use simulations to benchmark the performance of four different conditioning schemes, motivated by contemporary CMB analyses. The simulated surveys measure the $TT$, $TE$, and $EE$ power spectra over the angular multipole range $300 \le \ell \le 3500$ in $\Delta \ell = 50$ wide bins, for temperature map-noise levels of $10, 6.4$ and $2\,\mu$K-arcmin. We divide the survey data into $N_{\mathrm{real}} = 30, 50,$ or 100 uniform subsets. We show the results of different conditioning schemes on the errors in the covariance estimate, and how these uncertainties on the covariance matrix propagate to the best-fit parameters and parameter uncertainties. The most significant effect we find is an additional scatter in the best-fit point, beyond what is expected from the data likelihood. For a minimal conditioning strategy, $N_{\mathrm{real}} = 30$, and a temperature map-noise level of 10$\,\mu$K-arcmin, we find the uncertainty on the recovered best fit parameter to be $\times 1.3$ larger than the apparent posterior width from the likelihood ($\times 1.2$ larger than the uncertainty when the true covariance is used). Stronger priors on the covariance matrix reduce the misestimation of parameter uncertainties to $<1\%$. As expected, empirical estimates perform better with higher $N_{\mathrm{real}}$, ameliorating the adverse effects on parameter constraints.


INTRODUCTION
Current cosmological surveys provide new insight on inflation, dark matter, dark energy, and neutrino properties (Dodelson et al. 2013;Abazajian et al. 2015;Planck Collaboration et al. 2020a;Dutcher et al. 2021;Aiola et al. 2020).Cosmic microwave background (CMB) and large-scale structure (LSS) observations complement one another by providing snapshots of the universe at different times.The interplay of these two probes allows us to constrain structure formation and modified gravity models (Omori et al. 2019).
A crucial ingredient in many data analyses is the covariance matrix, which encodes the uncertainties on the measured data points.An accurate covariance matrix is essential for producing reliable model constraints; errors in the covariance matrix directly propagate into skewed cosmological parameters (Hartlap et al. 2007;Hamimeche & Lewis 2009;Taylor et al. 2013;Dodelson & Schneider 2013a).
Since analytic calculations of the covariance matrix may struggle to capture the full complexity of the instrument, es-Corresponding author: lbalkenhol@student.unimelb.edu.autimators are frequently used on simulated or real data to generate a robust estimate of the covariance matrix.While one can use simulations to produce a covariance estimate to any desired precision, this requires a good understanding of the noise properties of the experiment, and, perhaps more importantly, can be very computationally expensive.Dodelson & Schneider (2013a) note that modern surveys underestimate their parameter uncertainties by ∼ 5-10%, due to only running O(100) simulations (e.g., Louis et al. (2017b); Henning et al. (2018a);Planck Collaboration et al. (2020b)).Shrinkage estimators (Hamimeche & Lewis 2009), the CARPool algorithm (Chartier et al. 2021), data compression via MOPED (Heavens et al. 2017), emulators (Schneider et al. 2008;Morrison & Schneider 2013), and large-scale mode-resampling methods (Schneider et al. 2011) can ameliorate the increasing computational cost of simulations by reducing the number of realisations required to achieve adequate precision.
Alternatively, the measured data itself can be used to estimate the covariance matrix (e.g., Lueker et al. 2010b;Das et al. 2014;Crites et al. 2015;Henning et al. 2018b;Dutcher et al. 2021).This approach does not make any assumptions about the noise profile of the experiment and can avoid or re- The band power covariance matrix, C, describes the uncertainty on band power measurements and the correlations between them.It is defined as where C b is the binned power spectrum and C b the average thereof (e.g., Hivon et al. 2002;Tristram et al. 2005).Tristram et al. (2005) show that under certain assumptions we can model the covariance matrix as where {A, B, C, D} ∈ {T, E, B} are any temperature or polarisation field of the CMB, and ν b is the effective number of modes in a given band power bin.
However, modelling the experimental noise sufficiently accurately to use these theoretical prescriptions can be difficult.Eqn. 2 also assumes the mode-coupling matrix can be treated as diagonal, which is rarely satisfied given filtering choices and the requirement to mask bright radio galaxies and the Milky Way.The mode-coupling reduces the accuracy of straight-forward analytic methods to O(10%) (Hamimeche & Lewis 2009).
Alternatively, the covariance matrix can be estimated empirically.This approach relies on forming a uniform set of N real realisations of the experimental data, by dividing it into uniform subsets.These realisations form the basis of estimating the noise and signal-noise interaction term of the covariance matrix, which we refer to together as the noise-variance contribution, C N bb .Adding the signal term, i.e. the samplevariance, C S bb , to the noise-variance gives the full band power covariance matrix: By calculating the N real (N real − 1)/2 unique cross spectra of the data realisations, Ĉαβ , α = β, where α and β index the realisation, we can use the observed scatter of the distribution of band powers to estimate part of the covariance matrix.As shown by Lueker et al. (2010a), the correct noise-variance contribution to the covariance, is provided by the estimator where ∆C b are the mean-subtracted binned power spectra and f (N real ) is a correction for the finite number of realisations that approaches unity for large N real .In this limit, ĈN bb = C N bb .The sample-variance contribution can be obtained from the signal-only simulations typically used for the transfer-function calculation; since these simulations are uncorrelated, the variance of band powers provides an unbiased estimate of C S bb .Therefore, by adding the noise-variance estimator to the variance of the signal-only simulation spectra, the band power covariance matrix can be estimated.
Empirical estimators provide a noisy estimate of the band power covariance matrix, where the number of realisations N real sets the precision.Lueker et al. (2010b) show that each element has a statistical variance of Errors in the covariance matrix estimate propagate through to cosmological parameters, widening the posterior distributions and biasing the best-fit point (Hartlap et al. 2007;Hamimeche & Lewis 2009;Taylor et al. 2013;Dodelson & Schneider 2013a).Increasing N real reduces the errors, and thus the magnitude of these effects, although the observation strategy often limits the possible number of data splits for real data.For simulations, the computing cost limits the number of realisations available.

Conditioning
Conditioning the covariance matrix estimate can reduce the errors for a fixed N real .This step takes place before parameter estimation and uses prior knowledge to constrain noisy elements of the covariance matrix.To navigate this broad landscape we define four different, increasingly strict conditioning strategies, which we will benchmark in this work.These act on the complete covariance matrix, i.e. the sum of the noise-variance and samplevariance estimators.
• No conditioning: The final covariance is given by the noise-variance plus sample-variance estimators.
• Diagonal conditioning: We calculate the diagonal elements of all covariance blocks as before, but set all offdiagonal elements to zero.
• Moderate conditioning: Diagonal elements are calculated as before.Off-diagonal elements of the auto-covariance blocks are band-averaged in the correlation matrix for bins within ∆ ≤ 100 of the diagonal, but set to zero for ∆ > 100.All off-diagonal elements of off-diagonal covariance blocks are set to zero.
• Deep conditioning: An estimate of the effective number of modes and the noise spectrum is extracted from a combination of real data and transfer-function simulations.Using this information, diagonal elements are replaced by their theoretical expectation values.Second order polynomials are fitted to band-diagonal elements in the correlation matrices (for details see Dutcher et al. (2021)).

SIMULATION FRAMEWORK
We run simulations to gauge to what extend the conditioning schemes outlined in §2.2 protect against the adverse effects of a noisy covariance matrix estimate.To reduce the computational cost, we mimic the observation of a modest ∼ 290 deg 2 field, comprised of 512 × 512 square pixels of 2 arcmin width, using the flat-sky approximation.The underlying CMB signal is based on a power spectrum calculated by camb (Lewis et al. 2000a) 1 , with cosmological parameters set to the latest Planck results (Planck Collaboration et al. 2020d).We add white and 1/f noise at the map level, which follows the power spectrum where X ∈ {T, E}, α T = −3.5,T knee = 1000, α E = −1.4,E knee = 700, and N X white is the white noise level, which is in turn set by the map-noise level, σmap.We assume that the white noise power in polarisation is twice that of temperature.This noise curve is chosen to approximate the anticipated noise characteristics of the Simons Observatory (SO) large aperture telescope (c.f.Equation 1 in Simons Observatory Collaboration (2019)).We simulate having access to N real maps of the same CMB signal with independent noise realisations added to each one.We proceed to measure the T T , TE, and EE power spectra across the angular multipole range 300 ≤ < 3500 and bin them into band powers of width ∆ = 50.The simulated power spectrum measurements are complemented by Nsim = 300 signal-only simulations, which mimic the typical approach to calculating the transfer-function.
We run nine sets of simulations, for N real = 30, 50 or 100, and σmap = 10, 6.4, or 2 µK-arcmin.The number of realisations are chosen to span the range seen in contemporary analyses, without inflating computation cost (e.g., Dutcher et al. 2021;Henning et al. 2018b;Choi et al. 2020;Louis et al. 2017a;Planck Collaboration et al. 2020c).The map noise levels are chosen to match the SO baseline configuration, the SO goal, and the projected depth of the complete SPT-3G survey (Simons Observatory Collaboration 2019; Bender et al. 2018).We run 1000, 500, 100 simulations for N real = 30, 50, 100, respectively.
For each simulation we estimate the band power covariance matrix as detailed in §2.1 and condition copies of it as described in §2.2, i.e. each simulation is associated with four differently conditioned versions of the same covariance matrix estimate.Furthermore, we calculate the average of all unconditioned covariance matrices for a given N real , σmap combination, which we refer to as the fiducial covariance matrix.We also calculate a theoretical covariance void of mode-coupling using the prescriptions in Equation 2.
We note that the sky area chosen here is much less than the coverage of current and next-generation ground-based CMB surveys.However, the sky fraction only serves to scale the overall amplitude of the covariance matrix.Ultimately, we are interested in a comparative study of conditioning schemes, not their absolute performance.Therefore the shape of the covariance matrix and the noise on the covariance estimate, as set by σmap and N real , respectively, are of greater interest here.

Comparison in Covariance Space
First, we look at how well the true covariance matrix can be estimated, for a fixed number of realisations, under each conditioning scheme.We assess this by looking at the eigenvalues of differently conditioned covariance matrices for the N real = 30, σmap = 10 µK-arcmin simulations in Figure 1.
We see that the largest ∼ 70 eigenvalues are recovered well by all conditioning schemes, though the unconditioned matrices show the largest fluctuations around the theoretical expectation values.We consider fluctuations > 20% away from the fiducial values as failures to retrieve the correct eigenvalues.For the unconditioned matrices, 61% of eigenvalues are We plot the ratio of the average of the absolute eigenvalues across all N real = 30, σmap = 10 µKarcmin simulations in increasing order, to the eigenvalues of the theoretical covariance matrix.The shaded regions correspond to the standard deviation around the mean.The eigenvalues of unconditioned and mildly conditioned covariances recover the largest third and half of eigenvalues well, respectively -beyond that the eigenvalues drop off.The dotted grey lines indicate our threshold for acceptable deviations from the fiducial case, ≤ 20%.For the unconditioned covariances 39% of values lie outside this range, whereas 14% of the eigenvalues of diagonal and moderately conditioned matrices show large deviations from the expectation.Deep conditioning recovers the entire range of eigenvalues.
within this threshold.The performance of the other conditioning schemes is comparable to one another for the largest ∼ 70 eigenvalues, though for smaller eigenvalues they begin to deviate; 14% of the eigenvalues of the diagonal and moderately conditioned covariance deviate by more than 20% from the fiducial case.Deeply conditioned matrices recover all eigenvalues, with no statistically significant deviation from the theoretical covariance.Noisy estimates of the covariance matrix can lead to negative eigenvalues, which remain a problem for most of the conditioning strategies.Deep conditioning is the exception, yielding positive-definite covariance matrices in all cases tested.As the breakdown in positive definiteness is related to the magnitude of the noise on the covariance estimate, increasing the number of realisations generally improves the situation (c.f.Equation 4).For example, for the σmap = 10 µK-arcmin simulations, 24%, 19%, and 12% of the eigenvalues of unconditioned matrices are negative for N real = 30, 50, and 100, respectively.Diagonal and moderate conditioning perform better: at N real = 30 only 0.4% and 0.3% of eigenvalues are negative, which reduces to 0.03% and 0.02% for N real = 50, respectively.For N real = 100, diagonal and moderate conditioning match the performance of deep conditioning and we register no negative eigenvalues.
This work varies the number of realisations used in the empirical covariance estimate for the noise variance, but uses a fixed higher number of realisations to estimate the sample variance.As a result, the number of negative eigenvalues decreases in these tests as the angular multipole range dominated by sample variance increases (i.e. as the map-noise level decreases).The sample-variance contribution to the covariance is always comparatively well-constrained with 300 realisations.For example, the average fraction of negative eigenvalues for unconditioned covariance matrices reduces from 24% for σmap = 10 µK-arcmin, to 20% and 14% for σmap = 6.4 µK-arcmin and σmap = 2 µK-arcmin, respectively, for the N real = 30 simulations.
One can also improve the performance of empirical covariance estimators by reducing the number of entries in the covariance.In this context, the number can be reduced by using a coarser angular multipole binning for the power spectrum measurement.For the N real = 30, σmap = 10 µK-arcmin simulations, the temperature and E-mode power spectrum is signal-dominated for < 3300 and < 1750, respectively.Doubling the bin size to ∆ = 100 decreases the fraction of negative eigenvalues in the unconditioned covariances from 24% to 17%.Similarly, for diagonal and moderate conditioning the fraction of of negative eigenvalues is reduced to 0.02% and 0.01%, respectively.The performance of deep conditioning does not change.While such rebinning can help reduce the noise of the covariance matrix, a coarser binning of the power spectrum can result in wider parameter constraints.
Given the large fraction of negative eigenvalues of the unconditioned matrices, we exclude this scheme from further analysis.

Impact on Cosmological Parameters
While errors on the covariance are easy to measure, the real statistics of interest are the effects on the derived parameter constraints during model fitting.Our analysis focuses on the following key questions: is the parameter uncertainty over-or under-estimated?Is the best-fit point biased?Are the derived constraints suboptimal, i.e. do the results improve appreciably if the covariance estimate is improved?Before pursuing these questions and reporting our results, we briefly describe our analysis methodology below.
We benchmark the effects on parameter fitting under the assumption of the ΛCDM model, using model power spectra calculated by camb (Lewis et al. 2000b).The six model parameters are: the amplitude of primordial density perturbations, As, the tilt of the power spectrum, ns, defined at a pivot scale of 0.05 Mpc −1 ; the density of cold dark matter Ωch 2 ; the baryon density Ω b h 2 ; the optical depth to reionisation, τ ; and θMC , the approximation to ratio of the sound horizon to the angular diameter distance used by camb.
To reduce the computational cost of fitting cosmological parameters on O(5000) simulations, we primarly focus on a single parameter: θMC , the approximation to the ratio of the sound horizon to the angular diameter distance.We fix other cosmological parameters to the best-fit value of the signalonly spectrum obtained from combining all transfer-function simulations across the nine sets of simulations.We explore the range 1.046912 ≥ θMC ≥ 1.034112 around this fiducial θMC value, although in practice we filter out any likelihoods that peak close to the edge of the range -outside of the ∼ 7.5σ region of the least constraining simulation.
We investigate the statistics in the full 6-dimensional parameter space for a subset of 100 of the N real = 30, σmap = 10 µK-arcmin simulations.We use Cobaya (Torrado & Lewis 2021) to find the minimum of the likelihood for these realisations.Since the simulated surveys lack access to information from large angular scales, we fix the optical depth to reionisation τ to the input value.Using the sampler of Lewis & Bridle (2002), we run a full Markov Chain Monte Carlo for a single realisation using the fiducial band power covariance to obtain a fiducial parameter covariance matrix.
We calculate the joint likelihood of the T T , TE, and EE band powers for each simulation using a Gaussian approximation: where ∆C b is the difference of the simulation and model band powers and C bb is the covariance matrix.This procedure is carried out for each realisation using the diagonally-, moderately-, and deeply-conditioned covariance matrices, as well as the fiducial covariance.We set any negative eigenvalues of the covariance matrices to a large, positive number, six orders of magnitude larger than maximum eigenvalue, in order to guarantee positivedefiniteness for parameter estimation.We analyse the likelihood curves of all simulations, identifying the peak of the likelihood as the best-fit point and obtaining the uncertainty of the parameter constraint from the curvature of the likelihood around this point.We filter the results for outliers as follows.First, we reject a realisation if the best-fit value of any conditioned matrix peaks towards the edge of the parameter range as mentioned before.Furthermore, we perform sigma-clipping (5 σ, five iterations), based on the best-fit values and parameter uncertainty.For all simulations and analysis cases > 96% of realisations survive these cuts and we therefore do not expect this filtering to bias our results significantly.
We compare parameter uncertainties first and investigate the bias afterwards.We highlight key results in detail below, and point the reader to Tables 1 and 2 in the Appendix for the full results.

Uncertainty
We inspect the uncertainty on parameter constraints for different conditioning schemes for the N real = 30, σmap = 10 µKarcmin simulations and compare them to the fiducial covariance.Diagonal covariance matrices underestimate the parameter uncertainty by 5.8 +2.1 −1.9 %.This underestimate is inpart due to zeroing the off-diagonal entries in the covariance matrix (which are non-zero due to mode-coupling), but also due to the excessive noise on the main diagonals of the covariance.The latter point is confirmed by the performance of the moderately conditioned covariance matrices, which can recover the off-diagonal entries but have the same noisy estimate of the diagonals.The moderately conditioned case underestimates the uncertainty by 3.5 +2.7 −2.3 %.The deep conditioning scheme, which also reduces the uncertainty on the diagonal entries, recovers the fiducial error on θMC to within 0.3 ± 1.5%.These differences between the conditioning schemes are illustrated in the left panel of Figure 2. The uncertainty on θMC is artificially changed by the uncertainty in the covariance matrix for the diagonal and moderate conditioning cases, while the likelihood for deep conditioning closely traces the fiducial likelihood curve.
Increasing the number of realisations used in the covariance estimation will reduce the uncertainties (for a fixed number of bandpowers), and decrease their impact on the parameter uncertainties.The right panel of Figure 2 shows the trend with increasing N real for σmap = 10 µK-arcmin simulations.As expected, the uncertainties on θMC converge to the fiducial uncertainty for all conditioning schemes as the number of realisations, N real , increases.At N real = 100, moderate conditioning recovers the fiducial uncertainty well, although diagonal conditioning still underestimates the uncertainty by 2.4 +1.5 −1.2 % due to zeroing the off-diagonal entries.
By considering the number of noise-variance dominated bins estimated, we can understand how these results change for: different map-noise levels; different band power bin widths; and subsets of the T T , TE, and EE power spectra.Lowering the map-noise level increases the angular multipole range over which the well-measured sample-variance part of the covariance dominates.This boosts the performance of the conditioning schemes; for example, for N real = 30, σmap = 2 µK-arcmin simulations, the uncertainty underestimation of moderate conditioning is reduced from 3.5 +2.7 −2.3 % to 0.5 +1.4  −1.2 %.Similarly, when doubling the bin size from ∆ = 50 to ∆ = 100, moderate conditioning recovers the correct parameter uncertainty to 0.1 +1.2 −1.3 % (at σmap = 10 µK-arcmin).Since the temperature power spectrum is sample-variance dominated over the largest angular multipole range, we observe more robust constraints from T T -only data: moderate conditioning yields the correct parameter uncertainty at 1.3 +2.7 −2.5 %.The improvement in each case is owed to the re- Right panel: We plot the median of the ratio of θ M C uncertainty inferred from the data likelihood, σ L , from diagonal (green), moderate (blue), and deep conditioning (red), to the fiducial covariance matrix, σ 0 , across all N real , σmap = 10 µK-arcmin simulations.The wide and thin markers indicate the 68% and 95% interval, respectively.While shallow conditioning schemes tend to underestimate the parameter uncertainty, increasing the number of independent realisations N real ameliorates this issue.
duction in the number of covariance elements estimated from the limited data realisations.

Bias
We next turn to evaluating the level of bias in the recovered parameter constraints due to the noisy covariance estimate.
We define the bias ∆θ as the median difference of the best-fit values of a given conditioning scheme compared to the best-fit value of the fiducial covariance matrix.We also measure the additional scatter (or noise) introduced in the best-fit values by the noisy covariance.We quote both quantities relative to the expected uncertainty, by dividing the measured bias or scatter by the expected uncertainty from a Fisher forecast, σF .2 We use the median and σ b , defined as half the central 68% interval, to describe the two key quantities of interest, the bias and the extra noise to the best-fit point, respectively.The bias is a global shift to the parameter constraint and depends on the specific point in parameter space being explored.In theory, this can be corrected for using a set of simulations.The extra noise on the other hand can be understood as a random shift to the best-fit value, which is unique for each individual realisation.The additional scatter is not captured by the data likelihood and independent of the change to the curvature of the likelihood explored in §4.2.1.This effect needs to be accounted for explicitly through a modification of the likelihood or similar procedure.
Scaled bias distributions are shown in Figure 3 for the σmap = 10 µK-arcmin simulations.For N real = 30, diagonal and moderate conditioning lead to a sizeable scatter of parameter constraints, with σ b /σF = 0.44 and 0.47, respectively.There is little difference in the best-fit values obtained from these two schemes, which indicates that this extra noise is driven by the uncertainty on the main diagonal elements of the covariance matrix estimates.With medians of 0.16 and 0.17 for diagonal and moderate conditioning, respectively, we detect a bias at ∼10 σ.However, this bias is drowned out by the random scatter to the best-fit point.Deep conditioning does not admit a sizeable bias or significant extra noise.The median of the distribution is consistent with zero and σ b /σF = 0.11.
We do not expect to reach zero scatter to the best-fit point, due to residual noise in the fiducial covariance.We quantify this lower limit by running an additional set of simulations with the same setting and benchmarking the performance of the associated new fiducial covariance using the original simulations.We find that the half-width of the 68% interval

Deep
Figure 3. Noise in the covariance matrix leads to additional scatter in the parameter constraints.We show the probability density function (PDF) of the ratio of the parameter bias to the fiducial uncertainty, ∆θ/σ F , for σmap = 10 µK-arcmin simulations for diagonal (green), moderate (blue), and deep conditioning (red).With increasing vertical offset we show the N real = 30, 50, and 100 simulations and indicate the 68% and 95% interval in faint colour under the curves.Diagonal and moderate conditioning lead to a substantial scatter in the recovered parameters, as shown by the broad 68% and 95% intervals.This extra noise is not accounted for in the data likelihood and independent of the change to the curvature of the likelihood.Increasing N real reduces uncertainty on the covariance estimate and leads to narrower distributions, i.e. the to parameter constraints is decreased.Deep conditioning performs almost as well as the fiducial covariance matrix, even for N real = 30.Moreover, with increasing N real the distributions approach a Gaussian, with KS-test p-values in the 95% interval for all conditioning schemes given N real = 100.The PDFs shown here are estimated by computing a histogram of the scaled bias and smoothing it using a Gaussian kernel.
of the scaled bias distribution spans 0.07.This is over half of the level of scatter observed for deep conditioning quoted above, indicating that the random shifts admitted by the conditioning scheme are in practice appreciably lower and typically below the 0.1σF level.Figure 3 also allows us to observe the evolution of the distributions with N real .As expected, the distributions become tighter as more realisations become available, i.e. the 68% and 95% intervals shrink, and the median values move closer towards zero, as the covariance estimate is improved.Practically, this means that the scatter to the parameter constraints, σ b , shrinks and the uncertainty approaches the fiducial value.We focus on the evolution of the extra noise with N real in Figure 4, which shows this trend more clearly.However, even at N real = 100, diagonal and moderate conditioning both still lead to non-negligible scatter of the best-fit parameters, with σ b /σF = 0.19 and 0.21, respectively.We detect no significant change in the performance of deep conditioning with N real .
The extra noise dominates over the change to the curvature of the likelihood seen in §4.2.1 for most simulations tested.We combine the two effects by adding the standard deviation of the bias distributions to the parameter uncertainties measured in §4.2.1 in quadrature.Generally, the total effect is an increase in the parameter uncertainty compared to the fiducial case.For the N real = 30, σmap = 10 µK-arcmin simulations, we find the uncertainty on the recovered best fit parameter to be a factor of 1.3 larger than the apparent pos-terior width from the likelihood for diagonal and moderate conditioning.This corresponds to an increase by a factor of 1.2 and 1.3 compared to when the true covariance is used, respectively.Diagonal conditioning leads to a total uncertainty closer to unity, because the scatter of the best-fit point is offset by the the larger uncertainty underestimation from the curvature of the likelihood.This balance changes with increasing N real , as the extra noise shrinks but the likelihood curve obtained from diagonal conditioning remains narrower than the fiducial case, leading to a total underestimation of the parameter uncertainty by 2.4 +1.5 −1.2 %.For N real ≥ 100 the latter is no longer an issue for moderate conditioning, and as the scatter of the best-fit point is reduced the total uncertainty converges to the fiducial value.Deep conditioning recovers the fiducial uncertainty to less than 2% in all cases tested.
We expect that the effect of noise in the covariance on the parameter constraints will scale as the ratio of the number of bins estimated to the number of realisations (Dodelson & Schneider 2013b).For a fixed number of realisations, reducing the number of bins estimated will also reduce the additional uncertainty incurred due to the covariance estimate for all conditioning schemes.We have demonstrated this on all three axes: changing the map-noise level (varying the number of band powers dominated by noise variance); changing the number of band power bins; and reducing the number of spectra from T T , TE and EE to T T -only.For instance, for N real = 30 and moderate conditioning, lowering the noise We show here the ratio of the additional scatter to the fiducial parameter uncertainty for simulations with σmap = 10 µK-arcmin, parametrised by σ b , the half-width of the 68% interval of θ/σ F .The wide and thin markers indicate the 1 − 2σ range based on the number of samples for diagonal (green), moderate (blue), and deep conditioning (red), respectively.The dotted lines show an inverse power-law fit with slope N −1 real and the semi-transparent region corresponds to the 1σ region around it.Due to the small dataset, these fits only serve to guide the eye and are not statistically meaningful.The additional scatter to parameter constraints is reduced as the covariance estimate improves with higher N real .We do not expect to be able to reach zero extra noise due to residual noise in the fiducial covariance matrix.
While the tests so far have focused on single-parameter fits, we are generally interested in the size of the additional scatter due to noise in the covariance matrix in multiple-parameter fits, for instance to the full ΛCDM model.We use the best-fit points of a subset of 100 N real = 30, σmap = 10 µK-arcmin simulations and look at the additional scatter scaled by the fiducial parameter covariance matrix described in §4.2.We find that all cosmological parameters show a similar level of additional scatter in the marginalised 1D posteriors.Taking the geometric mean across the five dimensions, the extra scatter with diagonal, moderate, and deep conditioning is σ b /σ0 = 0.44, 0.50 and 0.11 respectively, statistically consistent with the results of 0.44, 0.47 and 0.11 in the singleparameter fits.Note, however, that this scatter applies to each individual axis in the parameter space, so if one looks at the distribution of the distance between the true and recovered best-fit across N dimensions, the width expands by up to a factor of √ N .We conclude that the estimates of the additional scatter due to noise in the covariance estimate in §4.2 and Fig. 4 are applicable to the marginalised 1D posterior constraints of each parameter in the full ΛCDM model.

CONCLUSION
In this work, we assess how effective conditioning noisy estimates of the covariance matrix is at producing robust parameter constraints.We benchmark the performance of four conditioning schemes imposing increasingly strong priors on the estimated covariance matrix: no conditioning, diagonal conditioning, moderate conditioning, and deep conditioning.We simulate T T +TE+EE power spectrum analyses spanning an angular multipole range of 300 ≤ < 3500 with band powers of width ∆ = 50, mimicking access to N real = 30, 50, or 100 data realisations and map-noise levels of σmap = 10, 6.4 and 2 µK-arcmin.We find that, for all cases tested, conditioning the covariance matrix is necessary to obtain robust parameter constraints.When examining the resulting parameter constraints, the largest effect of noise in the covariance estimate is a significant scatter in the best-fit points.Crucially, this additional uncertainty is not reflected in the reported parameter posteriors, from e.g., an MCMC analysis of the data likelihood, and must be explicitly accounted for by some other means.Not accounting for this additional scatter can significantly underestimate the uncertainty on cosmological parameters.For N real = 30, σmap = 10 µK-arcmin simulations, we find that the parameter uncertainties are a factor of 1.3 larger than the apparent posterior width from the likelihood for diagonal and moderate conditioning.Importantly, the distribution of this additional scatter is significantly non-Gaussian.The stricter priors of deep conditioning eliminate the increase in uncertainty.Increasing the number of data-splits used in the covariance estimate also improves the situation: with 100 data realisations, we report no increase in the parameter uncertainty for diagonal and moderate conditioning.In many cases, we expect it will be necessary to add an explicit correction for this additional scatter in order to accurately capture the parameter uncertainty.While the optimal approach may vary between analyses, we alert the reader to Sellentin & Heavens (2016) and Percival et al. (2021), who offer solutions in the Bayesian and Frequentist inference frameworks.
We also report two other less significant phenomena: a distortion of the curvature of the likelihood and a bias of parameter constraints.For substantial noise in the covariance estimate (N real = 30, σmap = 10 µK-arcmin), diagonal and moderate conditioning schemes lead to an artificially tight likelihood; this translates into an underestimation of the parameter uncertainty by a few percent.More data realisations ameliorates this underestimate.However, for diagonal conditioning, which does not capture the full correlation structure of the data, the likelihood does not approach the fiducial curve and parameter errors remain modestly too small by 2.4 +1.5 −1.2 % at N real = 100.While we detect a small bias to parameter constraints induced by the residual noise in the covariance matrix, it is smaller than the additional scatter in all cases.Both the bias and uncertainty underestimation approach zero as N real increases and the covariance estimate improves; the additional scatter is more significant than both effects at all times.
At a fixed number of data realisations, we expect the covariance matrix to become better constrained as the number of band powers decreases as there are fewer matrix elements (with the same fractional error per element).In the tests in this work, where the sample variance contributions were estimated from a much larger number of simulations, this means the effects of uncertainty in the covariance matrix should fall at lower map-noise levels as well as when looking at a brightest selection of T T , TE and EE spectra, or widening the band power angular multipole bins.We confirm that this intuition is correct and find that these changes indeed lead to a reduction in the additional scatter, misestimation of the likelihood curvature, and bias.
We are interested in expanding upon this work, by exploring parameter constraints for a wider and finer grid of analysis parameters, including N real , σmap, the angular multipole range, the band power bin width, the number of transferfunction simulations, inclusion of the lensing power spectrum, and increasing the sky area observed.This extended set of simulations would allow us to provide scaling relations for the parameter uncertainty misestimation and bias for different conditioning schemes as these analysis parameters are varied.This would ultimately allow us to produce a practical guide for readers to determine what amount of conditioning is necessary for a given analysis.Additionally, we would like to explore further how parameter constraints behave in a multi-dimensional model space.Using interpolation approaches to obtain model CMB power spectra (e.g., Fendt & Wandelt 2007;Spurio Mancini et al. 2021) may make full explorations of the parameter posteriors computationally tractable.Current-generation ground-based CMB experiments have the potential of making exciting discoveries of physics beyond the standard model.Understanding and managing the sources of even small changes to parameter uncertainties and biases is crucial to producing reliable results.

Figure 1 .
Figure1.Comparison of the eigenvalues of covariance matrices for different conditioning strategies: unconditioned (black), diagonal (green), moderate (blue), deep (red).We plot the ratio of the average of the absolute eigenvalues across all N real = 30, σmap = 10 µKarcmin simulations in increasing order, to the eigenvalues of the theoretical covariance matrix.The shaded regions correspond to the standard deviation around the mean.The eigenvalues of unconditioned and mildly conditioned covariances recover the largest third and half of eigenvalues well, respectively -beyond that the eigenvalues drop off.The dotted grey lines indicate our threshold for acceptable deviations from the fiducial case, ≤ 20%.For the unconditioned covariances 39% of values lie outside this range, whereas 14% of the eigenvalues of diagonal and moderately conditioned matrices show large deviations from the expectation.Deep conditioning recovers the entire range of eigenvalues.

Figure 2 .
Figure2.Left panel: We show stacked likelihood curves for N real = 30, σmap = 10 µK-arcmin simulations for diagonal (green), moderate (blue), and deep conditioning (red) and the fiducial covariance (black) in the top panel.The shaded region around the lines corresponds to the standard deviation across simulations.The curvature of the stacked likelihoods from diagonal, moderate, and deep conditioning deviate by 13%, 8%, 1% from the fiducial covariance, respectively, which translates into percent-level differences in the uncertainty on θ M C .The bottom panel shows the ratio of the different conditioning schemes to the stacked likelihood of the fiducial covariance.We mask the region |∆θ| ≤ 0.0003 in dark grey, because of the numerical artefact in dividing by zero.Right panel: We plot the median of the ratio of θ M C uncertainty inferred from the data likelihood, σ L , from diagonal (green), moderate (blue), and deep conditioning (red), to the fiducial covariance matrix, σ 0 , across all N real , σmap = 10 µK-arcmin simulations.The wide and thin markers indicate the 68% and 95% interval, respectively.While shallow conditioning schemes tend to underestimate the parameter uncertainty, increasing the number of independent realisations N real ameliorates this issue.

Figure 4 .
Figure4.Errors on the covariance matrix introduce additional (and unaccounted for) scatter in the recovered parameters.We show here the ratio of the additional scatter to the fiducial parameter uncertainty for simulations with σmap = 10 µK-arcmin, parametrised by σ b , the half-width of the 68% interval of θ/σ F .The wide and thin markers indicate the 1 − 2σ range based on the number of samples for diagonal (green), moderate (blue), and deep conditioning (red), respectively.The dotted lines show an inverse power-law fit with slope N −1 real and the semi-transparent region corresponds to the 1σ region around it.Due to the small dataset, these fits only serve to guide the eye and are not statistically meaningful.The additional scatter to parameter constraints is reduced as the covariance estimate improves with higher N real .We do not expect to be able to reach zero extra noise due to residual noise in the fiducial covariance matrix.
Choi et al. (2020)b)tioning schemes vary drastically between analyses.For example, Polarbear Collaboration et al. (2014),Das et al. (2014)andLouis et al. (2017b)drop the off-diagonal elements of all blocks, i.e. ignore mode-coupling.Choi et al. (2020)allow for mode-coupling to the nearest angular multipole bin, but explicitly zero the effects of filtering and masking fur-