Impact of Galactic dust non-Gaussianity on searches for B-modes from inflation

A key challenge in the search for primordial B-modes is the presence of polarized Galactic foregrounds, especially thermal dust emission. Power-spectrum-based analysis methods generally assume the foregrounds to be Gaussian random fields when constructing a likelihood and computing the covariance matrix. In this paper, we investigate how non-Gaussianity in the dust field instead affects CMB and foreground parameter inference in the context of inflationary B-mode searches, capturing this effect via modifications to the dust power-spectrum covariance matrix. For upcoming experiments such as the Simons Observatory, we find no dependence of the tensor-to-scalar ratio uncertainty $\sigma(r)$ on the degree of dust non-Gaussianity or the nature of the dust covariance matrix. We provide an explanation of this result, noting that when frequency decorrelation is negligible, dust in mid-frequency channels is cleaned using high-frequency data in a way that is independent of the spatial statistics of dust. We show that our results hold also for non-zero levels of frequency decorrelation that are compatible with existing data. We find, however, that neglecting the impact of dust non-Gaussianity in the covariance matrix can lead to inaccuracies in goodness-of-fit metrics. Care must thus be taken when using such metrics to test B-mode spectra and models, although we show that any such problems can be mitigated by using only cleaned spectrum combinations when computing goodness-of-fit statistics.


INTRODUCTION
The cosmic microwave background (CMB) was emitted when the Universe was approximately 380,000 years old.The temperature and polarisation fluctuations present in the CMB have informed our understanding of the origin of the Universe.In particular, in the standard cosmological model these primordial density fluctuations were produced by quantum fluctuations in the inflationary early Universe.The simplest inflation models make a key prediction that has not yet been confirmed: the production of a primordial gravitational wave background (Starobinsky 1979;Abbott & Wise 1984).This background would induce a unique B-mode polarisation pattern in the CMB (Kamionkowski et al. 1997;Seljak 1997), which is an important signature of primordial tensor fluctuations.The amplitude of these fluctuations is parametrized by the tensor-to-scalar ratio r; the tightest constraint to date sets r < 0.032 at 95 % confidence (Tristram et al. 2022).Although r could in principle be arbitrarily small, broad classes of well-motivated theories (e.g.Starobinsky 1979) predict a value for r that would be within the observable range of future ⋆ E-mail: ia404@cam.ac.ukCMB experiments, given current constraints on the scalar spectral index (see Abazajian et al. 2016).
In this context, there are and will be multiple CMB experiments pursuing the search for the faint primordial B-mode signal, such as Planck (Planck Collaboration et al. 2020c), the BICEP2 and Keck array telescope (Buder et al. 2014), CLASS (Essinger-Hileman et al. 2014), SPT-3G (Benson et al. 2014), the POLARBEAR-2 and Simons Array experiments (Suzuki et al. 2016), the Simons Observatory (SO, Simons Observatory Collaboration et al. 2019), the CMB-S4 experiment (Abitbol et al. 2017;Barron et al. 2022), and Lite-BIRD (Hazumi et al. 2019;LiteBIRD Collaboration et al. 2023).In particular, SO will deploy a number of Small Aperture Telescopes (SATs) targeted at measuring large-scale B-modes to constrain r at a level of σ(r = 0) ≲ 0.003 (Wolz et al. 2023), using a effective sky fraction fsky ∼ 0.1 with a white noise level of around 2 µK-arcmin.
Unfortunately, the observed B-mode signal is contaminated by Galactic foreground emission.Polarized synchrotron emission sourced by the motion of charged particles in the Galactic magnetic field (GMF) overwhelms the B-mode signal at low (ν ≲ 70 GHz) frequencies (see e.g.Page et al. 2007).In addition to this, thermal dust emission from our own Galaxy is dominant at frequencies ν ≳ 70 GHz (see e.g.Planck Collaboration et al. 2016d).Dust grains, which are aligned with the GMF, are heated by interstellar radiation and subsequently re-emit partially polarized light that traces the GMF structure (Draine & Fraisse 2009).Dust is the most important foreground at most frequencies relevant for B-mode searches.
In this paper, we focus on the impact of Galactic dust non-Gaussianity on C ℓ -based methods.These methods perform a likelihood analysis of the power-spectrum signal, which includes all BB auto-and cross-power spectra at a set of frequencies.Crucially, they require an estimation of the power-spectrum covariance matrix.At this stage, most works usually assume that all sky components can be described by statistically isotropic Gaussian random fields at the map level.In the full sky, and assuming statistical isotropy, this would give rise to a purely diagonal covariance matrix, directly related to the power spectrum of the different components.Off-diagonal elements coupling different power-spectrum multipoles would then only be caused by masking and filtering of the data (Knox 1997;Efstathiou 2004).In practice, however, Galactic foregrounds are both statistically anisotropic and exhibit rather non-Gaussian features, both in temperature (e.g.Ben-David et al. 2015;Jung et al. 2018) and in polarisation (e.g. von Hausegger et al. 2019;Coulton & Spergel 2019), which inevitably leads to couplings between different scales.Given the magnetised and turbulent nature of the diffuse interstellar medium (ISM), which gives rise to highly non-linear motions in the ISM fluid, these couplings are indeed expected.At the map-level, they give rise to structures such as filaments.At the power-spectrum level, they introduce couplings between different ℓ-bins in the power-spectrum covariance, which itself is no longer completely determined by just the multi-component power spectra.Although the problem of foreground non-Gaussianity has been briefly discussed in the literature (e.g.The Keck Array and BICEP2 Collaborations et al. 2021, hereafter BK18), an in-depth study on how dust non-Gaussianities affect the power-spectrum covariance, and the full impact this has on CMB and foreground parameter inference is still outstanding.
Traditionally, component separation studies have made use of simulations based on observed sky templates.Most of these templates rely on polarisation measurements that lack sensitivity on small scales.This issue is usually circumvented by filtering out these scales, and filling them in with Gaussian realizations of a given target power spectrum that follows the trend measured on large scales (see e.g.Hervías-Caimapo et al. 2016;Thorne et al. 2017).As a result, works using these models do not account correctly for non-Gaussianity on small scales.In the last few years, several groups have produced foreground simulations including non-Gaussianities on small scales.These include magnetohydrodynamic simulations (e.g.Kritsuk et al. 2018;Kim et al. 2019), phenomenological models based on temperature observations using a superposition of Galactic dust layers (e.g.Vansyngel et al. 2017;Martínez-Solaeche et al. 2018), models built from wavelet scattering transform statistics (Allys et al. 2019;Regaldo-Saint Blancard et al. 2020), dust models based on neutral hydrogen (Clark & Hensley 2019), and filament-based ones (Huffenberger et al. 2020;Hervías-Caimapo & Huffenberger 2022).Similar studies have also been carried out in the context of Galactic synchrotron (e.g.Wang et al. 2020;Martire et al. 2022Martire et al. , 2023)).BK18 explored some of these non-Gaussian foreground models.They used single dust realizations from Kritsuk et al. (2018); Martínez-Solaeche et al. (2018);Vansyngel et al. (2017), which they added to their modelled signal.However, with the analysis of only a limited number of realizations, it is unclear how reliably and precisely one can quantify the effect that foreground non-Gaussianity has on the C ℓ covariance matrix and on the inference of r.
The paper is organized as follows.We present the general methodology of C ℓ -based analyses in Section 2, with a particular focus on how to incorporate dust non-Gaussianity into the covariance matrix (2.4).In Section 3, we quantify the impact that dust non-Gaussianity has on parameter constraints, including in the presence of frequency decorrelation.We study the effect on parameter biases and variances, as well as goodness-of-fit metrics.We summarise our results and conclude in Section 4.

Parameter inference
We focus on multi-frequency power-spectrum-based analyses.Here, the data vector is composed of all BB auto-and cross-power spectra between all frequency maps.We use Monte Carlo Bayesian analysis, as implemented in emcee (Foreman-Mackey et al. 2013) to obtain posterior distributions of our sky model parameters (Section 2.2) given the mock data vector we generated (Section 2.3).This analysis pipeline has already been used in works such as Azzoni et al. (2021); Wolz et al. (2023).
Often, C ℓ -based analyses constraining primordial B-modes assume a Gaussian likelihood within the Bayesian analysis.This implicitly assumes that each sky component can be summarised by its power spectrum and its covariance (ignoring measurement noise and systematics).While this is certainly the case for the CMB, one also makes these assumptions for the foregrounds because each bandpower is an average over approximately ∆ℓ × (2ℓeff + 1) modes, (neglecting correlations due to, e.g., masking), where ℓeff is the effective ℓ value at each bin of size ∆ℓ.Thanks to the high number of modes, the central limit theorem applies and one can assume that each C ℓ follows a Gaussian distribution.However, this bandpower averaging does not remove the impact of non-Gaussian foregrounds, as the bandpower covariance matrix, with elements Cov(C ℓ , C ℓ ′ ), includes contributions from the connected trispectrum of the non-Gaussian foregrounds1 .
We assess the impact of the non-Gaussian dust foreground by comparing two separate runs of the parameter inference.For these we use two different models for the covariance matrix, keeping the data vector and the sky model fixed.In the first case, we assume a Gaussian dust foreground.In the second case, we describe the anisotropic and non-Gaussian nature of the dust sky by modelling the covariance matrix accordingly.We denote these covariances as ΣG and ΣNG, respectively.In Section 2.4, we describe how they are computed.
We run our analyses with the Hamimeche & Lewis (2008, HL) likelihood.It accounts for the non-Gaussian form of the likelihood on large scales, where there are fewer modes to average over.On the scales we are interested in (30 ≤ ℓ ≤ 300), both the Gaussian and HL likelihoods yield the same results (Wolz et al. 2023).We explore the posterior distribution by Markov-Chain Monte Carlo sampling using 128 walkers and 10 4 steps, and compute the integrated autocorrelation time (Goodman & Weare 2010) to assess convergence and ensure we have a sufficient number of independent samples.We remove the initial 25 % of the chains ("burn-in") to perform inferences.We present our results in Section 3.

Sky model
To create the mock data vector in this work, we use the same sky model described in Abitbol et al. (2021), which follows the fiducial BICEP2/Keck model (BK18; BICEP2 Collaboration et al. 2018;BICEP2/Keck Collaboration et al. 2015).We assume three components c for the sky signal: dust (d), synchrotron (s), and CMB.The foregrounds (d, s) auto-spectra are parametrized as a power-law: where Ac is the amplitude, αc the tilt, and ℓ0 = 80 the pivot frequency.The model also allows for a cross-correlation between dust and synchrotron, via a scale-independent parameter ϵ ds .The CMB B-mode power spectrum is modelled as where C lens ℓ and C tens,r=1 ℓ are theoretical predictions obtained with camb (Lewis & Challinor 2011) for the B-mode power spectrum produced by gravitational lensing and primordial tensor perturbations with r = 1, respectively.Treating A lens as a free parameter allows us to consider the need to marginalise over uncertainties in the residual lensing B-modes if delensing is present (A lens = 1 recovers the case without delensing), which will be of importance to improve r constraints (see e.g.Seljak & Hirata 2004;Sherwin & Schmittfull 2015;Namikawa et al. 2022).
The multi-frequency power spectrum is then given by where S ν c is the spectral energy distribution (SED) of component c at frequency ν in thermodynamic temperature units.We model the SEDs in the same manner as Abitbol et al. (2021): • CMB.For the CMB, the SED is a constant by definition: (5) • Synchrotron.We model the synchrotron spectrum as a simple power-law (see e.g.Planck Collaboration et al. 2020b): where βs is the synchrotron spectral index and ν s 0 = 23 GHz is the pivot frequency.g(xν ) is the conversion factor between Rayleigh-Jeans brightness temperature units (commonly used for foregrounds) Table 1.Priors used in the analysis, based on those used in Azzoni et al. (2021).The power-spectrum amplitude for tensor modes r is allowed to be negative to detect possible biases.

Parameter
Prior Bounds r Top-hat and thermodynamic temperature units (Planck Collaboration et al. 2014): where ΘCMB = 2.7255 K is the CMB temperature (Fixsen 2009).
• Dust.Thermal dust emission is characterised by a modified black-body spectrum (see e.g.Planck Collaboration et al. 2020b): where β d is the dust spectral index, ν d 0 = 353 GHz is the pivot frequency, and Bν (Θ) is the Planck black-body spectrum, which expresses the spectral radiance of a black-body for frequency ν at temperature Θ: and where k and h are the Boltzmann and Planck constants, respectively.The priors for all model parameters presented hereto are based on table 1 of Azzoni et al. (2021), which we reproduce in Table 1 for completeness.
The simple frequency dependence of equation ( 4) assumes that all components are perfectly correlated across frequencies.However, we expect Galactic foregrounds to present some level of frequency decorrelation, caused by variations in their spectral energy distribution across the sky and along the line of sight (see e.g.Pelgrims et al. 2021;Ritacco et al. 2023).To parametrize this decorrelation we will make use of a simple scale-independent dust decorrelation parameter, ∆ d , quantifying the map-level correlation coefficient between dust at 353 and 217 GHz (Planck Collaboration et al. 2016a).Specifically, it is defined as the ratio of the cross-spectrum between those two maps and the geometric mean of the corresponding autospectra (see equation 1 in Planck Collaboration et al. 2017).Even though the physically meaningful range for ∆ d is ∆ d ≤ 1 (BI-CEP2 Collaboration et al. 2018), we allow for ∆ d values greater than 1 to check for possible biases.We will also assess the impact of frequency decorrelation through the minimal moment expansion method of Azzoni et al. (2021), which is based on the "momentsbased" expansion (Chluba et al. 2017) (30,300), binned with ∆ℓ = 30, and is the same across all panels.The first panel corresponds to the covariance matrix of our initial Gaussian dust field, Σ G , which already incorporates masking effects: the covariance is essentially diagonal with the exception of small off-diagonal terms close to the diagonal that appear due to the mask.The second panel corresponds to a Gaussian dust field modulated by a large-scale template (Section 2.4.1).One can see that the large-scale modulation moves power from the diagonal to the off-diagonal terms, thus further coupling neighbouring ℓ-bins.The third panel corresponds to the covariance of a filamentary small-scale dust population obtained from 200 realizations of the dustfilaments (Hervías-Caimapo & Huffenberger 2022) model.The net effect of this population is to couple small scales to large scales.The last panel refers to the covariance matrix Σ NG , which contains both large and small scale effects.We use it to capture the full anisotropic non-Gaussian nature of the dust field.
the foreground spectral indices, δβc, as Gaussian random fields with a power spectrum of the form and associated frequency dependence (Azzoni et al. 2021) Even though Bs, B d correspond to auto-spectra amplitudes (see e.g.Mangilli et al. 2021;Vacher et al. 2022), we do not enforce positivity in their priors.This is to avoid a bias in the marginalised posterior distribution of r, which is found when data with no spectral index variations are analysed with the moments expansion and a positive prior Bc ≥ 0 (Azzoni et al. 2021).In summary, our baseline sky model has 9 free parameters.The number of parameters is extended to 10 as we include decorrelation in the dust component, or to 13 for the moments expansion in both dust and synchrotron.

Mock data
We generate mock multi-frequency power-spectrum data on which to run our analysis by evaluating the baseline model we described in the previous section (2.2) with the following parameter values.
We take the Simons Observatory experiment, as described in Ade et al. (2019), as our baseline instrument model.We generate multifrequency power spectra using the sky model above assuming six 2 See, however, nuances on A d and α d in Section 2.4.2.SO frequency bandpasses, centred around 27 and 39 GHz (lowfrequency bands, LF), 93 and 145 GHz (mid-frequency bands, MF), and 225 and 280 GHz (ultra-high-frequency bands, UHF).This leads to a total of 21 unique multi-frequency power spectra C νν ′ ℓ .
To incorporate the contribution from instrumental noise to the covariance matrix, we include the noise auto-spectrum at each frequency, assuming no correlation of the noise between different frequencies.For this we follow Ade et al. (2019) and model the SO SATs' noise spectrum as: where the first term accounts for white noise and the second for red 1/f noise, most prominent on large scales, and caused by atmospheric contributions and timestream filtering.This noise model is rather simplistic, and does not incorporate features associated with the specific scanning strategy, such as inhomogeneity or anisotropy.
Although Wolz et al. (2023) find that noise inhomogeneity may lead to an increase in σ(r) of up to ∼ O(30 %), this level of precision is sufficient for the purposes of quantifying the impact of foreground non-Gaussianity.We use values of N ν white and α ν knee corresponding to the "goal-optimistic" scenario, specified in tables 1 and 2 of Ade et al. (2019).This corresponds to the most optimistic noise model envisaged by SO.
We apply our results to BB-power-spectrum measurements on the SO patch, which we take to be our baseline analysis.We use the SAT footprint released in Ade et al. (2019) as our sky mask.Its edges are apodized on 5 • scales using the "C1" apodization method (Grain et al. 2009) as implemented in namaster3 (Alonso et al. 2019), giving an effective sky fraction of fsky = 0.1.We only use data in the range 30 ≤ ℓ ≤ 300, binned into ∆ℓ = 30 (9 bins in total).We choose a somewhat broad ℓ-binning to ensure that the covariance converges given the number of dust filament realizations we have (see Section 2.4.2), but we check that our results are robust against a finer ∆ℓ = 15 binning.

Dust non-Gaussian covariance
As we describe in Section 2.1, we capture the effect of the non-Gaussian dust foreground in the covariance matrix ΣNG.This is then compared to the results obtained assuming a Gaussian dust foreground as per ΣG.We first sketch how we compute the Gaussian covariance matrix ΣG.Detailed calculations of all results obtained in this section are described in Appendix A.
We start with a Gaussian field δ(n) 4 , which is modulated by the sky mask w(n): and define the corresponding pseudo-C ℓ , Cd G , for which one has where C δ ℓ is the power spectrum of δ(n), and we have defined the C ℓ -level mode-coupling matrix M w ℓℓ ′ , which can be computed from the sky mask alone (Hivon et al. 2002;Alonso et al. 2019).From equation ( 14), one can see that the effect of the sky mask is to couple different ℓ-modes.
We calculate the covariance of the power spectrum of the Gaussian field via its 4-point function.We assume that the sky mask is smooth, the mode-coupling coefficients are peaked around ℓ = ℓ ′ and that C δ ℓ is slowly varying.The final expression is: where Cδ ℓ 12 denotes the geometric mean of C δ ℓ 1 and C δ ℓ 2 , and ⟨• • • ⟩Ω denotes sky averaging.M w 2 ℓ 1 ℓ 2 is now the mode-coupling matrix for the sky mask squared, w 2 (n), which we compute analytically with namaster (Alonso et al. 2019).Under the given approximations, the ⟨w 2 ⟩Ω term serves as the conversion factor between Cd G ℓ and C δ ℓ .Equation ( 15) is essentially the Knox formula (Knox 1995(Knox , 1997) ) modified to properly include mask-coupling effects (Efstathiou 2004;García-García et al. 2019).We show in the first panel of Figure 1 the correlation matrix for the power-spectrum covariance of a Gaussian dust field at 280 GHz computed in this way.The covariance ΣG is essentially diagonal with the exception of small coupling terms close to the diagonal that appear due to the mask.
However, as we already discussed, the nature of foregrounds is anisotropic and non-Gaussian, and quantifying the impact of this on parameter inference is the main goal of this paper.Previous analyses (e.g.Azzoni et al. 2023) have assumed Gaussian foregrounds or have only briefly discussed the question of how non-Gaussian foregrounds impact r inference (BK18).Focusing on dust, we compare how the covariance matrix of dust changes as we introduce non-Gaussianity to an initially Gaussian isotropic dust field in two regimes: • Large-scale modulation.We introduce non-Gaussianity by modulating the Gaussian dust field by a large-scale template, which we take to be the Planck 353 GHz temperature map smoothed on small scales.This model was already proposed in Mak et al. (2017).The net effect of the modulation is to couple neighbouring bins, thus moving power from the diagonal to the nearby off-diagonal elements of the covariance.
• Small-scale dust filaments.We use dust simulations based on filaments (dustfilaments, Hervías-Caimapo & Huffenberger 2022) to estimate the covariance on small scales.These structures affect all scales and therefore introduce dominant off-diagonal elements away from the diagonal.

Large-scale modulating template
As found in Miville-Deschênes et al. ( 2007), the small-scale variance of the dust emission scales with the square of the large-scale intensity.This relation motivates a simple model to account for the statistically anisotropic nature of Galactic dust, as proposed in Mak et al. (2017).
In this model, we modify equation ( 13) so that the Gaussian isotropic dust field δ(n) is also modulated by a large-scale intensity field t(n): which enforces the results found empirically.There are multiple ways to estimate the template t(n) (Delabrouille et al. 2013;Remazeilles et al. 2015;Hervías-Caimapo et al. 2016;Thorne et al. 2017;Mak et al. 2017).In this work, we take as modulating template the 353 GHz temperature map from Planck (Planck Collaboration et al. 2020a), smoothed on 0.4 • (ℓ ≈ 400) scales.This scale is chosen to be well above the map noise limit, which we estimate using the FFP10 simulations (Planck Collaboration et al. 2020a).We run our analysis for different smoothing scale values in the range 0.1 • -40.0 • and check that our final results are robust against changes in the smoothing scale.The results we present in Section 3 correspond to a somewhat aggressive choice of smoothing scale (0.4 • ), where the non-Gaussian component is enhanced in comparison to the anisotropy that would be present in a map with stronger smoothing.After smoothing, the template is normalized to ensure that both the Gaussian dust field and the modulated one have the same power spectrum (see Appendix A1).As a result, the non-Gaussianity only enters at the level of the power-spectrum covariance.We show the modulating template on the SO patch in Figure 2.
In Appendix A, we derive how the dust power-spectrum covariance changes as we account for the modulating template.The idea is to capture the effect of t(n) as producing an effective sky mask w(n) ≡ w(n)t(n) and repeat the procedure used for the masked Gaussian field.The template normalization ensures that ⟨w 2 ⟩ 2 Ω = ⟨ w2 ⟩ 2 Ω .The new effective mask introduces new couplings between neighbouring ℓ-bins.This is in turn reflected in the covariance matrix of the dust field D where the mode-coupling matrix now captures both the effect of the mask and the template via the effective mask squared, w2 (n).We show how it compares to the Gaussian case in the first two panels of Figure 1.As can be seen in the second panel, the large-scale modulating template increases the coupling between neighbouring bins and introduces a larger variance with respect to the Gaussian case (first panel).

Small scales
The previous model assumes that the small-scale fluctuations are Gaussian distributed and modulated by a template.In practice, we know that this is a poor description for Galactic dust, which, for instance, exhibits a filamentary structure that cannot be reproduced from random Gaussian statistics.These filaments have been observed at millimetre wavelengths by Planck (e.g.Planck Collaboration et al. 2016b,c), among others, and also found to be traced by Galactic neutral hydrogen emission (Clark et al. 2014(Clark et al. , 2015;;Clark & Hensley 2019;Halal et al. 2023).To improve the modelling of small-scale fluctuations, we use the dust filament model dustfilaments5 (Hervías-Caimapo & Huffenberger 2022).Filaments and their misalignment with respect to the Galactic magnetic field can quantitatively reproduce the properties of Galactic dust as observed by Planck at 353 GHz (Planck Collaboration et al. 2016a, 2020d), such as the BB/EE power ratio or the dust BB-power-spectra powerlaw slope α d (Huffenberger et al. 2020).Based on this idea, dustfilaments constructs a full-sky simulation of the Galactic dust at millimetre frequencies by integrating the emission of millions of filaments in a predefined volume.By construction, the filament population is calibrated to match the Large Region 71 (LR71) Galactic mask in Planck Collaboration et al. (2020d), as described in Hervías-Caimapo & Huffenberger (2022).Since the polarization fraction changes depending on the portion of the sky considered, the dustfilaments model would produce a different amplitude of polarization signal if calculated in a different region than LR71, e.g. the SO patch (see section 5 of Hervías-Caimapo & Huffenberger 2022, for an extended discussion of this point).Consequently, we re-scale the final dustfilaments power spectra to match our chosen value for A d in the SO patch.Moreover, the slope α d in the simulations has the value found for the LR71 mask (α d = −0.54)by default.Our mock data therefore agree with this value, instead of the number previously quoted in Section 2.3.
The raw dustfilaments model does not reproduce the large-scale features in polarization.Before computing the covariance matrix from a set of 200 realizations, we employ a harmonic high-pass filter that removes the large-scale signal ℓ ≲ 50 from each map.The resulting dust power-spectrum covariance matrix on small scales is shown in the third panel of Figure 1.This covariance is sourced only by the filamentary small-scale dust population, which induces important correlations between small and large scales.This effectively adds off-diagonal terms to the power-spectrum covariance and also leads to larger small-scale variance than predicted by the Gaussian covariance.We also check that the power spectra from which the covariance is derived follow a Gaussian distribution once they are binned in bandpowers of width ∆ℓ = 30.For this, we make histograms of the distribution of power spectra across the 200 simulations for the lowest and highest bandpower.We check that they closely follow a Gaussian distribution with a mean given by the bandpower mean and a standard deviation derived from the diagonal of the covariance matrix.
Since we have filtered out the large scales, we cannot estimate the covariance matrix at low ℓ directly from the simulations.Consequently, we merge the covariance obtained from small scales with the one obtained in the previous section for large scales.The procedure is described in Appendix A3.It involves computing power spectra at every ℓ using the usual anafast healpix6 routine (Górski et al. 2005).We show the resulting correlation coefficient for the covariance matrix in the last panel of Figure 1.This final covariance matrix ΣNG is the one we use to represent the non-Gaussian anisotropic dust field.

Baseline analysis
We show in Figure 3 the 2-and 1-dimensional marginalised posterior distributions for the 9 theoretical model parameters.We compare the posterior distributions obtained when the likelihood incorporates the covariance matrix of a Gaussian dust field (ΣG, filled solid blue) with the contours obtained when assuming a non-Gaussian anisotropic dust field (ΣNG, unfilled dashed red).Note that the only difference between the two cases lies in the dust contribution to the covariance; the mock power-spectra remain the same in both runs regardless of the employed covariance, as discussed in Section 2.1.
Judging from this plot we can describe how dust non-Gaussianities impact C ℓ -based analyses on primordial B-modes.We mark in dotted gray the values used to generate the mock data, from which one can see that we obtain unbiased constraints on all parameters.This is expected since non-Gaussian dust only affects the covariance matrix in our analysis.More importantly, the uncertainties of both r and Alens are insensitive to the anisotropic and non-Gaussian nature of the dust covariance matrix.We recover σ(r) = 1.3 × 10 −3 , in agreement with previous forecasts focused on SO that used the same noise model (Ade et al. 2019;Azzoni et al. 2021;Wolz et al. 2023) 7 .We note that this result is also consistent with the BK18 test, which also found no effect on their r constraints when using dust simulations whose amplitude is modulated by degree-scale BB power measured on the Planck 353 GHz map.However, for the dust power-spectrum parameters A d and α d , we do see a difference in their posterior variance depending on the covariance matrix used.In the Gaussian case, we have A d = 28.08±0.93,α d = −0.542±0.035,as opposed to A d = 28.1 ± 1.2, α d = −0.543± 0.040 in the non-Gaussian case; the non-Gaussian uncertainties are larger by a factor 1.3 and 1.1, respectively.Therefore, using a Gaussian covariance to describe the statistics of dust leads to an underestimation of the dust spatial parameter uncertainties.
We also find the same qualitative behaviour in a more realistic case that includes delensing at the level expected for SO, Alens = 0.3 (Namikawa et al. 2022).Although the off-diagonal entries of the The key result is that σ(r) is unaffected, while the spatial dust parameter uncertainties are greater in the non-Gaussian case.When we use the covariance matrix capturing the non-Gaussian anisotropic nature of dust, we see an increase in σ(α d ) and σ(A d ) by a factor 1.1 and 1.3, respectively.Therefore, using a Gaussian covariance to describe the statistics of dust leads to an underestimation of the dust spatial parameter uncertainties.True mock data values are shown as grey dotted lines.We recover unbiased parameter constraints in all cases.This is expected since non-Gaussian dust only affects the covariance matrix in our analysis.
B-mode power-spectrum covariance will become somewhat more relevant after delensing, the lensing B-modes are typically small compared to the dust foreground power, so this result is not surprising.
In this paper, we do not investigate the impact of dust non-Gaussianity in the delensing tracer itself, as the results will be dependent on which tracer is used and the bias arising from non-Gaussian dust in delensing has already been investigated in other works (Beck et al. 2020;Baleato Lizancos et al. 2022).
We also check that this result holds for larger sky coverages (fsky ∼ 0.6) in a noise configuration without atmospheric noise (similar to a LiteBIRD-like experiment).In this case, the increase in sky fraction and decrease in noise levels improves the error on r, σ(r) = 5.0 × 10 −4 , for both ΣG and ΣNG.Remarkably, we observe the same behaviour in this configuration even when we make a large change to the dust covariance matrix, artificially boosting every offdiagonal dust covariance matrix element to a level corresponding to a (very large) 10 % correlation matrix: the error on r remains at σ(r) = 5.0 × 10 −4 , whereas now the errors on the spatial dust parameters A d and α d increase by a factor 2.8 and 1.3, respectively.We show the corresponding posteriors in Figure 4.
We also explore whether dust non-Gaussianities (incorporated via the power-spectrum covariance matrix) introduce biases in the re-covered parameter values.We generate 10 5 mock power-spectrum observations Ĉℓ by performing a multivariate Gaussian sampling of the non-Gaussian power-spectrum covariance matrix ΣNG.For each mock observation, we find the best-fitting sky model parameters.We find that the histogram of best-fitting values found for r and the spatial dust parameters are the same regardless of whether we use ΣNG or ΣG when evaluating the likelihood, and these are centred around the true model parameter values (Section 2.2).We thus expect dust non-Gaussianities to introduce no bias in our measurements.
We thus see that the presence of significant dust non-Gaussianity does not seem to affect the final constraints on r, and its impact is reduced to those parameters describing the spatial distribution of dust.Understanding this result is the goal of the next section.

Theoretical model for observed behaviour
To gain insight into the insensitivity of the constraints on r to dust non-Gaussianity, we employ a simple Fisher forecast to estimate parameter constraints analytically.We start by writing down a simple model (Efstathiou & Gratton 2019) for an experiment which measures only two frequencies: M (mid-frequency band, where the  Comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively).We only show the most relevant parameters included in the sky model.We employ a configuration similar to a LiteBIRD-like experiment and artificially boost Σ NG to a level corresponding to a (very large) 10 % correlation matrix.We find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (Σ G , Σ NG ) cases, with the exception of the dust spatial parameters (A d , α d ), whose variance increases.The larger sky fraction and decrease in noise levels improves the error on r, σ(r) = 5.0 × 10 −4 .primordial CMB signal is present) and H (high-frequency band, dominated by dust).Ignoring any other foregrounds and lensing for simplicity, we write the data vector at the power-spectrum level as: where we have included a noise signal N ν on each channel, which captures other terms we are neglecting, such as the lensing B-mode power in the mid-frequency channel or the B-mode tensor contribution in the high-frequency channel.This is a two-parameter model.f controls the ratio of the dust SED between both frequency channels, while r is the amplitude of the primordial BB-power spectrum.As before, we invoke the central limit theorem and write down a Gaussian likelihood for which the Fisher matrix elements take the form (see e.g.Heavens 2016): where Σ is the (diagonal) covariance matrix, µ is the mean of the data and where we have denoted partial derivatives with respect to model parameters as ",α".We have also ignored parameter dependences in the covariance matrix, as is appropriate (Carron 2013).
Assuming Gaussian fields, we calculate the (diagonal) covariance using the Wick's theorem (see e.g.Allison et al. 2015): where the proportionality factor includes terms dependent on ℓ and fsky.When inverting the covariance matrix, we neglect the contribution from tensor modes in the mid-frequency channel.We finally invert the Fisher matrix to obtain the marginalised errors on each parameter (see e.g.Verde 2007) -for the full expressions of these variances see Appendix B. We now take the limit N M ℓ ≪ C dd ℓ , as is justified for an experiment such as SO, where the noise in the mid-frequency channel is at most 1 % of the dust power spectrum in the high-frequency channel at 30 ≲ ℓ ≲ 150.The expressions for the variance per ℓ simplify to: where the proportionality factor is 1/ [fsky (2ℓ + 1)], as there are fsky (2ℓ + 1) modes per multiple ℓ.In the further limit N H ℓ /C dd ℓ ≪ 18 , the expression for σ(r) becomes: A direct evaluation of this expression for a SO-like configuration (fsky = 0.1, H referring to the 225 GHz ultra-high-frequency channel) gives σ(r) = 1.3 × 10 −3 at ℓ = 80, in agreement with the value found in our baseline MCMC analysis.Moreover, we explicitly see from equation ( 23) that σ(r) is independent of the dust power spectrum up to second order in N H ℓ /C dd ℓ .Moreover, as N H ℓ goes to zero, the error on r gets arbitrarily small thanks to sample variance cancellation (Schmittfull & Seljak 2018).This limit also results in a small error in the dust parameter f .Thus, we see that the error on r is unaffected by the statistical properties of the dust (making only reasonable assumptions about the levels of dust and noise).
The intuition for these findings is that when measuring the power spectra in the high-frequency channels, we are measuring the actual dust fluctuations.This means that one can simply subtract them without knowing the specific statistical realization of the dust, simply because the fluctuations are the same on both high-and mid-frequency channels (provided negligible levels of frequency decorrelation).For this to work, we need an estimate of f that is not limited by our knowledge of the dust statistics.This was proven in our forecast, for which the error on f tends to zero as the noise in the high-frequency channel becomes insignificant.Another way of understanding this result is that, with a good enough estimate of the dust frequency-scaling f , the MF channel (where the CMB signal is most prominent) can be cleaned from dust using the UHF channel (where dust is dominant) by a simple subtraction at the map level.The power-spectrum residuals in the MF map will scale as σ 2 ℓ (f ) (equation 22) times C dd ℓ , which is indeed independent of C dd ℓ itself.This map-level subtraction holds for whatever the dust realization may be (Gaussian or not).As a result, σ(r) is not affected.In fact, this technique has been implicitly used in most map-based component separation methods (de Belsunce et al. 2022(de Belsunce et al. , 2023;;Efstathiou et al. 2009;Efstathiou & Gratton 2019).We have now explicitly shown why these methods are immune to the statistics of the foreground fields; given this result, it appears that this operation at the map level is being "implicitly" performed in the C ℓ -likelihood.For the toy model at hand, where all parameters are linear, one can show that the power-spectrum likelihood indeed yields the same results as a map-based likelihood.Although r is unaffected, from equation ( 22) we can see that our ability to constrain the dust field properties is directly determined by our measurement of the dust field itself.Hence, the absence of a reliable method to quantify the level of foreground non-Gaussianity Figure 5. Histogram of PTE values obtained for 10 5 power-spectrum mock simulations drawn from Σ NG , then analysed with Σ NG (dashed line) and with Σ G (solid line).In the top panel, the PTE is obtained directly from the raw simulations with respect to the true model.In the bottom panel, we first perform a best fit against which we compare the mock data.Different colours correspond to the inclusion of different channels.With the exception of the low-frequency (LF) channel, which barely contains any dust, the PTE distributions are skewed towards low values.This would be interpreted as the model being unable to provide a good description of the data, whereas the actual problem is a mis-estimation of the data vector uncertainties.
directly impacts the error estimation on the spatial dust parameters A d and α d .

Impact on goodness-of-fit
We have seen that C ℓ -based foreground cleaning constraints on r are not impacted by a non-Gaussian, anisotropic dust field.However, we see that the uncertainties on the dust spatial parameters (A d , α d ) can become significantly underestimated in the Gaussian approximation.We now investigate how this impacts goodness-of-fit χ 2 values, which are common way to assess the suitability of the model being used in the likelihood (e.g.BK18; Wolz et al. 2023) or to search for systematic errors in the measurements.
We take the 10 5 mock power-spectrum observations Ĉℓ generated by performing a multivariate Gaussian sampling of ΣNG.We study two cases.In the first case, we directly compute the χ 2 value, defined as: where C model ℓ refers to our initial data vector (Section 2.3) that we generated with the sky model presented in Section 2.2.The χ 2 value is computed twice, for both Σ = ΣG and Σ = ΣNG.In the second case, we first refit the mock observations and obtain best-fitting values C bf ℓ .The χ 2 value is then computed for the mock data vector with respect to (w.r.t.) the best-fitting model: for both Σ =ΣG, ΣNG.Finally, we obtain the probability to exceed (PTE) of the observed χ 2 value for each simulation.In order to convert from a χ 2 value to a PTE, we assume the numbers of degrees of freedom (d.o.f.) to be the same as the number of data points in the first case, and the number of data points minus the number of model parameters in the second case.
We show the PTE histograms in the form of a probability density function (PDF) in Figure 5.The top panel corresponds to the first case (equation 24), and the bottom panel corresponds to the second case (equation 25, where the χ 2 is calculated with respect to each best-fitting model).The dashed line represents the analysis using ΣNG, for which we obtain a uniform distribution since the χ 2 is computed with the same covariance that generated the data.On the other hand, the solid black line represents the PTEs obtained with ΣG.This PTE distribution is skewed towards low values, meaning that there is a substantial number of simulations where the χ 2 are very high and significantly differ from the expected value for the data: 8 % of simulations have PTE < 0.05 (6 % if compared to the best fit) as opposed to the expected 5 % value.This would be interpreted as the model being unable to provide a good description of the data, whereas the actual problem is a mis-estimation of the data vector uncertainties.
Similar results were found in Azzoni et al. (2023), where the authors assumed a Gaussian covariance when analysing simulated data in which the dust component, generated using the model of Vansyngel et al. (2017), was significantly non-Gaussian.They found that, while the theoretical model used to describe the power spectra was able to obtain unbiased constraints on r, the best-fitting χ 2 values were consistently high.
We further calculate the PTE distributions corresponding to specific frequency channels (LF, MF, UHF).When comparing to the best-fitting model, the d.o.f. to use is unclear a priori because the best fit was found including all frequency channels.We therefore resort to the distributions obtained with ΣNG.For each band, we find the effective d.o.f. that produces a uniform distribution.This is then used for the equivalent calculation using ΣG.The results are included as differently colour-coded solid lines in Figure 5.We are only able to recover the true distribution for the LF channels.This is because they barely contain any dust.As a result, a non-Gaussian dust covariance has a negligible effect on them.The strongest effect is seen for the UHF channels, which are dust dominated.Even though comparing to the best fit alleviates the differences, neglecting the impact of dust non-Gaussianity in the covariance matrix still leads to poor goodness-of-fit metrics.

Recovering good PTEs with cleaned maps
The reason for the degradation in the χ 2 PTE described in the previous section is our inability to fully describe the statistical uncertainties of the dust power spectrum.However, as we have shown in Section 3.1, and justified in Section 3.2, this mostly affects the final constraints on parameters related to the dust power spectrum, and does not impact the cosmological and spectral parameters.It should therefore be possible to select a sector of the data vector where the effects of dust non-Gaussianity are minimized, and for which standard goodness-of-fit metrics can be used as a reliable diagnostic of model suitability.This can be achieved by extracting a "cleaned" CMB power spectrum.A similar approach at the map-  5, but we now show the PTE distributions obtained after cleaning the 10 5 power-spectrum mock simulations with the procedure outlined in Section 3.3.1.The cleaning procedure nulls the foregrounds and therefore minimizes the effect of their non-Gaussian statistics.We recover a uniform distribution for Σ G , which reflects how it correctly summarises the statistics of the data after cleaning.This contrasts with the originally skewed PTE distribution found for Σ G (solid black line, bottom panel Figure 5), added here as a solid gray line for comparison.
based level was developed by de Belsunce et al. ( 2022) and applied to Planck polarization maps.
Consider the harmonic coefficients of maps observed at Nν different frequencies, and let us model them as arising from a linear combination of components with spectra S ν c and instrumental noise: where A c ℓm are the harmonic coefficients of component c, and n ν,ℓm is the noise contribution.Assuming we know the spectra of all components, the maximum-likelihood estimator for the component amplitudes is where the matrix S, of size Nc × Nν , contains the SEDs of all components, N ℓ is the Nν ×Nν noise power spectrum matrix, and we have promoted mν and A c to vectors of sizes Nν and Nc respectively.If we are only interested in the CMB amplitude, we can express it as a linear combination of the input maps: where Then, the power spectrum of the cleaned CMB map is and its covariance is related to the full multi-frequency covariance via which is the same expression derived in Azzoni et al. (2023) within the context of residual covariances in component separation methods.We take our 10 5 mock power-spectrum observations.With their best-fitting values for the SED parameters, we compute the elements Qν , specific to the SO frequency channels and noise spectra.We then clean the mock observations and the best-fitting signal estimates C bf ℓ with equation ( 30), and obtain the clean covariance matrix (equation 31).This is performed twice, once for the best fit and covariance derived from ΣG and once for the corresponding quantities using ΣNG.We compute the corresponding χ 2 bf values for each cleaned mock observation.In order to convert these to PTE values, we proceed as in the previous section and compute the effective number of d.o.f. that return a uniform PTE distribution in the ΣNG case, which is the non-Gaussian covariance from which the mock observations were generated.
We show the corresponding PTE distributions in Figure 6.The distribution we recover for ΣG (solid red line) exactly matches that of ΣNG (blue dashed line), in contrast with the previous disagreement found when no cleaning was performed (Section 3.3) and which we overplot as a solid gray line for comparison.This highlights the suitability of the procedure to null the non-Gaussian foregrounds.It can therefore be used to quantify the ability of the model to describe the data when considering only the sources whose statistical distribution we have more under control (the CMB, in this case).It is another goodness-of-fit metric that should be explored in addition to the standard χ 2 , to ward off issues with non-Gaussianities.

Foreground extensions
Perhaps the most important challenge for component separation is foreground frequency decorrelation, which may arise from inaccurate modelling of the foreground SEDs, as well as through the spatial variation of these SEDs.This has been noted in the context of mapbased techniques (e.g.Armitage-Caplan et al. 2012;Kogut & Fixsen 2016;McBride et al. 2023), as well as in C ℓ -based methods (BI-CEP2/ Keck Collaboration et al. 2015;Planck Collaboration et al. 2017).It can be addressed by extending the foreground model used in component separation, to directly account for these spatial variations at the map level (Eriksen et al. 2004(Eriksen et al. , 2008;;Alonso et al. 2017;Errard & Stompor 2019) or by forward-modelling the impact of this contamination on to the multi-frequency power spectra, either through a moment expansion (Chluba et al. 2017;Rotti & Chluba 2021;Remazeilles et al. 2021;Mangilli et al. 2021;Azzoni et al. 2021;Vacher et al. 2022;Sponseller & Kogut 2022;Vacher et al. 2023b), or through an effective description of frequency decorrelation (BICEP2/ Keck Collaboration et al. 2015).We here investigate whether the presence of decorrelation would affect our results, since the qualitative explanation in Section 3.2 relies on the assumption of 100 % correlation between frequency channels.
To do this, we first repeat our analysis using an extended foreground model, through the so-called minimal moment expansion method of Azzoni et al. (2021).The model allows for spatially-varying spectral indices for both dust and synchtrotron, and parametrizes those variations as uncorrelated Gaussian random fields with power-lawlike power spectra.Since we are only interested in the changes that foreground non-Gaussianities introduce at the level of the powerspectrum covariance, we analyse data without any spectral index variations.This is equivalent to evaluating the parameters of the minimal moment expansion model at Bc, B d , γ d , γc = 0.The results obtained using this model for component separation are presented in Figure 7.We find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (ΣG, ΣNG) cases,   3, we show the comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively).In this case, the data have been analysed with the minimal moments expansion method of Azzoni et al. (2021) to assess the impact of frequency decorrelation.This adds 4 more parameters to our model (see equation 10), although we only show a reduced dataset for clarity.The amplitude of the spectral index power-spectrum fluctuations (B d , Bs) are consistent with zero.As a result, we ignore the prior volume effects that appear on the corresponding γ d , γs.We also find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (Σ G , Σ NG ) cases, with the exception of the dust spatial parameters (A d , α d ), whose variance increases.Due to the complexity of this model, the error on r increases by more than 50 %, σ(r) = 2.0 × 10 −3 , with respect to the baseline analysis.
with the exception of the dust spatial parameters α d and A d whose variance increases a factor 1.2 and 1.3, respectively.We obtain null amplitudes for the spectral index power-spectrum fluctuations (Bs, B d ).Due to the complexity of this model 9 , the error on r increases in both cases by more than 50 %, σ(r) = 2.0 × 10 −3 , with respect to the baseline analysis where no moments method was used.This agrees with the increase found in Azzoni et al. (2021).
The results obtained so far have been recovered from a data vector which itself contained no frequency decorrelation (even if the moment expansion model allows for it).To test if the presence of significant decorrelation in the data would change our results, we generate mock data that include decorrelation via the scale-independent decorrelation parameter ∆ d presented in Section 2.2.It is worth noting that the moment expansion model is a generalisation of this simple constant-decorrelation model, the latter corresponding to the limit in which spectral index variations are uncorrelated between pixels (Vansyngel et al. 2017;Azzoni et al. 2023).The amount 9 We are now marginalising over a model that allows for frequency decorrelation, and thus expect the additional uncertainty to leak into r.
of decorrelation present in data is still highly disputed.Sheehy & Slosar (2018) found negligible levels of decorrelation between the 217 and 353 GHz Planck channels, a result also found independently in Planck Collaboration et al. (2020d).No significant evidence of decorrelation was found in the in the BICEP/Keck region either (BK18), a result that was further supported in an analysis including correlations with neutral hydrogen (Ade et al. 2023).Nevertheless, decorrelation between the foreground emission spectral laws at different frequencies can be detected on individual lines of sight, as noted by Pelgrims et al. (2021), and hence it is worth exploring the potential impact of its presence on our conclusions.
We choose to generate data with 2 % decorrelation between the 217 and 353 GHz channels for the SO SAT region.This level of decorrelation is compatible with existing data (Planck Collaboration et al. 2020d;Pelgrims et al. 2021).While the power-spectrum values change, this small level of decorrelation allows us to use the same covariance matrix as in all previous tests.This incurs a 2 % error on the cross-frequency correlations and is expected to be irrelevant when computing the covariance matrix.
We run the MCMC inference with both data and model now in-   3, we show the comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively) for the most relevant parameters included in the sky model with simple ∆ d decorrelation.The scale-independent parameter ∆ d characterizes the level of decorrelation between the two UHF channels.In this case, the dust powerspectrum signal itself contains 2 % decorrelation, although this level is small enough not to be captured in the covariance.We also find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (Σ G , Σ NG ) cases, with the exception of the dust spatial parameters (A d , α d ), whose variance increases.The complexity of decorrelation increases the error on r by 30 %, σ(r) = 1.7 × 10 −3 .cluding scale-independent decorrelation in the dust component.We show our posteriors in Figure 8.While the uncertainty on r has now increased by 30 %, σ(r) = 1.7 × 10 −3 , we recover the same behaviour observed in the baseline analysis (Figure 3).Firstly, we obtain unbiased constraints for all parameters.Secondly, we find that the constraint on σ(r) is unaffected by the non-Gaussian statistics of dust at this level of decorrelation.Finally, we see a 1.1 and 1.2 factor increase in the errors of the spatial dust parameters α d and A d , respectively, when the data are analysed with ΣNG.Thus, as long as it is included in the modelling, this level of decorrelation does not hamper our ability to employ dust maps at one frequency to constrain the dust emission at another one.
If decorrelation is not taken into account within the model, we see an increase in σ(r).In particular, when we analyse data containing ∆ d = 0.98 with our baseline model (which does not include decorrelation), we obtain σ(r) = 3.5 × 10 −3 (which corresponds to a factor of 2 increase) 10 .This illustrates how the interpretation obtained with our simple toy model (equation 18) only applies when all frequency channels are perfectly correlated.If a significant level of decorrelation is present, a potential solution would be to break up the sky into different patches, within which the spectral index is constant and the implicit map-level subtraction can be performed (provided the patches are signal dominated and the spectral fluctuations are slowly varying).
10 In practice, however, this value for σ(r) is not reliable.This model would never be used to analyse the dataset as the PTE goodness-of-fit metric is less than 10 −9 .

CONCLUSIONS
A detection of the primordial gravitational wave background is of paramount importance: it would provide a conclusive test of inflation and open a whole new window to probe physics at the highest energies and earliest times.However, exquisite control over the polarized foreground signal is key for a reliable detection of primordial B-modes.In this paper, we addressed the standard assumption made in C ℓ -based cleaning methods, namely that foregrounds can be approximated as Gaussian (e.g.BK18).Focusing on dust, its covariance matrix is modified to include the effects of the anisotropic and non-Gaussian nature of this field.Both effects lead to additional couplings between different ℓ bins, as well as modifying the power-spectrum uncertainties.
We studied two regimes of non-Gaussianity, introduced by 1) a large-scale modulating field, and 2) small-scale filamentary dust structures.In 1), the large-scale template, which we took to be the Planck 353 GHz temperature map smoothed on small scales, introduces couplings between neighbouring ℓ bins close to the dust covariance matrix diagonal.In 2), we obtained the covariance on small-scales using 200 realizations of the dustfilaments model (Hervías-Caimapo & Huffenberger 2022).These filamentary structures couple large and small scales, and also modify the diagonal elements of the covariance matrix.
We compared the CMB and dust parameter inference obtained with a dust covariance matrix that captures these non-Gaussianities against one assuming Gaussian foregrounds.We selected noise levels and sky coverage such that we replicate the SO experiment, although we checked that our results are robust against other configurations more similar to a LiteBIRD-like experiment.We found that the final constraints on r are not affected by non-Gaussian dust foregrounds, although the error on the spatial dust parameters increases by a factor ∼ 1.2 on average.This reflects how their uncertainties can be significantly underestimated in the Gaussian approximation.We believe these results hold in general, as we recovered the same behaviour for r when we artificially set all off-diagonal covariance matrix elements to have different, very large values.
We explained these results by arguing that C ℓ -based constraints on r are nearly immune to the statistics of the dust field, as one can always clean the CMB signal from dust by a simple subtraction of the particular realization of dust on the sky using a dust-dominated high-frequency channel.We checked this conclusion using a simple analytical Fisher forecast that assumes 100 % correlation between frequency channels, although we found the same behaviour numerically for data that include decorrelation using an MCMC analysis.We also expect map-based component separation methods to be immune to the statistics of the foreground fields, given that similar arguments about removing the exact realization of the foreground field, without requiring knowledge of its statistics, apply to map-based cleaning methods.
We note, however, that even though r constraints are not affected, dust non-Gaussianity noticeably impacts goodness-of-fit analyses.This behaviour was also found by de la Hoz et al. (2022) in the context of determining polarization angles in CMB experiments.The χ 2 values computed with a covariance representing Gaussian foregrounds are degraded not because our model is unable to provide a good description of the data, but because we are unable to fully describe the statistical fluctuations of the dust power-spectrum through the Gaussian covariance matrix.We explored how this can be mitigated by using only foreground-cleaned spectrum combinations when computing goodness-of-fit statistics.
In this paper, we have considered non-Gaussianity in the fore-ground amplitudes, but not in the spectral index variations.The latter could add additional contributions to the non-Gaussian covariance that produce a non-trivial frequency structure.Without better data to model these SED variations, it is difficult to construct a well-educated model, so we leave this task for future work.The arguments presented in this work should also apply to non-Gaussianities present in the synchrotron background, although we leave such a study to future work.In addition, our work highlights the importance of taking into account non-Gaussianities in any kind of Galactic science study.Although the work presented here focused on the Simons Observatory, its results will be relevant for any experiment that interprets B-mode constraints on the tensor-to-scalar ratio in the absence of a reliable method to quantify the level of foreground non-Gaussianity.from the small scales, Cov(C SS ℓ , C SS ℓ ′ ), from a set of dustfilaments simulations.
We can approximate the covariance of the total map as: where Cov(C T T ℓ 1 , C T T ℓ 2 ) is the covariance of the large-scale modulated dust field, for which ⟨ CT T ℓ 12 ⟩ is simply the analytical power-spectra used in this work (Section 2.3).This ensures that non-Gaussianities are introduced only at the level of the power-spectrum covariance.The term Cov(C SS ℓ 1 , C SS ℓ 2 ) is the Gaussian contribution from the small scales, for which ⟨ CSS ℓ 12 ⟩ is the average power spectra, computed over simulation ensembles of mS maps.We only bin the resulting covariance matrices at the last step.Therefore, in order to obtain ⟨ CSS ℓ 12 ⟩ at each ℓ, we follow the method described in Nicola et al. ( 2021): we compute the power spectra at every ℓ with the usual anafast healpix routine (Górski et al. 2005) on the masked maps, and then divide this quantity by ⟨w 2 ⟩Ω.We verified that the covariance matrices estimated with this method were both symmetric and positive-definite in all cases.

APPENDIX B: TOY MODEL AND FISHER FORECAST
In Section 3.2, we devised a simple model (18) to gain insight into the observed behaviour for the posterior distributions (Figure 3).The model has two parameters: r, the amplitude of the primordial BB signal and f , the dust emission ratio between the mid-and highfrequency channels.We assume Gaussian fields and compute the covariance matrix using the Knox formula (Knox 1995).The full marginalised errors on each parameter, obtained from the Fisher forecast are: where Taking into account that there are fsky(2ℓ + 1) modes per multiple ℓ, one can check that these expressions simplify to equations (21, 22) in the limit N M ℓ ≪ C dd ℓ .This approximation holds for an experiment such as SO, where the noise in the mid-frequency channel is at most 10 % of the dust power spectrum in the high-frequency channel.

Figure 1 .
Figure1.Correlation of the dust power spectrum covariance at 280 GHz for different dust models in the SO patch.The ℓ range spans ℓ ∈ (30, 300), binned with ∆ℓ = 30, and is the same across all panels.The first panel corresponds to the covariance matrix of our initial Gaussian dust field, Σ G , which already incorporates masking effects: the covariance is essentially diagonal with the exception of small off-diagonal terms close to the diagonal that appear due to the mask.The second panel corresponds to a Gaussian dust field modulated by a large-scale template (Section 2.4.1).One can see that the large-scale modulation moves power from the diagonal to the off-diagonal terms, thus further coupling neighbouring ℓ-bins.The third panel corresponds to the covariance of a filamentary small-scale dust population obtained from 200 realizations of the dustfilaments (Hervías-Caimapo & Huffenberger 2022) model.The net effect of this population is to couple small scales to large scales.The last panel refers to the covariance matrix Σ NG , which contains both large and small scale effects.We use it to capture the full anisotropic non-Gaussian nature of the dust field.

4Figure 2 .
Figure2.Large-scale modulating template in normalised arbitrary units (a.u.) for the SAT region of the SO experiment.It is the product of a uniform SO patch mask (apodized on 5 • scales around the edges) with the large-scale field.The large-scale field corresponds to the Planck 353 GHz temperature map, smoothed on 0.4 • (ℓ ≈ 400) scales.We take this map to modulate the Gaussian dust field, introducing anisotropic features which are reflected as couplings between neighbouring ℓ-bins in the power-spectrum covariance.

Figure 3 .
Figure 3.Comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively).The key result is that σ(r) is unaffected, while the spatial dust parameter uncertainties are greater in the non-Gaussian case.When we use the covariance matrix capturing the non-Gaussian anisotropic nature of dust, we see an increase in σ(α d ) and σ(A d ) by a factor 1.1 and 1.3, respectively.Therefore, using a Gaussian covariance to describe the statistics of dust leads to an underestimation of the dust spatial parameter uncertainties.True mock data values are shown as grey dotted lines.We recover unbiased parameter constraints in all cases.This is expected since non-Gaussian dust only affects the covariance matrix in our analysis.

Figure 4 .
Figure 4.Comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively).We only show the most relevant parameters included in the sky model.We employ a configuration similar to a LiteBIRD-like experiment and artificially boost Σ NG to a level corresponding to a (very large) 10 % correlation matrix.We find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (Σ G , Σ NG ) cases, with the exception of the dust spatial parameters (A d , α d ), whose variance increases.The larger sky fraction and decrease in noise levels improves the error on r, σ(r) = 5.0 × 10 −4 .

Figure 6 .
Figure6.Similar to the bottom panel of Figure5, but we now show the PTE distributions obtained after cleaning the 10 5 power-spectrum mock simulations with the procedure outlined in Section 3.3.1.The cleaning procedure nulls the foregrounds and therefore minimizes the effect of their non-Gaussian statistics.We recover a uniform distribution for Σ G , which reflects how it correctly summarises the statistics of the data after cleaning.This contrasts with the originally skewed PTE distribution found for Σ G (solid black line, bottom panel Figure5), added here as a solid gray line for comparison.

Figure 7 .
Figure 7. Similar to Figure3, we show the comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively).In this case, the data have been analysed with the minimal moments expansion method ofAzzoni et al. (2021) to assess the impact of frequency decorrelation.This adds 4 more parameters to our model (see equation 10), although we only show a reduced dataset for clarity.The amplitude of the spectral index power-spectrum fluctuations (B d , Bs) are consistent with zero.As a result, we ignore the prior volume effects that appear on the corresponding γ d , γs.We also find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (Σ G , Σ NG ) cases, with the exception of the dust spatial parameters (A d , α d ), whose variance increases.Due to the complexity of this model, the error on r increases by more than 50 %, σ(r) = 2.0 × 10 −3 , with respect to the baseline analysis.

Figure 8 .
Figure 8. Similar to Figure3, we show the comparison between posterior distributions obtained with a Gaussian and non-Gaussian dust covariance matrix (shown in filled solid blue and unfilled dashed red, respectively) for the most relevant parameters included in the sky model with simple ∆ d decorrelation.The scale-independent parameter ∆ d characterizes the level of decorrelation between the two UHF channels.In this case, the dust powerspectrum signal itself contains 2 % decorrelation, although this level is small enough not to be captured in the covariance.We also find the same results as our baseline analysis: we recover the same unbiased parameter constraints in both (Σ G , Σ NG ) cases, with the exception of the dust spatial parameters (A d , α d ), whose variance increases.The complexity of decorrelation increases the error on r by 30 %, σ(r) = 1.7 × 10 −3 .