Bayesian inference methodology to characterize the dust emissivity at far-infrared and submillimeter frequencies

We present a Bayesian inference method to characterise the dust emission properties using the well-known dust-Hi correlation in the diffuse interstellar medium at Planck frequencies ν ≥ 217 GHz. We use the Galactic Hi map from the Galactic All-Sky Survey (GASS) as a template to trace the Galactic dust emission. We jointly infer the pixel-dependent dust emissivity and the zero level present in the Planck intensity maps. We use the Hamiltonian Monte Carlo technique to sample the multi-dimensional parameter space. We demonstrate that the methodology is unbiased when applied to realistic Planck sky simulations over a 7500 deg 2 area around the Southern Galactic pole. As an application on data, we analyse the Planck intensity map at 353 GHz to jointly infer the pixel-dependent dust emissivity at N side = 32 resolution (1.8 ◦ pixel size) and the global offset. We find that the spatially varying dust emissivity has a mean of 0.031 MJy sr − 1 (10 20 cm − 2 ) − 1 and 1 σ standard deviation of 0.007 MJy sr − 1 (10 20 cm − 2 ) − 1 . The mean dust emissivity increases monotonically with increasing mean Hi column density. We find that the inferred global offset is consistent with the expected level of Cosmic Infrared Background (CIB) monopole added to the Planck data at 353 GHz. This method is useful in studying the line-of-sight variations of dust spectral energy distribution in the multi-phase interstellar medium.


INTRODUCTION
Galactic dust emission is one of the significant foregrounds in measurements of the Cosmic Microwave Background (CMB) intensity and polarization at frequencies above ∼100 GHz (Planck 2018 results.IV. 2020).At this frequency range, the form of spectral energy distribution (SED) of the Galac-⋆ E-mail: adak@iac.es† sshaik14@asu.edutic dust and the Cosmic Infrared Background (CIB, Puget et al. 1996;Lagache et al. 2005) are similar.While interesting in its own right, characterisation and understanding of the Galactic dust properties are crucial for measuring the CMB B-mode polarization (Planck 2018 results.XI. 2020).To mitigate this foreground contribution from the measurements, it is essential to understand how dust grains contribute to the B-mode polarisation.Understanding the Galactic dust emission is also critical for the reliable reconstruction of the CIB anisotropies (Planck intermediate results.XLVIII. 2016).The fact that the CIB and the Galactic dust share similar spectral properties makes it crucial to understand the spatial distribution and the SED of Galactic dust to separate the two emissions reliably.The CIB traces the matter distribution in the Universe and plays a crucial role in delensing the lensed B-mode contribution in the observed B-mode measurements (Larsen et al. 2016).It is also an important probe to be cross-correlated with other probes of the largescale structure (Ade et al. 2014;van Engelen et al. 2015;Ade et al. 2016;Maniyar et al. 2019;Jego et al. 2023).Moreover, even though the CIB is weakly polarized, at the frequencies where CIB is significant, the polarization of the CIB can be a contaminant to CMB B-mode polarization (Lagache et al. 2020;Feng & Holder 2020).
Various methods have been used to study the nature of the Galactic dust emission with the Modified Black Body (MBB) as a model of the Galactic dust SED at Planck 1 frequencies.Inferring Galactic dust SED parameters at the highest Planck angular resolution with sufficient signal-tonoise ratio is also important because smoothing of resolution may reduce information associated with the Galactic dust.Planck 2015 results.X (2016) use Commander method to perform Bayesian inference of the Galactic dust spectral properties at 7.5 ′ full-width half maximum (FWHM) angular resolution.A complementary method to do a similar task is implementing the dust-Hi correlation.Galactic dust is correlated with Hi 21 cm line emission of neutral hydrogen at high Galactic latitude and low-column density regions (Boulanger & Perault 1988).Hence, the Galactic Hi emission map can be used as a tracer of the Galactic dust emission.Using Planck and IRAS data along with Hi 21 cm observations obtained by Green Bank Telescope (GBT), Planck early results.XXIV.2011 estimate the dust emissivities in 14 fields covering more than 800 deg 2 at high Galactic latitude.Using the same formalism, Planck intermediate results.XVII.( 2014) study the dust emissivity and its SED by fitting the MBB model over the Southern Galactic Pole (SGP, b < −25 • ).A proper estimation of the Galactic dust emission is needed to separate the CIB emission (see, e.g., Planck intermediate results.XLVIII. 2016;Lenz et al. 2019;Irfan et al. 2019).Once the contribution of Galactic dust to a given frequency map is estimated, it can be subtracted to obtain the contribution of the CIB anisotropy (Planck 2013 results.XXX.2014).Lenz et al. (2019) have produced the CIB maps by cross-correlating Hi4PI data (HI4PI Collaboration: et al. 2016) with the Planck intensities at 353, 545, and 857 GHz over approximately 25% of the sky.
Planck does not measure the absolute sky background referred to as the zero levels of intensity, which have been fixed at the level of map-making (Planck 2013 results.VIII. 2014;Planck 2015results. VIII. 2016).Two considerations go into determining the zero level of an HFI frequency map: the zero level of the Galactic emission and the CIB monopole.Zero level of the Galactic emission has been set using dust-Hi correlation at high Galactic latitude where Hi is the reliable Galactic dust tracer.The underlying assumption is that the dust emission is zero where Hi column density is zero.The offset at 857 GHz is obtained by crosscorrelating the Planck HFI data at 857 GHz with Leiden 1 http://www.esa.int/PlanckArgentine Bonn (LAB) Galactic Hi Survey data.For other HFI frequencies, the Galactic zero level has been fixed using cross-correlation of maps at respective frequencies with the 857 GHz map (for details, see section 5.1 of Planck 2013 results.VIII. 2014).The estimated offsets are subtracted from the detector data at the time of map-making.After this step, the CIB monopole, estimated from the CIB model of Béthermin et al. (2012), has been added to each Planck HFI frequency intensity map.
Separation of the Galactic dust emission from the Planck frequency maps needs to take into account the proper treatment of the offset present in the maps.In the previous studies, the dust emissivities are fitted over local sky patches and variable offsets.Planck early results.XXIV.( 2011) utilises dust-Hi correlation to estimate emissivity properties of the dust at Low, Intermediate, and High-Velocity Clouds (LVC, IVC, and HVC) over 14-fields.The analysis in Planck intermediate results.XVII. (2014) estimates the emissivity at LVC and offset using dust-Hi correlation over the patches of 15 • diameter.Both works use the χ 2 minimisation technique.When dealing with very high dimensional parameter problems, χ 2 minimisation often results in parameter estimates that are far from the typical set (Mackay 2003).Furthermore, the efficiency of the χ 2 minimisation method is limited by the correlation between the parameters of interest and becomes inefficient for the high dimensional parameter space.Planck 2013 results.XI. ( 2014) measures global offset first and then estimates the Galactic dust SED parameters by fitting the MBB spectrum to offset-corrected intensity maps.In this work, we utilise the dust-Hi correlation to jointly infer the dust emissivity and the global offset of the whole sky region considered instead of individual smaller sky patches.We use GASS Hi data with Planck intensity maps over the same sky region used in the study of Planck intermediate results.XVII.(2014).We show that such an analysis can benefit from the joint inference of the emissivity and the global offset.
We sample the joint posterior distribution of emissivity and the global offset.The total number of variables to sample in this problem is around 2 × 10 3 .We use the Hamiltonian Monte Carlo (HMC) sampling method, which can more efficiently sample such large dimensional parameter space than the Metropolis-Hastings (MH) algorithm (Duane et al. 1987).For both HMC and MH algorithms, a new proposed point (θ ⋆ ) is generated from the present point (θ) by taking some sort of steps based on the target density.For a D-dimensional problem, the number of steps required to reach a nearly independent proposal point grows as D 1/4 for HMC and D for MH algorithm.The total amount of computation time with a reasonable acceptance rate grows as D 5/4 for HMC and D 2 for MH algorithm (Creutz 1988;Neal 2012;Betancourt & Girolami 2013).HMC uses the gradient of the posterior distribution to generate a proposed point, which, in principle, has an acceptance probability equal to one.This feature has led to HMC being increasingly employed in the high dimensional inference problems encountered in cosmology (Hajian 2007;Taylor et al. 2008;Jasche et al. 2010;Jasche & Wandelt 2013;Anderes et al. 2015), including the inference of CMB foreground parameters (Grumitt et al. 2020).
This paper is structured along the following lines.In Section 2, we present the data model, the likelihood, and the details of the HMC sampler.Section 3 describes the data used in this paper and the pre-processing of the data before the main analysis.Validation of the Bayesian inference method using simulated maps is discussed in Section 4. In Section 5, we present and discuss the CMB-subtracted Planck 353 GHz intensity map analysis results.We summarise the main results of the paper in Section 6.

METHOD
This section discusses the data model and likelihood analysis to sample the model parameters from their posterior distributions using the HMC sampler.

Dust emission model and the data likelihood
We are interested in characterising the dust emission properties in the diffuse interstellar medium over the frequency range covered by the Planck HFI maps (ν ≥ 217 GHz).The major contributors to the total emission in this range of frequencies are CMB, Galactic dust, CIB, and instrumental noise.We assume that the CMB intensity map derived using the well-established component separation techniques (SMICA, NILC, SEVEM, and COMMANDER) from the Planck multi-frequency observations is accurate enough in the sky region not obscured by the Galactic disc (Planck 2018 results.IV. 2020).With this assumption, we work with the CMB-subtracted Planck frequency maps to reduce the total number of components that need to be modelled.The Hi column density map can be used as a tracer of the dust emission due to strong dust-gas correlation in the diffuse ISM (Boulanger & Perault 1988).We model the CMBsubtracted intensity map (dν ) at frequency ν as the addition of the Hi-correlated dust emission (I Hi-d ), the noise and a constant global offset.In general, the noise term can consist of the CIB emission (I CIB ν ), the instrument noise (I N ν ), and the Galactic residual emission (I R ν ).The Galactic residuals contain the dust emission from H2 gas, which is uncorrelated with the Hi-emission.The global offset (Oν ) over the regions being analysed includes the contribution mainly from the CIB monopole and emission not accounted for by the Hi emission.
It is evident from previous studies that the dust emissivity varies smoothly over the sky (Planck intermediate results.XVII.2014).This fact allows us to assume the emissivity to be constant over a set of pixels.This set of pixels is determined using the HEALPix grid at a coarser resolution (i.e.lower N side ) than the resolution of the data map.The bigger pixel over which emissivity is assumed constant is termed a superpixel, whereas pixels within the superpixel are called subpixels.This model is expressed in the following equation, (1) where Ω j i indicates the direction of i th subpixel within j th superpixel.We model the Hi-correlated dust emission (I Hi-d ν ) as where NHi denotes the integrated column density of Hi component in the direction Ω j i , and ϵ j ν is the dust emissivity at frequency ν at j th superpixel.Here, we consider a single Hi template to trace the Hi-correlated dust emission.Because we are treating I CIB ν and I R ν as noise along with I N ν , we call the remaining contribution of I Hi-d ν and Oν as the signal model, sν , We assume that the three noise components, the CIB, the Galactic residuals and the instrument noise, are independent.Hence, the covariance matrix of the total noise (Σν ) at frequency ν is the sum of the covariance of the individual noise component.We denote the angle between i th subpixel of j th superpixel and the i ′th subpixel of j ′th superpixel by θ jj ′ ii ′ and the elements of the covariance matrix are denoted by Σ jj ′ ν,ii ′ ≡ Σν (θ jj ′ ii ′ ).While we consider the correlation between the subpixels within a superpixel, we neglect the correlation between subpixels that belong to different superpixels, that is Σ jj ′ ν,ii ′ = Σ jj ν,ii ′ δ jj ′ .Hence, the nonzero elements of Σν are given by where Σ N ν denotes the instrument noise covariance, Σ CIB ν and Σ R ν denote the contribution to Σν due to the CIB and the Galactic residuals, respectively.Unlike instrument noise, I CIB ν and I R ν are spatially correlated signals.Hence, its contribution to the total noise covariance matrix gives rise to the non-zero off-diagonal terms in Σν .We further assume the instrument noise, the CIB and the residual Galactic emission to be Gaussian with their respective covariance.Hence, the joint likelihood of all the data elements given the model parameters (ϵ j ν , Oν ) is where χ 2 is ii ′ represents the element of the inverse of the covariance matrix Σ.
In the model fitting, templates NHi(Ω j i ) and the noise variance Σν are known, and {ϵ j ν , Oν } are the unknown parameters of the model that we aim to infer from the observed data.Bayes theorem allows us to write the posterior probability distribution (P({ϵ j ν , Oν }|{dν (Ω j i )})) of the parameters given data as where P({ϵ j ν , Oν }) is the prior probability distribution of the parameters, and P({dν (Ω j i )}) is the evidence.We assume a uniform prior for all the parameters of interest without any bounds.Because the model is linear in parameters and we assume a Gaussian likelihood, the posterior distribution is also a Gaussian as a function of the model parameters.P({dν (Ω j i )}) acts as a normalisation constant.Hence, the functional dependence of the posterior distribution on parameters is the same as that of the likelihood distribution up to a proportionality constant.We sample the posterior distribution given in Equation ( 7) to get the joint samples of all the parameters of interest {ϵ j ν , Oν }.We have around 2 × 10 3 emissivity parameters per NHi template and one offset parameter.

Additional terms in the modelling
In this section, we discuss additional terms that may be required in modelling the data, their motivation, and implementation in the inference methodology.

Dipole term
We can model and fit for a dipole contribution in the signal model, where Dν (Ω j i ) accounts for residual dipole due to the CMB dipole, the CIB dipole, and the dipole from the Galactic residuals.In harmonic space, the expression for the dipole is 9) where Y1,m are spherical harmonics with a ν 1,m being spherical harmonic coefficients.Using the relations Y l,−m = (−1) m Y * l,m and a l,−m = (−1) m a * l,m , the above expression can be rephrased in terms of m = 0 and m = 1 coefficients as: where superscripts R and I indicate real and imaginary parts of a complex quantity, respectively.In the inference process, it is convenient to treat the dipole in harmonic space because with the Gaussian likelihood, posterior is also Gaussian as a function of a1,m due to their linear nature, unlike the real space variables indicating dipole amplitude and the direction of the dipole.

Multiple templates
The dust emission in far-infrared and sub-millimetre frequency bands can also be modelled as a linear combination of multiple Hi templates with different dust emissivity per template.For example, Planck early results.XXIV.(2011) estimate the dust emissivity properties associated with LVC, IVC and HVC clouds over 14 fields.Ghosh et al. (2017) and Adak et al. (2020)  We test the impact of additional terms like dipole term or multiple Hi templates on the inferred {ϵ j,t ν , Oν } parameters using simulated maps at Planck frequencies in Section 4.

CIB model power spectra
The CIB is a relic emission from stellar-heated dust within galaxies formed throughout cosmic history.At farinfrared/sub-millimetre bands and the resolution of Planck, the CIB appears as a diffuse background emission in the total intensity.The CIB anisotropies are found to be correlated across the frequencies and follow approximately ℓ −1 power law angular power spectrum (Planck early results.XVIII. 2011;Planck 2013results. XXX. 2014).We adopt the bestfit CIB model power spectra (including the shot noise) at 217, 353, 545, and857 GHz obtained by Planck 2013 results. XXX. (2014).Figure 1 presents the model CIB power spectra used in our analysis.
We treat the CIB anisotropies as Gaussian and correlated noise in the Planck intensity maps.To take into account the correlation between pixels at a given smoothing scale, we compute the CIB covariance matrix, Σ CIB ν , at a given frequency ν between two pixels i and i ′ using the relation where C CIB ℓ is the CIB power spectrum, B ℓ is the beam window function of the Gaussian beam, P ℓ is the Legendre polynomial of order ℓ and w ℓ is the HEALPix pixel window function.

Hamiltonian Monte Carlo sampling
In this section, we present the essentials of the Hamiltonian Monte Carlo sampling formalism to draw the samples of {ϵ j ν } and {Oν } from the distribution given in Equation ( 7), which is the same as the likelihood in Equation ( 5) for a uniform prior on parameters.Details of the HMC sampling method and some considerations that went into analysing the parameter samples are discussed in Appendix A.
HMC uses Hamiltonian dynamics to generate the proposed point and traverse the parameter space.The method treats the parameters {ϵ j ν } and {Oν } as position variables and augments them with the momentum variables (p j,t kν , pO ν ) to define phase space dynamics.The Hamiltonian of the dynamics for the given problem is (13) where P({ϵ j,t ν , Oν }) is the parameter posterior as obtained in Equation ( 7) and µ j,t ν , µO ν are mass terms for ϵ j,t ν and Oν , respectively.Up to a constant, which is independent of parameters of interest, ln[P({ϵ j,t ν , Oν })] is where χ 2 is given by Equation ( 6).While considering the total covariance matrix, only considering the diagonal part leads to underestimating the parameter uncertainty.Whereas considering all the elements, including the ones corresponding to two subpixels of different superpixels, drastically increases the computation cost.We take the approach somewhere in between.We consider the correlations between subpixels corresponding to a given superpixel only and neglect the correlations among the pixels of two different superpixels.This is expressed in Equation (4).The mathematical expressions given in the subsequent discussion are under this approximation.
We need the time derivatives of position and momentum variables to simulate the Hamiltonian dynamics.The time derivative of the position corresponding to a given parameter is simply momentum divided by the mass of the corresponding parameter: which is not at all a computationally involved operation.
The time derivative of the momentum involves computing the derivative of the logarithm of the posterior: Our model is linear in the parameters of interest, and the likelihood is a Gaussian distribution.Hence, with the flat priors, the derivatives of the logarithm of the posterior are simple expressions given below.The derivative with respect to emissivity is and derivative with respect to offset Oν is While sampling the model parameters, including the residual CMB dipole contribution, the spherical harmonic coefficients a (R/I),ν 1m in Equation ( 10) are jointly sampled along with emissivity and offset as {ϵ j,t ν , Oν , a (R/I),ν 1m }. Since a ν 1,m in real space indicates the dipole amplitude and direction, they are sampled as global parameters similar to Oν .The momentum derivative for the dipole coefficients is given by for m = 0 and 1, where (±) corresponds to the real (R) or imaginary (I) parts respectively, of the coefficients (see Equation ( 10)).In general, sν (Ω j i ′ ), is given by Equation ( 11).Our HMC formalism takes into account pixel-dependent dust emissivity, global offset, and three dipole amplitudes.
As the algorithm requires, we simulate the Hamiltonian dynamics using the Leap-Frog scheme (see Appendix A).In the Leap-Frog scheme, the step size ∆ decides the time step, which is generally different for different parameters.A general practice is to standardize the parameter distribution scales, which requires some knowledge of the parameter covariance structure (e.g.Betancourt & Stein 2011).In the particular case of Gaussian posterior and the model that is linear in parameters, one can choose the mass matrix to achieve this goal.For problems where the curvature is isotropic and constant, such as for the Gaussian likelihood we consider in this work, a parameter independent ∆ can be chosen.This choice of ∆ in the case of Gaussian likelihoods is discussed in Taylor et al. (2008).For the general distributions with hierarchical modelling or nonlinear parameter dependencies, this procedure may not work as well as it does in our case.By setting the mass matrix equal to the inverse of the covariance matrix of the parameters, ∆ is made independent of the distribution of the individual parameter.The inverse of the parameter covariance matrix is the negative of the parameter Fisher matrix.With our choice of neglecting the correlations between the superpixels, for the given likelihood, the Fisher matrix turns out to be diagonal over the ϵ j,t ν parameters.The only non-zero off-diagonal terms are those which connect ϵ j,t ν with Oν , a1m, and Oν with a1m.However, we neglect these off-diagonal terms.Considering only the diagonal elements while assigning mass for the parameters then does not lead to ∆ for Oν and a1m being the same as that of {ϵ j,t ν }.Hence, we choose a different ∆ in the dynamical equations corresponding to Oν and the dipole parameters.Hence, we have different step sizes in the Leap-Frog scheme, ∆ϵ and ∆O corresponding to ϵ j,t ν and Oν , respectively.The details of this choice are discussed in Appendix A. This leads to the mass matrix with the following terms in the diagonal.The mass for ϵ j,t ν is for Oν , it is given by and the following is the mass for the dipole coefficients for m = 0 and 1. ( Note that the mass matrix elements for ϵ j,t ν depend on the noise covariance as well as the templates, whereas those for Oν and a1m depend only on the noise covariance.In HMC, the proposed parameter is obtained by evolving Hamilton's equations in a certain number of Leap-Frog jumps N .The product of ∆ and N determines the total distance traversed in the parameter space and controls the correlation length in the parameter chain.In this section, we have discussed the choices for N and ∆ to simulate the HMC process.While we discussed these choices without much rigour, we tested that the algorithm works using realistic simulations of the data.We would like to point out that formal methods have been developed to tune the HMC algorithm to facilitate appropriate choices for step size and path length.For example, Hoffman & Gelman (2011) presents the No-U-Turn Sampler scheme to alleviate the need for the user to choose the number of steps and also presents a method for adaptive stepsize.Recent developments in also include SNAPER-HMC for implementation on GPU and TPU hardware (Sountsov & Hoffman 2021), ChEES-HMC (Hoffman et al. 2021a), and various adaptive schemes, for example, MALT-HMC (Riou-Durand et al. 2022).Some of these schemes are implemented in probabilistic programming frameworks such as STAN (Carpenter et al. 2017), PyMC (Salvatier et al. 2015), and pyro (Bingham et al. 2019).
We first validate the above methodology on the Planck simulations.The results obtained from the simulations are presented in Section 4. The next section discusses the data, the CIB model, and the sky masks used for our analysis.

DATA SETS
In this section, we describe the Planck data, Hi data, and the sky mask used in the analysis.We also describe the procedure for computing the CIB covariance matrix using the model CIB power spectrum.

Planck data
We use the publicly available Planck 2018 Public Release 3 (PR32 ) legacy intensity map at 353 GHz (Planck 2018 results.I. 2020) for our analysis.The 353 GHz intensity map has been provided in HEALPix3 (Górski et al. 2005) grid at N side = 2048 (pixel size 1.7 ′ ) with an angular resolution of FWHM 4.82 ′ (Planck 2018 results.III.2020).We subtract the CMB contribution at 353 GHz using the following procedure.We use the SMICA CMB map provided at a beam resolution of 5 ′ (FWHM) and N side = 2048 (Planck 2018 results.IV. 2020).We smooth the 353 GHz intensity map at the resolution of the SMICA CMB map using the Gaussian approximation of the Planck beam and subtract the contribution of CMB.We further smooth the CMB-subtracted Planck 353 GHz map at the beam resolution of 16.2 ′ (FWHM), downgrade to N side = 512 (pixel size 6.8 ′ ) resolution.We choose the Planck 353 GHz map for our analysis because the contribution of synchrotron and free-free emissions are negligible compared to that of dust after the contribution due to the CMB is subtracted.We consider the Planck CMBsubtracted intensity map as the primary data.We treat the CIB monopole term as the global offset parameter.We use the unit conversion factors mentioned in Planck 2013 results.IX. ( 2014) to convert 353 GHz map from KCMB to kJy sr −1 unit.
We use 300 end-to-end (E2E) noise realizations for the Planck frequency maps from the Planck Legacy Archive (Planck 2018 results.XI. 2020).The original maps are provided at N side = 2048.We smoothed all the 300 noise maps at 16.2 ′ FWHM beam resolution and re-project them at the resolution of N side = 512.Then, we compute the variance map using the 300 smoothed E2E noise maps.

Hi data
To separate the Hi-correlated dust emission from the Planck data at high Galactic latitude, we use the Hi data provided by GASS4 survey carried out by Parkes telescope (McClure-Griffiths et al. 2009).The survey observed 21 cm emission over the southern galactic sky (at declinations δ < 1 • ) within velocity range, −400 km s −1 < VLSR < 500 km s −1 ; where VLSR is the velocity of the Hi clouds with respect to local standard of rest.The survey has a beam resolution of 14.5 ′ FWHM, velocity resolution δv = 1 km s −1 , and rootmean-square brightness temperature uncertainty of 50 mK (1σ).The GASS survey maps used in our analysis are from Kalberla et al. (2010) and are corrected for instrumental effects, stray radiation, and radio-frequency interference.
In the southern Galactic cap, the Hi in the Galactic disk (or what we call Galactic Hi) is mixed with significant emission from the Magellanic Stream (MS) (Nidever et al. 2008;D'Onghia & Fox 2016).The spectra in the 3D data cube (longitude, latitude and radial velocity) likely to be associated with MS are distinguished from Galactic Hi spectra using the velocity information (Venzmer et al. 2012).The three-dimensional model of Kalberla & Dedes (2008) helps to distinguish the spectra associated with the Galactic Hi emission and MS.Finally, the Hi template map is produced integrating 3D spectra over velocity range and projected on HEALPix grid at N side = 1024.The Galactic Hi column density map NHi used in our analysis has an angular resolution of 16.2 ′ FWHM and is projected on the HEALPix grid N side = 512 (pixel size 6.9 ′ ).We use the Galactic Hi map as a tracer for the Hi-correlated dust emission.The same Galactic Hi map is used by the Planck collaboration to study dust emission properties in the diffuse interstellar medium (Planck intermediate results.XVII.2014).

Sky masking
We use the same mask as used in Planck intermediate results.XVII.(2014) for our dust-Hi correlation analysis.The total sky area of the mask is 6300 deg 2 (f sky = 15.3%)where NHi < 6 × 10 20 cm −2 , and thus avoids the high column density regions.The unmasked sky region covers the Galactic latitude b ≤ −25 • .The area of 20 • diameter centred around (lMS, bMS) = (−50 • , 0 • ) is masked to avoid Magellanic Stream (Nidever et al. 2010).Further, the bright radio sources at microwave frequencies and infrared galaxies at 100 µm have been masked out.
To test the dependence of the analysis results on the sky fraction, we generate four additional masks with different NHi cutoff over the range between 2 × 10 20 cm −2 and 5 × 10 20 cm −2 .We mask the regions with NHi values higher than the cutoff value.We label the mask with NHi cutoff Q × 10 20 cm −2 as MaskHIQ.The sky masks are overlapping by construction.We use overlapping masks to study the dependence of the global offset as a function of mean Hi column density.The respective sky fraction f sky for each sky mask is quoted in Table 1.

PLANCK SIMULATIONS
In this section, we validate our methodology on the Planck simulations to simultaneously fit a dust emissivity per superpixel and a global offset using the HMC sampler.
We simulate the dust intensity maps at Planck HFI frequency bands between 217 and 857 GHz.We analyse simulated maps considering the total noise contribution from the instrument noise and the CIB anisotropies.We ignore the contribution of Galactic residuals in this analysis.We consider the instrument noise to be uncorrelated between pixels.However, for the CIB, we consider the inter-pixel correlations.Analysis with the simulated data helps to validate the pipeline and to determine some analysis choices, for example, the optimal values of the Leap-Frog step size (∆) and the Leap-Frog jumps (N ) by examining the behaviour of the Markov Chains (MC) drawn from the posterior distribution.
We require sufficient unmasked subpixels within each superpixel to fit the signal model with the data.We set this threshold to one-third of the subpixels within each superpixel5 .We excluded those superpixels from the joint fitting which do not fulfil this criterion.

Simulated maps
We construct the simulated maps using the following procedure: (i) We start with the dust emissivity map of Planck intermediate results.XVII.(2014) at 353 GHz (ϵ353) projected on N side = 32 (low-resolution) HEALPix grid.This map is obtained through the dust-Hi correlation analysis over 15 • circular patches in diameter centred on HEALPix pixels at N side = 32.We assume that ϵ353 is the same at all subpixels defined at resolution N side = 512 that fall within a superpixel defined by N side = 32.Each superpixel contains 256 subpixels.(ii) We translate the dust emissivity map from 353 GHz to other HFI frequencies using the MBB spectrum, ν 353 where β d is the dust spectral index fixed to 1. (iv) To simulate the instrument noise contribution, we use the variance map (II) computed from 300 smoothed E2E noise maps.We assume the instrumental noise to be Gaussian, white, and uncorrelated between the pixels.For the CIB noise component, we simulate the CIB map smoothed at 16.2 ′ FWHM beam resolution projected at N side = 512 from the model CIB power spectrum at each HFI frequency.Though the CIB is correlated between two frequencies, analysing individual frequency maps entails neglecting the correlation between the CIB emission at different frequencies.
(v) Finally, we co-add simulated dust intensity, global offset, instrument noise, and the CIB, all expressed in kJy sr −1 units.
(vi) To build the likelihood, we construct the instrumental noise covariance matrix (Σ N ν ) and the CIB covariance matrix (Σ CIB ν ).Σ N ν is taken as a diagonal covariance matrix because we neglect the inter-pixel instrument noise.I CIB ν exhibits a significant correlation between subpixels within a given superpixel.

Validation with simulations
We focus our discussion of the simulated map analysis on the 353 GHz frequency channel without the loss of generality.However, we summarise the simulation results at all four HFI frequencies (217 − 857 GHz).
The output of our HMC algorithm is the chains of MC samples of the dust emissivity per superpixel at N side = 32 and the global offset.The total number of parameters at each frequency is 2011 (dust emissivity values) + 1 (global offset) over MaskHI6.
We obtain 2 × 10 4 samples for each derived parameter and discard the first 10 3 samples.We use the remaining 1.9 × 10 4 MC samples for further analysis.In practice, the samples for a given parameter are not independent but are correlated with a certain correlation length.Figure 2 presents the auto-correlation coefficients, ρ(t), for sample chains of the dust emissivity at one superpixel and the global offset chains for 353 GHz maps.The results are depicted for two choices of {∆ϵ, ∆O}.When step size for ϵ and O are the same, {∆ϵ, ∆O} = {0.1,0.1}, the global offset (magenta solid) chain is highly correlated.For this choice, the correlation length of emissivity and offset chain is around 8 and 220, respectively.For {∆ϵ, ∆O} = {0.1,1.0}, the global offset (red dashed) chain becomes less correlated, and correlation length decreases by a factor of 2. However, the correlation length for the emissivity chain remains almost the same.For the latter choice of step sizes, the correlation length of dust emissivity and global offset chains are around 10 and 102, respectively.Therefore, we adopt the second choice of step sizes for the HMC sampling at all frequencies.Along with these {∆ϵ, ∆O}, we choose the number of Leap-Frog steps N = 10, which gives a reasonable acceptance rate and lower correlation length.We check the convergence of the chains using the Gelman-Rubin test (Gelman & Rubin 1992) (for details, see Appendix A1).Using seven independent chains, we find Gelman-Rubin Markov Chain Monte Carlo (MCMC) convergence diagnostics are 1.0002 (for emissivity) and 1.002 (for offset).These values confirm the chains are converged to a reasonable accuracy.
To check the correlations between the model parameters, we show the joint distributions of the dust emissivity at three superpixels and the global offset in Figure 3 at 353 GHz.We do not find a significant correlation between the dust emissivities at two different superpixels or between the emissivity at a given superpixel and the offset.We also show the marginalised posterior distributions of the respective parameters along with their posterior mean values (red dashed line) and corresponding input values (orange dashed line) used in the Planck simulation.We find the inferred posterior mean values of the parameters agree with their respective input values.
To quantify the goodness of fit, we use posterior predictive checks considering the pixels included in the analysis, which are a subset of those in the MaskHIQ mask.This is due to the exclusion of superpixels that do not satisfy the threshold criteria for the number of subpixels.Most of the boundary pixels in the MaskHIQ mask do not satisfy the criteria and are excluded from further analysis.We term the resultant extended mask as eMaskHIQ.We replicate the simulated data based on the mean and standard deviation of the parameter samples.The distribution of the original sim- Hi column density (⟨N Hi (Ω j i )⟩).The mean ⟨..⟩ is taken over all unmasked subpixels (i) that fall within a given superpixel.The "dot" symbol represents ϵ j ν and "plus" represents σ ϵ j ν at all Planck frequencies considered in this analysis.
ulated data and the replicated data match very well.Both the distributions are non-Gaussian due to the non-Gaussian nature of the thermal dust emission, which has a dominant contribution.We use summary statistics like median and 95% quantile level to test the consistency between the original and the replicated simulated data.The probabilityto-exceed (PTE) statistics are used to test the probability that the summary statistics of the replicated data exceed the original data.The PTE values of the summary statistics are quoted in Table 2.
From the MC chains of the parameters, we compute the mean and standard deviation of the respective parameters.
In order to quantify the accuracy of inference, in Figure 4, we present the distribution of the standard deviation (σ ϵ j out ) normalised difference, δ j = (ϵ j inp − ϵ j out )/σ ϵ j out of input dust emissivity at all superpixels (ϵ j inp ) and the posterior mean emissivity at respective superpixels (ϵ j out ), for all four frequency bands.We also show the same quantity for the offset with vertical dashed lines.The mean values obtained using the MC samples agree very well with their respective mean values and are within 3σ deviations.The largely symmetric nature of the histograms implies that the best-fit values of output dust emissivities for all superpixels are unbiased.
In Figure 5, we show the mean and standard deviation of dust emissivity per superpixel obtained from the dust-Hi correlation analysis as a function of mean Galactic Hi column density at those superpixels.The uncertainty decreases with an increase in mean Galactic Hi column density, indicating a better estimation of dust emissivity in high column density regions within the sky mask MaskHI6.This is expected as the uncertainty scales with 1/µO ν (see Equation ( 21)) and is inversely proportional to NHi.
We use the mean of the parameter samples as the bestfit value of the respective parameter (Mackay 2003).Owing to the linear nature of the signal model, the signal that corresponds to the best-fit values of emissivity and the offset is also the best-fit signal.We obtain the best-fit intensity map corresponding to the signal model using the mean values of the dust emissivity and the global offset following Equation (3).In Figure 6, we show the simulated map at 353 GHz along with maps of the best-fit model and the residual.The residual map is the difference between the input and best-fit model intensity maps.The residual map contains the contribution from the instrument noise and the CIB.The residual map shows the small-scale structure, lacking large-scale fluctuations, consistent with the nature of instrument noise and the CIB.While the map shows the spatial distribution of the residual, to check the nature of the distribution of the residual, in Figure 7, we compare it (black) with the expected residual contribution from instrument noise and the CIB (magenta).We find good agreement between the recovered and the expected residual distribution at all Planck frequencies, indicating that the analysis does not introduce systematic bias.At the map level, the residual map over the unmasked region strongly correlates with the expected residual map obtained by combining the input CIB and instrumental noise map at all Planck frequencies.The Pearson correlation coefficients are 0.97, 0.96, 0.96, and 0.96 at 217, 353, 545, and 857 GHz, respectively.Offset is a pixel-independent parameter; hence, we expect its inference to be unaffected by mask choice.We infer the global offset values for different mask choices using simulated maps to test this.In Table 1, we list the posterior mean values of the global offsets along with 1σ error bars and the respective masks.Irrespective of the choice of mask, the inferred offset values at all frequencies are consistent with their input values.This indicates the stability of the analysis for different choices of masks.
To elucidate the effect of the CIB noise in our analysis, we redo the analysis without the CIB contribution in the noise covariance matrix.In Figure 8, we compare the standard deviation of the dust emissivity and offset inferred with and without considering the CIB in the noise covariance matrix.We plot the ratio of these two standard deviations for all the superpixels against the mean ⟨NHi⟩ value at the respective superpixel.The uncertainty ratio in the offset parameter with and without the CIB for all four frequencies is depicted with horizontal solid lines.There is a weak dependence between the ratio of σϵ with and without the CIB as a function of NHi value.The ratio increases with the increasing frequency, consistent with the fact that the CIB noise dominates over instrumental noise at higher HFI frequency (Planck intermediate results.XVII.2014).
We have shown that the method gives an unbiased inference of emissivity and the offset in the presence of realistic The ratio is plotted for all the superpixels over the unmasked sky area against their mean N Hi value ("plus" marker).The horizontal lines represent the ratio of the uncertainty in the offset parameter, which is a single number at a given frequency, with and without the CIB in the noise covariance.
noise.We can faithfully take into account the noise arising due to the CIB, including the CIB-induced inter-pixel correlations within a superpixel, while we neglect the correlation between subpixels belonging to two different superpixels.The implication of considering the CIB noise is evident in Figure 8.Not considering the CIB can lead to underestimating the uncertainty by orders of magnitude and possible bias in the inference.Further, joint sampling of emissivities with the global offset mitigates any bias that may arise due to the biased or position-dependent value of the offset.
In this simulation section, we completely ignore the contribution of the Galactic residual.Assuming the same SED for Hi-correlated dust emission and Galactic residuals, one can expect it to dominate over the CIB at Planck's highest frequency, 857 GHz.Like the CIB emission, we can incorporate the contribution of Galactic residuals in the noise covariance term.

Validation of two-template fit with Galactic Hi and MS templates
To test the method's robustness, we repeat the same analysis on the 353 GHz simulated map that has a contribution from the two NHi templates.We use the MS and Galactic Hi templates to simulate map 353 GHz.The MS template traces the gas from IVC and HVC.We add the MS template with a constant emissivity ϵ MS 353 = 10 −2 ⟨ϵ Hi 353 ⟩, where ⟨ϵ Hi 353 ⟩ is the average over all the superpixels of the input dust emissivity map (ϵ353 map as used in Section 4.1).The corresponding input values for the dust emissivities are ⟨ϵ Hi 353 ⟩ = 37.3 kJy sr −1 (10 20 cm −2 ) −1 and ϵ MS 353 = 0.37 kJy sr −1 (10 20 cm −2 ) −1 , respectively, and the offset is 120 kJy sr −1 , the same as that used in Section 4.1.We infer the global offset and the emissivity parameter per superpixels for both the Galactic Hi template and the MS template over MaskHI6 performing the HMC methodology as discussed in Section 2.4. Figure 9 shows that the recovered emissivity values are well within 3σ of the input value without significant bias.The inferred global offset is 120.0 ± 0.3 kJy sr −1 , which is close to the input global offset value.The recovered emissivities are quoted as the mean of the dust emissivity values over all valid superpixels, along with its standard deviation.They are respectively, 37.3 kJy sr −1 (10 20 cm −2 ) −1 and 7.2 kJy sr −1 (10 20 cm −2 ) −1 for the Galactic Hi template while 0.39 kJy sr −1 (10 20 cm −2 ) −1 and 0.27 kJy sr −1 (10 20 cm −2 ) −1 for the MS template.

Validation with a residual CMB dipole term
Given that we infer a global offset parameter from the partial sky, the presence of residual CMB dipole can bias its inference.We demonstrate that the method can infer a residual CMB dipole and the global offset jointly.We use this exercise to assess the extent to which the residual dipole affects the inference of the global offset from the partial sky.
We simulate a single realization of the Planck map at 217 GHz with the residual CMB dipole amplitude A dip = 3.9 kJy sr −1 , corresponding to 8 µK in KCMB unit.We choose this representative amplitude approximately equal to the difference between WMAP and Planck estimates of the Solar system dipole (Planck 2018 results.I. 2020).The direction (l dip , b dip ) = (264.0• , 48.3 • ) is chosen same as the Solar dipole (Planck 2013 results.XXVII.2014).Corresponding to these real space dipole coordinates, the input values of the harmonic coefficients in Equation ( 10) are (a1,0, a R 1,1 , a I 1,1 ) = (5.9,0.4, −3.7) kJy sr −1 .The value of the global offset is O = 40 kJy sr −1 (same as in Table 1), and the input dust emissivity averaged over the superpixels for 217 GHz is 8.15 kJy sr −1 (10 20 cm −2 ) −1 .Following the procedure detailed in Section 2.4, we jointly sample emissivity per superpixel, the global offset, and the harmonic coefficients using variable Leap-Frog jumps to reduce the corre- lation among the parameters (as discussed in Appendix B).
We recover the offset O = 39.2 ± 0.6 kJy sr −1 and the harmonic coefficients as a1,0 = 3.9 ± 1.5, a R 1,1 = −0.1 ± 0.3 and a I 1,1 = −4.4± 0.4 in kJy sr −1 for maskHI6.The recovered emissivity is within 3σ as verified from a normalised histogram, indicating the method works well.The harmonic coefficients are related to the dipole amplitude and Galactic longitude and latitude, respectively, as We calculate A dip and (l dip , b dip ) for each 1.9 × 10 4 samples using Equation ( 24).The joint and marginalised probability distributions of the global offset, dipole amplitude and direction are shown in Figure 10.The corresponding mean values with standard deviations are obtained as A dip = 3.7 ± 0.3 kJy sr −1 and direction (l dip , b dip ) = (271.4± 4.2 • , 31.0 ± 10.9 • ).We find that the amplitude of the fitted dipole is positively correlated with the global offset.This is expected due to the dipole direction lying in the northern Galactic part, which is almost opposite to the SGP.The increased dipole amplitude corresponds to more negative dipole fluctuation in the SGP region, which is compensated by the global offset parameter increase.A similar reason leads to the positive correlation between the b dip and the offset.Lower values of b dip correspond to the most negative part of the dipole pointing away from the SGP, leading to the dipole compensating for a greater fraction of the positive zero level and letting the offset have a relatively smaller value.Including dipole in the analysis does not significantly change the inference of the offset, indicating unbiased inference.However, due to the correlated nature of the offset and the dipole parameters, there is a significant increase in the uncertainty of the offset parameter.Table 3 presents the recovered offset and dipole parameters as a function of sky masks at 217 GHz.As the sky area increases, the uncertainty of the global offset and three dipole parameters decreases.
We repeat the analysis with the residual CMB dipole at 353 GHz using the same Planck simulations.The same residual CMB dipole amplitude at 353 GHz is A dip = 2.4 kJy sr −1 (in intensity units).We performed the same analysis and recovered the offset O = 119.0± 2.4 kJy sr −1 and dipole amplitude A dip = 3.7 ± 1.4 kJy sr −1 and direction as (l dip , b dip ) = (277.4± 31.1 • , 14.9 ± 43.2 • ).The difference between input and the recovered values of the O, A dip , l dip and b dip are respectively 0.4σ, −0.9σ, −0.4σ and 0.8σ, σ being the uncertainty on the respective parameters.The uncertainty of inferred dipole parameters at 353 GHz is larger than that at 217 GHz.This is due to higher noise at 353 GHz, where the contribution from the CIB is higher than at 217 GHz (see Figure 1).The lower amplitude of residual CMB dipole at 353 GHz (in intensity units) and higher uncertainty results in a low signal-to-noise ratio, but the recovered value is consistent with the input within 1σ.Our results show that the residual CMB dipole contribution can be ignored at frequencies 353 GHz and above.We conclude that at the noise level considered here, if the offset is larger than the amplitude of the dipole, neglecting the residual CMB dipole would not lead to significant bias in the inference of global offset.

PLANCK DATA RESULTS
In this section, we discuss the analysis results of the 353 GHz CMB-subtracted Planck intensity map.We model the data with pixel-dependent dust emissivity at N side = 32 resolution and a global offset.For this analysis, we do not consider the residual CMB dipole contribution in the signal model.As shown with the simulations, the noise at 353 GHz leads to increased uncertainty on the residual CMB dipole parameters as compared to the same inference at 217 GHz.However, we do test the robustness of the offset to the addition of the MS template, discussed later in this section.We use the same superpixel and subpixel resolution as in the analysis of the simulations.We apply the HMC sampler as discussed in Section 2.4.We use the same values for the HMC sampler hyper-parameters (Leap-Frog step sizes and the number of jumps) as were used in the simulations for the analysis of the Planck data.We obtain 2 × 10 4 samples of the emissivity and the offset parameters over MaskHI6, discard the first 10 3 samples, and perform the analysis with the remaining 1.9 × 10 4 samples.Using the remaining 1.9 × 10 4 MC samples, we compute the posterior mean and standard deviation of the model parameters.The Gelman-Rubin convergence diagnostics values for the Planck 353 GHz data estimated using seven independent chains are 1.0003 (for emissivity) and 1.003 (for offset).This shows that the chains are converged.
In Figure 11, we show the map of the mean and the standard deviation of the dust emissivity for analysis with the MaskHI6 mask.The mean ϵ map shows the fluctuations that extend down to the scales of superpixels, indicating small-scale variations (1.8 • pixel size) in dust emissivity.The standard deviation map shows the inhomogeneity in the uncertainty of the dust emissivity.The NHi map partly determines the inhomogeneity in σϵ map as the instrument noise and the CIB are fairly statistically isotropic.This map also allows us to assess the spatial variation in the dust emissivity.The spatial average of the dust emissivity map in Figure 11 is 0.031 MJy sr −1 (10 20 cm −2 ) −1 and the standard deviation of the dust emissivity values over the unmasked sky area is 0.007 MJy sr −1 (10 20 cm −2 ) −1 .Note that, as shown in  We obtain the signal model map using the mean dust emissivity map and the mean global offset.As superpixels do not overlap and the emissivity varies over the superpixels, the estimated mean dust emissivity map becomes discontin-uous at the scale of HEALPix superpixel with 1.8 • angular size (pixel size at N side = 32).To avoid this discontinuity, we first obtain the dust emissivity map at N side = 512 by interpolating the N side = 32 pixel values using interpolate.griddatafunction of scipy and then smooth the resultant mean dust emissivity map with a Gaussian kernel of 1.8 • FWHM.We construct the signal model map using the smoothed dust emissivity map and the mean value of the global offset following Equation (3).We show the results of this analysis in Figure 13.The leftmost panel in Figure 13 shows the CMB-subtracted Planck intensity map at 353 GHz over the extended unmasked region eMaskHI6, and the second panel shows the signal map.The third panel in Figure 13 shows the difference between the CMB-subtracted Planck 353 GHz intensity map and signal model map over the same sky mask.
To compare the residual distribution with the expected distribution of the residual at 353 GHz, we simulate a random Gaussian realization of the CIB at a beam resolution of 16.2 ′ (FWHM) using the best-fit CIB model angular power spectrum and an uncorrelated Gaussian white noise realization from the E2E smoothed noise variance map and add them.The expected distribution of the total noise is centred around zero because both the CIB and the instrumental noise have a zero mean.If our model assumptions are correct, the distribution of the residual should be consistent with the distribution of the expected fluctuations from the CIB and instrument noise, similar to as seen in the simulated data (see Figure 7).The fourth panel of Figure 13 shows the residual for masks with different sky fractions and compares it with the residual expected from the CIB and the instrument noise for the respective sky fraction.Though the residual distributions obtained using different MaskHIQ masks are mostly consistent with the expected model residuals from the CIB and instrumental noise contributions, they have a minor fraction of pixels contributing to the non-Gaussian tails with increasing sky fractions.The residual histogram is in logarithmic scale to highlight the asymmetric nature of the positive tail arising from a very small number of localised pixels in which dust is uncorrelated with the Galactic Hi.
Table 4 shows the recovered offset and recovered mean emissivities at 353 GHz over different sky masks.As the available sky area decreases, the mean value of dust emissivity decreases, indicating a Hi column density dependence of dust emissivity.We also see an opposite trend in the global offset value, which increases with the reduction in the sky area and decreasing average emissivity.This is expected as a decrease in the global offset value compensates for an increase in the mean dust emissivity.Over the range of NHi thresholds and corresponding sky area, the inferred offset parameter ranges from 0.1358 MJy sr −1 (over MaskHI2) to 0.1284 MJy sr −1 (over MaskHI6).This range encompasses the value of the CIB monopole in the 353 GHz intensity map.The offset we infer is the total zero-level of the map that includes the CIB or any other residual zero-level.The differences between the CIB monopole value and the value we infer could be due to the differences in (1) the smoothing scale of the Planck and Hi column density map, (2) the Planck Galactic zero-level is estimated over a larger sky fraction (28% of the sky) than considered here, and (3) the difference in the local velocity cloud Hi column density thresh-old (< 3 × 10 20 cm −2 ) from the LAB survey (Kalberla et al. 2005).

SUMMARY
We demonstrate the use of the Bayesian inference method to estimate the pixel-dependent dust emissivity and the global offset in the diffuse interstellar medium at far-infrared and submillimeter frequencies.We utilise the dust-Hi correlation to model the Galactic thermal dust emission, using the integrated Galactic Hi column density map from the GASS survey (McClure-Griffiths et al. 2009) as a template.The signal model incorporates spatially varying dust emissivity and global offset over a 6300 deg 2 region centred around the southern Galactic pole.The HMC method allows efficient sampling of the high dimensional posterior distribution of the dust emissivity and the offset parameters.
We first validate the method on the Planck simulations, which include the CIB and instrumental noise.The dust emissivity parameters are fixed based on earlier Planck analysis (Planck intermediate results.XVII. 2014).This validation process allows us to test the analysis pipeline and fix the parameters of the HMC sampler that we use for the real Planck data.Given that the data is on the partial sky, the inference of the offset can be biased if any residual CMB dipole is not taken into account.We demonstrate the nature of the inferred parameters in the presence of residual CMB dipole at 217 GHz.We show that the amplitude of the residual CMB dipole is highly correlated with the global offset parameter for partial sky analysis.A small residual CMB dipole does not significantly affect the global offset inference beyond the increase in the error bar of the global offset, whose uncertainty is still significantly smaller than the global offset value at 217 GHz.At 353 GHz, the error bar on the fitted residual CMB dipole term increases due to increased noise contributions from the CIB and the instrumental noise.However, we note that the joint dipole and offset inference will be important in the case of application to the NPIPE data where the CMB dipole is retained in the frequency maps (Planck intermediate results.LVII.2020).We further test the robustness of the inferred offset in the presence of multiple Hi templates in the signal model.We consider the emission from the MS as an additional component in the signal model.The global offset value does not change significantly when considering the two-template analysis -Galactic Hi and MS templates.
As a demonstration of the method, we apply the same HMC methodology to the Planck intensity map at 353 GHz.Results of this analysis are depicted in Figure 11, Figure 12, Figure 13, and Table 4.As shown in Figure 11, we infer the  As expected, we find the inferred value of the global offset is stable for different sky masks.We also find that the mean value of the dust emissivity decreases as we go to the low column density regions or lower sky masks.The non-Gaussian tail in the residual distribution (caused by the emission not correlated with NHi) does depend on NHi threshold and sky fraction.This additional emission component can be partly dealt with as an additional noise component in the data model, denoted by I R ν in Equation ( 1), which we leave for future work.
The methodology introduced in this paper opens a new way to jointly estimate the pixel-dependent dust emissivity and a global offset over the field of interest using the dust-Hi correlation.The HMC method is able to sample around 2 × 10 3 parameters and infer the parameter posterior distribution.This method can be useful in studying the 3D variation of dust SEDs to constrain the frequency decorrelation of dust B-modes.In the future study, we will apply this technique to study the dust emissivity properties associated with different Hi phases (CNM, LNM, and WNM) of diffuse ISM considered in Adak et al. (2020) and Ghosh et al. (2017), which will be useful to extend the sky model of dust polarisation at multiple sub-mm frequencies.The residual maps estimated using this method will be useful for estimating the CIB maps at the best angular resolution of Planck.In the forthcoming paper, we will apply the same methodology to other Planck HFI frequency maps to separate the thermal dust emission from the CIB emission and characterise the dust emissivity parameters.oration for the same.This work has used Hi data of the GASS survey headed by the Parkes Radio Telescope, a part of the Australia Telescope funded by the Commonwealth of Australia for operation as a National Facility managed by CSIRO.Some of the results in this paper have been derived using the HEALPix package.All the computations in this paper were run on the Aquila cluster at NISER.Here, we discuss some considerations for analysing these Markov chains before using these chains to draw the inference.
Burn-in: We neglect some of the initial samples from the chain as Burn-in.To quantitatively determine the Burn-in sample size, we monitor the χ 2 using a model map estimated with the increasing number of samples in the chains of the parameters.The model map was built from the posterior mean of the parameter chain with an increasing number of samples.Burn-in sample then consists of the initial points where χ 2 is away from the total number of degrees of freedom.In practice, we discard the first 10 3 samples, which is much larger than the Burn-in sample determined using χ 2 criteria.Note that, in the absence of explicit priors on the parameters and Gaussian likelihood, χ 2 is equal to the logarithm of the posterior up to a constant term.
Correlation length: When we draw the samples from the target distribution, all the samples may not be independent because of the correlation within a chain of samples.The effective number of the independent samples out of the total samples, Ns, is n eff = Ns/L, where correlation length, L, is defined as, where ρ(t) is the auto-correlation coefficient of the chain for lag t (Taylor et al. 2008).
Correlation length depends on the total jump, s = N ∆ in the Leap-Frog scheme and the scale of the distribution of the parameters of the problem.The total jump s should be of the order of scale of distribution of the parameters to get less correlated samples.∆ controls the accuracy with which we implement Hamiltonian dynamics.This, in turn, determines the acceptance rate of the proposed sample.For a given ∆, N determines how far the proposed sample is from the current position.If ∆ is chosen to be too small, N needs to be large enough to move a sufficient distance along the trajectory, resulting in increased computation time.In contrast, for the choice of a large value of ∆, the computation of the phase space trajectories becomes relatively less accurate, resulting in a reduced acceptance rate.If N is chosen to be small, samples are correlated, whereas too large a value of N may bring back the proposed sample very close to the starting point after completing Leap-Frog jumps (Neal 2012).In Appendix B, we show the difference in correlation length for fixed and variable Leap-Frog jumps.The accuracy of Hamiltonian dynamics can be increased by efficient choice of total jump in the Leap-Frog scheme (Hoffman & Gelman 2011;Bou-Rabee & Sanz-Serna 2017;Hoffman et al. 2021b;Sountsov & Hoffman 2022).
As discussed in Section 2.4, we choose the mass matrix µ to make step size ∆ independent of the scale of the target distribution.In the approximations considered in our analysis, neglecting the correlation between subpixels belonging to two different superpixels renders the mass matrix sparse and diagonal dominant.The only non-zero off-diagonal terms are those due to the cross-correlation among emissivity, offset and spherical harmonic coefficients (see Section 2.4).The rest of the off-diagonal terms are the cross-correlation between emissivities at different superpixels, which are zero due to the approximations considered in our analysis.Inverting the mass matrix and evolving the vector forms of the Leap-Frog equations with µ −1 are relatively computationally costly when dealing with the non-diagonal mass matrix.Therefore, we neglect the off-diagonal terms in the column (and row) of the mass matrix that connect the emissivity and offset parameters.As a consequence of this choice, we have to choose two separate step sizes, ∆ϵ and ∆O, for emissivity and offset, respectively.For the various schemes used to tune the HMC hyper-parameters, see (Hoffman & Gelman 2011;Bou-Rabee & Sanz-Serna 2017;Hoffman et al. 2021b;Sountsov & Hoffman 2022) and for their implementation, refer to the probabilistic programming frameworks such as STAN (Carpenter et al. 2017), PyMC (Salvatier et al. 2015), and pyro (Bingham et al. 2019).
Convergence test: To quantify the convergence of the Monte Carlo sample, we adopt the Gelman-Rubin test (Gelman & Rubin 1992).To implement this test, one needs multiple Monte Carlo chains starting with sufficiently separated positions in the parameter space.One then computes the following ratio R, which should be ideally equal to one.
where V is the variance of the given parameter between the chain and W is the variance of the same parameter along the chain (Brooks & Gelman 1998;Heavens 2009).In practice, it is recommended that R should be less than 1.01 to consider the sample as converged to the distribution being sampled (Vehtari et al. 2021).

APPENDIX B: VARIABLE LEAP-FROG STEPS
To investigate the change in correlation length with Leap-Frog jumps N , we perform the HMC sampling with variable N .Following Neal (2012), we randomly draw N at each iteration from a uniform distribution, U [9, 15] and apply the HMC sampler on the simulated map at 353 GHz over MaskHI6 as discussed in Section 2.4.We then compute the compute the auto-correlation coefficient of emissivity ϵ at one superpixel and the global offset O as in Section 4.2.
Figure B1 shows the auto-correlation coefficient for fixed Leap-Frog jumps N = 10 (Section 4.2) and variable Leap-Frog jumps N ∈ [9,15].It shows that for variable N , the correlation length decreases considerably for O. Variable N minimises the correlation among the various global parameters for multiple template fit or fit with dipole contribution.

Figure 2 .
Figure 2. The auto-correlation coefficient, ρ(t), at 353 GHz for the samples of the emissivity in one superpixel (ϵ) and the global offset (O) as a function of the lag (t) for two different choices of Leap-Frog step size (∆ O ) for the global offset and fixed ∆ϵ = 0.1.For ∆ O = 0.1, the correlation length for ϵ and O are 8 and 220, respectively.For ∆ O = 1.0, the correlation length for ϵ in the same pixel and O are 10 and 102, respectively.

Figure 3 .
Figure3.The joint and marginalised probability distributions of dust emissivities (expressed in kJy sr −1 (10 20 cm −2 ) −1 ) at three representative superpixels (pixel index as super-script) and the global offset (in kJy sr −1 ) at 353 GHz.The red lines mark the posterior mean, and the orange lines depict the input values of the respective parameters.The contours mark the 68% and 90% regions of the joint distributions.The vertical black dashed lines in the histogram mark 16, 50, and 84 percentiles of the distribution.

Figure 6 .Figure 7 .
Figure 6.The left panel shows the input simulated map at 353 GHz (in units of kJy sr −1 ) over MaskHI6.The middle panel shows the signal model map (in kJy sr −1 ) derived from the mean values of the emissivity and the offset obtained using the HMC sampler.The right panel depicts the residual map (in kJy sr −1 ) obtained by subtracting the signal model map from the input map at 353 GHz.The residual map has contributions from the CIB and the instrumental noise.

Figure 8 .
Figure8.The ratio of 1σ uncertainty in dust emissivity with and without the CIB contribution in the noise covariance term.The ratio is plotted for all the superpixels over the unmasked sky area against their mean N Hi value ("plus" marker).The horizontal lines represent the ratio of the uncertainty in the offset parameter, which is a single number at a given frequency, with and without the CIB in the noise covariance.

Figure 9 .
Figure 9.The histograms of the normalised difference of input and the posterior mean emissivities for Galactic Hi and MS template at 353 GHz.The vertical line shows the normalised difference between the input and output global offset at the same frequency.

Figure 10 .
Figure 10.The joint and marginalised probability distributions of global offset and the three dipole parameters at 217 GHz.The global offset and residual CMB dipole amplitude are expressed in kJy sr −1 unit and the residual CMB dipole direction in degrees.Details are the same as Figure 3.

Figure 11 .
Figure 11.The mean (left) and the standard deviation (right) of the dust emissivity obtained from the Planck 353 GHz intensity map.Both the maps are in MJy sr −1 (10 20 cm −2 ) −1 units.These are for the analysis with the MaskHI6 mask, which has f sky = 15.3%.

Figure 12 .
Figure 12.Upper panel: Histogram of mean dust emissivity obtained in the Planck 353 GHz intensity map analysis over MaskHI6.The map of these dust emissivity values is given in the left panel of Figure 11.Lower panel: Variation of dust emissivity as a function of mean dust column density.⟨N Hi ⟩ represents the bin average Hi column density and ⟨ϵ⟩ are the average over the superpixels within the same N Hi bins.The error bars are the standard deviations in the respective bins.

Figure 12 ,
Figure 12, the spatial distribution of the dust emissivity is slightly skewed towards higher values.The upper panel of Figure 12 shows the histogram of dust emissivity values in the ϵ map depicted in Figure 11.The lower panel of Figure 12 shows the variation of dust emissivity as a function of NHi.The data point and the error bar represent the mean and standard deviation of the emissivity computed over superpixels that fall within the given NHi bin.The average dust emissivity increases with increasing Hi column density.Our spatial average value of the dust emissivity is ≈ 20% lower than the value quoted in Planck intermediate results.XVII.(2014).This difference could arise due to (1) different aperture sizes considered for the dust-Hi correlation analysis in Planck intermediate results.XVII.(2014) and (2) change in the calibration of the Planck 353 GHz map from PR1 to PR3.For MaskHI6, we report the global offset value equal to 0.1284 ± 0.0003 MJy sr −1 .We obtain the signal model map using the mean dust emissivity map and the mean global offset.As superpixels do not overlap and the emissivity varies over the superpixels, the estimated mean dust emissivity map becomes discontin-

Figure 13 .
Figure 13.Results of analysis with Planck data at 353 GHz in the MJy sr −1 unit.First panel: the CMB subtracted Planck intensity map at 353 GHz over the South Galactic Pole region, Second panel: the estimate of the signal map obtained using the posterior mean emissivity and the offset, Third panel: the residual map, which is the difference between the difference map and the signal estimate, Fourth panel: histogram of the residual map for the analysis with different sky masks.The dashed blue curve shows the expected residual based on the CIB and the instrument noise.

Figure B1 .
Figure B1.Same as Figure 2, but for fixed and variable values of Leap-Frog jumps N .For a fixed N = 10, the correlation length for ϵ and O are 10 and 102, respectively.For variable N ∈ [9, 15], the correlation length for ϵ and O are 7 and 57, respectively.
5, Bν is Planck black-body function and T d is the dust temperature fixed to 20 K (Planck intermediate results.XXII 2015).(iii) We simulate the dust intensity maps at 217, 353, 545, and 857 GHz at N side = 512 using the dust emissivity map and Galactic Hi template.We add the global offset values from Planck intermediate results.XLVIII.(2016) to produce the signal model maps at all HFI frequencies.

Table 1 .
Results of analysis of simulated maps at four Planck HFI frequencies.The global offset values with corresponding 1σ error bars are estimated over five sky masks, tracing low to intermediate Hi column density regions.The inference of offset is stable with respect to sky coverage, with a slight decrease in the uncertainty with higher f sky .

Table 2 .
The PTE values obtained for all Planck frequencies considered for our analysis over eMaskHI6.They are defined as the probability of obtaining summary statistics (median and 95% quantile) larger than fitted data, based on 1000 simulations with dust plus CIB and Planck noise.The PTE values are expressed as a percentage.

Table 3 .
Recovered global offset (O), dipole amplitude (A dip ) and Galactic longitude (l dip ) and latitude (b dip ) estimated over five different N Hi cutoff sky masks for simulated map at 217 GHz.The corresponding input values are respectively O = 40 kJy sr −1 , A dip = 3.9 kJy sr −1 , l dip = 264.0• , b dip = 48.3• .Column δ represents the difference between the input and the recovered parameter value in units of 1σ uncertainty.

Table 4 .
Béthermin et al. (2012)Planck intensity map at 353 GHz.We list the global offset (O) values and the mean dust emissivity over the superpixels (⟨ϵ j ⟩) estimated over different sky masks corresponding to different N Hi cutoffs.Column σ represents the uncertainty on the mean dust emissivity ⟨ϵ j ⟩.We also include the estimates from the Planck collaboration analysis for comparison.MJy sr −1 (10 20 cm −2 ) −1 σ MJy sr −1 (10 20 cm −2 ) −1 emissivity of the dust over the sky area ≈ 7500 deg 2 and its variation at scales larger than 1.8 • .For the region of interest over 7500 deg 2 area centered around the southern Galactic pole with NHi threshold NHi < 6 × 10 20 cm −2 , the spatial average of dust emissivity is 0.031 MJy sr −1 (10 20 cm −2 ) −1 with standard deviation of 0.007 MJy sr −1 (10 20 cm −2 ) −1 .The inferred offset for MaskHI6 is 0.1284±0.0003MJysr−1 is close to the monopole of theBéthermin et al. (2012)CIB model value 0.130 ± 0.038 MJy sr −1 added to the Planck intensity map at 353 GHz after setting the Galactic zero level (Planck 2018 results.I. 2020).We further performed the same analysis on the smaller sky mask by putting lower NHi thresholds.
This work was supported by the Science and Engineering Research Board, Department of Science and Technology, Govt. of India, grant number SERB/ECR/2018/000826. D. Adak acknowledges UGC for providing Ph.D. fellowship when a major part of this project is done.D. Adak acknowledges IMSc for providing a postdoc fellowship for one year when the rest of the project is completed.S. Shaikh acknowledges support from SERB for the Research Associate position at NISER, during which a part of this work was performed, and the Beus Center for Cosmic Foundations for current support.S. Sinha would like to thank NISER Bhubaneswar for the postdoctoral fellowship.G. Lagache acknowledges funding from the European Research Council (ERC) under the European Union's Horizon 2020 research and innovation programme (grant agreement No 788212) and from the Excellence Initiative of Aix-Marseille University-A*Midex, a French "Investissements d'Avenir" programme.A1 Burn-in, Correlation Length, and Convergence Test