CMB foreground: A concise review

Observations of Cosmic Microwave Background (CMB) anisotropies have become the most powerful probe of the early universe and cosmology. Recent satellite CMB experiments have revealed the all-sky view at microwave frequencies, and provided us with plenty of information about not only the cosmological CMB but also about various astrophysical processes in the Galaxy through foreground emissions. Accurate understanding about the foreground emissions is essential in order to precisely determine the cosmological CMB signal, and especially so if one tries to estimate weaker polarization signals from the CMB anisotropies. Here we brieﬂy review the recent progress on the foreground subtraction methods and current knowledge about the foregrounds mainly focusing on the large-scale, diffuse, Galactic components. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .


Introduction
Among cosmological observables, the Cosmic Microwave Background (CMB) anisotropies are considered as the most powerful probe of the early universe and cosmology. Satellite experiments of COBE [1], WMAP [2], and PLANCK [3], along with balloon-borne and ground-based experiments such as DASI [4], MAXIMA [5], BOOMERang [6], CBI [7], ACBAR [8], ACT [9], and SPT [10], have already put strong constraints on cosmological parameters. The constraints are mainly derived from the total intensity (T mode; temperature) anisotropies. CMB photons are polarized at 10% level by the Thomson scattering at the recombination and reionization epochs (E and B modes; nonlocal combinations of the Stokes Q and U parameters), and ongoing observations of polarization anisotropies by, e.g., PLANCK, QUIET [11], PolarBEAR [12], SPTPol [13], ACTPol [14], and QUI-JOTE [15] are expected to provide further information about cosmology. In particular, a detection of the B-mode will constitute strong evidence of the existence of primordial gravitational waves, because the B-mode polarization is not generated by the scalar density mode at the linear perturbation order (gravitational lensing can generate non-primordial B-modes from E-modes at the second order; for a review, see [16]).
Such cosmological information is available only when sources contaminating the estimation of CMB are removed successfully. Because most parts of the cosmological information comes from large angular scales of CMB anisotropies, say θ 0.1 • , and also because recent instruments of CMB experiments have sufficient sensitivity and angular resolution, the main source of uncertainty is the contamination by foreground emissions from the Galaxy, rather than the instrumental noise itself.  Which foreground sources dominate at frequencies relevant to CMB? It is now well known that synchrotron emission from the Galaxy dominates at low microwave frequencies ( 30 GHz), while thermal dust emission does at higher frequencies ( 70 GHz). Between the two components in frequency, it has been argued that thermal free-free emission and non-thermal dust emission, which are possibly due to spinning dust grains, become important [17] (Fig. 1). In addition, it now becomes clear that the rotational transitions of carbon monoxide (CO) significantly contaminate the PLANCK observing bands [18]. One of the purposes of this brief review is therefore to summarize the current status of our knowledge about each component of the Galactic foregrounds.
Even though there is no doubt that understanding the foregrounds is important, they are not a major issue for observations of temperature anisotropies, because it is observationally evident that cosmological CMB photons dominate at high Galactic latitudes. Situations are similar for E-mode observations, because both the foregrounds and the cosmological signal are polarized at ∼10% levels. However, the primordial B-mode signal is expected to be 1% of the foreground emission, and thus extraction of such a faint signal would be challenging. If the B-mode signal turns out to be 10% of the current upper limit, the Galactic foregrounds are always larger than the signal at all frequencies. Since the foreground emission is relatively smooth, it becomes more important on large angular scales (say, 1 • ). Therefore, development of foreground subtraction methods has become increasingly more important, and many methods have been proposed so far based on analyses of data at different frequencies and different frequency dependence of the astrophysical emission laws [19].
After submitting the initial draft, a striking result was announced from the BICEP2 collaboration [20]. They claim that "we find an excess of B-mode power over the base lensed-CDM expectation in the range 30 < < 150, inconsistent with the null hypothesis at a significance of >5σ " [21]. If this excess is interpreted as signals due to primordial gravitational waves, the tensor-to-scalar ratio should be as large as r ≈ 0.2. This is somewhat in tension with the WMAP and PLANCK temperature angular power spectra which suggest r 0.1 [22], stimulating further detailed investigation. The observation was done toward the "Southern Hole" direction where polarized foregrounds PTEP 2014, 06B109 K. Ichiki are expected to be especially low (see, however, [23]). To confirm the excess to be the primordial B-mode due to gravitational waves, one obvious way is to measure the reionization bump, which is a firm prediction from the primordial B-mode due to gravitational waves. Since the reionization bump is expected in the multipole range of 10 we need a wider sky region to analyze than observed by BICEP2, and correspondingly the galactic foreground contamination becomes more significant there. The wider sky coverage is primarily important remembering the fact that the super-horizon correlations for angular separations greater than the angle corresponding to the particle horizon when the polarization was generated is essential for the B-mode due to inflationary gravitational waves [24]. Therefore, precise knowledge about foreground emissions and accurate foreground removal methods will remain important even in the post-discovery era.
This brief review consists of two parts. In the next section we try to summarize the current status of our knowledge about each component of the Galactic foregrounds, paying particular attention to their contribution to the polarization measurements. In Sect. 3 we provide a list of the proposed methods for foreground subtraction or component separation, if not all, and discuss their performance. Section 4 is devoted to a summary and discussion.

Synchrotron
Synchrotron emission arises from interactions between cosmic ray electrons and magnetic fields in the Galaxy. The intensity and spectrum depend on the magnetic field strength and cosmic ray energy, and therefore they show significant spatial variations on the sky. For electrons with a power-law energy distribution, N (E) ∝ E − p , the spectrum of synchrotron emission becomes T ν ∝ B ( p+1)/2 ν β with β = −( p + 3)/2 [25]. A typical value of β is β ≈ −2.5 at radio frequency [26,27], and takes steeper values β ≈ −3.0 at ∼10 GHz frequencies, with typical spatial variations of ±0.2. Steepening can be explained by the cosmic ray's aging effect, but flattening can also arise due to the superposition of multiple components. In MHz bands, thermal free-free absorption reduces the index across the Galactic plane strip and introduces uncertainty in the spectral index determination [27]. The spatial variation and uncertainty in the spectral index and the possibility of steepening and/or flattening of the spectrum (i.e., the running spectral index) are the key issues for foreground modeling and component separation. Anyway, synchrotron emission from the Galaxy dominates the foreground in the lower frequency range.
Full-sky continuum maps in the low frequency range are available, for example, the Haslam et al. map at 408 MHz (0 • .85 angular resolution) [28], and the map (35 ) at 1.4 GHz by Reich and Reich [29]. Other maps are also compiled as the Global Sky Model (GSM) by de Oliveira-Costa et al. [26], and conveniently organized at her web site 1 . In particular, the Haslam map was used as a template to subtract synchrotron foreground in the earlier version of WMAP analyses. Synchrotron photons are emitted by cosmic ray electrons accelerated by the magnetic fields, and therefore are polarized perpendicular to the field lines. Thus, accurate modeling of the Galactic cosmic ray and magnetic field distributions can in principle be used to predict the polarization foreground from synchrotron emission and remove it from observed maps. The degree of linear polarization, integrated over all electron energy and frequency, is = ( p + 1)/( p + 7/3) [25]. The polarization degree can be as large as ∼40% at WMAP 23 GHz, with higher values at higher galactic latitudes PTEP 2014, 06B109 K. Ichiki  [32]. [30,31] (Fig. 2b). The larger polarization degrees at high Galactic latitudes are mostly attributed to the local structures, namely, the Fan region and the North Galactic Spur, which have polarization degrees as large as ∼ 30%. Kogut et al. found that the mean polarization degree is ∼14% at high Galactic latitudes |b| > 50 • outside the P06 mask while the Galactic plane |b| < 5 • is less than 10% polarized, with the mean of 5% [30]. The decline of polarization degree toward low Galactic latitudes could be interpreted as a depolarization effect due to the superposition of emissions with different polarization angles. The depolarization due to the Faraday rotation effect (Faraday depolarization) is negligible at the WMAP 23 GHz band. Even though polarization degree at low Galactic latitudes may be small, synchrotron emission is intrinsically strong there and these directions are not suitable for CMB observation anyway (Fig. 2a).

Free-free
Free-free emission, also known as thermal bremsstrahlung, arises from electron-ion scattering in interstellar plasma. It is known that the emission can be traced with Hα line emission, both of which come dominantly from Hii regions in the Galaxy. In the WMAP analysis they therefore used Hα line emission maps from the Virginia Tech Spectral line Survey (VTSS) [33], the Southern Hα Sky Survey Atlas (SHASSA) [34], and the Wisconsin Halpha Mapping Survey (WHAM) [35] compiled by Finkbeiner et al. [36] with a dust extinction correction applied as a foreground template of the Galactic free-free emission [37]. The WMAP nine-year analysis applies the scattering correction in addition [17].
For optically thin plasma, the intensity of free-free emission is given by an integration along the line of sight as Here, n e and n i are the number densities of electrons and ions, respectively, Z i is the atomic number, and T e 8000 K is the electron temperature. The gaunt factor for free-free emission is approximately given by, for hν kT , where e is the electron charge, m e is the electron mass, and γ is the Euler constant. In the CMB community, the observed intensity I ν is often expressed in terms of brightness temperature T B and/or PTEP 2014, 06B109 K. Ichiki fluctuation in thermodynamic temperature T CMB . A useful conversion formula is given by where x ≡ hν/kT CMB and T CMB = 2.725 K. One can read off that in the brightness temperature, the spectral index is ≈ −2. Here we quote the fiducial value at the WMAP K-band that T B ∝ ν −2.14 for T e ≈ 8000 K [17]. Thermal free-free emission is intrinsically unpolarized because the scattering directions of electrons are isotropic and random. Magnetic fields can break the isotropy, but interstellar magnetic fields are too small to generate enough polarization at microwave frequencies [39]. In fact, some of the emitted photons are self-scattered by electrons through the Thomson scattering and can acquire polarization. The scattered photons are expected to be polarized tangentially to the edges of Hii regions, at the maximum level of ∼10% for an optically thick cloud. However, for optically thin plasmas at high Galactic latitude, the effect should be small. Free-free emission is found to be unpolarized with un upper limit of 3.4% at the 95% confidence level [40].

Thermal dust
In the microwave sky at frequencies 70 GHz, thermal emission from the interstellar dust grains mostly made of graphites, silicates, and PAHs (Polycyclic Aromatic Hydrocarbons) dominates the foreground. The spectrum is well described by a modified black-body of the form is the Planck spectrum. The temperature T is determined by the interstellar radiation field (heating) and efficiency of emitting far-infrared light (cooling) of the dust grains, and a variety of circumstances and shapes of dust grains lead to multiple temperatures. The very famous, all-sky dust model of Schlegel, Finkbeiner and Davis, 2 which is based on COBE/DIRBE and IRAS/ISSA maps and is primarily for an estimate of Galactic extinction, adopts two temperature components (the socalled Model 8) as T 1,2 = (9.4, 16) K and β d 1,2 = (1.67, 2.70), respectively, and has been used to predict the dust foreground at microwave frequencies [41].
Detailed all-sky maps of dust intensity and temperature have been released by the Planck collaboration ( Fig. 3), where they have derived the temperature using IRIS 100 μm and the Planck-HFI data at 857 and 545 GHz. Along the Galactic plane, a temperature gradient can be seen from the outer PTEP 2014, 06B109 K. Ichiki Galactic regions to the Galactic center from T ≈ 14-15 K to T ≈ 19 K. This tendency is thought to be due to more active star formation in the inner regions of the Galaxy [42]. Near the Galactic poles the temperature seems systematically higher, but one should look with care because the determination becomes noisy there due to the low signal levels.
It should be noted that the temperature is determined based on a single-component model, with the spectral index being fixed to β = 1.8 [42]. Therefore it can not be used to predict the dust emission at lower frequencies where multicomponent models are favored [43]. In addition, the dust temperature should depend on the grain size, and the dust emission spectrum must be a superposition of the emissions from different-sized grains [44][45][46]. Therefore, simple gray-body approximation with a single component is not a good approximation and may introduce systematic errors, especially when we consider frequencies in the Wien part of the gray body. This systematic error should be considered when one tries to predict the dust emission at lower and higher frequencies than the frequency range used for the fit.
Another problem related to the Galactic dust modeling is the subtraction of extragalactic sources, appearing as Cosmic Infrared Background (CIB). One way to subtract this contribution is to take cross correlations with the Galactic Hi emission, determine the dust emissivity relative to Hi column density, and read off the intercept as the offset due to the background light. The error in the determination of the offset should be propagated when subtracting the offset from the Planck and IRIS maps [42].
Photons emitted from thermal dust with aspherical shape can be polarized, with spin-axes aligned with interstellar magnetic fields [48]. Basically, the grains will emit (or absorb) photons most efficiently along the longest axis. Combining this fact with the rule of thumb that the alignment mechanisms tend to make the long grain axis perpendicular to the local magnetic fields [49], polarization perpendicular to the local magnetic fields can be observed in emission, but parallel to the fields in the background light with absorption. Again, the degree of alignment would vary with a variety of sizes of dust grains, leading to frequency-dependent polarization [50].
Observations by the WMAP satellite have indicated that the polarization fraction of dust emission is about ∼1% toward the galactic center, and can be as large as ∼6% toward the anti-center. The fraction at high latitude (outside the P06 mask) is reported as 3.6 ± 1.1% [30]. The Archeops experiment at 353 GHz found coherent polarizations with similar polarization degrees (4-5%) over the longitude ranges (100, 120) and (180, 200) degrees [51].

Spinning dust
A growing body of evidence is accumulating to show that there is a fourth, anomalous, foreground component in the microwave range. The anomalous emission at 20-60 GHz, sometimes nicknamed "Foreground X" [52], was first identified as free-free emission, but the idea was later ruled out due to the lack of correlation with Hα data [53]. The emission is spatially more correlated with the 100 μm dust map, and the currently most plausible candidate is tiny PAH particles spinning with dipole moments, i.e., "spinning dust" (for a historical review, see [54]).
The spatial distribution of the anomalous emission and its frequency dependence have been estimated by subtracting contributions of synchrotron, free-free, and thermal dust emissions (e.g., assuming a constant spectral index β) from the WMAP 23, 33, 41, and 61 GHz maps (Fig. 4a) [31]. They confirm that the distribution is closely related to the thermal dust emission (Fig. 3b)   for the Warm Neutral Medium [55]. Macellari et al. found that in the WMAP K and Ka bands the spectral index of anomalous dust emission is β d ≈ −2.5 [40], which is slightly larger than but consistent with the result of β d ≈ −2.85 in [56]. Recently, the Planck collaboration has analyzed dense molecular clouds from which strong AME is expected [57]. They find that the spectra are well fitted by a model [58] including two components of spinning dust emissions from dense molecular gas and low-density atomic gas regions (Fig. 4b).
The polarization amplitude of spinning dust emission is expected to be small. Lazarian and Draine show that the polarization degree is no larger than 8% at 2 GHz, and falls rapidly for higher frequency, no greater than 0.5% above 30 GHz (Fig. 1 in Ref. [59]). Superpositions of emissions from a variety of dust grains populating different areas of the Galaxy will further reduce the polarization degree.
Another candidate for the anomalous emission is magneto-dipole emission from strongly magnetized grains [60]. A distinct feature of the magnetic dust emission is its high polarization degree levels, as high as 40% [60]. However, strong upper limits on the polarization degree of the anomalous emission, 3.0%, support the spinning dust hypothesis, although the existence of magnetic dust can not be ruled out [40,61].

CO molecular cloud
Along with the synchrotron and thermal dust emissions which constitute a substantial portion of the foreground emission of the Galaxy at microwave frequencies, the rotational transitions of carbon monoxide (CO) have shown up on the foreground stage. In the words of the Planck paper [18], "After launch it became apparent that the contribution of CO rotational transitions to the HFI measurements was greater than anticipated, especially for the 100 GHz band.  Instrument (HFI), that is, in the 100, 217, and 353 GHz bands. Generally, the distribution of the CO is associated with dust while the CO distribution is more concentrated. An analysis of CO molecular cloud at high Galactic latitude shows that the power spectrum of CO emissions can give significant contributions on small angular scales ( 1000) [62]. The pre-existing CO line surveys before Planck, such as the Columbia-CfA survey (e.g., [63,64]) and the NANTEN Galactic Plane Survey (NGPS) by the NANTEN telescope [65,66] are dedicated mainly to the Galactic plane, while Planck provides the first full sky map of the CO emissions.
In March 2013, the first cosmological results from the Planck collaboration came out [67]. It was shown that the abovementioned lines give significant foreground contamination in the Planck intensity maps. They have derived CO component maps in several ways with different assumptions, since the total number of parameters for the foreground components exceeds the number of observing frequency bands. Among them the TYPE-1 maps are derived using MILCA (Modified Internal Linear Combination Algorithm) [68], based on the differences in the spectral transmission of a CO line among the bolometers in a single frequency channel. The TYPE-2 maps are derived using multifrequency channels, using the Ruler algorithm which is based on a parametric model of the Galactic emissions, which is shown in Fig. 5b. The methods of internal linear combination and Ruler are discussed in the following section. Because estimations of the CO component rely on different assumptions, it is desirable to apply several algorithms to the CO estimations and check the stability of the results. As is shown in Fig. 5b, the distribution of CO line emission follows that of thermal dust emission ( Fig. 3b) with larger concentration (i.e., covering a smaller fraction of the sky), and therefore the strategy the Planck team devised for cosmological analysis is to mask them out. The other transitions have similar distributions, but in fact, the line ratios are different from one direction to another.

Template fitting
A simple and powerful foreground cleaning technique is the template fitting method in which it is assumed that the microwave sky at a pixeln and frequency ν, T (n, ν), is given by a superposition of various components X i (n) and noise n(n, ν) as where α i (ν) are the template coefficients and α i (ν) can thus be recognized as the frequency dependence (i.e., spectrum) of the template emission X i . A famous example of cleaning the microwave sky by template fitting can be found in the WMAP analysis [17], where they used a synchrotron template constructed from the WMAP lowest frequency channels, a free-free template from the Hα with corrections for dust extinction and scattering, and a dust template from Schlegel, Finkbeiner and Davis (SFD) [41]. One advantage from the simplicity of the method is that the statistical properties of the noise in the foreground-cleaned map are almost unaffected by the procedure. The error in the cleaned map should increase as one includes a larger number of parameters. However, in typical examples, much larger numbers of pixels O(10 5 ) can be used for a fit than the number of parameters O (10), and therefore the impact on the noise amplitude after cleaning is negligible. This is an important fact when considering the error propagation to the cosmological parameter estimations. One drawback, on the other hand, is that the method usually assumes that the frequency dependence α i (ν) is independent of the position n, which, in reality, is not fully satisfied. This is actually one of the reasons for the WMAP team to quit using the Haslam 408 MHz map as a synchrotron template in the microwave frequency range because a large spatial variation of the synchrotron spectral index could possibly bias the power spectrum estimated from the template cleaned map.
We should note that the templates are not necessarily to be real (physical) emission components. For example, a template from SFD can represent both the thermal dust emission component at higher frequencies and the anomalous emission component at lower frequencies simultaneously, by an appropriate weight of the template coefficients α i (ν). In this case, the weight should deviate from a simple power-law scaling. Furthermore, any hypothetical (not astrophysically motivated) template can be used in the cleaning, e.g., higher powers of existing templates. For example, in Fixsen et al., the authors used a quadratic of the HI map in the Galactic foreground cleaning to take into account the uncertainty in the HI-IR correlation [69].

Commander-Ruler.
Commander is a Bayesian estimate of the CMB and foreground components as well as the CMB power spectrum using a Gibbs sampling algorithm; the implementation has been developed by Eriksen et al. [70,71] and Larson et al. [72]. The sampling algorithm tries to estimate the signal map s and foreground maps f(A i , θ) with A i and θ being amplitude and spectral model parameters, respectively, and the CMB power spectrum C simultaneously, given the data d by computing the joint posterior distribution of the parameters, P(s, f(θ ), C |d). The joint distribution can be sampled from the conditional distributions as From Bayes' theorem and assuming Gaussian likelihoods, the conditional distributions are given by where N ν is the noise covariance at frequency channel ν, S is the signal covariance (C in harmonic space) and σ 2 = m |s m | 2 /(2 + 1) is the variance of the signal estimated from the sample. Thus the conditional distributions are multivariate Gaussian for (s + f), and an inverse gamma distribution for C . If polarization anisotropies are included in the model, the inverse gamma distribution is generalized to an inverse Wishart distribution. The actual implementation treats the foreground amplitudes A i and the spectral parameters θ separately, and we refer readers to [70] for details. For the Commander algorithm to work, one needs to model the foregrounds. For example, the Planck collaboration modeled the foregrounds as where A i (n) denotes the foreground amplitude for a component i in directionn, ν 0,i is the pivot frequency, and f ν,CO is the band transmission of CO line emissions for each frequency channel. Note that synchrotron, free-free, and spinning dust components are combined into "low-frequency" (lf) emission in this model. In the Planck analysis, Commander is run with the lowest resolution maps of 40 and low HEALPix resolution of N side = 256, a limit coming from the lowest frequency channel (LFI 30 GHz), in order to robustly estimate spectral indices in the foreground model. Therefore, after the Commander process, they numerically find the best-fitting values of the amplitude parameters A i (θ ) using the high resolution maps with the other parameters related to the spectrum θ being fixed. This second process is called "Ruler," and hence the name "Commander-Ruler." The power spectrum estimate obtained from the Commander run is used for the low multipole part of the Planck likelihood function for cosmological parameters. The advantage of this method is that it provides us with the best estimate of the power spectrum C and its posterior probability distribution taking into account errors in the foreground model parameters. The method also provides the best estimate of the foreground parameters with their associated errors, and therefore it can be recognized as a component separation method rather than the foreground removal method.
One caveat for a parametric method like Commander-Ruler is that the results are sensitive to the assumed priors on the foreground model parameters. For instance, it is shown from the FFP6 (Full Focal Plane) simulation for the Planck experiment that Commander-Ruler leaves a free-free emission-like component as small under-subtracted residuals (below a few μK) [47], partly because the foreground model of Eq. 9 treats synchrotron, free-free, and spinning dust emissions together as a low-frequency component. Furthermore, Gaussian priors imposed on the spectral indices of foreground components directly affect the results at high Galactic latitudes where the signal-to-noise is low. Although the Planck collaboration considered minor variations around the fiducial prior and found only small differences among them, it is of importance to employ non-parametric methods for cross checks as discussed below. 3.3.1. Internal linear combination. The internal linear combination (ILC) is a simple and powerful method when we have poor knowledge about foregrounds [73]. The method only uses the observed maps and requires knowledge about the frequency spectrum of the signal, which is the well-known blackbody in the CMB application. The method has been applied to the WMAP data and PTEP 2014, 06B109 K. Ichiki gives successful visualizations of the CMB anisotropies [17,32,37,74,75]. The idea is that, under the assumption that the signal and foregrounds are uncorrelated and the CMB signal is common in the observed maps, the linearly combined map with weights chosen to keep the component of the CMB unchanged should have the total minimum variance if the foreground components are successfully canceled out.

Non-parametric method
The ILC model is given by where s(n i ) is the CMB signal which we want to estimate, f j (n i ) and n j (n i ) are foreground and noise contributions at frequency channel j, respectively, and a j is the CMB spectrum at each channel j. Following Tegmark et al. [76], let us move to the harmonic space where the above equation is recast to x j m = a j s m + f j m + n j m , or, in vector notation, In the ILC method, as its name suggests, one considers a solution of the signal by linearly combining the observation x m :ŝ where w are the weights subject to a constraint w † · a = 1, i.e., a constraint to make the CMB signal untouched. Minimizing the variance of the ILC map |ŝ m | 2 under the constraint above gives the solution where C is the covariance matrix (cross-power spectrum) defined by Finally, for visualization purposes, one can Wiener filter the ILC map: where C CMB is the CMB power spectrum (the model from the theory) and C ILC is the power spectrum of the ILC map that includes signal and noise. The ILC method relies on the fact that the CMB signal and foregrounds are independent. Therefore, it can not be used to estimate the Galactic foreground components separately because they are correlated with each other. The ILC should be recognized as a method of foreground subtraction rather than a component separation method.

3.3.2.
FastICA. Independent component analysis (ICA) also assumes a (linear) superposition of the astrophysical components, but the method simultaneously estimates the spatial distribution of the components based on the independency. Specifically, in the simplest case, the ICA model is given by where x j (n i ) are the observed intensity at frequency j, s k (n i ) are the signal including foregrounds, and the matrix A is called the mixing matrix. The problem here is to estimate A and s k (n i ) simultaneously. This class of methods has several advantages in comparison with the methods mentioned above; among them the most important one is that ICA methods need no prior assumptions about the distribution of the foreground components and their frequency dependencies (therefore it is called Blind Source Separation in the statistics community).
In the FastICA method [77], independence is measured through the degree of non-Gaussianity. The decomposition is done by maximizing the linearly combined stochastic variable where W is the decomposition matrix, because the superposition of independent foreground components makes the distribution "more Gaussian" due to the central limit theorem. If W = A −1 , then y k = s k . To find matrix W we need an evaluation function g(y k ) of the level of non-Gaussianity. Any non-linear function should work in principle, but in practice, kurtosis and negentropy are frequently used. Several applications of FastICA to CMB component separation problem can be found in the literature, which includes applications to the COBE [78,79], BEAST [80], WMAP [81,82], and 21 cm maps [83], and the extraction problem of the CO component in the PLANCK data [62].

Spectral Matching Independent Component Analysis (SMICA).
In the Spectral Matching Independent Component Analysis (SMICA) method, the ICA model is built in the angular spectrum domain. Assuming the number of foreground components is d and the observation is done by N chan frequency channels, the data model is described as Here, C (θ ) is the model covariance matrix, a is an N chan vector that describes the spectrum of the CMB evaluated at each channel, the term AP A † describes the foreground covariance, and N is a diagonal noise matrix. More specifically, the foreground emission matrix A is an N chan × d matrix that describes the spectra of the foregrounds evaluated at each channel, and the foreground covariance matrix P is a positive d × d matrix. Thus the total number of model parameters is N chan + (N chan × d) + d(d + 1)/2 for each (in practice, binned ). In the case of the Planck analysis, for example, N chan = 9 and d = 6 [47]. Independence is imposed between the signals (the CMB and foregrounds) and hence this is an ICA method. The model C (θ ) is fitted to the data, i.e. the sample spectral covariance matrixĈ . The matrix is written aŝ where the N chan vector x m contains the spherical harmonic coefficients for the observed maps. The fitting is done by finding the parameters that maximize the Gaussian likelihood aŝ The fitted covariance matrix C (θ ) is used to find a solution, the spherical harmonic coefficients of the CMB, similar to the ILC method, asâ with where the vector a is again the spectrum of the CMB evaluated at each channel. ≈3.6% [30] Because the SMICA method is based on the model of the angular power spectrum (Eq. 18), it can not blindly separate the foreground components if their angular power spectrum is similar to each other. Fortunately, the cosmological CMB angular power spectrum is quite distinct from those of the foreground emissions, and thus separation is possible between the CMB and the foregrounds. Component separation between the foregrounds may, however, be difficult in this method. This is a similar situation to the ILC case.

Summary and Discussion
As summarized in Table 1, we have accumulated knowledge on the diffuse Galactic foreground emission in the microwave sky. So far, confirmed contributors are synchrotron emissions at lower frequencies and thermal dust emission at higher frequencies, and free-free and anomalous emission possibly due to the spinning dust grains in the middle of the two. A simple sky model described in Eq. 9 has been shown to successfully describe the microwave sky observed by Planck at high galactic latitude regions, while it has also been shown that the model fails to describe the sky toward the Galactic plane [47]. Although further detailed investigation is required for emissions from the Galactic plane, we can use the Galactic mask to remove such annoying regions from the cosmological analyses. One complication is that the spectral index of synchrotron emission and the temperature of dust emission can vary from one direction to another; however, this can be modeled as in Eq. 9.
Various statistical tools have been developed to remove the foreground emission. As discussed in Sect. 3, the methods are broadly categorized into three types: template fitting, parametric, and non-parametric (blind) methods. Template fitting is powerful for removing the foregrounds, and is shown to be successful on real data. We should note, however, that template cleaning assumes a separable nature for frequency and spatial positions, which is violated in the real sky to some extent. For example, the spectral index of synchrotron emissions varies over the sky and, furthermore, the spectrum may have curvature. Therefore, if templates are made with wide separations in frequency, the separations will cause large fitting errors in some sky directions, possibly leading to a biased result. Errors in the templates will also propagate into the resultant map in an ill-defined way, and the errors could be significant even in the clean region which we want to use for cosmological analyses.
The propagation of foreground fitting errors to the CMB map and the power spectrum may be properly taken into account in the Commander-type parametric methods (Sect. 3.2.1) in a Bayesian framework. Parametric methods are appropriate for applying our knowledge of physical processes of the foregrounds, and additional observational constraints are easily imposed as priors for the model parameters of foregrounds. Wrong modeling of the sky may bias the signal estimate, of course, but models can be tested by, for example, watching the value of reduced χ 2 . The current implementation of the Commander algorithm is done in pixel space, where frequency-dependent beams can not be PTEP 2014, 06B109 K. Ichiki handled easily. Furthermore, the parametric methods are, in general, computationally intensive. For these reasons, the resolution of the maps should be degraded to, at least, the lowest one in the data set. Thus the Commander map has the lowest angular resolution among the cleaned maps derived by Planck [47], although the high angular resolution may not be necessary for future experiments targeting the B-mode polarization caused by the inflationary gravitational waves on large scales. Non-parametric methods are less computationally demanding, and have an advantage that they do not need any assumption about physical sources of foregrounds. In any case, because estimations of the foreground components rely on different assumptions, it is desirable to apply several algorithms to check the stability of the results.
While we may say that the foreground problem is resolved for the temperature CMB anisotropies, studies for the polarization anisotropies are now in progress. An encouraging result is derived by Katayama and Komatsu [84], in which they show that a template cleaning method using three bands (60, 100, and 240 GHz) with a noise level of 2 µK arcmin is able to reach the tensor-to-scalar ratio r ≈ 10 −3 using the Planck Sky Model 4 [85] (see also [86]). The bias is dominated by the residual synchrotron emission due to the spatial variation of the spectral index, as discussed above. Using the SMICA method, the authors of [87] find that r = 0.1 can be detected at 3σ by Planck, and r = 0.001 at 6σ will be possible by the most ambitious experiment. Note that these simulations are based on the pre-launch Planck sky model; later this year the Planck collaboration will publish their polarization data, a new view of our universe in the microwave sky in polarization. It is of great interest to examine real data using a variety of methods ready for the foreground separation/removal, and pursue a course toward the detection of gravitational waves, a fossil of inflation in the early universe.