Results from the Wilkinson Microwave Anisotropy Probe

The Wilkinson Microwave Anisotropy Probe (WMAP) mapped the distribution of temperature and polarization over the entire sky in five microwave frequency bands. These full-sky maps were used to obtain measurements of temperature and polarization anisotropy of the cosmic microwave background with the unprecedented accuracy and precision. The analysis of two-point correlation functions of temperature and polarization data gives determinations of the fundamental cosmological parameters such as the age and composition of the universe, as well as the key parameters describing the physics of inflation, which is further constrained by three-point correlation functions. WMAP observations alone reduced the flat $\Lambda$ cold dark matter ($\Lambda$CDM) cosmological model (six) parameter volume by a factor of>68,000 compared with pre-WMAP measurements. The WMAP observations (sometimes in combination with other astrophysical probes) convincingly show the existence of non-baryonic dark matter, the cosmic neutrino background, flatness of spatial geometry of the universe, a deviation from a scale-invariant spectrum of initial scalar fluctuations, and that the current universe is undergoing an accelerated expansion. The WMAP observations provide the strongest ever support for inflation; namely, the structures we see in the universe originate from quantum fluctuations generated during inflation.


Introduction
The WMAP [1] spacecraft was designed to measure the full-sky distribution of temperature differences (anisotropy) and polarization of the cosmic microwave background (CMB). WMAP is the successor of the legendary Cosmic Background Explorer (COBE) satellite, whose spectrograph provided a precision-measurement of the CMB blackbody, implying that matter and radiation were in thermal equilibrium, consistent with the expectation of the hot Big Bang theory of the universe [2]. The COBE differential radiometers discovered the primordial ripples in spacetime that existed in the early universe [3]. With 35 times Fig. 1 Definition of the Stokes parameters with respect to Galactic coordinates (adapted from [35]). "N" and "E" denote the Galactic north and east directions, respectively.

These linear combinations are amplified by High Electron Mobility Transistor (HEMT)
amplifiers, yielding where n 1 and n 2 are the noise added by the amplifiers, and g 1 and g 2 are the gains of the amplifiers. 3. The latter combination, u 2 , is phased-switched to yield ±u 2 . Then u 1 and the phaseswitched u 2 are combined to form v l ≡ u 1 ± u 2 and v r ≡ u 1 ∓ u 2 . 4. The combined signals are detected by the square-law detectors (diodes). The outputs, V l and V r , are proportional to v 2 l and v 2 r , respectively. 5. Finally, we compute the difference between V l and V r , obtaining where s is the proportionality constant of the square-law detector. This quantity is thus proportional to the difference between the powers of light coming from the A and B sides; i.e., WMAP measures temperature differences in the sky separated by 141 • , and the mean CMB temperature (2.725 K) and many types of undesirable systematic effects cancel out.
We need to convert the measured 1 2 (V l − V r ) (in units of voltages) to a temperature difference in thermodynamic units. We do this by using the dipole anisotropy of the CMB. As Earth orbits around Sun, the L2 point (hence WMAP) also orbits around Sun at 30 km/s. This motion creates time-varying dipole anisotropy, creating a sinusoidal signal changing over a year. As we know the mean CMB temperature and the orbital velocity precisely, we know that the amplitude of this signal must be T cmb v/c = 273 µK. This fixes the proportionality constant between 1 2 (V l − V r ) and T A − T B , where T A and T B are the temperatures toward the A and B sides, respectively.

Polarization
The polarization information is still entangled in T A − T B measured by each radiometer. To measure polarization, we need to combine the measurements of a pair of radiometers 3/26 forming each DA. Let us define d 1 ≡ (T A − T B ) 1 and d 2 ≡ (T A − T B ) 2 , where d 1 and d 2 are the outputs of two radiometers. The action of OMTs gives (equations 10 and 11 of [18]) Here, I A , Q A , and U A are the Stokes parameters describing the incoming waves from the A side. We define the Stokes Q and U such that the polarization directions of a pure Q signal are parallel to either Galactic longitudes or latitudes, and the polarization directions of a pure U signal are 45 degrees tilted from those of a pure Q signal. In other words, a pure Q signal aligns with either the Galactic north-south or east-west direction (see figure 1). Then, γ A is the angle between a meridian through the Galactic poles and the projection of the electric field of each output port of the OMTs on the sky. The sum and difference of the instantaneous outputs of two radiometers thus yield The sum gives the temperature difference (i.e., difference of unpolarized intensities), while the difference gives a combination of the Stokes Q and U . The polarization angles of radiometers were measured on the ground using a polarized source. The uncertainty of the measurement is 1 • (section 2.5 of [41]), and the measurements are within ±1.5 • of the design orientation (section 3 of [20]). In flight, we observe Tau A [20,36], and find that the standard deviation of angles measured in five bands is 0.6 • (see Table 15 of [36]). This is consistent with (and is smaller than) the scatter of the ground measurements and we conservatively use 1.5 • as an estimate of the systematic error in the polarization angle of WMAP.

Map making
The instantaneous outputs of two radiometers per DA yield the temperature and polarization differences between the A and B sides. The next step is to reconstruct the distribution of temperature (minus the mean CMB temperature) and polarization over the entire sky. WMAP scans the full sky in six months. We can then estimate maps of temperature and polarization over the full sky using the six-month data. We first write down the measured time-ordered data (TOD) as (section 3.4 of [18]) where d t is the TOD of two radiometers, d = (d 1 , d 2 ), measured at a given observation time, t; m p is the actual sky map consisting of m = (I, Q, U ) at a given sky location (pixel), p; n t is noise of the TOD; and M tp is the so-called "mapping matrix" which projects m p onto d t .
For an ideal differential experiment, the mapping matrix is a 2N t × 3N p matrix, where N t is the number of data-recording times and N p is the number of sky pixels. Each row corresponds to one observation, while each column corresponds to a map pixel. Each row of M tp has 6 non-zero elements. The non-zero elements are ±1 in the columns corresponding to the observed pixels in the I map; and ± cos 2γ A , ± sin 2γ A , ± cos 2γ B , and ± sin 2γ B for 4/26 the Q and U maps. The plus and minus signs are chosen according to equations (3) and (4). Each observation is associated with 12 non-zero values of M tp that are distributed in two rows.
In reality, there are two dominant nonidealities in radiometers that must be taken into account. One is the "bandpass mismatch." We take the difference between two radiometers to measure polarization. While these radiometers were designed and built to have nearly identical frequency responses (bandpass) to the incoming electromagnetic waves, a slight mismatch in the bandpass produces a spurious polarization signal even in the absence of polarization. Suppose that the incoming waves are unpolarized and have a spectrum of I(ν). Due to the bandpass mismatch of two radiometers, they receive the incoming waves at slightly different effective frequencies, ν 1 and ν 2 . As a result, the difference between two radiometers does not vanish, producing a spurious polarization, s, given by s = I(ν 1 ) − I(ν 2 ) ≈ (ν 1 − ν 2 )∂I/∂ν. While the CMB, whose temperature does not depend on frequencies, does not produce a spurious polarization, the other components (such as Galactic emission) that depend on frequencies do produce a spurious polarization.
Fortunately, it is relatively straightforward to remove this effect. Equations (3) and (4) show that the real polarization signals are modulated by the angle γ. On the other hand, a spurious polarization is independent of γ. Therefore, we can separate the real and spurious polarization signals if we have enough coverage in γ. We modify equations (3) and (4) (9) and expand the mapping matrix to a 2N t × 4N p matrix. (Each row has 8 non-zero elements.) While WMAP's scan pattern allows for a uniform coverage in γ near the ecliptic poles, it covers only 30% of possible γ on the ecliptic plane. This produces noisy modes in the reconstructed sky maps, which must be properly de-weighted. (By comparison, Planck's coverage is < 4% on the ecliptic plane.) The second non-ideality is the "transmission imbalance," which is the difference between the A and B sides; namely, the A and B sides do not necessarily have equal responses to the incoming waves due to loss (i.e., imperfect transmission) in the system. The spurious polarization is an additive effect, but the transmission imbalance is a multiplicative effect, given by (equations 19 and 20 of [18]) where x im is the transmission imbalance factor, which has been measured using the responses of radiometers to the CMB dipole (table 2 of [18]). We include the transmission imbalance in the mapping matrix by multiplying the A-and B-side elements by 1 + x im and 1 − x im , respectively.

5/26
The optimal estimator for a sky map,m p , that is unbiased and has the minimum variance, is given by (in matrix notation) where N (= N tt ′ ) is the noise matrix of the TOD. The TOD noise of WMAP is stationary, in a sense that the noise matrix is a function of ∆t ≡ |t − t ′ |. The noise matrix is given by with (equation 21 of [18]) where ∆t is in units of samples, and ∆t max is the time lag at which the TOD noise correlation function (N i (t) in the integral) crosses zero, typically ≈ 600 seconds. The coefficients, C and K, are chosen such that N −1 i (0) is normalized to unity, and that the mean of N −1 i (∆t) over 0 ≤ ∆t < ∆t max vanishes. The TOD noise correlation function, N i (t), is measured from the data and a functional form is given in equation (4) of [18]. Given this noise matrix of the TOD, we solve the equation, (M T N −1 M )m = M T N −1 d, to find a sky map solution,m, using the conjugate gradient method.
As described in section 2.1, we use the CMB dipole to convert the input signals (in voltages) to thermodynamic temperatures. The uncertainty in this conversion (calibration uncertainty) per DA is 0.2% for the final nine-year temperature and polarization maps, which has unchanged since the five-year analysis (section 4 of [23]). Figure 2 shows the nine-year solutions to the full-sky I maps (minus dipole anisotropy), while figures 3 and 4 show the Stokes Q and U maps, respectively, in five frequency bands.
3. Galactic and extra-galactic foreground emission 3.1. Temperature Figure 2 shows that the distribution of the measured temperatures at high frequencies (41,61, and 94 GHz) at high Galactic latitudes are quite similar. This means that, at these frequencies, the temperature data at high Galactic latitudes are dominated by the CMB which is independent of frequencies. On the other hand, the data at lower frequencies (23 and 33 GHz) are clearly affected by strong Galactic emission, and the data near the Galactic plane are dominated by the Galactic emission at all five frequencies. Also, there are many extra-galactic sources (most of which are synchrotron sources) all over the sky, which need to be removed from the cosmological analysis.
To reduce the effects of the so-called "foreground" emission from our Galaxy and extragalactic sources, we combine two methods: one is to simply mask the pixels which are strongly affected by the foreground emission; and the other is to estimate and remove the foreground emission from the sky maps.
Three Galactic foreground components are known to dominate in the WMAP frequencies (section 3 of [5]): synchrotron, free-free, and dust. The antenna temperatures of these three components go (very approximately) as ∝ ν −3 , ν −2 , and ν 2 , respectively. As a result, synchrotron dominates at the lowest frequencies (23 and 33 GHz), free-free dominates in some We define the mask for the temperature data as follows (section 2 of [25]). We first smooth the 23 GHz (K band) map to one degree resolution, and remove an estimate of the CMB from this map. The CMB is estimated by the internal linear combination (ILC) method (section 5.2 of [19]). We then mask the pixels brighter than a certain threshold temperature, until 75% or 85% of the sky is left unmasked. We repeat the same procedure for the 41 GHz (Q band) map. The masks defined by the K-and Q-band maps are added to form two masks, "KQ75" and "KQ85" masks, depending on how much sky was left unmasked in the K-or Q-band map.
The mask for extra-galactic sources is created using the locations of the known bright radio sources in the literature (as described in section 7 of [5]), the sources found in the WMAP nine-year data (section 5.2.2 of [37]), and additional sources found in the Planck 7/26 Fig. 3 Full-sky nine-year Stokes Q maps in five frequency bands measured by WMAP (adapted from [37]). Maps are smoothed to a common Gaussian beam of 2 degrees to suppress noise. See figure 1 for the correspondence between the signs of Q and polarization directions. early release compact source catalog at 100 GHz [42]. An exclusion radius of 1.2 degree is used for sources brighter than 5 Jy in any WMAP band, and an exclusion radius of 0.6 degree is used for fainter sources.
To further reduce foreground emission in unmasked pixels, we estimate the distribution of the diffuse Galactic foreground emission over the full sky and remove it from the WMAP maps. We use the difference between the 23 and 33 GHz maps (which does not contain the CMB) for synchrotron emission; a map of Hα [43] corrected for extinction and scattering (section 5.3.1 of [37]) for free-free emission; and a map of dust emission [44]. These three maps are simultaneously fit to and removed from the 41, 61, and 94 GHz maps, yielding the foreground-reduced maps at these frequencies.
Finally, we slightly enlarge the KQ75 and KQ85 masks by further masking the regions which have significant excess in the differences between the foreground-reduced 41 and 61 GHz maps, and 61 and 94 GHz maps. The resulting nine-year temperature masks, 8  "KQ75y9" and "KQ85y9," retain 68.8% and 74.8% of the sky, respectively. The former is used for testing Gaussianity of the temperature data, while the latter is used for the power spectrum analysis.

Polarization
Figures 3 and 4 show that the polarization maps at low frequencies are dominated by the Galactic emission. We thus mask the regions strongly contaminated by foreground emission, and remove an estimate of the foreground emission from unmasked pixels.
Two polarized foreground components are known to dominate in the WMAP frequencies [20]: synchrotron and dust. Therefore, one might think that reducing foreground in polarization is easier than in temperature, as we have fewer foreground components. In reality, the polarization analysis is more challenging because the CMB signal in polarization is 10 times fainter than in temperature. While the polarized foreground emission is clearly seen in figures 3 and 4, there is no clear evidence for the CMB signals; thus, we need more sophisticated statistical analysis to extract faint polarization signals of the CMB.

9/26
We define the mask of polarization maps using the 23 GHz map (which is dominated by synchrotron) and a model of dust emission [20]. We first create lower resolution Q and U maps at the 23 GHz, each containing 3072 pixels. We then compute the total polarization intensity, Q 2 + U 2 . As the presence of noise produces a small positive bias in this quantity, we remove it using our estimate of noise in the maps. Finally, we mask the pixels that are brighter than a certain threshold polarization intensity. We choose the threshold to be 0.6 times the mean polarization intensity, and call this mask the "P06" mask. As for dust, we use a map of dust emission constructed from the WMAP temperature data using Maximum Entropy Method (MEM) (section 5 of [5]). We choose a threshold in this map to be 0.5 times the maximum value found in the polar caps (|b| > 60 degrees), and mask the pixels brighter than this threshold value. We then add this dust mask to the P06 mask. We find that extra-galactic sources are minimally polarized in the WMAP frequencies; thus, we only mask ten bright sources outside of the P06 mask [20,37]. The combined P06 mask including synchrotron, dust, and extra-galactic sources retains 73.2% of the sky for the cosmological analysis.
To further reduce foreground emission in unmasked pixels, we use the 23 GHz polarization maps to trace synchrotron. As for dust, we use (equation 15 of [20]) where I dust is the same dust map that we used for the temperature analysis [44]. Dust emission is polarized because dust grains are not spherical, and are aligned with coherent magnetic fields in our Galaxy such that the semi-major axes of grains are perpendicular to the fields. As a result, the polarization directions of dust emission are perpendicular to the field directions. The dust polarization we observe along a particular line of sight is the projection of multiple polarization signals along the line of sight, which is affected by geometry of the fields. We take this projection effect into account by the function g dust , which is calculated using a simple model of fields given in section 4.1 of [20]. How do we estimate γ dust ? As the polarization directions of synchrotron are also perpendicular to the field directions, we could take the polarization directions at 23 GHz as an estimate for γ dust ; however, alignment of dust grains with the fields is not necessarily perfect, which yields some differences between γ dust and γ synch . Therefore, we use the polarization directions measured toward stars. While the intrinsic starlight is usually unpolarized (or polarized very weakly), the observed starlight can be polarized due to selective extinction of the starlight when it passes through regions with dust grains. As it gets more extinction along the semi-major axes of dust grains, the observed starlight polarization direction is precisely orthogonal to the polarization direction of dust emission from the same location. We have compiled the existing measurements of starlight polarization in the literature (section 4.1.2 of [20]), and created a map of the starlight polarization directions, γ * (n). We then compute the dust polarization direction as γ dust (n) = γ * (n) + π/2. The correlation between γ dust computed in this way and γ synch shows that these two angles typically agree to 20 degrees.
We simultaneously fit the 23 GHz polarization maps and dust polarization maps to reduce polarized foregrounds from the higher frequency maps at 33, 41, 61, and 94 GHz. Due to much lower signal-to-noise ratios in the polarization maps, removal of the polarized foreground 10/26 presents a great challenge to the WMAP analysis. The error in the foreground removal has a non-negligible impact on the inferred value of the optical depth to the Thomson scattering, τ . For example, an alternative foreground removal method presented in [30] shifts the value of τ as much as the 1-σ statistical uncertainty.

Power spectrum measurements
The steps we have described so far give us foreground-reduced temperature maps at 41, 61, and 94 GHz, and foreground-reduced polarization maps at 33, 41, 61, and 94 GHz. The temperature maps come with the KQ75 and KQ85 masks depending on the purpose of the cosmological analysis, and the polarization maps come with the P06 mask. In this section, we describe how to measure the angular power spectra from these maps.

Temperature
Assuming that the distribution of CMB temperatures in the foreground-reduced maps outside the mask is given by a Gaussian distribution, which is a good approximation as described in section 6, a complete description of the measured CMB temperatures is given by the following probability density distribution function (PDF): where δT − T cmb is the difference between the CMB temperature toward a direction (or a sky pixel) i and the mean CMB temperature, T cmb = 2.725 K, with an extra index a denoting a DA and an observed period (e.g., first year, second year, etc). The total covariance matrix consists of the signal matrix, S iajb , and the noise matrix, N iajb . These matrices are M pix M DA M year × M pix M DA M year matrices, where M pix is the number of pixels, M DA = 6 is the number of DAs used for the temperature analysis (2 at 61 GHz and 4 at 94 GHz), and M year = 9 is the number of years. For WMAP, the noise matrix vanishes unless a = b, i.e., noise is uncorrelated between different DAs or years.
As WMAP is a differential experiment measuring temperature differences between two points separated by 141 degrees, there is a pixel-to-pixel noise correlation at 141 degrees (figure 11 of [10]). However, this correlation has a negligible influence on the temperature analysis, as the signal covariance due to the CMB totally dominates at large angular scales where this pixel correlation is important. We thus have a simple description of the noise matrix: where σ obs,i gives the corresponding effective number of observations in a sky pixel i. We also set the noise matrix to have a large value in the masked pixels, i.e., we effectively set the noise level to be infinity at the masked pixels.
The CMB signal matrix is given by where C ℓ is the temperature power spectrum of the CMB, and b (a) ℓ is the so-called "beam transfer function," which is the Legendre transform of a symmetrized beam profile of a given DA, and P ℓ (x) is the usual Legendre polynomials. An additional smearing due to pixelization may also be included in b  [37]). Then, the physical optics modeling is used from the inner beams out to 7.0, 5.5, 5.0, 4.0, and 3.5 degrees from the beam center at 23, 33, 41, 61, and 94 GHz, respectively. The beam response at greater distances from the beam center (i.e., far sidelobes) has been measured on the ground before launch and in-flight using Moon [9].
The accurate determination of the beam transfer functions is crucial for the accurate recovery of the intrinsic CMB power spectrum, C ℓ , as the errors we make in b (a) ℓ propagate directly into C ℓ . We estimate that the 1-σ uncertainty in the recovered C ℓ from the nineyear data due to the uncertainty in the beam transfer functions is 0.6% at ℓ 100. This uncertainty is coherent (correlated) over a wide range in ℓ, and must be included in the cosmological analysis of the CMB power spectrum.
How do we infer cosmological parameters from the CMB maps? We can calculate C ℓ as a function of cosmological parameters using the linear Boltzmann code such as CMBFAST [45], CAMB [46], and CLASS [47]. Ideally, we wish to evaluate equation (17) directly as a function of cosmological parameters given our knowledge of noise, mask, and beam transfer functions; namely, we interpret equation (17) as the likelihood function of temperature data given cosmological parameters, p(T |θ), where θ denotes a set of cosmological parameters. We then use Bayes' theorem to obtain the posterior probability of the parameters given the temperature data as p(θ|T ) ∝ p(T |θ)p(θ). Here, p(θ) is the prior probability of cosmological parameters, and we take it to be uniform within a certain reasonable range of θ. We then calculate the best-fit values of θ and the 68% confidence intervals, etc.
Another approach is to use equation (17) as the likelihood function of temperature data given power spectra, p(T |C ℓ ), and obtain the posterior probability as p(C ℓ |T ) ∝ p(T |C ℓ )p(C ℓ ) with uniform p(C ℓ ) within a certain reasonable range of C ℓ . We then evaluate p(C ℓ |T ) for theoretically computed C ℓ with various values of θ, and find the best-fit values of θ and the 68% confidence intervals, etc. In other words, we write the posterior probability of θ as because we know how to calculate C ℓ theoretically as a function of θ.
The difference between these two approaches is that the latter approach produces a convenient intermediate product, p(C ℓ |T ), which can be made much faster to evaluate than the full likelihood function using the so-called "Blackwell-Rao (BR) estimator" [48]. Note that the form of p(C ℓ |T ) is non-Gaussian even though p(T |C ℓ ) is a Gaussian (equation 17), as C ℓ is a quadratic function of temperatures. The central limit theorem makes p(C ℓ |T ) closer to a Gaussian distribution for large values of ℓ, but it is important to use the full non-Gaussian form of p(C ℓ |T ) at small values of ℓ for the cosmological parameter estimation. 12/26 While it is certainly possible to obtain p(C ℓ |T ) using the BR estimator out to large values of ℓ [49,50], the computational cost is still quite substantial. Therefore, the WMAP team has adopted a hybrid approach: we use the BR estimator to compute the full p(C ℓ |T ) at ℓ ≤ 32 from the ILC map with the KQ85 mask (following section 6 of [37]), and use the so-called "quadratic maximum likelihood (QML) estimator" [51,52] for ℓ > 32. The QML estimator,Ĉ ℓ , is obtained by Taylor-expanding the logarithm of equation (17) up to second order inĈ ℓ − C ℓ , and maximizing it with respect to C ℓ , i.e., d ln p(T |C ℓ )/dC ℓ |Ĉ ℓ = 0. The solution isĈ whereã (a) ℓm is the spherical harmonics coefficient of a map filtered by (S + N ) −1 ,ã where the repeated indices are summed. The QML estimator gives our best estimate for C ℓ at each ℓ if we use the correct C ℓ in the S matrix. We can therefore improve the performance of the estimator by iterating the estimation: assume some reasonable C ℓ , estimate C ℓ , use the estimated C ℓ to recompute the QML estimator, and repeat. If an incorrect C ℓ is used, the QML estimator does not give the minimum variance, but it is still unbiased.
While the QML estimator gives an estimate, we still need to calculate the form of the posterior distribution of C ℓ . We know that a Gaussian approximation is not accurate enough; thus, we combine a Gaussian distribution and a log-normal distribution with appropriate weights to obtain an improved form of the posterior distribution (following section 2.1 of [15]). We use the nine-year temperature data at 61 and 94 GHz, which have the highest angular resolutions, to compute C ℓ . Figure 5 shows the nine-year measurements of C ℓ along with estimates of the 68% CL error bars. The error bars are calculated as follows. Given the form of p(C ℓ |T ), we calculate the second-order moment (variance) of C ℓ , and parametrize it as where N ℓ shows the contribution from instrumental noise and a parameter f sky,ℓ may be regarded as the effective fraction of sky used for the analysis at each ℓ. We know how to calculate N ℓ from the known properties of noise and beam transfer functions; thus, for a given value of C ℓ , the only unknown quantity is f sky,ℓ . Equation (22) thus provides definition of f sky,ℓ , which is a slowly-varying function of ℓ (section 2.2.1 of [15]). This equation shows that there is an irreducible uncertainty even in the absence of noise, 2C 2 ℓ /[(2ℓ + 1)f 2 sky,ℓ ]. This is the so-called "cosmic variance" term, which arises from the fact that C ℓ is variance of CMB temperatures, and only 2ℓ + 1 samples are available for estimating variance at each ℓ.

13/26
Fig . 5 Nine-year angular power spectrum of the CMB temperature (adapted from [37]). While we measure C ℓ at each ℓ in 2 ≤ ℓ ≤ 1200, the points with error bars show the binned values of C ℓ for clarity. The error bars show the standard deviation of C ℓ from instrumental noise, [2(2C ℓ N ℓ + N 2 ℓ )/(2ℓ + 1)f 2 sky,ℓ ] 1/2 . The shaded area shows the standard deviation from the cosmic variance term, [2C 2 ℓ /(2ℓ + 1)f 2 sky,ℓ ] 1/2 (except at very low ℓ where the 68% CL from the full non-Gaussian posterior probability is shown). The solid line shows the theoretical curve of the best-fit ΛCDM cosmological model.

Polarization
The polarization analysis is similar to the temperature analysis. We begin with a Gaussian PDF for temperature and polarization: where m = (δT, Q, U ), and the signal matrix contains all the power spectrum combinations such as C T T ℓ , C T E ℓ , C EE ℓ , and C BB ℓ (as well as parity-violating combinations, C T B ℓ and C EB ℓ , if necessary). The explicit expressions are given in appendix of [53].
The TE power spectrum does not add much to the parameter constraints but is included in the model fits. The most important information we obtain from the polarization likelihood is the optical depth, τ , from the EE power spectrum at ℓ 10. We can evaluate the exact likelihood function given by equation (23) for such low multipoles, using the steps described in appendix D of [20]. More precisely, we use equation (23) to calculate the likelihood using the data at ℓ ≤ 23. We use the polarization maps at 33, 41, 61, and 94 GHz, while we use the ILC map for the temperature. Figure 6 shows the likelihood of the EE power spectrum, ℓ(ℓ + 1)C EE ℓ /(2π), for ℓ = 2 through 7.
14/26 Fig. 6 Likelihood functions of the nine-year EE power spectrum for ℓ = 2 through 7 obtained from equation (23) (adapted from [37]). These data fix the optical depth, τ . The diamonds show the theoretical values of the best-fit ΛCDM cosmological model.
It is still useful to compute the power spectrum of TE at high multipoles. For this we use a simplified approach: we do not weight the temperature maps at 61 and 94 GHz, while we weight the polarization maps at 41, 61, and 94 GHz by σ 2 0 /n obs,i . We then compute (2ℓ + 1) −1 m a T ℓm a E * ℓm , and deconvolve the effects of the mask and weight following appendix A of [13]. Figure 7 shows the nine-year measurements of C T E ℓ along with estimates of the 68% CL error bars. The error bars are calculated as follows: where N T T ℓ and N EE ℓ are the noise bias spectra of the temperature and E-mode polarization, respectively, and f T sky,ℓ and f E sky,ℓ are the effective sky fractions of the temperature and E-mode polarization data, respectively. 15/26 Fig. 7 Nine-year angular cross power spectrum of the CMB temperature and E-mode polarization (adapted from [37]). While we measure C T E ℓ at each ℓ in 2 ≤ ℓ ≤ 1000, the points with error bars show the binned values of C T E ℓ for clarity. The error bars show the standard deviation of C T E ℓ , which include both the instrumental noise and the cosmic variance. The solid line shows the theoretical curve of the best-fit ΛCDM cosmological model.
As the temperature and E-mode polarization are correlated, we can create images of Emode polarization around temperature spots by averaging the polarization data around hot and cold spots. Figure 8 shows average images of temperature and polarization data. We find that the polarization data around hot and cold spots exhibit radial and tangential polarization patterns, as predicted by simulations. What is the physics behind them?
First of all, the necessary and sufficient conditions for generating non-zero polarization of the CMB are to have Thomson scattering and quadrupolar temperature anisotropy around an electron. Frequent Thomson scatterings between photons and electrons suppress quadrupolar temperature anisotropy around an electron, and thus we need to wait until the photon decoupling epoch (at which photons and electrons become less strongly coupled) to produce polarization. How is then quadrupolar anisotropy around an electron created?
It turns out that polarization (for scalar modes) traces a velocity gradient field of the plasma around gravitational potentials. Suppose that a packet of the plasma is falling into the bottom of the potential well. Due to acceleration, a velocity gradient is generated: the front of the packet falls faster than the back of the packet. Therefore, an electron at the center of the packet observes redshifted photons from both the front and back of the packet, whereas there is no redshift or blueshift from the sides of the packet. This produces a quadrupolar radiation pattern (colder along the motion of the packet and hotter in the perpendicular directions), and the produced polarization is parallel to the motion of the packet. The polarization pattern around a spherically symmetric gravitational potential well is radial, and the magnitude of radial polarization is maximal at twice the sound horizon radius at the decoupling epoch (or 1.2 degrees in the sky) from the bottom of the potential well [35]. As the packet approaches the bottom of the potential well, the packet decelerates 16/26  because of a pressure gradient. In the adiabatic initial condition, the photon density is high at the bottom of the potential well, producing a pressure gradient to decelerate motion of the plasma falling into the potential well. The front of the packet falls slower than the back of the packet. Therefore, an electron at the center of the packet observes blueshifted photons from both the front and back of the packet, whereas there is no redshift or blueshift from the sides of the packet. This produces the opposite quadrupolar radiation pattern (hotter along the motion of the packet and colder in the perpendicular directions), and the produced polarization is tangential to the motion of the packet. The magnitude of tangential polarization is maximal at the sound horizon radius (or 0.6 degrees in the sky) from the bottom of the potential well [35].
These predictions have been confirmed by the WMAP polarization data. The bottom panels of Figure 8 show the average polarization directions measured around hot and cold temperature spots. On these angular scales (a few degrees), hot and cold spots correspond to potential wells and hills, respectively. (The high photon energy density at the bottom of the well overcomes the Sachs-Wolfe effect, turning potential wells into hot spots in the sky.) Therefore, we expect each hot spot to come with the radial and tangential polarization Table 1 Six cosmological parameters of the standard flat ΛCDM model, determined from the CMB data alone. We show the constraints from the WMAP nine-year temperature and polarization power spectra [37,38]; the WMAP data combined with the ACT [55] and SPT [56] temperature spectra ("WMAP+ACT+SPT"); and the Planck 15.5-month temperature power spectrum combined with the WMAP nine-year polarization power spectrum ("Planck+WP") [57]. patterns at 1.2 and 0.6 degrees from the center, respectively, and each cold spot to come with the opposite patterns. As the magnitude of polarization is small, WMAP cannot detect polarization around each spot; however, by averaging polarization patterns around many spots, we can detect polarization. There are 12387 hot spots and 12628 cold spots outside the Galactic mask in the WMAP seven-year temperature map. Averaging the polarization data around these spots, the expected polarization patterns (shown in the "Simulation" columns in Figure 8) are clearly detected in the data (shown in the "WMAP Data" columns) [35]. The TE cross power spectrum and the average polarization images offer a powerful, precision test of the standard cosmological model. We fix the basic six cosmological parameters by fitting the temperature power spectrum at 2 ≤ ℓ ≤ 1200 and the E-mode polarization power spectrum at low multipoles. We can then predict the cross power spectrum without any more additional free parameters. The prediction matches with the data at the precision shown in Figure 7 and 8. This is a great triumph of the standard cosmological model.

Parameters
The TE cross power spectrum offers also a powerful test of one of the generic predictions of cosmic inflation: the presence of "super-horizon" fluctuations, whose wavelength is greater than the horizon size at the decoupling time. This test is possible because polarization is generated only when there are free electrons. The reionization of the universe at z 10 can generate polarization only on very large angular scales, ℓ 10; thus, any TE signals at high multipoles must be generated at the decoupling epoch. The angle that subtends the radius of the horizon at the decoupling epoch is 1.2 degrees, which corresponds to ℓ = 150. Therefore, the anti-correlation seen in the TE cross power spectrum at ℓ < 150 provides the direct evidence for the presence of super-horizon fluctuations at the decoupling epoch, i.e., the key prediction of inflation [16,54].

Standard six parameters
The WMAP nine-year temperature and polarization data are consistent with the minimal six-parameter flat ΛCDM model [37,38]. The high-ℓ temperature power spectrum (33 ≤ ℓ ≤ 1200) gives χ 2 = 1200 for 1168 degrees of freedom, with the probability to exceed (PTE) of 18/26 25.1%. The high-ℓ TE spectrum (24 ≤ ℓ ≤ 800) gives χ 2 = 815.4 for 777 degrees of freedom, with PTE of 16.5%. Therefore, the best-fit ΛCDM model is a good fit to the high-ℓ data.
While the low-ℓ temperature likelihood (2 ≤ ℓ ≤ 32) based on the BR estimator does not give a χ 2 value, we find that the best-fit ΛCDM model is consistent with the distribution of spectra generated by the BR estimator. The low-ℓ polarization likelihood (2 ≤ ℓ ≤ 23) evaluated directly in pixel space gives χ 2 = 1321 for 1170 degrees of freedom with PTE of 0.13%, which is unusually low. This excess χ 2 can be interpreted as an additional noise component (due to, for example, residual foreground emission) of 0.27 µK per N side = 8 pixel (7.3 • on a side), which is significantly lower than the average standard deviation of 0.86 ± 0.17 µK. We confirm that this excess noise does not affect the determination of τ by using differences bewtween frequencies. See section 7.1 of [37] for details.
We assume flat priors on the following six parameters: the amplitude of the primordial power spectrum at k = 0.002 Mpc −1 , ∆ 2 R , the tilt of the primordial power spectrum, n s , the physical baryon density parameter, Ω b h 2 , the physical CDM density parameter, Ω c h 2 , the cosmological constant density parameter, Ω Λ , and the optical depth of the reionization, τ . Table 1 summarizes the constraints of the cosmological parameters from the CMB data alone. Adding the smaller-scale CMB data from the Atacama Cosmology Telescope (ACT) [55] and the South Pole Telescope (SPT) improves the parameter constraints significantly. In particular, statistical significance of a deviation of n s from unity increases from 2.1σ to 3.6σ. With additional cosmological measurements (Baryon Acoustic Oscillation [58][59][60][61] and the local Hubble constant [62]) this improves to n s = 0.9608 ± 0.0080 (68% CL), a 4.9σ deviation from unity. This is a great achievement in cosmology, providing strong evidence for cosmic inflation.
The parameters found from the Planck 15.5-month data combined with the WMAP low-ℓ polarization data ("Planck+WP") are consistent with the WMAP and WMAP+ACT+SPT parameters to within the quoted error bars. With the Planck+WP combination, 1 − n s is detected at 5.4σ. This is the first time that n s < 1 is detected with > 5σ from the CMB data alone.
The WMAP measurements also provide definitive evidence for the existence of nonbaryonic dark matter with Ω c /Ω b = 5.0 ± 0.2 (68% CL). This measurement comes from a combination of the ratio of the heights of odd and even acoustic peaks giving Ω b h 2 and the ratio of the heights of the first and other peaks giving the total matter density contributing to gravitational potential well. In other words, WMAP measures the density of matter which interacts with photons, and which does not. (No matter is left behind.) The difference between the two provides definitive evidence for non-baryonic dark matter. WMAP has erased lingering doubts about the existence of dark energy. This measurement comes from the peak positions giving the angular diameter distance, d A = c z * 0 dz/H(z), where z * = 1091 is the redshift of the photon decoupling. The Friedmann equation relates the Hubble expansion rate, H(z), to the total energy density in the universe. As the integral is dominated by low redshift contributions, this provides an estimate of the total energy density in a local universe. As we have the complete account of matter density in the universe at any redshifts after z * , the difference between the total energy density inferred from d A and the total matter density gives the energy density of some substance which is not even matter, i.e., dark energy. Strictly speaking, this measurement is possible if we assume flatness of the 19/26 universe or combine the WMAP data with other cosmological data (such as the local Hubble constant measurements). Alternatively, we can use the effects of gravitational lensing on the CMB to detect dark energy from the CMB data alone [63].

Parameters beyond flat ΛCDM
The minimal six-parameter model fits all the data we have at the moment (with a possible exception of the tensor-to-scalar ratio, r, which the BICEP/Keck Array collaboration claims to have found recently from the B-mode polarization at degree angular scales [64]). As a result, the WMAP data (sometimes in combination with other CMB and non-CMB data) place stringent limits on the parameters beyond the minimal model. Spatial geometry of the universe is consistent with flat (Euclidean) space. By combining the WMAP+ACT+SPT with the CMB lensing data, we find Ω k = −0.001 ± 0.012 (68% CL). When other non-CMB data (Baryon Acoustic Oscillation and the local Hubble constant measurements) are added, we find a stringent limit of Ω k = −0.0027 +0.0039 −0.0038 (68% CL), i.e., 0.4% measurement.
Dark energy is consistent with a cosmological constant. The constraints on the equation of state parameter, w, are consistent with w = −1 typically to within 10% (95% CL), depending on the data combinations.
The cosmic neutrino background affects temperature anisotropy of the CMB in four ways: peak locations, early integrated Sachs-Wolfe effect, anisotropic stress, and enhanced damping tail (see section 4.3.1 of [38] for summary). Using these effects and the WMAP five-year data, we have made the first (indirect) detection of the cosmic neutrino background [29]. The CMB data give the total energy density of neutrinos, ρ ν = (7π 2 /120)N eff T 4 ν , where N eff is the effective number of neutrino species. Assuming the standard thermal history of the universe relating the asymptotic neutrino temperature to the CMB temperature as T ν = (4/11) 1/3 T cmb , we use the nine-year data combined with ACT and SPT to find N eff = 3.89 ± 0.67 (68% CL), consistent with the standard value of 3.046 to within 2σ.
The damping tail of the CMB is sensitive to the primordial helium abundance, Y He . The more helium we have, the more electrons are captured by helium nuclei before the decoupling, the fewer electrons are available at the decoupling, the more diffusion damping results. We have made the first detection of this effect by combining the WMAP seven-year data and the small-scale CMB data [35]. The nine-year data combined with ACT and SPT give Y He = 0.299 ± 0.027 (68% CL), consistent with the standard value of 0.25 to within 2σ. These measurements of N eff and Y He offer a unique test of the Big Bang nucleosynthesis [65]. Our measurements are consistent with the standard Big Bang nucleosynthesis calculations.
The WMAP nine-year data alone place a limit on the sum of neutrino masses, m ν < 1.3 eV (95% CL). Adding the Baryon Acoustic Oscillation and the local Hubble constant measurements, the limit improves to m ν < 0.44 eV (95% CL). Single-field inflation models predict that the fluctuations in matter and photons trace each other, obeying the adiabatic relation of δρ m /ρ m = (3/4)δρ γ /ρ γ . We find that this relation holds to better than 7% (95% CL). This limit plays an important role in constraining the parameter space of axion dark matter models [29,35].
The shape of the primordial power spectrum is sensitive to the physics of inflation. The "running spectral index," dn s /d ln k, is typically predicted to be of order (n s − 1) 2 = 20/26 O(10 −3 ). The WMAP nine-year data alone give dn s /d ln k = −0.019 ± 0.025, while adding the small-scale CMB data improves the limit to dn s /d ln k = −0.022 +0.012 −0.011 (68% CL), consistent with a power-law power spectrum to within 2σ.
Finally, inflation generates nearly scale-invariant tensor mode metric perturbations (gravitational waves) [66], h ij , which also contribute to the observed temperature and polarization anisotropies of the CMB. The amplitude of h ij is parametrized by the "tensor-to-scalar ratio," r, defined by r ≡ 2 h ij h ij * / |R| 2 . The WMAP data alone give r < 0.38 (95% CL), which improves to r < 0.17 by adding the small-scale CMB data. Adding the Baryon Acoustic Oscillation measurements improves the limit further to r < 0.12 (95% CL). This limit largely comes from the low multipole temperature data (see section 3.2.3 of [29]), and thus it is sensitive to our assumption of a power-law power spectrum. Including the running index relaxes the limits to r < 0.43 (95% CL) with a large running index, dn s /d ln k ≈ −0.04, more or less independent of the data sets used. WMAP did not have sufficient sensitivity to detect B-mode polarization. Recently the BICEP/Keck Array collaboration claimed to have found B-mode polarization at degree angular scales at 150 GHz. If this signal is cosmological and originates from gravitational waves from inflation, it corresponds to r ≈ 0.1 − 0.2 [64]. As the measurement was done only at one frequency (the BICEP1 data at 100 GHz are too noisy to be useful), confirmation of the signal at other frequencies must be made to reject foregrounds. In any case, an independent detection from an independent group is required before interpreting the detected signal as inflationary.

Tests of Gaussianity with Angular Bispectrum
Inflation predicts that primordial fluctuations originate from quantum fluctuations, and the distribution of primordial fluctuations is nearly a Gaussian distribution (see [67] for a review). Sustained inflationary expansion for at least 50 e-folds requires a field driving inflation to be weakly coupled. The wave function of quantum fluctuations of a scalar field with no interaction in the ground state is precisely a Gaussian; thus, a weakly coupled field is nearly a Gaussian field. The linear physics preserves Gaussianity, and thus CMB temperature and polarization anisotropies are predicted to obey Gaussian statistics with high precision. Confirmation of this prediction gives strong evidence for the quantum origin of primordial fluctuations.
When the distribution is not a Gaussian, the PDF is no longer given by equation (17). However, when a departure from Gaussianity, i.e., non-Gaussianity, is small, we may approximate the PDF by "Taylor-expanding" around a Gaussian distribution. Let us do this in harmonic space. We obtain an expanded PDF for the spherical harmonics coefficients as [68,69] where C ℓm,ℓ ′ m ′ ≡ ij Y ℓm,i (S + N ) ij Y * ℓ ′ m ′ ,j is the signal plus noise covariance matrix in harmonic space. (We do not write indices for DAs or years for simplicity.) Here, the expansion is truncated at the three-point function (bispectrum) of a ℓm , and thus we have assumed that 21/26 the connected four-point and higher-order correlation functions are negligible compared to the power spectrum and bispectrum. (This condition is not always satisfied.) By evaluating the above derivatives, we obtain 2 This formula is useful, as it tells us how to estimate the angular bispectrum, a ℓ1m1 a ℓ2m2 a ℓ3m3 , optimally from given data by maximizing this PDF. In practice, we usually parametrize the bispectrum using a few parameters (e.g., f NL ), and estimate those parameters from the data by maximizing the PDF with respect to the parameters.
In the limit that the contribution of the connected four-point function (trispectrum) to the PDF is negligible compared to those of the power spectrum and bispectrum, equation (26) contains all the information on non-Gaussian fluctuations characterized by the covariance matrix, C ℓ1m1,ℓ2m2 = a * ℓ1m1 a ℓ2m2 , and the angular bispectrum, a ℓ1m1 a ℓ2m2 a ℓ3m3 . This approach can be extended straightforwardly to the trispectrum if necessary.
As the bispectrum has three angular wavenumbers, ℓ 1 , ℓ 2 and ℓ 3 , it can form triangles with various shapes. Among all the shapes, the so-called "local-form bispectrum," parametrized by a non-linear parameter f NL [71], carries a special significance, as detection of a large local-form bispectrum would rule out all inflation models based on a single energy component with a Bunch-Davies initial vacuum state and an attractor solution (see [72] for the latest discussion on this theorem). This triangle has the largest amplitude in the "squeezed configurations" in which one of the wavenumbers, say ℓ 3 , is much smaller than the other two, i.e., ℓ 3 ≪ ℓ 1 ≈ ℓ 2 [73]. Detailed descriptions on what this bispectrum is and what the other shapes are, as well as on how to measure them can be found in [74].
Using the foreground-reduced WMAP nine-year temperature data at 61 and 94 GHz with the KQ75 mask, we find f NL = 37 ± 20 (68% CL), which is consistent with zero to within 2σ; thus, the measurement agrees with the basic prediction of single-field inflation models with a Bunch-Davies initial vacuum state and an attractor solution. The Planck improves this limit greatly by finding f NL = 2.7 ± 5.8 (68% CL) [75].
One way to generate the local-form bispectrum is to write the primordial curvature perturbation as . This form is called the "local form" because both sides are evaluated at the same spatial location, x. Here, R L is a Gaussian random field, and the curvature perturbation is defined such that the linear Sachs-Wolfe effect gives δT /T = −R L /5. Using this form and the fact that the variance of R is 2 × 10 −9 , we find that the 95% upper bound from Planck, f NL < 14, implies that the observed CMB is Gaussian to the precision of 0.04% or better. This is a remarkable degree of Gaussianity, which provides strong evidence that the observed CMB fluctuations originate from quantum fluctuations generated during single-field inflation. 2 Babich [70] derived this formula for C ℓm,ℓ ′ m ′ = C ℓ δ ℓℓ ′ δ mm ′ .

Implications for inflation
Models of inflation [76][77][78][79][80] make specific, testable predictions. The simplest models based upon a single energy component (scalar field), which slowly rolls down on its potential and drives a sustained quasi-exponential expansion for at least 50 e-folds, predict that the observable universe is homogeneous and isotropic with flat geometry, and is filled with small fluctuations which are precisely adiabatic and nearly Gaussian (before fluctuations become non-linear). Both scalar and tensor fluctuations with various wavelengths are generated during inflation. The wavelengths of these fluctuations can exceed the horizon size at the decoupling epoch, and the amplitude of these fluctuations weakly depends on wavelengths.
All of these predictions fit the WMAP data remarkably well: flatness is measured with 0.4% precision (from WMAP combined with the Baryon Acoustic Oscillation and the local Hubble constant measurements); the adiabatic condition holds to better than 7% precision; a deviation from Gaussian fluctuations is restricted to be less than 0.2% (and 0.04% with the Planck 2013 data); and the presence of super-horizon fluctuations at decoupling is decisively detected in the TE cross power spectrum at ℓ < 150. The WMAP data combined with the Baryon Acoustic Oscillation and the local Hubble constant measurements find convincing evidence for the scale dependence of the scalar initial power spectrum with 4.9σ significance, with the best-fit value in agreement with the first prediction made in [81].
While WMAP did not find signatures of tensor fluctuations, the upper bound on the tensor-to-scalar ratio inferred from the temperature data is consistent with many singlefield inflation models. Figure 9 compares the limits on n s and r with a few representative single-field inflation models.

Conclusion
The nine years of observations of the WMAP satellite have taught us many things. The current universe is 13.77 billion years old, and consists of 4.6% atoms, 24% cold dark matter, and 71% dark energy [37,38]. The nature of dark energy is consistent with that of a cosmological constant. The spatial geometry of the universe is consistent with Euclidean geometry. The universe is filled with neutrinos, whose abundance is consistent with the standard model of particle physics. The mass of neutrinos is much less than 1 eV.
The measured properties of primordial fluctuations such as adiabaticity, Gaussianity, and near scale invariance all point toward a remarkable scenario: the observed fluctuations originate from quantum fluctuations generated during inflation driven by a single energy component. WMAP offered a number of stringent tests of the simplest inflation scenarios: (1) flat universe, (2) adiabatic fluctuations, (3) super-horizon fluctuations, (4) nearly, but not exactly, scale-invariant initial power spectrum, and (5) Gaussian fluctuations. The simplest scenarios passed all of these tests. The Planck 2013 data have confirmed all of these findings with greater precision.
Yet, neither the WMAP nor the Planck 2013 data detect the signature of primordial gravitational waves from inflation in CMB. Detecting and characterizing the B-mode polarization of the CMB is the next milestone in cosmology. While the BICEP/Keck Array collaboration claims to have found the B-mode polarization from inflationary gravitational waves at 150 GHz, confirmation of the signal at other frequencies and with an independent experiment must be made before we claim a victory in observing all of the inflation predictions. 23/26 WMAP(temp+pol)+ACT+SPT+BAO+H WMAP(pol) + Planck + BAO Fig. 9 Two-dimensional joint marginalized constraints (68% and 95% CL) on the primordial tilt, n s , and the tensor-to-scalar ratio, r. The red contours show the constraint from the WMAP nine-year data combined with the small-scale CMB temperature data (ACT and SPT), the Baryon Acoustic Oscillation data, and the local Hubble constant. The blue contours show the constraint from the Planck 15.5-month temperature data combined with the WMAP nine-year polarization data and the Baryon Acoustic Oscillation data. The symbols show the predictions from single-field inflation with monomial potentials, V (φ) ∝ φ n [82], with n = 4 (black), 2 (white), and 1 (light grey), and with a R 2 term in the gravitational action (dark grey) [76].