## Abstract

In this paper, we extend a previous work where we presented a method based on the *N*-point probability density function (pdf) to study the Gaussianity of the cosmic microwave background (CMB). We explore a local non-linear perturbative model up to third order as a general characterization of the CMB anisotropies. We focus our analysis in large-scale anisotropies (θ > 1°). At these angular scales (the Sachs–Wolfe regime), the non-Gaussian description proposed in this work defaults (under certain conditions) to an approximated local form of the weak non-linear coupling inflationary model. In particular, the *quadratic* and *cubic* terms are governed by the non-linear coupling parameters *f*_{NL} and *g*_{NL}, respectively. The extension proposed in this paper allows us to directly constrain these non-linear parameters. Applying the proposed methodology to *Wilkinson Microwave Anisotropy Probe* (*WMAP*) 5-year data, we obtain −5.6 × 10^{5} < *g*_{NL} < 6.4 × 10^{5}, at 95 per cent confidence level. This result is in agreement with previous findings obtained for equivalent non-Gaussian models and with different non-Gaussian estimators, although this is the first direct constraint on *g*_{NL} from CMB data. A model selection test is performed, indicating that a Gaussian model (i.e. *f*_{NL}≡ 0 and *g*_{NL}≡ 0) is preferred to the non-Gaussian scenario. When comparing different non-Gaussian models, we observe that a pure *f*_{NL} model (i.e. *g*_{NL}≡ 0) is the most favoured case and that a pure *g*_{NL} model (i.e. *f*_{NL}≡ 0) is more likely than a general non-Gaussian scenario (i.e. *f*_{NL}≠ 0 and *g*_{NL}≠ 0). Finally, we have analysed the *WMAP* data in two independent hemispheres, in particular the ones defined by the dipolar pattern found by Hoftuft et al. We show that, whereas the *g*_{NL} value is compatible between both hemispheres, it is not the case for *f*_{NL} (with a *p*-value of ≈0.04). However, if, as suggested by Hoftuft et al., anisotropy of the data is assumed, the distance between the likelihood distributions for each hemisphere is larger than expected from Gaussian and anisotropic simulations, not only for *f*_{NL} but also for *g*_{NL} (with a *p*-value of ≈0.001 in the case of this latter parameter). This result is extra evidence for the CMB asymmetries previously reported in *WMAP* data.

## 1 INTRODUCTION

Current observations of the cosmic microwave background (CMB) temperature and polarization fluctuations, in addition to other astronomical data sets (e.g. Gupta et al. 2009; Komatsu et al. 2009, see Barreiro 2009 for a recent review), provide an overall picture for the origin, evolution and matter and energy content of the universe, which is usually referred to as the *standard cosmological model*. In this context, we believe the universe to be highly homogeneous and isotropic, in expansion, well described by a Friedmman–Robertson–Walker metric and with a trivial topology. The space geometry is very close to flat, and it is filled with cold dark matter and dark energy (in the form of a cosmological constant, Λ), in addition to baryonic matter and electromagnetic radiation. Large-scale structure (LSS) is assumed to be formed by the gravitational collapse of an initially smooth distribution of adiabatic matter fluctuations, which were seeded by initial Gaussian quantum fluctuations generated in a very early inflationary stage of the universe evolution.

It is interesting to mention that, besides the success of current high-quality CMB data [in particular, the data provided by the *Wilkinson Microwave Anisotropy Probe* (*WMAP*) satellite; Hinshaw et al. 2009] in constraining the cosmological parameters with good accuracy and in showing the high degree of homogeneity and isotropy of the Universe (as predicted by the standard inflation scenario; see e.g. Liddle & Lyth 2000), it has been, precisely, through the analysis of this very same data that the CMB community has been allowed to probe fundamental principles and assumptions of the *standard cosmological model*. Most notably, the application of sophisticated statistical analysis to CMB data might help us to understand whether the temperature fluctuations of the primordial radiation are compatible with the fundamental isotropic and Gaussian standard predictions from the *inflationary phase*.

Indeed, the interest of the cosmology community in this field has experienced a significant growth, since several analyses of the *WMAP* data have reported some hints for departure from isotropy and Gaussianity of the CMB temperature distribution. The literature on the subject is very large, and is still growing, which makes really difficult to provide a complete and updated list of publications. We refer to our previous work (Vielva & Sanz 2009) for a (almost) complete review.

Among the previously mentioned analyses, those related to the study of non-standard inflationary models have attracted a larger attention. For instance, the non-linear coupling parameter *f*_{NL} that describes the non-linear evolution of the inflationary potential (see e.g. Bartolo et al. 2004, and references therein) has been constrained by several groups, from the analysis of the *WMAP* data using the angular bispectrum (Komatsu et al. 2003; Creminelli, Senatore & Zaldarriaga 2007; Spergel et al. 2007; Yadav & Wandelt 2008; Komatsu et al. 2009; Smith, Senatore & Zaldarriaga 2009), applying the Minkowski functionals (Komatsu et al. 2003; Gott et al. 2007; Spergel et al. 2007; Hikage et al. 2008; Komatsu et al. 2009), using different statistics based on wavelets (Mukherjee & Wang 2004; Cabella et al. 2005; Curto et al. 2009a; Curto, Martínez-González & Barreiro 2009b; Pietrobon et al. 2009; Rudjord et al. 2009) and exploring the *N*-point probability density function (*N*-pdf) (Vielva & Sanz 2009). Besides marignal detections of *f*_{NL} > 0 (with a probability of around 95 per cent; Yadav & Wandelt 2008; Rudjord et al. 2009), there is a general consensus on the *WMAP* compatibility with the predictions made by the standard inflationary scenario at least at 95 per cent confidence level (CL). The current best limits obtained from the CMB data are −4 < *f*_{NL} < 80 at 95 per cent CL by Smith et al. (2009). In addition, very recently, promising constraints coming out from the analysis of LSS have been reported: −29 < *f*_{NL} < 70 at 95 per cent CL (Slozar et al. 2008).

The aim of this paper is to extend our previous work (Vielva & Sanz 2009), where the full *N*-pdf of a non-Gaussian model that describes the CMB anisotropies as a local (pixel-by-pixel) non-linear expansion of the temperature fluctuations (up to second order) was derived. For this model – that, at large scales, can be considered as an approximation to the weak non-linear coupling scenario – we are able to build the exact likelihood on pixel space. Working in pixel space allows one to include easily non-ideal observational conditions, like incomplete sky coverage and anisotropic noise. The extension of the present work is to account for higher order moments in the expansion, in particular we are able to directly obtain constraints on *g*_{NL}, that is the coupling parameter governing the cubic term of the weak non-linear expansion.

As far as we are aware, direct constraints on *g*_{NL} have been made available only very recently (Desjacques & Seljak 2010) and are obtained from LSS analyses: −3.5 × 10^{5} < *g*_{NL} < 8.2 × 10^{5} at 95 per cent CL. This constraint was obtained for the specific case in which the coupling parameter governing the quadratic term (*f*_{NL}) is negligible (i.e. *f*_{NL}≡ 0, which is the situation required for some curvaton inflationary models; e.g. Sasaki, Valiviita & Wands 2006; Enqvist & Takahashi 2008; Huang 2009). We present in this work the first direct measurement of *g*_{NL} obtained from CMB data. In addition to study the particular case of *f*_{NL}≡ 0, we also consider a more general case in which a joint estimation of *f*_{NL} and *g*_{NL} is performed. Finally, and justified by recent findings (e.g. Hoftuft et al. 2009), we compute the *N*-pdf in two different hemispheres and derive from it constraints on *f*_{NL} and *g*_{NL} for this hemispherical division of the celestial sphere.

The paper is organized as follows. In Section 2, we describe the physical model based on the local expansion of the CMB fluctuations and derive the full posterior probability, recalling how it defaults to the case already addressed in Vielva & Sanz (2009). In Section 3, we check the methodology against *WMAP*-like simulations. Results on *WMAP* 5-year data are presented in Section 4. Conclusions are given in Section 5. Finally, in Appendix A we provide a detailed computation of the full *N*-pdf.

## 2 THE NON-GAUSSIAN MODEL

Although current CMB measurements are well described by random Gaussian anisotropies (as it is predicted by the standard inflationary scenario), observations also allow for small departures from Gaussianity, which could indicate the presence of an underlying physical process generated in non-standard models.

As we did in Vielva & Sanz (2009), we adopt a parametric non-Gaussian model for the CMB anisotropies, which accounts for a small and local (i.e. point-to-point) perturbation of the CMB temperature fluctuations, around its intrinsic Gaussian distribution:

The*linear term*[(Δ

*T*)

_{i}_{G}] is given by a Gaussian

*N*-point probability density function (

*N*-pdf) that is easily described in terms of the standard inflationary model. The second and third terms on the right-hand side are the

*quadratic*and the

*cubic*perturbation terms, respectively, and they are governed by the

*a*and

*b*parameters.

The subindex *i* refers to a given direction in the sky that, in practice, is described in terms of a certain pixelization of the sphere. The operator 〈·〉 indicates averaging over all the pixels defining the sky coverage. We have not considered explicitly an *instrumental noise*-like term, since, for the particular case that we intend to explore (i.e. large-scale CMB data), its contribution to the measured signal (for experiments like the *WMAP* or *Planck*) is negligible. Precisely at the large-scale regime the term Δ*T _{i}* is mostly dominated by the Sachs–Wolfe contribution to the CMB fluctuations, and can be related to the primordial gravitational potential Φ (e.g. Komatsu & Spergel 2001) by

*T*

_{0}is the CMB temperature. Small departures from Gaussianity of the Φ potential are usually described via the weak non-linear coupling model:

Taking into account equations (1)–(3), and always considering the specific case for scales larger than the horizon scale at the recombination time (i.e. above the degree scale), it is trivial to establish the following relations:

At this point, it is worth mentioning that the model in equation (1) does not pretend to incorporate all the gravitational and non-gravitational effects, due to the evolution of the initial quadratic potential model, but rather allows for a useful parametrization for describing a small departure from Gaussianity. The relationships in equation (4) have to be understood just as an asymptotic equivalence for large scales.Let us simplify the notation by transforming the Gaussian term (Δ*T _{i}*)

_{G}into a zero mean and unit variance random variable φ

_{i}. It is easy to show that equation (1) transforms into

^{2}≡〈(Δ

*T*)

_{i}^{2}

_{G}〉 is the standard deviation of the CMB fluctuations. Trivially, the normalized Gaussian variable φ satisfies where

*n*≥ 0 and

*m*> 0 are integer numbers and ξ

_{ij}represents the normalized correlation between pixels

*i*and

*j*. Obviously, the

*N*-pdf of the

**φ**={φ

_{1}, φ

_{2}, …, φ

_{N}} random field (where

*N*refers to the number of pixels on the sphere that are actually observed) is given by a multivariate Gaussian distribution: where

**ξ**denotes the correlation matrix and operator ·

^{t}denotes standard matrix/vector transpose.

As it was the case in Vielva & Sanz (2009), the objective is to obtain the *N*-pdf associated with the non-Gaussian ** x**={

*x*

_{1},

*x*

_{2}, …,

*x*} field, as a function of the non-linear coupling parameters (or, equivalently, the

_{N}*a*and

*b*coefficients):

*Z*is the determinant of the Jacobian for the transformation. Because the proposed model is local (i.e. point-to-point), the Jacobian matrix is diagonal and therefore

*Z*is given by

Both, equations (9) and (10), require the inversion of equation (5), i.e. to express φ_{i} as a function of *x _{i}*. After some algebra, it can be proved that

*p*(

**|**

*x**a*,

*b*), it is equivalent, but more convenient, to work with the log-likelihood : A detailed computation of is given in Appendix A. Let us just recall here its final expression: where

*N*is the number of data points and

*F*,

*G*,

*H*and

*I*are functions of

*a*and

*b*(see A13). The desired

*N*-pdf,

*p*(

**|**

*x**a*,

*b*) is obtained by the inversion of equation (13) and taking into account that

*p*(

**|0) ≡**

*x**p*(φ=

*x*), i.e. the known Gaussian

*N*-pdf in equation (8).

## 3 APPLICATION TO *WMAP* SIMULATIONS

In this section, we aim to investigate the performance of the parameters estimation from the *N*-pdf derived in the previous section. We explore different non-Gaussian scenarios; in particular, we study three particular cases of special interest.

Case I (

*a*≠ 0,*b*= 0). This scenario would correspond, for example, to the case for the slow-roll standard inflationCase II (

*a*= 0,*b*≠ 0). This scenario would correspond to the particular situation for some curvaton models.Case III (

*a*≠ 0,*b*≠ 0). It is a generic case, not representing any specific inflationary model, but rather a general scenario.

In particular, we will study how the determination of the parameters governing the non-Gaussian terms is performed, and what is the impact when one is exploring different configurations. In the next sections, we will focus, first, on the case in which a slow-roll standard like scenario is assumed (i.e. we only try to adjust for the *quadratic* term, assuming the *cubic* one is negligible), whatever the data are actually a pure *quadratic* or *cubic* model or a general non-Gaussian scenario. Secondly, we will follow a similar analysis, but assuming the estimation of a pure *cubic* term. Finally, we will address the case for a joint estimation of both (*quadratic* and *cubic*) terms. In the following, we will refer all our results in terms of the non-linear coupling parameters (*f*_{NL} and *g*_{NL}), rather than to the *a* and *b* coefficients.

In order to carry out this analysis, we have generated Gaussian CMB simulations coherent with the model induced from the *WMAP* 5-year data at nside= 32 healpix (Górski et al. 2005) resolution (≈2°).

The procedure to generate a CMB Gaussian simulation –(Δ*T*)_{G} in equation (1)– is as follows. First, we simulate *WMAP* observations for the Q1, Q2, V1, V2, W1, W2, W3, W4 difference assemblies at nside= 512 healpix resolution. The *C*_{ℓ} obtained with the cosmological parameters provided by the best fit to *WMAP* data alone (table 6 in Hinshaw et al. 2009) are assumed.

Secondly, a single co-added CMB map is computed afterwards through a noise-weighted linear combination of the eight maps (from Q1 to W4). The weights used in this linear combination are proportional to the inverse mean noise variance provided by the *WMAP* team. They are independent of the position (i.e. they are uniform across the sky for a given difference assembly) and forced to be normalized to unity. Note that we have not added Gaussian white noise to the different difference assembly maps, since we have already checked that instrumental noise plays a negligible role at the angular resolution in which we are interested (≈2°; see Vielva & Sanz 2009, for details).

Thirdly, the co-added map at nside= 512 is degraded down to the final resolution of nside= 32, and a mask representing a sky coverage like the one allowed by the *WMAP* KQ75 mask (Gold et al. 2009) is adopted. At nside= 32, the mask keeps around 69 per cent of the sky. The mask is given in Fig. 1. Let us remark that observational constraints from an incomplete sky coverage can be easily taken into account by the local non-Gaussian model proposed in this work, since it is naturally defined in pixel space. This is not the case for other common estimators like the bispectrum, where the presence of an incomplete sky coverage is usually translated into a loss of efficiency.

We have generated 500 000 simulations of (Δ*T*)_{G}, computed as described above, to estimate the correlation matrix **ξ** accounting for the Gaussian CMB cross-correlations. We have checked that this large number of simulations is enough to obtain an accurate description of the CMB Gaussian fluctuations.

In addition, we have generated another set of 1000 simulations. These are required to carry out the statistical analysis to check the performance of the parameter estimation. Each one of these 1000(Δ*T*)_{G} simulations is transformed into * x* (following equations 1 and 6) to study the response of the statistical tools as a function of the non-linear parameters defining the local non-Gaussian model proposed in equation (5).

Finally, let us remark that hereinafter the likelihood maximization is simply performed by exploring a grid of values in the parameter space of the non-linear coupling parameters. The step used in the grid is small enough to guarantee a good estimation of both the likelihood peak and tails.

### 3.1 The recovery of *f*_{NL} in the presence of a *cubic* term

The results obtained from the 1000 simulations are given in Fig. 2. We have explored 16 different non-Gaussian models, accounting for all the possible combinations obtained with simulated values of 0, 3 × 10^{5}, 5 × 10^{5} and 10^{6} and values of 0, 200, 400 and 600. For each panel, we present the histogram of the maximum-likelihood estimation of the non-linear coupling *quadratic* parameter: . Note that we refer to a simulated value of a given non-linear coupling parameter (*x*_{NL}), as , whereas its estimation via the maximum likelihood is denoted as . Vertical dashed lines in each panel indicate the value of the maximum-likelihood estimation for the parameter.

As it can be noted from the figure, when the simulated data satisfy the condition of the particular explored model (i.e. , first column), the *f*_{NL} is accurately and efficiently estimated, at least for values of . Actually, this is a result that we already obtained in Vielva & Sanz (2009), which indicates that for the perturbative model is no longer valid.

However, when the simulated non-Gaussian maps also contain a significant contribution from a *cubic* term, the bias in for the cases with . It is interesting to note that, even if the simulated is large (for instance ), we do not see any significant bias in for simulated values lower than 200.

Summarizing, we can infer that for non-Gaussian scenarios with and no significant bias on the estimation of a pure *quadratic* term is found. It is worth mentioning that these range of values are in agreement with predictions from most of the physically motivated non-Gaussian inflationary models. Note that, in general, even for the cases in which a bias is observed, the efficiency in the determination of *f*_{NL} (somehow related to the width of the histograms) is almost unaltered.

### 3.2 The recovery of *g*_{NL} in the presence of a *quadratic* term

As for the previous case, a graphical representation of the results obtained from the 1000 simulations is given in Fig. 3. We have explored the same 16 different non-Gaussian models already described above. As it can be noted from the figure, when the simulated data correspond to the explored model (i.e. , first row) the *g*_{NL} parameter is reasonably estimated, at least for simulated .

However, when the simulated non-Gaussian maps also contain a significant contribution from the *quadratic* term, a bias in the determination of the *g*_{NL} parameter stars is to be notorious for lower values of the simulated coefficient. In particular, the plots of the first column (i.e. ) show a clear bias on . This indicates that, when the analysed case corresponds to a pure *quadratic* scenario, and a pure *cubic* model is assumed, the *g*_{NL} estimator is sensitive to the *quadratic* non-Gaussianity and, somehow, it absorbs the non-Gaussianity in the form of a fake *cubic* term. In particular, an input value of is determined as a pure . Note that this was not the situation for the previous case, where the *f*_{NL} estimation was not sensitive to the presence of a pure *cubic* model (at least for reasonable values of ). This is an expected result, since any skewed distribution would imply the presence of a certain degree of kurtosis, whereas the opposite is not necessary true.

### 3.3 The general case: the joint recovery of *f*_{NL} and *g*_{NL}

Finally, we have also explored the case of a joint estimation of the *quadratic* and *cubic* terms. The results obtained from the 1000 simulations are given in Fig. 4. As for the previous cases, we have explored the same 16 different non-Gaussian models already described above. The plots represent the contours of the 2D histograms obtained for the pair of the maximum-likelihood estimation. Vertical and horizontal dashed lines indicate the simulated and values, respectively.

As it can be noted from the figure, only for the regime and , we obtain an accurate and efficient estimation of the non-linear coupling parameters. As it was reported above, this regime corresponds to the boundaries obtained from the pure *f*_{NL} case.

It is interesting to note the presence of very large biases for cases outside the previous range. In particular, estimations tend to move towards a region of the parameter space of larger values of both, *f*_{NL} and *g*_{NL}. Only a secondary peak in the 2D histogram corresponds to the simulated pair of values.

This result, combined with the previous ones, indicates that the non-Gaussian model proposed in equation (1) is only valid up to values of the *quadratic* and *cubic* terms of around 1 and 0.05 per cent, respectively.

## 4 APPLICATION TO *WMAP* 5-YEAR DATA

We have studied the compatibility of the *WMAP* 5-year data with a non-Gaussian model as the one described in equation (1). In particular, we have analysed the co-added CMB map generated from the global noise-weighted linear combination of the reduced foreground maps for the Q1, Q2, V1, V2, W1, W2, W3 and W4 difference assemblies (see Gold et al. 2009, for details). The weights are proportional to the inverse average noise variance across the sky, and are normalized to unity. This linear combination is made at nside= 512 healpix resolution, being degraded afterwards down to nside= 32.

Under these circumstances, we are in the same condition as for the analysis performed on the simulations described in Section 3. Therefore, the theoretical multinormal covariance of the CMB temperature fluctuations (**ξ**) is the one already computed with the 500 000 simulations (see the previous section).

Two different analyses were performed. The first accounts for an all-sky study (except for the sky regions covered by the Galactic mask described in the previous section), where constraints on the non-linear coupling parameters from different scenarios are presented. We will present as well results derived from a model selection approach, where we investigate which are the models that are more favoured by the data. The second analysis explores the *WMAP* data compatibility with the local non-Gaussian model in two different hemispheres. In particular, we have studied independently the two hemispheres related to the dipolar pattern described in Hoftuft et al. (2009).

### 4.1 All-sky analysis

We have computed the full *N*-pdf in equation (9), for three different scenarios: a non-Gaussian model with a pure *quadratic* term (i.e. *g*_{NL}≡ 0); another case with a pure *cubic* term (i.e. *f*_{NL}≡ 0) and a general non-Gaussian model (i.e. *f*_{NL}≠ 0 and *g*_{NL}≠ 0),

Results are given in Fig. 5. The left-hand panel shows the likelihood obtained for the first case: . Actually, this result is the one that we already obtained in our previous work (Vielva & Sanz 2009). The maximum-likelihood estimation for the *quadratic* factor is .1 The constraint on the non-Gaussian parameter is −154 < *f*_{NL} < 94 at 95 per cent.

The middle panel in Fig. 5 presents the likelihood obtained from a model with a pure *cubic* term: . The maximum-likelihood estimation for the *quadratic* factor is . The constraint on the parameter is −5.6 × 10^{5} < *g*_{NL} < 6.4 × 10^{5} at 95 per cent. This result is compatible with a previous finding obtained from the analysis of LSS data (Desjacques & Seljak 2010). The result reported in this work is, as far as we know, the first direct constraint of *g*_{NL} from CMB data alone.

The right-hand panel in Fig. 5 shows the contour levels at the 68, 95 and 99 per cent CL, for the likelihood obtained from an analysis of a general *quadratic* and *cubic* model: *p*(** x**|

*f*

_{NL},

*g*

_{NL}). Note that the maximum-likelihood estimation for the

*f*

_{NL}and

*g*

_{NL}parameters are similar to those obtained from the previous cases (where the pure models were investigated). Even more, the marginalized distributions for the two parameters are extremely similar to the likelihood distributions discussed previously, and therefore the constraints on the non-linear coupling parameters are virtually the same.

Finally, we want to comment a few words about two issues: the incorporation of possible a priori information related to the parameters defining the non-Gaussian model and the application of model selection criteria (or hypothesis tests) to discriminate among the Gaussian model and different non-Gaussian models.

As we largely discussed in our previous work (Vielva & Sanz 2009), one of the major advantages of computing the full *N*-pdf on the non-Gaussian model is that, in addition to provide a maximum-likelihood estimation for the non-linear coupling parameters, we have a full description of the statistical properties of the problem. More, in particular, if we could have any physical (or empirical) motivated prior for the *f*_{NL} and *g*_{NL} parameters, it could be used together with the likelihood function to perform a full Bayesian parameter estimation. This aspect has not been considered in this work, precisely because such a well-motivated prior is lacking. Actually, a possible and trivial a priori information that could be used in this specific case would be to limit the range of values that can be taken by *f*_{NL} and *g*_{NL}, such as the non-Gaussian model is, indeed, a local perturbation of a Gaussian field (i.e. the typical values that we discussed in Section 3). However, these priors do not seem to be quite useful since, first, there is not any clear evidence to choose a prior form different from a uniform value over the parameters range and, secondly, the limits of these ranges are somehow arbitrary. These kind of priors do not provide any further knowledge of the Bayesian parameter determination: as it is well known, such estimation would be totally driven by the likelihood itself, since it is fully defined within any reasonable a priori ranges.

The possibility of performing a model selection approach is an extra advantage of dealing with the full *N*-pdf. Of course, under the presence of a hypothetical well-motivated prior on the non-linear coupling parameters, model selection could be done in terms of the Bayesian evidence or the ratio of posterior probabilities (see Vielva & Sanz 2009, for a specific discussion on this application). However, the lack of such a prior (as we discussed above) makes the application of a full Bayesian model selection framework significantly less powerful than in other situations: as it is very well known, the use of uniform priors for all the parameters would provide very little information, since the results would be very much dependent on the size of the parameters range. Despite this, we can still make a worthy use of the likelihood to perform model selection. In particular, some asymptotic model selection criteria, like the Akaike Information Criteria (AIC; Akaike 1973) and the Bayesian Information Criteria (BIC; Schwarz 1978), can be applied. Both methods provide a ranging index for competitive hypotheses, where the most likely one is indicated by the lowest value of the index. The AIC and BIC indices depend on the maximum value of the log likelihood ():

*p*is the number of parameters that determines the hypothesis or model

*H*. We have applied these two asymptotic model selection criteria to the

_{i}*WMAP*5-year data. Defining the Gaussian model as

*H*

_{0}, the pure

*quadratic*model as

*H*

_{1}, the pure

*cubic*model as

*H*

_{2}and the general non-Gaussian model as

*H*

_{3}, and considering the maximum value for the log likelihoods obtained for all these cases, we obtain AIC (

*H*

_{0}) < AIC (

*H*

_{1}) < AIC (

*H*

_{2}) < AIC (

*H*

_{3}) and BIC (

*H*

_{0}) < BIC (

*H*

_{1}) < BIC (

*H*

_{2}) < BIC (

*H*

_{3}). That is, the most likely model is the Gaussian one (what is in agreement with the results obtained from the parameter determination, since

*f*

_{NL}≡ 0 and

*g*

_{NL}≡ 0 cannot be rejected at any meaningful CL). Among the non-Gaussian models, a pure

*f*

_{NL}model is the most likely scenario, being a joint

*f*

_{NL},

*g*

_{NL}model the most disfavoured by the

*WMAP*5-year data.

### 4.2 Hemispherical analysis

Among the large number of the *WMAP anomalies* that have been reported in the literature, an anisotropy manifested in the form of a hemispherical asymmetry is one of the topics that has been more largely studied (e.g. Eriksen et al. 2004a; Hansen et al. 2004a; Hansen, Banday & Górski 2004b). Most of the works related to this issue have reported that such asymmetry is more marked for a north–south hemispherical division relatively close to the Northern and Southern ecliptic hemispheres.

In a recent work, Hoftuft et al. (2009) reported that large-scale *WMAP* data were compatible with such kind of anisotropy, in the form of a dipolar modulation defined by a preferred direction pointing towards the Galactic coordinates (*l*, *b*) = (224°, −22°).

Motivated by these results, we have repeated the analysis described in the previous section, but addressing independently the two hemispheres associated with the dipolar pattern found by Hoftuft et al. (2009). Hereinafter, we will refer to the Northern hemisphere of this dipolar pattern as the half of the celestial sphere whose pole is closer to the Northern ecliptic pole, and, equivalently, we will indicate as the Southern hemisphere the complementary half of the sky. The corresponding areas of the sky that are analysed are shown in Fig. 6. The left- and right-hand panels show the allowed sky regions, when the Northern and Southern hemispheres of the dipolar pattern are independently addressed, respectively. Note that the regions not allowed by the Galactic mask are also excluded from the analysis. The portion of the sky that is analysed is around 34 per cent for the Northern hemisphere and around 35 per cent for the Southern half.

As we discussed in the previous section, the constraints of the *f*_{NL} and *g*_{NL} parameters obtained from the analysis of pure *quadratic* and *cubic* non-Gaussian models do not differ significantly from those obtained from a general analysis of a joint scenario. This is expected for a regime of relatively low values of the non-linear coupling parameters. For that reason, in the present study we will only consider the following two cases: a pure *quadratic* (i.e. *g*_{NL}≡ 0) and a pure *cubic* (i.e. *f*_{NL}≡ 0) models. Results are given in Fig. 7. We present the likelihood probabilities for the first case –*p*(** x**|

*f*

_{NL})– in the left-hand plot and the one corresponding to the second case –

*p*(

**|**

*x**g*

_{NL})– in the right-hand panel. Each plot shows the results for the Northern (dashed lines), and the Southern (dot–dashed lines) hemispheres. The maximum-likelihood estimation for the non-linear coupling parameters is given as vertical lines.

The right-hand panel shows that both hemispheres have a similar likelihood [*p*(** x**|

*g*

_{NL})] for the case of a pure

*cubic*model. However, interestingly, it is not the case when addressing a

*g*

_{NL}≡ 0 scenario. In this case, we note two important results. First, whereas the

*f*

_{NL}estimation from the analysis of the Northern hemisphere provides a constraint compatible with the Gaussian scenario, it is not the case for the Southern hemisphere. In fact, we find that

*f*

_{NL}< 0 at 96 per cent CL. In particular, we find . Secondly, the distance between both distributions is too large. Let us make use of the Kullback–Leibler divergence (KLD; Kullback & Leibler 1951) as a measurement of the distance between the two likelihoods

*p*

_{n}(

**|**

*x**f*

_{NL}) and

*p*

_{s}(

**|**

*x**f*

_{NL}):

*p*

_{n}(

**|**

*x**f*

_{NL}) and

*p*

_{s}(

**|**

*x**f*

_{NL}) are the likelihoods for the Northern and the Southern hemispheres, respectively. Actually, we use the symmetrized statistic

*D*, defined as We have found that the distance

*D*for the likelihood distributions of the

*f*

_{NL}parameter estimated in the Northern and the Southern hemispheres defined by the dipolar pattern described by Hoftuft et al. (2009) is much larger than it would be expected from Gaussian and isotropic random CMB simulations. In particular, such a distance has a

*p*-value of ≈0.04. This result is a further evidence on the largely discussed

*WMAP*north–south asymmetry, and it is as well an indication that such asymmetry is manifested in terms of the non-Gaussianity of the CMB temperature fluctuations, in agreement with previous results (e.g. Eriksen et al. 2004b; Hansen et al. 2004a,b; Park 2004; Vielva et al. 2004; Cruz et al. 2005; Eriksen et al. 2005; Land & Magueijo 2005; Monteserín et al. 2008; Räth et al. 2009; Rossmanith et al. 2009).

At this point, it is worth recalling that the analysis described above has been performed assuming isotropy, i.e. we have used the same type of correlations to described the second-order statistics in both the Northern and the Southern hemispheres. However, the result obtained by Hoftuft et al. (2009) indicates that these two hemispheres might be described by two different correlations (i.e. the sky would not be isotropic any longer). The dipolar modulation proposed by Hoftuft et al. (2009) was small (its amplitude was lower than 0.7 per cent) but significant (a 3.3σ detection was claimed). Assuming this point, we have repeated our previous analysis, but using different statistical properties for the correlation matrices in the two halves. The way we have estimated these new correlation matrices is as follows: we have generated 500 000 simulations (in the same way it has been already described at the beginning of Section 4, and, once the co-added maps are degraded to nside= 32, each one of the simulations has been modified by applying the dipolar modulation estimated by Hoftuft et al. (2009) from the *WMAP* data. It is from these modulated simulations that we have estimated the new correlation matrices needed to estimate the likelihood probabilities. The result of this test is presented in Fig. 8. The conclusions related to the *f*_{NL} estimation are essentially the same: on the one hand, the analysis of the Northern hemisphere provides a constraint compatible with the Gaussian scenario, whereas that of the Southern hemisphere does not; on the other hand, the distance between both distributions is again too large (it also has a *p*-value – as compared to, in this case, anisotropic simulations – increases up to ≈0.09 (estimated in terms of the KLD). However, despite this slight change in the *f*_{NL} hemispherical estimation, dramatic differences can be observed for the pure *cubic* scenario. Interestingly, accounting for the dipolar modulation correction reveals an extra departure form anisotropy related to the *g*_{NL} constraints. The dipolar modulation makes the maximum-likelihood estimation of *g*_{NL} highly incompatible between both hemispheres. In particular, the distance between both distributions is extremely rare as compared with the expected behaviour from Gaussian and anisotropic CMB simulations (generated, as explained above, by applying the dipolar modulation reported by Hoftuft et al. 2009): it has a *p*-value of ≈0.002, in the sense of the KLD.

We have also studied whether *WMAP* data corrected by the dipolar modulation found by Hoftuft et al. (2009) could present a behaviour compatible with the Gaussian and isotropic hypotheses. Results for the corrected *WMAP* data are given in Fig. 9. Note that we do not see any significant differences from the previous situation (i.e. the case in which uncorrected *WMAP* data were analysed assuming anisotropy): the dipolar modulation does not affect the *f*_{NL} and *g*_{NL} constraints.

Summarizing, the results obtained in this section seem to confirm that there is some kind of anomaly related to a hemispherical asymmetry as the one defined by the dipolar pattern reported by Hoftuft et al. (2009), in the sense of the *f*_{NL} parameter. Even more, when *WMAP* data are analysed using correlations compatible with the dipolar modulation suggested by Hoftuft et al. (2009), asymmetries related to the *f*_{NL} parameter are not only clear but also associated with the *cubic term* (i.e. the *g*_{NL} parameter). Intriguingly, the correction of the *WMAP* data in terms of this dipolar modulation is not enough to obtain a CMB signal compatible with a Gaussian and isotropic random field.

At this point, it is worth mentioning that the dipolar modulation of Hoftuft et al. (2009) was obtained by considering second-order moments of the CMB data and therefore this correction only addresses the problems related to an asymmetry in terms of this order. Hence, it is not totally surprising that this dipolar modulation correction is not sufficiently satisfactory to solve the anomaly reported in this work, since such anomaly is related to higher order moments. It is also need to point out that we have tested that the dipolar modulation correction of the *WMAP* data does not affect the results obtained from an all-sky analysis of the CMB data.

Finally, let us recall that, in a recent work, Rudjord et al. (2009) searched for specific asymmetries related to the local estimation of the *f*_{NL} parameter, by using needlets. Contrarily to our findings, in this work no significant asymmetry was found when analysing *WMAP* data. There are some differences between the analyses that could explain the discrepancy, although they have to be taken as mere suggestions. First, the kind of non-Gaussianity that is probed by each work is different: whereas the Rudjord et al. (2009) paper explores a *f*_{NL} model that is local in the gravitational potential (from which the non-Gaussian temperature fluctuations are obtained taking into account all the gravitational effects), here we adopt a local model in the Sachs–Wolfe regime. Secondly, they work at the best *WMAP* resolution (around 10–20 arcmin), whereas we focus on scales of around 2°. Thirdly, we explore an specific division of the sky (the one reported by Hoftuft et al. 2009), whereas they consider several divisions that, not necessarily, have to match the one used by us (they explore hemispherical divisions within an interval of around 30°).

## 5 CONCLUSIONS

We have presented an extension of our previous work (Vielva & Sanz 2009), by defining a parametric non-Gaussian model for the CMB temperature fluctuations. The non-Gaussian model is a local perturbation of the standard CMB Gaussian field, which (under certain circumstances) is related to an approximative form of the weak non-linear coupling inflationary model at scales larger than the horizon scale at the recombination time (i.e. above the degree scale; see e.g. Komatsu & Spergel 2001; Liguori, Matarrese & Moscardini 2003). For this model, we are able to build the posterior probability of the data given the non-linear parameters *f*_{NL} and *g*_{NL}. From these pdfs, optimal maximum-likelihood estimators of these parameters can be obtained.

We have verified with *WMAP*-like simulations that the maximum-likelihood estimation of the *quadratic* non-linear coupling parameters () is unbiased, at least for a reasonable range of values, even when non-Gaussian simulations also account for a *cubic* term. In particular, we found that for simulated non-Gaussian coefficients such as and the estimation of *f*_{NL} is accurate and efficient. However, when trying to study the case in which only a pure *cubic* model is addressed the situation is different. In particular, the simulated *quadratic* term has an important impact on the estimation of *g*_{NL}. For instance, if a pure *quadratic* non-Gaussian model is simulated with a value of , a value of is wrongly estimated. This results indicates, obviously, that the *quadratic* term is more important than the *cubic* one in the expansion of the local non-Gaussian model and therefore not accounting properly for the former might have a dramatic impact on the latter. Contrarily, the opposite situation is much more unlikely. Finally, we have investigated the joint estimation of the *f*_{NL} and *g*_{NL} parameters. In this case, we find that for a similar regime as the one mentioned above (i.e. and ) an accurate and efficient estimation of the non-linear coupling parameters is obtained. However, for larger values of these coefficients, we find that the parameter estimation is highly biased, favouring a region of the parameter space of larger values for both coefficients.

We have addressed, afterwards, the analysis of the *WMAP* 5-year data. We have considered two different analyses. First, we have investigated the case of an all-sky analysis (except for the Galactic area not allowed by the *WMAP* KQ75 mask). Secondly, motivated by previous findings, we have performed a separated analysis of two hemispheres. In particular, the hemispherical division associated with the dipolar pattern found by Hoftuft et al. (2009) was considered.

Regarding the all-sky analysis, we find, for the case in which a pure *quadratic* model is investigated, the same result that we already found in our previous work (Vielva & Sanz 2009). In particular, we determine that −154 < *f*_{NL} < 94 at 95 per cent. Equivalently, for the case of a pure *cubic* non-Gaussian model we establish −5.6 × 10^{5} < *g*_{NL} < 6.4 × 10^{5} at 95 per cent. This is in agreement with a recent work by Desjacques & Seljak (2010), where an analysis of LSS data was performed. The result that we provide in this paper is, as far as we know, the first direct constraint on *g*_{NL} from CMB data alone. Finally, we have also investigated the case of a joint estimation of the *quadratic* and *cubic* non-linear coupling parameters. In this case, the constraints obtained on *f*_{NL} and *g*_{NL} are virtually the same as the ones already reported for the independent analyses.

We have performed a model selection to evaluate which of the four hypotheses (i.e. Gaussianity, a pure *quadratic* model, a pure *cubic* model and a general non-Gaussian scenario) is more likely. Since a well-motivated prior for the non-linear coupling parameters is lacking, we have used asymptotic model selection criteria (like AIC and BIC), instead of more powerful Bayesian approaches like the Bayesian Evidence or the posterior ratio test. Both methodologies (AIC and BIC) indicate that the Gaussian hypothesis is more likely than any of the non-Gaussian models. We also found that, among the non-Gaussian scenarios, the one with a pure *quadratic* model is the most favoured one, whereas the general one (i.e. *f*_{NL}≠ 0 and *g*_{NL}≠ 0) is the most unlikely.

The analysis of the *WMAP* data in two hemispheres revealed that, whereas both halves of the sky present similar constraints on the *g*_{NL} parameter (and, in both cases, not indicating any significant incompatibility with the zero value), the analysis of a pure *quadratic* scenario showed a clear asymmetry. First, the *f*_{NL} value in the hemisphere whose pole is closer to the Southern ecliptic pole is , which implies that *f*_{NL} < 0 at 96 per cent. Even more, the distance between both likelihoods (given in terms of the KLD) presents a *p*-value of ≈0.04.

We have also analysed the *WMAP* data after by considering different correlation properties in each hemisphere (according to the dipolar modulation described by Hoftuft et al. 2009). We have tested that the behaviour found for the *f*_{NL} is practically the same as before, and that, in addition, an extra anomaly appears associated with the *g*_{NL} parameter. In particular, the distance between both likelihoods is anomalously large as well (it corresponds to a *p*-value of ≲0.002). Further test was performed after correcting *WMAP* data from the dipolar modulation. In this case, the asymmetries in the maximum-likelihood estimations of both non-linear coupling parameters remain unaltered. Hence, these results indicate that, as it has been previously reported in other works, there are evidences of anisotropy in the *WMAP* data, reflected as an asymmetry between two opposite hemispheres. Such anomaly is related to a different distribution for a non-linear coupling parameter related to the *quadratic* term. However, a correction in terms of a dipolar modulation as the one proposed by Hoftuft et al. (2009) seems not to be sufficient to account for this anomaly related to the likelihood distribution of the *f*_{NL} parameter.

*f*

_{NL}parameter had an opposite sign with respect to the definition used in this paper.

We thank Belén Barreiro and Enrique Martínez-González for useful comments. We acknowledge partial financial support from the Spanish Ministerio de Ciencia e Innovación project AYA2007-68058-C03-02. PV also thanks financial support from the *Ramón y Cajal* programme. The authors acknowledge the computer resources, technical expertise and assistance provided by the Spanish Supercomputing Network node at Universidad de Cantabria. We acknowledge the use of Legacy Archive for Microwave Background Data Analysis. Support for it is provided by the NASA Office of Space Science. The healpix package was used throughout the data analysis (Górski et al. 2005).

## REFERENCES

*et al.*(

## Appendix

### APPENDIX A: *N*-PDF DERIVATION

In this appendix, we provide a detailed derivation of the *N*-pdf for the non-linear local model given by equation (5). As it has been already discussed, the objective is to make use of this likelihood to constrain the parameters that define the perturbative model. In particular, we are interested in estimating the parameters governing the *quadratic* (*a*) and *cubic* (*b*) terms, which, as it was explained in Section 2, can be related (under certain circumstances) to the non-linear coupling inflationary parameters *f*_{NL} and *g*_{NL}, respectively.

Let us recall here the expression for the *N*-pdf for the non-Gaussian model, as a function of the multinormal *N*-pdf [*p*(**φ**), equation 9]:

*Z*is the determinant of the Jacobian of the transformation (equation 10). For practical purposes (i.e. in order to constrain the non-linear coupling parameters), it is more convenient to deal with the log-likelihood (given in equation 13), instead of

*p*(

**|**

*x**a*,

*b*) (given in equation 9). Obviously, the later can be easily obtained by the inversion of equation (13).

Replacing equation (9) into (13), it is easy to show that the computation of implies to solve two terms:

where and, we recall it,*p*(

**|0) ≡**

*x**p*(

**φ=**).

*x*Let us address the determination of these two terms (log *Z* and Ω) independently.

#### A1 Determination of the log-Jacobian

According to equation (10), the log-Jacobian for the transformation is given by

where φ_{i}as a function of

*x*is given in equation (11). It is straightforward to prove that where Since we are considering a perturbative non-Gaussian model, then and therefore the log-Jacobian can be easily derived from the Taylor expansion for the logarithm function: Taking into account that (as it can be trivially shown from equations 5 and 7) and replacing equation (A5) into equation (A6), one finally gets up to the appropriate order in σ.

_{i}#### A2 Determination of Ω

Since we are dealing with a perturbative model (i.e. **φ**=** x**+

**ε**, with

**ε**≪

**and where**

*x***ε**≡

**η**σ+

**ν**σ

^{2}+

**μ**σ

^{3}+

**λ**σ

^{4}, according to equation 11) it is interesting to note that the probability function

*p*[

**φ**=

**φ**(

**)] can be expanded in terms of the Taylor expansion, up to the appropriate order:**

*x**P*≡

*p*(

**φ**=

**+ 0) ≡**

*x**p*(

**|0). The function ∂**

*x*_{i}

*Y*(

**) is the derivative of**

*x**Y*(

**) with respect to**

*x**x*, i.e. . Equivalently, and .

_{i}Taking into account this expansion, we have

where, and . It is trivial to show that these quantities are related to the data*, its expected correlation*

**x****ξ**=[ξ

_{ij}], and the model (i.e.

*a*and

*b*). After some algebra, one trivially finds

Since we are interested in the quantity , the logarithm function can be expanded up to the appropriate order. It is straightforward to obtain

whereWe have defined the operator *S*_{αβ}≡α^{i}ξ^{−1}_{ij}β^{j}, and **η**, **ν**, **λ** and **μ** were defined in equation (12).

Therefore, taking into account equations (A8) and (A12), the final expression for the log likelihood is