Beauty Contests and the Term Structure

A novel decomposition highlights the scope for information to influence the term structure of interest rates. Based on the law of total covariance, we show that real term premia in macroeconomic models contain a component that depends on covariances of realised stochastic discount factors and a component that depends on covariances of expectations of those stochastic discount factors. The impact of different informational assumptions can then be identified by looking at their effect on the second, expectational, component. If agents have full information about technology in a simple macro-finance model then the conditional covariance of expectations is low, which contributes to the real term premia implied by the model being at least an order of magnitude too small, a result that is unchanged if some components of technology are unobservable or observed with noise. To generate realistic term premia, we draw on the beauty contest literature by differentiating between private and public information and introducing the possibility of strategic complementarities in the formation of expectations. A quantitative version of the model is found to explain a significant proportion of observed term premia when estimated using data on expectations of productivity growth from the Survey of Professional Forecasters.


Introduction
The way that developments in the real economy affect the pricing of financial assets is central to much recent research in monetary economics and macro-finance. Despite this, quantitative DSGE models of the type developed by Christiano et al. (2005) and Wouters (2003, 2007) continue to have ineluctable difficulty generating risk premia of magnitude anything like those seen in financial markets. For example, Rudebusch and Swanson (2012) find that the average term premium on a default-free nominal 5-year zero-coupon bond is less than one basis point in a medium-scale DSGE model with nominal rigidities and a reasonable coefficient of relative risk aversion. Estimates from Adrian et al. (2013) suggest that the term premium on 5-year US Treasury Bills 1999-2017 was, at 57.2 bps, at least an order of magnitude higher. This paper presents a new decomposition that stresses the importance of informational assumptions for the emergence of sizeable term premia in asset pricing models. We focus our attention on the relationship between information and the real term premium. 1 Using the standard no-arbitrage pricing condition, we divide the real term premium into a component that is affected by information and a component that is not. By applying the law of total covariance, the mean of the real term premium at any maturity can be shown to depend on covariances of successive realised stochastic discount factors and covariances of successive expectations of stochastic discount factors, with the latter being directly affected by the informational assumptions imposed on the model. The new decomposition prompts us to attribute the quantitative failure of medium-scale DSGE models to informational assumptions that are unable to deliver sufficient covariance in expectations. Having identified the channels by which information impacts on term premia, the new decomposition provides a useful framework in which to explore alternative assumptions about information.
We begin our analysis by focusing on the real term premium on two-period bonds in a simplified yet tractable model where persistent and transitory technology shocks are the only exogenous disturbances. In keeping with quantitative results from medium-scale DSGE models, we find that the term premium is small if agents are assumed to have full information about the current level of technology. This is because the unconditional covariance of both realised and expected stochastic discount factors is low for any realistic paramaterisation of the technology 1 Information has the potential to influence term premia through its effects on real term premia and inflation risk premia. Our decision to concentrate on real term premia reflects a belief as in Hördahl and Tristani (2012) that "most of the average slope in the nominal term structure is due to compensation for real risks, rather than inflation risk." We also have reservations about empirical estimates of the inflation risk premia, which are yet to settle in the literature and often differ across specifications using survey data, synthetic yields or TIPS. process and any reasonable level of risk aversion. The unconditional covariance of expectations is especially small, a result that still holds if we assume that the representative agent has only partial or noisy information and so has to infer the current level of technology and break it down into persistent and transitory components.
In thinking of alternative informational assumptions that may increase the covariance of expected stochastic discount factors, we are guided by the quote of Shiller (1984) above the start of this introduction. We interpret his hypothesis of a social process driving opinions as motivation for exploring informational assumptions that introduce strategic complementarities into the formation of expectations. Drawing on the literature on beauty contests, we look at what happens when agents price the real term premium according to forecasts of technology that reward the agent for being similar not only to fundamentals but also to the average forecast across all agents. In this scenario, the greater the strategic complementarity in forecasting the more that agents are willing to incorporate aggregate and idiosyncratic noise into their forecasts of future technology. If the aggregate noise is persistent then it will induce additional unconditional autocovariance in expectations of the stochastic discount factor, which increases term premia as agents demand compensation for the extra risk they face.
The potential for complementarities in forecasting to occasion significant term premia is first demonstrated in our simple analytical framework, where as informational assumptions they gain traction by increasing the unconditional covariance of expectations of the stochastic discount factor. We find the same result in a more general model in which the degree of strategic complementarities is tightly disciplined by data on individual forecasts of productivity from the Survey of Professional Forecasters. When estimated, the general model generates a real premium on 5-year zero-coupon bonds of 23.8 bps if strategic complementarities are present, a distinct improvement on the 5.6-7.2 bps that is estimated when they are absent, and more than 40% of the 57.2 bps term premium previously noted for 5-year US Treasury Bills 1997-2017. Our decomposition attributes all the extra term premium to a larger covariance in expectations of the stochastic discount factor. The paper is organised as follows. Section 2 discusses related literature before Section 3 applies the law of total covariance to derive our new decomposition. Section 4 presents the implications of different informational assumptions for the premium on a two-period bond in our simplified analytical model. Section 5 presents the more general business cycle model and discusses quantitative results from its estimation. Section 6 concludes.

Related literature
The finding that standard general equilibrium asset pricing models are inconsistent with data on the equity premium and the term structure of interest rates dates back to Mehra and Prescott (1985), Backus et al. (1989), and Hansen and Jagannathan (1991). A large literature evolved in response to this apparent "puzzling" disconnect between models and data, including ex-planations based on recursive preferences (Epstein and Zin, 1989), long-run risk (Bansal and Yaron, 2004;Croce, 2014), rare disasters (Rietz, 1988;Barro, 2006) and habit formation (Jermann, 1998;Abel, 1999;Rudebusch and Swanson, 2008). However, models which include one or more of the above mechanisms typically explain data on risk premia only at the expense of implausibility still remaining at some other margin. Tallarini (2000) shows that asset pricing models with recursive preferences can simultaneously match data on the equity risk premium and the risk-free rate with a coefficient of relative risk aversion of 50, considerably less than what is needed to justify the risk premium with standard preferences but still well above the levels usually estimated in micro data. Rudebusch and Swanson (2012) similarly ask whether a medium-scale DSGE model with long-run risk and recursive preferences can match the average term premium on a nominal 10-year bond. The answer is yes with a plausible degree of long-run risk, but their preferred specification still relies on a coefficient of relative risk aversion of 110.
An attempt to rationalise the ostensibly high coefficients of relative risk aversion has been made by Barillas et al. (2009) in the literature on ambiguity aversion. They show that the market price of risk includes components that compensate agents not only for known risks but also for ambiguity surrounding the true data generating process for returns, in which case the high coefficients of relative risk aversion in recursive preference formulations can be re-interpreted as measuring reasonable levels of aversion to both risk and ambiguity. Results from van Binsbergen et al. (2012) caution against applying this reasoning to the results of Rudebusch and Swanson (2012) though, finding that inflation would still need to be unrealistically volatile for a standard DSGE model with nominal frictions to square with the US Treasury yield curve, even if the high coefficient of relative risk aversion is rationalised by ambiguity aversion. van Binsbergen et al.
(2012) work with a model in which the capital stock is endogenous, which moderates the effects of exogenous fluctuations in capital that are important in Rudebusch and Swanson (2012) .
A number of authors have explored how amending the standard informational assumptions in macroeconomic models can help explain the behaviour of selected financial assets. Cogley and Sargent (2008) create significant equity premia by requiring agents to re-learn the law of motion for consumption growth after a bout of pessimism brought on by the Great Depression, although the premia eventually dissipate as the influence of the initial pessimism declines. Collard et al. (2018) report similar findings in a model where ambiguity-averse agents fear that their model of the joint distribution of future consumption and dividends may be misspecified.
Another promising line of research is by Luo (2010) and Luo and Young (2016), who examine simple portfolio choice models in which agents solve a rational inattention problem of the type introduced by Sims (2003). Models in which higher-order expectations play a role in the dynamics of equity prices have been proposed in work by Allen et al. (2006), Bacchetta and Wincoop (2008), Kasa et al. (2014) and Barillas and Nimark (2018) that is broadly related to the beauty contest literature of Morris and Shin (2002) and Angeletos and Pavan (2007).
The idea that strategic motives can amplify volatility in aggregate expectations is studied by Angeletos and La'O (2013), who demonstrate how random matching of agents uncertain about their idiosyncratic productivity makes expectations volatile even when aggregate productivity is common knowledge. Benhabib et al. (2015) make a related point in a model with multiple equilibria. We regard these approaches as complementary to ours.

Decomposing the term premium
This section explains bond pricing in dynamic stochastic general equilibrium and applies the law of total covariance to derive our novel decomposition of risk premia. In what follows we fix ideas by assuming that financial assets are priced by a representative agent, leaving until Section 4.5 the decomposition of risk premia in beauty contests with heterogeneous agents and strategic complementarities in expectations formation.

The household's optimisation problem
The household's expected utility is in all periods t ≥ 0, where β ∈ (0, 1) is the discount factor, E t is the conditional expectation operator with respect to the information set I t , (c t , l t ) ∈ Ω ⊆ (R + ) 2 is a consumption and labour supply choice and u : Ω → R + is the period utility function. The flow budget constraint is where w t is the nominal wage and d t is lump-sum income. The household can invest in noncontingent zero-coupon real bonds that have a redemption value of one unit of consumption at maturity. Their period t holdings of bonds due to mature in n periods are denoted by b (n) t for n = 1, 2, . . . , N , with corresponding prices p (n) t . Note that bond holdings b (n) t−1 inherited from period t − 1 are priced at p (n−1) t in period t. The price of a bond at maturity is its redemption value, hence p (0) t = 1. The representative household maximises utility (1) subject to the sequence of flow budget constraints (2) and conditions ruling out Ponzi schemes. An interior solution satisfies N consumption Euler equations of the form for n ∈ {1, 2, . . . , N }, where m t+1 is the stochastic discount factor defined by Equation (3) is the standard asset pricing condition equating the price of an asset to its expected discounted price in the following period. Together with the law of iterated expectations, it implies that the price of a bond p (n) t is given by the expected product of successive stochastic discount factors over the maturity of the bond, These bond prices can be translated into yields by defining the continuously-compounded yield of an n-period zero-coupon bond as the value of i We denote the yield on a one period bond as i t ≡ i (1) t without a superscript to simplify notation.

Comovement, expectations and the term premium
The stochastic nature of bond prices implies that bonds of maturity n > 1 are a source of risk for the household, even if there is no possibility of default. 2 It is well-known from the capital asset pricing model (CAPM) that a risk-averse household demands compensation for holding bonds if there is undesirable comovement between bond prices and the household's marginal utilities of consumption. Since marginal utilities depend on realised stochastic discount factors and bond prices are a function of expected stochastic discount factors, the term premium will depend on the joint autocovariance structure of realisations and expectations.
The real term premium at maturity n is defined relative to the hypothetical price of an n-period bond under risk neutrality. Following the literature in Andreasen (2012), Gürkaynak and Wright (2012) and Swanson (2008, 2012), the risk-neutral price is assumed to bep which is the redemption value of an n-period bond discounted by expected future one-period bond yields rather than expected household stochastic discount factors. Translating prices into yields as we did for equation (6), the risk-neutral yield at maturity n is to an approximation that ignores a Jensen's inequality term. It coincides with the bond price predicted by the expectations theory of the term structure. The per-period real term premium 2 Holding bonds of maturity n = 1 is not risky as they always deliver one unit of consumption next period.
in bond prices is defined by with ψ (1) t . With regard to yields the term premium is i

A useful decomposition
The real term premium on bond prices for n = 2 is given by equations (5), (7) and (9) as The last term on the right hand side of (10) satisfies E t (m t+1 m t+2 ) = Cov t (m t+1 , m t+2 ) + E t m t+1 E t m t+2 by the definition of conditional covariance. Combining the Euler equation (3) with the bond yield equation (6) allows us to write E t m t+1 = e −it and E t+1 m t+2 = e −i t+1 . By the law of iterated expectations the latter becomes E t m t+2 = E t e −i t+1 and the last term on the right hand side is Applying analogous reasoning to the first term on the right hand side of (10) yields and the decomposed real term premium is The unconditional mean of the real term premium follows as by applying the law of total covariance to equation (13).
The novelty is the ability to decompose the real term premium into components depending on conditional covariances of successive realised and expected stochastic discount factors. A positive term premium arises if the conditional covariance of expectations of stochastic discount factors is positive and larger than the conditional covariance of realised stochastic discount factors over the maturity of the bond. The intuition comes from risk premia being the difference in price between two bonds of the same maturity, so the common components in (11) and (12) cancel and the risk premium is determined solely by how the bond price conditionally covaries with the stochastic discount factor and the yield on a one-period bond.
It is possible to decompose conditional and unconditional real term premia at all horizons into components that depend on realisations and expectations, albeit with involved calculations.
For example, the conditional term premium for n = 3 is Proposition 1 and Corollary 1 in Appendix A.1 present the general case. The resulting expressions are independent of assumptions about information and the stochastic processes of external disturbances, which gives the decomposition power to inform across many different environments. The only requirement is that the law of iterated expectations holds. We are particularly interested in how the expectations components react to different informational assumptions.

A simple analytical model
This section uses the new decomposition to connect informational assumptions and term premia in simple analytical models, in preparation for the more general investigation in Section 5.

Households, firms and technology
The economy consists of a representative household and a representative firm. The representative household has inelastic labour supply l t =l and solves the household's optimisation problem in Section 3.1. The representative firm produces output y t according to where the fixed capital stock is normalised to one, α ∈ (0, 1) is a parameter and the logarithm of technology a t ≡ ln A t follows the exogenous process with ρ ∈ (−1, 1), η t ∼ N (0, σ 2 η ), ε t ∼ N (0, σ 2 ε ) and Cov(η t , ε t ) = 0. The logarithm of technology hence has an AR(1) persistent component x t and an i.i.d. transitory component η t . The household's optimisation problem defines the general form (4) of the stochastic discount factor. We assume that utility is logarithmic in consumption so the stochastic discount factor is In equilibrium c t = A tl 1−α for all t, since in the household's budget constraint (2) labour is paid its marginal product, firms make zero profits and bonds are in zero net supply. To a log-linear approximation the stochastic discount factor implied by the model becomes It is instructive to begin with the shortest maturity n = 2 at which bonds have a non-zero real term premium, for which the unconditional mean real term premium is given by equation (14).
The first term on the right hand side of (14) is the unconditional autocovariance of successive realised stochastic discount factors. When the stochastic discount factor is approximated by equation (19), the technology process (16) and (17) implies The autocovariance of realised stochastic discount factors Cov(m t+1 , m t+2 ) in equation (20) is independent of the household information set I t , which means that informational assumptions are fully reflected in the expectations component Cov(E t m t+1 , E t+1 m t+2 ) in the simple model.

Full information
The benchmark informational assumption is that households have full information about current and past values of the persistent and transitory components of technology. The assumption is where I * t is the information set of the representative household and the superscript notation z t ≡ {z s } t s=0 indicates the entire history of a variable up to and including period t. Taking expectations of the stochastic discount factor implied by the model (19) with respect to I * t gives its conditional expectation as and the unconditional autocovariance of successive expected stochastic discount factors is With the unconditional autocovariance of successive realised stochastic discount factors (20) being independent of informational assumption, the unconditional mean of the real term premium can be calculated analytically. Substituting the covariance terms (20) and (22) into (14) it is Figure 1 depicts the mean real term premium Eψ (2) F I with full information in a quantitative version of our model. In both panels we fix the discount factor β = 0.99 and the overall volatility in technology a t at Var(a t ) = 0.01 2 , which leaves freedom to explore term premia in two dimensions. The left panel plots the real term premium against the degree of persistence ρ in the persistent component x t of the technology process, holding constant its relative contribution to  overall volatility in technology by setting σ 2 ε and σ 2 η such that Var(x t )/Var(a t ) = 0.9. In the right panel we have the real term premium plotted against the relative contribution of the persistent component, adjusting σ 2 ε and σ 2 η to fix the persistence of the persistent component at ρ = 0.8. The term premium in each panel is decomposed into covariance terms −Cov(m t+1 , m t+2 ) and Cov E t+1 (m t+1 |I * t ), E t+2 (m t+2 |I * t+1 ) that respectively depend on realisations and expectations of stochastic discount factors.
The real term premium for the model with full information is decreasing in the persistence parameter ρ and the relative contribution of the persistent component of technology. Most of the real term premium comes from the component that depends on realisations, which decreases in ρ in the left panel of Figure 1 because a t and a t+1 enter equation (19) for the stochastic discount factor with opposite signs. Higher autocovariance in technology therefore translates into lower autocovariance in realised stochastic discount factors. The component of the real term premium that depends on expectations first rises with ρ as the autocovariance of the persistence component of technology increases. However, it eventually falls as a higher ρ reduces the extent to which that autocovariance is loaded into expectations of the stochastic discount factor by equation (21). The largest contribution from the expectations component comes at the value of ρ that makes technology sufficiently persistent yet still tracked by expectations. At the extremes expectations make no contribution to the real term premium, at ρ = 0 because there is no persistence in the model and at ρ = 1 as a random walk in technology removes all persistence from the stochastic discount factor. The expectations component increases with the relative contribution of the persistent component to technology in the right panel of Figure 1, although its contribution remains small and dominated by the component that depends on realisations.

Partial information
The first relaxation of the full information benchmark supposes that the household knows current and past values of technology but does not observe its decomposition into transitory and persistent components. Formally, the representative household's information set I t contains a t but there is no period s ∈ {0, 1, . . . , t} such that x s ∈ I t or η s ∈ I t .
The household forms expectations over future stochastic discount factors according to which requires them to infer the fraction of current technology that comes from its persistent component. This is a standard signal extraction problem, requiring an estimate of the state x t from a sequence of noisy signals a t . The solution satisfies the Kalman filter recursion where Σ t ≡ Var(x t |I t−1 ) and K t is the Kalman gain. We further assume that x 0 ∼ N (0, Σ), in which case Σ t and K t are constant for all t and the resulting autocovariance of expected stochastic discount factors can be calculated numerically using Monte Carlo methods. The autocovariance of realised stochastic discount factors is independent of the household's information set, so identical to that under full information. What these calculations imply for the real term premium and its decomposition is shown in Figure 2.
If the majority of the volatility in technology comes from its persistent component then a t is a precise signal about x t and expectations formed using the Kalman filter are close to those with full information. As can be seen from the left panel of Figure 2 drawn for Var(x t )/Var(a t ) = 0.9, the expectations component of the real term premium under I t is practically identical to that under the full information set I * t at all levels of ρ. The right panel of Figure 2 shows a marginally larger expectations component at intermediate levels of Var(x t )/Var(a t ) when ρ is fixed at 0.8, but requiring the household to decompose technology into transitory and persistent components still only has a very weak impact on the real term premium.

Noisy information
Adding noise to the household's signal extraction problem does not generally result in larger term premia in our model. To see this, we change the informational assumptions so that the representation agent only observes noisy signals of current and past values of technology when making decisions that determine the real term premium. Separating the information set relevant  for the term premium from that determining labour supply and consumption is necessary to prevent the household imputing the current level of technology from the wage rate or total household income. It can be rationalised by dividing the household into a representative investor and a representative consumer. The investor trades both maturities of bonds in financial markets until the real term premium is arbitrage-free with respect to their information set, whereas the consumer knows current technology when it supplies labour and allocates household income to consumption and saving. 3 The signal s t has the form s t = a t + ξ t with noise ξ t ∼ N (0, σ 2 ξ ) that may correlate with the transitory component of technology according to σ ηξ ≡ Cov(η t , ξ t ). I t is the information set of the representative investor, defined such that s t ⊂ I t but with no period s ∈ {0, 1, . . . , t} such that a s ∈ I t , x s ∈ I t or η s ∈ I t . The conditional expectation of m t+1 is and the representative investor has to infer the persistent component of technology from a noisy signal, as before when it was unobservable. The solution satisfies the same Kalman filter recursion defined by equations (25) and (26) in Section 4.3, only now the signal Figure 3: Components of the real term premium (n = 2) with full information (FI) and noisy information (NI). β = 0.99, Var(a t ) = 0.01 2 , σ 2 ξ = Var(a t )/2 and σ 2 ηξ = 0. Left panel: (27) the Kalman gain is with Σ s t ≡ Var(x t |I t−1 ). What this implies for the real term premium is illustrated in Figure 3, drawn for Var(ξ t ) = Var(a t )/2 and σ ηξ = 0.
Assuming that technology is observed through a noisy signal reduces the expectations component of the real term premium in both panels of Figure 3, most notably at intermediate values of the persistence parameter ρ. Noise has the potential to inject additional volatility and persistence into estimates of the persistent component of technology, but any effect on the autocovariance of expected stochastic discount factors is offset by expectations no longer directly reacting to realisations of technology. Expectations of the stochastic discount factor react to a t −ρx t with full information by equation (21) or a t −ρE(x t |I t ) with partial information by equation (24). But when there is noisy information they track (1 − ρ)E (x t |I t ) by equation (28). The absence of a direct effect of a t reduces the negative autocovariance of expectations and explains the fall in the expectations component of the term premium. If the aim is to generate term premia that match estimates from the data then adding noise in this way is counterproductive.
The finding that noise attenuates real term premia extends to the general case in which the representative investor chooses the informational content of the noisy signals they observe, subject to a limited information processing capacity. Luo and Young (2014) show that rational inattention and signal extraction problems are isomorphic when the task is to extract an estimate of the persistent component of an exogenous process in a linear-quadratic Gaussian framework.
Their results apply to our model, so our findings extend to the general case under rational inattention and we conclude that the expectations component of the real term premium is dampened by noise, however that noise is introduced.

Beauty contests
The inability of either partial or noisy information to engender the autocovariance in expectations required to support meaningful real term premia suggests a need for more radical informational assumptions. In this section we do just that by dropping the assumption of the representative investor and instead assuming that investors demand term premia on the basis of heterogeneous forecasts of technology that are conditioned to be similar to the forecasts of other investors. Our setup mirrors the classic Keynesian beauty contest, where to predict the winner it is necessary to identify not only the prettiest contestant but also who other people think is the prettiest. This coordinates investors in our model as they form expectations of the expectations of others.

Information
The household is comprised of an investor and a consumer. There is a representative consumer making labour supply and consumption decisions based on a common information set as before, but investors are heterogeneously informed. Investor i ∈ [0, 1] receives a signal s i,t about technology that has public and private noise components. Common noise n t and idiosyncratic noise n i,t follow mean-zero AR(1) processes with common persistence parameter ρ and respective innovations ξ t ∼ N (0, σ 2 ξ ) and ζ i,t ∼ N (0, σ 2 ζ ). The signal is where the sum of all persistent components x n i,t ≡ x t + n t + n i,t evolves as The formal definition of the information set I i,t of investor i is s t i ⊂ I i,t but with no period s ∈ {0, 1, . . . , t} such that a s ∈ I i,t , x s ∈ I i,t , n s ∈ I i,t or n i,s ∈ I i,t . We assume that the transitory component of technology η t is observable following the negligible effect of partial information on the term structure in Section 4.3. This makes the persistent component x n i,t observable too and η t , x n,t i ⊂ I i,t .

Strategic complementarity
The expectations component of the term premium demanded by investor i depends as before on the autocovariance of their expectations of stochastic discount factors. The equilibrium stochastic discount factor for household i is analogous to that in equation (19), so the expectation of its stochastic discount factorm i,t+1 formed is assumed to be constructed aŝ in period t, which highlights the role played by forecastsx i,t of the permanent component of technology. We target our strategic complementarity at this forecasting problem. 5 Strategic complementarities are introduced by assuming that investor i sets their forecast of the permanent component of technology to minimise where 0 ≤ ω ≤ 1 is a parameter measuring the degree of strategic complementarity.
The first term gives the investor an incentive to minimise the mean squared deviation of their forecast from the true value of the persistent component of technology. This reflects the investor acting on behalf of the household, and so wanting to produce a forecast that best estimates the household's stochastic discount factor. The second term captures a strategic complementarity by rewarding the investor for a forecast that has the same sign as the expected average forecast of all investors. If the average is expected to be positive then the investor adjusts their forecast upwards, if negative the impetus is for the investor to adjust downwards.
This acts to coordinate forecasts and expectations. Constructs of this type have been used to introduce strategic interactions in a variety of settings, although establishing microfoundations for any given specification has proved challenging. 6 The difficulties are also apparent here, so we eschew precise microfoundations and proceed under the assumption that our specification adequately captures incentives when the forecasts of others matters. 7 5 In our linear model it could equally be assumed that there are complementarities in forecasting the stochastic discount factor. We prefer to work with forecasts of technology since they have a clear empirical counterpart that proves useful in disciplining estimation of the general model in Section 5. 6 In management science, Dessein and Santos (2006) and Dessein et al. (2016) posit a function similar to ours to capture strategic complementarities in organising tasks. Haltiwanger and Waldman (1989) do likewise for strategic complementarities in production, as does Vives (2014) in a finance environment. All these applications shy away from the microfoundations of strategic complementarities, as does the microeconomic theory in Bergemann et al. (2015). 7 The strategic complementarity in our model could be rationalised by fears that the household might suffer a liquidity shock and so need to liquidate their bond holdings within the period, in which case they would be interested in the expected price on liquidation.

Equilibrium signal extraction
The first order condition of investor i implies that it sets its forecast according tô which adjusts the mean squared error minimising rational expectation of x t to account for the strategic complementarity. The rational expectation of the persistent component of technology is extracted from investor i's noisy signal x n i,t as That the expectation is linear in the signal suggests the existence of a symmetric equilibrium in which the forecasts of all investors are a linear function of their respective signalŝ where θ is a parameter to be determined in equilibrium, in which case investor i's expectation of investor j's forecast solves the complementary signal extraction problem as Symmetric linear equilibrium is confirmed with The equilibrium exists with 0 ≤ θ < ∞ provided 2(1 − ω)/ω > (σ 2 ε + σ 2 ξ )/(σ 2 ε + σ 2 ξ + σ 2 ζ ), which is satisfied if there is a sufficiently private noise component σ 2 ζ in signals or only a limited degree of strategic complementarity ω. The maximal strategic complementarity the model can support is therefore bounded from above by a limit that depends on the variance of noise relative to the variance of the persistent component of technology. 8 We restrict ourselves throughout to configurations of the model for which the condition holds.
The value of θ determines how much the investor's forecastsx i,t andm i,t+1 react to the noisy signals they observe. It is increasing and convex in ω in equilibrium, meaning that forecasts react progressively more to signals when there are more strategic complementarities. When ω = 0 the investor forecasts in isolation, being indifferent to the forecasts of others and having no reason to consider whether the noise in their signal comes from a common or idiosyncratic component.
When ω > 0 the investor can no longer do this. They know that the forecasts of others react to signals that have a common noise component, so to keep their forecast in line with others they would ideally react to the common noise too. Since the common noise is unobservable and cannot be identified, the best the investor can do is react more to their own noisy signal.

Term premium
The real term premium demanded by investor i is again an average of covariances of successive realisations and expectations of the household's stochastic discount factor. The strategic complementarities in the beauty contest have no effect on realisations, which continue to have the autocovariance calculated in Section 4.1. The autocovariance of expectations mirrors that in forecasts of the stochastic discount factor, which inherit the properties of signals by their and the unconditional mean real term premium The mean real term premium demanded by an investor depends on how much forecasts react to signals in equilibrium, but strikingly is independent of the particular signal that the investor receives. The reason is the standard capital asset pricing model (CAPM) intuition that the term premium is a compensation for the risk that the investor gives up consumption in a period when the marginal utility is high and receives a claim to consumption in a period when the marginal utility is low. The risk depends on what is expected to happen to the household's stochastic discount factor, and is compensated for according to our decomposition of the term premium into components that depend on the autocovariances of realisations and expectations. The autocovariance of realisations is common to all households by our assumption of a representative consumer that sets labour supply and decides on consumption and saving.
The autocovariance of expectations is a function of the volatility in expectations and how much an investor expects their forecasts to persist into the future. These are identical for all investors, volatility by construction and persistence because each investor projects their forecasts forward using the persistence parameter that governs the persistent component in their signal. Whether an investor thinks their household's stochastic discount factor is currently high or low, they all project forward in the same way, the autocovariance of expectations is common across investors, and the no-arbitrage real term premium is independent of signals. Whilst investors agree on the real term premium, there is dispersion in their forecasts of the household stochastic discount factor and hence the valuation they place on a bond of a given maturity. Put another way, investors agree about the slope but not the intercept of the yield curve. Focussing on the real term premium allows us to be agnostic about how these differences are reconciled, but to avoid heterogeneity in valuations spilling over into heterogeneity in household income we invoke a no-trade theorem. We make our model concordant with the conditions in Milgrom and Stokey (1982) by recognising that the initial allocation before signals are observed is Pareto efficient and by assuming that it is common knowledge under rational expectations that trades must be mutually acceptable to both investors. There are no insurance or transactional gains to trading bonds in our model, so the only reason to trade would be to take advantage of another investor. The absence of gains to trade means that any such trade must be disadvantageous to the other investor and so cannot take place under rational expectations.
There is no trade in equilibrium and investors hold zero bonds at all maturities.

A more general model
The results with the simple analytical model suggest that real term premia may be sizeable if there is a beauty contest element to the formation of expectations. In this section we show that this still applies when some of the seemingly restrictive assumptions of the simple model are relaxed. Our main innovation is to allow for endogeneity in labour supply, which creates a business cycle in output. This is done by specifying a form for household preferences that incorporates the disutility of labour and recognises that the household's coefficient of relative risk aversion is not necessarily equal to one. We further generalise by working with real term premia at horizons of up to 20 quarters and by using the exact stochastic discount factor rather than its log-linear approximation. The other features of the simple analytical model are retained, in particular that a no-trade theorem prevents non-zero bond holdings. Sections 5.1-5.4 outline the general model and present its quantitative implications through a series of numerical examples, which sets the scene for taking the model to data in Section 5.5.

Model
The economy consists of heterogeneously-informed households and a representative firm. The firm produces according to and maximise profits by demanding labour L t until the marginal product of labour is equal to the wage rate. The household is divided into a representative consumer and a representative investor, the latter receiving the same noisy signal of technology as they did in the simple analytical model. The representative consumer has period utility following Greenwood et al. (1988). Commonly referred to as GHH preferences, this functional form implies that labour supply is independent of wealth in equilibrium, and so provides a tractable route to modelling realistic fluctuations without recourse to nominal rigidities or labour market frictions (Gertler et al., 2012). 9 The equilibrium conditions are standard, with the household's stochastic discount factor a function of current and future technology with γ 1 and γ 2 functions of parameters of the model. The full derivation of equilibrium conditions and the stochastic discount factor is presented in Appendix A.2.

Bond prices and valuations
The price of an n-period bond with full-information is The equilibrium price is a function of the current levels of the persistent and transitory components of technology x t and η t . Expectations are conditional on the full information set I * t and taken over the joint distribution of future technology and its components. The risk-neutral price under full information is defined bỹ The valuation p (n) i,t of an n-period bond by investor i when they receive a noisy signal that has common and idiosyncratic noise components is It is a function of the sum of the persistent technology and noise components in their signal x n i,t and on the idiosyncratic component of technology η t , both of which are observed by assumption.
The expectation is conditional on investor i's information set I i,t and is defined over the joint distribution of the investor's forecasts of future technology, its components, and the sum of the persistent components in the signal. The corresponding risk-neutral bond valuation is

Computation
The equations for the pricing and valuing of bonds in Section 5.2 have a recursive structure that enables us to compute term premia up to any desired horizon. Our algorithm has four steps.
1. The exogenous processes for technology and noise are discretised into Markov chains using standard methods.
2. The joint conditional distributions of current and future endogenous variables are discretised using a moment matching procedure.
3. Conditional real term premia are calculated up to any desired maturity by recursively integrating over the joint conditional distributions of exogenous and endogenous variables at successively increasing horizons.
4. The unconditional real term premium is approximated by Monte Carlo simulation.
Steps 1 to 4 deliver an accurate numerical characterisation of the real term premium, providing that enough nodes are used when discretising and that the Monte Carlo simulation has converged.
We find that 15 nodes and simulating for 10 6 periods is sufficient. Full details of the algorithm are presented in Appendix A.3.

Term premia
The real term premia in the general model are illustrated in Figure 5. The general model has an increasing and concave term structure. A higher degree of strategic complementarity ω is associated with a steeper term structure, as expected from the discussion in Section 4.5.4. Real term premia are below their full information equivalents when ω is low because the reaction of investors to their signal is muted and there is only limited volatility in their forecasts. As ω rises so does the reaction coefficient θ and enough noise enters investors' forecasts that real term premia rise above those in the full information case. The coefficient of relative risk aversion σ raises term premia and amplifies the effects of ω.

Quantitative analysis
The model is now confronted with US data. Our interest is in how much the model can explain average term premia at different maturities. To find out, we estimate the model and derive what its parameter estimates entail for the average real term premium at different horizons. We only use data at the shortest maturity when estimating the model, preferring instead to identify ω in the model with strategic complementarities by requiring the cross-sectional and time-series distributions of forecasts implied by the model to be consistent with those of productivity growth forecasts in the Survey of Professional Forecasters.

Data
The sample period for estimation is 1999q1-2017q2. The Survey of Professional Forecasters asks participants at the beginning of each year to forecast the average annual growth in labour productivity over the next ten years. The survey responses are shown in Figure 6, where the solid line is the median of the cross-sectional distribution each period and the dashed lines are the lower and upper quartiles. Not all forecasters respond every time they are asked. Over the 26 years for which data is available, the number of respondents ranges from 21 to 46 with an average of 30.5. Moments of this series are used when estimating the parameters governing forecasts in the model with strategic complementarities.
We calculate moments over the whole period 1992-2017 for which data is available. 10 Consumption is measured as the quarterly sum of real personal consumption expenditure on non-durables and services recorded by the Bureau of Economic Analysis. It is transformed into per capita terms using data on the civilian non-institutional population from the Bureau of Labour Statistics. Labour supply is taken as the quarterly average weekly hours worked in the non-farm business sector reported by the Bureau of Economic Analysis. Consumption and labour supply data are used to estimate the exogenous processes for technology that drive the household's stochastic discount factor.
Data on the term premium provides a benchmark against which models can be judged.
Estimates of the term premium on nominal zero-coupon Treasuries are available from Adrian et al. (2013) for maturities between one and ten years. 11 The premia estimated on yields are converted into premia on bond prices using p The term premium is calculated as the annualised difference between the prices of a bond and its risk-neutral counterpart. High frequency data is averaged where necessary. Table 1 lists a subset of parameters that are calibrated before estimating the model. The discount factor is set to match the steady-state annualised yield on a four-quarter bond in the model to the average yield on one-year zero-coupon nominal Treasuries in the data. 12 The mean is 2.05% in the data, which includes an extended period when overnight rates were at the effective lower bound. The labour share of income 1 − α is fixed at 61.6%, the mean share of labour compensation in GDP in the data. The parameter χ is equal to the inverse Frisch elasticity of labour supply. It can be shown that Var(ln l t )/Var(ln c t ) = 1/(1 + χ) 2 in equilibrium, so χ is chosen such that the relative standard deviation of hours and consumption in the model matches that in detrended data. The value that results lies in the range of common calibrations. The weight on labour disutility χ 0 is calibrated so that 1/3 of the time endowment is spent working 10 Calculating moments from 1997 to match the sample period for estimation does not lead to significant changes. 11 Adrian et al. (2013) use a regression-based approach to estimate an affine term structure model with five pricing factors. Their frequently-updated estimates are available at https://www.newyorkfed.org/research/ data_indicators/term_premia.html.

Calibrated parameters
12 Matching the steady-state yield on a real bond in the model to the average yield on a nominal bond in the data implicitly assumes that the inflation compensation component in yields is negligible at this maturity. This is reasonable for four-quarter bonds, especially given the low and stable inflation over the sample period.

Estimation procedure
The model parameters to be estimated relate to the exogenous processes for technology and noise, the degree of strategic complementarities and the coefficient of relative risk aversion. We estimate them using a method-of-moments procedure that fits the model to moments of the data from consumption and the Survey of Professional Forecasters. The process for technology in the model is independent of the degree of strategic complementarities or risk aversion, so its parameters can be estimated independently. The remaining parameters can then be estimated conditional on the process fitted to technology.
There are three parameters to estimate in the exogenous process for technology, the variance of the transitory component and the innovation variance and persistence of the persistent component. We collect these in the vector Φ 1 = (ρ σ η σ ε ). The properties of the technology process are inherited by consumption through an equilibrium relationship that only depends on calibrated parameter values, hence we can estimate the parameters of interest from the moments of the detrended consumption data. Our method-of-moments estimator targets the variance and autocovariance of consumption and the variance of the consumption growth rate by minimising the sum of squared deviations of the model from these moments. We are able to match the moments almost perfectly with a computationally efficient procedure that exploits closed form expressions to calculate the moments of detrended consumption in the model. 14 Having estimated Φ 1 we proceed to estimate Φ 2 = (σ ξ σ ζ ω σ), the innovation variances of common and idiosyncratic noise, the degree of strategic complementarity, and the coefficient of 13 The share of labour compensation in GDP is taken from the Penn World Table 9.0 and has declined from around 64% at the beginning of the sample to values close to 60% from 2009 onwards. We detrend when necessary using the Hodrick-Prescott filter with λ = 1600 for quarterly data.
relative risk aversion. Three moments from the Survey of Professional Forecasters are targeted.
The first two are the variance and autocovariance of the median forecastγ 50 t of technology growth over the next ten years, which proves highly informative for the estimates of ω, σ ξ and σ ζ because these parameters affect the amount of noise incorporated in forecasts. The third moment to target is the mean interquartile range of forecastsγ 75 t −γ 25 t , which helps in allocating the volatility in noise into its idiosyncratic and common components.
The moments from the Survey of Professional Forecasters are not informative about the coefficient of relative risk aversion. We therefore target one additional moment, the mean term premium on a one-year nominal Treasury. 15 This does not mean that we are targeting the whole term structure in estimation. Instead, we are allowing the estimation of the coefficient of relative risk aversion to target the smallest term premium in the data. This aims at matching the term premium at a single point on the yield curve, so it is a legitimate test of the model to ask whether it can explain term premia at other points on the yield curve with longer horizons.
The procedure yields an estimate of σ that is within the range typically considered.
The method-of-moments estimator of Φ 2 minimises a weighted sum of squared deviations of the model from the four targeted moments. The weighting matrix is constructed in the standard way as the inverse of a bootstrap estimate of the variance-covariance matrix of the moments of the Survey of Professional Forecasters. It implies that less weight is placed on matching the mean interquartile range, since that moment is measured with less precision than the others. 16

Estimation results
The parameter estimates are presented in Table 2 15 Targetting the average term premium on a nominal Treasury again abstracts from inflation compensation in yields, this time by implicitly assuming that there is no inflation risk premium in the data at this maturity. 16 The estimate of the variance-covariance matrix of the sample moments is based on 100,000 bootstrapped draws of the data. The weight on the fourth moment is not relevant because the coefficient of relative risk aversion does not affect the moments of the forecast of technology generated by the model. It only affects the term premium on a one-year nominal Treasury, so the fourth moment can be matched perfectly irrespective of its weight in the estimation. 17 The estimate of ω is at the upper bound that the model can support in equilibrium. As explained in Section 4.5.4, this is the maximal strategic complementarity that permits existence of equilibrium with θ > 0 for the estimated technology and noise processes.    but not as much as in the data, although that in part is due to the low weight that is placed on matching the relatively imprecise estimate of the interquartile range that comes from the data.
The model is able to match the mean term premium on a one-year nominal Treasury perfectly by suitable choice of the coefficient of relative risk aversion.
The final four rows in Table 3 assess the fit of the model to moments that are not targeted in estimation. The model delivers an annualised real term premium of 14.9, 19.2, 22.0 and 23.8 bps at 2, 3, 4 and 5 year horizons, between 40% and 70% of the corresponding annualised term premia in the data. To understand the success in this dimension we add two more models in the fourth and fifth columns of Table 3. The model with full information takes the estimates from Table 2 and assumes that households observe the current level of technology and its decomposition into persistent and transitory components. Making technology observable renders estimates of the noise processes and the degree of strategic complementarities irrelevant. The model with ω = 0 also adopts the parameter estimates in Table 2, but abstracts from strategic complementarities by removing the incentive for forecasters to coordinate their forecasts. The informational assumptions are otherwise as in the estimated model.
The term premia in the full information and ω = 0 models fall badly short of those in the data and the estimated model. The level of the term premia can be brought closer to the data by estimating the alternative models rather than taking the estimates from Table 2, but doing so comes at the cost of significantly higher estimates for the coefficient of relative risk aversion.
The reason why the full information and ω = 0 models fail is their inability to generate sizeable variance and autocovariance in forecasts of technology. Forecasts are anchored to technology in the full information model and to rational expectations in the ω = 0 model, neither of which leaves much scope for a large expectations component in term premia. The variances and autocovariances in these models are several orders of magnitude less than in the Survey of Professional Forecasters. It is only in the estimated model with strategic complementarities that movements in forecasts are sufficient to rationalise the term premia observed at longer horizons.
Forecasts are not very disperse in the ω = 0 model and by definition not disperse at all in the full information model. The lack of heterogeneity in forecasts contributes to the low real term premia in these models, but it cannot explain why their term structures are so flat. The dispersion of forecasts is low in the estimated model, yet it still supports a term structure that is clearly upward-sloping.

Conclusion
Our findings stress the importance of informational assumptions in models of the term structure of interest rates, as starkly visible in Figure 7 when comparing the term structure in our models to estimates from US data. The estimated model exactly matches the term premium at the short end since it is targeted in estimation, but in doing so the data prefers strategic complementarities over an unrealistically high coefficient of relative risk aversion. With these strategic complementarities in place, the estimated model can explain a significant proportion of term premia at all maturities, even though they have not been targeted. The full information model is singularly unable to do this with realistic levels of risk aversion, failing to deliver either sizeable term premia or a term structure that is upward sloping.
The success of the estimated model is driven by a beauty contest in forecasting, which rewards investors for being not only accurate but also close to the average forecast of others.
Even though we discipline the strength of the beauty contest by requiring the distribution of forecasts in the model to be consistent with that in the Survey of Professional Forecasters, we are : Term premia in the data and in the models still able to justify 40% of the term premium in the data at the 5-year horizon. The remaining 60% may be due to inflation risk premia or stronger strategic complementarities than can be supported in equilibrium in our models. The general decomposition of the per-period real term premium in bond prices (9) into components depending on realisations and expectations follows as Proposition 1 for the real term premium and Corollary 1 for the mean real term premium.
Proposition 1. The real term premium ψ where I * t is their information set. The first order conditions are Profit maximisation by the firm sets w t = (1 − α)A t L −α t , bonds are in zero net supply, and all markets clear. Combining the first and third first order conditions gives w t = l χ t and consumption drops out of the intratemporal condition for labour supply, as expected with GHH preferences.

A.3 Computational algorithm
Before describing the algorithm, it is necessary to identify the joint conditional distributions of current and future endogenous variables that investors use to price the term premium. Under full information these are the usual conditional distributions under rational expectations that can be characterised by standard signal extraction results. Things are more involved when information is heterogeneous and there are strategic complementarities, because investors price the term premium according to forecasts rather than rational expectations. The key determinant of forecasts is the conditional distribution of investor i's forecast of the current persistent component of technology, conditional on the signal they observe. The first moment of this conditional forecast is pinned down by the linear reaction of the forecast to the signal, but our procedure for setting forecasts is silent on its conditional second moment. We therefore fix the second moment at what it would be under rational expectations.
1. The univariate exogenous processes for z t where z ∈ {x, η, x n i , ε, ξ + ζ i } are approximated by finite-state Markov chains withÑ nodes using the Tauchen (1986) procedure. The vector of nodesx z = (x z,1 ,x z,2 , . . . ,x z,Ñ ) is chosen so that the extremesx z,Ñ = −x z,1 cover a multiple of the standard deviation of the unconditional distribution of z t . The remaining nodes partition [x z,1 ,x z,Ñ ] into equispaced intervals. The transition probabilities between nodes follow from the normal distribution of z t . To achieve precision in computations we setÑ = 15, which is significantly larger than the values typically used in the literature. The extremes of the vector of nodes in Step 1 are at ±7 standard deviations for all the exogenous processes, with the exception of x n i for which a lower span of ±2 standard deviations is imposed to guarantee that the conditional forecast θx n i,t remains in [x x,1 ,x x,Ñ ] even for large θ. Figure 8 shows the Euclidean distance between discretised and theoretical moments in Step 2 for different values of ω. At all levels of strategic complementarity the mean squared distance on the vertical axis lies between 10 −5 and 10 −8 , implying a high degree of accuracy when approximating the x t |x n i,t distribution.