-
PDF
- Split View
-
Views
-
Cite
Cite
Botond Kőszegi, Filip Matějka, Choice Simplification: A Theory of Mental Budgeting and Naive Diversification, The Quarterly Journal of Economics, Volume 135, Issue 2, May 2020, Pages 1153–1207, https://doi.org/10.1093/qje/qjz043
- Share Icon Share
Abstract
We develop a theory of how an agent makes basic multiproduct consumption decisions in the presence of taste, consumption opportunity, and price shocks that are costly to attend to. We establish that the agent often simplifies her choices by restricting attention to a few important considerations, which depend on the decision at hand and affect her consumption patterns in specific ways. If the agent’s problem is to choose the consumption levels of many goods with different degrees of substitutability, then she may create mental budgets for more substitutable products (e.g., entertainment). In some situations, it is optimal to specify budgets in terms of consumption quantities, but when most products have an abundance of substitutes, specifying budgets in terms of nominal spending tends to be optimal. If the goods are complements, in contrast, then the agent may—consistent with naive diversification—choose a fixed, unconsidered mix of products. And if the agent’s problem is to choose one of multiple products to fulfill a given consumption need (e.g., for gasoline or a bed), then it is often optimal for her to allocate a fixed sum for the need.
I. Introduction
Individuals and households must make myriad decisions on how to allocate money in the face of many competing uses and a barrage of relevant information. A central part of Thaler’s (1985, 1999) influential framework of mental accounting proposes that to help solve such allocation problems, individuals create different “mental budgets” for different purposes (entertainment, clothing, etc.), and treat these budgets as separate when responding to changes in circumstances. Yet despite the intuitive appeal of and empirical support for the concept, there is no theory that explains how a person creates separate mental budgets from fungible finances, and how this process interacts with her reactions to shocks.
In this article, we formulate a theory of expenditure allocation based on the premise that a person’s attention is costly as well as flexible, so that she is motivated to both economize on attention and direct it toward important issues. We show that as a result, the person often engages in choice simplification: she restricts attention to a few considerations customized to be most useful for the decision at hand. The way she simplifies in turn affects her consumption patterns in specific ways, allowing us to explain mental budgeting, identify a connection between mental budgeting and naive diversification—a phenomenon that has hitherto been treated separately—and make other predictions.
After illustrating the logic of our results in a simple example in Section II, in Section III we develop tools for analyzing the effects of costly attention on decision making when an agent’s action and information are multidimensional. We extend the rational-inattention approach of Sims (2003) using the water-filling algorithm from information theory (Telatar 1999; Cover and Thomas 2006) to show that the agent establishes a pecking order of information vectors and tilts her attention toward the more important vectors, potentially even ignoring the least important ones. Beyond consumption problems, our general methods are likely to apply to many economic situations, such as how individuals digest complex information about the economy or form political opinions.
In Section IV.A, we turn to our main topic, consumption decisions with costly attention. We analyze how a person allocates expenditure when she faces independently distributed shocks to her preferences or consumption opportunities for (but not the prices of) different products, and she can reduce any aspect of that uncertainty through costly attention. We assume that the goods can be grouped into nested consumption categories and, first considering the case of substitutes, we posit that they are more substitutable within than between categories. For instance, a restaurant dinner and a play could both be in the “entertainment” category under the larger category of “discretionary spending,” with the two being more substitutable with each other than either is with products outside the entertainment category.
Our main result says that the agent often behaves as if she had separate mental budgets for separate categories: (i) consumption in a category is independent of shocks to other categories, and (ii) total consumption is unresponsive, but individual consumption levels are smoothly responsive, to shocks within the category. In a classical consumption problem, (i) holds only if utility is separable across categories—which our model does not assume—and (ii) does not hold for any utility function we could think of. Intuitively, the most relevant consideration for the agent to think about is which of multiple highly substitutable products are worth buying, so if she has sufficiently costly attention, she simplifies her decision making by thinking only about this consideration. As a result, she does not think about shocks to the optimal level of consumption, and hence her budget is fixed. Even if her attention cost is lower and therefore she does not have a hard budget, her spending in a category varies less than with full information, so she can be interpreted as having a soft budget.
Our budgeting result helps explain evidence that many individuals and households separate expenditures into budgetary categories (Rainwater et al. 1959; Kahneman and Tversky 1984; Lave 1995; Ameriks, Caplin, and Leahy 2003; Antonides, De Groot, and Van Raaij 2011), and makes the novel prediction that products are grouped into mental budgets according to their substitutability. Through a simple reinterpretation, our theory predicts that individuals may use budgeting strategies for other types of decisions, for instance, allocating separate time budgets for substitute tasks.
We illustrate that mental budgeting can interact in an economically interesting way with budget constraints. Even more than an unconstrained agent, a budget-constrained agent may prefer not to think about how much to consume in total, leading her to mentally budget. Furthermore, if her budget constraint is relatively tight, her mental budget exhausts all of her available funds—despite lower consumption being optimal with some probability. For budget-constrained individuals, therefore, costly attention increases consumption as well as the marginal propensity to consume out of increases in available funds.
An entirely different prediction emerges when we assume that the products are complements, and (paralleling the case of substitutes) they are more complementary within than between categories. Because the optimal consumption levels of complementary products tend to move together, the agent may now simplify her choice by not thinking about her relative values for products at all, only about how much she should consume in total. Hence, she may choose a fixed, unconsidered mix of products. Furthermore, such an unconsidered mix can also be optimal for substitute products if the agent’s preferences for the products are sufficiently positively correlated. We argue that these predictions are consistent with the phenomenon of naive diversification in financial (Benartzi and Thaler 2001, 2007) and consumption (Simonson 1990) decisions. Suppose, for instance, that the products are funds in an employer-based retirement program, and they look ex ante identical to the agent (e.g., because she knows nothing about her values for the individual funds to start with). If the funds invest in different assets, then they are complements because buying them jointly serves diversification purposes; if the funds invest in similar assets, then the agent’s values for them are highly correlated. In either case, the agent may follow the |$\frac{1}{N}$| rule, investing equal amounts in the available funds. Mental budgeting and naive diversification can therefore be viewed as solutions to the same type of decision-making problem that apply in different circumstances.
In Section V, we ask whether the agent still wants to set budgets for substitute products when there are price shocks, and whether she prefers budgets expressed in quantities of consumption or amounts of spending. Accordingly, we allow the agent to make and execute plans in two different ways: she can choose the quantity of consumption for each product, or she can choose the amount of spending on each product. Although these two ways of thinking are equivalent in a classical consumer problem in which prices are known, in our framework and with price uncertainty—in which the agent may not fully learn prices before making decisions—they are not equivalent. We establish that thinking in terms of spending is optimal whenever optimal total consumption is sufficiently price sensitive, or there are sufficiently many substitute products in a category. Intuitively, fixing the amount to be spent on a product means that consumption responds to unforeseen changes in the product’s price, and this is optimal if an average of the relevant optimal price elasticities (of substitution and total consumption) is sufficiently high. Furthermore, we show that under reasonable conditions, a consumer who thinks in terms of spending sets spending budgets. These results explain the prevalence of spending budgets as well as the greater prevalence of spending budgets among (generally more price-sensitive) lower-income households, but they also predict consumption budgets in some plausible circumstances. For instance, a high-income consumer may set an entertainment budget in terms of nights out per month.
In Section VI, we consider a variant of our model in which the agent has unit demand for each product—for example, she needs a single mattress or computer to replace her old one or a given amount of gasoline to drive that month—but has multiple versions of each product to choose from. Similar to the above, we ask whether deciding the version of the product (e.g., the grade of gasoline) to buy or the amount to spend is optimal for a consumer who does not process all price information before making decisions. We show that thinking in terms of spending is optimal if and only if product prices are on average sufficiently positively correlated with premiums for better products. If prices and premiums are uncorrelated, then it is optimal for the agent to absorb a price increase fully by increasing spending. If prices and premiums are highly correlated, however, an increase in prices greatly increases the average marginal price of increasing quality, so it is better to lower spending back to the original level. Thinking in terms of spending implies, in line with evidence by Hastings and Shapiro (2013) on gasoline purchases, that when prices for all varieties of a product rise, the agent switches to a lower-priced variety. Although in Hastings and Shapiro’s specific setting the price and price premium are not positively correlated, our explanation applies if such situations are sufficiently uncommon in consumers’ lives and consumers do not think about the specific setting separately.
In Section VII, we discuss how our model relates to existing theories. Whereas previous work explores another central aspect of mental budgeting, self-control problems (Shefrin and Thaler 1988; Galperti 2019), we are the first to explain how a person creates mental budgets from fungible finances and the first to formally connect mental budgeting and naive diversification. In Section VIII, we note that our model does not distinguish between, and therefore cannot explain differential consumption responses to, different but fully fungible sources of income. We argue, however, that closely related attention-based models can potentially explain some of these phenomena. In addition, we emphasize that it would be fruitful to study the interaction between our framework and related phenomena, especially self-control problems and loss aversion.
II. Example
In this section, we illustrate the logic of our main insights using a simple example.
II.A. Setup
Before choosing y1 and y2, the agent can observe exactly one of x1, x2, x1 − x2, and x1 + x2: she can think about her taste for one of the goods or her relative or total taste for the two goods. We ask: what does she optimally choose to think about, and how does this affect her consumption?
II.B. Solution: Mental Budgeting versus Naive Diversification
Optimal information acquisition is now obvious from how information affects the conditional variances of x− and x+. If the products are substitutes (i.e., θ > 0, and therefore |$\frac{1}{1-\theta } > \frac{1}{1+\theta }$|), the agent chooses to observe x− because she wants to know her relative taste for the two products. Since x− and x+ are independent, observing x− provides no information about x+, so |$y_+ =\frac{2(\overline{x}- 1)+ E[x_+]}{1+\theta } = \frac{2(\overline{x}- 1)}{1+ \theta }$|. This means that y1 + y2 is constant: the agent has a fixed budget determined by her average taste |$\overline{x}$| for the products. Because |$y_- = \frac{x_-}{1-\theta }$|, however, the consumption levels y1 and y2 are not fixed—the agent does respond to changes in circumstances, but not by changing her total budget.
If the products are complements (i.e., θ < 0, and therefore |$\frac{1}{1-\theta } < \frac{1}{1+\theta }$|), then the agent chooses to observe x+—she wants to know her total taste for the products. As a result, she learns nothing about x−, so |$y_- = \frac{E[x_-]}{1-\theta } = 0$|. This means that y1 = y2: the agent naively diversifies, always choosing the goods in equal proportion. Since |$y_+ = \frac{2(\overline{x}-1) + x_+}{1+\theta }$|, however, the consumption levels y1 and y2 are not fixed—the agent does think about the problem, but not by changing the ratio in which she buys the products.
II.C. Other Implications
We substantially generalize the insights above and derive other predictions in Section IV.A, and study the implications of price uncertainty in Section V. We use variants of our simple model to make a few further points, which we do not reconsider in a more general setting. First, our model assumes that x1 and x2 are independent. If x1 and x2 are positively correlated, then var[x+] > var[x−], which by equation (4) increases the value of observing x+. Hence, in this case naive diversification is more likely to occur. Intuitively, if the tastes for two products are highly positively correlated, the consumer is unlikely to learn much from thinking about which one she likes, so she chooses to think about her total taste. Conversely, a negative correlation between x1 and x2 increases the value of observing x−, increasing the tendency toward budgeting.
Second, by treating the disutility of spending money (or, equivalently, the value of saving) as a constant, we have implicitly assumed that the agent knows it or does not want to think about it. Uncertainty in the value of saving affects the disutility of spending on both products equally, so—if the agent can lower the uncertainty through thinking—it is equivalent to a positive correlation between x1 and x2.1 If the value of saving is highly uncertain, therefore, the previous point implies that our budgeting result fails. In this sense, figuring out one’s value of saving to a point where one no longer wants to think about it much is a precursor to budgeting.
Third, consider also what happens when the goods are substitutes, and the agent has a relatively tight budget constraint |$y_+ \le \rm{y_+^{max}} \le \frac{2(\overline{x}-1)}{1+ \theta }$|. Without information, the constraint would be binding, with the agent choosing |$y_+ = \rm{y_+^{max}}$| and y− = 0. Because observing x− is only useful for choosing y−, the constraint, which does not restrict y−, leaves the value of observing x− unchanged. In contrast, since observing x+, x1, or x2 is useful for choosing y+, the constraint—which prevents increases in y+ in response to news—decreases the value of observing any of these variables. Hence, the agent still prefers to observe x−, and her total consumption is |$y_+ = \rm{y_+^{max}}$|, that is, she always exhausts her spendable funds. Intuitively, while consuming less might be optimal, thinking about this is less valuable than thinking about how to split her spendable funds between the goods. When the budget constraint is relatively tight, therefore, limited attention increases consumption.
Furthermore, notice that the agent spends a marginal increase in available funds if and only if her optimal unconstrained consumption exceeds available funds. Hence, her average marginal propensity to consume out of increases in available funds equals the probability with which her budget constraint binds. As a result, limited attention also increases the average marginal propensity to consume from as low as |$\frac{1}{2}$| to 1.2
Fourth, budget constraints can interact with attention costs. To illustrate, we suppose that acquiring a second signal from the set {x1, x2, x−, x+} has a positive rather than infinite marginal cost. Once the agent knows x−, any other signal perfectly reveals x1 and x2, so the other signals are equally valuable. Because the budget constraint does not affect the value of learning only x− but lowers the value of learning both x1 and x2, it can lead the agent not to acquire a second signal. In this case, the budget constraint induces budgeting, as well as the associated focus on relative tastes, due to costly attention.
III. Theoretical Tools
In this section, we develop a methodology for analyzing rational-inattention models in which—as with mental budgeting and naive diversification—the agent’s information and action are multidimensional.3 Because these tools are potentially applicable to many economic settings, we present them in a general form. We lay out our results on consumption decisions in a self-contained way, so readers not interested in the general tools can skip to Section IV.A.
III.A. Multidimensional Rational Inattention
Before choosing |$\mathbf {y}$|, the agent can obtain any Gaussian signal about |$\mathbf {x}$|. The resulting posterior beliefs are also Gaussian, with the agent being able to choose the posterior variance-covariance matrix Σ subject to the constraint that ψ − Σ is positive definite—that is, that the posterior is more precise than the prior. Denoting by |·| the determinant of a matrix, we posit that the cost of information is |$\frac{\lambda }{2} \cdot (\log |\psi | - \log |\Sigma |)$|, where λ ≥ 0 is the agent’s attention cost. This specification of decision making with costly attention is the reduced form of a general rational-inattention model in which the agent (instead of being restricted to a Gaussian signal) can choose any signal at a cost equal to the reduction in the entropy of her beliefs.4
As in the previous literature, there are three main reasons for using the entropy-based functional form for attention costs. First, it is highly tractable. Second, it has the basic property that information is costly (if the agent learns |$\mathbf {x}$| more precisely, then |Σ| is lower, and therefore |$\frac{\lambda }{2} \cdot (\log |\psi | - \log |\Sigma |)$| is higher). Third, it implies that all information has the same cost—what matters is the amount of uncertainty reduction, not what the uncertainty is about—so it can be viewed as ideal for studying information acquisition based on endogenous considerations about the benefits of information, not based on exogenous assumptions about the costs of information.
At the same time, researchers have raised various concerns about specifying attention costs to be linear in entropy reduction. Woodford (2012) points out that the entropy-based cost function fails to predict the finding from perceptual experiments that subjects make smaller errors in more likely states. Dean and Neligh (2017) find that experimental subjects’ behavior is consistent with a cost function that is convex in entropy reduction. Similarly, Morris and Strack (2017) establish that a constant marginal cost of signals in sequential information-acquisition problems corresponds to a convex entropy-based cost function. Accordingly, theoretical work generalizes the entropy-based cost function to allow for differences in comparison costs across versus within nests of products (Fosgerau, Melo, and Shum 2017), on different dimensions of the state space (Pomatto, Strack, and Tamuz 2019), and for nearby versus distant states (Morris and Yang 2016), and Caplin and Dean (2015) study a broader class of cost functions called posterior separable. With alternatives going beyond entropy-based costs, our decision problem would be difficult or impossible to analyze. Such extensions would add the consideration that the agent is more likely to obtain less costly information, but there is no obvious sense in which they might undermine the logic of our results.
III.B. Optimal Information Acquisition and Actions
1. Decomposition into One-Dimensional Problems
The optimal information-acquisition strategy is to acquire independent signals of vi · x such that the posterior variance of vi · x is |$\min \lbrace \sigma _0^2, \frac{\lambda }{2\Lambda _{i}}\rbrace$|.
Intuitively, the agent processes more information about vectors in the space of x that are more costly to misestimate. Specifically, if |$\sigma _0^2 \le \frac{\lambda }{2\Lambda _{i}}$|, then the agent acquires no information about vi · x; and if |$\sigma _0^2 > \frac{\lambda }{2\Lambda _{i}}$|, then she observes a signal about vi · x with precision chosen to bring the posterior variance of vi · x down to |$\frac{\lambda }{2\Lambda _{i}}$|. Hence, when the cost of information λ is high (|$\frac{\lambda }{2\Lambda _{i}}>\sigma _0^2$| for all i), then the agent does not process any information. If the cost is somewhat lower, then the agent processes information about the vi · x with the highest Λi, but she processes no other information. At even lower costs, the agent processes information about vi · x for more i’s, and so on.
2. Responsiveness of Actions
Next we discuss implications for actions. We show in the Appendix that |${\bf y}= H \tilde{{\bf x}}$|, where |$H= \frac{C^{-1}{B}^{\prime }}{2}$|. We define |$\varphi ^\lambda _i$| as the average change in the action y when x changes in direction vi by 1. We can think of |$\varphi ^\lambda _i$| as the average responsiveness of the agent’s behavior to shocks along vi. Using this notation, the responsiveness under perfect information—when the agent has no attention costs—is |$\varphi ^0_i$|.
(i) The space of actions is spanned by |$\lbrace H v^i| \frac{\lambda }{2\Lambda _{i}}<\sigma _0^2 \rbrace$|.
(iii) In the range |$\Lambda _i>\Lambda _j > \frac{\lambda }{2\sigma _0^2}$|, the relative responsiveness |$\frac{\varphi ^\lambda _i}{\varphi ^\lambda _j}$| is strictly increasing in λ.
Part (i), an immediate implication of Proposition 1, says that the agent’s action moves only in response to information that is sufficiently important to pay attention to. Part (ii) says that the agent underresponds to shocks. Because the agent pays only partial attention to information, on average she does not notice the extent of shocks, so she does not respond as much as an agent with zero attention costs. More interestingly, part (iii) says that with costly attention, optimal behavior calls for concentrating reactions to shocks in directions that are the most important. As a result, the responsiveness to shocks along vi relative to v j is higher than under perfect information if and only if Λi > Λj.
IV. Consumption Patterns
We now apply the tools from Section III to analyze how a person attends and responds to taste or consumption opportunity shocks when choosing a consumption basket from many products with different degrees of substitutability or complementarity. This generalizes our example in Section II by allowing for more products and by assuming that the consumer can choose any type and any amount of information (rather than one of four signals). We analyze price shocks in the next section.
To be able to analytically solve and economically interpret our model, we posit the following specific structure for Θ. The goods can be grouped into L ≥ 1 level of nested categories. The level l = L is the largest category (e.g., discretionary spending), which includes all N goods; the level l = L − 1 is the set of second-largest categories (e.g., entertainment), and so on, with the smallest (l = 1) categories being individual consumption goods (e.g., a dinner out). We denote by Rk, l⊂{1, …, N} the consumption category k at level l. We assume that all categories at level l are of the same size (|$|R^{k,l}| = |R^{k^{\prime },l}|$| for all k, k′, l), and that each category at level l < L is a subset of a higher category (for each l < L, k, there is a unique k′ such that |$R^{k,l} \subset R^{k^{\prime },l+1}$|). The substitutability of two goods is determined by the smallest category to which they both belong. For two goods m and n, let l be the smallest l′ such that there is a k with |$m,n \in R^{k,l^{\prime }}$|. Then, Θmn = γl, where γ2 through γL are constants.5
We assume that the xm are i.i.d. normal random variables with mean 0 and variance |$\sigma _0^2$|, and the agent can obtain any multivariate normal signal about (x1, …, xN). Part of this thinking could, for instance, involve mentally simulating future consumption (as in Gabaix and Laibson 2017) or searching for information about consumption opportunities. The agent’s cost of attention is the same as in Section III, so that she maximizes the sum of her expected utility given her posterior beliefs plus |$\frac{\lambda \log |\Sigma |}{2}$|, where λ ≥ 0 is her cost of attention, Σ is the variance-covariance matrix of her posterior, and |Σ| is the determinant of Σ.
The attention cost in our model is most straightforwardly interpreted as an information-acquisition or information-processing cost. But it can also be interpreted as a calculation cost when the agent knows her tastes and consumption opportunities (or, in Section V, prices), but without thinking does not know what they imply for optimal consumption. Under either interpretation, the assumption that the agent can think about the vector (x1, …, xN) in a fully flexible way is unrealistic. For instance, it is unlikely that one can obtain a noisy signal about an arbitrary linear combination of the marginal utilities of this month’s entertainment programs. At the same time, there is clearly flexibility in what a person thinks about or focuses on, and our framework captures such flexibility without making potentially ad hoc assumptions on its limits. Fortunately, the optimal ways of thinking we identify below are highly plausible and intuitive, so if we allowed only plausible ways of thinking, the same solutions would obtain.6
As a benchmark, we identify how the agent behaves if she has costless attention and how she responds to ex ante known changes (i.e., changes in the |$\overline{x}_m$|). For instance, the agent’s average taste may evolve over time. To state the result, let |$\mathbf {y} = (y_1, \dots , y_N)^{\prime },\mathbf {\overline{x}} = (\overline{x}_1 , \dots , \overline{x}_N)^{\prime } , \mathbf {x} = (x_1 , \dots , x_N)^{\prime }$|.
If λ = 0, then |$\mathbf {y} = \frac{\Theta ^{-1}(\mathbf {\overline{x}} + \mathbf {x})}{2}$|. For any λ ≥ 0, |$E[\mathbf {y}] = \frac{\Theta ^{-1}\mathbf {\overline{x}}}{2}$|.
The agent’s average behavior responds to ex ante known changes in exactly the same way as with perfect information. This also means that her utility function (i.e., the matrix Θ) can be extracted from her responses to ex ante known changes. As we show later, her responses to ex post shocks she needs to think about are often markedly different and by implication do not accurately reflect her true preferences over consumption. Nevertheless, these responses can be predicted from her (from ex ante known shocks measurable) true preferences.
IV.A. Substitutes: Mental Budgeting
First, we consider substitute goods, assuming that 0 < γL < … < γ2 < 1. This captures the idea that a good is a better substitute for other goods in its category than for goods in a different category. For instance, a French dinner is a better substitute for a Chinese dinner than for a movie. Then:
Proposition 3 says that if (and only if) her attention cost is sufficiently high, the agent has a fixed mental budget—a constant total expenditure—for each l-category of products. Accordingly, the higher is her cost of attention—for example, because she has lower cognitive ability or is busy with other things—the more likely she is to budget, and the narrower are her budgets. To appreciate ways in which such behavior differs from that of a classical decision maker, suppose that λ2 < λ < λ1, and one category at level 2 is entertainment. Denoting the entertainment category by R:
(i) For any m ∈ R and |$n\not\in R$|, ym does not depend on xn; (ii) ∑m∈Rym is constant; and (iii) for any m ∈ R, |$E[y_m | \mathbf {x} ]$| is a smooth function of the vector (xm − xn)n∈R∖{m} that is strictly increasing in each component.
Corollary 1 implies two related phenomena. First, part (i) says that the agent’s consumption decisions regarding entertainment are independent of other shocks. In a classical consumption problem, this occurs only if the utility from entertainment is separable from the rest of the utility function. We do not impose such separability; in fact, with full information |$\frac{\partial y_m}{\partial x_n} <0$| for all n ≠ m.7 Second, parts (ii) and (iii) imply that the agent’s total consumption of entertainment is independent of shocks, but her consumption within the category responds smoothly to within-category shocks. This is not the case in any classical model we could think of.
Intuitively, knowing about a shock to the relative marginal utility of movies and theater is very valuable, as it allows for substantial readjustment of both consumption levels through substitution. Knowing about a shock to the relative marginal utility of movies and clothing is less valuable, because the scope for substitution between these goods is lower. And knowing about a shock to the marginal utility of movies is also less valuable, as it leads mainly to the adjustment of movies consumption. With the agent’s attention being costly, she thinks only about the most important consideration, the relative marginal utility of movies and theater. As a result, she fixes total entertainment consumption.
Proposition 3 explains evidence that a significant number of consumers have category-specific budgets. As a stark manifestation of this phenomenon, many households used to place budgets allocated for different purposes into different envelopes or tin cans (Rainwater et al. 1959; Lave 1995). Furthermore, Ameriks, Caplin, and Leahy (2003) and Antonides, De Groot, and Van Raaij (2011) document that the mental budgeting (if not physical separation) of expenses is still common. More generally, the majority of U.S. households report having a spending plan or budget (e.g., Lin et al. 2016), which may also be broken down into smaller budgets. Indeed, most of the many online financial management tools seem to presume that users want to set separate budgets for separate categories. Proposition 3 not only accounts for this evidence but makes the novel prediction that the most substitutable goods go into the same budget.
Of course, testing this prediction requires identifying which products are most substitutable. One way to do so is to use the agent’s responses to ex ante known information to measure Θ, as we mentioned after Fact 1. In addition, the proof of Proposition 3 implies that ex post information is useful as well: although the agent does not maximize her consumption utility, she does substitute more between product pairs m, n for which θm,n is greater. Formally, for two products m and n in a budget, the smaller is the smallest common category to which both belong, the greater is the agent’s average tendency to substitute between the products in response to a relative taste shock, |$\frac{\partial E[y_m-y_n | x_m - x_n ]}{\partial (x_m - x_{n})}$|.8 This allows an observer to establish a ranking of product pairs according to substitutability. After observing a person’s or population’s substitutability ranking, it is possible to predict, for instance, how a comparable person or population with greater attention cost might form budgets.
The logic of Proposition 3 applies in other domains as well. For instance, there is evidence that some individuals have mental budgets for time allocation, such as hours per day devoted to studying (Rajagopal and Rha 2009). This follows from our model by reinterpreting ym as the time allocated to task m, and xm as a shock to the return of working on task m. Furthermore, our theory predicts that a person creates budgets for substitute tasks, for instance, different ways of studying for an exam.
Our model is static in the sense that the agent solves a single optimization problem over what information to obtain and what to consume. But she does not have to make all choices at the same time. When choosing budgets, she can leave her plans incomplete and obtain information about shocks only when relevant consumption opportunities start arising, even making decisions separately for separate categories of products. This piecemeal execution is facilitated by the separable nature of the optimal plan, and is in fact optimal if obtaining the same information or mentally simulating consumption at the earlier budgeting stage is costlier.9
Having budgets leads to specific patterns in how a person reacts to shocks. Suppose, for instance, that the xm in the entertainment category all increase by the same amount—that is, unusually fun entertainment opportunities present themselves across the board. Then the agent’s average consumption of entertainment as well as other goods remains unchanged. Because she evaluates entertainment goods only relative to each other, on average she does not see a reason to change her behavior. If she had unlimited attention, in contrast, she would respond to such positive shocks by increasing entertainment consumption and decreasing other consumption. Similarly, if a single xm increases, that leads the agent to increase ym. If she had full information, she would also decrease the consumption of all other goods. Because she has a budget, however, she concentrates the substitution to within the category.
For simplicity our setup imposes a strong form of symmetry on products, but a simple extension of our proof makes it clear that weaker assumptions can also lead to budgeting. For instance, suppose that L = 3, there are two categories of unequal size at level 2, and products are symmetric within each category. Then, for sufficiently high attention costs the agent has a budget for each category. More generally, for an agent with high attention costs to budget within a category, it is sufficient for utility within the category to have the symmetric nested structure we have imposed; other categories could have different structures.
Proposition 3 identifies a particularly strong form of budgeting, in which the budget is completely fixed: if λ ≥ λl, then the correlation between the consumption of a good and the total consumption of other goods in its l-category is −1. Although this stark hard-budgeting result does not hold in many other cases, it identifies a force toward budgeting that holds more generally. As a case in point, we identify a version for lower attention costs.
Suppose that 0 < γL < … < γ2 < 1 and λ < λl. For any k and m ∈ Rk, l, the correlation between ym and |$\sum _{n\in R^{k,l}\setminus \lbrace m\rbrace } y_n$| is strictly decreasing in λ.
Proposition 4 says that the higher the agent’s attention cost, the more she restricts consumption adjustments to substitutions within a category. In this sense, she can be viewed as having a soft budget for l-categories.
Asymmetries in the prior variances of xm or prices also lead to a kind of soft budgeting. To illustrate, suppose that L = 2 and N = 4—there is a single category of four products—and the prior variances |$\sigma _{0,m}^2$| satisfy |$\sigma _{0,1}^2 \ne \sigma _{0,2}^2 = \sigma _{0,3}^2 = \sigma _{0,4}^2$|. We show in Appendix B (Proposition 9) that there are λ1 and α such that if λ ≥ λ1, then αy1 + y2 + y3 + y4 is constant, with α < 1 if and only if |$\sigma _{0,1}^2$| is greater than the other |$\sigma _{0,m}^2$|. Hence, total spending equals y1 + y2 + y3 + y4 = constant + (1 − α)y1. Furthermore, simulations show that unless the asymmetry is very large, an increase in y1 is associated with a decrease in y2 + y3 + y4 much more than with full information. This can be interpreted as saying that the agent has a soft target budget, allowing herself to go over the target if she happens to have a high value for a good with more volatile value. Relatedly, if good 1 has price p1 ≠ 1, then total spending is p1y1 + y2 + y3 + y4 = constant + (p1 − α)y1: now the agent allows herself to go over the target if she has a high value for a more expensive product. When choosing between cheaper chicken and more expensive beef, for instance, she allows herself to splurge when especially nice beef is available.
Figure I illustrates Propositions 3 and 4 in an example. We consider four goods grouped into categories {1, 2} and {3, 4}, and draw the joint distribution of y1 and y2 for different levels of λ. For costless attention (λ = 0), the distribution of possible consumption pairs is quite dispersed. At the other extreme, for a very high attention cost (λ = 1), the consumption amounts are fixed. For lower, but relatively high attention costs (λ = 0.75, 0.5), the agent sets a budget for the two products, so her consumption is always on the same budget line. These situations correspond to Proposition 3. For even lower positive attention costs (λ = 0.48, 0.45), the agent starts substituting goods 1 and 2 with goods 3 and 4, but not as much as with costless attention, so the distribution of y1 and y2 is closer to a budget line than for λ = 0. These situations correspond to Proposition 4.

Example
Joint distributions of y1 and y2 for different costs of attention when N = 4, there are two categories, {1, 2} and {3, 4}, and |$\sigma _0^2=1,\gamma ^2= \frac{1}{2},\gamma ^3= \frac{1}{4}$|. Iso-density curves are shown.
IV.B. Complements: Naive Diversification
We turn to complementary products, assuming that γ2 < … < γL < 0. This means that products are arranged in a nested fashion into categories and are stronger complements within than across categories. For instance, different features of a car (e.g., driving experience, seats, sound system) might be highly complementary to each other but not to one’s furniture. To simplify our statement and capture situations in which the products are ex ante equally desirable, we also assume that the |$\overline{x}_m$| are equal. Then:
Proposition 5 says that if the agent’s attention cost is sufficiently high, then she engages in naive diversification, that is, chooses a fixed mix of products, in category l. Intuitively, because the optimal consumption levels of complementary products tend to move together, the agent does not think about their optimal relative consumption at all, only about how much she should consume in total. Continuing with the example of cars, the agent does not think separately about the quality of the engine, seats, sound system, and so on, she wants—she only thinks about whether she wants an economy or luxury car.
Although it is more typical for complementary products, under some circumstances naive diversification also occurs for substitutes. First, as we have illustrated in Section II, it emerges if the xm are sufficiently positively correlated. Second, it emerges—in a trivial form—if the agent’s attention cost is so high that she does not obtain any information. In this case, her consumption of all products is fixed at the ex ante optimal level, and therefore the mix of products is fixed as well.
An important application of the above results is naive diversification in financial decisions, whereby a person chooses a simple mix of investments that is unlikely to be fully optimal. For instance, Benartzi and Thaler (2001) document that many employees in employer-based retirement savings plans divide their investments equally across available funds, and relatedly, employees invest more in stocks if there are more stock funds available. Huberman and Jiang (2006) find a similar pattern for plans offering 10 or fewer funds, although not for plans offering more funds. To see how our model can account for this phenomenon in an example, suppose that an investor with mean-variance preferences decides the amounts y1 and y2 to invest into two assets. There are two equally likely states, with asset 1’s net return being x1 + 1 in state 1 and x1 − 1 in state 2, and asset 2’s net return being x2 − 1 in state 1 and x2 + 1 in state 2. It is easy to check that the mean of the investor’s wealth is x1y1 + x2y2 and the variance is (y1 − y2)2, so the utility function can be written in the form of expression (8) with Θ12 = γ2 = −1. Hence, Proposition 5 predicts that an investor with sufficiently costly attention splits her investment equally between the two assets. More generally, because diversification is desirable, different investments are often complements, so Proposition 5 predicts that investors may diversify naively.10
Of course, it may be the case that different funds invest in similar assets, so combining them does not serve a diversification purpose, and therefore the funds are more appropriately viewed as substitutes. In this case, however, the investor’s values for the funds are prone to be highly positively correlated, making naive diversification optimal again. Reinforcing this tendency is that preferences are prone to be positively correlated to start with: that one fund is a good investment reflects in part that employer-sponsored retirement savings is a good investment in general, and therefore other funds in the program are good investments as well.
Investigating a completely different domain, Simonson (1990) finds that individuals naively diversify when choosing items to consume at different future dates.11 Our model explains this finding if individuals have a taste for variety—which is equivalent to complementarity—and are subject to taste shocks. Consistent with our perspective, Simonson argues that naive diversification is attributable to the combination of taste uncertainty and the desire to simplify the decision.
The observation that the agent reacts to ex ante known changes exactly as in the full-information case (Fact 1) qualifies Proposition 5 in an interesting way. For instance, suppose that an investor distinguishes between stock funds and bond funds and knows that stocks are more valuable investments for her. Then she chooses more stock funds than bond funds or might choose only stock funds. But if she considers stock funds as ex ante identical, then she still naively diversifies within the class of stock funds. More generally, if the agent sees a reason to invest in only a handful of funds, but treats them as ex ante equally good investments, then she may naively diversify between these funds. Huberman and Jiang (2006) find some evidence of such a conditional |$\frac{1}{N}$| rule.
For simplicity of presentation, we treat the cases of substitute products and complementary products separately. But it is easy to combine the two problems into one grand decision problem. In particular, suppose that a subset of the products are substitutes as above, while the rest are complements as above, with preferences over the two subsets being separable. Because the two problems are then separable, our results apply unchanged to each subset. Furthermore, although we have not investigated such cases in detail, more complicated patterns involving both budgeting and naive diversification of the same products can emerge if some categories are complements, and some are substitutes. Suppose, for instance, that dinners and movies, as well as jazz and cocktails, are complements, but they are substitutes across the two pairs. Then, an agent with high attention costs always pairs dinners and movies as well as jazz and cocktails—in this sense naively diversifying—but has a budget for the total number of entertainment evenings.12
V. Price Uncertainty and the Nature of Budgets
We conceptualize the question of whether the agent might want a budget for consumption or for spending by asking a more fundamental question: whether she wants to think—that is, make plans and execute decisions—in terms of the consumption levels of the goods or the amounts of spending on the goods.14 Formally, in the former case she chooses consumption ym for each good, and in the latter case she chooses spending Ym = pmym on each good. Although these two ways of thinking are equivalent in a classical problem with known prices, in our model—in which the agent does not fully learn prices before making decisions—they are not equivalent. For instance, deciding to buy a front-row ticket to a concert no matter how much it costs will generally not result in the same purchase as deciding to spend |${\$}$|100 on the concert no matter where one sits.
For any |$\lambda , \sigma _0^2 ,\epsilon ^1, \epsilon ^2$|, thinking in terms of spending yields strictly higher expected utility than thinking in terms of consumption if (i) |$\epsilon ^1,\epsilon ^2 > \frac{1}{2}$| or (ii) |$\epsilon ^1>\frac{1}{2}$| and N is sufficiently large, and the converse holds if (iii) |$\epsilon ^1,\epsilon ^2 < \frac{1}{2}$|.
Proposition 6 identifies two sufficient conditions for thinking in terms of nominal spending to be optimal. Both conditions require that the products are relatively good substitutes (|$\epsilon ^1>\frac{1}{2}$|). To understand the logic of condition (i), suppose first that N = 1, that is, there is a single product. Then, the condition says that the price elasticity of consumption of the single product must be greater than |$\frac{1}{2}$|. Intuitively, fixing nominal spending generates a price elasticity of consumption of 1 (from approximation (12), |$\frac{(y_m-\overline{y})}{(p_m-\overline{p})}\frac{\overline{p}}{\overline{y}} =1$|) while fixing consumption generates a price elasticity of consumption of 0, so the former is optimal if and only if the optimal price elasticity is closer to 1 than to 0. Extending the logic to N > 1 gives condition (i): thinking in terms of spending is optimal if both relevant elasticities are greater than |$\frac{1}{2}$|. The converse gives condition (iii): thinking in terms of consumption is optimal if both relevant elasticities are less than |$\frac{1}{2}$|.
If |$\epsilon ^1> \frac{1}{2}$| and |$\epsilon ^2 < \frac{1}{2}$|, then the logic is not sufficient to determine whether thinking in terms of spending is optimal. Still, condition (ii) says that it is optimal if N is sufficiently large. Intuitively, this occurs because with many products, the predominant manner in which the agent wants to adjust consumption to shocks is by substituting between products—not by adjusting total consumption—so this substitution elasticity is more important in determining how she wants to think.
Our next proposition extends the budgeting result in Proposition 3 and Corollary 1 to spending.
Suppose that the agent thinks in terms of spending, and ε1ε2 > 1. Then, there are λ1, λ2 satisfying 0 < λ2 < λ1 such that if λ2 ≤ λ < λ1, then total spending ∑mYm is constant, but the individual spending levels Ym are not constant.
The logic also parallels that before: the most valuable pieces of information to know about are price differences, so often this is all the agent pays attention to. As a result, she restricts adjustments to substitutions between products, fixing total spending.16
Thinking in terms of nominal spending, and having nominal budgets, is therefore optimal if the optimal price elasticity of total consumption is sufficiently high, or it is not too low and product categories feature many closely substitutable products. These results explain the general prevalence of spending budgets, and—since lower-income people have higher price elasticities of consumption—the greater prevalence of spending budgets among lower-income individuals.17 Nevertheless, our model predicts that individuals who do not care much about prices are more likely to have budgets expressed in terms of quantities. Consider, for example, a high-income person whose primary constraint in entertainment consumption is time, not money. Because she is therefore not price sensitive, she is more likely to choose a budget in entertainment quantity. Anecdotally, some people do seem to set consumption budgets, for instance, deciding to go out twice a month or to take two weeks of vacation a year. Relatedly, as Krishnamurthy and Prokopec (2010) note, in some self-control settings people tend to have quantity budgets, for instance in the number of weekly desserts or Weight Watchers points they allow themselves. Although our model does not formalize a self-control motive, this is another setting in which the primary cost of consumption is not the price, so that we predict quantity budgets rather than spending budgets.
Having a spending budget leads to an interesting pattern in how a person reacts to price shocks.
Suppose that the agent thinks in terms of spending, and ε1ε2 > 1, λ2 ≤ λ < λ1. A decrease in the price of good m lowers spending on good m and increases spending on other goods.
With full information, a decrease in the price of a good would lead to an increase in the consumption of that good and a decrease in the consumption of substitutes. In direct contrast, Corollary 2 says that the agent increases the consumption of substitutes as well. Although they are not precise confirmations, some experimental results are indicative of this prediction. In the experiment of Heilman, Nakamoto, and Rao (2002), shoppers who were given |${\$}$|1 off an item increased their purchases of products related to the discounted item. But unlike in our model, the discount applied only to one item and hence was not a price decrease, and the discount also increased purchases of unrelated “treats.” Similarly, Heath and Soll (1996) find in hypothetical choices that MBA students reduce their entertainment consumption more if they had spent |${\$}$|20 on a sports ticket than if they had received the same ticket as a gift. But again, a gift is not identical to a price shock.
Note that our model assumes linear disutility of money. Because thinking in terms of spending rather than quantities reduces risk in one’s total spending, a budget constraint over nominal spending, or more generally a concave utility function over nominal savings provides an additional reason to think in terms of spending and therefore to have spending budgets.18 Once again, this is especially likely to apply to low-income individuals, who typically face tighter constraints.
VI. Unit Demand
In our main model, the agent chooses consumption levels from a continuum. In a number of prototypical consumer decisions, however, a person is better described as having unit demand, needing exactly one item and being able to choose it from a selection. For instance, in the medium run the car a person uses, and how much she uses her car, are fixed, so that she needs to buy a fixed amount of gasoline. When a consumer’s computer breaks, she needs to buy exactly one new computer to replace it. When shopping for a new bedroom, a homeowner may be looking for exactly one mattress and one comforter. We now analyze the implications of our framework for such purchases.
We assume that there are N categories of products. In each category, there is a continuum of products with different quality levels, and the agent is looking to buy exactly one of them (with her utility being −∞ if there is a category in which she does not purchase). In category m, product |$y_m \in \mathbb {R}$| has utility ym, and a random price pm(ym). For instance, one category m could be gasoline, with the choice of ym representing the grade of gasoline that the agent chooses. The agent’s total utility is ∑m(ym − pm(ym)).
The shape of the pricing functions pm(ym) is determined by the differentiable, strictly increasing and strictly convex function p(·) that has full range and satisfies limy→−∞p′(y) < 1 and limy→∞p′(y) > 1. But the actual pricing functions are subject to shocks of the following form. For each category m, nature independently draws (i) the random variable xm from a distribution with mean 0, and (ii) whether a vertical or a horizontal price shock occurs, which have probabilities s and 1 − s, respectively. Then, the pricing function for category m becomes pm(ym) = p(ym) + xm if the price shock is vertical, and pm(ym) = p(ym + xm) if the price shock is horizontal. In combination with the assumption that p(·) is convex, this specification captures two canonical types of price shocks of interest for shopping behavior. A vertical price shock changes the price level while leaving the marginal price of increases in quality unchanged. With a horizontal price shock, however, a change in the price level changes the marginal price of increases in quality in the same direction; so (say) a price increase is associated with an increase in the marginal price as well.19
We consider an agent who has sufficiently costly attention (a sufficiently high λ) such that she does not want to think about price shocks, and therefore makes a plan that is independent of price realizations. An alternative interpretation is that the price uncertainty is the residual uncertainty after the agent has thought about the problem. Similarly to the previous section, we ask whether the agent wants to fix the level of quality or the amount of spending for each category. For computers, for instance, she could decide on a specific computer brand and configuration no matter how much it costs, or she could ask for the best |${\$}$|2,000 computer no matter what specific machine that is. For gasoline, she could buy the same grade each time, or she could decide how much she is willing to spend on gas, and choose the grade whose price is closest to that amount. These choice variables seem equally easy to implement in practice: in the former case the agent needs to remember the version she wants to buy in each category, and in the latter case she needs to remember the price she is aiming for in each category.
Suppose that the agent does not acquire information about prices. For any p(·) and any mean-zero nondegenerate shock distribution, there is an S ∈ (0, 1) such that fixing quality is optimal for s > S, and fixing spending is optimal for s < S.
If all price shocks are vertical (s = 1), then fixing quality is optimal. In this case, the marginal price of increasing ym is constant, so choosing a fixed ym is optimal. This means that if prices in category m increase, then the agent absorbs the shock and increases spending on category m. If some price shocks are horizontal, however, an increase in prices also increases the average marginal price of increasing ym, so fully absorbing a price increase by increasing spending is not optimal. Instead, lowering spending back toward the original level increases utility. In the extreme case in which all price shocks are horizontal (s = 0), fixing spending is optimal. In this case, decreasing spending back to the original level perfectly aligns marginal value with marginal price. Extending this logic, fixing spending is superior to fixing quality if a sufficiently large share of price shocks is horizontal—or, equivalently, the price and marginal price of quality are sufficiently positively correlated.
Again, our model assumes linear disutility of money. If the agent has a budget constraint over nominal spending, or more generally her utility over nominal savings is concave, then she is more prone to think in terms of spending to reduce risk in her total spending.
The economically most important prediction of this section emerges when consumers choose to fix spending. Then, because pm(·) is always strictly increasing, an increase in prices means that the agent must substitute to a lower-quality product. This prediction provides a potential explanation for the finding of Hastings and Shapiro (2013) that when gasoline prices rise, there is a shift in demand from premium to regular gasoline—that is, a cheaper product in the same category. Our explanation, however, requires that the circumstances that Hastings and Shapiro study are relatively rare in consumers’ lives. In particular, Hastings and Shapiro find substitution toward lower-grade gasoline for price shifts for which the price and marginal price of quality are approximately uncorrelated, while our model predicts such behavior only if the price and marginal price of quality are sufficiently positively correlated. Nevertheless, the correlation between the price and marginal price of quality is plausibly positive for many product categories consumers have experience with. Even for gasoline, Hastings and Shapiro focus on the short run, and the correlation may be positive in the longer run. Hence, our model accounts naturally for the evidence under at least two circumstances, especially for budget-constrained consumers. First, if the relevant correlation is sufficiently positive for gasoline, then fixing spending is optimal even for gasoline purchases in isolation. Second, more plausibly, if the correlation is sufficiently positive for the average consumer product with unit demand, and the consumer does not want to think separately about the correlation or does not want a separate shopping strategy for gasoline, then again fixing spending on gasoline is optimal.
Note that for a consumer who prefers to think in terms of spending (s < S), the implications of our unit-demand model contrast in an interesting way with those of our continuous-demand model above. When the agent has a budget in the continuous model, an equal increase in prices—that is, a vertical price increase—for a category leaves her spending levels unchanged in expectation for all products. This occurs because she is paying attention only to the relative marginal utilities of products in the category, which a vertical price increase does not change. In the unit-demand model, in contrast, a vertical increase in prices leads the consumer to reallocate spending to a cheaper substitute.
VII. Related Literature
Mental budgeting is a central component of, and is often referred to in the literature as, mental accounting, but the latter term is used for a broader set of issues. In applications of prospect theory, in particular, a mental account often refers to the set of monetary outcomes that are evaluated jointly in the context of a single decision (e.g., Kahneman and Tversky 1984; Thaler 1985; Henderson and Peterson 1992). For instance, a person is more willing to drive 20 minutes for a |${\$}$|5 saving if it comes off of a |${\$}$|15 purchase than if it comes off of a |${\$}$|125 purchase (Tversky and Kahneman 1981), presumably because she evaluates the saving together with the purchase to which it is applied. Our article is instead about mental accounts/budgets that serve as a decision-making aid when there are multiple competing uses for money.
There are two main explanations for mental budgets that have long been noted in the literature. Our framework is close in spirit to one of them: the idea that mental budgets simplify a consumer’s otherwise hopelessly complex problem by breaking it into manageable pieces (Thaler 1999; Zhang and Sussman 2018). This idea, however, is very underdeveloped in the literature. Most notably, previous work does not formalize how mental budgets simplify decisions and does not provide precise predictions on how consumers group products into budgets. Indeed, the need for research on these questions seems widely recognized (e.g., Hastings and Shapiro 2013; Zhang and Sussman 2018). We provide a theory of product categorization, and our formalization also generates other predictions, such as the connection we find between mental budgeting and naive diversification.
The other, more developed explanation for mental budgets is self-control problems—attempting to use budgets to mitigate overconsumption in the future. In a classic paper, Shefrin and Thaler (1988) develop a life cycle consumption-savings model in which the individual’s “planner” self would like to control the “doer” self’s tendency to consume too much. Shefrin and Thaler assume that the individual can separate money into different mental accounts, current spendable income, current assets, and future income, out of which her marginal propensities to consume are different. In the context of goal setting under self-control problems, Koch and Nafziger (2016) assume that a person can decide between broad and narrow goals, and that falling short of one’s chosen goal(s) leads to sensations of loss. The motive to avoid such losses creates an incentive that mitigates self-control problems.20 Similarly, Galperti (2019) compares good-specific and total-expenditure budgets for a person who is subject to self-control problems and intratemporal and intertemporal taste shocks.
Our theory provides a complementary reason for mental budgets that has a different foundation and therefore different predictions and features. Conceptually, the most important difference is in the nature of mental budgets themselves: while self-control-based theories exogenously assume nonfungibility of money in the sense that spending from different accounts or budgets is subject to different constraints or preferences, in our model mental budgets emerge despite money being fully fungible. The different foundation also allows us to derive naive diversification from the same framework and make other predictions.
Gorman (1959) identifies circumstances under which it is optimal for a standard utility maximizer to make consumption decisions using a two-step procedure similar to that in Sections IV.A and V, whereby she first allocates fixed budgets to different consumption categories, and then optimizes within each category given the allocated budget. Unlike in our model, the budgeting in the first stage requires the agent to know with certainty all the relevant price indices for the categories, and there is no taste uncertainty.
In predicting that the agent may completely ignore some aspects of her decision environment, our model is similar to the sparsity-based model of bounded rationality by Gabaix (2014). In Gabaix’s setting, the variables the agent may choose to look at are exogenously given, whereas in ours the agent can choose any combination of variables. We also apply the model to different questions than Gabaix.
Because our theory predicts unambiguous budgets based on economic preferences and fundamentals, it fails to capture some subtle context dependence in how individuals categorize outlays. For instance, Cheema and Soman (2006) find that individuals categorize a restaurant dinner flexibly as either food or entertainment depending on which budget has more money left over in it. The authors interpret such malleability in mental budgeting as an attempt to justify spending.
VIII. Conclusion
Although our models explain some important regularities in how individuals allocate money between multiple products, there are related phenomena that we have not covered. In particular, because our model does not distinguish multiple fully fungible sources of income, it cannot explain differences in whether and how consumers spend different types of income. The most important of these phenomena is the consumption effect of transfers that can only be used on a subset of products. The rational consumer model with full information implies that if such a transfer is inframarginal—that is, if the consumer would have spent more than the transfer on the products in question—then it is equivalent to cash. Yet experimental work by Abeler and Marklein (2016) and empirical work by Hastings and Shapiro (2018) indicate that inframarginal transfers have larger effects on the consumption of targeted products than do cash transfers. Even when a transfer is not inframarginal, it can have a surprisingly large effect: for instance, incentives for health-improving behaviors that are minute relative to the health benefits can significantly influence behavior (Volpp et al. 2008; Dupas 2014).21 Although not predicted by our current framework, there is a plausible attention-based explanation for these findings. Namely, there are many things that a person could consider doing but that she deems not worthwhile to think about due to costly attention, and she therefore does not do them. Receiving a transfer or subsidy can induce the person to think about the potential benefits, increasing the effect of the transfer. In ongoing work, we formalize this mechanism and consider what it implies for the optimal design of transfers.
Of course, we do not believe that mental budgeting is solely about costly attention. As we have mentioned, a likely motive for creating mental budgets is self-control problems. It would be interesting to combine the attention-based and self-control-based explanations of mental budgeting to identify interactions. For example, a person may use the costly nature of her attention to improve self-control by creating plans that she is unwilling to reconsider later. And when it comes to implementing a mental-budgeting-based consumption plan, researchers understand that if the budget becomes a reference point, then loss aversion helps stick with the plan. Our theory provides one possible foundation for which outcomes are evaluated jointly in a reference-dependent model. Once again, it would seem fruitful to combine the attention-based view with loss aversion.
Appendix A: LQ Multivariate Setup
Now we show by contradiction that S is diagonal. Suppose that the optimal S is not diagonal, and let SD be the matrix constructed from its diagonal, that is, |$S^D_{ii}=S_{ii}$| for all i and |$S^D_{ij}=0$| for all i ≠ j.
Part (ii): Let |$\xi _i=1-\frac{S_{ii}}{\sigma _0^2}$| be the relative reduction of uncertainty about the component vi · x. ξ is also the linear weight on a signal (as opposed to on the prior) in Bayesian updating with Gaussian signals. This means that in one-dimensional Bayesian updating, if the random variable vi · x moves by Δx, then the posterior mean about this variable moves in expectation by ξiΔx.
Part (iii): Differentiating |$\frac{\varphi ^\lambda _i}{\varphi ^\lambda _j}$| with respect to λ then implies the statement. □
Appendix B: Consumption and Spending Budgets
Proof of Fact 1. This is an immediate implication of equation (14) for C = Θ and B = I. □
Let rl denote the size of the category Rk,l on level l, that is, rl = |Rk, l| for all k.
Therefore, each |$v^{k,l,r^{\prime }}$| is an eigenvector, and they form a basis. This is because they are all mutually orthogonal. Vectors associated with distinct categories are orthogonal due to equation (20), and vectors associated with mutually nested categories are orthogonal due to equation (22) and equation (21). For vectors of the same category, the dimensionality is due to equation (22) equal the number of subcategories minus one lower dimensionality due to equation (21), |$\frac{r_l}{r_{l-1}}-1$|. The total number of vectors associated with level l > 1 is |$\frac{N}{r_{l-1}}-\frac{N}{r_l}$|, and the total number of these orthogonal eigenvectors on all levels is N − 1, which together with the eigenvector (1,..,1) delivers N orthogonal eigenvectors, and thus a basis. □
Proof of Proposition 3. We proceed in the following steps. First, we use Lemma 1 to infer the eigenvectors vi and eigenvalues Λi of Ω. Second, we use Proposition 1 to find costs of information for which the agent acquires information about vi · x. Third, we connect information acquisition to changes in vi · y by invoking Proposition 2. Finally, we show how this relates to fixed budgets.
Second, let |$\lambda _{l-1} = 2\sigma ^2_0\Lambda _i$|. According to Proposition 1, the agent gets information about vi · x if and only if λ < λl−1. Note that λl−1 are decreasing in l, because Λi are decreasing in l. Therefore, there are λ1, …, λL satisfying λL < … < λ1 such that |$v^i\cdot \tilde{{\bf x}}$| is constant for all vi associated with a level l if and only if λ ≥ λl−1.
Proof of Corollary 1.
Part (ii): This is an immediate implication of Proposition 3.
The derivative is constant (i.e., the dependence is linear) and clearly nonnegative. Moreover, it is positive since for i′s associated with l = 2 the attention factor ξi is positive and due to equation (21) the nonzero eigenvectors cannot have entries |$v_m^i$| constant for all |$m\in R.$| □
Let l* > 1 be the smallest level such that both m, n belong to a category |$R^{k,l^*}$|. From Lemma 1, we know that |$v^i_n v^i_m$| on the right-hand side of equation (26) is equal to 0 for all i associated with a level lower than l*, by condition (20), and |$v^i_n v^i_m$| is nonnegative for all i associated with levels higher than l*, by condition (22).
Moreover, if ξ* > 0, then the monotonicity is strict. In that case, (ξ*Λ* − ξiΛi) > 0 and for all levels l > l* there always exists i such that |$v^i_mv^i_n>0$|, because on all such levels there exists a category including both m and n. Therefore, there exists i on the level (l* + 1) such that |$(\xi ^*\Lambda ^*-\xi _i\Lambda _i) v^i_n v^i_m$| is positive. Dropping this term as l* increases decreases the derivative strictly. □
Proof of Proposition 4. The variance-covariance matrix of posterior means (describing correlations of beliefs about xi and xj) is P = (ψ − Σ). This matrix is diagonal in the basis of eigenvectors vk, that is, P = UQU−1, where the columns of U are vi. The diagonal elements of Qkk ≡ Qk equal |$\sigma _0^2-\sigma _k^2$|, which is the reduction of uncertainty about vk · x. The reduction |$Q_k={\rm max}(0,\sigma ^2_0-\frac{\lambda }{2\Lambda _k})$| is weakly increasing in Λk and weakly decreasing in λ, see Proposition 1.
Proof of Proposition 9. Without loss of generality |$\sigma _{0,2}^2=1$|. We first transform the state space such that in the new coordinates, |$\tilde{x}_1=\frac{x_1}{\sqrt{\sigma _{0,1}^2}}$| and |$\tilde{x}_m=x_m$| for all m > 1; the new prior variance-covariance matrix is then |$\tilde{\Psi }=I$|. The only other change to the original choice problem is that now the utility is |$-{\bf y}^{\prime } \Theta {\bf y} + \tilde{{\bf x}}^{\prime } \tilde{B} {\bf y}$|, where the element |$\tilde{B}_{11}$| of the new matrix |$\tilde{B}$| interacting actions and states is equal to |$a=\sqrt{\sigma _{0,1}^2}$|.
Now we compute the loss matrix |$\Omega =\frac{\tilde{B}\Theta ^{-1}\tilde{B}^{\prime }}{4}$|, its eigenvectors and eigenvalues.24 The three largest eigenvalues are associated with eigenvectors v1, v2, and v3, which are proportional to (0, −1, 0, 1), to (0, −1, 1, 0), and to (ϱ1, 1, 1, 1) respectively. The eigenvector associated with the strictly smallest eigenvalue is v4 ∝ (ϱ2, 1, 1, 1), where ϱ1 ≠ ϱ2.
Finally, we need to show that α1 > 1 if and only if a < 1. We express α1 using the condition that |$\tilde{H}v^3$| is orthogonal to α, and tedious but basic algebra reveals that in fact α > 1 if and only if a < 1. □
One natural question is whether the agent still engages in soft budgeting in the sense of the text, that an increase in the consumption of a good is associated with a decrease in the consumption of other goods much more than with full information. To measure this, we analyze the relative volatility of the budget, that is, ratio of the variance of ∑iyi and the sum of variances of the yi. Intuitively, this answers how much the agent changes her budget relative to how much she changes the consumption levels of the individual goods. For λ = 0.3, for instance, this ratio is 0 for a = 1 and increasing fairly slowly. For |$\theta =\frac{1}{4}$| we find that the volatility of x1 needs to be quadrupled, that is, |$\sigma _{0,1}^2 = 4\sigma _{0,2}^2$|, to make the relative volatility of the budget one-half of the relative volatility under perfect information.
Proof of Proposition 5. The first step is the same as in the proof of Proposition 3. In the second, we first note that the ordering of the eigenvalues is the opposite since for complements: γ2 < … < γL < 0. Therefore, we find that there are λ1, …, λL satisfying λ1 < … < λL such that vi · x is constant for all vi associated with level l, if and only if λ ≥ λl−1.
Because m, n ∈ Rk,l, then the only nonzero elements on the right-hand side might be those with i that is associated with a level l and lower. Above we showed that |$v^i\cdot \tilde{{\bf x}}$| associated with levels l and lower are constant if and only if λ ≥ λl−1. Therefore, ym − yn is constant if and only if λ ≥ λl−1. The symmetry of the problem in fact implies ym = yn if and only if λ ≥ λl−1, and relabeling of the threshold costs λl−1 → λl concludes the proof. □
Proof of Corollary 4. Applying the arguments of the proof of Proposition 3 for the level l = 3 of substitutes (categories R1,2 and R2,2) we find that there exist λ2 > λ3 such that y1 + y2 + y3 + y4 = const if and only if λ ≥ λ3, and y1 + y2 = y3 + y4 = const if and only if λ ≥ λ2.
Similarly, replicating the proof of Proposition 5 for the level l = 2 of complements we find that there exists λ1 > 0 such that y1 = y2 = const and y3 = y4 = const if and only if λ ≥ λ1.
Finally, from equation (24) we find that that μ2 = 1 − γ2 > 1, because for complements γ2 < 0, and μ3 = μ2 + (γ2 − γ3) = 1 − γ3 < 1 because for substitutes γ3 > 0. Therefore μ2 > μ3, and thus λ1 < λ2. The statement is concluded by denoting |$\hat{\lambda }_1=\max (\lambda _1,\lambda _3)$| and |$\hat{\lambda }_2=\lambda _2.$| Hence for |$\lambda \in (\hat{\lambda }_1,\hat{\lambda }_2):$|y1 + y2 + y3 + y4 = const, y1 = y2 = const and y3 = y4 = const, while y1 + y2 ≠ y3 + y4, and thus individual ym are not constant. □
Proof of Lemma 2.
Proof of Proposition 6. Part (i) is an immediate implication of Lemma 2. This is because if both |$\epsilon ^1,\epsilon ^2>\frac{1}{2}$|, then losses are lower when thinking in terms of spending for any given form of information. Therefore, whatever information strategy the agent chooses when thinking in terms of consumption, the agent can generate a higher objective when thinking in terms of spending by replicating the same information strategy.
All three differences between the two objectives in expressions (39)–(41) are for |$\epsilon ^1>\frac{1}{2}$| positive for sufficiently large N. In each of the expressions, the second term is independent of N, while the first terms are positive for |$\epsilon ^1>\frac{1}{2}$| and increasing linearly with N. □
Proof of Proposition 7. We replicate the proof of Proposition 3 as long as the ordering of eigenvalues is the same regardless of whether thinking in terms of spending or consumption.
To complete the proof, we show that for s = 0 choosing spending is optimal, and for s = 1 choosing quality is optimal. If s = 0, then expression (44) is strictly less than ym − p(ym), which is exactly expression (45) with Ym = p(ym), so fixing spending dominates fixing the quality. If s = 1, then expression (45) is strictly less than q(Ym) − Ym, which is exactly expression (44) for ym = q(Ym), so fixing the quality dominates fixing spending. □
Footnotes
Formerly titled “An Attention-Based Theory of Mental Accounting.” We thank Yuriy Gorodnichenko, Paul Heidhues, Rupal Kamdar, Marc Kaufmann, David Laibson, Andrei Shleifer, Philipp Strack, four anonymous referees, and seminar audiences for excellent comments. This project has received funding from the European Research Council (grant agreement numbers 678081 and 788918) and the UNCE project (UNCE/HUM/035).
Formally, let the disutility of spending be 1 + μ, with μ capturing the uncertainty in the value of money, and let |$x_1^{\prime }$| and |$x_2^{\prime }$| be the independent taste shocks. The agent’s utility is then |$(\overline{x}+ x_1^{\prime }) y_1 + (\overline{x}+ x_2^{\prime }) y_2 -\frac{y_1^2}{2} - \frac{y_2^2}{2} - \theta y_1 y_2 - (1+\mu ) y_1 - (1+\mu ) y_2$|. Setting |$x_1 = x_1^{\prime } - \mu$| and |$x_2 = x_2^{\prime } - \mu$| gives expression (1), where x1 and x2 are now positively correlated.
To see that with full attention the marginal propensity to consume can be as low as |$\frac{1}{2}$|, suppose that |$\rm{y_+^{max}} = \frac{2(\overline{x}-1)}{1+ \theta }$|, and note that optimal unconstrained total consumption is |$y_+ = \frac{2( \overline{x}-1)+ x_+}{1+\theta }$|. Since x+ has mean zero, the constraint binds with probability one-half.
For some previous applications of rational inattention, see Veldkamp (2006); Mackowiak and Wiederholt (2009); Woodford (2009); Luo and Young (2014); Caplin and Dean (2015); Matějka and McKay (2015); and Matějka (2016). See Mackowiak, Matějka, and Wiederholt (2018) for a review. Recent papers by Afrouzi and Yang (2019), Fulton (2017), Miao, Wu, and Young (2019), and Verstyuk (2019) also solve models of multidimensional rational inattention. These papers demonstrate that the agent may prefer lower-dimensional signals but do not identify implications for consumption patterns.
Sims (2003) shows that in a multidimensional rational-inattention model with entropy costs, it is optimal for an agent with our linear-quadratic consumption utility to collect Gaussian signals; hence, we simply assume that the agent does so. In addition, the entropy of a Gaussian distribution with variance-covariance matrix Σ is a constant plus |$\frac{\log |\Sigma |}{2}$|.
Note that we have introduced the notion of categories merely to facilitate the definition of the substitutability matrix Θ and the statement of our results; we do not presume that the agent thinks of goods in the same category separately from other goods.
Relatedly, our formal framework, in which the agent solves a complex optimization problem, may suggest a view of consumers going through a conscious, elaborate, and precise thought process to arrive at their budgets and strategies to spend. As with many other models in microeconomic theory, a more realistic interpretation is that consumers approximate the solution through trial and error or other means and make a habit of strategies that have worked in the past.
See Fact 2 in Appendix B for a proof.
See Corollary 3 in Appendix B for a proof.
Technically speaking, at the budgeting stage it is necessary for the agent to understand exactly what she will do at the execution stage. Interpreted more broadly, it is sufficient for her to have (perhaps based on experience) a reasonable understanding of the average value of increasing her budget. Relatedly, when the agent acquires information piecemeal, the question arises how costly each piece of information is. A simple assumption consistent with our formulation is that at each stage, the cost of information equals |$\frac{\lambda }{2} ( \log |\Sigma _0| - \log |\Sigma _1|)$|, where Σ0 and Σ1 are the variance-covariance matrices of her previous and new beliefs, respectively.
In the illustrative example above, the complementarity of the two investments relies on the asset returns being negatively correlated. Even for uncorrelated or somewhat positively correlated asset returns, investments are complements if the investor’s disutility from variance is strictly concave. Furthermore, with a precautionary savings motive, risky and safe investments are often complements.
In one study, for instance, students chose snacks to be received at the end of three different classes. When choosing the snacks one at a time at the beginning of these classes, 9% of students chose three different snacks. But when simultaneously choosing three snacks ahead of time, 64% of students chose three different snacks. To the extent that in the sequential-choice condition students know more about their momentary tastes, the former choices better reflect their true preferences.
For a formal statement and proof, see Corollary 4 in Appendix B.
In our main application for naive diversification, retirement investment, decisions are naturally denominated in dollar amounts invested into funds. This corresponds to prices that equal 1, so there is no price uncertainty.
This type of question is almost never considered in the literature on rational inattention, but one notable exception is Reis (2006). Analyzing a consumption-savings problem in which a consumer does not know her wealth perfectly, Reis asks whether the consumer prefers to make decisions in terms of consumption or savings.
Because the utility function is quadratic, optimal perfect-information consumption is linear in prices. Hence, the above directional derivatives are constant. Furthermore, due to the symmetry of the problem, ε1 does not depend on m and n.
The intuition for the qualifier ε1ε2 > 1 derives from the central property of thinking in terms of spending, that it forces consumption to be sensitive to unanticipated price shocks. If the optimal (full-information) price elasticity of category consumption, ε2, is low, then it is important for the agent to pay attention to the price level to reduce unanticipated price shocks. Hence, in that case paying attention to the price level is more important than paying attention to price differences, so trading off only within the category is never optimal.
An additional potential reason for the last pattern is that lower-income individuals have higher costs of attention. Indeed, an experiment by Mani et al. (2013), and a variety of other evidence discussed in Schilbach, Schofield, and Mullainathan (2016), indicate that poverty impedes cognitive performance, which means that lower-income people have a higher λ. A classical account, however, would suggest that lower-income people have a lower opportunity cost of time due to lower wages, and therefore have a lower λ.
The simplest formal way to make this point is to assume mean-variance preferences over spending. Start with our model above, in which the agent does not care about the variance of spending. Suppose that the agent wants to set budgets, and is indifferent between thinking in terms of spending and thinking in terms of consumption. Now suppose that she also derives disutility from the variance of her spending. Then, her achievable level of utility is strictly lower if she thinks in terms of quantities, but the same if she thinks in terms of spending. This means that she strictly prefers to think in terms of spending.
In terms of consumption utility, our unit-demand model is an extremely simplified variant of our basic model: there is one product in each category with separable, linear utility. Unlike in our basic model, however, prices are now nonlinear. An alternative approach would have been to assume that prices are linear and (to ensure interior solutions) that the utility function is concave. With this alternative specification, the condition for when fixing spending is optimal implicates both the pricing function and the utility function in a way that we do not find transparent. With our specification, the condition implicates only the pricing function and is transparent.
See also Hsiaw (2018) for related work and Pagel (2017) for other implications of loss aversion for consumption-savings behavior.
A related finding in political economy is the flypaper effect (Hines and Thaler 1995): when a local government receives a grant earmarked for a specific purpose, it tends to increase spending on that purpose by the amount of the grant.
Entropy of a multivariate N(μ, Σ) of dimension n is |$\frac{n}{2}(\log (2\pi )+1)+\frac{1}{2}\log |\Sigma |.$|
Equation (28) implies that actions yi and yj are more positively correlated if more uncertainty is reduced in the direction of vk, for which the signs of entries |$v_i^k$| and |$v_j^k$| are the same.
We provide all the detailed computations on request.
References
Verstyuk, Sergiy,