Revealed preference analysis and bounded rationality

The principle of revealed preference is the backbone of structural empirical work on consumer demand. It focuses on what we can learn about the processes by which economic agents make decisions, using observed choices and minimal auxiliary assumptions. Classical revealed preference methods assume that choices and preferences are stable and consistent, but many studies have found violations of these assumptions both in consumer data and experimental settings. Recent research ef-fort extends far beyond the axiomatic characterization of neoclassical choice mod- els. New behavioural theories explain these violations in a theoretically founded way, considering data consistency and preference recoverability for a wide class of behavioural models. This article reviews some of the themes emerging from the re- cent literature. JEL classiﬁcations: C43, D11.


Introduction
Revealed preference analysis is an elementary, structural approach to the analysis and interpretation of data using economic theory. Following the principles of  and Houthakker (1950), this approach aims to minimize the use of untestable assumptions in empirical work by examining observable properties of data, rather than unobservable statistical functions (such as demand curves). In its purest form, revealed preference analysis uses empirical inequalities describing the properties that data must satisfy if (and only if) it is consistent with a particular model, without requiring functional form assumptions or statistical model estimation.
Unlike structural econometrics, the revealed preference approach does not need error terms. Structural econometrics uses economic theory to develop formal mathematical statements about causes (explanatory variables) and effects (endogenous variables), and then attempts to fit these economically meaningful relationships to data. Causes, both observed ðxÞ and unobserved ðgÞ, are linked to effects ðyÞ by the following structural equation: where h represents a set of unknown parameters or functions.
To account for the fact that the model does not perfectly explain the data, structural econometrics uses unobservable error terms ðeÞ. When combined with statistical assumptions on the joint distribution of ðx; g; eÞ, the resulting empirical model is capable of rationalizing any set of observables. Structural econometric modelling thus relies on the correct specification of this statistical aspect, because the source and properties of the econometric errors e can have a critical impact on the estimation and interpretation of results. This approach can be challenging because economic theories, which are completely deterministic, generally cannot provide guidance on the appropriate statistical model to use, and it is difficult to learn about properties of the unobserved ðgÞ and the unobservable ðeÞ from the data alone.
Empirical revealed preference is also based on economic theory, but is entirely different from the 'y ¼ f ðx; g; h; eÞ' framework used in structural econometrics. Instead, it uses systems of inequalities which depend on observables and not the form of structural functions or unobservables. While statistical error terms and functional form assumptions may be added, in contrast to the econometric approach, these are not an essential requirement of the method. In other words, empirical revealed preference focuses on what we can learn solely by combining economic theory with observable features of the world.
The revealed preference characterization of many canonical neoclassical microeconomic models (such as utility maximization by price-taking consumers and profit maximization by competitive firms) dates back to contributions by Paul . In general, revealed preference conditions exploit the consistency implied by some essential stability in the relevant model. For utility maximization, the combination of rationality and stability of preferences implies that 'if an individual selects batch one over batch two, he does not at the same time select two over one' (Samuelson, 1938, p. 65).
However, over the last few decades, the field of behavioural economics has questioned this stability assumption, proposing models that allow neoclassically inconsistent behaviour. In many cases, behavioural models have revealed preference characterizations and thus can be non-parametrically falsified given finite data. The revealed preference analysis of behavioural models of individual choice was the subject of a recent conference, 'The Symposium for the Analysis of Revealed Preference', which took place in Oxford in the summer of 2018. This article reviews some of the themes and recent developments discussed at the conference.
After laying out the history and basic tools of the revealed preference approach (Sections 2 and 3, respectively), we focus on explanations of departures from neoclassical models, considering both the role of errors (Section 4) and behavioural explanations (Section 5). Finally, we discuss the importance of accounting for the relatively permissive nature of many behavioural models when comparing the empirical performance of competing models.

A brief history of classical revealed preference
The theory of revealed preference was first introduced by Samuelson in his 1938 Economica article. He argued that the testable implications of utility theory should be based on axioms about demand (which is observable) rather than on axioms about preferences (which are unobservable), and sought to derive these implications without using a specific utility function to represent preferences. Samuelson published an Econometrica article later that year, demonstrating that utility analysis contains empirically meaningful implications and utility theory is therefore refutable. Houthakker (1950) extended Samuelson's work by introducing the Strong Axiom of Revealed Preference (SARP), which works by exploiting transitivity. He also demonstrated that SARP is a necessary and sufficient condition for data to be consistent with the maximization of well-behaved preferences subject to the consumer's budget constraint, thus establishing a close link-also recognized by Samuelson (1950)-between the axioms about demand and the axioms about preferences.
However, both Samuelson and Houthakker assumed that the researcher could observe the entire demand system. If the entire demand system was observable, then the question of testable implications could easily be addressed by checking the conditions required for a utility function to be recoverable from a demand function. 1 In reality, the researcher only observes a finite number of observations. The structural econometric approach addresses this data deficit by fitting functions to the finite data, allowing demand to be evaluated at any point, including points not observed in the data. However, estimating these functions consistently requires untestable auxiliary statistical assumptions. Testing any hypothesis of maximizing behaviour is therefore really a joint hypothesis test: consistency with the behaviour of interest and the statistical assumptions made. This joint test is the essence of the Duhem-Quine problem in the philosophy of science (Quine, 1951;Duhem, 1954). Afriat (1967) derived testable implications for settings with only a finite number of observations. While this might seem a small contribution, it was actually the key to liberating applied work from the need to rely on assumed properties of unobserved and unobservable quantities. Diewert (1973) made very important further contributions: he analysed the assumptions about the utility function that were needed for a solution to the utility maximization problem to exist, an issue not considered by Afriat (1967). He discovered that the assumption of local non-satiation is crucial for a solution to exist; otherwise, any set of observed choices can trivially be rationalized by resorting to thick indifference curves. Diewert (1973) also demonstrated that a linear programme can be constructed to solve the questions of testability (checking for existence of a solution) and recoverability (recovering preferences from the observed behaviour). This characterization was the first step in translating Afriat's theoretical work into practical applications. Subsequent research focused on solving or simplifying many important computational aspects of revealed preference analysis, and considering the problem of extrapolation (using observed behaviour to predict choices at unobserved prices). 2

The classical revealed preference treatment of consumer demand
In the canonical model of consumer choice, a utility maximizing, price-taking individual faces an exogenous linear budget constraint: 1 The question of whether there exists a utility function that generates an observed demand function, and conditions under which the utility function can be recovered, is known as the 'integrability' problem and was first considered by Antonelli (1886). Required conditions include symmetry of the Slutsky matrix. 2 See, for example, Varian (1982Varian ( , 1983Varian ( , 1985Varian ( , 1988. max q uðqÞ subject to p 0 t q x t Suppose, we observe some data on prices and choices fp t ; q t g t¼1;...;T for an individual consumer. Two questions immediately arise. First, if the data were generated by the model, what properties must they necessarily have? Secondly, if the observed data have these properties, is that sufficient to know that the data could have been generated by the model? The answer to both of these questions is in the central result of the revealed preference literature: Afriat's Theorem.
Definition 1: A utility function uðqÞ rationalizes the data fp t ; q t g t¼1;...;T if uðq t Þ ! uðqÞ for all q such that p 0 t q t ! p 0 t q: Afriat's Theorem. The following statements are equivalent: (A) There exists a continuous, non-satiated, and concave utility function uðqÞ that rationalizes the data fp t ; q t g t¼1;...;T .
(C) There exists a non-satiated utility function uðqÞ that rationalizes the data fp t ; q t g t¼1;...;T .
There are now several different proofs of Afriat's Theorem in the literature; 3 we present a simple heuristic version in our Online Appendix.
Afriat's Theorem is exhaustive, it summarizes all of the empirical implications of the canonical model without making any additional functional form assumptions. The empirical content in Afriat's Theorem lies in conditions (B.1) and (B.2), which provide two equivalents but computationally distinct ways of testing a data set to check if it is consistent with the model of interest. (B.1) is more commonly used in the literature than (B.2) because of its linearity, so the existence or non-existence of a solution can be verified efficiently using the standard linear-programming methods, such as the simplex algorithm. This efficient decidability is an important aspect of Afriat's Theorem. 4 This program has two possible solutions: either a basic feasible solution is found, or the feasible set is empty (the required numbers fU t ; k t > 0g t¼1;...;T do not exist). The conditions of the model are satisfied if (and only if) this linear program has a basic feasible solution. Note that since the data are finite, there will be many possible utility functions that are consistent with the observed behaviour, so if a solution exists, it is usually not unique.
Condition (B.2) is the Generalized Axiom of Revealed Preference (GARP).
Definition 2: A data set fp t ; q t g t¼1;::;T satisfies GARP if and only if there exist relations R 0 and R such that 3 See, for example, Varian (1982). 4 In contrast, if the conditions in Afriat's theorem were nonlinear in unknowns, then the problem becomes NP-hard and decidability is not guaranteed: if a solution does not exist, the researcher may not be able to firmly establish this fact in a finite amount of time.
(i) for all t, s if p t q t ! p t q s then q t R 0 q s ; (ii) for all t; s; u; . . . ; r; v, if q t R 0 q s ; q s R 0 q u , . . ., q r R 0 q v then q t R q v ; (iii) for all t, s, if q t R q s , then p s q s p s q t .
GARP imposes a no-cycling condition on individual choice. Condition (i) states that the bundle q t is directly revealed preferred over q s if q t was chosen when q s was also attainable. Condition (ii) imposes transitivity on the revealed preference relation R. Finally, condition (iii) states that if bundle q t is revealed preferred to bundle q s , then q s cannot be more expensive than q t .
Verifying that a data set satisfies GARP involves checking the consistency of all the direct revealed preference relations, which for large data sets is more difficult than checking condition (B.1) due to the number of observations and direct revealed preference relations. 5 While Afriat's Theorem applies to individual (unitary) behaviour, it is extendable to models of household (collective) behaviour, where observed behaviour is the result of a joint (and usually unobserved) decision-making process that occurs between household members. Cherchye et al. (2007Cherchye et al. ( , 2011aCherchye et al. ( ,b, 2015 derive revealed preference conditions for various models of collective consumption behaviour and show how to set-identify the aspects of the unobserved within-household decision-making process.

Errors as explanations for violations
Revealed preference conditions are deterministic: the subject either satisfies or does not satisfy the conditions. This distinction is binary because the data-generating process is a deterministic model rather than a stochastic process. Important statistical considerations are present nonetheless. For example, we might only get to see a sample of an individual's behaviour from the population of potential choices which that person could have made, the data may be subject to measurement errors, and/or the individual may make optimization errors which are stochastic in nature. For all of these reasons, revealed preference test results require careful interpretation.

Measurement error
As discussed in Section 1, empirical revealed preference does not require error terms. However, errors cannot be ignored once we apply revealed preference conditions to data. Measurement error is an obvious consideration, but identical issues arise when revealed preferences are applied to statistical objects like estimates of aggregate consumption, as in Browning (1989), or non-parametric Engel curves, as in Blundell et al. (2003Blundell et al. ( , 2008. In these cases, the price and quantity data we observe is a function of a random variable. This introduction of a statistical element to empirical revealed preference forms an important link between revealed preference and structural econometrics. To illustrate the case of classical (mean-zero, symmetric) additive measurement error, consider the model 5 The Floyd-Warshall algorithm (Warshall, 1962) is an efficient method of finding the adjacency matrix of the transitive closure of relation R on a finite set. Varian (1982) showed that this algorithm could solve the problem of finding the transitive closure very efficiently (in time proportional to T 3 ). See Rockafellar (1970) for more details on convex analysis and solving optimization problems.
where q Ã t denotes the true values of demands and e t is a vector of classical measurement errors. Now the data-generating process is the deterministic economic model describing individual behaviour, plus a stochastic model describing the behaviour of the errors.
Suppose that we are interested in the null hypothesis that the true data fp t ; q Ã t g t2T satisfy GARP. If the revealed preference conditions fail for the observed demands q t ; we can use the following minimum distance criterion function to generate a restricted estimator of demand,q t , such that fp t ;q t g t2T satisfies GARP: where the weight matrix X À1 t is the inverse of the covariance matrix of the demands. The solution to this minimization problem leads to demandsq t , which satisfy the revealed preference restrictions and are unique almost everywhere. When evaluated at the restricted demands, this distance function also provides a test statistic for the revealed preference conditions, which can be used for conducting conservative inference. 6 While this approach is reasonable for data with classical measurement error, the advent of scanner-level data, in which consumers use electronic scanners to record their purchases at the bar-code level, introduces non-classical measurement problems. These issues include non-recording or under-recording of expenditures on certain items (such as tobacco, alcohol, or illegal drugs), and modifying behaviour during the period of data collection (e.g. by purchasing healthier foods). Problems like these seem at least (if not more) important than classical measurement error and there is currently no clear way to deal with them.

Optimization errors
Instead of asking whether the outcome of an empirical revealed preference test represents a statistically significant departure from the data-generating process into which a stochastic element has been introduced, we can also ask whether the results of the test represent an economically significant departure from rational choice. A consumer that violates revealed preference conditions appears to 'waste' money by buying a consumption bundle when a cheaper bundle is available and also revealed preferred to it. The cost-efficiency measure (e) suggested in Afriat (1973) is the smallest amount of this 'wastage' (as a proportion of the overall budget) consistent with the given demand data. In other words, we allow the consumer a 'margin of error' of proportion ð1 À eÞ. 7 If e < 1, there are violations of GARP in the original data. This measure, known as Afriat's Critical Cost Efficiency Index or Afriat's Efficiency Index, provides a simple way of measuring the size of a GARP violation and does so in units which are easy to understand and interpret economically. 8 Afriat's Efficiency Index has been used in the experimental literature to explain GARP violations, including settings where subjects make choices over a range of goods under various budget sets and/or relative prices. 9 Typically, researchers set a critical value of this index, denoted e Ã , such that they would consider any e > e Ã a 'small' or 'tolerable' violation of GARP. Varian (1991), for instance, suggests a value of e ¼ 0.95.

Errors as 'field-dressings'
If a data set violates revealed preference conditions, it is almost always possible to 'rescue' the model by claiming that errors of some kind are present. However, these errors are typically unobserved and unobservable, so cannot constitute a true explanation-instead, they represent the inability of the model to explain the data. 10 Errors can thus be (mis)used as a 'field-dressing' for a failing theory.
Even if an individual's data satisfies GARP, there is an inferential question which arises regardless of these other issues: how justified might we be in concluding that this individual is 'really' a utility maximizer? This question relates to the problem of under-determinism in the philosophy of science, which refers to the fact that data can be consistent with various possible hypotheses, not just the particular one(s) we consider.
Our assessment of a GARP result depends on factors such as the number of observations and the ability of the GARP test to detect non-utility-maximizing behaviour. If we have few observations, or budget constraints which do not cross often, then we should be hesitant to conclude that this person must be a utility maximizer simply because she passed GARP. Crawford and De Rock (2014) place these issues in the following Bayesian framework. Let U denote the event that the individual is a utility maximizer, and let G denote the event that the data generated by this individual satisfies GARP. Bayes' Theorem gives us the posterior probability that the individual truly is a utility maximizer given they satisfy GARP: where P(U) is the prior and PðGj : UÞ is the probability that the individual passes GARP when they are not a utility maximizer. Assuming no optimization or measurement error, PðGjUÞ ¼ 1 because a utility maximizer will pass GARP. Thus, the posterior probability simplifies to If the GARP test is insensitive to non-rational behaviour (e.g. if the budget constraints do not cross frequently) then PðGj : UÞ % 1 and PðUjGÞ ! PðUÞ, so a successful GARP test should not alter our prior beliefs very much. If the GARP test is very sensitive, then PðGj : UÞ % 0 and PðUjGÞ ! 1, so the GARP test then gives us rational grounds to believe that the individual is truly a utility maximizer.
The term PðGj : UÞ is key to the way in which we interpret a successful empirical GARP test. Whether it is close to one or close to zero partly depends on the alternative data-generating process ( : U). In the next section, we consider some alternatives to rational choice models which are now being analysed from a revealed preference perspective.

'It Takes a Model to Beat a Model': behavioural violations of revealed preference
The behavioural economics literature provides a rich and growing set of alternatives to the benchmark rational choice model. While standard neoclassical models often perform surprisingly well, they cannot rationalize all behaviour. Systematic violations of rationality have been documented in experimental and field settings in a range of circumstances. For example, households hold high pre-retirement wealth, high credit card debt, and low liquid assets simultaneously (Angeletos et al., 2001), and one's willingness to take risks is strongly influenced by the way that choices are framed (Kahneman and Tversky, 1979).
Behavioural models attempt to explain these phenomena by relaxing particular assumptions made by standard models, such as stable preferences or complete attention. The necessary and sufficient conditions for behaviour to be rationalized by behavioural models can be derived in most circumstances. Chambers et al. (2017) prove that any model that can be written as a set of universal statements will have necessary and sufficient revealed preference conditions, defined only in terms of observables. Thus, the revealed preference approach that applies to standard models can also be used to characterize behavioural models that satisfy this condition. 11 In this section, we review the emerging literature in which revealed preference conditions are derived for models with boundedly rational agents. We organize our discussion into three parts. First, we review the recent literature that relaxes the assumption of stable, consistent preferences, including a revealed preference treatment of time inconsistency and social preferences. A further strand of the literature relaxes the assumption that individuals are aware of all the available alternatives and face no computational difficulties in evaluating their complex consumption decisions. We then discuss revealed preference tests of limited attention and cognitive costs to search and/or evaluating alternatives.
Finally, it is important to recognize that behavioural models will perform at least as well empirically as their neoclassical counterparts because they typically add 'free parameters' to the standard framework. Thus, data that are consistent with the standard model will necessarily be consistent with its behavioural counterpart. Falsifiability is therefore a larger concern with models of bounded rationality, because models whose conditions are impossible to violate have limited empirical usefulness. We end this section by discussing how best to compare the empirical performance of alternative theories.

Behavioural models: preferences
The benchmark model of rational choice posits stable, non-satiated, and time-consistent preferences that depend on own consumption only. Recent work considers many ways to relax this assumption. 12 In structural econometrics, unobserved preference shifts are addressed via errors-in-variables models, where the true values of the independent variables 11 Spiegler (2008) highlights the methodological importance of providing revealed preference characterizations of behavioural models.
are assumed to be measured with error. Here, we review the revealed preference treatments given to non-standard preferences, specifically preferences over time (self-control problems), risk (subjective expected utility [SEU] maximization), and the utility of others (social preferences).

Time preferences
The standard model of intertemporal choice is exponentially discounted utility (EDU), where the individual maximizes the sum of per-period utilities, discounted by a factor d 2 ð0; 1: 13 This model is tractable as it summarizes an individual's time preference with a single parameter. However, it also imposes time consistency on an individual's behaviour. Time consistency is a strong assumption in reality, as there is substantial evidence that individuals exhibit present-biased preferences beyond that implied by exponential discounting. 14 For example, Benzion et al. (1989) find strong evidence against the classical hypothesis that individual discount rates are uniform across scenarios, time delays, and sums of cash flow.
The quasi-hyperbolic discounting model (Strotz, 1955;Laibson, 1997Laibson, , 1998Harris and Laibson, 2001) incorporates present bias through an additional discounting parameter b 2 ð0; 1 applied to all future consumption: This model encompasses the EDU case (b ¼ 1). EDU and quasi-hyperbolic discounting have been tested in both field and laboratory settings. 15 While many studies find evidence of quasi-hyperbolic discounting (Thaler, 1981;Kirby and Herrnstein, 1995), support for quasi-hyperbolic discounting is not unanimous. 16 For example, Andersen et al. (2014) do not find evidence of significant hyperbolic discounting behaviour in an experiment where adult Danes made choices of deferred monetary payments. Andreoni and Sprenger (2012) also find no evidence of present bias or hyperbolic discounting in their experimental subjects. Benhabib et al. (2010) find clear evidence for present bias, but little evidence for quasi-hyperbolic discounting. However, this model is generally considered to fit observed behaviour better than the EDU model in a wide range of settings and it can explain observed phenomena that the EDU model cannot. For example, hyperbolic discounting can explain why consumers use illiquid assets as a commitment device to constrain their own future choices (Laibson, 1997), 12 See DellaVigna (2009) for a summary of the literature on non-standard preferences, beliefs, and decision-making. 13 While it is theoretically possible for an individual to value the future more than the present, empirically, d falls within the ð0; 1 interval across a wide range of contexts and time frames (Frederick et al., 2002). 14 See Frederick et al. and why consumption at retirement drops sharply, particularly for low-wealth households (Laibson, 1998). When combined with concerns over self-image and/or an inability to commit to future consumption, time inconsistency may lead individuals to engage in seemingly irrational behaviour, such as avoiding information, incomplete learning, creating obstacles to their own performance, and 'self-deception' through selective memory, limited awareness, and other forms of belief manipulation (Benabou and Tirole, 2000;Carrillo and Mariotti, 2000). Dynamically inconsistent preferences can also explain undesired procrastination (Fischer, 1999;Rabin, 1999a,b, 2001), and indulgence in addictive activities beyond that predicted by the Becker and Murphy (1988) model of rational addiction (Carrillo, 1998;O'Donoghue and Rabin, 2000;Gruber and Koszegi, 2001). Time inconsistency can rationalize observed choices from a contractual menu and inertia in those choices (despite low or no switching costs) that the standard utility maximization model cannot explain, in areas ranging from gym membership (Della Vigna and Malmendier, 2006), credit card offers (Shui and Ausubel, 2004), 401(k) plans (Madrian and Shea, 2001), and welfare programme participation (Fang and Silverman, 2009). In macroeconomic analysis, the interaction between hyperbolic discounting and staggered nominal contracts can explain the significant trade-off between inflation and unemployment in the long-run Phillips curve that many well-known studies have documented (Graham and Snower, 2008).
Revealed preference conditions have been developed to test the EDU and quasi-hyperbolic discounting model on experimental data (Dziewulski, 2018;Echenique et al., 2020) and consumption survey data (Blow et al., 2017). As with standard models, the derived conditions are necessary and sufficient, and can be checked computationally using observable data only.
For the EDU model, Browning (1989) showed that the following condition was necessary and sufficient for consistency: the data are consistent with the EDU model if and only if there exist fu t g t¼1;...;T ; k > 0, and d 2 ð0; 1 such that u s u t þ 1 d tÀ1 kq 0 t ðq s À q t Þ for 8s; t 2 1; . . . ; T f g where q k t ¼ p k t =P t i¼1 ð1 þ r i Þ for all goods k ¼ 1; . . . ; K. These inequalities are similar to the Afriat inequalities (B.1) in Theorem 1, except that the marginal utility of income (k) is constant over time, prices are discounted by the perperiod interest rates (r), and the term involving prices and quantities is discounted by a common discount rate d. These inequalities are also linear conditional on d, so can be tested using a fine grid search on the interval ð0; 1.
For the quasi-hyperbolic discounting model, Blow et al. (2017) show that the revealed preference conditions for consistency are identical to those of the EDU model. In other words, without further restrictions, the quasi-hyperbolic discounting model and the EDU model are non-parametrically indistinguishable with only data on prices and quantities. However, with restrictions on the functional form of uðq t Þ, it is possible to distinguish between these two models.
In an experimental setting, individuals can be presented with many different intertemporal paths for incomes and prices. However, with consumption survey data, we only observe a single realized consumption path for each individual in a data set. The fact that we can only see what individuals actually do and not what they would or could have planned to do has implications both for the nature of the revealed preference test and the ability to discriminate between naive versus sophisticated consumers. Furthermore, the nature of errors and the level of data aggregation will differ between the two settings (Adams et al., 2014).
Conclusions about the relative empirical performance of the EDU and quasi-hyperbolic models vary according to the data set in question. Echenique et al. (2020) apply their tests to experimental data collected by Andreoni and Sprenger (2012) and Carvalho et al. (2016), and find little support for both the EDU and quasi-hyperbolic frameworks: approximately 20-30% of respondents passed both tests in the two experimental designs. However, with field consumption data, the empirical performance of the quasi-hyperbolic model appears much stronger than that of the EDU model. Blow et al. (2017) find that only 2% of households satisfy EDU in a consumption panel of Spanish households, while 84% of the sample satisfies the conditions for quasi-hyperbolic discounting. The predictive success (explanatory ability) of the quasi-hyperbolic model is also larger than that of EDU (0.06 compared with 0.00).

Risk preferences
Decision making under risk and uncertainty is another active area of behavioural economics research that has been given a revealed preference treatment. Under the SEU model, consumers make decisions to solve the problem: max q2Bðp;IÞ X s2S l s uðq s Þ (1) given prices p 2 R S þþ , income I > 0, and the set S representing all possible states of the world. The set Bðp; IÞ is the budget set defined by p and I.
Early revealed preference characterizations of expected utility maximization took state probabilities, l s , to be known. 17 The revealed preference conditions for the SEU model are an extension of the Afriat inequalities: the data are consistent with the SEU model if and only if there exist fu t ; k t > 0g t¼1;...;T such that u j s u k t þ 1 l k t k t p k t ðq j s À q k t Þ for 8s; t 2 1; . . . ; T f gand 8j; k 2 1; . . . ; S f g : The assumption of known state probabilities has been relaxed in the more recent literature. For example, Echenique and Saito (2015) derive the Strong Axiom of Revealed Subjective Expected Utility (SARSEU), applicable to purchases of a state-contingent pay-off at varying prices and income levels, which has a GARP-type characterization but is non-linear. 18 In general, the non-linearity of revealed preference tests can create computational difficulties with implementation, as there is no efficient algorithm for determining if a solution 17 See, for example, Green and Srivastava (1986), Varian (1983Varian ( , 1988, Diewert (2012), Kubler et al. (2014). 18 Data are consistent with SARSEU if and only if for any sequence of pairs ðq ki si ; q k0i s0i Þ n i¼1 for which the following statements hold: (i) q ki si > q k0i s0i for all i, (ii) each s appears as s i (on the left of the pair) the same number of times it appears as s 0 i (on the right), and (iii) each k appears as k i (on the left of the pair) the same number of times it appears as k 0 i (on the right), the product of the ratio of prices satisfies to general non-linear inequalities exists. However, Echenique and Saito (2015) proved that there is an efficient algorithm that can determine whether or not a data set satisfies SARSEU. While most of the literature focuses on characterizing risk-averse behaviour due to the concavity properties of the resulting utility function, Polisson et al. (2020) provide a characterization that does not require convexity of preferences over state-contingent consumption space, thereby allowing for risk-loving behaviour.
A number of violations of expected utility maximization have been documented in the experimental literature. For example, consumers are often observed choosing lotteries that are stochastically dominated. Polisson et al. (2020) extend their revealed preference results on expected utility to rank-dependent utility (Quiggin, 1982) and disappointment aversion (Gul, 1991) to establish whether these relaxations of SEU might better rationalize behaviour. They find that allowing for disappointment aversion does little to improve the proportion of behaviour that can be rationalized over the SEU model. However, rank-dependent utility might provide a superior characterization of consumer behaviour; the pass rate of the rank dependent utility model is about double that of SEU, and its predictive success is also (statistically) significantly higher, conditional on passing GARP.

Social preferences
In the benchmark rational choice model, utility only depends on one's own consumption. However, many lab experiments have shown evidence that dictators engage in reciprocal behaviour in several different contexts. 19 In the dictator games of Hoffman et al. (1996) and Bohnet and Frey (1999), a considerable proportion of dictators offered positive amounts to their recipients. In Ben-Ner et al. (2004), when the dictator game was played again between the same subjects but with roles switched, the amount sent back is strongly correlated with the amount received, even though the interaction was oneoff and anonymous. Cox (2004) finds evidence of altruistic other-regarding preferences, trust, and reciprocity. Experiments indicate that when recipients are friends, dictators share more with those to whom they are more closely connected (Goeree et al., 2010). Similarly, dictators give more to close friends than to strangers, and these differences are the strongest when the giving is not anonymous (Leider et al., 2009).
There are many reasons for why people might give to others, so it can be difficult to disentangle whether transfers to others reflect altruistic or strategic motives. Individuals may be averse to unfairness or inequality and may be willing to sacrifice some material pay-off to achieve a more equitable outcome (Fehr and Schmidt, 1999). People may give to others because they experience a 'warm-glow' contribution to their own utility from doing so (Andreoni, 1989(Andreoni, , 1990. Other reasons include reciprocity-rewarding friendly actions or punishing hostile actions at a cost (Rabin, 1993;Camerer and Fehr 2004), and reciprocal altruism-giving to generate or relieve an obligation (Camerer and Fehr, 2004;Cox, 2004;Leider et al., 2009;Ligon and Schechter, 2012). Andreoni and Miller (2002) and Porter and Adams (2016) combine experimental and revealed preference methods in order to analyse preferences for giving. Using a modified dictator game in which respondents are endowed with a certain number of tokens (their budget) and the price of giving is varied across rounds, they test the consistency of preferences for giving and recover heterogeneity in preferences. In their framework, transfers to others (p o ) can enter as a separate good in respondent's utility function, in addition to the 19 See Levitt and List (2007) for a discussion of the external (real-world) validity of lab experiments. Both studies test for consistency in preferences over own and other's consumption, and hypotheses concerning the precise functional form of the utility function. For example, uðp s ; p o Þ ¼ p s þ p o represents utilitarian preferences, uðp s ; p o Þ ¼ minfp s ; p o g represents Rawlsian preferences, and uðp s ; p o Þ ¼ p o represents completely selfless preferences. Both studies find a considerable degree of rationality in altruistic behaviour: over 90% of respondents in both papers passed GARP. A considerable degree of heterogeneity in preferences for giving was also identified. For example, approximately a quarter of respondents were purely selfish in Andreoni and Miller (2002), with the remainder displaying varying degrees of altruism.

Behavioural models: constraints
It is not only a consumer's preferences that might deviate from the rational choice benchmark, but also the constraints under which choices are made. For example, standard models assume that individuals optimize by taking all available information into account. However, there is strong evidence from the marketing and psychology literature that cognitive constraints force individuals to focus on a subset of information or potential choices (a 'consideration set'), and then choose optimally from this reduced set. 20 Consumer choice theory adopted this idea of consideration sets, with recent papers in the decision theory literature providing a revealed preference characterization of these models. Demuynck and Seel (2018) provide revealed preference conditions for consumption survey data to be consistent with the formation of consideration sets, and present evidence that consideration set models are a better empirical fit than the standard utility maximization model. In the most basic model of limited consideration, the individual's optimization problem in period t is as follows: with q k t ¼ 0 for 8k 6 2 I t , where I t is a subset of the full set of goods G. The standard model is thus a special case in which I t ¼ G.
The associated revealed preference conditions are an extension of the Afriat inequalities (B.1): the data are consistent with choice under consideration sets (limited consideration) if there exist fu t ; k t > 0g t¼1;...;T and vectors P t 2 R n þþ such that In other words, the subset of goods purchased and the subset of goods not purchased should both satisfy GARP. For goods that are not purchased, the consumer's behaviour does not need to be consistent with the actual prices, but there must be some beliefs about the prices that can explain the choice to not purchase them. Assumptions about these beliefs place additional restrictions on P t and provide extra revealed preference conditions for testing.
There is a growing interest in going beyond using revealed preference conditions to test whether behaviour is consistent or not with a given model, to using these conditions to 20 See Roberts and Lattin (1997) for an overview of findings in this area. recover the underlying structural functions from choice data. In a discrete choice setting, Masatlioglu et al. (2012) derive necessary and sufficient conditions under which a general model of consumer choice with deterministic preferences and deterministic consideration sets is consistent with the data. Building on this framework, Cattaneo et al. (2020) place minimal structure on the stochastic process generating consideration sets and show that preferences and consideration set probabilities are partially identified from choice probabilities. Manzini and Mariotti (2014) prove that consideration probabilities and a consumer's preference relation can be uniquely identified from individual choice data on the basis of the revealed preference conditions if we can observe a choice from every possible non-degenerate subset of feasible alternatives.
Some of this literature considers the process by which individuals form consideration sets.  provide a revealed preference characterization of sequential and potentially incomplete search, which can be applied to data on choice processes. Using an elegant experimental design,  find that consumer choice behaviour is consistent with simple models of sequential search and cannot be plausibly rationalized by an alternative theory in which consumers consider the entire choice set but make mistakes when choosing. Caplin and Dean (2015) provide a revealed preference characterization of rational inattention (the choice of whether or not to acquire information in the presence of search costs), finding support for this hypothesis in their experiment: respondents actively modified their attention in line with the predictions of an optimizing model. Finally, Dardanoni et al. (2020) develop revealed preference conditions for a model of choice where consumers are subject to cognitive constraints that limit their ability to consider the full set of feasible products. Their approach assumes standard market share data, so in principle can be applied to a wide range of non-experimental settings.
Another strand of literature focuses on revealed preference characterizations of heuristics in riskless choice. In real-life and experimental settings, individuals' choices are often found to violate the Weak Axiom of Revealed Preference (WARP), which states that if alternative a is directly revealed preferred to b (i.e. chosen from some menu alternatives where some other alternative b is present), then alternative b can never be selected from any other menu where both a and b are present. For example, individual preferences may contain pairwise cycles where a is directly revealed preferred to b, b is directly revealed preferred to c, but c is directly revealed preferred to a. Individuals may also exhibit menu dependence, where a is directly revealed preferred to b and c in a pairwise choice, but not when given the three options a, b, and c.
To explain these violations, Kalai et al. (2002) propose the concept of rationalization by multiple rationales, in which the consumer applies a different rationale for each subset of her choice problem. The revealed preference test involves finding the minimum number of rationales that rationalize the observed choices. In a similar vein, the rational shortlist method (RSM) of Manzini and Mariotti (2007) supposes that the consumer has multiple rationales, applied sequentially, to shortlist the alternatives in a choice set. Testing for consistency for RSM involves checking whether 'weak WARP' is satisfied: if a is chosen both when only b is also available and when b and other alternatives fc 1 ; . . . ; c k g are available, then b is not chosen when a and a subset of fc 1 ; . . . ; c k g are available. Weak WARP is also a necessary and sufficient condition for the 'categorize then choose' heuristic of Manzini and Mariotti (2012), in which consumers categorize alternatives before choosing the best item in that category, and Cherepanov et al.'s (2013) model of rationalization, in which individuals choose their preferred alternative from the subset of choices that have a subjectively appealing rationale. In some cases, these heuristics are observationally equivalent to neoclassical utility maximization: Mandler et al. (2012) show that if consumers choose by proceeding sequentially through a checklist of desirable properties, they act as if they were maximizing a utility function.

Predictive success: evaluating models
As behavioural models are the generalization of standard models, they will necessarily be able to rationalize a larger proportion of the data. In other words, the pass rate (percentage of all individuals whose behaviour is consistent with the revealed preference conditions) will always be weakly higher for a behavioural model compared with its standard counterpart. Thus, a higher pass rate is not sufficient to conclude that the behavioural model is a 'better fit' than the standard model.
Instead, the pass rate must be evaluated in the light of the model's restrictiveness. A common measure of restrictiveness is power, which is defined as the probability of rejecting a false null hypothesis (Bronars, 1987). Here, the null hypothesis is that observed behaviour is consistent with the model in question. In other words, does the individual satisfy the conditions of the model because the conditions were extremely difficult or even impossible to violate? The alternative hypothesis is usually taken to be uniform random behaviour (Becker, 1962), as there is no obvious definition of 'non-utility-maximizing' behaviour. Zero power means that any behaviour is rationalizable so it is impossible to reject the null hypothesis, while a power close to one means there are only a few behaviours that satisfy the model's conditions.
Predictive success combines pass rate and power into a single measure of a model's explanatory ability. This measure and its properties were first proposed by Selten (1991) and applied to revealed preference analysis by Beatty and Crawford (2011). There are many functions of pass rate and power that could be used but Selten (1991) provided an axiomatic argument for a simple difference measure, 21 the most commonly used function being Pass rate À ð1 À PowerÞ: Using this function, predictive success is close to 1 when the data are consistent with a very restrictive model, and close to -1 when the data are inconsistent with a very unrestrictive model.
Predictive success can be used to compare the explanatory ability of models that place different restrictions on behaviour and may be nested. Therefore, when comparing the degree to which alternative models can rationalize a data set, we advocate that predictive success rather than the pure pass rate should be the metric of interest.

Conclusion
Many classes of behavioural models have revealed preference characterizations, meaning that they are non-parametrically falsifiable given finite data sets, despite relaxing various assumptions about neoclassical utility maximization. This article reviews recent 21 Selten argued that any measure of predictive success should satisfy the following properties: (i) monotonicity (consistency with a restrictive model should get a higher value than inconsistency with an unrestrictive model), (ii) equivalence (consistency with a model that can rationalize any behaviour should be given the same value as inconsistency with a model that cannot explain any observed behaviour), and (iii) aggregability (the group-level measure of predictive success should equal a weighted sum of individual-level measures).
behavioural characterizations of time preferences, risk preferences, social preferences, and cognitive constraints on decision-making. We argue that these models are preferable to simply ascribing departures from the neoclassical benchmark to statistical errors, because errors fail to explain the underlying cause of the violation, whereas behavioural models propose well-formed and theoretically founded alternatives. Even accounting for the relative permissiveness of these behavioural models, there is evidence that they outperform their standard counterparts in some contexts. Therefore, the new behavioural revealed preference characterizations seem to offer much promise for empirical research.

Supplementary information
Supplementary material (the online appendix containing a heuristic proof of Afriat's Theorem) is available on the OUP website.

Funding
This work was supported by Oxford Economic Papers; the Royal Economic Society; Nuffield College, Oxford; and the British Academy.