## Abstract

Using a large sample of individual investor records over a nine-year period, we analyze survival rates, the disposition effect, and trading performance at the individual level to determine whether and how investors learn from their trading experience. We find evidence of two types of learning: some investors become better at trading with experience, while others stop trading after realizing that their ability is poor. A substantial part of overall learning by trading is explained by the second type. By ignoring investor attrition, the existing literature significantly overestimates how quickly investors become better at trading.

Academics have recently shown an interest in the investment behavior and performance of individuals, a field that has been called “household finance” by Campbell (2006). Over the past decade, several researchers have documented a number of behavioral biases among individual investors. More recently, researchers have found evidence that some individual investors are more informed or skilled than others.1 Considering these findings, it is natural to ask how skilled or informed investors acquire their advantage. For example, do investors learn by trading? If so, to what extent do investors improve their ability, and to what extent do they learn about their inherent ability? And how quickly do investors learn? In this paper, we exploit trading records to study both average investor performance and the strength of the behavioral bias known as the disposition effect.2 We correlate performance and disposition with investor experience and investor survival rates to determine whether and how investors learn by trading.

Motivated by the existing economics literature on learning, we consider two specific ways in which investors can learn. First, in the spirit of classical learning-by-doing models (Arrow 1962; Grossman, Kihlstrom, and Mirman 1977), investors might improve their ability as they trade (“learning by doing”). Second, as investors trade, they might realize that their inherent level of ability is low and decide to stop trading (“learning about ability”). Although these types of learning are different, they are not mutually exclusive, and the primary contribution of our paper is to separate these types of learning empirically and estimate the importance of each. Our results provide robust evidence for both types of learning, but the magnitudes of the learning estimates suggest that most of the learning by trading occurs as individuals learn about their own ability and low-ability investors stop trading. This implies that, by ignoring investor attrition, the existing literature substantially overstates how quickly investors become better at trading.

To clarify the model of learning we have in mind, consider the case of an individual who decides to begin trading. The investor must decide which of the myriad sources of market information and investment advice available to him or her to take seriously. He or she could consult standard news sources, Internet sites, investment newsletters, and neighbors or friends. He or she might also consider the advice of brokers, news analysts, authors of books and magazines, and finance professors. To the extent that these sources fail to agree completely, individuals must determine how much decision weight to assign to each source. Moreover, the quality of these sources is likely to differ across individuals: some investors may know executives at a firm, while others may not. As investors begin trading, they can learn to which of the various sources they should pay more attention. This can be thought of as improving ability through learning. Investors who have access only to poor sources of information cannot improve by focusing more on particular sources. Instead, they will learn that they have no useful information and will stop trading actively, choosing instead to invest in a passive investment such as an index fund.3 This is also a type of learning, but rather than improving his or her ability, the investor learns about his or her ability.

Since investors who learn that their ability is poor should stop trading, we empirically examine whether investors learn about their ability by examining attrition in our data. Once we account for time-invariant individual heterogeneity and endogenous attrition, any improvement in investors’ performance and any reduction in their disposition effect that comes with experience should reflect a direct improvement in their ability. By tracking the survival rates, performance, and the level of the disposition effect for each individual over time, we are able to differentiate between these two types of learning. In doing so, ours is the first paper to identify and measure both types of learning.4

We test our hypotheses with a remarkable data set that includes the complete trading records of investors in Finland from 1995 to 2003, including more than 22 million observations of trades placed by households. We use these data to estimate disposition and calculate performance at the account level. Our disposition estimates indicate that a median individual in our sample is 2.8 times more likely to sell a stock when its price has risen since purchase than when its price has fallen. We exploit the panel structure of our data to examine whether individual investors learn to avoid the disposition effect and improve their performance as they trade. In particular, we estimate the mean return and the disposition effect for each account and year in our sample and relate these estimates to experience, past returns, and various demographic controls.

We identify learning about ability in the data by adjusting for survivorship and heterogeneity with a modified Heckman selection model that allows for individual fixed effects. The model is a two-stage instrumental variables model that adjusts for the possibility that the composition of the sample is endogenous while accounting for any cohort effects. We construct two variables for use as instruments: the first is an indicator of whether the individual inherited shares in the previous year due to the death of a relative, and the second is the proportion of accounts in an investor’s zip code that are held by active traders. We argue that both of these variables are likely to satisfy the necessary exclusion restrictions—that is, they are likely to affect the probability of an investor remaining in the sample, but are unlikely to affect changes in the investor’s performance or disposition effect except through their effect on survival.

Differentiating between types of learning allows us to determine in a meaningful way how quickly investors learn. Estimating learning without accounting for heterogeneity and attrition results in inflated estimates of improvement that do not correspond to the experience of any particular type of investor. Correctly measuring the speed of learning is important for a number of reasons. If investors learn quickly and there is low turnover in the population of investors, behavioral biases are unlikely to affect asset prices significantly. Moreover, if they learn relatively quickly, then the “excessive” trading documented by Odean (1999) and Barber and Odean (2001) may be justified, because investors may optimally choose to trade more actively if they know they will improve with experience. Finally, thinking about both the speed and the type of learning has implications for market efficiency. For example, if many inexperienced investors begin trading around the same time, and they learn slowly, their trading could lead to time-varying market efficiency. In fact, we find evidence that suggests that investors learn less about their ability in years with positive market returns. In this respect, our paper contributes to the theoretical literature concerned with the survival and price impact of noise traders (see, for example, DeLong, Shleifer, Summers, and Waldmann 1991; Kogan, Ross, Wang, and Westerfield 2006).

While our tests have some features in common with existing papers, they differ from the literature in a number of important respects. First, unlike other papers, our tests use measures of performance and estimates of the disposition effect that are specific to individuals, allowing us to track particular individuals over time. This permits us to control for investor heterogeneity and survivorship effects, which allows us to separate the two types of learning. It also ensures that each observation in almost all of our tests is an average or regression coefficient for one individual in one particular year. Finally, using only one observation per person per year reduces the likelihood that our standard error estimates are incorrect because of correlation among our regression residuals. Given the unique features of our data and our test methods, the results of our hypothesis tests add significantly to the literature on financial learning.

The rest of the paper is organized as follows. Section 1 provides details on our data, while Section 2 describes the hypotheses we test and our statistical methods. Section 3 discusses our results, and Section 4 concludes. We give some details about our statistical methods in the Appendix.

## Data and Methods

The data used in this study come from the central register of shareholdings in Finnish stocks maintained by the Nordic Central Securities Depository (NCSD), which is responsible for the clearing and settlement of trades in Finland. Finland has a direct holding system, in which individual investors’ shares are held directly by the CSD. Since our data come from the CSD, they reflect the official record of holdings and are therefore of extremely high quality. The data cover all trading in all Finnish stocks over a nine-year period. Grinblatt and Keloharju (2000, 2001a, 2001b) use a subset of the same data, comprising the first two years of our sample period.6 The data include the transactions of nearly 1.3 million individuals and firms, beginning in January 1995 and ending in December 2003. In all, more than 22 million trades by individual investors are included. On average, individuals hold 12.6% of all equity in Finland, while financial institutions hold 9.6%, 34.7% is held by the government, and nonfinancial firms hold 33.4% during our sample period. (Additional statistics on investor trading are provided in Table 1.)

Table 1

Summary statistics

Mean 25th Pctl Median 75th Pctl
Panel A: Entire sample (322,454 accounts)
Number of years with trades 1.9 1.0 1.0 2.0
Number of securities traded 3.5 1.0 1.0 3.0
Number of trades 15.4 1.0 3.0 8.0
Average value of shares traded, EUR 3,447 808 1,653 3,310
Average portfolio value, EUR 11,588 1,470 2,794 5,856
Age in 1995 39.3 27.0 39.0 51.0
Gender (1 = female) 0.39
Trades options (1 = yes) 0.03
Panel B: Accounts with disposition estimates (11,979 accounts)
Number of years with trades 4.4 3.0 4.0 6.0
Number of securities traded 22.3 12.0 18.0 28.0
Number of trades 222.3 68.0 117.0 224.0
Average value of shares traded, EUR 5,356 1,855 3,235 5,759
Average portfolio value, EUR 58,828 5,102 11,483 26,147
Age in 1995 35.3 27.0 34.0 44.0
Gender (1 = female) 0.15
Trades options (1 = yes) 0.17
Mean 25th Pctl Median 75th Pctl
Panel A: Entire sample (322,454 accounts)
Number of years with trades 1.9 1.0 1.0 2.0
Number of securities traded 3.5 1.0 1.0 3.0
Number of trades 15.4 1.0 3.0 8.0
Average value of shares traded, EUR 3,447 808 1,653 3,310
Average portfolio value, EUR 11,588 1,470 2,794 5,856
Age in 1995 39.3 27.0 39.0 51.0
Gender (1 = female) 0.39
Trades options (1 = yes) 0.03
Panel B: Accounts with disposition estimates (11,979 accounts)
Number of years with trades 4.4 3.0 4.0 6.0
Number of securities traded 22.3 12.0 18.0 28.0
Number of trades 222.3 68.0 117.0 224.0
Average value of shares traded, EUR 5,356 1,855 3,235 5,759
Average portfolio value, EUR 58,828 5,102 11,483 26,147
Age in 1995 35.3 27.0 34.0 44.0
Gender (1 = female) 0.15
Trades options (1 = yes) 0.17
 Variable Mean $$t$$-stat 10th Pctl Median 90th Pctl Panel C: Hazard function estimates $${\beta }^{d}$$ 1.32 6.81 -0.36 1.04 2.57 $${\beta }^{r}$$ -0.04 -0.27 -0.65 0.03 0.93 $${\beta }^{s}$$ 0.01 0.46 -0.05 0.00 0.04 $${\beta }^{V}$$ 0.25 0.91 -2.62 0.24 3.08
 Variable Mean $$t$$-stat 10th Pctl Median 90th Pctl Panel C: Hazard function estimates $${\beta }^{d}$$ 1.32 6.81 -0.36 1.04 2.57 $${\beta }^{r}$$ -0.04 -0.27 -0.65 0.03 0.93 $${\beta }^{s}$$ 0.01 0.46 -0.05 0.00 0.04 $${\beta }^{V}$$ 0.25 0.91 -2.62 0.24 3.08

This table presents summary statistics for our data. Panel A includes all individual accounts in our data that started trading during the sample period. Panel B gives results just for those accounts for which we are able to estimate at least one disposition coefficient. We estimate the disposition coefficient only if an individual has placed at least seven round-trip trades in a given year. Number of trades is the total number of trades placed by an investor during the sample period. Average portfolio value is the average marked-to-market value of an investor’s portfolio using daily closing prices. Panel C reports summary statistics for the estimates of the hazard model in Equation (A2) in the text.

Our data allow us to reconstruct the portfolio of stocks held by each account on a daily basis. Using these holdings, we construct a proxy for wealth by calculating the average daily marked-to-market portfolio value for each investor. We also calculate the average value of trades placed by an investor each year. To measure sophistication, we note that investors who trade options are likely to be more familiar with financial markets. This is particularly true in our setting because many of the options in our data are granted to corporate executives as part of compensation. Therefore, while we do not include options trades in our estimates of disposition, we use whether an investor ever trades options as a proxy for sophistication. We also count the number of distinct securities traded by an investor over the sample period and use this as a measure of portfolio diversification.

Table 1 provides summary statistics for the new accounts in our data set. New accounts are those that place their first trade in 1995 or a subsequent year; they have no recorded initial positions. Panel A includes all new accounts that place at least one trade during our sample period (1995–2003), while Panel B gives results only for those new accounts for which we are able to estimate the disposition coefficient at least once. We attempt to estimate the disposition coefficient only if an individual has placed at least seven round-trip trades in a given year, although even with this restriction the procedure to maximize the likelihood function does not always converge. The last two rows of each panel are indicator variables, taking a value of one if the investor: (a) trades options, or (b) is female, and zero otherwise.

Comparing Panels A and B, it is apparent that the subset of investors for whom disposition coefficients are available is somewhat different than the larger population. By construction, the accounts in Panel B place more trades, but they also have larger portfolios, trade larger amounts of money, trade in a wider selection of securities, and are somewhat younger. Also, investors for whom we can estimate disposition are more likely to trade options (17%) than the overall sample (3%). Since we are only able to estimate disposition for investors who trade with some frequency, this likely results from the fact that investors who trade options are simply more likely to trade in general.

Figure 1 shows the number of accounts (including both new and existing accounts) that place one or more trades in each year. There is considerable variation in the number of accounts placing trades over time, from a low of 54,196 accounts in 1995 to a high of 311,013 accounts in 2000. Additions of new accounts follow a similar pattern. We discuss entry and exit from the sample in more detail in the next section.

Figure 1

Participation by year

This graph shows the number of accounts that place one or more trades in each year, including both accounts that exist at the beginning of the sample and new accounts. There is considerable variation in the number of accounts placing trades over time, from a low of around 54,196 in 1995 to a high of 311,013 in 2000.

Figure 1

Participation by year

This graph shows the number of accounts that place one or more trades in each year, including both accounts that exist at the beginning of the sample and new accounts. There is considerable variation in the number of accounts placing trades over time, from a low of around 54,196 in 1995 to a high of 311,013 in 2000.

### Measuring performance

Investor performance is the primary variable that we correlate with experience to test our hypotheses. Measuring the performance of individual investors is a significant challenge for a number of reasons. For example, it is not obvious how we should compare the performance of investors with different holding periods. Given the challenges associated with calculating performance, we take a straightforward approach that is nevertheless likely to capture much of the relevant information in the individual’s returns. In short, we calculate the returns earned by each purchased stock in the 30 trading days following each investor’s purchases. Importantly, we truncate this calculation window at the length of the actual holding if it is shorter than 30 days. We choose to focus on 30-day returns because the median holding period in our data is 39 trading days, but all of our findings remain unchanged if we use a 10-, 45-, or 60-day holding period. Our annual performance measure is the average of these 30-day returns. We provide details of our performance measures in the Appendix.7

### Measuring the disposition effect

The other outcome variable that we track to evaluate whether investors learn with experience is the disposition effect. We estimate the disposition effect using a hazard model estimated with daily transaction and return data in each year for each investor in our sample. Estimating the effect for each year means that each individual/year coefficient is estimated with a unique data set that is completely unaffected by the individual’s trades in other years. We estimate a hazard model of the form

(1)
$${h_{i,j}}\left( t \right) = {\phi _i}\left( t \right)\exp \left\{ {{\rm{\beta }}_i^d\;{\rm{PriceDumm}}{{\rm{y}}_t} + {\rm{controls}}} \right\},$$
where $$h$$$$i,j$$$$(t)$$ is investor $$i$$’s probability of selling position $$j$$ on date $$t$$, conditional on not having sold prior to date $$t. PriceDummy_{t}$$ takes a value of one when the stock price on date $$t$$ is above investor $$i$$’s purchase price, and zero otherwise. Therefore, $$\beta^d_i$$ measures investor $$i$$’s susceptibility to the disposition effect, which we term the “disposition coefficient.” Details of the estimation and additional discussion are provided in the Appendix.

### Measuring experience

We measure investing experience with both the number of years that an investor has been trading and the cumulative number of trades that an investor has placed. Of course, investors may gain experience by actively trading securities and observing the results of each trade. If this is the primary way in which investors learn, then cumulative trades will predict future investment performance and the disposition effect. However, investors may also learn by observing market quantities and considering the outcomes of hypothetical trades based on, for example, a particular information source. If investors mainly learn this way, then years of experience will be a better predictor of investment performance and the disposition effect than cumulative trades.8

## Hypotheses

We test two hypotheses in this paper to examine if individual investors learn by trading. Both hypotheses are about how the characteristics of individuals change with time, so we test both with panel data. Our first hypothesis (H1) is that investors learn to improve their ability over time—that is, “learning by doing” occurs. The second hypothesis (H2) is that investors learn about their inherent ability by trading.

H1 is in the spirit of classical learning-by-doing models (Arrow 1962; Grossman, Kihlstrom, and Mirman 1977), which argue that the productivity of agents increases with experience. The notion here is that investors improve over time since they should, for instance, be able to choose better combinations of various information sources that help them choose which stocks to hold. The main testable prediction of H1 is that performance should improve over time for investors who remain active.

H2 is best understood in the context of Mahani and Bernhardt's (2007) model, where individuals do not initially know their ability. However, individuals obtain information about their skills through their trading experience. Only a small fraction of the population is adept at identifying profitable trading opportunities. In equilibrium, each period some novice speculators enter the financial markets. Recognizing that most traders lack financial acumen, the novices first experiment on a small scale. They then use the information contained in their trading profits to decide whether to continue. Those who earn sufficient profits conclude that they are likely to be skilled and expand their speculative activities. However, some speculators do less well, conclude that they are more likely to be inept traders, and stop actively trading. The main testable prediction of H2 is that investors who experience losses stop actively trading, while those with sufficient profits remain and increase their trading intensity over time.

If either H1 or H2 is true, we should find an improvement in the average performance of the pool of investors over time. We therefore begin by testing a simple learning model to look for evidence of learning overall. If we assume a world with random attrition and no heterogeneity in ability among investors, a simple regression of performance on experience gives an estimate of the first type of learning (learning-by-doing). Alternatively, it gives an estimate of the second type of learning (learning-about-ability) if the pool of investors improves over time as the low-ability investors stop trading and those with sufficient profits continue to trade. To disentangle H1 and H2, we therefore need to examine the role played by individual heterogeneity and attrition in our data. Accounting for unobserved individual heterogeneity and attrition provides us with an assessment of H2, and any learning that remains provides an estimate of H1.

## Results

We present our empirical findings in this section. We begin by presenting the results of our tests relating to performance and disposition in Section 3.1. The simple learning model is presented in Section 3.2, and we deal with unobserved heterogeneity in Section 3.3. Section 3.4 examines the importance and magnitude of survivorship effects, and we present evidence on changes in trading intensity and risk-taking in Section 3.5. Finally, a number of robustness tests are presented in Section 3.6.

### Performance and disposition

We start by examining trader performance to be sure that our performance measure has similar properties to the measures used in the literature. Previous papers, in particular Odean (1998), have shown that average investor performance is worse than that of the market portfolio. Poor performance by average investors is also a prediction of the model in Mahani and Bernhardt (2007). We test whether individuals on average outperform a market index by calculating the average return to a stock purchased by an individual investor net of the market return. Calculating this average at a 30-day horizon (using our convention of using a shorter holding period if the individual sells the stock before 30 days) yields an average return net of the market of −48 basis points. At a 60-day horizon, the average net return is −50 basis points. At both of these horizons, returns net of the market return are quite statistically significantly negative.

While the average performance of individual investors is likely to be quite poor, Coval, Hirshleifer, and Shumway (2005) show that some individuals persistently outperform others. Again, performance persistence among individuals is an implication of Mahani and Bernhardt (2007). We test whether there is any persistence in investor performance in three related ways. First, we regress each investor’s average 30-day return in year $$t$$ on the investor’s average return in year $$t - 1$$ and year fixed effects. Using year fixed effects adjusts for time-series variation in average market returns. The estimated coefficient in this regression is 0.183 $$(p {\lt } 0$$.0001), very statistically and economically significant. Second, we calculate each investor’s average return in two disjoint time periods: 1995–1999 and 2000–2003. We then calculate the Spearman rank correlation between the return series from the first period with that from the second period. This correlation is 0.164 $$(p {\lt } 0$$.0001), again quite statistically and economically significant. Our third test involves sorting investors in each year into performance quartiles, and then plotting the average performance of each of those quartiles for the next several years. This plot, which appears in Figure 2, again gives evidence that the most successful investors in the past continue to outperform the least successful investors for at least a couple of years. Results calculated with alphas instead of raw returns are qualitatively the same. These results confirm that there is a degree of persistence in individual returns.9

Figure 2

Returns persistence

This figure plots the average 30-day returns earned by investors in years following their first purchase. Investors are grouped into quartiles in their first year of trading, and we then calculate average returns for each quartile in subsequent years. Returns are calculated using the approach discussed in the text, and are demeaned by calendar year, which removes the impact of any year fixed effects. Raw returns are reported here, but the results for risk-adjusted returns are not qualitatively different.

Figure 2

Returns persistence

This figure plots the average 30-day returns earned by investors in years following their first purchase. Investors are grouped into quartiles in their first year of trading, and we then calculate average returns for each quartile in subsequent years. Returns are calculated using the approach discussed in the text, and are demeaned by calendar year, which removes the impact of any year fixed effects. Raw returns are reported here, but the results for risk-adjusted returns are not qualitatively different.

Next, we examine the other outcome variable that we will use in our analysis—the disposition effect. The disposition effect is quite large in our data. Figure 3 is a plot of the relation between the propensity to sell an existing position (the hazard ratio) and the position’s holding period return. To generate this plot, we group all investors and estimate one hazard model each year. We group the data for this procedure so that we can estimate a model with many covariates, but almost all of the tests that follow are based on individual-level results.10 The conditional hazard ratio is remarkably similar across years. The plot shows an obvious kink in the hazard ratio near zero: investors are clearly more likely to sell a stock if it has increased in value since the date of purchase. This provides strong support for the presence of a disposition effect in aggregate, consistent with the extensive literature cited above.

Figure 3

Disposition effect in aggregate

This graph shows how the propensity to sell a stock depends on the stock’s return since purchase. Each line plots the regression coefficients from one hazard regression modeling the conditional probability of selling a stock. The coefficients correspond to dummy variables for return bins ranging from [− 10, − 9) percent to [9, 10) percent. In each year, there is a pronounced kink near zero, and the hazard increases rapidly for positive returns.

Figure 3

Disposition effect in aggregate

This graph shows how the propensity to sell a stock depends on the stock’s return since purchase. Each line plots the regression coefficients from one hazard regression modeling the conditional probability of selling a stock. The coefficients correspond to dummy variables for return bins ranging from [− 10, − 9) percent to [9, 10) percent. In each year, there is a pronounced kink near zero, and the hazard increases rapidly for positive returns.

Turning to our main individual-level disposition regressions, we require that an investor place at least seven round-trip trades in a year to be included in the sample, and we run the regression for each investor-year to generate a separate disposition coefficient whenever possible. While this filter drastically reduces our sample size, it is necessary to ensure that our coefficients of interest are identified. Panel C of Table 1 summarizes these estimates. The median disposition coefficient in the cross-section of investors across years is 1.04, which is economically quite large. This coefficient implies that the median new investor in our data is $$e^{1.04} = 2.8$$ times more likely to sell a stock whose price is above its purchase price than a stock that has fallen in value since the time of purchase. None of the controls is statistically significant in the cross-section.

Before we can consider whether investors learn to avoid the disposition effect, we need to argue that the effect is in fact a behavioral bias. In particular, one necessary condition for disposition to be a costly behavioral bias is that investors with more disposition have inferior investment performance. If disposition is unrelated to investment performance, investors with the effect have little incentive to learn to avoid it. To get a sense of how returns vary with disposition, we examine average investor returns across quintiles of the disposition coefficient. In this sort, the disposition coefficients are always estimated one year before the average returns are calculated, so disposition coefficients and average returns are not mechanically correlated in any way. For each quintile, Figure 4 graphs the average return earned by investors over different horizons from the purchase date. Returns are substantially higher in the lowest disposition quintile than in the highest disposition quintile. For example, in the 30 days following a purchase, a stock’s price increases 46 bp on average when bought by an investor in the lowest disposition quintile, compared with a decline of 54 bp if purchased by an investor in the highest disposition quintile. The differences between high- and low-quintile average returns range from 17 bp at the 10-day horizon to 131 bp at the 45-day horizon. These differences are both economically and statistically large, leading us to conclude that individuals with high disposition effect coefficients have relatively poor investment performance.

Figure 4

Returns by disposition quintile

This figure shows average 10-, 20-, 30-, and 45-day returns following a purchase for each disposition quintile. Returns are calculated using the approach discussed in the text. Raw returns are reported here, but the results for risk-adjusted returns are not qualitatively different. Returns earned by the lowest quintile (1) are higher than those earned by the highest quintile (5).

Figure 4

Returns by disposition quintile

This figure shows average 10-, 20-, 30-, and 45-day returns following a purchase for each disposition quintile. Returns are calculated using the approach discussed in the text. Raw returns are reported here, but the results for risk-adjusted returns are not qualitatively different. Returns earned by the lowest quintile (1) are higher than those earned by the highest quintile (5).

We can also verify that disposition is costly by mimicking the tests of Odean (1998), as presented in Table 2. The idea is to compare the returns of stocks sold at a gain with those that could have been sold at a loss but were not. In order to see the difference in cost of the disposition effect to investors in the low- and high-disposition groups, we implement a difference-in-difference specification. Using a regression framework instead of simple averages allows us to include year fixed effects to absorb year-specific factors such as average market performance. We define low- and high-disposition groups as the bottom and the top quintiles of the disposition coefficient and exclude investors who do not fall into either of these two quintiles. The negative coefficient in the first row indicates that the effect documented by Odean is reversed for the low-disposition group: stocks sold for a gain by low-disposition investors subsequently underperform those that could have been sold at a loss. The difference-in-difference coefficient is significantly positive at the three horizons we consider, indicating that the effect documented by Odean (1998) is stronger for individuals in our high-disposition group than in the low-disposition group. That is, high-disposition investors sell stocks that subsequently outperform the stocks they could have sold, but low-disposition investors do not. This test reiterates that disposition effect is costly to investors.

Table 2

Ex post return test

Ex post return horizon
30 Days 45 Days 60 Days
Sold for a gain −0.0164 −0.0520 −0.0541
(0.0021)*** (0.0024)*** (0.0026)***
High disposition −0.0120 −0.0191 −0.0145
(0.0052)*** (0.0069)*** (0.0084)
Sold for a gain × high disposition 0.0605 0.1063 0.1146
(0.0026)*** (0.0033)*** (0.0040)***
Year fixed effects Yes Yes Yes
Ex post return horizon
30 Days 45 Days 60 Days
Sold for a gain −0.0164 −0.0520 −0.0541
(0.0021)*** (0.0024)*** (0.0026)***
High disposition −0.0120 −0.0191 −0.0145
(0.0052)*** (0.0069)*** (0.0084)
Sold for a gain × high disposition 0.0605 0.1063 0.1146
(0.0026)*** (0.0033)*** (0.0040)***
Year fixed effects Yes Yes Yes

This table shows the results of a regression that effectively implements the same test reported in Odean's (1998) Table 6. The idea is to compare the returns of stocks sold at a gain with those that could have been sold at a loss but were not. In order to see the difference in cost of the disposition effect to investors in the low- and high-disposition groups, we implement a difference-in-difference specification. Using a regression framework instead of simple averages allows us to include year fixed effects to absorb year-specific factors such as average market performance. We define low- and high-disposition groups as the bottom and top quintile of the disposition coefficient. The negative coefficient in the first row indicates that for the low-disposition group the effect documented by Odean is reversed: stocks sold for a gain subsequently underperform. (Robust standard errors, clustered by investor, are presented in parentheses. ***, **, and * denote significance at 1%, 5%, and 10%, respectively.) The difference-in-difference coefficient is significantly positive at the three horizons we consider, indicating that the effect documented by Odean (1998) is stronger for individuals in our high-disposition group than in the low-disposition group.

Another necessary condition for disposition to be a behavioral bias is that disposition is a somewhat stable, predictable attribute of a particular investor. We test this conjecture by estimating the disposition effect at the investor level in adjacent time periods. Each set of estimates comes from a completely disjoint data set. Any trades that are not closed at the end of the first period are considered censored in the model estimated with first period data. Therefore, any trades that are not closed at the end of the first period are completely ignored in the model estimated with second period data. We explore the stability of disposition coefficients by estimating the rank correlation of account-level disposition coefficients over the two periods, testing whether the rank correlation is significantly different from zero. We estimate the rank correlation between an investor’s disposition coefficient in year $$t$$ and their coefficient in year $$t - 1$$ to be 0.364, suggesting that there is a fair degree of persistence in the individual’s disposition coefficient. This correlation is extremely statistically significant. Taken together, these results provide strong evidence that the disposition effect is a widespread and economically important behavioral bias that is present in each year of our study.

### Simple learning model

We start our analysis by examining if more experienced investors have better investment performance. We should find an improvement in the average performance of the pool of investors over time if either H1 or H2 is true. Therefore, before disentangling H1 and H2, the simple learning model helps us to assess whether either or both of these effects exist. We adjust for individual heterogeneity and survivorship in the next section. Our analysis begins by estimating a simple learning model of the form

(2)
$${y_{i,t + 1}} = \alpha + {{\rm{\beta }}_1}{\rm{Experienc}}{{\rm{e}}_{i,t}} + {{\rm{\beta }}_2}{\rm{Experienc}}{{\rm{e}}^2} + {\rm{\delta }}{X_{i,t}} + { \epsilon _{i,t}},$$
where the dependent variable is either the investor’s performance (average return following purchases) or the investor’s disposition coefficient. The primary coefficient of interest in this specification is on Experience. In the specification, we include Experience2 to allow investors to learn faster during earlier years. We proxy for investors’ trading experience by either years of experience or cumulative number of trades placed, and also include a vector of controls $$(X$$$$i,t$$) that might be expected to affect an individual’s performance as he or she trades, such as the individual’s average total daily portfolio value (PortVal).

Columns 1 and 2 of Table 3 report the results of the performance learning regression. When experience is measured in either number of years or cumulative trades, it is positively and significantly related to average returns. An additional year of experience increases average 30-day post-purchase returns by 41 − 4 = 37 bp, or approximately 3% at an annualized rate. An additional 100 trades increases returns at slightly over one-fourth of this rate. Again, results estimated with alpha instead of raw returns are quite similar (unreported).11 While these estimates are encouraging, the speed of learning they imply seems almost implausibly large. For instance, taking the regression parameters at face value, an investor with eight years of experience should outperform a new investor by about 22% per year. While we observe some heterogeneity in investor ability (or some performance persistence), it is not nearly large enough to justify these large coefficients.

Table 3

Learning models

Simple learning model Learning model with heterogeneity
$${\bar R_{i,t + 1}}$$ $${\beta }^{d}$$$$i,t+1$$ $${\bar R_{i,t + 1}}$$ $$\beta^d_{i,t+1}$$
$$YearsTraded_{t}$$ 0.414  −0.050  0.249  −0.010
(0.160)***  (0.014)***  (0.401)  (0.038)
$$YearsTraded^{2}_{t}$$ −0.043  0.007  −0.022  0.01
(0.027)  (0.002)***  (0.038)  (0.003)***
$$CumulTrades_{t} ( \div 10^{2})$$  0.110  −0.041  0.058  −0.030
(0.043)***  (0.003)***  (0.059)***  (0.005)***
$$CumulTrades^{2}_{t} ( \div 10^{4})$$  −0.0005  0.0002  −0.0009  0.0002
(0.0003)*  (0.00002)***  (0.0004)**  (0.00003)***
$$NumSec_{t}$$ 0.096 0.104 −0.011 −0.012 0.003 0.002 −0.008 −0.008
(0.012)*** (0.012)*** (0.001)*** (0.001)*** (0.024) (0.024) (0.002)*** (0.002)***
$$NumTrades_{t}$$ −0.031 −0.043 0.0006 0.005 −0.021 −0.029 −0.0007 0.0003
(0.005)*** (0.007)*** (0.001) (0.001)*** (0.010)** (0.010)*** (0.001) (0.001)
$$PortVal_{t} ( \div 10^{6})$$ 0.104 0.099 −0.006 −0.003 0.618 0.587 −0.067 −0.065
(0.061)* (0.058)* (0.003)** (0.003) (0.602) (0.601) (0.602) (0.601)
$${\bar R_t}$$   −0.005 −0.005   −0.002 −0.002
(0.001)*** (0.001)***   (0.001)* (0.001)*
Individual fixed effects No No No No Yes Yes Yes Yes
Year fixed effects Yes Yes Yes Yes Yes Yes Yes Yes
Observations 13,404 13,404 17,715 17,715 13,404 13,404 17,715 17,715
$$R^{2}$$ (%) 1.6 1.7 1.1 1.6 45.7 45.9 38.7 38.6
Simple learning model Learning model with heterogeneity
$${\bar R_{i,t + 1}}$$ $${\beta }^{d}$$$$i,t+1$$ $${\bar R_{i,t + 1}}$$ $$\beta^d_{i,t+1}$$
$$YearsTraded_{t}$$ 0.414  −0.050  0.249  −0.010
(0.160)***  (0.014)***  (0.401)  (0.038)
$$YearsTraded^{2}_{t}$$ −0.043  0.007  −0.022  0.01
(0.027)  (0.002)***  (0.038)  (0.003)***
$$CumulTrades_{t} ( \div 10^{2})$$  0.110  −0.041  0.058  −0.030
(0.043)***  (0.003)***  (0.059)***  (0.005)***
$$CumulTrades^{2}_{t} ( \div 10^{4})$$  −0.0005  0.0002  −0.0009  0.0002
(0.0003)*  (0.00002)***  (0.0004)**  (0.00003)***
$$NumSec_{t}$$ 0.096 0.104 −0.011 −0.012 0.003 0.002 −0.008 −0.008
(0.012)*** (0.012)*** (0.001)*** (0.001)*** (0.024) (0.024) (0.002)*** (0.002)***
$$NumTrades_{t}$$ −0.031 −0.043 0.0006 0.005 −0.021 −0.029 −0.0007 0.0003
(0.005)*** (0.007)*** (0.001) (0.001)*** (0.010)** (0.010)*** (0.001) (0.001)
$$PortVal_{t} ( \div 10^{6})$$ 0.104 0.099 −0.006 −0.003 0.618 0.587 −0.067 −0.065
(0.061)* (0.058)* (0.003)** (0.003) (0.602) (0.601) (0.602) (0.601)
$${\bar R_t}$$   −0.005 −0.005   −0.002 −0.002
(0.001)*** (0.001)***   (0.001)* (0.001)*
Individual fixed effects No No No No Yes Yes Yes Yes
Year fixed effects Yes Yes Yes Yes Yes Yes Yes Yes
Observations 13,404 13,404 17,715 17,715 13,404 13,404 17,715 17,715
$$R^{2}$$ (%) 1.6 1.7 1.1 1.6 45.7 45.9 38.7 38.6

This table presents results for regressions of the form

$${y_{i,t + 1}} = {\alpha _i}{\rm{ + }}{{\rm{\beta }}_1}{\rm{Experienc}}{{\rm{e}}_{i,t}} + {{\rm{\beta }}_2}{\rm{Experienc}}{{\rm{e}}^2} + {\rm{\delta }}{X_{i,t}} + {\epsilon _{i,t}},$$

where the dependent variable is either the investor’s average 30-day return following purchases ( $${\bar R_{i,t + 1}}$$ ) or the investor’s disposition coefficient $$({\beta }^{d}$$$$i,t+1$$). In the Simple Learning Model, the $${\alpha }_{i}$$ are held constant across all investors, while in the model with heterogeneity they are allowed to vary. Experience is measured by either years of experience (YearsTraded) or cumulative number of trades placed (CumulTrades). $$X$$$$i,t$$ is a vector of controls including the number of trades placed by the individual in a given year (NumTrades), the number of securities held by the individual in a given year (NumSec), and the individual’s average total daily portfolio value (PortVal). Data are from the period 1995 to 2003. Standard errors are in parentheses, and ***, **, and * denote significance at 1%, 5%, and 10%, respectively.

Columns 3 and 4 of Table 3 present our results for the disposition learning regressions. To reduce the weight given to disposition coefficients that are not estimated very precisely, we estimate the regressions with weighted least squares, where the weights are proportional to $$1/\widehat{{\rm{Var}}\left( {{{\rm{\beta }}^d}} \right)}$$ from our hazard regression (Equation (A2) in the Appendix). Column 3 shows that disposition declines with experience (β1 < 0). Moreover, investors tend to slow down in their learning as they gain experience since β2 > 0. Wealthier traders, investors who trade more securities, and investors who earned higher returns in the previous year all have lower levels of disposition. Column 4 indicates that an additional 100 trades reduces the disposition coefficient by 0.041, which is similar to the coefficient on Experience in Column 3. In other words, a year of experience or 100 trades has approximately the same effect on disposition. In each of the specifications, the estimated YearsTraded and CumulTrades coefficients are statistically significant at the 1% level. Economically, however, our results suggest that investors learn relatively slowly. Specifically, the estimates in Column 3 suggest that an additional year of experience corresponds to a reduction in the disposition coefficient of approximately 0.05. To provide some context for this estimate, note that the unconditional median disposition coefficient in our sample is 1.04. An extra year of experience decreases this by about 5%.

As mentioned in Section 2, these estimates could reflect the amount of learning-by-doing in a world with exogenous attrition and no investor heterogeneity or they could reflect learning-about-ability if the pool of investors improves over time due to attrition of low-ability traders. We now move on to testing H1 and H2 by examining the role played by individual heterogeneity and attrition in our data.

### Learning model with individual fixed effects

The simple learning model does not account for unobserved investor heterogeneity. As indicated in Figure 1, investor participation in our sample changes significantly over time. Consequently, cohort effects might make it appear as if there is learning even if there is none. For instance, if the number of trades placed by an investor is a noisy measure of ability, our experience variables could capture learning even if investors have constant ability and high-ability investors trade more than low-ability ones. With this in mind, our next model exploits the long time series of performance and disposition estimates available for each investor to assess the impact of time-invariant unobserved investor-level factors such as ability, education, or wealth on investor learning:

(3)
$${y_{it}} = {\alpha _i} + {{\rm{\beta }}_1}{\rm{Experienc}}{{\rm{e}}_{i,t}} + {{\rm{\beta }}_2}{\rm{Experienc}}{{\rm{e}}^2} + {\rm{\delta }}{X_{i,t}} + {\epsilon _{i,t}},$$
where the fixed effects, $${\alpha }_{i}$$, control for unobserved individual heterogeneity.

The performance results on YearsTraded, reported in Column 5 of Table 3, suggest that an investor with one year of experience will earn 22 bp more than an inexperienced investor over a 30-day horizon. Column 6 indicates that a similar increase in returns comes from an additional 400 trades. This suggests that though investor performance improves with experience, accounting for individual heterogeneity reduces the estimates by about 50%. Results from the disposition regressions including individual fixed effects are presented in Columns 7 and 8. While YearsTraded is no longer significant in these regressions, Column 8 suggests that 100 trades reduces the disposition coefficient by approximately 0.03. Comparing these estimates with those from the regression without fixed effects, we find that the learning estimates are again reduced by roughly one-half.

These regressions assume that individuals stop trading for purely exogenous reasons. The model shows that some learning by trading is occurring, but it gives little information about the nature of that learning. In the model of Mahani and Bernhardt (2007), as low-ability investors trade they realize that their inherent level of ability is low and decide to stop trading actively. Thus, examining attrition in the sample is critical to disentangle how much of the estimates of β1 and β2 is impacted by the endogenous nature of our sample, or how much of β1 and β2 is driven by H1 and how much is driven by H2. To separate our inferences about H1 and H2, we carefully control for investor heterogeneity and survivorship in our next set of tests.

### Impact of attrition on learning estimates

Endogenous attrition can significantly affect our learning estimates, a problem best understood if we consider the decision of the investor to continue trading. To represent an investor who continues to trade when his or her performance is good and stops once he or she gets a few bad draws, define

(4)
$${s_{it}} = I\left( {{\alpha _i}{\rm{ + }}{{\rm{\beta }}_1}{\rm{Experienc}}{{\rm{e}}_{i,t}} + {{\rm{\beta }}_2}{\rm{Experienc}}{{\rm{e}}^2} + {\rm{\delta }}{X_{i,t}} + {\nu _{i,t}} \gt 0} \right),$$
where $$I({\cdot})$$ is an indicator variable. Here, $$s_{it}$$ equals zero if investor $$i$$ exists in the sample after year $$t$$, and one otherwise. That is, $$s_{it}$$ equals zero only in the last year of investor $$i$$’s trading. The problem with both the simple learning model estimates and the fixed effect estimates is that Equation (3) assumes ν$$i,t$$ is uncorrelated with ε$$i,t$$, which may not be the case.

We first present some overall attrition evidence by examining the rate at which investors who are in our sample (having placed seven round-trip trades) in one year fail to place any trades during the rest of our sample period. Figure 5 shows that attrition is a significant feature of our data. Since the rest of the sample period changes from year to year, the earlier years of our sample period provide more reliable estimates of true exit rates than the later years. Approximately 25% of those traders who enter the sample in one year fail to ever trade again. Of traders who trade for two or three years, about 5% permanently exit the sample.

Figure 5

Proportion of accounts who exit

This graph shows the percentage of investors who exit in 1 year, 2 years,…, or 6 years following the first year in which they place at least seven trades. Exit is defined as placing no further trades during our sample period. Within each year group, the first bar indicates the percentage of investors who exited in the first year after placing a trade, the second bar indicates the percentage of investors who exited in the second year, and so on. Data are missing for later years because we do not know how many investors stopped trading after 2003.

Figure 5

Proportion of accounts who exit

This graph shows the percentage of investors who exit in 1 year, 2 years,…, or 6 years following the first year in which they place at least seven trades. Exit is defined as placing no further trades during our sample period. Within each year group, the first bar indicates the percentage of investors who exited in the first year after placing a trade, the second bar indicates the percentage of investors who exited in the second year, and so on. Data are missing for later years because we do not know how many investors stopped trading after 2003.

We conduct the Verbeek and Nijman (1992) test to assess whether selection is a problem by including $$s_{it- 1}$$ in the fixed effects model. If selection is not a problem—so investor attrition is random—the coefficient estimate on the selection dummy should be insignificant. The coefficient in the Verbeek and Nijman test estimate on the lagged selection dummy $$(s_{it- 1})$$ in regression (3) is found to be significant for both performance (coefficient 2.19, $$t$$-stat 6.44) and disposition (coefficient 0.10, $$t$$-stat 3.30). This again indicates that selection is severe in the sample and attrition is not random.

To examine directly how much ceasing to trade (or learning about ability) affects our inferences about learning, we use a modified version of the selection model introduced by Heckman (1976). While the classic Heckman model involves a two-stage procedure—a selection model in the first stage to predict which observations will be observable in the second stage and the regression of interest with an adjustment for survivorship bias in the second stage—it does not account for individual heterogeneity. The evidence in Section 3.3 suggests that accounting for individual heterogeneity may be important to control for cohort or similar effects. Thus, we modify this procedure to account for both survivorship bias and individual heterogeneity, adopting the empirical strategy of Wooldridge (1995), which modifies the Heckman model to allow for fixed effects. Intuitively, this approach accounts for survivorship by estimating the selection model every year and including the inverse Mills ratios (the conditional probability that an individual continues to trade) of each selection equation in the learning regression model. Individual time-invariant heterogeneity is accounted for in this method by running the regression in first-differences. More concretely, the learning regression model we estimate is

(5)
$$\Delta {y_{i,t + 1}} = {\rm{\beta }}\Delta {x_{i,t}} + {{\rm{\rho }}_{96}}I\left( {t + 96} \right){\lambda _{96}} + \ldots + {{\rm{\rho }}_{02}}I\left( {t = 02} \right){\lambda _{02}} + {\epsilon _{i,t}},$$
where λ96, …, λ02 are the inverse Mills ratios from the cross-sectional probit model (the selection model) in each of the years 1996–2002, and $$I({\cdot})$$ is an indicator variable. (1995 is omitted to avoid perfect collinearity of the model.) Including these variables in the learning regression accounts for the impact of the selection equation. Note that a joint test of $${\rho }_{t} = 0$$ for $$t = 1996, \ldots , 2002$$ is a test of whether survivorship bias is a concern (Wooldridge 1995).

The first stage uses cross-sectional probit regressions to predict whether or not the individual ceases to trade in a given period. The probit regressions include a constant, linear, and quadratic experience terms, the number of different stocks the investor trades, the individual’s average return in the previous year, the cross-sectional standard deviation of the individual’s previous-year 30-day return, and a dummy variable that is one if the individual’s average daily marked-to-market total portfolio value is in the top quartile of all investors. As instruments, we use the following variables: (1) a dummy variable for whether an investor inherited shares in the previous calendar year (Inheritance); and (2) the proportion of active traders in the zip code of the individual (excluding the individual) in the previous calendar year (Activeness). As we will explain below, both these variables are likely to satisfy the necessary exclusion restrictions—that is, they are likely to affect the probability of remaining in the sample, but are unlikely to affect changes in an individual’s performance or disposition effect except through their effect on survival.

For our first instrument, we conjecture that an individual who inherits shares is more likely to trade in the future, perhaps because his or her wealth has increased or because the new shares cause him or her to pay more attention to the stock market. Another explanation, consistent with Jin and Scherbina (2008), is that inherited shares may prompt the recipient to trade since the shares may not fit with the investor’s desired asset allocation. This satisfies the exogeneity condition since inheritance of shares from a relative is unlikely to directly affect changes in the performance or disposition effect of an individual. In the data, when an investor dies and shares are transferred to an heir, it appears as a transaction with the account of the deceased selling shares and the account of the heir purchasing shares. A special code identifies the transaction as an inheritance. In our sample, death transfers are evenly spread over the sample (ranging from 62 in 1995 to as high as 443 in 2003).

Our second instrument is based on the proportion of accounts in an individual’s zip code that place at least seven round-trip trades in the previous year. The papers that argue that an individual investor is more likely to trade if his or her neighbors are trading (Hong, Kubik, and Stein 2004; Ivković and Weisbenner 2007) imply that this measure is correlated with investor activity. This instrument satisfies the exclusion restriction, since having active traders in one’s zip code is unlikely to directly affect changes in performance or disposition effect.12 There are 1979 zip codes in the data and the measure varies between 0% and 88%.

We construct the sample to be used in the selection model as follows. An account observation is added to the selection sample if it places one or more trades in a given year. This differs from our main sample, where we require investors to have placed at least seven round-trip trades in order to estimate either the average performance or the disposition coefficient. Once an account is added, it remains in the selection sample until 2003, which is the end of our data. In some years, an account will have placed enough round-trip trades to be included in our hazard regressions, so the data will include a performance average and a disposition estimate for this account. However, each year we will also have data on many accounts for which we do not have performance and disposition estimates. If estimates are available, we treat the account as having been selected into our data. Results from the selection model, with two-step efficient estimates of the parameters and standard errors, are given in Table 4. The first-stage selection model uses 36,030 observations, while the second-stage regression (in first differences) uses only 11,959 observations in the performance regression and 16,188 observations in the disposition regressions.

Table 4

Learning with survival controls

Selection models (1st Stage) 2nd Stage 2nd Stage
Dependent variable In-sample$$i,t+1$$ = 1 $$\Delta {\bar R_{i,t + 1}}$$ $$\Delta\beta^d_{i+t+1}$$
$$YearsTraded_{t}$$   0.292 0.451 −0.038
(0.199) (0.521) (0.042)
$$YearsTraded^{2}_{t}$$   0.028 −0.009 0.007
(0.008)*** (0.043) (0.003)**
$$CumulTrades_{t} ( \div 10^{2})$$   0.341 0.036 −0.021
(0.021)*** (0.003)*** (0.006)***
$$CumulTrades^{2}_{t} ( \div 10^{4})$$   −0.0011 −0.001 0.002
(0.000)*** (0.000) (0.000)***
$${\bar R_{t - 1}}$$ 0.859 1.040 1.050  −0.211
(0.064)*** (0.082)*** (0.082)***  (0.162)
$$\sigma \left( {{{\bar R}_{t - 1}}} \right)$$  −0.160 −0.181 −6.539 0.047
(0.062)*** (0.061)*** (0.829)*** (0.097)
(0.018)*** (0.018)***
Wealthy  0.430 0.427
(0.021)*** (0.022)***
$$I$$(Inherit = 1)   0.134
(0.060)**
Activeness   0.289
(0.043)***
Inverse mills ratios:
ρ96    4.868 −0.128
(0.883)*** (0.062)**
ρ97    12.57 −0.141
(1.130)*** (0.074)*
ρ98    24.03 −0.388
(1.350)*** (0.143)***
ρ99    −7.6973 0.291
(4.910) (0.203)
ρ00    4.417 −0.146
(0.822)*** (0.064)**
ρ01    7.16 −0.092
(0.636)*** (0.050)*
ρ02    12.874 −0.229
(0.671)*** (0.058)***
Observations $$(N)$$ 36,030 36,030 36,030 11,959 16,188
Time fixed effects Yes Yes Yes
Other controls Yes Yes Yes Yes Yes
Log likelihood −26077 −22982 −22958
$$R^{2}$$ (%)    12.7 0.83
Joint test of$${\rho }_{t} = 0$$
$$F(7, N)$$    35.80 12.18
Pr > $$F$$    0.000 0.000
Selection models (1st Stage) 2nd Stage 2nd Stage
Dependent variable In-sample$$i,t+1$$ = 1 $$\Delta {\bar R_{i,t + 1}}$$ $$\Delta\beta^d_{i+t+1}$$
$$YearsTraded_{t}$$   0.292 0.451 −0.038
(0.199) (0.521) (0.042)
$$YearsTraded^{2}_{t}$$   0.028 −0.009 0.007
(0.008)*** (0.043) (0.003)**
$$CumulTrades_{t} ( \div 10^{2})$$   0.341 0.036 −0.021
(0.021)*** (0.003)*** (0.006)***
$$CumulTrades^{2}_{t} ( \div 10^{4})$$   −0.0011 −0.001 0.002
(0.000)*** (0.000) (0.000)***
$${\bar R_{t - 1}}$$ 0.859 1.040 1.050  −0.211
(0.064)*** (0.082)*** (0.082)***  (0.162)
$$\sigma \left( {{{\bar R}_{t - 1}}} \right)$$  −0.160 −0.181 −6.539 0.047
(0.062)*** (0.061)*** (0.829)*** (0.097)
(0.018)*** (0.018)***
Wealthy  0.430 0.427
(0.021)*** (0.022)***
$$I$$(Inherit = 1)   0.134
(0.060)**
Activeness   0.289
(0.043)***
Inverse mills ratios:
ρ96    4.868 −0.128
(0.883)*** (0.062)**
ρ97    12.57 −0.141
(1.130)*** (0.074)*
ρ98    24.03 −0.388
(1.350)*** (0.143)***
ρ99    −7.6973 0.291
(4.910) (0.203)
ρ00    4.417 −0.146
(0.822)*** (0.064)**
ρ01    7.16 −0.092
(0.636)*** (0.050)*
ρ02    12.874 −0.229
(0.671)*** (0.058)***
Observations $$(N)$$ 36,030 36,030 36,030 11,959 16,188
Time fixed effects Yes Yes Yes
Other controls Yes Yes Yes Yes Yes
Log likelihood −26077 −22982 −22958
$$R^{2}$$ (%)    12.7 0.83
Joint test of$${\rho }_{t} = 0$$
$$F(7, N)$$    35.80 12.18
Pr > $$F$$    0.000 0.000

This table reports estimates of selection model regressions with the fixed effects modification developed by Wooldridge (1995). The regressions are of the form

$$\Delta {y_{i,t + 1}} = {\rm{\beta }}\Delta {x_{i,t}} + {{\rm{\rho }}_{96}}I\left( {t = 96} \right){\lambda _{96}} + \ldots + {{\rm{\rho }}_{02}}I\left( {t = 02} \right){\lambda _{02}} + {\epsilon _{i,t}},$$

where λ96, …, λ02 are the inverse Mills ratios from the cross-sectional probit model (the selection model) in each of the years 1996–2003, and $$I({\cdot})$$ is an indicator variable. Including these variables in the learning regression accounts for the impact of the selection equation. (1995 is omitted to avoid perfect collinearity of the model.) A joint test of $${\rho }_{t} = 0$$ for $$t = 1996, \ldots , 2002$$ tests whether survivorship bias is a concern. The first-stage probit model is estimated each year, and the inverse Mills ratios for each year are computed separately from each of these models. For brevity, the table reports estimates of the first-stage probit model with data from all of the years of the sample pooled together. The second-stage regressions are estimated with all the variables in first differences, except the inverse Mills ratios. These first differences add fixed effects to the model. The dependent variable in the second stage is either the individual’s average return in the following year, $${\bar R_{i,t + 1}},$$ or the individual’s disposition coefficient, $${\beta }^{d}$$$$i,t- 1$$. $$X$$$$i,t$$ is a vector of controls described in the text. Standard errors are in parentheses, and ***, **, and * denote significance at 1%, 5%, and 10%, respectively.

We estimate the first-stage regression for each year and construct inverse Mills ratios for each year. For brevity, we report only one set of pooled first-stage estimates with year fixed effects in Columns 1–3 of Table 4. Results for each of the years are qualitatively similar to those reported. The model in Column 1 shows that the estimate on $${\bar R_{t - 1}}$$ is positive and significant. This is consistent with $$H_{1}$$—as low-ability investors trade, they learn about their inherent ability and cease trading when they get negative returns. More successful investors continue to trade actively. The estimate is also economically meaningful and suggests that, keeping other explanatory variables at their mean levels, a decrease in returns of one standard deviation increases the probability that the individual will cease to trade in the next period by around 15%. The other coefficient estimates reported in the first column of Table 4 also seem sensible: investors are more likely to remain in the sample and trade if they hold relatively diversified portfolios and have relatively more trading experience.

Following the model of Mahani and Bernhardt (2007), under H2, it is likely that individuals will cease trading when the variance of the signals they get from the performance of their trades is large—i.e., the signals are noisy. We proxy for the noise in the signals the investors receive from the performance by the variance in the returns across all positions taken by an account in the previous year and include it in the model in Column 2.13 We also include measures of ex ante investor sophistication proxied by whether they trade options or have significant wealth. The idea is that such investors are more likely to continue trading. The coefficient estimates reported on these additional variables in the second column also seem sensible: investors are more likely to remain in the sample if they receive precise signals about their performance and if they are relatively sophisticated.

Finally, in the third column, we also add our instruments to the first-stage regression. As is reported, the coefficient estimates for both of our instruments are statistically significant at the 5% level or better, and they have the predicted sign. Specifically, both inheriting shares and having higher trading activity in the zip code of an investor increase the probability that the investor will continue trading in the next period. A joint χ22 test for the significance of the instruments rejects the null hypothesis that the instruments are weak at the 1% level (χ22 = 49.10). In addition, we test the validity of exclusion restrictions related to the instruments using the likelihood ratio (LR) test proposed by Wooldridge (2003, Chapter 17).14 We find that the null hypothesis that exclusion restrictions related to the two instruments are violated is rejected at the 1% level, suggesting again that our instruments are doing a good job in explaining the selection equation. Both the instruments also have an economically significant impact. For instance, keeping other variables at their mean levels, inheriting shares increases the probability that the investor will continue trading in the next period by around 8%. Similarly, a one-standard-deviation increase in investor activity (an increase of 0.20) in the zip code of the individual in the previous year increases the probability that the investor will continue trading in the next period by about 5%.

We find that accounting for selection has a significant impact on our learning estimates. Column 4 uses performance as the dependent variable in a regression of the form of Equation (5). Comparing the estimates in Table 4 with the simple model reported in Table 3, the coefficient on YearsTraded is no longer statistically significant, and the coefficient on cumulative trades is reduced by about 90%.15 When we use disposition as the dependent variable in Column 5, the coefficient is reduced by slightly more than 50%.

The coefficients on the inverse Mills ratios in Columns 4 and 5 are also sensible. In particular, they suggest that factors that predict which investors stay in the sample are positively correlated with future performance and negatively related with disposition. This suggests that survivorship is indeed important. The endogenous attrition of low-ability investors significantly boosts the experience coefficient estimate in the simple performance regression, and it diminishes the estimate in the simple disposition regression. The joint tests of statistical significance of the inverse Mills ratios also show that accounting for sample selection is important. In the disposition regression, the joint F-test of significance of the inverse Mills ratios yields an F-statistic of 12.18, much greater than the 1% critical value of 2.64. Similarly, in the performance regression, the F-statistic of 35.80 is much greater than the 1% critical value. Unreported results in which YearsTraded and CumulTrades are included in separate regression models yield almost the same coefficients, but regressions that include both variables are reported for brevity.

We also explore the properties of the coefficients on the inverse Mills ratios from the disposition and performance regressions in unreported tests. We find a strong negative relation between the coefficients on the inverse Mills ratios from the performance and disposition models. This is quite reasonable, since endogenous attrition of low-ability investors should significantly increase the estimates of experience in the performance regression and should reduce them in the disposition regression. Moreover, regressing the coefficients of the inverse Mills ratios on a dummy for whether the excess market return (over the risk-free rate) is positive, we find that there is less attrition in high-return periods than in low-return periods. This is indicated by the coefficient on the return inverse Mills ratios being higher (by about 12.3, $$p = 0.07)$$ and the coefficient on the disposition inverse Mills ratios being lower (by about −0.23; $$p = 0.11)$$ during high-return periods. These results suggest that less learning-about-ability occurs in “good” times.

Our estimates suggest that the fraction of learning that is driven by investors learning about their inherent ability is large. After adjusting for this type of learning, the portion of learning that is due to investors learning to improve their ability over time is statistically different from zero, but not excessively large. Overall, our findings are consistent with the two hypotheses outlined in Section 2, and the evidence supporting H2 is much stronger than the evidence supporting H1.

Figure 6

This figure shows how trading intensity changes with experience. Intensity is measured as the number of trades placed (dark blue) and the total value of trades placed (light blue; in 10,000s of EUR). Results are demeaned by calendar year to adjust for year fixed effects. The results for the first partial year in which investors trade are omitted, so the positive values in the plot are all deviations from that first partial year period. We report results beginning in the investor’s first full calendar year in the sample.

Figure 6

This figure shows how trading intensity changes with experience. Intensity is measured as the number of trades placed (dark blue) and the total value of trades placed (light blue; in 10,000s of EUR). Results are demeaned by calendar year to adjust for year fixed effects. The results for the first partial year in which investors trade are omitted, so the positive values in the plot are all deviations from that first partial year period. We report results beginning in the investor’s first full calendar year in the sample.

#### Learning by doing: risk and the speed of learning

As outlined in the hypothesis section, H1 suggests that the productivity of agents increases with experience, for instance, by using better combinations of various information sources that help them choose the stocks to invest in. In order to examine this, we estimate our learning regressions with risk-adjusted returns (or 30-day alphas) instead of raw returns, and we regress the average factor betas of stocks purchased by investors on experience and our control variables. Our regressions also control for survivorship and individual and year fixed effects. The results of our regressions appear in Table 7. This table clearly shows that risk-adjusted returns improve with experience, with a coefficient that is actually larger than the coefficient we estimate for raw returns. Looking at the coefficients on average factor betas makes it clear why this is the case. With more experience, investors are actually both improving raw returns and taking less risk, or purchasing stocks with lower factor betas. This result is particularly strong for the market (RMRF) and size (SMB) factor betas. These tests provide additional support for H2.

As a final confirmation that agents improve their ability with experience, we examine learning for a group of less-sophisticated (low-ability) investors. We conjecture that these ex ante low-ability investors will learn at a faster rate than their higher-ability peers. The model of Mahani and Bernhardt (2007) assumes that investors do not know their ability before they start trading. As a result, it has no clear prediction on differential speeds of learning of ex ante sophisticated and unsophisticated investors. However, if we assume that investors who are ex ante sophisticated have some notion of their ability, our first hypothesis suggests that since these investors have little to improve, they will either learn slowly or not learn at all. To test this, we need to sort investors by their level of sophistication. We rely on ex ante observable characteristics of investors that we believe are related to their financial sophistication. In particular, we consider investors ex ante likely to be relatively sophisticated if they trade options or have significant wealth.

Each row of Table 5 displays the mean of the disposition coefficient and the average returns (or performance) of each group, the simple regression coefficient of these variables on cumulative trades, estimates corrected for survivorship and unobserved heterogeneity, and the number of observations used in the calculations. Results for disposition are shown in Columns 1–3 and for returns in Columns 4–6. Looking at the table, it is clear that the mean for each of our sophistication subgroups is significantly different, and each change in the mean across subgroups is of the sign we expect. In each pair of rows of the table there is a clear difference between the unsophisticated investors, who learn to avoid the disposition effect at a rate of about 10% per year, and sophisticated investors, for whom the learning coefficient is often insignificant. Moreover, the survivorship-adjusted estimates show that even after correcting for attrition and investor fixed effect, these investors do improve their ability to trade over time—that is, we confirm that $${{\rm{H}}_1}$$ is true. Similar results are obtained if we use YearsTraded as the experience variable instead.

Table 5

Heterogeneity in learning

Dependent variable
$${\beta }^{d}$$$$i,t+1$$ $${\bar R_{i,t + 1}}$$
Classification Mean Mod. 1 Mod. 2 Obs Mean Mod. 1 Mod. 2 Obs
No options trades 1.17 −0.024 −0.02 14078 −0.81 0.14 0.32 10389
(0.003)*** (0.001)***   (0.06)*** (0.13)***
Trades options 0.99 −0.013 −0.01 3846 0.1 0.008 0.01 3020
(0.004)*** (0.004)***   (0.04) (0.01)
Low wealth 1.14 −0.023 −0.018 15426 −1.18 0.05 0.029 11411
(0.003)*** (0.001)***   (0.01)*** (0.01)***
High wealth 1.11 −0.04 −0.01 2498 1.08 0.01 0.02 1198
(0.05) (0.012)   (0.023) (0.021)
Dependent variable
$${\beta }^{d}$$$$i,t+1$$ $${\bar R_{i,t + 1}}$$
Classification Mean Mod. 1 Mod. 2 Obs Mean Mod. 1 Mod. 2 Obs
No options trades 1.17 −0.024 −0.02 14078 −0.81 0.14 0.32 10389
(0.003)*** (0.001)***   (0.06)*** (0.13)***
Trades options 0.99 −0.013 −0.01 3846 0.1 0.008 0.01 3020
(0.004)*** (0.004)***   (0.04) (0.01)
Low wealth 1.14 −0.023 −0.018 15426 −1.18 0.05 0.029 11411
(0.003)*** (0.001)***   (0.01)*** (0.01)***
High wealth 1.11 −0.04 −0.01 2498 1.08 0.01 0.02 1198
(0.05) (0.012)   (0.023) (0.021)

This table reports both means and simple learning coefficient estimates from regressions of the form

$${y_{i,t + 1}} = \alpha + {{\rm{\beta }}_1}{\rm{CumulTrade}}{{\rm{s}}_{i,t}} + {{\rm{\beta }}_2}{\rm{CumulTrades}}_{i,t}^2 + {\rm{\delta }}{X_{i,t}} + {\gamma _t} + {\epsilon _{i,t}}.$$

The variable of interest is either the disposition coefficient $$({\beta }^{d}$$$$i,t+1$$) or returns ( $${\bar R_{i,t + 1}}$$ ). Regressions using the simple learning model are labeled Mod. 1, while regressions using the adjustment for attrition are labeled Mod. 2. Attrition-corrected regressions include the inverse Mills ratios described in Table 3. For brevity, we report only β1 coefficients in the table. We classify investors as trades options if they trade in options at any point during our sample. Similarly, investors are classified as wealthy if they are in the top 25th percentile of average portfolio value. We include year dummies in all the regressions. Data are from the period 1995 to 2003. Standard errors are in parentheses and ***, **, and * denote significance at 1%, 5%, and 10%, respectively. All group means are significantly different at the 1% level.

Table 6

No attrition correction With attrition correction
$$YearsTraded_{t}$$ 10.074 135257 13.32 22713
(0.618)*** (14091)*** (0.749)*** (5626)***
$$YearsTraded^{2}_{t}$$ −0.462 −9515 −0.896 −1216
(0.097)*** (2123)*** (0.085)*** (557)***
Observations 13,266 13,266 13,266 13,266
Individual fixed effects Yes Yes Yes Yes
Year fixed effects Yes Yes Yes Yes
No attrition correction With attrition correction
$$YearsTraded_{t}$$ 10.074 135257 13.32 22713
(0.618)*** (14091)*** (0.749)*** (5626)***
$$YearsTraded^{2}_{t}$$ −0.462 −9515 −0.896 −1216
(0.097)*** (2123)*** (0.085)*** (557)***
Observations 13,266 13,266 13,266 13,266
Individual fixed effects Yes Yes Yes Yes
Year fixed effects Yes Yes Yes Yes

This table reports regression results for trading intensity and experience. In Columns 1 and 3, the dependent variable is the number of trades placed by an investor in a year. In Columns 2 and 4, the dependent variable is the value of shares traded, measured in Euros. Attrition-corrected regressions include the inverse Mills ratios described in Table 3. For brevity, we report only the coefficients on experience variables in this table. All regressions include individual and year fixed effects. Data are from the period 1995 to 2003. Standard errors are in parentheses, and ***, **, and * denote significance at 1%, 5%, and 10%, respectively.

Table 7

Risk-taking and experience

Dependent variables
Coefficient 30-day α β on RMRF β on SMB β on HML β on UMD
YearsTraded 0.58 −0.034 −0.016 0.0544 0.0073
(0.41) (.0128)** (.0059)** (.0435) (.0250)
YearsTraded2 0.02 0.00246 0.00163 0.0037 −0.00076
(0.02) (.0012)* (.0005)*** (.0028) (.0019)
CumulTrades (÷ 1020.05 −0.01 −0.002 −0.002 −0.016
(0.02)*** (0.002)*** (0.0009)*** (0.004) (.015)
CumulTrades2 (÷ 104−0.0002 0.00005 0.00001 0.00002 0.00008
(0.0001)** (0.00001)*** (0.000006)** (0.00002) (0.00005)
Dep var mean 0.07 0.663 0.028 −0.086 −0.054
Standard dev 3.30 0.288 0.103 0.431 0.440
Dependent variables
Coefficient 30-day α β on RMRF β on SMB β on HML β on UMD
YearsTraded 0.58 −0.034 −0.016 0.0544 0.0073
(0.41) (.0128)** (.0059)** (.0435) (.0250)
YearsTraded2 0.02 0.00246 0.00163 0.0037 −0.00076
(0.02) (.0012)* (.0005)*** (.0028) (.0019)
CumulTrades (÷ 1020.05 −0.01 −0.002 −0.002 −0.016
(0.02)*** (0.002)*** (0.0009)*** (0.004) (.015)
CumulTrades2 (÷ 104−0.0002 0.00005 0.00001 0.00002 0.00008
(0.0001)** (0.00001)*** (0.000006)** (0.00002) (0.00005)
Dep var mean 0.07 0.663 0.028 −0.086 −0.054
Standard dev 3.30 0.288 0.103 0.431 0.440

This table reports the results of fixed effect selection model estimates of regressions of various performance and risk measures on experience measures. The method of these regressions and their associated first-stage estimates are described in Table 4. The dependent variables include each investor’s average 30-day risk-adjusted return (alpha) and each investor’s average beta coefficient on four factors—RMRF, SMB, HML, and UMD. These betas are estimated in the standard way, as described in the text. Each regression includes the control variables and inverse Mills ratios described in Table 4, but only the experience variables coefficients are reported in this table. Standard errors are in parentheses, and ***, **, and * denote significance at 1%, 5%, and 10%, respectively.

### Robustness tests

In this section we report the results of a few additional tests that relate to our main predictions. First, in unreported results, we substitute the market return for each stock’s return to see if individuals learn to time the market. If investors are learning to identify good times to buy, then the market as a whole will tend to increase after their purchases; if instead they are learning to select stocks, we will not find evidence of learning when we look only at market returns. In fact, we find that the coefficient estimates on experience variables are insignificant, which suggests that performance improves because investors become better at stock selection. Second, we also conduct our tests using the Wooldridge (1995) method, taking data on investors who resume trading after ceasing to trade for a few years (the tests reported in the last section had dropped such investors, using only observations in two consecutive years). Including these investors increases the sample by around 250 observations in the second stage but does not affect the nature of the results reported. Third, all of the results on disposition remain qualitatively unchanged if we include a “December dummy” in Equation (A2) or remove partial sales from our sample. This rules out tax-motivated selling or rebalancing as possible explanations for the disposition effect. Finally, in all the fixed effect regressions that control for individual heterogeneity, we cluster the standard errors at the individual level and find that our results are unaffected.

We also perform all of our tests with 30-day returns, regardless of when the investor actually sells. We find that our results are unchanged. This implies that our results are not driven by the selling behavior of investors. It is not the case, for example, that correlated liquidity shocks cause investors to sell simultaneously, drive down prices, and experience poor returns. In additional tests, we find that an investor’s sales are not concentrated in time, which is not consistent with their selling being driven by correlated liquidity shocks.

We also confirm that the learning we find is not driven by increasing precision in our disposition estimates. Although we estimate the disposition effect with disjoint data for each investor-year, since we expect surviving investors to increase their trading over time, it is likely that the precision of our disposition estimates will increase with time. It is, however, important to note that the precision of our estimates is also driven by the volatility of the stocks that investors hold, not just the number of positions they open and close. Moreover, we have no reason to believe that increasing precision will cause a reduction in our disposition estimates and not an increase. Nevertheless, to rule out this possibility, we conduct a bootstrap experiment that keeps constant the number of observations we use to estimate each investor’s disposition effect. We assume that an investor places ten trades each year, regardless of the number of trades he or she actually places. We take a random sample of ten trades from the actual trades (with replacement) and estimate the disposition coefficient as before. We then estimate our survivorship and heterogeneity-adjusted model as before with these new disposition coefficients. Our results are robust to this procedure and clearly show that our finding of learning is not driven by increasing precision in our estimates of the disposition effect.

## Appendix

This appendix provides details of our approach to calculating individual returns and the disposition effect. It also describes the results of an alternative estimation procedure for the disposition effect that follows Feng and Seasholes (2005).

### Measuring performance

As mentioned in the main text, measuring the performance of individual investors is a significant challenge. Our data do not include all nonequity securities that may be held by an investor, so it is impossible to measure the return for the investor’s entire portfolio. This is made more difficult by the fact that the amount of money an individual has invested in equities often fluctuates significantly over time. Since we cannot accurately measure portfolio returns, we measure performance by examining the average return of stocks purchased. However, this generates a new problem—comparing the returns on holding periods of different lengths. For example, it is particularly difficult to compare the performance of one investor who holds a stock for one week and earns a holding period return of 3% to that of another investor who holds a stock for one year and earns a holding period return of 15%.

We therefore calculate the returns earned by the purchased stock in the 30 trading days following each investor’s purchases. Importantly, we truncate this calculation window at the length of the actual holding if it is shorter than 30 days. That is, the 30-day return for investor $$i$$ holding stock $$j$$ is

(A1)
$${R_{i,j}}\left( t \right) = \frac{{{P_j}\left( {t + \min \left( {s,30} \right)} \right)}}{{{P_j}\left( t \right)}} - 1,$$
where $$P_{j}({\cdot})$$ denotes the stock’s closing price adjusted for splits and dividends, $$t$$ denotes the purchase date, and $$s$$ denotes the actual holding period.

Our approach is an attempt to deal with the problem of comparing returns over similar holding periods while ensuring that the actual selling decisions of investors affect their performance. By measuring returns in this way, we hope to capture the value of short-term signals that the investor may have received. As a robustness check, we perform all of our tests with 30-day returns, regardless of when the investor actually sells, and find that our results are unchanged. Looking over longer horizons would introduce considerable noise into our return estimates.

### Measuring disposition

Previous researchers have measured the disposition effect in a number of ways. Odean (1998) compares the proportion of losses realized to the proportion of gains realized by a large sample of investors at a discount brokerage firm. Grinblatt and Keloharju (2001b) model the decision to sell or hold each stock in an investor’s portfolio by estimating a logit model that includes one observation for each position on each day that an account sells any security. Days in which an account does not trade are dropped from their analysis.

As Feng and Seasholes (2005) point out, a potential problem with these and similar approaches is that they may give incorrect inferences in cases in which capital gains or losses vary over time. Hazard models, which have been extensively applied in a number of fields including labor economics and epidemiology, are ideally suited to our setting. Since our focus is on estimating disposition at an individual level, we estimate the hazard regression for each investor and year. Implementation of the hazard model uses all data about the investor’s trading and the stock price path, rather than just data on days when a purchase or sale is made. That is, it implicitly considers the hold-or-sell decision each day. This improves our disposition estimates and gives us more power with which to investigate learning. More important, individual level estimates enable us to test H1 and H2 by explicitly accounting for investor attrition and heterogeneity.

To measure the disposition effect for each investor-year, we use a Cox proportional hazard model with time-varying covariates to model the probability that an investor will sell shares that he or she currently holds. We count every purchase of a stock as the beginning of a new position, and a position ends on the date the investor first sells part or all of his or her holdings. Alternative definitions of a holding period, such as first purchase to last sale, or requiring a complete liquidation of a position, do not substantively alter our results. Our time-varying covariates include daily observations of some market-wide variables and daily observations of whether each position corresponds to a capital gain or loss.

Specifically, we estimate

(A2)
$${h_{i,j}}\left( {t|{x_{i,j}}\left( t \right)} \right) = {\phi _i}\left( t \right)\exp \left\{ {{\rm{\beta }}_i^dI\left( {{R_{i,j}}\left( t \right) \gt 0} \right) + {\rm{\beta }}_i^r{{\bar R}_{M,t}} + {\rm{\beta }}_i^s{\sigma _{M,t}} + {\rm{\beta }}_i^{\rm{V}}{{\rm{V}}_{M,t}}} \right\},$$
where the hazard rate, $$h$$$$i,j$$$$(t)$$, is investor $$i$$’s probability of selling position $$j$$ at time $$t$$ conditional on not selling the position until time $$t$$, and $${\phi }_{i}(t)$$ is the investor’s baseline hazard. Since we estimate the hazard model for each investor-year, the baseline hazard rate describes the typical holding period of just one investor in one particular year. The Cox proportional hazard model does not impose any structure on the baseline hazard, and Cox's (1972) partial likelihood approach allows us to estimate the β coefficients without estimating ϕ $$(t)$$.

The time-varying covariates, $$x$$$$i,j$$$$(t)$$, are allowed to change each day. These include $$I(R$$$$i,j$$$$(t) {\gt } 0)$$, an indicator of whether the total return on position $$j$$ from the time of purchase up until time $$t$$ is positive. Investors who suffer from the disposition effect are more likely to sell when this condition is true, so they will have positive values of $$\beta^d_i.$$ We therefore refer to $$\beta^d_i$$ as investor $$i$$’s “disposition coefficient.” The total return variable includes any dividends or other distributions and is calculated using closing prices on all days, including the date of purchase. We use the closing price on the purchase date instead of the actual purchase price to ensure that our results are not contaminated by microstructure effects. Also, our data do not include transaction prices during the first three months of 1995, but we do have closing prices during this period. (Using ex-dividend returns leaves our results unchanged.) In addition, we include as controls three five-day moving averages of market-level variables to ensure that we are not capturing selling related to market-wide movements: market returns ( $${\bar R_{M,t}}$$ ), squared market returns (σ$$M,t$$), and market volume $$(V$$$$M,t$$). We repeat this estimation via maximum likelihood each year from 1995 to 2003 for each individual $$i$$ who places at least seven round-trip trades in a year.

### Estimates using the Feng and Seasholes method

We estimate the yearly hazard models presented in Table 8, which are comparable with the model of Feng and Seasholes (2005). We use a proportional hazard regression rather than the parametric Weibull model used by Feng and Seasholes. Our indicator variable is the same as their “Trading Gain Indicator” (TGI), and the experience variable we use for this model is the total number of trades placed before the current trade, rather than CumulTrades, which is the total number of trades placed in previous calendar years. The estimates are from pooled hazard models estimated each year, in which all individuals are grouped together. Experience is interacted with an indicator for whether the price is above the purchase price, and the coefficient on this interaction term is interpreted as a learning coefficient. These models again give evidence of learning. However, the learning coefficient estimates are quite variable over time (0.149 to 0.063), and they are statistically insignificant in three of the nine years. Furthermore, the average disposition coefficient of an investor is estimated in aggregate to be around 0.65 in this model. This suggests that, depending on the period chosen, an additional year of experience corresponds to a reduction in the disposition coefficient of approximately 10% to 25%, significantly higher than the estimate from our heterogeneity- and survivorship-adjusted learning model. Table 8 also lists the number of observations available each year and the fraction of the observations that are censored, or the purchased stocks that are not sold by the end of each year.

Table 8

Disposition estimates at aggregate level using alternative estimation

Year 1995 1996 1997 1998 1999 2000 2001 2002 2003
β4 −0.074 −0.093 −0.096 −0.149 −0.031 −0.013 −0.063 −0.069 −0.123
(0.039)* (0.062) (0.031)*** (0.022)*** (0.024) (0.011) (0.011)*** (0.012)*** (0.012)***
Trades 6.6 14.5 26.6 44.6 99.6 251.9 161.9 115.6 131.8
Censored 16% 19% 19% 19% 17% 16% 17% 18% 20%
Accounts 384 713 1360 2412 4953 11585 7445 5416 6056
Year 1995 1996 1997 1998 1999 2000 2001 2002 2003
β4 −0.074 −0.093 −0.096 −0.149 −0.031 −0.013 −0.063 −0.069 −0.123
(0.039)* (0.062) (0.031)*** (0.022)*** (0.024) (0.011) (0.011)*** (0.012)*** (0.012)***
Trades 6.6 14.5 26.6 44.6 99.6 251.9 161.9 115.6 131.8
Censored 16% 19% 19% 19% 17% 16% 17% 18% 20%
Accounts 384 713 1360 2412 4953 11585 7445 5416 6056

This table presents learning estimates from pooled proportional hazards models, using a method similar to that of Feng and Seasholes (2005). Each year we pool the trades of all investors, treating them as if they were just one individual and estimating one learning coefficient for the entire population. The model is

$$h\left( t \right) = \phi \left( t \right)\exp \left\{ {{{\rm{\beta }}_1}I\left( {{R_{i,j}}\left( t \right) \gt 0} \right){{\rm{\beta }}_2}{\rm{Exper}} + {{\rm{\beta }}_3}{\rm{Expe}}{{\rm{r}}^2} + {{\rm{\beta }}_4}{\rm{Exper}}\left[ {I\left( {{R_{i,j}}\left( t \right) \gt 0} \right)} \right] + {{\rm{\beta }}_5}{\rm{Expe}}{{\rm{r}}^2}\left[ {I\left( {{R_{i,j}}\left( t \right) \gt 0} \right)} \right]} \right\},$$

where Exper is measured in years since first placing a trade and $$I(R$$$$i,j$$$$(t) {\gt } 0)$$ is an indicator variable that takes a value of one when a stock has increased in price since its date of purchase. For brevity, only the β4 coefficient estimates are reported. Estimated β5 coefficients are all insignificant. We also report the number of trades or observations considered by the model (in thousands of trades), the percentage of observations censored (trades not closed by the end of the year), and the number of accounts contributing observations to the model. Standard errors are in parentheses, and ***, **, and * denote significance at 1%, 5%, and 10%, respectively.

The low level of the average disposition effect, the high variability in the annual learning estimates, and the high level of learning found both by Feng and Seasholes (2005) and in our implementation of their model suggest that there are significant differences between our approach and theirs. One important difference is that we estimate a different disposition coefficient for each individual. This allows us to control for each individual’s baseline hazard function (or their average holding period). Another difference is that we estimate learning over a much longer period of time, since Feng and Seasholes (2005) only have about two years of transaction data. Thus, the year-to-year variation in the annual estimates in Table 8 is much less of a concern for our analysis.

## References

Arrow
K.
The Economic Implications of Learning by Doing
Review of Economic Studies
,
1962
, vol.
29
(pg.
155
-
73
)
Barber
B. M.
Odean
T.
Boys Will Be Boys: Gender, Overconfidence, and Common Stock Investment
Quarterly Journal of Economics
,
2001
, vol.
116
(pg.
261
-
92
)
Barber
B. M.
Odean
T.
Strahilevitz
M.
Once Burned, Twice Shy: Naïve Learning, Counterfactuals and the Repurchase of Stocks Previously Sold
2004

Working Paper, University of California, Berkeley
Bolton
P.
Harris
C.
Strategic Experimentation
Econometrica
,
1999
, vol.
67
(pg.
349
-
74
)
Calvet
L. E.
Campbell
J. Y.
Sodini
P.
Fight or Flight? Portfolio Rebalancing by Individual Investors
Quarterly Journal of Economics
,
2009
, vol.
124
(pg.
301
-
48
)
Campbell
J. Y.
Household Finance
Journal of Finance
,
2006
, vol.
61
(pg.
1553
-
604
)
Carhart
M.
On Persistence in Mutual Fund Performance
Journal of Finance
,
1997
, vol.
52
(pg.
57
-
82
)
Chancellor
E.
Devil Take the Hindmost: A History of Financial Speculation
,
2000
New York
Plume
Choi
J.
Laibson
D.
B.
Metrick
A.
Reinforcement Learning and Investor Behavior
Journal of Finance
,
2009

Coval
J. D.
Hirshleifer
D. A.
Shumway
T.
Can Individual Investors Beat the Market?
2005

Working Paper, University of Michigan
Coval
J. D.
Shumway
T.
Journal of Finance
,
2005
, vol.
60
(pg.
1
-
34
Do Behavioral Biases Affect Prices?
Cox
D. R.
Regression Models and Life Tables
Journal of the Royal Statistical Society, B
,
1972
, vol.
34
(pg.
187
-
220
)
DeLong
J. B.
Shleifer
A.
Summers
L. H.
Waldmann
R. J.
The Survival of Noise Traders in Financial Markets
,
1991
, vol.
64
(pg.
1
-
19
)
Fama
E.
French
K.
Common Risk Factors in the Returns on Stocks and Bonds
Journal of Financial Economics
,
1993
, vol.
33
(pg.
3
-
56
)
Feng
L.
Seasholes
M. S.
Do Investor Sophistication and Trading Experience Eliminate Behavioral Biases in Financial Markets?
Review of Finance
,
2005
, vol.
9
(pg.
305
-
51
)
Frazzini
A.
The Disposition Effect and Underreaction to News
Journal of Finance
,
2006
, vol.
61
(pg.
2017
-
46
)
Genesove
D.
Mayer
C.
Loss-Aversion and Seller Behavior: Evidence from the Housing Market
Quarterly Journal of Economics
,
2001
, vol.
116
(pg.
1233
-
60
)
Greenwood
R.
Nagel
S.
Inexperienced Investors and Bubbles
Journal of Financial Economics
,
2009
, vol.
93
(pg.
239
-
58
)
Grinblatt
M.
Keloharju
M.
The Investment Behavior and Performance of Various Investor Types: A Study of Finland’s Unique Data Set
Journal of Financial Economics
,
2000
, vol.
55
(pg.
43
-
67
)
Grinblatt
M.
Keloharju
M.
How Distance, Language, and Culture Influence Stockholdings and Trades
Journal of Finance
,
2001a
, vol.
56
(pg.
1053
-
73
)
Grinblatt
M.
Keloharju
M.
Journal of Finance
,
2001b
, vol.
56
(pg.
589
-
616
)
Grossman
S. J.
Kihlstrom
R. E.
Mirman
L. J.
A Bayesian Approach to the Production of Information and Learning by Doing
Review of Economic Studies
,
1977
, vol.
44
(pg.
533
-
47
)
Heckman
J.
The Common Structure of Statistical Models of Truncation, Sample Selection, and Limited Dependent Variables and a Simple Estimator for Such Models
Annals of Economic and Social Measurement
,
1976
, vol.
5
(pg.
475
-
92
)
Hong
H.
Kubik
J. D.
Stein
J. C.
Social Interaction and Stock-Market Participation
Journal of Finance
,
2004
, vol.
59
(pg.
137
-
63
)
Ivković
Z.
Sialm
C.
Weisbenner
S. J.
Portfolio Concentration and the Performance of Individual Investors
Journal of Financial and Quantitative Analysis
,
2008
, vol.
43
(pg.
613
-
55
)
Ivković
Z.
Weisbenner
S. J.
Local Does as Local Is: Information Content of the Geography of Individual Investors Common Stock Investments
Journal of Finance
,
2005
, vol.
60
(pg.
267
-
306
)
Ivković
Z. Z.
Weisbenner
S.
Information Diffusion Effects in Individual Investors’ Common Stock Purchases: Covet Thy Neighbors’ Investment Choices
Review of Financial Studies
,
2007
, vol.
20
(pg.
1327
-
57
)
Jin
L.
Scherbina
A.
Inheriting Losers
2008

Working Paper, University of California, Davis
Kogan
L.
Ross
S. A.
Wang
J.
Westerfield
M. M.
The Price Impact and Survival of Irrational Traders
Journal of Finance
,
2006
, vol.
61
(pg.
195
-
229
)
Korniotis
G. M.
Kumar
A.
Superior Information or a Psychological Bias? A Unified Framework with Cognitive Abilities Resolves Three Puzzles
2008

Working Paper, University of Texas at Austin
Linnainmaa
J.
Learning from Experience
2006

Working Paper, University of Chicago
Linnainmaa
J.
Do Limit Orders Alter Inferences About Investor Performance and Behavior?
2007

Working Paper, University of Chicago
List
J. A.
Does Market Experience Eliminate Market Anomalies?
Quarterly Journal of Economics
,
2003
, vol.
118
(pg.
41
-
71
)
Mahani
R.
Bernhardt
D.
Financial Speculators’ Underperformance: Learning, Self-Selection, and Endogenous Liquidity
Journal of Finance
,
2007
, vol.
62
(pg.
1313
-
40
)
Nicolosi
G.
Peng
L.
Zhu
N.
Journal of Financial Markets
,
2008
, vol.
12
(pg.
317
-
36
Do Individual Investors Learn from Their Trading Experience?
Odean
T.
Are Investors Reluctant to Realize Their Losses?
Journal of Finance
,
1998
, vol.
53
(pg.
1775
-
98
)
Odean
T.
American Economic Review
,
1999
, vol.
89
(pg.
1279
-
98
)
Pástor
L.
Taylor
L.
Veronesi
P.
Entrepreneurial Learning, the IPO Decision, and the Post IPO Drop in Firm Profitability
Review of Financial Studies
,
2009
, vol.
22
(pg.
3005
-
46
)
Pástor
L.
Veronesi
P.
Stock Valuation and Learning About Profitability
Journal of Finance
,
2003
, vol.
58
(pg.
1749
-
89
)
Scholes
M.
Williams
J.
Estimating Betas from Nonsynchronous Data
Journal of Financial Economics
,
1977
, vol.
5
(pg.
309
-
27
)
Shapira
Z.
Venezia
I.
Patterns of Behavior of Professionally Managed and Independent Investors
Journal of Banking and Finance
,
2001
, vol.
25
(pg.
1573
-
87
)
Shefrin
H.
Statman
M.
The Disposition to Sell Winners Too Early and Ride Losers Too Long: Theory and Evidence
Journal of Finance
,
1985
, vol.
40
(pg.
777
-
90
)
Shiller
R. J.
2005
2nd ed. Princeton, NJ
Princeton University Press
Shumway
T.
Wu
G.
Does Disposition Drive Momentum?
2006

Working Paper, University of Michigan
Verbeek
M.
Nijman
T.
Testing for Selectivity Bias in Panel Data Models
International Economic Review
,
1992
, vol.
33
(pg.
681
-
703
)
Weber
M.
Camerer
C.
The Disposition Effect in Securities Trading: An Experimental Analysis
Journal of Economic Behavior and Organization
,
1998
, vol.
33
(pg.
167
-
84
)
Wooldridge
J. M.
Selection Corrections for Panel Data Models under Conditional Mean Independence Assumptions
Journal of Econometrics
,
1995
, vol.
68
(pg.
115
-
32
)
Wooldridge
J. M.
{Introductory Econometrics: A Modern Approach
,
2003
Cincinnati, OH
South-Western College Publishing
1
Coval, Hirshleifer, and Shumway (2005) document significant performance persistence among individuals. Ivković and Weisbenner (2005) find that individuals place more informed trades in stocks of companies located close to their homes, and Ivković, Sialm, and Weisbenner (2008) show that individuals with more concentrated portfolios tend to outperform those who are more diversified. Linnainmaa (2007) finds that individuals who trade with limit orders suffer particularly poor performance.
2
The disposition effect is the propensity of investors to sell assets on which they have experienced gains and to hold assets on which they have experienced losses. The effect was first proposed by Shefrin and Statman (1985) and was subsequently documented in a sample of trading records from a U.S. discount brokerage firm by Odean (1998). The effect has been found in other contexts, including in Finland (Grinblatt and Keloharju 2001b), China (Feng and Seasholes 2005; Shumway and Wu 2006), and Israel (Shapira and Venezia 2001); among professional market makers (Coval and Shumway 2005), mutual fund managers (Frazzini 2006), and home sellers (Genesove and Mayer 2001); and in experimental settings (Weber and Camerer 1998). We focus on the disposition effect because it is a robust empirical finding and is relatively easy to measure.
3
This example is similar in spirit to well-known “bandit” problems. For example, Bolton and Harris (1999) study the strategic interaction of agents in an experimentation game. In our setting, we can think of investors as being randomly assigned slot machines with different unobservable expected payoffs, and experimenting to learn what the payoffs are. The random assignment is analogous to the assignment of inherent ability.
4
Two recent papers examine the relation between investor sophistication and behavior. Calvet, Campbell, and Sodini (2009) find that more-sophisticated households in Sweden are more likely to begin trading and less likely to stop than are less-sophisticated households. Korniotis and Kumar (2008) show that investors in the United States have better performance when their demographic characteristics indicate that they are likely to have high cognitive abilities. Neither of these studies directly examines whether investors learn.
5
Other papers that find some learning in various settings include Pástor, Taylor, and Veronesi (2009); Choi, Laibson, Madrian, and Metrick (2009); Linnainmaa (2006); Barber, Odean, and Strahilevitz (2004); Pástor and Veronesi (2003); and List (2003).
6
These references provide a detailed discussion of the data.
7
Some returns in our observation period are quite volatile. Particularly in the few years at the end of the decade, some stocks would see large returns in short periods and some investor returns are quite large or sensitive to holding periods. In our main analysis, we confirm that our results are not sensitive to removing investors with extreme portfolio returns or winsorizing returns.
8
It is also possible, of course, that investors learn by considering returns to hypothetical trades before they ever start trading (“paper trading”). If this is the only way in which investors learn, then we should find no evidence of learning by trading (either H1 or H2). Learning from paper trading may be difficult for a number of reasons, including that it requires significant discipline to keep track of hypothetical trades and that transaction prices for hypothetical trades are not observable.
9
We use a four-factor model to adjust returns for the known risk factors of Fama and French (1993) and Carhart (1997). To construct the HML and SMB factors, we take quarterly data on shares outstanding and book value of equity using definitions as similar as possible to those of Fama and French and construct portfolios in the manner they describe. We measure a firm’s risk in a particular year by regressing daily returns in excess of the risk-free rate on the daily returns of the four factor-mimicking portfolios using the Scholes and Williams (1977) method to avoid well-known problems associated with nonsynchronous trading. As with the raw returns, we calculate holding period returns over a fixed interval of 30 days, but truncate the holding period at the length of the actual holding period.
10
Rather than using only one indicator variable as in Equation (A2), we use 20 dummy variables corresponding to different 1% return “bins” in this model. In Figure 3 we plot the dummy variable coefficients by year. The sum of these coefficients times their corresponding dummy variables is multiplied by the baseline hazard rate to give the actual conditional hazard rate.
11
Results estimated with both YearsTraded and CumulTrades terms included in the same regression are also very similar to those reported.
12
While these papers acknowledge that there may be other common factors driving investor activity (such as access to similar financial advisors), these papers claim that there is a social component that is unrelated to performance that drives at least part of investor trading. Our identification relies only on this component.
13
This measure is not equivalent to the volatility of stocks purchased by an investor. Rather, it is the variance of the returns generated by each of the trades placed by the investor. Empirically, the correlation between this quantity and the average volatility of stocks held by an investor in our data is only 0.23.
14
The idea is to examine the fall in the loglikelihood functions for the unrestricted (with instruments) and restricted (without instruments) models and compare with a chi-square distribution with appropriate degrees of freedom, which in our case is two.
15
The coefficients on YearsTraded in these regressions must be viewed with caution. Because the change in YearsTraded is always equal to exactly one year, the coefficient on YearsTraded can only be identified if we leave the fixed effect for one year out of the regression. For robustness, we also estimate standard Heckman selection models without either individual or year fixed effects. The coefficients on YearsTraded in these models are 0.31 in the performance regression and −0.019 in the disposition regression. Both of these coefficients are marginally statistically significant (at 10%). The magnitude and significance of these coefficients are consistent with our other results.
16
An alternative explanation that might also explain why some traders stop trading is related to the wealth of the investors: investors could stop trading because they run out of money, not because they realize that their ability is low. We do not believe that this alternative explains our results. First, our previous results directly control for initial wealth of the investor and for investor fixed effects. Doing so should partially account for any initial wealth-related attrition. Second, in contrast to H2, this alternative makes no predictions on how trading intensity of an investor should change over time. Under H2, low-ability traders should stop trading, while high-ability traders should scale up their trading intensity. The results in this section also provide support for H2 rather than this alternative.

## Author notes

We thank Brad Barber, Dan Bernhardt, James Choi, Juhani Linnainmaa, Uday Rajan, Mark Seasholes, Morten Sorensen, Matthew Spiegel (the editor), Ning Zhu, two anonymous referees, and seminar participants at the 2007 Western Finance Association meeting, the 2007 Wharton Household Portfolio Choice conference, Carnegie Mellon University, Nanyang Technical University, National University of Singapore, Singapore Management University, University of California at Irvine, University of Edinburgh, University of Manchester, University of Toronto, and University of Michigan for helpful comments. Any remaining errors are ours. We are grateful to Jussi Keppo for helping us to acquire the data used in this study, and to the Mitsui Life Financial Research Center at the University of Michigan for funding. We also thank Jarkko Heinonen and Monica Bergström at the Helsinki Exchange for helping us with the data.