## Abstract

Using a large sample of individual investor records over a nine-year period, we analyze survival rates, the disposition effect, and trading performance at the individual level to determine whether and how investors learn from their trading experience. We find evidence of two types of learning: some investors become better at trading with experience, while others stop trading after realizing that their ability is poor. A substantial part of overall learning by trading is explained by the second type. By ignoring investor attrition, the existing literature significantly overestimates how quickly investors become better at trading.

Academics have recently shown an interest in the investment behavior and performance of individuals, a field that has been called “household finance” by Campbell (2006). Over the past decade, several researchers have documented a number of behavioral biases among individual investors. More recently, researchers have found evidence that some individual investors are more informed or skilled than others.^{1} Considering these findings, it is natural to ask how skilled or informed investors acquire their advantage. For example, do investors learn by trading? If so, to what extent do investors improve their ability, and to what extent do they learn about their inherent ability? And how quickly do investors learn? In this paper, we exploit trading records to study both average investor performance and the strength of the behavioral bias known as the disposition effect.^{2} We correlate performance and disposition with investor experience and investor survival rates to determine whether and how investors learn by trading.

Motivated by the existing economics literature on learning, we consider two specific ways in which investors can learn. First, in the spirit of classical learning-by-doing models (Arrow 1962; Grossman, Kihlstrom, and Mirman 1977), investors might improve their ability as they trade (“learning by doing”). Second, as investors trade, they might realize that their inherent level of ability is low and decide to stop trading (“learning about ability”). Although these types of learning are different, they are not mutually exclusive, and the primary contribution of our paper is to separate these types of learning empirically and estimate the importance of each. Our results provide robust evidence for both types of learning, but the magnitudes of the learning estimates suggest that most of the learning by trading occurs as individuals learn about their own ability and low-ability investors stop trading. This implies that, by ignoring investor attrition, the existing literature substantially overstates how quickly investors become better at trading.

To clarify the model of learning we have in mind, consider the case of an individual who decides to begin trading. The investor must decide which of the myriad sources of market information and investment advice available to him or her to take seriously. He or she could consult standard news sources, Internet sites, investment newsletters, and neighbors or friends. He or she might also consider the advice of brokers, news analysts, authors of books and magazines, and finance professors. To the extent that these sources fail to agree completely, individuals must determine how much decision weight to assign to each source. Moreover, the quality of these sources is likely to differ across individuals: some investors may know executives at a firm, while others may not. As investors begin trading, they can learn to which of the various sources they should pay more attention. This can be thought of as improving ability through learning. Investors who have access only to poor sources of information cannot improve by focusing more on particular sources. Instead, they will learn that they have no useful information and will stop trading actively, choosing instead to invest in a passive investment such as an index fund.^{3} This is also a type of learning, but rather than improving his or her ability, the investor learns about his or her ability.

Since investors who learn that their ability is poor should stop trading, we empirically examine whether investors learn about their ability by examining attrition in our data. Once we account for time-invariant individual heterogeneity and endogenous attrition, any improvement in investors’ performance and any reduction in their disposition effect that comes with experience should reflect a direct improvement in their ability. By tracking the survival rates, performance, and the level of the disposition effect for each individual over time, we are able to differentiate between these two types of learning. In doing so, ours is the first paper to identify and measure both types of learning.^{4}

We test our hypotheses with a remarkable data set that includes the complete trading records of investors in Finland from 1995 to 2003, including more than 22 million observations of trades placed by households. We use these data to estimate disposition and calculate performance at the account level. Our disposition estimates indicate that a median individual in our sample is 2.8 times more likely to sell a stock when its price has risen since purchase than when its price has fallen. We exploit the panel structure of our data to examine whether individual investors learn to avoid the disposition effect and improve their performance as they trade. In particular, we estimate the mean return and the disposition effect for each account and year in our sample and relate these estimates to experience, past returns, and various demographic controls.

We identify learning about ability in the data by adjusting for survivorship and heterogeneity with a modified Heckman selection model that allows for individual fixed effects. The model is a two-stage instrumental variables model that adjusts for the possibility that the composition of the sample is endogenous while accounting for any cohort effects. We construct two variables for use as instruments: the first is an indicator of whether the individual inherited shares in the previous year due to the death of a relative, and the second is the proportion of accounts in an investor’s zip code that are held by active traders. We argue that both of these variables are likely to satisfy the necessary exclusion restrictions—that is, they are likely to affect the probability of an investor remaining in the sample, but are unlikely to affect changes in the investor’s performance or disposition effect except through their effect on survival.

Estimates of the selection model confirm that investors with poor performance are more likely to cease trading. An investor whose performance is one standard deviation worse than the mean is about 15% less likely to continue trading. Our instruments are also good predictors of whether individuals continue trading. The second stage of our selection model suggests that investors learn by trading: after accounting for survivorship, an extra 100 trades is associated with an improvement in average returns of approximately 3.6 basis points (bp) over a 30-day horizon (or about 30 bp per year), and a reduction in the disposition effect of about 2%. If we measure experience with the number of years an individual has been trading instead of the number of trades he or she has placed, the improvement is negligible. This implies that individuals actually have to place trades to learn, or that learning truly occurs by trading. Perhaps more important, the magnitude of the learning estimates presented above is about two to four times higher when not adjusted for investor attrition. This implies that the learning we document occurs primarily as investors learn about their ability.

Differentiating between types of learning allows us to determine in a meaningful way how quickly investors learn. Estimating learning without accounting for heterogeneity and attrition results in inflated estimates of improvement that do not correspond to the experience of any particular type of investor. Correctly measuring the speed of learning is important for a number of reasons. If investors learn quickly and there is low turnover in the population of investors, behavioral biases are unlikely to affect asset prices significantly. Moreover, if they learn relatively quickly, then the “excessive” trading documented by Odean (1999) and Barber and Odean (2001) may be justified, because investors may optimally choose to trade more actively if they know they will improve with experience. Finally, thinking about both the speed and the type of learning has implications for market efficiency. For example, if many inexperienced investors begin trading around the same time, and they learn slowly, their trading could lead to time-varying market efficiency. In fact, we find evidence that suggests that investors learn less about their ability in years with positive market returns. In this respect, our paper contributes to the theoretical literature concerned with the survival and price impact of noise traders (see, for example, DeLong, Shleifer, Summers, and Waldmann 1991; Kogan, Ross, Wang, and Westerfield 2006).

Some evidence about learning by trading exists. Feng and Seasholes (2005) give evidence that investors, in aggregate, display significantly less disposition over time, estimating that for sophisticated investors the disposition effect is essentially attenuated after about 16 trades. These estimates do not appear to be consistent with Frazzini's (2006) finding that mutual fund managers, who trade substantially more than most individuals, display a significant disposition effect. Furthermore, since Feng and Seasholes (2005) do not adjust for heterogeneity and attrition, it is impossible to tell whether their estimates imply that a particular investor who trades 16 times will no longer exhibit the disposition effect or whether they imply that most investors who exhibit the disposition effect will cease trading before they place 16 trades. Nicolosi, Peng, and Zhu (2008) show that the trading performance of individuals appears to improve with trading experience, estimating that individuals can improve their risk-adjusted portfolio return by about 2% per year (or about 0.8 bp per day) over a three-year period. This seems quite large. Again, it is not possible to tell whether this estimate is driven by the most successful investors surviving or by the least successful investors improving their ability.^{5}

While our tests have some features in common with existing papers, they differ from the literature in a number of important respects. First, unlike other papers, our tests use measures of performance and estimates of the disposition effect that are specific to individuals, allowing us to track particular individuals over time. This permits us to control for investor heterogeneity and survivorship effects, which allows us to separate the two types of learning. It also ensures that each observation in almost all of our tests is an average or regression coefficient for one individual in one particular year. Finally, using only one observation per person per year reduces the likelihood that our standard error estimates are incorrect because of correlation among our regression residuals. Given the unique features of our data and our test methods, the results of our hypothesis tests add significantly to the literature on financial learning.

The rest of the paper is organized as follows. Section 1 provides details on our data, while Section 2 describes the hypotheses we test and our statistical methods. Section 3 discusses our results, and Section 4 concludes. We give some details about our statistical methods in the Appendix.

## Data and Methods

The data used in this study come from the central register of shareholdings in Finnish stocks maintained by the Nordic Central Securities Depository (NCSD), which is responsible for the clearing and settlement of trades in Finland. Finland has a direct holding system, in which individual investors’ shares are held directly by the CSD. Since our data come from the CSD, they reflect the official record of holdings and are therefore of extremely high quality. The data cover all trading in all Finnish stocks over a nine-year period. Grinblatt and Keloharju (2000, 2001a, 2001b) use a subset of the same data, comprising the first two years of our sample period.^{6} The data include the transactions of nearly 1.3 million individuals and firms, beginning in January 1995 and ending in December 2003. In all, more than 22 million trades by individual investors are included. On average, individuals hold 12.6% of all equity in Finland, while financial institutions hold 9.6%, 34.7% is held by the government, and nonfinancial firms hold 33.4% during our sample period. (Additional statistics on investor trading are provided in Table 1.)

Mean | 25th Pctl | Median | 75th Pctl | |
---|---|---|---|---|

Panel A: Entire sample (322,454 accounts) | ||||

Number of years with trades | 1.9 | 1.0 | 1.0 | 2.0 |

Number of securities traded | 3.5 | 1.0 | 1.0 | 3.0 |

Number of trades | 15.4 | 1.0 | 3.0 | 8.0 |

Average value of shares traded, EUR | 3,447 | 808 | 1,653 | 3,310 |

Average portfolio value, EUR | 11,588 | 1,470 | 2,794 | 5,856 |

Age in 1995 | 39.3 | 27.0 | 39.0 | 51.0 |

Gender (1 = female) | 0.39 | |||

Trades options (1 = yes) | 0.03 | |||

Panel B: Accounts with disposition estimates (11,979 accounts) | ||||

Number of years with trades | 4.4 | 3.0 | 4.0 | 6.0 |

Number of securities traded | 22.3 | 12.0 | 18.0 | 28.0 |

Number of trades | 222.3 | 68.0 | 117.0 | 224.0 |

Average value of shares traded, EUR | 5,356 | 1,855 | 3,235 | 5,759 |

Average portfolio value, EUR | 58,828 | 5,102 | 11,483 | 26,147 |

Age in 1995 | 35.3 | 27.0 | 34.0 | 44.0 |

Gender (1 = female) | 0.15 | |||

Trades options (1 = yes) | 0.17 |

Mean | 25th Pctl | Median | 75th Pctl | |
---|---|---|---|---|

Panel A: Entire sample (322,454 accounts) | ||||

Number of years with trades | 1.9 | 1.0 | 1.0 | 2.0 |

Number of securities traded | 3.5 | 1.0 | 1.0 | 3.0 |

Number of trades | 15.4 | 1.0 | 3.0 | 8.0 |

Average value of shares traded, EUR | 3,447 | 808 | 1,653 | 3,310 |

Average portfolio value, EUR | 11,588 | 1,470 | 2,794 | 5,856 |

Age in 1995 | 39.3 | 27.0 | 39.0 | 51.0 |

Gender (1 = female) | 0.39 | |||

Trades options (1 = yes) | 0.03 | |||

Panel B: Accounts with disposition estimates (11,979 accounts) | ||||

Number of years with trades | 4.4 | 3.0 | 4.0 | 6.0 |

Number of securities traded | 22.3 | 12.0 | 18.0 | 28.0 |

Number of trades | 222.3 | 68.0 | 117.0 | 224.0 |

Average value of shares traded, EUR | 5,356 | 1,855 | 3,235 | 5,759 |

Average portfolio value, EUR | 58,828 | 5,102 | 11,483 | 26,147 |

Age in 1995 | 35.3 | 27.0 | 34.0 | 44.0 |

Gender (1 = female) | 0.15 | |||

Trades options (1 = yes) | 0.17 |

Variable | Mean | $$t$$-stat | 10th Pctl | Median | 90th Pctl |

Panel C: Hazard function estimates | |||||

$${\beta }^{d}$$ | 1.32 | 6.81 | -0.36 | 1.04 | 2.57 |

$${\beta }^{r}$$ | -0.04 | -0.27 | -0.65 | 0.03 | 0.93 |

$${\beta }^{s}$$ | 0.01 | 0.46 | -0.05 | 0.00 | 0.04 |

$${\beta }^{V}$$ | 0.25 | 0.91 | -2.62 | 0.24 | 3.08 |

Variable | Mean | $$t$$-stat | 10th Pctl | Median | 90th Pctl |

Panel C: Hazard function estimates | |||||

$${\beta }^{d}$$ | 1.32 | 6.81 | -0.36 | 1.04 | 2.57 |

$${\beta }^{r}$$ | -0.04 | -0.27 | -0.65 | 0.03 | 0.93 |

$${\beta }^{s}$$ | 0.01 | 0.46 | -0.05 | 0.00 | 0.04 |

$${\beta }^{V}$$ | 0.25 | 0.91 | -2.62 | 0.24 | 3.08 |

This table presents summary statistics for our data. Panel A includes all individual accounts in our data that started trading during the sample period. Panel B gives results just for those accounts for which we are able to estimate at least one disposition coefficient. We estimate the disposition coefficient only if an individual has placed at least seven round-trip trades in a given year. Number of trades is the total number of trades placed by an investor during the sample period. Average portfolio value is the average marked-to-market value of an investor’s portfolio using daily closing prices. Panel C reports summary statistics for the estimates of the hazard model in Equation (A2) in the text.

Our data allow us to reconstruct the portfolio of stocks held by each account on a daily basis. Using these holdings, we construct a proxy for wealth by calculating the average daily marked-to-market portfolio value for each investor. We also calculate the average value of trades placed by an investor each year. To measure sophistication, we note that investors who trade options are likely to be more familiar with financial markets. This is particularly true in our setting because many of the options in our data are granted to corporate executives as part of compensation. Therefore, while we do not include options trades in our estimates of disposition, we use whether an investor ever trades options as a proxy for sophistication. We also count the number of distinct securities traded by an investor over the sample period and use this as a measure of portfolio diversification.

Table 1 provides summary statistics for the new accounts in our data set. New accounts are those that place their first trade in 1995 or a subsequent year; they have no recorded initial positions. Panel A includes all new accounts that place at least one trade during our sample period (1995–2003), while Panel B gives results only for those new accounts for which we are able to estimate the disposition coefficient at least once. We attempt to estimate the disposition coefficient only if an individual has placed at least seven round-trip trades in a given year, although even with this restriction the procedure to maximize the likelihood function does not always converge. The last two rows of each panel are indicator variables, taking a value of one if the investor: (a) trades options, or (b) is female, and zero otherwise.

Comparing Panels A and B, it is apparent that the subset of investors for whom disposition coefficients are available is somewhat different than the larger population. By construction, the accounts in Panel B place more trades, but they also have larger portfolios, trade larger amounts of money, trade in a wider selection of securities, and are somewhat younger. Also, investors for whom we can estimate disposition are more likely to trade options (17%) than the overall sample (3%). Since we are only able to estimate disposition for investors who trade with some frequency, this likely results from the fact that investors who trade options are simply more likely to trade in general.

Figure 1 shows the number of accounts (including both new and existing accounts) that place one or more trades in each year. There is considerable variation in the number of accounts placing trades over time, from a low of 54,196 accounts in 1995 to a high of 311,013 accounts in 2000. Additions of new accounts follow a similar pattern. We discuss entry and exit from the sample in more detail in the next section.

### Measuring performance

Investor performance is the primary variable that we correlate with experience to test our hypotheses. Measuring the performance of individual investors is a significant challenge for a number of reasons. For example, it is not obvious how we should compare the performance of investors with different holding periods. Given the challenges associated with calculating performance, we take a straightforward approach that is nevertheless likely to capture much of the relevant information in the individual’s returns. In short, we calculate the returns earned by each purchased stock in the 30 trading days following each investor’s purchases. Importantly, we truncate this calculation window at the length of the actual holding if it is shorter than 30 days. We choose to focus on 30-day returns because the median holding period in our data is 39 trading days, but all of our findings remain unchanged if we use a 10-, 45-, or 60-day holding period. Our annual performance measure is the average of these 30-day returns. We provide details of our performance measures in the Appendix.^{7}

### Measuring the disposition effect

The other outcome variable that we track to evaluate whether investors learn with experience is the disposition effect. We estimate the disposition effect using a hazard model estimated with daily transaction and return data in each year for each investor in our sample. Estimating the effect for each year means that each individual/year coefficient is estimated with a unique data set that is completely unaffected by the individual’s trades in other years. We estimate a hazard model of the form

_{$$i,j$$}$$(t)$$ is investor $$i$$’s probability of selling position $$j$$ on date $$t$$, conditional on not having sold prior to date $$t. PriceDummy_{t}$$ takes a value of one when the stock price on date $$t$$ is above investor $$i$$’s purchase price, and zero otherwise. Therefore, $$\beta^d_i$$ measures investor $$i$$’s susceptibility to the disposition effect, which we term the “disposition coefficient.” Details of the estimation and additional discussion are provided in the Appendix.

### Measuring experience

We measure investing experience with both the number of years that an investor has been trading and the cumulative number of trades that an investor has placed. Of course, investors may gain experience by actively trading securities and observing the results of each trade. If this is the primary way in which investors learn, then cumulative trades will predict future investment performance and the disposition effect. However, investors may also learn by observing market quantities and considering the outcomes of hypothetical trades based on, for example, a particular information source. If investors mainly learn this way, then years of experience will be a better predictor of investment performance and the disposition effect than cumulative trades.^{8}

## Hypotheses

We test two hypotheses in this paper to examine if individual investors learn by trading. Both hypotheses are about how the characteristics of individuals change with time, so we test both with panel data. Our first hypothesis (H_{1}) is that investors learn to improve their ability over time—that is, “learning by doing” occurs. The second hypothesis (H_{2}) is that investors learn about their inherent ability by trading.

H_{1} is in the spirit of classical learning-by-doing models (Arrow 1962; Grossman, Kihlstrom, and Mirman 1977), which argue that the productivity of agents increases with experience. The notion here is that investors improve over time since they should, for instance, be able to choose better combinations of various information sources that help them choose which stocks to hold. The main testable prediction of H_{1} is that performance should improve over time for investors who remain active.

H_{2} is best understood in the context of Mahani and Bernhardt's (2007) model, where individuals do not initially know their ability. However, individuals obtain information about their skills through their trading experience. Only a small fraction of the population is adept at identifying profitable trading opportunities. In equilibrium, each period some novice speculators enter the financial markets. Recognizing that most traders lack financial acumen, the novices first experiment on a small scale. They then use the information contained in their trading profits to decide whether to continue. Those who earn sufficient profits conclude that they are likely to be skilled and expand their speculative activities. However, some speculators do less well, conclude that they are more likely to be inept traders, and stop actively trading. The main testable prediction of H_{2} is that investors who experience losses stop actively trading, while those with sufficient profits remain and increase their trading intensity over time.

If either H_{1} or H_{2} is true, we should find an improvement in the average performance of the pool of investors over time. We therefore begin by testing a simple learning model to look for evidence of learning overall. If we assume a world with random attrition and no heterogeneity in ability among investors, a simple regression of performance on experience gives an estimate of the first type of learning (learning-by-doing). Alternatively, it gives an estimate of the second type of learning (learning-about-ability) if the pool of investors improves over time as the low-ability investors stop trading and those with sufficient profits continue to trade. To disentangle H_{1} and H_{2}, we therefore need to examine the role played by individual heterogeneity and attrition in our data. Accounting for unobserved individual heterogeneity and attrition provides us with an assessment of H_{2}, and any learning that remains provides an estimate of H_{1}.

## Results

We present our empirical findings in this section. We begin by presenting the results of our tests relating to performance and disposition in Section 3.1. The simple learning model is presented in Section 3.2, and we deal with unobserved heterogeneity in Section 3.3. Section 3.4 examines the importance and magnitude of survivorship effects, and we present evidence on changes in trading intensity and risk-taking in Section 3.5. Finally, a number of robustness tests are presented in Section 3.6.

### Performance and disposition

We start by examining trader performance to be sure that our performance measure has similar properties to the measures used in the literature. Previous papers, in particular Odean (1998), have shown that average investor performance is worse than that of the market portfolio. Poor performance by average investors is also a prediction of the model in Mahani and Bernhardt (2007). We test whether individuals on average outperform a market index by calculating the average return to a stock purchased by an individual investor net of the market return. Calculating this average at a 30-day horizon (using our convention of using a shorter holding period if the individual sells the stock before 30 days) yields an average return net of the market of −48 basis points. At a 60-day horizon, the average net return is −50 basis points. At both of these horizons, returns net of the market return are quite statistically significantly negative.

While the average performance of individual investors is likely to be quite poor, Coval, Hirshleifer, and Shumway (2005) show that some individuals persistently outperform others. Again, performance persistence among individuals is an implication of Mahani and Bernhardt (2007). We test whether there is any persistence in investor performance in three related ways. First, we regress each investor’s average 30-day return in year $$t$$ on the investor’s average return in year $$t - 1$$ and year fixed effects. Using year fixed effects adjusts for time-series variation in average market returns. The estimated coefficient in this regression is 0.183 $$(p {\lt } 0$$.0001), very statistically and economically significant. Second, we calculate each investor’s average return in two disjoint time periods: 1995–1999 and 2000–2003. We then calculate the Spearman rank correlation between the return series from the first period with that from the second period. This correlation is 0.164 $$(p {\lt } 0$$.0001), again quite statistically and economically significant. Our third test involves sorting investors in each year into performance quartiles, and then plotting the average performance of each of those quartiles for the next several years. This plot, which appears in Figure 2, again gives evidence that the most successful investors in the past continue to outperform the least successful investors for at least a couple of years. Results calculated with alphas instead of raw returns are qualitatively the same. These results confirm that there is a degree of persistence in individual returns.^{9}

Next, we examine the other outcome variable that we will use in our analysis—the disposition effect. The disposition effect is quite large in our data. Figure 3 is a plot of the relation between the propensity to sell an existing position (the hazard ratio) and the position’s holding period return. To generate this plot, we group all investors and estimate one hazard model each year. We group the data for this procedure so that we can estimate a model with many covariates, but almost all of the tests that follow are based on individual-level results.^{10} The conditional hazard ratio is remarkably similar across years. The plot shows an obvious kink in the hazard ratio near zero: investors are clearly more likely to sell a stock if it has increased in value since the date of purchase. This provides strong support for the presence of a disposition effect in aggregate, consistent with the extensive literature cited above.

Turning to our main individual-level disposition regressions, we require that an investor place at least seven round-trip trades in a year to be included in the sample, and we run the regression for each investor-year to generate a separate disposition coefficient whenever possible. While this filter drastically reduces our sample size, it is necessary to ensure that our coefficients of interest are identified. Panel C of Table 1 summarizes these estimates. The median disposition coefficient in the cross-section of investors across years is 1.04, which is economically quite large. This coefficient implies that the median new investor in our data is $$e^{1.04} = 2.8$$ times more likely to sell a stock whose price is above its purchase price than a stock that has fallen in value since the time of purchase. None of the controls is statistically significant in the cross-section.

Before we can consider whether investors learn to avoid the disposition effect, we need to argue that the effect is in fact a behavioral bias. In particular, one necessary condition for disposition to be a costly behavioral bias is that investors with more disposition have inferior investment performance. If disposition is unrelated to investment performance, investors with the effect have little incentive to learn to avoid it. To get a sense of how returns vary with disposition, we examine average investor returns across quintiles of the disposition coefficient. In this sort, the disposition coefficients are always estimated one year before the average returns are calculated, so disposition coefficients and average returns are not mechanically correlated in any way. For each quintile, Figure 4 graphs the average return earned by investors over different horizons from the purchase date. Returns are substantially higher in the lowest disposition quintile than in the highest disposition quintile. For example, in the 30 days following a purchase, a stock’s price increases 46 bp on average when bought by an investor in the lowest disposition quintile, compared with a decline of 54 bp if purchased by an investor in the highest disposition quintile. The differences between high- and low-quintile average returns range from 17 bp at the 10-day horizon to 131 bp at the 45-day horizon. These differences are both economically and statistically large, leading us to conclude that individuals with high disposition effect coefficients have relatively poor investment performance.

We can also verify that disposition is costly by mimicking the tests of Odean (1998), as presented in Table 2. The idea is to compare the returns of stocks sold at a gain with those that could have been sold at a loss but were not. In order to see the difference in cost of the disposition effect to investors in the low- and high-disposition groups, we implement a difference-in-difference specification. Using a regression framework instead of simple averages allows us to include year fixed effects to absorb year-specific factors such as average market performance. We define low- and high-disposition groups as the bottom and the top quintiles of the disposition coefficient and exclude investors who do not fall into either of these two quintiles. The negative coefficient in the first row indicates that the effect documented by Odean is reversed for the low-disposition group: stocks sold for a gain by low-disposition investors subsequently underperform those that could have been sold at a loss. The difference-in-difference coefficient is significantly positive at the three horizons we consider, indicating that the effect documented by Odean (1998) is stronger for individuals in our high-disposition group than in the low-disposition group. That is, high-disposition investors sell stocks that subsequently outperform the stocks they could have sold, but low-disposition investors do not. This test reiterates that disposition effect is costly to investors.

Ex post return horizon | |||
---|---|---|---|

30 Days | 45 Days | 60 Days | |

Sold for a gain | −0.0164 | −0.0520 | −0.0541 |

(0.0021)^{***} | (0.0024)^{***} | (0.0026)^{***} | |

High disposition | −0.0120 | −0.0191 | −0.0145 |

(0.0052)^{***} | (0.0069)^{***} | (0.0084) | |

Sold for a gain × high disposition | 0.0605 | 0.1063 | 0.1146 |

(0.0026)^{***} | (0.0033)^{***} | (0.0040)^{***} | |

Year fixed effects | Yes | Yes | Yes |

Ex post return horizon | |||
---|---|---|---|

30 Days | 45 Days | 60 Days | |

Sold for a gain | −0.0164 | −0.0520 | −0.0541 |

(0.0021)^{***} | (0.0024)^{***} | (0.0026)^{***} | |

High disposition | −0.0120 | −0.0191 | −0.0145 |

(0.0052)^{***} | (0.0069)^{***} | (0.0084) | |

Sold for a gain × high disposition | 0.0605 | 0.1063 | 0.1146 |

(0.0026)^{***} | (0.0033)^{***} | (0.0040)^{***} | |

Year fixed effects | Yes | Yes | Yes |

This table shows the results of a regression that effectively implements the same test reported in Odean's (1998) Table 6. The idea is to compare the returns of stocks sold at a gain with those that could have been sold at a loss but were not. In order to see the difference in cost of the disposition effect to investors in the low- and high-disposition groups, we implement a difference-in-difference specification. Using a regression framework instead of simple averages allows us to include year fixed effects to absorb year-specific factors such as average market performance. We define low- and high-disposition groups as the bottom and top quintile of the disposition coefficient. The negative coefficient in the first row indicates that for the low-disposition group the effect documented by Odean is reversed: stocks sold for a gain subsequently underperform. (Robust standard errors, clustered by investor, are presented in parentheses. ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively.) The difference-in-difference coefficient is significantly positive at the three horizons we consider, indicating that the effect documented by Odean (1998) is stronger for individuals in our high-disposition group than in the low-disposition group.

Another necessary condition for disposition to be a behavioral bias is that disposition is a somewhat stable, predictable attribute of a particular investor. We test this conjecture by estimating the disposition effect at the investor level in adjacent time periods. Each set of estimates comes from a completely disjoint data set. Any trades that are not closed at the end of the first period are considered censored in the model estimated with first period data. Therefore, any trades that are not closed at the end of the first period are completely ignored in the model estimated with second period data. We explore the stability of disposition coefficients by estimating the rank correlation of account-level disposition coefficients over the two periods, testing whether the rank correlation is significantly different from zero. We estimate the rank correlation between an investor’s disposition coefficient in year $$t$$ and their coefficient in year $$t - 1$$ to be 0.364, suggesting that there is a fair degree of persistence in the individual’s disposition coefficient. This correlation is extremely statistically significant. Taken together, these results provide strong evidence that the disposition effect is a widespread and economically important behavioral bias that is present in each year of our study.

### Simple learning model

We start our analysis by examining if more experienced investors have better investment performance. We should find an improvement in the average performance of the pool of investors over time if either H_{1} or H_{2} is true. Therefore, before disentangling H_{1} and H_{2}, the simple learning model helps us to assess whether either or both of these effects exist. We adjust for individual heterogeneity and survivorship in the next section. Our analysis begins by estimating a simple learning model of the form

^{2}to allow investors to learn faster during earlier years. We proxy for investors’ trading experience by either years of experience or cumulative number of trades placed, and also include a vector of controls $$(X$$

_{$$i,t$$}) that might be expected to affect an individual’s performance as he or she trades, such as the individual’s average total daily portfolio value (PortVal).

Columns 1 and 2 of Table 3 report the results of the performance learning regression. When experience is measured in either number of years or cumulative trades, it is positively and significantly related to average returns. An additional year of experience increases average 30-day post-purchase returns by 41 − 4 = 37 bp, or approximately 3% at an annualized rate. An additional 100 trades increases returns at slightly over one-fourth of this rate. Again, results estimated with alpha instead of raw returns are quite similar (unreported).^{11} While these estimates are encouraging, the speed of learning they imply seems almost implausibly large. For instance, taking the regression parameters at face value, an investor with eight years of experience should outperform a new investor by about 22% per year. While we observe some heterogeneity in investor ability (or some performance persistence), it is not nearly large enough to justify these large coefficients.

Simple learning model | Learning model with heterogeneity | |||||||
---|---|---|---|---|---|---|---|---|

$${\bar R_{i,t + 1}}$$ | $${\beta }^{d}$$_{$$i,t+1$$} | $${\bar R_{i,t + 1}}$$ | $$\beta^d_{i,t+1}$$ | |||||

$$YearsTraded_{t}$$ | 0.414 | −0.050 | 0.249 | −0.010 | ||||

(0.160)^{***} | (0.014)^{***} | (0.401) | (0.038) | |||||

$$YearsTraded^{2}_{t}$$ | −0.043 | 0.007 | −0.022 | 0.01 | ||||

(0.027) | (0.002)^{***} | (0.038) | (0.003)^{***} | |||||

$$CumulTrades_{t} ( \div 10^{2})$$ | 0.110 | −0.041 | 0.058 | −0.030 | ||||

(0.043)^{***} | (0.003)^{***} | (0.059)^{***} | (0.005)^{***} | |||||

$$CumulTrades^{2}_{t} ( \div 10^{4})$$ | −0.0005 | 0.0002 | −0.0009 | 0.0002 | ||||

(0.0003)^{*} | (0.00002)^{***} | (0.0004)^{**} | (0.00003)^{***} | |||||

$$NumSec_{t}$$ | 0.096 | 0.104 | −0.011 | −0.012 | 0.003 | 0.002 | −0.008 | −0.008 |

(0.012)^{***} | (0.012)^{***} | (0.001)^{***} | (0.001)^{***} | (0.024) | (0.024) | (0.002)^{***} | (0.002)^{***} | |

$$NumTrades_{t}$$ | −0.031 | −0.043 | 0.0006 | 0.005 | −0.021 | −0.029 | −0.0007 | 0.0003 |

(0.005)^{***} | (0.007)^{***} | (0.001) | (0.001)^{***} | (0.010)^{**} | (0.010)^{***} | (0.001) | (0.001) | |

$$PortVal_{t} ( \div 10^{6})$$ | 0.104 | 0.099 | −0.006 | −0.003 | 0.618 | 0.587 | −0.067 | −0.065 |

(0.061)^{*} | (0.058)^{*} | (0.003)^{**} | (0.003) | (0.602) | (0.601) | (0.602) | (0.601) | |

$${\bar R_t}$$ | −0.005 | −0.005 | −0.002 | −0.002 | ||||

(0.001)^{***} | (0.001)^{***} | (0.001)^{*} | (0.001)^{*} | |||||

Individual fixed effects | No | No | No | No | Yes | Yes | Yes | Yes |

Year fixed effects | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |

Observations | 13,404 | 13,404 | 17,715 | 17,715 | 13,404 | 13,404 | 17,715 | 17,715 |

$$R^{2}$$ (%) | 1.6 | 1.7 | 1.1 | 1.6 | 45.7 | 45.9 | 38.7 | 38.6 |

Simple learning model | Learning model with heterogeneity | |||||||
---|---|---|---|---|---|---|---|---|

$${\bar R_{i,t + 1}}$$ | $${\beta }^{d}$$_{$$i,t+1$$} | $${\bar R_{i,t + 1}}$$ | $$\beta^d_{i,t+1}$$ | |||||

$$YearsTraded_{t}$$ | 0.414 | −0.050 | 0.249 | −0.010 | ||||

(0.160)^{***} | (0.014)^{***} | (0.401) | (0.038) | |||||

$$YearsTraded^{2}_{t}$$ | −0.043 | 0.007 | −0.022 | 0.01 | ||||

(0.027) | (0.002)^{***} | (0.038) | (0.003)^{***} | |||||

$$CumulTrades_{t} ( \div 10^{2})$$ | 0.110 | −0.041 | 0.058 | −0.030 | ||||

(0.043)^{***} | (0.003)^{***} | (0.059)^{***} | (0.005)^{***} | |||||

$$CumulTrades^{2}_{t} ( \div 10^{4})$$ | −0.0005 | 0.0002 | −0.0009 | 0.0002 | ||||

(0.0003)^{*} | (0.00002)^{***} | (0.0004)^{**} | (0.00003)^{***} | |||||

$$NumSec_{t}$$ | 0.096 | 0.104 | −0.011 | −0.012 | 0.003 | 0.002 | −0.008 | −0.008 |

(0.012)^{***} | (0.012)^{***} | (0.001)^{***} | (0.001)^{***} | (0.024) | (0.024) | (0.002)^{***} | (0.002)^{***} | |

$$NumTrades_{t}$$ | −0.031 | −0.043 | 0.0006 | 0.005 | −0.021 | −0.029 | −0.0007 | 0.0003 |

(0.005)^{***} | (0.007)^{***} | (0.001) | (0.001)^{***} | (0.010)^{**} | (0.010)^{***} | (0.001) | (0.001) | |

$$PortVal_{t} ( \div 10^{6})$$ | 0.104 | 0.099 | −0.006 | −0.003 | 0.618 | 0.587 | −0.067 | −0.065 |

(0.061)^{*} | (0.058)^{*} | (0.003)^{**} | (0.003) | (0.602) | (0.601) | (0.602) | (0.601) | |

$${\bar R_t}$$ | −0.005 | −0.005 | −0.002 | −0.002 | ||||

(0.001)^{***} | (0.001)^{***} | (0.001)^{*} | (0.001)^{*} | |||||

Individual fixed effects | No | No | No | No | Yes | Yes | Yes | Yes |

Year fixed effects | Yes | Yes | Yes | Yes | Yes | Yes | Yes | Yes |

Observations | 13,404 | 13,404 | 17,715 | 17,715 | 13,404 | 13,404 | 17,715 | 17,715 |

$$R^{2}$$ (%) | 1.6 | 1.7 | 1.1 | 1.6 | 45.7 | 45.9 | 38.7 | 38.6 |

This table presents results for regressions of the form

where the dependent variable is either the investor’s average 30-day return following purchases ( $${\bar R_{i,t + 1}}$$ ) or the investor’s disposition coefficient $$({\beta }^{d}$$_{$$i,t+1$$}). In the Simple Learning Model, the $${\alpha }_{i}$$ are held constant across all investors, while in the model with heterogeneity they are allowed to vary. Experience is measured by either years of experience (YearsTraded) or cumulative number of trades placed (CumulTrades). $$X$$_{$$i,t$$} is a vector of controls including the number of trades placed by the individual in a given year (NumTrades), the number of securities held by the individual in a given year (NumSec), and the individual’s average total daily portfolio value (PortVal). Data are from the period 1995 to 2003. Standard errors are in parentheses, and ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively.

Columns 3 and 4 of Table 3 present our results for the disposition learning regressions. To reduce the weight given to disposition coefficients that are not estimated very precisely, we estimate the regressions with weighted least squares, where the weights are proportional to $$1/\widehat{{\rm{Var}}\left( {{{\rm{\beta }}^d}} \right)}$$ from our hazard regression (Equation (A2) in the Appendix). Column 3 shows that disposition declines with experience (β_{1} < 0). Moreover, investors tend to slow down in their learning as they gain experience since β_{2} > 0. Wealthier traders, investors who trade more securities, and investors who earned higher returns in the previous year all have lower levels of disposition. Column 4 indicates that an additional 100 trades reduces the disposition coefficient by 0.041, which is similar to the coefficient on Experience in Column 3. In other words, a year of experience or 100 trades has approximately the same effect on disposition. In each of the specifications, the estimated YearsTraded and CumulTrades coefficients are statistically significant at the 1% level. Economically, however, our results suggest that investors learn relatively slowly. Specifically, the estimates in Column 3 suggest that an additional year of experience corresponds to a reduction in the disposition coefficient of approximately 0.05. To provide some context for this estimate, note that the unconditional median disposition coefficient in our sample is 1.04. An extra year of experience decreases this by about 5%.

As mentioned in Section 2, these estimates could reflect the amount of learning-by-doing in a world with exogenous attrition and no investor heterogeneity or they could reflect learning-about-ability if the pool of investors improves over time due to attrition of low-ability traders. We now move on to testing H_{1} and H_{2} by examining the role played by individual heterogeneity and attrition in our data.

### Learning model with individual fixed effects

The simple learning model does not account for unobserved investor heterogeneity. As indicated in Figure 1, investor participation in our sample changes significantly over time. Consequently, cohort effects might make it appear as if there is learning even if there is none. For instance, if the number of trades placed by an investor is a noisy measure of ability, our experience variables could capture learning even if investors have constant ability and high-ability investors trade more than low-ability ones. With this in mind, our next model exploits the long time series of performance and disposition estimates available for each investor to assess the impact of time-invariant unobserved investor-level factors such as ability, education, or wealth on investor learning:

The performance results on YearsTraded, reported in Column 5 of Table 3, suggest that an investor with one year of experience will earn 22 bp more than an inexperienced investor over a 30-day horizon. Column 6 indicates that a similar increase in returns comes from an additional 400 trades. This suggests that though investor performance improves with experience, accounting for individual heterogeneity reduces the estimates by about 50%. Results from the disposition regressions including individual fixed effects are presented in Columns 7 and 8. While YearsTraded is no longer significant in these regressions, Column 8 suggests that 100 trades reduces the disposition coefficient by approximately 0.03. Comparing these estimates with those from the regression without fixed effects, we find that the learning estimates are again reduced by roughly one-half.

These regressions assume that individuals stop trading for purely exogenous reasons. The model shows that some learning by trading is occurring, but it gives little information about the nature of that learning. In the model of Mahani and Bernhardt (2007), as low-ability investors trade they realize that their inherent level of ability is low and decide to stop trading actively. Thus, examining attrition in the sample is critical to disentangle how much of the estimates of β_{1} and β_{2} is impacted by the endogenous nature of our sample, or how much of β_{1} and β_{2} is driven by H_{1} and how much is driven by H_{2}. To separate our inferences about H_{1} and H_{2}, we carefully control for investor heterogeneity and survivorship in our next set of tests.

### Impact of attrition on learning estimates

Endogenous attrition can significantly affect our learning estimates, a problem best understood if we consider the decision of the investor to continue trading. To represent an investor who continues to trade when his or her performance is good and stops once he or she gets a few bad draws, define

_{$$i,t$$}is uncorrelated with ε

_{$$i,t$$}, which may not be the case.

We first present some overall attrition evidence by examining the rate at which investors who are in our sample (having placed seven round-trip trades) in one year fail to place any trades during the rest of our sample period. Figure 5 shows that attrition is a significant feature of our data. Since the rest of the sample period changes from year to year, the earlier years of our sample period provide more reliable estimates of true exit rates than the later years. Approximately 25% of those traders who enter the sample in one year fail to ever trade again. Of traders who trade for two or three years, about 5% permanently exit the sample.

We conduct the Verbeek and Nijman (1992) test to assess whether selection is a problem by including $$s_{it- 1}$$ in the fixed effects model. If selection is not a problem—so investor attrition is random—the coefficient estimate on the selection dummy should be insignificant. The coefficient in the Verbeek and Nijman test estimate on the lagged selection dummy $$(s_{it- 1})$$ in regression (3) is found to be significant for both performance (coefficient 2.19, $$t$$-stat 6.44) and disposition (coefficient 0.10, $$t$$-stat 3.30). This again indicates that selection is severe in the sample and attrition is not random.

To examine directly how much ceasing to trade (or learning about ability) affects our inferences about learning, we use a modified version of the selection model introduced by Heckman (1976). While the classic Heckman model involves a two-stage procedure—a selection model in the first stage to predict which observations will be observable in the second stage and the regression of interest with an adjustment for survivorship bias in the second stage—it does not account for individual heterogeneity. The evidence in Section 3.3 suggests that accounting for individual heterogeneity may be important to control for cohort or similar effects. Thus, we modify this procedure to account for both survivorship bias and individual heterogeneity, adopting the empirical strategy of Wooldridge (1995), which modifies the Heckman model to allow for fixed effects. Intuitively, this approach accounts for survivorship by estimating the selection model every year and including the inverse Mills ratios (the conditional probability that an individual continues to trade) of each selection equation in the learning regression model. Individual time-invariant heterogeneity is accounted for in this method by running the regression in first-differences. More concretely, the learning regression model we estimate is

_{96}, …, λ

_{02}are the inverse Mills ratios from the cross-sectional probit model (the selection model) in each of the years 1996–2002, and $$I({\cdot})$$ is an indicator variable. (1995 is omitted to avoid perfect collinearity of the model.) Including these variables in the learning regression accounts for the impact of the selection equation. Note that a joint test of $${\rho }_{t} = 0$$ for $$t = 1996, \ldots , 2002$$ is a test of whether survivorship bias is a concern (Wooldridge 1995).

The first stage uses cross-sectional probit regressions to predict whether or not the individual ceases to trade in a given period. The probit regressions include a constant, linear, and quadratic experience terms, the number of different stocks the investor trades, the individual’s average return in the previous year, the cross-sectional standard deviation of the individual’s previous-year 30-day return, and a dummy variable that is one if the individual’s average daily marked-to-market total portfolio value is in the top quartile of all investors. As instruments, we use the following variables: (1) a dummy variable for whether an investor inherited shares in the previous calendar year (*Inheritance*); and (2) the proportion of active traders in the zip code of the individual (excluding the individual) in the previous calendar year (*Activeness*). As we will explain below, both these variables are likely to satisfy the necessary exclusion restrictions—that is, they are likely to affect the probability of remaining in the sample, but are unlikely to affect changes in an individual’s performance or disposition effect except through their effect on survival.

For our first instrument, we conjecture that an individual who inherits shares is more likely to trade in the future, perhaps because his or her wealth has increased or because the new shares cause him or her to pay more attention to the stock market. Another explanation, consistent with Jin and Scherbina (2008), is that inherited shares may prompt the recipient to trade since the shares may not fit with the investor’s desired asset allocation. This satisfies the exogeneity condition since inheritance of shares from a relative is unlikely to directly affect changes in the performance or disposition effect of an individual. In the data, when an investor dies and shares are transferred to an heir, it appears as a transaction with the account of the deceased selling shares and the account of the heir purchasing shares. A special code identifies the transaction as an inheritance. In our sample, death transfers are evenly spread over the sample (ranging from 62 in 1995 to as high as 443 in 2003).

Our second instrument is based on the proportion of accounts in an individual’s zip code that place at least seven round-trip trades in the previous year. The papers that argue that an individual investor is more likely to trade if his or her neighbors are trading (Hong, Kubik, and Stein 2004; Ivković and Weisbenner 2007) imply that this measure is correlated with investor activity. This instrument satisfies the exclusion restriction, since having active traders in one’s zip code is unlikely to directly affect changes in performance or disposition effect.^{12} There are 1979 zip codes in the data and the measure varies between 0% and 88%.

We construct the sample to be used in the selection model as follows. An account observation is added to the selection sample if it places one or more trades in a given year. This differs from our main sample, where we require investors to have placed at least seven round-trip trades in order to estimate either the average performance or the disposition coefficient. Once an account is added, it remains in the selection sample until 2003, which is the end of our data. In some years, an account will have placed enough round-trip trades to be included in our hazard regressions, so the data will include a performance average and a disposition estimate for this account. However, each year we will also have data on many accounts for which we do not have performance and disposition estimates. If estimates are available, we treat the account as having been selected into our data. Results from the selection model, with two-step efficient estimates of the parameters and standard errors, are given in Table 4. The first-stage selection model uses 36,030 observations, while the second-stage regression (in first differences) uses only 11,959 observations in the performance regression and 16,188 observations in the disposition regressions.

Selection models (1st Stage) | 2nd Stage | 2nd Stage | |||
---|---|---|---|---|---|

Dependent variable | In-sample_{$$i,t+1$$} = 1 | $$\Delta {\bar R_{i,t + 1}}$$ | $$\Delta\beta^d_{i+t+1}$$ | ||

$$YearsTraded_{t}$$ | 0.292 | 0.451 | −0.038 | ||

(0.199) | (0.521) | (0.042) | |||

$$YearsTraded^{2}_{t}$$ | 0.028 | −0.009 | 0.007 | ||

(0.008)^{***} | (0.043) | (0.003)^{**} | |||

$$CumulTrades_{t} ( \div 10^{2})$$ | 0.341 | 0.036 | −0.021 | ||

(0.021)^{***} | (0.003)^{***} | (0.006)^{***} | |||

$$CumulTrades^{2}_{t} ( \div 10^{4})$$ | −0.0011 | −0.001 | 0.002 | ||

(0.000)^{***} | (0.000) | (0.000)^{***} | |||

$${\bar R_{t - 1}}$$ | 0.859 | 1.040 | 1.050 | −0.211 | |

(0.064)^{***} | (0.082)^{***} | (0.082)^{***} | (0.162) | ||

$$\sigma \left( {{{\bar R}_{t - 1}}} \right)$$ | −0.160 | −0.181 | −6.539 | 0.047 | |

(0.062)^{***} | (0.061)^{***} | (0.829)^{***} | (0.097) | ||

Trades options | 0.257 | 0.257 | |||

(0.018)^{***} | (0.018)^{***} | ||||

Wealthy | 0.430 | 0.427 | |||

(0.021)^{***} | (0.022)^{***} | ||||

$$I$$(Inherit = 1) | 0.134 | ||||

(0.060)^{**} | |||||

Activeness | 0.289 | ||||

(0.043)^{***} | |||||

Inverse mills ratios: | |||||

ρ_{96} | 4.868 | −0.128 | |||

(0.883)^{***} | (0.062)^{**} | ||||

ρ_{97} | 12.57 | −0.141 | |||

(1.130)^{***} | (0.074)^{*} | ||||

ρ_{98} | 24.03 | −0.388 | |||

(1.350)^{***} | (0.143)^{***} | ||||

ρ_{99} | −7.6973 | 0.291 | |||

(4.910) | (0.203) | ||||

ρ_{00} | 4.417 | −0.146 | |||

(0.822)^{***} | (0.064)^{**} | ||||

ρ_{01} | 7.16 | −0.092 | |||

(0.636)^{***} | (0.050)^{*} | ||||

ρ_{02} | 12.874 | −0.229 | |||

(0.671)^{***} | (0.058)^{***} | ||||

Observations $$(N)$$ | 36,030 | 36,030 | 36,030 | 11,959 | 16,188 |

Time fixed effects | Yes | Yes | Yes | ||

Other controls | Yes | Yes | Yes | Yes | Yes |

Log likelihood | −26077 | −22982 | −22958 | ||

$$R^{2}$$ (%) | 12.7 | 0.83 | |||

Joint test of$${\rho }_{t} = 0$$ | |||||

$$F(7, N)$$ | 35.80 | 12.18 | |||

Pr > $$F$$ | 0.000 | 0.000 |

Selection models (1st Stage) | 2nd Stage | 2nd Stage | |||
---|---|---|---|---|---|

Dependent variable | In-sample_{$$i,t+1$$} = 1 | $$\Delta {\bar R_{i,t + 1}}$$ | $$\Delta\beta^d_{i+t+1}$$ | ||

$$YearsTraded_{t}$$ | 0.292 | 0.451 | −0.038 | ||

(0.199) | (0.521) | (0.042) | |||

$$YearsTraded^{2}_{t}$$ | 0.028 | −0.009 | 0.007 | ||

(0.008)^{***} | (0.043) | (0.003)^{**} | |||

$$CumulTrades_{t} ( \div 10^{2})$$ | 0.341 | 0.036 | −0.021 | ||

(0.021)^{***} | (0.003)^{***} | (0.006)^{***} | |||

$$CumulTrades^{2}_{t} ( \div 10^{4})$$ | −0.0011 | −0.001 | 0.002 | ||

(0.000)^{***} | (0.000) | (0.000)^{***} | |||

$${\bar R_{t - 1}}$$ | 0.859 | 1.040 | 1.050 | −0.211 | |

(0.064)^{***} | (0.082)^{***} | (0.082)^{***} | (0.162) | ||

$$\sigma \left( {{{\bar R}_{t - 1}}} \right)$$ | −0.160 | −0.181 | −6.539 | 0.047 | |

(0.062)^{***} | (0.061)^{***} | (0.829)^{***} | (0.097) | ||

Trades options | 0.257 | 0.257 | |||

(0.018)^{***} | (0.018)^{***} | ||||

Wealthy | 0.430 | 0.427 | |||

(0.021)^{***} | (0.022)^{***} | ||||

$$I$$(Inherit = 1) | 0.134 | ||||

(0.060)^{**} | |||||

Activeness | 0.289 | ||||

(0.043)^{***} | |||||

Inverse mills ratios: | |||||

ρ_{96} | 4.868 | −0.128 | |||

(0.883)^{***} | (0.062)^{**} | ||||

ρ_{97} | 12.57 | −0.141 | |||

(1.130)^{***} | (0.074)^{*} | ||||

ρ_{98} | 24.03 | −0.388 | |||

(1.350)^{***} | (0.143)^{***} | ||||

ρ_{99} | −7.6973 | 0.291 | |||

(4.910) | (0.203) | ||||

ρ_{00} | 4.417 | −0.146 | |||

(0.822)^{***} | (0.064)^{**} | ||||

ρ_{01} | 7.16 | −0.092 | |||

(0.636)^{***} | (0.050)^{*} | ||||

ρ_{02} | 12.874 | −0.229 | |||

(0.671)^{***} | (0.058)^{***} | ||||

Observations $$(N)$$ | 36,030 | 36,030 | 36,030 | 11,959 | 16,188 |

Time fixed effects | Yes | Yes | Yes | ||

Other controls | Yes | Yes | Yes | Yes | Yes |

Log likelihood | −26077 | −22982 | −22958 | ||

$$R^{2}$$ (%) | 12.7 | 0.83 | |||

Joint test of$${\rho }_{t} = 0$$ | |||||

$$F(7, N)$$ | 35.80 | 12.18 | |||

Pr > $$F$$ | 0.000 | 0.000 |

This table reports estimates of selection model regressions with the fixed effects modification developed by Wooldridge (1995). The regressions are of the form

where λ_{96}, …, λ_{02} are the inverse Mills ratios from the cross-sectional probit model (the selection model) in each of the years 1996–2003, and $$I({\cdot})$$ is an indicator variable. Including these variables in the learning regression accounts for the impact of the selection equation. (1995 is omitted to avoid perfect collinearity of the model.) A joint test of $${\rho }_{t} = 0$$ for $$t = 1996, \ldots , 2002$$ tests whether survivorship bias is a concern. The first-stage probit model is estimated each year, and the inverse Mills ratios for each year are computed separately from each of these models. For brevity, the table reports estimates of the first-stage probit model with data from all of the years of the sample pooled together. The second-stage regressions are estimated with all the variables in first differences, except the inverse Mills ratios. These first differences add fixed effects to the model. The dependent variable in the second stage is either the individual’s average return in the following year, $${\bar R_{i,t + 1}},$$ or the individual’s disposition coefficient, $${\beta }^{d}$$_{$$i,t- 1$$}. $$X$$_{$$i,t$$} is a vector of controls described in the text. Standard errors are in parentheses, and ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively.

We estimate the first-stage regression for each year and construct inverse Mills ratios for each year. For brevity, we report only one set of pooled first-stage estimates with year fixed effects in Columns 1–3 of Table 4. Results for each of the years are qualitatively similar to those reported. The model in Column 1 shows that the estimate on $${\bar R_{t - 1}}$$ is positive and significant. This is consistent with $$H_{1}$$—as low-ability investors trade, they learn about their inherent ability and cease trading when they get negative returns. More successful investors continue to trade actively. The estimate is also economically meaningful and suggests that, keeping other explanatory variables at their mean levels, a decrease in returns of one standard deviation increases the probability that the individual will cease to trade in the next period by around 15%. The other coefficient estimates reported in the first column of Table 4 also seem sensible: investors are more likely to remain in the sample and trade if they hold relatively diversified portfolios and have relatively more trading experience.

Following the model of Mahani and Bernhardt (2007), under H_{2}, it is likely that individuals will cease trading when the variance of the signals they get from the performance of their trades is large—i.e., the signals are noisy. We proxy for the noise in the signals the investors receive from the performance by the variance in the returns across all positions taken by an account in the previous year and include it in the model in Column 2.^{13} We also include measures of *ex ante* investor sophistication proxied by whether they trade options or have significant wealth. The idea is that such investors are more likely to continue trading. The coefficient estimates reported on these additional variables in the second column also seem sensible: investors are more likely to remain in the sample if they receive precise signals about their performance and if they are relatively sophisticated.

Finally, in the third column, we also add our instruments to the first-stage regression. As is reported, the coefficient estimates for both of our instruments are statistically significant at the 5% level or better, and they have the predicted sign. Specifically, both inheriting shares and having higher trading activity in the zip code of an investor increase the probability that the investor will continue trading in the next period. A joint χ^{2}_{2} test for the significance of the instruments rejects the null hypothesis that the instruments are weak at the 1% level (χ^{2}_{2} = 49.10). In addition, we test the validity of exclusion restrictions related to the instruments using the likelihood ratio (LR) test proposed by Wooldridge (2003, Chapter 17).^{14} We find that the null hypothesis that exclusion restrictions related to the two instruments are violated is rejected at the 1% level, suggesting again that our instruments are doing a good job in explaining the selection equation. Both the instruments also have an economically significant impact. For instance, keeping other variables at their mean levels, inheriting shares increases the probability that the investor will continue trading in the next period by around 8%. Similarly, a one-standard-deviation increase in investor activity (an increase of 0.20) in the zip code of the individual in the previous year increases the probability that the investor will continue trading in the next period by about 5%.

We find that accounting for selection has a significant impact on our learning estimates. Column 4 uses performance as the dependent variable in a regression of the form of Equation (5). Comparing the estimates in Table 4 with the simple model reported in Table 3, the coefficient on YearsTraded is no longer statistically significant, and the coefficient on cumulative trades is reduced by about 90%.^{15} When we use disposition as the dependent variable in Column 5, the coefficient is reduced by slightly more than 50%.

The coefficients on the inverse Mills ratios in Columns 4 and 5 are also sensible. In particular, they suggest that factors that predict which investors stay in the sample are positively correlated with future performance and negatively related with disposition. This suggests that survivorship is indeed important. The endogenous attrition of low-ability investors significantly boosts the experience coefficient estimate in the simple performance regression, and it diminishes the estimate in the simple disposition regression. The joint tests of statistical significance of the inverse Mills ratios also show that accounting for sample selection is important. In the disposition regression, the joint F-test of significance of the inverse Mills ratios yields an F-statistic of 12.18, much greater than the 1% critical value of 2.64. Similarly, in the performance regression, the F-statistic of 35.80 is much greater than the 1% critical value. Unreported results in which YearsTraded and CumulTrades are included in separate regression models yield almost the same coefficients, but regressions that include both variables are reported for brevity.

We also explore the properties of the coefficients on the inverse Mills ratios from the disposition and performance regressions in unreported tests. We find a strong negative relation between the coefficients on the inverse Mills ratios from the performance and disposition models. This is quite reasonable, since endogenous attrition of low-ability investors should significantly increase the estimates of experience in the performance regression and should reduce them in the disposition regression. Moreover, regressing the coefficients of the inverse Mills ratios on a dummy for whether the excess market return (over the risk-free rate) is positive, we find that there is less attrition in high-return periods than in low-return periods. This is indicated by the coefficient on the return inverse Mills ratios being higher (by about 12.3, $$p = 0.07)$$ and the coefficient on the disposition inverse Mills ratios being lower (by about −0.23; $$p = 0.11)$$ during high-return periods. These results suggest that less learning-about-ability occurs in “good” times.

Our results suggest that accounting for selection is important and significantly affects inferences about learning. Investor heterogeneity and survivorship effects account for something on the order of one-half to three-quarters of the learning estimates found in simple and aggregate models. This translates directly into slower learning than that inferred from simpler models. Taking the disposition learning coefficient, for example, 100 trades correspond to an improvement of about 0.04 in the simplest model, an improvement of about 0.03 in the model with individual fixed effects, and an improvement of about 0.02 in the survivorship/fixed effects model. Roughly speaking, if it takes about 100 trades to improve about 4% in the simple model, it takes about 200 trades to achieve the same improvement after adjusting for survivorship and individual heterogeneity.

Our estimates suggest that the fraction of learning that is driven by investors learning about their inherent ability is large. After adjusting for this type of learning, the portion of learning that is due to investors learning to improve their ability over time is statistically different from zero, but not excessively large. Overall, our findings are consistent with the two hypotheses outlined in Section 2, and the evidence supporting H_{2} is much stronger than the evidence supporting H_{1}.

### Additional evidence of learning

#### Learning about ability: trading intensity

An important implication of H_{2} is that investors increase their trading as they become more experienced and their beliefs about their ability become more precise, as in Mahani and Bernhardt (2007). We therefore examine whether investors who decide to continue trading increase their trading intensity. First, in Figure 6, we plot the average number of trades (dark blue) and the value of shares traded in Euro (light blue) for each year of experience. We demean the results by calendar year to adjust for year fixed effects, which ensures that market-wide changes in investor behavior do not contaminate our results. The plot clearly indicates that surviving investors place more trades as they gain more experience, both in quantity and in value of shares traded. This is confirmed using regressions, which are reported in Table 6. The regressions use the average number of trades and the value of trades as the dependent variables, and they include year and individual fixed effects. In Columns 3 and 4, we also control for the endogenous attrition decision of investors. As is clear from the table, we find that investors trade more intensively as they become more experienced. This analysis does not include those investors for whom we are not able to estimate the disposition effect. These results are not qualitatively changed if we expand our sample to the entire population of investors.^{16}

#### Learning by doing: risk and the speed of learning

As outlined in the hypothesis section, H_{1} suggests that the productivity of agents increases with experience, for instance, by using better combinations of various information sources that help them choose the stocks to invest in. In order to examine this, we estimate our learning regressions with risk-adjusted returns (or 30-day alphas) instead of raw returns, and we regress the average factor betas of stocks purchased by investors on experience and our control variables. Our regressions also control for survivorship and individual and year fixed effects. The results of our regressions appear in Table 7. This table clearly shows that risk-adjusted returns improve with experience, with a coefficient that is actually larger than the coefficient we estimate for raw returns. Looking at the coefficients on average factor betas makes it clear why this is the case. With more experience, investors are actually both improving raw returns and taking less risk, or purchasing stocks with lower factor betas. This result is particularly strong for the market (RMRF) and size (SMB) factor betas. These tests provide additional support for H_{2}.

As a final confirmation that agents improve their ability with experience, we examine learning for a group of less-sophisticated (low-ability) investors. We conjecture that these *ex ante* low-ability investors will learn at a faster rate than their higher-ability peers. The model of Mahani and Bernhardt (2007) assumes that investors do not know their ability before they start trading. As a result, it has no clear prediction on differential speeds of learning of *ex ante* sophisticated and unsophisticated investors. However, if we assume that investors who are *ex ante* sophisticated have some notion of their ability, our first hypothesis suggests that since these investors have little to improve, they will either learn slowly or not learn at all. To test this, we need to sort investors by their level of sophistication. We rely on *ex ante* observable characteristics of investors that we believe are related to their financial sophistication. In particular, we consider investors *ex ante* likely to be relatively sophisticated if they trade options or have significant wealth.

Each row of Table 5 displays the mean of the disposition coefficient and the average returns (or performance) of each group, the simple regression coefficient of these variables on cumulative trades, estimates corrected for survivorship and unobserved heterogeneity, and the number of observations used in the calculations. Results for disposition are shown in Columns 1–3 and for returns in Columns 4–6. Looking at the table, it is clear that the mean for each of our sophistication subgroups is significantly different, and each change in the mean across subgroups is of the sign we expect. In each pair of rows of the table there is a clear difference between the unsophisticated investors, who learn to avoid the disposition effect at a rate of about 10% per year, and sophisticated investors, for whom the learning coefficient is often insignificant. Moreover, the survivorship-adjusted estimates show that even after correcting for attrition and investor fixed effect, these investors do improve their ability to trade over time—that is, we confirm that $${{\rm{H}}_1}$$ is true. Similar results are obtained if we use YearsTraded as the experience variable instead.

Dependent variable | ||||||||
---|---|---|---|---|---|---|---|---|

$${\beta }^{d}$$_{$$i,t+1$$} | $${\bar R_{i,t + 1}}$$ | |||||||

Classification | Mean | Mod. 1 | Mod. 2 | Obs | Mean | Mod. 1 | Mod. 2 | Obs |

No options trades | 1.17 | −0.024 | −0.02 | 14078 | −0.81 | 0.14 | 0.32 | 10389 |

(0.003)^{***} | (0.001)^{***} | (0.06)^{***} | (0.13)^{***} | |||||

Trades options | 0.99 | −0.013 | −0.01 | 3846 | 0.1 | 0.008 | 0.01 | 3020 |

(0.004)^{***} | (0.004)^{***} | (0.04) | (0.01) | |||||

Low wealth | 1.14 | −0.023 | −0.018 | 15426 | −1.18 | 0.05 | 0.029 | 11411 |

(0.003)^{***} | (0.001)^{***} | (0.01)^{***} | (0.01)^{***} | |||||

High wealth | 1.11 | −0.04 | −0.01 | 2498 | 1.08 | 0.01 | 0.02 | 1198 |

(0.05) | (0.012) | (0.023) | (0.021) |

Dependent variable | ||||||||
---|---|---|---|---|---|---|---|---|

$${\beta }^{d}$$_{$$i,t+1$$} | $${\bar R_{i,t + 1}}$$ | |||||||

Classification | Mean | Mod. 1 | Mod. 2 | Obs | Mean | Mod. 1 | Mod. 2 | Obs |

No options trades | 1.17 | −0.024 | −0.02 | 14078 | −0.81 | 0.14 | 0.32 | 10389 |

(0.003)^{***} | (0.001)^{***} | (0.06)^{***} | (0.13)^{***} | |||||

Trades options | 0.99 | −0.013 | −0.01 | 3846 | 0.1 | 0.008 | 0.01 | 3020 |

(0.004)^{***} | (0.004)^{***} | (0.04) | (0.01) | |||||

Low wealth | 1.14 | −0.023 | −0.018 | 15426 | −1.18 | 0.05 | 0.029 | 11411 |

(0.003)^{***} | (0.001)^{***} | (0.01)^{***} | (0.01)^{***} | |||||

High wealth | 1.11 | −0.04 | −0.01 | 2498 | 1.08 | 0.01 | 0.02 | 1198 |

(0.05) | (0.012) | (0.023) | (0.021) |

This table reports both means and simple learning coefficient estimates from regressions of the form

The variable of interest is either the disposition coefficient $$({\beta }^{d}$$_{$$i,t+1$$}) or returns ( $${\bar R_{i,t + 1}}$$ ). Regressions using the simple learning model are labeled Mod. 1, while regressions using the adjustment for attrition are labeled Mod. 2. Attrition-corrected regressions include the inverse Mills ratios described in Table 3. For brevity, we report only β_{1} coefficients in the table. We classify investors as trades options if they trade in options at any point during our sample. Similarly, investors are classified as wealthy if they are in the top 25th percentile of average portfolio value. We include year dummies in all the regressions. Data are from the period 1995 to 2003. Standard errors are in parentheses and ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively. All group means are significantly different at the 1% level.

No attrition correction | With attrition correction | |||
---|---|---|---|---|

Num. trades | Val. trades | Num. trades | Val. trades | |

$$YearsTraded_{t}$$ | 10.074 | 135257 | 13.32 | 22713 |

(0.618)^{***} | (14091)^{***} | (0.749)^{***} | (5626)^{***} | |

$$YearsTraded^{2}_{t}$$ | −0.462 | −9515 | −0.896 | −1216 |

(0.097)^{***} | (2123)^{***} | (0.085)^{***} | (557)^{***} | |

Observations | 13,266 | 13,266 | 13,266 | 13,266 |

Individual fixed effects | Yes | Yes | Yes | Yes |

Year fixed effects | Yes | Yes | Yes | Yes |

No attrition correction | With attrition correction | |||
---|---|---|---|---|

Num. trades | Val. trades | Num. trades | Val. trades | |

$$YearsTraded_{t}$$ | 10.074 | 135257 | 13.32 | 22713 |

(0.618)^{***} | (14091)^{***} | (0.749)^{***} | (5626)^{***} | |

$$YearsTraded^{2}_{t}$$ | −0.462 | −9515 | −0.896 | −1216 |

(0.097)^{***} | (2123)^{***} | (0.085)^{***} | (557)^{***} | |

Observations | 13,266 | 13,266 | 13,266 | 13,266 |

Individual fixed effects | Yes | Yes | Yes | Yes |

Year fixed effects | Yes | Yes | Yes | Yes |

This table reports regression results for trading intensity and experience. In Columns 1 and 3, the dependent variable is the number of trades placed by an investor in a year. In Columns 2 and 4, the dependent variable is the value of shares traded, measured in Euros. Attrition-corrected regressions include the inverse Mills ratios described in Table 3. For brevity, we report only the coefficients on experience variables in this table. All regressions include individual and year fixed effects. Data are from the period 1995 to 2003. Standard errors are in parentheses, and ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively.

Dependent variables | |||||
---|---|---|---|---|---|

Coefficient | 30-day α | β on RMRF | β on SMB | β on HML | β on UMD |

YearsTraded | 0.58 | −0.034 | −0.016 | 0.0544 | 0.0073 |

(0.41) | (.0128)^{**} | (.0059)^{**} | (.0435) | (.0250) | |

YearsTraded^{2} | 0.02 | 0.00246 | 0.00163 | 0.0037 | −0.00076 |

(0.02) | (.0012)^{*} | (.0005)^{***} | (.0028) | (.0019) | |

CumulTrades (÷ 10^{2}) | 0.05 | −0.01 | −0.002 | −0.002 | −0.016 |

(0.02)^{***} | (0.002)^{***} | (0.0009)^{***} | (0.004) | (.015) | |

CumulTrades^{2} (÷ 10^{4}) | −0.0002 | 0.00005 | 0.00001 | 0.00002 | 0.00008 |

(0.0001)^{**} | (0.00001)^{***} | (0.000006)^{**} | (0.00002) | (0.00005) | |

Dep var mean | 0.07 | 0.663 | 0.028 | −0.086 | −0.054 |

Standard dev | 3.30 | 0.288 | 0.103 | 0.431 | 0.440 |

Dependent variables | |||||
---|---|---|---|---|---|

Coefficient | 30-day α | β on RMRF | β on SMB | β on HML | β on UMD |

YearsTraded | 0.58 | −0.034 | −0.016 | 0.0544 | 0.0073 |

(0.41) | (.0128)^{**} | (.0059)^{**} | (.0435) | (.0250) | |

YearsTraded^{2} | 0.02 | 0.00246 | 0.00163 | 0.0037 | −0.00076 |

(0.02) | (.0012)^{*} | (.0005)^{***} | (.0028) | (.0019) | |

CumulTrades (÷ 10^{2}) | 0.05 | −0.01 | −0.002 | −0.002 | −0.016 |

(0.02)^{***} | (0.002)^{***} | (0.0009)^{***} | (0.004) | (.015) | |

CumulTrades^{2} (÷ 10^{4}) | −0.0002 | 0.00005 | 0.00001 | 0.00002 | 0.00008 |

(0.0001)^{**} | (0.00001)^{***} | (0.000006)^{**} | (0.00002) | (0.00005) | |

Dep var mean | 0.07 | 0.663 | 0.028 | −0.086 | −0.054 |

Standard dev | 3.30 | 0.288 | 0.103 | 0.431 | 0.440 |

This table reports the results of fixed effect selection model estimates of regressions of various performance and risk measures on experience measures. The method of these regressions and their associated first-stage estimates are described in Table 4. The dependent variables include each investor’s average 30-day risk-adjusted return (alpha) and each investor’s average beta coefficient on four factors—RMRF, SMB, HML, and UMD. These betas are estimated in the standard way, as described in the text. Each regression includes the control variables and inverse Mills ratios described in Table 4, but only the experience variables coefficients are reported in this table. Standard errors are in parentheses, and ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively.

### Robustness tests

In this section we report the results of a few additional tests that relate to our main predictions. First, in unreported results, we substitute the market return for each stock’s return to see if individuals learn to time the market. If investors are learning to identify good times to buy, then the market as a whole will tend to increase after their purchases; if instead they are learning to select stocks, we will not find evidence of learning when we look only at market returns. In fact, we find that the coefficient estimates on experience variables are insignificant, which suggests that performance improves because investors become better at stock selection. Second, we also conduct our tests using the Wooldridge (1995) method, taking data on investors who resume trading after ceasing to trade for a few years (the tests reported in the last section had dropped such investors, using only observations in two consecutive years). Including these investors increases the sample by around 250 observations in the second stage but does not affect the nature of the results reported. Third, all of the results on disposition remain qualitatively unchanged if we include a “December dummy” in Equation (A2) or remove partial sales from our sample. This rules out tax-motivated selling or rebalancing as possible explanations for the disposition effect. Finally, in all the fixed effect regressions that control for individual heterogeneity, we cluster the standard errors at the individual level and find that our results are unaffected.

We also perform all of our tests with 30-day returns, regardless of when the investor actually sells. We find that our results are unchanged. This implies that our results are not driven by the selling behavior of investors. It is not the case, for example, that correlated liquidity shocks cause investors to sell simultaneously, drive down prices, and experience poor returns. In additional tests, we find that an investor’s sales are not concentrated in time, which is not consistent with their selling being driven by correlated liquidity shocks.

We also confirm that the learning we find is not driven by increasing precision in our disposition estimates. Although we estimate the disposition effect with disjoint data for each investor-year, since we expect surviving investors to increase their trading over time, it is likely that the precision of our disposition estimates will increase with time. It is, however, important to note that the precision of our estimates is also driven by the volatility of the stocks that investors hold, not just the number of positions they open and close. Moreover, we have no reason to believe that increasing precision will cause a reduction in our disposition estimates and not an increase. Nevertheless, to rule out this possibility, we conduct a bootstrap experiment that keeps constant the number of observations we use to estimate each investor’s disposition effect. We assume that an investor places ten trades each year, regardless of the number of trades he or she actually places. We take a random sample of ten trades from the actual trades (with replacement) and estimate the disposition coefficient as before. We then estimate our survivorship and heterogeneity-adjusted model as before with these new disposition coefficients. Our results are robust to this procedure and clearly show that our finding of learning is not driven by increasing precision in our estimates of the disposition effect.

## Conclusion

We examine learning using the complete trading records of individual investors in Finland during the period 1995–2003. We correlate performance and disposition with investor experience and investor survival rates to determine whether and how investors learn by trading. We find that performance improves and the disposition effect declines as investors become more experienced, suggesting that investors learn by trading. Importantly, cumulative trades is a better measure of trading experience than the number of years that an investor has traded; our evidence that years of experience matters is relatively weak. We differentiate between investors learning about their inherent ability by trading and investors learning to improve their ability over time by accounting for investor attrition. We find that a substantial part of this learning occurs when investors stop trading after learning about their inherent ability rather than continuing to trade and improving their ability over time. That is, the primary way that low-ability investors learn is by learning to stop trading. By not accounting for investor attrition and heterogeneity, the previous literature significantly overestimates how quickly investors become better at trading.

Our results suggest a number of interesting implications. First, since investors who continue trading learn slowly and there is a great deal of turnover in the investor population, it is likely that behavioral biases are an important feature of financial markets. Agents do not learn fast enough to make it impossible for biases to affect asset prices. Second, while it would be nice to know how quickly investors who cease to trade would learn if they chose to continue trading, we have no way to estimate this speed. If we assume that those who continue trading learn more quickly than those who cease to trade, policy makers might enhance welfare by devising screening mechanisms or tests that measure and reveal inherent investing ability. Allowing unskilled investors to learn of their poor ability without incurring significant costs might be more valuable than encouraging people to become active investors. Third, an open question in the literature is why there is such high trading volume, particularly among seemingly uninformed individual investors. Our results indicate that such trading may be rational; investors may be aware that they will learn from experience and choose to trade in order to learn. Our results also suggest that differences in the expected performance of investors may arise from different experience levels. Finally, if many inexperienced investors begin trading around the same time, their trades could lead to time-varying market efficiency. Our evidence is therefore consistent with the recent results of Greenwood and Nagel (2009) and the more general discussion found in Chancellor (2000) and Shiller (2005).

## Appendix

This appendix provides details of our approach to calculating individual returns and the disposition effect. It also describes the results of an alternative estimation procedure for the disposition effect that follows Feng and Seasholes (2005).

### Measuring performance

As mentioned in the main text, measuring the performance of individual investors is a significant challenge. Our data do not include all nonequity securities that may be held by an investor, so it is impossible to measure the return for the investor’s entire portfolio. This is made more difficult by the fact that the amount of money an individual has invested in equities often fluctuates significantly over time. Since we cannot accurately measure portfolio returns, we measure performance by examining the average return of stocks purchased. However, this generates a new problem—comparing the returns on holding periods of different lengths. For example, it is particularly difficult to compare the performance of one investor who holds a stock for one week and earns a holding period return of 3% to that of another investor who holds a stock for one year and earns a holding period return of 15%.

We therefore calculate the returns earned by the purchased stock in the 30 trading days following each investor’s purchases. Importantly, we truncate this calculation window at the length of the actual holding if it is shorter than 30 days. That is, the 30-day return for investor $$i$$ holding stock $$j$$ is

Our approach is an attempt to deal with the problem of comparing returns over similar holding periods while ensuring that the actual selling decisions of investors affect their performance. By measuring returns in this way, we hope to capture the value of short-term signals that the investor may have received. As a robustness check, we perform all of our tests with 30-day returns, regardless of when the investor actually sells, and find that our results are unchanged. Looking over longer horizons would introduce considerable noise into our return estimates.

### Measuring disposition

Previous researchers have measured the disposition effect in a number of ways. Odean (1998) compares the proportion of losses realized to the proportion of gains realized by a large sample of investors at a discount brokerage firm. Grinblatt and Keloharju (2001b) model the decision to sell or hold each stock in an investor’s portfolio by estimating a logit model that includes one observation for each position on each day that an account sells any security. Days in which an account does not trade are dropped from their analysis.

As Feng and Seasholes (2005) point out, a potential problem with these and similar approaches is that they may give incorrect inferences in cases in which capital gains or losses vary over time. Hazard models, which have been extensively applied in a number of fields including labor economics and epidemiology, are ideally suited to our setting. Since our focus is on estimating disposition at an individual level, we estimate the hazard regression for each investor and year. Implementation of the hazard model uses all data about the investor’s trading and the stock price path, rather than just data on days when a purchase or sale is made. That is, it implicitly considers the hold-or-sell decision each day. This improves our disposition estimates and gives us more power with which to investigate learning. More important, individual level estimates enable us to test H_{1} and H_{2} by explicitly accounting for investor attrition and heterogeneity.

To measure the disposition effect for each investor-year, we use a Cox proportional hazard model with time-varying covariates to model the probability that an investor will sell shares that he or she currently holds. We count every purchase of a stock as the beginning of a new position, and a position ends on the date the investor first sells part or all of his or her holdings. Alternative definitions of a holding period, such as first purchase to last sale, or requiring a complete liquidation of a position, do not substantively alter our results. Our time-varying covariates include daily observations of some market-wide variables and daily observations of whether each position corresponds to a capital gain or loss.

_{$$i,j$$}$$(t)$$, is investor $$i$$’s probability of selling position $$j$$ at time $$t$$ conditional on not selling the position until time $$t$$, and $${\phi }_{i}(t)$$ is the investor’s baseline hazard. Since we estimate the hazard model for each investor-year, the baseline hazard rate describes the typical holding period of just one investor in one particular year. The Cox proportional hazard model does not impose any structure on the baseline hazard, and Cox's (1972) partial likelihood approach allows us to estimate the β coefficients without estimating ϕ $$(t)$$.

The time-varying covariates, $$x$$_{$$i,j$$}$$(t)$$, are allowed to change each day. These include $$I(R$$_{$$i,j$$}$$(t) {\gt } 0)$$, an indicator of whether the total return on position $$j$$ from the time of purchase up until time $$t$$ is positive. Investors who suffer from the disposition effect are more likely to sell when this condition is true, so they will have positive values of $$\beta^d_i.$$ We therefore refer to $$\beta^d_i$$ as investor $$i$$’s “disposition coefficient.” The total return variable includes any dividends or other distributions and is calculated using closing prices on all days, including the date of purchase. We use the closing price on the purchase date instead of the actual purchase price to ensure that our results are not contaminated by microstructure effects. Also, our data do not include transaction prices during the first three months of 1995, but we do have closing prices during this period. (Using ex-dividend returns leaves our results unchanged.) In addition, we include as controls three five-day moving averages of market-level variables to ensure that we are not capturing selling related to market-wide movements: market returns ( $${\bar R_{M,t}}$$ ), squared market returns (σ_{$$M,t$$}), and market volume $$(V$$_{$$M,t$$}). We repeat this estimation via maximum likelihood each year from 1995 to 2003 for each individual $$i$$ who places at least seven round-trip trades in a year.

### Estimates using the Feng and Seasholes method

We estimate the yearly hazard models presented in Table 8, which are comparable with the model of Feng and Seasholes (2005). We use a proportional hazard regression rather than the parametric Weibull model used by Feng and Seasholes. Our indicator variable is the same as their “Trading Gain Indicator” (TGI), and the experience variable we use for this model is the total number of trades placed before the current trade, rather than CumulTrades, which is the total number of trades placed in previous calendar years. The estimates are from pooled hazard models estimated each year, in which all individuals are grouped together. Experience is interacted with an indicator for whether the price is above the purchase price, and the coefficient on this interaction term is interpreted as a learning coefficient. These models again give evidence of learning. However, the learning coefficient estimates are quite variable over time (0.149 to 0.063), and they are statistically insignificant in three of the nine years. Furthermore, the average disposition coefficient of an investor is estimated in aggregate to be around 0.65 in this model. This suggests that, depending on the period chosen, an additional year of experience corresponds to a reduction in the disposition coefficient of approximately 10% to 25%, significantly higher than the estimate from our heterogeneity- and survivorship-adjusted learning model. Table 8 also lists the number of observations available each year and the fraction of the observations that are censored, or the purchased stocks that are not sold by the end of each year.

Year | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 |
---|---|---|---|---|---|---|---|---|---|

β_{4} | −0.074 | −0.093 | −0.096 | −0.149 | −0.031 | −0.013 | −0.063 | −0.069 | −0.123 |

(0.039)^{*} | (0.062) | (0.031)^{***} | (0.022)^{***} | (0.024) | (0.011) | (0.011)^{***} | (0.012)^{***} | (0.012)^{***} | |

Trades | 6.6 | 14.5 | 26.6 | 44.6 | 99.6 | 251.9 | 161.9 | 115.6 | 131.8 |

Censored | 16% | 19% | 19% | 19% | 17% | 16% | 17% | 18% | 20% |

Accounts | 384 | 713 | 1360 | 2412 | 4953 | 11585 | 7445 | 5416 | 6056 |

Year | 1995 | 1996 | 1997 | 1998 | 1999 | 2000 | 2001 | 2002 | 2003 |
---|---|---|---|---|---|---|---|---|---|

β_{4} | −0.074 | −0.093 | −0.096 | −0.149 | −0.031 | −0.013 | −0.063 | −0.069 | −0.123 |

(0.039)^{*} | (0.062) | (0.031)^{***} | (0.022)^{***} | (0.024) | (0.011) | (0.011)^{***} | (0.012)^{***} | (0.012)^{***} | |

Trades | 6.6 | 14.5 | 26.6 | 44.6 | 99.6 | 251.9 | 161.9 | 115.6 | 131.8 |

Censored | 16% | 19% | 19% | 19% | 17% | 16% | 17% | 18% | 20% |

Accounts | 384 | 713 | 1360 | 2412 | 4953 | 11585 | 7445 | 5416 | 6056 |

This table presents learning estimates from pooled proportional hazards models, using a method similar to that of Feng and Seasholes (2005). Each year we pool the trades of all investors, treating them as if they were just one individual and estimating one learning coefficient for the entire population. The model is

where Exper is measured in years since first placing a trade and $$I(R$$_{$$i,j$$}$$(t) {\gt } 0)$$ is an indicator variable that takes a value of one when a stock has increased in price since its date of purchase. For brevity, only the β_{4} coefficient estimates are reported. Estimated β_{5} coefficients are all insignificant. We also report the number of trades or observations considered by the model (in thousands of trades), the percentage of observations censored (trades not closed by the end of the year), and the number of accounts contributing observations to the model. Standard errors are in parentheses, and ^{***}, ^{**}, and ^{*} denote significance at 1%, 5%, and 10%, respectively.

The low level of the average disposition effect, the high variability in the annual learning estimates, and the high level of learning found both by Feng and Seasholes (2005) and in our implementation of their model suggest that there are significant differences between our approach and theirs. One important difference is that we estimate a different disposition coefficient for each individual. This allows us to control for each individual’s baseline hazard function (or their average holding period). Another difference is that we estimate learning over a much longer period of time, since Feng and Seasholes (2005) only have about two years of transaction data. Thus, the year-to-year variation in the annual estimates in Table 8 is much less of a concern for our analysis.

## References

_{1}or H

_{2}). Learning from paper trading may be difficult for a number of reasons, including that it requires significant discipline to keep track of hypothetical trades and that transaction prices for hypothetical trades are not observable.

_{2}, this alternative makes no predictions on how trading intensity of an investor should change over time. Under H

_{2}, low-ability traders should stop trading, while high-ability traders should scale up their trading intensity. The results in this section also provide support for H

_{2}rather than this alternative.