Can Survey Participation Alter Household Saving Behaviour?

We document an effect of survey participation on household saving. Identification comes from random assignment to modules within a population‐representative Internet panel. The saving measure is based on linked administrative wealth data. Households that responded to a detailed questionnaire on needs in retirement reduced their non‐housing saving rate by 3.5 percentage points, on a base of 1.5%. The survey may have acted as a salience shock, possibly with respect to reduced housing costs in retirement. Our findings present an important challenge to survey designers. They also add to the evidence of limited attention in household financial decision making.

Much empirical research in economics analyses data from surveys of individuals and households. The development of panel surveys has allowed researchers to assess and account for heterogeneity and dynamics in economic behaviour. However, repeated data collection from the same individuals or households brings a risk of 'survey effects' the possibility that questioning individuals about their actions or attitudes in a particular domain can alter their later behaviour. Finding significant survey effects in important areas of economic research would require a rethinking of data collection strategies. More positively, finding such effects might also provide insight into the cognitive processes underlying broader economic behaviour.
In this article, we test for survey effects in a central domain of economic research: household saving behaviour. In particular, we test whether being asked questions about retirement income needs leads to changes in household saving behaviour. Recent work in behavioural finance suggests possible mechanisms for survey effects on saving behaviour. Limited attention means that individuals tend to overlook some of the consequences of their decisions (DellaVigna, 2009). If those unnoticed consequences materialise in the future, as do the benefits of saving today, this results in biases that are similar to those induced by limited self-control (Karlan et al., 2012). However, in contrast to self-control problems, limited attention suggests that behaviour might be corrected by focusing individuals' attention on the aspects they are missing. For example, recent literature suggests that sending out mailings can induce desired behaviour (such as tax compliance or personal care) cost effectively (Fellner et al., 2013;Altmann and Traxler, 2014;Hallsworth et al., 2014). To the extent that surveys on retirement planning can direct participants' attention, they may have behavioural effects, though the direction of such effects is not clear a priori.
Our research design has a number of critical features. First, we study survey effects in a large representative survey of a developed country population (the Netherlands). Second, we study effects on a key life-cycle choice: the level of savings. Third, the randomised allocation of members of an Internet panel to a survey module on retirement income needs provides for clean identification of the causal effect of participating in that survey module. Fourth, we measure household saving with linked administrative datanot with responses in subsequent panel surveys. This allows us to rule out the possibility that any observed effect is on reporting, rather than on the underlying saving behaviour. No prior study combines all these attributes.
In an environment in which public and occupational pension schemes implied high income replacement rates in retirement (the Netherlands in 2008), we find robust evidence that exposure to the retirement needs module subsequently led to lower average household saving. This mean effect is driven by older and more educated households. These households have the highest expected pension wealth and higher housing wealth. They are also much more likely, in the survey, to report that their housing costs will fall in retirement (perhaps because they anticipate paying off their mortgage debt). Our interpretation is that the survey module directed attention to aspects of retirement preparedness and needs that were not otherwise salient and, for wealthy households, this apparently implied that they should save less.
The survey methods literature has long been concerned with 'panel conditioning': the way in which experience in a panel survey affects subjects' responses. Several studies have examined panel conditioning in domains such as subjective well-being (Van Landeghem, 2014), marital satisfaction (Glenn, 1998), and preferences (Binswanger et al., 2013). Much of this literature compares experienced panel members with refreshment samples or other novice respondents. As pointed out by Das et al. (2011), disentangling panel conditioning from unobserved factors influencing attrition is a complex task. Das et al. (2011) conclude that, after controlling for unobserved attrition factors, there are significant panel conditioning effects in knowledge questions but not in other types of questions. 1 Our design, based on random invitations to a survey module, avoids concerns about attrition as well as age and time effects. In a recent review paper, Warren and Halpern-Manners (2012) note that the survey research literature on panel conditioning generally failed to employ randomised designs. They also point out that to date, there is little systematic evidence on panel conditioning in large-scale longitudinal social science surveys. Our article addresses this concern as well.
The psychology and marketing literatures have documented a number of related behavioural phenomena. The 'question-behaviour effect' refers to the observation that behaviour and retirement preparation in particular. Measurement is not innocuous. Even a random sample of a population, which has not been subject to non-random attrition, can fail to be representative of the population under study if the act of measurement alters the behaviour of sample members. More positively, this study demonstrates the value of randomised survey content in allowing for the exploration of such effects.
Our results also significantly extend the evidence base for the importance of limited attention in household financial decision making (Karlan et al., 2012;Stango and Zinman, 2014). We show that survey questions can alter choices with respect to a key life-cycle variable, the overall level of saving. Our findings also demonstrate that salience shocks can shift behaviour in different directions, depending on the context. While Karlan et al. (2012) find that deliberate and targeted reminders raise contributions to goal-specific savings accounts in Bolivia, Peru and the Philippines, we find that exposure to a module of questions about retirement income needs lowered overall saving in the Netherlands. We also find quite different patterns of effect heterogeneity. While Stango and Zinman (2014) find larger effects of being surveyed among lower-educated subjects, we find the largest effects among the higheducation group. We discuss this further below.
The rest of the article is organised as follows. Section 1 describes the institutional context of our study. Section 2 describes the research design, data and methods. In Section 3, we present the main results as well as a number of robustness checks and falsification tests. Section 4 concludes.

Institutional Context
Active saving will reflect the economic circumstances that households face and so to interpret the results that follow, some background is required. The Dutch pension system in 2008 was characterised by arrangements that were almost universal and provided extremely generous income replacement.
The Dutch system of income provision during retirement consists of four categories or 'pillars'. The first pillar is that of the public pension, which provides anyone who lived in the Netherlands between the ages of 15 and 65 with a subsistence income. Coverage of the public pension is close to universal, since uninterrupted residence in the country is the only criterion (benefits are cut by 2% for each year spent abroad). 2 The level of the public pension is set with reference to the minimum wage. Since public pensions only provide a minimum income, almost all employees accumulate additional entitlements in occupational pensions (the second pillar). Such arrangements cover 90% of all employees and are usually organised at the level of the sector or of the company (Bovenberg and Meijdam, 2001). Participation in the first two pillars is mandatory and together they replace, on average, 70% of the gross last earned income before retirement. This translates into replacement rates net of taxes above 80% (Kapteyn and De Vos, 1999;Bovenberg and Meijdam, 2001). The third pillar contains all private savings vehicles that are aimed specifically at retirement, such as life annuities, and these accounted for 7% of retirement income on average in the relevant period. Such voluntary arrangements are especially important for individuals who cannot rely on occupational pensions, such as the self employed. Finally, the remainder of retirement income comes from the fourth pillar which contains all other forms of wealth that can be drawn down after retirement, such as savings accounts, investments, and real estate.
There were no major changes to the Dutch pension system between 2002 and 2013, except for the move away from tax exempt early retirement schemes towards actuarially fair adjustment of benefits which took place between 2000 and 2006; see De Vos (1999, 2007) and Kapteyn et al. (2010) for further descriptions of the Dutch pension system during the period covered by our study.
De Bresser and Knoef (2015) use responses to the retirement needs module in the LISS panel (our treatment) combined with the administrative wealth data we used in this study and other data to analyse whether the Dutch pension system succeeded in providing an adequate retirement income to its contributors around the time of our study. They compare projected annuities from pensions and non-pension savings with the self-reported minimal and desired expenditure levels from the retirement needs module. Respondents report rather high minimum expenditure requirements, on average 50% higher than the highest official poverty line of Statistics Netherlands. Nevertheless, a majority of close to 70% can still expect to exceed their own minimum expenditure level using their funds in the first two pillars of the system. The extent of over-saving is substantial: the median difference relative to what would be required to maintain the self-reported minimum expenditure floor is 25% taking only the first two pillars of the retirement saving system into account; and this rises to 36% if the authors take non-pension, non-housing savings into account. Moreover, the median difference between the annuity from pensions and non-housing wealth and the self-reported desired expenditure level is 18%. Thus it seems that at the time the retirement module was fielded in LISS (in 2008), a large fraction of the population could significantly reduce their savings and still meet their post-retirement expenditure goals.
Another important aspect of the institutional context of our study is that beginning in 2008, individuals could find detailed information on their personal pension entitlements in their uniform pension overview (UPO). These UPOs provide all members of pension funds, in both the second and third pillars, with yearly updates on their current entitlements and projected future entitlements at age 65. UPOs have been mandatory for all financial institutions in the Netherlands since 1 January 2008.

Data and Research Design
Our analysis is based on the Dutch LISS Panel, a population-representative Internet panel survey of households that has been conducted in the Netherlands since 2007. Members of the LISS Panel complete online surveys on a regular basis. These surveys collect data on a range of core demographic, financial, health and social topics. The LISS has two features that are crucial to our study.
First, eligible respondents are selected randomly to be asked to complete additional, non-core survey modules. These modules are typically designed and submitted by researchers and fielded only once. Researchers can specify a subset of the LISS panel as eligible for their module. Researchers' required sample size often does not exhaust the full subset of those eligible, so costs are reduced by randomly choosing eligible panel members to be asked to complete the module. The treatment we study is a module of questions on expected needs in retirement and preferences for current versus retirement consumption (henceforth, the 'retirement needs module'). This module was fielded in January 2008 and it was the first randomly assigned module in the LISS Panel.
Second, we can measure household saving through linkage to records in the Dutch national tax record system between 2007 and 2009. This system records detailed information on assets and debt across different asset classes at the beginning of the calendar year. We are thus able to construct a very accurate measure of saving for 2008 and 2007: the years immediately after and before treatment. Critically, an independent measure of saving is necessary to be sure that observed effects represent genuine behavioural change and not changes in survey reporting style.
Details of these data sources and our research design are as follows.

The LISS Panel
The LISS Panel is a representative random sample from the Dutch population that was initiated during the autumn of 2007. The LISS Panel is administered by CentERdata, a survey research institute affiliated with Tilburg University, and follows close to 8,000 individuals from 5,000 households. Surveys are administered over the Internet.
Though the Netherlands has a high rate of Internet access (more than 80% of Dutch households are connected), CentERdata safeguards representativeness by providing sample households with an Internet subscription and a simple computer when necessary. Scherpenzeel (2011) provides details on the design and sample.

Treatment
We define treatment as participation in the first randomly assigned module in the LISS Panelthe retirement needs module. This module, formally titled 'What is an adequate old age income?' was designed to study preferences and attitudes relating to living standards in retirement. The module was first fielded in an older Internet panel (the CentERpanel) and the resulting data are analysed in Binswanger and Schunk (2012). The module was then fielded in the LISS panel in January 2008. Binswanger et al. (2013) then compare responses to the module across the two panels to study differences in responses between experienced (CentERpanel) and novice (LISS) respondents. They find greater non-response among novice respondents but little difference between experienced and novice panel members conditional on response. Note that the survey effects they study are very different from the effects studied in this article. They compare novice and experienced panel members, all of whom receive the retirement needs module, to estimate the effect of past experience with an Internet panel on survey responses (particularly responses to this module). We compare novice panel members who received the retirement needs module to novice panel members who did not, in order to estimate the effect of receiving this module on actual savings behaviour (measured independently of the survey).
It is important to emphasise that all the subjects we study were novice panel members, and none had previously been exposed to randomised survey content. Eligible panel members who were not offered the retirement needs module were not offered another module. Moreover random assignment to the retirement needs module was independent of assignment to all subsequent modules. Thus, the effect we study is receiving the module versus not receiving it and not a comparison between this module and some other module, or sequence of modules.
Eligibility for the survey was limited to all LISS members that were 25 years or older, who had a net household income of at least 800 euro per month and who were either the head of the household or his/her partner (children or other household members were excluded from participation). This lead to a total eligible sample of 5,435 individuals, 2,755 of which were selected at random and were offered the retirement needs module. Multiple members of a household can be members of the LISS panel, and so households may contain zero, one or two eligible individuals. The 5,435 eligible individuals were members of 3,125 households.
The basic unit of our analysis is the household since, as elaborated below, we measure both wealth and income at the household level. We classify a household as eligible if it contained at least one eligible individual, and we define a household as 'offered' if at least one household member that was eligible for the retirement needs module received the request to participate. Similarly, we define a household as treated if at least one household member was offered the module and actually completed the module. The completion rate among those who received the offer was 74% at the individual level and 79% at the household level.
The retirement needs module consists of around 60 items that concern desired expenditure levels in retirement, the trade-off between current and future consumption, and risk attitudes with respect to income after retirement.
• The module starts by asking how much respondents have thought about retirement and whether they would be willing to cut down on housing expenditures when they stop working. • The next two questions elicit expectations with regard to the evolution of housing costs during the first decade following retirement: the general direction, decrease/roughly equal/increase, followed by the expected change in euro per year. • After having been primed in this way to consider housing as an important, and potentially changing, category of expenditures, respondents are asked what the minimum expenditure level is that they would never want to fall below during retirement. Respondents then compare this minimal expenditure level with their current expenditures and indicate the reasoning behind their answer (e.g. summing projected expenditures in different categories or taking a certain fraction of current income or expenditures). • After reporting their minimal expenditures during retirement, a series of multiple choice questions elicits desired expenditures by means of choices between different expenditure paths during working life and retirement. • The questions on desired expenditure levels are followed by a series of choices between lotteries that involve income streams during retirement, to measure (domain-specific) risk preferences, and a vignette question in which respondents indicate whether they agree that one hypothetical lifetime expenditure path is preferable to another. • Finally, respondents are asked a question about their willingness to pay for prevention of climate change and they evaluate how difficult they found the questionnaire.
The exact wording of the questions can be found in Binswanger and Schunk (2012). Importantly, the module neither provided respondents with any information about the Dutch system of retirement income provision in general, nor about respondents' personal entitlements in particular. Thus, the randomised survey module did not constitute an information shock. Since the survey did not include any questions on predicted or intended savings either, the randomised survey module could not induce a question-behaviour effect; respondents were not asked to predict their own behaviour.

Saving Measures
We investigate the effect of survey participation on household saving. Though the LISS data include elaborate biannual surveys on assets and debt, we construct our saving measure from matched administrative data on wealth for two reasons. First, we want to isolate effects of survey participation on economic behaviour. If we found an effect of survey participation on self-reported savings, this could reflect altered survey-reporting styles rather than altered economic behaviour. Deriving our outcome measures from administrative data eliminates that concern. Second, there is a general concern about the quality of self-reported survey data on assets (Bound et al., 2001).
The administrative wealth data we use are taken from the Complete Asset Data of the Netherlands (Integraal Vermogensbestand, CAD), which was constructed by Statistics Netherlands. The CAD is based on tax records, which are supplemented with information from banks. The CAD contains a detailed decomposition of householdlevel wealth for the entire Dutch population. It measures assets on the first of January for the years 2007, 2008 and 2009 (data for more recent years are not yet released at the time of writing). Available records thus allow us to compute yearly savings during 2007 and 2008 as the differences between wealth stocks on 1 January of consecutive years. Note that the timing is ideal for measuring the effect of a module fielded in January of 2008. We compute these wealth stocks net of the value of the primary residence, because we want to focus on pure savings and housing has an important consumption component. The use of administrative data on changes in wealth to measure savings and consumption is becoming more common; see Browning et al. Many studies of household savings behaviour are limited by having only data on specific assets or accounts. 3 Thus, changes in contributions may represent portfolio reshuffling rather than changes in net saving. In contrast, a strength of our data is that we observe an almost complete measure of wealth. The categories of assets that we observe are checking and saving accounts, bonds, stocks, property, other real estate, business capital and other tangibles. For debt, the CAD distinguishes between mortgage and other debt. We miss savings held in small accounts because banks are not obliged to report accounts with a balance of <500 euro or <15 euro in interest payments. We also do not observe debt for households without capital income. Finally, we miss savings held in tax-exempt private retirement ('third pillar') pensions. Such accounts are taxed only during the payout phase and are therefore invisible in tax records up to retirement. However, they were not very important during the period we study, since annuities contributed only 7% of household retirement income (Bovenberg and Meijdam, 2001). While we do not observe holdings in such accounts in the administrative data, we do have information from the LISS assets survey (one of the core surveys that is answered by all panel households). Thus, we do have a check on changes in saving in these accounts.
In our analysis of savings, we look both at levels, in euro per year, and rates, which are levels divided by yearly disposable income. The data on the yearly disposable income of households are also taken from tax records. We use the Complete Household Income Data of the Netherlands (Integraal Huishoudens Inkomstenbestand, CHID), assembled by Statistics Netherlands. The measure for primary income in the CHID is quite complete: in addition to labour income it includes income from entrepreneurship and from assets (interest payments and imputed rent for homeowners). Disposable income is defined as primary income plus government transfers that the household received minus the transfers and taxes paid by the household. The administrative income measure that we use is likely to be more accurate than survey measures of income since information about the various income streams is provided electronically by employers and financial institutions to the tax authority.

Matching Survey and Administrative Data and Sample Selection
The construction of our estimation sample starts with 3,125 households that contain at least one member that was eligible for the retirement needs module according to the criteria mentioned in Section 2. Incomplete linkage is often an important concern when combining survey and administrative data. Out of the 3,125 households that contain at least one member that was eligible for the retirement needs module, we could find administrative data matches for only 1,602. Respondent refusal of consent for matching can be a reason for incomplete linkage (Sakshaug et al., 2012) but in our case panel attrition is a bigger problem. Informed consent for the match of LISS data with administrative records was elicited only in September of 2011, almost four years after the retirement needs module was fielded, and by this time many of the households who were eligible for the retirements needs module had attritted from the panel. De Bresser and Knoef (2015) show that only 10% of the respondents to the retirement needs module that were still in the LISS Panel in 2011 refused consent to the administrative data linkage. After matching the LISS respondents to administrative data, we obtain wealth records for 1,429, 1,437 and 1,449 households in the years 2007, 2008 and 2009 respectively. We drop those households for which all eligible members were retired in 2008, reducing the sample to 1,275 households. Even with accurately measured administrative data, there can be large outliers in ratio variables. We trim all households for which 2008 savings rates relative to after-tax household income were larger than 50% in absolute value, leaving us with an estimation sample of 999 households. We also tried trimming the sample at savings rates larger than 75% and 100% of net household income and found quantitatively similar results to those reported below. 4 Table 1 compares our matched estimation sample to a 'full' eligible sample from the LISS that excludes only households for which all eligible members were retired in 2008 (from the 3,125 eligible LISS households this produces a sample of 2,816 households). We provide descriptive statistics for these two samples separately for couples and singles. Couples are households in which two partners live together irrespective of their marital status and, among couples, individual-specific attributes (such as gender and age) are reported for the head of the household. For both couples and singles, these samples are very similar in terms of observables.   Table 2 summarises, for our matched estimation sample, administrative assets records for the years 2007, 2008 and 2009 (all in 2008 euro). The single most important category of assets is that of the primary residence, with an average value of around 200,000 euro. Savings accounts follow at great distance as the second most important type both in terms of mean (27,000 euro) and median (13,000 euro) value. Real estate other than the primary residence is also important but only for a small minority: the mean value is around 7,000 euro though only 8% of the sample has any non-residential real estate. The mean value of risky assets, stocks and bonds, drops from 7,210 euro in 2007 to 4,857 euro in 2009 (median holdings are zero in all years). Business wealth and other wealth are the least important categories of assets with a mean value below 1,500 euro in all years.
On average, households have about 105,000-110,000 euro in mortgage debt and around 2,000-2,500 euro of non-mortgage debt. Non-mortgage debt is concentrated in a small minority of 6% of the sample, among which the mean non-mortgage debt is around 20,000 euro.
Taking assets and debt together, the mean net worth of the households in the sample is around 135,000 euro. Not surprisingly, net worth is concentrated in the primary residence, which has a mean value net of mortgage of around 95,000 euro. Because of the consumption value of housing, we compute savings based on the remaining 40,000 euro of non-housing savings.

Threats to Validity
Our analysis faces three threats to internal validity. Figure 1 provides an overview of the issues. The first threat is that the randomisation of the offer was performed at the level of the individual while the outcome variables we analyse are measured at the household level. We classify a household as being offered the survey if at least one eligible member received the offer, so by construction households with multiple eligible members are more likely to receive the offer. Thus we have conditionally random allocation to treatment and control: conditional on the number of eligible household members, randomisation across individuals ensures that the offer is random at the household level. Therefore in all of the regressions reported in the next Section, we control for the presence of multiple household members. As in practice, eligible households have either one or two eligible members, this amounts to including in our regression models a dummy variable equal to one if the household has a second eligible member.
We checked the conditional random allocation of households to treatment and control by regressing our instrument (an indicator for being offered the retirement needs module), on all the socio-demographic variables listed in Table 1 while controlling for the presence of multiple eligibles in the household. As expected, the dummy for multiple eligibles does predict receipt of offer at the household level. However, conditional on this control, the other covariates are jointly insignificant (p = 0.901). One-by-one balance tests also confirm that treatment and control groups are similar in terms of demographics, with no statistically or economically significant differences. Hence, the covariates are (conditionally) balanced, as one would expect given the conditional randomisation.
The second issue is incomplete compliance with the offer of treatment: not every LISS Panel member who was offered the retirement needs module completed this module. We apply two remedies. First, we perform an intention-to-treat (ITT) analysis  that compares those who did receive the offer with those who did not (instead of comparing those who were treated with those who were not). Second, we perform instrumental variables (IV) analyses in which we use the (conditionally) random offer of treatment as an instrument for being treated. Both methods allow us to obtain estimates of treatment effects that are not affected by endogenous sample selection as a result of non-compliance, since they rely on exogenous variation in the module offer. 5 Our intention to treat analysis, exploiting the conditionally random allocation of eligible households to the offer of the retirement needs module, is implemented by estimating the following regression: where S h;t is the saving (level or rate) of household h at time t; IME h is an indicator variable for the presence of multiple eligible individuals in the household and Z h is our instrument, which is equal to one if one of those eligible in the household received the offer of the retirement-needs-module. The parameter b 2 is the ITT effect. Our estimate of the treatment effect is obtained by estimating the equation: where T h is the treatment dummy, equal to one if at least one of those eligible in the household received the offer of the retirement needs module and completed the module. We estimate this equation by instrumental variables with Z h as the instrument for T h . The parameter a 2 is the treatment effect. Importantly, the research design we use is characterised by one-sided non-compliance: those respondents who were not randomly selected for the offer of the module could not complete the module. Thus, the monotonicity requirement for the identification of local average treatment effects (Angrist et al., 1996) is satisfied. In addition to IV regressions for the conditional mean of the savings distribution, we also estimate decile treatment effects in order to establish the robustness of our results. We use the estimator proposed in Fr€ olich and Melly (2013). 6 The final threat to internal validity is the incomplete matching to the administrative data, described above. This would invalidate our design if, among those eligible, selection into our estimation sample were related to the offer of the retirements needs survey (i.e. if selection were related to our instrument). It is important to check whether this is the case.
The descriptive statistics in Table 1 show that the estimated sample is quite like the full sample of those eligible in terms of observables characteristics. However, one difference between the full sample and the estimation sample that is not reported in Table 1 is the degree of compliance with the survey offer. As noted above, in the full sample, 74% of individuals who were offered the survey participated. Household-level compliance in the full sample is 5 percentage points higher at 79%. 7 In the estimation sample, the corresponding compliance rates are 82% for individuals and 87% for households. It is not surprising that compliance with the offer of the retirement needs module is related to inclusion in the estimation sample. Non-compliers with the survey offer are also more likely to leave the LISS Panel altogether. As a result, non-compliers were less likely to be present in 2011 to give consent for the match to administrative records. However, this does not compromise our research design, so long as the instrument is orthogonal to this selection process.
In the next Section, we formally test whether selection into the estimation sample is (mean) independent of the instrument. We do this by estimating the following regression on the full sample of eligible households: where IES h is a dummy variable which is equal to one if the household is included in the estimation sample. The hypothesis that inclusion in the sample is independent of the instrument implies c 2 ¼ 0, and this is our test. . This is unsurprising given that conditions for eligibility for the module included a minimum income. The difference in mean incomes between the not-offered and offered groups reflects the fact that random assignment of the module to those eligible occurred at the level of the individual respondent, as described in the previous Section. This means that households with multiple eligible members were more likely to be offered and so offered households are more likely to have multiple eligible members. Table 3 reveals that in 2008, 82% of offered households contained multiple eligible individuals while only 62% of not-offered households contained multiple eligible individuals. These differences, and the resulting income differences, emphasise again that, at the household level, we have conditional random assignment to the offer of the module (conditional on the number of eligible individuals in the household). As explained in the previous Section, our empirical strategy accounts for this. In 2007, the mean level of savings among eligible households who were subsequently offered the module was 1,207 euros, with a median of 1.6 euro. This is non-housing saving, computed as the difference between the non-housing wealth stocks of 1 January 2007 and 2008. For eligible households that were subsequently offered the module, mean savings were 1,247 euro, with a median of 184 euro. Note that the savings levels are very dispersed, with the standard deviation in both groups about 9,000 euro. In the same year, the mean savings rates were 2.3% for the not-offered and 1.7% for the offered. We compute savings rates as the level of savings divided by after-tax income. The medians are respectively 0% and 1%. As with saving levels, savings rates are very dispersed with a standard deviation of 18 percentage points in both groups.

Outcomes: Descriptive Statistics
In 2008, the mean saving rate of the not-offered group fell somewhat, to 1.6%. The mean saving rate of the offered group fell much more, from 1.7% to (À1.6)% (a fall of 3.1 percentage points). This suggests that the participation in the module may have had an effect on the saving behaviour of the 79% of the offered group that did participate. Note, however, that the saving rate of the offered group in 2007 does not provide a credible counterfactual for the saving rate of that group in 2008 (because of time effects). The saving rate of the non-offered group in 2008 also does not provide a credible counterfactual for the saving rate of the offered in 2008 (because the groups differ in the frequency of households with multiple eligible individuals). We now turn to the ITT and IV estimators that provide a credible counterfactual and therefore credible estimates of the causal effect of participation in the module.

Validity and Relevance of the Instrument
As explained in Section 4, we lose about half of our sample when we match LISS Panel records with administrative data. This loss of data would compromise the internal validity of our empirical strategy if the matching of the LISS Panel observations with the administrative data were related to the instrument, the offer of the retirement needs module. Table 4 shows estimates of (3). This a linear regression that uses our instrument, called 'offer', and a dummy for households with multiple eligible individuals, to explain an indicator of inclusion in the estimation sample. We find that sample selection is not correlated with the offer of the retirement needs module, so the loss of data that results from matching survey participants to administrative data does not invalidate our research design. Turning to instrument relevance, the first-stage regression reported in Table 5 demonstrates that the instrument is highly relevant: the F-statistic for the coefficient of the instrument in a model that controls for the presence of multiple eligible individuals is 4,818.37. Notes. Robust standard errors in parentheses. *Significant at 10%; **significant at 5%; ***significant at 1%.

2017] S U R V E Y P A R T I C I P A T I O N & H O U S E H O L D S A V I N G
Together these two tables confirm that our empirical strategy overcomes both incomplete matching of households to the administrative wealth data and noncompliance with the offer of module participation. Table 6 presents ITT estimates of the effect of (the offer of) survey participation on household saving, both for the mean and at various quantiles. The top panel uses 2008 non-housing savings as the outcome variable, while the bottom panel explains the 2008 savings rate (non-housing savings divided by household income). In the mean regression, we find a reduction in the savings rate of 3.1 percentage points, or about 1,500 euro of 2008 non-housing savings. The estimated saving rates effects at various deciles are of the same order of magnitude as the mean effect, between 2 and 5 percentage points.

Main Results
These ITT estimates, in conjunction with the summary statistics in Table 3, allow us to calculate counterfactual savings levels and rates for the offered group in 2008. In particular, subtracting the ITT estimate of (À3.1) percentage points from the observed saving rate of (À0.016) gives a counterfactual saving rate of 1.5% for the offered group in 2008. An analogous calculation gives a counterfactual saving level of 1,173. Table 7 presents our main results, obtained from IV regressions of the two savings measures (levels and rates) on participation in the retirement needs module, where we instrument survey participation with the random offer of the survey and control for multiple eligibles in the household. The leftmost column shows the estimated coefficients and accompanying standard errors for the treatment dummy in 2SLS models, which capture the mean effect. Participation in the survey caused households to save 1,683 euro less on average during 2008. This is a large effect, and can be compared to the counterfactual saving of the offered group of 1,173 euro.
When we express savings relative to household income, we also find a significant and negative effect. Survey participation caused households to save 3.5 percentage points Notes. Standard errors in parentheses. *Significant at 10%; **significant at 5%; ***significant at 1%. †For decile models we report unconditional treatment effects. We control for the presence of multiple eligibles. less on average, compared with counterfactual mean saving rate of 1.5% and a standard deviation of 18 percentage points (both for the offered group). On average, participation in the survey moved households from modest saving to modest dis-saving. The remaining columns of Table 7 report quantile IV effects. We report estimates for the second to eighth deciles in order to show that participation in the retirement needs module affected subsequent savings throughout the distribution of savings rates. For the level of savings, we find significant and large effects for the third, sixth and seventh deciles. The estimated coefficients for the other deciles are also all negative. For the saving rate, we find strongly significant effects at the third and sixth deciles as well as marginally significant effects at the second and fourth deciles. These estimates show that large parts of the savings distribution were shifted by participation in the survey, with similar effect sizes below and above the median. The mean effects reported above are not driven by responses in the tails of the saving distribution.
We investigated the robustness of these results in a number of other ways as well. In particular, alternative trimming rules yield quantitatively similar results, as do narrower definitions of wealth that include only risky assets and bank accounts. 8 Our identification is based on the randomised offer of the retirement needs module to a subset of the eligible panel members. This implies conditional randomisation to households (conditional on the number of eligibles in a household) and allows us to cleanly measure the causal effect of interest. Nevertheless, we added to the models Notes. Standard errors in parentheses. *Significant at 10%; **significant at 5%; ***significant at 1%. †For decile models, we report unconditional treatment effects. We control for the presence of multiple eligibles.
reported in Table 7 all the covariates listed in Table 1 and found that all effects are robust to including these additional controls. 9

Falsification Test
As a falsification test, we estimate the same IV model from (2) with 2007 savings as the dependent variable, that is on behaviour realised before exposure to the retirement needs module. The results of this exercise are reported in Table 8. We find no evidence of any systematic difference in savings behaviour, neither in terms of the mean level or rate of saving, nor at any of the deciles. Note that, in contrast to Table 7, the coefficients of the various deciles vary in sign. This provides further assurance that our research design is valid and that our main results are not driven by incomplete data linkage (as any differential selection into the match sample would also affect the 2007 results).

Effect Heterogeneity
We next investigate effect heterogeneity. One approach would be to run IV analyses on subsamples but many variables that could be used for interesting splits of the sample are correlated. Examples are income and education, or income and age. Therefore, we prefer a pooled IV approach, in which we interact the treatment (survey participation) dummy with household characteristics, and then generate additional instruments by interacting our instrument (the offer dummy) with household characteristics. In addition to the interaction between selected covariates and the treatment dummy, the estimated model includes all the covariates from Table 1 as exogenous independent variables. We investigate heterogeneity along the lines of income, education and age. Note that, for reasons of sample size, the specification does not contain dummies for all cells defined by those variables but only interactions of the separate variables with the treatment indicator. The upper panel of Table 9 displays coefficient estimates for the main effect and interaction terms of the model with the savings rate as the dependent variable. According to these estimates, the survey did not affect the average savings rate of the base group: young, income-poor households that are poorly educated. We find strong evidence for differential effects along the age and education dimensions: households with more highly educated or older heads reduced their savings more after having been offered the retirement needs module. The evidence for effect heterogeneity by current income is weaker. Once we allow the treatment effect to interact with education, the interaction with income is positive (indicating that, conditional on education, households with higher current income cut saving by less). But the income interaction is only weakly statistically significant (p < 0.1) and much smaller in magnitude than the interaction with higher education. It may be that education is a better measure of long-run economic position or retirement preparedness. Notes. Robust standard errors in parentheses. *Significant at 10%; **significant at 5%; ***significant at 1%. We then use the coefficients of this model to calculate treatment effects for households with different combinations of characteristics. These are reported in the lower panel of Table 9. Participation actually raised the saving rate of households with poorly educated heads who are younger than 40 and have a disposable income above the sample median by 8 percentage points (though this effect is not particularly precise). Households in the highest education category saved less regardless of the age of the head and their household income but we find the strongest effects for older households: the mean treatment effect is À3/À9 percentage points for households below age 40; À5/À11 percentage points for age 40-54; and À12/À17 percentage points for households aged 55 or older. We find similar heterogeneity in the survey effect on the level of savings. 10 Overall, Table 9 shows that the size of the effect of survey participation on saving is much larger for the highly educated (college and university graduates). In addition, we also investigated effect heterogeneity by gender and family type, and whether the number of household members participating matters. In particular, we checked whether the effect differs depending on whether the individual(-s) who received the offer is a husband; a wife; both husband and wife; a single male; or a single female. However, with our sample size, we are unable to discern differential effects depending on whether men or women were offered the survey, or whether one or two household members were offered the survey. These estimates are available on request.
Next we further explored potential sources of the significant effect heterogeneity across education groups. The various estimates underlying the results discussed below are not presented in the Tables but, again, are available on request.
Households in the highest education group have higher rates of home ownership. They also have higher pension wealth in terms of a standardised annuity from the first two (mandatory) pillars of the pension system: the median predicted annuity, net of taxes, is 1,442 euro/month for poorly educated households, compared with 1,725 and 2,039 euro/month respectively for the higher education groups. However, the replacement rate of the projected annuity relative to current income is similar for all education groups, ranging from 79% to 81%. Conditional on education, neither home ownership nor either measure of pension wealth further significantly interacts with the treatment effect. This suggests that differences in financial circumstances may not drive the heterogeneity in treatment effects across education groups.
An alternative hypothesis is that the education groups differ in the way the survey affects attention to retirement saving and needs. However, there is no significant difference in their reported rates of thinking about retirement prior to the survey. The high education group answer the survey more quickly than the low education group (median difference two minutes, p = 0.058) but the difference between the high education and middle income groups is not statistically significant. Moreover, we find no significant interaction between treatment effect and survey response time.
A question in the retirement needs module elicited subjects' expectations of housing costs in retirement. Interestingly, relative to low education households, highly educated households are much more likely to expect a decrease in housing costs after retirement.
These differences are economically and statistically significant (p < 0.001). Unfortunately, as this question was asked in the retirement needs module, i.e. only of the treated sample, we cannot test for treatment effect heterogeneity in this specific dimension. Nevertheless, this difference raises the possibility that the survey made salient to high education households likely decreases in housing expenditure after retirement.

Evidence on Portfolio Effects
Finally, we consider whether the observed effect of survey participation on household saving might reflect portfolio reshuffling rather changes in active saving. There are two concerns. First, as explained earlier, the administrative wealth data do not contain information on a relatively small category of tax-favoured savings accounts. Though these are not a major asset category in the Netherlands, it remains possible that our findings of a negative effect of survey participation on non-housing savings results from a re-allocation of assets to those accounts. Second, our wealth-based measure of saving contains both active saving and capital gains. A possibility is that treated households shifted saving to risky assets and then experienced significant losses with the onset of the financial crisis in 2008. Changes in active saving and changes in portfolio allocation would both be a survey effect but it is important and interesting to distinguish the nature of the effect.
With respect to the first point, we used survey data from the LISS assets module to look at investments in these tax-favoured saving accounts. We tested for survey effects on accounts on both extensive and intensive margins. We find no effect on participation in such accounts, change in participation, balances conditional on participation or unconditional balances. 11 While the survey data may be quite noisy at the intensive margin, we believe the participation is relatively well-measured. Turning to the second point, we use the administrative wealth data to test for an effect of survey participation on the portfolio share of risky assets, participation in risky assets, or changes in either shares or participation. 12 We do not find any evidence to suggest that capital gains for risky assets drive the survey effects that we document. Moreover, we also find survey effects when we limit our attention to changes in the balance of saving accounts, an asset class that does not exhibit large price-driven variation in value. These checks point to the effect documented above being an effect on active saving, and not an effect on portfolio allocation.

Discussion and Conclusion
In this article, we show that participating in a non-informative survey module on retirement needs led Dutch households to save significantly less. Our analysis uses administrative wealth data to calculate clean measures of savings that are not contaminated by the reporting styles of survey respondents, which themselves might have been affected by the intervention. Participation in the survey is instrumented by randomised assignment of invitations to participate in the module, so our estimates are unaffected by endogenous compliance. Estimated effects are large: the saving rate (saving as a fraction of disposable income) is 3.5 percentage points lower among treated households (on a base saving rate of 1.5%). Our wealth data only allow us to compute savings during 2007 and 2008. Naturally we will investigate the durability of the effects when more data become available.
Quantile IV models show effects across a wide range of the savings distribution. Falsification checks reveal no effect on savings prior to the survey. We find evidence for heterogeneous treatment effects. The mean effect is driven by older and highly educated households. These households have the highest expected pension wealth and higher housing wealth. They are also much more likely, in the survey, to report that their housing costs will fall in retirement (perhaps because they anticipate paying off their mortgage debt).
Our results are consistent with limited attention (DellaVigna, 2009). The survey may have made aspects of retirement needs and retirement planning more salient to participants. After reflecting on their expenditure needs in retirement, older and highly educated households concluded that they can afford to save less while the young and poorly educated marginally increased their savings. Asymmetric costs of adjustment may also be relevant. If the survey led some participants to conclude that they were saving too much, and some to conclude they were saving too little, the former may have found it easier to adjust. They only need to consume more.
These results are likely specific to the context of the experiment. At the time the retirement needs module was fielded in 2008, mandatory pensions were generous, replacing 80% of final income after tax on average, and covered nearly the entire population. De Bresser and Knoef (2015) show that many Dutch households were oversaving, relative to their self-reported desired expenditure level in retirement. They also show that more educated households could look forward to higher occupational pensions. Moreover, financial institutions were obliged to provide all pension holders with Universal Pensions Overviews (UPOs) from 2008 onwards. This may have meant that households whose attention was drawn to their retirement needs and plans could obtain information on current entitlements and projections for age 65 at very low cost. It is not clear how households would react to the salience shock of the survey if they were largely under-saving (especially if adjustment costs are asymmetric), or if information on retirement preparedness was costly to obtain.
While effects which are likely due to limited attention have been documented previously (Stango and Zinman, 2014), our results highlight that such effects can operate in surprising directions depending on the context. We also show that the patterns of heterogeneous effects are context specific. We find the largest effects for the most educated while Stango and Zinman (2014) find the opposite. Perhaps most importantly, we show that such effects can operate on the most central of life-cycle choices, such as the level of consumption and saving. These salience effects might be exploited for purposes of pension policy. But it is only through the continued accumulation of evidence on how salience shocks manifest in widely-varied contexts, and with respect to different outcomes, that general models of these effects can be formulated and convincingly tested.
The validity of empirical research that uses survey data relies on the representativeness of the sample. 13 Survey effects such as the one documented in this article imply that panels may not be representative of the underlying population even if the initial sampling was representative and the representativeness of the panel has not been degraded by attrition or non-response. If survey participation alters the behaviour of respondents, the external validity of any study based on such data will be compromised. Moreover, if the behaviour of survey participants is altered by participation, new ethical issues arise in survey research. This issue is already being debated in the public health literature (Fitzsimons and Moore, 2008;Schwarz, 2008).
Such considerations may be a further argument for greater use of administrative data whenever available (Einav and Levin, 2014). 14 Also, as noted by Zwane et al. (2011), survey effects may mean that it is better to achieve statistical power with large panels and infrequent measurement rather than with smaller but more frequently measured samples.
In their recent survey, Warren and Halpern-Manners (2012) call for more research on panel conditioning and survey effects that employs experimental or quasiexperimental designs, and which documents effects in the kinds of large-scale longitudinal surveys that underpin much research in the social sciences. Our analysis shows one way in which this can be done: by exploiting the randomisation of content that is becoming more common in such surveys. We strongly support their call for further research.