Return on Trust is Lower for Immigrants

Trustworthiness is key for successful economic and social interactions. We conduct an experiment with a representative sample of the Dutch population to study whether trustworthiness depends on the ethnicity of the interaction partner. Native Dutch trustees play trust games with an anonymous other, who is either another native Dutch or an immigrant from non-Western descent. We find that the trustees reciprocate trust up to 13% less frequently if the trustor is a non-Western immigrant than if he/she is native Dutch. This percentage increases up to 23% for trustees who report disliking ethnic diversity in society in a survey that took place one year before the experiment. Since the decision to reciprocate does not involve behavioral risk, we take our results as evidence of taste-based discrimination. The implication is that the return on trust is lower for immigrants from non-Western descent than for native Dutch.


Introduction
In several European countries about 10% of the population is born outside of Europe. This mix of individuals has posed and continues to pose challenges. For example, immigrants in Europe born outside of Europe earn about 77% of what native Europeans earn (median net income of e12.258 versus e15.830), and their unemployment rates are almost twice as high (18.8% versus 9.5%) (Eurostat, 2014). There are several factors that potentially underlie this gap. Employers may be uncertain about skills and education acquired abroad, there may be language problems, and the socio-demographic composition of the immigrant population may be different from the native European population. The achievement gap between native Europeans and immigrants persists for second-generation immigrants, though, who are born and typically trained in the country of residence (Eurostat, 2011;OECD and EU, 2015). Second-generation immigrants in Europe are almost twice as likely to be unemployed if both parents have a foreign background as compared to individuals with native-born parents. In some countries, including Germany and the Netherlands, their youth unemployment rate in the age range of 25-34 is close to three times as high. Remarkably, second-generation immigrants report feeling more discriminated against than their parents do.
Several field experiments have shown that ethnic discrimination is a reality in many European (and other) countries and not just a perception (see Guryan and Charles, 2013;OECD, 2013;Zschirnt and Ruedin, 2016;Bertrand and Duflo, 2016, for overviews). Audit studies, for example, where actors of different ethnic origins participate in job interviews, typically find that auditors from ethnic minorities are less likely to be called back, and if called back, they are less likely to be hired. Likewise, in correspondence studies, resumes that are sent in response to job vacancies have less chance of success if the name of the applicant is from an ethnic minority than if not so, even if all the other elements in the resume are the same (Bertrand and Mullainathan, 2004;Oreopoulos, 2011). 1 The discrimination identified by field experiments can be the result of 'stereotypes' or of 'tastes'. Stereotyping (or statistical discrimination) may occur in a context of behavioral risk: if the decision-maker lacks information about a person, he may form expectations about her behavior by relying on general information about the group to which this person belongs (Arrow, 1973;Phelps, 1972). To illustrate, an employer who sees many immigrants being unemployed may believe that immigrants are less productive than workers from the native majority, and may therefore be reluctant to hire an immigrant. Discrimination driven by tastes stems from a dislike of immigrants and is arguably more difficult to overcome than stereotyping (Becker, 1957). It persists even if accurate information about the other person's behavior is available. What is more, it may lead to societal outcomes that are economically inefficient, for example, because it distorts the allocation of talent.
Understanding the nature of discrimination is key for understanding how it can be overcome. Given that field data are inherently noisy and almost always characterized by behavioral risk, a difficulty in empirical research is to identify the type of discrimination involved (see Guryan and Charles, 2013, for a discussion). In this paper, we report the results from an experiment that eliminates behavioral risk. The experiment thus rules out stereotyping as a channel of discrimination and isolates taste-based discrimination.
We conducted the experiment with participants recruited from the LISS (Longitudinal Internet Studies for the Social sciences) panel, which constitutes a representative sample of the population living in the Netherlands. The main decision-makers are native Dutch, and they interact either with another native Dutch or with a 'non-Western' immigrant. 2 Participants were matched with an anonymous other in the sample and played binary trust games. The decision context was simple, could easily be explained to nonstudent participants, and is a workhorse model of several types of social and economic interactions involving incomplete contracts, such as employer-employee, buyer-seller, or borrower-lender relationships. Trustworthiness is a key component of success in these types of relations.
Our research question is whether trustworthiness of native Dutch depends on whether 2 'Western' countries refers to all European countries (except Turkey), North-America, and Oceania.
'Non-Western' countries include countries in Africa, Latin-America and Asia (including Turkey). The demographic composition and social and economic conditions of immigrants in the Netherlands are roughly similar to those in other European countries, like Germany, France, or Belgium. See section A in the Appendix for more information about immigrants in the Netherlands. the trustor is also native Dutch or is, instead, a non-Western immigrant. We focus on trustworthiness because the choice to be trustworthy does not involve behavioral risk, and thus depends on tastes. This is in contrast to the choice to trust, which can be driven by both tastes and beliefs: people may not trust others because they dislike them or because they think they are not trustworthy (Ashraf et al., 2006). All trustees were thus native Dutch, and were randomly matched to a native or non-native, non-Western, trustor. In order to create awareness of the trustor's origins, we truthfully revealed his or her first name to the trustee. A first name is a signal of the ethnic roots and, at the same time, preserves anonymity. 3 Hence, participants have no reputational incentives to act in a certain manner. 4 We also revealed the trustor's gender and age range to the trustees, so that the first name was not focal.
We find that native Dutch reciprocate trust of immigrants about 4.5 percentage points (8%) less frequently than trust of other native Dutch. If gains to trust and reciprocation are relatively high, the treatment effect increases to 7-8 percentage points (12-13%). The overall trust rate is 55%. Moreover, we find that behavior in the experiment corresponds remarkably well to self-reported attitudes towards diversity elicited in a survey one year before we ran the experiment: discrimination is driven by individuals who tend to dislike diversity in society. In contrast, the reciprocation rate of individuals who are relatively open to diversity and immigrants does not depend on the background of the trustor. 3 Examples of experiments on discrimination that reveal participants' surnames or full names are Fershtman and Gneezy (2001) and Bouckaert and Dhaene (2004). A separate laboratory experiment we ran among Dutch students who were shown a list of the trustors' names indicates that the first name is indeed a signal of ethnicity: participants were successful in correctly linking names in 89% of the cases. 4 Glaeser, Laibson, Scheinkman, and Soutter (2000) find that trustees are less trustworthy vis-à-vis a trustor who is of a different race. However, given that participants met in person, the result may be driven by reputation concerns, as outside the experimental setting same-race interactions are more likely than mixed-race ones.

The LISS panel and our sample
The LISS panel is managed by CentERdata, a survey research institute located at the campus of Tilburg University, and consists of a true probability sample of 8000 individuals drawn from the population registered by Statistics Netherlands. Panel members participate in surveys and experiments in exchange for a fixed fee (plus a variable fee in some cases). The experiments do not involve deception. Households that lack the means to participate are provided with a computer, or a television set, which is connected to the internet. Panel members also take part in yearly longitudinal studies ('core studies') that keep track of the changes in the panel members' lives. These studies are aimed to measure individuals' reactions to policy measures and changes in society, and it is possible to match behavior in experiments to their answers in the core studies.
In Table 1 we present the distributions of key socio-economic variables both for the entire population of the Netherlands and the LISS sample (third and fourth column, respectively). The table shows that distributions are very similar along dimensions gender, age, and education. Individuals who live in a strongly urbanized area and one-person households are somewhat under-represented in the LISS panel, though.
A sample of 839 pairs of players whose first name was available was drawn from the LISS panel by CentERdata. The pairs each consisted of a native Dutch trustee and a native Dutch trustor or a trustor of non-Western descent. The trustees were contacted in the course of December 2014. 5 Out of these, 691 (82.4%) participated in the experiment of which 329 were matched to a Dutch native trustor (treatment Native) and 362 to an immigrant trustor (treatment Non-native).
The trustors made their choices in the course of March 2015. Details on their sampling and descriptive statistics can be found in section F of the Appendix. Trustors in Nonnative are either born outside the Netherlands (first-generation immigrants), or are born in the Netherlands but have at least one parent who is born elsewhere (second-generation immigrants). We only sampled immigrants with a 'non-Western' background, given that it is this group in particular that suffers from substantial economic and social problems, and, potentially, from discrimination.
An overview of the distributions of key socio-economic variables among the participants in the role of trustee is included in Table 1 (last three columns). It can be seen that the distributions are highly similar in the experiment and the LISS sample as well as between treatments in the experiments. The distributions of gender, age, and size of household do not differ significantly between treatments (P > 0.269 in χ 2 tests), and nei-ther does household income (P = 0.619 in a t-test). The distribution of participants across education categories is significantly different though (P = 0.012 in a χ 2 test); it turned out that participants with tertiary education were oversampled in Non-native. Therefore, it is important to control for education when analyzing treatment effects.

The experiment
The trustees played three binary trust games with the same trustor. 6 The three games differed with respect to the payoffs of mutual cooperation and had the same payoffs otherwise. Figure 1 shows the game; A and B refer to the trustor and trustee, respectively, and x is equal to e40, e60, or e80. Eliciting choices in three games allows us to gain some insight into the preference function that motivates trustees' behavior. 7 Moreover, we expected that participants in our experiment may feel less pushed to act according to what is considered desirable-reciprocate if trusted irrespective of the identity of the trustor-when making several choices rather than a single, focal, choice. In what follows, we refer to the three games as G40, G60, and G80.
In the instructions, it was explained to the participants that at the end of the research five pairs (one A and one B) would be randomly drawn for payment among the 200 pairs that were expected to participate. 8 The variable earnings that trustees could expect were thus between 0.875 Euro and 2.125 Euro, depending on the choice of the trustor and the trustee's own choice. In addition, they received a fixed fee of 1.5 Euro, calculated on the basis of an expected duration of 6 minutes. In terms of hourly earnings, trustees could thus expect to earn between 23.75 and 36.25 Euro an hour. In reality, they earned 14.86 Euro an hour (variable earnings of 5.85 Euro) which was lower than expected because the number of participating pairs turned out to be higher-CentERdata oversampled in order to be certain that we would have at least 200 pairs of participants by treatment-and the 6 See sections B and C of the Appendix for a detailed description of the experimental procedures and a sample of instructions, respectively. 7 A similar approach has been used before in Dufwenberg and Gneezy (2000) in a related game. 8 In experiments on representative samples it is standard practice to pay a fraction of the participants (e.g. von Gaudecker et al., 2011;Dohmen et al., 2012). Bolle (1990) shows that such reward structure does not make subjects behave differently. duration of the experiment was underestimated. 9 We elicited trustees' choices using the strategy method of Selten (1967), which has the advantage of ensuring sufficient statistical power to identify a treatment effect.A number of papers study whether decisions are different as compared to a direct response mode in the context of a trust game, or, relatedly, sequential prisoner's dilemma (Casari and Cason, 2009;Brandts and Charness, 2000). Although differences in terms of 'levels' of choices (e.g., level of trustworthiness) sometimes occur, the mode of response does not have an effect on whether a treatment effect is obtained (see Brandts and Charness, 2011, for a survey). Brandts and Charness (2011) suggest that the strategy method provides a lower bound for testing for treatment effects.
At the point of choice elicitation, participants received information about the matched player. In particular, they were communicated the first name of the trustor as well as the gender and the age range (which was between 16 and 89 years old in all cases). We revealed the gender and the age range with the sole purpose of reducing the salience of A's ethnic background, as signaled by his/her first name. 10 Since participants could quit the experiment at any time, one may wonder whether there is a treatment effect on the 9 See section E for a detailed calculation of expected and actual earnings. 10 To study whether perceptions of first names can be connected to the ethnic background, we elicited these perceptions in a separate laboratory experiment conducted with Dutch students. The experiment reveals that in 89% of the cases it was correctly perceived whether the name is of a native Dutch or not. The design and results of this experiment are described in section D of the Appendix.
number of participants who decided to leave the experiment at the point they got to see the name of their partner and before entering their choices. We find that there is no such difference between the two treatments (P = 0.538 in a χ 2 test). Our experiment is thus not affected by a selection bias.
The elicitation of participants' choices was followed by two belief elicitation tasksbeliefs about choices of other trustees and beliefs about choices of trustors were elicitedand a short comprehension questionnaire. We discuss beliefs in section 4. A particularly relevant question in the comprehension questionnaire relates to the clarity of the experiment and asked participants whether they would rate on a five-point scale whether they found the questions clear (1=not at all; 5 = very much; see screen 8). We used the answers to this question as a proxy for how confused participants were and include it as a control in part of our regressions.

Politics and values survey
LISS panelists take part in longitudinal studies ('Core Studies') that measure the same set of variables every year. The goal is to keep track of changes in the panel members' lives and their reaction to modifications in society. One of the Core Studies is a survey on politics and values, where, among other things, attitudes towards diversity and immigrants in the Netherlands are surveyed. In order to study whether participants' behavior in the experiment correlates with their self-reported attitudes towards immigrants, we match the data from the December 2013-January 2014 wave to our experimental data. Table 2 displays the eight survey items that are used to collect opinions about immigration. 11 For each item, panelists are asked to indicate on a scale from 1 to 5 whether they agree with the statement, where 1 means full disagreement and 5 means full agreement.
In order to obtain a single measure that summarizes one's attitude towards immigrants on a five-point scale, we used responses to items 1, 3, 4, 5, 7, and 8, and calculated their average, after reversing the scale for items 5 and 7. Higher values correspond to a more positive attitude. We excluded items 2 and 6 because these questions elicit one's belief about how immigrants are generally treated in the Netherlands, rather than eliciting one's personal attitude to immigrants. Furthermore, all items are positively and signif- 3. It should be made easier to obtain asylum in the Netherlands.
4. Legally residing foreigners should be entitled to the same social security as Dutch citizens.
5. There are too many people of foreign origin or descent in the Netherlands.
6. People of foreign origin or descent are not accepted in the Netherlands. icantly correlated (correlations ranging from 0.20 to 0.58, P < 0.05), except items 2 and 6. 12 In total, 5661 panel members answered the above eight questions, corresponding to 88% of the LISS panel members who were contacted for the survey. We could match 566 of our trustees to the survey, which means that out of the 691 trustees who participated in our experiment, 125 did not answer the survey questions. In the results section we discuss the robustness of our main results to the missing survey data.
Since our treatment allocation is random, attitudes towards immigrants should be balanced between treatments. We indeed find that the average of the composite measure is equal to 2.85 (with a standard deviation of 0.68) in Native and 2.86 (with a standard deviation of 0.64) in Non-native (P = 0.818 in a t-test).
12 Item 2 is positively and significantly correlated with 4 other items (namely 3, 4, 5, and 7) and item 6 is positively and significantly correlated with 2 other items (namely items 3 and 7). Including all the 8 items in the measure leaves the results reported in section 3.2 practically unchanged. These results are available from the authors upon request. Notes: The figure shows overall reciprocation rates by treatment (left-hand-side panel) and reciprocation rates by game and treatment (right-hand-side panel). Error bars refer to 95% confidence intervals estimated in probit regressions of which results are reported in columns 1 and 3 of Table 3.

Reciprocation rates
Figure 2 shows reciprocation rates by treatment averaged over all games (left-hand-side panel) and reciprocation rates in each game (right-hand-side panel). As can be seen in the left-hand-side panel of the figure, the overall reciprocation rate in Non-native is equal to 57.5%, and is lower than that in Native, which is 62%. If we look at reciprocation rates by game, we observe that these are similar in Native (59.3%) and Non-native (60.5%) in G40, that is, when the payoffs of reciprocation are equal to e40. For payoffs of e60 and e80, however, reciprocal choices are more frequent in Native than in Non-native. In G60, the reciprocation rate is 61.7% in Native and 55.3% in Non-native, and in G80, it is 65.4% in Native and 58.0% in Non-native.
To test the statistical significance of the treatment effects, we ran probit regressions with the choice of the trustee as a dependent variable. In all reported regressions, standard errors are corrected for clustering at the individual (trustee) level. Table 3 reports the results of four specifications. The first two columns present estimations of general treatment effects, without and with controlling for individual-specific variables, respectively. The controls include the trustee's gender, a range of socio-economic variables (age, income, education, urbanization of the area of residence) and a confusion variable. The confusion variable is a dummy equal to 1 for participants who report to be confused in  Notes: The table shows marginal effects in probit regressions with clustering at the individual level (Pvalues in parentheses). The dependent variable is a dummy equal to 1 if the trustee reciprocates. Stars * * * , * * and * indicate that the marginal effect is statistically significant at the 1%, 5% and 10% level, respectively.
the post-experimental questionnaire, that is, who answered 1 or 2 to the clarity question (see screen 8 in the instructions included in section C of the Appendix). These specifications show that trustees are generally 4.2 to 4.7 percentage points (about 7%) less likely to reciprocate trust from a non-Western immigrant than from a native Dutch. In both specifications the treatment difference is statistically significant at the 10% level.
The last two columns in Table 3 allow for game-specific treatment effects by including interactions between treatment and game dummies, again without and with controlling for individual-specific variables. Marginal effects and associated P-values from Waldtests are shown at the bottom of the table. These tests indicate that the probability of acting reciprocally is significantly smaller in Non-native than in Native when the payoffs of reciprocation are equal to e60 (P ≤ 0.085) and e80 (P ≤ 0.047). Marginal effects are between 6.4 and 8.1 percentage points, corresponding to an effect of about 12-13%.
Notice that we do not find evidence for gender-based discrimination. If we control for the gender of the trustor, which is known to the trustee when making decisions, the treatment effect of Non-native maintains and the gender dummy is not significant (see Table G.1 in the Appendix). 13

Attitude to immigrants
We study the relation between trustees' attitude to immigrants as measured in the survey on politics and values discussed in section 2.3 and their behavior in the trust game. 14 First, we classify trustees into two groups based on a median-split of their attitude to immigrants. Trustees with a more positive (negative) than median attitude (equal to 2.92) towards immigrants are classified as having a 'positive' ('negative') attitude. Figure 3 shows overall reciprocation rates (left-hand-side panels) and reciprocation rates by game (right-hand-side panels), for trustees with a negative attitude to immigrants (upper panels) and for trustees with a positive attitude to immigrants (lower panels). Error bars as well as P-values reported in the text are based on estimations from probit regressions of which estimation results are reported in Tables G.3 and G.4 in the Appendix.
Panel (a) of Figure 3 reveals that the overall reciprocation rate of people with a relatively negative attitude to immigrants is 12 percentage points (more than 20%) lower in Non-native than in Native (P ≤ 0.008). As can be seen in panel (b), the pattern is very different for people with a relatively positive attitude to immigrants; their reciprocation rate is practically identical in Native and Non-native (P > 0.554). If we look at reciprocation rates sliced up by game, we see that in all three games, trustees with a negative attitude 13 The same table in the Appendix also reports results from regressions with interactions between the Non-native dummy and the trustor's gender. These regressions show that native Dutch (non-native) female trustors are reciprocated the most (the least), while reciprocation rates towards native Dutch and non-native males are in between.
14 Out of the 691 trustees 125 did not answer the related questions. We obtain treatment effects of the same order of magnitude as reported previously for a sample of participants that excludes these 125 participants (see Table G.2 in the Appendix).

Non-native
Notes: The figure shows overall reciprocation rates by treatment (left-hand-side panels) and reciprocation rates by game and treatment (right-hand-side panels), for trustees with a negative attitude to immigrants (upper panel) and for trustees with a positive attitude to immigrants (lower panel). Error bars refer to 95% confidence intervals estimated in probit regressions of which results are reported in columns (1) and (3)  are less reciprocal to non-Western immigrants than to native Dutch. The effect is large and significant for G60 and G80 (P ≤ 0.037), and not so for G40. To illustrate, in G80, trustees with a negative attitude are 23% less likely to reciprocate trust by an immigrant.
For trustees with a positive attitude, differences are not significant (P ≥ 0.148).
Next, we dig deeper in the relation between reciprocation and self-reported attitude to immigrants by showing the relation between deciles of the distribution of the attitude to immigrants and reciprocation rates. Figure 4 shows reciprocation rates across the three games by decile, where higher deciles represent a more positive attitude. The fitted line is obtained from probit regressions where the probability to reciprocate is regressed on the deciles of the attitudes to immigrants. Estimation results are reported in Table G.6 of the .5 .6 .7 .8 1 2 3 4 5 6 7 8 9 10 1 2 3 4 5 6 7 8 9 10 Native Non-Native

Reciprocation rate Fitted
Attitude to immigrants: Decile Notes: The figure shows reciprocation rates by decile of the distribution of attitudes to immigrants going from highly negative (1) to highly positive (2). The fitted lines are from probit regressions where the choice to reciprocate is the dependent variable. Results of the probit regressions are in columns (1) and (3) of Table  G.6 of the Appendix.
Appendix. We find that in Non-native the reciprocation rate is significantly increasing in the attitude to immigrants (marginal effect of 0.018, P = 0.015). Differently, in Native the relation between reciprocation and immigrant attitudes is not significant (marginal effect of 0.005, P = 0.423).

Types of trustees
We classify trustees into types based on the combination of choices they made in the three games. To do so, we use two characteristics of decision-making: consistency and reciprocity. Consistency refers to choices being consistent with a well-behaved utility function (Fisman et al., 2007), which in our games also allows for social preferencesà la Fehr and Schmidt (1999), or Charness and Rabin (2002). Out of the combinations of consistency and reciprocity, we extract three types: consistent reciprocators, defectors, and inconsistent types. Consistent reciprocators (call these reciprocal types) are trustees who either always reciprocate, or reciprocate only in G80, or only in G60 and G80. Defectors are trustees who never reciprocate. 15 Inconsistent types are types who either switch back and forth between reciprocating and defecting, or start reciprocating in a game with low gains to cooperation (e.g., G40) and then stop reciprocating as gains to cooperation increase (e.g., G80). Table 4 shows the distributions of types by treatment.
As can be seen in Table 4, overall there are 8 percentage points fewer reciprocal types and more defectors and inconsistent types in Non-native than in Native (P = 0.149 in a χ 2 -test based on three types, and P = 0.056 if defectors and inconsistent types are merged into one type). The difference in distributions is particularly clear for trustees who have a negative attitude to immigrants. Non-native has 12 percentage points fewer reciprocal types than Native (P = 0.117 in a χ 2 -test based on three types, and P = 0.050 if defectors and inconsistent types are merged into one type). For trustees with a positive attitude to immigrants the distribution of types is not that different between treatments. For example, someone who attaches a strongly negative weight to the monetary payoff of an immigrant-basically, someone who gets pleasure from destroying an immigrant's money-would be less likely to reciprocate in games with high gains to cooperation than in games with lower such gains in an interaction with an immigrant. Second, results on 'attention discrimination' (due to Bartoš et al., 2016) suggest that trustees matched to an immigrant may switch back and forth more frequently because they pay relatively little attention to the specifics of the games.
Our data allow to further study the interpretation of inconsistency in choices. In particular, we look at the relation between being an inconsistent type and self-reported confusion as measured in the post-experimental questionnaire. Before reporting results on this relation, it is important to emphasize that we find no significant difference in the central tendency, variance or distribution of self-reported confusion between Native and Non-native (P = 0.890 in a t-test, P = 0.576 in a variance ratio test, and P = 0.969 in a Kolmogorov-Smirnov test). In both treatments, the average answer to the noise question is 3 on a Likert scale of 1 to 5. Therefore, if inconsistency is associated to confusion in Native and not so in Non-native, this would suggest that at least part of the inconsistencies in the latter treatment are meaningful and not mere noise. Probit regressions with an inconsistent type dummy as dependent variable and a dummy for self-reported confusion as independent variable show this is indeed the case: confusion significantly increases the likelihood of being inconsistent in Native (average marginal effect of confusion is 0.19, P < 0.001) but does not have an effect in Non-native (average marginal effect of confusion is -0.01, P = 0.962). 16 So in contrast to Native, inconsistent choices in Non-native are less likely to stem from participants who found the experiment relatively unclear. We take this as suggestive evidence that inconsistent behavior in Non-native may actually be meaningful and could be motivated by, for instance, atypical social preferences or attention discrimination.

Expected payoff of trust
Ultimately, an important question is whether the return on trust depends on one's ethnicity. Figure   Overall, as can be seen in the left-hand-side panel, immigrants who trust are expected to earn about e2 (that is, 5%) less than native Dutch who trust. The effect is generally significant at the 5% level. If we look at each game separately, we find that immigrants who trust are expected to earn e2.6-e3.1 less (about 6%) in G60 and e4.4-e4.9 less (about 8%) in G80 than native Dutch who trust. The effects are significant at the 5% or 10% level.
If the trustee happens to dislike ethnic diversity (middle panel of Figure 5), immigrants who trust are expected to earn about e5 (that is, 13%) less than native Dutch who trust (P ≤ 0.007). In the worst possible case for immigrants, that is, in G80 when matched with a trustee who dislikes diversity, the payoff of trust is e8.4 to e9.2 (about 15%) lower than for native Dutch trustors. Finally, if the trustee's attitude to immigrants is positive, the expected payoff of trust does not significantly depend on the trustor's background (P ≥ 0.281).  Table G.12 in the Appendix. Stars * * * , * * and * indicate that the marginal effect is statistically significant at the 1%, 5% and 10% level, respectively.

Socio-economic characteristics of discriminators
Up to this point, in our analysis we have assumed that the socio-economic characteristics of trustees do not interact with the treatment. We now relax this assumption, and exploit the richness of our data set to study whether certain groups, defined along standard socioeconomic indicators, are more likely to discriminate against immigrants. In particular, we study whether the probability of behaving in a discriminatory way is related to age, education, urbanization of the area of residence or income. Table 5 displays reciprocation rates for the different socio-economic groups by treatment, and includes results from tests of treatment effects based on probit regressions.
First, we find that trustees above median age are significantly less likely to reciprocate the trust of an immigrant than that of a native Dutch, while such discrimination is not observed among trustees who are younger. Second, we observe that lowly educated trustees, and not highly educated trustees, are significantly less trustworthy to immigrants. Third, trustees who live in areas with relatively low urbanization discriminate against immigrants, while discrimination is not observed among trustees living in urbanized areas. Lastly, we find that trustees with a relatively low income discriminate against immigrants, while those with a relatively high income do not.

Trust and beliefs about trust
Given that trustees can condition their choice on the choice of the matched trustor, they do not face behavioral risk, so their beliefs about the choices of trustors are irrelevant for their decisions in a standard economics framework. However, it may be that they reciprocate trust by immigrants less than that by native Dutch because they think immigrants will not trust them anyway. Therefore, we investigate whether treatment differences are related to differences in beliefs about trust. We study beliefs about trust by trustees using two different proxies. As a first proxy we use beliefs elicited after trustees made their choices.
As a second proxy we use the observed behavior of trustors assuming that beliefs of trustees are rational.
We The left-hand-side panel of Figure 6 displays trustees' beliefs about trustors' behavior for both treatments and 95% confidence intervals. The figure shows that trustees believe that trust increases with the payoff of mutual cooperation (from G40 to G80), and most importantly, that beliefs do not differ between Native and Non-native 18 Next, we look at behavior of trustors. 19 We compare trust rates of Native trustors to these of Non-native trustors. The right-hand-side panel of Figure 6 displays trust rates and 95% confidence intervals in both treatments. Trustors in our immigrant sample are overall more likely to trust a native trustee than native Dutch trustors. This result is driven 17 Also, for example, Trautmann and van de Kuilen (2015) show that there is no clear advantage of incentivizing beliefs as compared to simple introspection. 18 We also elicited trustees' beliefs about the behavior of other trustees, and find no treatment effects either.
19 Section F of the Appendix describes the experimental procedures in the trustors' wave in detail. Notes: The figure shows beliefs of trustees about trust rates normalized on a scale from 0 to 1 by game and treatment (left-hand-side panel) and trust rates by game and treatment (right-hand-side panel). Error bars refer to 95% confidence intervals estimated in regressions of which results are reported in columns (1) and (2)  by a treatment effect in G40. In G60 and G80 treatment differences are not statistically significant.
We take these two sets of results as suggestive evidence that the observed discrimination against immigrants cannot be accounted for by differences in elicited beliefs about trust rates, nor by differences in actual trust rates.

Discussion
We study taste-based discrimination in a controlled experiment on a representative sample of the Dutch population by eliciting reciprocation choices of native Dutch trustee who were either matched to a native Dutch trustor or to a trustor who is a non-Western immigrant. Our trustees reciprocate trust of an immigrant less frequently than trust of a native Dutch. The overall effect is driven by individuals who self-report having a negative attitude to diversity in society, which supports the external validity of the experimental results. Moreover, it suggests that discrimination is not always implicit or unintentional (e.g. Greenwald et al., 1998;Hofmann et al., 2005;Stanley et al., 2011).
Experiments aimed at detecting taste-based discrimination typically do not find much evidence for it (see, for example, Fershtman and Gneezy, 2001; Bouckaert and Dhaene, ception is Whitt and Wilson (2007) who report taste-based discrimination in a dictator game experiment conducted with a large sample of individuals living in Bosnia (Bosnjaks, Croats and Serbs). They find that individuals send significantly less money to nonco-ethnics than to co-ethnics. We speculate that, like in Whitt and Wilson (2007), our subjects are less likely to share a common identity as is the case in more specific subject pools (such as students, small businessmen or sportscards fans). As argued by Tafjel and Turner (1986) and Akerlof and Kranton (2010), a lack of common identity can be a source of discrimination. Another exception is Danilov and Saccardo (2017) who find that German students are more likely to reject unfair offers in ultimatum games if a Turkish instead of a German student made the offer. We speculate that the room for spitefulness which exists in their as well as our experiment may make discrimination more likely.
Since each trustee in the experiment plays three games that vary in the gains of mutual cooperation, we can gain some insights into the preferences that motivate behavior. We find no discrimination with low gains of cooperation, and a large treatment effect with high gains to cooperation. This is consistent with trustees attaching a strongly negative weight to the payoffs of non-Westerners in their utility, stronger than the (positive) weight attached to their own payoff. 20 And indeed, if we look at the distribution of types of trustees, we observe that trustees are less likely to be reciprocal and consistent-they are less likely to reciprocate always or start reciprocating as gains to cooperation increasewhen matched to an immigrant. Instead, these trustees are more likely to either defect irrespective of the gains to cooperation, or to make choices that cannot be rationalized with a standard social preference function.
Discrimination in our experiment shows up among trustees who are above median age or live in areas that are not strongly urbanized. These results may be due to the fact that young generations and people living in large cities have more occasions to get in contact with immigrants. Literature in social psychology suggests that inter-group contact causes a reduction of prejudice against other groups, including ethnic prejudice (see 20 Remarkably, a similar effect of gains to cooperation has been reported in an audit study where male auditors of different ethnicity are sent out to bargain to purchase a new car at different car dealers Price et al. (2012). African-American auditors get worse offers than Caucasian auditors, and have the initial counter offer rejected almost twice as frequently when gains to cooperation are high-when the car involved is high-end-but not so when they are low.
Pettigrew and Tropp, 2006, for a meta-study). However, since individuals with a rather positive attitude to diversity may self-select into large, multi-ethnic cities, our results do not allow making any causal claim. We also find that trustees with lower-than-median income or at most secondary education are more likely to discriminate as compared to richer or higher-educated individuals. This is consistent with them being in direct competition with immigrants in the labor market, but no causal claims can be made here either.
A relevant question is whether immigrants also discriminate against native Dutch.
To answer this question, we ran an additional treatment where non-Western immigrants played the role of trustee and were matched to a native Dutch trustor. 21 Section H in the Appendix provides a detailed description of the procedures as well as a data analysis.
We find that the immigrants in our sample are more reciprocal to native Dutch than native Dutch are to them (P = 0.064). Moreover, the immigrants' reciprocation rates are not different from the reciprocation rates of native Dutch (P = 0.767). Of course, these results do not necessarily imply that they treat native Dutch in the same way as they would treat other immigrants; it may be that they are even more likely to reciprocate trust of other immigrants. Even if that would be the case, the results would still suggest that they do not discriminate against native Dutch in absolute terms.
In our experiment, a standard economics framework leaves little role for beliefs hold by trustees about the choice of the matched trustor to rationalize discrimination. Yet, as formalized in psychological game theory, preferences may be a function of different layers of beliefs (Geanakoplos et al., 1989). For example, the intentions trustees attribute to the choice to trust may depend on the ethnicity of the trustor. If our trustees associate trust of the immigrants in our sample with greed and trust of native Dutch with willingness to cooperate, then they may be less reciprocal to the immigrants (Rabin, 1993). Alternatively, native Dutch may feel less guilty about not reciprocating trust of an immigrant if they think that they expect lower payoffs than native Dutch (see Battigalli and Dufwenberg, 2007). In summary, there are a variety of psychological forces that may underlie tastebased discriminationà la Becker. Our experiment does not allow to identify these, nor was it intended to do so, so we leave these questions for future research.
A big question that remains unanswered is how persistent tastes are over time. In the long run, tastes can be endogenous, and social and institutional contexts may shape them (Bowles, 1998;Bauer et al., 2016). For instance, an experiment ran in Mostar (Bosnia-Herzegovina) shows that children who live in an integrated neighborhood reach higher cooperation levels in ethnically diverse groups than children who live in a segregated neighborhood (Alexander and Christia, 2011). Future research hopefully helps to gain understanding of what makes people more or less trustworthy towards ethnically different others. Whereas trust is essential for the functioning of societies (Knack and Keefer, 1997), in the long run it cannot exist without trustworthiness.

A Immigrants in The Netherlands
In the Netherlands, 10.8% of the population was born in another country 22 ; 4.3% in a 'Western' country and 6.5% in a 'non-Western' country. 23 In addition, 10.6% of the population -the second-generation immigrants -has one or two parents born in another country; 5.2% in a 'Western' country and 5.4% in a 'non-Western' country. In sum, 11.9% Immigrants with 'non-Western' roots suffer from substantial economic and social problems. In 2014 their unemployment rate was 16.5%, versus 6.1% for native Dutch and 8.7% for 'Western' immigrants. 24 The unemployment rate of 'non-Western' youngsters was particularly high, namely 28.3% as compared to 9.8% for native Dutch. 22 The data reported in this subsection are from January 1, 2014, and are extracted from a year report on

B Experimental procedures for trustees
The 839 panelists were contacted with a request to participate in research organized by Tilburg University (see section C of the Appendix for the detailed instructions). On the first screen they got to see, they were informed that by participating they could earn additional money on top of the usual fixed participation fee. They were also informed that  answered the second set of beliefs questions.
After the elicitation of beliefs, participants were asked to fill in a short comprehension questionnaire with questions related to clarity, difficulty, interest, etc.

C Instructions for participants in the experiment
We include instructions (translated from Dutch) that were shown on the screens to LISS panel members who were contacted to participate in the experiment, and who (would) play in the role of trustee. Apart from role-related instructions, the instructions for the trustors were the same. In order to proceed to the next screen, participants had to click a 'Continue' button at the bottom right of the screen. From screens 2 to 4, and 6 to 8 they could go back one screen by clicking a 'Back' button at the bottom left of the screen. On screens 5 and 6, they could not go back.

{Screen 1}
This research is commissioned by Tilburg University. By participating you can earn money, in additional to the usual participation fee.
The amount of money that you can earn in addition, depends on choices made by yourself and another participant. All choices are anonymous. The identity of all participants (including yourself) remains strictly confidential.
For the research it is important that you read the instructions carefully.

{Screen 2}
You will get a role: A or B.
A will be asked to choose between 'IN' and 'OUT' in three choice situations and B will be

{Screen 3}
At the end of the research, one of the three choice situations will be drawn randomly.
Also, 5 pairs (1 A and 1 B) will be drawn randomly. Each of these pairs will receive that amount that corresponds to the choices they made in the randomly drawn choice situation. The payment will be included in the regular payment. In total, there will be about 200 participants with role B.

{Screen 4}
You have role B.

The matched participant with role A is called [first name], lives in the Netherlands, is
[male/female], and is between 16 and 89 years old. To guarantee anonymity, we cannot give you more details about A's identity.
Indicate your 3 choices at the bottom of the screen and click 'Continue' to enter your choices. Be careful, once you have clicked 'Continue', you cannot come back to this screen.
[ Figure 1 is shown] NB: Please finish the questionnaire until you arrive at the starting screen. Only then the system registers the questionnaire as complete.
What did you think of the current survey:

D Experiment on the perception of names
In order to test whether native Dutch people are able to distinguish between native Dutch and other first names, we conducted a laboratory experiment. Subjects were presented with a list of names of all trustors who participated in the experiment with the LISS sample. For each name, subjects were asked to indicate whether they thought the name was that of a Dutch person or of a person with non-Dutch origins. Each correct guess yielded e0.02, and total earnings were at most e12. There was no time limit to complete the task and the instructions emphasized the importance of giving careful answers. The experiment was conducted at CentERlab Tilburg University with 6 student participants.
In what follows we describe the results of the experiment. We define the accuracy rate by name as the percentage of subjects who correctly guessed whether that name belongs to a Dutch person or not. Figure D.1 shows the distribution of accuracy rates. The median and mode accuracy rate is 100%, implying that most raters correctly guessed the name's origin, and the mean is 89%. We take these figures as evidence that communicating trustors' names is an effective way to reveal their origins to trustees.

E Earnings of trustees
The trustees' earnings are composed of a fixed part and a variable part. The fixed fee is calculated on the basis of an hourly payment of 15 Euro and the expected duration of the survey. Based on experience with past experiments, CentERdata estimated that the panelists would take about 6 minutes to complete our experiment. Panelists thus received a participation fee of 1.5 Euro. The amount of the fee sets panelists' expectations regarding the length of the experiment.
We proceed with calculating how much panelists could expect to earn. In order to calculate the variable part we use the probability of being paid out conditional on the matched trustor participating (equal to 0.025 = 5 out of 200 pairs that were expected to participate by CentERdata). This gives us variable earnings between between 0.875 Euro and 2.125 Euro depending on one's choice and the choice of the matched trustor. If we combine the fixed and variable fee and translate the earnings to hourly earnings (that is, we multiply the amounts by 10 assuming an expected duration of 6 minutes), we get that a trustee could expect to earn between 15 + 8.75 = 23.75 Euro and 15 + 21.25 = 36.25 Euro an hour.
Next, we calculate how much trustees actually earned. To do so we take into account the amounts paid to trustees, which is equal to 520 Euro in total, and the median duration of the experiment, which is 10 minutes. Across treatments Native and Non-native, 533 pairs actually participated-out of the 691 participating trustees, 533 could be matched to a participating trustor-which is more than the 400 pairs (200 by treatment) initially expected by CentERdata. This is because CentERdata oversampled in order to be 100% certain that we would get at least 200 pairs by treatment. Among these 533 pairs 10 were randomly drawn for payment (5 by treatment). This implies that variable earnings for trustees whose matched trustor participated were equal to 0.976 Euro on average, or 5.85 Euro an hour assuming a duration of 10 minutes. Adding the fixed fee, this gives average earnings of 14.86 Euro an hour.
In summary, participants in our experiment could expect to earn between 23.75 and 36.25 Euro on average an hour and actually earned 14.86 Euro on average an hour.

F Procedures for trustors
In March 2015, a total of 1122 panelists, who would play in the role of trustor, were contacted to participate in the experiment. 25 Recall that the trustee-trustor pairs were drawn before trustees made their choices, so before December 2014. Out of the contacted trustors, 899 (80.1%) took part in the experiment: 326 native Dutch trustors were matched to a native Dutch trustee (treatment Native), 275 non-Western immigrant trustors were matched to a native Dutch trustee (treatment Non-native) and 298 native Dutch trustors were matched to an immigrant trustee (cf. the additional treatment discussed in section H of the Appendix). Table F.1 summarizes the socio-demographic characteristics of the sample of trustors.
As expected, there are substantial differences between native Dutch and non-native trustors.
The latter are generally younger, lower educated, more likely to live in large cities, and have larger households in our sample.
In contacting the trustors, we followed the same procedures as with the trustees. The trustors also received the same instructions describing the experimental games up until the point their role was revealed (see Section 2.2). In what follows we only describe the parts in which the instructions and procedures are role-specific.
On the decision screen participants learned that they were assigned role A (the role of trustor) and were provided with information about the matched player with role B.
The information about the matched trustee was formulated as follows: "The matched participant with role B is called [first name], lives in the Netherlands, is [male/female], and is between 16 and 89 years old. To guarantee anonymity, we cannot give you more details about B's identity." After making a decision for each of the three games, trustors proceeded to another decision screen. They were then reminded of their role and they were informed that B players had already made their decisions in a previous phase of the experiment. For each of the three games, trustors could view the percentage of trustees that reciprocated trust, and were asked to make a decision again for each game.
Trustors were instructed that at the end of the research, five pairs (one A and one B) would be randomly drawn among the 200 pairs that were expected to participate. For  After trustors made their choices, we elicited their beliefs about the behavior of other trustors. We asked trustors to indicate for each of the three games how many out of 100 trustees they thought would choose IN and how many would choose OUT. A total of 872 participants (97% of the participants whose choice was elicited) answered the whole set of beliefs questions. The elicitation of beliefs was not incentivized. After the elicitation of beliefs, participants were asked to fill in a short comprehension questionnaire.  Notes: The table shows marginal treatment effects from probit regressions with clustering at the individual level (P-values in parentheses). The dependent variable in the regressions is a dummy equal to 1 if the trustee reciprocates. Stars * * * , * * and * indicate that the effect is statistically significant at the 1%, 5% or 10% level, respectively. Notes: The table shows marginal treatment effects from probit regressions with clustering at the individual level (P-values in parentheses). Estimations are based on decisions from trustees who have also answered the questions related to attitudes to immigrants in the political and values survey. The dependent variable in the regressions is a dummy equal to 1 if the trustee reciprocates. Stars * * * , * * and * indicate that the effect is statistically significant at the 1%, 5% or 10% level, respectively.    Notes: The table shows marginal effects in probit regressions with clustering at the individual level (Pvalues in parentheses). The dependent variable is a dummy equal to 1 if the trustee reciprocates. Decile indicates the decile of the distribution of attitudes to immigrants where higher deciles represent a more positive attitude. Stars * * * , * * and * indicate that the marginal effect is statistically significant at the 1%, 5% and 10% level, respectively.  Notes: The table shows marginal effects in probit regressions with clustering at the individual level (Pvalues in parentheses). The dependent variable is a dummy equal to 1 if the trustee is an inconsistent type. Stars * * * , * * and * indicate that the marginal effect is statistically significant at the 1%, 5% and 10% level, respectively.  Notes: The table shows linear regression results with clustering at the individual level based on decisions of trustees who have a positive attitude to immigrants (P-values in parentheses). The dependent variable is the expected payoff of the trustor. Stars * * * , * * and * indicate that the marginal effect is statistically significant at the 1%, 5% or 10% level, respectively.  Notes: The first two columns show results from OLS regressions with clustering at the individual level (Pvalues in parentheses). The dependent variable is the trustee's belief about trust rates (between 0 and 1). The set of controls include the trustee's gender, age, education, income, urbanization of the area of residence and a dummy variable indicating whether the participant reciprocates trust or not. The last two columns show marginal effects in probit regressions with clustering at the individual level (P-values in parentheses). The dependent variable is a dummy equal to 1 if the trustor trusts. Stars * * * , * * and * indicate statistical significance at the 1%, 5% and 10% level, respectively.

H Treatment with non-native trustees
We ran an additional treatment with non-native trustees matched to native trustors at the same time that the treatments with the native trustees were run (treatment Extra).
All non-native trustees were recruited from the Immigrant panel and were matched to a native trustor from the LISS panel. The Immigrant panel was a panel separate from the LISS panel, also managed by CentERdata. It was available from October 2010 up until December 2014, and consisted of around 2400 individuals of which 1700 were of non-Dutch origin. We followed the exact same procedures as in the two main treatments (e.g., participants got the same instructions and were contacted in the same way) and only selected non-Western immigrants. A total of 360 panelists from the Immigrant panel were contacted to participate in the experiment. Out of the contacted panelists, 248 individuals (69%) actually participated and made choices in the three trust games. Figure H.1 shows the percentage of non-native trustees who reciprocate trust of a native trustor (cf. Extra), and compares it to the behavior of native trustees towards nonnative trustors. Comparing the light-green to the pink bars, the figure reveals that nonnative trustees are systematically more trustworthy than natives in a relationship with a member from the out-group. The regression results in Table H.1 show that this difference is (marginally) statistically significant in the aggregate. It is not significant in the separate games, though. .5 .55 .6 .65 Extra Native Non-Native Extra Native Non-Native Extra Native Non-Native G40 G60 G80

Reciprocation rate
Notes: The light-green bars depict reciprocation rates in Extra, where non-native trustees are matched with native trustors. Blue and pink bars refer to reciprocation rates in Native and Non-native, where native trustees are matched to native and non-native trustors, respectively.