Do Voters Vote in Line with Their Policy Preferences?—the Role of Information

In this article, I investigate how political information affects voting behavior. Specifically, I test (i) if more informed voters are more likely to vote for their closest politicians; and (ii) if this translates into a bias on the aggregate level. To do so, I use a set of Swedish individual survey data on the preferences for local public services of both politicians and voters, which provides an opportunity to investigate how information affects voters' ability to match their preferences with those of their politicians. The results indicate that more informed voters are more likely to vote for politicians with similar preferences for local public services and, on the aggregate level, that the left-wing parties would have received 1–3 percentage points fewer votes if all voters had been equally well-informed.


Introduction
Ever since Downs (1957) put forth the hypothesis that voters are 'rationally ignorant', scholars have discussed what role information plays for the individual vote decision and, ultimately, for the aggregate election result.In this article I argue that even if more political knowledge does not change the voters' preferences, the mere fact that some voters have more information than others may distort the election outcome.Using individual survey data on the preferences for public spending of both politicians and voters in Swedish municipalities, I study the extent to which more informed voters are more likely to vote for the politicians that have the most similar preferences as themselves.I then use the result to simulate the electoral consequences of all voters having the same amount of information as the most informed voters.The results suggest that information is important both for the individual vote decision, and also on the aggregate level.I find that the left-wing bloc would have received 1-3 percentage points fewer votes if all voters had been equally well-informed.
From a theoretical perspective, it is not obvious that a lot of political knowledge is needed for the voters to cast informed votes.Some scholars have argued that uninformed voters can use informational shortcuts to vote as if they are informed (see, for instance, Popkin et al. 1976,  Conover and Feldman 1989, Lupia 1994, McDermott 1997).Although voters may not know the candidates', or parties', specific policy positions, they may be able to identify their ideological leanings or party label.They can then use this knowledge to vote in the same manner as if they had complete information.If voters use these shortcuts successfully, their lack of information will have no impact on the election outcome.
Even if voters do make errors when deciding for whom to vote, this may not necessarily introduce an aggregate bias.Following Condorcet's jury theorem, Shapiro and Page (1988), Wittman (1989), and Page and Shapiro  (1992) argue that voter errors cancel out when votes are aggregated.The implicit assumption is that the errors are unsystematic.However, Caplan  (2007) argues that voters do make systematic mistakes.He claims that voters hold biased beliefs about the economy which lead them to demand undesirable economic policies.This, in turn, also means that the aggregate outcome gets distorted. 1Furthermore, even if the errors are in fact unbiased, the jury theorem may not be valid if certain groups of voters have more information and therefore make fewer mistakes.As Delli Carpini and Keeter (1996) put it: political knowledge is not randomly distributed in the population.The very groups who are disadvantaged economically and socially are also less politically informed and, thus, disadvantaged in the struggle over the political allocation of scarce goods, services, and values (p.265).
To test whether the election outcome would differ if all voters had the same information, Bartels (1996), Delli Carpini and Keeter (1996), Althaus (1998, 2003), and Gilens (2001) simulate 'fully informed' voter behavior. 2The idea is that voters with similar demographic background characteristics have similar political interests.Therefore, in a given demographic group, differences in voting behavior or preferences between differently informed voters can be interpreted as an information effect.The result is typically that information heterogeneity causes systematic biases in voters' preferences and voting behavior.
This article is closely related to that group of papers with one important difference.Although earlier papers generally consider the information effect to occur either explicitly through policy preferences (i.e., information changes the preferences, or beliefs, voters hold) or implicitly (the vote decision is based on preferences that are a function of information), I consider policy preferences as exogenous.Instead, I investigate whether 1 See also Caplan (2002); Romer (2003) for a discussion on how voters' beliefs are formed and how they may be biased.more informed voters are more likely to vote for the politicians whose policy preferences are closest to the voters. 3To do so I use individual data on the preferences of both voters and politicians in Swedish municipalities.These data make it possible to directly investigate whether more informed voters vote for politicians who have preferences closer to their own.
To what extent should we expect voters to care about politicians' preferences?In the Downsian median voter model (Downs 1957), candidates are office-motivated and converge to the median voter's preferred position to win the election.This result holds even if they are policy-motivated; they are still forced to locate at the median. 4If this is the case, politicians' preferences are irrelevant from the voters' perspective.The crucial assumption of this model is that it is possible for candidates to commit to policy before the election.As Alesina (1988) showed, if the candidates This result raises the question of why voters elect politicians who have preferences that differ from their own.In this article, I investigate whether the lack of information held by voters can be one explanation for why preferences differ between voters and politicians.

Theoretical framework
I use a simplified model of partisan politics to structure the empirical analysis. 7There are two parties, R and L, that are assumed to have exogenous preferences, i.e., they can be thought of as citizen candidates who cannot credibly commit to anything other than their own preferred policy. 8They have preferences over the size of government spending with bliss points g Ã R and g Ã L (with g Ã R < g Ã L ), respectively.Voters derive utility from the winning party's policy, g p .I assume that voters also care about other party characteristics that are unrelated to government spending.These characteristics are captured by the variable .Specifically, I assume the following utility function:9 Due to a lack of information, voters do not directly observe the parties' policy positions; instead, voters observe these positions with some error.The more informed voters are, the smaller this error is.Voter i will vote for party L if where i :¼ iR À iL , whereas ij is the error the voter makes about the policy position of the two parties, where j indicates how much information the voter has.A positive ij means that the voter believes herself to be closer to party R compared with party L than is actually true, whereas a negative ij implies the opposite.I assume that EðÞ ¼ 0 and Eð j Þ ¼ 0 but that the variance of j decreases with increasing information.Let F j be the cumulative distribution function of þ j .The probability that the voter votes for L is then For any two information levels, I and I 0 , 7 For a fuller model of similar type see Enelow and Hinich (1981).See also Bartels (1986)  for an empirical operationalization and test of that model.

8
In a full citizen-candidate model the decision to become a politician is endogenous.Because I only investigate voting behavior and assume voters vote sincerely, I take the policy positions of the politicians as exogenously given. where These equations indicate that as long as voter i is closer to party R than party L (x i < 0), she is less likely to vote for party L the more informed she is.Conversely, if she is closer to party L than party R (x i > 0), she is more likely to vote for party L if she has more information, which means that the probability that the voter votes for the party closest to her is larger for more informed voters.This is illustrated in Figure 1 where In this article I test whether more informed voters are more likely to vote in line with their own preferences.The crucial assumption is that information does not have a causal effect on voters' preferences, but only on how well they observe the parties' policy positions.This implies that a sufficient condition for the voters to vote according to their preferences is that ij ¼ 0. Because both and are unobserved, the identifying assumption in the empirical part of the article is that is orthogonal to x i and the voters' information levels.

Institutional setting and data
To test the hypothesis presented above, I use survey data on the preferences of both politicians and voters in Swedish municipalities.The Swedish municipalities are well suited to test the hypothesis because of their economic importance.During the period studied in this article, municipal revenues made up around 16-20% of total GDP and the municipalities employed about 18% of the Swedish workforce.Their responsibilities range from providing schools, child care, and social care to housing and infrastructure.Furthermore, the Swedish constitution states that the municipalities are autonomous.
The municipalities collect revenue from three primary sources: income taxation, grants from the central government, and various user fees.Central government grants, which during the studied period accounted for about one-fourth of the revenues, are determined exogenously at the national level and are not likely to be a determinant when voters decide for whom to vote. 10 The most important revenue source in the local budget is the proportional income tax which makes up approximately 40-50% of the revenues (nearly 10% of national GDP) and is determined by the municipal council. 11The rest of the revenues are made up of different user fees (around 15-17%) and other sources.
Because the tax is the most important revenue source for the municipalities, I argue that changing the municipal tax rate is the most straightforward way to change the size of the municipal budget.Thus, given its economic importance, voters are likely to assign substantial weight to the parties' positions on whether to raise municipal spending or lower the municipal tax rate.

The surveys
The surveys I use cover two elections, the 1979 election and the 1991 election. 12The municipalities under study in 1979 were drawn using a stratified sampling technique.The municipalities were divided into twenty-five strata based on the demographic, economic, and political 10 Dahlberg and Johansson (2002) and Johansson (2003) find evidence suggesting that the central government uses grants strategically, and Jordahl (2002) finds some support for the hypothesis that voters reward the central government with their vote if they get a large amount of grants.However, these votes are only for the central government, which distributes the grants.There is less reason to believe voters hold the local government responsible, because grant allocation is not decided at this level.

11
An exception is the tax freeze enforced by the central government between 1991 and 1993 when the municipalities were not allowed to raise the income tax.

12
The survey data are handled and distributed by the Swedish National Data Service (SND).Neither the SND nor the principal investigators bear responsibility for the analysis presented in this article.The surveys are SND 0100- Local elections 1979, SND  0101-Local politicians 1979-1980, SND 0306-Local citizen survey 1991, and SND  0482-Local politicians 1993.characteristics of the municipalities. 13One municipality was drawn from each stratum; hence, the survey covers twenty-five municipalities.The voter survey was carried out with personal interviews.The vast majority of interviews was made between less than a month before and 2 weeks after the election.A total of 2,100 individuals were selected, and 1,608 of these individuals participated in the interviews.Thus, the response rate was around 77%. 14 All politicians elected to the municipal councils in the twenty-five municipalities were selected to answer the survey, which was distributed via mail within 6 months after the election.827 out of 1,179 politicians answered the survey (response rate of 70%).
Twenty of the municipalities included in the 1979 survey were also included in the 1991 survey.Additionally, eight new municipalities were added to the 1991 survey. 15This survey was conducted via mail (sent out the day after the election) instead of personal interviews as was the case for the previous voter survey.This is the likely cause of the drop in response rate; 3,187 out of a total of 7,550 individuals (39%) answered the survey.The politicians were surveyed in 1993, 2 years after the election.This poses a problem, which is discussed further below.78% (1,011/1,292) replied to this survey.
Although Sweden has a multi-party system, it has been traditionally treated as a two-party system with one right-wing bloc (consisting of the Center Party (c), the Liberals (fp), and the Moderates (m)) and one leftwing bloc (consisting of the Communists (vpk) and the Social Democrats (s)). 16To test the hypothesis presented above, I only study the respondents who reported voting for one of these five parties.17The 1979 survey focused on the recent amalgamation of municipalities in Sweden.Because the three largest municipalities (Stockholm, Go¨teborg, and Malmo¨) were hardly affected by this reform, they were excluded from the population.14 This number represents those respondents who answered the questions on the vote decision, demand for public consumption, and all of the information variables.In 5 of the municipalities, 300 respondents were selected from each municipality.For the other 20 municipalities, 40 respondents were selected from each municipality.15 One of the additional municipalities was Go¨teborg, which was excluded from the population in the 1979 survey.1,000 individuals were selected from Go¨teborg.For 10 other municipalities, 400 individuals were selected.For the final 17 municipalities, 150 individuals were selected from each municipality.16 See, for instance, Alesina et al. (1997); A ˚gren et al. (2007); Pettersson-Lidbom (2008).Nowadays the Environmental Party is generally considered a left-wing party and the Christian Democrats a right-wing party.However, at the time of the surveys, they either did not exist or considered themselves to be neither left-wing nor right-wing.Therefore, they are excluded from the analysis.However, the results do not change if these parties are included.I also exclude the right-wing populist party, New Democracy, that only existed during the 1990s.The Communists have changed their name and is now called the Left Party.

Measuring voter information
The variable of interest is whether the voter is informed about the preferences of the politicians in each bloc.Although I am not able to directly observe this variable, I observe several proxy variables.To get a single measure of how much information each respondent has I use factor analysis to combine the information proxies into a single information index.The basic idea is that the higher the correlation between a given proxy variable and the other proxy variables, the more likely it is that the variable is also correlated with the information variable of interest; consequently, it is assigned a larger weight in the index.
I use five different variables as proxies for information.These are: (i) if the respondent knows the name of at least one member of the municipal council; (ii) if the respondent reads the part of the newspaper that deals with the local government;18 (iii) how well-informed the respondent considers herself to be about local politics; (iv) if she is interested in politics; and (v) how often the respondent talks about local government issues with people in her surroundings.From the above questions it is clear that the information variable constructed here captures general information and interest in local politics rather than specific knowledge of the politicians' preferences for local public services.
Table 1 shows the result from the factor analysis.A more detailed discussion of the result of the factor analysis together with a description of the actual questions asked and the variable coding can be found in The table shows the result from the factor analysis.The first column presents the factor loadings, whereas the second column shows the variation of each proxy that cannot be explained by the common factor (uniqueness).The third column presents the regression scores.
Appendix A. All the proxy variables have positive factor loadings, shown in the first column, which suggests that they, as expected, are all positively associated with information.The second column shows the amount of variation of each proxy that is not explained by the common factor. 19inally, the third column presents the regression scores, which indicate the weight each proxy (standardized to have mean 0 and standard deviation of 1) is given in the final information index.To simplify interpretation of the information variable, I standardize the index so that the least informed are given a value of 0 and the most informed a value of 1. Figure 2 shows the distribution of the information variable.
The hypothesis put forth in the theoretical framework is that more informed voters are more likely to vote according to their preferences.If certain groups of voters are more informed, the actual election result may be biased toward these groups because they are better at selecting their most preferred politicians.Thus, it is important to investigate which groups have a high or low level of information.In Appendix A, I show how information varies with a large number of background characteristics.Information is increasing with education, age and if the respondent is male, or working in the municipal sector.Overall, the results are similar to  The variation not explained by the common factor, called uniqueness, is simply 1 minus the squared factor loading.
CESifo Economic Studies, 60, 4/2014 the findings in previous literature (Delli Carpini and Keeter 1996), which suggests that the information index indeed measures the respondents' levels of political knowledge.Further discussion of these results, together with a description of how they are estimated, can be found in the Appendix.

Demand for local public services
I want to investigate whether voters vote in line with their own preferences for local public services.To identify these preferences, I use the responses to the following statement, which was provided to both politicians and voters: It is more urgent to lower municipal taxes than to increase municipal services.
The respondents could choose from four different responses.They could 'agree' (coded as 1), 'mostly agree' (2), 'mostly disagree' (3), and 'disagree' (4).From a theoretical standpoint, we want a single measure of the politicians' preferences within each bloc.Although each politician in the winning bloc may have a different demand for public consumption, there can ultimately be only one policy outcome.Theoretically, it seems reasonable that the median politician within each bloc will decide policy.Therefore, one way to estimate the bloc's policy position would be to use the observed median position.The problem is that there is no reason to believe that demand for public consumption is naturally divided into four categories.An alternative approach would be to say that individuals have a latent demand on a continuous interval.If this is the case, there is a problem with non-classical measurement error.However, there exists some additional information that can be used to reduce this problem which is illustrated in the following example.Consider the case of two different municipal councils, both of which have eleven right-wing politicians.In the first, there are five politicians who 'agree' and six who 'mostly agree'.In the second, there are also six who 'mostly agree' but instead five who 'mostly disagree'.If we were to take the observed median in each of these municipalities, it would be 'mostly agree' (coded as 2) in both.However, in the first municipality, the politician with the lowest latent preference (in category 2) would be the median, whereas in the second municipality, the politician with the highest latent preference would be the median.It therefore seems likely that the demand for local public services of the right-wing bloc is higher in the second municipality than in the first.
If we are willing to make an assumption regarding how politicians' preferences for local public services are distributed within each response category, it is possible to obtain an estimate of the bloc's position that will lead to a smaller problem with measurement error in the estimations.Specifically, I assume that preferences are uniformly distributed within each response category and that the distance between each response category is equal, that is, the data are cardinal.Given these assumptions, I can obtain a better estimate of the median preference in the following manner.Suppose the observed median is in the category coded as a, 20 and there are N individuals with spending preferences in that category.Suppose also that the median has the Ñ-th lowest spending preference in that category.The estimate of the latent median position for bloc p in municipality j at time t, g Ã pjt , is which follows from the uniform distribution assumption. 21

Empirical strategy and results
The prediction from equation (3) in the theoretical framework is that information should increase the probability that a voter votes for the bloc closest in preference for local public services.Using the preference measure for each bloc developed in the previous section, the estimate of the distance to the right-wing bloc minus the distance to the left-wing bloc is xijt ¼ j ĝRjt À g obs ijt j À j ĝLjt À g obs ijt j, where g obs ijt is the observed demand for local public services of individual i.22 I can then define a variable for which bloc is closest in preference for local public services to each individual, A ijt , as where R stands for the right-wing bloc and L stands for the left-wing bloc.
An additional prediction from the theoretical framework is that the probability of voting for the closest bloc is increasing in the relative distance to 20 That is, a can take on any integer from 1 to 4.

21
In the example above, this means that the estimate of the latent median preference would be 1:5 þ 1=ð6 þ 1Þ % 1:64 in the first municipality and 1:5 þ 6=ð6 þ 1Þ % 2:36 in the second municipality.It should be noted that using the observed median instead of the approach outlined in this section does not alter the results in the article (results available upon request).The reason is that the estimate of the latent median is highly correlated with the observed median.
the two blocs, j xijt j.That is, as the distance to the bloc farthest away in preference for local public services increases, relative to the closest bloc, the probability of voting for the closest bloc is expected to increase.These two hypotheses can be tested by estimating the following equation: where I ijt is the information variable and ÈðÁÞ is the standard normal distribution function which makes it possible to estimate this equation using probit regression. 23The theoretical prediction is that 1 > 0 and 2 > 0.
Importantly, the theoretical framework suggests that the effect of information will depend on the distance to the two blocs.That is, I expect an interaction effect between I ijt and j xijt j.Such a model is estimated in Section 4.1.Here I begin with estimating the simpler model without such an interaction.The interpretation of the estimated information effect is therefore how much information increases the probability of voting for the closest bloc on average.jt allows for the effect to vary for each municipality and time period.
To control for potential omitted variable bias, I also include demographic control variables for age, age squared, gender, and marital status in the vector C ijt .In principal, there is a large set of variables that could be used as controls.However, there are two reasons for why I do not do so in my baseline specification.First, due to missing values, the sample size reduces somewhat when additional controls are added.Second, and more importantly, the information variable is only a proxy.This means that a control variable correlated with 'true' information, such as education, could potentially not only remove spurious correlation but also remove some of the information effect (education could even potentially have been used as a proxy for information in the factor analysis).Nevertheless, I have also run regressions controlling for both education level and different work-related controls.The results are discussed in Appendix C where it can be seen that the results do not change substantially with a larger set of controls.Descriptive statistics of the variables used in the baseline regressions, for both voters and politicians, are found in Tables A2 and A3 in Appendix B.
Voters and politicians are not observed at the same time for the 1991 election, which poses a problem if something happened between the time the voters' preferences are observed (in 1991) and the time the politicians' 23 Throughout the article I will use probit instead of the simpler alternative of estimating a linear probability model (LPM).Although the results are similar when using LPM, the functional form becomes important in Section 4.2 when I use predicted values from the estimations.The sensitivity of the results to the functional form assumption is tested in Section 4.1.1 when a non-parametric model is estimated.
preferences are observed (in 1993).Specifically, one concern is the possibility that the demand for local public services is a function of economic conditions in the municipalities.The time between 1991 and 1993 was one of the most economically turbulent periods in Sweden the last century, which means it is important to account for the economic situation.However, I cannot simply add economic indicators to the right-hand side of equation ( 6) because the preferences of both the median politician and the voter are included in the dependent variable.
Instead, I can estimate the equation in two steps.Let Z jt be a vector of economic and demographic variables that affects preferences for spending in the municipality.First, I estimate the relationship between observed demand for local public services, g obs ijt (on voters and politicians simultaneously), and the vector Z jt using OLS: This equation shows how the economic and demographic situation in the municipality affects demand for local public services.v ijt is the part of the demand that cannot be explained by aggregate economic conditions, meaning that the residuals, vijt , from the estimation of equation ( 7) represent the idiosyncratic demand net of the economic condition in the municipality.I call this variable estimated spending preference.I use variables that are likely to affect the need and demand for local public services.These include unemployment rate, tax base, net migration rate, (log of) population size, the proportion of foreign citizens and the proportion of young and elderly, as well as municipal taxes, debts, and expenditures.For instance, the proportion of young is likely to affect the need for child care, whereas unemployment may affect the need for social care. 24Throughout the article I will perform the estimations using both the observed (or stated) preferences as well as these estimated preferences.
Table 2 shows the result where equation ( 6) has been estimated using probit regression.The top panel shows the result from using stated preferences.As can be seen in column (1), information is strongly positively correlated with voting according to preference for local public services and significant at any conventional significance level.The coefficient shows the marginal effect evaluated at the mean of the information variable.Figure 3 shows the estimated effect over the entire distribution as well as the actual data aggregated for each value of the information variable. 25As the figure shows, the estimated probability of voting according to preference is just 24 A description of these variables are found in Table A4 in Appendix B.

25
The size of the markers depends on the number of respondents of a given information level.
CESifo Economic Studies, 60, 4/2014 over 50% for the least informed voters.That is, the least informed voters are equally likely to vote for the left-or right-wing bloc regardless of their preferences for local public services.For the most informed voters, the probability of voting according to preference is over 70%.A onestandard-deviation increase in information increases the probability of voting in line with preferences for local public services by approximately 4 percentage points.The coefficients represent the marginal effects evaluated at the mean of all independent variables on the probability of voting for the bloc closest in preference for local public services (with the exception of the 'Left preference' variable where the coefficient is the effect of a discrete change from having right-wing to having left-wing preference).Standard errors, adjusted to allow for cluster effects on the municipal Â time level, are shown in parentheses.*, **, and *** denote significance at the 10%, 5%, and 1% level, respectively.
The second column of Table 2 shows the estimate of the relative distance variable.As expected it enters strongly positive.The greater the distance is to the farthest bloc relative to the closest bloc, the greater the probability of voting for the latter is. 26A standard deviation increase in the distance variable increases the probability of voting for the closest bloc with around 9 percentage points.The point estimates remain similar when both the information and the relative distance variables are included together.Furthermore, neither municipal Â time effects nor the demographic control variables seem to affect the result in any significant way.In column (6) a variable that indicates whether the voter shares her spending preference with the left-wing parties is added (i.e., a variable taking on 0 if xi < 0 and 1 if xi > 0).If such a variable is positive, it would suggest that the voters, given their relative distance from the two blocs, are biased toward the left-wing parties.However, the coefficient is small and insignificant which suggests that voters are not biased toward either of the blocs.
The middle panel shows the result from the model run using estimated preferences to account for the difference in timing between the voters' interviews (in 1991) and the politicians' (in 1993).The results are virtually unchanged.Information is still significant in all specifications with similar point estimates as before.However, this two-step procedure of estimating preferences may not completely solve the timing problem.During the years 1991-1993, the central government imposed a tax freeze which forbade the municipalities to raise taxes.This could potentially affect the respondents' demand for local public services.Also, the response rate of the 1991 survey was much lower than that of the survey conducted in 1979, which makes it less representative of the electorate.Therefore, the bottom panel shows the estimated information effect using only the 1979 survey.The results do not change much when applying this restriction.
Thus far, I have assumed a two-party system.However, because Sweden has a multi-party system this assumption could be an oversimplification.To test for this possibility, I let the outcome variable be a dummy variable for whether the respondent reported voting for the party (instead of bloc) closest in spending preference.The results are shown in Table 3.As can be seen, the information variable is strongly positively related with voting for the party closest in preference for local public services.Because I now compare the vote decision between five parties instead of two, I can no longer include the relative distance variable.Instead, I control for party biases by including a dummy variable for each party that indicates whether that party is the closest in preference for local public services.Adding these controls does not affect the result in any significant way.Furthermore, neither using estimated preferences nor only estimating for the 1979 election makes any difference for the estimates.The fact that the results are similar when using all five parties to the results when grouping the left-wing and right-wing parties together suggests that the results are not sensitive to the two-party simplification. 27Because that allows for the possibility of exploring how the relative distance measure interacts with the information variable (see next section), I will continue the analysis dividing the parties into a left-wing and right-wing bloc.

Allowing for an interaction effect
The theoretical framework discussed in Section 2 suggests that the information effect depends on the relative distance to the two blocs, as shown in Figure 1.When the voter is at a nearly equal distance between the two blocs, she is almost indifferent between the two spending policies.As the relative distance between the two blocs increases, preferences for local public services become relatively more important, and the information effect increases.When the relative distance is large, the information effect could potentially decrease because even less informed voters may be aware of the blocs' positions when the difference between the blocs is large.To test whether there is such an interaction effect I estimate the following model: Table 4 shows the results from the probit estimations of equation ( 8).The top panel presents the results when the stated preferences are used.Because information is scaled between 0 and 1, the main effect of the distance variable measure the effect the relative distance has on the least informed voters.That variable is positive but insignificant, suggesting that the relative distance from the two blocs does not matter much for the least informed.The main effect of information measures the information effect when the relative 27 This is consistent with the finding in Folke (2012).He finds that two of the smaller single-issue parties in Sweden, the Environmental Party and New Democracy, has had a causal effect on environmental and immigration policies but not an effect on overall taxation at the municipal level.One interpretation is that while small parties can have a direct effect on specific policy issues, for overall spending policies the two-party simplification is a good approximation.distance is 0, that is, when the voter is equally distant from the two blocs.As expected, this variable is not statistically significant.The estimate of the information variable interacted with the relative distance measure is positive and significant at the 5% or 10% level depending on the specification.However, as noted by Ai and Norton (2003), because the probit model is a nonlinear model this is not the interaction effect of interest.The full interaction effect is where the first part, 3 È 0 ðÁÞ, is the marginal effect reported in Table 4 (evaluated at the mean of all independent variables).Importantly, different from the main effects, the sign of the interaction effect can differ depending on where in the distribution it is evaluated.Figure 4 illustrates the interaction effect corresponding to the estimates in column (1) in the top panel of Table 4.The predicted values are indicated on the x-axis and the interaction effect is shown on the y-axis.The lower and upper bounds of the 95% confidence interval are also shown. 28The figure shows that the interaction effect is positive and statistically significant when the predicted probability is close to 0.5.As the predicted probability increases, the interaction effect decreases and becomes significantly negative when the predicted probability is close to 0.9.

.4
Interaction effect (stated preferences) .4 .5 .6 .7 .8.9Predicted probability Figure 4 Estimated interaction effect, stated preferences.28 They are calculated using the Delta method, see Ai and Norton (2003); Norton et al.  (2004) To further illustrate the interaction effect, Figure 5 plots the result from the same regression as above where the predicted probability is on the y-axis.The relative distance is on the x-axis and the effect is shown for three different values of information: the least informed, the most informed, and the median informed.When the relative distance is close to 0, i.e., the voter is on almost equal distance from the two blocs, the predicted probability is close to 0.5 as expected.Importantly, this is almost the same for the most and least informed voters.The intuition is that when the distance to the two blocs is around the same, then information does not have an effect on the probability of voting for the closer of the two because the utility difference for the voters, with respect to preferences for local public services, between them is close to 0.29 Importantly, as shown in Figure 4, although the absolute difference in predicted probability between differently informed voters is small when the relative distance is close to 0 (and the predicted probability is close to 0.5), the marginal effect of information is large at this point.On the other hand, when the relative distance is large the effect of additional information for informed voters is not as important compared to lesser informed voters which explains the negative interaction effect when the predicted probability is high.The intuition is that extra information plays very little role for well-informed voters when the relative distance is large because they are already voting for the closest bloc.Figure 5 shows that when the relative distance is at its maximum, the probability that the most informed voters vote for the closest bloc is 89% whereas the least informed voters have a predicted probability of 58% at this point.Although the absolute difference between the most and least informed voters is the greatest at this point, the marginal effect of information is smaller for the former group compared to the latter.
Going back to Table 4, the second and third column in the top panel show the results when municipality and time dummies are added as well as demographic background characteristics.The results are virtually unchanged when these controls are added.The middle panel shows the result when using the estimated preferences.The main effects of both information and the relative distance variables are statistically insignificant and fairly close to 0. The estimate of the interacted variable is now larger and statistically significant at the 1% level.The full interaction effect is shown in Figure 6.The interaction effect is larger than when −.2 0 .
CESifo Economic Studies, 60, 4/2014 using stated preferences and statistically significant up until the predicted probability of voting for the closest bloc is around 0.7.The effect also decreases faster and becomes negative and significant when the predicted probability is close to 0.9.This is also shown in Figure 7 where the difference between the most informed and the least informed is close to 0 when the relative distance is small.As the relative distance increases, the predicted probability increases much faster for more informed voters.The difference between the most informed and the least informed when the relative distance is at its maximum is almost 40 percentage points.The slope for the least informed voters is almost 0, meaning that those voters barely react to the relative distance at all.
Finally, the bottom panel of Table 4 shows the result when only the 1979 sample is used.The results are very similar to when the estimated preferences are used but the precision is not as good because of the smaller sample size.

Non-parametric estimation
The probit model relies on restrictive assumptions regarding the underlying data generating process.In this section I relax these assumptions and, as a sensitivity test, estimate a fully non-parametric model where the only assumption placed on the functional form is that the data generating process is smooth, i.e., no discontinuous jumps are allowed.Specifically, I use local linear regression.The drawback of this approach is that I cannot use many covariates as controls because they will cause the estimator to converge slowly (curse of dimensionality).The model I estimate is simply Prði votes for where D jt is a categorical variable indicating municipality and year.Using a smooth Kernel function (Aitchison and Aitken 1976), I allow the municipality-time effect to vary with the other covariates.
Figure 8 shows the predicted values from the local linear estimation. 30he predicted values differ for each municipality and time period so to present the effect in a single graph I have taken the average predicted value weighted with the number of respondents in each municipality.The bandwidth was selected using the Akaike information criterion cross-validation method, see Hurvich et al. (1998); Li and Racine (2004) for details.Using the leastsquares cross-validation technique to select bandwidth instead does not change the estimates in any significant way.

703
The difference from the probit model is that the information effect increases quickly until the relative distance is moderate (around 1).After that, the information effect increases much slower.Figure 9 shows the result when preferences are estimated with the two-stage procedure outlined above.As can be seen, the result is similar.

Aggregate effects
Even though information may have a large effect on the individual's vote decision, it is not clear that it matters on the aggregate level.If information is randomly distributed among voters, it will not matter for the electoral outcome that more informed voters are better at voting according to their spending preferences.If informed and uninformed voters have similar preferences, Condorcet's jury theorem implies that errors will cancel out and that the 'correct' decision will always be made when the number of voters is large.Page and Shapiro (1992) and Wittman (1989) argue along this line.However, if informed and uninformed voters have very different preferences, the composition of politicians will be biased toward the informed voters because they are more likely to vote according to these preferences.In this case, more informed voters also want higher local public spending.The correlation coefficient is 0.1229 and significant at any conventional significance level. 31o test whether there exists an aggregate information effect, I use the previous estimates to simulate what the election result would have been had all voters had the maximum level of information in the sample.This calculation is done by aggregating the predicted probability for each individual of voting for the left-wing parties given (i) their actual level of information; and (ii) the counterfactual that they would have had the highest information score. 32The difference between these two measures is the aggregate information effect.
The results are shown in Table 5.The first three columns correspond to the regressions in Table 4.The first column of the top panel shows the result from the estimation of equation ( 8) without any control variables.The first cell shows that the predicted left-wing vote share, using stated preferences and given the actual (observed) information level, is 48.0%, which is close to the actual value in the sample of 48.5%.The predicted left-wing vote share when information is set at its maximum value for all respondents is 46.1%.This means that the aggregate effect is that the leftwing vote share would decrease with 1.9 percentage points if all voters had the information level of the most informed respondents.This effect is significant at the 1% level. 33The results are similar when control variables are added (columns (2) and ( 3)).The fourth column shows the result from the local linear regression.The aggregate impact using that model is slightly larger; the estimates suggest that the left-wing vote share would decrease with around 2.2 percentage points if all voters had the information level of the most informed respondents. 34he second panel shows the result when using estimated preferences.The impact is smaller: the aggregate information effect is around 0.9 percentage points and significant only at the 10% or 5% level depending on the specification.The information effect is, again, slightly larger for the non-parametric specification.
As previously mentioned, the 1979 voter survey had a response rate of 77%, whereas the 1991 voter survey only had a response rate of 39%.

31
The t-value is 7.41.The correlation coefficient when using estimated preferences is 0.1211 with t-value of 7.24.

32
The predicted probability of voting for the left-wing bloc is simply the predicted probability of voting according to spending preference when the voter has a left-wing spending preference and 1 minus the predicted probability of voting according to spending preference when the voter has a right-wing spending preference.

33
Standard errors are bootstraped because the predicted probability for each respondent is dependent on the regression estimates.
Although these rates may not be problematic for the individual-level estimations, the difference in response rates becomes a much bigger problem in the aggregate, where we need to assume that the non-respondents are identical to the respondents.Specifically, it is possible that less informed voters are less likely to respond, which could lead to an underestimation of the aggregate information effect.The observed information level is indeed lower in the 1979 survey (mean of 0.516) than in the 1991 survey (mean of 0.557).Although this may be caused by a general increase in knowledge between the two surveys, it may also be caused by selection in the 1991 survey.
The bottom panel of Table 5 shows the results when only the 1979 survey is used. 35The aggregate information effect is indeed larger; the simulated left-wing vote share would decrease with between 2.6 and 3.0 percentage points if all of the voters had the same level of information as the most informed voters.
The results so far suggest that the vote result would differ with a couple of percentage points with more information.The question is whether this is an important effect.If elections are generally lopsided, a couple of percentage points in either direction make little difference.However, if elections tend to be close, the estimated effect could be important.In nine out of the fifty-three elections (twenty-five in 1979 and twenty-eight in 1991)  studied in this article, of the five parties, the two left-wing parties received between 50% and 52% of the votes, which suggests that the information effect is indeed important. 36he size of the aggregate information effect is comparable to the finding by Bartels (1996).He finds that the aggregate election result from the six presidential elections in the USA from 1972 to 1992 would have differed with, on average, 3 percentage points if all voters would have been 'fully informed'.It is important to note that he does not model the mechanisms with which information changes voting behavior.That is, he allows information to affect not only the ability of the voters to vote for their most preferred candidate, but also to change the voters' preferences and beliefs about policy.In this article I take preferences as given and investigate the effect of information under the assumption that voters' preferences would be the same if they were more (or less) informed.It is therefore, perhaps, surprising that the aggregate information effect is as large as it is given the more restrictive model.
On the other hand, if the positive correlation between information and preferences for local public services is interpreted as causal, it is possible that the aggregate effect of an equally well-informed electorate would go in the opposite direction.Voters with preferences for a relatively low level of local public services (i.e., voters with right-wing preferences) are also less likely to be well-informed.If more information increases the probability that they vote in line with these preferences (the mechanism studied in this article), then they are more likely to vote for the right-wing parties.On the other hand, if information causes them to be more positive toward local public consumption, then they might be more likely to vote for the left-wing parties.

Aggregate effect of restricted turnout
Given that voters with little information have trouble voting according to their preferences for local public services, we may ask whether they would be better off not voting at all.Feddersen and Pesendorfer (1996) argue along this line.Given the previous estimates, I can test whether the election outcome would be less biased if the least informed voters would abstain from voting by restricting the aggregation to voters with information above a given threshold.
Figure 10 shows the result for the estimates from column (1) in the top panel of Table 4.The y-axis shows the predicted left-wing vote share, and the x-axis presents the information threshold.The first point corresponds to the predicted left-wing vote share for all voters (0.480, as shown in Table 5), and moving farther right on the x-axis shows the predicted aggregate left-wing vote share for voters with information above a given level.The horizontal line indicates the predicted 'all informed' left-wing

Summary
In this article, I study the extent to which more informed voters are more likely to vote in line with their preferences for government spending compared with lesser informed voters.To do so, I use a data set that includes the observed preferences for local government services of both politicians and voters, which makes it possible to test whether voters with a high degree of information vote for politicians with preferences closer to their own compared to voters with a low degree of information.
The results suggest that the most informed voters are around 20 percentage points more likely to vote for the bloc closest in spending preference compared to the least informed voters.A standard deviation increase in information increases this probability by around 4 percentage points.I also show that the effect depends on the relative distance in spending preference between the two blocs; when a voter is at almost equal distance from the two blocs, the effect of information is negligible.As the relative distance increases, the effect also increases.
This raises the question of whether information heterogeneity in the electorate has an aggregate effect on the election outcomes.Information can affect the aggregate result either by changing voters' preferences or, given voters' preferences, by increasing the probability that voters vote according to these preferences.Previous studies have, primarily, either focused on the first mechanism (for instance, Althaus 1998) or the joint outcome (Bartels 1996) of these two effects.In this article, I specifically focus on the second mechanism.Taking voters' preferences as exogenous, I investigate whether the fact that more informed voters are more likely to vote for their most preferred politicians causes an aggregate distortion.If the information level is uncorrelated with policy preferences, it does not matter that some voters are informed and others are not.However, in this case, information is strongly positively related to preferences for local government services.Using the parameter estimates from the individuallevel regressions, I am able to simulate what the election result would have CESifo Economic Studies, 60, 4/2014 been if all voters had been equally well-informed.The results suggest that, in that case, the left-wing bloc would have received between 1 and 3 percentage points fewer votes.This means that the mere fact that information is not distributed homogeneously in the electorate causes a skewed election result, even if more information does not change the preferences of the voters.However, this result does not mean that it would be better if only the most informed citizens voted.Because their preferences are distinct from those of the population at large, restricting turnout would actually increase the aggregate bias.
In this article, the policy positions of the politicians are taken as exogenous.However, the fact that more informed voters are significantly 'better' at voting has implications for the incentives facing politicians.Grossman  and Helpman (2001) develop a model where some voters are perfectly informed about the policy positions of the political parties and others are not.They show that when the parties are office-motivated they will commit to policies that are biased toward the informed voters. 37On the other hand, if politicians have policy preferences, as discussed in this article, and cannot commit to policy in advance, it stands to reason that politicians that are closer to the preferences of the informed voters are more likely to run for office.This expectation is supported by the data.More informed voters want higher government spending, and, as A ˚gren et al. (2007) show, politicians want higher spending than the average citizen.Such a phenomenon could be understood in a citizen-candidate framework where the decision to run for office depends on the probability of getting elected.A possible avenue for future research would be to develop such a model formally and analyse how a heterogeneously informed electorate will affect public policy.in Public Economics.The Swedish Research Council is acknowledged for their financial support.criterion that the eigenvalue of the factor matrix should be greater than 1 (Sharma 1995) can easily be rejected for all factors but the first one.The question is if the proxies are correlated enough so that factor analysis is suitable.If the variables are only weakly correlated, the information variable cannot be identified.There is a formal criterion to test for this, the so-called Kaiser-Meyer-Olkin (KMO) measure (Kaiser and Rice  1974).It is suggested that this measure, that lies between 0 and 1, should be 0.8 or greater even though a measure of 0.6 or greater can be accepted (Sharma 1995).In this case the KMO-measure is 0.76 which suggests the data are acceptable for factor analysis.

Regression of the information variable on background characteristics
It may matter how information is distributed among different groups in the electorate.To investigate which groups of citizens that have a high or low information level, I estimate the following model, using OLS: where I ijt is the information level for individual i in municipality j at time t.X ijt is a vector of background characteristics including demographic, educational, and work status variables.jt is a unique intercept for each municipality and time period while " ijt is an idiosyncratic error term.Table A1 shows the results from the estimation of equation (A.2).As expected, the education variables are strongly positively related to information.Information also increases with the age of the respondents but at a decreasing rate.Women generally have a lower level of information, but the gender gap decreased substantially between the two surveys.Interestingly, conditional on the other covariates, income is not related to information.Although the raw correlation (data not shown here) between information and income is positive, the effect disappears when the education and working status variables are controlled for.Compared to blue-collar workers, white-collar workers and employers/self-employed have more information.Finally, individuals who work in the public sector, and specifically in the municipal sector, have more information.This is reasonable because those who work in the municipal sector are strongly affected by political decisions at the local level.The coefficients represent the marginal effects evaluated at the mean of all independent variables on the probability of voting for the bloc closest in preference for local public services.Standard errors, adjusted to allow for cluster effects on the municipal Â time level, are shown in parentheses.*, **, and *** denote significance at the 10%, 5%, and 1% level, respectively.
present results using a larger set of controls.Specifically, I add a set of education controls and a set of work-related controls.The education controls are dummy variables for whether the respondent had a medium or high education level (where the reference category is having only gone to elementary school).The work controls are dummy variables for income quartiles, working at home, being a student, being retired or not working, working in the state, municipal, county or private sector and finally being a white-collar worker, blue-collar worker, employer/self-employed, or having never worked.
The results are shown in Table A5 where the first three columns correspond to the results in Table 2.That is, the table shows estimates of how information correlates with the probability of voting for the parties closest in preference for local public services.Compared to the results in Table 2, the information estimates are slightly lower.However, they are still significant at, at least, the 5% significance level in all specifications.Columns (4)-( 6) show the results when information is interacted with the relative distance, which corresponds to the estimations in Table 4.As can be seen, there are no substantial changes compared to the baseline results when the new controls are added.If anything, the estimate of the interaction variable slightly increases with additional controls.All in all, these results suggest that the findings are not sensitive to the inclusion of controls for education and work status.

Figure 1
Figure 1 Probability of voting for party L. 13

Figure 2
Figure2Frequency plot of the information variable. 19

Figure 3
Figure3Relationship between the probability of voting according to preference for local public services and information.

Figure 5
Figure 5 Predicted probabilities of voting for the bloc closest in preference for local public services, stated preferences.

Figure 7
Figure 7 Predicted probabilities of voting for the bloc closest in preference for local public services, estimated preferences.

Figure 8
Figure 8 Predicted probabilities of voting for the bloc closest in preference for local public services from the local linear regression, stated preferences. 30

Figure 9
Figure9Predicted probabilities of voting for the bloc closest in preference for local public services from the local linear regression, estimated preferences.

Table 1
Factor analysis result

Table 2
Probability of voting for the bloc closest in preference for local public services, results from probit regressions

Table 3
Probability of voting for the party closest in preference for local public services, results from probit regressions The coefficients represent the marginal effects evaluated at the mean of all independent variables on the probability of voting for the party closest in preference for local public services.Standard errors, adjusted to allow for cluster effects on the municipal Â time level, are shown in parentheses.*, **, and *** denote significance at the 10%, 5%, and 1% level, respectively.

Table 4
Probability of voting for the bloc closest in preference for local public services, results from probit regressions .

Table 5
Predicted vote share for the left-wing bloc given different information levels errors, shown in parentheses in the first three columns, have been estimated by performing 1,000 bootstrap replications of the regressions in Table4(allowing for cluster effects on the municipal Â time level).*, **, and *** denote significance at the 10%, 5%, and 1% level, respectively, for the difference between the predicted left-wing vote share for the hypothetically informed electorate and the actual information level of the electorate. Standard Figure 10 Effect of restricted turnout on the vote share of the left-wing bloc.voteshare (0.461).The figure shows that as turnout is restricted to more informed voters, the left-wing vote share increases and, as a result, the aggregate bias increases.Therefore, if reducing aggregate bias is desirable, restricting turnout to only informed voters would not be preferable.It is worth noting that I have only included individuals who actually voted.Therefore, I cannot say what would happen with the aggregate bias if the least informed voters who abstained were to vote.However, the average turnout rate in the 53 elections under study was around 87% which means that accounting for those who did not vote is of relative minor importance.

Table A1
OLS regressions of the information variable on background characteristicsStandard errors, adjusted to allow for cluster effects on the municipal Â time level, are shown in parentheses.*, **, and *** denote significance at the 10%, 5%, and 1% level, respectively.

Table A5
Probability of voting for the bloc closest in preference for local public services, results from probit regressions where additional control variables have been included