Measuring Support for Women’s Political Leadership

Abstract Public opinion surveys are a fundamental tool to measure support for women’s political rights. This article focuses on perceptions of women’s suitability for leadership. To what extent do influential cross-country surveys that include such items suffer from measurement errors stemming from gender of interviewer effects? Building on the literature on social desirability, we expect that respondents are more likely to express preference for men’s suitability as political leaders with male interviewers and more likely to state support for women’s leadership when interviewed by a woman. We hypothesize that these processes are conditioned by having one’s spouse present, by age differences between respondents and interviewers, as well as by respondents’ levels of education. Analyzing Afrobarometer data, we generally find support for our claims. In addition, it seems that men are slightly more affected by such effects than women are. These gender of interviewer effects persist when analyzing alternative survey rounds and are insensitive to various fixed effects specifications and robustness tests. For the analysis of survey data, we suggest that researchers using gender-related items should control for gender of interviewer effects. We propose that comparative survey programs pay even more attention to interviewer characteristics and the interview situation in their protocols.


Introduction
Survey items gauging the extent to which citizens see women as fit for political leadership (Inglehart and Norris 2003) commonly capture public demand for women in public office (Lovenduski and Norris 1993). Opinions about women's fitness for higher political office serve as proxy variables to understand gender equality in low-income countries (Bolzendahl and Myers 2004), especially where women's rights are weak (Benstead 2018a(Benstead , 2018b. This article looks at gendered distortions in citizens' assessment of women's suitability to hold higher political office, focusing on the sex of the interviewer. To what extent do men and women give different answers to questions about women's fitness for political office when interviewed by either a male interviewer or a female interviewer, and to what extent do influential cross-country surveys suffer from such measurement errors? Initially based on small samples within the United States (Benney et al. 1956;Landis et al. 1973) and more recently on representative samples in single-country settings (e.g., Flores-Macias and Lawson 2008; Benstead 2014a), empirical work on gender of interviewer effects has not yet explored interviewer effects on gender-related items in cross-county surveys. Therefore, the literature has seldom discussed the "big picture," that is, how such potential measurement errors affect our understanding of support for women's rights across the globe or in one region. For instance, the largest comparative survey program, the World Values Survey, has never systematically recorded the gender of interviewers.
Social desirability-where respondents seek to make a favorable impression and adopt their answers to the interviewer's expected opinions-could explain why survey participants give a different answer when interviewed by a man rather than a woman. In such interactions, the respondent might align their views to the perceived views of the interviewer and adjust their responses in the direction of what they think the interviewer wants to hear. When the interviewer and the respondent are the same genders, selfdisclosure theories imply that respondents are less likely to shy away from revealing sensitive attitudes (see Catania et al. 1996;Dykema et al. 2012), allowing men possibilities to air attitudes about men's superiority and giving women room to voice attitudes about equal rights. Yet, these same-sex dyads are possibly also tainted with socially desirable responding, as respondents could act to reinforce in-group esteem by answering in line with the stereotyped views of their gender (Benstead 2014b). This leads us to expect that male and female respondents are more prone to express preference for male leadership with male interviewers and less prone when interviewed by a woman. Moreover, we believe that the presence of one's spouse, the age difference between respondent and interviewer, and the education level of the respondent could condition these interviewer effects. We test our expectations in an analysis of round six of the Afrobarometer project.

Interviewer Effects
The study of public opinion in low-income countries is beset with several methodological challenges (Lupu and Michelicht 2018). Much of today's survey research in such settings builds on face-to-face interviews, situations where interviewer effects could matter. Aware of these potential effects, prior research has looked at how a range of observable interviewer characteristics, such as linkages between the audio-visual (e.g., skin color, a gendered voice, and accents) and social categorizations (e.g., social class and race), systematically alter answers from respondents (Hyman et al. 1954;Schuman and Converse 1971;Campbell 1981). 1 In particular, there is some consensus that appearance-based characteristics such as the wearing of a headscarf can affect answers in the field of gender equality (Blaydes and Gillum 2013;Benstead 2014aBenstead , 2014b. Another relevant finding, from the context of African countries, is that co-ethnicity matters: respondents are more likely to voice discriminatory attitudes toward other ethnic groups when interviewed by someone of their own ethnicity (Adida et al. 2016).

Gender of Interviewer Effects
The gender of the interviewer can also play an important role in the interview process (Johnson and Braun 2016). In fact, studies on gender of interviewer effects build on three bodies of literature: research on 1) survey interviews, 2) job interviews and counselling studies, and 3) social-psychological experiments (Lueptow et al. 1990). For example, one vein of work proposes that female interviewers obtain better response rates (Benney et al. 1956) and quality of responses (Liu and Wang 2016), because they appear less threatening and are more likely to gain access to respondents' homes (Huddy et al. 1997), even if this effect is not always significant (West and Blom 2017). Another feature, discussed by Becker, Feyisetan, and Makinwa-Adebusoye (1995), refers to the tendency of female respondents to shy away from 1. Moreover, perceptions about whether interviewers represent the state may affect responses about support for democracy (Lau 2018) and government (Tannenberg 2022), especially in authoritarian settings (Calvo et al. 2019). answering questions about sexual behavior when interviewed by a man (see also Pollner 1998;Davis et al. 2010).
Does the interviewer's gender influence attitudes toward gender equality among male and female respondents? Landis et al. (1973) analyzed a sample of US students' responses to issues related to gender roles in society and found that female students give more feminist responses when interviewed by women. Studying a small group of US students, Lueptow and colleagues (1990) add that women voice more liberal attitudes to female interviewers. Another study, by Kane and Macaulay (1993), establishes that men state different attitudes about gender inequalities in the labor market, when interviewed by men and women, respectively (see also Catania et al. 1996). Huddy and colleagues (1997) add that these trends are larger among the less well-educated and younger respondents.
Social-psychological experiments further suggest that gender might be a cue for what the respondents perceive as a desirable response. For instance, Galla et al. (1981) find that male interviewers generate more traditional attitudes from female respondents on a sex-role questionnaire (see also Frisone et al. 1982). Work by Flores-Macias and Lawson (2008) also reports gender of interviewer effects for questions on women's rights among men, but only in the part of their sample that is drawn from the capital region in Mexico, where respondents' characteristics might be more heterogeneous than in more rural parts of the country. Finally, in the Moroccan context, Benstead (2014a) finds that interviewers' visible religiosity and gender interactively affect responses to religiously sensitive questions.

Theoretical Expectations
We know from a case study of Morocco (see Benstead 2014b) that men tend to report more egalitarian answers to questions about women and politics when interviewed by women. Yet, it is not clear from this case study whether these effects are generalizable to other contexts and, if any, which other factors condition these processes. To understand these effects theoretically, we make use of the literature on social desirability, which Fisher (1993, p. 303) defines as "systematic error in self-report measures resulting from the desire of respondents to avoid embarrassment and project a favorable image to others." In detail, processes of social desirability generally reflect people's propensity to "deny socially undesirable traits and to claim socially desirable ones, and the tendency to say things which place the speaker in a favorable light" (Nederhof 1985, p. 264). Face-to-face interviews are particularly prone to social desirability biases. To explain the origin of these predispositions, Brenner (2017, p. 7) suggests that observable characteristics are important in interactions where you have no experience of working together: "the dyadinterviewer and sample element-begins an interaction typically with very little information about the other. [They] quickly size up each other on the basis of physical appearance, vocal characteristics, and so on." Such interactions trigger beliefs about what is desirable and are heterogonous across survey item types: "Respondents tend to report attitudes in line with their expectation of the interviewer's opinion on the basis of these observable characteristics . . . [but] only those questions relevant to the observable physical characteristic are prone to interviewer effects" (Brenner 2017, p. 6). In addition, social desirability is likely to appear in discussions about controversial political questions, when respondents might "believe their true answer goes against perceived societal norms" (Streb et al. 2008, p. 77).
The notion of social distance-a concept that pinpoints how similarities in characteristics between interviewer and interviewee (such as gender or race) determines perceptions of whether the two actors share a mutual understanding of different phenomena (Tu and Liao 2007)-can explain why respondents engage in socially desirable responding. Shying away from statements that could contradict the interviewer's believed opinions can be a mechanism to reduce the social distance between interviewer and respondent (Williams 1964;Landis et al. 1973;Benstead 2014a). These processes follow what Paulhus (2002) labels impression management and self-deception. As Nederhof (1985, p. 264) notes, impression management is about the "norms of what constitutes a good impression in a given situation." 2 It is plausible that processes related to impression management apply to both male and female respondents, especially in the African context, where women's political rights are far from taken for granted (see Sundström et al. 2017) and where voicing ideas about gender equality might still be radical. Deference theory or power relations theory, which builds on the reasoning that constructed gender roles tend to shape how people behave in conversations (Lau 2018), would predict that women could have reasons to downplay gender equality views when interviewed by a man. Being outspoken about the promotion of women's rights might require considerable self-confidence. In line with this view, theories on attribution further suggest that women might attribute less progressive gender-related views to men when trying to adjust in a socially desirable way to their male conversation partners (Benney et al. 1956;Hyman 1954). This implies that respondents who feel subordinate to the interviewer may be more likely to engage in socially desirable responding and adjust their stated views to the perceived opinions of the interviewer. In such a situation, a woman describing herself as a feminist might still expect to receive ridicule or hostile responses from men. As a result, we expect that women tend to voice more gender-egalitarian opinions to a woman than to a man. Similar interviewer effects should be at play for men. For example, theories of self-disclosure could explain men's tendencies to prefer male leaders when interviewed by men. According to this reasoning, survey respondents are more likely to expose sensitive views to an interviewer perceived as supportive or non-judgmental (Catania et al. 1996;Lau 2018). In such a situation, the processes of impression management and self-deception would not kick into place. As stated by Dykema and colleagues (2012, p. 312), "individuals are expected to be more honest and disclose more to someone they trust and with whom they feel comfortable." 3 However, this is not to say that the responses in same-sex interviews are completely free from processes where respondents adjust according to who is asking the questions. In fact, Benstead (2014a) notes that "respondents (might) demonstrate loyalty and enhance in-group esteem by agreeing with the stereotyped views of their in-group" (pp. 740-44). They might do so to impress upon their conversation partner or because they feel comfortable to voice such opinions. Regardless, attitudes of preferences for male leadership should be more likely to emerge in conversations among only male participants. Male pairs may invoke social pressures for respondents to engage in "locker room talk" in which they can air attitudes about female subordination freely and even in exaggerated ways (Grenz 2005). In contrast, men might voice more egalitarian views when it comes to women leadership roles when the interviewer is a woman. In the words of Streb et al. (2008, p. 79), "respondents might want to avoid appearing sexist" when talking to a woman. Taken together, this leads to the following expectation: Hypothesis 1: Respondents are more likely to state support for the election of women to political leadership positions when interviewed by a woman than when interviewed by a man. Likewise, they are less likely to state that men are better suited for leadership when interviewed by a woman than when interviewed by a man.
While this hypothesis predicts that both men and women are susceptible to interviewer effects, it is also important to detect if the size of these effects differs between the two groups (see Huddy et al. 1997). Methodologically, more susceptibility to interviewer effects by either men or women would entail more noise in the answers of this particular group. Greater gender of interviewer effects for women would be consistent with the theoretical idea of "sex role stereotyping, where females are generally more sensitive to the characteristics of the interview situation, especially when these involve threat or desirability" (Lueptow et al. 1990, p. 31; see also Kane and Macaulay 1993). In contrast, greater gender of interviewer effects for men would imply that men, on average, still adhere to gender-traditional attitudes, but are more likely to voice these thoughts when being interviewed by a man (Flores-Macias and Lawson 2008). Since we do not know a priori which of the two expectations is more in tune with reality, we postulate the two following hypotheses: Hypothesis 2a: Gender of interviewer effects on survey items gauging stated support for women in political leadership are larger among women than among men. Hypothesis 2b: Gender of interviewer effects on survey items gauging stated support for women in political leadership are larger among men than among women.
Gender of interviewer effects might also depend on the privacy of the interview or whether other parties, especially spouses, are present (Zipp and Toth 2002). Hartmann (1994) suggests that privacy matters in the interview, which often takes place in the household. For example, the presence of a "third party," which may include spouses, children, or bystanders, can influence respondents' answers on sensitive issues such as gender roles (Smith 1997;Mneimneh et al. 2018). We deem it likely that the presence of the interviewee's husband or wife could subtly impede the airing of traditional attitudes (among men) or possibly defiant views (among women) about the role of women in political office. 4 There are four possible dyads or combinations between the gender of the interviewer and the respondent. In a male-male dyad, the presence of the respondent's wife could instill some restraint, making men uncomfortable to belittle women's rights. With a female interviewer, a male respondent with his wife present could be even more reluctant to express aversion toward having female leaders; the presence of two women might be a double constraint for men to voice preferences for male leadership. In the third combination, a female-female dyad, the presence of the husband might dampen the female respondent's tendency to express emancipatory views. She might feel uneasy to state egalitarian views because her husband could disapprove. With a male interviewer, the effect of a husband's presence is plausibly stronger; in the presence of two men, a female respondent might not dare to express progressive views about women's political rights.
Hypothesis 3: The presence of the spouse will weaken gender of interviewer effects on survey items gauging stated support for women in political leadership, and this weakening effect will be particularly strong in mixed-sex interviewer/ respondent dyads.
In the context of an interview, it is possible that the age difference between interviewer and interviewee can influence respondents' view of socially acceptable opinions (Lau 2018). We believe that stating progressive opinions to an older interviewer can feel intimidating, because somebody might worry about how it makes her appear to that older person, especially in a context where respect for older people is more ingrained than in European or North American settings. This might particularly apply to a woman if the conversation partner is an older man. More generally, because of deference, the presence of an older interviewer could increase men's tendency to state support for traditional gender roles and therefore strengthen the magnitude of gender of interviewer effects on gendered issues. Even more so, to be interviewed by a much older person may decrease women's likelihood to voice gender equality attitudes, and this should interact with the gender of the interviewer.
Hypothesis 4: Gender of interviewer effects on survey items gauging stated support for women in political leadership will intensify when the interviewer is considerably older than the respondent, particularly if the interviewer is a man.
There is an extensive literature (Schuman and Converse 1971;Campbell 1981;Huddy et al. 1997;Blaydes and Gillum 2013) that finds that less welleducated individuals seem to be more easily swayed by social desirability and interviewer effects in surveys. Applied to support for women's political rights, this would imply that individuals with less education should have a higher tendency to "please" the interviewer by voicing preferences for male leadership if the interviewer is a man and a more emancipatory stance if the interviewer is a woman.
Hypothesis 5: Gender of interviewer effects on survey items gauging stated support for women in political leadership will be smaller at higher levels of respondent education.

Research Design
We focus on round six of the Afrobarometer project, which was collected in 2014 to 2015 and covers 36 countries. 5 The Afrobarometer survey draws a 5. We refrain from using rounds five or seven, fielded two years prior and later, because they had only 34 countries. However, when replicated, our results remain largely the same with these additional rounds (see Supplementary Material table Sm1 for a basic model showing that the variables behave in similar ways across rounds and that main effects persist). The sample of Afrobarometer round six contains 53,415 respondents, the analytical samples for the regression models contain between 52,912 and 51,624 observations. For missing data, we engaged in listwise deletion. For sampling details, see https://afrobarometer.org/surveys-and-methods/sampling-principles. For figures on response rates, see https://afrobarometer.org/sites/default/files/data/afrobarometer_ response_rates_round_2_to_7.xls. We used unweighted data to analyze all relationships between variables within the dataset. clustered, stratified, multi-stage area probability sample that consists of 49.8 percent men (and 50.2 percent women). 6 In-person interviews are conducted in respondents' households. We are confident that gender of interviewer effects in the data do not stem from assignment bias or a systematic skewness in the assignment of male or female interviewers to areas and respondents (Flores-Macias and Lawson 2008;Lau 2018;Adida et al. 2016;Lupu and Michelicht 2018). The Afrobarometer protocol oversees the completion of the survey in the same way in all partner countries and the organization trains the national team, employed by a partner firm, in its implementation. When fielded, interviewers move in gender-mixed teams consisting of one field supervisor and four interviewers (Afrobarometer 2014, p. 6). Therefore, a team's withdrawal from an insecure enumeration area will not disproportionally affect the areas surveyed by men or women. To corroborate this, we interviewed the Afrobarometer Project's Deputy Director of Surveys. We also communicated with 15 of the national partner firms, which all confirmed that teams indeed consist of both men and women (see Supplemental Material table Sm2 for details and Supplemental  Material table Sm3, which shows the gender distribution of interviewers per country).
Moreover, the assignment of interviewers to respondents should not create gendered distortions. The survey adheres to a gender quota in sampling: on any given day, an interviewer alternates respondents by gender, starting each morning by referring to her last interview the previous day, or a coin-flip. 7 Upon approaching a selected household-and knowing which gender to sample-a list of potential respondents is established as the adult household member first encountered describes who in the household are women or men. The interviewer clarifies: "We would like to choose an adult from your household. Would you help us pick one?" One person from the allotted gender category is chosen by drawing a numbered card. 8 6. While it is possible to investigate our questions on the Americas Barometer, the Arab Barometer, and the Asian Barometer, these projects do not ask the question about women's leadership in the same way, lack several independent variables, and cover fewer countries. 7. Because of the binary categories used in the survey, we assume that people are sorted in two gender (or sex) categories. For this purpose, we use the terms male/man and female/women interchangeably. Respondent's gender is recorded in the sampling phase and interviewer's gender is self-reported. 8. Male and female interviewers are generally not assigned to different type of areas or respondents, but there is a small difference in the distribution of characteristics such as respondents' age, education levels, and religious denominations across all four types of interviewer/respondent dyads, as well as residence type (see Supplementary Material tables Sm4-Sm7). However, we show in our models (e.g., models 2-7 and models Sm1-Sm20) that including controls for area type or respondents' characteristics does not alter the gender of interviewer effects.

Dependent Variable
Our dependent variable measures attitudes about whether men make better political leaders than women, which we see as a fundamental hurdle for women's advancement in the political sphere (Lovenduski and Norris 1993;Inglehart and Norris 2003). In detail, the interviewer reads aloud: Which of the following statements is closest to your view? Choose Statement 1 or Statement 2.
Statement one: Men make better political leaders than women, and should be elected rather than women.
Statement two: Women should have the same chance of being elected to political office as men.
Interviewers record the neutral response "agree with neither," even if they did not read it aloud. For respondents who selected one of the two statements, interviewers probe the strength of the opinion by asking, "Do you agree or agree very strongly?" Therefore, responses range from 0 (agree very strongly with statement one) to 4 (agree very strongly with statement two), with a neutral mid-category. 9

Gender of Respondent and Interviewer
To construct our key independent variable, we created four dummy variables, one for each of the following dyads: 1) the respondent and interviewer are both male, 2) the respondent is male and the interviewer is female, 3) the respondent is female and the interviewer is male, and 4) the respondent and interviewer are both female.

Additional Independent Variables
Our full models have several independent variables. The indicators capturing education, also self-reported, consist of four dummy variables: no formal education, primary education, secondary education, and post-secondary education. Three dummy variables gauge privacy during the interview: a) if no one else were in the same room, b) if a spouse only was present, and c) if others were attending the interview (combining the options "children," "a small crowd," or "a few others"). 10 Finally, we use information on age of 9. For a distribution per each country, including the "don't know," see Supplementary Material  table Sm8. 10. Information about who is attending is recorded by the interviewer in the questionnaire, and the assumption we make here is that the interviewer asks whether a person attending is the respondent's spouse. As the third category includes the three response options "children," "a respondents (self-reported, in years) and age of the interviewer (in years, recorded by the interviewer completing the survey), to gauge the age difference between interviewers and respondents. In more detail, we create a binary variable coded 1 if the interviewer is at least 15 years older than the interviewee (0 if not). 11 We opt for the 15-year difference for two reasons. First, and more theoretically, 15 years is generally the lower limit to denote a generation (see Pew Research Center 2015). At the time of the survey (i.e., in 2014/2015), most respondents belonged to Generation X (those born between 1965 and 1980) and the millennial generation (those born between 1981 and 1997). To qualify for participation in the survey, participants had to be 18 years of age. This implies that the life span of these two most important generations in the survey was 15 years for Generation X, and 16 years for the millennials. More empirically, if we were to use a higher limit to denote a generation, we would have very few interviewer-respondent dyads with a sufficiently large age difference. To illustrate, there are only 2 percent of the dyads where the age difference is 20 years or more. We control for respondents' residency and religious denomination to hold constant two factors that could determine variation in gender equality attitudes (Bolzendahl and Myers 2004). To capture residency, we add a dummy variable coded 1 for urban and 0 for rural (categories recorded by survey administrators). Religion distinguishes between Christians, Muslims, and other religions (stated information by respondents), through dummy variables. Table 1 reports summary statistics of all variables (for the exact question wording of the items we used, see Supplementary Material table Sm22).
Our analysis has several steps. First, to examine whether there are interviewer effects, we report cross-tabular statistics on the gendered leadership questions across the four dyads, and use a Chi-square test of independence to test whether differences are significant. 12 Second, to test hypotheses 1 and 2, we evaluate gender of interviewer effects using multivariate regression. 13 On the left-hand side of Model 1 is the five-value ordinal gendered leadership variable. On the right-hand side are the three dummy variables capturing interviewer-respondent dyads. The reference category is the male-male dyad. Country fixed effects in each model hold country-specific confounders such as national political culture constant (Johnson and Braun 2016). Model 1 tests our main effects and Model 2 adds controls. small crowd,"' or a few others," it is possible that a spouse is present in the latter two categories. The option "spouse only" is not included in this category. 11. In additional models, we test for smaller age differences between interviewer and respondents (i.e., we create dummy variables if the age difference is at least 5 years or more, and at least 10 years or more, respectively), as well as larger age differences (i.e., 20 years' difference and more). 12. We display the mean responses to this question across the four dyads for all countries in the dataset (Supplemental Material table Sm9). 13. We use Stata 17.
We use ordered logistic regression models. 14 Given the ordinal nature of our dependent variable, we deem this analytical choice in line with theory. To interpret the logistic regression coefficients, we create marginal effects plots that display the predictive margins of interviewer gender on responses. In more detail, we create two marginal effects plots for each model. The first plot shows the predicted average marginal effect for men, split by 14. The models present the ordered log-odds (logit) regression coefficients and the cut-off points (which display different levels of support of the latent variable used to differentiate various levels of the support for women as leaders). For each variable, we present the ordered log odds regression coefficient, the standard error, and the significance level (p-values). We also present information on the log-likelihood of the model, a likelihood ratio (LR) Chi squared test-indicating whether a model is significant-and the Pseudo R-squared value.
interviewer gender. The second plot shows the predicted average marginal effect for women, split by interviewer gender. To examine hypotheses 2a and 2b, we run an additional model where we interact two indicator variables capturing the gender of the interviewer (0 ¼ female, 1 ¼ male) and the gender of respondent (0 ¼ female, 1 ¼ male). This model performs a significance test of whether the gender of interviewer effect is stronger for men or for women. We also display these effects graphically via a conditional marginal effects plot.
To investigate hypotheses 3 to 5, we generate interactive models. We start by interacting the interviewer-respondent dyads and our measure of spousal presence. We then create models that control for the age difference between interviewers and respondents and proceed to interact the binary version of the 15-year age difference between the interviewer and respondent. In an additional model we also interact our dummies of interviewer-respondent dyads with respondent's education. Similar to our main models, we also create marginal effects plots to graphically display these relationships.

Results
Our analysis provides clear results. We find that the gender of the interviewer affects men's and women's responses to the gendered leadership item. Table 2 shows that regardless of interviewer gender, men are more supportive than women of the statement that men are better suited for political leadership, with women stating more support for gender equality in political leadership. Yet, the gender of the interviewer matters as well. Chi-square tests of independence, reported in table 2, show that there is a significant difference in answers across male and female interviewers to the gendered leadership item: respondents are less likely to agree with women's suitability as leaders when interviewed by a man, and more likely to do so when interviewed by a woman. These results offer general support for hypothesis 1.
Our multivariate regression models (i.e., Models 1 and 2 in table 3) confirm this finding. For example, holding everything else constant in the model, men have an approximately 10-percentage-point higher chance of strongly agreeing with the statement that women make as good political leaders as men when interviewed by a woman than when interviewed by a man (the predicted probability to give this answer increases from 30 to 40 percent, as shown in figure 1). Women have an approximately 8-percentage-point higher chance of strongly agreeing with the statement that women are equally suitable for political leadership as men when the interviewer is female compared to when the interviewer is male. Table 4 illustrates that this interviewer effect is larger for male respondents. The negative interaction (p ¼ 0.043 in the full model 2) between the gender of the interviewer and the gender of the respondent indicates that compared to women, men have a statistically higher likelihood of being affected by the gender of the interviewer. Figure 2 further displays that this effect is particularly strong for the first response category (i.e., those that "strongly agree that men make better leaders than women"). For the other   response categories, this influence is smaller and there is an overlap in confidence intervals. We therefore find support for hypothesis 2b, albeit limited, which speaks against the proposition that women are more prone to engage in socially desirable responding when answering sensitive items (Lueptow et al. 1990). The interactive model that explores the role of spousal presence (Model 3, table 3) illustrates that male respondents primarily seem to react differently to male and female interviewers if their wife is absent (see figure 3). This could be an indication that men are reluctant to show their traditional attitudes about women in politics (a preference for male leadership) in the presence of their wives. In contrast, there are no significant interactive effects for women; that is, women seem to alter their responses to interviewers regardless of whether their husbands are present or not. Our hypothesis 3 is therefore only partially supported.
We find limited support for hypothesis 4. When we include the interactive terms capturing the 15-year age differences, we see that the effect of interviewer gender disappears among female respondents (see model 1 in table 5 and figure 4). Women give more progressive responses to female interviewers who are less than 15 years older than them but not to those at least 15 years older than them. This suggests to us that (relatively younger) women adapt their answers in the presence of older female interviewers. Yet, this finding comes with the caveat that the share of dyads where the interviewer is this much older and female is small (about 4 percent of our dyads; see table 1). 15 Figure 4 further illustrates that there are no significant interactive effects for men. In Supplementary Material table Sm10, we show that an age difference of five years and more (model Sm10a) and 10 years and more (model Sm10b) produces no significant interactions. There is also an interactive effect between the interviewer-respondent dyads and respondents' levels of education (model 2 in table 5 and figures 5 and 6). In support of hypothesis 5, we find that the effect of the gender of the interviewer is strongest for individuals with low education, and then levels off a bit more for each education level. We find support for this among both female and male respondents.

Robustness Checks
We run two additional models with the same variable specification but now using generalized ordered logistic regressions and multinomial logistic regressions, because our ordered logistic regression analysis violates the parallel line assumption (Williams 2016). These two models (Supplementary Materia tables Sm11 and Sm12l) confirm our main results. We then show, using again the ordered logistic regression models, that our results are insensitive to different operationalizations of the dependent variable. For example, if we disregard the strength of opinion and reduce our dependent variable to three categories instead of five, results are consistent. The variables capturing gender of interviewer effects are all still significant (p<0.000) and the size of the coefficients and their directions are more or less unchanged (Supplemental Material   Additional specifications indicate that the gender of interviewer effects are a rather consistent feature in the national surveys within the Afrobarometer project. Analyzing the main model for each of the 36 countries, we find that they persist in a majority: for 19 countries, all three interviewer-respondent dyads are significant, and for 16 countries at least two of the dyads are significant. In one country, only one of the dyads was significant (Supplementary Material table Sm16 and figure Sm1).
The findings are robust regardless of whether we run our models with country-level-, regional level-, local-level, or even enumeration area (EA)level fixed effects (Supplementary Material table Sm19). Moreover, these findings are unchanged when we run our main models on data from Afrobarometer rounds five and seven (for these replication results, see Supplemental Material table Sm1). Finally, we test the boundary conditions of our findings; that is, do these effects apply merely to survey items having to do with women's standing in society, or to non-gendered questions as well? We do this through two types of analyses. First, we get very similar gender of interviewer effects when we regress three items-attitudes to women's right to divorce, their right to work, and the chances of a women becoming a president in a Muslim society-on our interviewer-respondent dyads, respectively (Supplementary Material table Sm20). 17 Second, and as a contrast, we do not find any gender of interviewer effects, if we look at two gender neutral items as the dependent variable (i.e., respondents' trust in the    the literature: an age difference between the respondent and interviewer of at least 15 years eliminates the increase in women's likelihood to state support for women in political leadership when they are interviewed by female compared to male interviewers. With the 15-year age gap, the gender of interviewer effect disappears among women. This suggests to us that women adapt their answers in the presence of older female interviewers. In addition, we also confirm some prior findings from the literature. As such, we endorse early analysis on spousal presence (see Zipp and Toth 2002). Our models illustrate that without their wives present men are more likely to voice preferences for male leadership than in their presence, irrespective of the gender of the interviewer. Finally, we find support for the proposition that gender of interviewer effects are larger among respondents with low education (see Campbell 1981;Blaydes and Gillum 2013). Overall, our findings suggest to us that it may be inappropriate to estimate opinions about women's suitability for political leadership within countries or to compare such estimates across countries using only responses from interviewer-administered surveys because such responses can be distorted by the interview conditions. Cross-country variation in data collection procedures such as the proportion of interviews conducted by male versus female  interviewers, the proportion of men and women interviewed, age differences between interviewers and respondents, and so on, may distort country-level estimates and cross-country comparisons. In the data used here, for example, the share of interviews conducted by male interviewers ranged from 27 percent in South Africa to 75 percent in Cameroon (see Supplemental Material table Sm3), a difference that has potential to introduce considerable noise into comparisons between these two countries. Our results further suggest that analysts need to be able to account for characteristics of the interview itself (especially interviewer sex and age and who else was present during the interview), as well as the respondent characteristics of sex, age, and education, when estimating support for women's leadership. However, in order to do so, analysts need access to information about interviewers and the interview context that is often unavailable in the world's leading multinational surveys. Practically, we suggest that large survey programs, such as the World Values Survey and the European Values Survey, systematically report characteristics of the interviewer such as gender, as well as intersectional aspects of race, age and class, and religious symbols (see also Adida et al. 2016;Benstead 2014b). We also recommend that researchers then control for these items, when analyzing attitudinal questions, in particular those dealing with sensitive items such as gender equality.