Education, intelligence and Alzheimer’s disease: evidence from a multivariable two-sample Mendelian randomization study

Abstract Objectives To examine whether educational attainment and intelligence have causal effects on risk of Alzheimer’s disease (AD), independently of each other. Design Two-sample univariable and multivariable Mendelian randomization (MR) to estimate the causal effects of education on intelligence and vice versa, and the total and independent causal effects of both education and intelligence on AD risk. Participants 17 008 AD cases and 37 154 controls from the International Genomics of Alzheimer’s Project (IGAP) consortium. Main outcome measure Odds ratio (OR) of AD per standardized deviation increase in years of schooling (SD = 3.6 years) and intelligence (SD = 15 points on intelligence test). Results There was strong evidence of a causal, bidirectional relationship between intelligence and educational attainment, with the magnitude of effect being similar in both directions [OR for intelligence on education = 0.51 SD units, 95% confidence interval (CI): 0.49, 0.54; OR for education on intelligence = 0.57 SD units, 95% CI: 0.48, 0.66]. Similar overall effects were observed for both educational attainment and intelligence on AD risk in the univariable MR analysis; with each SD increase in years of schooling and intelligence, odds of AD were, on average, 37% (95% CI: 23–49%) and 35% (95% CI: 25–43%) lower, respectively. There was little evidence from the multivariable MR analysis that educational attainment affected AD risk once intelligence was taken into account (OR = 1.15, 95% CI: 0.68–1.93), but intelligence affected AD risk independently of educational attainment to a similar magnitude observed in the univariate analysis (OR = 0.69, 95% CI: 0.44–0.88). Conclusions There is robust evidence for an independent, causal effect of intelligence in lowering AD risk. The causal effect of educational attainment on AD risk is likely to be mediated by intelligence.


Introduction
Alzheimer's disease (AD) is the leading cause of death in England and Wales. 1 Existing treatments are currently unable to reverse or delay progression of the disease. Thus, strategies for reducing the incidence of the disease by intervening on modifiable risk factors are important. Higher educational attainment is associated with a lower risk of dementia. [2][3][4][5] However, the mechanisms underlying the associations of educational attainment with AD risk are uncertain and this has implications for intervention design.
In particular, what is the role of intelligence? The degree to which education affects intelligence, vs intelligence being largely fixed in early life and acting as a determinant of educational attainment, has been debated for decades [6][7][8][9][10] and studies have provided evidence of an effect in both directions. 8,11 If the principal direction of causality is intelligence to educational attainment, intelligence would induce confounding bias in the association between educational attainment and AD. In this case, interventions aiming to increase educational attainment (e.g. raising the school leaving age to increase years of schooling) are unlikely to affect AD risk, but alternative prevention strategies such as cognitive training may prove effective. In contrast, if the principal direction of causality is such that greater educational attainment increases intelligence (i.e. intelligence lies on the causal pathway from educational attainment to AD risk), then interventions designed to prolong the duration of education may reduce AD risk, either directly or indirectly through subsequently increasing intelligence.
Determining the relative contributions of education and intelligence to AD risk is of clear importance for designing appropriate policy interventions to reduce AD risk. Using observational methods to unpick these associations is challenging due to bias from measurement error, confounding and reverse causation. More recently, studies have attempted to estimate causal effects of educational attainment on AD risk using methods such as univariable Mendelian randomization (MR). MR is a form of instrumental variable analysis, in which genetic variants are used as proxies for a single environmental exposure. 12 Due to their random allocation at conception, genetic variants associated with a particular risk factor are largely independent of potential confounders that may otherwise bias the association of interest when using observational methods. Genetic variants also cannot be modified by subsequent disease, thereby eliminating potential bias by reverse causation. Thus, MR can be a useful tool for helping to establish whether the association between an exposure and an outcome is likely to be causal. However, these methods can be problematic with traits that are highly genetically and phenotypically correlated (such as educational attainment and intelligence). 13,14 Figure 1 illustrates possible models underlying the observed associations of educational attainment and intelligence with AD risk. In all models shown, causal effects for both exposures on AD risk would

Key Messages
• Mendelian randomization (MR) estimates of the causal effect of education on risk of Alzheimer's disease (AD) can yield biased estimates with traits that are highly genetically and phenotypically correlated (such as education and intelligence).
• We provide evidence that intelligence and education are likely to have causal effects on each other, with the magnitude of effect being similar in both directions.
• We show that the existing associations reported in the literature between greater educational and lower AD risk are likely to be largely driven by intelligence, rather than there being an independent protective effect of staying in school for longer. Figure 1. A non-exhaustive list of possible models underlying the observed causal effects of educational attainment, intelligence and risk of Alzheimer's disease. These are not intended to be directed acyclic graphs. IQ denotes intelligence. EA denotes educational attainment and AD denotes Alzheimer's Disease. G denotes a set of instruments that are drawn as a single node for visual simplicity. (a) Illustrates a model in which G is identified in a GWAS of EA, because it is associated with EA indirectly through IQ. IQ has an independent effect on AD but EA does not. A spurious association between EA and AD is induced due to confounding by IQ. Accounting for IQ in multivariable analysis would reveal no independent effect of EA on AD risk and the intervention target should be IQ. (b) Illustrates a model in which G is identified in a GWAS of IQ because it is associated with IQ indirectly through EA. EA has an independent effect on AD but IQ does not. A spurious association between IQ and AD is induced due to confounding by EA. Accounting for EA in multivariable analysis would reveal no independent effect of IQ on AD risk and the intervention target should be EA. (c) Illustrates a model in which the effect of EA on AD risk is entirely mediated by IQ (i.e. IQ lies on the causal pathway between EA and AD). Multivariable analyses would reveal an independent effect of IQ on AD risk, but no independent effect of EA. The intervention target could be either IQ or EA. (d) Illustrates a model in which the effect of IQ on AD risk is entirely mediated by EA (i.e. EA lies on the causal pathway between IQ and AD). Multivariable analyses would reveal an independent effect of EA on AD risk, but no independent effect of IQ. The intervention target could be either EA or IQ. (e) Illustrates a model in which there is full horizontal pleiotropy through IQ. Horizontal pleiotropy occurs when G has a causal effect on disease independently of its effect on the exposure. In this case, multivariate analyses would reveal an independent effect of IQ on AD risk, but no independent effect of EA and the intervention target should be IQ. (f) Illustrates a model in which there is full horizontal pleiotropy through EA. Multivariate analyses would reveal an independent effect of EA on AD risk, but no independent effect of IQ and the intervention target should be EA. (g) Illustrates a model in which G independently effects all three traits, but the three traits have no causal effect on each other. Multivariable analysis would show no independent effects of EA or IQ on AD risk. (h) Illustrates a model in which there are joint independent effects of both EA and IQ on AD risk. Multivariate analysis would show independent effects of both IQ and EA and the intervention target could be either IQ or EA. Here, the bi-directional relationship between IQ and EA does not affect the qualitative interpretation. be implied from univariable MR analyses. However, depending on the underlying model, intervention targets will differ. Multivariable MR is an extension of univariable MR in which multiple exposures are included within the same model. It can estimate causal effects of one trait, independently of another related trait. Thus, extending MR analyses from the univariable to the multivariable setting may be a useful tool for further disentangling these relationships and establishing the respective roles of both education and intelligence in AD risk. 13 In this study, we estimated (i) the effect of educational attainment on intelligence and vice versa, (ii) the overall effects of educational attainment and intelligence on AD risk and (iii) the independent effects of both education and intelligence on AD risk (i.e. the effects of educational attainment and intelligence on AD risk that are independent of the other trait).

Mendelian randomization
MR is a form of instrumental variable analysis that uses genetic variants to proxy for environmental exposures. Twosample MR 15 is an extension in which the effects of the genetic instrument on the exposure and on the outcome are obtained from separate genome-wide association studies (GWAS). This method is particularly useful for trying to identify early life risk factors for later life diseases like AD, because unlike in observational studies, rich longitudinal data across the whole life course (which are scarce) are not needed. MR is based on three key assumptions: (i) genetic variants must be robustly associated with the exposure of interest, (ii) genetic variants must not be associated with potential confounders of the association between the exposure and the outcome and (iii) there must be no effects of the genetic variants on the outcome, that do not go via the exposure (i.e. no horizontal pleiotropy). 16 To-date, MR studies have typically been univariable (i.e. examining the effect of one exposure on an outcome), thereby estimating the total effect of the exposure on the outcome through all possible pathways. More recently, multivariable MR methods have been proposed to investigate the independent effects of multiple traits on an outcome. Methods for conducting a multivariable MR analysis have been published elsewhere. 13,17,18 Data For educational attainment, we used the GWAS (discovery and replication meta-analysis, n ¼ 293 723) 19 which identified 162 approximately independent genome-wide significant (P < 5x10 -8 ) single nucleotide polymorphisms (SNPs) associated with years of schooling. SNP coefficients were per standard deviation (SD) units of years of schooling (SD ¼ 3.6 years). For intelligence, we used the largest (n ¼ 248 482) and most recent iteration of the Multi-Trait Analysis of Genome-wide association studies, 20 which identified 194 approximately independent (r 2 threshold <0.01 within a 10 mb window using 1000 genomes reference panel 21 ) genome-wide significant SNPs. SNP coefficients were per one SD increase in the intelligence test scores (SD¼ 15 points on the intelligence test score). Tables S1 and S2, available as Supplementary data at IJE online detail each SNP used from the education and intelligence GWAS along with the chromosome, gene position, effect and other alleles, effect allele frequency and the associations of each SNP with the exposure and the outcome. F statistics provide an indication of instrument strength 22 and are a function of R 2 (how much variance in the trait is explained by the set of genetic instruments being used), the number of instruments being used and the sample size. The F statistics for the educational attainment and intelligence instruments are 43.5 and 50.45, respectively (F > 10 indicates the analysis is unlikely to suffer from weak instrument bias). 23 For the outcome (AD) we used the large-scale GWAS of AD conducted by the International Genomics of Alzheimer's Project (IGAP, n ¼ 17 008 AD cases and 37 154 controls). 24 SNP coefficients were log odds ratios (ORs) of AD. Ethical approval was granted for each of the original GWAS studies and details can be found in the respective publications.

Estimating the bidirectional association between intelligence and educational attainment
After (i) excluding non-independent SNPs, (ii) excluding SNPs that overlapped between the two GWAS and (iii) harmonization across both GWAS, there were 148 genome-wide significant SNPs for educational attainment and 180 for intelligence available for these analyses. Full details of the harmonization procedure are available as Supplementary data at IJE online. Univariable MR was used to estimate the total effect of intelligence on educational attainment, and educational attainment on intelligence. This was done using inversevariance-weighted (IVW) regression analysis. 25 Briefly, IVW regression is where causal effect estimates for each genetic variant are averaged using an inverse-variance weighted formula (taken from the meta-analysis literature) to provide an overall causal estimate of the exposure on the outcome. 26 In this regression, the intercept is constrained to zero, which makes the assumption of no horizontal pleiotropy. Results are presented in SD units to enable a comparison of the magnitude of effect across both exposures.

Estimating the total and independent effects of education and intelligence on Alzheimer's disease
There were 142 genome-wide significant SNPs for educational attainment and 185 for intelligence available for these analyses, after excluding non-independent SNPs and harmonization across both GWAS (full details of harmonization are available as Supplementary data at IJE online). Univariable MR was used to estimate the total effects of both intelligence and educational attainment (separately) on AD risk, through all possible pathways, using in an IVW regression analysis (described above). 25 As mentioned previously, this univariable method has been shown to yield biased effect estimates if the genetic instruments being used are non-specific for the hypothesized exposure. 13,14 Thus, to demonstrate these effects as they would be observed in a typical univariable analyses, we did not exclude the 9 SNPs that overlapped across education and intelligence GWAS. We then used multivariable MR to estimate the effects of educational attainment and intelligence on AD risk, independently of each other, by including both exposures within the same model. 13 After clumping the full list of SNPs from both the education and intelligence GWAS (to ensure only independent SNPs are included) and restricting to those SNPs (or proxies) found in the AD GWAS, a total of 231 SNPS were available for the multivariable MR analyses (84 for education and 156 for intelligence, 9 of which overlap between both GWAS).

Sensitivity analyses
Firstly, in the bidirectional analysis between educational attainment and intelligence, we endeavoured to rule out the possibility that the genetic instruments used to proxy for educational attainment are actually instruments for intelligence and vice versa (i.e. we wanted to test that the hypothesized causal direction was correct for each SNP used). To do this we performed Steiger filtering 27 for each SNP to examine whether it explains more variance in the exposure than it does in the outcome (which should be true if the hypothesized causal direction from exposure to outcome is correct). We then re-ran analyses excluding those SNPs for which there was evidence that it explained more variance in the outcome than the exposure. Secondly, to check that the SNPs do not exert a direct effect on the outcome apart from through the exposure (which would violate a key MR assumption of no horizontal pleiotropy 12 ) we compared results from all univariable (both the bidirectional education on intelligence analyses and the analysis of education and intelligence on AD risk) and multivariable IVW regressions to those obtained with MR-Egger regression. In MR-Egger regression, the intercept is not constrained to zero, thus, the assumption of no horizontal pleiotropy is relaxed. 16,26,28 The estimated value of the intercept in MR-Egger regression can be interpreted as an estimate of the average pleiotropic effect across the genetic variants. An intercept term that differs from zero is therefore indicative of horizontal pleiotropy, and the causal effect estimate obtained from an MR-Egger regression is adjusted for the degree of pleiotropy detected. 16 Full details of the MR-Egger regression analyses are available as Supplementary data at IJE online. Thirdly, we conducted a leave-one-out analysis for the univariable models in which we systematically removed one SNP at a time to assess the influence of potentially pleiotropic SNPs on the causal estimates. 29 If any single SNP was invalid, there would likely be distortion in the distribution of the causal effects estimates. Fourth, in all univariable analysis, we assessed whether causal estimates from different genetic variants were comparable (i.e. heterogeneity) using Cochran's Q statistic. 16 Considerable heterogeneity would imply that the MR assumptions may not be valid for all the variants included in the analysis. Finally, funnel plots were generated to enable the visual assessment of the extent to which pleiotropy is balanced across the set of instruments used in each analysis. Symmetry in these plots provides evidence against directional pleiotropy.

Bidirectional effects of intelligence on educational attainment and their influences on AD risk
Using 180 and 148 genetic instruments for intelligence and educational attainment, respectively (and no overlapping SNPs), we found strong evidence of causal effects both of intelligence on educational attainment, and of educational attainment on intelligence (Table 1). However, the magnitude of the effect was over two-fold greater for educational attainment on intelligence compared with intelligence on educational attainment. Per SD increase in intelligence (i.e. per 15 points on the intelligence test), years of schooling increased by 0.51 SD [95% confidence interval (CI): 0.49 to 0.54]. Per SD increase in years of schooling (i.e. per 3.6 years of schooling), intelligence increased by 1.04 SD (95% CI: 0.99 to 1.10).
The main IVW regression using all SNPs from the educational attainment GWAS showed that, with each SD more years of schooling (i.e. 3.6 years), the odds of AD were, on average, 37% lower (95% CI: 23-49%). Per one SD higher intelligence test score, the odds of AD were, on average, 35% lower (95% CI: 25-43%, Fig. 2 and Table  S3, available as Supplementary data at IJE online).

Multivariable analysis of education and intelligence on AD
When both intelligence and educational attainment were included within a single multivariable model, there was little evidence of an effect of educational attainment on AD risk, independent of intelligence ( Fig. 2 and Table S3, available as Supplementary data at IJE online, OR for the effect of a one SD increase in years of schooling on AD ¼ 1.15, 95% CI: 0.68-1.93). There was, however, evidence that higher intelligence lowers AD risk, independently of educational attainment. On average, after accounting for educational attainment, odds of AD were 38% lower (95% CI: 12-56%) per one SD higher intelligence test score ( Fig. 2 and Table S3, available as Supplementary data at IJE online).

Sensitivity analyses
The Steiger filtering provided evidence that all intelligence SNPs explained more variance in intelligence than educational attainment, suggesting they were all in the correct causal direction (i.e. from intelligence to education). However, there was evidence that 125 (85%) of the 148 education SNPs explained more variance in intelligence than educational attainment, suggesting the hypothesized causal direction is incorrect and is more likely to go from intelligence to education. This left 23 education SNPs. When using only these 23 education SNPs, there was still strong evidence of a causal effect of educational attainment on intelligence (standardized b ¼ 0.57, 95% CI: 0.48-0.66, Table S4, available as Supplementary data at IJE online), but the magnitude attenuated so that it was comparable to the effect of intelligence on educational attainment (as opposed to the main analysis which showed >2-fold greater magnitude of effect for education on intelligence than vice versa). There was some evidence of horizontal pleiotropy only in the estimate of the total effect of intelligence on AD risk (Tables S3 and S5, available as Supplementary data at IJE online). However, for all univariable and multivariable analyses (including the bidirectional effects of intelligence on educational attainment), MR-Egger effect estimates adjusting for pleiotropy were consistently comparable to those from the IWV regressions (Tables S3 and S5, available as Supplementary data at IJE online). As expected the standard errors were much larger for MR-Egger estimates, because MR-Egger regression provides estimates of two parameters (i.e. both an intercept and a slope) compared with the single parameter in the IVW regressions (i.e. only the slope). The MR-Egger estimate for the total effect of intelligence on AD risk went in the opposite direction to the IVW estimate (i.e. greater rather than lower odds of AD per SD increase in the intelligence score); however, the CIs were very wide, and the effect estimate could plausibly go in either direction (OR: 1.36, 95% CI: 0.75, 2.48). There was no distortion in the leave-one-out plots for univariable analyses ( Figures S1-S4, available as Supplementary data at IJE online), suggesting that no single SNP was driving the observed effect from any analysis. There was evidence of heterogeneity in the causal effect estimates from all    Tables S3 and S5, available as Supplementary data at IJE  online). However, provided the pleiotropic effects of genetic variants are equally likely to be positive or negative (i.e. no directional pleiotropy), the overall causal estimate based on all genetic variants is likely to be unbiased and the funnel plots showed little evidence of departure from symmetry (Supplemental Figures S5-S8, available as Supplementary data at IJE online).

Bidirectional causal effects in the relationship between of educational attainment and intelligence
In this study we examined the bidirectional effects of intelligence on educational attainment. We found that the relationship between intelligence and educational attainment is indeed likely to be bidirectional in nature (i.e. there is evidence of an effect in both directions), with the magnitude of effect being similar in both directions. A recent metaanalysis of quasi-experimental studies of educational effects on intelligence provides evidence that supports our MR findings. Across 142 effect sizes from 42 data sets involving over 600 000 participants, the authors reported consistent evidence for beneficial effects of education on cognitive abilities of 1-5 IQ points (contingent on study design, inclusion of moderators and publication-bias correction) for an additional year of education. 11 These findings are similar to ours with respect to magnitude of effect. Assuming a SD of 15 for IQ (as described in the metaanalysis 11 ) intelligence was, on average, up to one-third of a SD higher per year of schooling. In our study we show an average of 0.57 SD higher in intelligence per SD (3.6 years) increase in years of schooling, which equates to 0.16 SD higher intelligence per one additional year of schooling. It is worth nothing that in the quasi-experimental policy reform studies, levels of prior intelligence (or underlying general cognitive ability) will be similar among individuals who left school before and after the policy reforms, making confounding by prior intelligence unlikely. Similarly, in the MR analyses, we endeavoured to exclude any SNPs for education for which there was evidence that they explained more variance in intelligence than education, making it unlikely that our findings for the effect of education on intelligence are a result of all genetic instruments being associated with intelligence and not educational attainment. Thus, both genetic and non-genetic instruments (which contain different sources of bias) provide consistent evidence that educational attainment affects later intelligence.
Longitudinal observational studies have previously reported associations between early-life intelligence and educational attainment. 8 However, we are unaware of any longitudinal studies that have compared the magnitude of effect for baseline intelligence on educational attainment, with educational attainment on subsequent intelligence in the same sample. One previous study has examined the association between education and lifetime cognitive change after controlling for childhood IQ. Authors reported that (after controlling for childhood IQ score) education was positively associated with IQ at ages 70 and 79 (with the two outcome ages being in different samples), and more strongly for participants with lower initial IQ scores. Education, however, showed no significant association with processing speed, measured at ages 70 and 83 (again, with the two ages being in different samples). 30 Another study examined associations between father's occupation, childhood cognition, educational attainment, own occupation in the 3rd decade, and self-reported literacy and numeracy problems in the 4th decade in 1946 and 1958 Birth Cohorts. 31 The authors report inverse associations between childhood cognition, educational attainment and adult literacy and numeracy problems. Some studies have looked at genetic overlap between the two traits 20,32 and reported correlations of up to 0.7 20,33 but to date, none have explicitly tried to examine the direction of the association using genetic variants that are associated with each of them. As mentioned previously, the largest and most robust evidence to date comes from a recent meta-analysis of quasi-experimental studies of educational effects on intelligence. 11

Effects of educational attainment and intelligence on AD risk
We also examined the total effects of education and intelligence on AD risk, and the effects of each exposure on AD risk independent of the other exposure. Our findings imply that the existing associations reported in the literature between greater educational attainment and lower AD risk are likely to be largely driven by intelligence, rather than there being an independent protective effect of staying in school for longer. This provides evidence against the underlying models illustrated in Fig. 1b, d, f and h (i.e. models in which there is an independent effect of educational attainment on AD risk). There are then four main possible explanations for our finding. The first is that prior intelligence is a confounder and induces a spurious association between education and AD risk (i.e. Fig. 1a). However, given the evidence supporting an effect of education on later intelligence from instrumental variable analyses using policy reforms to increase the school leaving age (in which prior intelligence is randomly distributed among instrument arms and thereby cannot confound), the model in Fig. 1a is unlikely. The second and third explanations relate to horizontal pleiotropy (either a pathway through IQ as in Fig. 1e, or independently effecting all traits as in Fig. 1g). Given that our causal effect estimates were comparable when using methods to quantify and adjust for horizontal pleiotropy, these models are also unlikely to fully explain our findings. The fourth explanation is that there is an effect of educational attainment on AD risk, but it is largely mediated by its effects on later intelligence (i.e. Fig 1c). Given the existing evidence supporting an effect of education on later intelligence from quasi-experimental studies, 11 and from our own MR analyses, this explanation seems most plausible.
Together, these findings suggest that increasing education attainment (for example, by increasing years of schooling) may have beneficial consequences for future AD incidence. As such, they offer support to the most recent change in school policy in the UK (in 2013), which requires young people to remain in at least part-time education until age 18 years (as opposed to 16 years). Our findings also suggest that there may potentially be other ways of reducing AD risk by improving various aspects of intelligence (e.g. with cognitive training), which may be particularly effective in those with lower educational attainment or in populations where increasing years of schooling is not feasible (e.g. older populations). However, it is worth nothing that it is not clear what type of training (if any) would be beneficial or when in the life course (and indeed disease course) such training would confer protection.

Limitations
There are a number of limitations to our study. Firstly, in two-sample MR, 'winner's curse' (i.e. where the effect sizes of variants identified within a single sample are likely to be larger than in the overall population, even if they are truly associated with the exposure) can bias causal estimates towards the null. However, we used SNPs identified in the meta-analysis of the discovery and replication samples of the educational attainment GWAS 19 making it unlikely that the estimate of the independent effect of education is biased to the null. Secondly, in the presence of weak instruments (i.e. SNPs that are not associated with the exposure at the genome-wide significance level), sample overlap in two-sample MR can bias estimates towards the confounded observational estimate. 34 There were no overlapping samples in the analysis of educational attainment and intelligence on AD risk, but there was considerable overlap in the samples used for the bidirectional educational attainment on intelligence analysis. Given that all instruments used in the analysis were strong (associated with the exposure at P < 5x10 -08 ), any bias should be minimal. Thirdly, it is currently not possible to estimate the F statistic (a measure of instrument strength) for multivariable MR in a two-or three-sample setting. Thus, we are unable to assess the conditional strength of our instruments for each exposure, once the SNP effect on the other exposure is taken into account. 13 Fourth, the estimated effect of an exposure on an outcome, that are both associated with mortality, may be susceptible to survival bias. 35 For example, if individuals with lower educational attainment are more likely to die before the age of onset of AD, bias may occur because those individuals with a genetic predisposition for higher educational attainment are likely to live longer, thus having greater risk of being diagnosed with AD. This may induce a non-zero causal effect estimate even if no true biological association exists. In a previous study, we performed simulations to investigate whether our estimates of the effect of educational attainment on AD risk may be biased by survival and found no evidence to suggest this was the case. 5 Fifth, the phenotype used in the GWAS of intelligence was typically brief (a 2-min, 13item test) and heterogeneous. Thus, results may be different if a more precise measure of intelligence was available for GWAS studies. Finally, the educational attainment GWAS only assessed years of full-time academic training from primary education through to advanced qualifications (e.g. degree). Therefore, it remains unclear whether the same genetic variants would be associated with other aspects of education (for example, vocational courses or completing part-time as opposed to full-time courses). It is worth noting that there is a larger GWAS of family history of Alzheimer's disease. 36 In that study, the phase 3 GWAS meta-analysis includes only AD-by-proxy cases from the UK Biobank (i.e. there are no diagnosed cases). AD-byproxy cases were defined as a positive response to the question 'Has your mother or father ever suffered from Alzheimer's disease/dementia'. We had several concerns about using this data for MR analysis. Firstly, participants defined as cases have not themselves been diagnosed with AD. Secondly, the question does not specify Alzheimer's disease but asks about any form of dementia. Lastly, the question does not ask if family members were diagnosed by a doctor. These issues are likely to introduce measurement error in the outcome, which may mitigate any power gained by the increased sample size of that GWAS, over the IGAP GWAS used in our MR analyses. 24 Given that we have sufficient power to test our hypotheses, we opted to use the IGAP GWAS which, although it has a smaller sample size, it has a more precise phenotype.

Conclusions
Our findings imply that there is a bidirectional effect of intelligence on educational attainment and that the magnitude of effect is likely to be similar in both directions. There is robust evidence for an independent, causal effect of intelligence in reducing AD risk. The implications of this are uncertain, but it potentially increases support for a role of cognitive training interventions to improve various aspects of fluid intelligence. However, given that greater educational attainment also increases intelligence, there is potentially also support for policies aimed at increasing length of schooling in order to lower incidence of AD.

Supplementary data
Supplementary data are available at IJE online. All authors provided critical comments on the manuscript. E.L.A. is the guarantor and accepts full responsibility for the work and/or the conduct of the study, had access to the data, and controlled the decision to publish. The corresponding author attests that all listed authors meet authorship criteria and that no others meeting the criteria have been omitted.

Conflict of interest:
The authors have no competing interests to disclose. All authors have completed the ICMJE uniform disclosure form at www.icmje.org/coi_disclosure.pdf and declare: no support from any organisation for the submitted work; no financial relationships with any organizations that might have an interest in the submitted work in the previous three years; no other relationships or activities that could appear to have influenced the submitted work.