Exploring a causal role of DNA methylation in the relationship between maternal vitamin B12 during pregnancy and child’s IQ at age 8, cognitive performance and educational attainment: a two-step Mendelian randomization study

Abstract An adequate intake of vitamin B12 during pregnancy plays an important role in offspring neurodevelopment, potentially via epigenetic processes. We used a two-step Mendelian randomization approach to assess whether DNA methylation plays a mediating and causal role in associations between maternal vitamin B12 status and offspring’s cognition. Firstly, we estimated the causal effect of maternal vitamin B12 levels on cord blood DNA methylation using the maternal FUT2 genotypes rs492602:A > G and rs1047781:A > T as proxies for circulating vitamin B12 levels in the Avon Longitudinal Study of Parents and Children (ALSPAC) and we tested the observed associations in a replication cohort. Secondly, we estimated the causal effect of DNA methylation on IQ using the offspring genotype at sites close to the methylated CpG site as a proxy for DNA methylation in ALSPAC and in a replication sample. The first step Mendelian randomization estimated that maternal vitamin B12 had a small causal effect on DNA methylation in offspring at three CpG sites, which was replicated for one of the sites. The second step Mendelian randomization found weak evidence of a causal effect of DNA methylation at two of these sites on childhood performance IQ which was replicated for one of the sites. The findings support a causal effect of maternal vitamin B12 levels on cord blood DNA methylation, and a causal effect of vitamin B12-responsive DNA methylation changes on children’s cognition. Some limitations were identified and future studies using a similar approach should aim to overcome such issues.


Introduction
Components of one carbon metabolism, which includes folate, and several B vitamins, play an important role in prenatal nutrition and have been implicated in a range of neurodevelopmental disorders in offspring (1). Periconceptional folic acid or multivitamin supplementation has been shown to protect offspring against neural tube defects (2-6) and more recently folic acid supplementation during pregnancy has been linked to reduced risk of severe language delay (7) and autism in children (8). Adequate circulating vitamin B 12 levels during pregnancy are also associated with a decreased risk of neural tube defects (9) and there is evidence that maternal vitamin B 12 status during pregnancy is associated with offspring cognition (10)(11)(12)(13).
The mechanisms through which sub-optimal levels of circulating vitamin B 12 exert unfavourable effects during pregnancy are not fully understood. One possible pathway is via the role of vitamin B 12 in one carbon metabolism and the donation of methyl groups for the methylation of a range of biological molecules including DNA. Profound changes in DNA methylation which occur in early development in the brain and nervous system may be particularly sensitive to vitamin B 12 availability during foetal life. Emerging evidence suggests that maternal prenatal vitamin B 12 status influences DNA methylation at the insulin-like growth factor-II locus and at a global level (14)(15)(16).
A recurring problem in addressing questions of maternal prenatal nutrition and offspring outcomes is distinguishing causal relationships in the face of multiple potential confounding factors. A Mendelian randomization (MR) approach has been widely applied in an attempt to strengthen causal inference and circumvent the issue of confounding in such studies (13,(17)(18)(19)(20)(21)(22)(23). In a Mendelian randomization framework, a genetic variant is used as a proxy for the exposure of interest. This approach can be implemented when considering vitamin B 12 as an exposure, as genetic variants have recently been reported that are robustly associated with vitamin B 12 levels. For example, genome-wide association studies have identified common genetic variants in the FUT2 gene to be associated with serum vitamin B 12 levels in individuals with European ancestry (24)(25)(26)(27) and in a Chinese population (28). Two FUT2 SNPs that are strongly associated with vitamin B 12 levels are rs492602:A > G and rs1047781:A > T. In a previous study (13), maternal genotype at the common single nucleotide polymorphism (SNP) rs492602:A > G in the FUT2 gene was found to be associated with offspring's IQ at age 8. Since maternal FUT2 genotype is not associated with socioeconomic confounders that might affect maternal vitamin B 12 levels and intake, these findings suggest that maternal vitamin B 12 levels are causally related to children's IQ but have little effect. These findings were not confirmed in more recent studies (29,30) and the intermediate effect of DNA methylation has not been previously examined.
An extension of Mendelian randomization to investigate mediation has been described (17,31), and the particular application to epigenetic studies elaborated upon (17). The aim of this method is to assess whether DNA methylation plays a mediating and causal role in linking an exposure to an outcome. In 'two-step epigenetic Mendelian randomization' a SNP is used to proxy for the exposure of interest but rather than explore its association with the phenotype of interest (here child IQ), DNA methylation is considered as an outcome in the first step. If DNA methylation variation is associated with the exposure, then a second Mendelian randomization step can be applied using an SNP that proxies for methylation levels at the site modified by the exposure of interest, and its association with the interrogated exposure. The process is summarised in Figure 1. The rationale for this study is twofold. Firstly, based on indications in the literature that maternal vitamin B 12 levels during pregnancy affect cognitive neurodevelopment, we wanted to address the biological question of whether epigenetic changes play a role in this association. Secondly, we wanted to explore and develop the recently established Mendelian randomization methodology and apply it to a novel setting within the context of mediation.
In this study, we investigated the potential mediating role of DNA methylation in the observed relationship between maternal prenatal vitamin B 12 levels and offspring IQ using a two-step epigenetic Mendelian randomization (MR) approach (17). In the first step, we used maternal genetic variants at the FUT2 gene as proxies for in utero vitamin B 12 exposure on offspring DNA methylation at the genome-wide level. Site-specific differences in DNA methylation (CpG sites) identified in step 1 were taken forward to step 2; in the second step we used offspring SNPs near the CpGs of interest as proxies for offspring DNA methylation, and used these to estimate the causal effect of DNA methylation in children on their own IQ. If DNA methylation mediates an association between prenatal vitamin B 12 exposure and IQ, we should observe firstly that the maternal FUT2 genetic variation is associated with offspring DNA methylation and secondly that SNPs that proxy for the associated methylation sites are associated with childhood IQ. The SNP effect on both the exposure and the outcome might indicate causality or a pleiotropic effect (the outcome and the exposure are affected by the same SNP) as elucidated recently (22,32,33). Ideally, the outcomes of both MR steps should be justified a-priori or based on analyses in a separate dataset to avoid bias. However, because available datasets are lacking and no previous studies have investigated DNA methylation as a mediator on the causal pathway between maternal FUT2 genotype and childhood IQ, we performed both steps in subgroups of the same study population and tried to replicate the first and second step in separate studies. This work has also highlighted other challenges associated with two-step Mendelian randomization that future studies should consider and attempt to overcome.

Results
A flow-chart summarizing the two-steps diagram including the datasets used and relevant results tables is presented in Figure 2.

Study sample characteristics
We performed the first and second step MR in the Avon Longitudinal Study of Parents and Children (ALSPAC) study (34,35). From the original eligible sample, only subsets of subjects had the data needed for each analysis step (mother and child genetic, epigenetic, IQ). The numbers of subjects included in each different analysis are shown in the corresponding result tables. Table 1 shows the characteristics of the study population and of the subsamples used in the 1 st -step MR and the 2 nd -step MR. The children that were participants in the first step MR analysis had slightly older and better educated mothers who smoked less than the children in the second step MR analysis. The children of the first step MR analysis also were tested at a slightly younger age on average and had higher IQ scores. Mother's BMI, parity and child's sex did not seem to differ between the two study samples.
The replication of the first step was carried out in the Genetics of Overweight Young Adults (GOYA) study. The GOYA study was similar to the ALSPAC in terms of age of delivery, child's sex and parity. However, BMI was higher, and there was a higher proportion of pregnancy smokers and highly educated mothers. The characteristics of the study populations are summarized in Table 1. 1 st -step MR. The association of maternal FUT2 genotype with cord blood DNA methylation was explored by running a methylome-wide association study (MWAS) in ALSPAC. The results are presented in Figure 3 and the top 20 CpGs with the lowest P-values are reported in Tables 2 and 3. The MWAS for mother's rs492602:A > G genotype (genomic inflation factor k ¼ 0.969) did not show any CpG site where offspring DNA methylation was associated with the maternal FUT2 genotype at FDR 0.05. The rs1047781:A > T MWAS (k ¼ 0.802) revealed three CpGs where offspring DNA methylation was associated with the maternal FUT2 genotype at FDR 0.05. Maternal FUT2 genotype was associated with lower DNA methylation at cg23332223 in an intergenic region upstream the gene USP29 and it was associated with higher methylation at cg10543947 and cg15676719, at the transcription start sites of the genes APOL2 and RCSD1. These three CpG sites were carried forward in the first step MR analysis. Step 1 Step 2 CONFOUNDERS Figure 1. Diagram showing the two-step Mendelian randomization approach used in this study.
Step 1: The maternal FUT2 genotype is used as instrumental variable (IV) for circulating vitamin B12 levels in pregnancy (exposure) as its effect on cord blood CpG methylation (outcome) is not vulnerable to confounders.
Step 2: The offspring genotype at a cis-SNP around the CpG whose methylation is affected by maternal FUT2 genotype is used as IV for CpG methylation to estimate the effect of cord blood methylation (exposure) on childhood IQ (outcome) as it is free from confounder biases.
Methylome-wide association study for maternal FUT2 in ALSPAC-ARIES cord blood 450K data (N=641): 3 CpG sites at FDR<0.05 (Tables 2 and 3, Figure 3) Cis-SNP discovery from publicly available ALSPAC-ARIES data (N=771): 2 independent SNPs for 2 CpG sites at p<1 x 10 -7 (Table 5) Two-sample IV analysis ( Two-sample IV analysis (Table 4) Replication of 2 CpG sites using publicly available SSGAC data in MR-base Two-sample IV analysis (Table 7) Observational analysis in ALSPAC-ARIES (N=611-608) ( Table 6) The two-sample MR analysis for the first step was performed using the genotype-exposure estimates from the GWAS study on vitamin B 12 levels previously published by Lin et al. (28). Specifically, beta coefficient ¼ 70.21 (pg/ml) and standard error ¼ 5.53 (pg/ml) were used. The results are reported in Table 4. In line with the above results, maternal vitamin B 12 levels were associated with decreased cord blood methylation upstream the gene USP29, and with higher cord blood methylation at APOL2 and RCSD1. Our analysis estimated that 1 pg/ml higher maternal vitamin B 12 increased or decreased cord blood methylation by less than 1% at each site investigated.
In the GOYA cohort, the association of maternal FUT2 genotype at rs1047781:A > T was in the same direction for all the three CpGs investigated. There was evidence of replication for cg23332223 (P-value < Bonferroni 0.05) as shown in Table 4.
2 nd -step MR. The second step MR was carried out in the ALSPAC and using summary data from the Social Science Genetic Association Consortium (SSGAC) (36)(37)(38). In order to perform the second step MR we looked in the ARIES mQTL database for cis-SNPs for the vitamin B 12 -responsive CpG sites discovered in the first step. At birth, there were no SNPs associated with cg23332223 DNA methylation, therefore we could not perform the second step MR for this CpG site. There were six SNPs associated with DNA methylation at cg10543947, which were not independent and, amongst these, only rs5750236:C > T had data in the SSGAC database, therefore we used rs5750236:C > T as IV in the 2 nd -step MR for this CpG. For cg15676719, 91 SNPs were associated with DNA methylation, and there was only one independent cis-SNP, rs1890131:C > T, which was therefore taken forward as IV in the 2nd step MR. The characteristics of the chosen cis-SNP are reported in Table 5. Table 6 shows the results of the associations of cis-SNP genotype and DNA methylation at corresponding CpG sites with children's IQ in the ALSPAC sample after excluding the samples  Study of Parents and Children (ALSPAC), adjusted for batch, cell composition, child's genotype, mother's body mass index (BMI), mother's education, mother's age at conception, smoking during pregnancy and parity.  with methylation data as they were used for the discovery of cis-SNPs associations with methylation. The estimates from the conventional (non-IV) analyses between DNA methylation and IQ had very large standard errors and high P-values. However, using MR, there was some evidence for a minimal causal effect of DNA methylation on increasing performance IQ at APOL2 and decreasing overall IQ and performance IQ at RCSD1. The two-sample analysis conducted in MR-base using the SSGAC database (Table 7) provided some evidence for an increase in childhood intelligence via DNA methylation at the APOL2 gene. Neither methylation site was associated with cognitive/educational outcomes in the larger study. Childhood intelligence data for RCSD1 were not available.

Functional characterization of the methylation sites
The relatively novel USP29 gene codes for a ubiquitin-specific processing protease and it is involved in ubiquitin-dependent protein catabolic processes and thiol-dependent ubiquitinyl hydrolase activity. Some evidence show a role in the repair of DNA damage and cell survival (39). It is expressed mainly in the testis, while virtually no expression was found in the brain. An association with urate levels in European lean men is reported in the GWAS Catalog (40). The APOL2 gene encodes for a cytoplasmatic protein that is involved in lipid transport and lipoprotein metabolic processes. It is expressed in most tissues, including the brain and the pituitary gland. It is upregulated in the prefrontal cortex of schizophrenic patients (41), and the GWAS Catalog reports an association with non-diabetic endstage renal disease in African Americans (42). The RCSD1 gene encodes for a protein involved in actin filament binding. It was found to be expressed at minimal levels in the brain, while it was highly expressed in EBV-transformed lymphocytes, lung, muscle, small intestine, spleen and whole blood. No associations were found in OMIM or in the GWAS Catalog.

Hypothesis-free MR analysis of other potential health outcomes
Within MR-Base we performed a hypothesis-free MR analysis of the association of DNA methylation with the disease outcomes. For the APOL2 variant there were 32 diseases with available

Discussion
Increasing evidence suggests that DNA methylation is an intermediate link between prenatal malnutrition and offspring's neurocognitive development (1,9,10,12,29,43,44). We used a two-step Mendelian randomization (MR) approach to investigate whether prenatal exposure to maternal vitamin B 12 levels is causally linked to offspring's IQ at age 8 via changes in offspring's DNA methylation. In the first MR step, we used maternal genotype in the FUT2 gene at rs492602:A > G and at rs1047781:A > T two genetic variants that are strongly associated with vitamin B 12 levels to estimate the causal association with DNA methylation. Our IV analysis suggests that maternal vitamin B 12 status is causally associated with small differences in DNA methylation in the cord blood of offspring. In the second MR step, we used the offspring genotype at cis-SNPs associated with DNA methylation at the three CpG sites most strongly associated with maternal vitamin B 12 (identified in the first step) and estimated the causal association between DNA methylation and IQ. The causal estimates for this association were positive at one site and negative at another site, therefore not providing enough evidence for a positive effect of increasing vitamin B 12 levels prenatally on offspring intelligence. Some caution is warranted because the CpG sites taken forward to the second step were selected amongst those that were associated with the instrument from the first step (maternal FUT2 genotype), therefore a causal effect on DNA methylation was expected for these sites and the causal effect estimates for the first step might be overestimated through this procedure. The replication cohort provided more evidence for the associations found in the first step, at least for one site, but we could not take forward this site to the second step. Moreover, the effect sizes were much smaller than in the discovery cohort. We tried to replicate the findings of the second step using two-sample MR in MR-base where we used GWAS summary data on childhood intelligence, cognitive performance and years of schooling and we replicated a positive effect of DNA methylation at the APOL2 gene on childhood intelligence, while we could not replicate the negative effect at the RCSD1 gene. Overall, our study suggests a small positive effect of prenatal vitamin B 12 on child's IQ through DNA methylation. Moreover, DNA methylation at vitamin B 12 -responsive genes seemed to have an effect on other disease outcomes as ulcerative colitis and irritable bowel disease, although these results need further investigation. The approach taken in this study highlighted several other shortcomings that are discussed below and should be considered when performing twostep Mendelian randomization in future studies. A possible confounder in Mendelian randomization is population stratification, however the ALSPAC study is mostly representative of white mother-child pairs since non-white mothers constitute only 2.2% of the ALSPAC sample (33) and genotyping and epigenotyping has been undertaken on mainly white mother-child pairs. Moreover, we excluded mothers and children that were classified "non-white". The drawback of this approach is that the results are not generalizable to non-white or more mixed populations.
One typical limitation to consider in Mendelian randomization studies is often the low power due to a small genetic effect on the exposure. This is even more of a consideration in two step MR where numbers need to be an order of magnitude greater to account for the power needed at each stage. In the first step, our study had 80% power to detect at true effect size f 2 ¼0.01 of maternal FUT2 on DNA methylation at a ¼ 0.05/485000 and n ¼ 641. The first step two-sample analysis in ALSPAC had 80% power to detect a true standardized causal effect b ¼ 0.52 at R 2 ¼0.05, a ¼ 0.05 and n ¼ 641 (online calculator https://sb452.shi nyapps.io/power/; date last accessed May 03, 2017) (45). Our findings in ALSPAC were still largely underpowered due to the low minor allele frequency of FUT2 genotype and due to the small effect sizes therefore we tried to overcome this by running the 1 st step MR in a separate cohort, where we had 80% power to detect a true effect b ¼ 0.42 at R 2 ¼0.05, a ¼ 0.05 and n ¼ 916. Since we focused our investigation on the top CpG sites resulting from the methylome-wide association analysis, we may have missed sites with weaker association with FUT2 that could potentially still be associated with IQ. The 450K data set contains probes for only around 1% of the CpGs that are potentially methylated in the whole genome. However, more informative technologies such as sequencing are not yet affordable for large sample sizes from population studies. In the second step, we had 80% power to detect cis-SNPs for cord blood DNA methylation at a true effect size f 2 ¼0.01 at a ¼ 0.05/6000000 and n ¼ 770. Using those cis-SNPs in the two-sample MR allowed us to detect true standardized causal effects b ¼ 0.45 at 80% power, a ¼ 0.05 and n ¼ 3843. Therefore, we tested the 2 nd step MR additionally in a larger consortium, where we had 80% power to detect a causal effect b ¼ 0.025 at a ¼ 0.05 and n ¼ 12441. Moreover, only two out of three CpG sites had independent SNPs and therefore there is a need to identify cis-SNPs in larger datasets. Another limitation of Mendelian randomization relevant to our study is the potential for confounding due to pleiotropy and linkage disequilibrium (LD). An on-line search on the GWAS Catalog (46) has revealed that FUT2 is associated with phenotypes such as obesity-related traits, Crohn's disease, cholesterol levels and bipolar disorder. These phenotypes might be downstream to the effects on vitamin B 12 levels (vertical pleiotropy) and therefore they might not represent a problem in the interpretation of the Mendelian randomization analysis. However, as the molecular biology of FUT2 is still unclear in relation to vitamin B 12 , we cannot exclude potential biological pathways that do not involve vitamin B 12 (horizontal pleiotropy). A hypothesis-free look-up in MR-base of all available traits and outcomes identified associations of rs492602:A > G FUT2 genotype with Crohn's disease, cholesterol levels, inflammatory bowel disease and type 1 diabetes, but no associations for rs1047781:A > T FUT2 genotype, the SNP on which our findings are based. A future study could combine multiple genetic variants to proxy for vitamin B 12 in order to overcome pleiotropy. In the second step, we used independent cis-SNPs for methylation, as these may be less likely to have pleiotropic effects and LDinduced confounding than trans-SNPs. However, the children's genotype at the methylation sites showed some weak association with phenotypes that are also affected by FUT2, suggesting some potential pleiotropy perhaps in relation to their own FUT2 genotype.
Although blood is relatively easy to access in large population studies, epigenetic marks are often tissue-specific, so stronger associations might have been found in central nervous system tissue. Although this is a potential limitation, obvious ethical and logistical reasons precluded us from using such tissue in this study and the relevance of these findings in blood must be carefully considered. Some studies have suggested little overlap between blood and brain DNA methylation patterns (47,48) and some associations between blood DNA methylation and brain-related processes, for instance with the functionality of the serotonin pathway (49), childhood physical aggressiveness (50), major depressive disorder (51), autistic spectrum disorder (52) and schizophrenia (53). However, the concordance is highly heterogeneous between different loci and can be minimal (54)(55)(56). Regarding our specific findings, a publicly available dataset (http://epigenetics.essex.ac.uk/bloodbrain/; date last accessed May 03, 2017) (47) show some correlation between adult brain and blood in two out of the three vitamin-B 12 responsive CpGs (Supplementary Material, Fig. S1), the sites that we took forward in step 2 MR (for cg15676719 the correlation is for one brain area only). At this stage, it is still not possible to know whether blood and brain correspondence translates into similar effects from the maternal exposure to vitamin B 12 and this issue clearly requires further investigation.
Finally, the two-sample IV approach can be cost-effective, especially when gene-exposure association estimates from large GWAS studies are available (45). However, it assumes that the two independent samples are comparable. The published study we used is based on Chinese men. The SNP rs1047781:A > T used as IV for prenatal vitamin B 12 exposure seems to be a specific SNP for Chinese populations (MAF ¼ 0.459) and it is rare in the ARIES (MAF ¼ 0.004) and in the GOYA mothers (MAF¼ .009) that participated in this study. The difference in the MAF between the two samples might create a bias in the IV estimate. Data on the association between vitamin B 12 and FUT2 genotype in pregnant women are lacking and there is to date no indication that hormonal changes affect the association between FUT2 genotype and vitamin B 12 levels. In a small subset of ALSPAC we found evidence for a positive association only for child's rs492602:A > G genotype, (Supplementary Material, Table S1) and it remained after adjusting for maternal genotype. Furthermore, although there is evidence for a correlation between maternal and cord vitamin B 12 status (57), there is also evidence suggesting that maternal vitamin B 12 deficiency might not translate into a foetal vitamin B 12 deficiency (58). Consequently, impaired vitamin B 12 levels in the mothers might affect foetal DNA methylation through mechanisms other than lowering foetal vitamin B 12 levels that have yet to be explored.
To our knowledge, this is the first report of a methylomewide study of the effects of maternal prenatal vitamin B 12 on cord blood DNA methylation. Previous studies showed that maternal vitamin B 12 levels were associated with cord blood hypomethylation of the promoter of the IGF2 gene (15) and overall hypomethylation of genomic DNA (14). Another study found no association with cord blood LINE-1 methylation (16). Other studies have found that maternal levels of other components of the one-carbon metabolism affect cord blood DNA methylation (43,59). Together with these other studies, our data suggest that during pregnancy an adequate intake of micronutrients involved in one-carbon pathway could affect DNA methylation. With regards to cognitive development, the association found in our study with APOL2 methylation in the blood might reflect changes in lipid transport and metabolism in the brain that in turn alter brain functioning and cognition (60). The hypothesisfree analysis performed to extend the question to a wider range of health consequences suggests causal effects beyond the cognitive phenotype.
In conclusion, we applied a two-step Mendelian randomization approach to investigate DNA methylation as a mediator between a prenatal exposure and a childhood outcome. We found some evidence of a causal effect of maternal vitamin B 12 levels on cord blood DNA methylation and little evidence of a causal effect of DNA methylation on childhood intelligence. This work has highlighted the various strengths and particular challenges associated with two-step Mendelian randomization and we hope that it will motivate further use of the approach in similar contexts where DNA methylation is hypothesised to mediate observed associations between a prenatal exposure and a laterlife phenotype.

ALSPAC data
The subjects for the main part of this study were participants of the ALSPAC (see for details (34,35). The participants included in the study were from the core sample, singletons, and of white ethnicity. Please note that the study website contains details of all the data that is available through a fully searchable data dictionary (http://www.bris.ac.uk/alspac/researchers/data-access/ data-dictionary/; date last accessed May 03, 2017). Ethical approval for the study was obtained from the ALSPAC Ethics and Law Committee and the Local Research Ethics Committees. Mothers' and children's genetic data at specified SNPs (see below) were extracted from the ALSPAC GWAS database (34).
Briefly, the genetic data for mothers (FUT2 genotype) were generated using the Illumina human660W-quad and the IlluminaGenomeStudio calling algorithm. Quality Control measures included the removal of SNPs with more than 5% of missingness, a Hardy-Weinberg-Equilibrium P-value lower than 10 À6 and a minor allele frequency on less than 1%. Samples with more than 5% missingness, indeterminate X chromosome heterozygosity or extreme autosomal heterozygosity were excluded. SNP imputation was carried out against the 1000 Genome Project database (www.1000genomes.org; date last accessed May 03, 2017) (61). Genetic data for the children were generated by Sample Logistics and Genotyping Facilities at the Wellcome Trust Sanger Institute and LabCorp (Laboratory Corportation of America) using support from 23andMe using the Illumina Human Hap 550-quad and the Illumina GenomeStudio calling algorithm. Quality control and imputation were performed as above.
Neonatal blood DNA methylation data were generated using the IlluminaHumanMethylation450 BeadChip as part of the ARIES project (http://www.ariesepigenomics.org.uk; date last accessed May 03, 2017 (62)), where DNA methylation was measured in white cells extracted from cord blood (82% of the subjects) and in cord blood spots (18% of the subjects) and using the Illumina Infinium 450K platform. Briefly, DNA samples were bisulphite converted using the Zymo EZ DNA methylationTM kit (Zymo, Irvine, CA). Following conversion genome-wide methylation was measured using the Illumina HumanMethylation450 (HM450K) BeadChip. The arrays were scanned using an Illumina iScan, with initial quality review using GenomeStudio. For each sample the proportion of DNA molecules methylated at each CpG site is represented as a beta-value (63). Samples with >20% probes with P-value > ¼ 0.01 were excluded from further analysis and scheduled for repeat assay. Genotype probes on the HM450k were compared between samples from the same individual and against SNP-chip data to identify and remove any sample mismatches. Methylation data were normalised with the wateRmelon R package (64), using the "Tost" algorithm to reduce the non-biological differences between probes (65). All the processing of the methylation data was done using the meffil R package (available at https://github.com/per ishky/meffil; date last accessed May 03, 2017).
After excluding multiple births, genetic data were available for n ¼ 8890 mothers and n ¼ 8871 children whereas epigenetic data were available for n ¼ 914 children. The complete-cases sample containing genetic (mother and children), epigenetic and covariate data used for the MWAS (first step MR) was n ¼ 641.
Childhood cognition was measured as the overall intelligence score (IQ) derived from a shortened version of the Wechsler Intelligence Scale for Children (WISC-III) that was administered to the children at approximately 8 years of age (66,67). The Verbal IQ score was computed using the age-scaled scores obtained by administering the Information, Similarities, Arithmetic, Vocabulary and Comprehension subtests. The Performance IQ score is composed of the age-scaled scores of Picture Completion, Coding, Picture Arrangement, Block Design and Object Assembly subtests. IQ data were available from n ¼ 5164 children. Amongst these, n ¼ 611 participants had epigenetic data and the relevant covariates and were used for the conventional (non-IV analysis), whereas n ¼ 3843 that did not include the ones with epigenetic data were used for the IV analysis (second step MR).

Replication data
The subjects for the replication study of the first step MR were participants in the GOYA Study which is described previously by Paternoster et al. (68). It is based on the Danish National Birth Cohort (69) and included 2451 women that were pregnant during 1996-2002, had given birth to a live born infant, had provided a blood sample during pregnancy and had a BMI that ranged from 32.6 to 64.4 and 2450 pregnant women during the same period that were chosen at random. Of these, 1,960 extremely overweight and 1,948 control women were genotyped using the Illumina Human610-Quad v1.0 BeadChip and passed quality control. Cord blood DNA methylation data were generated for the offspring of 1000 mothers in the GOYA study. Cord blood was collected according to standard procedures, spun and frozen at À80˚C. DNA methylation analysis and data preprocessing were performed at the University of Bristol using the same protocol as for the generation of the ARIES (ALSPAC) data. The complete-case sample was n ¼ 916.
The replication of the second step MR was carried out using summary data from published studies that were carried out by the Social Science Genetic Association Consortium (SSGAC) (36)(37)(38).

Two-step MR analyses
Two-step Mendelian randomization analysis ( Fig. 1) was performed using instrumental variable (IV) analysis. For each step, the IV estimator was calculated as the ratio between the b coefficient of the genotype-outcome regression and the b coefficient of the genotype-exposure regression. The standard error was calculated as reported in Thomas et al., 2007 (70).
1 st -step MR: Examining the causal effect of maternal vitamin B 12 on DNA methylation. We first evaluated the association between the genotype-outcome (FUT2-methylation) association in the ALSPAC-ARIES study by conducting a methylome-wide association study (MWAS) using the CpGassoc R package (71). Briefly, after removing probes on sex chromosomes and probes with Pvalue > 0.05, and after removing outliers, a linear regression model measured the additive effect of the minor allele in the maternal genotype at the rs492602:A > G SNP and at the rs1047781:A > T in the FUT2 gene on DNA methylation at 468,622 CpG sites in cord blood. Batch (10 surrogate variables), cell composition estimated for cord blood as in Bakulski et al., (72), child's genotype, mother's BMI, mother's education, mother's age at conception, smoking during pregnancy and parity were included in the model as covariates as they could potentially influence DNA methylation at birth and therefore they could potentially lead to errors in the phenotype assessment. The three CpG sites observed to be most strongly associated with maternal FUT2 based on the MWAS (false discovery rate, FDR, 0.05) were taken forward in the IV analysis, which estimated the causal exposure-outcome relationship, i.e. the causal effect of maternal vitamin B 12 levels on offspring DNA methylation. We then tested the top three FUT2-CpG associations using the genetic and epigenetic data from the GOYA cohort. For this analysis, we followed the same samples and probes exclusions and statistical models as for the ALSPAC cohort, including the same covariates.
Functional analyses for the methylation genome-wide significance hits were carried out using online tools. A search for functional annotation terms was performed using the Gene Ontology Consortium website (www.geneontology.org; date last accessed May 03, 2017) (73). Gene expression was examined in the Genotype-Tissue Expression (GTEx) portal version V6p (www.gtexportal.org; date last accessed May 03, 2017). A literature search for potential associations with health phenotypes was performed using the NHGRI-EBI GWAS Catalog of published genome-wide association studies (www.ebi.ac.uk/gwas/; date last accessed May 03, 2017) (46) and the Online Mendelian Inheritance in Man (OMIM) database (www.omim.org; date last accessed May 03, 2017).
2 nd -step MR: Examining the causal effect of DNA methylation on IQ. Firstly, conventional (non-IV) linear regression analysis was performed to assess the association between DNA methylation (the exposure) and IQ (the outcome) in the ALSPAC cohort. These analyses were carried out using beta-values for each of the CpG sites identified as associated with maternal FUT2 in step 1. Models were adjusted for age of testing batch (surrogate variables), mother's BMI, mother's age at delivery, mother's education, smoking during pregnancy, and parity.
Next, Mendelian randomization was used to estimate the causal association between DNA methylation and IQ. For the methylation probes discovered in step 1, cis-SNPs were identified in the mQTL database (http://mqtldb.org; date last accessed May 03, 2017) (74), which contains the associations between CPG methylation and SNPs observed in the ARIES project at P < 10 À7 . These were SNPs associated with CpG methylation and were located within 1 Mb either side of the CpG. Independent cis-SNPs, obtained through LD clumping, were selected as the instrumental variables for CpG methylation and the b coefficient and standard error for the cis-SNP-methylation association were used in the IV analysis. The sample size for this analysis was n ¼ 770.
For the genotype-outcome associations, IQ was regressed on the cis-SNPs in the ALSPAC cohort, excluding participants that were used to identify cis-SNPs, as it has been shown that choosing variants according to their strength of association with exposure in the data under analysis can give biased estimates (75,76). The models included age of testing as a covariate.
The 2 nd -step MR was also run in a separate study in order to investigate replication and benefit from the increased power of a larger study sample. A two-sample MR was carried out using MRbase (http://www.mrbase.org/beta/; date last accessed May 03, 2017), an online platform that contains data from 985 genomewide association studies and interfaces with the R package TwoSampleMR (https://github.com/MRCIEU/TwoSampleMR; date last accessed May 03, 2017). Cognitive and education genomewide association data were available from the Social Science Genetic Association Consortium (SSGAC) (36)(37)(38). We selected childhood intelligence, cognitive performance, college completion and years of schooling as outcomes.

Supplementary Material
Supplementary Material is available at HMG online.