DNA methylation mediates the effect of maternal smoking during pregnancy on birthweight of the offspring

Background: We examined whether the effect of maternal smoking during pregnancy on birthweight of the offspring was mediated by smoking-induced changes to DNA methylation in cord blood. Methods: First, we used cord blood of 129 Dutch children exposed to maternal smoking vs 126 unexposed to maternal and paternal smoking (53% male) participating in the GECKO Drenthe birth cohort. DNA methylation was measured using the Illumina HumanMethylation450 Beadchip. We performed an epigenome-wide association study for the association between maternal smoking and methylation followed by a mediation analysis of the top signals [false-discovery rate (FDR) < 0.05]. We adjusted both analyses for maternal age, education, pre-pregnancy BMI, offspring’s sex, gestational age and white blood cell composition. Secondly, in 175 exposed and 1248 unexposed newborns from two independent birth cohorts, we replicated and meta-analysed results of eight cytosine-phosphate-guanine (CpG) sites in the GFI1 gene, which showed the most robust mediation. Finally, we performed functional network and enrichment analysis. Results: We found 35 differentially methylated CpGs (FDR < 0.05) in newborns exposed vs unexposed to smoking, of which 23 survived Bonferroni correction (P < 1 × 10-7). These 23 CpGs mapped to eight genes: AHRR, GFI1, MYO1G, CYP1A1, NEUROG1, CNTNAP2, FRMD4A and LRP5. We observed partial confirmation as three of the eight CpGs in GFI1 replicated. These CpGs partly mediated the effect of maternal smoking on birthweight (Sobel P < 0.05) in meta-analysis of GECKO and the two replication cohorts. Differential methylation of these three GFI1 CpGs explained 12–19% of the 202 g lower birthweight in smoking mothers. Functional enrichment analysis pointed towards activation of cell-mediated immunity. Conclusions: Maternal smoking during pregnancy was associated with cord blood methylation differences. We observed a potentially mediating role of methylation in the association between maternal smoking during pregnancy and birthweight of the offspring. Functional network analysis suggested a role in activating the immune system.


Introduction
It is well known that maternal smoking during pregnancy can cause intrauterine growth restriction and low birthweight. [1][2][3][4] Low birthweight, in turn, has been associated with increased childhood growth and cardiometabolic problems in childhood and adulthood. 5,6 The development of chronic diseases in adulthood is therefore believed to start during pregnancy as a result of exposure to adverse intrauterine environments, also known as fetal programming. We hypothesized that the long-lasting effects of adverse fetal exposures (e.g. smoking) on birthweight and subsequent cardiometabolic risk are at least partly caused by DNA methylation. [7][8][9] Thus, maternal smoking during pregnancy may have adverse health consequences during the offspring's entire life course via DNA methylation.
In recent studies, tobacco smoke exposure has been associated with DNA methylation changes in smokers. [10][11][12] The effect of maternal tobacco smoking during pregnancy on DNA methylation of their offspring has also been investigated in a number of studies using different designs. [13][14][15][16][17][18][19][20] Several of these studies in offspring investigated global or gene-specific DNA methylation differences, in umbilical cord blood and placental cells. 13,14,16,21 Several studies have used an epigenome-wide association study (EWAS) design, [17][18][19][20] focusing on methylation differences of individual cytosine-phosphate-guanine (CpG) sites. Some EWASs used the 27 k chip (Illumina Inc., San Diego, USA) in placental samples or whole blood samples of children and identified methylation of several CpGs to be associated with maternal smoking during pregnancy. 19,20 Other EWASs used the 450 K chip (Illumina Inc., San Diego, USA) to identify changes in methylation associated with maternal smoking during pregnancy. 17,18 Joubert et al. 17 identified and replicated methylation changes in cord blood of several genes (AHRR, CYP1A1 and GFI1) associated with maternal smoking during pregnancy. More recently, Markunas et al. 18 identified and replicated differential methylation of CpGs in 10 novel genes in whole blood from 889 newborns. Other EWASs studied associations with birthweight. 22,23 Adkins et al. 23 found no epigenome-wide associations with birthweight, whereas Engel et al. 22 identified 19 CpGs. Interestingly, no studies investigated mediation by methylation in the association between maternal smoking and birthweight or other health-related outcomes.
Therefore, we conducted an EWAS in cord blood to examine the association between maternal smoking during pregnancy and DNA methylation, with the 450 K chip. Furthermore, we studied for the first time whether differentially methylated CpGs mediated the effect of smoking on birthweight. Finally, we sought to replicate the most promising mediation findings in two independent birth cohorts, and meta-analysed the results.

Subjects
We derived data from GECKO Drenthe, a Dutch population-based birth cohort that studies risk factors associated with the development of overweight from birth into adulthood. 24 The cohort includes 2874 children born between April 2006 and April 2007. Children have been extensively phenotyped on parental characteristics, pregnancy and delivery, children's health, nutrition and childhood growth. Data were gathered during pregnancy and at multiple time points during childhood. Maternal and paternal smoking during pregnancy were self-reported and (if available) additional information from obstetricians was used. Directly after delivery, umbilical cord blood was collected from 1565 children and stored at -80 C. DNA was extracted from the buffy coats using the QIAamp96 DNA Blood Kit (QIAGEN). To increase DNA concentration to ! 50 ng/ml, all samples were treated with Glycoblue.
From all children in the total cohort with stored cord blood, we selected those that had sufficient DNA of good quality after DNA isolation (DNA concentration ! 50 mg/ml). Of those, we excluded non-Dutch newborns, premature newborns ( 37 weeks), twins and those with a mother with (gestational) diabetes. We also excluded children with missing information on these variables, which resulted in n ¼ 1118. Then 447 children were selected because they had information on maternal and paternal smoking during pregnancy and the number of cigarettes smoked by the mother. This resulted in 129 children exposed to maternal smoking and 318 children unexposed to either maternal or paternal smoking. This group of 447 did not differ from the group of 1118 on gestational age, birthweight, maternal educational level or gender. Only the maternal pre-pregnancy BMI of the group of 447 was slightly lower (24.4 vs 25.0 kg/m 2 ). Therefore, we concluded that these 447 were broadly representative of the total cohort. We used the complete exposed group (n ¼ 129) and randomly selected 129 unexposed newborns (of which 3 dropped out during QC), see flowchart in Supplementary Figure S1, available as Supplementary data at IJE online.
This study has been approved by the Medical Ethics Committee of the University Medical Center Groningen, and parents of all participants gave written informed consent.

Genome-wide methylation assay
We used 500 ng DNA per sample to perform methylation analysis. To minimize batch effects, we randomized all samples on sex and exposure status per chip over three 96-well plates. Thus each chip contained three exposed boys, three unexposed boys, three exposed girls and three unexposed girls. In addition, we randomly assigned five control samples of the same male to each plate; two on the first plate, two on the second plate and one on the third plate. We performed bisulphite conversion using the EZ-96 DNA methylation kit (Zymo research Corporation, Irvine, USA). After validating that unmethylated cytosines had converted to thymidines using commercially available bisulphite conversion controls (Zymo Research Corporation, Irvine, USA), we processed the samples using the Infinium HumanMethylation450 BeadChip (Illumina Inc., San Diego, USA). We checked performance of built-in internal quality controls in the Controls Dashboard in the methylation module of GenomeStudio (Illumina Inc., San Diego, USA).

Quality control
For all 485 577 CpGs we calculated beta-values and detection P-values using the Minfi R package. 25 Overall, betavalues ranged from zero to one, showing the level of methylation for each CpG, and detection P-values < 0.05 indicated that the target sequence signal was distinguishable from the background. We performed all quality control steps for the three plates separately. Cluster plots for the betas on the X chromosome showed a clear distinction by sex. Two males were in the female cluster, and were excluded from further analyses. Illumina-suggested background normalization and colour correction were performed. One sample did not meet the criterion of ! 99% of the CpGs with detection P-value < 0.05 and was excluded. This resulted in a final sample of 255 children: 129 exposed and 126 unexposed. Control probes, probes on X or Y chromosomes and probes that did not meet our criteria of a detection P-value of < 0.05 in ! 99% of the samples were excluded. This resulted in 465 891 remaining CpGs. The five duplicate male control samples (included in each plate) showed high correlations ranging from 0.995 to 0.998, indicating that batch effects were minimal. These five samples were removed from further analyses.

Statistical analyses
We performed all pre-processing steps using R packages SWAN (Subset-quantile Within Array Normalization) and Minfi 25 and linear regression in the R package Limma (Linear Models for Microarray Analysis). 26 We generated basic characteristics, mediation analysis and the volcano plot in Stata v12 (StataCorp, College Station, TX, USA).

Epigenome-wide association (EWAS) analysis
We performed linear regression analyses in Limma comparing the methylation beta values of the exposed with the unexposed group. We adjusted for the following covariates that were selected based on their expected association with maternal smoking and/or methylation: sex, gestational age, maternal age, pre-pregnancy BMI, educational level, plate number and cell type composition. 17,18,27 Sex and gestational age (weeks) were reported by obstetricians. Maternal educational level (low/average vs university educated), maternal BMI before pregnancy (kg/m 2 ) and maternal age (years) were self-reported by the mothers. Missing values on gestational age (n ¼ 2), maternal educational level (n ¼ 3) and maternal pre-pregnancy BMI (n ¼ 8) were imputed with the mean/median to maintain power. Excluding the 10 newborns with ! 1 missing covariate did not alter the results, and since multiple imputation in an EWAS dataset would be computationally burdensome, we present our findings including these 10 samples with single imputed covariate data. Furthermore, the number of participants with missing data was very small, thus substantial bias was unlikely. Additionally, we included plate number to adjust for potential batch effects and we calculated cell type proportions based on the method previously presented by Houseman and colleagues 28 with the dataset presented by Reinius and colleagues. 29 These cell type proportions (B cells, granulocytes, monocytes, NK cells, CD4þ T cells and CD8þ T cells) were included as covariates in the model. As a sensitivity analysis, we also performed our analysis without correction for cell type and even in a crude model without any of the covariates, to test the effect of these covariates on our results. We converted raw P-values to false discovery rates (FDRs) based on Benjamini and Hochberg. 30 We used both FDR < 0.05 (raw P < 7.5 Â 10 -6 ) and Bonferroni corrected P-values (raw P < 1 Â 10 -7 ) as significance thresholds. We tested a dose-response effect of number of cigarettes per day on methylation in the exposed group for those signals with FDR < 0.05.

Mediation analysis
We tested the CpGs with FDR < 0.05 for mediation in the association between maternal smoking during pregnancy and birthweight, using the widely used method of Baron and Kenny 31 and the Sobel test. 32 As shown in Figure 1, mediation was considered to be present when: (i) smoking correlated with methylation level (ba); (ii) smoking correlated with birthweight without adjusting the model for the mediator (bc); (iii) differential methylation correlated with birthweight (bb); (iv) the association between smoking and birthweight decreased upon addition of methylation to the model (bc'); and (v) the Sobel test gave P < 0.05, indicating a decrease in the effect of smoking on birthweight after adjusting for the differentially methylated CpG. For those CpGs showing mediation, we tested the assumption that there is no interaction of the exposure and covariates with the mediator CpGs. 33,34 For the mediating CpGs, we further calculated which part of the association between smoking and birthweight could be explained by the mediator using the formula: 35

Functional network analysis
We performed network and enrichment analysis to facilitate the functional interpretation of our differentially methylated genes using GeneMANIA. 36,37 To this end, we selected all genes to which the top CpGs (FDR < 0.05) mapped as input, to construct a functional interaction network by adding the 100 most strongly interacting genes. Data resources used by the GeneMANIA algorithm were functional association datasets including genetic interactions, protein-protein, co-expression, shared protein domains and co-localization networks. 36,38 Functional enrichment analysis of all genes of the constructed interaction network against Gene Ontology (GO) terms was performed to find the most enriched GO terms.

Replication
We performed replication analyses for the top findings of our EWAS and mediation analysis in two independent birth cohorts with 450 K methylation data in cord blood samples from Caucasian children: ALSPAC (Avon, UK) 39,40 and Generation R (Rotterdam, The Netherlands 41 ). For the replication analyses, we analysed data of 65 exposed and 613 unexposed offspring in ALSPAC and 110 exposed and 635 unexposed offspring in Generation R (see Supplementary text and Supplementary  Table S1, available as Supplementary data at IJE online). All eight GFI1 CpGs with FDR < 0.05 in the EWAS were taken forward for replication. We limited replication to the GFI1 gene as its CpGs showed the most robust and clearest mediation results and GFI1 was among the genes with the most robust EWAS signals in GECKO. Furthermore, unlike NEUROG1, differential methylation of GFI1 was previously reported to be associated with maternal smoking. 17 Exposure in the replication cohorts was defined as sustained maternal smoking during pregnancy vs no maternal smoking during pregnancy, because this was the most accurate measure of exposure in the replication cohorts. Paternal smoking was adjusted for in the mediation analysis. Except for this additional covariate, mediation analyses were performed using the same analysis protocol as in GECKO. In order to obtain one overall estimate of the results for each of the eight GFI1 CpGs, we used fixed effects inverse variance meta-analysis of the results of the two replication cohorts. Subsequently, we combined results of discovery (GECKO) and replication (ALSPAC and Generation R) stages in a joint metaanalysis. We concluded that mediation was present for CpGs showing a two-sided P < 0.05 in both the replication and the joint meta-analysis.

Results
General characteristics of all participants in GECKO are presented in Table 1, for characteristics of ALSPAC and Generation R participants see Supplementary Table S1, available as Supplementary data at IJE online. On average, in GECKO, smoking mothers were 1.4 years younger and more often had a lower educational level and their children had a 281 g lower birthweight.
We found 35 CpGs, mapping to 10 genes, that showed differential methylation (FDR < 0.05) between the groups exposed and unexposed to maternal smoking ( Table 2). After the more conservative Bonferroni correction, 23 CpGs remained. These 23 CpGs mapped to eight genes: AHRR, GFI1, MYO1G, CYP1A1, NEUROG1, CNTNAP2, FRMD4A and LRP5. All eight CpGs mapping to GFI1, LRP5 and CNTNAP2 had lower methylation levels in the group exposed to maternal smoking during pregnancy compared with the unexposed group (methylation difference (beta value exposed minus beta value unexposed) ranged from À0.021 to À0.117). The 11 CpGs that mapped to MYO1G, NEUROG1, FRMD4A and CYP1A1 had higher methylation levels in the exposed group (methylation difference ranged from 0.028 to 0.077). For AHRR, three CpGs had lower methylation levels (methylation difference between À0.024 and À0.073) whereas one had higher methylation in the exposed group (methylation difference 0.038).
Effects of covariate adjustment on EWAS results are shown in Supplementary Table S2, available as Supplementary data at IJE online. Analysis without adjustment for cell type distribution did not substantially change our top (Bonferroni significant) findings, but the list of CpGs with FDR < 0.05 decreased substantially after cell type correction.
The volcano plot in Figure 2 shows the methylation differences between the exposed and unexposed groups plotted against statistical significance. It shows the 35 differentially methylated CpGs (FDR < 0.05) and the 23 CpGs that remained statistically significant after Bonferroni correction.
We observed no dose-response effect of number of cigarettes smoked per day on differential methylation in the exposed group for any of the 35 top CpGs (data not shown).
Next, we considered the 35 top CpGs to test for the mediating effect in the association between maternal smoking and birthweight. All CpGs on the growth factor independent 1 transcription repressor (GFI1) gene (eight CpGs) and the neurogenin 1 (NEUROG1) gene (two CpGs) showed mediation with P < 0.07 in GECKO (Table 3), whereas the other CpGs did not (Supplementary  Table S3, available as Supplementary data at IJE online). None of these CpGs showed interaction with the exposure or covariates in its effect on birthweight (Supplementary  Tables S4a-h, available as Supplementary data at IJE online). We limited replication analysis to CpGs in GFI1 because it showed the most robust results. Data shown as n (%) or mean 6 SD. Except for number of cigarettes smoked: median (range). P-values are given for independent samples t-test (continuous) or chi-square test (categorical).
Unexposed group was defined as no smoking during pregnancy, by mother or by father. Exposed group was defined as smoking during pregnancy by mother.
Replication and meta-analysis in ALSPAC and Generation R confirmed the association with maternal smoking for seven of the eight CpGs in GFI1 and mediation was replicated for three of the eight GFI1 CpGs: cg09935388, cg14179389 and cg12876356 (Table 4). Although not all these CpGs were significant in the two individual replication cohorts, directions of the effects were consistent (Supplementary Table S5, available as Supplementary data at IJE online). Joint meta-analysis of discovery and replication cohorts combined showed that differential methylation of these three GFI1 CpGs explained 12-19% of the 202 g lower birthweight in smoking mothers. For example, this was 19% for cg09935388 calculated as follows: newborns of smoking mothers had a 202 g lower birthweight compared with unexposed newborns (meta-analysis of b c , data not shown). After adding the CpG as mediator in the model, the effect of smoking on birthweight decreased by 37.5 g (b c À b c' in overall meta-analysis, see Table 4). Therefore, 37.5/202 ¼ 19% of the 202 g lower birthweight in exposed newborns could be explained by mediation through differential methylation.
We observed 28 enriched GO terms (FDR < 0.05) for the 110 genes in the interaction network (Table 5). Most enriched terms are closely related and point towards regulation of immune system processes, particularly the cellmediated immunity response.

Discussion
We aimed to examine the effect of maternal tobacco smoking during pregnancy on DNA methylation in cord blood.
Our second aim was to study the mediating effect of DNA methylation in the association between maternal smoking during pregnancy and offspring's birthweight. We found 35 CpGs (FDR < 0.05) in 10 genes to be differentially methylated in the exposed and non-exposed groups; 23 of these CpGs (in eight genes) survived Bonferroni correction. Furthermore, replication analysis confirmed methylation of three GFI1 CpGs to mediate the association between maternal smoking during pregnancy and decreased birthweight. Finally, functional network analysis showed that the top differentially methylated genes influenced immune system processes, particularly related to cell-mediated immunity.
The association between smoking and methylation is one of the most widely studied epigenetic associations and evidence from EWASs on maternal tobacco smoking and DNA methylation specifically in offspring is accumulating rapidly. [13][14][15][16][17][18][19][20] EWASs investigating the influence of cigarette smoking have used a variety of DNA sources, including placental cells, 19 and studies in active smokers have been performed in whole blood, peripheral blood, lymphoblast DNA or lung alveolar macrophages [10][11][12] with a generally high level of consistency across tissue and studies. To our knowledge only a limited number of EWASs have been published investigating the effect of maternal smoking during pregnancy in offspring using the 450 K chip, of which only one was done in cord blood. 17,18 The 23 differentially methylated CpGs mapped to eight genes: AHRR, GFI1, MYO1G, CYP1A1, NEUROG1, CNTNAP2, FRMD4A and LRP5. Differential methylation of these genes (except for NEUROG1) was also observed (but not all consistently replicated) in other EWASs in cord and whole blood 17,18 and/or in other studies into smoking and methylation in adults. 10,12,15 Previous studies related methylation in the aryl-hydrocarbon receptor repressor (AHRR) gene and the Cytochrome P450, family 1, subfamily A1 (CYP1A1) gene to tobacco smoke exposure in both smokers and newborns and most studies, including ours, reported the same CpG as the top signal (cg05575921). 10,12,15,42,43 Both AHRR and CYP1A1 are involved in the aryl-hydrocarbon receptor (AhR) pathway, regulating the biological responses to hydrocarbons found in cigarette smoke and xenobiotic metabolism in general. [43][44][45] The myosin-1 G (MYO1G) gene is involved in haematopoietic processes and regulation of cell elasticity. 46 The contactin-associated protein-like 2 (CNTNAP2) gene is involved in the development of the nervous system 47 and in neuropsychiatric disorders. Finally, the low-density lipoprotein receptor-related protein 5 (LRP5) gene plays a role in skeletal homeostasis. 48 Differential methylation of the FERM Domain Containing 4A (FRMD4A) gene has also previously been observed in relation to tobacco smoke exposure in offspring of smoking mothers (in whole blood). 18 Interestingly, single nucleotide polymorphisms in FRMD4A have been shown to be involved in nicotine dependence. 49 An important finding in our study was the mediating effect of differential methylation of the growth factor independent 1 transcription repressor (GFI1) gene in the association between maternal smoking and birthweight. GFI1 is known to play a role in developmental processes such as haematopoiesis and oncogenesis. 50,51 Thus, GFI1 could be involved in cellular development and possibly fetal growth. However, it has not previously been linked to birthweight or other anthropometric measures.
Differential methylation of NEUROG1 also seemed to mediate the association between maternal smoking and birthweight in GECKO; however, our discovery results in NEUROG1 await future replication. NEUROG1 is known to be associated with neuronal differentiation and neurogenesis, 52 making a link to fetal development plausible. It should be noted that these CpGs were not mapped within the NEUROG1 gene regions, but located close to this gene (57 k downstream).
To the best of our knowledge, we were the first to investigate and identify statistical evidence of mediation by DNA methylation (in GFI1) in the pathway from maternal tobacco smoking during pregnancy to decreased birthweight of the offspring. Meta-analysis of all three cohorts showed that three CpGs on GFI1 explained between 12% and 19% of the effect of maternal smoking on birthweight. These findings are promising, as this biological mechanism seemed to explain part of the effect of smoking on birthweight. Other mechanisms causing reduced fetal growth Figure 2. Volcano plot showing methylation differences between exposed and unexposed against -log 10 of the P-values. may involve impaired placental perfusion, chronically low levels of fetal oxygen supply 53 and sensitivity to adipocytokines, e.g. leptin or ghrelin. 54 However, it should be kept in mind that many other factors are involved in intrauterine growth and birthweight, e.g. malnutrition or stress, 55,56 and that DNA methylation could not explain the total variation in birthweight resulting from smoking. As in any epidemiological study, residual confounding could not be entirely excluded. However, maternal smoking during pregnancy is known to have a direct adverse effect on growth of the fetus and is therefore likely to have a much stronger effect on methylation than other possible confounding factors.
We performed network and enrichment analysis to facilitate the functional interpretation of our 10 differentially methylated genes. Most enriched GO terms were related to immune system processes, especially to those related to cell-mediated immunity. Thus, intrauterine exposure to components in cigarette smoke seemed to elicit an immune response in the offspring. Such an immune response in smokers and offspring of smoking mothers may play a role in the increased risk of developing asthma. 57 This is in line with studies showing that the AhR pathway activates the immune system triggered by environmental exposures such as tobacco smoke, pollutants and diet. 58,59 Additional research will be needed to show whether these smokinginduced methylation effects may increase the risk of developing autoimmune diseases. [60][61][62] These results seemed independent of cell type differences caused by maternal smoking, as we have adjusted all our analyses for these differences, although we cannot entirely exclude that cell correction was incomplete and residual cell (sub)type effects could be possible.
The current study has many strengths. We found that 78% of our top CpG signals overlapped with those from a previous EWAS on the same topic (data not shown), which is a testament to the robustness of our findings. 17 Moreover, cord blood is an excellent tissue to test for methylation differences associated with maternal smoking, because cord blood has not yet been exposed to external influences other than those provided by the intrauterine environment. As such, potential confounding by those external exposures on the newborn is minimized. Use of cord blood to study DNA methylation as a potential mediator of birthweight is less ideal, as it implicitly assumes that it reflects methylation patterns from other tissues such as muscle, fat and bone that might be more plausibly causally related to fetal growth and birthweight. However, such tissues would be prohibitively difficult to collect from newborns and for this reason cord blood is currently the most commonly used tissue in epidemiological studies of newborns. 63 Furthermore, in (epi)genetic epidemiology the winner's curse is a well-known phenomenon, which means that the effect sizes of newly identified associations are The coefficients bc and bc' can be interpreted as the amount of grams lower birthweight for smoking vs non-smoking mothers in the 'smoking to birthweight' and full model, respectively. bb represents the effect of methylation level (coded as a proportion between 0-1) on birthweight. For cg09935388 this means that an increase of 100% in methylation level is associated with 1190.4 g higher birthweight. For extra information on the betas, see Figure 1.
often overestimated in the discovery cohort. For this reason we reported effect sizes of the combined analyses of discovery and replication cohorts, which showed only partial replication of our discovery findings. We were able to replicate three of the eight mediating CpGs in two other cohorts, which confirmed and strengthened our results. However it should be kept in mind that not all CpGs replicated and those CpGs that did replicate did not show such strong mediation as in the discovery sample. Another strength was the inclusion of the mediation analysis, giving more insight into the biological pathway between maternal smoking and birthweight.
To our knowledge, this is the first study to formally assess and report this mediating effect of DNA methylation. Additionally, we gave a functional interpretation of our results using functional network and enrichment analyses, which indicated that the differentially methylated genes play a role in activation of immune system processes.
Finally, we used the Houseman correction with the Reinius dataset, a popular method to adjust for differences in cell type distributions between the exposed and unexposed groups of six cell types (B cells, granulocytes, monocytes, NK cells, CD4þ T cells and CD8þ T cells). 28,29 This, reassuringly, showed no alterations in our top findings. The top signals still survived Bonferroni correction after cell type correction; however, the larger list of CpGs that survived FDR differed substantially (Supplementary Table S2, available as Supplementary data at IJE online). Consequently, the gene list that was used as input for the network and functional enrichment analysis was also different. Interestingly, the general pattern of results did not change, as we still observed that most enriched terms pointed towards positive regulation of particularly cell-mediated immune responses. Furthermore, the mediation results did not change as we observed significant mediation by the GFI1 gene and not by any of the other genes, before and after cell type correction  For all meta-analysis we have used a two-sided P < 0.05 as significance threshold. Bold: CpG sites for which significant mediation was confirmed (P < 0.05 for both replication meta-analysis and overall meta-analysis).
(mediation results before correction are not shown). This method was based on a reference dataset of whole blood samples from adult males, which have a different cell composition from cord blood, and this cell type correction did not account for more specific cell subtypes. However, currently this is the best option because no cord blood reference dataset exists and, even in cord blood, this referencebased cell type correction is the best method available as recently applied by Kile and colleagues. 27 In contrast to an earlier study, which observed dosedependency by maternal cotinine plasma levels, 17 we did not find an effect of the number of cigarettes smoked per day. Joubert et al. 17 found a dose-response relationship for two of the significant genes, but not for all top genes. Thus, a dose-response relationship could be expected for some genes but not for all. Another potential reason for the lack of a dose-response relationship in our data is our smaller sample size compared with the study of Joubert et al.
A potential limitation was the use of self-reported smoking behaviour during pregnancy. This may have caused underreporting of smoking behaviour and possibly could have resulted in an underestimation of the effects. In the GECKO Drenthe cohort, 14% of the mothers smoked during pregnancy. This is comparable to the prevalence of 7.6-13.2% found in The Netherlands in 2001-07 64 and 12.3% in the USA. 65 Furthermore, we observed results that were highly comparable to the study by Joubert et al. which measured smoking status objectively as plasma cotinine levels. 17 We found support for our hypothesis that differential methylation mediates part of the effect of smoking on birthweight, but we could not be certain about the direction of causation in this observational study. One possibility is that methylation markers simply provided a better measure of smoking exposure than the self-reported smoking behaviour we used in our study. Such biomarkers would then also be expected to be associated with birthweight. However, the fact that only GFI1 showed significant association with birthweight and not, for example, the AHRR cg05575921 CpG showing the strongest EWAS signal, contradicted this explanation. Another possibility we could not entirely exclude is that retardation of fetal growth expressed as lower birthweight led to differential methylation rather than the other way around. However, we believe this is unlikely given the primary role of epigenetic mechanisms in orchestrating changes in gene expression during growth and development.
We acknowledge that the Baron and Kenny approach for mediation analysis has been criticized among others for its dependency on and sensitivity to measurement errors, misclassification and violation of model assumptions. 66,67 However, the Infinium HumanMethylation450 BeadChip is a reliable instrument reflecting the state of the art in measurement of genome-wide DNA methylation. 68 Moreover, mediation effects of three CpG sites were independently replicated in cord blood data from two other birth cohorts, in spite of presumably differential measurement errors between the three cohorts. Instability of methylation over time is an additional potentially important source of measurement error that could not be addressed by the cross-sectional design of our study, which only looked at differential methylation at birth. We backed up our mediation results from the Baron and Kenny approach with a more advanced statistical approach, and additionally applied causal mediation analysis to the three replicated CpGs in the GECKO cohort. This analysis uses a more general potential outcomes framework, can provide additional distribution-free estimates of the mediated effects and facilitates sensitivity analyses for the observed effects. 67 Results of these analyses were in line with our Baron-Kenny results and Sobel tests (see Supplementary Note, available as Supplementary data at IJE online).
Previously, fathers who started smoking early were shown to have heavier sons, 69 indicating a possible direct effect of paternal smoking on fetal programming through the sperm epigenome, which can affect embryogenesis. 70,71 We did not explicitly test this possible direct effect in our study. However, only 39 (30%) of the fathers in the exposed group had smoked during pregnancy and, after excluding these children from the analysis, 83% of our top CpGs remained Bonferroni-significant. We also controlled for this possible paternal smoking effect in the study design, as we only included in the unexposed group those children whose mother and father did not smoke.
Our results suggested that in utero exposure to smoking could have an effect on selected methylation markers which may in turn affect later health outcomes in offspring. Our approach of testing the effects of intrauterine exposures on DNA methylation in the child may serve as a model that could be extended to other exposures. One example is fetal exposure to polycyclic aromatic hydrocarbons (PAHs), which has been linked to childhood obesity. 72 PAHs are produced during incomplete combustion and are constituents not only of cigarette smoke but also of many other sources. Results of such studies may then provide guidance to future prevention efforts tailored to limit certain exposures for pregnant women with major potential impact on public health.
In conclusion, maternal tobacco smoking during pregnancy showed genome-wide methylation differences in 35 CpGs mapped to 10 genes measured in cord blood. Our results showed remarkable similarity to previous findings, confirming the robustness of the effects. Additionally, we observed a potentially mediating role of DNA methylation in the association between maternal smoking during pregnancy and birthweight of the offspring. We were able to replicate the mediating effect for three CpGs in GFI1, which confirmed and strengthened our findings. Finally, our network and enrichment analyses indicated that smoking in the mother may induce a cellular immune response in the fetus.

Supplementary Data
Supplementary data are available at IJE online.