Longitudinal analysis of DNA methylation associated with birth weight and gestational age

Gestational age (GA) and birth weight have been implicated in the determination of long-term health. It has been hypothesized that changes in DNA methylation may mediate these long-term effects. We obtained DNA methylation profiles from cord blood and peripheral blood at ages 7 and 17 in the same children from the Avon Longitudinal Study of Parents and Children. Repeated-measures data were used to investigate changes in birth-related methylation during childhood and adolescence. Ten developmental phenotypes (e.g. height) were analysed to identify possible mediation of health effects by DNA methylation. In cord blood, methylation at 224 CpG sites was found to be associated with GA and 23 CpG sites with birth weight. Methylation changed in the majority of these sites over time, but neither birth characteristic was strongly associated with methylation at age 7 or 17 (using a conservative correction for multiple testing of P < 1.03 × 10–7), suggesting resolution of differential methylation by early childhood. Associations were observed between birth weight-associated CpG sites and phenotypic characteristics in childhood. One strong association involved birth weight, methylation of a CpG site proximal to the NFIX locus and bone mineral density at age 17. Analysis of serial methylation from birth to adolescence provided evidence for a lack of persistence of methylation differences beyond early childhood. Sites associated with birth weight were linked to developmental genes and have methylation levels which are associated with developmental phenotypes. Replication and interrogation of causal relationships are needed to substantiate whether methylation differences at birth influence the association between birth weight and development.


Introduction
Gestational age (GA) and birth weight have been implicated in the determination of long-term health (1)(2)(3)(4) and may mediate their persistent biological effects via epigenetic mechanisms (5)(6)(7)(8). This is predicated on the notion that epigenetic changes induced in utero or early postnatal life persists across the life course; a studies have included either appraisal of preterm birth or GA across the normal range, although it is clear that both have potentially very different determinants and underlying mechanisms.
Previous investigation of the relationship between DNA methylation and GA includes a study by Schroeder et al. (5) that detected 41 cytosine bases (CpG sites) showing differential methylation in relation to GA in 39 genes using a cohort of 259 neonates from mothers with a history of neuropsychiatric disorders. They replicated their findings for 26 of these sites (in 25 genes) using a sample of 194 newborns from healthy mothers. Many of the identified CpG sites were within genes implicated in labour, delivery and development of later adverse health outcomes. Similarly, Lee et al. (6) identified several differentially methylated regions of the epigenome which were associated with GA. These regions were adjacent to three genes involved in early development (NFIX, RAPGEF2 and MSRB3). Further support for an association of DNA methylation and GA comes from a study by Parets et al. (7) who performed an epigenome-wide association study (EWAS) on 50 newborns, of whom 22 were preterm deliveries. They identified 9637 CpG sites with methylation levels associated with GA. With respect to birth weight, in a study of 1046 infants, Engel et al. (8) reported 19 CpG sites to be associated with cord blood DNA methylation levels; some of the identified CpG sites were within genes that had previously been associated with roles in adipogenesis [ARID5B (14)] and DNA repair [XRCC3 (15,16)]. Tan et al. (17) found no epigenome-wide associations between birth weight and methylation from a study of 150 pairs of adult monozygotic twins (median age 57, range 30-74) who were discordant for birth weight. However, they identified three CpG sites in a sub-sample of twins who were extremely discordant for birth weight.
Studies have also been conducted in the context of preterm birth, which by definition is a category of low GA and thus closely correlates with low birth weight (18,19). A recent study (19) identified 1555 CpG sites with differential methylation between term and preterm newborns in a matched case-control study. A key finding was that many of these differences had been resolved by adulthood, suggesting that methylation difference at delivery (when not at term) merely reflects methylation changes during a normal developmental trajectory. Whether these differences in early life affect later phenotypic development has not yet been explored.
Physiological or molecular changes induced in early life have the potential to have profound developmental consequences and implications for health across the life course (20)(21)(22). Although previous studies have attempted to consider the persistence of DNA methylation from birth to adulthood, they have been severely limited by analysing DNA methylation at a single time point-either sampled at birth and compared cross-sectionally with birth weight (8) and GA (5-7) or sampled in childhood or adulthood and compared retrospectively with birth weight and GA [or other exposures such as prenatal famine (11)]. 'Change' in DNA methylation, i.e. difference detected over time, has been the focus of several studies to date (23)(24)(25)(26)(27)(28)(29)(30)(31)(32). Many of these report on cross-sectional samples (26,28), adult populations (25) or both (27,29,30,32). Appraisal of change over time in individuals requires measurement of DNA methylation at more than a single time point. A small number of studies have included two time points of serially measured DNA methylation (23,24,31,33,34). However, while two serial measurements are useful, they are limited to the study of linear change, i.e. the difference in methylation from one clinic to the next.
This study sought to analyse DNA methylation in serial samples from the same individuals at three time points from birth to age 17. Focus was placed on the identification of methylation variable sites associated with birth weight and GA, these being two exposures clearly associated with long-term health consequences. Ten developmental phenotypes (e.g. bone mineral density and weight) were used to identify potential mediation of the birth-development association by DNA methylation.

Results
There were 914, 973 and 974 samples with blood DNA methylation profiles obtained using the Illumina Infinium HumanMethy-lation450 BeadChip (Illumina, Inc.) (35). This array measures DNA methylation simultaneously at over 485 000 CpG sites across the human genome. After a quality control (QC) step (see Materials and Methods), we obtained DNA methylation in cord blood (466 432 CpGs), peripheral blood at age 7 years (471 347 CpGs) and peripheral blood at age 15/17 years (469 902 CpGs), respectively. The Accessible Resource for Integrated Epigenomic Studies (ARIES) cohort is gender-balanced, with 49% males; average birth weight was 3.48 kg and GA 39.6 weeks (Table 1). Average maternal age was 29.2 years; 61% of mothers had never smoked, whereas 11% continued to smoke during pregnancy.

EWAS
Therewas evidence for an association between GA and cord blood methylation at 224 different probes annotated to 155 genes (Table 2 and Supplementary Material, Table S1), after correcting for cell type composition in blood using the method described by Houseman et al. (36). GA had a negative association with methylation at 188 probes and a positive association at 36 probes. There was no strong evidence for any associations between GA and peripheral blood methylation measured at age 7 or 15/17.

Longitudinal analysis
The results of longitudinal analysis of methylation from these 224 probes during childhood and adolescence are provided in Table 2 and Supplementary Material, Table S1. Methylation at the vast majority of the 188 probes showing a negative relationship with GA continued to decrease during childhood. For example, the change in methylation at cg25551168 (AVP) is shown in Figure 1. From an estimated 54.2% in cord blood, methylation decreased by 2.7% per year on average during early childhood and then reached a plateau during adolescence with negligible change in methylation from 7 to 17 years. This pattern of rapid change during early childhood followed by stabilization during adolescence is evident across the majority of CpG sites.
Of the 188 negatively associated probes, there were 160 where GA was associated with childhood changes (from 0 to 7 years) in methylation (i.e. where an interaction between GA and age was found). Almost all of the observed interactions are positive, which suggests that those children with a shorter gestation period have faster methylation change during childhood, i.e. methylation differences at birth are resolving during childhood. There was no strong evidence for an effect of GA on methylation changes from age 7 to 17.
Of the 36 probes which had increased methylation per week of gestation, 34 continued to increase in methylation during childhood. The maximum increase was 6.7% per year between birth and 7 years (EBF4). However, just 5 of the 36 had an increase in methylation between 7 and 17 years, suggesting that methylation levels had largely stabilized by age 7. An interaction between GA and methylation changes from birth to age 7 was identified for 31 of the 36 positively associated probes. Each of these suggests that children with shorter gestation have a faster rate of change in methylation during childhood, which again suggests that methylation differences attributed to GA are resolving during early life. For example, the average cord blood methylation at cg25551168 (AVP) was 54.7% and for each week of GA, methylation was 0.7% lower on average, such that children with GA of 40 weeks would be estimated to have a 7% lower methylation than those with GA of 30 weeks, on average. We found that methylation decreased by 2.7% per year but that for each extra week of GA the change in methylation is slowed down by 0.1% per year. Thus, children with a 40-week GA at delivery would lose 1.7% methylation per year, compared with 2.7% per year for children with 30-week GA, on average. Thus, at age 7, children with GA of 30 or 40 weeks would have equal methylation on average, explaining a lack of an association between GA and methylation or methylation change at or beyond age 7.

EWAS
We identified 23 probes in 14 genes where cord blood methylation was associated with birth weight (7 probes did not have a RefSeq gene annotation; Table 3). Birth weight was positively associated with cord blood methylation at 10 probes and negatively associated

Replication
Of the 23 associations observed between birth weight and cord blood methylation, two had been previously shown to have a negative association (8) in the Norwegian Mother and Child (MoBa) cohort (37). These were cg20076442 (no RefSeq gene) without a local gene, and cg25953130 (ARID5B).

Longitudinal analysis
Longitudinal analysis of methylation at these probes showed that 11 of the 13 sites negatively associated with birth weight had a reduction in average methylation during childhood (Table 3). Beyond age 7, 7 of these 13 probes demonstrated a continued reduction in average methylation levels. The most pronounced change in methylation occurred in cg20076442 (no RefSeq gene). Cord blood methylation was estimated to be 72.4% and this decreased by 4.3% per year from birth to age 7 on average. This reduction in methylation continued throughout later childhood and adolescence, but at a rate of 0.11% per year (Fig. 2). Probes with methylation levels which were positively associated with birth weight tended to increase in methylation during childhood and adolescence. During childhood (age 0-7), an interaction between birth weight and age was found for each of the 23 probes. Across all probes, lower birth weight was associated with faster changes in methylation during childhood, i.e. birth weightrelated methylation differences resolved during childhood. There was no strong evidence that birth weight was associated with methylation change between ages 7 and 17.

Exploration of phenotypic differences in adolescence
With 224 identified probes in the EWAS of GA and 10 selected developmental phenotypes (height, weight, leg length, lean mass, fat mass, bone mass, bone mineral density, IQ, forced expiratory volume and forced vital capacity), there were a possible 2240 association between GA-related methylation and developmental phenotypes. Of these, 70 probe-phenotype associations were observed, yet just two GA-phenotype associations were observed (with leg length and IQ). However, neither of these associations overlapped so there is no support in this data set for methylation differences in these probes being associated with later phenotypic differences.
In the birth weight EWAS, 23 probes were identified such that 230 probe-phenotype association tests were performed. Of these, 14 probe-phenotype associations were observed while all 10 developmental phenotypes were associated with birth weight. The 14 overlapping associations are presented in Table 4. There was a strong positive association between birth weight and development and a negative relationship between methylation and development in 12 of the 14 phenotypes (i.e. lower methylation was associated with better development). Of the 14 associations, eight involved methylation at cg15783941(NFIX) and five were found in two CpG sites in the LTA gene. Longitudinal analysis showed that estimated cord blood methylation at cg15783941 (NFIX) is 75.4% and this increases by 1.4% per year on average during childhood with no evidence for further methylation change beyond age 7 (Fig. 3). Given the observed association between cord blood methylation and later phenotypes, there is support here for differences at birth having longer term effects.

Summary
We identified 224 CpG sites where cord blood methylation was found to be associated with GA across the normal range of gestation. These included an appreciable overlap with previously identified associations including three probes that have previously been reported in three separate cohorts (5,7) and one probe which was common to all four cohorts (5,7,19). An EWAS   (8). There was little evidence of an association between methylation at age 7 or 17 years and either of birth weight or GA. This is suggestive of the non-persistence of birthrelated DNA methylation differences and implies that much of the variation in birth-related DNA methylation observed at birth attenuates markedly in the first few years of life. Indeed, there appears to be a phase of rapid 'catch-up' in methylation differences. This observation has important implications for the role of epigenetic processes in developmental programming (in blood); although it does not preclude the involvement of epigenetic mechanisms per se, it suggests that any downstream consequences of these specific epigenetic perturbations observed at birth may be set in train during early life rather than the marks themselves persisting across the life course. These inferences are limited to differential methylation associated with GA and birth weight and may not extend to all changes induced in utero, exemplified by the smoking-responsive changes in DNA methylation recently reported which can persist for longer periods of time (38,39). Longitudinal analysis provided evidence that methylation levels change rapidly during early development. For the majority of probes with changing methylation levels, this change was more pronounced during the immediate postnatal years, with many sites tending to stabilize in methylation level beyond age 7. Where interactions were found, lower birth weight and shorter GA were associated with an increased rate of methylation change during the life course, suggesting a 'catch-up' mechanism in early life. The resolution of cord blood methylation differences  during childhood has been recently reported (40) at differentially methylated probes (DMPs) related to pre-pregnancy maternal BMI. However, this is in contrast to changes in methylation at smoking-responsive probes, where differences in cord blood methylation associated with maternal smoking have been shown to persist through childhood and adolescence (39). Two previous studies have sought to investigate methylation changes in early life, from birth to 6 months (34) and from birth to 18 months in twins (33). Using two time points, they performed paired t-tests and established that 30% of 330 168 probes under investigation had methylation which changed in the first 18 months, with an average increase of 3.1% per year (33). In contrast, our longitudinal analysis was not epigenome-wide and investigated change in only those DMPs identified in cord blood in relation to birth weight or GA. The questions addressed are therefore different; one establishing the difference in methylation across the methylome between two ages, and the other establishing the extent of change over time in loci shown to be differentially methylated with regard to a specific trait. In contrast to the 1% of the probes previously shown to differ between birth and 12 months of age (5), our study suggests that the vast majority of GA-(87%, 190/224) and birth weight (87% 20/23)-associated DMPs undergo a period of rapid change. The lower estimate of change found by Martino et al. (34) may relate to the cell type studied, CD4+ cells rather than whole blood cells and the shorter time period, 1 year compared with the 7 years in the current study. Moreover, our findings do not necessarily contradict the previous results because the CpG sites perturbed by GA and birth weight are likely to be specifically targeted by the 'catch-up' mechanism discussed above that resolves GA and birth weight-related changes in early life. Birth weight-related cord blood methylation in the NFIX gene was shown to be associated with developmental phenotypes, which were also associated with birth weight, including bone mineral density at 17 years of age. NFIX regulates the development of the brain and of bone and skeletal muscle. NFIX has been identified previously as having birth weight-associated levels of DNA methylation (6). Birth weight-associated sites near the transcription start of LTA were also associated with developmental phenotypes including lean mass. LTA is a tumour necrosis factor family member produced by lymphocytes and is integrally involved in the development of secondary lymphoid organs both before and shortly after birth (41) as well as in a wide variety of immune responses. Previous work has also identified likely associations between cord blood methylation in the region of the endothelial nitric oxide synthase gene (eNOS) and bone mineral density at 9 years (42). We observed an association with GA in a neighbouring gene promoter (KCNH2) less than 40 Kb away.

Comparison with previous EWAS
Our analyses replicate the findings of several previous studies (5,7,8). Among these are two GA-related probes in the AVP gene, which has been linked together with the OXT gene to the timing of delivery (43). Suggestively, nearby associated probes were identified in the promoter of OXT in our cohort ARIES and in WMHP and CANDLE (5) though not at precisely the same probes. Another is a probe adjacent to the transcription start site of the CRHBP gene. CRHBP encodes a protein produced by the placenta, but that drops dramatically prior to parturition in order to promote corticotropin-releasing hormone activity (7). Whereas previous studies identified an association near the ESR1 gene, also known to play a role in parturition, no such associations were found in the ARIES cohort. A third associated probe (cg05294455) is located about 100 bp upstream of the MYL4 gene that encodes a motor protein involved in muscle contractions. Salomonis et al. (44) show that MYL4 is one of the most downregulated transcripts in the mouse myometrium during late gestation and note that the MYL4 plays a primary role in uterine contraction at term. In human cord blood, we show that MYL4 methylation is positively associated with GA possibly implying a negative association with MYL4 gene expression in contrast to a positive association in the myometrium. This is consistent with the observation of Cruickshank et al. (19) of large methylation decrease at the same probe in neonatal blood spots of preterm infants compared with term infants. The relationship between blood DNA methylation of these parturition-related genes and timing of delivery remains to be elucidated.
Of the three regional associations identified by Lee et al. (6), we replicated one near the MSRB3 gene encoding a product that catalyzes the reduction of methionine sulphoxide to methionine. It has been hypothesized that increased levels of methionine sulphoxide in body tissues contribute to aging (45,46). The negative association between DNA methylation and GA that we observe suggests that MSRB3 may be more highly expressed in cord blood at later GAs.
Unlike Paret et al. (7), we did not observe associations at probes near key epigenetic genes DNMT1, DNMT3A, DNMT3B and TET1. However, we did observe associations at the same sites near the CHD4 and CHD5 genes that encode members of a nucleosome remodelling and deacetylase complex. Paret et al. (7) observe positive associations at probes near MMP9, which is involved in the breakdown of the extracellular matrix in the process of cervical ripening (47). We similarly observe positive associations at probes near MMP15, another matrix metalloproteinase involved in extracellular matrix breakdown.
Two of the 23 birth weight-related probes have previously been identified in the MoBa cohort (8), including the same probe in ARID5B. This gene has been shown to be associated with postnatal adiposity in mice (14). As noted for the MoBa cohort (8), several birth weight-related sites (6) are linked to genes that play important roles in development, including RAR, NFIX, LTA and HOXA3. For example, knockout of RAR in mice induces lethality shortly after birth and results in testis degeneration (48), NFIX provides foetal-specific transcription regulation in developing skeletal muscle (49), LTA is critically involved in the development of secondary lymphoid organs both before and shortly after birth (50), and HOXA3 is involved in patterning the cranial neural crest (51). There was no overlap between our birth weight-related probes and three probes found in a study of extremely birth weight discordant adult twins (17).

Strengths
A major strength of our analysis is the ARIES data set, which contains serially measured DNA methylation on 1018 children along with a plethora of phenotypic information. This has enabled longitudinal analysis of DNA methylation over three measurement occasions. Using these data, we have provided an in-depth study of methylation changes during childhood and adolescence at methylation loci associated with birth weight and GA.

Limitations
Although we corrected for cellular heterogeneity using the Houseman (36) algorithm, there is a different ratio of white blood cell types in cord blood compared with peripheral blood drawn in childhood. This raises the possibility that differences observed can be explained by longitudinal ( possibly developmental) changes in white blood cell profiles. However, we used independent surrogate variable analysis (ISVA) estimated components to attempt to correct for changing blood composition. Due to the unique nature of our data set and the lack of available serial data in other cohort studies (at the current time), we have not been able to replicate our longitudinal analysis findings in an independent cohort, although our findings do show concordance with other cross-sectional studies. Modelling methylation over time at single sites rather than over regions has inherent limitations. Furthermore, with only three repeated measures of methylation available, our longitudinal modelling strategy was limited. For example, with several repeated measures available, smoothing methods (52) may have been employed to accurately capture the pattern of methylation change. With respect to interpretation, more detailed modelling of correlated groups of probes or differentially methylated regions of the genome would help to enhance the functional relevance of our observations. Another limitation is tissue specificity. Although convenient, use of blood is not likely the most relevant tissue for investigating the effects of birth characteristics on developmental phenotypes. It is possible that DNA methylation perturbations in tissues that actually play a role in the phenotypes (e.g. brain tissues and IQ) might actually persist throughout childhood. The limited association observed between DNA methylation at birth and later phenotype may also be related to the stringent cut-offs imposed in our study and consideration of a larger pool of DMPs may uncover more evidence of phenotypic consequences of birth weight-and GA-associated differential methylation.

Conclusion
Using serially collected DNA methylation, we have provided strong evidence for the non-persistence of epigenetic marks that are associated with birth characteristics throughout childhood and adolescence. In the vast majority of probes identified in our study, methylation levels change rapidly in the early stages of development and then stabilize into adolescence. While birth weight appears to have an influence on methylation changes, there is less evidence that GA plays a role in the evolution of long-term DNA methylation patterns. There is evidence that birth weight-related methylation differences may be linked to later developmental phenotypes. Replication in other tissue types and the application of causal analysis methods are needed to further address this hypothesis, namely that the effect of birth weight on later developmental phenotypes is influenced in part by DNA methylation differences at birth.

Study population
This study used DNA methylation data generated under the auspices of the Avon Longitudinal Study of Parents and Children (ALSPAC) (53). ALSPAC recruited 14 541 pregnant women with expected delivery dates between April 1991 and December 1992. Of these initial pregnancies, there were 14 062 live births and 13 988 children who were alive at 1 year of age. The study website contains details of all the data that are available through a fully searchable data dictionary (http://www.bris.ac.uk/alspac/ researchers/data-access/data-dictionary).
As part of the ARIES project (http://www.ariesepigenomics. org.uk), a sub-sample of 1018 ALSPAC child-mother pairs had DNA methylation obtained using the Infinium HumanMethyla-tion450 BeadChip (Illumina, Inc.) (35). In this study, we use DNA methylation data generated from cord blood and peripheral blood samples at age 7 and again at age 15 or 17 years, leading to three measurements of DNA methylation per child. However, DNA methylation data were also obtained from the mothers of these children, from blood samples taken during pregnancy and at a follow-up clinic 18 years later.

Laboratory methods, QC and preprocessing
All DNA methylation wet laboratory and preprocessing analyses were performed at the University of Bristol as part of the ARIES project. Following extraction, DNA was bisulphite-converted using the Zymo EZ DNA Methylation™ kit (Zymo, Irvine, CA, USA). Infinium HumanMethylation450 BeadChips were used to measure genome-wide DNA methylation levels at over 485 000 CpG sites. The arrays were scanned using an Illumina iScan, with initial quality review using GenomeStudio. The assay detects methylation of cytosine at CpG islands using two site-specific probes-one to detect the methylated (M) locus and one to detect the unmethylated (U) locus. Single-base extension of the probes incorporates a labelled chain-terminating ddNTP, which is then stained with a fluorescence reagent. The ratio of fluorescent signals from the methylated site versus the unmethylated site determines the level of methylation at the locus. The level of methylation is expressed as a 'beta' value (β-value), ranging from 0 (no cytosine methylation) to 1 (complete cytosine methylation). β-values are reported as percentages.
During the data generation process, a wide range of batch variables were recorded in a purpose-built laboratory information management system (LIMS). The LIMS also reported QC metrics from the standard control probes on the 450 K BeadChip. Samples failing QC (average probe detection P-value ≥0.01) were repeated. Samples from all time points in ARIES were randomized across arrays to minimize the potential for batch effects. As an additional QC step, genotype probes on the 450 K BeadChip were compared between samples from the same individual and against SNP-chip data to identify and remove any sample mismatches. In addition to these QC steps, probes that contained <95% of signals detectable above background signal (detection P-value <0.01) (N = 7938) were excluded from analysis. After excluding these probes, as well as control probes and probes on sex chromosomes, a total of 466 432 CpGs were included in the main analysis for cord blood methylation. At age 7, 471 347 CpGs were included and at age 17, 469 902 CpGs were included in the main analysis, following the same exclusion criteria. Raw β-values were preprocessed using R (version 3.0.1) with background correction and subset quantile normalization performed using the pipeline described by Touleimat and Tost (54). β-values were corrected for cell type heterogeneity in blood using the method described by Houseman et al. (36).

EWAS
Using CpGassoc (55), separate EWASs of GA and birth weight with cord blood methylation, peripheral blood methylation at age 7 and peripheral blood methylation at age 15/17 years were carried out. As well as being mutually adjusted for GA and birth weight, each EWAS was adjusted for parity, maternal age, maternal smoking, child sex and delivery method (caesarean yes/no) as potential confounders. DNA methylation M-values (i.e. logittransformed β-values) were used in each EWAS (56). We report those CpG sites where an association was found at the P < 1.03 × 10 −7 level (to account for multiple testing across CpG sites). A description of each CpG site, as defined by the Illumina BeadChip probe name and RefSeq annotation (where known), is presented.

Replication
Two previous studies have used Illumina BeadChip arrays to identify associations with GA, the first using the 27 K BeadChip in the WMHP cohort and the CANDLE cohort (5), and the second using the 450 K BeadChip in the NB cohort (7). The MoBa cohort (37) has investigated relationship between birth weight and DNA methylation using the 450 K BeadChip (8). We compared identified probes with those found in these EWAS [of GA (5,7) and birth weight (8)], highlighting any replicated CpG sites. We also compared the results of our GA EWAS with findings from a case-control study of preterm birth (19), which used the 450 K BeadChip.

Longitudinal analysis
Longitudinal DNA methylation data from birth to 17 years were analysed for those CpG sites found to be associated with the birth weight or GA. Methylation β-values from all three time points were normalized together using the method described by Touleimat and Tost (54) and ISVA (57) was used to generate the top 20 components of variation. These components account for confounding due to position effects and also any changing cell type proportions. The methylation β-values were modelled over time in the children using multilevel models (58,59) to account for within-and between-child variation in methylation. Multilevel models also allow investigation of both methylation change during childhood and the effect of birth weight/GA on this methylation change. Irregular measurements of methylation over time (i.e. some children measured at 15 and some at 17) are handled naturally by multilevel models, introducing no bias from this design. A linear spline term was added to allow for different linear changes from 0 to 7 and from 7 to 17. For example, the model for GA-related methylation changes is meth ij ¼ β 0 þ u 0i þ β 1 GA þ β 2 age ij þ β 3 ðage ij À 7Þ þ þ β 4 GA × age ij þ β 5 GA × ðage ij À 7Þ þ þ ε ij ε ij ∼ Nð0; σ 2 ε Þ u 0i ∼ Nð0; σ 2 u Þ where i indexes the children in ARIES, j ¼ 1; 2; 3 indexes the measurement occasion, a þ ¼ a if a > 0 or 0 otherwise and u 0i is a random intercept, which allows children to have different cord blood methylation. β 1 gives the average change in methylation per week increase in GA at delivery; β 2 gives the average change in methylation from birth to adolescence; β 3 is the change to this trend (i.e. β 2 ) from 7 to 17; β 4 is the difference in methylation change between 0 and 7 per week increase in GA at delivery; β 5 is the difference in methylation change between 7 and 17 per week increase in GA. A random intercept-only model is reported, since random slope models could not be fitted to these data due to the small number of repeated measures. The models were further adjusted for the first 20 ISVA components (which account for variance due to changing cell type proportions and batch effects), maternal age, maternal alcohol consumption, maternal education, maternal smoking, parity and delivery method as potential confounders.

Phenotypic differences in adolescence
To investigate whether observed birth characteristic-related differences in methylation were associated with longer term phenotypic differences, we identified a set of 10 developmental phenotypes measured during childhood and adolescence. These were height (cm) and weight (kg; measured at 17 years); leg length (cm; at 11 years); lean mass (g), fat mass (g), bone mass (g) and bone mineral density (g/cm 2 ; 17 years); IQ (at 8 years); lung capacity [forced expiratory volume (l) in 1 second and forced vital capacity (l) at 15 years]. These outcomes were chosen to represent development in size (height, weight, leg length, lean/bone/fat mass and bone mineral density), mental function (IQ) and physical function (lung capacity) and based on published evidence that each has been associated with birth weight and/or GA (60)(61)(62)(63)(64)(65)(66)(67)(68). Using simple linear regression, these 10 phenotypes were tested for association with birth weight/GA as well as cord blood methylation from those CpG sites identified through EWAS. To account for multiple testing, evidence for association was set at the P < 0.05/(10 [ phenotypes] × number of CpG sites).

Supplementary Material
Supplementary Material is available at HMG online.