Epigenetic predictors of lifestyle traits applied to the blood and brain

Abstract Modifiable lifestyle factors influence the risk of developing many neurological diseases. These factors have been extensively linked with blood-based genome-wide DNA methylation, but it is unclear if the signatures from blood translate to the target tissue of interest—the brain. To investigate this, we apply blood-derived epigenetic predictors of four lifestyle traits to genome-wide DNA methylation from five post-mortem brain regions and the last blood sample prior to death in 14 individuals in the Lothian Birth Cohort 1936. Using these matched samples, we found that correlations between blood and brain DNA methylation scores for smoking, high-density lipoprotein cholesterol, alcohol and body mass index were highly variable across brain regions. Smoking scores in the dorsolateral prefrontal cortex had the strongest correlations with smoking scores in blood (r = 0.5, n = 14, P = 0.07) and smoking behaviour (r = 0.56, n = 9, P = 0.12). This was also the brain region which exhibited the largest correlations for DNA methylation at site cg05575921 – the single strongest correlate of smoking in blood—in relation to blood (r = 0.61, n = 14, P = 0.02) and smoking behaviour (r = −0.65, n = 9, P = 0.06). This suggested a particular vulnerability to smoking-related differential methylation in this region. Our work contributes to understanding how lifestyle factors affect the brain and suggest that lifestyle-related DNA methylation is likely to be both brain region dependent and in many cases poorly proxied for by blood. Though these pilot data provide a rarely-available opportunity for the comparison of methylation patterns across multiple brain regions and the blood, due to the limited sample size available our results must be considered as preliminary and should therefore be used as a basis for further investigation.


Introduction
DNA methylation (DNAm) is one route by which modifications to the genome can occur and typically involves the addition of a methyl group to a cytosine residue on CG dinucleotides (CpGs). 1 Lifestyle traits such as smoking, alcohol intake and body mass index (BMI), as well as high-density lipoprotein (HDL) cholesterol levels are known to associate with differential blood DNAm at CpG sites across the genome. [2][3][4][5][6][7][8] These lifestyle factors are associated with a range of brain health outcomes and neurological diseases, [9][10][11][12][13][14] in addition to brain morphology differences. [15][16][17][18] Whereas the DNAm differences are likely to be a consequence, as opposed to cause, of lifestyle traits, it is unknown if these patterns are consistent across the blood and brain. Previous work suggests that blood is unlikely to reflect the brain reliably for all CpGs, but that some sites more closely reflect brain DNAm than others. [19][20][21] Given that the brain is the critical organ of interest for the pathology of neurological diseases, characterizing the signature of DNAm resulting from lifestyle exposures in brain, as well as the extent to which blood DNAm can proxy for this is therefore paramount.
Blood-based DNAm predictors have been previously shown to explain 60% of the variance in self-reported smoking patterns and $12% of the variance in alcohol, smoking and BMI when projected into blood DNAm. 22 We, therefore, hypothesized that differential methylation patterns at the CpG sites associated with lifestyle traits in these blood predictors would be present across the corresponding sites in the brain. To test this, we applied the described out-of-sample blood-based predictors from our previous work 22 to a pilot dataset consisting of matched blood and brain samples in 14 individuals from the Lothian Birth Cohort 1936 (LBC1936). We profiled four traits: HDL, BMI, alcohol and smoking. The most recent DNAm measure in blood taken prior to death was matched with genome-wide DNAm from post-mortem brain samples across five regions: Brodmann's areas BA35 (hippocampus), BA46 (dorsolateral prefrontal cortex), BA24 (anterior cingulate cortex), BA20/21 (inferior temporal cortex) and BA17 (primary visual cortex).
We first assessed the correspondence between the scores derived from blood-based lifestyle predictors between the blood and brain, to characterize how well circulating DNAm measures of lifestyle traits may translate to the brain. We then compared the associations between the predictor scores and the corresponding self-reported or clinically assessed data available for each lifestyle trait. DNAm at a single CpG (cg05575921 in the AHRR gene; the strongest CpG correlate of smoking in blood which is hypomethylated in response to smoking) 23,24 was also profiled in each of these analyses. This site is a key component of the smoking predictor and we hypothesized that the same hypomethylation observed in blood would also be evident in the brain. The relationship between blood-based predictor scores and lifestyle traits is presented for 499 individuals in the wider LBC1936 group to illustrate how representative the 14 individuals are of a larger reference group. The brain regions and study design are presented in Fig. 1.

Materials and methods
The Lothian Birth Cohort 1936 The LBC1936 (N ¼ 1091) is a longitudinal study of healthy ageing in individuals who reside in Scotland. 25,26 Participants completed a childhood intelligence test at age 11 years in 1947 and were then recruited for this cohort at the mean age of 70 years. They have been followed up approximately every 3-4 years (currently at the sixth wave), collecting a series of cognitive, physical, clinical and social data, along with blood donations that have been used for genetic, epigenetic, and proteomic measurement. Approximately 15% of individuals in the LBC1936 have consented to post-mortem tissue collection. To date, brain samples from 14 individuals are available and were Graphical Abstract therefore selected as the brain bank group (n ¼ 14) for the present study.

Participant consent
Written informed consent was obtained from all participants. Ethical permission for the LBC1936 was obtained from the Multi-Centre for Scotland (MREC/01/0/56), Lothian (LREC/2003/2/29) and Scotland A (07/MRE00/ 58) Research Ethics Committees. Use of human tissue for post-mortem studies has been reviewed and approved by the Edinburgh Brain Bank ethics committee and the ACCORD medical research ethics committee, AMREC (ACCORD is the Academic and Clinical Central Office for Research and Development, a joint office of the University of Edinburgh and NHS Lothian, ethical approval number 15-HV-016). Human tissue from the Edinburgh brain bank was used under the research ethics committee (REC) approval (16/ES/0084). All experimental methods were in accordance with the Helsinki declaration.
Blood DNAm in the LBC1936 DNA from whole blood at 485 512 CpG sites was assessed using the Illumina Human Methylation 450K array at the Edinburgh Clinical Research Facility. The full details of the processing steps have been previously described. 27,28 Raw intensity data were background-corrected and normalized using internal controls. Following background correction, manual inspection permitted removal of low-quality samples presenting issues relating to bisulphite conversion, staining signal, inadequate hybridization or nucleotide extension. Quality control analyses were performed to remove probes with low detection rate <95% at P < 0.01. Samples with a low call rate (samples with <450 000 probes detected at P-values of less than 0.01) were also eliminated. Furthermore, samples were removed if they had a poor match between genotype and SNP control probes, or incorrect DNA methylation-predicted sex. There were a total of 450 276 probes which remained. Investigators were blinded to participant information when assessing DNAm to reduce potential sources of bias.
Blood DNAm was available for up to 4 waves, measured over a 10-year period. The most recent blood sample prior to death was selected and self-report and clinical information was also taken from the most recent wave for which it was available. The most recent blood measurement was performed at wave 3 in seven individuals and wave 4 in the remaining seven, with a mean time between blood sampling and death of 3.2 years (SD 1.6) in the wave 3 group and 1.6 years (SD 0.94) in the wave 4 group (Supplementary Table 1). In the 14 individuals, the mean age at blood sampling was 77.9 (SD 1.7) and the mean age at death was 80.3 (SD 1.6). The blood DNAm reference group (n ¼ 499) was taken from wave 4 of the LBC1936 and the mean age at sampling was 79.3 years (SD 0.62). Lifestyle trait information and blood DNAm were recorded in a consistent way across both the brain bank and reference groups.

Brain DNAm processing in the LBC1936
Brain tissue samples were received from the Edinburgh Brain Bank. Regions of the brain were dissected after brains were removed and cut coronally into slices, as per previous methodology. 29 Samples from regions BA46 (dorsolateral prefrontal cortex), BA17 (primary visual cortex), BA24 (anterior cingulate cortex), BA20-21 [ventral (20) and lateral (21) inferior temporal cortex] and BA35 (hippocampus) were taken from the cortex and snap frozen. A tissue selection of $25 mg was processed for DNA extraction, which was done using the DNeasy kit (Qiagen). DNAm was measured at the Edinburgh Clinical Research Facility using Illumina MethylationEPIC BeadChips. Quality control steps were then performed: the wateRmelon pfilter() function removed samples in  22 ) were applied to matched genome-wide blood and brain DNAm. The cg05575921 site in the AHRR gene locus was also measured across matched blood and brain samples. Analyses investigated the correlation between blood and brain measures and the correlation between the measures and respective lifestyle trait phenotypes relevant in each case. Figure created with BioRender.com. which >1% of probes had a detection P-value of >0.05, probes with a beadcount of <3 in >5% of samples, and probes with >1% of samples reaching a detection Pvalue of >0.05. Additional SNP probes and cross-hybridising probes on X and Y chromosomes were removed. 30 If a discordance between the methylation-predicted sex and recorded sex was identified samples were also removed. Performance of 15 normalization functions was examined, as per Pidsley et al., with danet selected as the top-ranking method. 31 The normalized data had a total of 69 samples (14 individuals, across 5 regions, with 1 hippocampal sample unavailable) and 807 163 probes. DNAm beta values were used in all analyses. Investigators were blinded to participant information when assessing DNAm.

Lifestyle trait information in the LBC1936
Lifestyle trait phenotype measurements were as follows: Self-reported smoking status (0 ¼ never smoked, 1 ¼ former smoker, 2 ¼ current smoker); alcohol consumption in a usual week (converted into units); BMI (defined as the ratio of weight in kg divided by height in m 2 ) and HDL cholesterol (measured in mmol/l). Pack years smoked was calculated by multiplying the number of packs of cigarettes smoked per day by the number of years the individual had smoked for, divided by 20 (cigarettes per pack). If a person reported having never smoked, their pack years was recorded as 0 and this was done for both the reference and brain bank groups. Of the reference group, 37 people had pack years available, 193 did not have information available and the remaining 269 indicated that they were never smokers. Five individuals were excluded from the pack years trait in the brain bank group as they did not have information on the number of packs smoked per day, which is why a smaller subset were used for analysis of this trait (n ¼ 9). One individual in the brain bank group did not have a starting age for smoking; however, the mean of the group starting age was imputed (16 years old). Any other unknown trait information from the reference group was not used. Mortality data were obtained through data linkage to the National Health Service Central Register, provided by the General Register Office for Scotland (now National Records of Scotland) and were correct as of July 2020.

DNAm signature predictor scores for lifestyle traits
CpG predictor weights for complex traits were generated and validated through a pipeline described previously. 22 Briefly, LASSO penalized regression was used to identify a linear combination of CpG sites with DNAm levels that were associated with lifestyle traits. The coefficients/ weights generated for the CpG sites were based on whole-blood derived samples from 5087 individuals (mean age ¼ 49, SD ¼ 14) in the Generation Scotland cohort. 32,33 Though Scotland has a historic and sustained high prevalence of unhealthy lifestyle behaviours such as smoking and alcohol consumption, with a large proportion of the current population either overweight or obese, 34,35 cohort profiling has found both the Generation Scotland and Lothian Birth Cohorts to be healthier and of higher socioeconomic status than the general population. 26,36 Here, the described CpG weights for each trait (taken from McCartney et al. 22 Additional file 1: Tables S1-3  and Table S6) were applied to DNAm at the same CpG sites in LBC1936 individuals and summed to generate predictor scores for smoking, HDL, BMI and alcohol for each individual. Lifestyle trait scores were generated using DNAm in the blood and five brain regions in the brain bank group (n ¼ 14) and the blood DNAm which was available in the reference group (n ¼ 499). Blood and brain DNAm samples were restricted to sites included on the Illumina Human Methylation 450K array, to ensure comparability across samples. Over 95% of CpG sites in the predictors were available in both the blood and brain DNAm for each lifestyle trait score.

Statistical analyses
First, Pearson correlations were applied to measure the correspondence between the methylation predictors in the blood and brain. These correlated the lifestyle predictor scores-generated from the application of blood-derived CpG predictor weights-between blood and brain, as well as the DNAm measurements at site cg05575921 between the blood and brain. Second, Pearson correlations were used to measure the relationships between brain and blood DNAm measures of lifestyle traits and the respective lifestyle trait information. Units of alcohol consumed weekly, BMI, HDL cholesterol from blood and both smoking status and pack years smoked were included as traits and correlated with the DNAm-based lifestyle predictor scores. Smoking information was also correlated with DNAm at site cg05575921. These lifestyle trait correlations were also performed for the blood-based predictor scores and cg05575921 measurements in the reference group from wave 4 of the LBC1936.
As a sensitivity analysis, Spearman correlations were conducted for every association tested in the described Pearson correlations. Inter-region Spearman correlations were also performed for the blood-derived lifestyle predictor scores applied to brain and for the cg05575921 DNAm measure across the five regions. Though covariate information was available on the proportion of neurons in the brain, the brain weight, brain pH and most-mortem interval, power calculations suggested that linear mixed effects regression analyses in the sample size of 14 individuals were unlikely to be sufficiently powered to detect significant effects (Supplementary Table 2). For this reason, no further statistical testing was performed. Correlations were conducted using the Hmisc library (Version 4.4-0). 37 Inter-region correlation heatmaps were produced using the ggcorrplot library (Version 0.1.3) 38 and correlation plots were produced using the corrplot library (Version 0.84). 39 All analyses were performed using R (Version 3.6.3). 40 Data availability LBC1936 data are available on request from the Lothian Birth Cohort Study, University of Edinburgh (simon.cox@ ed.ac.uk). LBC1936 data are not publicly available due to them containing information that could compromise participant consent and confidentiality.

Code availability
All R code used in this study is available with open access at the following Gitlab repository: https://gitlab.com/ dannigadd/blood-brain-lifestyle-traits.
The predictor weights used in our study and the original code used to generate them (McCartney et al. 22 ) can also be found at this location.

Cohort assessment
Summary information for the 14 individuals from the LBC1936 brain bank subset and the reference group (up to n ¼ 499) is presented in Table 1. The brain bank subset had a higher proportion of males (64%) than the reference group (50%). Age at death in the brain donor group ranged from 77.6 to 83.0 years and the mean age of the reference group was well matched to this group (77.9 versus 79.3, respectively). The majority of the 14 individuals within the brain bank subset either had been or were still smokers (86%) at the time of death, which was a higher proportion than in the reference group (46%). Most of the 14 individuals (86%) had high HDL cholesterol (> 1 mmol/l), drank alcohol weekly (92%) and had mean BMI of 25.5 kg/m 2 . Five of the individuals did not have smoking pack years data recorded.

Correlation between blood and brain measures
Blood-derived epigenetic predictors for lifestyle traits were applied to the matched blood and brain DNAm samples in the brain bank group to generate lifestyle trait scores. DNAm at the smoking-associated CpG site cg05575921 was also considered. The variability across brain regions in the magnitude of correlations between DNAm predictor scores and DNAm at site cg05575921 are illustrated in Supplementary Fig. 1. The correlations between both the blood and brain lifestyle scores and the blood and brain measures of cg05575921 were regionally variable (Fig. 2). The strongest association between blood and brain DNAm at site cg05575921 was observed for BA46 (r ¼ 0.61, n ¼ 14, P ¼ 0.02) followed by BA17 (r ¼ 0.39, n ¼ 14, P ¼ 0.16). Blood smoking scores were most highly correlated with the BA46 region scores for smoking (r ¼ 0.5, n ¼ 14, P ¼ 0.07), with weaker associations observed for BA24 (r ¼ 0.29, n ¼ 14, P ¼ 0.31) and BA35 (r ¼ 0.36, n ¼ 13, P ¼ 0.23). BMI scores were negatively correlated in regions BA46 (r ¼ À0.72, n ¼ 14, P ¼ 0.004) and BA35 (r ¼ À0.46, n ¼ 14, P ¼ 0.12), suggesting that methylation patterns related to BMI were divergent across blood and brain. The HDL scores were moderately-correlated between the blood and region BA20/21 (r ¼ 0.55, n ¼ 14, P ¼ 0.04), with weaker correlations observed for regions BA24 (r ¼ 0.38, n ¼ 14, P ¼ 0.18) and BA35 (r ¼ À0.32, n ¼ 13, P ¼ 0.29). Blood alcohol scores were most highly correlated with BA24 (r ¼ 0.35, n ¼ 14, P ¼ 0.22) and with BA46 (r ¼ 0.25, n ¼ 14, P ¼ 0.38) alcohol scores. A sensitivity analysis suggested that the strongest correlations were consistent across Pearson and Spearman methods. The correlation coefficients and P-values are presented for both analyses in Supplementary Table 3.

Correlation of blood and brain measures with lifestyle traits
The lifestyle trait scores generated by applying bloodderived epigenetic predictors to brain and blood samples and the DNAm measures at site cg05575921 were then correlated with clinical and self-reported lifestyle phenotypes (Fig. 3). As expected, in the reference cohort we observed hypomethylation in blood DNAm levels of cg05575921 associated with pack years smoked (r ¼ À0.31, n ¼ 306, P ¼ 4.7 Â 10 À8 ). This trend was mirrored by the negative association found for blood DNAm at cg05575921 in the brain bank subset (r ¼ À0.55, n ¼ 9, P ¼ 0.11). Similar patterns were observed in some but not all brain regions. The strongest association for cg05575921 in brain with pack years was found for region BA46 (r ¼ À0.65, n ¼ 9, P ¼ 0.06), followed by a weaker correlation in BA35 (r ¼ À0.22, n ¼ 8, P ¼ 0.59). BA20/21 had an opposite trend to that expected in blood (r ¼ 0.2, n ¼ 9, P ¼ 0.61). Methylation in all other brain regions showed much weaker associations with the pack years phenotype (jrj 0.1). Regarding the lifestyle predictor scores, all trait scores in the blood samples from the brain bank subset were reflective of the wider reference cohort in terms of direction and approximate magnitude (jrj between 0.35 and 0.65), except BMI for which the brain bank group (r ¼ À0.14, n ¼ 14, P ¼ 0.68) was not representative of the reference group (r ¼ 0.41, n ¼ 498, P < 0.05). As with the results for cg05575921, BA46 was the region for which smoking scores were most highly correlated with pack years smoked (r ¼ 0.56, n ¼ 9, P ¼ 0.12). To a lesser extent BA17 (r ¼ 0.22, n ¼ 9, P ¼ 0.56) and BA35 alcohol and (E) BMI traits applied to the blood and brain (light blue). Relationships between brain DNAm and blood DNAm are shown for each brain region and measure. Each point represents one individual. Pearson correlation coefficients are annotated in each case. All individuals had both blood and brain samples available (n ¼ 14), apart from one individual for which no BA35 hippocampal sample was available (n ¼ 13). The solid blue line represents the linear regression slope; shaded areas represent 95% confidence intervals.
(r ¼ 0.24, n ¼ 8, P ¼ 0.56) smoking scores were also correlated with pack years. HDL brain scores correlated with HDL trait information strongly in BA17 (r ¼ 0.62, n ¼ 14, P ¼ 0.02), whereas BA35 (r ¼ 0.21, n ¼ 14, P ¼ 0.49) had a weaker association and a negative trend was observed for BA46 (r ¼ À0.4, n ¼ 14, P ¼ 0.16). Alcohol signatures showed the strongest trends in regions BA17 (r ¼ À0.55, n ¼ 14, P ¼ 0.04) and BA20/ 21 (r ¼ À0.41, n ¼ 14, P ¼ 0.15), in the opposite direction to that found in blood. BMI signature scores in BA17 were negatively correlated with BMI information (r ¼ À0.3, n ¼ 14, P ¼ 0.3) in a trend opposite to the LBC reference group and BA20/21 also had a positive correlation with BMI (r ¼ 0.27, n ¼ 14, P ¼ 0.35). The remaining regions did not show any notable correlations with BMI trait information (jrj 0.1). A sensitivity analysis comparing Pearson and Spearman methods found that, though there was variability across the methods, the top associations were consistent (Supplementary Table 4). Correlations between smoking measures and smoking status are also available in this file.

Discussion
Through the application of blood-derived epigenetic predictors of four lifestyle traits to whole-genome DNA methylation in matched blood and brain samples, we uncover regional variability in how well blood-derived scores may be able to proxy for brain-based scores. Though our results highlight disparities between the blood and the brain, we did find evidence to suggest that the blood-derived predictors of lifestyle traits may translate to specific regions. The dorsolateral prefrontal cortex (BA46) was identified as a region of interest for the smoking trait and showed relationships in our analyses using the epigenetic predictor of smoking and the CpG site cg05575921 -the single strongest known correlate of smoking across the epigenome. We . Correlations for blood measures are shown in red, with cg05575921 DNAm in dark blue and lifestyle trait scores generated from the application of blood-derived lifestyle predictors to brain in light blue. (A) DNAm at site cg05575921 is correlated with pack years smoked for the reference (n ¼ 306) and brain bank (n ¼ 9) groups. Correlations are then provided for each group between lifestyle trait scores for (B) smoking, (C) HDL, (D) alcohol and (E) BMI, against relevant lifestyle trait information. HDL is measured in mmol/l. Alcohol is measured in average units per week. Each point represents one individual. Pearson correlation coefficients are annotated in each case. All individuals had both blood and brain samples available (n ¼ 14), apart from one individual for which no BA35 hippocampal sample was available. The solid blue line represents the linear regression slope; shaded areas represent 95% confidence intervals. present these preliminary findings as a contribution to the ongoing, important question of how circulating DNAm measures are able to reflect methylation in brain tissue.
The pilot dataset used in this study provided a rarelyaccessible resource to compare genome-wide methylation specific to each of the five brain regions against matched blood methylation in the same individuals. Given that the generation of epigenetic predictors requires large training datasets, it was not feasible to generate lifestyle predictors derived from brain DNAm in this study, but we hypothesized that CpG sites from the blood predictors of lifestyle traits would be concordant in their methylation patterns in the brain. We tested this by applying blood-derived CpG predictor weights to the matched blood and brain samples. This context is central to the interpretation of our results, which suggest that lifestyle-related methylation in some brain regions is more highly correlated with blood than others. The variability we found across brain regions may be due to two possibilities: (i) lifestyle traits may have a stronger influence on DNAm in the regions which were well-correlated, or (ii) poorly correlated regions may have CpGs influenced by lifestyle traits which are unique to the brain and therefore not captured through the blood predictors.
Previous studies have found that many sites in the genome are poorly correlated between the blood and brain with further variability across brain regions. [19][20][21] The cg05575921 site has been shown to correlate between blood and adipose tissue, 41 suggesting that it may have a discernible trace across circulating measures and tissues; however, in most brain regions we did not observe strong correlations at this site both in relation to blood or smoking traits. Generally, there were weak correlations between the lifestyle predictors across blood and brain samples and in relation to lifestyle traits for many regions. Though the effects of lifestyle on brain DNAm directly are relatively unknown, in the LBC1936 sample, the blood-based DNAm predictor for smoking used in the present study has been shown to associate with brain morphology differences. 16 There is also evidence linking in utero exposure to smoking and alcohol use disorders to brain DNAm alterations. 42,43 Blood lipids have also been causally associated with DNAm differences in circulating cells, 44 though it is unknown whether this is true of lipids and brain DNAm. Taken together, these studies suggest that our findings may be due to possibility (ii), that there could be CpG sites related to lifestyle traits which are unique to the brain and may not be captured in the present study.
The dorsolateral prefrontal cortex (BA46) smoking predictor score and methylation at site cg05575921 were well-correlated with both the equivalent measures in blood and smoking trait information. A previous study found that of four regions, DNAm at cg05575921 in blood was most highly correlated with cg05575921 in the prefrontal cortex (r ¼ 0.28, P ¼ 0.02) in matched samples in 74 individuals. 20 Though this correlation is somewhat weaker than that observed in our study, it suggests that possibility (i) may be correct; there may be a particularly strong influence of smoking on differential DNAm in regions such as BA46. This possibility is further supported by studies that have pinpointed the frontal cortex as a region that is particularly vulnerable to the effects of smoking on brain morphology. 15,45 One of the discussed studies was able to show that smoking had unique statistical contributions to brain morphology when modelling against many other vascular risk factors (VRFs) in a large population (N ¼ 9722) 45 ; these findings indicate that there some brain areas may be more susceptible to smoking. As vascular risk factors (such as smoking) associate with various adverse outcomes including cognitive ageing and dementia, [46][47][48][49] areas such as BA46 may be more susceptible to ageing and effects that underpin cognitive function, as they show the strongest vascular risk factor-related coupling in blood and brain. A larger sample with more detailed regional sampling across the brain will be required to investigate DNAm differences in relation to this.
Though our brain bank sample was particularly enriched for smoking and alcohols traits, we anticipate our results generalizing to similar populations due to the representative nature of the large healthy ageing cohort used to train lifestyle predictors. 36 The weights that we used for BMI and smoking have also been projected into an independent cohort. 50 An important caveat is the relatively homogeneous Scottish ancestries of both cohorts in this study, which may limit translation of our findings to other ethnic backgrounds. There may also be selection biases which exist within the cohorts, as they are considered to be of higher socioeconomic status than the general population. 26,36 The LBC1936 group were born in 1936 and there are socioeconomic and cultural trends in their lifestyles that must be taken into consideration; many may have worked in factories or in shipbuilding yards, exposing them to high levels of respiratory pollutants and poorer socioeconomic status in this era was related to behaviours such as smoking. 51,52 Associations between childhood intelligence and a range of comorbidities in this group have also been shown to be partially attenuated by smoking. 51 However, we anticipate that the generally poor correlations between lifestyle trait predictors which we observe across the blood and brain would also be seen in other cohorts with reduced exposures to the lifestyle factors studied here.
There are a number of limitations to our study. First, there were only 14 individuals with matched samples in the LBC1936 brain bank. As discussed, the lack of brain DNAm samples meant that it was not possible to create lifestyle predictors in brain tissue. Recent work has generated predictors for ageing in cortical samples, providing evidence that, though imperfect, there is a concordance between predictors generated in blood and brain. 53 Future work should seek to determine if this is the case for lifestyle predictors. The limited sample size also meant that though we had information on covariates such as the post-mortem interval between death and brain DNAm sampling, regression analyses were not feasible due to insufficient power. Though confounders may have influenced our results, the effect of post-mortem interval is still debated. 54,55 Second, the blood-based BMI DNAm signatures of the 14 individuals were not reflective of the wider reference group and are therefore limited in their interpretability. Third, differences have been observed in DNAm levels from the same blood samples when measured by different arrays. 56 Here, the EPIC array was used to generate DNAm measures in both the Generation Scotland cohort used to train predictor weights (after being subset to sites overlapping with the 450K array) and in the brain DNAm subset of 14 individuals. The EPIC-derived DNAm predictors correlated well with the lifestyle traits when applied to the blood-based LBC1936 samples, which were all assessed on the 450K array. Fourth, many current or former smokers had not reported pack years information and there was variability in the strength of correlations across smoking phenotypes. Though trends for the smoking status trait were generally weaker than those observed for pack years, this difference may be reflective of the longitudinal nature of smoking that pack years captures. Finally, whereas all participants were free from neurodegenerative conditions at the study recruitment at age 70, the absence of clinical data on subsequent diagnoses means that we cannot rule out the possibility that these results are partly driven by disease-specific DNAm profiles in the brain. Several studies have found differential DNAm at regions such as the hippocampus in those with Alzheimer's disease pathology, suggesting that DNAm alterations may result from the pathological changes seen before symptoms arise. 57,58 Growing sample donations and ongoing clinical ascertainment will partly address these limitations in future work.

Conclusion
In this study, we characterize variability in how well blood-derived epigenetic measures of lifestyle traits correlate when applied across a rarely-available pilot dataset consisting of matched blood and brain samples. We find variability in the alignment between blood and brain lifestyle predictor scores across brain regions, with the most notable relationships found between the dorsolateral prefrontal cortex (BA46) and smoking-related measures. Though our work relies on the application of bloodbased signatures of lifestyle traits to brain tissue and is limited by low sample size, it nonetheless provides a preliminary insight into whether circulating DNAm proxies may reflect the epigenetic effects of lifestyle traits in the brain. This is critical given the known associations between modifiable lifestyle factors with both neurological disease risk and brain health outcomes.

Supplementary material
Supplementary material is available at Brain Communications online.