Schizophrenia-associated methylomic variation: molecular signatures of disease and polygenic risk burden across multiple brain regions

Abstract Genetic association studies provide evidence for a substantial polygenic component to schizophrenia, although the neurobiological mechanisms underlying the disorder remain largely undefined. Building on recent studies supporting a role for developmentally regulated epigenetic variation in the molecular aetiology of schizophrenia, this study aimed to identify epigenetic variation associated with both a diagnosis of schizophrenia and elevated polygenic risk burden for the disease across multiple brain regions. Genome-wide DNA methylation was quantified in 262 post-mortem brain samples, representing tissue from four brain regions (prefrontal cortex, striatum, hippocampus and cerebellum) from 41 schizophrenia patients and 47 controls. We identified multiple disease-associated and polygenic risk score-associated differentially methylated positions and regions, which are not enriched in genomic regions identified in genetic studies of schizophrenia and do not reflect direct genetic effects on DNA methylation. Our study represents the first analysis of epigenetic variation associated with schizophrenia across multiple brain regions and highlights the utility of polygenic risk scores for identifying molecular pathways associated with aetiological variation in complex disease.


Introduction
Schizophrenia is a severe neurodevelopmental disorder, characterized by episodic psychosis and altered cognitive function (1) that contributes significantly to the global burden of disease (2). Twin and family studies have highlighted a notable heritable component to schizophrenia (3), however the role of genetic variation in the aetiology of the disorder is complex. Rare, highly penetrant inherited and de novo mutations have been implicated in some cases of schizophrenia (4)(5)(6)(7), however, susceptibility is predominantly attributed to the action of common genetic variants of low penetrance. Recently, a large-scale genome-wide association study (GWAS) identified 108 independent genomic loci exhibiting genome-wide significant association with schizophrenia, which provided convincing evidence for a substantial polygenic component to aetiology within signals falling below genome-wide levels of significance (8). Despite these advances in understanding the genetic epidemiology of schizophrenia, little is known about the mechanisms by which schizophrenia risk variants mediate disease susceptibility in the brain (9,10).
Improved understanding about the biology of the genome has led to increased interest in the role of non-DNA sequencebased variation in the aetiology of neurodevelopmental phenotypes, including schizophrenia. Epigenetic processes have been hypothesized to mediate associations between genetic risk burden, environmental risk exposure and phenotype. Furthermore, a growing number of studies provide evidence for the dysregulation of epigenetic mechanisms in complex psychiatric disorders (9,(11)(12)(13). To date, such studies have primarily focused on DNA methylation at CpG dinucleotides, as this is the best characterized and most stable epigenetic modification. DNA methylation influences gene expression via physical disruption of transcription factor binding and through the attraction of methyl-binding proteins that initiate chromatin compaction and gene silencing. Of note, previous studies characterizing schizophrenia-associated methylomic variation have been limited by small sample number or the assessment of a single brain region (14)(15)(16)(17)(18)(19).
This study represents the first attempt to systematically examine the association of genome-wide methylomic variation with schizophrenia and schizophrenia polygenic risk burden, across multiple brain regions, using post-mortem tissue obtained from two independent cohorts of schizophrenia patients and controls.

Overview of experimental strategy
We quantified genome-wide patterns of DNA methylation in 262 post-mortem samples derived from four brain regions dissected from 88 individuals (41 schizophrenia and 47 nonpsychiatric controls) obtained from two independent brain banks, using the Illumina Infinium HumanMethylation450 BeadChip (450K array) (Illumina Inc., San Diego, CA, USA) (see Materials and Methods). In total, data from 76 prefrontal cortex (PFC; n ¼ 38 schizophrenia patients and 38 controls), 82 striatum (STR; n ¼ 37 schizophrenia patients and 45 controls), 27 hippocampus (HC; n ¼ 14 schizophrenia patients and 13 controls) and 77 cerebellum (CER; n ¼ 37 schizophrenia patients and 40 controls) samples passed stringent quality control (QC) metrics and were used for analysis (Table 1 and Supplementary Material, Fig. S1). For post-mortem brain regions available from both brain banks (PFC, STR and CER), a meta-analysis approach was used to combine data from both sources. Our initial analyses focused on identifying differentially methylated positions (DMPs) and differentially methylated regions (DMRs) associated with disease status. Analyses were first performed independently for each brain region, and we subsequently employed a multi-level model to identify consistent DNA methylation markers of schizophrenia present across multiple brain regions. We subsequently calculated a schizophrenia polygenic risk score (PRS) for Caucasian samples (Supplementary Material, Table S1) to identify DMPs and DMRs associated with the polygenic risk burden. A schematic overview of the study is given in Supplementary Material, Fig. S2 with more detailed experimental procedures described in the Material and Methods section.
In contrast, we find significant evidence for schizophreniaassociated variation at specific loci across the genome in each brain region. Our first analyses focused on identifying DNA methylation differences between schizophrenia cases and nonpsychiatric controls. The fifty top-ranked schizophrenia-associated DMPs in each brain region are listed in Supplementary  Materials, Tables S2-S5, Figs S4-S7, with Fig. S8). Results for all probes included in the final QC'd dataset can be downloaded from http://epigenetics.essex.ac.uk/schizo brain/. Although the specific list of top-ranked DMPs identified in each tissue is relatively distinct, many DMPs are characterized by consistent effects across brain regions (Supplementary Materials, Figs S4-S7), and for DMPs identified in each of the four individual brain regions, schizophrenia-associated DNA methylation differences are significantly positively correlated with those at the same probes in the other three brain regions (correlations for: PFC DMPs Quantile-quantile (Q-Q) plots for the P-values of the analyses in each tissue are shown in Supplementary Material, Fig. S10A-D highlighting some evidence of P-value inflation (PFC k ¼ 1.18, STR k ¼ 1.02, HC k ¼ 1.13, CER k ¼ 1.23) in several of the brain regions; such inflation is not unusual in epigenome-wide association study (EWAS) analyses and standard genomic control methods -widely used in GWAS -are not suitable for EWAS data (22). Because it is likely that unmeasured factors beyond the variables included in our analysis model (i.e. age, sex, and estimated neuronal proportion) confound our case-control analysis of methylomic variation associated with schizophrenia, we therefore investigated the impact of additional surrogate variables capturing variation in DNA methylation on the association statistics for schizophrenia-associated DMPs. We compared the regression coefficients from our initial analysis model to sequential models iteratively including up to 10 principal components (PCs) derived from the DNA methylation data, observing a strong positive correlation for schizophreniaassociated DNA methylation differences between analyses (Supplementary Materials, Figs S11-S14). This sensitivity analysis implies that although additional confounders potentially exist in our dataset, the identified schizophreniaassociated DMPs are relatively robust to the major PCs associated with methylomic variance.

Consistent methylomic markers of schizophrenia across brain regions
We next employed a multi-level model to further explore consistent schizophrenia-associated differences across multiple brain regions (see Material and Methods). As reported in previous analyses of epigenetic variation in the human brain (25)(26)(27), our data show that at a global level the patterns of DNA methylation in the CER are very distinct to the other three brain regions included in this study (Supplementary Material, Fig. S17); for this reason we excluded the CER from our multi-region model and focused on identifying consistent signals across the PFC, STR and HC. Of note, there is inflation in the distribution of Pvalues in the multi-region case-control analysis (k ¼ 1.43, Supplementary Material, Fig. S9E); although our model is designed to control for the non-independence of brain regions from the same individual, it is possible that combining datasets has resulted in some residual inflation. Compared to other published EWAS analyses, however, this inflation is relatively  Table S9), suggesting that DNA methylation differences at these loci are consistently-associated with schizophrenia across the three regions. Of note, the top-ranked crossregion DMRs include a highly-significant signal spanning 11 CpG sites ( Sid ak-corrected P ¼ 8.90E-11) annotated to WNT5A (28), in addition to regions annotated several loci identified in the analyses of the different brain regions such as GBP4 ( Sid akcorrected P ¼ 0.01), PRDM9 ( Sid ak-corrected P ¼ 0.04) and RPH3AL ( Sid ak-corrected P ¼ 1.23E-05).

Methylomic variation associated with schizophrenia polygenic risk score (PRS)
A recent large-scale GWAS of schizophrenia identified 128 independent associations spanning 108 genomic regions in a metaanalysis of over 80,000 samples (8). 5006, 5058, 5066 and 4951 Illumina 450K array probes included in our PFC, STR, HC and CER analyses, respectively, are located within these broad genomic regions. Although a number of probes within these regions were nominally associated with schizophrenia (see Supplementary Material, Table S10 for all DMPs with P < 1.00E-03), we found no overall enrichment of DMPs in any of the analyses performed (Fisher's exact test: PFC P ¼ 0.20, STR P ¼ 0.27, HC P ¼ 0.59, CER P ¼ 0.55, multi-region model P ¼0.04) at a Bonferroni corrected P-value for the number of tests performed (P < 1.25E-02).
Beyond the specific genome-wide significant loci identified in GWAS, an individual's accumulated genetic burden can be quantified to define an overall PRS -i.e. the sum of traitassociated alleles across many genetic loci, weighted by effect sizes estimated by GWAS analyses (8,29). We next explored if an increased burden of polygenic variants associated with schizophrenia was itself associated with variation in DNA methylation in the brain. Each sample was genotyped and SNP data was imputed using the latest data release from the 1,000 Genomes project, and a PRS for each sample was generated using data from the recent schizophrenia GWAS (8) (see Materials and Methods). None of the samples used in this study were included in the PGC GWAS analysis of schizophrenia, and thus did not contribute to defining the variants included in the PRS. To avoid population stratification effects, ethnicity was determined using data from HapMap Phase 3 (see Materials and Methods) and non-Caucasian samples (n ¼ 10) were excluded from subsequent PRS-based analyses. Despite the relatively small sample size (see Supplementary Material, Table S1 for an overview of samples), schizophrenia patients (n ¼ 34) were characterized by a significantly higher PRS than controls (n ¼ 40) (P ¼ 4.42E-03) (Fig.  3A). For Caucasian samples, we repeated our case-control study with and without the inclusion of PRS as a covariate. For the top-ranked DMPs associated with schizophrenia (presented above), there was a highly-significant correlation of both schizophrenia-associated DNA methylation difference and Table 2. Top-ranked schizophrenia-associated differentially methylated positions (DMPs). Shown are DMPs associated with schizophrenia at a highly stringent significance threshold (P < 1.66E-07) derived using permutations to estimate the nominal P-value for 5% family-wise error. Top-ranked schizophrenia-associated DMPs for each of the four brain regions profiled are presented in   Table 3. schizophrenia at a highly stringent significance threshold (P < 1.66E-07) derived using permutations to estimate the nominal P-value for 5% family-wise error.
Additional information on these DMPs is given in Table 2. Colour depicts a brain region in which the schizophrenia-association was identified: prefrontal cortex ¼ blue, striatum ¼ green, hippocampus ¼ red, and cerebellum ¼ yellow. Table 3. Significant schizophrenia-associated differentially methylated regions (DMRs). Shown in chromosomal order is the location of significant ( Sid ak-corrected P < 0.05) DMRs identified in each of the four brain regions, with the median P-value for DMR probes given for the other three brain regions (bold denotes median P < 0.05 and grey boxes denote regions that were not identified in that brain region). The 'Gene' column lists the combined Illumina and Genomic Regions Enrichment of Annotation Tool (GREAT) annotation (38). Region P-value across all four brain regions (correlations ranging from 0.80 to 1.00 for all comparisons, except for the HC (0.56-1. 00)), indicating that polygenic risk burden is not impacting greatly on the schizophrenia-associated differences identified. We next employed a linear model, controlling for age, sex, and neuronal estimates (except in the CER, as described in Materials and Methods) to identify methylomic variation associated with the schizophrenia PRS. Q-Q plots for the P-values of the analyses in each tissue are shown in Supplementary Material, Fig. S20, again providing evidence of some P-value inflation (PFC k ¼ 0.96, STR k ¼ 1.10, HC k ¼ 1.17, CER k ¼ 1.26) in some brain regions. The 15 PRS-associated DMPs passing our stringent significance threshold listed in Table 4 and the fifty top-ranked PRS-associated DMPs in each brain region presented in Supplementary Materials, Figs S21-S24, Tables S11-S14. Results for all probes included in the analysis of PRS can be downloaded from http://epigenetics.essex.ac.uk/schizobrain/. The top-ranked PRS-associated DMP is cg20640266 (annotated to the zinc-finger gene ZNF618), at which an increased polygenic burden was associated with elevated DNA methylation in the CER (P ¼ 6.50E-07) (Fig. 3B). Although the specific top-ranked PRS-associated loci in each tissue are distinct, effect sizes at PRS-associated DMPs are significantly correlated across brain regions (Supplementary Materials, Table S15, Fig. S25), with the exception of the HC where the low number of samples (n ¼ 23) means we are probably underpowered to detect robust effects. Furthermore, although there is no direct overlap between the top-ranked schizophrenia-associated and PRS-associated DMPs, the effect sizes at the top-ranked PRS-associated probes are significantly correlated with those at the same sites in the casecontrol analysis, and vice versa, across all brain regions (Supplementary Materials, Figs S26-S29). We used to identify spatially correlated regions of differential DNA methylation significantly associated with polygenic burden for schizophrenia ( Sid ak-corrected P < 0.05, number of consecutive probes ! 2). PRS-associated DMRs in each of the four brain regions are listed in Table 5 Table S17 across PFC, STR and HC. Of note, the top-ranked PRS-associated DMP (cg04910228), at which PRS is negatively correlated with DNA methylation (estimate ¼ À0.38%, P ¼ 6.50E-07), is located within the TSNAX-DISC1 locus on chromosome 1 (Fig. 3C). A balanced translocation involving this gene that segregates with several major psychiatric disorders including schizophrenia has been intensively studied in a Scottish pedigree (5), although the involvement of this locus in the aetiology of the disorder remains controversial and common genetic variation in this region was not identified in recent GWAS analyses (30,31). Our data suggest that an increased polygenic burden for schizophrenia may impact upon regulatory variation of the DISC1 locus in the brain.
Polygenic risk score-associated methylomic variation does not reflect direct genetic effects on DNA methylation Although one of the top-ranked PRS-associated DMPs was located within a GWAS-nominated genomic region-cg01682070    Table S20). None of the top-ranked PRS-associated DMPs in any of the individual brain regions, in addition to those identified in the multiregion model, were significantly associated with any of the genetic variants included in the PRS calculation. Because it is possible that weaker-effect mQTLs may still underlie some of the PRS-associated epigenetic variation, we subsequently relaxed our mQTL significance threshold to P < 1.00E-10, again finding no overlap with PRS-associated DMPs ( Supplementary  Material, Fig. S32). Together, these data indicate that PRSassociated epigenetic variation does not directly result from genetic influences on DNA methylation in any of the brain regions tested.

Discussion
In this study, we quantified genome-wide patterns of DNA methylation in postmortem brain samples isolated from PFC, STR, HC and CER obtained from two independent cohorts of schizophrenia patients and controls. We identified numerous DMPs and DMRs associated with disease and polygenic risk burden; many of these loci were differentially methylated in individual brain regions although others showed consistent patterns across brain regions. Many of the DMPs and DMRs associated with increased genetic burden for schizophrenia are independent of the changes observed in the disease itself, and there is no evidence for direct genetic effects on DNA methylation (i.e. via mQTLs) for variants included in the PRS. Overall, our study represents the first analysis of epigenetic variation associated with schizophrenia across multiple brain regions and highlights that DNA methylation in the brain is robustly associated with the polygenic risk burden, independently of many of the changes observed in the disease itself.
Genes annotated to several of the schizophrenia-associated DMPs and DMRs have been previously implicated in the pathophysiology of schizophrenia or have relevant roles in brain function; such as NCAM1 (which encodes a neural cell adhesion molecule with a well-established role in neurodevelopment and synaptic plasticity) (32,33), SYNPO (which encodes a actinassociated protein that is associated with postsynaptic densities and dendritic spines and differentially expressed in schizophrenia brain) (34), GBP4 (that encodes a gene that has been found to be differentially expressed in schizophrenia patients) (35), PRDM9 (which encodes a protein with histone H3K4 trimethyltransferase activity during meiosis that has been previously hypothesized to play a role in schizophrenia) (36,37) and WNT5A (an important neurodevelopmental locus) (28). Of note, a DMR spanning four probes within the RPH3AL gene (which encodes a protein that plays a direct regulatory role in calciumion-dependent exocytosis), was consistently hypomethylated in schizophrenia patients across all four brain regions. Interestingly, this DMR is also functionally annotated to the DOC2B gene using the Genomic Regions Enrichment of Annotation Tool (GREAT) (38); this encodes a high-affinity Ca 2þ sensor involved in the spontaneous neurotransmitter release from synaptic vesicles (39). Most notably, the top-ranked probe associated with PRS in our multi-region model (i.e. across PFC, STR and HC) is located in the gene body of DISC1, a gene previously strongly linked to schizophrenia in a Scottish pedigree with a balanced translocation spanning the locus (5). Our data suggest that an increased PRS for schizophrenia may impact upon regulatory variation of the DISC1 locus in the brain, implicating a potentially common pathway between polygenic and highly penetrant single locus aetiologies that warrants further investigation. Despite this being the first study to quantify DNA methylation across four different brain regions from schizophrenia patients and controls, this study has a number of important limitations. First, the number of samples assessed in this study is relatively low, especially for analyses involving the HC, which was only available from one of the two brain-bank cohorts. Despite this, we were able to identify a number of DMPs and DMRs passing our stringent significance thresholds in both the analyses of diagnosed schizophrenia and polygenic risk burden. Furthermore, although the magnitude of change at the differentially methylated loci was relatively small (i.e. involving a relative small proportion of cells in a given brain region), we were able to technically validate the Illumina 450K array data using bisulfite-pyrosequencing. Of note, given the relatively small number of individual donors, the PRS analyses were undertaken in both cases and controls, and therefore our study design is potentially confounded in a way that makes it not completely independent from the schizophrenia analysis. Despite this, we observed no direct overlap between the top-ranked schizophrenia-associated and PRS-associated DMPs although effect sizes were correlated.
Second, because epigenetic processes play an important role in defining cell-type-specific patterns of gene expression (40)(41)(42), the use of bulk tissue from each brain region is a potential confounder in DNA methylation studies (43,44). Despite our efforts to control for the effect of cell type diversity in DNA methylation quantification in our analyses using in silico approaches, this approach is not suitable to estimate the neuronal proportion in the cerebellum and cannot inform us about disease relevant DNA methylation changes specific to individual brain cell types. Third, there is increasing awareness of the importance of 5-hydroxymethyl cytosine (5-hmC) in the human brain (45,46), although this modification cannot be distinguished from DNA methylation using standard bisulfite-based approaches (47). It is therefore plausible that many of the differences identified in this study are confounded by modifications other than DNA methylation. To date, no study has evaluated the role of 5-hmC in schizophrenia or any other psychiatric disorder, although a recent paper from our group quantified levels of 5-hmC across the genome in human cortex and cerebellum (47); of note, none of the significant DMPs identified in this study were characterized by detectable levels of this DNA modification.
Definitively distinguishing cause from effect in epigenetic epidemiology is difficult, especially for disorders like schizophrenia that manifest in inaccessible tissues such as the brain and are therefore particularly refractory to longitudinal study Sid ak-corrected P < 0.05, number of probes !2) associated with schizophrenia PRS identified in any of the four brain regions. Effect sizes for individual probes within each DMR are also shown for the other three brain regions (blue ¼ hypomethylation, red ¼ hypermethylation). Further details for individual DMRs are provided in Table 5. (44). However, our observation of consistent changes across multiple brain regions in two independent cohorts for many DMPs and DMRs suggests that the identified loci are potentially directly relevant to the schizophrenia pathogenesis. Furthermore, our identification of PRS-associated variation in DNA methylation potentially less confounded by medication intake and other disease-associated exposures that can influence case-control analyses. We tested if the PRS associations reflected the direct effects of genetic variation by testing whether the genetic variants used to derive the polygenic risk scores are mQTLs that influence DNA methylation at PRS-associated DMPs; our analyses suggest that the associations with schizophrenia polygenic burden are independent from genetic variation itself.
Unlike for GWAS, little work has been done to determine appropriate levels of significance in EWAS and a major issue in the field of epigenetic epidemiology is that no empirically-derived thresholds have been established that can be used consistently across studies (48). To establish a stringent multiple-testing significance threshold to identify schizophrenia-associated DMPs, we utilized data from a large Illumina 450K dataset (n ¼ 675 individuals) generated as a part of recent study from our group (49). Briefly, we performed 5,000 EWAS permutations using a multiple linear regression model controlling for age, sex, smoking and cell composition, and used these to estimate the nominal P-value for 5% family-wise error (P ¼ 1.66E-07). Although we believe this approach to be highly stringent, providing a significance threshold that can be used in subsequent EWAS analyses, it is important to note that the permutations were performed in an independent dataset.
Although we controlled for age, sex and derived neuronal composition in our analyses, it is plausible that other factors may be confounding our case-control analyses of schizophrenia, as highlighted by the inflated Q-Q plots observed for several of the analyses. For example, epidemiological data highlights a much higher rate of smoking in schizophrenia patients compared to unaffected controls (50,51). Although smoking has been shown to have striking effects on DNA methylation in blood (52), none of the robust smoking-associated DMPs identified in the blood are amongst the schizophrenia DMPs identified in any of the four brain regions assessed in the current study. Although P-value inflation is a common feature of many DNA methylation datasets, standard genomic control methodswidely used in GWAS -are not suitable for EWAS data (22). Therefore, we investigated the impact of including additional surrogate variables capturing variation in DNA methylation on the association statistics for schizophrenia-associated DMPs (Supplementary Material, Figs S11-S14), observing that the identified schizophrenia-associated DMPs are relatively robust to the major PCs associated with methylomic variance. Of course, the modest P-value inflation observed in this study does not necessarily result from residual confounding; it is plausible that there are multiple differentially methylated loci of small effect associated with schizophrenia, and that changes across genomic regions are coordinated. Finally, although the control samples used in this study were selected to be free of psychiatric morbidity, little additional information about these donors is available; given the nature of post-mortem tissue, for example, they will have died from a number of different causes. Although our study presents novel evidence for associations between schizophrenia diagnosis, schizophrenia polygenic burden and variable DNA methylation across different brain regions, further replication using larger sample sizes is essential to further support these results. Future studies should focus on understanding the transcriptional consequences of the observed associations, and testing whether these associations are causal or a consequence of disease and/or medication.
In summary, our data provide evidence for differences in DNA methylation across multiple brain regions in schizophrenia. We also identify evidence for differential DNA methylation associated with the increased polygenic burden for schizophrenia, including in the vicinity of DISC1, a gene previously implicated in the disease by a highly penetrant balanced translocation. Of note, there is no enrichment of loci identified in a recent large GWAS of schizophrenia amongst DMPs for either schizophrenia or schizophrenia PRS identified in this study. Our study represents the first analysis of epigenetic variation associated with schizophrenia across multiple brain regions and highlights the utility of polygenic risk scores for identifying molecular pathways associated with aetiological variation.

Post-mortem tissue samples
Post-mortem PFC (Brodmann area 9), STR (putamen), HC, and CER samples from a total of 41 schizophrenia patients and 47 non-psychiatric control samples were obtained from the MRC London Neurodegenerative Diseases Brain Bank (LNDBB) (http:// www.kcl.ac.uk/ioppn/depts/bcn/Our-research/Neurodegenera tion/brain-bank.aspx) and the Douglas-Bell Canada Brain Bank (DBCBB), Montreal (http://douglasbrainbank.ca/). LNDBB subjects were approached in life for written consent for brain banking, and all tissue donations were collected and stored following legal and ethical guidelines (NHS reference number 08/MRE09/38; LBBND brain bank HTA license number 12293). Schizophrenia patients were diagnosed by trained psychiatrists, according to the Diagnostic and Statistical Manual of Mental Disorders criteria. DBCBB samples were collected post-mortem following consent obtained with next of kin, according to tissue banking practices regulated by the Quebec Health Research Fund (http://ethique.msss.gouv.qc.ca/), and based on the OECD Guidelines on Human Biobanks and Genetic Research Databases (http://www.oecd.org/science/biotech/44054609.pdf). Psychiatric diagnoses were based on best-estimate diagnostic procedures, following SCID I diagnostic interviews conducted with informants, as described elsewhere (53). The current study was approved by the University of Exeter Medical School Research Ethics Committee (reference number 13/02/009). All samples were dissected by trained neuropathologists from each brain bank, snap-frozen and stored at À80 C.

Methylomic profiling
Genomic DNA was isolated using a standard phenol-chloroform extraction protocol and assessed for quality and purity using spectrophotometry. DNA ($500 ng) from each sample was treated with sodium bisulfite using the EZ-96 Gold DNA methylation kit (Zymo Research, Irvine, CA, USA). DNA methylation was quantified using the Illumina Infinium Human Methylation450 BeadChip (Illumina, San Diego, CA, USA) scanned on an Illumina HiScan System (Illumina, San Diego, CA, USA). Samples were batched by tissue and brain-bank, and randomized with respect to diagnosis, sex and age throughout all experimental procedures. Illumina Genome Studio software was used to extract the raw signal intensities of each probe (without background correction or normalization). QC and normalization steps were performed separately for samples from each brain bank. Signal intensities for each probe were imported into R (54) using the methylumi and minfi packages (55,56). Multidimensional scaling plots of sex chromosome probes were used to check that the predicted sex corresponded with the reported sex for each individual. The ten bisulfite conversion control probes on the array were used to calculate the efficiency of the bisulfite conversion reaction. Comparison of 65 SNP probes on the array confirmed that matched tissues were sourced from the same individual. The 65 SNP probes, probes on sex chromosomes, cross-hybridizing probes (57,58) and probes containing an SNP with minor allele frequency > 5% within 10 bp of the single base extension position were excluded from analysis (57). The 'pfilter' function of the wateRmelon package (59) was used to filter data by beadcount and detection P-value. Samples with-> 1% probes with a detection P-value > 0.01 were removed, along with probes with a detection P-value > 0.05 in at least 1% of the samples and/or a beadcount < 3 in 5% of samples were also removed. The 'dasen' function in wateRmelon was used to normalize the data as previously described (59). The total number of CpG sites included and excluded for each brain region in the final dataset are presented in Supplementary Material, Table S21. In total 5 PFC, 4 STR, 3 HC and 4 CER samples were excluded during these stringent QC procedures. The number of samples in the final dataset used in the analyses are presented in Table 1.

Genotyping and derivation of schizophrenia polygenic risk scores
Genomic DNA (200 ng) from each individual was used for genotyping on the Illumina Infinium HTS HumanOmniExpress-24 BeadChip v1-0 using an iScan Microarray Scanner (Illumina, San Diego, CA, USA), according to manufacturer's instructions. Illumina GenomeStudio software was used for genotype calling and the data were exported as .ped and .map files. PLINK (60) was used to remove samples with > 5% missing data and SNPs with > 1% missing values, Hardy-Weinberg equilibrium P < 1.00E-03 or minor allele frequency of < 5%. Sample ethnicity was determined by merging the genotypes with data from HapMap Phase 3 (http://www.sanger.ac.uk/resources/downloads/ human/hapmap3.html) and LD pruning the overlapping SNPs such that no pair of SNPs within 1500 bp had r 2 > 0.20. GCTA software (61) was used to calculate principal components of the genetic data, which were visually inspected to ascertain ethnicity for each sample by comparison with the known ethnicities of the HapMap sample. Non-Caucasian samples (n ¼ 10) were excluded from PRS analyses. For imputation, genotypes were recoded into .vcf files using PLINK1.9 (62) and VCFtools (63) before uploading to the Michigan Imputation Server (https://imputationserver.sph. umich.edu/start.html#!pages/home), which uses SHAPEIT (64) to phase haplotypes, and Minimac3 (65) with the most recent 1000 Genomes reference panel (phase 3, version 5) (http://www. 1000genomes.org/). PRSs were calculated in PLINK (60) using the imputation dosages from 99,940 variants and the score file downloaded from the Psychiatric Genomics Consortium (PGC) website (https://www.med.unc.edu/pgc/downloads) where GWAS results have been clumped, retaining the best association (identified by P-value) in each LD block.

Identification of schizophrenia-associated differentially methylated positions and regions
An overview of the samples included in our schizophrenia casecontrol analysis is given in Table 1. We estimated the proportion of neuronal cells for each sample using the CETS package in R (43). To identify DMPs in each brain region, we used linear regression with the preprocessed and normalized methylation (b) values separately for samples from each brain bank using disease status, age, sex and neuronal proportion estimates as independent variables. Given the nature of the samples used in this study, information about medication, smoking status and other phenotypic information was not available and could not be included as covariates in analyses. Neuronal proportion estimates were not included as a variable for cerebellum samples because of the high proportion of non-NeuNexpressing neurons, which make CETS unsuitable for estimating the cell composition. In our data, cerebellum neuronal estimates derived from CETS correlated significantly with age (q ¼ 0.48, P ¼ 1.26E-05) reflecting the previously reported ageassociated variance in the ratio of NeuN-expressing and nonexpressing cells in the cerebellum (43). For tissues collected from both brain banks (PFC, STR and CER) a fixed-effect metaanalysis on the adjusted mean b values computed with inverse variance weights was performed using the 'metacont' from the meta package in R (66). Only probes that survived QC and were common to both datasets were used in the meta-analysis (Supplementary Material, Table S21). We employed a fixedeffects (rather than random-effects) meta-analysis because with only two sample cohorts contributing to the pooled effect size, the precision of the estimate of the between-studies variance is poor using a random-effects model. To identify DMRs, we identified spatially correlated P-values in our data using the Python module comb-p (23) to group spatially correlated DMPs (seed P-value < 1.00E-03, minimum of 2 probes) at a maximum distance of 300 bp in each brain tissue. DMR P-values were corrected for multiple testing using Sid ak correction (67) which corrects the combined P for n a /n r tests, where n a is the total number of probes tested in the initial EWAS and n r the number of probes in the given region. The Bioconductor package Bumphunter (24) was used to confirm specific DMRs identified by comb-p with an alternative method. The probes common to the PFC, STR and HC analyses (411,449 probes) were tested for homogeneous DNA methylation effects associated with schizophrenia across the three brain regions using a mixed-effects model with sex, age, neuronal estimates and brain bank as fixed effects and individual and brain region as random effects.

Identification of polygenic risk score-associated differentially methylated positions and regions
Although the utility of PRS for exploring the molecular genomic mechanisms involved in disease pathogenesis is largely unexplored, PRS-associated epigenetic variation is potentially less affected by factors associated with the disease itself (e.g. medication exposure, stress and smoking), which can confound case-control analyses. An overview of the samples included in our analysis of methylomic variation associated with the polygenic burden for schizophrenia is given in Supplementary Material, Table S1. To identify PRS-associated DMPs we performed a multiple linear regression for each cohort using the PRS, age, sex and neuronal proportion estimates as independent variables (except in the CER, where neuronal proportion estimates were not included, as described above). For each of the tissues collected from both brain banks (PFC, STR and CER), a fixed-effect meta-analysis based on the linear regression estimates and their standard errors was computed with inverse variance weights using the 'metagen' function from the meta package in R (66). Only probes that passed our QC metrics and were common to both cohorts in each brain region were used in the meta-analysis (Supplementary Material, Table S21). To identify PRS-associated DMRs we used comb-p (23) as described above. To identify homogeneous DNA methylation effects associated with PRS across PFC, STR and HC data a mixed effects model was fitted as described above.

Establishing multiple testing significance threshold for EWAS analysis
To establish a stringent multiple-testing significance threshold to identify schizophrenia-associated DMPs, we utilized data from a large Illumina 450K dataset (n ¼ 675 individuals) generated as part of a recent study from our group (49). The sample was randomly split into cases and controls 5,000 times, and for each permutation an EWAS was performed using a multiple linear regression model controlling for age, sex, smoking and cell composition, and the probe-level P-values recorded. The minimum or most significant P-value was identified for each permutation and the 5 th percentile across the permutations was used to estimate the nominal P-value for 5% family-wise error (P ¼ 1.66E-07).

Targeted validation using bisulfite-pyrosequencing
Bisulfite-pyrosequencing was used to quantify DNA methylation across the chr17:154410-154672 region identified in our DMR analysis. The bisulfite-pyrosequencing assay was designed using the PyroMark Assay design software (Qiagen, Hilden, Germany), with bisulfite-PCR amplification performed in duplicate using the primers and assay conditions in Supplementary Material, Table S22. Fully methylated control samples were included in all experiments. DNA methylation was quantified across amplicons using the Pyromark Q24 system (Qiagen) following the manufacturer's standard instructions and Pyromark Q24 CpG 2.0.6 software.