Obesity-related genetic determinants of stroke

As obesity, circulating lipids and other vascular/metabolic factors inﬂuence the risk of stroke, we examined if genetic variants associated with these conditions are related to risk of stroke using a case (cid:2) control study in Galicia, Spain. A selection of 200 single-nu-cleotide polymorphisms previously found to be related to obesity, body mass index, circulating lipids, type 2 diabetes, heart failure, obesity-related cancer and cerebral infarction were genotyped in 465 patients diagnosed with stroke and 480 population-based controls. An unsupervised Lasso regression procedure was carried out for single-nucleotide polymorphism selection based on their potential effect on stroke according to obesity. Selected genotypes were further analysed through multivariate logistic regression to study their association with risk of stroke. Using unsupervised selection procedures, nine single-nucleotide polymorphisms were found to be related to risk of stroke overall and after stratiﬁcation by obesity. From these, rs10761731, rs2479409 and rs6511720 in obese subjects [odds ratio (95 % conﬁdence interval) ¼ 0.61 (0.39–0.95) ( P ¼ 0.027); 0.54 (0.35–0.84) ( P ¼ 0.006) and 0.42 (0.22–0.80) ( P ¼ 0.0075), respectively], and rs865686 in non-obese subjects [odds ratio (95 % conﬁdence interval) ¼ 0.67 (0.48– 0.94) ( P ¼ 0.019)], were independently associated with risk of stroke after multivariate logistic regression procedures. The associations between the three single-nucleotide polymorphisms found to be associated with stroke risk in obese subjects were more pronounced among females; for rs10761731, odds ratios among obese males and females were 1.07 (0.58–1.97) ( P ¼ 0.84), and 0.31 (0.14–0.69) ( P ¼ 0.0018), respectively; for rs2479409, odd ratios were 0.66 (0.34–1.27) ( P ¼ 0.21), and 0.49 (0.24–0.99) ( P ¼ 0.04), for obese males and females, respectively; the stroke-rs6511720 association was also slightly more pronounced among obese females, odds ratios were 0.33 (0.13–0.87) ( P ¼ 0.022), and 0.28 (0.09–0.85) ( P ¼ 0.02) for obese males and females, respectively. The rs865686 (cid:2) stroke association was more pronounced among non-obese males [odds ratios ¼ 0.61 (0.39–0.96) ( P ¼ 0.029) and 0.72 (0.42–1.22) ( P ¼ 0.21), for non-obese males and females, respectively]. A combined genetic score of variants rs10761731, rs2479409 and rs6511720 was highly predictive of stroke risk among obese subjects ( P ¼ 2.04 (cid:3) 10 (cid:2) 5 ), particularly among females ( P ¼ 4.28 (cid:3) 10 (cid:2) 6 ). In summary, single-nucleotide polymorphisms rs1076173, rs2479409 and rs6511720 were found to independently increase the risk of stroke in obese subjects after adjustment for established risk factors. A combined score with the three genomic variants was an independent predictor of risk of stroke among obese subjects in our population. number Logistic regression analyses were carried out to assess the independency of genotypes Data were presented as odds ratios (OR) and 95% confidence intervals (95% CIs).


Introduction
Ischaemic stroke is a neurological disease caused by focal obstruction of the cerebral circulation.Its etiopathogenesis is associated with disease of large and small vessels and embolisms mainly originated in the heart.Although genetic causes are only demonstrated in a small proportion of cases, it is possible that genetics play a more significant role in stroke aetiology. 1,2Currently, stroke prevalence increases in relation to the ageing of the population, it is the second cause of death and the first cause of disability associated with illness.Recent advances in genomics can provide new clues on stroke mechanisms and help predict future risk of disease, prognosis or identifying new drug targets for treatment. 2o far, the complexity and heterogeneity of stroke, with many aetiological factors, has made the identification of Graphical Abstract genes difficult.Because several mechanisms are involved in stroke aetiology, such as blood pressure, obesity, atrial fibrillation, as well as other conditions, the disease is expected to share genetic influences with these conditions.In addition, genetic factors may shed light into the complex relationship between ischaemic stroke and some of these conditions, such as obesity, which can act as both, a risk or even a protective factor, after occurrence of a first event, a fact underlying the concept of the obesity paradox in stroke aetiology and prognosis. 3Thus, it is likely that the different role of obesity and other conditions in ischaemic stroke could be, at least in part, genetically determined.
Genetic studies associated with obesity and other stroke-related conditions themselves have highlighted strong genetic influences, therefore we have selected genetic variants implicated in major risk factors for stroke to provide new insights on the biology and pathways leading to the disease.We have also selected some common gene variants that directly influence the risk of stroke itself and could therefore modify disease risk, progression or response to pharmacological therapy.
The objective of this study was to examine the potential role of selected single-nucleotide polymorphisms (SNPs) previously found to be related to obesity, body mass index (BMI), circulating lipids, type 2 diabetes, obesity-related cancer or directly related to stroke/cerebral infarction or heart disease, on risk of stroke in the Spanish Galician population, in which no data on the genetics of stroke has been previously published.

Subjects
Patients consecutively admitted in the Stroke Unit of the University Clinical Hospital of Santiago de Compostela in the BICHUS (Biobanco de Ictus del Complejo Hospitalario Universitario de Santiago de Compostela) registry with a diagnosis of stroke in accordance with the current European guidelines of clinical practice were invited to participate in this study.In total, 465 patients with stroke and 480 population-based healthy controls, free of disease, confirmed not to have had any previous stroke episode through Ianus, the computerized Galician electronic medical history, and selected from the same base-population as cases from a parallel study of metabolic syndrome in Galicia, 4 Spain, were included.Information on risk factors and anthropometrical and clinical characteristics were collected for each patient and control subject.This research was carried out in accordance with the Declaration of Helsinki of the World Medical Association (2008) and approved by the Ethics Committee of Clinical Research of Galicia (CEIC).All patients and controls were included in the study under signed written informed consent.

Ethics approval
All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki declaration and its later amendments or comparable ethical standards.The study was approved by the Ethics Committee of Clinical Research of Galicia (CEIC).Informed written consent was obtained from all individual participants included in the study.

Measurements and laboratory data
Venous peripheral blood samples for genetic analysis were obtained at admission of stroke patients.BMI was calculated in the moment of the inclusion and categorized in non-obese (BMI < 30 kg/m 2 ) and obese (BMI !30 kg/ m 2 ).The diagnosis of diabetes mellitus was based on the latest criteria established by the American Diabetes Association. 5Hypertension was defined as habitual systolic/diastolic blood pressure >140/90 mmHg or current use of any antihypertensive medication.Dyslipidaemia was defined by the current treatment with anti-hyperlipidaemic drugs.We did not have specific total cholesterol (TC), low-density lipoprotein (LDL) or high-density lipoprotein (HDL) measurements for the study subjects.All patients were admitted to an acute stroke unit and treated according to the European Stroke Organisation guidelines. 6

SNP selection
The objective of this study is to assess the independent and interactive roles of genetic variants identified from genome-wide association studies (GWAS) involved in metabolic traits such as obesity, BMI, circulating lipids, type 2 diabetes, heart failure, obesity-related cancer and stroke (listing of SNPs in Supplementary Table 1), 7À15 on the risk of stroke.SNPs were selected based on their magnitude and consistency of the association with major risk factors for stroke, replication by several studies and budgetary considerations.
Briefly, (i) we selected 200 SNPs from the SNPs identified by the large scale GWAS studies that were found to be associated with obesity, BMI, circulating lipids, heart failure, obesity-related cancers and stroke; (ii) genotyped them using iPLEX/Malditof genotyping assay; (iii) analysed them to evaluate their association with the risk of ischaemic stroke, overall and stratified by sex and obesity, and (iv) examined whether a GRS score may play a role.Although there have been an increased number of ischaemic stroke genetic variants being identified in different populations, majority of them were conducted in Caucasians, and very limited data exists on the effect of the identified genetic variants in the Spanish population.
Our SNP selection criteria were similar as the SNP selection criteria used in Mendelian Randomization analysis.Briefly, all the phenotype-related SNPs included in the present study exhibited a P value in the GWAS Catalogue 9 Â 10 À6 , with majority of them being 5 Â 10 À8 , and confirmed in several independent studies.Specifically, selected SNPs included common genetic variants influencing obesity at two loci, FTO and MC4R, which have been reproducibly associated with obesity measured by BMI.BMI-associated loci found in SH2B1, TMEM18, NEGR1, KCTD15, BDNF, ETV5/DGKG, SEC16B/RASAL2, BCDIN3D/FAIM2, SH2B1 and MTCH2 genes were also included.Additionally, three loci in NPC1, near MAF, and near PTER, as BMI-related alleles on the risk of obesity-related diseases such as type 2 diabetes, were also included.Additional associated SNPs from GWAS of type 2 diabetes, related to obesity and from other metabolic traits also obesity-related, such as glucose, insulin/insulin response and C-reactive protein, have been identified and were also included.Recent GWAS studies have also localized common DNA variants affecting circulating serum HDL, LDL, TC and triglycerides (TG).Included are also SNPs at $38 loci that have been associated with one or more of the three traits at genome-wide significance level (LCAT, APOB, APOE, PCSK9, LDLR, HMGCR, CETP, MLXIPL, GCKR, TRIB1, GALNT2).Finally, we also included 26 SNPs identified from GWAS that increase the risk of obesityrelated cancers such as breast cancer, that have previously been shown to exhibit potential interactions with obesity. 16notyping DNA was extracted from buffy coat by using the Chemagic DNA Buffy Coat Kit special with the Chemagic MSM I system (Perkin Elmer, Waltham, MA), based on magnetic beads.After quantification of dsDNA using PicoGreen (Thermo Fisher Scientific, Waltham, MA), the DNA was diluted to a final concentration of 50 ng/ml in water.
Genotyping of the selected 200 SNPs was conducted by the CEGEN-PRB2 USC node using the iPlex Gold chemistry and MassARRAY platform, according to the manufacturer's instructions (Agena Bioscience, San Diego, CA).All assays were performed in 384-well plates, including negative controls and a trio of DNA samples obtained from the NIGMS Human Genetic Cell Repository at the Coriell Institute for Medical Research (NA10860, NA10861 and NA11984) for quality control.

Quality control
The following SNPs were excluded from analysis: rs4836133, not biallelic; rs12422552, rs13281615, rs2642442 and rs6602024 had a Hardy-Weinberg equilibrium P value lower than 0.00025 among controls (corresponding to 0.05/200); 10 additional SNPs with a minor allele frequency (MAF) lower than 1%.Therefore, the initial 200 SNPs were reduced to 185 SNPs that were finally analysed in relation to the risk of stroke.Because of the limited existing data on the Spanish, and, particularly, the Galician population, we decided to genotype all selected SNPs despite a lower cost-efficiency.Statistical methods used are not affected by any possible linkage disequilibrium between SNPs.

Patients and outcomes
We excluded samples with more than 5% missing genotype or relevant risk factor data.Two cases were excluded from the analysis because of missing information on important covariates such as sex or BMI.Another case was excluded because of 10 (5%) or more missing genotypes.Therefore, 462 patients were included in the final analysis.Among controls, 3 controls were excluded because they had a previous diagnosis of stroke, and 30 were excluded because of 10 (5%) or more missing genotypes.Therefore, 447 controls were included in the final analysis.

Data analysis
The statistical analyses were performed using the R statistical software version 3.2.3. 17The categorical or dichotomous clinical variables were expressed as absolute values and percentages and were compared with the Pearson v 2 test or t-test.The continuous variables were described as mean 6 standard deviation (SD).For the comparison of quantitative continuous variables, Student's t-test or the ANOVA were used.
Because of sample size limitations and the large numbers of studied SNPs, we first carried out an unsupervised Lasso regression procedure for SNP preselection.The procedure is mainly based on the classical cross-validation technique used to evaluate machine learning models, in our case, Lasso models.Briefly, samples were divided into 10 subsets (or folds) of approximately equal size and with equal proportion of cases and controls, and several Lasso models were constructed.The first fold was treated as a validation set, and the Lasso models were fit on the remaining 9-fold samples as a training set.Each one of the folds were used as different validation sets and information about the best SNPs in each round is retained to do an SNP selection at the end of the full process.We used the function cv.glmnet() from the R package glmnet, with misclassification error as the criterion for 10-fold cross-validation.Using Lasso methods, we preselected any genetic marker that showed differences in its ability to increase the risk of stroke, overall and/or stratified by the exposure of interest, obesity, and including SNP interaction terms with obesity and/or BMI.Next, the preselected SNPs were considered potential candidates for increasing the risk of stroke and were therefore analysed through multivariate logistic regression procedures, in either obese and/or non-obese subjects.Lasso preselection methods are based on cross-validation and therefore are free of multiple testing since no association test was performed.The number of preselected SNPs was next used to establish a new significance threshold for standard multivariate logistic regression procedures.Logistic regression analyses were carried out to assess the independency of genotypes to predict the risk of stroke.Data were presented as odds ratios (OR) and 95% confidence intervals (95% CIs).Information on established risk factors for stroke, such as age, sex, obesity, BMI, smoking, height, diabetes type 2, hypertension, hyperlipidaemia and ischaemic cardiopathy (Table 1), were considered as potential confounders and/or effect modifiers in caseÀcontrol analysis.
Since the relationship between stroke and BMI has been reported to be non-linear, but rather U-shaped, with both, low and high BMI conferring increased risk, we took this into account to model the underlying biological mechanism and conducted separate stratified analysis by BMI: nonobese (BMI <30 kg/m 2 ) and obese (BMI !30 kg/m 2 ).
Genetic risk score (GRS) calculation.Briefly, GRS was calculated using a weighted method, calculating an average GRS per risk allele from each SNP for each individual. 18

Availability of data and material
The datasets generated in the current research can be obtained from the corresponding author upon reasonable request.

Results
Demographic and clinical characteristics of study subjects are shown in Table 1.The mean age of cases and controls were 70 and 64.6, respectively.Among cases, 57.4% were male, 23.2% were diabetic and 64.5% were hypertensive.The corresponding figures among controls subjects were 55.3%, 9.4% and 27.5%, respectively (Table 1).Since our cases were, on average, 5.4 years older than controls, we conducted stratified analysis and rerun all analyses excluding cases and controls under two different scenarios to fully account for the age-related differences; results were virtually unchanged (see Discussion section).

SNP selection by Lasso regression
As described above, we first carried out an unsupervised Lasso procedure for SNP selection in two steps.We first preselected any genetic marker that exhibited differences in their ability to increase the risk of stroke, overall or stratified by obesity.Through a 10-fold cross-validation strategy, 30 SNPs were preselected.Next, we repeated the procedure in the subset of 30 SNPs without stratification and with an SNP interaction term for obesity and/or BMI.Thus, a final list of nine SNPs (rs10761731, rs12190287, rs1432679, rs2116830, rs2479409, rs2531995, rs581080, rs6511720, rs865686) was selected.We then conducted multivariate logistic regression analyses to examine the associations between the nine SNPs and the risk of stroke, adjusting for established risk factors for stroke.We carried out caseÀcontrol analyses in total subjects and stratifying by obesity, i.e. in obese subjects and non-obese subjects, separately (Tables 2 À 4).
Three SNPs (rs2479409, rs6511720 and rs10761731) showed a strong association with the risk of stroke in obese subjects.Although these associations did not reach the required multiple testing Bonferroni corrected level of significance (0.05/9 ¼ 0.006), all three showed borderline independent associations among obese subjects, and showed virtually no association among non-obese subjects (Tables 3 and 4).Only one other SNP, rs865686 in 9q31.2, was found to be significantly associated with the risk of stroke among non-obese subjects (P ¼ 0.019), and showed no association in obese subjects (Tables 3 and 4).
Multivariate logistic regression models showed independence of associations of the three SNPs found to be associated with the risk of stroke in obese subjects (data not shown).We constructed a GRS with the three SNPs (rs10761731, rs2479409 and rs6511720) having into account their risk allele effects on risk of stroke in a logarithmic scale.Among obese subjects, the GRS was strongly associated with a statistically significant risk of stroke [1-unit change OR ¼ 2.2 (1.5-3.2) (P ¼ 2.04 Â 10 À5 )] (Fig. 1).No association was observed among non-obese individuals [1-unit change OR ¼ 1.1 (0.9-1.3) (P ¼ 0.54)].Despite sample size limitations, we also checked for potential differences in the GRSÀstroke   )] than males [1-unit change OR ¼ 1.4 (0.9-2.2) (P ¼ 0.14)] (Fig. 1).We also examined the effects of the selected SNPs on risk of stroke by stroke by hyperlipidaemia (Supplementary Tables 2 and 3).Results were similar as to those stratified by obesity.We also examined the association among nonobese and non-hyperlipidaemic individuals (Supplementary Table 4), and results were similar as those found for non-obese or non-hyperlipidaemic individuals.Finally, we examined the SNPsÀstroke relationship among obese or hyperlipidaemic individuals, or both (Table 5), and found a more pronounced effect of the key SNPs on the risk of stroke.
The allele frequency of our nine selected SNPs was similar as that of publicly available control datasets.
We have included a Supplementary Table 5 with the allele frequency of the nine stroke-related SNPs from Tables 3 to 5 of our manuscript in publicly available control datasets such as the GWAS Catalogue and the NCBI ALFA (Supplementary Table 5).

Discussion
In this study, we selected 200 SNPs with a potentially relevant role on cerebrovascular disease because of their previously reported associations with metabolic and vascular disorders that play a role in the pathophysiology of stroke, mostly obesity-and lipid-related disorders, such as heart failure, obesity-related cancers, as well as genetic variants directly related to the risk of stroke.After a quality control analysis of the genotyping data, 185 SNPs were finally included in the study.An unsupervised Lasso procedure was carried out, from which a consensus list of nine SNPs was obtained.After adjustment for established risk factors for stroke, three of the SNPs (rs1076173, rs2479409 and rs6511720), were associated with the risk of stroke among obese subjects and one (rs865686) among non-obese subjects.Thus, obesity seems to play a role in the SNPs-stroke relationship as the three genetic variants were found to be independently associated with stroke in obese patients but not in nonobese patients, and, conversely, one of the variants was found to be associated with the risk of stroke only in non-obese patients.These data emphasize that obesity may play a role on the genetics of stroke, as different variants were found to be related to stroke in obese and not in non-obese and vice versa.Obesity may be a determinant factor underlying the genetically-based stroke risk stratification in this population.
The main results of the present study are that three SNPs (rs10761731, rs2479409 and rs6511720) were associated with the risk of stroke among obese subjects.For these three SNPs, the associations were confined or more pronounced among females.We also built a combined GRS weighting the risk of each risk allele of the three genomic variants (rs10761731, rs2479409 and ), having into account their effects on risk of stroke in a logarithmic scale.The GRSÀstroke association in obese individuals was also more pronounced among females.There was one only SNP, rs865686, found to be associated with risk of stroke in non-obese subjects, particularly among non-obese males.These findings, particularly among obese females, are the first report of an association in a Spanish population.We further checked the potential determinant role of obesity in the genetics of stroke, as supported by our findings, in the recent published literature.We first searched GWAS data from the publicly available MEGASTROKE study datasets, 19 and examined the summary statistics of our main findings, i.e. the SNPs found to be associated or borderline-associated with risk of stroke in the present study.Using a sample size of 239 313 subjects (503 624 subjects in the case of rs6511720 and rs2116830), we validated three (rs10761731, rs6511720, rs865686) of the four SNPs that were associated with risk of stroke in the present study, and two (rs12190287 and rs2116830) of the four that were of borderline significance (Supplementary Table 6).
We also reviewed the existing data of stroke GWAS and obesity/BMI from other publicly available large-scale studies such as Abraham et al., 20 who conducted a metaanalysis of GRSs which included a BMIÀGRS that was found to enrich the ischaemic stroke GRS.The authors of that study noted that while recent GWAS had identified a GRS for ischaemic stroke, this GRS had only a modest predictive power in comparison with established stroke risk factors.In their meta-analysis, Abraham et al. 20 enriched this ischaemic stroke GRS with 19 additional GRSs from ischaemic stroke risk factors, co-morbidities and stroke subtypes, constructing a meta-GRS that outperformed the ischaemic stroke GRS.In this meta-GRS, the BMIÀGRS played an important role, being the fourth GRS in terms of biggest contribution, and the first GRS from a 'non-cardio-related' phenotype.This relevant role of BMI in the ischaemic stroke meta-GRS is internally consistent with the results from our study showing differences in the genetic risk of ischaemic stroke by BMI.
We also downloaded the GWAS datasets from Abraham et al. 20 to examine the nine stroke-selected SNPs in our study stratified by BMI.At SNP level, we checked Abraham et al. 20 Supplementary materials where meta-GRS model coefficients were available for each of 3.2 million SNPs.We found that four of the nine selected SNPs from Table 2 (rs10761731, rs2116830, rs2531995 and rs6511720), exhibited a sufficiently strong effect to be included in the Ischemic Stroke-GRS and/or the BMIÀGRS from Abraham et al.'s 20 study, and, consequently, they were included in the meta-GRS model.
Little is known about the genes and proteins related to the genetic variants revealed by the study.First, SNP rs2479409 in the Proprotein convertase subtilisin-like kexin type 9 (PCSK9, OMIM 607786) gene, may influence inter-individual variation in circulating LDL cholesterol levels.In a previous study, this common potentially functional SNP showed an unusually extended homozygosity.PCSK9 is a newly discovered serine protease that plays a key role in LDL cholesterol homeostasis by mediating LDL receptor (LDL-R) breakdown through a post-transcriptional mechanism.21À24 PCSK9 may also regulate apolipoprotein B-containing lipoprotein production, 25,26 and promote production of nascent very LDL (VLDL) in the fasting state. 27Adenoviral-mediated overexpression of human PCSK9 in mice promotes the accumulation of LDL cholesterol in plasma but this response is absent in LDL receptor-deficient animals. 22,24,28Recent studies show that PCSK9 binds directly to the extracellular domain of the LDL receptor 29,30 and increases its degradation. 29With respect to neurological functions, PCSK9 expression has been detected in the cerebellum, as well as in other tissues. 31PCSK9 may enhance degradation of other receptor types or proteins during the development of cerebellum and telencephalon, 31 and promote cerebellar cortical neurogenesis, possibly by increased recruitment of undifferentiated neural progenitor cells into the neuronal lineage. 32s6511720 in 19p13 locus is another genetic variant also associated with blood LDL-cholesterol. 33The nearest gene is LDLR (Low Density Lipoprotein Receptor gene) which provides instructions for making LDL protein receptor that binds to LDLs.Mutations in the LDLR gene cause an inherited form of high cholesterol called familial hypercholesterolaemia.The specific effect of rs6511720[T] in LDLR on LDL was quantified and estimated to be À0.26 6 0.02 SD by Kathiresan et al. 33 and À0.15 6 0.03 SD in a subsequent study. 34s10761731 in 10q21.3 is in the Jumonji domain containing 1 C (JMJD1C) gene.The protein encoded by this gene interacts with thyroid hormone receptors.This SNP was previously associated with plasma TG level, 15,35 and a recent combined GWAS analysis found rs10761731 to be associated also with C-reactive protein and HDLcholesterol. 36s865686 in 9q31.2 was the only, among all studied SNPs, to be significantly associated with the risk of stroke among non-obese subjects (P ¼ 0.019).rs865686 has been found to be previously associated with the risk of oestrogen positive breast cancer. 37A study found that the effect of rs865686 on percent mammographic density (MD), an established risk factor for breast cancer, clearly differed across strata of BMI (P ¼ 0.01).Interestingly, the effect of this SNP on density reflected an obesity paradoxical effect, as it was inversely associated with percent MD among heavy women (BMI !25 kg/m 2 ), but positively associated with percent MD among lean women (BMI <25 kg/m 2 ; per minor allele change in percent MD: À0.67% and 1.43%, respectively; P ¼ 0.01). 38In the present study, we found an inverse association between this genetic variant and stroke risk only among non-obese subjects (OR ¼ 0.67, 95% CI 0.48-0.94)which could

Table 1
Baseline patients' and controls' characteristics

Table 4
Selected SNPs risk of stroke among non-obese subjects Adjusted by age, sex, smoking, diabetes mellitus, hyperlipidaemia, hypertension, and ischaemic cardiopathy.

Table 3
Selected SNPs and risk of stroke among obese subjects Adjusted by age, sex, smoking, diabetes mellitus, hyperlipidaemia, hypertension and ischaemic cardiopathy.

Table 2
Selected SNPs and risk of stroke, overall subjects

Table 5
Selected SNPs and risk of stroke among obese and/or hyperlipidaemic subjects