-
PDF
- Split View
-
Views
-
Cite
Cite
Laura Ibanez, Laura Heitsch, Caty Carrera, Fabiana H G Farias, Jorge L Del Aguila, Rajat Dhar, John Budde, Kristy Bergmann, Joseph Bradley, Oscar Harari, Chia Ling Phuah, Robin Lemmens, Alessandro A Viana Oliveira Souza, Francisco Moniche, Antonio Cabezas-Juan, Juan Francisco Arenillas, Jerzy Krupinksi, Natalia Cullell, Nuria Torres-Aguila, Elena Muiño, Jara Cárcel-Márquez, Joan Marti-Fabregas, Raquel Delgado-Mederos, Rebeca Marin-Bueno, Alejandro Hornick, Cristofol Vives-Bauza, Rosa Diaz Navarro, Silvia Tur, Carmen Jimenez, Victor Obach, Tomas Segura, Gemma Serrano-Heras, Jong Won Chung, Jaume Roquer, Carol Soriano-Tarraga, Eva Giralt-Steinhauer, Marina Mola-Caminal, Joanna Pera, Katarzyna Lapicka-Bodzioch, Justyna Derbisz, Antoni Davalos, Elena Lopez-Cancio, Lucia Muñoz, Turgut Tatlisumak, Carlos Molina, Marc Ribo, Alejandro Bustamante, Tomas Sobrino, Jose Castillo-Sanchez, Francisco Campos, Emilio Rodriguez-Castro, Susana Arias-Rivas, Manuel Rodríguez-Yáñez, Christina Herbosa, Andria L Ford, Alonso Gutierrez-Romero, Rodrigo Uribe-Pacheco, Antonio Arauz, Iscia Lopes-Cendes, Theodore Lowenkopf, Miguel A Barboza, Hajar Amini, Boryana Stamova, Bradley P Ander, Frank R Sharp, Gyeong Moon Kim, Oh Young Bang, Jordi Jimenez-Conde, Agnieszka Slowik, Daniel Stribian, Ellen A Tsai, Linda C Burkly, Joan Montaner, Israel Fernandez-Cadenas, Jin Moo Lee, Carlos Cruchaga, Multi-ancestry GWAS reveals excitotoxicity associated with outcome after ischaemic stroke, Brain, Volume 145, Issue 7, July 2022, Pages 2394–2406, https://doi.org/10.1093/brain/awac080
- Share Icon Share
Abstract
During the first hours after stroke onset, neurological deficits can be highly unstable: some patients rapidly improve, while others deteriorate. This early neurological instability has a major impact on long-term outcome. Here, we aimed to determine the genetic architecture of early neurological instability measured by the difference between the National Institutes of Health Stroke Scale (NIHSS) within 6 h of stroke onset and NIHSS at 24 h.
A total of 5876 individuals from seven countries (Spain, Finland, Poland, USA, Costa Rica, Mexico and Korea) were studied using a multi-ancestry meta-analyses. We found that 8.7% of NIHSS at 24 h of variance was explained by common genetic variations, and also that early neurological instability has a different genetic architecture from that of stroke risk. Eight loci (1p21.1, 1q42.2, 2p25.1, 2q31.2, 2q33.3, 5q33.2, 7p21.2 and 13q31.1) were genome-wide significant and explained 1.8% of the variability suggesting that additional variants influence early change in neurological deficits. We used functional genomics and bioinformatic annotation to identify the genes driving the association from each locus. Expression quantitative trait loci mapping and summary data-based Mendelian randomization indicate that ADAM23 (log Bayes factor = 5.41) was driving the association for 2q33.3. Gene-based analyses suggested that GRIA1 (log Bayes factor = 5.19), which is predominantly expressed in the brain, is the gene driving the association for the 5q33.2 locus. These analyses also nominated GNPAT (log Bayes factor = 7.64) ABCB5 (log Bayes factor = 5.97) for the 1p21.1 and 7p21.1 loci. Human brain single-nuclei RNA-sequencing indicates that the gene expression of ADAM23 and GRIA1 is enriched in neurons. ADAM23, a presynaptic protein and GRIA1, a protein subunit of the AMPA receptor, are part of a synaptic protein complex that modulates neuronal excitability.
These data provide the first genetic evidence in humans that excitotoxicity may contribute to early neurological instability after acute ischaemic stroke.
Introduction
Stroke is the second most common cause of death and the most common cause of disability, worldwide.1 Ischaemic stroke, the most common subtype,2 is caused by the occlusion of an artery in the brain, resulting in the abrupt development of cerebral ischaemia and neurological deficits.3 During the first hours after stroke onset, neurological deficits can be highly unstable with some patients demonstrating rapid deterioration, while others rapidly improve.4 In fact, early change in neurological deficits have a major influence on long-term outcome. National Institutes of Health Stroke Scale (NIHSS) changes from baseline (within 6 h of stroke onset) to 24 h after acute ischaemic stroke (ΔNIHSS) have a significant and independent association with favourable 90-day outcome, accounting for >30% of the explained variance.4–6 A number of mechanisms are thought to account for these early changes including fibrinolysis and reperfusion, haemorrhagic transformation, aetiology and endogenous neuroprotective mechanisms.7–14
Previous genome-wide association studies (GWAS), mostly in populations of European descent, have identified numerous loci associated with stroke risk. In 2018, the MEGASTROKE consortia performed one of the largest GWAS to date, combining most of the available GWAS for stroke risk in a unique multi-ancestry meta-analysis including 67 162 cases and 454 450 controls. This analysis led to the discovery of 22 novel loci, bringing the total stroke risk loci to 32. Many loci were previously linked to other vascular traits (blood pressure, cardiac phenotypes, venous thromboembolism); while others had no obvious connection with stroke, warranting further investigation to identify potentially novel mechanisms.15 A similar approach, used to decipher the genetics of long-term disability after ischaemic stroke in 6165 non-Hispanic Whites, identified one locus that was been not replicated so far.16,17 However, to date, there have been no genetic studies examining early neurological change after ischaemic stroke.
To our knowledge, the Genetics of Early Neurological InStability after Ischaemic Stroke (GENISIS) is the largest well-characterized study for early outcomes quantified by ΔNIHSS.18 To increase the power to detect genetic associations, our study recruited patients from multiple diverse ancestry groups. We leveraged the GENISIS cohort using ΔNIHSS as a quantitative phenotype, to identify novel variants, genes and pathways associated with early neurological instability after ischaemic stroke.
Materials and methods
Study design
A detailed description of the acute ischaemic stroke patients recruited from 21 sites from seven countries throughout the world, has been published elsewhere.6 Briefly, adult acute ischaemic stroke patients with a measurable deficit on the NIHSS that presented within 6 h of stroke onset (or last known normal) were enrolled in the study after obtaining informed consent, including patients treated with tissue plasminogen activator (tPA). All available inpatient data, including history, clinical exam, laboratory values, diagnostic tests, imaging and discharge diagnosis were used to confirm the diagnosis of ischaemic stroke. Patients who underwent a thrombectomy, were enrolled in other treatment trials or for whom consent and/or a blood sample could not be obtained were excluded. Demographics, co-morbidities, acute treatment variables, imaging data and TOAST (trial of ORG 10172 in acute stroke treatment) classification were collected.
To accommodate the difference in the genetic architecture intrinsic to the country of origin, we performed a three-stage analysis (Fig. 1A). First, we used an additive model to perform a GWAS in each country individually, except for the USA where the population was stratified into European and African ancestry cohorts. We then performed a fixed-effects meta-analyses within the same ethnic cohorts. Finally, we used a multi-ancestry Bayesian meta-analysis to collapse all the ethnic backgrounds. Unlike a fixed effect meta-analysis, the Bayesian approach is able to account for population structure differences.19 Genetic loci that passed multiple test correction, a threshold set at log Bayes factor (LBF) >5,19,20 were annotated using bioinformatics tools to identify the gene driving the genetic signal (Fig. 1B). We used functional annotation, multi-tissue expression quantitative trait loci (eQTL) data and summary data-based Mendelian randomization to map the genome-wide data to specific genes. Single-nuclei RNA-sequencing (RNA-seq) data derived from cortex samples was used to determine potential correlation between the transcripts of the identified genes and determine in which brain cell types the genes are expressed.21

Study design. Summarized description of the multi-step approach used to account for the genetic heterogeneity intrinsic to the multi-ancestry nature of the GENISIS study (A). We performed single variant analysis in each of the participating countries separately. Then we meta-analysed all the non-Hispanic whites (blue) and Hispanic (green) ethnicities. Finally, we analysed the non-Hispanic whites, Hispanics, Korea (orange) and US participants of African descent (US AfA, yellow) using a Bayesian model. The variants with genome-wide significant or suggestive results were annotated using sequential steps to elucidate the gene driving the association (B). We performed gene-based and pathway analyses, we collected the information available in publicly available datasets and we performed Mendelian randomization. We also performed genetic architecture overlap tests to examine overlap with known genetic risk factors.
The study was approved by the Institutional Review Boards at every participating site. Written informed consent was obtained from all participants or their family members. All research was performed according to the approved protocols and consents.
Genotyping
All participants were genotyped using Illumina single nucleotide polymorphism (SNP) array technology. Samples were genotyped in seven batches during the GENISIS recruitment (Supplementary material). Genotyping quality control and imputation were performed separately for each genotyping round using SHAPEIT22 and IMPUTE2.23 For each genotyping batch, SNPs with a call rate lower than 98% and autosomal SNPs that were not in Hardy–Weinberg equilibrium (P < 1 × 10−6) were removed from the dataset. The X chromosome SNPs were used to determine sex based on heterozygosity rates, and samples with discordant inferred sex and reported sex were removed. Only samples with call rate >98% were considered to pass quality control. Finally, the genotype batches were merged in a single file to perform the analyses.
Additional quality control was performed in the merged dataset. We tested pairwise genome-wide estimates of proportion identity-by-descent, the presence of unexpected duplicates and cryptically relatedness (PI-HAT > 0.30). Of the pairs of these samples flagged, the sample with higher genotyping rate was kept for downstream analysis. Principal component analysis was performed using HapMap as an anchor to remove ethnic outliers and keep the populations as homogeneous as possible for each of the participant countries. Principal components were also used to cluster and identify ancestry populations for US participants of European descent and African-American descent. Samples outside 2 standard deviations (SD) from the centre of the Non-Hispanic White or the Asian cluster were considered outliers for Spain, Finland and Poland. We confirmed the ethnicity of the African-American and Hispanic populations; however, due to the genetic heterogeneity present in these populations, we did not remove the samples outside 2 SD from the mean.
Analysis of variance
We used genome-wide complex trait analysis (GCTA) to determine the heritability of ΔNIHSS.24 GCTA estimates the amount of phenotypic variance in a given complex trait explained by all the SNPs and fits the effects of these SNPs as random effects in a linear mixed model. Because it relies on a large, homogeneous populations for accurate results, we only included the individuals with non-Hispanic White ancestry.
Single variant analyses
To mitigate the effects of genetic heterogeneity due to the diverse ancestry of participants enrolled in the GENISIS study, we used a multi-step study design (Fig. 1A). First, we performed single variant analyses each participant country separately. We tested the association of SNPs across the genome with ΔNIHSS using an additive linear model with PLINK 1.9.25 Sex, age and the two principal components calculated for each population were included in the model. Additional covariates include the SNP genotyping batch, TOAST classification (using dummy variables to incorporate all subtypes), tPA and baseline NIHSS to adjust for stroke severity. The primary focus of the GWAS was on early neurological change; thus, baseline NIHSS was included as a covariate in the model. Although baseline NIHSS was used to calculate ΔNIHSS, it does not fully explain the observed variance in ΔNIHSS; further, there is no multicollinearity between these two variables, permitting their inclusion in the model.26 Second, we meta-analysed the populations with similar ethnic backgrounds using with fixed effect meta-analyses using METAL.27 We performed two meta-analyses, one for the non-Hispanic Whites (Spain, Finland, Poland and USA European descent) and one for the Hispanics (Costa Rica and Mexico). Finally, we analysed the four available ethnicities non-Hispanic Whites (meta-analysis), Hispanics (meta-analysis), Asians (Korea) and African Americans using MANTRA, a Bayesian-based multi-ancestry meta-analyses.19 An LBF >5 was considered to be genome-wide significant after multiple test correction.19,20
To ensure that the loci were related to ΔNIHSS in all ischaemic strokes and were not specific to a stroke subtype (defined by TOAST criteria), we also performed joint analyses for cardioembolic stroke (n = 2149), large-artery atherosclerosis (n = 980), small vessel disease (618), undetermined (n = 1926) and other (n = 222). No significant loci were found associated with specific stroke subtypes.
As both time to evaluation and time to treatment with tPA have been shown to be predictors of stroke outcome, we conducted sensitivity analyses for subjects that had available information regarding elapsed time to evaluation (n = 4477) and elapsed time to tPA (n = 2312). In both instances, the results of the joint GWAS with and without the time variable of interest demonstrated highly correlated beta and P-values and did not reveal any additional potential loci associated with ΔNIHSS.
Functional annotation
We annotated all the variants with suggestive associations (LBF > 4) with ANNOVAR28 and SnpEff29 to identify the nearest gene and to determine whether any variant is predicted to change protein sequence (non-synonymous variants) or could affect expression. We also confirmed whether any of the SNPs were possible regulatory elements or DNA features using RegulomeDB.30
DEPICT31 and FUMA32 were used to perform gene ontology and pathways analyses. We also leveraged brain single-nuclei RNA expression data (http://ngi.pub/snuclRNA-seq/),21 to determine whether the gene expression of the genes located in each identified loci was expressed in brain. For the ones expressed in brain, we also investigated whether they were expressed in any specific brain cells (Fig. 1B). Finally, we accessed blood RNA expression data taken at different times after stroke onset (3, 5 and 24 h) from the CLEAR trial33 (NCT00250991 at https://www.clinicaltrials.gov/) to test whether the expression of genes located in the identified loci was associated with NIHSS or ΔNIHSS (NIHSS5 h − NIHSS24 h). We extracted the correlation between ΔNIHSS and gene expression (measured using Affymetrix U133 Plus 2.0 array).34
Expression quantitative trait mapping Mendelian randomization and co-localization
To identify the most likely functional gene, we accessed available expression quantitative trait (eQTL) datasets: the Genotype-Tissue Expression (GTEx) Project V8 (accessed on 12/09/2021), the Brain eQTL Almanac (Braineac) and an in-house dataset that includes brain expression data for 613 brains.35 We used the SMR36 and co-localization37 to test for pleiotropic association between the expression level of a gene and a complex trait to evaluate if the effect size of a genetic variant on the phenotype is mediated by gene expression (Fig. 1B). We tested GWAS-significant and -suggestive loci from the ΔNIHSS analysis in two datasets: selected GTEx tissues (brain anterior cortex, cerebellum, brain cerebellar hemisphere, substantia nigra, hippocampus, frontal cortex and putamen) and the Westra et al.38 dataset derived from whole blood. Both summary data-based Mendelian randomization and co-localization require effect sizes and the respective standard error to test the causal relationship, but MANTRA does not provide effect sizes. As a consequence, we used the summary statistics from the joint analysis for all populations to perform these analyses that are correlated with the results from MANTRA (r = −0.57; P < 1.07 × 10−05: data not shown). To complement the Mendelian randomization analyses with the posterior probability of a variant being causal in both GWAS and eQTL studies accounting for the genetic heterogeneity and linkage disequilibrium, we used eCAVIAR,39 which will consider several variants within the GWAS-significant loci to perform the test.
Genetic correlation
We examined similarities in the genetic architectures of stroke early outcomes (ΔNIHSS) and stroke risk15 using PRSice,40 linkage disequilibrium score regression (LDSC)41 and genetic covariance analyzer (GNOVA)42 (Fig. 1B). Briefly, PRSice calculates polygenic risk scores at different P-value thresholds by weighting each SNP by their effect size estimates. SNPs present in one dataset, ambiguous SNPs (A/T or C/G) and all SNPs in linkage disequilibrium are removed before polygenic risk score calculation. LDSC and GNOVA estimate the genetic covariance and the variant-based heritability for two sets of summary statistics, each one corresponding to one trait of interest. These two parameters are used to calculate the genetic correlation and covariance respectively between the two traits. We limited our comparisons to the non-Hispanic White population to keep the population genetically homogeneous and use the 1000 Genomes European population-derived reference dataset. We calculated the genetic correlation between the European ischaemic stroke summary statistics of the MEGASTROKE15 study and the non-Hispanic Whites meta-analysis summary statistics from the GENISIS study. We also determined whether traits related to cardiovascular and general health (age at death,43 lipid levels44 and body mass index45) are genetically correlated to ΔNIHSS.
Data availability
Summary statistics of the GENISIS dataset used for these analyses along with individual data for the full GENISIS dataset will be uploaded to the Database of Genotypes and Phenotypes and will be titled ‘GENISIS’.
Results
The GENISIS study recruited 5876 acute ischaemic stroke patients from seven countries (Spain, Finland, Poland, USA, Costa Rica, Mexico and Korea). The mean patient age was 73 years; 45% of the patients were females, 54% were treated with tPA, 20% had a previous history of stroke. No significant differences in age or sex were found across sites. The distribution of TOAST classification of stroke aetiology was also similar across sites. Significant differences were observed in baseline NIHSS and tPA treatment rates, probably due to differences in practices across the sites (Table 1).6 ΔNIHSS approximated a normal distribution, similar to that of each of the ethnic groups (non-Hispanic whites, Hispanics, African descent and Asians) (Supplementary Fig. 1).
. | Spain . | Finland . | Poland . | US European descent . | Costa Rica . | Mexico . | Korea . | US AfA . | GENISIS . |
---|---|---|---|---|---|---|---|---|---|
. | (n = 3419) . | (n = 490) . | (n = 356) . | (n = 798) . | (n = 141) . | (n = 63) . | (n = 285) . | (n = 324) . | (n = 5876) . |
Age, yearsa | 76.0 (66.0–83.0) | 68.0 (58.0–76.0) | 71.0 (63.0–80.0) | 70.0 (60.0–79.0) | 67.0 (56.0–78.0) | 67.0 (50.5–75.5) | 69.0 (58.0–78.0) | 63.0 (54.0–74.3) | 73.0 (62.0–81.0) |
Sex, females (%) | 1554 (45.5) | 193 (39.4) | 159 (44.7) | 337 (42.2) | 57 (40.4) | 28 (44.4) | 91 9 (31.9) | 169 (52.2) | 2588 (44.0) |
Baseline NIHSSa | 10.0 (5.0–17.0) | 5.0 (2.0–9.0) | 6.0 (3.0–12.0) | 6.0 (3.0–8.2) | 13.0 (9.0–18.0) | 11.0 (6.0–14.5) | 4.0 (2.0–8.0) | 7.0 (4.0–12.0) | 8.90 (4.0–15.0) |
tPA treatment, % | 48.32 | 48.37 | 59.55 | 73.81 | 100 | 46.03 | 28.07 | 75.62 | 54.20 |
ΔNIHSSb | 2.77 ± 5.42 | 2.34 ± 5.68 | 2.12 ± 3.40 | 2.11 ± 5.98 | 6.00 ± 7.14 | 3.40 ± 4.90 | 1.17 ± 3.40 | 2.37 ± 6.29 | 2.56 ± 5.52 |
TOAST classificationc, % | |||||||||
Cardioembolic | 38.32 | 41.63 | 29.21 | 37.72 | 21.28 | 23.81 | 30.53 | 29.01 | 36.50 |
Large artery | 17.17 | 16.53 | 12.36 | 13.03 | 39.01 | 25.40 | 24.56 | 8.64 | 16.76 |
Small vessel disease | 9.15 | 6.73 | 3.09 | 13.16 | 12.77 | 14.29 | 17.89 | 16.98 | 10.14 |
Other | 2.46 | 8.16 | 2.81 | 3.13 | 2.13 | 15.87 | 13.68 | 3.09 | 3.76 |
Undetermined | 32.90 | 26.94 | 52.53 | 32.96 | 24.11 | 20.63 | 13.33 | 42.28 | 32.83 |
. | Spain . | Finland . | Poland . | US European descent . | Costa Rica . | Mexico . | Korea . | US AfA . | GENISIS . |
---|---|---|---|---|---|---|---|---|---|
. | (n = 3419) . | (n = 490) . | (n = 356) . | (n = 798) . | (n = 141) . | (n = 63) . | (n = 285) . | (n = 324) . | (n = 5876) . |
Age, yearsa | 76.0 (66.0–83.0) | 68.0 (58.0–76.0) | 71.0 (63.0–80.0) | 70.0 (60.0–79.0) | 67.0 (56.0–78.0) | 67.0 (50.5–75.5) | 69.0 (58.0–78.0) | 63.0 (54.0–74.3) | 73.0 (62.0–81.0) |
Sex, females (%) | 1554 (45.5) | 193 (39.4) | 159 (44.7) | 337 (42.2) | 57 (40.4) | 28 (44.4) | 91 9 (31.9) | 169 (52.2) | 2588 (44.0) |
Baseline NIHSSa | 10.0 (5.0–17.0) | 5.0 (2.0–9.0) | 6.0 (3.0–12.0) | 6.0 (3.0–8.2) | 13.0 (9.0–18.0) | 11.0 (6.0–14.5) | 4.0 (2.0–8.0) | 7.0 (4.0–12.0) | 8.90 (4.0–15.0) |
tPA treatment, % | 48.32 | 48.37 | 59.55 | 73.81 | 100 | 46.03 | 28.07 | 75.62 | 54.20 |
ΔNIHSSb | 2.77 ± 5.42 | 2.34 ± 5.68 | 2.12 ± 3.40 | 2.11 ± 5.98 | 6.00 ± 7.14 | 3.40 ± 4.90 | 1.17 ± 3.40 | 2.37 ± 6.29 | 2.56 ± 5.52 |
TOAST classificationc, % | |||||||||
Cardioembolic | 38.32 | 41.63 | 29.21 | 37.72 | 21.28 | 23.81 | 30.53 | 29.01 | 36.50 |
Large artery | 17.17 | 16.53 | 12.36 | 13.03 | 39.01 | 25.40 | 24.56 | 8.64 | 16.76 |
Small vessel disease | 9.15 | 6.73 | 3.09 | 13.16 | 12.77 | 14.29 | 17.89 | 16.98 | 10.14 |
Other | 2.46 | 8.16 | 2.81 | 3.13 | 2.13 | 15.87 | 13.68 | 3.09 | 3.76 |
Undetermined | 32.90 | 26.94 | 52.53 | 32.96 | 24.11 | 20.63 | 13.33 | 42.28 | 32.83 |
US AfA = African-American descent.
Values are expressed as median (95% confidence interval).
Values are expressed as mean ± standard deviation.
TOAST classification criteria.1
. | Spain . | Finland . | Poland . | US European descent . | Costa Rica . | Mexico . | Korea . | US AfA . | GENISIS . |
---|---|---|---|---|---|---|---|---|---|
. | (n = 3419) . | (n = 490) . | (n = 356) . | (n = 798) . | (n = 141) . | (n = 63) . | (n = 285) . | (n = 324) . | (n = 5876) . |
Age, yearsa | 76.0 (66.0–83.0) | 68.0 (58.0–76.0) | 71.0 (63.0–80.0) | 70.0 (60.0–79.0) | 67.0 (56.0–78.0) | 67.0 (50.5–75.5) | 69.0 (58.0–78.0) | 63.0 (54.0–74.3) | 73.0 (62.0–81.0) |
Sex, females (%) | 1554 (45.5) | 193 (39.4) | 159 (44.7) | 337 (42.2) | 57 (40.4) | 28 (44.4) | 91 9 (31.9) | 169 (52.2) | 2588 (44.0) |
Baseline NIHSSa | 10.0 (5.0–17.0) | 5.0 (2.0–9.0) | 6.0 (3.0–12.0) | 6.0 (3.0–8.2) | 13.0 (9.0–18.0) | 11.0 (6.0–14.5) | 4.0 (2.0–8.0) | 7.0 (4.0–12.0) | 8.90 (4.0–15.0) |
tPA treatment, % | 48.32 | 48.37 | 59.55 | 73.81 | 100 | 46.03 | 28.07 | 75.62 | 54.20 |
ΔNIHSSb | 2.77 ± 5.42 | 2.34 ± 5.68 | 2.12 ± 3.40 | 2.11 ± 5.98 | 6.00 ± 7.14 | 3.40 ± 4.90 | 1.17 ± 3.40 | 2.37 ± 6.29 | 2.56 ± 5.52 |
TOAST classificationc, % | |||||||||
Cardioembolic | 38.32 | 41.63 | 29.21 | 37.72 | 21.28 | 23.81 | 30.53 | 29.01 | 36.50 |
Large artery | 17.17 | 16.53 | 12.36 | 13.03 | 39.01 | 25.40 | 24.56 | 8.64 | 16.76 |
Small vessel disease | 9.15 | 6.73 | 3.09 | 13.16 | 12.77 | 14.29 | 17.89 | 16.98 | 10.14 |
Other | 2.46 | 8.16 | 2.81 | 3.13 | 2.13 | 15.87 | 13.68 | 3.09 | 3.76 |
Undetermined | 32.90 | 26.94 | 52.53 | 32.96 | 24.11 | 20.63 | 13.33 | 42.28 | 32.83 |
. | Spain . | Finland . | Poland . | US European descent . | Costa Rica . | Mexico . | Korea . | US AfA . | GENISIS . |
---|---|---|---|---|---|---|---|---|---|
. | (n = 3419) . | (n = 490) . | (n = 356) . | (n = 798) . | (n = 141) . | (n = 63) . | (n = 285) . | (n = 324) . | (n = 5876) . |
Age, yearsa | 76.0 (66.0–83.0) | 68.0 (58.0–76.0) | 71.0 (63.0–80.0) | 70.0 (60.0–79.0) | 67.0 (56.0–78.0) | 67.0 (50.5–75.5) | 69.0 (58.0–78.0) | 63.0 (54.0–74.3) | 73.0 (62.0–81.0) |
Sex, females (%) | 1554 (45.5) | 193 (39.4) | 159 (44.7) | 337 (42.2) | 57 (40.4) | 28 (44.4) | 91 9 (31.9) | 169 (52.2) | 2588 (44.0) |
Baseline NIHSSa | 10.0 (5.0–17.0) | 5.0 (2.0–9.0) | 6.0 (3.0–12.0) | 6.0 (3.0–8.2) | 13.0 (9.0–18.0) | 11.0 (6.0–14.5) | 4.0 (2.0–8.0) | 7.0 (4.0–12.0) | 8.90 (4.0–15.0) |
tPA treatment, % | 48.32 | 48.37 | 59.55 | 73.81 | 100 | 46.03 | 28.07 | 75.62 | 54.20 |
ΔNIHSSb | 2.77 ± 5.42 | 2.34 ± 5.68 | 2.12 ± 3.40 | 2.11 ± 5.98 | 6.00 ± 7.14 | 3.40 ± 4.90 | 1.17 ± 3.40 | 2.37 ± 6.29 | 2.56 ± 5.52 |
TOAST classificationc, % | |||||||||
Cardioembolic | 38.32 | 41.63 | 29.21 | 37.72 | 21.28 | 23.81 | 30.53 | 29.01 | 36.50 |
Large artery | 17.17 | 16.53 | 12.36 | 13.03 | 39.01 | 25.40 | 24.56 | 8.64 | 16.76 |
Small vessel disease | 9.15 | 6.73 | 3.09 | 13.16 | 12.77 | 14.29 | 17.89 | 16.98 | 10.14 |
Other | 2.46 | 8.16 | 2.81 | 3.13 | 2.13 | 15.87 | 13.68 | 3.09 | 3.76 |
Undetermined | 32.90 | 26.94 | 52.53 | 32.96 | 24.11 | 20.63 | 13.33 | 42.28 | 32.83 |
US AfA = African-American descent.
Values are expressed as median (95% confidence interval).
Values are expressed as mean ± standard deviation.
TOAST classification criteria.1
Identification of novel loci associated with stroke early outcomes
We performed single variant analyses for each individual cohort separately; then we combined cohorts with similar ethnic backgrounds; finally, we performed a multi-ancestry meta-analysis with the four ethnic groups available in this study (non-Hispanic Whites, Hispanics, Asians and African Americans) (Fig. 1A). We identified eight GWAS-significant loci (Fig. 2A and Table 2) associated with ΔNIHSS.

Association and annotation results. (A) Manhattan plot shows the LBF values from the multi-ancestry meta-analysis in each genomic location. The red line indicates the GWAS-significant threshold (LBF > 5) and the blue line the GWAS suggestive threshold (LBF > 4). The genome-wide significant loci are highlighted. Local Manhattan plots are shown for rs16838349 (C) and rs17115057 (F) along with the corresponding forest plots (D and G), showing the contribution of each population to the overall signal. As part of the functional gene mapping, we accessed an in-house single-nuclei dataset (B) to describe the expression patterns in human brain cortical cell populations of the driving genes identified for rs16838349, ADAM23 (E) and rs17115057, GRIA1 (H).
SNP . | rs1451040 . | rs9660272 . | rs58763243 . | rs13403787 . | rs16838349 . | rs17115054 . | rs10807797 . | rs9545725 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAF | 0.160 | 0.161 | 0.070 | 0.158 | 0.067 | 0.059 | 0.579 | 0.108 | ||||||||
Effect allele | T | T | G | A | G | T | A | A | ||||||||
Chr: position | 1:103158738 | 1:232253211 | 2:7762999 | 2:178459146 | 2:207427437 | 5:153074938 | 7:19995629 | 13:82056977 | ||||||||
Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | |
Non-Hispanic White cohorts | ||||||||||||||||
Spain | −0.095 | 0.565 | −0.060 | 0.712 | −0.413 | 0.158 | 0.687 | 7.50 × 10−5 | 0.803 | 0.001 | 1.267 | 1.35 × 10−7 | 0.560 | 1.33 × 10−5 | 0.096 | 0.578 |
Finland | −0.046 | 0.905 | −0.273 | 0.481 | −1.426 | 0.003 | 0.133 | 0.763 | 0.674 | 0.250 | 1.455 | 0.032 | 0.635 | 0.016 | 0.015 | 0.973 |
Poland | 0.360 | 0.411 | −0.427 | 0.320 | −0.285 | 0.613 | 0.310 | 0.588 | 1.893 | 0.005 | NAa | NAa | 0.127 | 0.705 | 0.074 | 0.892 |
US European descent | 0.254 | 0.510 | −0.586 | 0.117 | −0.844 | 0.084 | NAa | NAa | 0.793 | 0.130 | −0.032 | 0.963 | 0.505 | 0.082 | −0.012 | 0.977 |
Non-Hispanic White METAb | −0.004 | 0.977 | −0.187 | 0.159 | 0.661 | 1.41 × 10−3 | 0.591 | 1.43 × 10−4 | 0.884 | 8.74 × 10−6 | 1.162 | 6.41 × 10−8 | 0.524 | 2.82 × 10−7 | 0.073 | 0.614 |
Hispanic cohorts | ||||||||||||||||
Costa Rica | −3.236 | 0.014 | −5.203 | 0.002 | 0.485 | 0.691 | 2.812 | 0.053 | 2.674 | 0.029 | 0.249 | 0.893 | 1.880 | 0.019 | −8.735 | 1.33 × 10−5 |
Mexico | −4.569 | 7.45 × 10−6 | −6.335 | 1.17 × 10−6 | 0.168 | 0.825 | 2.519 | 0.066 | 1.451 | 0.203 | 0.616 | 0.678 | −0.624 | 0.343 | 1.424 | 6.76 × 10−4 |
Hispanic METAb | −4.131 | 3.45 × 10−8 | −5.953 | 1.81 × 10−10 | −0.257 | 0.690 | 2.655 | 6.74 × 10−3 | 2.019 | 0.001 | 0.473 | 0.681 | 0.385 | 0.444 | −6.42 | 2.07 × 10−8 |
Additional cohorts | ||||||||||||||||
Korea | 1.276 | 0.023 | −0.566 | 0.364 | −0.277 | 0.462 | 0.827 | 0.004 | NAa | NAa | NAa | NA | 0.500 | 0.084 | NA | NA |
US AfA | −0.919 | 0.071 | 0.078 | 0.887 | −6.491 | 8.24 × 10−8 | NAa | NAa | −2.896 | 0.020 | −1.230 | 0.158 | 0.416 | 0.387 | −0.514 | 0.452 |
Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | |
MANTRAb | −−+− | 6.56 | −−−+ | 7.64 | −+−− | 6.58 | +++? | 5.13 | ++?+ | 5.41 | ++?− | 5.19 | ++++ | 5.97 | +−−? | 5.50 |
SNP . | rs1451040 . | rs9660272 . | rs58763243 . | rs13403787 . | rs16838349 . | rs17115054 . | rs10807797 . | rs9545725 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAF | 0.160 | 0.161 | 0.070 | 0.158 | 0.067 | 0.059 | 0.579 | 0.108 | ||||||||
Effect allele | T | T | G | A | G | T | A | A | ||||||||
Chr: position | 1:103158738 | 1:232253211 | 2:7762999 | 2:178459146 | 2:207427437 | 5:153074938 | 7:19995629 | 13:82056977 | ||||||||
Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | |
Non-Hispanic White cohorts | ||||||||||||||||
Spain | −0.095 | 0.565 | −0.060 | 0.712 | −0.413 | 0.158 | 0.687 | 7.50 × 10−5 | 0.803 | 0.001 | 1.267 | 1.35 × 10−7 | 0.560 | 1.33 × 10−5 | 0.096 | 0.578 |
Finland | −0.046 | 0.905 | −0.273 | 0.481 | −1.426 | 0.003 | 0.133 | 0.763 | 0.674 | 0.250 | 1.455 | 0.032 | 0.635 | 0.016 | 0.015 | 0.973 |
Poland | 0.360 | 0.411 | −0.427 | 0.320 | −0.285 | 0.613 | 0.310 | 0.588 | 1.893 | 0.005 | NAa | NAa | 0.127 | 0.705 | 0.074 | 0.892 |
US European descent | 0.254 | 0.510 | −0.586 | 0.117 | −0.844 | 0.084 | NAa | NAa | 0.793 | 0.130 | −0.032 | 0.963 | 0.505 | 0.082 | −0.012 | 0.977 |
Non-Hispanic White METAb | −0.004 | 0.977 | −0.187 | 0.159 | 0.661 | 1.41 × 10−3 | 0.591 | 1.43 × 10−4 | 0.884 | 8.74 × 10−6 | 1.162 | 6.41 × 10−8 | 0.524 | 2.82 × 10−7 | 0.073 | 0.614 |
Hispanic cohorts | ||||||||||||||||
Costa Rica | −3.236 | 0.014 | −5.203 | 0.002 | 0.485 | 0.691 | 2.812 | 0.053 | 2.674 | 0.029 | 0.249 | 0.893 | 1.880 | 0.019 | −8.735 | 1.33 × 10−5 |
Mexico | −4.569 | 7.45 × 10−6 | −6.335 | 1.17 × 10−6 | 0.168 | 0.825 | 2.519 | 0.066 | 1.451 | 0.203 | 0.616 | 0.678 | −0.624 | 0.343 | 1.424 | 6.76 × 10−4 |
Hispanic METAb | −4.131 | 3.45 × 10−8 | −5.953 | 1.81 × 10−10 | −0.257 | 0.690 | 2.655 | 6.74 × 10−3 | 2.019 | 0.001 | 0.473 | 0.681 | 0.385 | 0.444 | −6.42 | 2.07 × 10−8 |
Additional cohorts | ||||||||||||||||
Korea | 1.276 | 0.023 | −0.566 | 0.364 | −0.277 | 0.462 | 0.827 | 0.004 | NAa | NAa | NAa | NA | 0.500 | 0.084 | NA | NA |
US AfA | −0.919 | 0.071 | 0.078 | 0.887 | −6.491 | 8.24 × 10−8 | NAa | NAa | −2.896 | 0.020 | −1.230 | 0.158 | 0.416 | 0.387 | −0.514 | 0.452 |
Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | |
MANTRAb | −−+− | 6.56 | −−−+ | 7.64 | −+−− | 6.58 | +++? | 5.13 | ++?+ | 5.41 | ++?− | 5.19 | ++++ | 5.97 | +−−? | 5.50 |
NA = not available due to MAF below the inclusion threshold (0.03) or non-convergence of the statistical model.
The results from the meta-analysis of the single populations.
Direction of effects are shown in the following order: people of non-Hispanic White, Hispanic, Korean and African-American (Afa) descent; +/−/? = positive beta in given population/negative beta in given population/not present in given population.
SNP . | rs1451040 . | rs9660272 . | rs58763243 . | rs13403787 . | rs16838349 . | rs17115054 . | rs10807797 . | rs9545725 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAF | 0.160 | 0.161 | 0.070 | 0.158 | 0.067 | 0.059 | 0.579 | 0.108 | ||||||||
Effect allele | T | T | G | A | G | T | A | A | ||||||||
Chr: position | 1:103158738 | 1:232253211 | 2:7762999 | 2:178459146 | 2:207427437 | 5:153074938 | 7:19995629 | 13:82056977 | ||||||||
Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | |
Non-Hispanic White cohorts | ||||||||||||||||
Spain | −0.095 | 0.565 | −0.060 | 0.712 | −0.413 | 0.158 | 0.687 | 7.50 × 10−5 | 0.803 | 0.001 | 1.267 | 1.35 × 10−7 | 0.560 | 1.33 × 10−5 | 0.096 | 0.578 |
Finland | −0.046 | 0.905 | −0.273 | 0.481 | −1.426 | 0.003 | 0.133 | 0.763 | 0.674 | 0.250 | 1.455 | 0.032 | 0.635 | 0.016 | 0.015 | 0.973 |
Poland | 0.360 | 0.411 | −0.427 | 0.320 | −0.285 | 0.613 | 0.310 | 0.588 | 1.893 | 0.005 | NAa | NAa | 0.127 | 0.705 | 0.074 | 0.892 |
US European descent | 0.254 | 0.510 | −0.586 | 0.117 | −0.844 | 0.084 | NAa | NAa | 0.793 | 0.130 | −0.032 | 0.963 | 0.505 | 0.082 | −0.012 | 0.977 |
Non-Hispanic White METAb | −0.004 | 0.977 | −0.187 | 0.159 | 0.661 | 1.41 × 10−3 | 0.591 | 1.43 × 10−4 | 0.884 | 8.74 × 10−6 | 1.162 | 6.41 × 10−8 | 0.524 | 2.82 × 10−7 | 0.073 | 0.614 |
Hispanic cohorts | ||||||||||||||||
Costa Rica | −3.236 | 0.014 | −5.203 | 0.002 | 0.485 | 0.691 | 2.812 | 0.053 | 2.674 | 0.029 | 0.249 | 0.893 | 1.880 | 0.019 | −8.735 | 1.33 × 10−5 |
Mexico | −4.569 | 7.45 × 10−6 | −6.335 | 1.17 × 10−6 | 0.168 | 0.825 | 2.519 | 0.066 | 1.451 | 0.203 | 0.616 | 0.678 | −0.624 | 0.343 | 1.424 | 6.76 × 10−4 |
Hispanic METAb | −4.131 | 3.45 × 10−8 | −5.953 | 1.81 × 10−10 | −0.257 | 0.690 | 2.655 | 6.74 × 10−3 | 2.019 | 0.001 | 0.473 | 0.681 | 0.385 | 0.444 | −6.42 | 2.07 × 10−8 |
Additional cohorts | ||||||||||||||||
Korea | 1.276 | 0.023 | −0.566 | 0.364 | −0.277 | 0.462 | 0.827 | 0.004 | NAa | NAa | NAa | NA | 0.500 | 0.084 | NA | NA |
US AfA | −0.919 | 0.071 | 0.078 | 0.887 | −6.491 | 8.24 × 10−8 | NAa | NAa | −2.896 | 0.020 | −1.230 | 0.158 | 0.416 | 0.387 | −0.514 | 0.452 |
Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | |
MANTRAb | −−+− | 6.56 | −−−+ | 7.64 | −+−− | 6.58 | +++? | 5.13 | ++?+ | 5.41 | ++?− | 5.19 | ++++ | 5.97 | +−−? | 5.50 |
SNP . | rs1451040 . | rs9660272 . | rs58763243 . | rs13403787 . | rs16838349 . | rs17115054 . | rs10807797 . | rs9545725 . | ||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
MAF | 0.160 | 0.161 | 0.070 | 0.158 | 0.067 | 0.059 | 0.579 | 0.108 | ||||||||
Effect allele | T | T | G | A | G | T | A | A | ||||||||
Chr: position | 1:103158738 | 1:232253211 | 2:7762999 | 2:178459146 | 2:207427437 | 5:153074938 | 7:19995629 | 13:82056977 | ||||||||
Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | Beta | P | |
Non-Hispanic White cohorts | ||||||||||||||||
Spain | −0.095 | 0.565 | −0.060 | 0.712 | −0.413 | 0.158 | 0.687 | 7.50 × 10−5 | 0.803 | 0.001 | 1.267 | 1.35 × 10−7 | 0.560 | 1.33 × 10−5 | 0.096 | 0.578 |
Finland | −0.046 | 0.905 | −0.273 | 0.481 | −1.426 | 0.003 | 0.133 | 0.763 | 0.674 | 0.250 | 1.455 | 0.032 | 0.635 | 0.016 | 0.015 | 0.973 |
Poland | 0.360 | 0.411 | −0.427 | 0.320 | −0.285 | 0.613 | 0.310 | 0.588 | 1.893 | 0.005 | NAa | NAa | 0.127 | 0.705 | 0.074 | 0.892 |
US European descent | 0.254 | 0.510 | −0.586 | 0.117 | −0.844 | 0.084 | NAa | NAa | 0.793 | 0.130 | −0.032 | 0.963 | 0.505 | 0.082 | −0.012 | 0.977 |
Non-Hispanic White METAb | −0.004 | 0.977 | −0.187 | 0.159 | 0.661 | 1.41 × 10−3 | 0.591 | 1.43 × 10−4 | 0.884 | 8.74 × 10−6 | 1.162 | 6.41 × 10−8 | 0.524 | 2.82 × 10−7 | 0.073 | 0.614 |
Hispanic cohorts | ||||||||||||||||
Costa Rica | −3.236 | 0.014 | −5.203 | 0.002 | 0.485 | 0.691 | 2.812 | 0.053 | 2.674 | 0.029 | 0.249 | 0.893 | 1.880 | 0.019 | −8.735 | 1.33 × 10−5 |
Mexico | −4.569 | 7.45 × 10−6 | −6.335 | 1.17 × 10−6 | 0.168 | 0.825 | 2.519 | 0.066 | 1.451 | 0.203 | 0.616 | 0.678 | −0.624 | 0.343 | 1.424 | 6.76 × 10−4 |
Hispanic METAb | −4.131 | 3.45 × 10−8 | −5.953 | 1.81 × 10−10 | −0.257 | 0.690 | 2.655 | 6.74 × 10−3 | 2.019 | 0.001 | 0.473 | 0.681 | 0.385 | 0.444 | −6.42 | 2.07 × 10−8 |
Additional cohorts | ||||||||||||||||
Korea | 1.276 | 0.023 | −0.566 | 0.364 | −0.277 | 0.462 | 0.827 | 0.004 | NAa | NAa | NAa | NA | 0.500 | 0.084 | NA | NA |
US AfA | −0.919 | 0.071 | 0.078 | 0.887 | −6.491 | 8.24 × 10−8 | NAa | NAa | −2.896 | 0.020 | −1.230 | 0.158 | 0.416 | 0.387 | −0.514 | 0.452 |
Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | Effectc | LBF | |
MANTRAb | −−+− | 6.56 | −−−+ | 7.64 | −+−− | 6.58 | +++? | 5.13 | ++?+ | 5.41 | ++?− | 5.19 | ++++ | 5.97 | +−−? | 5.50 |
NA = not available due to MAF below the inclusion threshold (0.03) or non-convergence of the statistical model.
The results from the meta-analysis of the single populations.
Direction of effects are shown in the following order: people of non-Hispanic White, Hispanic, Korean and African-American (Afa) descent; +/−/? = positive beta in given population/negative beta in given population/not present in given population.
Three independent loci were identified in chr2. The first locus, tagged by rs58763243 [minor allele frequency (MAFG) = 0.07; LBF = 6.58], was located in a region comprised by several long non-coding RNAs and microRNAs (Supplementary Fig. 2E). For this locus, all of the populations contributed to the association with negative betas, indicating that the minor allele was associated with lower (or more negative) ΔNIHSS. In addition, this locus reached genome-wide significance in the US African-American population and was nominally significant in the Finnish population (Supplementary Fig. 2F).
The second locus, rs13403787 (MAFA = 0.16; LBF = 5.13), was also located on chr2 in a region with >20 genes (Supplementary Fig. 2G). The minor allele was associated with higher (or more positive) ΔNIHSS in all cohorts (Supplementary Fig. 2H). The last genome-wide significant locus in chr2 was rs16838349 (MAFA = 0.07; LBF = 5.41), located in a region that includes ADAM23, CREB1, DYTN, NRP2 and MDH1B, among many others (Fig. 2C). The signal is driven by the non-Hispanic Whites (meta-analysis P = 8.74 × 10−6), but virtually all ethnic groups contributed to this association, as the directionality was consistent across Hispanic, non-Hispanic White and African-American ethnic groups (Fig. 2D and Supplementary Table 1). However, the SNPs in this locus were monomorphic in the Asian population. We did not observe any significant correlation between ΔNIHSS with the genotype in this locus (R2 = 0.063, P = 0.09; Supplementary Table 2).
Five additional loci were identified outside chr2. Two independent loci were identified in chr1. Both, rs1451040 (MAFT = 0.16, LBF = 6.56) and rs9660272 (MAFT = 0.16, LBF = 7.64) were in gene rich regions (Supplementary Fig. 2A and C). These two loci are highly significant in the Latino populations (Mexico and Costa Rica), with large negative effect sizes (Supplementary Fig. 2B and D). However, they are not significant in any of the other populations, except for rs1451040 that is nominally significant in the Korean population but has the opposite direction of effect. The locus identified on chr5 is located on a region containing nine genes (LOC101927134, GRIA1, FAM114A2, SAP30L, SAP30L-AS1, MFAP3, GALNT10, HAND1 and MIR3141; Fig. 2F). The minor allele for the top hit in this locus, rs17115057 (MAFT = 0.06; LBF = 5.19) was associated with greater (more positive) ΔNIHSS across most of the cohorts, and was significant in the Spanish (P = 1.35 × 10−7) and Finnish cohorts (P = 0.03; Fig. 2G). Another locus on chr7, tagged by the variant rs10807797 (MAFG = 0.42; LBF = 5.97), and located in a gene rich region with 15 genes, including TWISTNB, MACC1, TMEM196, ABCB5, RPL23P8 (Supplementary Fig. 2I). This locus is tightly encompassed by two recombination sites. The top signal was significant or suggestive in all populations except the Polish and Mexican cohorts. Consistently, the direction of effect was the same in all cohorts except the Mexican cohort (Supplementary Fig. 2J and Supplementary Table 1). Finally, we identified a locus on chr13, tagged by rs9545725 located in a gene desert. No genes were identified in this locus (Supplementary Fig. 2K). The variants in the region were significant in the Latino cohorts but the direction of effect was not consistent across all cohorts (Supplementary Fig. 2L). Moreover, the MAF for these variants ranged between 1% in the Korean population to 15% in the African-American and Spanish populations, suggesting that the region is very polymorphic depending on ethnicity. Thus, even though the locus is important for ΔNIHSS, it is possible that it is not the causal variant.
Genetic contribution to early outcomes after ischaemic stroke
We used GCTA to quantify the phenotypic variance explained by common SNPs. Because GCTA exploits linkage disequilibrium patterns to calculate the explained variance, we restricted our analysis to non-Hispanic Whites. Due to founder effects present in the Finnish population, we also removed this cohort from the variance calculation (final n = 4573). GCTA revealed that common genetic variants explained 8.7% of the variance of ΔNIHSS (P = 0.001), confirming that genetic variants and genes are implicated on stroke outcomes. Next, we determine what proportion of the genetic component is explained by the GWAS signals.
The SNPs contained in the eight genome-wide significant loci, defined as 500 bp upstream or downstream of the top signal, explained 1.8% of the total variance (P = 2.18 × 10−4) of ΔNIHSS or just 20.7% of the genetic component of ΔNIHSS. This suggests that there are additional loci associated with ΔNIHSS yet to be discovered. Thus, studies with larger sample size and more statistical power are needed to identify these additional loci.
Functional annotation of the genome-wide significant loci
Identifying the likely causal gene from each loci driving the association is a multi-step process (Fig. 1B). We first annotated the suggestive variants (LBF > 4), but none of them were predicted to change the protein sequence, comprise a regulatory element or affect the chromatin architecture. Next, we explored publicly available datasets to investigate whether any of the SNPs with suggestive LBFs were eQTLs (Supplementary Table 3). We performed gene-based analyses (Supplementary Table 4) and Mendelian randomization analyses to identify possible causal relationships between gene expression and ΔNIHSS (Supplementary Table 5). Summary results can be found in Fig. 3.

Gene prioritizing summary. Summary showing the seven genome-wide significant loci from the multi-ancestry analysis (first column), the total number of genes identified in each of the locus (second column) and gene name for genes for which we have found some kind of evidence (third column). We have included the results from the gene-based analyses, the presence of any eQTL in GTEx portal or Braineac for any of the genome-wide or suggestive variants, if the gene is differentially expressed in any bran region according to the single-nuclei RNA-seq data and the results from Mendelian randomization using Westra dataset (whole blood) or GTEx portal (all tissues). Black dots indicate that the gene was not found, red is that it was found but was not significant, yellow it was moderately significant (0.05 < P < 1 × 10−3) and green shows a significant association (P < 1 × 10−3).
Gene-based analyses using FUMA suggested that DYTN (P = 2.55 × 10−4, Z = 3.47) and ADAM23 (P = 2.04 × 10−3, Z = 2.87) were the genes driving the association at 2q33.3. Several variants in the ADAM23 region were strong eQTLs for this gene in multiple tissues, based on the GTEx data (oesophageal mucosa: P = 2.00 × 10−6; cultured fibroblasts: P = 5.90 × 10−5). Mendelian randomization analyses indicated that ADAM23 (P = 0.04) was the gene driving the association in this locus. Human brain single-nuclei RNA-seq data indicate that ADAM23 expression is enriched in neurons (P < 2.20 × 10−16); compared to all the other brain cell types. More specifically, its expression is enriched in excitatory neurons (Fig. 2E).
Gene-based analyses using FUMA revealed that GRIA1 located in 5q33.2 was the gene most probably driving the association in that region (P = 0.03, Z = 1.83). However, Braineac identified several eQTLs for GALNT10 (P = 3.70 × 10−4) in the occipital cortex, but GALNT10 was less significant in the gene-based analysis (P = 0.04, Z = 1.79). GTEx portal and the protein atlas reveals that GRIA1 is mainly expressed in brain tissue. While GALNT10 is also expressed in the brain, it has higher expression in other tissues. The human brain single-nuclei RNA-seq data confirmed that both GRIA1 and GALNT10 are expressed in divergent brain cell types (Fig. 2H and Supplementary Fig. 3A). GRIA1 is highly expressed in neurons compared to other cell types (P < 2.20 × 10−16), but not expressed in oligodendrocytes (P < 2.20 × 10−16) or astrocytes (P < 2.20 × 10−16). In contrast, GALNT10 is expressed in microglia, oligodendrocytes and astrocytes, but expression in neurons is low (P < 2.20 × 10−16). GRIA1 expression in peripheral blood was also nominally associated with ΔNIHSS in the CLEAR trial dataset (P = 0.002, r2 = 0.22).
Of the remaining six loci, we were able to map five (Supplementary material). Briefly, eQTL analysis, revealed that 1p21.1 was likely to be driven by COL11A1 or AMY2B. Gene-based and Mendelian randomization analyses suggested that the locus 1q42.2 was driven by GNPAT. No eQTLs were identified for 2p25.1, but gene-based analyses suggested that the signal is probably driven by AGPS or TTC30A. Regarding 2p31.2, it is probably driven by DFNB59, while 7p21.1 contains several eQTLs for TWISTNB and ABCB5 (Supplementary material).
Pathway analyses
Gene ontology and pathway analyses using DEPICT and summary statistics for ΔNIHSS revealed consistent suggestive associations with functions relating to the brain and CNS. The top tissue enrichment from DEPICT identified the cardiovascular system (1.8 × 10−3) and the CNS (P = 2.0 × 10−3), including the brain (P = 0.01) and some brain regions: occipital lobe (P = 2.00 × 10−3), cerebral cortex (P = 4.80 × 10−3) and temporal lobe (P = 6.33 × 10−3; Supplementary Table 6). The most significant pathways in the gene-set enrichment were the regulation of the heart contraction (P = 5.80 × 10−6), the sodium ion transmembrane transport (P = 6.27 × 10−6), the circulatory system process (P = 6.39 × 10−6) learning or memory (P = 7.11 × 10−6) and abnormal CNS synaptic transmission (P = 2.88 × 10−5; Supplementary Table 7). Several genome-wide significant candidate genes fell within these networks, of special interest, GRIA1 (5q33.2, LBF = 5.19) in the sodium ion transmembrane transport, which adds evidence to the involvement of GRIA1 in ΔNIHSS. MAGMA gene-set analyses did not reveal any enriched gene set associated with ΔNIHSS (Supplementary Table 8).
Unique genetic architecture of early outcomes after stroke
We examined the genetic architecture of ΔNIHSS for shared genetic variation with other cardiovascular and ageing-related traits, including stroke risk, age at death, plasma lipid levels and body mass index using PRSice (Supplementary Table 9 and Supplementary Fig. 4), LDSC (Supplementary Table 10) and GNOVA (Supplementary Table 11). Although the P-value for PRSice was significant in the comparison of stroke risk and ΔNIHSS, the amount of variance explained was very small (R2 = 0.009). Additionally, this finding was not supported by LDSC or GNOVA analyses, suggesting that there is no genetic overlap, as reported in a previous work.18 Similarly, no overlap with age at death, lipid levels or body mass index was identified by LDSC or GNOVA. Even though PRSice found significant correlations with several stroke risk factors, high-density lipoprotein levels (P = 0.01), triglyceride levels (P = 8.97 × 10−4), total cholesterol (P = 0.02), body mass index (P = 1.89 × 10−6) and age at death (P = 0.01), the amount of variance explained were all below 0.5%, suggesting that the overlap is minimal. LDSC was unable to calculate the heritability estimate for ΔNIHSS. GNOVA, was successful at estimating the heritability for ΔNIHSS, but could not calculate the genetic correlation estimate. Several of the heritability estimates for ΔNIHSS for overlap were negative, probably due to the low number of variants included in the analyses. Because both GNOVA and LDSC require larger sample sizes, the results of these analyses were inconclusive.
Discussion
The first 24 h after stroke onset is a period of great neurological instability, which may reflect brain tissue at risk for infarction but with the potential for salvageability.4,46–48 Not only is early neurological change (as reflected by ΔNIHSS) common, but it is also influenced by known mechanisms involved in early deterioration/improvement and has a strong influence on long-term functional outcome.6 Here, we performed a GWAS using ΔNIHSS as a quantitative phenotype in 5876 acute ischaemic stroke patients. We found that ΔNIHSS is heritable: common SNPs account for 8.7% of its variance. We have found eight genome-wide significant loci that are related to ΔNIHSS. However, they explain only 1.8% of the variance, indicating that 6.9% of the variance is explained by genes below the genome-wide significant threshold. Through functional annotation, we have linked each locus to specific genes, some of which are uniquely expressed in the brain.
Of all the loci showing association with ΔNIHSS, functional annotation analyses strongly suggests that ADAM23 is the functional gene for the locus 2q33.3. ADAM23 belongs to the ADAM (a disintegrin and metalloproteinase) family of proteins, defined by a single-pass transmembrane structure with a metallopeptidase domain (some inactive). This protein family is involved in cell adhesion, migration, proteolysis and signalling.49 ADAM23 is a transmembrane member without catalytic domain, and is involved in cell–cell and cell–matrix interactions.49,50 Previous studies have shown that ADAM23 is expressed in presynaptic membranes, linked by the extracellular protein LGI1 to postsynaptic ADAM22.51,52 We found that ADAM23 was expressed primarily in excitatory neurons of the cerebral cortex, based on our human brain single-nuclei transcriptomics dataset,21 and confirmed by the Human Transcriptomic Cell Types dataset from the Allen Brain Map.53 Several lines of evidence suggest that ADAM23 is important for pathological synaptic excitability: (i) adam23 is a common risk gene for canine idiopathic epilepsy54–56; (ii) mutations in its binding partner, LGI1, cause the neurological syndrome ADPEAF (autosomal dominant partial epilepsy with auditory features)57; and (iii) autoimmunity against LGI1 (as seen in limbic encephalitis) results in seizures and encephalopathy.58
Indeed, ADAM23 is also known to be a binding partner (via ADAM22 and PSD95) of the protein product of another one of our genome-wide significant associated genes, GRIA1, which encodes for the α-amino-3-hydroxy-5-methyl-4-isoxazolepropionic acid receptor subunit 1 (AMPAR1).52 It has long been known that AMPA receptors, along with other glutamate receptors, are mediators of excitotoxic neuronal death, proposed to play an important role in ischaemic brain injury.59,60 The failure of numerous older clinical trials examining the efficacy of anti-excitotoxic drugs has cast doubt on the relevance of excitotoxicity in human acute ischaemic stroke, although questions about the quality of these early clinical trials have been raised.61–64 Thus, the association between ADAM23 and GRIA1 with ΔNIHSS provides the first genetic evidence that excitotoxicity may contribute to ischaemic brain injury in humans.
The plausible roles that ADAM23 and GRIA1 play in acute brain ischaemia mechanisms lend support to the idea that GWAS using ΔNIHSS as a quantitative phenotype can identify novel mechanisms and potential drug targets to mitigate neurological deterioration or enhance early improvement after stroke. From the CLEAR dataset,33,34 expression levels of GRIA1 in peripheral blood of ischaemic stroke patients were associated with ΔNIHSS between 5 and 24 h post-stroke onset, supporting a link between increased expression of GRIA1 and improved outcomes. In addition to the two genes discussed previously, our GWAS identified six other loci—the functional genes remain to be identified. Acute ischaemic stroke patients are extremely well-phenotyped, as part of standard of care, with both clinical assessments and structural/physiological imaging. Thus, there is great potential for additional quantitative phenotypes to expand understanding of the genetic architecture of acute ischaemic stroke, promising to identify novel mechanisms and drug targets. Larger and more comprehensive genetic studies of acute ischaemic stroke are needed.
There are several limitations to this study. GENISIS enrolled a heterogeneous group of stroke patients without regard to underlying aetiology, stroke localization and genetic and environmental background. Although we have previously demonstrated that aetiology (TOAST criteria) has little influence on ΔNIHSS, it is likely that mechanisms involved in neurological instability may depend on aetiology. Stroke localization may also be an important determinant of mechanisms involved in neurological instability. For example, mechanisms in cortical strokes may differ from those in subcortical or brainstem strokes. Furthermore, specific medication information (such as type of anticoagulation medication, if being used for secondary prevention at the time of stroke) were not collected, and therefore cannot be accounted for. False positive findings due to the characteristics of the population is possible, but by using MANTRA we were able to correct by population heterogeneity. Future studies might aim to enrol a more homogeneous cohort of stroke patients to increase power to discover more genetic variants that associate with neurological instability. Finally, most of the patients in GENISIS were enrolled before the thrombectomy treatment era, and patients that underwent thrombectomy were excluded from the study to reduce heterogeneity. As a result, genetic interactions with reperfusion are largely unexplored.
Acknowledgements
We would like to thank the patients and their families for making possible all the genetic studies included in this paper. We also thank the MEGASTROKE consortium for access to the data (see full list of MEGASTROKE authors in supplementary data), the Genotype-Tissue Expression (GTEx) Project (supported by the Common Fund of the Office of the Director of the National Institutes of Health, and by NCI, NHGRI, NHLBI, NIDA, NIMH, and NINDS) and the Brain eQTL Almanac (Braineac) resource to access the UK Brain Expression Consortium (UKBEC) dataset.
Funding
This work was supported by grants from the Emergency Medicine Foundation Career Development Grant; AHA Mentored Clinical & Population Research Award (14CRP18860027); NIH/NINDS-R01-NS085419 (C.C., J.M.L.); NIH/NINDS-R37-NS107230, NIH/NINDS U24-NS107230 (J.M.L.); NIH/NINDS-K23-NS099487 (L.H.); NIH/NIA-K99-AG062723 (L.I.); Barnes-Jewish Hospital Foundation (J.M.L.); Biogen (C.C., J.M.L.); Bright Focus Foundation, US Department of Defense, Helsinki University Central Hospital; Finnish Medical Foundation; Finland government subsidiary funds; Spanish Ministry of Science and Innovation; Instituto de Salud Carlos III (grants ‘Registro BASICMAR’ Funding for Research in Health (PI051737), ‘GWALA project’ from Fondos de Investigación Sanitaria ISC III (PI10/02064, PI12/01238 and PI15/00451), JR18/00004); Fondos FEDER/EDRF Red de Investigación Cardiovascular (RD12/0042/0020); Fundació la Marató TV3; Genestroke Consortium (76/C/2011); Recercaixa’13 (JJ086116). Tomás Sobrino (CPII17/00027), Francisco Campos (CPII19/00020) and Israel Fernandez are supported by Miguel Servet II Program from Instituto de Salud Carlos III and Fondos FEDER. I.F. is also supported by Maestro project (PI18/01338) and Pre-test project (PMP15/00022) from Instituto de Salud Carlos III and Fondos Feder, Agaur; and Epigenesis project from Marató TV3 Foundation. J.C., J.M., A.D., J.M.-F., J.A. and I.F. are supported by Invictus plus Network (RD16/0019) from Instituto de Salud Carlos III and Fondos Feder. Fundação de Amparo à Pesquisa do Estado de São Paulo (FAPESP-2013/07559-3) (I.L.-C.), Sigrid Juselius Foundation. The MEGASTROKE project received funding from sources specified at http://www.megastroke.org/acknowledgments.html. B.S., B.A. and F.S. are supported by NIH awards NS097000, NS101718, NS075035, NS079153 and NS106950.
Competing interests
C.C. receives research support from: Biogen, EISAI, Alector and Parabon, and is a member of the advisory board of ADx Healthcare and Vivid Genomics. J.M.L receives research support from Biogen, and is a consultant for Regenera. E.A.T. and L.C.B. are employed by Biogen. J.F.A. has received speaker or consultant honoraria from Bayer, Boehringer Ingelheim, Pfizer-BMS, Daiichi Sankyo, Amgen and Medtronic. T.T. receives or has received research support from Bayer, Boehringer Ingelheim and Bristol Myers Squibb; he is a member of advisory boards for Bayer, Boehringer Ingelheim, Bristol Myers Squibb and Portola Pharmaceuticals, and he has been granted international patents: new therapeutic uses (method to prevent brain oedema and reperfusion injury), and thrombolytic compositions (method to prevent post-thrombolytic haemorrhage formation). The funders of the study had no role in the collection, analysis or interpretation of data; in the writing of the report or in the decision to submit the paper for publication.
Supplementary material
Supplementary material is available at Brain online.
References
Abbreviations
- eQTL
expression quantitative trait loci
- GCTA
genome-wide complex trait analysis
- GENISIS
Genetics of Early Neurological InStability after Ischaemic Stroke
- GWAS
genome-wide association studies
- LBF
log Bayes factor
- MAF
minor allele frequency
- NIHSS
National Institutes of Health Stroke Scale
- SNP
single nucleotide polymorphism
- tPA =
tissue plasminogen activator