-
PDF
- Split View
-
Views
-
Cite
Cite
Jue-Sheng Ong, Puya Gharahkhani, Thomas L Vaughan, David Whiteman, Bradley J Kendall, Stuart MacGregor, Assessing the genetic relationship between gastro-esophageal reflux disease and risk of COVID-19 infection, Human Molecular Genetics, Volume 31, Issue 3, 1 February 2022, Pages 471–480, https://doi.org/10.1093/hmg/ddab253
- Share Icon Share
Abstract
Symptoms related with gastro-esophageal reflux disease (GERD) were previously shown to be linked with increased risk for the 2019 coronavirus disease (COVID-19). We aim to interrogate the possibility of a shared genetic basis between GERD and COVID-19 outcomes. Using published GWAS data for GERD (78 707 cases; 288 734 controls) and COVID-19 susceptibility (up to 32 494 cases; 1.5 million controls), we examined the genetic relationship between GERD and three COVID-19 outcomes: risk of developing severe COVID-19, COVID-19 hospitalization and overall COVID-19 risk. We estimated the genetic correlation between GERD and COVID-19 outcomes followed by Mendelian randomization (MR) analyses to assess genetic causality. Conditional analyses were conducted to examine whether known COVID-19 risk factors (obesity, smoking, type-II diabetes, coronary artery disease) can explain the relationship between GERD and COVID-19. We found small to moderate genetic correlations between GERD and COVID-19 outcomes (rg between 0.06 and 0.24). MR analyses revealed a OR of 1.15 (95% CI: 0.96–1.39) for severe COVID-19; 1.16 (1.01–1.34) for risk of COVID-19 hospitalization; 1.05 (0.97–1.13) for overall risk of COVID-19 per doubling of odds in developing GERD. The genetic correlation/associations between GERD and COVID-19 showed mild attenuation towards the null when obesity and smoking was adjusted for. Susceptibility for GERD and risk of COVID-19 hospitalization were genetically correlated, with MR findings supporting a potential causal role between the two. The genetic association between GERD and COVID-19 was partially attenuated when obesity is accounted for, consistent with obesity being a major risk factor for both diseases.
Introduction
The coronavirus disease 2019 (COVID-19) outbreak has affected tens of millions globally, with the COVID-19 related death toll exceeding 1 million in October 2020. The COVID-19 is caused by the newly identified severe acute respiratory syndrome coronavirus (SARS-CoV-2), where the virus enters host cells via the transmembrane serine protease 2 (TMPRSS2) and the ACE-II cell receptor (ACE2) (1). Human hosts infected by the virus risk developing excessive immune reaction and pneumonia-like symptoms, which can be fatal if left untreated. Whilst promising progress on the COVID-19 vaccines had been made (2), risk factors associated with susceptibility for COVID-19 remain severely understudied.
More recently, symptoms related with gastro-esophageal reflux disease have been reported to be associated with COVID-19 in several epidemiological studies (3,4). Gastro-esophageal reflux is a chronic disease characterized by the frequent regurgitation of acid arising from the stomach to the esophagus. Prior in-vivo studies have shown evidence for digestive systems play a role in the pathogenesis of SARS-CoV-2 as the key receptors ACE2 and TMPRSS2 were shown to be co-expressed in both the upper epithelial and gland cells along the esophagus and the colon (5). On the other hand, the observational link between gastro-esophageal reflux disease (GERD) and COVID-19 is unsurprising since both diseases share some major risk factors and present common symptoms (Fig. 1) (6–10). For instance, obesity and smoking are established risk factors for both GERD and COVID-19 (9,11). However, most of these studies were observational in nature and causality cannot be assumed. Genetic data offer an interesting avenue to validate these associations given the availability of large scale genome-wide association study (GWAS) data on both COVID-19 susceptibility and GERD. If there is a direct causal effect between GERD and COVID-19, we would expect that genetic variants which increase GERD risk will also increase COVID-19 risk. Genetic-derived findings can hence provide a complementary perspective into the biological mechanisms linking both diseases (12).

Diagram illustrating the observational risk factors and symptoms shared between COVID-19 outcomes and GERD susceptibility. Findings summarized from previous observational studies on GERD, risk of COVID-19 infection and severity (6–10). Arrows in bold are risk factors and symptoms unique to the disease.
In this study, we interrogate the possibility of a shared genetic architecture between COVID-19 and GERD in two ways. We first estimated the genetic correlation between GERD diagnosis and several COVID-19 related outcomes, followed by a genetic instrumental variable analysis to evaluate whether genetically predicted GERD diagnosis is linked with risk of COVID-19 infection. We further implemented a conditional analysis on common risk factors between GERD and COVID-19 to evaluate potential mediation mechanisms.
Results
Genetic correlation between GERD and COVID-19 outcomes
The sample size for each COVID-19 outcome was shown in Table 1. Each of the examined COVID-19 outcomes showed moderate evidence of being heritable. Based on an estimated 8% population-wide COVID-19 infection rate in Europe with ~ 10% of cases developing severe symptoms, the estimated SNP-heritability under the liability scale was h2 = 0.206 (se 0.038) for risk of developing severe COVID-19, h2 = 0.074 (0.015) for risk of COVID-19 hospitalization and h2 = 0.01 (0.004) for overall risk of developing COVID-19. The estimated genetic correlations between GERD and risk of severe COVID-19 was rg = 0.145 (se 0.050, p = 3.9e-03). Similarly the correlation with risk of COVID-19 hospitalization was rg = 0.239 (se 0.053, p = 7.88e-06). There was no evidence for a genetic correlation between GERD and overall risk of COVID-19 infection [rg = 0.056 (se 0.078, p = 0.47)].
Description of GWAS studies for GERD and COVID-19 outcome phenotypes involved in the present analysis
Trait . | Phenotype definition . | PMID . | Total sample size . | Ncases; ncontrols . | Additional information . | Studies involved . |
---|---|---|---|---|---|---|
Main exposure | ||||||
GERD | Cases inferred through self-reported GERD symptoms, use of GERD medication and ICD-10 codes | 31 527 586 | 367 441 | 78 707; 288 734 | Meta-analysis of GERD GWAS from UK Biobank and the Australian QSKIN stud. See An et al. for more information. | UK Biobank, QSKIN |
COVID-19 outcomes (based on COVID-19 Host Genetic Initiative phenotype definitions) | ||||||
COVID-A2 | Very severe respiratory confirmed COVID-19 cases vs. controls | 32 404 885 | 1 059 456 | 4792; 1 054 664 | Please visit https://www.covid19hg.org/data-sharing/ for precise analysis protocol. Summary statistics based on January 2021 data release, including up to 30 studies. | Please visit https://www.covid19hg.org/results/ for the description of individual studies. |
COVID-B2 | Hospitalized COVID-19 cases vs. controls | 32 404 885 | 1 557 411 | 8316; 1 549 095 | ||
COVID-C2 | All COVID-19 cases vs. controls | 32 404 885 | 1 348 701 | 32 494; 1 316 207 |
Trait . | Phenotype definition . | PMID . | Total sample size . | Ncases; ncontrols . | Additional information . | Studies involved . |
---|---|---|---|---|---|---|
Main exposure | ||||||
GERD | Cases inferred through self-reported GERD symptoms, use of GERD medication and ICD-10 codes | 31 527 586 | 367 441 | 78 707; 288 734 | Meta-analysis of GERD GWAS from UK Biobank and the Australian QSKIN stud. See An et al. for more information. | UK Biobank, QSKIN |
COVID-19 outcomes (based on COVID-19 Host Genetic Initiative phenotype definitions) | ||||||
COVID-A2 | Very severe respiratory confirmed COVID-19 cases vs. controls | 32 404 885 | 1 059 456 | 4792; 1 054 664 | Please visit https://www.covid19hg.org/data-sharing/ for precise analysis protocol. Summary statistics based on January 2021 data release, including up to 30 studies. | Please visit https://www.covid19hg.org/results/ for the description of individual studies. |
COVID-B2 | Hospitalized COVID-19 cases vs. controls | 32 404 885 | 1 557 411 | 8316; 1 549 095 | ||
COVID-C2 | All COVID-19 cases vs. controls | 32 404 885 | 1 348 701 | 32 494; 1 316 207 |
Description of GWAS studies for GERD and COVID-19 outcome phenotypes involved in the present analysis
Trait . | Phenotype definition . | PMID . | Total sample size . | Ncases; ncontrols . | Additional information . | Studies involved . |
---|---|---|---|---|---|---|
Main exposure | ||||||
GERD | Cases inferred through self-reported GERD symptoms, use of GERD medication and ICD-10 codes | 31 527 586 | 367 441 | 78 707; 288 734 | Meta-analysis of GERD GWAS from UK Biobank and the Australian QSKIN stud. See An et al. for more information. | UK Biobank, QSKIN |
COVID-19 outcomes (based on COVID-19 Host Genetic Initiative phenotype definitions) | ||||||
COVID-A2 | Very severe respiratory confirmed COVID-19 cases vs. controls | 32 404 885 | 1 059 456 | 4792; 1 054 664 | Please visit https://www.covid19hg.org/data-sharing/ for precise analysis protocol. Summary statistics based on January 2021 data release, including up to 30 studies. | Please visit https://www.covid19hg.org/results/ for the description of individual studies. |
COVID-B2 | Hospitalized COVID-19 cases vs. controls | 32 404 885 | 1 557 411 | 8316; 1 549 095 | ||
COVID-C2 | All COVID-19 cases vs. controls | 32 404 885 | 1 348 701 | 32 494; 1 316 207 |
Trait . | Phenotype definition . | PMID . | Total sample size . | Ncases; ncontrols . | Additional information . | Studies involved . |
---|---|---|---|---|---|---|
Main exposure | ||||||
GERD | Cases inferred through self-reported GERD symptoms, use of GERD medication and ICD-10 codes | 31 527 586 | 367 441 | 78 707; 288 734 | Meta-analysis of GERD GWAS from UK Biobank and the Australian QSKIN stud. See An et al. for more information. | UK Biobank, QSKIN |
COVID-19 outcomes (based on COVID-19 Host Genetic Initiative phenotype definitions) | ||||||
COVID-A2 | Very severe respiratory confirmed COVID-19 cases vs. controls | 32 404 885 | 1 059 456 | 4792; 1 054 664 | Please visit https://www.covid19hg.org/data-sharing/ for precise analysis protocol. Summary statistics based on January 2021 data release, including up to 30 studies. | Please visit https://www.covid19hg.org/results/ for the description of individual studies. |
COVID-B2 | Hospitalized COVID-19 cases vs. controls | 32 404 885 | 1 557 411 | 8316; 1 549 095 | ||
COVID-C2 | All COVID-19 cases vs. controls | 32 404 885 | 1 348 701 | 32 494; 1 316 207 |
Estimate of genetic causal effect for GERD on COVID-19 outcomes
About 88 independent GERD-associated SNPs (GERD association p-value < 5e-08) were used as genetic proxies to evaluate whether higher genetic predisposition towards GERD influenced risk of COVID-19 infection (Supplementary Material, Table S1). Power calculations indicated adequate statistical power for Mendelian randomization (MR) to detect moderate effect sizes (Supplementary Material, Table S3). Based on the inverse variance weighted (IVW) estimate, a doubling of odds of being diagnosed with GERD was associated with risk of severe COVID-19 (OR 1.15 [0.96–1.39]) and risk of COVID-19 hospitalization (OR = 1.16 [1.01–1.34]), but not for the overall risk of COVID-19 infection (OR 1.05 [0.97–1.13]). Findings from alternative MR sensitivity models were not meaningfully different from the IVW estimates with largely overlapping 95% confidence intervals (CIs; Fig. 2; complete summary in Supplementary Material, Table S4). Moreover, the MR-Egger intercept estimates were very small (<0.01) with 95% CIs overlapping the null, indicating very weak evidence of directional pleiotropy (Supplementary Material, Table S5). Similarly, there was limited evidence for heterogeneity among the SNP effect sizes as indicated by the Cochran Q statistics and the MR-PRESSO global heterogeneity tests (Supplementary Material, Tables S6 and S7). The MR scatter plots illustrating the fitted slope (causal effect) under various MR models are presented in Supplementary Material, Figures S1–S3. Finally, we found limited evidence for a bi-directional association from COVID-19 outcomes on GERD susceptibility (Supplementary Material, Tables S8–S9).

MR estimates on COVID-19 outcomes per doubling of odds in GERD susceptibility. COVID-A2: Severe COVID-19 infection; COVID-B2: COVID-19 hospitalization; COVID-C2: Overall COVID-19 susceptibility; MRE: Multiplication random effect. The model MR Egger (bootstrap) used 1000 iterations in the bootstrapping procedure. Error-bars represent the 95% CI around the derived OR estimate. For each COVID-19 outcome, every single row represents the OR estimate derived from different MR techniques. Whilst some estimators had lower precision, findings were largely consistent with those derived from the inverse-variance weighted model. Estimates derived using robust IVW and robust MR-Egger models are not presented here, but can be found in the Supplementary materials.
The genetic correlation between GERD and COVID-19 outcomes after conditioning on known COVID-19 risk factors
The correlation matrix in Figure 3(a) reveals the estimated genetic correlation between GERD, COVID-19 risk factors and COVID-19. Each of these risk factors showed strong evidence of genetic overlap with GERD (rg between 0.2 and 0.4). We also found moderate genetic correlations for BMI, CAD, cigarette smoked per day, risk of Type-2 diabetes with risk of COVID-19 hospitalization (COVID-A2) and risk of severe COVID-19 infection (COVID-B2) (Fig. 3(a)). However, the estimated correlation of these risk factors with overall COVID-19 susceptibility (COVID-C2) was much weaker. We hence re-estimated the genetic correlation between GERD and COVID-19 outcomes after adjusting for these risk factors. Apart from obesity-adjusted COVID-19 and smoking-adjusted COVID-19 where our revised rg estimate between GERD and COVID-19 outcomes showed evidence of attenuation towards the null, findings for other covariate-adjusted models remain widely consistent with the original unadjusted rg estimates (see Fig. 3(b)).

(A) Pairwise genetic correlation between GERD, COVID-19 risk factors and COVID-19 outcomes. (B) Genetic correlation estimates between GERD and risk factor-adjusted COVID-19 outcomes derived using bivariate LD-Score regression. COVID-A2: Severe COVID-19 infection; COVID-B2: COVID-19 hospitalization; COVID-C2: Overall COVID-19 susceptibility; SI: Smoking Initiation; CigPerDay: Cigarettes smoked per day; T2D: Type-II diabetes; CAD: Coronary artery disease; BMI: Body mass index. For (A), estimates in cells that are crossed-off represent correlation estimates that were non-significant following correction for multiple testing. Note that the rg estimates between COVID-19 outcomes were out of bound (i.e. rg > 1) but are not significantly different from one. (B) Error-bars represent the 95% CI around the point estimate. Adjustment for individual COVID-19 risk factors in the COVID-19 GWAS summary statistics performed via GCTA-mtCOJO. For the smoking-adjusted model, both cigarettes smoked per day and smoking initiation were conditioned on.
Estimate of genetic causal effect for GERD on COVID-19 outcomes after conditioning on known COVID-19 risk factors
To assess our genetic instruments for potential horizontal pleiotropic association via known COVID-19 risk factors, we evaluated the association between each of the 88 variants and the genetic effect sizes on BMI, smoking phenotypes, risk of T2D and risk of CAD using publicly available summary statistics (13–16). In our analysis, at least 50 SNPs showed evidence of pleiotropy (i.e. absolute Z-scores > 5) on one or more of the aforementioned COVID-19 risk factors. The heatmap in Figure 4 illustrates the distribution of Z-scores across these risk factors, indicating pervasive pleiotropic association between GERD instruments and established COVID-19 risk factors.

Distribution of Z-scores for the association between GERD SNP instruments and COVID-19 risk factors. COVID-A2: Severe COVID-19 infection; COVID-B2: COVID-19 hospitalization; COVID-C2: Overall COVID-19 susceptibility; BMI: Body mass index; CAD: Coronary artery disease; T2D: Type-II diabetes; CigPerDay: Cigarette smoked per day; SmkInit: Smoking initiation. The column on univariateGERD refers to the Z-scores derived from the primary An et al. (20) GERD GWAS. The column on MTAG_GERD refers to the Z-scores derived from the multi-trait GERD GWAS model, which showed an overall enhancement in genetic signal on GERD by leveraging correlated traits. As shown above, there is evidence of pervasive pleiotropy between the GERD instruments and most of these risk factors.
To control for potential genetic confounding between GERD and COVID-19 outcomes, we performed a conditional analysis adjusting for established COVID-19 risk factors in the COVID-19 GWAS summary statistics using GCTA-mtCOJO. Results for the conditional MR analysis are shown in Figure 5, showing minimal evidence of mediation through known COVID-19 risk factors, except for BMI—where the resultant MR estimate showed slight attenuation of the point estimate on COVID-19 hospitalization and risk of severe COVID-19 infection. However, the genetic association between GERD and overall risk of COVID-19 remains broadly unaffected in the conditional analyses. The pattern of associations under each adjusted mtCOJO model was broadly similar using alternative MR estimators, with concordant effect size estimates and direction of association (Supplementary Material, Tables S10–S13). Finally, excluding the variant rs9940128 in the FTO gene in our leave-one-SNP-out analyses did not alter our original MR findings (See Supplementary Material, Figs S4–S6).

MR estimates on COVID-19 outcomes per doubling of odds in GERD susceptibility conditioned on known COVID-19 risk factors. COVID-A2: Severe COVID-19 infection; COVID-B2: COVID-19 hospitalization; COVID-C2: Overall COVID-19 susceptibility; IVW: Inverse variance weighted; BMI: Body mass index; CAD: Coronary artery disease; T2D: Type-II diabetes. For every COVID-19 outcome, each row represents the estimated IVW OR on risk of COVID-19 based on each unadjusted (original) model or model adjusted for known COVID-19 risk factors using GCTA-mtCOJO. Error-bars represent the 95% CI around the derived OR estimate. The smoking-adjusted model includes adjustment for both cigarette smoked per day and smoking initiation. Among all the aforementioned risk factors, the BMI-adjusted model showed the largest attenuation of effect towards the null. However, none of these covariate-adjusted estimates were meaningfully different from the unadjusted model.
Discussion
Our genetic analyses reveal strong genetic evidence supportive of a potentially causal relationship between GERD and COVID-19 susceptibility. Genetically, higher odds of having GERD were associated with ~15% increase in risk of severe COVID-19 and COVID-19 hospitalization, consistent with retrospective observational findings (3,17). Adjustment for known COVID-19 risk factors, most notably obesity, partially weakens the association between GERD and COVID-19 although with 95% confidence intervals that overlap the unadjusted estimates.
Drawing a direct causal inference between GERD and COVID-19 can be difficult, as both diseases share common risk factors such as smoking, diabetes and obesity. For instance, obesity was previously shown to be causally associated with both GERD and COVID-19 outcomes in earlier MR findings. (18,19) Genetic correlation analyses further revealed strong genetic overlap between these risk factors with GERD and COVID-19 risks. We controlled for these risk factors via mediation analyses, which typically reduced (but did not completely eliminate) the genetic correlations between GERD and COVID-19 outcomes (Fig. 3). Similarly, our MR estimate on COVID-19 outcomes after adjusting for these risk factors showed partial attenuation towards the null, but remains widely consistent with a moderate adverse effect (Fig. 5).
Our analysis primarily focused on GERD susceptibility as opposed to previous findings that evaluated the use of PPIs specifically. (3) We showed in our previous study that GERD diagnosis attained through self-report and medication use (i.e. GERD cases inferred through individuals using GERD-related medications such as PPIs) were genetically very similar (20) and hence our genetics based approach cannot reliably separate the cause and effect of PPI use. That is, our data cannot clarify whether increased risk from COVID-19 relates to GERD and its complications per se, or to associated treatments such as PPIs. Most of our GERD instruments showed strong evidence of being replicated in an independent cohort (21), reducing chances of there being winner’s curse bias (22). Our analysis also tried to control for genetic confounding arising from a comorbidity with obesity, smoking behavior, diabetes and risk of cardiovascular complications—upon which the association with COVID-19 was unchanged apart from the model adjusting for obesity. While these findings are consistent with the role of PPIs promoting greater risk for COVID-19 through a mechanism independent of obesity and/or detection bias (5), further validations are required.
Several limitations ought to be considered. Firstly, our study focused chiefly on evaluating COVID-19 susceptibility and severity/hospitalization relative to the non-infected population; while GWAS data were available for severe versus non-severe COVID-19 within COVID-19 cases, the sample sizes were more limited and our power was limited. To obtain heritability estimates in the genetic liability scale one is required to provide an approximate prevalence (heritability does not change dramatically if this is mis-specified), we tentatively assume a population-wide COVID-19 prevalence of 10% based on availability of data (23). We note however that the mis-specification of the prevalence has only negligible impact on the genetic correlation estimates (24). Instruments used for GERD susceptibility were derived from a symptom-based GERD definition (20). Even though we previously showed that the genetic architecture of our broad GERD definition is similar to those obtained via more robust clinical diagnosis, these instruments are highly heterogeneity in nature. We attempt to minimize biases arising from instrument misspecification by adopting MR models such as median- and mode-based estimators, though these models are typically less powered. Whilst sample size for the analyses on overall COVID-19 infection outcome was the largest, we did not observe strong evidence for an association with GERD. One possible explanation is that the overall COVID-19 phenotype might be heterogeneous, which is reflected in the relatively lower estimated heritability. We performed post-hoc analyses using an earlier dataset (based on the COVID-19 HGI Release 3 in September 2020) before the availability of vaccines across US and Europe, in an attempt to capture a better phenotype. We did observe evidence for an association with GERD, though the estimated effect sizes for overall risk of COVID-19 were still lower than those derived for COVID-19 severity and risk of COVID-19 hospitalization (Supplementary Material, Table S14). Lastly, our analysis found very little evidence for reverse causality from COVID-19 susceptibility to GERD, though our power was limited as there are only a handful of COVID-19 associated variants.
Due to the nature of our study, we were also unable to directly compare our findings to those evaluating duration of PPIs used on COVID-19 severity (3). Genetically derived findings were conceivably less biased by confounding and reverse causality; however, issues of residual pleiotropy cannot be completely ruled out. Moreover, precise estimation of MR effect sizes between a binary–binary relationship such as those between GERD and COVID-19 susceptibility is difficult as the homogeneity and monotonicity assumptions (25) might not be satisfied. Whilst we present genetic evidence supporting a causal relationship between GERD and COVID-19, the precise magnitude of association cannot be reliably estimated and compared with observational findings. The derived instrumental variable estimates will hence reflect the average causal effect of GERD on COVID-19 susceptibility among genetically predicted compliers (25) (i.e. GERD patients with genes that predict GERD diagnosis). Finally, although individuals included in the COVID-19 GWAS were predominantly of European ancestry, some of the smaller studies included participants from other/mixed ancestries.
In summary, genetic findings support a common genetic basis for both GERD and COVID-19 susceptibility, showing strong genetic correlation between GERD susceptibility and COVID-19 outcomes. Genetically predicted higher odds of being diagnosed with GERD increased risk of developing severe COVID-19 and COVID-19 hospitalization, as well as risk of being hospitalized due to COVID-19. Our conditional analyses show that the relationship between GERD and COVID-19 might be partly explained by shared comorbidities such as obesity, though the attenuation of effect sizes was only partial. Future studies are warranted to explore the biological mechanisms linking these risk factors, GERD and COVID-19.
Materials and Methods
Data source for GERD diagnosis
We obtained the largest GWAS summary data for GERD susceptibility from a study of 71 522 GERD cases and 261 039 controls of European ancestry from the UK Biobank and the Australian QSKIN cohorts (20). Both UK Biobank and QSKIN are population-based cohorts with predominantly middle aged participants. (26,27) Details on the genotyping, genetic-QC and imputation for these studies had been previously described (26,27). GERD cases in both studies were derived through a combination of self-report, ICD-10 codes (K21), GERD-related medication or heartburn status (QSKIN only), though our previous findings have found strong genetic correlations between broad and clinical GERD definitions (20). For the instrumental variable analyses, 88 GERD-associated variants were selected based on SNPs which reached genome-wide significance in a multi-trait GERD GWAS meta-analysis (21).
Data source for COVID-19 diagnosis
GWAS summary statistics for COVID-19 infection were obtained from the COVID-19 Host Genetic Initiative website (https://www.covid19hg.org/results/) (28,29). For the genetic analyses, we used the genetic summary statistics for each COVID-19 trait excluding both the participants from 23andMe and UK Biobank (January 2021 data release) to prevent bias for the genetic causal inference analyses. Information on the demographics of the individual studies involved in the COVID-19 host genetic initiative are available in the COVID-19 HGI partners interactive website (https://www.covid19hg.org/partners/). In total, GWAS data from 26 individual studies were meta-analyzed to generate the combined GWAS summary statistics including up to 32 494 cases and more than 1 million controls. Information on the analysis protocol and how the genetic data from each study were quality controlled is provided in Supplementary Notes. We manually mapped the chromosome:basepair variant notation in the GWAS summary statistics back to RSID (under built 37) using the 1000G European reference panel for all our analyses. The three primary COVID-19 related outcomes evaluated in this study are provided below.
Exposure
Genetically predicted susceptibility towards GERD, derived through combination of self-report status, ICD-10 codes, hospital records and use of GERD-related medications such as omeprazole. Note that we have previously shown that the GERD defined through these definitions were highly consistent, i.e. the genetic correlations between self-report, medication-inferred and clinical GERD definitions were very close to one (20).
Primary outcome
Genetic estimates on risk of severe COVID-19 infection, risk of COVID-19 hospitalization and overall risk of developing COVID-19. See Supplementary Notes for the precise case definition for each COVID-19 outcome. Sample sizes for each COVID-19 outcome analysis were provided in Table 1.
Statistical approach
Genetic correlation analysis: The LD-score regression (30) method implemented through the ldsc python software (available at https://github.com/bulik/ldsc) was used to estimate the pairwise genetic correlation estimate between GERD and COVID-19 susceptibility outcomes, which only requires summary level genetic data. We first estimated the SNP-heritability (i.e. proportion of phenotypic variance captured by common genetic variation) for each COVID-19 outcome via LD-score regression and converted the observed heritability estimate into the genetic liability scale (–h2 option; see Supplementary Notes for technical details) (31). The pairwise genetic correlation between GERD and COVID-19 outcomes were estimated via bi-variate LD-score regression (–rg option). For all analysis, the European 1000 HGP reference panel was used to obtain the LD-score derived for each SNP among European populations. The genetic correlation estimate, rg describes the genome-wide correlation of genetic effect sizes between two traits (30).
Genetic instrumental variable analysis (MR): We obtained 88 genome wide significant SNP-markers associated with GERD susceptibility from our previous work (21). The individual SNP effect size and the allele frequencies for each of the 88 GERD-associated SNPs are provided in Supplementary Material, Table S1. Power calculations using the mRnd online interface (accessible here: https://cnsgenomics.com/shiny/mRnd/) were performed to assess power for MR to detect clinically relevant endpoints (32). To avoid over-inflation of effect sizes from the multi-trait model, we adopted genetic effect size estimates for the 88 SNPs from the published (univariate) GERD GWAS (20). Harmonization of effect alleles were performed to remove variants with palindromic alleles that cannot be inferred via minor allele frequency (MAF > 0.4). We then fitted an IVW regression model for the genetic effect of GERD against the genetic effect of risk on COVID-19 outcomes, weighted by the inverse variance of the genetic effect on GERD. We repeated our MR analysis using other MR estimators (MR-Egger (33), MR-Weighted Mode (34), MR-Weighted median (34), MR-PRESSO (35), robust and penalized IVW (36) methods) to triangulate evidence for genetic causality under weak violation of key MR assumptions. Technical details of these alternative models have been previously described (35,37,38). To aid the interpretation of our association estimates, we scaled our results to reflect the log(OR) on COVID-19 outcomes per doubling of odds on developing GERD. This was done by multiplying the resultant genetically derived log(OR) estimate from the IVW model by log(2) = 0.693 (25). We also checked for reverse causality by performing a MR analysis for COVID-19 outcomes on GERD susceptibility (see Supplementary Notes). All statistical analyses were performed using the statistical software R v4.0.3. Genetic instrumental variable analyses were conducted using the TwoSampleMR and MendelianRandomization R packages (39,40).
Sensitivity analyses: conditional-analyses on common risk factors between GERD and COVID-19 outcomes
To evaluate potential horizontal pleiotropy between GERD and COVID-19 susceptibility, we first estimated the genetic effect size for the 88 GERD SNPs on a series of established risk factors (9) on COVID-19 susceptibility. These risk factors include: body mass index (BMI), type-2 diabetes (T2D), smoking traits (cigarettes per day, smoking initiation) and coronary artery disease (CAD). The largest and most recent GWAS summary statistics for these risk factors were obtained from publicly available repositories (13–16). Details of the data source and curation of GWAS datasets for these phenotypes available in Supplementary Note (sources in Supplementary Material, Table S2). Genetic effect size estimates for each risk factor are aligned based on the GERD-increasing allele.
We then explored whether the pattern of genetic correlation/causality changed when risk factors such as body mass index and smoking were adjusted for in the model. We implemented the GCTA-mtCOJO (mtCOJO) framework (41) to remove the genetic effect sizes of established COVID-19 risk factors from the COVID-19 GWAS. In brief, the mtCOJO model is a summary-level based approach which estimates the genetic estimate for a trait of interest by conditioning out the causal effect of potential mediating factors on the outcome of interest (estimated using the GCTA-GSMR model). The resultant conditional estimates had also been previously shown to be robust against collider bias (see for instance (42) on collider bias generated via adjusting for heritable covariates). Further technical details on the mtCOJO model have been previously described (the mtCOJO framework is available within the GCTA software available here https://cnsgenomics.com/software/gcta) (41).
We repeated our MR analyses between GERD and COVID-19 outcomes after adjusting for each COVID-19 risk factor in turn. This was done by regressing the GERD instrumental variables against the genetic estimate on COVID-19 susceptibility after conditioning on the aforementioned risk factors via mtCOJO. Similarly, MR estimates of the covariate-adjusted model derived from alternative MR techniques were used to validate robustness of the IVW findings. Given that obesity is potentially a major risk factor for both GERD and COVID-19 susceptibility, we performed a leave-one-SNP-out MR analysis to assess whether the genetic association between these traits are potentially driven by specific variants, such as those in BMI-associated genes (e.g. the FTO gene).
Authorship Contribution
J.S.O. has full access to all of the data in the study and takes responsibility for the integrity of the data and the accuracy of the data analysis. Study concept and design: J.S.O. and S.M. Statistical analysis and interpretation of data: J.S.O., S.M., P.G., D.W., B.J.K., T.L.V. Acquisition of data and obtained funding: J.S.O., S.M., P.G., D.W. Data preparation: J.S.O., P.G. Drafting of the manuscript: J.S.O. and S.M. Critical review of the manuscript for important intellectual content: J.S.O., S.M., P.G., D.W., B.J.K., T.L.V. Study supervision: S.M. All authors read and approved the final version of the manuscript for submission.
Role of Sponsors
The funding bodies for our study had no role in the design or conduct of the study; collection, management, analysis and/or interpretation of the data; preparation, review or approval of the manuscript or the decision to submit the manuscript for publication.
Data Availability
The summary statistics for the GERD GWAS meta-analysis can be downloaded from here (https://doi.org/10.6084/m9.figshare.8986589). GWAS summary statistics for the COVID-19 outcomes provided by the COVID-19 HGI (Release 5) can be accessed here (https://www.covid19hg.org/results/r5/). The individual GWAS summary statistics for traits used in the mediation analyses are publicly available with data sources provided in Supplementary Notes.
Acknowledgement
We thank all the genetics consortia and the COVID-19 Host Genetics Initiative for making COVID-19 related summary statistics publicly accessible for our analysis. We thank research staff and participants from the Australian QSKIN study (https://www.qimrberghofer.edu.au/study/qskin/) and the UK Biobank (https://www.ukbiobank.ac.uk/) for contributing to the GERD GWAS summary data. We acknowledge the use of UK Biobank data under application number 25331. We finally acknowledge the contribution of the participating studies in the COVID-19 Host Genetics Initiative listed at https://www.covid19hg.org/acknowledgements/. This study primarily uses GWAS summary data and does not involve the use of human test subjects. This research was approved by the QIMR Berghofer’s Human Research Ethics Committee under project ID 3501.
Conflict of Interests. We declare no conflict of interests on this work.
Funding
QSkin was funded by grants from the National Health and Medical Research Council of Australia (NHMRC) APP1073898, APP1063061. S.M. acknowledges support from an NHMRC fellowship. S.M. acknowledges funding from the NHMRC grant 1123248.