-
PDF
- Split View
-
Views
-
Cite
Cite
Hugues Aschard, Martin D Tobin, Dana B Hancock, David Skurnik, Akshay Sood, Alan James, Albert Vernon Smith, Ani W Manichaikul, Archie Campbell, Bram P Prins, Caroline Hayward, Daan W Loth, David J Porteous, David P Strachan, Eleftheria Zeggini, George T O’Connor, Guy G Brusselle, H Marike Boezen, Holger Schulz, Ian J Deary, Ian P Hall, Igor Rudan, Jaakko Kaprio, James F Wilson, Jemma B Wilk, Jennifer E Huffman, Jing Hua Zhao, Kim de Jong, Leo-Pekka Lyytikäinen, Louise V Wain, Marjo-Riitta Jarvelin, Mika Kähönen, Myriam Fornage, Ozren Polasek, Patricia A Cassano, R Graham Barr, Rajesh Rawal, Sarah E Harris, Sina A Gharib, Stefan Enroth, Susan R Heckbert, Terho Lehtimäki, Ulf Gyllensten, Understanding Society Scientific Group, Victoria E Jackson, Vilmundur Gudnason, Wenbo Tang, Josée Dupuis, María Soler Artigas, Amit D Joshi, Stephanie J London, Peter Kraft, Evidence for large-scale gene-by-smoking interaction effects on pulmonary function, International Journal of Epidemiology, Volume 46, Issue 3, June 2017, Pages 894–904, https://doi.org/10.1093/ije/dyw318
- Share Icon Share
Abstract
Background: Smoking is the strongest environmental risk factor for reduced pulmonary function. The genetic component of various pulmonary traits has also been demonstrated, and at least 26 loci have been reproducibly associated with either FEV1 (forced expiratory volume in 1 second) or FEV1/FVC (FEV1/forced vital capacity). Although the main effects of smoking and genetic loci are well established, the question of potential gene-by-smoking interaction effect remains unanswered. The aim of the present study was to assess, using a genetic risk score approach, whether the effect of these 26 loci on pulmonary function is influenced by smoking.
Methods: We evaluated the interaction between smoking exposure, considered as either ever vs never or pack-years, and a 26-single nucleotide polymorphisms (SNPs) genetic risk score in relation to FEV1 or FEV1/FVC in 50 047 participants of European ancestry from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) and SpiroMeta consortia.
Results: We identified an interaction (βint = –0.036, 95% confidence interval, –0.040 to –0.032, P = 0.00057) between an unweighted 26 SNP genetic risk score and smoking status (ever/never) on the FEV1/FVC ratio. In interpreting this interaction, we showed that the genetic risk of falling below the FEV1/FVC threshold used to diagnose chronic obstructive pulmonary disease is higher among ever smokers than among never smokers. A replication analysis in two independent datasets, although not statistically significant, showed a similar trend in the interaction effect.
Conclusions: This study highlights the benefit of using genetic risk scores for identifying interactions missed when studying individual SNPs and shows, for the first time, that persons with the highest genetic risk for low FEV1/FVC may be more susceptible to the deleterious effects of smoking.
Spirometric measures of pulmonary function are influenced by both smoking and genetics. This paper reports a genetic risk score-by-ever smoking interaction on FEV1/FVC (forced expiratory volume in 1 second/forced vital capacity).
In individuals of European ancestry, the reduction in FEV1/FVC as a result of smoking was greater among individuals who are genetically predisposed to lower FEV1/FVC ratio.
Genetic risk score-by-ever smoking interaction can allow the identification of subgroups in the population whose genetic background makes them more susceptible to the deleterious effects of smoking.
Introduction
Spirometric measures of pulmonary function, such as the forced expiratory volume in 1 second (FEV1) or its ratio with the forced vital capacity (FEV1/FVC), form the basis of the diagnosis of chronic obstructive pulmonary disease (COPD).1–3 Pulmonary function measures are also used clinically to monitor severity and control of asthma and other respiratory diseases and are independent risk factors for mortality.1–3 Pulmonary function is strongly influenced by cigarette smoking and by multiple low-penetrance genetic variants. Indeed, genome-wide association studies (GWAS) of marginal genetic effects (i.e. not including interaction effects between genetic variants and smoking) have identified at least 26 loci associated with FEV1 or FEV1/FVC in the general population.4 However, the interplay between genetic factors and environmental exposures has not been well established for pulmonary function or its associated traits. More broadly, although considerable efforts have been made to identify interaction effects between genetic variants and environmental exposures across the wide range of human traits and diseases,5,6 such investigations have been mostly unsuccessful in detecting robust gene–environment interactions.5,7 The well-established effect of cigarette smoking on numerous human health outcomes8 makes it a serious candidate for identification of novel gene–environment interactions, especially for pulmonary traits.
Hypothesizing the presence of single nucleotide polymorphism (SNP)-by-smoking interaction, Hancock et al.9 performed a genome-wide interaction study of pulmonary function, modelling single SNP main effects and their interactions with smoking in 50 047 participants of European ancestry across 19 studies within the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE)10 and SpiroMeta consortia11—the largest genome-wide interaction study of pulmonary function as modified by smoking to date. However, rather than focusing on the interaction effects per se, they performed a meta-analysis of the joint test of SNP main effects and SNP-by-smoking interaction effects to improve power for identifying genetic variants associated with pulmonary function.12,13 Although they reported new candidate variants based on this joint test, the study did not identify any SNPs with genome-wide significant interaction with smoking.
Here, we explored gene-by-smoking interaction effects limited to genetic variants previously found to be associated with pulmonary function in standard marginal effects GWAS,4 therefore not including the new variants reported by Hancock et al.9 based on the joint test of main effects plus interaction. Specifically, we aimed to determine whether smoking modifies the effect of established genetic variants when considered singly or in combination using a genetic risk score summarizing the genetic predisposition to abnormal pulmonary function. The primary motivation for using genetic risk score is statistical power.14,15 Indeed, several genetic risk score-by-exposure interactions have already been identified in cases where single SNPs did not show evidence for statistically significant interactions.16–21 Genetic risk score-by-exposure interaction testing expands on the principle of omnibus test while leveraging the assumption that, for a given choice of coded alleles, most interaction effects will have the same direction. This is similar to burden tests that have been widely used for rare variant analysis22 where a single parameter can accumulate evidence for association without increasing the number of degrees of freedom. When interaction effects are null on average (i.e. if interaction effects are both negative and positive so that the sum of interaction coefficients tend to zero), the single SNP approach will generally outperform the risk score-based approach. Conversely, if interaction effects tend to be in the same direction, the risk score-based approach can have dramatically higher power.14
Methods
Study sample
The present analysis relies on the Hancock et al.9 genome-wide meta-analysis for main genetic effects plus interaction effects with smoking in relation to pulmonary function among 50 047 participants (56% women) of European ancestry from 19 studies. The mean age was 53 years at the time of pulmonary function testing. Approximately 15% were current smokers and 56% were ever smokers. Among ever smokers, the average pack-years of smoking was 21. Supplementary Table 3 (available as Supplementary data at IJE online) provides the main characteristics of the studies included; complete details of study-specific pulmonary function testing protocols have been published.4 For studies with spirometry at a single visit, we analysed FEV1/FVC and FEV1 measured at that visit. For studies with spirometry at more than one visit, measurements from the baseline visit or the most recent examination with spirometry data was used. Smoking history (current, former and never smoking) was ascertained by questionnaire at the time of pulmonary function testing. Pack-years of smoking were calculated for current and past smokers by multiplying smoking amount (packs per day) and duration (years smoked). Approximately 2.5 million autosomal SNPs were tested for interaction with smoking status (ever smoking vs never smoking) and pack-years, for two outcomes: FEV1 and FEV1/FVC (see next section). We also used two independent datasets of individuals of European ancestry to test for replication. The first replication dataset included 8859 unrelated individuals, and the second dataset included 9457 family-based individuals. The look-up was done in the GWAS for marginal genetic effects done separately in ever and never smoker as part of a recent meta-analysis of FEV1 and FEV1/FVC.23
Single SNP-by-smoking interaction
Detailed description of studies used in the replication analysis can be found in Soler Artigas et al.23 In brief, linear regression of age, age2, sex, height and principal components for population structure was undertaken on FEV1 and FEV1/FVC separately for ever smokers and never smokers. The residuals were normalized using a rank-based inverse normal transformation, again separately in ever smokers and never smokers. These transformed residuals were then used as the phenotype for association testing under an additive genetic model in each exposure strata. Inference of the interaction effects from the exposure-stratified analyses are described in the Supplementary Note (available as Supplementary data at IJE online).
Multivariate interaction analysis overview
First, we considered an unweighted genetic risk score-by-smoking interaction where the risk score simply sums the number of risk alleles (i.e. alleles associated with a lower pulmonary function). This unweighted genetic risk score is most powerful when the interaction effects have the same direction as marginal SNP effects (i.e. the harmful effects of smoking are magnified in individuals with a genetic predisposition to reduced pulmonary function). Second, we used a weighted genetic risk score where SNPs were weighted by the absolute value of their marginal effect estimates obtained from stage 1 screening of FEV1 and FEV1/FVC from Soler Artigas et al.4 (Supplementary Table 1, available as Supplementary data at IJE online). This weighting scheme is most powerful when the magnitude of interaction effects is proportional to the SNP marginal effects. Finally, for our third multivariate analysis, we derived a standard omnibus test of all interaction effects. This test will retain power in the presence of effects in both directions or of different magnitudes. Although there is strong correlation among the 12 tests performed (these three models, considering interaction with two smoking metrics, ever/never smoking or pack-years, for the two pulmonary function metrics FEV1 and FEV1/FVC), we used a stringent Bonferroni P-value correction threshold of 4 × 10–3 to account for multiple testing.
Relative risk in ever smokers vs never smokers
GRS interaction effects can further be translated in terms of risk prediction. For pulmonary function, low FEV1 or FEV1/FVC increases the risk of death24 and together they form the basis for the diagnosis of COPD.1–3 COPD stage 2 or higher are defined by the Global Initiative for Chronic Obstructive Lung Disease (GOLD) as FEV1/FVC < 0.70 and FEV1 < 80% of the predicted value. According to recent studies,2,25 between 5% and 20% of European ancestry adults are expected to have FEV1/FVC < 0.70, depending on smoking characteristics and age distribution. Several studies argue for a more stringent threshold to define COPD25,26 based on lower limit of normal predicted value, rather than a fixed absolute value, to prevent disease misclassification.
Results
We selected 26 loci previously found to be associated with FEV1 or FEV1/FVC at genome-wide significance (P < 5 × 10–8) in marginal association tests4,11,27 (i.e. not including interaction effects with smoking exposures) and replicated in the GWAS by Soler Artigas et al.,4 the largest meta-analysis of marginal genetic effect conducted for these two traits in the general population. Additional loci for these two phenotypes have been identified in two recent studies.28,29 However, these new loci were not included in our analysis because both these studies used a large cohort ascertained through smoking status. For each of the 26 selected loci, we choose the SNP with the strongest evidence for association (i.e. smallest P-value) with each of these phenotypes. The final list included 26 SNPs per phenotype, with only two SNPs being different between FEV1 and FEV1/FVC as previously reported4 (Supplementary Table 1, available as Supplementary data at IJE online). Estimated interaction effects of these SNPs were extracted from the meta-analysis summary statistics for the four tests performed in the Hancock et al.9 analysis: SNP-by-smoking status (ever smoking vs never smoking) interaction effect on FEV1 and FEV1/FVC; and SNP-by-smoking pack-years interaction effect on FEV1/FVC and FEV1. As shown in Supplementary Table 2 (available as Supplementary data at IJE online), nine SNPs showed nominal significance (P < 0.05) out of the 104 tests performed; however, none remained significant after accounting for multiple testing (Bonferroni corrected P-value threshold of 5 × 10–4). The minimum P-value was observed for the interaction between rs993925, near the TGFβ2 gene, and smoking status on FEV1 [βint = –0.036, 95% confidence interval (CI), –0.009 to –0.032, P = 0.007].
Next, using these data, we conducted three multivariate (as opposed to single SNP) interaction analyses, testing jointly for the interaction effects between those SNPs and either smoking status or pack-years on the two phenotypes (FEV1 and FEV1/FVC) for a total of 12 tests. As shown in Table 1, none of the multivariate interaction tests with pack-years was significant. However, four of the six multivariate interaction tests with smoking status (ever vs never) showed nominal significance, and two tests for FEV1/FVC had a P-value below the Bonferroni significance level (12 tests, P < 4 × 10–3). The strongest signal was observed for the unweighted genetic risk score-by-smoking status interaction effect on FEV1/FVC (βint = –0.036, 95% CI –0.040 to –0.032, P = 0.00057). The Cochran’s Q test for heterogeneity of the interaction effect across studies was not significant (P = 0.97) and the forest plot of study-specific results did not display any obvious outlier (Supplementary Figure 1, available as Supplementary data at IJE online).
Multivariate interaction tests of the 26 loci associated with pulmonary function
Outcome . | Exposure . | Test . | ^βint . | (CI) . | P-value . |
---|---|---|---|---|---|
FEV1 | Smoking statusa | uGRS | –0.0055 | (–0.011, 2.7 × 10–5) | 0.051 |
wGRS | –0.21 | (–0.40, –0.033) | 0.020 | ||
CHISQ | – | – | 0.49 | ||
FEV1 | Pack-years | uGRS | –1.6 × 10–5 | (–4.6 × 10–5, 1.4 × 10–5) | 0.30 |
wGRS | –6.5 × 10–4 | (–1.6 × 10–3, 3.3 × 10–4) | 0.19 | ||
CHISQ | – | – | 0.46 | ||
FEV1/FVC | Smoking status | uGRS | –0.0099 | (–0.016, –0.0043) | 0.00057b |
wGRS | –0.21 | (–0.33, –0.073) | 0.0022b | ||
CHISQ | – | – | 0.026 | ||
FEV1/FVC | Pack-years | uGRS | –4.4e-06 | (–3.6 × 10–5, 2.7 × 10–5) | 0.78 |
wGRS | –6.5 × 10–5 | (–8.0 × 10–4, 6.6 × 10–4) | 0.85 | ||
CHISQ | – | – | 0.53 |
Outcome . | Exposure . | Test . | ^βint . | (CI) . | P-value . |
---|---|---|---|---|---|
FEV1 | Smoking statusa | uGRS | –0.0055 | (–0.011, 2.7 × 10–5) | 0.051 |
wGRS | –0.21 | (–0.40, –0.033) | 0.020 | ||
CHISQ | – | – | 0.49 | ||
FEV1 | Pack-years | uGRS | –1.6 × 10–5 | (–4.6 × 10–5, 1.4 × 10–5) | 0.30 |
wGRS | –6.5 × 10–4 | (–1.6 × 10–3, 3.3 × 10–4) | 0.19 | ||
CHISQ | – | – | 0.46 | ||
FEV1/FVC | Smoking status | uGRS | –0.0099 | (–0.016, –0.0043) | 0.00057b |
wGRS | –0.21 | (–0.33, –0.073) | 0.0022b | ||
CHISQ | – | – | 0.026 | ||
FEV1/FVC | Pack-years | uGRS | –4.4e-06 | (–3.6 × 10–5, 2.7 × 10–5) | 0.78 |
wGRS | –6.5 × 10–5 | (–8.0 × 10–4, 6.6 × 10–4) | 0.85 | ||
CHISQ | – | – | 0.53 |
uGRS is the genetic risk score using equal weights to all SNPs; wGRS is the genetic risk score weighted by effect estimates from the marginal screening; CHISQ is the omnibus test of all interaction effects; ^βint is the estimated interaction effect between the GRS and the outcome; and CI is the confidence interval of that estimate. Nominally significant tests are indicated in bold. aSmoking status is defined as never smokers vs ever smokers. bSignificant P-value after Bonferroni correction.
Multivariate interaction tests of the 26 loci associated with pulmonary function
Outcome . | Exposure . | Test . | ^βint . | (CI) . | P-value . |
---|---|---|---|---|---|
FEV1 | Smoking statusa | uGRS | –0.0055 | (–0.011, 2.7 × 10–5) | 0.051 |
wGRS | –0.21 | (–0.40, –0.033) | 0.020 | ||
CHISQ | – | – | 0.49 | ||
FEV1 | Pack-years | uGRS | –1.6 × 10–5 | (–4.6 × 10–5, 1.4 × 10–5) | 0.30 |
wGRS | –6.5 × 10–4 | (–1.6 × 10–3, 3.3 × 10–4) | 0.19 | ||
CHISQ | – | – | 0.46 | ||
FEV1/FVC | Smoking status | uGRS | –0.0099 | (–0.016, –0.0043) | 0.00057b |
wGRS | –0.21 | (–0.33, –0.073) | 0.0022b | ||
CHISQ | – | – | 0.026 | ||
FEV1/FVC | Pack-years | uGRS | –4.4e-06 | (–3.6 × 10–5, 2.7 × 10–5) | 0.78 |
wGRS | –6.5 × 10–5 | (–8.0 × 10–4, 6.6 × 10–4) | 0.85 | ||
CHISQ | – | – | 0.53 |
Outcome . | Exposure . | Test . | ^βint . | (CI) . | P-value . |
---|---|---|---|---|---|
FEV1 | Smoking statusa | uGRS | –0.0055 | (–0.011, 2.7 × 10–5) | 0.051 |
wGRS | –0.21 | (–0.40, –0.033) | 0.020 | ||
CHISQ | – | – | 0.49 | ||
FEV1 | Pack-years | uGRS | –1.6 × 10–5 | (–4.6 × 10–5, 1.4 × 10–5) | 0.30 |
wGRS | –6.5 × 10–4 | (–1.6 × 10–3, 3.3 × 10–4) | 0.19 | ||
CHISQ | – | – | 0.46 | ||
FEV1/FVC | Smoking status | uGRS | –0.0099 | (–0.016, –0.0043) | 0.00057b |
wGRS | –0.21 | (–0.33, –0.073) | 0.0022b | ||
CHISQ | – | – | 0.026 | ||
FEV1/FVC | Pack-years | uGRS | –4.4e-06 | (–3.6 × 10–5, 2.7 × 10–5) | 0.78 |
wGRS | –6.5 × 10–5 | (–8.0 × 10–4, 6.6 × 10–4) | 0.85 | ||
CHISQ | – | – | 0.53 |
uGRS is the genetic risk score using equal weights to all SNPs; wGRS is the genetic risk score weighted by effect estimates from the marginal screening; CHISQ is the omnibus test of all interaction effects; ^βint is the estimated interaction effect between the GRS and the outcome; and CI is the confidence interval of that estimate. Nominally significant tests are indicated in bold. aSmoking status is defined as never smokers vs ever smokers. bSignificant P-value after Bonferroni correction.
The contrast between this significant risk score interaction and the absence of strong single SNP interaction effects can be explained by looking at the distribution of the single SNP interaction effect estimates. Figure 1 shows this distribution for the alleles associated with decreased FEV1/FVC. It highlights that, although the 95% CI of most single SNP interaction effects encompass the null (and therefore the absence of significant single SNP interaction effect), there is an enrichment for negative interaction effects. Indeed, even a binomial test can be used to confirm the unbalanced direction of interaction effects (18 of 26 interactions are negative leading to a P-value of 0.014 for a binomial test with an expected equiprobable distribution of 0.5). The genetic risk score-based interaction test exploits such enrichment by testing for the average interaction effect across all SNPs.14 As with any multivariate approach based on a composite null hypothesis, this result indicates that at least a subset of these 26 SNPs interact with smoking status, but does not allow us to determine which or how many SNPs are driving the genetic risk score-by-smoking interaction. The three other sets of single SNP interaction tests showed a similar (but not significant after correction for multiple testing) trend with enrichment for negative interactions (Supplementary Figures 2–4, available as Supplementary data at IJE online). We summarized the contribution of the unweighted genetic risk score-by-smoking interaction on FEV1/FVC in Table 2 and Figure 2A. This indicates that the deleterious effect of smoking is enhanced among carriers of the risk alleles or equivalently that the deleterious effect of smoking is reduced among subjects carrying the protective alleles.
Summary of effect estimates for genetic risk score-by-smoking status interaction on FEV1/FVC
Predictors . | Beta . | SD . | P-value . |
---|---|---|---|
From the marginal exposure model | |||
Pack-years | –0.0030 | 0.00017 | 1.2 × 10–71 |
Current smoking | –0.040 | 0.0047 | 7.7 × 10–18 |
Smoking statusa | –0.0023 | 0.0046 | 0.61 |
From the interaction model | |||
GRS | –0.0363 | 0.0021 | 3.9 × 10–64 |
GRS × Smoking statusa | –0.0099 | 0.0029 | 5.7 × 10–4 |
Predictors . | Beta . | SD . | P-value . |
---|---|---|---|
From the marginal exposure model | |||
Pack-years | –0.0030 | 0.00017 | 1.2 × 10–71 |
Current smoking | –0.040 | 0.0047 | 7.7 × 10–18 |
Smoking statusa | –0.0023 | 0.0046 | 0.61 |
From the interaction model | |||
GRS | –0.0363 | 0.0021 | 3.9 × 10–64 |
GRS × Smoking statusa | –0.0099 | 0.0029 | 5.7 × 10–4 |
GRS is the unweighted genetic risk score; beta is the effect estimates of each predictor; and SD the standard deviation of the each beta. aSmoking status was defined as never smokers vs ever smokers.
Summary of effect estimates for genetic risk score-by-smoking status interaction on FEV1/FVC
Predictors . | Beta . | SD . | P-value . |
---|---|---|---|
From the marginal exposure model | |||
Pack-years | –0.0030 | 0.00017 | 1.2 × 10–71 |
Current smoking | –0.040 | 0.0047 | 7.7 × 10–18 |
Smoking statusa | –0.0023 | 0.0046 | 0.61 |
From the interaction model | |||
GRS | –0.0363 | 0.0021 | 3.9 × 10–64 |
GRS × Smoking statusa | –0.0099 | 0.0029 | 5.7 × 10–4 |
Predictors . | Beta . | SD . | P-value . |
---|---|---|---|
From the marginal exposure model | |||
Pack-years | –0.0030 | 0.00017 | 1.2 × 10–71 |
Current smoking | –0.040 | 0.0047 | 7.7 × 10–18 |
Smoking statusa | –0.0023 | 0.0046 | 0.61 |
From the interaction model | |||
GRS | –0.0363 | 0.0021 | 3.9 × 10–64 |
GRS × Smoking statusa | –0.0099 | 0.0029 | 5.7 × 10–4 |
GRS is the unweighted genetic risk score; beta is the effect estimates of each predictor; and SD the standard deviation of the each beta. aSmoking status was defined as never smokers vs ever smokers.

Distribution of interaction effects on FEV1/FVC. Single SNP risk allele-by-smoking status (ever/never) interaction effect estimates (βint) and 95% confidence intervals are plotted by increasing values. The unweighted genetic risk score-by-smoking status interaction is plotted at the bottom.

Overview of the unweighted genetic risk score-by-smoking interaction effect on FEV1/FVC. Upper panel (A) presents the distribution of the unweighted genetic risk score (GRS, grey density plot) and the relationship between the unweighted GRS and standardized FEV1/FVC in ever smokers (dashed line) and never smokers (solid line). Lower panel (B) shows the excess relative risk (RR) of having FEV1/FVC in the lowest 1%, 5% and 20% of the population for ever smokers compared with never smokers, as stratified by GRS quintiles.
We used two independent datasets, one of 8859 unrelated individuals and another of 9457 related individuals, to test for independent replication of our results (Supplementary Note, available as Supplementary data at IJE online). Although the interaction effects were not significant, both replication samples showed consistent negative GRS-by-ever smoking interaction effect on FEV1/FVC ( = –0.0025, 95% CI –0.0165, 0.0115, P = 0.72 and = –0.0030, 95% CI –0.0214, 0.0154, P = 0.74, and overall interaction effect in the combined replication datasets = –0.0027, 95% CI –0.0136, 0.0082 P = 0.63) and a Cochran’s Q test for heterogeneity showed no significant difference in the three effect estimates (P = 0.51).
To quantify the impact of this result from a public health perspective, we estimated the impact of the genetic risk score-by-smoking interaction on having FEV1/FVC below 1%, 5% and 20% in the lower tails of the distribution in the population. Specifically, we derived the RR of having FEV1/FVC below these cut-off points (1%, 5% and 20%) in ever smokers compared with never smokers. Figure 2B quantifies the excess RR (i.e. the RR minus one) of individuals across five GRS quintiles. It highlights the higher risk associated with smoking among individuals carrying risk alleles (i.e. alleles associated with poorer pulmonary function) as compared with individuals carrying protective alleles (i.e. alleles associated with better pulmonary function). For example, among individuals with a GRS above the 80th percentile, smokers have on average a 26% excess RR of having FEV1/FVC in the lowest 1% of the population distribution, whereas ever smokers with a GRS below the 20th percentile have on average an 18% excess RR of falling in that same FEV1/FVC category compared with never smokers. Applying the same approach for FEV1, we observed a similar pattern (Supplementary Figure 5, available as Supplementary data at IJE online). However, as expected, the lower magnitude of the genetic risk score-by-ever smoking interaction on FEV1 implied a lower difference in RR between ever smokers and never smokers.
Discussion
Using the largest dataset to date of European ancestry participants from the general population with pulmonary function (FEV1/FVC and FEV1), smoking and genetic data, we identified a gene-by-smoking interaction effect on FEV1/FVC by using a GRS composed of 26 SNPs identified and replicated in a prior GWAS meta-analysis of marginal genetic effects. To our knowledge, our study is the first to report a synergistic action of genes and smoking on pulmonary function (i.e. the reduction in FEV1/FVC as a result of smoking is greater among individuals who are genetically predisposed to lower FEV1/FVC ratio). Our study also highlights the importance of developing and applying alternative strategies to evaluate interaction effects for lung phenotypes along with other complex traits and diseases. The genetic risk score-based approach enabled us to identify an interaction when the standard univariate test (i.e. evaluating each single genetic variant for interaction independently) failed to identify any interactions.
Replication studies showed interaction effect estimates in the same direction as the discovery study but were not significant, and the magnitude of interaction effects were substantially smaller. We acknowledge that, despite careful evaluation of the interaction effects in the discovery sample, the observed signal might be overestimated or confounded by unmeasured complex factors. However, we can a priori rule out a systematic bias of the single SNP interaction effects in the discovery study, because the genomic inflation factor λ, defined as the ratio of the median of the empirically observed distribution of the test statistic to the expected median,30 was not substantially different from 1 (λ = 1.044 for FEV1/FVC and smoking status). Instead, differences in significance and effect estimates might be partly explained by the limited sample size in the replication study and differences in the analytical design. Indeed, the discovery analysis was performed using a saturated model including three smoking exposures and explicitly modelled the interaction effect. In comparison, the replication analysis was not adjusted for current smoking status and pack-year, and the interaction effect was approximated from analyses stratified by smoking status outcome, which has some limitations (see Supplementary Note and Supplementary Figure 6, available as Supplementary data at IJE online). Previous work has shown that combined analyses are more powerful when effects exist in both strata,31 as observed in discovery study. Further, even with N = 18 316 individuals in the combined replication population, we are underpowered. This sample size provides less than 50% power, at nominal significance of 5%, to detect interaction effects with the GRS.
Genetic risk score-by-exposure interaction can have higher clinical value than the identification of single SNP-by-exposure interaction by capturing a wealth of information in a single measure to identify subgroups in the population whose genetic background makes them more susceptible to the deleterious effects of smoking.19,32,33 Indeed, if single SNP-by-smoking interactions are distributed unconditionally on the marginal genetic effect (i.e. interaction effects are equally likely to be positive or negative given that the coded alleles are the risk alleles), the genetic effect is expected to be similar between ever and never smokers. The enrichment for negative interactions we identified through our GRS approach reveals a stronger genetic component among the ever smoker subgroup in the population and can allow the implementation of more efficient implementation of prevention strategies. For example, in the public health setting, programmes targeting smoking cessation campaigns to individuals who are genetically predisposed to low pulmonary function may have a stronger impact in preventing COPD.
Our results may also elucidate biological mechanisms underlying the interplay between genes and smoking in pulmonary function. In particular, the higher statistical power for the genetic risk score-based interaction test points towards the potential presence of an unmeasured intermediate biomarker mediating the effect of the 26 loci on FEV1/FVC. As shown in Figure 3, the most parsimonious model (i.e. the less complex following Occam’s razor) that would explain multiple interactions going in the same direction (Figure 1) implies that the genetic variants together influence an intermediate biomarker, which itself interacts with smoking. Future studies with extended genomic data, including transcriptomic, proteomic or metabolomic data, might be able to further assess such an hypothesis by evaluating (i) the effect of the GRS on those biomarkers and (ii) testing for interactions between smoking and the candidate biomarkers identified at step (i).

Underlying causal model. Potential causal diagrams underlying the gene and smoking interaction effects on FEV1/FVC. Panel (A) presents a scenario where each genetic variant influences the outcome through a SNP-specific pathway, and interactions with the environmental exposure take place along these pathways. Panel (B) presents an alternative (and simpler) model where multiple genetic variants influence an unmeasured intermediate biomarker U, which effect on FEV1/FVC depends on smoking. In scenario (A), the single SNP-by-smoking interaction test is the optimal approach, whereas, in scenario (B), the single SNP-by-smoking interaction test can become inefficient, and interaction would be easier to detect using a genetic risk score-by-smoking interaction test, because it summarizes all interaction effects in a single test.
This study has some limitations. The 26 selected variants together explain a relatively small proportion of the additive genetic variance in FEV1/FVC and in FEV1.4 However, GWAS with increasing sample sizes will likely continue to provide additional associated genetic variants to further assess the role of SNP-by-smoking interaction effects on pulmonary phenotypes and may increase the gap between smokers and never smokers to allow for a significant impact in the clinic or at the population level. Moreover, we focused on genetic variants previously found to be associated at genome-wide significance level, but future studies might consider less stringent criteria to select genetic variants, including those with only suggestive evidence, or alternatively candidate variants with functional annotation relevant to the outcomes and exposures in question. Obviously, the signal-to-noise ratio might decrease when relaxing the constraint on the SNP selection. However, as we recently showed, additional gain in statistical power might be achieved even if a substantial proportion of the variants do not interact with the exposure.14 Finally, investigation of interaction effects with other environmental exposures such as second-hand smoke, air pollution, asbestos or occupational risks may lead to a more comprehensive understanding of the biological and epidemiological significance of these variants.
In summary, the identification of interaction effects between genetic variants and environmental exposures in human traits is recognized as extremely challenging, and this quest has been mostly unsuccessful so far. In this study, we discovered novel gene-by-smoking interactions using risk scores that were not observed at the level of individual genetic variants. This risk score analysis suggests that persons with a greater genetic predisposition to low pulmonary function are more susceptible to the deleterious effects of smoking. By extension, the use of a GRS may help predict which smokers will fall below thresholds that establish the diagnosis of COPD.
Supplementary Data
Supplementary data are available at IJE online.
Acknowledgements
We thank the many colleagues who contributed to collection and phenotypic characterization of the clinical sampling and genotyping of the data. We especially thank those who kindly agreed to participate in the studies. H.A. was supported by R21HG007687. S.J.L. was supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences. The research undertaken by M.D.T., L.V.W. was partly funded by the National Institute for Health Research (NIHR). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. A full list of studies and authors source of funding source and acknowledgements is provided in the Supplementary data. H.A., P.K., S.J.L., M.T., D.B.H. and A.J. were involved in designing the study. M.D.T., D.B.H., A.S., A.J., A.V.S., A.W.M., D.W.L., D.P.S., G.O.C., R.G.B., G.G.B., I.P.H., J.K.P., J.F.W., J.W., J.H.Z., K.d.J., L.V.W., M.S.A., H.M.B., M-R.J., M.F., P.A.C., S.A.G., S.R.H., V.G., W.T., S.J.L., I.R., O.P., J.E.H., C.H., A.C., D.J.P., S.E.H., I.J.D., S.E., U.G., LP.L., T.L., E.Z., B.P.P. and V.E.J. were involved in participant recruitment, sample collection or genotyping. H.A. performed analyses from the discovery study. H.A., V.E.J. and M.S.A. performed the replication analysis. H.A. drafted the paper, with substantial editorial input from P.K., S.J.L., J.D., D.S., M.T., D.B.H. and A.J. All authors have reviewed and approved the final draft. This material has not been published previously in a substantively similar form.
Conflict of interest: J.K. consulted for Pfizer on nicotine dependence. J.B.W. was employed by Pfizer at the time this research was undertaken. W.T. is a full-time employee and receives salary from Boehringer Ingelheim Pharmaceuticals Inc. Other authors declare no competing financial interest.
References
Author notes
These authors contributed equally to this work.