Abstract

Background: Smoking is the strongest environmental risk factor for reduced pulmonary function. The genetic component of various pulmonary traits has also been demonstrated, and at least 26 loci have been reproducibly associated with either FEV1 (forced expiratory volume in 1 second) or FEV1/FVC (FEV1/forced vital capacity). Although the main effects of smoking and genetic loci are well established, the question of potential gene-by-smoking interaction effect remains unanswered. The aim of the present study was to assess, using a genetic risk score approach, whether the effect of these 26 loci on pulmonary function is influenced by smoking.

Methods: We evaluated the interaction between smoking exposure, considered as either ever vs never or pack-years, and a 26-single nucleotide polymorphisms (SNPs) genetic risk score in relation to FEV1 or FEV1/FVC in 50 047 participants of European ancestry from the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) and SpiroMeta consortia.

Results: We identified an interaction (βint = –0.036, 95% confidence interval, –0.040 to –0.032, P = 0.00057) between an unweighted 26 SNP genetic risk score and smoking status (ever/never) on the FEV1/FVC ratio. In interpreting this interaction, we showed that the genetic risk of falling below the FEV1/FVC threshold used to diagnose chronic obstructive pulmonary disease is higher among ever smokers than among never smokers. A replication analysis in two independent datasets, although not statistically significant, showed a similar trend in the interaction effect.

Conclusions: This study highlights the benefit of using genetic risk scores for identifying interactions missed when studying individual SNPs and shows, for the first time, that persons with the highest genetic risk for low FEV1/FVC may be more susceptible to the deleterious effects of smoking.

Key Messages

  • Spirometric measures of pulmonary function are influenced by both smoking and genetics. This paper reports a genetic risk score-by-ever smoking interaction on FEV1/FVC (forced expiratory volume in 1 second/forced vital capacity).

  • In individuals of European ancestry, the reduction in FEV1/FVC as a result of smoking was greater among individuals who are genetically predisposed to lower FEV1/FVC ratio.

  • Genetic risk score-by-ever smoking interaction can allow the identification of subgroups in the population whose genetic background makes them more susceptible to the deleterious effects of smoking.

Introduction

Spirometric measures of pulmonary function, such as the forced expiratory volume in 1 second (FEV1) or its ratio with the forced vital capacity (FEV1/FVC), form the basis of the diagnosis of chronic obstructive pulmonary disease (COPD).1–3 Pulmonary function measures are also used clinically to monitor severity and control of asthma and other respiratory diseases and are independent risk factors for mortality.1–3 Pulmonary function is strongly influenced by cigarette smoking and by multiple low-penetrance genetic variants. Indeed, genome-wide association studies (GWAS) of marginal genetic effects (i.e. not including interaction effects between genetic variants and smoking) have identified at least 26 loci associated with FEV1 or FEV1/FVC in the general population.4 However, the interplay between genetic factors and environmental exposures has not been well established for pulmonary function or its associated traits. More broadly, although considerable efforts have been made to identify interaction effects between genetic variants and environmental exposures across the wide range of human traits and diseases,5,6 such investigations have been mostly unsuccessful in detecting robust gene–environment interactions.5,7 The well-established effect of cigarette smoking on numerous human health outcomes8 makes it a serious candidate for identification of novel gene–environment interactions, especially for pulmonary traits.

Hypothesizing the presence of single nucleotide polymorphism (SNP)-by-smoking interaction, Hancock et al.9 performed a genome-wide interaction study of pulmonary function, modelling single SNP main effects and their interactions with smoking in 50 047 participants of European ancestry across 19 studies within the Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE)10 and SpiroMeta consortia11—the largest genome-wide interaction study of pulmonary function as modified by smoking to date. However, rather than focusing on the interaction effects per se, they performed a meta-analysis of the joint test of SNP main effects and SNP-by-smoking interaction effects to improve power for identifying genetic variants associated with pulmonary function.12,13 Although they reported new candidate variants based on this joint test, the study did not identify any SNPs with genome-wide significant interaction with smoking.

Here, we explored gene-by-smoking interaction effects limited to genetic variants previously found to be associated with pulmonary function in standard marginal effects GWAS,4 therefore not including the new variants reported by Hancock et al.9 based on the joint test of main effects plus interaction. Specifically, we aimed to determine whether smoking modifies the effect of established genetic variants when considered singly or in combination using a genetic risk score summarizing the genetic predisposition to abnormal pulmonary function. The primary motivation for using genetic risk score is statistical power.14,15 Indeed, several genetic risk score-by-exposure interactions have already been identified in cases where single SNPs did not show evidence for statistically significant interactions.16–21 Genetic risk score-by-exposure interaction testing expands on the principle of omnibus test while leveraging the assumption that, for a given choice of coded alleles, most interaction effects will have the same direction. This is similar to burden tests that have been widely used for rare variant analysis22 where a single parameter can accumulate evidence for association without increasing the number of degrees of freedom. When interaction effects are null on average (i.e. if interaction effects are both negative and positive so that the sum of interaction coefficients tend to zero), the single SNP approach will generally outperform the risk score-based approach. Conversely, if interaction effects tend to be in the same direction, the risk score-based approach can have dramatically higher power.14

Methods

Study sample

The present analysis relies on the Hancock et al.9 genome-wide meta-analysis for main genetic effects plus interaction effects with smoking in relation to pulmonary function among 50 047 participants (56% women) of European ancestry from 19 studies. The mean age was 53 years at the time of pulmonary function testing. Approximately 15% were current smokers and 56% were ever smokers. Among ever smokers, the average pack-years of smoking was 21. Supplementary Table 3 (available as Supplementary data at IJE online) provides the main characteristics of the studies included; complete details of study-specific pulmonary function testing protocols have been published.4 For studies with spirometry at a single visit, we analysed FEV1/FVC and FEV1 measured at that visit. For studies with spirometry at more than one visit, measurements from the baseline visit or the most recent examination with spirometry data was used. Smoking history (current, former and never smoking) was ascertained by questionnaire at the time of pulmonary function testing. Pack-years of smoking were calculated for current and past smokers by multiplying smoking amount (packs per day) and duration (years smoked). Approximately 2.5 million autosomal SNPs were tested for interaction with smoking status (ever smoking vs never smoking) and pack-years, for two outcomes: FEV1 and FEV1/FVC (see next section). We also used two independent datasets of individuals of European ancestry to test for replication. The first replication dataset included 8859 unrelated individuals, and the second dataset included 9457 family-based individuals. The look-up was done in the GWAS for marginal genetic effects done separately in ever and never smoker as part of a recent meta-analysis of FEV1 and FEV1/FVC.23

Single SNP-by-smoking interaction

The analysis performed in this study used summary statistics data from the aforementioned meta-analysis of 19 studies performed by Hancock et al.9 In brief, each of the 19 studies derived the residuals of FEV1 and FEV1/FVC after regressing out age, age2, sex, standing height, principal component eigenvectors of genotypes and recruitment site if applicable. The residuals were normalized using a rank-based inverse normal transformation. Single SNP interaction effects were assessed using the following model (see Supplementary Note, available as Supplementary data at IJE online):
(1)
where βG and βEl are the main effect of the SNP G and exposure El, βGEk is the interaction effect between G and exposure Ek, and β0 the intercept.

Detailed description of studies used in the replication analysis can be found in Soler Artigas et al.23 In brief, linear regression of age, age2, sex, height and principal components for population structure was undertaken on FEV1 and FEV1/FVC separately for ever smokers and never smokers. The residuals were normalized using a rank-based inverse normal transformation, again separately in ever smokers and never smokers. These transformed residuals were then used as the phenotype for association testing under an additive genetic model in each exposure strata. Inference of the interaction effects from the exposure-stratified analyses are described in the Supplementary Note (available as Supplementary data at IJE online).

Multivariate interaction analysis overview

First, we considered an unweighted genetic risk score-by-smoking interaction where the risk score simply sums the number of risk alleles (i.e. alleles associated with a lower pulmonary function). This unweighted genetic risk score is most powerful when the interaction effects have the same direction as marginal SNP effects (i.e. the harmful effects of smoking are magnified in individuals with a genetic predisposition to reduced pulmonary function). Second, we used a weighted genetic risk score where SNPs were weighted by the absolute value of their marginal effect estimates obtained from stage 1 screening of FEV1 and FEV1/FVC from Soler Artigas et al.4 (Supplementary Table 1, available as Supplementary data at IJE online). This weighting scheme is most powerful when the magnitude of interaction effects is proportional to the SNP marginal effects. Finally, for our third multivariate analysis, we derived a standard omnibus test of all interaction effects. This test will retain power in the presence of effects in both directions or of different magnitudes. Although there is strong correlation among the 12 tests performed (these three models, considering interaction with two smoking metrics, ever/never smoking or pack-years, for the two pulmonary function metrics FEV1 and FEV1/FVC), we used a stringent Bonferroni P-value correction threshold of 4 × 10–3 to account for multiple testing.

When raw data are available, the weighted genetic risk score (GRS) is usually expressed as GRS = Σm[wi × Gi], where m is the number of SNPs included in the genetic risk score and w = (w1,..wm) are the weights attributed to each single SNP. Following previous notation, the test of interaction between the GRS and the exposure Ek can be applied using the following model:
(2)
where γ0, γGRS, γEl and γINT are the intercept, the main effect of the GRS, the main effect of the exposure El and the interaction effect between Ek and the GRS, respectively. However, because individual-level data were not directly available, we performed the test of γINT from summary statistics of interaction effects using an inverse-variance weighted sum as proposed by Aschard.14 The chi-square for the interaction term γINT was derived as follows:
(3)
where β^Gi×Ek and σ^βGi×Ek2 are the estimated effects and variance of the interaction between the exposure Ek and the SNP Gi obtained from Equation (1) and wi is the weight applied to SNP Gi. Under the null hypothesis of no interaction effect, χint2 follows a chi-squared distribution with one degree of freedom.
The standard omnibus test of all interaction effects consisted of evaluating jointly αG×Ek=(αG1×Ek,,αGm×Ek) from the model:
(4)
where α0, αGi, αEl and αGi×Ek are the intercept, the main effects of SNP Gi and the exposure El, and the interaction effect between Gi and Ek. Leveraging the independence between the SNPs considered (a single SNP was selected for each independent locus), we also derived the omnibus test using summary statistics. Under this independence assumption, the Gi × Ek interaction terms would also be independents,14 so that it can be performed by summing the chi-square from each univariate interaction test to form a chi-square with m degrees of freedom as follows:
(5)
where β^Gi×Ek and σ^βGi×Ek2 are the estimated effects and variance of the interaction between the exposure Ek and the SNP Gi obtained from Equation (1).

Relative risk in ever smokers vs never smokers

GRS interaction effects can further be translated in terms of risk prediction. For pulmonary function, low FEV1 or FEV1/FVC increases the risk of death24 and together they form the basis for the diagnosis of COPD.1–3 COPD stage 2 or higher are defined by the Global Initiative for Chronic Obstructive Lung Disease (GOLD) as FEV1/FVC < 0.70 and FEV1 < 80% of the predicted value. According to recent studies,2,25 between 5% and 20% of European ancestry adults are expected to have FEV1/FVC < 0.70, depending on smoking characteristics and age distribution. Several studies argue for a more stringent threshold to define COPD25,26 based on lower limit of normal predicted value, rather than a fixed absolute value, to prevent disease misclassification.

To explore the impact of interaction effect on the risk of disease, we derived the relative risk (RR) of having FEV1/FVC below a given threshold (1%, 5% and 20%) in ever smokers vs never smokers conditional on the unweighted GRS. This quantity is defined as the joint probability of having both FEV1/FVC in the interval [–∞, FEV1/FVCup] and the GRS in the interval [GRSlow,GRSup]. This can be expressed as the following integral:
(6)
where y, e and g are FEV1/FVC, smoking status and the GRS, respectively, and f1 and f2 are the probability density function of y and g. The detailed derivation of the above integral is available as Supplementary data at IJE online.

Results

We selected 26 loci previously found to be associated with FEV1 or FEV1/FVC at genome-wide significance (P < 5 × 10–8) in marginal association tests4,11,27 (i.e. not including interaction effects with smoking exposures) and replicated in the GWAS by Soler Artigas et al.,4 the largest meta-analysis of marginal genetic effect conducted for these two traits in the general population. Additional loci for these two phenotypes have been identified in two recent studies.28,29 However, these new loci were not included in our analysis because both these studies used a large cohort ascertained through smoking status. For each of the 26 selected loci, we choose the SNP with the strongest evidence for association (i.e. smallest P-value) with each of these phenotypes. The final list included 26 SNPs per phenotype, with only two SNPs being different between FEV1 and FEV1/FVC as previously reported4 (Supplementary Table 1, available as Supplementary data at IJE online). Estimated interaction effects of these SNPs were extracted from the meta-analysis summary statistics for the four tests performed in the Hancock et al.9 analysis: SNP-by-smoking status (ever smoking vs never smoking) interaction effect on FEV1 and FEV1/FVC; and SNP-by-smoking pack-years interaction effect on FEV1/FVC and FEV1. As shown in Supplementary Table 2 (available as Supplementary data at IJE online), nine SNPs showed nominal significance (P < 0.05) out of the 104 tests performed; however, none remained significant after accounting for multiple testing (Bonferroni corrected P-value threshold of 5 × 10–4). The minimum P-value was observed for the interaction between rs993925, near the TGFβ2 gene, and smoking status on FEV1 [βint = –0.036, 95% confidence interval (CI), –0.009 to –0.032, P = 0.007].

Next, using these data, we conducted three multivariate (as opposed to single SNP) interaction analyses, testing jointly for the interaction effects between those SNPs and either smoking status or pack-years on the two phenotypes (FEV1 and FEV1/FVC) for a total of 12 tests. As shown in Table 1, none of the multivariate interaction tests with pack-years was significant. However, four of the six multivariate interaction tests with smoking status (ever vs never) showed nominal significance, and two tests for FEV1/FVC had a P-value below the Bonferroni significance level (12 tests, P < 4 × 10–3). The strongest signal was observed for the unweighted genetic risk score-by-smoking status interaction effect on FEV1/FVC (βint = –0.036, 95% CI –0.040 to –0.032, P = 0.00057). The Cochran’s Q test for heterogeneity of the interaction effect across studies was not significant (P = 0.97) and the forest plot of study-specific results did not display any obvious outlier (Supplementary Figure 1, available as Supplementary data at IJE online).

Table 1.

Multivariate interaction tests of the 26 loci associated with pulmonary function

OutcomeExposureTestint(CI)P-value
FEV1Smoking statusauGRS–0.0055(–0.011, 2.7 × 10–5)0.051
wGRS–0.21(–0.40, –0.033)0.020
CHISQ0.49
FEV1Pack-yearsuGRS–1.6 × 10–5(–4.6 × 10–5, 1.4 × 10–5)0.30
wGRS–6.5 × 10–4(–1.6 × 10–3, 3.3 × 10–4)0.19
CHISQ0.46
FEV1/FVCSmoking statusuGRS–0.0099(–0.016, –0.0043)0.00057b
wGRS–0.21(–0.33, –0.073)0.0022b
CHISQ0.026
FEV1/FVCPack-yearsuGRS–4.4e-06(–3.6 × 10–5, 2.7 × 10–5)0.78
wGRS–6.5 × 10–5(–8.0 × 10–4, 6.6 × 10–4)0.85
CHISQ0.53
OutcomeExposureTestint(CI)P-value
FEV1Smoking statusauGRS–0.0055(–0.011, 2.7 × 10–5)0.051
wGRS–0.21(–0.40, –0.033)0.020
CHISQ0.49
FEV1Pack-yearsuGRS–1.6 × 10–5(–4.6 × 10–5, 1.4 × 10–5)0.30
wGRS–6.5 × 10–4(–1.6 × 10–3, 3.3 × 10–4)0.19
CHISQ0.46
FEV1/FVCSmoking statusuGRS–0.0099(–0.016, –0.0043)0.00057b
wGRS–0.21(–0.33, –0.073)0.0022b
CHISQ0.026
FEV1/FVCPack-yearsuGRS–4.4e-06(–3.6 × 10–5, 2.7 × 10–5)0.78
wGRS–6.5 × 10–5(–8.0 × 10–4, 6.6 × 10–4)0.85
CHISQ0.53

uGRS is the genetic risk score using equal weights to all SNPs; wGRS is the genetic risk score weighted by effect estimates from the marginal screening; CHISQ is the omnibus test of all interaction effects; int is the estimated interaction effect between the GRS and the outcome; and CI is the confidence interval of that estimate. Nominally significant tests are indicated in bold. aSmoking status is defined as never smokers vs ever smokers. bSignificant P-value after Bonferroni correction.

Table 1.

Multivariate interaction tests of the 26 loci associated with pulmonary function

OutcomeExposureTestint(CI)P-value
FEV1Smoking statusauGRS–0.0055(–0.011, 2.7 × 10–5)0.051
wGRS–0.21(–0.40, –0.033)0.020
CHISQ0.49
FEV1Pack-yearsuGRS–1.6 × 10–5(–4.6 × 10–5, 1.4 × 10–5)0.30
wGRS–6.5 × 10–4(–1.6 × 10–3, 3.3 × 10–4)0.19
CHISQ0.46
FEV1/FVCSmoking statusuGRS–0.0099(–0.016, –0.0043)0.00057b
wGRS–0.21(–0.33, –0.073)0.0022b
CHISQ0.026
FEV1/FVCPack-yearsuGRS–4.4e-06(–3.6 × 10–5, 2.7 × 10–5)0.78
wGRS–6.5 × 10–5(–8.0 × 10–4, 6.6 × 10–4)0.85
CHISQ0.53
OutcomeExposureTestint(CI)P-value
FEV1Smoking statusauGRS–0.0055(–0.011, 2.7 × 10–5)0.051
wGRS–0.21(–0.40, –0.033)0.020
CHISQ0.49
FEV1Pack-yearsuGRS–1.6 × 10–5(–4.6 × 10–5, 1.4 × 10–5)0.30
wGRS–6.5 × 10–4(–1.6 × 10–3, 3.3 × 10–4)0.19
CHISQ0.46
FEV1/FVCSmoking statusuGRS–0.0099(–0.016, –0.0043)0.00057b
wGRS–0.21(–0.33, –0.073)0.0022b
CHISQ0.026
FEV1/FVCPack-yearsuGRS–4.4e-06(–3.6 × 10–5, 2.7 × 10–5)0.78
wGRS–6.5 × 10–5(–8.0 × 10–4, 6.6 × 10–4)0.85
CHISQ0.53

uGRS is the genetic risk score using equal weights to all SNPs; wGRS is the genetic risk score weighted by effect estimates from the marginal screening; CHISQ is the omnibus test of all interaction effects; int is the estimated interaction effect between the GRS and the outcome; and CI is the confidence interval of that estimate. Nominally significant tests are indicated in bold. aSmoking status is defined as never smokers vs ever smokers. bSignificant P-value after Bonferroni correction.

The contrast between this significant risk score interaction and the absence of strong single SNP interaction effects can be explained by looking at the distribution of the single SNP interaction effect estimates. Figure 1 shows this distribution for the alleles associated with decreased FEV1/FVC. It highlights that, although the 95% CI of most single SNP interaction effects encompass the null (and therefore the absence of significant single SNP interaction effect), there is an enrichment for negative interaction effects. Indeed, even a binomial test can be used to confirm the unbalanced direction of interaction effects (18 of 26 interactions are negative leading to a P-value of 0.014 for a binomial test with an expected equiprobable distribution of 0.5). The genetic risk score-based interaction test exploits such enrichment by testing for the average interaction effect across all SNPs.14 As with any multivariate approach based on a composite null hypothesis, this result indicates that at least a subset of these 26 SNPs interact with smoking status, but does not allow us to determine which or how many SNPs are driving the genetic risk score-by-smoking interaction. The three other sets of single SNP interaction tests showed a similar (but not significant after correction for multiple testing) trend with enrichment for negative interactions (Supplementary Figures 2–4, available as Supplementary data at IJE online). We summarized the contribution of the unweighted genetic risk score-by-smoking interaction on FEV1/FVC in Table 2 and Figure 2A. This indicates that the deleterious effect of smoking is enhanced among carriers of the risk alleles or equivalently that the deleterious effect of smoking is reduced among subjects carrying the protective alleles.

Table 2.

Summary of effect estimates for genetic risk score-by-smoking status interaction on FEV1/FVC

PredictorsBetaSDP-value
From the marginal exposure model
Pack-years–0.00300.000171.2 × 10–71
Current smoking–0.0400.00477.7 × 10–18
Smoking statusa–0.00230.00460.61
From the interaction model
GRS–0.03630.00213.9 × 10–64
GRS × Smoking statusa–0.00990.00295.7 × 10–4
PredictorsBetaSDP-value
From the marginal exposure model
Pack-years–0.00300.000171.2 × 10–71
Current smoking–0.0400.00477.7 × 10–18
Smoking statusa–0.00230.00460.61
From the interaction model
GRS–0.03630.00213.9 × 10–64
GRS × Smoking statusa–0.00990.00295.7 × 10–4

GRS is the unweighted genetic risk score; beta is the effect estimates of each predictor; and SD the standard deviation of the each beta. aSmoking status was defined as never smokers vs ever smokers.

Table 2.

Summary of effect estimates for genetic risk score-by-smoking status interaction on FEV1/FVC

PredictorsBetaSDP-value
From the marginal exposure model
Pack-years–0.00300.000171.2 × 10–71
Current smoking–0.0400.00477.7 × 10–18
Smoking statusa–0.00230.00460.61
From the interaction model
GRS–0.03630.00213.9 × 10–64
GRS × Smoking statusa–0.00990.00295.7 × 10–4
PredictorsBetaSDP-value
From the marginal exposure model
Pack-years–0.00300.000171.2 × 10–71
Current smoking–0.0400.00477.7 × 10–18
Smoking statusa–0.00230.00460.61
From the interaction model
GRS–0.03630.00213.9 × 10–64
GRS × Smoking statusa–0.00990.00295.7 × 10–4

GRS is the unweighted genetic risk score; beta is the effect estimates of each predictor; and SD the standard deviation of the each beta. aSmoking status was defined as never smokers vs ever smokers.

Distribution of interaction effects on FEV1/FVC. Single SNP risk allele-by-smoking status (ever/never) interaction effect estimates (βint) and 95% confidence intervals are plotted by increasing values. The unweighted genetic risk score-by-smoking status interaction is plotted at the bottom.
Figure 1.

Distribution of interaction effects on FEV1/FVC. Single SNP risk allele-by-smoking status (ever/never) interaction effect estimates (βint) and 95% confidence intervals are plotted by increasing values. The unweighted genetic risk score-by-smoking status interaction is plotted at the bottom.

Overview of the unweighted genetic risk score-by-smoking interaction effect on FEV1/FVC. Upper panel (A) presents the distribution of the unweighted genetic risk score (GRS, grey density plot) and the relationship between the unweighted GRS and standardized FEV1/FVC in ever smokers (dashed line) and never smokers (solid line). Lower panel (B) shows the excess relative risk (RR) of having FEV1/FVC in the lowest 1%, 5% and 20% of the population for ever smokers compared with never smokers, as stratified by GRS quintiles.
Figure 2.

Overview of the unweighted genetic risk score-by-smoking interaction effect on FEV1/FVC. Upper panel (A) presents the distribution of the unweighted genetic risk score (GRS, grey density plot) and the relationship between the unweighted GRS and standardized FEV1/FVC in ever smokers (dashed line) and never smokers (solid line). Lower panel (B) shows the excess relative risk (RR) of having FEV1/FVC in the lowest 1%, 5% and 20% of the population for ever smokers compared with never smokers, as stratified by GRS quintiles.

We used two independent datasets, one of 8859 unrelated individuals and another of 9457 related individuals, to test for independent replication of our results (Supplementary Note, available as Supplementary data at IJE online). Although the interaction effects were not significant, both replication samples showed consistent negative GRS-by-ever smoking interaction effect on FEV1/FVC (β^int= –0.0025, 95% CI –0.0165, 0.0115, P = 0.72 and β^int= –0.0030, 95% CI –0.0214, 0.0154, P = 0.74, and overall interaction effect in the combined replication datasets β^int= –0.0027, 95% CI –0.0136, 0.0082 P = 0.63) and a Cochran’s Q test for heterogeneity showed no significant difference in the three effect estimates (P = 0.51).

To quantify the impact of this result from a public health perspective, we estimated the impact of the genetic risk score-by-smoking interaction on having FEV1/FVC below 1%, 5% and 20% in the lower tails of the distribution in the population. Specifically, we derived the RR of having FEV1/FVC below these cut-off points (1%, 5% and 20%) in ever smokers compared with never smokers. Figure 2B quantifies the excess RR (i.e. the RR minus one) of individuals across five GRS quintiles. It highlights the higher risk associated with smoking among individuals carrying risk alleles (i.e. alleles associated with poorer pulmonary function) as compared with individuals carrying protective alleles (i.e. alleles associated with better pulmonary function). For example, among individuals with a GRS above the 80th percentile, smokers have on average a 26% excess RR of having FEV1/FVC in the lowest 1% of the population distribution, whereas ever smokers with a GRS below the 20th percentile have on average an 18% excess RR of falling in that same FEV1/FVC category compared with never smokers. Applying the same approach for FEV1, we observed a similar pattern (Supplementary Figure 5, available as Supplementary data at IJE online). However, as expected, the lower magnitude of the genetic risk score-by-ever smoking interaction on FEV1 implied a lower difference in RR between ever smokers and never smokers.

Discussion

Using the largest dataset to date of European ancestry participants from the general population with pulmonary function (FEV1/FVC and FEV1), smoking and genetic data, we identified a gene-by-smoking interaction effect on FEV1/FVC by using a GRS composed of 26 SNPs identified and replicated in a prior GWAS meta-analysis of marginal genetic effects. To our knowledge, our study is the first to report a synergistic action of genes and smoking on pulmonary function (i.e. the reduction in FEV1/FVC as a result of smoking is greater among individuals who are genetically predisposed to lower FEV1/FVC ratio). Our study also highlights the importance of developing and applying alternative strategies to evaluate interaction effects for lung phenotypes along with other complex traits and diseases. The genetic risk score-based approach enabled us to identify an interaction when the standard univariate test (i.e. evaluating each single genetic variant for interaction independently) failed to identify any interactions.

Replication studies showed interaction effect estimates in the same direction as the discovery study but were not significant, and the magnitude of interaction effects were substantially smaller. We acknowledge that, despite careful evaluation of the interaction effects in the discovery sample, the observed signal might be overestimated or confounded by unmeasured complex factors. However, we can a priori rule out a systematic bias of the single SNP interaction effects in the discovery study, because the genomic inflation factor λ, defined as the ratio of the median of the empirically observed distribution of the test statistic to the expected median,30 was not substantially different from 1 (λ = 1.044 for FEV1/FVC and smoking status). Instead, differences in significance and effect estimates might be partly explained by the limited sample size in the replication study and differences in the analytical design. Indeed, the discovery analysis was performed using a saturated model including three smoking exposures and explicitly modelled the interaction effect. In comparison, the replication analysis was not adjusted for current smoking status and pack-year, and the interaction effect was approximated from analyses stratified by smoking status outcome, which has some limitations (see Supplementary Note and Supplementary Figure 6, available as Supplementary data at IJE online). Previous work has shown that combined analyses are more powerful when effects exist in both strata,31 as observed in discovery study. Further, even with N = 18 316 individuals in the combined replication population, we are underpowered. This sample size provides less than 50% power, at nominal significance of 5%, to detect interaction effects with the GRS.

Genetic risk score-by-exposure interaction can have higher clinical value than the identification of single SNP-by-exposure interaction by capturing a wealth of information in a single measure to identify subgroups in the population whose genetic background makes them more susceptible to the deleterious effects of smoking.19,32,33 Indeed, if single SNP-by-smoking interactions are distributed unconditionally on the marginal genetic effect (i.e. interaction effects are equally likely to be positive or negative given that the coded alleles are the risk alleles), the genetic effect is expected to be similar between ever and never smokers. The enrichment for negative interactions we identified through our GRS approach reveals a stronger genetic component among the ever smoker subgroup in the population and can allow the implementation of more efficient implementation of prevention strategies. For example, in the public health setting, programmes targeting smoking cessation campaigns to individuals who are genetically predisposed to low pulmonary function may have a stronger impact in preventing COPD.

Our results may also elucidate biological mechanisms underlying the interplay between genes and smoking in pulmonary function. In particular, the higher statistical power for the genetic risk score-based interaction test points towards the potential presence of an unmeasured intermediate biomarker mediating the effect of the 26 loci on FEV1/FVC. As shown in Figure 3, the most parsimonious model (i.e. the less complex following Occam’s razor) that would explain multiple interactions going in the same direction (Figure 1) implies that the genetic variants together influence an intermediate biomarker, which itself interacts with smoking. Future studies with extended genomic data, including transcriptomic, proteomic or metabolomic data, might be able to further assess such an hypothesis by evaluating (i) the effect of the GRS on those biomarkers and (ii) testing for interactions between smoking and the candidate biomarkers identified at step (i).

Underlying causal model. Potential causal diagrams underlying the gene and smoking interaction effects on FEV1/FVC. Panel (A) presents a scenario where each genetic variant influences the outcome through a SNP-specific pathway, and interactions with the environmental exposure take place along these pathways. Panel (B) presents an alternative (and simpler) model where multiple genetic variants influence an unmeasured intermediate biomarker U, which effect on FEV1/FVC depends on smoking. In scenario (A), the single SNP-by-smoking interaction test is the optimal approach, whereas, in scenario (B), the single SNP-by-smoking interaction test can become inefficient, and interaction would be easier to detect using a genetic risk score-by-smoking interaction test, because it summarizes all interaction effects in a single test.
Figure 3.

Underlying causal model. Potential causal diagrams underlying the gene and smoking interaction effects on FEV1/FVC. Panel (A) presents a scenario where each genetic variant influences the outcome through a SNP-specific pathway, and interactions with the environmental exposure take place along these pathways. Panel (B) presents an alternative (and simpler) model where multiple genetic variants influence an unmeasured intermediate biomarker U, which effect on FEV1/FVC depends on smoking. In scenario (A), the single SNP-by-smoking interaction test is the optimal approach, whereas, in scenario (B), the single SNP-by-smoking interaction test can become inefficient, and interaction would be easier to detect using a genetic risk score-by-smoking interaction test, because it summarizes all interaction effects in a single test.

This study has some limitations. The 26 selected variants together explain a relatively small proportion of the additive genetic variance in FEV1/FVC and in FEV1.4 However, GWAS with increasing sample sizes will likely continue to provide additional associated genetic variants to further assess the role of SNP-by-smoking interaction effects on pulmonary phenotypes and may increase the gap between smokers and never smokers to allow for a significant impact in the clinic or at the population level. Moreover, we focused on genetic variants previously found to be associated at genome-wide significance level, but future studies might consider less stringent criteria to select genetic variants, including those with only suggestive evidence, or alternatively candidate variants with functional annotation relevant to the outcomes and exposures in question. Obviously, the signal-to-noise ratio might decrease when relaxing the constraint on the SNP selection. However, as we recently showed, additional gain in statistical power might be achieved even if a substantial proportion of the variants do not interact with the exposure.14 Finally, investigation of interaction effects with other environmental exposures such as second-hand smoke, air pollution, asbestos or occupational risks may lead to a more comprehensive understanding of the biological and epidemiological significance of these variants.

In summary, the identification of interaction effects between genetic variants and environmental exposures in human traits is recognized as extremely challenging, and this quest has been mostly unsuccessful so far. In this study, we discovered novel gene-by-smoking interactions using risk scores that were not observed at the level of individual genetic variants. This risk score analysis suggests that persons with a greater genetic predisposition to low pulmonary function are more susceptible to the deleterious effects of smoking. By extension, the use of a GRS may help predict which smokers will fall below thresholds that establish the diagnosis of COPD.

Supplementary Data

Supplementary data are available at IJE online.

Acknowledgements

We thank the many colleagues who contributed to collection and phenotypic characterization of the clinical sampling and genotyping of the data. We especially thank those who kindly agreed to participate in the studies. H.A. was supported by R21HG007687. S.J.L. was supported by the Intramural Research Program of the NIH, National Institute of Environmental Health Sciences. The research undertaken by M.D.T., L.V.W. was partly funded by the National Institute for Health Research (NIHR). The views expressed are those of the author(s) and not necessarily those of the NHS, the NIHR or the Department of Health. A full list of studies and authors source of funding source and acknowledgements is provided in the Supplementary data. H.A., P.K., S.J.L., M.T., D.B.H. and A.J. were involved in designing the study. M.D.T., D.B.H., A.S., A.J., A.V.S., A.W.M., D.W.L., D.P.S., G.O.C., R.G.B., G.G.B., I.P.H., J.K.P., J.F.W., J.W., J.H.Z., K.d.J., L.V.W., M.S.A., H.M.B., M-R.J., M.F., P.A.C., S.A.G., S.R.H., V.G., W.T., S.J.L., I.R., O.P., J.E.H., C.H., A.C., D.J.P., S.E.H., I.J.D., S.E., U.G., LP.L., T.L., E.Z., B.P.P. and V.E.J. were involved in participant recruitment, sample collection or genotyping. H.A. performed analyses from the discovery study. H.A., V.E.J. and M.S.A. performed the replication analysis. H.A. drafted the paper, with substantial editorial input from P.K., S.J.L., J.D., D.S., M.T., D.B.H. and A.J. All authors have reviewed and approved the final draft. This material has not been published previously in a substantively similar form.

Conflict of interest: J.K. consulted for Pfizer on nicotine dependence. J.B.W. was employed by Pfizer at the time this research was undertaken. W.T. is a full-time employee and receives salary from Boehringer Ingelheim Pharmaceuticals Inc. Other authors declare no competing financial interest.

References

1

Rabe
KF
,
Hurd
S
,
Anzueto
A
et al.
Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: GOLD executive summary
.
Am J Respir Crit Care Med
2007
;
176
:
532
55
.

2

Lange
P
,
Celli
B
,
Agusti
A
et al.
Lung-function trajectories leading to chronic obstructive pulmonary disease
.
N Engl J Med
;
373
:
111
22
.

3

Vestbo
J
,
Hurd
SS
,
Agusti
AG
et al.
Global strategy for the diagnosis, management, and prevention of chronic obstructive pulmonary disease: GOLD executive summary
.
Am J Respir Crit Care Med
;
187
:
347
65
.

4

Soler Artigas
M
,
Loth
DW
,
Wain
LV
et al.
Genome-wide association and large-scale follow up identifies 16 new loci influencing lung function
.
Nat Genet
2011
;
43
:
1082
90
.

5

Aschard
H
,
Lutz
S
,
Maus
B
et al.
Challenges and opportunities in genome-wide environmental interaction (GWEI) studies
.
Hum Genet
2012
;
131
:
1591
1613
.

6

Khoury
MJ
,
Wacholder
S
.
Invited commentary: from genome-wide association studies to gene–environment-wide interaction studies—challenges and opportunities
.
Am J Epidemiol
2009
;
169
:
227
30
;
discussion 34–5.

7

Hutter
CM
,
Mechanic
LE
,
Chatterjee
N
et al.
Gene–environment interactions in cancer epidemiology: a National Cancer Institute Think Tank report
.
Genet Epidemiol
2013
;
37
:
643
57
.

8

How Tobacco Smoke Causes Disease: The Biology and Behavioral Basis for Smoking-Attributable Disease: A Report of the Surgeon General
.
Atlanta (GA)
,
2010
.

9

Hancock
DB
,
Artigas
MS
,
Gharib
SA
et al.
Genome-wide joint meta-analysis of SNP and SNP-by-smoking interaction identifies novel loci for pulmonary function
.
PLoS Genetics
2012
;
8
:
e1003098
.

10

Psaty
BM
,
O’Donnell
CJ
,
Gudnason
V
et al.
Cohorts for Heart and Aging Research in Genomic Epidemiology (CHARGE) Consortium: design of prospective meta-analyses of genome-wide association studies from 5 cohorts
.
Circulation Cardiovascular Genetics
2009
;
2
:
73
80
.

11

Repapi
E
,
Sayers
I
,
Wain
LV
et al.
Genome-wide association study identifies five loci associated with lung function
.
Nat Genet
2010
;
42
:
36
44
.

12

Aschard
H
,
Hancock
DB
,
London
SJ
,
Kraft
P
.
Genome-wide meta-analysis of joint tests for genetic and gene–environment interaction effects
.
Hum Hered
2011
;
70
:
292
300
.

13

Kraft
P
,
Yen
YC
,
Stram
DO
et al.
Exploiting gene–environment interaction to detect genetic associations
.
Hum Hered
2007
;
63
:
111
19
.

14

Aschard
H
.
A perspective on interaction effects in genetic association studies
.
Genet Epidemiol
2016;doi: 10.1002/gepi.21989
.

15

Marigorta
UM
,
Gibson
G
.
A simulation study of gene-by-environment interactions in GWAS implies ample hidden effects
.
Frontiers in Genetics
2014
;
5
:
225
.

16

Pollin
TI
,
Isakova
T
,
Jablonski
KA
et al.
Genetic modulation of lipid profiles following lifestyle modification or metformin treatment: the Diabetes Prevention Program
.
PLoS Genetics
2012
;
8
:
e1002895
.

17

Qi
L
,
Cornelis
MC
,
Zhang
C
et al.
Genetic predisposition, Western dietary pattern, and the risk of type 2 diabetes in men
.
Am J Clin Nutr
2009
;
89
:
1453
8
.

18

Ahmad
S
,
Rukh
G
,
Varga
TV
et al.
Gene x physical activity interactions in obesity: combined analysis of 111,421 individuals of European ancestry
.
PLoS Genetics
2013
;
9
:
e1003607
.

19

Qi
Q
,
Chu
AY
,
Kang
JH
et al.
Sugar-sweetened beverages and genetic risk of obesity
.
N Engl J Med
2012
;
367
:
1387
96
.

20

Fu
Z
,
Shrubsole
MJ
,
Li
G
et al.
Interaction of cigarette smoking and carcinogen-metabolizing polymorphisms in the risk of colorectal polyps
.
Carcinogenesis
2013
;
34
:
779
86
.

21

Langenberg
C
,
Sharp
SJ
,
Franks
PW
et al.
Gene-lifestyle interaction and type 2 diabetes: the EPIC interact case-cohort study
.
PLoS Medicine
2014
;
11
:
e1001647
.

22

Lee
S
,
Abecasis
GR
,
Boehnke
M
,
Lin
X
.
Rare-variant association analysis: study designs and statistical tests
.
Am J Hum Genet
2014
;
95
:
5
23
.

23

Soler Artigas
M
,
Wain
LV
,
Miller
S
et al.
Sixteen new lung function signals identified through 1000 Genomes Project reference panel imputation
.
Nature Communications
2015
;
6
:
8658
.

24

Viegi
G
,
Pistelli
F
,
Sherrill
DL
et al.
Definition, epidemiology and natural history of COPD
.
Eur Respir J
2007
;
30
:
993
1013
.

25

Roche
N
,
Dalmay
F
,
Perez
T
et al.
FEV1/FVC and FEV1 for the assessment of chronic airflow obstruction in prevalence studies: do prediction equations need revision?
Respiratory Medicine
2008
;
102
:
1568
74
.

26

Swanney
MP
,
Ruppel
G
,
Enright
PL
et al.
Using the lower limit of normal for the FEV1/FVC ratio reduces the misclassification of airway obstruction
.
Thorax
2008
;
63
:
1046
51
.

27

Hancock
DB
,
Eijgelsheim
M
,
Wilk
JB
et al.
Meta-analyses of genome-wide association studies identify multiple loci associated with pulmonary function
.
Nat Genet
2010
;
42
:
45
52
.

28

Artigas
MS
,
Wain
LV
,
Miller
S
et al.
Sixteen new lung function signals identified through 1000 Genomes Project reference panel imputation
.
Nature Communications
2015
;
6
:
8658
.

29

Wain
LV
,
Shrine
N
,
Miller
S
et al.
Novel insights into the genetics of smoking behaviour, lung function, and chronic obstructive pulmonary disease (UK BiLEVE): a genetic association study in UK Biobank
.
The Lancet Respiratory Medicine
2015
;
3
:
769
81
.

30

Devlin
B
,
Roeder
K
.
Genomic control for association studies
.
Biometrics
1999
;
55
:
997
1004
.

31

Behrens
G
,
Winkler
TW
,
Gorski
M
et al.
To stratify or not to stratify: power considerations for population-based genome-wide association studies of quantitative traits
.
Genet Epidemiol
2011
;
35
:
867
79
.

32

Aschard
H
,
Zaitlen
N
,
Lindstrom
S
et al.
Variation in predictive ability of common genetic variants by established strata: the example of breast cancer and age
.
Epidemiology
2015
;
26
:
51
8
.

33

Aschard
H
,
Chen
J
,
Cornelis
MC
et al.
Inclusion of gene-gene and gene–environment interactions unlikely to dramatically improve risk prediction for complex diseases
.
Am J Hum Genet
2012
;
90
:
962
72
.

Author notes

These authors contributed equally to this work.

This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited.