Body mass index and breast cancer survival: a Mendelian randomization analysis

Abstract Background There is increasing evidence that elevated body mass index (BMI) is associated with reduced survival for women with breast cancer. However, the underlying reasons remain unclear. We conducted a Mendelian randomization analysis to investigate a possible causal role of BMI in survival from breast cancer. Methods We used individual-level data from six large breast cancer case-cohorts including a total of 36 210 individuals (2475 events) of European ancestry. We created a BMI genetic risk score (GRS) based on genotypes at 94 known BMI-associated genetic variants. Association between the BMI genetic score and breast cancer survival was analysed by Cox regression for each study separately. Study-specific hazard ratios were pooled using fixed-effect meta-analysis. Results BMI genetic score was found to be associated with reduced breast cancer-specific survival for estrogen receptor (ER)-positive cases [hazard ratio (HR) = 1.11, per one-unit increment of GRS, 95% confidence interval (CI) 1.01–1.22, P = 0.03). We observed no association for ER-negative cases (HR = 1.00, per one-unit increment of GRS, 95% CI 0.89–1.13, P = 0.95). Conclusions Our findings suggest a causal effect of increased BMI on reduced breast cancer survival for ER-positive breast cancer. There is no evidence of a causal effect of higher BMI on survival for ER-negative breast cancer cases.


Introduction
Breast cancer is the most common form of cancer for women worldwide. 1 There is substantial variation in survival outcomes between patients. Some of this variation can be explained by established clinico-pathological factors including clinical stage, tumour grade and the molecular phenotype of the tumour. However, other factors such as germline genetic variation 2 and lifestyle factors may also be important. The association between body mass index (BMI) and survival has been investigated in many studies with increased BMI being associated with a reduced survival, [3][4][5][6][7][8][9][10][11] with some studies reporting an association limited to estrogen receptor (ER)-positive disease. [12][13][14][15] Whether this association is causal or simply due to confounding by other factors remains unclear.
Mendelian randomization (MR) 16,17 has become an established method used to estimate the causal relationship between an exposure and an associated outcome using data on inherited genetic variants that influence exposure status. Genetic variants are attractive as candidate instrumental variables because they are randomly assigned at conception and are not affected by potential environmental confounding factors. The use of germline genetic variants as instruments for modifiable exposures has the potential to avoid some of the limitations of conventional observational epidemiology for making causal inferences. 18 Recent genome-wide association studies have identified multiple loci associated with BMI, 19 enabling investigation of a possible causal role of BMI in breast cancer outcomes using an MR approach.
The aim of this study was to utilize germline genotype data for genetic variants known to be associated with BMI, in a breast cancer case-cohort to evaluate the association between BMI and breast cancer survival in an unbiased way. There are three assumptions under which genetic variants provide valid instrumental variables for the effect of BMI on breast cancer survival: first, the genetic variants are associated with BMI; second, the variants are not associated with any confounder of the BMI-breast cancer survival association (pleiotropy); third, the variants are conditionally independent of the survival, given the BMI and confounders (exclusion restriction).

Methods
We included six datasets where a genotyping array providing genome-wide coverage of common genetic variation had been used to genotype multiple breast cancer casecohorts in populations of European ancestry (COGS, CGEMS, METABRIC, PG-SNPs, SASBAC and UK2). A summary of these case-cohorts has been described in detail previously. 2 The characteristics of the studies used in our analysis are summarized in Table S1 (available as Supplementary data at IJE online). Genotypes for common variants across the genome were imputed using a reference panel from the 1000 Genomes Project (March 2012) for each dataset. All patients provided written informed consent, and each study was approved by the relevant institutional review board. Data on age at diagnosis, vital status, breast cancer-specific mortality, follow-up time, time between diagnosis and blood draw, lymph node status, histological grade, tumour size and estrogen receptor status were also available. In addition, some case-cohorts from the COGS study provided data on height and weight (selfreported) at date closest to diagnosis (cases) or study entry (controls) for 65 582 participants. BMI was calculated as weight in kilograms divided by height in metres squared (kg/m 2 ).

Calculation of BMI genetic risk score
The Genetic Investigation of Anthropometric Traits (GIANT) consortium involving over 300 000 individuals of European descent has reported 97 common variants associated with BMI, of which three were only associated with BMI for men. 19 We used the genotype data described above to construct the BMI genetic risk score (GRS) based on 94 BMI-associated genetic variants . The BMI genetic risk score is given by the sum of the weighted imputed allele doses (number of risk alleles carried) where the weights are the reported beta-coefficients for association with BMI. The manuscript 19 presented the results as the number of standard deviations increase in BMI per allele. We therefore transformed these to the increase in BMI per allele. The imputation r 2 of all 94 single nucleotide

Key Messages
• Observational studies have reported an association between elevated body mass index (BMI) and reduced survival for women with breast cancer. However, the causal nature of the association is unclear.
• We conducted a large Mendelian randomization analysis in order to examine a potential causal effect of BMI on breast cancer survival, using both individual genotype data and summary data.
• Our study provides evidence that the reported association between BMI and survival for estrogen receptor-positive breast cancer is likely to be causal. polymorphisms (SNPs) in the breast cancer dataset is greater than 0.4.

Statistical analysis
We verified the first assumption of Mendelian randomization by evaluating the association between BMI GRS and BMI in a set of control subjects from the COGS study. MR analysis was performed using Cox proportional hazard models, to evaluate the associations of the BMI genetic risk scores with breast cancer-specific mortality based on 36 210 cases with 2475 events over 170 504 person-years of follow-up. The date of diagnosis was used to calculate time-to-event with follow-up being censored at death, last follow-up or 10 years, whichever came sooner. As several studies include prevalent cases, the date of study entry was used to determine time under observation in order to adjust for the potential bias of prevalent cases in a prospectively recruiting study (left-truncation). 20 All analyses were performed for each study separately, and summary statistics were obtained using a fixed-effect meta-analysis. We also conducted MR subtype-specific analysis for 5683 ER-negative cases (679 events) and 22 567 ER-positive cases (1161 events) (Table S1).
We assessed the relationship between BMI GRS and breast cancer survival using summary statistics for the association of each BMI-associated SNP with survival, for each dataset. We used both an inverse-variance weighted method and a likelihood-based method 21 to estimate the association. Several clinico-pathological factors are known to be associated with survival. Rather than being true potential confounders of any relationship between BMI and survival, these factors should be considered as intermediates. Nevertheless, in order to evaluate the second assumption of MR, we tested for association between BMI-associated SNPs and node status, tumour size and histological grade. Alternatively, it is possible that smoking behaviour might mediate the true casual mechanisms for the association between BMI and breast cancer survival. We examined therefore the potential associations between smoking behaviour (measured as self-reported total packyears smoked) and survival and between GRS and smoking behaviour. Pleiotropic effects of the BMI SNPs on unmeasured confounders may also violate the assumption. The role of directional pleiotropy was assessed using Egger regression on the summary statistics of association for each BMI-associated SNP with survival. 22 Egger regression is a modified form of standard inversevariance weighted meta-analysis. When applied to MR analyses, the slope of the Egger regression provides an estimate of the causal effect, and the estimated value of the intercept can be interpreted as an estimate of the average pleiotropic effect across all the genetic variants. 23 All analyses were performed using R (R project for Statistical Computing).

Results
We observed strong positive associations between the BMI GRS and observed BMI using a set of 28 190 controls from the COGS study. A one-unit increase in GRS corresponds to a 0.94 kg/m 2 (95% CI 0.85-1.03, P ¼ 4.16 Â 10 À99 ) increase in BMI and explained 1.6% of the BMI variance (F statistic ¼ 450). Self-reported BMI was significantly associated with breast cancer survival for both ER-negative and ER-positive disease in the COGS data (Table 1). Both associations were attenuated after adjustment for tumour grade, nodal status and tumour size.
We performed MR analysis for all available ERnegative and ER-positive breast cancer cases. The GRS was found to be associated with reduced breast cancerspecific survival for ER-positive cases with hazard ratio (HR) of 1.11 (95% CI ¼ 1.01-1.22, P ¼ 0.03) per one-unit increment of the GRS (Table 1). In order to evaluate whether this association varied by menopausal status, we compared the estimates for GRS for premenopausal (defined as age at diagnosis < 50 years) and postmenopausal (age at diagnosis ! 50 years) women with ERpositive breast cancer, using data from the COGS study. We found no evidence for a difference in the hazard ratios (P ¼ 0.93). No significant association with genetic score was observed for ER-negative cases (HR ¼ 1.00, 95% CI 0.89-1.13; Table 1). This indicates that the observed association between BMI and breast cancer survival for ER-negative cases might not be causal. However, we had only 38% power to detect the same magnitude of association as that observed for ER-positive disease with a type I error of 5%. 24 The number of events would need to be approximately 2000 for a power of 80% in ER-negative cases (Supplementary Figure 1, available as Supplementary data at IJE online). The differences between the estimated associations with genetic score for ER-positive and ERnegative were not significant (P ¼ 0.07). The association between BMI and breast cancer survival was also evaluated using standard inverse-variance weighted meta-analysis of summary statistics for the association of each BMIassociated SNP with survival. The results were similar to those based on individual-level data ( Table 1).
In order to test the validity of the exclusion restriction assumption, we compared the results of a standard inversevariance weighted regression with the Egger regression for the SNPs in the GRS ( Figure 1A). The slope of the inversevariance weighted regression was 0.10 (95% CI 0.02-0.19) which was similar to that from the Egger regression 0.10 (95% CI 0.11-0.32). The intercept from the Egger regression was not significantly different from zero (À0.0002, P-value ¼ 0.99), suggesting no overall directional pleiotropy. A funnel plot of the minor allele frequency-corrected genetic associations with the BMI against the individual causal effect estimates for each SNP shows little evidence for asymmetry ( Figure 1B).
We tested each GRS SNP for association with either node status or grade or tumour size or stage. Sixteen of the BMI SNPs were associated with one or more of these variables. We then repeated the individual data MR analysis using a GRS-78 that excluded these SNPs. The magnitudes of the associations with ER-positive breast cancer were similar to those for the results based on all the BMI SNPs (GRS-78: HR ¼ 1.10, 95% CI 1.00-1.22, P ¼ 0.06).
We explored a potential complex relationship between smoking behaviour, BMI and prognosis by investigating the association between BMI GRS and smoking behaviour and between smoking behaviour and prognosis. There was a very weak correlation between GRS and number of pack-years smoked (correlation coefficient ¼ 0.017, P ¼ 0.004). However, there was no association between smoking and prognosis (P ¼ 0.47 and 0.79 for ER-positive and ER-negative disease, respectively). It is unlikely that the association between smoking behaviour and BMI can explain the association between BMI GRS and prognosis.

Discussion
We conducted a large Mendelian randomization analysis in order to examine a potential causal effect of BMI on breast cancer survival, using both individual data and summary data. We constructed a weighted BMI genetic score comprising 94 BMI-associated genetic variants identified in genome-wide association studies as instrumental variables. We also used an inverse-variance weighted method and likelihood-based method to evaluate the combined association of BMI-associated SNPs with breast cancer survival. The results from the summarized data were in agreement with the results from two-stage regression based on individual-level genotype data. Our findings suggest a possible causal association between increased BMI and reduced breast cancer survival for ER-positive cases. This provides consistent evidence, along with other findings, that increased BMI has been repeatedly associated with ER-positive breast cancer.
A limitation of the analysis is that, even if the genetic variants are not associated with confounders of the relationship between BMI and breast cancer survival for the population as a whole (that is, the genetic variants are valid instrumental variables for the population), the genetic variants may be associated with these confounders for the subpopulation of breast cancer patients. This is due to conditioning on a collider: if BMI is a causal risk factor for breast cancer risk, then conditioning on breast cancer risk (by only including breast cancer patients in the analysis) means that all common causes of breast cancer risk (including the genetic variants and confounders) are conditionally associated. In simple terms, even if genetic variants are distributed randomly in the population as a whole, they are not necessary randomly distributed in the ascertained population of breast cancer patients. This may lead to bias in the analysis, although it is unclear how serious this bias might be. In order to evaluate the potential for collider bias, we performed a simulation study in which we simulated data on a genetic risk score and an exposure (BMI in our example) for 100 000 individuals. For each individual, we simulated whether that individual had a positive breast cancer diagnosis as a binomial random variable. For each individual with a positive breast cancer diagnosis, we simulated the time-to-event for breast cancer progression as an exponential random variable. The genetic risk score was simulated as a normally distributed random variable, as was the confounder (assumed unmeasured), and the independent error term. The probability of breast cancer diagnosis was modelled as a function of the exposure. This leads to the collider (selection) bias: individuals with a breast cancer diagnosis (and therefore eligible for the Mendelian randomization analysis) will have higher average levels of the exposure and confounder than those not included. While collider bias was observed for extreme values of the effect of the risk factor on disease status, it was not observed for values that are in line with the effect of BMI on breast cancer diagnosis as observed in previous investigations. Hence, while we would be cautious not to generalize the result of this limited simulation study to other analysis contexts, in this case there seemed to be little potential for bias and type 1 error rate inflation to arise due to collider bias.
While our results suggest a causal association between BMI and survival for women with ER-positive breast cancer, BMI is, in itself, a complex phenotype. It is conceivable that more specific phenotypes related to body fat composition and distribution might be better predictors of outcome. Untangling such complex relationships with survival will require data on the association between germline genetic variation and specific body fat composition and distribution phenotypes. Potential mechanisms underlying effects of obesity on breast cancer survival are mediators such as members of the insulin/insulin-like growth factor family, adipocytokines secreted from adipose tissue and inflammatory cytokines. 23 Our study, based on data from multiple large-scale genetic association studies of breast cancer, provides evidence that the reported association between BMI and survival for ER-positive breast cancer is likely to be causal. This suggests that BMI reduction in overweight women with ER-positive breast cancer might improve clinical outcomes.

Supplementary Data
Supplementary data are available at IJE online.