Circulating insulin-like growth factors and risks of overall, aggressive and early-onset prostate cancer: a collaborative analysis of 20 prospective studies and Mendelian randomization analysis

Abstract Background Previous studies had limited power to assess the associations of circulating insulin-like growth factors (IGFs) and IGF-binding proteins (IGFBPs) with clinically relevant prostate cancer as a primary endpoint, and the association of genetically predicted IGF-I with aggressive prostate cancer is not known. We aimed to investigate the associations of IGF-I, IGF-II, IGFBP-1, IGFBP-2 and IGFBP-3 concentrations with overall, aggressive and early-onset prostate cancer. Methods Prospective analysis of biomarkers using the Endogenous Hormones, Nutritional Biomarkers and Prostate Cancer Collaborative Group dataset (up to 20 studies, 17 009 prostate cancer cases, including 2332 aggressive cases). Odds ratios (OR) and 95% confidence intervals (CI) for prostate cancer were estimated using conditional logistic regression. For IGF-I, two-sample Mendelian randomization (MR) analysis was undertaken using instruments identified using UK Biobank (158 444 men) and outcome data from PRACTICAL (up to 85 554 cases, including 15 167 aggressive cases). Additionally, we used colocalization to rule out confounding by linkage disequilibrium. Results In observational analyses, IGF-I was positively associated with risks of overall (OR per 1 SD = 1.09: 95% CI 1.07, 1.11), aggressive (1.09: 1.03, 1.16) and possibly early-onset disease (1.11: 1.00, 1.24); associations were similar in MR analyses (OR per 1 SD = 1.07: 1.00, 1.15; 1.10: 1.01, 1.20; and 1.13; 0.98, 1.30, respectively). Colocalization also indicated a shared signal for IGF-I and prostate cancer (PP4: 99%). Men with higher IGF-II (1.06: 1.02, 1.11) and IGFBP-3 (1.08: 1.04, 1.11) had higher risks of overall prostate cancer, whereas higher IGFBP-1 was associated with a lower risk (0.95: 0.91, 0.99); these associations were attenuated following adjustment for IGF-I. Conclusions These findings support the role of IGF-I in the development of prostate cancer, including for aggressive disease.


Introduction
Prostate cancer is the second most common cancer in men worldwide and a leading cause of cancer death. 1 Insulinlike growth factors (IGFs) are important growthpromoting peptides that act through the IGF-I receptor. 2,3 IGF-I and IGF-II are mainly produced by the liver and circulate in the bloodstream, but they are also produced in local tissues where they function in a paracrine/autocrine manner. 3 The majority of both of these growth factors circulate bound to IGF proteins (IGFBPs), 2,4 which extend the half-life of the IGFs and modulate IGF signalling. 2,4 Higher IGF-I signalling increases cell survival and decreases apoptosis, increasing the probability of carcinogenesis. 4,5 Circulating IGF-I concentrations are positively associated with risks of several cancers, particularly prostate, breast and colorectal cancer. 6,7 The Endogenous Hormones, Nutritional Biomarkers and Prostate Cancer Collaborative Group (EHNBPCCG) is a pooled individual participant nested case-control dataset of prospective studies of hormonal and nutritional factors and prostate cancer risk, which previously reported positive associations of IGF-I, IGF-II, IGFBP-2 and IGFBP-3 with overall prostate cancer risk and an inverse association with IGFBP-1. 8 However, in this previous study it was unclear whether IGF-II or the IGFBPs are associated with prostate cancer independently of IGF-I, and the analyses of associations with aggressive disease subtypes were underpowered to provide strong evidence of an effect. 8 The EHNBPCCG dataset has since been expanded to include more than double the number of prostate cancer cases (up to 17 000 prostate cancer cases, including 2300 aggressive cases).
In blood-based observational analyses it is difficult to rule out the possibility of biases including residual confounding or reverse causality. Mendelian randomization (MR) uses germline genetic variants as proxies of putative risk factors and estimates their associations with disease risk. These germline genetic variants are randomly allocated and fixed at conception, and therefore MR is less likely to be affected by these biases and so is potentially a more robust method for causal inference. 9 In order to appraise causality for IGF-I, we carried out two-sample MR analyses using instruments identified from UK Biobank and genetic data from the PRACTICAL consortium. [10][11][12] Using these genetic datasets, we also ran colocalization analyses to investigate whether the IGF1 gene region and prostate cancer share the same genetic signal to exclude the possibility of confounding by linkage disequilibrium. 13 Using these two international consortia and UK Biobank, we aimed to assess the associations of circulating IGF-I with overall, aggressive and early-onset prostate cancer risk, using observational and genetic methods. The analysis of very large datasets can provide more robust risk estimates, and the integration of evidence from these different epidemiological approaches can strengthen the basis for causal inference. 14 We additionally report observational associations of IGF-II and IGFBPs-1,-2,-3 with overall, aggressive and early-onset subtypes.

Endogenous hormones, nutritional biomarkers and Prostate Cancer Collaborative Group
Data collection and study designs Individual participant data were available from up to 20 prospective studies with IGF-I (17 009 cases), IGF-II (4466 cases), IGFBP-1 (4491 cases), IGFBP-2 (3776 cases) and IGFBP-3 (9113 cases) measurements. Participating studies are listed in Supplementary Table S1 and further  details of data collection and processing are provided in  the Supplementary material. Matching criteria are  shown in Supplementary Table S2. Assay details and hormone measurement data are provided in Supplementary  Table S3.
Data processing and outcomes Disease definitions were as defined by the PRACTICAL consortium. 10,11 Aggressive prostate cancer was categorized as 'yes' for any of the following: disease metastases at diagnosis (M1), Gleason score 8þ (or equivalent), prostate cancer death (defined as death from prostate cancer) or prostate-specific antigen (PSA) >100 ng/mL. Early-onset prostate cancer was defined as a diagnosis aged 55 years. Further details of the disease characterization can be found in the Supplementary Methods.

Statistical analysis
Conditional logistic regression was used to estimate prostate cancer risk by circulating concentrations of IGF-I, IGF-II, IGFBP-1, IGFBP-2 and IGFBP-3. Analyses were conditioned on the study-specific matching variables and adjusted for age at blood collection, body mass index (BMI), height, smoking status, alcohol consumption, racial or ethnic group, education, married/cohabiting and diabetes status. Biomarkers were standardized by study and entered into the model as continuous variables, so each increment represents 1 studyspecific SD increase in biomarker concentration. For categorical analyses, biomarkers were categorized into study-specific fifths with cut-points determined in controls. 15 Further details are available in the Supplementary Methods.

Further analyses
We examined heterogeneity in the associations of each biomarker with prostate cancer by participant characteristics, with subgroups defined a priori based on the availability of data and previous analyses using this dataset 8,16 ; heterogeneity in the associations by study was also examined (Supplementary Methods). We additionally investigated unadjusted matched associations, associations in tenths, and estimates per 80th percentile increase. Associations were also examined following mutual adjustment for other biomarkers (IGF-I, IGF-II, IGFBP-1,-2,-3, free and total testosterone and sex hormone-binding globulin [SHBG]), and we tested for interactions between these biomarkers; further details are available in the Supplementary Methods. Stratified analyses and associations in tenths were not investigated for earlyonset disease due to the limited number of cases.

Mendelian randomization analysis
Genetic instruments for hormone concentrations Single nucleotide polymorphisms (SNPs) associated with circulating IGF-I concentrations were identified from a publicly available genome-wide association study (GWAS) based on 158 444 male UK Biobank participants of White British ancestry (P <5 x 10 -8 significance threshold). 17 We pruned SNPs by a linkage disequilibrium threshold of r 2 <0.001, based on the lowest P-value.
Genetic associations with prostate cancer SNP associations for prostate cancer were obtained from the PRACTICAL and GAME-ON/ELLIPSE consortia, 10,11 which currently do not include UK Biobank data. Individual studies included in these consortia are detailed in Conti et al. 12 and Schumacher et al. 10 Associations with overall prostate cancer risk were generated from 85 554 prostate cancer cases and 91 972 controls, 12 with aggressive disease from 15 167 cases and 58 308 controls and with early-onset disease from 6988 cases and 44 256 controls, 10 all of White European ancestry.

Statistical analysis
The MR estimation for hormones was conducted using the inverse-variance weighted (IVW) method. 18 We additionally calculated the I 2 statistic to assess measurement error in SNP-exposure associations, 19 the F statistic to assess instrument strength, 20,21 Cochran's Q statistic to test for heterogeneity between the MR estimates for each SNP 22 and PhenoScanner was used to assess pleiotropy of the genetic instruments. 23 As sensitivity analyses, we used the MR residual sum and outlier (MR-PRESSO), MR robust adjusted profile score (MR-RAPS) and leave-one-out analyses to investigate the role of SNP outliers. 24 To assess pleiotropy, we used the weighted median, MR-Egger and the MR-Egger intercept. 25 We also used the contamination mixture method, which assumes a normal distribution of valid instruments around the true causal value, and invalid instruments are normally distributed around zero in order to account for potentially pleiotropic variants. 26 To rule out reverse causality, analyses were repeated after applying Steiger filtering which excludes variants with larger effects on prostate cancer risk than on IGF-I. 27 The associations of the IGF-I cis-SNP, defined as the lead SNP on the biomarker gene coding region identified from the exposure datasets, with prostate cancer were investigated using the Wald ratio. This cis-SNP is less likely than trans-SNPs to be affected by horizontal pleiotropy. 28

Colocalization analysis
Colocalization was used to investigate whether the associations of variation in the IGF1 gene region with both circulating IGF-I concentration and prostate cancer risk, share the same genetic signal or whether the associations identified by our MR analysis may be confounded by linkage disequilibrium. 13 Analyses were conducted for a 75kb region surrounding the lead IGF-I cis-SNP (rs5742653) using the UK Biobank and PRACTICAL datasets. 12,17 Colocalization was assessed using three approaches: conventional colocalization, 13 which tests for the presence of a single shared genetic signal; as well as the sum of single effects (SuSiE) regression framework 29 ; and conditional iterative colocalization. 30 The latter two methods allow for the possibility of multiple independent (but partially correlated) causal variants in proximity. 31 We created colocalization plots using LocusCompareR 32 and a z-z locus plot. 33 We considered a posterior probability of a shared causal variant (PP4) of >0.7 as being consistent with evidence of colocalization between IGF-I and prostate cancer. 13 Further details of the colocalization analysis are available in the Supplementary Methods.
Details of statistical software and packages used are available in the Supplementary Methods. All tests of significance were two-sided, and P-values <0.05 were considered statistically significant.

Study and participant characteristics in the observational analyses
A total of 20 studies, contributing up to 17 009 cases and 37 243 controls, were included in this analysis. Prostate cancer was classified as aggressive in 2332 cases and earlyonset disease in 607 cases. Study participants were 91.3% of White ethnicity (Table 1). Men who were diagnosed with overall prostate cancer were taller and had a lower BMI than their matched controls (Table 1).
Prostate cancer characteristics by study are displayed in Supplementary Table S4. Mean age at blood collection for each study ranged from 33.8 to 76.8 years (overall mean ¼ 61.2 years, SD ¼ 7.8 years). Cases were diagnosed on average 6.7 years (SD ¼ 5.4) after blood collection, and the average age at diagnosis was 67.5 years (SD ¼ 6.5) ( Table 1). Aggressive disease was diagnosed on average 8.0 years after blood collection (SD ¼ 6.3) ( Table 1). Partial correlations between biomarkers ranged from r ¼ À0.004 (PSA and IGF-II) to r ¼ 0.54 (IGF-II and IGFBP-2) (Supplementary Table S5).

Further analyses-observational analysis
Associations of IGF-I with overall and aggressive prostate cancer were generally consistent by subgroups and secondary outcomes (Figures 2 and 3). The OR for prostate cancer death was 1.08 for IGF-I (1.00, 1.17) (Figure 2). There was some evidence of larger magnitudes of associations with overall prostate cancer for men with a family history of prostate cancer (1.19: 1.09, 1.29) than for men without (1.07: 1.03, 1.11; P het ¼ 0.02) (Figure 2).
The associations of IGF-II and IGFBPs with prostate cancer risk were broadly similar by subgroups ( Supplementary Figures S3-S10). There was evidence of heterogeneity in the association of IGFBP-2 with overall prostate cancer by BMI (P het ¼ 0.0007); for men whose BMI was <25 kg/m 2 at baseline, IGFBP-2 was inversely associated with prostate cancer (0.89: 0.83, 0.96), and the OR for men with BMI 30þ was 1.19 (0.99, 1.42) (Supplementary Figure S7). IGFBP-2 was also inversely associated with aggressive disease risk for men whose BMI was <25 kg/m 2 (0.78: 0.66, 0.94), but not for men who had a higher BMI (P het ¼ 0.01) (Supplementary Figure S8).
Associations with overall and aggressive prostate cancer by study are available in Supplementary Figures S11-S20. There was some evidence of heterogeneity by study in the associations of IGF-I with aggressive disease (P het ¼ 0.02) (Supplementary Figure S12), and IGF-II and IGFBP-2 with overall prostate cancer risk (P het ¼ 0.0001 and 0.02,  Figure 1 Risks of overall, aggressive* and early-onset † prostate cancer by study-specific fifths of biomarker concentrations (observational only) and 1 SD increment (observational and Mendelian randomization). Estimates are from logistic regression conditioned on the matching variables and adjusted for age, BMI, height, alcohol intake, smoking status, marital status, education status, racial/ethnic group and diabetes status. The position of each square indicates the magnitude of the odds ratio, and the area of the square is proportional to the inverse of the variance of the log odds ratio. The length of the horizontal line through the square indicates the 95% confidence interval. MR risk estimates are estimated using the inverse-variance weighted method for the full instrument methods and the Wald ratio in the cis-SNP analyses. P trend represents 1-SD increase in biomarker concentration. *Aggressive cancer defined as Gleason grade 8þ, or prostate cancer death, or metastases or PSA >100 ng/mL. †Early-onset defined as diagnosed 55 years. BMI, body mass index; CI, confidence interval; IGF, insulin-like growth factor; IGFBP, insulin-like growth factor-binding protein; OR, odds ratio; PSA, prostate-specific antigen; SD, standard deviation; MR, Mendelian randomization; SNP, single nucleotide polymorphism respectively) ( Supplementary Figures S13 and S17). Associations were broadly similar to the primary analyses in unadjusted matched analyses (Supplementary Figure  S21), using study-specific tenths (Supplementary Figure  S22) and per 80 percentile increase (Supplementary Table  S7). Following mutual adjustment for IGF-I, the associations of IGF-II and IGFBP-1 with risk were attenuated to the null (Supplementary Table S8). For IGF-I and IGFBP-3, mutual adjustment slightly attenuated the associations with overall prostate cancer risk, but both these associations remained (Supplementary Table S8).
There was some evidence of interactions in the associations of IGF-II, IGFBP-1 and IGFBP-2 concentrations with prostate cancer risk by total testosterone concentrations; men with total testosterone concentrations above the study-specific median showed evidence of a positive relationship for IGF-II and an inverse association for IGFBP-1, whereas these associations were null for men with lower total testosterone concentrations (P het ¼ 0.03 and 0.02, respectively) (Supplementary Table S9). Only men with lower total testosterone concentrations had a positive association between IGFBP-2 and overall prostate cancer (P het ¼ 0.01). For aggressive disease, the OR for IGFBP-2 was 1.27 for men with lower total testosterone concentrations (1.00, 1.62), and in men with higher total testosterone there was an inverse relationship of IGFBP-2 with aggressive disease (0.75: 0.60, 0.93; P het <0.01), although the number of aggressive cases was limited (N ¼ 443) (Supplementary Table S10).

Further analyses-mendelian randomization
There was no strong evidence of measurement error in the genetic instruments for IGF-I (I 2 ¼ 0.99) and all SNPs had an F statistic >10. 20 There was significant heterogeneity in the MR estimates for the SNPs with overall prostate cancer, and for aggressive and early-onset disease (Cochran's Q P <0.001). Full MR results are found in Supplementary Table S11. Forest plots of single-SNP results are available in Supplementary Figures S23-25, leave-one-out plots are available in Supplementary Figures S26-28 and MR scatterplots are available in Supplementary Figure S29. Outliers identified by MR-PRESSO are available in Supplementary Table S12. Following Steiger filtering, the results were slightly attenuated (Supplementary Table  S13). Using PhenoScanner, 430 traits were identified as being linked to genetically predicted IGF-I, including height and measures of adiposity (Supplementary Figure S30). Higher concentrations of IGF-I instrumented by the cis-SNP (rs5742653) were associated with increased peak expiratory flow (P <5 x 10 -8 ).  Figure 2 Odds ratio (95% CIs) for overall prostate cancer per study-specific 1-SD increment of IGF-I concentration by subgroup in the EHNBPCCG.
Estimates are from logistic regression conditioned on the matching variables and adjusted for age, BMI, height, alcohol intake, smoking status, marital status, education status, racial/ethnic group and diabetes status. The position of each square indicates the magnitude of the odds ratio, and the area of the square is proportional to the inverse of the variance of the log odds ratio. The length of the horizontal line through the square indicates the 95% confidence interval. Tests for heterogeneity for case-defined factors were obtained by fitting separate models for each subgroup and assuming independence of the ORs using a method analogous to a meta-analysis. Tests for heterogeneity for non-case-defined factors were assessed with a v 2 test of interaction between subgroup and the binary variable. *Aggressive cancer defined as Gleason grade 8þ, or prostate cancer death, or metastases or PSA >100 ng/mL. †Localized defined as TNM stage <T2 with no reported lymph node involvement or metastases or stage I; other localized stage if TNM stage T2 with no reported lymph node involvement or metastases, stage II, or equivalent; advanced stage if they were TNM stage T3 or T4 and/or N1þ and/or M1, stage III-IV or equivalent. ‡ Low grade defined as Gleason score was <7 or equivalent (i.e. extent of differentiation good, moderate); medium grade if Gleason score was 7 (i.e. poorly differentiated); high grade if the Gleason score was !8 or equivalent (i.e. undifferentiated). BMI, body mass index; CI, confidence interval; EHNBPCCG, Endogenous Hormones, Nutritional Biomarkers and Prostate Cancer Collaborative Group; IGF-I, insulin-like growth factor-I; OR, odds ratio; PSA, prostate-specific antigen; SD, standard deviation; TNM, tumour, node, metastases  Figure 3 Odds ratio (95% CIs) for aggressive* prostate cancer per study-specific 1-SD increment of IGF-I concentration by subgroup in the EHNBPCCG. Estimates are from logistic regression conditioned on the matching variables and adjusted for age, BMI, height, alcohol intake, smoking status, marital status, education status, racial/ethnic group and diabetes status. The position of each square indicates the magnitude of the odds ratio, and the area of the square is proportional to the inverse of the variance of the log odds ratio.

Discussion
This is the first study that has applied both observational and genetic approaches using data from large international consortia to investigate the associations of IGF-I with prostate cancer risk. Our results support a role of circulating IGF-I in the development of prostate cancer, including aggressive disease. In observational analyses, IGF-II and IGFBPs-1 and -3 were also associated with overall prostate cancer risk, but these associations were attenuated following adjustment for IGF-I. Genetic analyses may be more informative than observational analyses about the direct role of the exposure on the outcome. The weaker findings from genetic analyses from the multi-SNP (cis and trans) instrument, compared with the cis-SNP may be related to associations of some of the trans-SNPs with other components of the IGF signalling pathway such as the IGFBPs. 34 For the lead cis-SNP MR we observed larger magnitude effects, which likely indicates stronger biological plausibility of a direct role for IGF-I and a reduced role of horizontal pleiotropy, 35 and may also be due to the possible role of local IGF1 expression in the prostate tissue. Moreover, colocalization analyses showed strong evidence of a shared genetic cause at the IGF1 gene for IGF-I concentrations and risk for prostate cancer, indicating that our findings are unlikely to be due to confounding by linkage disequilibrium.
In our observational analyses, IGF-II, IGFBP-1 and IGFBP-3 were positively associated with overall prostate cancer, but we were underpowered to detect associations with aggressive or early-onset disease. Following further adjustment for IGF-I, the associations with overall disease were attenuated although IGFBP-3 remained significantly associated with overall prostate cancer. These results suggest that the observed associations may be at least partially due to the correlations of these biomarkers with IGF-I. Analogous genetic approaches such as multivariable MR may be useful in exploring the direct and indirect effects of these biomarkers on prostate cancer risk. 36 These analyses have several strengths. This is the largest collection of observational and genetic data on hormones and prostate cancer risk available, representing almost all the available data worldwide. This large sample size maximizes power to assess associations robustly and enabled us to investigate associations across subgroups. Further, by incorporating observational and genetic methods, we were able to use different lines of evidence for a more robust investigation towards causal inference. 14 This study had a number of limitations. IGFs and IGFBPs are also produced locally as well as by the liver, which may affect prostate cancer risk independently of circulating concentrations. 2,4 Consequently, the predictive value of circulating IGF-I as an indicator of intra-prostatic IGF signalling remains incompletely understood, 4 and future research including measured intra-prostatic IGF-I and IGF-I receptor expression may help to clarify this. Our analyses relied on single biomarker measurements, and although these biomarkers have good reproducibility over a 1 to 5 year period (intraclass correlation coefficients 0.60-0.90 for IGF-I and IGFBP-1,-2,-3), 37-39 this would be expected to lead to underestimates of risk in the observational analyses. 40 Although associations were generally consistent by subgroup, the number of statistical tests in these analyses increased the possibility of false-positives. Assay methods used to measure the biomarkers varied by study, and some IGF biomarkers are more difficult to measure than others (for example, IGF-II); measurement error would be expected to be non-differential and therefore tend to bias associations towards the null. As in the standard approach for MR, effect estimates were calculated on the same scale as for the observational analyses, and this scaling-up results in some imprecision with wide confidence intervals in the associations; the concordance of the directions of the associations is therefore particularly important. Wider confidence intervals in MR sensitivity analyses may relate to lower power for some of these methods. 41

Conclusion
In conclusion, the findings from these analyses using observational and genetic data from large-scale international consortia are supportive of a role of IGF-I in the aetiology of prostate cancer. For the first time we show evidence that IGF-I is important for aggressive, clinically relevant disease. These findings support the need for more research on the modifiable determinants of IGF-I, and on whether interventions to lower IGF-I might reduce the risk of prostate cancer.

Ethics approval
Each individual study obtained ethical approval, therefore additional ethical approval for this study was not required.

Data availability
Studies pooled by the EHNBPCCG are not owned by the writing group and so are not available from this consortium. Individual studies may be contacted to request access to their data. PRACTICAL consortium data are available upon request, see [http://practical.icr.ac.uk/blog/] for further details.

Supplementary data
Supplementary data are available at IJE online.

Author contributions
Author contributions are available as a Supplementary file at IJE online.