-
PDF
- Split View
-
Views
-
Cite
Cite
Janne Pott, Jesper R Gådin, Elizabeth Theusch, Marcus E Kleber, Graciela E Delgado, Holger Kirsten, Stefanie M Hauck, Ralph Burkhardt, Hubert Scharnagl, Ronald M Krauss, Markus Loeffler, Winfried März, Joachim Thiery, Angela Silveira, Ferdinand M van't Hooft, Markus Scholz, Meta-GWAS of PCSK9 levels detects two novel loci at APOB and TM6SF2, Human Molecular Genetics, Volume 31, Issue 6, 15 March 2022, Pages 999–1011, https://doi.org/10.1093/hmg/ddab279
- Share Icon Share
Abstract
Proprotein convertase subtilisin/kexin type 9 (PCSK9) is a key player in lipid metabolism, as it degrades low-density lipoprotein (LDL) receptors from hepatic cell membranes. So far, only variants of the PCSK9 gene locus were found to be associated with PCSK9 levels. Here we aimed to identify novel genetic loci that regulate PCSK9 levels and how they relate to other lipid traits. Additionally, we investigated to what extend the causal effect of PCSK9 on coronary artery disease (CAD) is mediated by low-density lipoprotein–cholesterol (LDL–C).
We performed a genome-wide association study meta-analysis of PCSK9 levels in up to 12 721 samples of European ancestry. The estimated heritability was 10.3%, which increased to 12.6% using only samples from patients without statin treatment. We successfully replicated the known PCSK9 hit consisting of three independent signals. Interestingly, in a study of 300 African Americans, we confirmed the locus with a different PCSK9 variant. Beyond PCSK9, our meta-analysis detected three novel loci with genome-wide significance. Co-localization analysis with cis-eQTLs and lipid traits revealed biologically plausible candidate genes at two of them: APOB and TM6SF2. In a bivariate Mendelian Randomization analysis, we detected a strong effect of PCSK9 on LDL-C, but not vice versa. LDL-C mediated 63% of the total causal effect of PCSK9 on CAD.
Our study identified novel genetic loci with plausible candidate genes affecting PCSK9 levels. Ethnic heterogeneity was observed at the PCSK9 locus itself. Although the causal effect of PCSK9 on CAD is mainly mediated by LDL-C, an independent direct effect also occurs.
Introduction
Proprotein convertase subtilisin/kexin type 9 (PCSK9) is a key player in lipid metabolism by controlling cellular low-density lipoprotein–cholesterol (LDL-C) uptake. This regulation is based on PCSK9 binding to LDL receptors (LDLR) on hepatic cells followed by endocytosis and degradation of the PCSK9-LDLR complex (1). Understanding genetic and non-genetic regulators of PCSK9 levels is crucial for effective personalized treatment of lipid disorders. Statins are the most commonly used class of lipid-lowering drugs, but it is well known that their effect is attenuated to some extent by upregulation of PCSK9. Therefore, combined application of statins and PCSK9 inhibitors (e.g. alirocumab (2) or evolocumab (3)) are applied to counter this effect in cases of insufficient response to statin treatment.
The genetic regulation of PCSK9 is only partly unraveled. First, mutations in the PCSK9 gene were discovered in 2003 and are related to autosomal dominant hypercholesterolemia (4). Since then, several gain-of-function and loss-of-function mutations of the PCSK9 gene have been described (5,6). It is a genetically diverse locus, with differences of allele frequencies between Europeans and African Americans, leading to a significant population differentiation (7). In a trans-ethnic fine-mapping study of lipid traits it was shown that PCSK9 variants are associated with LDL-C in both European and African American samples, but only one signal was associated in both, namely rs11591147 (8). PCSK9 levels are not routinely measured in large studies, and the two published genome-wide association studies (GWAS) of PCSK9 levels are of individuals of European ancestry (9,10). The variant rs11591147 was the lead SNP in both GWASs (9,10), and only the PCSK9 locus was detected with sufficient significance. So far, no GWAS of PCSK9 levels in African Americans has been published. Although a recently conducted study estimated the PCSK9 heritability at 47% in Europeans (11), SNPs described so far only explained 4% of the trait’s variance (9). This and the limited sample size of previous studies (N = 1215 and N = 3290) suggest that further genetic associations could be discovered. A GWAS focusing on statin-induced change in PCSK9 levels (N = 562) (12) revealed a significant hit in CFAP44 (aka WDR52), which points toward an interaction of statin treatment and genetic regulation on PCSK9 levels.
In a Mendelian Randomization (MR) study we showed causal effects of PCSK9 levels on risk for several atherosclerotic phenotypes comprising coronary artery disease (CAD), and carotid plaques (9). However, mediating effects of LDL-C were not analyzed in this context.
In this study, we significantly increased the sample size of our previous work by performing the first genome-wide association meta-analysis (GWAMA) of PCSK9 levels in 12 721 participants of five independent studies of European descent. For African Americans, only one study with a sample size of n = 300 was available, which we used to validate the detected significantly associated loci in another ethnicity. The primary objective was to improve our understanding of the genetic regulation of PCSK9 levels by adding new genes and increasing the set of causal variants. We also analyzed statin-stratified subgroups to detect possible interactions of genetics and statin treatment. We used these results to unravel the causal relationships between lipid traits, PCSK9 and CAD by genetic correlation analyses, co-localization analyses and MR network analyses.
![Miami plot. Distribution of log10-transformed P-values of our GWAMA for all (top, −log10) and the statin free subset (bottom, + log10). The red dashed and blue dotted lines mark genome-wide significance (α = 5 × 10–8) and suggestive significance (α = 1 × 10–6), respectively. The Y-axis is limited to [−20, 20] (max. and min. Original y-value: 48.0 for all, −45.8 for sub). Four distinct loci with genome-wide significance were found and their candidate genes are given in black for the best associating group. Candidate genes of suggestive loci are shown in grey. QQ-plots are given in the Supplementary Material, Fig. S1.](https://oup.silverchair-cdn.com/oup/backfile/Content_public/Journal/hmg/31/6/10.1093_hmg_ddab279/3/m_ddab279f1.jpeg?Expires=1748299541&Signature=5J-bTt69OGWXqwgTFhiuIeQM3XrUhXJhkvqJ0IVX6EEoGcXHaXIcOG6jN5O-NuYGyr2hNiJV9I0Jn1cPFMtiqysHfKSnTIP4p9uAxE7~U3ChAZtHXCgmAoYG0FgBgF9dT1aLGlFx39~J1gqRQNRi51eC29WVVLDkRBYGe2xL0P2cWk7777A0YgJyfIttdhuLXpnLnGtyDvoAbg~oUu0rUY0MLQZABUac4pdFD~2zLfnFdNpNlC-1hoZXyY6Rp6HVGV4vk92SCEsBHP63fnYiohZlmGFI4q-I7oKAoJXtA4d8LKbIeDomt2qXSrgF-LAJz7yscDBo0wvWnUkheNPdsg__&Key-Pair-Id=APKAIE5G5CRDK6RD3PGA)
Miami plot. Distribution of log10-transformed P-values of our GWAMA for all (top, −log10) and the statin free subset (bottom, + log10). The red dashed and blue dotted lines mark genome-wide significance (α = 5 × 10–8) and suggestive significance (α = 1 × 10–6), respectively. The Y-axis is limited to [−20, 20] (max. and min. Original y-value: 48.0 for all, −45.8 for sub). Four distinct loci with genome-wide significance were found and their candidate genes are given in black for the best associating group. Candidate genes of suggestive loci are shown in grey. QQ-plots are given in the Supplementary Material, Fig. S1.
Results
Meta-GWAS and fine-mapping results
We performed a GWAMA for PCSK9 levels in five independent studies: LIFE-Heart (13), LIFE-Adult (14), LURIC (15), CAP (16) and TwinGene (17) (n = 12 721 individuals of European ancestry, n = 10 186 without statin treatment). The average age ranged between 54 and 62 years and sex distributions were between 52% and 69% males. Detailed study characteristics are provided in the Supplemental Material and Supplementary Material, Table S1. Each cohort imputed genetic data to 1000 Genomes Phase 3 (18), and performed association analyses with PCSK9 levels using a standardized protocol (see Methods).
After applying SNP filters (MAF ≥ 1%, info≥0.5, I2 ≤ 0.9, number of studies ≥ 2), about 8.7 million SNPs remained for a fixed-effect meta-analysis. No general signs of inflation were observed (λ = 1.01 of fixed effects meta-analysis using genomic control corrected single study results, see Supplementary Material, Fig. S1 for QQ-Plots). Genome-wide results of both the combined set and statin free subset are shown in Figure 1. We discovered 182 SNPs at four loci achieving genome-wide significance in at least one of the settings (P < 5 × 10−8). Further 125 SNPs at eight additional loci reached at least suggestive significance (P < 5 × 10−6). Summary statistics and annotations can be found in Supplementary Material, Tables S2–S5.
The strongest association was observed at 1p32.3, within the PCSK9 gene. This locus has been previously described for associations with PCSK9 levels (9,10), but also for lipids (19) and CAD (20). The other three genome-wide significant hits were novel: at 2q24.1 (lead SNP rs673548), 19p13.11 (rs58542926) and 19q13.41 (rs71180459). To better define potential causal genes, we performed per locus conditional-joint (COJO) analyses to identify independent variants, and validated the results in 300 African American samples. Next, credible sets (CS) containing the causal variant with 99% certainty were determined for each independent SNP, and checked for interactions regarding sex and statin treatment. Finally, we tested for co-localization with cis-eQTLs, lipids and CAD association signals (see Methods). We present the results per locus.
PCSK9 locus (1p32.3)
We detected 71 SNPs with genome-wide significance at the 1p32.3 locus. In the COJO analysis, three independent signals were detected: the lead SNPs were rs11591147, rs2495477 and rs11206510 (see Table 1 for summary statistics of COJO and Figure 2 for regional association of the conditioned estimates, and Supplementary Material, Fig. S2 for forest plots). Since both rs11591147 and rs2495477 had a posterior probability (PP) > 0.99, their CS contained only the respective variant. The lead SNP rs11591147 codes for the well-known missense mutation R46L, increasing PCSK9 degradation. Rs2495477 is an intronic variant, which, according to CADD evaluation, influences the splicing process. Therefore, both are functionally plausible causal variants for PCSK9 levels. For the third independent signal, rs11206510, 21 SNPs form the respective 99% CS. Besides rs11206510, which is known for its association with CAD, two SNPs are plausible causal variants: rs11583680, a missense mutation (A53V), and rs45448095, a 5′ UTR modifier. Both have regulatory consequences according to CADD evaluation (see Supplementary Material, Table S6 and S7 for CS and CADD annotation).
Cytoband/candidate gene . | #SNPs in 99% CS (95%) . | SNP . | EA/OA . | MAF . | Info . | I2 . | Beta . | COJO . | P-value . | COJO . | CADD score . |
---|---|---|---|---|---|---|---|---|---|---|---|
1p32.3 PCSK9 | 1 (1) | rs11591147 | T/G | 0.014 | 0.880 | 0.793 | −0.293 | −0.269 | 9.20 × 10−49 | 2.18 × 10−39 | 17.1 |
1 (1) | rs2495477 | G/A | 0.405 | 0.899 | 0.554 | −0.046 | −0.045 | 1.40 × 10−23 | 2.18 × 10−22 | 14.7 | |
21 (17) | rs11206510 | C/T | 0.179 | 0.976 | 0.000 | −0.048 | −0.037 | 1.40 × 10−17 | 4.36 × 10−11 | 0.073 | |
2p24.1 APOB | 68 (34) | rs673548 | A/G | 0.216 | 0.996 | 0.000 | 0.036 | 1.92 × 10−12 | 5.1 | ||
rs676210* | A/G | 0.215 | 1.000 | 0.000 | 0.036 | 3.47 × 10−12 | 27.1 | ||||
19p13.11 | 171 (61) | rs58542926 | T/C | 0.088 | 0.963 | 0.338 | −0.048 | 4.17 × 10−8 | 23.2 | ||
19q13.41 | 4109 (3118) | rs71180459 | T/TG | 0.103 | 0.904 | 0.000 | 0.040 | 1.58 × 10−8 | 0 |
Cytoband/candidate gene . | #SNPs in 99% CS (95%) . | SNP . | EA/OA . | MAF . | Info . | I2 . | Beta . | COJO . | P-value . | COJO . | CADD score . |
---|---|---|---|---|---|---|---|---|---|---|---|
1p32.3 PCSK9 | 1 (1) | rs11591147 | T/G | 0.014 | 0.880 | 0.793 | −0.293 | −0.269 | 9.20 × 10−49 | 2.18 × 10−39 | 17.1 |
1 (1) | rs2495477 | G/A | 0.405 | 0.899 | 0.554 | −0.046 | −0.045 | 1.40 × 10−23 | 2.18 × 10−22 | 14.7 | |
21 (17) | rs11206510 | C/T | 0.179 | 0.976 | 0.000 | −0.048 | −0.037 | 1.40 × 10−17 | 4.36 × 10−11 | 0.073 | |
2p24.1 APOB | 68 (34) | rs673548 | A/G | 0.216 | 0.996 | 0.000 | 0.036 | 1.92 × 10−12 | 5.1 | ||
rs676210* | A/G | 0.215 | 1.000 | 0.000 | 0.036 | 3.47 × 10−12 | 27.1 | ||||
19p13.11 | 171 (61) | rs58542926 | T/C | 0.088 | 0.963 | 0.338 | −0.048 | 4.17 × 10−8 | 23.2 | ||
19q13.41 | 4109 (3118) | rs71180459 | T/TG | 0.103 | 0.904 | 0.000 | 0.040 | 1.58 × 10−8 | 0 |
For 1p32.3, in addition to the beta estimate and P-values for the univariate analysis (GWAMA), we report the results of the joint analysis (COJO-cond). Lead SNPs per credible set are marked in bold. For APOB, the most likely causal variant is provided in the second line. Results of all associated variants and SNPs in credible sets are shown in Supplementary Material, Tables S2 and S6. *LD between rs673548 and rs676210: r2 = 0.988.
Cytoband/candidate gene . | #SNPs in 99% CS (95%) . | SNP . | EA/OA . | MAF . | Info . | I2 . | Beta . | COJO . | P-value . | COJO . | CADD score . |
---|---|---|---|---|---|---|---|---|---|---|---|
1p32.3 PCSK9 | 1 (1) | rs11591147 | T/G | 0.014 | 0.880 | 0.793 | −0.293 | −0.269 | 9.20 × 10−49 | 2.18 × 10−39 | 17.1 |
1 (1) | rs2495477 | G/A | 0.405 | 0.899 | 0.554 | −0.046 | −0.045 | 1.40 × 10−23 | 2.18 × 10−22 | 14.7 | |
21 (17) | rs11206510 | C/T | 0.179 | 0.976 | 0.000 | −0.048 | −0.037 | 1.40 × 10−17 | 4.36 × 10−11 | 0.073 | |
2p24.1 APOB | 68 (34) | rs673548 | A/G | 0.216 | 0.996 | 0.000 | 0.036 | 1.92 × 10−12 | 5.1 | ||
rs676210* | A/G | 0.215 | 1.000 | 0.000 | 0.036 | 3.47 × 10−12 | 27.1 | ||||
19p13.11 | 171 (61) | rs58542926 | T/C | 0.088 | 0.963 | 0.338 | −0.048 | 4.17 × 10−8 | 23.2 | ||
19q13.41 | 4109 (3118) | rs71180459 | T/TG | 0.103 | 0.904 | 0.000 | 0.040 | 1.58 × 10−8 | 0 |
Cytoband/candidate gene . | #SNPs in 99% CS (95%) . | SNP . | EA/OA . | MAF . | Info . | I2 . | Beta . | COJO . | P-value . | COJO . | CADD score . |
---|---|---|---|---|---|---|---|---|---|---|---|
1p32.3 PCSK9 | 1 (1) | rs11591147 | T/G | 0.014 | 0.880 | 0.793 | −0.293 | −0.269 | 9.20 × 10−49 | 2.18 × 10−39 | 17.1 |
1 (1) | rs2495477 | G/A | 0.405 | 0.899 | 0.554 | −0.046 | −0.045 | 1.40 × 10−23 | 2.18 × 10−22 | 14.7 | |
21 (17) | rs11206510 | C/T | 0.179 | 0.976 | 0.000 | −0.048 | −0.037 | 1.40 × 10−17 | 4.36 × 10−11 | 0.073 | |
2p24.1 APOB | 68 (34) | rs673548 | A/G | 0.216 | 0.996 | 0.000 | 0.036 | 1.92 × 10−12 | 5.1 | ||
rs676210* | A/G | 0.215 | 1.000 | 0.000 | 0.036 | 3.47 × 10−12 | 27.1 | ||||
19p13.11 | 171 (61) | rs58542926 | T/C | 0.088 | 0.963 | 0.338 | −0.048 | 4.17 × 10−8 | 23.2 | ||
19q13.41 | 4109 (3118) | rs71180459 | T/TG | 0.103 | 0.904 | 0.000 | 0.040 | 1.58 × 10−8 | 0 |
For 1p32.3, in addition to the beta estimate and P-values for the univariate analysis (GWAMA), we report the results of the joint analysis (COJO-cond). Lead SNPs per credible set are marked in bold. For APOB, the most likely causal variant is provided in the second line. Results of all associated variants and SNPs in credible sets are shown in Supplementary Material, Tables S2 and S6. *LD between rs673548 and rs676210: r2 = 0.988.

Regional association (RA) plots of the locus 1p32.3. The respective lead SNP is colored blue, the other SNPs are colored according to their LD with the lead SNP (using 1000 Genomes Phase 3, Europeans only). (A)–(C) RA plots of the independent variants with statistics conditioned on the respective other independent variants. (D) RA plot of the unconditional statistics with rs11591147 as lead SNP. Red circles indicate the two other independent signals, rs2495477 and rs11306510.
In the attempt to validate the locus in African American samples from CAP, rs11591147 turned out to be monomorphic in this ethnicity. The other two SNPs did not reach the nominal threshold, but rs11206510 had concordant effect direction. Statistics are provided in Supplementary Material, Table S8. Analyzing the PCSK9 locus in more detail, we found one variant significantly associated in African Americans, namely rs28362263 causing the loss-of-function mutation A443T (21) in PCSK9 (β = 0.36, P = 1.0 × 10−11). This SNP was monomorphic in our European GWAMA and it had been reported for association with LDL-C in multi-ethnic studies (6,22), but not with PCSK9. We provided summary statistics in African Americans for this locus in Supplementary Material, Table S9 and a regional association plot in Supplementary Material, Fig. S3.
The lead SNP rs11591147 showed a significant interaction with statin treatment. Still being genome-wide significant in the statin free subset (|${\beta}_{no\ statin}=-0.320,{p}_{no\ statin}=9.06\times{10}^{-47}$|), there was no effect in participants under statin treatment (|${\beta}_{statin}=-0.076,{p}_{statin}=0.199$|). Effect sizes differed significantly controlling for a 5% FDR (|${p}_{diff}=1.24\times{10}^{-4},{q}_{diff}=0.035$|, see Supplementary Material, Table S10 and Supplementary Material, Fig. S4). However, this interaction was not confirmed when using association with statin-induced changes in plasma PCSK9 in European ancestry CAP participants (P = 0.502), who all had plasma PCSK9 measured before and during statin treatment. Five SNPs of the rs11206510 CS were suggestively significantly associated in the statin treatment group, and the effects were more than twice of those for the statin-free subset. However, formal interaction analysis did not pass the FDR controlled threshold. In the sex-stratified analysis, rs2495477 showed a trend toward a stronger effect in men, but again this difference was not significant after FDR control.
The unconditioned association signals at 1p32.3 were co-localized with total cholesterol (TC), LDL-C and CAD (PP4 > 0.75), but not with eQTLs of PCSK9 expression levels (max. PP4 = 0.110 in transformed fibroblasts). Using results of the COJO analysis, there was a strong co-localization of PCSK9 eQTLs with rs2495477 in testis and whole blood, and with rs11206510 in brain (cerebellum and cerebellar hemisphere), lung and nerve (tibial) tissues. Similar results were obtained when using statistics of the statin free sub-group. Co-localization results are summarized in Tables 2, Supplementary Material, Table S11 and Figure 3.
APOB locus (2p24.1)
There were 168 genome-wide significant SNPs at 2p24.1, but the conditional analysis revealed only one independent signal, rs673548 (PP = 0.350, see Supplementary Material, Figs S5 and S6 for forest plot and RA plot). The 99%-CS of this locus contained 68 variants. The second highest PP was observed for rs676210 (PP = 0.209), which is in high LD with the lead SNP (LD r2 = 0.99). It is a missense mutation of APOB (P2739L) with high CADD score, and therefore a likely causal variant for the observed association. Both variants showed the same effect direction in African Americans, and rs673548 reached nominal significance (P = 0.040).
We did not observe any significant interactions for the CS-SNPs of this locus (lead SNP: |${\beta}_{no\ statin}=0.038$|,|${\beta}_{statin}=0.030,{p}_{diff}=0.682$|). Co-localization analyses revealed an overlap with high-density lipoprotein–cholesterol (HDL-C) and triglyceride (TG) (PP4 = 0.985 and 0.975, respectively), but not with TC and LDL-C (both PP3 = 1), indicating different SNPs driving the associations of these traits compared to PCSK9. In the analyses of eQTL data, we only detected co-localizations when using statistics of the statin free sub-group. Here, we observed co-localizations between PCSK9 associations and APOB eQTLs in colon sigmoid (PP4 = 0.928), brain (substantia nigra, PP4 = 0.869), artery (tibial, PP4 = 0.843)) and esophagus (muscularis and gastroesophageal junction, PP4 = 0.970 and PP4 = 0.847, respectively).
Genetic relationships between PCSK9, CAD and lipid traits at genome-wide and locus-specific level
Trait . | rg . | p(rg) . | 1p32.3 . | 2p24.1 . | 19p13.11 . | 19q13.41 . |
---|---|---|---|---|---|---|
TC | 0.392 | 0.005 | 1.000 | 0.000 | 0.993 | 0.006 |
LDL-C | 0.325 | 0.017 | 1.000 | 0.001 | 0.992 | 0.006 |
HDL-C | 0.230 | 0.061 | 0.122 | 0.985 | 0.013 | 0.019 |
TG | 0.123 | 0.414 | 0.049 | 0.974 | 0.994 | 0.006 |
CAD | 0.021 | 0.825 | 0.888 | 0.000 | 0.193 | 0.004 |
Trait . | rg . | p(rg) . | 1p32.3 . | 2p24.1 . | 19p13.11 . | 19q13.41 . |
---|---|---|---|---|---|---|
TC | 0.392 | 0.005 | 1.000 | 0.000 | 0.993 | 0.006 |
LDL-C | 0.325 | 0.017 | 1.000 | 0.001 | 0.992 | 0.006 |
HDL-C | 0.230 | 0.061 | 0.122 | 0.985 | 0.013 | 0.019 |
TG | 0.123 | 0.414 | 0.049 | 0.974 | 0.994 | 0.006 |
CAD | 0.021 | 0.825 | 0.888 | 0.000 | 0.193 | 0.004 |
Genetic correlations (rg) with PCSK9 were estimated using our PCSK9 summary statistics and those of lipid traits and CAD available from Teslovich et al. (26) and Nikpay et al. (20). LDHub was used for analyses. Co-localization (columns 4–7, Bayesian posterior probabilities for PP4) was determined using summary statistics of Surakka et al. (19) (lipids) and Nikpay et al. (20). High PP4 indicates co-localization. Significant genetic correlations (p(rg) < 0.05) and clear evidence for co-localization (PP4 > 0.75) are marked in bold. Further results can be found in Supplementary Material, Tables S11 and S12.
Genetic relationships between PCSK9, CAD and lipid traits at genome-wide and locus-specific level
Trait . | rg . | p(rg) . | 1p32.3 . | 2p24.1 . | 19p13.11 . | 19q13.41 . |
---|---|---|---|---|---|---|
TC | 0.392 | 0.005 | 1.000 | 0.000 | 0.993 | 0.006 |
LDL-C | 0.325 | 0.017 | 1.000 | 0.001 | 0.992 | 0.006 |
HDL-C | 0.230 | 0.061 | 0.122 | 0.985 | 0.013 | 0.019 |
TG | 0.123 | 0.414 | 0.049 | 0.974 | 0.994 | 0.006 |
CAD | 0.021 | 0.825 | 0.888 | 0.000 | 0.193 | 0.004 |
Trait . | rg . | p(rg) . | 1p32.3 . | 2p24.1 . | 19p13.11 . | 19q13.41 . |
---|---|---|---|---|---|---|
TC | 0.392 | 0.005 | 1.000 | 0.000 | 0.993 | 0.006 |
LDL-C | 0.325 | 0.017 | 1.000 | 0.001 | 0.992 | 0.006 |
HDL-C | 0.230 | 0.061 | 0.122 | 0.985 | 0.013 | 0.019 |
TG | 0.123 | 0.414 | 0.049 | 0.974 | 0.994 | 0.006 |
CAD | 0.021 | 0.825 | 0.888 | 0.000 | 0.193 | 0.004 |
Genetic correlations (rg) with PCSK9 were estimated using our PCSK9 summary statistics and those of lipid traits and CAD available from Teslovich et al. (26) and Nikpay et al. (20). LDHub was used for analyses. Co-localization (columns 4–7, Bayesian posterior probabilities for PP4) was determined using summary statistics of Surakka et al. (19) (lipids) and Nikpay et al. (20). High PP4 indicates co-localization. Significant genetic correlations (p(rg) < 0.05) and clear evidence for co-localization (PP4 > 0.75) are marked in bold. Further results can be found in Supplementary Material, Tables S11 and S12.

Results of pairwise co-localization for the genome-wide significant loci and other lipid traits, CAD and eQTLs. Maximum of PP4 and PP3 are shown. Blue indicates a high posterior probability of co-localization (PP4, both traits share the same causal variant), and red a high posterior probability for two independent associations at this locus (PP3). White cells indicate that there is no signal for the trait compared. For the eQTL comparison, only the maximum PP4 value over independent SNPs and eQTL tissues was displayed per locus.
TM6SF2 locus (19p13.11)
There was one genome-wide and 21 suggestive significant SNPs at 19p13.11 in the statin-free setting, while in the combined analysis this locus reached only suggestive significance. Only the lead SNP rs58542926 represented an independent signal, and its 99% CS contains 171 SNPs. The lead variant codes for a missense mutation in TM6SF2 (E167K, loss-of-function mutation) and was associated with a negative effect on PCSK9 levels. We therefore considered it the most likely causal variant at this locus and TM6SF2 as the respective candidate gene.
Although the locus was more strongly associated in statin free participants than in the combined analysis, there was no significant SNP × statin interaction (rs58542926: |${\beta}_{no\ statin}=-0.052,{\beta}_{statin}=-0.020,{p}_{diff}=0.200$|). However, there were four SNPs with nominal significant interaction, and all of them were also associated with statin-induced changes in plasma PCSK9 levels in CAP. TC, LDL-C and TG associations co-localized with the PCSK9 association here (PP4 > 0.98), whereas there was no co-localization with HDL-C (PP1 = 0.956). We did not observe any co-localization with TM6SF2 eQTLs, but with ATP13A1 in skin tissue (sun exposed, PP4 = 0.873), and with MAU2 in whole blood (PP4 = 0.850).
PPP2R1A locus (19q13.14)
At the fourth locus, only one SNP reached genome-wide and three SNPs suggestive significance (lead SNP rs71180459). The 99%-CS of rs71180459 was large, containing 4109 of the 4370 variants of the region, of which 107 had a CADD score > 10 (see Supplementary Material, Fig. S7 and Supplementary Material, Table S6). Three of the SNPs are in weak LD (r2 = 0.3) with a variant reported for HDL-C levels (23) (reported gene FPR3).
In an enrichment analysis using all eQTLs in LD (r2 > 0.3) and nearby genes of our four genome-wide loci, we detected one pathway that connected the gene PPP2R1A from this locus with APOB: ‘Platelet sensitization by LDL’ (pathway ID R-HSA-432142, Enrichment OR = 21.1, P = 0.0042).
The locus did not reach suggestive significance in the statin free subset, and only 153 variants of the CS reached nominal significance here. In the statin-treated subset, 390 SNPs were nominally associated, of which two displayed significant SNP × statin interaction. However, these two SNPs were not associated in the combined analysis or the statin free analysis. Although we could not confirm statin interaction of these two SNPs in CAP, 17 other SNPs reached nominal significance here.
In the sex-stratified analyses of the CS, 589 SNPs were associated in men, whereas only 101 did so in women. This resulted in 40 SNPs with significant sex interaction, which had all sex-specific effect differences and effect directions.
We detected no co-localization between our PCSK9 signal and any lipid trait, CAD or eQTLs of any gene. The highest PP4 was observed for a lncRNA (CTC-471 J1.10, PP4 = 0.304 in thyroid tissue), while the PP for co-localization with PPP2R1A was 0.237 in colon tissue.
Heritability and genetic correlation
The five independent and genome-wide significant SNPs of the combined setting explained about 3.5% of PCSK9 variance (4.0% in the statin-free). By expanding the significance threshold to 15 SNPs with FDR < 1%, the explained variance increased to 6.7% (7.4% with 13 SNPs in the statin-free setting). Using the online-tool LDHub (24) we estimated the heritability of PCSK9 levels and genetic correlations with other lipid traits (25). The estimated heritability was h2 = 10.3% (standard error 3.7%; 12.6% with SE 4.8% in the statin-free setting), which is in a similar magnitude as that of other lipid traits according to LDHub (LDL–C: 10.7%; TC: 13.7%). There was a significant genetic correlation between PCSK9 and TC (rg = 0.39, P = 0.005) and LDL–C (rg = 0.34, P = 0.020), but not with HDL–C or TG (see Table 2, data from Teslovich et al. (26)). We also analyzed lipid metabolites of Kettunen et al. (27) for genetic correlation and found 29 of the 107 traits significantly correlated (see Supplementary Material, Table S12). The strongest correlation was observed with free cholesterol in large LDL (rg = 0.82, P = 0.002). There was also a significant correlation with Apolipoprotein B (rg = 0.54, P = 0.025).
In our GWAS catalog look-up, we found 2554 variants associated with TC, LDL-C, HDL-C or TG. For 2099 of these SNPs distributed over 368 distinct loci, PCSK9 association statistics were available in our study. Using our results of combined and statin-free analyses, we detected 30 genome-wide, 16 suggestive and 527 nominal significant SNPs (at 3, 1 and 102 independent loci, respectively). Therefore, 28% (106 out of 368) of known lipid loci are also associated with PCSK9 on at least a nominal level (enrichment P = 8.52 × 10−50). There were 31% of all LDL-C loci co-associated, whereas for HDL-C only 21% were co-associated (enrichment P = 1.64 × 10−31 and P = 1.46 × 10−16, respectively). Statistics for all 2099 SNPs are given in Supplementary Material, Table S13.
MR analyses
We aimed at distinguishing between direct and indirect effect mediated by LDL-C of PCSK9 on CAD using MR (28). To rule out a possible reverse causality, we first performed a bidirectional MR of PCSK9 and LDL-C levels (see Methods and Supplementary Material, Fig. S8).
As expected, we found a significant causal effect of PCSK9 on LDL–C (β = 1.770, P = 4.44 × 10−144, see Table 3 and Supplementary Material, Fig. S9) when using the three independent SNPs of the PCSK9 locus as instruments for PCSK9 plasma levels. The opposite direction did not reach significance when 24 instruments of LDL-C were used (|${\beta}_{m=24}=0.015,p=0.179)$|. For sensitivity, we repeated the analysis with the subset of SNPs explaining >0.5% of LDL-C variance. Here, a significant reverse causality was observed (|${\beta}_{m=3}=0.029,p=0.016)$| suggesting feedback loops between PCSK9 and LDL-C. We obtained the same results when restricting to statin free subjects (|${\beta}_{m=24}=0.022,p=0.071,{\beta}_{m=3}=0.043,p=0.001)$|.
Parameter (see Supplementary Material, Figure S8) . | X . | Y . | Causal estimate . | se . | P-value . | #SNPs . |
---|---|---|---|---|---|---|
α | PCSK9 | LDL-C | 1.770 | 0.069 | 4.4 × 10−144 | 3* |
β | LDL-C | CAD | 0.357 | 0.070 | 3.2 × 10−07 | 24+ |
τ | PCSK9 | CAD | 0.996 | 0.131 | 3.2 × 10−14 | 3* |
Indir. effect (α * β) mediated by LDL-C | PCSK9 | CAD | 0.632 | 0.126 | 5.4 × 10−7 | |
Direct effect (τ−α * β) | PCSK9 | CAD | 0.364 | 0.182 | 4.5 × 10−2 |
Parameter (see Supplementary Material, Figure S8) . | X . | Y . | Causal estimate . | se . | P-value . | #SNPs . |
---|---|---|---|---|---|---|
α | PCSK9 | LDL-C | 1.770 | 0.069 | 4.4 × 10−144 | 3* |
β | LDL-C | CAD | 0.357 | 0.070 | 3.2 × 10−07 | 24+ |
τ | PCSK9 | CAD | 0.996 | 0.131 | 3.2 × 10−14 | 3* |
Indir. effect (α * β) mediated by LDL-C | PCSK9 | CAD | 0.632 | 0.126 | 5.4 × 10−7 | |
Direct effect (τ−α * β) | PCSK9 | CAD | 0.364 | 0.182 | 4.5 × 10−2 |
Parameters correspond to Supplementary Material, Fig. S8. Analyzed exposure and outcome are denoted with X and Y, respectively. Causal estimates, their standard errors and P-values were obtained using the IVW approach. We used the summary statistics of our meta-analysis, Surakka et al. (19), and Nikpay et al. (20) for instrument effects on PCSK9, LDL-C and CAD, respectively.
*Independent SNPs at PCSK9 (rs11591147, rs2495477 and rs11206510).
Independent SNPs associated with LDL-C but not associated with PCSK9.
Parameter (see Supplementary Material, Figure S8) . | X . | Y . | Causal estimate . | se . | P-value . | #SNPs . |
---|---|---|---|---|---|---|
α | PCSK9 | LDL-C | 1.770 | 0.069 | 4.4 × 10−144 | 3* |
β | LDL-C | CAD | 0.357 | 0.070 | 3.2 × 10−07 | 24+ |
τ | PCSK9 | CAD | 0.996 | 0.131 | 3.2 × 10−14 | 3* |
Indir. effect (α * β) mediated by LDL-C | PCSK9 | CAD | 0.632 | 0.126 | 5.4 × 10−7 | |
Direct effect (τ−α * β) | PCSK9 | CAD | 0.364 | 0.182 | 4.5 × 10−2 |
Parameter (see Supplementary Material, Figure S8) . | X . | Y . | Causal estimate . | se . | P-value . | #SNPs . |
---|---|---|---|---|---|---|
α | PCSK9 | LDL-C | 1.770 | 0.069 | 4.4 × 10−144 | 3* |
β | LDL-C | CAD | 0.357 | 0.070 | 3.2 × 10−07 | 24+ |
τ | PCSK9 | CAD | 0.996 | 0.131 | 3.2 × 10−14 | 3* |
Indir. effect (α * β) mediated by LDL-C | PCSK9 | CAD | 0.632 | 0.126 | 5.4 × 10−7 | |
Direct effect (τ−α * β) | PCSK9 | CAD | 0.364 | 0.182 | 4.5 × 10−2 |
Parameters correspond to Supplementary Material, Fig. S8. Analyzed exposure and outcome are denoted with X and Y, respectively. Causal estimates, their standard errors and P-values were obtained using the IVW approach. We used the summary statistics of our meta-analysis, Surakka et al. (19), and Nikpay et al. (20) for instrument effects on PCSK9, LDL-C and CAD, respectively.
*Independent SNPs at PCSK9 (rs11591147, rs2495477 and rs11206510).
Independent SNPs associated with LDL-C but not associated with PCSK9.
In our mediation analysis of PCSK9 on CAD, we estimated an indirect effect as product of the PCSK9 effect on LDL-C and the LDL-C effect on CAD: βindir = 0.632 (P = 5.44 × 10−7). The direct causal effect is the difference of total and indirect effect of PCSK9 on CAD, which was significant too (βdir == 0.364, P = 0.045). Similar results were obtained when restricting to statin-free subjects or using other instruments for LDL-C (see Supplementary Material, Table S14). We conclude that 63% of the causal PCSK9 effect on CAD is mediated by LDL-C, but there is also a significant direct effect.
Discussion
We performed the first genome-wide meta-analysis of PCSK9 blood levels in a sample size of 12 721, which is six-times that of the largest single study published so far. We detected three novel loci with genome-wide significance and added further independent variants to the already known locus at 1p32.3. Based on these variants, we estimated the direct and LDL-C mediated causal effects of PCSK9 on CAD showing that both are significant and in the same order of magnitude.
We confirmed the known association at 1p32.3 at the PCSK9 gene locus. By our fine-mapping approach, we identified three independent signals representing different modes of action. The lead SNP is the known missense mutation R46L increasing the rate of PCSK9s degradation (5,9,29). In a phenome-wide association study, this SNP was linked to disorders of lipid metabolism (hyperlipidemia, hypercholesterolemia) (30). The second independent variant was located in the fifth intron of PCSK9 and was predicted to modify RNA splicing. One possible functional mechanism could be that the modified splicing results in reduced mRNA levels. This is supported by our co-localization analyses, in which the conditioned estimates of this hit co-localized with eQTLs of PCSK9 gene expression in whole blood and testis. The third CS included three plausible candidates. Although the lead SNP rs11206510 is known for its association to CAD (20), the highest CADD score was observed for rs11583680, coding for a missense mutation in PCSK9 (A53V). Another causal candidate was rs45448095, which was reported to be in high LD with an in-frame leucine insertion associated with lower LDL-C levels (31,32). The mentioned leucine insertion was not included in our study data due to low MAF. The SNP itself was located in the 5′UTR region regulating transcription according to CADD annotation. Analyzing ethnic heterogeneity of this locus in 300 African Americans, we observed a different lead SNP for this locus, rs28362263. It is a loss-of-function mutation that increases the chance for furin cleavage by which large parts of the catalytic domain are lost (21). We observed a positive SNP effect here, which could be explained by the antibody used in the ELISA that detects both the furin-cleaved and intact versions of PCSK9 in the circulation and hence only partially relates to PCSK9 activity. This variant is not present in our GWAMA, since it is monomorphic in Europeans. Conversely, our independent variant rs11591147 was monomorphic in African Americans demonstrating the necessity of further trans-ethnic analyses of this locus with larger sample sizes.
A novel association was found at 2p24.1 around the APOB gene, which has been reported for associations with TC, LDL-C, HDL-C and TG in GWAMAs (19) and with disorders of lipid metabolism in a PheWAS analysis (30). Of note, we only observed co-localization with HDL-C and TG. Due to access to ApoB100 data in LIFE-Heart, we performed co-localization analysis with this trait but could not find any evidence for co-localization (33), indicating that our hit does not act via ApoB100 plasma levels (data not shown). The CS of the lead variant contained 69 SNPs, of which the missense mutation P2739L affecting the secondary structure of ApoB100 is the most plausible causal variant (34). ApoB100 is targeted by biotherapeutics and small molecule drugs (35) to treat familiar hypercholesterolemia, a genetic disorder caused by mutations in the PCSK9 gene, among others. PCSK9 binds to ApoB100 in LDL-C (36,37). This mutation could change the binding affinity resulting in higher levels of free PCSK9, i.e. more available PCSK9 antibody binding sites in the immunoassays. More experimental data are required to corroborate this mechanism in more detail.
We detected another genome-wide significant signal at 19p13.11 with 171 SNPs in the 99% CS of the lead variant. This locus was described for its associations with plasma lipoprotein concentrations (38), and both SUGP1 and TM6SF2 were suggested as causal genes (39). However, the gene expression associations of both genes was not co-localized with our PCSK9 signal. Subsequent studies identified the E167K missense mutation in TM6SF2 as the most likely causal variant for observed associations (40,41). A PheWAS of this variant had detected associations to various liver diseases, decreased neutrophil count, hemoglobin traits and platelet traits (42). In addition, this variant was shown to increase the SREBP-1c expression levels (43), which is an important transcription factor of HMGCR, LDLR and PCSK9. This might also explain the observed trend to a statin interaction. Functional studies in human hepatoma cells (44) and Tm6sf2 KO-mice (45) confirmed that TM6SF2 inhibition influences the secretion and/or lipidation of TG-rich lipoproteins by the liver. In a model of 3D spheroids from primary human hepatocytes it was shown that TM6SF2 E167K increases hepatocyte fat content by reducing ApoB particle secretion, and that genes of the cholesterol metabolism were differentially expressed compared to wild-type TM6SF2 (46). Here, we observed that three lipid traits (TG, cholesterol and LDL–cholesterol) were co-localized with our PCSK9 association signal. Of note, Smagris et al. (45) observed a decrease in plasma PCSK9 concentration in Tm6sf2−/− mice compared to wild-type mice, but this reduction did not reach statistical significance. Overall, these observations suggest that TM6SF2 influences the secretion of PCSK9 by hepatocytes and that TM6SF2 could be a potential target for future drug development. The mechanism responsible for the involvement of TM6SF2 in PCSK9 secretion remains to be further elucidated.
Finally, we found an association at 19q13.41. The CS of the lead variant was large so that it was impossible to pin down the causal variant and candidate gene with sufficient certainty. Possible candidate genes were from the FPR family and PPP2R1A. FPRs have been linked to HDL-C levels (23), as their activation of neutrophils is canceled by ApoA1 (47). We compared our in-house annotation with online-tool FUMA (Functional Mapping and Annotation) (48), which listed eight ZNFs as possible candidate genes, which all were included in the co-localization analyses as nearby genes. However, our association signal did not co-localize, neither with FPRs, ZNFs, nor with HDL-C levels. PPP2R1A codes for the structural subunit of the protein phosphatase 2 (PP2A), necessary as scaffold for the regulatory and catalytic subunits of PP2A (49). It lies in a shared pathway with APOB enhancing platelet aggregation (‘Platelet sensitization by LDL’, pathway ID R-HSA-432142). In addition, PP2A interacts with SREBP-2 and changes its phosphorylation status. It is an important modification to enable SREBP-2 to act as transcription factor for LDLR and PCSK9 (50,51). The observed sex- and statin-interactions further supports PPP2R1A as candidate gene, as it is also a putative biomarker for endometrial cancer (52), and has been discussed to interact with statin (53). However, this mechanism requires further experimental validation.
Based on the newly identified variants, we analyzed the causal relationships between PCSK9, LDLC and CAD in more detail by MR Analyses. For bivariate MR approaches, it is recommended to use instruments that have a direct biological relationship to the respective exposure, which can be assumed for the three independent variants of PCSK9. Using these instruments, we found a significant causal effect of PCSK9 on LDL-C. However, for the reverse direction of LDL-C on PCSK9 the choice of instruments was more challenging. In a first attempt, we used 24 SNPs reported to be associated with LDL-C, with sufficient quality in our data and not associated with PCSK9 levels. No significant causal effect was detected with this choice. In a second attempt, we used the three strongest LDL-C signals in terms of explained LDL-C variance, namely variants at sortilin 1 (SORT1), APOE and LDLR. This resulted in a weak causal effect. However, a pleiotropic effect of those instruments cannot completely be excluded. SORT1 is not only involved in LDL-metabolism, but also in PCSK9 secretion (54), and both APOE and LDLR mediate the effect of PCSK9 on hepatic lipid production (55).
We also observed a strong mediating effect of LDL-C on the causal relationship of PCSK9 and CAD explaining about 63% of the total effect of PCSK9 on CAD. However, the direct effect reached significance too, which is in line with previous findings of PCSK9 as an independent risk factor of CAD risk (56). Underlying mechanisms of this direct effect need to be investigated in further studies, also taking into account the particle sizes of LDL and LDL subclasses.
In conclusion, we detected four independent loci associated with PCSK9 levels. The strongest hit at the PCSK9 locus showed both, a considerable locus heterogeneity as demonstrated by the identification of three independent variants and an ethnic heterogeneity requiring further investigations. Although no clear candidate gene could be assigned to the 19q13.41 locus, the remaining two hits are in plausible genes involved in lipid metabolism. This suggests that PCSK9 levels is also a polygenic trait not only regulated by variants in its gene, but also by genes effecting secretion of or binding affinity to PCSK9. Our MR analysis suggests that the causal effect of PCSK9 on LDL-C levels is much larger than a possible reverse direction. Finally, the causal effect of PCSK9 on CAD is mainly mediated by LDL–C, but an independent direct effect also occurs.
Materials and methods
Data availability
Data related to this project, including summary level data from the meta-analyses, can be found online at https://doi.org/10.5281/zenodo.5643551.
Studies
Five independent studies contributed to this GWAMA: LIFE-Heart (13), LIFE-Adult (14), LURIC (15), CAP (16) and TwinGene (17), reaching a total sample size of 12 721 participants of European ancestry (n = 10 186 without statin treatment), and 300 participants of African American ancestry (CAP, all without statin treatment). Given the small sample size for African Americans, we refrained from carrying out a trans-ethnic GWAMA and instead performed a GWAMA in Europeans, and then validated the detected GWAMA lead SNPs in African Americans. Detailed study characteristics are provided in the Supplemental Material and in Supplementary Material, Table S1.
All studies meet the ethical standards of the Declaration of Helsinki and were approved by relevant institutional review boards. Written informed consent including agreement with genetic analyses was obtained from all participants in all studies.
PCSK9 measurement
In LIFE-Heart and LURIC, total PCSK9 levels were analyzed in plasma samples using a commercial assay (Quantikine Human PCSK9 immunoassay, R&D Systems). Data were log-transformed for further analyses (9,57). In LIFE-Adult, relative quantification of EDTA plasma PCSK9 was assessed by normalized protein expression units from the Olink target 96 multiplex platform (Olink Proteomics AB; CVD panel III). In CAP, a colorimetric ELISA assay using an AX213 antibody was used to measure PCSK9 in plasma (measured by BG Medicine in collaboration with Merck) (12), whereas in TwinGene serum PCSK9 concentration was determined as described in (58).
Genotyping, imputation and study level quality control
Genotyping arrays and preimputation quality control (QC) per study are shown in Supplementary Material, Table S1. Sample and SNP QC of genotype raw data was performed at the discretion of the single study analysts (see Supplemental Data). SNP-QC measures included call rate, violation of Hardy–Weinberg equilibrium and minor allele frequency (MAF) or monomorphism. For imputation, all studies used 1000 Genomes Phase 3 (18).
Analysis plan
Single study analysts were requested to follow a standardized analysis plan sent to all contributing studies. Genome-wide associations were estimated using linear regression analyses assuming additive genetic models (SNP dosage), adjusting for sex, age, statin treatment and current smoking. Principal components were included in the regression model if considered necessary by the respective study analyst (CAP). X-chromosomal SNPs were analyzed assuming total X inactivation (i.e. male genotypes were coded as A = 0, B = 2 and female genotypes were coded as AA = 0, AB = 1 and BB = 2). Association analyses were performed with PLINK2 (LIFE-Adult, LIFE-Heart and LURIC) or mach2qtl (CAP). Due to dizygotic twin pairs in TwinGene, associations were calculated with a mixed linear model adjusting for the estimated genetic relationship matrix. As recommended, the genetic relationship matrix was estimated leaving out the chromosome under analysis (GCTA MLMA LOCO (59,60)).
We also requested sub-group analyses of subjects stratified for statin medication on genome-wide scale, if available (n = 10 186 without statin treatment). The same regression model except for considering statin treatment as covariate and the same software were used for that purpose.
Meta-analysis
FileQC
Single study GWAS results were harmonized centrally by a pre-meta QC of summary files per study. First, we excluded SNPs with missing values in allelic information (effect allele, effect allele frequency) or statistics (beta estimate, standard error, imputation quality score). Further SNP filtering criteria were MAF < 1%, imputation info score < 0.5 and minor allele count ≤ 6.
We used the R package ‘EasyQC’ (61) to filter SNPs with mismatching alleles or chromosomal position with respect to the reference (1000 Genomes Phase 3, Version 5 (2015) for European samples (18)), and with high deviation of study to reference allele frequency (difference > 0.2). Finally, the alleles were harmonized so that the same effect allele was used in all studies.
Meta-GWAS
For the meta-analysis, single study results were combined using a fixed-effect model assuming homogenous genetic effects across studies. Heterogeneity of study results was assessed by I2 statistics. We filtered SNPs with minimum of imputation info-score across studies<0.5, I2 > 90%, or number of studies with association statistics <2. The genome-wide and suggestive significance level was set to|${\alpha}_{gw}=5\times{10}^{-8}$| and|${\alpha}_{sug}=1\times{10}^{-6}$|, respectively. SNPs were considered for down-stream analyses if reaching at least suggestive significance. A locus was defined as the set of associated SNPs in physical proximity (±500 kb) to the respective regional lead SNP. Pairwise LDs were calculated using data from 1000 Genomes Phase 3, Version 5 (2015) for European samples (18).
We annotated all variants reaching at least suggestive significance with the following bioinformatics resources: nearby Ensembl genes (±250 kb) (62), variants reported in the GWAS Catalog in linkage disequilibrium (LD, r2 > 0.3) (63), and expression quantitative traits in LD (r2 > 0.3) (64–68). We then used the nearby genes and eQTL genes to test for pathway enrichments (retrieved from DOSE (69) and Reactome (70)).
Secondary analyses
Conditional and joint analyses
We used the tool GCTA (version 1.92.0beta3) (60) to test for secondary signals at each locus. First, we applied GCTA’s stepwise model selection algorithm (cojo-slct) to identify independent variants. In case of more than one independent variant, we performed conditional association analysis adjusting for the respective other independent variants (cojo-cond) (71), resulting in conditional statistics for all SNPs at the locus (fixed genomic range). As reference panel we used the genetic data of the combination of LIFE-Adult and LIFE-Heart (n = 13 369).
We looked-up all independent SNPs in the CAP African American samples (before statin treatment) and checked for concordant effect direction. In addition, we also analyzed the PCSK9 locus in African Americans in more detail and tested for ethnicity specific causal variants.
CS analyses
For each independent variant, we performed a CS analyses to determine the likely causal SNPs within a region of ±500 kb of the respective SNP (72,73). In case of more than one independent signal per locus, we used the respective conditional statistics. R package ‘gtx’ was used to derive Approximate Bayes Factors (ABF) from the (conditional) effect estimates and standard errors. Standard deviation priors were chosen per locus in dependence on the respective distribution of effect sizes (difference of 97.5 and 2.5 percentile divided by 2 × 1.96). Results varied between 0.007 (locus 19q13.41) and 0.014 (2p24.1). The ABFs were used to calculate the PP that a variant drives the association signal. We ordered variants by their PP and calculated for each SNP the cumulative posterior in descending order until 99% was achieved, resulting in a set of SNPs containing the causal variant with 99% certainty.
We annotated the variants of the 99% CS with the above mentioned bioinformatics resources. We also added CADD scores as measure of deleteriousness (74) and considered variants with CADD > 15 and high PP as possible causal SNPs. A gene was considered a plausible candidate if containing such a likely causal variant.
Interaction analyses
Finally, we performed statin- and sex-stratified analyses for all SNPs within the 99% CSs. We compared effect sizes of SNPs by testing their differences against zero (t-test, prefiltering for SNPs associated at least on nominal level in one of the strata, followed by Benjamini & Hochberg FDR 5%) (75). For the sex-stratified analysis, we used data from LIFE and TwinGene (n = 5345 and n = 4880 for men and women, respectively). For the statin-stratified analysis, we used data from participants without statin treatment (subsets of LIFE-Heart, LIFE-Adult and LURIC; all TwinGene, n = 9623) and with statin medication (subsets of LIFE-Heart, and LIFE-Adult; n = 1589). Since CAP assessed PCSK9 levels prior and during statin treatment, we calculated associations with the respective differences in order to validate observed SNP × statin interactions (n = 563).
Co-localization analyses
For each independent locus, we performed a pairwise co-localization test (76) between our GWAS results and literature GWAS results for TC, LDL-C, high-density lipoprotein–cholesterol (HDL-C) and TG (19), cis-eQTLs (64) (GTEx v7), and CAD (20). In more detail, the co-localization method evaluates whether two trait associations share the same causal variant (76). Five hypotheses are tested (H0: no association with either trait; H1: association with trait 1, not with trait 2; H2: association with trait 2, not with trait 1; H3: association with trait 1 and 2, two independent SNPs; H4: association with trait 1 and 2, shared SNP). As threshold for co-localization a PP of ≥0.75 for H4 was applied. As before, we used all SNPs within the 500 kb window around the independent lead SNPs. For the loci 1p32.3 and 2p24.1, we used eQTL-statistics of the candidate genes only (PCSK9, and APOB) as determined by CS analysis. Since multiple independent variants were detected at the 1p32.3 locus, we performed co-localization analyses for each of them separately using conditional estimates. This allows identification of signals likely acting via eQTLs. For the loci 19p11.13 and 19q13.41, we analyzed all genes of our annotation for co-localizing eQTLs (SNPs in the 99% CSs with CADD > 10, genes nearby (±250 kb) or known cis effects). Therefore, we tested 40 genes at 19p13.11, and 78 genes at 19q13.41.
Heritability, genetic correlation and look-up of lipid loci
We estimated the heritability of PCSK9 in three modes: using only the genome-wide significant and independent SNPs (n = 6 SNPs), using SNPs with FDR < 1% (n = 15 SNPs, pairwise LD r2 < 0.1), and using the online-tool LDHub (24) (n = 1 085 662 SNPs). We repeated this in the statin-free set (n = 5, n = 13, and n = 1 087 910 SNPs in the three modes, respectively). In addition, we estimated the genetic correlation (25) of PCSK9 with lipid traits (26) and metabolites (27) and we checked known lipid loci for association with PCSK9 levels. For this analysis, we searched the GWAS Catalog (63) for loci associated with TC (trait ID in the experimental factor ontology: EFO_0004574), LDL-C (EFO_0004611), HDL-C (EFO_0004612) and TG (EFO_0004530). We downloaded all variants reported for these four traits (download date 23.01.2020) and excluded those not achieving genome-wide significance and duplicates (m = 2554 SNPs after filtering). Due to filter criteria in the meta-analysis, no PCSK9 statistics were available for 458 of these SNPs.
Mendelian randomization
By MR analyses, we aimed at determining the causal effect of PCSK9 on CAD. Moreover, we distinguish between direct effects and an indirect effect mediated by LDL-C as suggested by (28) (see Supplementary Material, Fig. S8 for a graphical visualization). We first performed a bidirectional MR of PCSK9 and LDL-C levels in order to rule out a possible reverse causality.
As instruments for PCSK9, we used the three independent variants of PCSK9 (G1). For LDL-C, we used 24 LDL-C associated SNPs not associated with PCSK9 levels (G2). For sensitivity, we repeated the analysis with three SNPs explaining >0.5% of LDL-C variance each and using the summary statistics of the statin-free subset. Summary statistics for LDL-C and CAD were obtained from Surakka et al. (19) and Nikpay et al. (20), respectively.
Thus, we estimated four causal effects|$\alpha ={\hat{\beta}}_{IVW}(\mathrm{PCSK}9\to \mathrm{LDL}-\mathrm{C})$|, |$\beta ={\hat{\beta}}_{IVW}(\mathrm{LDL}-\mathrm{C}\to \mathrm{CAD})$|, |$\gamma ={\hat{\beta}}_{IVW}(\mathrm{LDL}-\mathrm{C}\to \mathrm{PCSK}9)$|, and |$\tau ={\hat{\beta}}_{IVW}(\mathrm{PCSK}9\to \mathrm{CAD})$|, using G1 and G2 as instruments by inverse-variance weighting as implemented in the R package ‘MendelianRandomization’.
Acknowledgements
We thank Robert John Konrad and his team at Ely Lilly and Company for the PCSK9 measurements in serum samples from the TwinGene cohort.
We thank Sylvia Henger for data QC of LIFE-Adult and LIFE-Heart, Kay Olischer and Annegret Unger for technical assistance regarding LIFE-Heart, and Kerstin Wirkner for running the LIFE-Adult study center. We thank all study participants of the LIFE-Adult study whose personal dedication and commitment have made this project possible. LIFE-Adult genotyping (round 3) was done at the Cologne Center for Genomics (CCG, University of Cologne, Peter Nürnberg and Mohammad R. Toliat). For LIFE-Adult genotype imputation, compute infrastructure provided by ScaDS (Dresden/Leipzig Competence Center for Scalable Data Services and Solutions) at the Leipzig University Computing Centre was used.
We thank the LURIC study team who were either temporarily or permanently involved in patient recruitment as well as sample and data handling, in addition to the laboratory staff at the Ludwigshafen General Hospital and the Universities of Freiburg and Ulm, Germany.
Icons for the graphical abstract were taken from the Noun Project and created by Rflor, Made, DigitalShards, and Trevor Dsouza (CC BY 3.0 license).
Conflict of Interest statement
Winfried März is employed with SYNLAB Holding Deutschland GmbH. Grants and personal fees from AMGEN, BASF, Sanofi, Siemens Diagnostics, Aegerion Pharmaceuticals, Astrazeneca, Danone Research, Numares, Pfizer, Hoffmann LaRoche: personal fees from MSD, Alexion; grants from Abbott Diagnostics, all outside the submitted work.
Marcus Kleber received lecture fees from Bayer and SYNLAB outside the submitted work and is employed with SYNLAB Holding Deutschland GmbH.
Hubert Scharnagl received grants and personal fees from AMGEN, Sanofi, Abbott, numares, and Unilever, all outside the submitted work.
Funding
This work was supported supported by the Leducq Foundation [13CVD03 to TwinGene]; Swedish Research Council [12660 to TwinGene]; the Swedish Heart-Lung Foundation [201202729 to TwinGene]; means of the European Union, the European Regional Development Fund, funds of the Free State of Saxony within the framework of the excellence initiative [713-241202, 14505/2470, 14575/2470 to LIFE]; the Helmholtz Institute for Metabolic, Obesity and Vascular Research to LIFE-Adult; the German Federal Ministry of Education and Research within the framework of the e:Med research and funding concept [01ZX1906B to LIFE]; the 7th Framework Program of the European Union [201668 to LURIC, 305739 to LURIC]; National Institutes of Health [U01 HL069757 to CAP]; and a Merck Investigator Initiated Research Grant to CAP.