A genome-wide association meta-analysis on apolipoprotein A-IV concentrations

Apolipoprotein A-IV (apoA-IV) is a major component of HDL and chylomicron particles and is involved in reverse cholesterol transport. It is an early marker of impaired renal function. We aimed to identify genetic loci associated with apoA-IV concentrations and to investigate relationships with known susceptibility loci for kidney function and lipids. A genome-wide association meta-analysis on apoA-IV concentrations was conducted in five population-based cohorts (n = 13,813) followed by two additional replication studies (n = 2,267) including approximately 10 M SNPs. Three independent SNPs from two genomic regions were significantly associated with apoA-IV concentrations: rs1729407 near APOA4 (P = 6.77 × 10 − 44), rs5104 in APOA4 (P = 1.79 × 10−24) and rs4241819 in KLKB1 (P = 5.6 × 10−14). Additionally, a look-up of the replicated SNPs in downloadable GWAS meta-analysis results was performed on kidney function (defined by eGFR), HDL-cholesterol and triglycerides. From these three SNPs mentioned above, only rs1729407 showed an association with HDL-cholesterol (P = 7.1 × 10 − 07). Moreover, weighted SNP-scores were built involving known susceptibility loci for the aforementioned traits (53, 70 and 38 SNPs, respectively) and were associated with apoA-IV concentrations. This analysis revealed a significant and an inverse association for kidney function with apoA-IV concentrations (P = 5.5 × 10−05). Furthermore, an increase of triglyceride-increasing alleles was found to decrease apoA-IV concentrations (P = 0.0078). In summary, we identified two independent SNPs located in or next the APOA4 gene and one SNP in KLKB1. The association of KLKB1 with apoA-IV suggests an involvement of apoA-IV in renal metabolism and/or an interaction within HDL particles. Analyses of SNP-scores indicate potential causal effects of kidney function and by lesser extent triglycerides on apoA-IV concentrations.


Introduction
Apolipoprotein A-IV (apoA-IV) is an antioxidative glycoprotein that is synthesized primarily in the intestine and to a lesser extent in the liver (1,2). It is secreted into the lymph as a structural protein of chylomicrons, very-low-density lipoproteins, high-density lipoproteins and participates in reverse cholesterol transport (3,4). Consequently, it plays an important role in relieving peripheral cells of an overload of cholesterol (5,6). It has an effect on fat resorption and has been discussed to be a satiety factor and related to diet-induced obesity at least in animal models (2). ApoA-IV shows antiatherogenic properties (7,8) and low concentrations were found to be associated with cardiovascular outcomes (9)(10)(11)(12). Moreover, it acts as an early marker of impaired renal function and is a predictor of a progression of chronic kidney disease (13)(14)(15)(16).
The knowledge on the genetic regulation of apoA-IV is limited. Heritability estimates derived from a family-based study in 119 nuclear families varied between 0% and 67%, depending on the underlying model (17). ApoA-IV is expressed by the APOA4 gene on chromosome 11. This gene is in close proximity and linkage with APOA5, APOC3 and APOA1. This gene region is often referred as the APOA5-A4-C3-A1 gene cluster. There have been numerous candidate gene studies, which primarily evaluated the non-synonymous variants rs675 (T347S) and rs5110 (Q360H) with e.g. the ability of apoA-IV to bind lipids and to promote cholesterol efflux from cells (18). Association results of these variants with plasma apoA-IV levels (19,20) as well proposed associations with triglycerides were contradictory (21)(22)(23). Variants in the APOA5-A4-C3-A1 gene cluster have also been found to be associated on a genome-wide scale with lipid phenotypes, primarily with triglyceride and HDL cholesterol (HDL-C) concentrations (24). Up to now, there have been no genome-wide studies (GWAS) investigating apoA-IV concentrations.
The aim of the present study was to identify gene loci that are associated with apoA-IV concentrations based on a hypothesis-free approach. We conducted a genome-wide association meta-analysis using data from five population-based studies followed by a replication step in two additional studies. We also performed gene-based and pathway analyses to shed new light on the functional role of the identified genes and/or apoA-IV. Since the information on the heritability of apoA-IV is limited, we conducted a polygenic analysis to calculate the heritability of apoA-IV concentrations as well as the proportion of phenotypic variance explained by the single nucleotide polymorphisms (SNPs). ApoA-IV is known to be associated with kidney function and lipid phenotypes. Therefore, we also performed look-ups in and from the respective GWAS to elucidate possible causal relationships.

Description of cohorts & quality control
Five studies contributed to the discovery stage (n=13,813) and 2 additional studies to the replication phase (n=2,267) (  Table 2).

GWA discovery stage
The GWA meta-analysis (stage 1) resulted in two genome-wide significant gene-regions (Manhattan plot shown in Figure 2, QQ-plot shown in Supplementary Figure 3). In a broad region surrounding the APOA4 gene, 423 SNPs reached genome-wide significance with the lowest p-value for SNP rs1729407 (p=6.00x10 -40 , Figure 3). Additionally, 64 genome-wide significant SNPs in the generegion around the KLKB1 gene on chromosome 4 were identified (lowest p-value for SNP rs4241819: 1.08x10 -12 , Figure 4). Furthermore, one locus on chromosome 5 (SOWAHA) reached our predefined level of significance sufficient for replication using the RE model by Han & Eskin (lowest p-value for SNP rs59698941: 3.76x10 -07 , Supplementary Figure 4).
Besides the two SNPs rs1729407 and rs5104 the following missense variants were selected for replication: rs5110 (p=9.26x10 -07 ) and rs675 (p=0.0021). The latter was selected due to its wide use in the literature. For KLKB1 and SOWAHA, no additional SNP was added by applying the conditional analysis. One missense variant was selected within the SOWAHA gene region (rs2292030, p=9.24x10 -5 missense variant rs3733402 that were selected for replication were not accessible to iPLEX genotyping. Therefore, a proxy SNP (rs4253311) in high linkage disequilibrium (LD) with both the lead SNP and the missense variant was chosen for replication (p=1.43x10 -11 , r 2 =0.932 with rs4241819, r 2 =0.994 with rs3733402, based on 1000 Genomes phase 3 v5; see also Figure 4 for graphical display of LD between the SNPs). Characteristics of all selected SNPs can be found in Supplementary Table   3.

Replication stage and combined analysis
Altogether, 7 SNPs were genotyped in the two replication studies. The single study results for these SNPs are given in Supplementary Table 4. Of these, 3 SNPs reached a false-discovery rate less than 0.05 on the replication stage and a genome-wide significance level after inclusion of all 7 studies (discovery stage + replication stage, Table 1): rs1729407 near APOA4 (p=6.77x10 -44 ), rs5104 in APOA4 (p=1.79x10 -24 ) and rs4241819 (using rs4253311 as proxy in the replication studies) in KLKB1 (p=5.63x10 -14 ). For these 3 SNPs, effect directions were identical in all studies. For each copy of the minor allele of rs1729407, apoA-IV concentrations decrease by 0.2645 mg/dl. Each minor allele of rs5104 also decreases apoA-IV concentrations by 0.2526 mg/dl. In a joint analysis including both SNPs, both SNPs remain significant (p=2.66x10 -25 for rs1729407, p=4.01x10 -08 for rs5104) with slightly smaller effect estimates (β=0.2041 for rs1729407 and β=0.1455 for rs5104). The minor allele of SNP rs4241819/rs4253311 in KLKB1 increases apoA-IV concentrations by 0.1469 mg/dl.

Effects in men and women
GWAS stratified for men and women did not reveal any additional genome-wide significant SNPs outside the wider APOA4 gene region ( Supplementary Figures 7 and 8). There was also no genomewide significant SNP-gender interaction effect (Supplementary Figure 9).

Gene-based and Pathway analyses
The gene-based association scan resulted in 18 significant genes, all of which are located either in the broad APOA4 or KLKB1 gene regions ( Table 2). The pathway analysis revealed 15 gene sets to be significantly enriched with susceptibility genes, including expected lipid transport and lipoprotein metabolism pathways as well as some additional liver-related pathways (Supplementary Table 5).

Look-up in other GWA meta-analysis consortia
We looked up the two replicated SNPs in the APOA4 gene region (rs1729407, rs5104), the replicated SNP in KLKB1 (rs4241819), its proxy used in the replication step (rs4253311) and the correlated missense mutation (rs3733402) in the GWA meta-analysis results on kidney function, HDL-C and triglycerides. Two SNPs (rs1729407, rs4253311) were available in all GWAS consortia. Only one significant result was found: the apoA-IV lead SNP rs1729407 was associated with HDL-C with a pvalue of 7.1x10 -07 (Table 3).
In addition, lead SNPs from the most recent HDL-C (n=70), triglyceride (n=38) and kidney function (n=53) GWAS were selected. The p-values of these partially overlapping 142 SNPs were retrieved from our GWA meta-analysis on log-transformed apoA-IV (Supplementary Tables 6-8). Only one SNP was significantly associated with apoA-IV, rs964184 in APOA1 (p=0.0001), which is included in the HDL-C as well as in the triglyceride SNP list.
The analyses based on the weighted genetic SNP-scores for kidney function, HDL-C and triglycerides in the combined KORA F3 and F4 dataset yielded two significant results. The weighted kidney function SNP-score was significantly and inversely associated with apoA-IV (p=5.5x10 -05 ). That means apoA-IV concentrations increased with an increasing number of GFR-decreasing SNPs.
Furthermore, a greater number of triglyceride-increasing alleles was shown to be associated with lower apoA-IV concentrations (p=0.0078). The association with HDL-C SNPs was not significant but pointed in the opposite direction as expected: the more HDL-C-increasing alleles, the higher were apoA-IV concentrations (p=0.0554).

Discussion
This study revealed three major findings. First, using genome-wide data from five studies and two independent replication studies we could identify three independent SNPs from two genomic regions (APOA4 and KLKB1), which were significantly associated with apoA-IV concentrations. Second, approximately one third of the phenotypic variability of apoA-IV seems to be genetically regulated.
Third, genetic variants that have a significant effect on kidney function and triglyceride concentrations suggest a causal role of these phenotypes on apoA-IV concentrations.

Genome-wide significant and replicated SNPs in APOA4 and KLKB1
Conditional stepwise regression analysis including all SNPs in the broad APOA4 gene region (a 1 MB region including the APOA5-A4-C3-A1 cluster) led to the identification of two SNPs: the lead SNP rs1729407, located between APOA5 and APOA4, and one missense variant (rs5104). So far, the effect of rs5104 has been studied only in some small studies: it was associated with dyslipidemia in Han Chinese (25), with postprandial ApoA-I plasma concentration in healthy young men (26) and with triglyceride response to fenofibrate treatment (27). Conversely, no association between the lead SNP rs1729407 and any phenotype had been shown until now. The effect of other missense variants in APOA4 (rs675, rs5110), although widely studied before, could not be replicated. However, these previous studies were markedly smaller, showed contradictory results and investigated different inheritance models (19,20,23).
Both APOA4 top hits do not present overt functional effects. The lead SNP rs1729407 is located in an intergenic region (Supplementary Figure 10A) while rs5104 causes a serine to asparagine substitution (Ser147Asn), which is classified as benign by common bioinformatics prediction tools. Of note, the lead SNP is in perfect LD (r 2 =1) with a SNP located in a large cluster of transcription factor binding sites located approximately 1.5 kb downstream (rs1729405, p=9.92E-40 in our meta-analysis; Supplementary Figure 10B).
Besides the APOA4 gene, we also identified a locus on chromosome 4 encompassing the three genes CYP4V2, KLKB1 and F11. The top hit was in nearly perfect LD with the missense variant rs3733402 in KLKB1. KLKB1 encodes the glycoprotein plasma kallikrein (also known "Fletcher factor" (28)), which acts as a proteolytic activator of several vasoactive and circulating peptides (kinins) (29,30).

8
F11 is a paralog of KLKB1 and codes for the coagulation factor XI. Both are part of the intrinsic pathway (36). However, to our knowledge a mechanism which obviously links apoA-IV to the kininkallikrein system or the intrinsic pathway has not been described so far. Therefore replication and functional studies will be required to appraise the significance of this finding. The third gene in the locus, CYP4V2, is a nearly ubiquitously expressed omega-hydroxylase, with the phenotype of loss-offunction mutations being restricted to the eye (37,38) and causing the degenerative ocular disease Finally, gene-based analysis or pathway-based analysis did not reveal additional novel genes beyond those located in the genomic regions around APOA4 and KLKB1. Since the stepwise conditional analysis resulted in only two independent SNPs located at the APOA4 or KLKB1 loci, the observation in the gene-based analysis that multiple genes were significant in each locus could most likely be explained by LD.

Variance explained and heritability
Another aim of this study was the estimation of the heritability of apoA-IV as well as the variation of apoA-IV explained by all included additive-coded SNPs. Both, genomic and also narrow-sense heritability were calculated to be around 30%. Only a relatively small fraction is explained by the two gene regions we have identified which leaves sufficient room for the discovery of other gene regions.
In addition, the major extent of apoA-IV concentrations seems to be regulated by non-genetic factors.

SNP look-up using results from other GWAS consortia
Another aspect of this study was the look-up of the identified SNPs in other GWAS consortia. Since variants in the APOA5-A4-C3-A1 gene cluster have consistently been found to be associated with triglycerides and HDL-C (24,40,41), results from the most recent lipids-GWA meta-analysis (24) was used for this look-up. The lead APOA4-SNP rs1729407 showed an association with HDL-C (p=7.1x10 -07 ). However, this SNP seems to be independent from the lead SNP of the lipid-GWA within that gene region (rs964184, reported gene APOA1, r 2 with rs1729407 < 0.1), which had a pvalue of 6.00E-48 in the GWAS on HDL-C and 7.00x10 -224 in the GWAS on triglycerides (24). SNP rs964184 has also been associated with coronary heart disease on a genome-wide scale, an association possibly triggered by the strong association of rs964184 with triglyceride concentrations (42)(43)(44). In our analysis, rs964184 was also associated with apoA-IV (p=0.0001). However, this is far away from genome-wide significance. Altogether, it seems that, despite being within the APOA5-A4-C3-A1 gene 9 cluster, the SNPs associated with HDL-C and triglycerides are statistically independent from the APOA4-SNPs associated with apoA-IV concentrations.
We also performed a look-up to check whether the SNPs detected in our apoA-IV GWAS study were associated with kidney function, defined by eGFR using data from the CKDGen consortium (45). This consortium was chosen because of the already known association of apoA-IV with kidney function and chronic kidney disease (13)(14)(15)(16)46). However, none of the APOA4 and KLKB1 lead SNPs showed significant associations with eGFR.
We further applied a look-up approach the other way around: when we selected in total 142 unique SNPs that were retrieved from the kidney-and lipid-GWAS, no single SNP was associated with apoA-IV in our GWAS besides the aforementioned SNP in APOA1 (TG and HDL-C). However, taken together as weighted SNP-scores, the strongest associations with apoA-IV could be found for the kidney-SNP-score and still significant associations for the triglycerides-SNP-score. These results potentially support a possible causal effect of kidney function on apoA-IV concentrations. This might also be true for a potential causal effect of triglycerides on apoA-IV, but to a lesser extent.
So far, only few studies investigated the association between apoA-IV and triglyceride levels, and the results have been inconsistent: for example, no association could be found in the EARS study (1261 controls and 629 cases) (23), whereas a study conducted in 105 participants reported a significantly positive association between apoA-IV and triglyceride levels (47) concentrations. This finding is contradictory to the direction of correlation we found using a triglyceride-increasing SNP-score.
As part of the HDL particle, apoA-IV plays a role as a mediator in the reverse-cholesterol transport (48). Some epidemiological studies also suggest an association of HDL-C with apoA-IV (23).
However, a causal role of HDL-C on apoA-IV could not be shown with our data, but also not ruled out. In Hanniman et al. (49), APOA4 knockout mice showed decreased HDL-C values, whereas overexpression of APOA4 led to increase of HDL-C, which suggests a causal role of apoA-IV on HDL-C.

Conclusion
Using data from five population-based studies and two additional replication studies, two independent SNPs located in or next to the APOA4 gene and one SNP in KLKB1 gene were significantly associated with apoA-IV levels. These two gene regions alone can only explain a small fraction of the genomewide explained variance by SNPs which we estimated to be roughly 30%. Therefore, a major part of apoA-IV variability is likely to be regulated by non-genetic factors. Analyses of SNP-scores

Study design
The genome-wide SNP association analysis on apoA-IV is based on a two-stage design with a discovery stage and a replication stage (Figure 1). Genome-wide SNP arrays were available for 5 studies of European ancestry (n=13,813 in total). All independent SNPs and missense variants with a p-value below 1x10 -6 were taken forward to the replication stage. In addition, one non-synonymous SNP from the APOA4 gene that did not fulfill the p-value selection criteria was selected for replication (rs675), since it has been widely studied before (18). Altogether, 8 SNPs were then genotyped in both replication studies. Replication of SNPs was achieved, if the following criteria were met: genomewide significance (p <5x10 -8 ) in the meta-analysis of all 7 studies within the discovery + replication stage (n=16,080 in total), direction of effects in replication studies consistent with the discovery stage and a false-discovery rate (FDR) (50) less than 0.05 on the replication stage.

GWAs discovery stage: study population, genotyping and imputation
Details on genotyping and imputation for each study can be found in Supplementary Table 2. The CoLaus study is a single-center, cross-sectional study including 6,182 Caucasian subjects aged 35 to 75 years from the city of Lausanne in Switzerland (51). From 5,435 participants, genotypes were imputed using the software minimac (52) and 1000 Genomes (phase 1, version 3), resulting in over 7 million SNPs after filtering. Full phenotype information as well as imputed genotypes are available for n=3,996 participants.
For the NHLBI Family Heart Study (FamHS), 1,200 families (∼6,000 individuals) were ascertained in 1992, half randomly sampled, half selected because of an excess of coronary heart disease (CHD) or risk factor abnormalities (53). Study participants belonging to the largest pedigrees were invited for a second clinical exam (2002)(2003)(2004). GWAS analysis was undertaken for 4135 European American subjects using Illumina arrays. SNP genotypes were subsequently imputed with the software MACH (version1.0.16) (54) using 1000 genomes phase 1 version 3 (55) as reference, leading to a total of ∼7.7 million SNPs after filtering. Both imputed genotype data as well as phenotype information was available for n=1,712 participants.

11
The KORA F3 study, conducted in the years 2004/05, is a population-based sample from the general population living in the region of Augsburg, Southern Germany, which has evolved from the WHO MONICA study (Monitoring of Trends and Determinants of Cardiovascular Disease). Genome-wide data are available for all participants (n=3,075 with complete phenotype information) based on llumina Omni 2.5/Illumina Omni Express. The KORA F4 survey is an independent non-overlapping sample drawn from the same population in the years 2006/08 (n=2,926 with complete phenotype information).
Genome-wide data are available for all participants in the KORA F4 study (Affymetrix Axiom) (40,56). Both genome-wide genotype data have been imputed with the software IMPUTE using 1000

Replication stage: study population and de-novo genotyping
The Bruneck study is a prospective population-based survey designed to investigate the epidemiology and pathogenesis of atherosclerosis (58,59). The study population was recruited in 1990 as a sex-and age-stratified random sample of all inhabitants of Bruneck, Italy. The attendance rate was 93.6% with complete data in 919 subjects. An intensive phenotyping was done and follow-up data are available for a period of 25 years.

The SAPHIR study (Salzburg Atherosclerosis Prevention Program in subjects at High Individual
Risk) is an observational study conducted in the years 1999-2002 involving 1,770 unrelated subjects from a healthy working population. Study participants were recruited by health-screening programs in companies in and around the Austrian city of Salzburg (60). Full phenotype and genotype information is available for n=1,454 participants.
In both studies, de-novo genotyping was performed in a multiplex approach using the SEQUENOM MassArray platform and iPLEX Gold chemistry. Full phenotype and genotype information is available for n=802 participants.

Measurement of apoA-IV
For all participating studies, quantification of plasma apoA-IV was done in the same laboratory (Division of Genetic Epidemiology, Medical University of Innsbruck, Austria). It was based on a double-antibody enzyme-linked immunosorbent assay using an affinity-purified polyclonal rabbit antihuman apoA-IV antibody for coating and the same antibody coupled to horseradish peroxidase for detection. Plasma with a known concentration of apoA-IV was used as the calibration standard (61).
Four control sera with different concentrations were run on each plate in double measurements for control purposes throughout the entire project. The intra-and interassay coefficients of variation were 2.7% and 6.0%, respectively (61).

GWAS analysis of single studies & discovery stage meta-analysis
An overview of the quality control and meta-analysis workflow in the discovery stage is given in  Figure 2). In each study, each SNP was associated with log-transformed apoA-IV concentrations in an additive genetic model using linear regression, adjusted for age and sex.
Additionally, linear regression was performed on the untransformed apoA-IV levels to obtain interpretable effect estimates. Since women have slightly lower apoA-IV levels than men, genderstratified models have also been applied in all studies (62). Genome-wide analysis in the FamHS study was performed using a linear mixed model accounting for familial dependencies described by a pedigree-based kinship matrix.
Quality control and filtering of SNPs was performed centrally and standardized by the Innsbruck study group using EasyQC (63). SNPs were only included in the analysis if they fulfilled the following criteria: imputation quality ≥ 0.4 (e.g. IMPUTE info), minor allele frequency ≥ 1% and a p-value of the HWE-test ≥ 1x10 -06 . Additional analyses for quality control were applied on the already filtered datasets, which included a P-Z-plot (63) and calculation of genomic inflation factor λ. The P-Z-plot compares the reported p-values from each study with the p-values calculated from Z-statistics derived from the reported beta coefficient and standard error.
For the meta-analysis over all GWAS studies, METASOFT (64) was used for all imputed SNPs that met imputation and quality control criteria. SNPs were only included in the meta-analysis if they were available in 3 or more studies. Based on the heterogeneity between studies for each SNP, a fixed effects (FE) or optimized random effects model (RE) as proposed by (64), was used as implemented in

13
METASOFT. The test statistic for this optimized RE model is partitioned into a mean effects and heterogeneity part. To give higher weights to the mean effects, this RE model was only used when the test statistic for the mean effects part was higher than the heterogeneity part and if the test for heterogeneity was significant (p value of Q statistic < 0.1 & I² ≥ 50). The test statistics were corrected for genomic inflation in both the GWAS analysis stage and meta-analysis stage. Based on the genderstratified analyses, a t-test on effect differences between men and women was performed (62). All regional plots presenting the p-values and LD between SNPs in predefined genomic regions were done using LocusZoom (65).

SNP selection for replication
To detect independently associated SNPs, a conditional stepwise analysis was performed using the program GCTA (version 1.24.7 (66)). For each locus with at least one p-value < 10 -6 , the SNP with the lowest p-value on the discovery stage was taken as the lead SNP. All SNPs within a region +/-500 kB surrounding the lead SNP were included in the conditional analysis. GCTA uses the summary-level statistics of the meta-analysis plus one reference population for LD calculation. As reference population, a combined genotype dataset of KORA F3 and KORA F4 was used (n=6,001). By default, the lead SNP is included in the model. Then, all SNPs in the included gene region are tested for association in addition to the already included SNPs. Finally, all SNPs within a gene region with a pvalue of < 10 -6 in the conditional analysis were taken forward for replication. Furthermore, all missense mutations with p-values of < 10 -6 were selected for replication, irrespective of possible LD with already selected SNPs.

Two-stage meta-analysis
All genotyped SNPs in the replication phase were meta-analyzed in both replication studies separately as well as in a combined analysis of all GWAS and replication-stage studies. Again, METASOFT (64) was used in the same way as in the first stage meta-analysis.

Gene-based test and Pathway analysis
In addition to the analysis of single SNP effects, a gene-based scan and a pathway analysis were performed using KGG version 3.5 (67). Gene regions were defined as the gene ± 20 kb according to the RefGene database. Using this definition, 66.35% of the available SNPs were included in the analysis. For the gene-based analysis, the extended Simes test (GATES) was used as implemented in KGG (68). To adjust for multiple testing, the Bonferroni-method was applied on the number of tested genes. To calculate LD between the SNPs, the 1000G Phase 1 v3 Reference was used. All pathways that are available in the C2 curated gene set from GSEA at GSF Forschungszentrum on July 22, 2016 http://hmg.oxfordjournals.org/ Downloaded from (http://software.broadinstitute.org/gsea/msigdb/) were included in the pathway analysis. To test for enrichment of each pathway with significant genes, a hypergeometric test as implemented in KGG was used (69). To adjust for multiple testing, the Bonferroni-method was applied on the number of pathways tested.

Variance explained
The percentage of explained variance for the SNPs that were taken forward for replication was calculated in the SAPHIR study (n=1,465) -as an independent replication cohort -as well as in a combined dataset of both KORA studies (n=6,001). The combined KORA dataset was also used to get an estimate of the proportion of phenotypic variance explained by the regression on additively coded SNPs for a) all SNPs within the APOA4 and KLKB1 gene regions, defined as the lead SNPs +/-500 kB, as in the conditional analysis and b) all available genome-wide imputed SNPs. The latter has been denoted as the genomic heritability (70). Hence, this genomic heritability includes solely the variance attributable to the measured SNP effects. For these analyses, the software GCTA version 1.24.7 was used (66). In the FamHS study, an estimate of the proportion of the additive (polygenic) variance on the phenotypic variance, the narrow-sense heritability h 2 , was obtained using GenABEL's polygenic function, taking the kinship matrix into account. This narrow-sense heritability thus also includes the variance explained by not measured SNPs and other variants (e.g. copy-number-variations). All estimates for the explained variance and heritability refer to log-transformed values of apoA-IV.

SNP Look-up
We performed a look-up of our replicated SNPs in downloadable GWA meta-analysis results on kidney function (defined by eGFR) (71), HDL-C and triglycerides (24). We further looked up lead SNPs identified in these consortia in our apoA-IV GWA meta-analysis. 53 SNPs associated with kidney function, defined by eGFR, were derived from the CKDGen-GWA meta-analysis and 70 SNPs with HDL-C and 38 with triglycerides from the GLGC-GWA meta-analysis. For these SNPs, their respective p-values from the log-transformed analysis on apoA-IV levels on the discovery stage were looked up. Altogether, 143 unique SNPs were included in this analysis, some of them involved in more than one phenotype (especially for HDL-C and triglycerides). Therefore, results are declared significant, when the p-value is lower than 0.05/143 = 0.00035. Since the effect of single SNPs (and therefore the statistical power) is assumed to be low, we also used the imputed genotypes in both KORA studies to create SNP-scores. Weighting and direction of effects were based on the original publication where the SNPs were derived from. All SNPs were scaled in such a way that they are phenotype and/or risk increasing and weighted by the beta-estimate derived from the respective 15 original study. These weighted genotype scores were then summed up to derive a genetic risk score for each of the phenotypes studied. For these analyses, a combined dataset of KORAF3 and KORAF4 was used (n=6,001). A mixed effects model was performed for this analysis with the study included as a random effects variable. Since three SNP-scores were evaluated, the significance threshold was set to 0.05/3=0.0167 for these analyses.

Acknowledgments / Funding
The measurements of apoA-IV were supported by a grant from the "Standortagentur Tirol" to Florian