Genetic loci associated with renal function measures and chronic kidney disease in children: the Pediatric Investigation for Genetic Factors Linked with Renal Progression Consortium

Background. Chronic kidney disease (CKD) in children is characterized by rapid progression and a high suggest that larger collaborative efforts will be needed to draw reliable conclusions about the presence and identity of common variants associated with eGFR, proteinuria and CKD in pediatric populations. ABSTRACT Background. Studies on the effectiveness of seasonal in ﬂ uenza vaccination in peritoneal dialysis (PD) patients are lim-ited. The aim of the present study is to evaluate the effectiveness of seasonal in ﬂ uenza vaccination in reducing morbidity and mortality in incident end-stage renal disease patients on PD. Methods. From Taiwan ’ s National Health Insurance Research Database, we identi ﬁ ed 2089 incident PD patients with seasonal in ﬂ uenza vaccination and 2089 propensity score matched incident PD patients without the vaccination during


A B S T R AC T
Background. Chronic kidney disease (CKD) in children is characterized by rapid progression and a high incidence of end-stage renal disease and therefore constitutes an important health problem. While unbiased genetic screens have identified common risk variants influencing renal function and CKD in adults, the presence and identity of such variants in pediatric CKD are unknown. Methods. The international Pediatric Investigation for Genetic Factors Linked with Renal Progression (PediGFR) Consortium comprises three pediatric CKD cohorts: Chronic Kidney Disease in Children (CKiD), Effect of Strict Blood Pressure Control and ACE Inhibition on the Progression of CRF in Pediatric Patients (ESCAPE) and Cardiovascular Comorbidity in Children with CKD (4C). Clean genotype data from >10 million genotyped or imputed single-nucleotide polymorphisms (SNPs) were available for 1136 patients with measurements of serum creatinine at study enrolment. Genome-wide association studies were conducted to relate the SNPs to creatinine-based estimated glomerular filtration rate (eGFR crea ) and proteinuria (urinary albumin-or protein-to-creatinine ratio ≥300 and ≥500 mg/g, respectively). In addition, European-ancestry Ped-iGFR patients (cases) were compared with 1347 European-ancestry children without kidney disease (controls) to identify genetic variants associated with the presence of CKD. Results. SNPs with suggestive association P-values <1×10 −5 were identified in 10 regions for eGFR crea , four regions for proteinuria and six regions for CKD including some plausible biological candidates. No SNP was associated at genome-wide significance (P < 5×10 −8 ). Investigation of the candidate genes for proteinuria in adults from the general population provided support for a region on chromosome 15 near RSL24D1/ UNC13C/RAB27A. Conversely, targeted investigation of genes harboring GFR-associated variants in adults from the general population did not reveal significantly associated SNPs in children with CKD. Conclusions. Our findings suggest that larger collaborative efforts will be needed to draw reliable conclusions about the presence and identity of common variants associated with eGFR, proteinuria and CKD in pediatric populations.
Keywords: pediatric chronic kidney disease, genetic epidemiology, genome-wide association study, glomerular filtration rate, proteinuria

I N T RO D U C T I O N
Chronic kidney disease (CKD) is an important global health problem [1]. Although uncommon in children [2][3][4], CKD in this population is associated with increased risk for poor growth, cardiovascular disease and mortality [5][6][7]. Despite the lower prevalence of CKD in children when compared with adults, a higher proportion of pediatric CKD patients progress to end-stage renal disease. The average life expectancy of children with renal replacement therapy (RRT) is 63 years for those receiving a transplant and only 38 years for those remaining on dialysis [8].
Thus, gaining insights into pathophysiological mechanisms that underlie the development and progression of pediatric CKD is of great importance. Previous studies in adults support a genetic component to CKD and kidney function that is also observed in individuals from the general population without clear monogenic causes of CKD [9][10][11]. Genome-wide association studies (GWAS) were used successfully to identify common genetic variants associated with measures of renal function and CKD in population-based studies [12][13][14][15][16][17] as well as with specific kidney diseases such as IgA nephropathy [18,19] and membranous nephropathy [20].
Previous GWAS of CKD and renal function measures were conducted among middle-aged or older adults. It is unclear whether common genetic variants also contribute to CKD risk in children. While monogenic causes account for a substantial fraction of pediatric CKD, it is conceivable that common genetic variants, each typically conferring small increases in risk, contribute to the development and/or progression of CKD. To date, no pediatric GWAS of renal function parameters have been conducted because of the lack of sufficiently large study populations. The Pediatric Investigation for Genetic Factors Linked with Renal Progression (PediGFR) Consortium has now assembled several pediatric CKD cohorts. Here we report on GWAS utilizing baseline parameters of renal function, glomerular filtration rate (GFR) and proteinuria and CKD from data of 1136 PediGFR participants.

M AT E R I A L S A N D M E T H O D S
The PediGFR Consortium is an international collaborative effort to identify genetic risk factors for kidney disease. The Chronic Kidney Disease in Children (CKiD) prospective cohort study [21] enrolled 540 children aged 1-16 years with an estimated GFR (eGFR) of 30-75 mL/min/1.73 m 2 , 440 of which were genotyped. The Effect of Strict Blood Pressure Control and ACE Inhibition on the Progression of CRF in Pediatric Patients (ESCAPE) trial recruited 468 children. Of these, 315 children aged from 3-18 years and with an eGFR of 15-80 mL/min/ 1.73 m 2 at the baseline visit were genotyped [22]. The Cardiovascular Comorbidity in Children with CKD (4C) prospective study enrolled 705 children aged 6-17 years with an initial eGFR of 10-45 mL/min/1.73 m 2 [23], 691 of whom were genotyped.
Publicly available data from the Study of the Genetic Causes of Complex Pediatric Disorders [24] from the Children's Hospital of Philadelphia (CHOP) were obtained [database of Genotypes and Phenotypes (dbGAP) accession number phs000490. v1.p1] and used as an external control population (interquartile age range 6-17 years). Patients with kidney-related diseases (n = 41) were excluded based on primary diagnosis International Classification of Diseases, Ninth Revision codes (580-594, 598, 403, 404, 250.4 and 753).
For the case-control analyses, all PediGFR patients were treated as CKD cases, while CHOP children without kidney disease were treated as controls.
Using the Illumina Infinium 2.5M-8 microarray, 1450 Ped-iGFR participants were successfully genotyped at ∼2.4 million single-nucleotide polymorphism (SNP) markers. Data cleaning was performed according to standard protocols [26] separately for ESCAPE/4C and for CKiD and resulted in exclusion of n = 84 samples (Supplementary data, Figure S1).
A flow chart outlining data preparation for the analyses using the external control population is presented in the Supplementary data ( Figure S3). Control samples had been genotyped on the Illumina Human 610-Quad v1 (n = 1480) and the Illumina Human Hap550 v3 arrays (n = 1151). Genotype datasets of only European-ancestry PediGFR cases and CHOP controls were merged. Based on the top 20 calculated principal components, 2005 individuals of similar genetic ancestry were retained for analyses (658 cases, 1347 controls; Supplementary data, Figure S4). Because of the different genotyping arrays and data sources, stringent data cleaning was performed prior to imputation, removing SNPs with minor allele frequency (MAF) <1%, Hardy-Weinberg equilibrium (HWE) P-value <0.01 (in cases and controls separately) and call rate <95%. Imputation of the combined case-control dataset with the same settings as outlined above resulted in 8.3 million high-quality SNPs (information measure ≥0.8).
Genome-wide association tests were performed using SNPtest v2.5 [32]. Within each of the five PediGFR groups, a linear regression model was used to relate continuous eGFR (ln transformed) to each SNP using an additive model and genotype dosages. The analysis for the binary outcome proteinuria among the PediGFR children was performed similarly using logistic regression. We adjusted for sex, age and any of the first 10 principal components associated with the analyzed phenotype (P < 0.05).
A fixed-effects inverse-variance weighted meta-analysis combined estimates across the five groups using the software GWAMA [33]. Only SNPs with MAF >1% and present in >50 individuals in each group were meta-analyzed. Data were combined over all five groups (n = 1136) and separately for individuals of central European ancestry (n = 744) and Turkish ancestry (n = 397). Genomic control was applied at the individual study level and post meta-analysis [34]. A P-value threshold <1×10 −5 was used to indicate suggestive signals and <5×10 −8 to indicate genome-wide significance.
For the comparison of PediGFR cases with external controls, power calculations using Quanto 1.2.4 showed that, for a discovery subset of 658 cases and 2 controls per case, there was at least 80% power to detect a CKD odds ratio of 1.87 (1.57) or higher at genome-wide significance at a MAF of 10% (30%), assuming a population prevalence of pediatric CKD of 0.01%. A logistic regression model was used to relate CKD or proteinuria case status to each SNP using an additive model with covariates as described above.
Experimental follow-up was attempted for novel genomewide significant signals and genes in loci suggestively associated with the clinically important phenotype proteinuria. For genes in the linkage disequilibrium (LD) block of the index SNP, relevant orthologs in Danio rerio, including all paralogs, were identified. All animal work was carried out according to relevant national and international guidelines. Zebrafish of the strain AB/TL wild-type were maintained, and the embryos were staged as previously described [35].
Zebrafish orthologs were identified for clmp, gramd1b, rab27a and rsl24d1. SAMD3 had no ortholog; the extended LD block containing FAM151B contained too many genes to make experimental follow-up feasible. Amplification was carried out from zebrafish cDNA with specific primers, cloned into TOPO (Invitrogen) and linearized with corresponding restriction enzymes. Whole-mount in situ hybridization (WISH) analysis using digoxigenin-labeled probes was performed as described [36] using 4-Nitro blue tetrazolium chloride (blue) (Roche) as substrate.
In order to evaluate the presence of risk variants in genes previously associated with the complex traits CKD, eGFR, creatinine and/or cystatin C levels in adults, a list of 34 genes was assembled using the NHGRI GWAS Catalog [37] (Supplementary data, Table S1). For each of these genes plus a 5 kb flanking region, the SNP of MAF >5% with the lowest P-value was retrieved from the PediGFR meta-analysis data for the corresponding phenotype. To correct for multiple testing, a Bonferroni correction for the number of independent SNPs per gene (determined using a pruning procedure of the 1000 Genomes Project EUR data; r 2 <0.2, window size 50 bp, offset 5 bp) and the number of investigated genes was used.
We further investigated the genes implicated by the PedGFR data in the Chronic Renal Insufficiency Cohort (CRIC) study [38], a study of adult patients with CKD. Data were acquired from dbGAP (accession number phs000524.v1.p1). PediGFR candidate genes were queried in CRIC baseline eGFR association data in the subsample of patients of European ancestry without diabetes (n = 926).
In addition, we queried PediGFR proteinuria-associated regions for their association with the UACR in data from up to 54 450 mostly population-based adults studied in the CKDGen Consortium ( [14] and personal communication). Significance thresholds were derived based on the number of independent SNPs in the region as described above. Lastly, we investigated the expression of these genes in publicly available data from freshly isolated murine podocytes and other glomerular cells [39]. Differences in gene expression between podocytes and other glomerular cells were corrected for transcriptome-wide multiple testing and additionally for the number of regions investigated (n = 4).

R E S U LT S
Demographic parameters were similar across the three PedGFR cohorts (Table 1). Differences in baseline GFR, serum creatinine and cystatin C as well as the proportion of proteinuric children were consistent with differences in study-specific inclusion criteria.

GWAS of eGFR in PediGFR participants
Association analyses of eGFR crea showed no evidence for systematic inflation of results in the individual groups (1.00 < λ < 1.02) or after meta-analysis (λ = 1.01; Figure 1A). While 10 genomic regions contained at least one SNP associated with eGFR crea at P < 10 −5 , none of these loci reached genomewide significance. Table 2 shows the SNP with the lowest P-value (index SNP) at each of the loci. Regional association plots of the associated regions are shown in the Supplementary data. To assess whether the association with creatinine-based eGFR reflected an association with kidney function rather than with creatinine metabolism, we assessed the association of the index SNPs with an alternative estimate of GFR, eGFR cycr . Both effect sizes and directions of association were consistent (Supplementary data, Table S2). Second, measured iohexol GFR was available in a subset of CKiD study participants. Again, the effects of the index SNPs were consistent with those found in the main analysis (Figure 2), indicating the use of eGFR crea as the primary phenotype was appropriate. Allele frequencies of the index SNPs were similar across the five groups.
In the secondary meta-analyses of the three groups of central European ancestry and the two groups of Turkish ancestry separately, there were also no genome-wide significant findings for eGFR crea (Supplementary data, Table S3).

GWAS of proteinuria in PediGFR
In association analyses comparing PediGFR children with proteinuria with PediGFR children without proteinuria, we observed no evidence for systematic inflation of study-specific or combined results ( Figure 1B). Four genomic regions contained index SNPs of MAF ≥5% and P < 10 −5 , mapping in or near F I G U R E 1 : (A) Quantile-quantile (QQ) plots for the PediGFR eGFR crea quantitative GWAS and (B) the PediGFR proteinuria GWAS. X-axis, expected −log 10 P-value; Y-axis, observed −log 10 P-value. Thin lines above and below the identity line indicate the 95% confidence interval. FAM151B, SAMD3, MIR4493/CLMP and RSL24D1/UNC13C/ RAB27A (Table 2), but none of these associations reached genome-wide significance.

GWAS of CKD and proteinuria comparing PediGFR cases with external pediatric controls
In order to test for the presence of common genetic variants associated with the presence of CKD or proteinuria, we compared PediGFR cases with external pediatric controls without kidney disease. There was no systematic inflation of results comparing PediGFR participants (CKD cases) versus CHOP controls (λ = 1.03) and for the comparison of PediGFR patients with proteinuria versus CHOP controls (λ = 1.00). Table 3 shows that while there were no genome-wide significant associations for both analyses, six regions contained index SNPs associated with CKD at P < 10 −5 and 10 regions contained such SNPs for proteinuria.

Experimental evaluation of results
In lieu of suitable and sufficiently powered pediatric replication cohorts, we sought experimental evidence and examined kidney-specific expression during zebrafish development. Genes in regions suggestively associated with proteinuria were selected because of the clinical relevance of proteinuria in CKD progression. None of the selected genes showed specific glomerular expression during zebrafish development (Supplementary data, Figure S5), although general expression was observed for genes clmp, gramd1b, rab27a and rsl24dl. Among other organ systems, rab27a showed expression in the pronephric tubule.
Candidate gene evaluation Finally, we used evidence from studies of CKD or complex renal traits in adults to examine overlap with the loci implicated here. None of the 34 known risk genes for CKD or complex renal traits in adults contained SNPs significantly associated with the corresponding traits in the PediGFR or the PediGFR-CHOP datasets after correcting for multiple testing. Some nominally significant associations, such as rs11097415 in the SHROOM3 region and CKD, were observed (P = 7.3×10 −4 ). In addition, we queried the 10 genes associated with eGFR in PediGFR (Table 2) in eGFR association results of the CRIC baseline data, a study of adult patients with CKD. Again, the search did not yield any significantly associated SNPs.
One of the four candidate regions for proteinuria, RSL24D1/ UNC13C/RAB27A, contained SNPs significantly associated with UACR in the CKDGen dataset. The index SNP in the CKDGen Consortium was rs1528472 (P = 5×10 −7 ). This variant falls into F I G U R E 2 : Scatter plot comparing the effects of the index SNPs associated with eGFR crea with their effects on iohexol-measured GFR (iGFR) in a subsample of CKiD patients with measured GFR. Black: regression line. the same LD block as the PediGFR index variant rs76158983, which was not present in the HapMap-imputation-based CKDGen data that also did not contain a good proxy (r 2 > 0.8) for rs76158983. One of the genes, FAM151B, showed significantly higher expression in podocytes compared with other glomerular cells in the murine glomerular expression dataset (1.4-fold change, corrected P = 3×10 −4 ).
Supplementary data Table S5 summarizes results attempting to prioritize genes in the implicated regions through several bioinformatics resources that leverage information on gene expression and functional variant annotation as well as a literature search.

D I S C U S S I O N
This study is the first GWAS meta-analysis based on data from 1136 children with CKD recruited in large international collaborations. Our analyses revealed suggestive associations of several common genetic variants with eGFR and proteinuria as well as with CKD status at study baseline. However, none of the associations reached genome-wide significance.
Multiple GWASs among adults from the general population identified common risk variants associated with measures of kidney function and CKD [12][13][14][15][16][17]40]. The variants identified in these studies confer modest effects, as evidenced by the large sample sizes required for their identification. For example, GWASs among ∼20 000 adults identified SNPs in four eGFR-associated regions [12], whereas almost 70 000 individuals were required to identify 25 such regions [13].
Consequently, genetic studies in 1136 children would only be able to identify common risk variants if they conferred larger risks than those observed in population-based adults. It is conceivable that a given susceptibility variant may exert a larger effect in children because additional factors, e.g. the presence of micro-and macro-vascular diseases and of lifestyle-related factors such as smoking and obesity, can be assumed to have a smaller impact. In support, a previous study of growth and height reported a larger effect of the same common variant in children compared with adults [41]. Our results do not support the presence of common variants of large effect associated with GFR or proteinuria in this sample of children, since our study had >85% power to detect, for a variant with MAF of 20%, a change in average GFR ≥16% or an odds ratio for CKD ≥1.7 per risk allele at genome-wide significance.
There are at least three potential explanations for the negative findings at genome-wide significance: first, common genetic susceptibility variants may be present in children but only confer small changes in risk, similar to adults. This is supported by published GWASs of other phenotypes in pediatric populations that reported mostly small effect sizes of significant and replicated results [24,42,43]. The discovery of such variants in children with CKD would require a larger study sample than PediGFR, and additional samples for replication. This represents a considerable challenge in light of the low incidence of pediatric CKD and the fact that PediGFR already represents the currently largest international effort collecting patients across continents for more than a decade.
A second explanation could be that the genetic architecture of CKD differs between children and adults, and that common genetic variants of moderate effect do not play an equally important role in children as they do in adults. The substantial fraction of children with CKD due to a single-gene mutation [44][45][46] may limit the role of common genetic variants that modify rather than determine disease. As GWASs, by design, do not allow for the detection of single-gene mutations, and because for many children in PediGFR, monogenic causes of CKD were not comprehensively evaluated, this explanation represents a hypothesis that can be addressed in the future once whole genome sequences are available.
Third, the general phenotypes studied in PediGFR arise as a result of different underlying diseases. It is plausible that this etiologic heterogeneity complicates the detection of causespecific common variants of larger effect. The combined study of renal function parameters and CKD in all individuals was performed to increase statistical power for the discovery of variants that impact pathophysiological mechanisms irrespective of disease etiology. Evidence from studies in adults with CKD suggests that common variants of large effects indeed exist for specific renal diseases, for example, membranous nephropathy [20] or APOL1-related kidney disease [47]. Again, given the low incidence of pediatric CKD, subgroup analyses will be extremely challenging.
Since there were no regions associated at genome-wide significance and suggestive regions could not be replicated because of the lack of adequate studies, we refrain from detailed discussions of the genes mapping into suggestive regions. One locus that received some support through additional evidence is the one at RSL24D1/UNC13C/RAB27A on chromosome 15 that showed an association with proteinuria in CKD. This region was significantly associated with a closely related phenotype, UACR, in population-based data from the CKDGen Consortium.
The strengths of our study are the large international collection of patients and the careful and comprehensive phenotypic characterization, including the availability of several renal function markers as well as measured GFR in a subset of the participants. Data cleaning and analysis were carried out using state-of-the-art methods, which together with the public posting of the genotype data facilitates reproducibility and allows for future pooling with data from additional cohorts. The consortium assembled data from understudied ethnicities such as the Turkish population, where the Chronic Renal Disease in Turkey (CREDIT) study estimated that the prevalence of pediatric CKD stages 3-5 was 2600 per million age-related population as opposed to 68.9 in Western Europe [48].
A number of limitations deserve acknowledgement. The study did not have sufficient statistical power to detect genetic risk variants that confer only moderate changes in risk. As such, it is not possible to draw definite conclusions about the presence of such variants, the genetic architecture of CKD in children when compared with adults or their effect on specific CKD etiologies. Another important limitation is the lack of suitable external replication cohorts. Finally, because of the study inclusion criteria, the full spectrum of eGFR could not be examined. As the studies were designed with the objective to study CKD progression, GWAS of kidney function decline and incident end-stage renal disease should be performed once sufficient follow-up data have been collected.
In this first GWAS meta-analysis of pediatric CKD, we did not identify common genetic susceptibility variants significantly associated with GFR, proteinuria or pediatric CKD status, although several suggestive candidates emerged. Even larger efforts will be needed to draw reliable conclusions about the presence of such variants in pediatric populations.

CO N F L I C T O F I N T E R E S T S TAT E M E N T
The results presented in this paper have not been published previously in whole or part, except in abstract format. The authors declare no conflicts of interest.