Genome-wide association studies (GWAS) of urinary bladder cancer (UBC) have yielded common variants at 12 loci that associate with risk of the disease. We report here the results of a GWAS of UBC including 1670 UBC cases and 90 180 controls, followed by replication analysis in additional 5266 UBC cases and 10 456 controls. We tested a dataset containing 34.2 million variants, generated by imputation based on whole-genome sequencing of 2230 Icelanders. Several correlated variants at 20p12, represented by rs62185668, show genome-wide significant association with UBC after combining discovery and replication results (OR = 1.19, P = 1.5 × 10−11 for rs62185668-A, minor allele frequency = 23.6%). The variants are located in a non-coding region approximately 300 kb upstream from the JAG1 gene, an important component of the Notch signaling pathways that may be oncogenic or tumor suppressive in several forms of cancer. Our results add to the growing number of UBC risk variants discovered through GWAS.

## INTRODUCTION

Worldwide, almost 400 000 individuals are diagnosed with urinary bladder cancer (UBC) and >150 000 patients die from the disease every year (1,2). In the United States, UBC is the 5th most common type of cancer with an estimated 72 500 new cases diagnosed and 15 200 deaths in 2013 (3). The male to female ratio of UBC is 3 : 1. Although in vitro studies and animal models support an essential role for the androgen receptor in UBC risk and progression, this gender difference has been largely attributed to historical differences in tobacco smoking and occupational exposure to aromatic and heterocyclic amines, the most important risk factors for UBC.

In addition to smoking and occupational exposures, genetic factors play a role in UBC risk. Population-based studies show that the risk of UBC is almost 2-fold the population average for the first-degree relatives of UBC patients (4,5) and the risk is even higher for relatives of young probands (46). Families with many cases of UBC are rare and segregation analyses suggest that the ‘no major gene’ model is the best one to describe the occurrence of UBC in these families (7). Taken together, these studies suggest that the genetic UBC risk may be largely conferred by many variants of moderate-to-low effect. Early genetic studies that focused on variants in carcinogen-metabolizing genes, strongly suggested that NAT2 slow acetylator and GSTM1 null genotypes increase UBC risk (8). In addition, genome-wide association studies (GWAS) have yielded 12 UBC risk loci in populations of European ancestry (915). All the variants reported are common with risk allele frequencies >19% and exert a relatively weak effect on UBC risk (allelic odds ratios (ORs) <1.24).

In this study, we searched for additional variants that affect UBC risk in Iceland by performing GWAS on sequence variants identified through whole-genome sequencing of Icelanders and with follow-up of selected variants in >5000 UBC cases and 10 000 controls from 11 European populations.

## RESULTS

To search for sequence variants that associate with UBC, we performed a GWAS using information from the Icelandic Cancer Registry (ICR) on all UBC diagnoses in Iceland since 1955 and sequence variants identified through whole-genome sequencing of Icelanders. At the beginning of this study, 2230 Icelanders, including 22 UBC cases, had been whole-genome sequenced to an average sequencing depth of 22×. Approximately 34.2 million SNPs and small INDELs were identified in this effort and, using imputation assisted by long-range phased haplotypes, the genotype probabilities of all the sequence variants were estimated for 826 UBC cases genotyped with Illumina chips and for additional 844 UBC cases who are close relatives of chip-typed individuals (1618). The sequences of these 1670 UBC cases were compared with the sequences of 90 180 controls matched on genotype informativeness (see Materials and Methods). In this analysis, the marker showing the strongest association with UBC was the previously reported SNP rs10094872-T on 8q24.2 (P = 1.7 × 10−7, OR = 1.26) (13). The variants reported to associate with UBC at 2q37.1, 3q26.2, 3q28, 4p16, 5p15.3, 8p22, 8q24.21, 8q24.3, 11p15.5, 18q12.3, 19q12 and 22q13.1 all showed effect in the reported direction with P-values ranging from 0.06 to 1.7 × 10−7 except for rs10936599 at 3q26.2 (P = 0.68) and rs8102137 at 19q12 (P = 0.21) (Supplementary Material, Table S1). Since our dataset allows testing of variants, both common and rare, that were previously not covered by HapMap or commercial chips, we scrutinized previously reported GWAS loci for variants representing the association better than the reported SNPs or representing additional independent signals. At three of the loci, we observed variants that have either slightly stronger association in our data than the reported variants or coding variants that could potentially explain the observed association (Supplementary Material, Table S1). However, at none of the loci did we find evidence that the associated variants are tagging a rarer variant with higher risk nor did we observe evidence of other independent association signals at any of the loci (P < 10−5).

Since no signal reached the threshold of genome-wide significance in our analysis (P < 2 × 10−9), we applied the following marker selection scheme for identification of new UBC signals (Supplementary Material, Fig. S1). SNPs at previously reported loci and all SNPs with minor allele frequency (MAF) < 0.005 were filtered out; of the remaining markers, we selected those with P-value of <1 × 10−5 and imputation information score >0.95 for further analysis. A total of 82 variants representing 19 loci fulfilled these criteria (Supplementary Material, Table S2). These 82 variants were tested for association with UBC, using in silico follow-up in a set of 1601 Dutch UBC cases and 1824 controls, genotyped with Illumina's Human Hap CNV370 chip and imputed using 1000 Genomes low-coverage pilot haplotypes (released June 2010) and the HapMap3 haplotypes (release #2 February 2009) (13). Twelve of the 19 loci had imputed SNPs that could be used for follow-up and two of those, represented by rs62185668 on 20p12.2 and rs10784282 on 12q14.1, showed suggestive association with UBC risk in the same direction as the Icelandic dataset (Supplementary Material, Table S2A). For the other 10 loci, the association signal did not replicate in the Dutch GWAS. Nine markers, representing 7 loci, were not present in the Dutch GWAS dataset and no good proxies were identified. These markers, along with rs62185668 and rs10784282 that showed suggestive results in silico, were genotyped with single SNP assays in UBC case–control sample sets from The Netherlands (2098 cases and 4545 controls—including the samples used in the GWAS) and Sweden (343 cases and 1264 controls) (Supplementary Material, Table S2B). Only rs62185668 on 20p12.2 showed suggestive association with UBC in both sample sets and this locus was selected for a further follow-up.

In the Icelandic GWAS dataset, three markers at 20p12.2 show the strongest association with UBC; two single nucleotide variants rs62185668 and rs4813953 and an INDEL rs148953085 (4 bp deletion) (Fig. 1 and Table 2). Because none of the markers were present on the chips used for genotyping, all three variants were directly genotyped to verify the accuracy of the Icelandic imputation. Comparing the counts of the risk alleles of the three markers among 2633 individuals, the correlation between imputation and direct genotyping was >0.97 for all variants.

Figure 1.

Regional association plot for the 20p12.2 locus. The figure shows the −log10 P-values (left vertical axis) of variant associations with UBC in the Icelandic discovery samples against their positions at the 20p12.2 locus. The purple circle highlights the most significant SNP, rs4813953, in the discovery analysis and circles corresponding to other SNPs are color coded to reflect their LD with rs4813953 estimated in the Icelandic dataset. The red line indicates recombination rates (right axis), based on the Icelandic recombination map for males and females (19) combined with the peaks indicating recombination hotspots defining LD blocks. Known genes in the region are shown underneath the plot, taken from the UCSC genes track in the UCSC genome browser. All positions are in NCBI Build 36 coordinates. The plot was created using a standalone version of the LocusZoom software (http://csg.sph.umich.edu/locuszoom/) (20).

Figure 1.

Regional association plot for the 20p12.2 locus. The figure shows the −log10 P-values (left vertical axis) of variant associations with UBC in the Icelandic discovery samples against their positions at the 20p12.2 locus. The purple circle highlights the most significant SNP, rs4813953, in the discovery analysis and circles corresponding to other SNPs are color coded to reflect their LD with rs4813953 estimated in the Icelandic dataset. The red line indicates recombination rates (right axis), based on the Icelandic recombination map for males and females (19) combined with the peaks indicating recombination hotspots defining LD blocks. Known genes in the region are shown underneath the plot, taken from the UCSC genes track in the UCSC genome browser. All positions are in NCBI Build 36 coordinates. The plot was created using a standalone version of the LocusZoom software (http://csg.sph.umich.edu/locuszoom/) (20).

The variants rs62185668, rs4813953 and rs148953085 were genotyped in a total of 5241 UBC cases and 10 456 controls from 11 European UBC case–control sample sets (Table 1). Using information on over 18 000 cases and controls from the 12 study populations directly genotyped for all three markers, we determined that rs62185668 and rs148953085 are highly correlated (r2 = 0.96) and can be assumed to represent the same association signal. For convenience, this signal will hereafter be referred to using rs62185668. The correlation between rs62185668 and rs4813953 is moderate (r2 = 0.50). The evidence for replication of the UBC associations in non-Icelandic populations was significant (OR = 1.17, P = 2.5 × 10−7 for rs62185668 and OR = 1.13, P = 6.2 × 10−6 for rs4813953) and showed no evidence of heterogeneity (Table 2). Combined with the Icelandic data, the overall association passed the threshold of genome-wide significance for both variants (rs62185668; OR = 1.19, P = 1.5 × 10−11 and rs4813953; OR = 1.16, P = 2.1 × 10−10) (Table 2).

Table 1.

Case–control groups used in the study

Study group No. of cases No. of controls Average age at diagnosis (range) % males (cases) Study type
Discovery group (GWA)
Iceland 1670 90 180 68 (20–95) 76 Population-based
In silico replication (GWA)
The Netherlandsa 1601 1824 62 (25–93) 81 Population-based
Follow-up groups
The Netherlandsa 2340 4545 64 (23–93) 81 Population-based
UK, Leeds 771 574 73 (30–101) 71 Hospital-based
Italy, Torino 332 391 63 (40–75) 100 Hospital-based
Italy, Brescia 183 193 63 (22–80) 100 Hospital-based
Belgium, Leuven 201 385 68 (40–93) 86 Population-based
Eastern Europe (Hungary, Romania, Slovakia) 214 533 65 (36–90) 83 Hospital-based
Sweden, Stockholm 352 1350 69 (32–97) 67 Population-based
Spain, Zaragoza 246 1844 65 (27–94) 87 Hospital-based
Germany, Dortmund 213 298 65 (20–91) 86 Hospital-based
Germany, Lutherstadt Wittenberg 197 239 71 (35–89) 75 Hospital-based
Germany Neuss 217 104 71 (26–93) 78 Hospital-based
Total 6936 99 682
Study group No. of cases No. of controls Average age at diagnosis (range) % males (cases) Study type
Discovery group (GWA)
Iceland 1670 90 180 68 (20–95) 76 Population-based
In silico replication (GWA)
The Netherlandsa 1601 1824 62 (25–93) 81 Population-based
Follow-up groups
The Netherlandsa 2340 4545 64 (23–93) 81 Population-based
UK, Leeds 771 574 73 (30–101) 71 Hospital-based
Italy, Torino 332 391 63 (40–75) 100 Hospital-based
Italy, Brescia 183 193 63 (22–80) 100 Hospital-based
Belgium, Leuven 201 385 68 (40–93) 86 Population-based
Eastern Europe (Hungary, Romania, Slovakia) 214 533 65 (36–90) 83 Hospital-based
Sweden, Stockholm 352 1350 69 (32–97) 67 Population-based
Spain, Zaragoza 246 1844 65 (27–94) 87 Hospital-based
Germany, Dortmund 213 298 65 (20–91) 86 Hospital-based
Germany, Lutherstadt Wittenberg 197 239 71 (35–89) 75 Hospital-based
Germany Neuss 217 104 71 (26–93) 78 Hospital-based
Total 6936 99 682

aThe Dutch dataset used for in silico replication is a subset of the total Dutch case–control sample set used for follow-up

Table 2.

Association of rs62185668[A] and rs4813953[T] on 20p12.2 with UBC

Study population rs62185668[A]

rs4813953[T]

Frequency

OR 95% CI P-value Phet I2 Frequency

OR 95% CI P-value Phet I2
Cases Controls Cases Controls
Discovery groups (GWA)
Icelanda 0.290 0.253 1.26 1.14, 1.40 5.84E−06   0.425 0.367 1.26 1.15, 1.38 9.06E−07
Follow-up groups
The Netherlands 0.260 0.236 1.13 1.04, 1.23 0.0037   0.396 0.368 1.13 1.05, 1.21 0.0019
UK 0.266 0.231 1.20 1.00, 1.45 0.052   0.422 0.394 1.12 0.95, 1.32 0.17
Italy, Torino 0.287 0.233 1.33 1.04, 1.69 0.023   0.451 0.399 1.23 0.99, 1.53 0.057
Italy, Brescia 0.243 0.237 1.03 0.70, 1.52 0.86   0.406 0.369 1.17 0.86, 1.59 0.33
Belgium 0.247 0.221 1.16 0.86, 1.55 0.34   0.415 0.367 1.22 0.95, 1.57 0.13
Eastern Europe 0.260 0.253 1.03 0.76,1.41 0.84   0.390 0.403 0.95 0.73, 1.23 0.68
Sweden 0.253 0.222 1.19 0.97, 1.46 0.10   0.389 0.377 1.05 0.88, 1.27 0.58
Spain 0.278 0.250 1.15 0.93, 1.43 0.20   0.409 0.394 1.06 0.88, 1.29 0.52
Germany Dortmund 0.300 0.233 1.41 1.02, 1.95 0.038   0.448 0.379 1.33 1.00, 1.76 0.046
Germany, Luth/Witt 0.262 0.238 1.14 0.80, 1.61 0.47   0.391 0.362 1.13 0.85, 1.51 0.40
Germany, Neuss 0.313 0.221 1.60 1.08, 2.37 0.019   0.433 0.375 1.27 0.90, 1.80 0.17
Follow-up groupsb 0.270 0.234 1.17 1.10, 1.24 2.50E−07 0.78 0.0 0.413 0.381 1.13 1.07, 1.19 6.20E−06 0.86 0.0
All combinedb 0.271 0.236 1.19 1.13, 1.26 1.50E−11 0.70 0.0 0.414 0.380 1.16 1.11, 1.21 2.10E−10 0.56 0.0
Study population rs62185668[A]

rs4813953[T]

Frequency

OR 95% CI P-value Phet I2 Frequency

OR 95% CI P-value Phet I2
Cases Controls Cases Controls
Discovery groups (GWA)
Icelanda 0.290 0.253 1.26 1.14, 1.40 5.84E−06   0.425 0.367 1.26 1.15, 1.38 9.06E−07
Follow-up groups
The Netherlands 0.260 0.236 1.13 1.04, 1.23 0.0037   0.396 0.368 1.13 1.05, 1.21 0.0019
UK 0.266 0.231 1.20 1.00, 1.45 0.052   0.422 0.394 1.12 0.95, 1.32 0.17
Italy, Torino 0.287 0.233 1.33 1.04, 1.69 0.023   0.451 0.399 1.23 0.99, 1.53 0.057
Italy, Brescia 0.243 0.237 1.03 0.70, 1.52 0.86   0.406 0.369 1.17 0.86, 1.59 0.33
Belgium 0.247 0.221 1.16 0.86, 1.55 0.34   0.415 0.367 1.22 0.95, 1.57 0.13
Eastern Europe 0.260 0.253 1.03 0.76,1.41 0.84   0.390 0.403 0.95 0.73, 1.23 0.68
Sweden 0.253 0.222 1.19 0.97, 1.46 0.10   0.389 0.377 1.05 0.88, 1.27 0.58
Spain 0.278 0.250 1.15 0.93, 1.43 0.20   0.409 0.394 1.06 0.88, 1.29 0.52
Germany Dortmund 0.300 0.233 1.41 1.02, 1.95 0.038   0.448 0.379 1.33 1.00, 1.76 0.046
Germany, Luth/Witt 0.262 0.238 1.14 0.80, 1.61 0.47   0.391 0.362 1.13 0.85, 1.51 0.40
Germany, Neuss 0.313 0.221 1.60 1.08, 2.37 0.019   0.433 0.375 1.27 0.90, 1.80 0.17
Follow-up groupsb 0.270 0.234 1.17 1.10, 1.24 2.50E−07 0.78 0.0 0.413 0.381 1.13 1.07, 1.19 6.20E−06 0.86 0.0
All combinedb 0.271 0.236 1.19 1.13, 1.26 1.50E−11 0.70 0.0 0.414 0.380 1.16 1.11, 1.21 2.10E−10 0.56 0.0

All P-values shown are two sided. Shown are the allelic frequencies of variants in affected and control individuals and the allelic OR with P-values based on the multiplicative model.

aResults adjusted by the method of genomic control (see Supplementary Material). Of the Icelandic subjects, 826 cases and 44 604 controls were directly genotyped, the remaining cases and controls are individuals that had not been chip typed, but for which genotype probabilities were imputed using methods of familial imputation (17).

bFor the combined study populations, the control frequency was the average, unweighted control frequency of the individual populations, while the OR and the P-value were estimated using the Mantel–Haenszel model.

We ran conditional analysis on rs62185668 and rs4813953 where we adjusted the results for each variant, using the other variant as a covariate, including all individuals that had been genotyped by single-variant assays for both markers. Once adjusted for the other, the effect of each marker became considerably smaller and the P-value much less significant (P > 0.01), suggesting that both markers are capturing the same signal although neither marker appears to fully explain the association (Supplementary Material, Table S3).

We tested whether the risk variants at 20p12.2 associate with disease subtypes that carry different risks of progression. Clinical and molecular evidence suggests that UBC can develop along at least two distinct pathways, one predicted to have low risk of progression (tumors confined to the bladder mucosa and not poorly differentiated) and the other having a high risk of progression (tumor invasion in or beyond the lamina propria or poorly differentiated) (21). Using the same classification of cases, we found that the frequency of rs62185668 and rs4813953 was not different between patients with tumors of different risks of progression (P = 0.84 and 0.23, for rs62185668 and rs4813953, respectively) (Supplementary Material, Table S4). Neither rs62185668 nor rs4813953 associates with gender or age at diagnosis of UBC.

The UBC risk variants are located in a non-coding region on 20p12.2. The closest gene is JAG1 (jagged1), located ∼300 kb upstream of rs62185668, rs148953085 and rs1327235 (Fig. 1). JAG1 is the ligand for the NOTCH family of transmembrane receptors that play important roles in determining cell fate and stem cell maintenance, promote cell survival and have been implicated in many forms of cancer (22). An immunohistochemical study on 70 bladder carcinomas and 10 samples of normal urothelium revealed significantly decreased NOTCH1 and JAG1 staining in tumor tissues (23). We assessed the expression of JAG1 mRNA in bladder tissues (4 normal urothelial samples and 181 tumors) profiled on the Affymetrix U133Plus2 expression microarray (24). Consistent with the previous study, there was a trend towards lower JAG1 expression in bladder tumors from the level in normal urothelium (t-test P-value = 0.063; Supplementary Material, Table S5). Furthermore, in the large majority of our panel of 45 UBC cell lines, expression of JAG1 mRNA is less than in than normal human urothelial cells (Supplementary Material, Fig. S2).

We next attempted to determine if the UBC risk variants associate with differential expression of JAG1 in urothelial cells. Normal urothelium is hard to obtain because of the difficulty in isolating the few layers of urothelial cells from the stroma beneath it and no large-scale expression datasets are available for this tissue type. We therefore assessed JAG1 expression in 34 low-passage normal human urothelial cell strains, using quantitative real-time RT-PCR and correlated expression with genotype of rs62185668, adjusting for number of passages. JAG1 showed a trend towards lower expression in carriers of rs62185668-A (P-value = 0.11, effect = −0.15 per allele) (Supplementary Material, Fig. S3). We searched for eQTLs for JAG1 in the GTEx database (http://www.broadinstitute.org/gtex) which contains eQTLs for nine different tissue types, albeit no data from urothelium. Lung tissue was the only tissue type reporting eQTLs for JAG1; these eQTLs were not correlated to our UBC markers. We then genotyped rs62185668 in 67 tumor samples for which DNA was available and correlated the genotype with JAG1 expression, JAG1 locus loss in tumors, stage and grade of the tumor, mutation status of FGFR3 and TP53, and the presence or absence of the recently defined MRES epigenetic phenotype (24). While JAG1 expression is associated with its copy number, rs62185668 genotype did not show significant association with any of the tumor phenotypes. However, taking into account bladder tumor heterogeneity, the number of tumors is too small to conclusively rule out modest effects.

Analysis was undertaken to explore the potential functional roles of the UBC risk variants. First, we identified a set of variants at the 20p12.2 locus that are strongly correlated with rs62185668 or rs4813953 (r2 > 0.9). A total of 12 variants were correlated with rs62185668 with r2 > 0.9 but no variant was found that had this strong correlation with rs4813953 (the most highly correlated marker is rs62185671 with r2 = 0.86). We then cross-referenced the selected variant locations with potential biological functional features according to The Encyclopedia of DNA Elements project (2527). The most-highly annotated variant is the INDEL rs148953085 located within a predicted DNaseI hotspot with strong signal in 32 cell types (Supplementary Material, Table S6). According to a ChIP-Seq analysis, this site also binds the C-FOS, C-JUN and GATA2 transcription factors in the umbilical vein endothelial cell line HUVEC. Furthermore this site is in a region annotated as an enhancer based on histone mark data from four cell types (27), it has RegulomeDB score of 4 (28) and Gerp score of 2.62 (29). The second strongest candidate is the SNP variant rs62185663 (r2 = 1 with rs62185668) which is located within a predicted DNaseI hotspot with strong signal in 17 cell types. This site is in a region annotated as an enhancer based on histone mark data from two cell types; it has RegulomeDB score of 4 (28).

Heterozygous mutations in JAG1 cause Alagille syndrome (arteriohepatic dysplasia; OMIM 118450), a multisystem disorder primarily affecting the liver, heart, skeleton, eye and face. Previously, common variants at the JAG1 locus have been reported that associate with bone mineral density (BMD) (rs3790160, rs2273061) (30,31) and various measures of blood pressure (BP) (rs1327235) (32). Neither rs62185668 nor rs4813953 is correlated with the BMD variants (r2 < 0.04), whereas rs62185668 shows modest correlation with the BP variants (r2 = 0.32) (Supplementary Material, Table S7). We tested the association of the BMD and BP variants with UBC and, conversely, whether the UBC variants associated with BMD or BP, using the extensive genotype and phenotype data available at deCODE (Supplementary Material, Table S8). The BMD and BP variants showed no association with bladder cancer in the Icelandic data (Supplementary Material, Table S8), whereas the UBC variant rs62185668 showed association with several measures of bone density with P < 3 × 10−4. To further test the association between the UBC variants and BMD, we used data from the meta-analysis of GWAS for BMD of the femoral neck (FN-BMD; n = 32 961) and lumbar spine (LS-BMD; n = 31 800), including ∼2.5 million autosomal genotyped or imputed SNPs from 17 studies from populations across North America, Europe, East Asia and Australia (http://www.gefos.org/?q=content/data-release, downloaded February 12, 2014) (31). rs4813953 was directly tested in the meta-analysis but the marker rs62185668 was not present in the dataset so a perfect surrogate (D′ = 1, r2 = 1, according to http://www.broadinstitute.org/mpg/snap based on 1000 Genomes pilot 1) was tested instead. In this large sample set, rs4813953 and rs6077985 showed association with FN-BMD with P-values of 0.003 and 0.001, respectively, but did not associate with LS-BMD (P = 0.41 and 0.03, respectively). Given that the meta-analysis has far more power and greater sample size than the Icelandic study alone, we conclude that the additional evidence from non-Icelandic studies does not replicate the association with the same effect observed initially in Iceland and that the UBC SNPs are unlikely to have an effect on BMD.

Finally, a suggestive association between a variant at 20p12.2 (rs6104690, MAF = 0.44) and UBC was recently reported, however, this association (OR = 0.89, P = 7.13 × 10−7) did not reach the level of genome-wide significance (15). The variants reported here to associate with genome-wide significance with UBC have a lower frequency and a stronger effect than the reported variant rs6104690 and show low-to-moderate correlation with this variant (r2 between rs6104690 and each of the variants rs62185668 and rs4813953 is 0.30 and 0.14, respectively). We performed conditional analysis where we adjusted the association results for rs62185668 for rs6104690 and vice versa, using data on 826 UBC cases and 44 604 controls in the Icelandic population (Supplementary Material, Table S9). After adjustment for rs62185668, no signal remained at rs6104690 whereas the association signal of rs62185668 was not affected by adjustment with rs6104690. Finally, a variant at 6p22.3, rs4510656, was also reported to have a suggestive association with UBC (OR = 0.89, P = 6.7 × 10−7) (15). We assessed the association between rs4510656 and UBC in the Icelandic and Dutch GWAS data. In both datasets, the variant showed a weak effect in the same direction as reported (combined OR = 0.94, P = 0.061); however, the results from combined analysis of the Icelandic, Dutch and published GWAS failed to reach the threshold for genome-wide significance (OR = 0.91, 95% CI (0.88, 0.94), P = 3.6 × 10−7, Phet = 0.29, I2 = 18.9).

## DISCUSSION

In this study, we discovered and validated associations between UBC and common markers at 20p12.2. The strongest association signal is represented by two moderately correlated markers rs62185668 and rs4813953. Conditional analysis showed that neither variant retains a significant association after adjustment for the other, indicating that they may represent the same signal. Using the sequence information of 2230 Icelanders who have been whole-genome sequenced to an average sequencing depth of 22×, we can impute variants down to ∼0.1% frequency into our case–control population. We scrutinized the region flanking rs62185668 (1 Mb on either side) and did not observe any variants that could better explain the association signal than rs62185668 and rs4813953. However, it should be noted that even at this coverage, current WGS technology has problems resolving some regions of the genome, such as repetitive sequences and first exons of protein coding genes. We can therefore not exclude that the causative variant still remains to be found.

The closest gene to the UBC variants at 20p12.2 is the NOTCH1 ligand, JAG1, prompting speculations that this gene may contribute to susceptibility to UBC. The Notch pathway has been shown to play a complex role in cancer where its role is highly dependent on cell type and context, being implicated both as a tumor promoter and suppressor (22). Activation of Notch pathways is commonly observed in different cancer types and inhibitors of Notch signaling are being tested as cancer medicines (33). Reports of expression of Notch pathway genes in UBC have been conflicting. In the Expression Atlas (a study that compiles 5372 samples from 206 different studies generated on the HG-U133A array platform) both NOTCH1 and JAG1 are reported to be upregulated in UBC (34) and a recent study on 131 bladder tumors, released through The Cancer Genome Atlas research network, reported upregulation of JAG1 expression in 12 tumors (35). However, a study that assessed the presence of the NOTCH1 and JAG1 proteins by immunohistochemistry of bladder tumors and normal urothelium and our own results suggest that JAG1 is downregulated in bladder tumors and UBC cell lines. Notably, postoperative disease-free survival time in patients with papillary tumors with low Notch1 and Jag1 expression has been found to be significantly shorter than that in patients with other expression patterns (23). Finally, NOTCH1 is on the long arm of 9q which is deleted in a high fraction of both superficial and muscle-invasive UBC (36).

In conclusion, we have discovered an association between sequence variants close to JAG1 at 20p12.2 and the risk of UBC. This finding brings the growing number of UBC risk loci identified through GWAS to 13.

## MATERIALS AND METHODS

### Study subjects from Iceland

Records of all UBC diagnoses were obtained from the ICR (http://www.krabbameinsskra.is). The ICR contains all cancer diagnoses in Iceland from January 1, 1955. The ICR contained records of 1983 Icelandic UBC patients diagnosed until December 31, 2011. Recruitment of UBC cases was initiated in 2001 and included all prevalent cases as well as newly diagnosed cases from that time. The participation rate for newly diagnosed cases since 2001 is 75%. Patients were recruited by trained nurses on behalf of the patients’ treating physicians, through special recruitment clinics. Participants in the study donated a blood sample and answered a lifestyle questionnaire. A total of 826 patients (77% males; diagnosed from December 1964 to December 2011) were included in a genome-wide SNP genotyping effort, using the Infinium II assay method and either the Sentrix HumanHap 300, HumanCNV370-duo or Omni Express BeadChips (Illumina). The median age at diagnosis for all consenting cases was 67.4 years (range 10–96 years) as compared with 68.9 years for all UBC patients in the ICR. In addition to the chip-genotyped cases, we used information on 844 UBC cases without chip information whose genotype probabilities were imputed using methods of familial imputation (17). The 90 180 controls used in this study consisted of individuals from other ongoing GWAS at deCODE. No individual disease group is represented by >10% of the total control group. Cancer patients (prostate, breast, colorectal and lung) were analyzed separately, and the frequency of the sequence variants studied did not differ from other controls. Samples from prostate, breast, colorectal and lung cancer patients as well as individuals used for the analysis of smoking variables come from other ongoing project at deCODE Genetics. The study was approved by the Data Protection Authority of Iceland and the National Bioethics Committee. Written informed consent was obtained from all patients, relatives and controls. Personal identifiers associated with medical information and blood samples were encrypted with a third-party encryption system for which the Data Protection Authority maintains the code.

### Study subjects from the Netherlands

The Dutch patients were recruited for the Nijmegen Bladder Cancer Study (NBCS: http://dceg.cancer.gov/icbc/membership.html). The NBCS identified patients in the Eastern part of the Netherlands through the Netherlands Cancer Registry held by the Comprehensive Cancer Centre the Netherlands (IKNL). Patients diagnosed between 1995 and 2006 under the age of 75 years were selected and their vital status and current addresses updated through the hospital information systems of seven community hospitals and one university hospital (Radboud university medical center, Radboudumc). All patients still alive on August 1, 2007 were invited to the study by the Comprehensive Cancer Center on behalf of the patients’ treating physicians. A second group of patients, diagnosed between 2006 and 2008, was invited in 2009, a third group, diagnosed between 2008 and 2009, was invited in 2010 and a fourth group, diagnosed between 2009 and 2011, was invited in 2012. In case of consent, patients were sent a lifestyle questionnaire to fill out and blood samples were collected by Thrombosis Service centers, which hold offices in all the communities in the region. In total, 2654 patients were invited to participate. Of all the invitees, 1744 gave informed consent (66%): 1603 filled out the questionnaire (60%) and 1626 (61%) provided a blood sample. All the patients that were selected for the analyses were of self-reported European descent. The median age at diagnosis was 64 (range 25–81) years. Eighty percent of the participants were males. Data on tumor stage and grade were obtained through the cancer registry.

The series of patients that was recruited through the Comprehensive Cancer Centre was combined with a non-overlapping series of 465 bladder cancer patients who were (i) recruited previously for a study on gene–environment interactions in three hospitals (Radboudumc, Canisius Wilhelmina Hospital, Nijmegen and Streekziekenhuis Midden-Twente, Hengelo, the Netherlands) (ii) consecutive bladder cancer patients from the Department of Urology, Radboudumc, Nijmegen, The Netherlands. The median age at diagnosis was 64 (range 30–93) years. Eighty-five percent of these participants were males. Data on tumor stage and grade were obtained through the cancer registry.

Additionally, another non-overlapping series of 249 consecutive bladder cancer patients from the Department of Urology, Erasmus Medical Center in Rotterdam, The Netherlands were included (81% males). All patients signed an informed consent form for their blood samples.

The control group (46% males) was recruited within a project entitled ‘Nijmegen Biomedical Study’. Control individuals from the NBS were invited to participate in a study on gene–environment interactions in multifactorial diseases such as cancer. The details of this study were reported previously (37). Briefly, this is a population-based survey conducted by the Department for Health Evidence and the Department of Clinical Chemistry of the Radboudumc. Age- and sex-stratified randomly selected adult inhabitants of Nijmegen (n = 22 451), a city located in the eastern part of the Netherlands, received an invitation to fill out a postal questionnaire including questions about lifestyle, health status and medical history, and to donate a blood sample for DNA isolation and biochemical studies. A total of 9350 (43%) persons filled out the questionnaire, of which 6468 (69%) responders donated blood samples.

The study protocols of the Nijmegen Bladder Cancer Study and the Nijmegen Biomedical Study were approved by the Institutional Review Board of the Radboud university medical center and all study subjects gave written informed consent.

### Study subjects from the United Kingdom

Details of the Leeds Bladder Cancer Study have been reported previously (38). In brief, patients from the urology department of St James's University Hospital, Leeds were recruited from August 2002 to March 2006. All those patients attending for cystoscopy or transurethral resection of a bladder tumor who had previously been found, or were subsequently shown, to have urothelial cell carcinoma of the bladder were included. Exclusion criteria were significant mental impairment or a blood transfusion in the past month. All non-Caucasians were excluded from the study leaving 764 patients. The median age at diagnosis of the patients was 73 years (range 30–101). Seventy-one percent of the patients were male and 36% of all the patients had tumors with low risk of progression (pTaG1/2). The controls were recruited from the otolaryngology outpatients and ophthalmology inpatient and outpatient departments at St James's Hospital, Leeds, from August 2002 to March 2006. All controls of appropriate age for frequency matching with the cases were approached and recruited if they gave their informed consent. As for the cases, exclusion criteria for the controls were significant mental impairment or a blood transfusion in the past month. Also, controls were excluded if they had symptoms suggestive of bladder cancer, such as hematuria. 2.8% of the controls were non-Caucasian leaving 530 Caucasian controls for the study. Seventy-one percent of the controls were male. Data were collected by a health questionnaire on smoking habits and smoking history (non-, ex- or current smoker, smoking dose in pack-years), occupational exposure history (to plastics, rubber, laboratories, printing, dyes and paints, diesel fumes), family history of bladder cancer, ethnicity and place of birth and places of birth of parents. The participation response rate of cases was ∼99%, that among the controls ∼80%. Ethical approval for the study was obtained from Leeds (East) Local Research Ethics Committee, project number 02/192.

### Study subjects from Torino, Italy

The source of cases for the Torino bladder cancer study are two urology departments of the main hospital in Torino, the San Giovanni Battista Hospital (39). Cases are all Caucasian men, aged 40–75 years (median 63 years) and living in the Torino metropolitan area. They were newly diagnosed between 1994 and 2006 with a histologically confirmed, invasive or in situ, bladder cancer. Of all the patients with information on stage and grade, 56% were at low risk of progression (pTaG1/2). The sources of controls are urology, medical and surgical departments of the same hospital in Torino. All controls are Caucasian men resident in the Torino metropolitan area. They were diagnosed and treated between 1994 and 2006 for benign diseases (such as prostatic hyperplasia, cystitis, hernias, heart failure, asthma and benign ear diseases). Controls with cancer, liver or renal diseases and smoking-related conditions were excluded. The median age of the controls was 57 years (range 40–74). Data were collected by a professional interviewer who used a structured questionnaire to interview both cases and controls face to face. Data collected included demographics (age, sex, ethnicity, region and education) and smoking. For cases, additional data were collected on tumor histology, tumor site, size, stage, grade and treatment of the primary tumor. The participation response rates were 90% for cases and 75% for controls resulting in 328 cases and 389 controls. Ethical approval for the study was obtained from Comitato Etico Interaziendale, A.O.U. San Giovanni Batista – A.). C.T.O./Maria Adelaide.

### Study subjects from Belgium

The Belgian study has been reported in detail (41). In brief, cases were selected from the Limburg Cancer Registry and were approached through urologists and general practitioners. All cases were diagnosed with histologically confirmed urothelial cell carcinoma of the bladder between 1999 and 2004 and were Caucasian inhabitants of the Belgian province of Limburg. The median age of the patients was 68 years; 86% of all the patients were males. For the recruitment of controls, a request was made to the ‘Kruispuntbank’ of the social security for simple random sampling, stratified by municipality and socio-economic status, among all citizens >50 years of age of the province. The median age of the controls was 64 years; 59% of the controls were males. Three trained interviewers visited cases and controls at home. Information was collected through a structured interview and a standardized food frequency questionnaire. In addition, biological samples were collected. Data collected included medical history, lifetime smoking history, family history of bladder cancer and a lifetime occupational history. Informed consent was obtained from all participants and the study was approved by the ethical review board of the Medical School of the Catholic University of Leuven, Belgium.

### Study subjects from Eastern Europe

The details of this study have been described previously (42). Cases and controls were recruited as part of a study designed to evaluate the risk of various cancers due to environmental arsenic exposure in Hungary, Romania and Slovakia between 2002 and 2004. The recruitment was carried out in the counties of Bacs, Bekes, Csongrad and Jasz-Nagykun-Szolnok in Hungary; Bihor and Arad in Romania; and Banska Bystrica and Nitra in Slovakia. The cases (N = 214) and controls (N = 533) selected were of Hungarian, Romanian and Slovak nationalities. Bladder cancer patients were invited on the basis of histopathological examinations by pathologists. Hospital-based controls were included in the study, subject to fulfillment of a set of criteria. All general hospitals in the study areas were involved in the process of control recruitment. The controls were frequency matched with cases for age, gender, country of residence and ethnicity. Controls included general surgery, orthopedic and trauma patients aged 30–79 years. Patients with malignant tumors, diabetes and cardiovascular diseases were excluded as controls. The median age of the bladder cancer patients was 65 years (range 36–90); 83% of the patients were males. The median age of the controls was 61 years (range 28–83); 51% of the controls were males. The participation rates among cases and controls were ∼70%. Of all the patients with known stage and grade information, 28% had a low-risk tumor (pTaG1/2). Clinicians took venous blood and other biological samples from cases and controls after consent forms had been signed. Cases and controls recruited to the study were interviewed by trained personnel and completed a general lifestyle questionnaire. Ethnic background for cases and controls was recorded along with other characteristics of the study population. Local ethical boards approved the study plan and design.

### Study subjects from Sweden

The Swedish patients come from a population-based study of UBC patients diagnosed in the Stockholm region in 1995–1996 (43). Blood samples from 352 patients were available out of a collection of 538 patients with primary urothelial carcinoma of the bladder. The average age at onset for these patients is 69 years (range 32–97 years), and 67% of the patients are males. Clinical data, including age at onset, grade and stage of tumor, were prospectively obtained from hospitals and urology units in the region. The control samples came from blood donors in the Stockholm region and were from cancer-free individuals of both genders. The regional ethical committee approved the study and all participants gave informed consent.

### Study subjects from Spain

The Spanish study patients were recruited from the Urology and Oncology Departments of Zaragoza Hospital between September 2007 and June 2009. Two hundred and forty-six patients with histologically proven urothelial cell carcinoma of the bladder were enrolled (response 77%). Clinical information including age at onset, grade and stage was obtained from medical records. The median age at diagnosis for the patients was 65 years (range 27–94) and 87% were males. The 1844 Spanish control individuals were part of a larger collection of control samples obtained from individuals who had attended the University Hospital in Zaragoza, Spain, for diseases other than cancer between November 2001 and May 2007. The controls were of both genders and median age was 52 years (range 11–87). Controls were questioned to rule out prior cancers before drawing the blood sample. All patients and controls were of self-reported European descent. Study protocols were approved by the Institutional Review Board of Zaragoza University Hospital. All subjects gave written informed consent.

### Study subjects from Germany

The study subjects from Germany came from three different studies.

• The Neuss bladder cancer study. Details of the bladder cancer cases of this study have been published previously (44). The ongoing case–control series consists of 217 bladder cancer cases and 105 controls from the Department of Urology, Lukasklinik Neuss, Germany, located ∼20 km from the Ruhr area. The median age at diagnosis was 72.9 (range 26.1–93.4) years. Seventy-eight percent of the participants were males. Data on tumor stage and grade were obtained through the cancer registry. Forty-five percent of the patients had a low-risk tumor (pTaG1/2). The 105 control individuals (64% males) (median age 42.4, range 18.0–89.0) were cancer free. Data were collected from June 2009 to July 2010. The local ethics committee approved the study plan and design.

• The Dortmund bladder cancer study. Details of the bladder cancer cases of this study have been published previously (43). The case–control series consists of 197 patients with a confirmed bladder cancer from the Department of Urology, St.-Josefs-Hospital Dortmund-Hörde, located in the Ruhr area, an area of former coal, iron and steel industries and 240 controls from the same Department of Urology, admitted for treatment of benign urological diseases, enrolled from July 2009 to July 2010. The median age at diagnosis was 71.2 (range 35.0–89.2) years. Seventy-five percent of the participants were males. Sixty percent of the patients had tumors with low risk of progression (pTaG1/2). The 240 control individuals (77% males) were cancer free and frequency matched for age with the cases (median age 70.7, range 21.7–100). The local ethics committee approved the study plan and design.

### Genome-wide genotyping and imputations of untyped variants

The Icelandic chip-typed samples were assayed with the Illumina HumanHap300, HumanCNV370, HumanHap610, HumanHap1M, HumanHap660, Omni-1, Omni 2.5 or Omni Express bead chips. SNPs were excluded if they had (i) yield <95%, (ii) minor allele frequency <1% in the population, (iii) significant deviation from Hardy–Weinberg equilibrium (P < 0.001), (iv) if they produced an excessive inheritance error rate (over 0.001) or (v) if there was substantial difference in allele frequency between chip types (from just a single chip if that resolved all differences, but from all chips otherwise). All samples with a call rate <97% were excluded from the analysis. For the HumanHap series of chips, 308 840 SNPs were used for long-range phasing, whereas for the Omni series of chips 642 079 SNPs were included. The final set of SNPs used for long-range phasing was composed of 785 863 SNPs.

Variants were imputed based on WGS data from 2230 Icelanders, sequenced at a minimum depth of 10× (average 22×). Approximately 34.2 million SNPs and small indel variants were imputed based on this set of individuals. A detailed description of imputation methods used for the Icelandic population has been published recently (18). In brief, sequencing by synthesis was performed on Illumina GAIIx and HiSeq2000 instruments. SNPs that were identified through sequencing were imputed into all Icelanders who had been phased with long-range phasing using the same model used by IMPUTE.

The full GWAS results provide specific information on the genetic characteristics of the whole Icelandic nation, precluding full disclosure of the data. Association results for all variants with P-value < 10−4, MAF > 0.005 and imputation information score >0.95 are shown in Supplementary Material, Table S10.

The Dutch GWAS has been described previously (13). In brief, 1631 Dutch cases and 1824 Dutch controls were assayed with the Illumina HumanHap300 or HumanHapCNV370 (Illumina, SanDiego, CA, USA). SNPs were excluded if they had MAF <0.01, were not in Hardy–Weinberg equilibrium (P < 10–5) or had different frequency for the two chip types used (P < 10–5). For imputation into the Dutch dataset, we used 292 650 autosomal SNPs present on both chip types and that passed QC to impute an additional 7 543 837 ungenotyped SNPs using the IMPUTE v2.1 software (47,48) and a training set consisting of the combined 1000 Genomes low-coverage pilot haplotypes (released June 2010, 120 chromosomes) and the HapMap3 haplotypes (release #2 February 2009, 1920 chromosomes) downloaded from http://mathgen.stats.ox.ac.uk/impute/impute_v2.html#filtered_1kg_hm3_haps.

### Single track variant genotyping

Genotyping of single SNP was carried out by deCODE Genetics in Reykjavik, Iceland, applying the Centaurus (Nanogen) platform (49). The quality of the imputation was evaluated by comparing imputed genotypes to genotypes obtained by using the assay. The concordance was >97%. Positive and negative controls were present on all genotyping plates in order to ensure correct genotyping. Genotyping of the indel variant rs148953085 was done by PCR amplification followed by size fractionation on Applied Biosystems model 3730 sequencer, using Genescan (v. 3.0) peak-calling software. Alleles were called using an internal allele-calling program (50). The sequences of the primers used for genotyping were as follows: forward 5′-(GTGGGTAAGGCAAACCAAAA)-3′, reverse 5′-(AGGTTCAGTCTGCCCTGAAA)-3′.

### Association testing

#### Case–control association testing

Logistic regression was used to test for association between sequence variants and disease, treating disease status as the response and expected allele counts from imputation or allele counts from direct genotyping as covariates. The analysis included assessment of chromosome X. Testing was performed using the likelihood ratio statistic. When testing for association using the in silico genotypes, controls were matched to cases based on the informativeness of the imputed genotypes, such that for each case C controls of matching informativeness where chosen. Failing to match cases and controls will lead to a highly inflated genomic control factor, and in some cases may lead to spurious false positive findings. The informativeness of each of the imputation of each of an individual's haplotypes was estimated by taking the average of

$a(e,θ)=e−θ1−θ,e≥θθ−eθ,e<θ$
over all SNPs imputed for the individual, where e is the expected allele count for the haplotype at the SNP and θ is the population frequency of the SNP. Note that a(θ,θ) = 0 and a(0,θ) = a(1,θ) = 1. The mean informativeness values cluster into groups corresponding to the most common pedigree configurations used in the imputation, such as imputing from parent into child or from child into parent. Based on this clustering of imputation informativeness, we divided the haplotypes of individuals into seven groups of varying informativeness, which created 27 groups of individuals of similar imputation informativeness; 7 groups of individuals with both haplotypes having similar informativeness, 21 groups of individuals with the two haplotypes having different informativeness, minus the one group of individuals with neither haplotype being imputed well. Within each group, we calculate the ratio of the number of controls and the number of cases, and choose the largest integer C that was less than this ratio in all the groups. For example, if in one group there are 10.3 times as many controls as cases and if in all other groups this ratio was greater, then we would set C = 10 and within each group randomly select 10 times as many controls as there are cases.

#### Quantitative trait association testing

A generalized form of linear regression was used to test for association between sequence variants and BMD or BP. Let y be the vector of quantitative measurements, and let g be the vector of expected allele counts for the SNP being tested. We assume the quantitative measurements follow a normal distribution with a mean that depends linearly on the expected allele at the SNP and a variance covariance matrix proportional to the kinship matrix:

$y∼N(α+βg,2σ2Φ),$
where
$Φij=12,i=j2kij,i≠j,$
is the kinship matrix as estimated from the Icelandic genealogical database. It is not computationally feasible to use this full model and we therefore split the individuals with in silico genotypes and BMD and BP measurements into smaller clusters. Here, we chose to restrict the cluster size to at most 300 individuals.

The maximum likelihood estimates for the parameters α, β and σ2 involve inverting the kinship matrix. If there are n individuals in the cluster, then this inversion requires O(n3) calculations, but since these calculations only need to be performed once, the computational cost of doing a genome-wide association scan will only be O(n2) calculations; the cost of calculating the maximum likelihood estimates if the kinship matrix has already been inverted.

### Association analysis of follow-up datasets

For association analysis of the follow-up datasets, we used a standard likelihood ratio statistic, implemented in the NEMO software (51) to calculate two-sided P-values for each individual allele, assuming a multiplicative model for UBC risk, i.e. the bladder cancer risk multiplies by the number of risk alleles a person carries. Results from multiple case–control groups were combined using a Mantel–Haenszel model in which the groups were allowed to have different population frequencies for alleles and genotypes but were assumed to have common ORs (52).

Stratified analyses were conducted by smoking and by UBC aggressiveness. For the latter, all patients for whom detailed histology information was available were classified with regards to risk of progression, based on stage and grade information. Patients with ‘low risk of progression’ were defined as having TNM stage pTa in combination with WHO 1973 differentiation Grade 1 or 2 or WHO/ISUP 2004 low grade. All other tumors were classified as ‘high risk of progression’ (stage pTis or ≥ pT1 or WHO 1973 Grade 3 or WHO/ISUP 2004 high grade).

### Conditional analysis

For the two variants rs62185668 and rs4813953, their association was tested conditional on the observed association of the others. For the replication cohorts for which direct genotypes were available, the conditional analysis was done using the Nemo software (51). For the Icelandic cohort, the conditional analysis was done using the same method used for the genome-wide association analysis that takes into account the imputation uncertainty. However, the analysis was restricted to the 863 cases and 46 602 controls that have been chip typed as the imputation is much less reliable for individuals with genotype probabilities estimated based on information from chip-typed relatives. While this does not affected the single marker association analysis, very uncertain estimates for genotype probabilities for the variant that is conditioned on make the conditional analysis harder to interpret.

### Heterogeneity calculations

Heterogeneity was tested by comparing the null hypothesis of the effect being the same in all populations to the alternative hypothesis of each population having a different effect using a likelihood ratio test. I2 lies between 0 and 100% and describes the proportion of total variation in study estimates that is due to heterogeneity (53).

### Bioinformatics methods

A search was carried out to detect overlaps between variant locations and known bioinformatic features. We retrieved data from UCSC test browser (HG19 build 37) (26). We accessed feature tracks relevant to the bladder tissue containing genome positional information and identified those features that overlapped with the SNP. We also accessed HaploReg v2 (27) and RegulomeDB (28). We accessed The Gene Expression Atlas by EBI for reports on expression of JAG1 in bladder cancer (54). These are recognized draft quality data and were used as is without quality filtering.

### Assessment of expression in cell lines and tissues

Expression of JAG1 in tumor and normal UBC tissues was measured by Affymetrix HG-U133 Plus 2.0 arrays as described (55). The microarray data are available from ArrayExpress (www.ebi.ac.uk/arrayexpress/) under the accession numbers E-MTAB-1803 for the muscle-invasive bladder tumors and E-MTAB-1940 for the non-muscle-invasive tumors and normal samples.

Bladder cell lines used in this study and gene expression profiling using GeneChip Human Genome U133 Plus 2.0 Arrays (Affymetrix) were as described (56). Normal human urothelial cells (NHUC) were isolated from human ureters obtained at nephrectomy (57) and were used either uncultured (passage 0) or cultured to low passage (passages 1 or 2). Total RNA was extracted directly from passage 0 NHUC or from cells cultured to 70–80% confluence using an RNeasy Mini Kit (Qiagen). cDNA was synthesized using 1 μg of total RNA and Superscript II (Invitrogen) according to the manufacturer's instructions. Expression of JAG1 in low-passage NHUC was assessed by quantitative real-time RT-PCR using a TaqMan assay (Hs01070032_m1; Applied Biosystems). Levels of expression were normalized to SDHA (Hs00417200_m1) and measured relative to cell line 97-18.

## FUNDING

This work was supported by the following funding agencies. Collection of samples and data in Iceland and the Netherlands was funded in part by the European Commission (POLYGENE: LSHC-CT-2005) (grant number 018827) and a research investment grant of the Radboud university medical centre (Radboudumc). The Leeds Bladder Cancer Study was funded by Cancer Research UK and Yorkshire Cancer Research. Torino Bladder Cancer Case Control Study was supported by a grant to ECNIS (Environmental Cancer Risk, Nutrition and Individual Susceptibility), a network of excellence operating within the European Union 6th Framework Program, Priority 5: ‘Food Quality and Safety’ (Contract No. 513943); by a grant of the Compagnia di San Paolo—Human Genetics Foundation (HuGeF), the Italian Association for Cancer Research, Italy and the Piedmont Region Progetti di Ricerca Sanitaria Finalizzata. The Belgian bladder cancer study was funded by the Flemish government and by the health authorities of the Belgian province of Limburg. The Swedish study was funded by the Swedish Cancer Society and the Swedish Research Council. J.I.M. is funded by Red Temática de Investigación Cooperativa en Cáncer (grant number RD06/0020/1054). The expression array work was supported by INSERM, CNRS, Institut Curie and by grants from INCa (INCA_1053 and INCA_5176), La Ligue Nationale Contre le Cancer (FR: équipe labellisée), and the program ‘Carte d'Identité des Tumeurs’, initiated, developed and funded by Ligue Nationale Contre le Cancer.

## ACKNOWLEDGEMENTS

We thank the individuals that participated in the study and whose contribution made this work possible. We also thank the personnel at all the recruitment centres. We acknowledge the cancer registries in Iceland and the Netherlands for assistance in the ascertainment of the Icelandic and Dutch UBC patients.

Conflict of Interest statement. None declared.

## REFERENCES

1
Ploeg
M.
Aben
K.K.H.
Kiemeney
L.A.
The present and future burden of urinary bladder cancer in the world
World J. Urol.

2009
27
289
293
2
Ferlay
J.
Shin
H.-R.
Bray
F.
Forman
D.
Mathers
C.
Parkin
D.M.
Estimates of worldwide burden of cancer in 2008: GLOBOCAN 2008
Int. J. Cancer J. Int. Cancer

2010
127
2893
2917
3
N.
Noone
A.M.
Krapcho
M.
Garshell
J.
Neyman
N.
Altekruse
S.F.
Kosary
C.L.
Yu
M.
Ruhl
J.
Tatalovich
Z.
et al.
SEER Cancer Statistics Review 1975–2010

Bethesda, MD, USA
National Cancer Institute
,
4
Aben
K.K.H.
Witjes
J.A.
Schoenberg
M.P.
Hulsbergen-van de Kaa
C.
Verbeek
A.L.M.
Kiemeney
L.A.L.M.
Familial aggregation of urothelial cell carcinoma
Int. J. Cancer J. Int. Cancer

2002
98
274
278
5
Murta-Nascimento
C.
Silverman
D.T.
Kogevinas
M.
García-Closas
M.
Rothman
N.
Tardón
A.
García-Closas
R.
Serra
C.
Carrato
A.
Villanueva
C.
et al.
Risk of bladder cancer associated with family history of cancer: do low-penetrance polymorphisms account for the increase in risk?
Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol.

2007
16
1595
1600
6
Dong
C.
Hemminki
K.
Modification of cancer risks in offspring by sibling and parental cancers from 2,112,616 nuclear families
Int. J. Cancer J. Int. Cancer

2001
92
144
150
7
Aben
K.K.H.
Baglietto
L.
Baffoe-Bonnie
A.
Coebergh
J.-W.W.
Bailey-Wilson
J.E.
Trink
B.
Verbeek
A.L.M.
Schoenberg
M.P.
Alfred Witjes
J.
Kiemeney
L.A.
Segregation analysis of urothelial cell carcinoma
Eur. J. Cancer Oxf. Engl.

2006
42
1428
1433
8
García-Closas
M.
Malats
N.
Silverman
D.
Dosemeci
M.
Kogevinas
M.
Hein
D.W.
Tardón
A.
Serra
C.
Carrato
A.
García-Closas
R.
et al.
NAT2 slow acetylation, GSTM1 null genotype, and risk of bladder cancer: results from the Spanish Bladder Cancer Study and meta-analyses
Lancet

2005
366
649
659
9
Kiemeney
L.A.
Thorlacius
S.
Sulem
P.
Geller
F.
Aben
K.K.H.
Stacey
S.N.
Gudmundsson
J.
Jakobsdottir
M.
J.T.
Sigurdsson
A.
et al.
Sequence variant on 8q24 confers susceptibility to urinary bladder cancer
Nat. Genet.

2008
40
1307
1312
10
Wu
X.
Ye
Y.
Kiemeney
L.A.
Sulem
P.
Rafnar
T.
Matullo
G.
Seminara
D.
Yoshida
T.
Saeki
N.
Andrew
A.S.
et al.
Genetic variation in the prostate stem cell antigen gene PSCA confers susceptibility to urinary bladder cancer
Nat. Genet.

2009
41
991
995
11
Kiemeney
L.A.
Sulem
P.
Besenbacher
S.
Vermeulen
S.H.
Sigurdsson
A.
Thorleifsson
G.
Gudbjartsson
D.F.
Stacey
S.N.
Gudmundsson
J.
Zanon
C.
et al.
A sequence variant at 4p16.3 confers susceptibility to urinary bladder cancer
Nat. Genet.

2010
42
415
419
12
Rothman
N.
Garcia-Closas
M.
Chatterjee
N.
Malats
N.
Wu
X.
Figueroa
J.D.
Real
F.X.
Van Den Berg
D.
Matullo
G.
Baris
D.
et al.
A multi-stage genome-wide association study of bladder cancer identifies multiple susceptibility loci
Nat. Genet.

2010
42
978
984
13
Rafnar
T.
Vermeulen
S.H.
Sulem
P.
Thorleifsson
G.
Aben
K.K.
Witjes
J.A.
Grotenhuis
A.J.
Verhaegh
G.W.
Hulsbergen-van de Kaa
C.A.
Besenbacher
S.
et al.
European genome-wide association study identifies SLC14A1 as a new urinary bladder cancer susceptibility gene
Hum. Mol. Genet.

2011
20
4268
4281
14
Garcia-Closas
M.
Ye
Y.
Rothman
N.
Figueroa
J.D.
Malats
N.
Dinney
C.P.
Chatterjee
N.
Prokunina-Olsson
L.
Wang
Z.
Lin
J.
et al.
A genome-wide association study of bladder cancer identifies a new susceptibility locus within SLC14A1, a urea transporter gene on chromosome 18q12.3
Hum. Mol. Genet.

2011
20
4282
4289
15
Figueroa
J.D.
Ye
Y.
Siddiq
A.
Garcia-Closas
M.
Chatterjee
N.
Prokunina-Olsson
L.
Cortessis
V.K.
Kooperberg
C.
Cussenot
O.
Benhamou
S.
et al.
Genome-wide association study identifies multiple loci associated with bladder cancer risk
Hum. Mol. Genet

2013
23
1387
1398
16
Kong
A.
Masson
G.
Frigge
M.L.
Gylfason
A.
Zusmanovich
P.
Thorleifsson
G.
Olason
P.I.
Ingason
A.
Steinberg
S.
Rafnar
T.
et al.
Detection of sharing by descent, long-range phasing and haplotype imputation
Nat. Genet.

2008
40
1068
1075
17
Kong
A.
Steinthorsdottir
V.
Masson
G.
Thorleifsson
G.
Sulem
P.
Besenbacher
S.
Jonasdottir
A.
Sigurdsson
A.
Kristinsson
K.T.
Jonasdottir
A.
et al.
Parental origin of sequence variants associated with complex diseases
Nature

2009
462
868
874
18
Styrkarsdottir
U.
Thorleifsson
G.
Sulem
P.
Gudbjartsson
D.F.
Sigurdsson
A.
Jonasdottir
A.
Jonasdottir
A.
Oddsson
A.
Helgason
A.
Magnusson
O.T.
et al.
Nonsense mutation in the LGR4 gene is associated with several human diseases and other traits
Nature

2013
497
517
520
19
Kong
A.
Thorleifsson
G.
Gudbjartsson
D.F.
Masson
G.
Sigurdsson
A.
Jonasdottir
A.
Walters
G.B.
Jonasdottir
A.
Gylfason
A.
Kristinsson
K.T.
et al.
Fine-scale recombination rate differences between sexes, populations and individuals
Nature

2010
467
1099
1103
20
Pruim
R.J.
Welch
R.P.
Sanna
S.
Teslovich
T.M.
Chines
P.S.
Gliedt
T.P.
Boehnke
M.
Abecasis
G.R.
Willer
C.J.
LocusZoom: regional visualization of genome-wide association scan results
Bioinformatics

2010
26
2336
2337
21
Knowles
M.A.
Molecular subtypes of bladder cancer: Jekyll and Hyde or chalk and cheese?
Carcinogenesis

2006
27
361
373
22
Lobry
C.
Oh
P.
Aifantis
I.
Oncogenic and tumor suppressor functions of Notch in cancer: it's NOTCH what you think
J. Exp. Med.

2011
208
1931
1935
23
Shi
T.
Xu
H.
Wei
J.
Ai
X.
Ma
X.
Wang
B.
Ju
Z.
Zhang
G.
Wang
C.
Wu
Z.
et al.
Association of low expression of notch-1 and jagged-1 in human papillary bladder cancer and shorter survival
J. Urol.

2008
180
361
366
24
Vallot
C.
Stransky
N.
Bernard-Pierrot
I.
Hérault
A.
Zucman-Rossi
J.
Chapeaublanc
E.
Vordos
D.
Laplanche
A.
Benhamou
S.
Lebret
T.
et al.
A novel epigenetic phenotype associated with the most aggressive pathway of bladder tumor progression
J. Natl. Cancer Inst.

2011
103
47
60
25
Rosenbloom
K.R.
Sloan
C.A.
V.S.
Dreszer
T.R.
Learned
K.
Kirkup
V.M.
Wong
M.C.
M.
Fang
R.
Heitner
S.G.
et al.
ENCODE data in the UCSC Genome Browser: year 5 update
Nucleic Acids Res.

2013
41
D56
D63
26
Meyer
L.R.
Zweig
A.S.
Hinrichs
A.S.
Karolchik
D.
Kuhn
R.M.
Wong
M.
Sloan
C.A.
Rosenbloom
K.R.
Roe
G.
B.
et al.
The UCSC Genome Browser database: extensions and updates 2013
Nucleic Acids Res.

2013
41
D64
D69
27
Ward
L.D.
Kellis
M.
HaploReg: a resource for exploring chromatin states, conservation, and regulatory motif alterations within sets of genetically linked variants
Nucleic Acids Res.

2012
40
D930
D934
28
Boyle
A.P.
Hong
E.L.
Hariharan
M.
Cheng
Y.
Schaub
M.A.
Kasowski
M.
Karczewski
K.J.
Park
J.
Hitz
B.C.
Weng
S.
et al.
Annotation of functional variation in personal genomes using RegulomeDB
Genome Res.

2012
22
1790
1797
29
Ng
S.B.
Turner
E.H.
Robertson
P.D.
Flygare
S.D.
Bigham
A.W.
Lee
C.
Shaffer
T.
Wong
M.
Bhattacharjee
A.
Eichler
E.E.
et al.
Targeted capture and massively parallel sequencing of 12 human exomes
Nature

2009
461
272
276
30
Kung
A.W.C.
Xiao
S.-M.
Cherny
S.
Li
G.H.Y.
Gao
Y.
Tso
G.
Lau
K.S.
Luk
K.D.K.
Liu
J.
Cui
B.
et al.
Association of JAG1 with bone mineral density and osteoporotic fractures: a genome-wide association study and follow-up replication studies
Am. J. Hum. Genet.

2010
86
229
239
31
K.
Styrkarsdottir
U.
Evangelou
E.
Hsu
Y.-H.
Duncan
E.L.
Ntzani
E.E.
Oei
L.
Albagha
O.M.E.
Amin
N.
Kemp
J.P.
et al.
Genome-wide meta-analysis identifies 56 bone mineral density loci and reveals 14 loci associated with risk of fracture
Nat. Genet.

2012
44
491
501
32
Ehret
G.B.
Munroe
P.B.
Rice
K.M.
Bochud
M.
Johnson
A.D.
Chasman
D.I.
Smith
A.V.
Tobin
M.D.
Verwoert
G.C.
et al.
International Consortium for Blood Pressure Genome-Wide Association Studies
Genetic variants in novel pathways influence blood pressure and cardiovascular disease risk
Nature

2011
478
103
109
33
Espinoza
I.
Miele
L.
Notch inhibitors for cancer treatment
Pharmacol. Ther.

2013
139
95
110
34
Lukk
M.
Kapushesky
M.
Nikkilä
J.
Parkinson
H.
Goncalves
A.
Huber
W.
Ukkonen
E.
Brazma
A.
A global map of human gene expression
Nat. Biotechnol.

2010
28
322
324
35
The Cancer Genome Atlas Research Network
Comprehensive molecular characterization of urothelial bladder carcinoma
Nature

2014
507
315
322
36
Hirao
S.
Hirao
T.
Marsit
C.J.
Hirao
Y.
Schned
A.
Devi-Ashok
T.
Nelson
H.H.
Andrew
A.
Karagas
M.R.
Kelsey
K.T.
Loss of heterozygosity on chromosome 9q and p53 alterations in human bladder cancer
Cancer

2005
104
1918
1923
37
Wetzels
J.F.M.
Kiemeney
L.A.L.M.
Swinkels
D.W.
Willems
H.L.
den Heijer
M.
Age- and gender-specific reference values of estimated GFR in Caucasians: the Nijmegen Biomedical Study
Kidney Int.

2007
72
632
637
38
Sak
S.C.
Barrett
J.H.
Paul
A.B.
Bishop
D.T.
Kiltie
A.E.
The polyAT, intronic IVS11–6 and Lys939Gln XPC polymorphisms are not associated with transitional cell carcinoma of the bladder
Br. J. Cancer

2005
92
2262
2265
39
Matullo
G.
Guarrera
S.
Sacerdote
C.
Polidoro
S.
Davico
L.
Gamberini
S.
Karagas
M.
Casetta
G.
Rolle
L.
Piazza
A.
et al.
Polymorphisms/haplotypes in DNA repair genes and smoking: a bladder cancer case-control study
Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol.

2005
14
2569
2578
40
Shen
M.
Hung
R.J.
Brennan
P.
Malaveille
C.
Donato
F.
Placidi
D.
Carta
A.
Hautefeuille
A.
Boffetta
P.
Porru
S.
Polymorphisms of the DNA repair genes XRCC1, XRCC3, XPD, interaction with environmental exposures, and bladder cancer risk in a case-control study in northern Italy
Cancer Epidemiol. Biomark. Prev. Publ. Am. Assoc. Cancer Res. Cosponsored Am. Soc. Prev. Oncol.

2003
12
1234
1240
41
Kellen
E.
Zeegers
M.
Paulussen
A.
Van Dongen
M.
Buntinx
F.
Fruit consumption reduces the effect of smoking on bladder cancer risk. The Belgian case control study on bladder cancer
Int. J. Cancer

2006
118
2572
2578
42
Thirumaran
R.K.
Bermejo
J.L.
Rudnai
P.
Gurzau
E.
Koppova
K.
Goessler
W.
Vahter
M.
Leonardi
G.S.
Clemens
F.
Fletcher
T.
et al.
Single nucleotide polymorphisms in DNA repair genes and basal cell carcinoma of skin
Carcinogenesis

2006
27
1676
1681
43
P.
Wijkström
H.
Thorstenson
A.
J.
Norming
U.
Wiklund
P.
Onelöv
E.
Steineck
G.
A population-based study of 538 patients with newly detected urinary bladder neoplasms followed during 5 years
Scand. J. Urol. Nephrol.

2003
37
195
201
44
Lehmann
M.-L.
Selinski
S.
Blaszkewicz
M.
Orlich
M.
Ovsiannikov
D.
Moormann
O.
Guballa
C.
Kress
A.
Truss
M.C.
Gerullis
H.
et al.
Rs710521[A] on chromosome 3q28 close to TP63 is associated with increased urinary bladder cancer risk
Arch. Toxicol.

2010
84
967
978
45
Golka
K.
Schmidt
T.
Seidel
T.
Dietrich
H.
Roemer
H.C.
Lohlein
D.
Reckwitz
T.
Sokeland
J.
Weistenhofer
W.
Blaszkewicz
M.
et al.
The influence of polymorphisms of glutathione S-transferases M1 and M3 on the development of human urothelial cancer
J. Toxicol. Environ. Health A

2008
71
881
886
46
Golka
K.
Hermes
M.
Selinski
S.
Blaszkewicz
M.
Bolt
H.M.
Roth
G.
Dietrich
H.
Prager
H.-M.
K.
Hengstler
J.G.
Susceptibility to urinary bladder cancer: relevance of rs9642880[T], GSTM1 0/0 and occupational exposure
Pharmacogenet. Genomics

2009
19
903
906
47
Howie
B.N.
Donnelly
P.
Marchini
J.
A flexible and accurate genotype imputation method for the next generation of genome-wide association studies
PLoS Genet.

2009
5
e1000529
48
Marchini
J.
Howie
B.
Myers
S.
McVean
G.
Donnelly
P.
A new multipoint method for genome-wide association studies by imputation of genotypes
Nat. Genet.

2007
39
906
913
49
Kutyavin
I.V.
Milesi
D.
Belousov
Y.
Podyminogin
M.
Vorobiev
A.
Gorn
V.
Lukhtanov
E.A.
Vermeulen
N.M.J.
Mahoney
W.
A novel endonuclease IV post-PCR genotyping system
Nucleic Acids Res.

2006
34
e128
50
Kong
A.
Gudbjartsson
D.F.
Sainz
J.
Jonsdottir
G.M.
Gudjonsson
S.A.
Richardsson
B.
Sigurdardottir
S.
Barnard
J.
Hallbeck
B.
Masson
G.
et al.
A high-resolution recombination map of the human genome
Nat. Genet.

2002
31
241
247
51
Gretarsdottir
S.
Thorleifsson
G.
Reynisdottir
S.T.
Manolescu
A.
Jonsdottir
S.
Jonsdottir
T.
Gudmundsdottir
T.
S.M.
O.B.
Gudjonsdottir
H.M.
et al.
The gene encoding phosphodiesterase 4D confers risk of ischemic stroke
Nat. Genet.

2003
35
131
138
52
Mantel
N.
Haenszel
W.
Statistical aspects of the analysis of data from retrospective studies of disease
J. Natl. Cancer Inst.

1959
22
719
748
53
Higgins
J.P.T.
Thompson
S.G.
Quantifying heterogeneity in a meta-analysis
Stat. Med.

2002
21
1539
1558
54
Kapushesky
M.
T.
Burdett
T.
Culhane
A.
Farne
A.
Filippov
A.
Holloway
E.
Klebanov
A.
Kryvych
N.
Kurbatova
N.
et al.
Gene Expression Atlas update – a value-added database of microarray and sequencing-based functional genomics experiments
Nucleic Acids Res.

2012
40
D1077
D1081
55
El Behi
M.
Krumeich
S.
Lodillinsky
C.
Kamoun
A.
Tibaldi
L.
Sugano
G.
De Reynies
A.
Chapeaublanc
E.
Laplanche
A.
Lebret
T.
et al.
An essential role for decorin in bladder cancer invasiveness
EMBO Mol. Med.

2013
5
1835
1851
56
Morgan
R.
Bryan
R.T.
Javed
S.
Launchbury
F.
Zeegers
M.P.
Cheng
K.K.
James
N.D.
Wallace
D.M.A.
Hurst
C.D.
Ward
D.G.
et al.
Expression of Engrailed-2 (EN2) protein in bladder cancer and its potential utility as a urinary diagnostic biomarker
Eur. J. Cancer Oxf. Engl.

2013
49
2214
2222
57
Southgate
J.
Hutton
K.A.
Thomas
D.F.
Trejdosiewicz
L.K.
Normal human urothelial cells in vitro: proliferation and induction of stratification
Lab. Investig. J. Tech. Methods Pathol.

1994
71
583
594

## Author notes

These authors contributed equally to this work.