-
PDF
- Split View
-
Views
-
Cite
Cite
Minako Imamura, Atsushi Takahashi, Masatoshi Matsunami, Momoko Horikoshi, Minoru Iwata, Shin-ichi Araki, Masao Toyoda, Gayatri Susarla, Jeeyun Ahn, Kyu Hyung Park, Jinhwa Kong, Sanghoon Moon, Lucia Sobrin, International Diabetic Retinopathy and Genetics CONsortium (iDRAGON), Toshimasa Yamauchi, Kazuyuki Tobe, Hiroshi Maegawa, Takashi Kadowaki, Shiro Maeda, Genome-wide association studies identify two novel loci conferring susceptibility to diabetic retinopathy in Japanese patients with type 2 diabetes, Human Molecular Genetics, Volume 30, Issue 8, 15 April 2021, Pages 716–726, https://doi.org/10.1093/hmg/ddab044
- Share Icon Share
Abstract
Several reports have suggested that genetic susceptibility contributes to the development and progression of diabetic retinopathy. We aimed to identify genetic loci that confer susceptibility to diabetic retinopathy in Japanese patients with type 2 diabetes. We analysed 5 790 508 single nucleotide polymorphisms (SNPs) in 8880 Japanese patients with type 2 diabetes, 4839 retinopathy cases and 4041 controls, as well as 2217 independent Japanese patients with type 2 diabetes, 693 retinopathy cases and 1524 controls. The results of these two genome-wide association studies (GWAS) were combined with an inverse variance meta-analysis (Stage-1), followed by de novo genotyping for the candidate SNP loci (P < 1.0 × 10−4) in an independent case–control study (Stage-2, 2260 cases and 723 controls). After combining the association data (Stages 1 and 2) using meta-analysis, the associations of two loci reached a genome-wide significance level: rs12630354 near STT3B on chromosome 3, P = 1.62 × 10−9, odds ratio (OR) = 1.17, 95% confidence interval (CI) 1.11–1.23, and rs140508424 within PALM2 on chromosome 9, P = 4.19 × 10−8, OR = 1.61, 95% CI 1.36–1.91. However, the association of these two loci was not replicated in Korean, European or African American populations. Gene-based analysis using Stage-1 GWAS data identified a gene-level association of EHD3 with susceptibility to diabetic retinopathy (P = 2.17 × 10−6). In conclusion, we identified two novel SNP loci, STT3B and PALM2, and a novel gene, EHD3, that confers susceptibility to diabetic retinopathy; however, further replication studies are required to validate these associations.
Introduction
Diabetic retinopathy (DR) is one of the microvascular complications in patients with diabetes mellitus, and a major cause of visual loss in adults. Its prevalence has been reported to be approximately 35% of patients with diabetes mellitus (1). It has been shown that persistent hyperglycaemia, along with known risk factors, including a long duration of diabetes, hypertension and dyslipidaemia, are responsible for the onset or progression of DR (2). However, the susceptibility to DR or severity of the disease varies among patients with diabetes mellitus, even if their glycaemic controls, the duration of diabetes or status of other risk factors are similar (3). In addition, familial aggregation of DR or advanced DR has been observed among patients with type 1 or type 2 diabetes (4). These pieces of evidence suggest that genetic susceptibility contributes to the development and progression of DR. Although the heritability of DR has been estimated as 25–52% from the results of family-based studies (5), the details about the genetic background and molecular mechanisms of DR remain poorly understood.
A genome-wide association study (GWAS) is a powerful tool to identify genetic loci for complex diseases, and a large number of genetic loci for the susceptibility to various diseases, such as type 2 diabetes, have been successfully identified through GWAS (6–9). GWAS for DR have been performed, but most of the studies only reported suggestive signals with no replication (5) because of their limited sample sizes. Recently, several loci with genome-wide significant associations have been identified using GWAS; these include ARHGAP22, HS6ST3, KIAA0825, PLXDC2, GRB2, NVL and NOX4 (10–13), as well as EPO and SLC19A3 by candidate gene approaches (14–15). However, the associations in those original studies have not been confirmed by subsequent replication analyses, which is likely due to limited sample sizes in the discovery stage (n = 749–4666) and the possible heterogeneity of the phenotypic characteristics among the studies.
To facilitate identification of the novel genetic factors associated with DR, we performed a GWAS meta-analysis for DR using the existing genome-wide single nucleotide polymorphism (SNP) data for Japanese patients with type 2 diabetes (8,16). To the best of our knowledge, this is the largest GWAS for DR to date.
Results
A meta-analysis of GWAS for DR in Japanese patients with type 2 diabetes
In the discovery stage (Stage-1), we performed GWAS for DR using 8880 Japanese individuals with type 2 diabetes (4839 DR cases and 4041 controls) in set-1 as well as 2217 Japanese individuals with type 2 diabetes (693 DR cases and 1524 controls) in set-2 that passed the quality control (Supplementary Material, Fig. S1). DR cases include type 2 diabetes patients with any stages of DR: simple retinopathy, pre-proliferative retinopathy, proliferative retinopathy or post laser photocoagulation therapy. We used two genome-wide SNP data from the previously reported GWAS for type 2 diabetes, one was genotyped with Omni Express Exome Bead Chip (8) for set-1, and the other was genotyped with Illumina Human 610 K SNP array (16) for set-2. Imputation was performed for both sets using MACH and Minimac (17,18) with individuals from the 1000 Genomes Project (phased JPT, CHB, and Han Chinese South (CHS) data n = 275, March 2012) as reference populations (19). We obtained directly genotyped data for 258 754 SNPs, which were available in both set-1 and set-2, and imputed data for 5 786 456 SNPs, which passed the quality control (r2 ≥ 0.7) in both studies. As for 254 702 SNPs for which both directly and imputed genotyped data were available, we prioritized directly genotyped data for further analysis. Thus, we performed two GWAS for DR with 5 790 508 SNP data (258 754 direct genotyped, and 5 531 754 imputed) in set-1 and set-2 independently, then combined these results by meta-analysis (Fig. 1, Supplementary Material, Fig. S1). There was no obvious genomic inflation in the quantile–quantile plot (Supplementary Material, Fig. S2). The most significant association for DR in the discovery stage was observed at rs8025089 in chromosome 15, although its association did not reach genome-wide significance level (P = 2.97 × 10−7, Supplementary Material, Table S1).

Manhattan plot in the discovery stage (Stage-1, set-1 + set-2). Each plot indicates individual association data of 5 790 508 SNPs. The Y axis shows −log10 P-values of association analysis of each SNP. The red line indicates the genome-wide significant threshold (P = 5 × 10−8). The association of suggestive loci shown in large-sized plots above the blue line was further examined by Stage-2 analysis.
We then selected 85 suggestive loci associated with DR in the Stage-1 meta-analysis (P < 1 × 10−4, Supplementary Material, Table S1) for further analysis using an independent case–control sample (Stage-2, 2260 cases, and 723 controls). In Stage-2, we successfully obtained de novo genotyping data for 62 lead SNPs or their proxies out of 85 loci using the multiplex PCR invader assay. By combining all association data (Stage-1 set-1, set-2, and Stage-2) with the inverse-variance fixed-effects meta-analysis, we identified two SNP loci showing genome-wide significant association with DR: rs12630354 near STT3B and rs140508424 in PALM2 (rs12630354; P = 1.62 × 10−9, odds ratio (OR) = 1.17, 95% confidence interval (CI) 1.11–1.23, rs140508424; P = 4.19 × 10−8, OR = 1.61, 95% CI 1.36–1.91, Table 1 and Supplementary Material, Table S2). We additionally identified four SNP loci with suggestive evidence for the associations with DR (5.0 × 10−8 ≤ P < 5.0 × 10−7, Table 1, and Supplementary Material, Table S2). Regional plots for these 6 (2 + 4) loci are shown in Fig. 2 or Supplementary Material, Fig. S3.
Association of six SNP loci with DR in Japanese patients with type 2 diabetes
SNPid . | RA . | AA . | Stage . | RAF . | RSQR . | OR . | 95%CI . | P . | Phet . | |
---|---|---|---|---|---|---|---|---|---|---|
CHRPOS (GRCh37)Nearest gene . | . | . | . | Case . | Control . | . | . | . | . | . |
rs12630354 | T | A | Stage-1, set-1 | 0.435 | 0.400 | 0.96 | 1.16 | (1.09–1.23) | 2.17 × 10−6 | |
3 | Stage-1, set-2 | 0.432 | 0.407 | 0.90 | 1.12 | (0.98–1.28) | 0.10 | |||
31 501 119 | Stage2 | 0.436 | 0.385 | 1.24 | (1.10–1.41) | 5.00 × 10−4 | ||||
STT3B | Combined a | 1.17 | (1.11–1.23) | 1.62 × 10−9 | 0.49 | |||||
rs140508424 | C | T | Stage-1, set-1 | 0.983 | 0.974 | 1.00 | 1.53 | (1.24–1.87) | 4.84 × 10−5 | |
9 | Stage-1, set-2 | 0.985 | 0.981 | 0.93 | 1.32 | (0.78–2.24) | 0.31 | |||
112 695 256 | Stage2 | 0.985 | 0.968 | 2.18 | (1.49–3.21) | 7.15 × 10−5 | ||||
PALM2 | Combined | 1.61 | (1.36–1.91) | 4.19 × 10−8 | 0.20 | |||||
rs1074390 | G | A | Stage-1, set-1 | 0.419 | 0.384 | typed | 1.15 | (1.09–1.23) | 3.00 × 10−6 | |
15 | Stage-1, set-2 | 0.424 | 0.390 | typed | 1.15 | (1.01–1.30) | 0.036 | |||
80 080 307 | Stage2 | 0.394 | 0.366 | 1.12 | (0.99–1.27) | 0.064 | ||||
MTHFS | Combined | 1.15 | (1.09–1.21) | 5.69 × 10−8 | 0.92 | |||||
rs7861760 | G | A | Stage-1, set-1 | 0.687 | 0.657 | 0.96 | 1.15 | (1.08–1.23) | 1.36 × 10−5 | |
9 | Stage-1, set-2 | 0.675 | 0.667 | 0.96 | 1.04 | (0.91–1.20) | 0.56 | |||
34 368 674 | Stage2 | 0.691 | 0.638 | 1.27 | (1.12–1.44) | 2.00 × 10−4 | ||||
KIAA1161/NUDT2 | Combined | 1.15 | (1.10–1.22) | 1.04 × 10−7 | 0.12 | |||||
rs2471299 | A | C | Stage-1, set-1 | 0.563 | 0.528 | 0.98 | 1.15 | (1.09–1.22) | 3.83 × 10−6 | |
7 | Stage-1, set-2 | 0.561 | 0.541 | 0.95 | 1.09 | (0.95–1.25) | 0.20 | |||
139 982 756 | Stage2 | 0.549 | 0.519 | 1.13 | (1.0004–1.27) | 0.049 | ||||
SLC37A3 | Combined | 1.14 | (1.08–1.20) | 2.05 × 10−7 | 0.75 | |||||
rs74305293 | T | G | Stage-1, set-1 | 0.274 | 0.249 | 0.87 | 1.16 | (1.08–1.25) | 6.19 × 10−5 | |
8 | Stage-1, set-2 | 0.258 | 0.236 | 0.86 | 1.15 | (0.98–1.35) | 0.084 | |||
83 575 369 | Stage2 | 0.308 | 0.269 | 1.22 | (1.06–1.40) | 4.40 × 10−3 | ||||
SNX16 | Combined | 1.17 | (1.10–1.24) | 2.47 × 10−7 | 0.79 |
SNPid . | RA . | AA . | Stage . | RAF . | RSQR . | OR . | 95%CI . | P . | Phet . | |
---|---|---|---|---|---|---|---|---|---|---|
CHRPOS (GRCh37)Nearest gene . | . | . | . | Case . | Control . | . | . | . | . | . |
rs12630354 | T | A | Stage-1, set-1 | 0.435 | 0.400 | 0.96 | 1.16 | (1.09–1.23) | 2.17 × 10−6 | |
3 | Stage-1, set-2 | 0.432 | 0.407 | 0.90 | 1.12 | (0.98–1.28) | 0.10 | |||
31 501 119 | Stage2 | 0.436 | 0.385 | 1.24 | (1.10–1.41) | 5.00 × 10−4 | ||||
STT3B | Combined a | 1.17 | (1.11–1.23) | 1.62 × 10−9 | 0.49 | |||||
rs140508424 | C | T | Stage-1, set-1 | 0.983 | 0.974 | 1.00 | 1.53 | (1.24–1.87) | 4.84 × 10−5 | |
9 | Stage-1, set-2 | 0.985 | 0.981 | 0.93 | 1.32 | (0.78–2.24) | 0.31 | |||
112 695 256 | Stage2 | 0.985 | 0.968 | 2.18 | (1.49–3.21) | 7.15 × 10−5 | ||||
PALM2 | Combined | 1.61 | (1.36–1.91) | 4.19 × 10−8 | 0.20 | |||||
rs1074390 | G | A | Stage-1, set-1 | 0.419 | 0.384 | typed | 1.15 | (1.09–1.23) | 3.00 × 10−6 | |
15 | Stage-1, set-2 | 0.424 | 0.390 | typed | 1.15 | (1.01–1.30) | 0.036 | |||
80 080 307 | Stage2 | 0.394 | 0.366 | 1.12 | (0.99–1.27) | 0.064 | ||||
MTHFS | Combined | 1.15 | (1.09–1.21) | 5.69 × 10−8 | 0.92 | |||||
rs7861760 | G | A | Stage-1, set-1 | 0.687 | 0.657 | 0.96 | 1.15 | (1.08–1.23) | 1.36 × 10−5 | |
9 | Stage-1, set-2 | 0.675 | 0.667 | 0.96 | 1.04 | (0.91–1.20) | 0.56 | |||
34 368 674 | Stage2 | 0.691 | 0.638 | 1.27 | (1.12–1.44) | 2.00 × 10−4 | ||||
KIAA1161/NUDT2 | Combined | 1.15 | (1.10–1.22) | 1.04 × 10−7 | 0.12 | |||||
rs2471299 | A | C | Stage-1, set-1 | 0.563 | 0.528 | 0.98 | 1.15 | (1.09–1.22) | 3.83 × 10−6 | |
7 | Stage-1, set-2 | 0.561 | 0.541 | 0.95 | 1.09 | (0.95–1.25) | 0.20 | |||
139 982 756 | Stage2 | 0.549 | 0.519 | 1.13 | (1.0004–1.27) | 0.049 | ||||
SLC37A3 | Combined | 1.14 | (1.08–1.20) | 2.05 × 10−7 | 0.75 | |||||
rs74305293 | T | G | Stage-1, set-1 | 0.274 | 0.249 | 0.87 | 1.16 | (1.08–1.25) | 6.19 × 10−5 | |
8 | Stage-1, set-2 | 0.258 | 0.236 | 0.86 | 1.15 | (0.98–1.35) | 0.084 | |||
83 575 369 | Stage2 | 0.308 | 0.269 | 1.22 | (1.06–1.40) | 4.40 × 10−3 | ||||
SNX16 | Combined | 1.17 | (1.10–1.24) | 2.47 × 10−7 | 0.79 |
RA: risk allele; AA: alternative allele; RAF: risk allele frequency; RSQR: R-square (imputation quality); OR: odds ratio; 95%; CI: 95% confidence interval; P: association P-value; Phet: P-value for heterogeneity test; CHR: Chromosome; POS: Chromosome position, a result of meta-analysis combining Stage-1 (set-1 and -2), Stage-2, fixed-effect inverse-variance meta-analysis was utilized to generate the P-value.
Association of six SNP loci with DR in Japanese patients with type 2 diabetes
SNPid . | RA . | AA . | Stage . | RAF . | RSQR . | OR . | 95%CI . | P . | Phet . | |
---|---|---|---|---|---|---|---|---|---|---|
CHRPOS (GRCh37)Nearest gene . | . | . | . | Case . | Control . | . | . | . | . | . |
rs12630354 | T | A | Stage-1, set-1 | 0.435 | 0.400 | 0.96 | 1.16 | (1.09–1.23) | 2.17 × 10−6 | |
3 | Stage-1, set-2 | 0.432 | 0.407 | 0.90 | 1.12 | (0.98–1.28) | 0.10 | |||
31 501 119 | Stage2 | 0.436 | 0.385 | 1.24 | (1.10–1.41) | 5.00 × 10−4 | ||||
STT3B | Combined a | 1.17 | (1.11–1.23) | 1.62 × 10−9 | 0.49 | |||||
rs140508424 | C | T | Stage-1, set-1 | 0.983 | 0.974 | 1.00 | 1.53 | (1.24–1.87) | 4.84 × 10−5 | |
9 | Stage-1, set-2 | 0.985 | 0.981 | 0.93 | 1.32 | (0.78–2.24) | 0.31 | |||
112 695 256 | Stage2 | 0.985 | 0.968 | 2.18 | (1.49–3.21) | 7.15 × 10−5 | ||||
PALM2 | Combined | 1.61 | (1.36–1.91) | 4.19 × 10−8 | 0.20 | |||||
rs1074390 | G | A | Stage-1, set-1 | 0.419 | 0.384 | typed | 1.15 | (1.09–1.23) | 3.00 × 10−6 | |
15 | Stage-1, set-2 | 0.424 | 0.390 | typed | 1.15 | (1.01–1.30) | 0.036 | |||
80 080 307 | Stage2 | 0.394 | 0.366 | 1.12 | (0.99–1.27) | 0.064 | ||||
MTHFS | Combined | 1.15 | (1.09–1.21) | 5.69 × 10−8 | 0.92 | |||||
rs7861760 | G | A | Stage-1, set-1 | 0.687 | 0.657 | 0.96 | 1.15 | (1.08–1.23) | 1.36 × 10−5 | |
9 | Stage-1, set-2 | 0.675 | 0.667 | 0.96 | 1.04 | (0.91–1.20) | 0.56 | |||
34 368 674 | Stage2 | 0.691 | 0.638 | 1.27 | (1.12–1.44) | 2.00 × 10−4 | ||||
KIAA1161/NUDT2 | Combined | 1.15 | (1.10–1.22) | 1.04 × 10−7 | 0.12 | |||||
rs2471299 | A | C | Stage-1, set-1 | 0.563 | 0.528 | 0.98 | 1.15 | (1.09–1.22) | 3.83 × 10−6 | |
7 | Stage-1, set-2 | 0.561 | 0.541 | 0.95 | 1.09 | (0.95–1.25) | 0.20 | |||
139 982 756 | Stage2 | 0.549 | 0.519 | 1.13 | (1.0004–1.27) | 0.049 | ||||
SLC37A3 | Combined | 1.14 | (1.08–1.20) | 2.05 × 10−7 | 0.75 | |||||
rs74305293 | T | G | Stage-1, set-1 | 0.274 | 0.249 | 0.87 | 1.16 | (1.08–1.25) | 6.19 × 10−5 | |
8 | Stage-1, set-2 | 0.258 | 0.236 | 0.86 | 1.15 | (0.98–1.35) | 0.084 | |||
83 575 369 | Stage2 | 0.308 | 0.269 | 1.22 | (1.06–1.40) | 4.40 × 10−3 | ||||
SNX16 | Combined | 1.17 | (1.10–1.24) | 2.47 × 10−7 | 0.79 |
SNPid . | RA . | AA . | Stage . | RAF . | RSQR . | OR . | 95%CI . | P . | Phet . | |
---|---|---|---|---|---|---|---|---|---|---|
CHRPOS (GRCh37)Nearest gene . | . | . | . | Case . | Control . | . | . | . | . | . |
rs12630354 | T | A | Stage-1, set-1 | 0.435 | 0.400 | 0.96 | 1.16 | (1.09–1.23) | 2.17 × 10−6 | |
3 | Stage-1, set-2 | 0.432 | 0.407 | 0.90 | 1.12 | (0.98–1.28) | 0.10 | |||
31 501 119 | Stage2 | 0.436 | 0.385 | 1.24 | (1.10–1.41) | 5.00 × 10−4 | ||||
STT3B | Combined a | 1.17 | (1.11–1.23) | 1.62 × 10−9 | 0.49 | |||||
rs140508424 | C | T | Stage-1, set-1 | 0.983 | 0.974 | 1.00 | 1.53 | (1.24–1.87) | 4.84 × 10−5 | |
9 | Stage-1, set-2 | 0.985 | 0.981 | 0.93 | 1.32 | (0.78–2.24) | 0.31 | |||
112 695 256 | Stage2 | 0.985 | 0.968 | 2.18 | (1.49–3.21) | 7.15 × 10−5 | ||||
PALM2 | Combined | 1.61 | (1.36–1.91) | 4.19 × 10−8 | 0.20 | |||||
rs1074390 | G | A | Stage-1, set-1 | 0.419 | 0.384 | typed | 1.15 | (1.09–1.23) | 3.00 × 10−6 | |
15 | Stage-1, set-2 | 0.424 | 0.390 | typed | 1.15 | (1.01–1.30) | 0.036 | |||
80 080 307 | Stage2 | 0.394 | 0.366 | 1.12 | (0.99–1.27) | 0.064 | ||||
MTHFS | Combined | 1.15 | (1.09–1.21) | 5.69 × 10−8 | 0.92 | |||||
rs7861760 | G | A | Stage-1, set-1 | 0.687 | 0.657 | 0.96 | 1.15 | (1.08–1.23) | 1.36 × 10−5 | |
9 | Stage-1, set-2 | 0.675 | 0.667 | 0.96 | 1.04 | (0.91–1.20) | 0.56 | |||
34 368 674 | Stage2 | 0.691 | 0.638 | 1.27 | (1.12–1.44) | 2.00 × 10−4 | ||||
KIAA1161/NUDT2 | Combined | 1.15 | (1.10–1.22) | 1.04 × 10−7 | 0.12 | |||||
rs2471299 | A | C | Stage-1, set-1 | 0.563 | 0.528 | 0.98 | 1.15 | (1.09–1.22) | 3.83 × 10−6 | |
7 | Stage-1, set-2 | 0.561 | 0.541 | 0.95 | 1.09 | (0.95–1.25) | 0.20 | |||
139 982 756 | Stage2 | 0.549 | 0.519 | 1.13 | (1.0004–1.27) | 0.049 | ||||
SLC37A3 | Combined | 1.14 | (1.08–1.20) | 2.05 × 10−7 | 0.75 | |||||
rs74305293 | T | G | Stage-1, set-1 | 0.274 | 0.249 | 0.87 | 1.16 | (1.08–1.25) | 6.19 × 10−5 | |
8 | Stage-1, set-2 | 0.258 | 0.236 | 0.86 | 1.15 | (0.98–1.35) | 0.084 | |||
83 575 369 | Stage2 | 0.308 | 0.269 | 1.22 | (1.06–1.40) | 4.40 × 10−3 | ||||
SNX16 | Combined | 1.17 | (1.10–1.24) | 2.47 × 10−7 | 0.79 |
RA: risk allele; AA: alternative allele; RAF: risk allele frequency; RSQR: R-square (imputation quality); OR: odds ratio; 95%; CI: 95% confidence interval; P: association P-value; Phet: P-value for heterogeneity test; CHR: Chromosome; POS: Chromosome position, a result of meta-analysis combining Stage-1 (set-1 and -2), Stage-2, fixed-effect inverse-variance meta-analysis was utilized to generate the P-value.

Regional association plots of the GWAS meta-analysis for two novel susceptibility loci to DR in the Japanese population. Each plot shows −log10 P-values against the chromosomal positions of SNPs in a specific region. The SNP with the strongest association signal (lead SNP) in each locus of Stage-1 (set-1 + set-2) + Stage-2 meta-analysis is represented as a large pink diamond. The other plots represent the association data of Stage-1 (set-1 + set-2) meta-analysis, coloured according to the extent of LD with the lead SNP. The estimated recombination rates from the hg19/1000 Genomes Project in March 2014 East Asian references are shown as light-blue lines. (a) STT3B; (b) PALM2.
We next performed de novo replication analysis using additional independent Japanese (n = 1589) samples. None of these associations were statistically significant (P > 0.05), although the association of rs12630354 in STT3B was strengthened by including the Japanese replication analysis data (OR = 1.16, 95% CI 1.11–1.22, P = 6.97 × 10−10, Supplementary Material, Table S3). The association of rs140508424 was no longer genome-wide significant after including the additional Japanese replication data (OR = 1.53, 95% CI 1.30–1.81, P = 2.78 × 10−7, Supplementary Material, Table S3). We additionally performed in silico replication analyses in previously reported European (n = 2464) and African American (n = 1495) GWAS for DR (12) and a Korean DR cohort (n = 14 305). The association of rs12630354 was not significant in either the European, the African American or the Korean cohort (P > 0.05), although the direction of the effect was in concordance with that of our Japanese GWAS (Supplementary Material, Table S4). Regarding rs140508424, the association was not significant in the Korean cohort (P > 0.05, Supplementary Material, Table S4). The association data of rs140508424 were not present in either the European or African American cohort.
We next performed a sub-analysis of the primary studies described above to explore the genetic loci specifically related to the advanced DR. We selected type 2 diabetes patients with pre-proliferative DR, proliferative DR or post photocoagulation as cases, and used in the GWAS for advanced DR (Stage-1) and a validation study (Stage-2) (Supplementary Material, Figs S1 and S4, Supplementary Material, Tables S5 and S6). We identified two loci showing borderline association with advanced DR (5.0 × 10−8 ≤ P < 5.0 × 10 −7, Supplementary Material, Table S6), but no locus was significantly associated with the advanced stage of DR in the meta-analysis (Stage-1 set-1, set-2, and Stage-2).
Fine-mapped replication analysis for previously reported DR loci and candidate loci
We examined the association of nine previously reported DR susceptible loci (P < 5.0 × 10−8 in the original reports) (10–15) and three candidate loci, ACE, AKR1B1 and VEGFA, in our Japanese GWAS meta-analysis data (Supplementary Material, Tables S7 and S8). Of the 12 loci examined, the lead SNPs of three loci, HS6ST3, KIAA0825 and NOX4, were significantly associated with DR in Japanese patients with type 2 diabetes (P < 0.05/12 = 4.17 × 10−3), although linkage disequilibrium between our lead SNPs and those of original studies were low to moderate (r2 < 0.30) (Supplementary Material, Fig. S5). We observed no significant association with DR in the other nine loci (P > 4.17 × 10−3).
Expression quantitative trait (eQTL) analysis
To uncover the molecular mechanisms of DR susceptibility, we examined the association of the lead SNPs and gene expression in various tissues by searching publicly available eQTL database, GTEx (20), and Eye Genotype Expression database, EyeGEx (21) (Supplementary Material, Tables S9 and S10). Regarding the two SNP loci with a genome-wide significant association, rs12630354 and rs140508424, we identified a significant positive association between rs12630354-T and STT3B expression in the adrenal cortex (P = 9.2 × 10−6, Supplementary Material, Table S9). However, we did not find any significant association between rs12630354-T and the expression of genes, including STT3B in the retina (EyeGEx; P > 0.05, Supplementary Material, Table S10). Rs140508424 is monoallelic in the European population, and eQTL data for the SNP were not available in either GTEx or EyeGEx. We also found that rs1074390-G, a risk allele of a suggestive locus, was associated with decreasing ST20 and MTHFS expression in the retina (ST20; P = 1.35 × 10−7, MTHFS; P = 0.0078) as well as in other multiple tissues (e.g. ST20 in the tibial nerve, P = 1.9 × 10−7, MTHFS in the tibial artery, P = 2.9 × 10−9) (Supplementary Material, Tables S9 and S10).
Functional mapping and annotation
We next performed functional mapping and annotation using a publicly available platform, FUMA-GWAS (https://fuma.ctglab.nl/) (22). Results of the chromatin interaction analysis suggested that rs12630354 in the STT3B locus interacted with STT3B, GADL1, OSBPL10, ZNF860, CMTM8 and CNOT10, respectively, (Fig. 3, Supplementary Material, Table S11). Similar to the results of eQTL analysis described above, it was shown that rs12630354 had a physical interaction with STT3B, and rs12630354-T, the risk allele for DR, was positively correlated with the expression value of STT3B (Fig. 3, Supplementary Material, Table S11).

3D chromatin interaction (Hi-C) and eQTL analysis on Chromosome 3. The outermost layer indicates a Manhattan plot of GWAS (Stage-1). In the second layer, genomic risk loci with association P < 5 × 10−6 in Stage-1 are highlighted in blue. If the gene is mapped only by chromatin interactions or only by eQTLs, it is coloured orange or green, respectively. When the gene is mapped by both, it is coloured red. Links coloured orange are chromatin interactions. Hi-C revealed significant interactions between rs12630354 and STT3B or five other genes on chromosome 3 (FDR < 1 × 10−6 shown in red or orange). Also, rs12630354 was associated with the expression of STT3B and TGFBR2 (FDR < 1 × 10−5, shown in red or green).
Gene-based and gene-set association analyses
We performed a gene-based association analysis by MAGMA (23) using Stage-1 meta-analysis data of 5 786 456 imputed SNP data with high imputation quality (r2 ≥ 0.7). We identified that EHD3 was significantly associated with DR (P = 2.17 × 10−6 < 2.71 × 10−6 = 0.05/18480, Fig. 4, Supplementary Material, Fig. S6 and Supplementary Material, Table S12). We also performed a gene-set analysis with MAGMA using the Stage-1 summary data. The most significant gene set was the adenylate cyclase-inhibiting G protein-coupled receptor signalling pathway (GO:0007193; beta = 0.40, P = 1.38 × 10−5), followed by the leucine zipper domain binding (GO:0043522; beta = 1.05, P = 3.21 × 10−5). However, these enrichments were not statistically significant after Bonferroni correction (P > 3.2 × 10−6 = 0.05/15485, Supplementary Material, Table S13).

Manhattan plot for gene-based association analysis. Each plot indicates individual gene-based association data of 18 480 genes. The x-axis indicates the chromosomal position. The Y-axis shows −log10 P-values of association. The red broken line indicates the threshold of the Bonferroni adjusted P-value (P = 2.71× 10−6).
Polygenic correlations between DR and other traits
We assessed the genetic overlap between DR and other traits by evaluating the genetic correlations using LD regression, which is a polygenic model that could take the consistency of the genetic effect directions between the traits examined into account (24). We incorporated GWAS results for type 2 diabetes (8), diabetic nephropathy (DN) (9,25,26) or 58 quantitative traits, such as body mass index (BMI), fasting glucose and glycated haemoglobin (HbA1c) (27,28). We did not observe any significant genetic overlap between DR and DN; in contrast, DR and DN unexpectedly showed the opposite direction of the effect (Rg < 0, Supplementary Material, Table S14). We observed a better correlation between DR and serum phosphate, fasting glucose, HbA1c or type 2 diabetes, although these correlations were not significant in the present analyses (P > 0.05, Supplementary Material, Table S15).
Discussion
In this study, we performed a GWAS meta-analysis for DR in 11 097 Japanese patients with type 2 diabetes, and a subsequent validation study using an independent 2983 Japanese patients with type 2 diabetes. After combining these data, we identified two genetic loci, namely STT3B and PALM2, predisposing to DR with a genome-wide significant association (P < 5 × 10−8), although the association of PALM2 was no longer significant after including the additional Japanese replication data. We also identified EHD3 as a novel susceptibility gene to DR by subsequent gene-based analysis.
To the best of our knowledge, this is the largest GWAS for DR to date. The strongest association was observed at the rs12630354 locus, 73 kbp upstream of STT3B. In silico chromatin interactions and eQTL analyses suggest that the risk allele, rs12630354-T, might be involved in transcriptional regulation and may contribute to DR susceptibility through a consequent up-regulation of STT3B expression. STT3B encodes a catalytic subunit of an oligosaccharyltransferase (OST) complex, which transfers oligosaccharides onto asparagine residues (29). STT3B is ubiquitously expressed in human tissues (Supplementary Material, Fig. S7), and the homozygous mutation of STT3B, which results in the ablation of STT3B mRNA, causes a congenital neurological disease, namely CDG1X (congenital disorder of glycosylation Ix; OMIM entry # 615597), and clinical manifestations of CDG1X include multiple neurologic abnormalities, as well as a lack of visual tracking and the optic nerve hypoplasia (30). Although we did not find any significant effect of rs12630354 in the retina eQTL database (EyeGEx, Supplementary Material, Table S10), rs12630354-T was significantly associated with an increase in STT3B expression in the adrenal gland (P = 9.2 × 10−6) (GTEx, Supplementary Material, Table S9). Since OST complex proteins, including STT3B, have been suggested to participate in local synthesis and quality control of membrane proteins involved in cholesterol and steroid metabolism in the adrenocortical cells (31), rs12630354-T might affect the susceptibility to DR indirectly via the dysregulation of cholesterol and steroid metabolism in the adrenal glands.
The second signal is rs140508424 located at the intron of PALM2. PALM2 (paralemmin 2) is a member of a multigene family consisting of two other members, paralemmin (PALM) and palmdelphin (PALMD) (32). Palm2 has been reported to be expressed in the mouse lens after birth but is not abundant in the retina, whereas the expression of palm has been observed in the lens and retina (33). Although there has been no evidence that rs140508424 affects the function or transcriptional regulation of PALM2, rs140508424-C might confer DR susceptibility via dysregulation of retinal layer formation. Since rs140508424 is monoallelic in the population of European descent (risk allele: C EUR 100% in 1000G Phase 3), this is a susceptibility locus to DR specific for East Asian populations, including the Japanese.
By gene-based analysis, we have additionally identified EHD3 as a new gene susceptible to DR. EHD3 is a member of the EH domain-containing protein family, and the principal function of EHD3 is to regulate endosomal membrane trafficking during endocytosis (34). A previous experimental report using zebrafish demonstrated that EHD1/EHD3-dependent ciliary vesicle formation was required in the early stages of ciliogenesis (35), which suggests that the dysregulation of ciliogenesis could functionally link to DR susceptibility. By searching the publicly available databases, we found that EHD3 was expressed in human retina: Bgee (https://bgee.org), RefEx (https://refex.dbcls.jp/index.php?lang=en).
Among the remaining loci with a borderline association, the association of rs1074390 located near MTHFS (methenyltetrahydrofolate synthetase) was close to a genome-wide significance level (P = 5.69 × 10−8), and the DR risk allele, rs1074390-G, was associated with a decrease in MTHFS expression in multiple tissues including the retina. The MTHFS gene encodes a key enzyme involved in folate metabolism, which catalyses the conversion of 5-formyltetrahydrofolate to 5,10-methenyltetrahydrofolate (36). Since it has been shown that folate metabolism is closely related to the regulation of homocysteine, which has been suggested to be associated with the prevalence of microvascular diseases including DR (37,38), MTHFS is a biologically plausible candidate for the susceptibility to DR. A non-synonymous variant, rs1801133 (677C-T, ALA222VAL) of methylenetetrahydrofolate reductase (MTHFR), another folate cycle enzyme, which catalyses the conversion of 5,10-methylenetetrahydrofolate to 5-methyltetrahydrofolate, has been shown to be associated with plasma homocysteine concentrations, or with several macro/microvascular diseases, including DR (39–41). However, we did not observe a significant association between variants in MTHFR and DR in our GWAS meta-analysis (data not shown).
As stated above, the GWAS meta-analysis we performed in this study is the largest GWAS for DR in terms of sample size. In addition, we selected individuals belonging to a genetically homogeneous population, the Japanese Hondo cluster, from the results of PCA, further strengthening the power of our study. However, the association of the two novel loci for DR, STT3B or PALM2, has not been replicated in replication studies using Japanese, Korean, European or African American samples; therefore, these observations may not be in line with standards for GWA studies. The estimated study power for the Japanese, European and African American replication studies was between 20.4% and 73.9% when we set α = 0.05, the odds ratio in the discovery stage as the expected effect sizes and prevalence of the disease was assumed to be 10% (Supplementary Material, Tables S3 and S4), suggesting that the sample size of these replication sets except that of the Korean study may be underpowered. Moreover, the lack of a significant association in the population other than East Asians might be explained by the genetic heterogeneity among different ethnicities; in other words, the association of the two novel loci with the disease could be population specific. The association results of the replication studies might have been biased by the phenotypic heterogeneity among Japanese cohorts (Stages 1, 2 and 3) and in silico replication cohorts (Korean, European and African American). Japanese cohorts included extreme controls defined as type 2 diabetes patients with no DR with long duration of diabetes (set-1; ≥10 years, set-2; ≥5 years) or with diabetic nephropathy. Meanwhile controls of European and African American cohorts included type 2 diabetes patients with no DR regardless of diabetic duration; controls of Korean cohorts included type 2 diabetes patients regardless of DR status. Further studies, such as a large-scale longitudinal study with strictly defined phenotype in various ethnic groups, are required to elucidate the association of STT3B and PALM2 with DR.
Our study has some limitations. First, detailed clinical information, that is, the duration of diabetes, haemoglobin A1c, blood pressure, lipid profiles and medication, was not available for some participants. Since these are considered as important co-variables for DR risk, the results of our study are influenced by these factors. Second, the clinical information, including diagnosis of DR in this study, is based on existing medical records in multiple medical institutions. Diagnostic variation among ophthalmologists at individual hospitals might cause a minor misclassification bias. Third, there is an imbalance of sample sizes in Stage-2 analysis between case and control due to the limit of control sample availability, which might adversely affect to association analysis. Lastly, our control samples included patients with type 2 diabetes with diabetic nephropathy, which could decrease the study power in identifying shared genetic loci for DR and DN.
Although our main analysis, including all DR patients as cases, had an advantage to identify a certain portion of genetic variants for DR by increasing the statistical power, further phenotyping of DR in GWAS would be useful to identify variants specifically predisposing to the DR progression. Indeed, the sub-analysis in this study using patients with advanced stages of DR as cases identified additional candidate loci for DR; further phenotyping in GWAS, i.e. DR cases with shorter duration of diabetes versus no DR controls with longer diabetes duration (>20 years), DR cases under good glycaemic control versus no DR controls under poor glycaemic control, is considered useful if the enough clinical information would be available.
In conclusion, we have identified two novel SNP loci, STT3B and PALM2, and a novel gene, EHD3, that contribute to DR susceptibility. However, further replication studies are required to validate these associations.
Materials and Methods
Subjects
Discovery stage (Stage-1): Participants were selected from type 2 diabetes patients registered in the BioBank Japan Project (8). DR cases were defined as type 2 diabetes patients with any stage of DR: simple retinopathy, pre-proliferative retinopathy, proliferative retinopathy or post laser photocoagulation therapy, based on their medical records (set-1, n = 5415 set-2, n = 790). Controls were defined as type 2 diabetes patients with no signs of DR and a long duration of diabetes (set-1; ≥10 years, set-2; ≥ 5 years) or with diabetic nephropathy (set-1, n = 4676, set-2, n = 1779).
Validation stage (Stage-2): Participants were selected from type 2 diabetes patients registered in the BioBank Japan Project. There was no overlap sample between Stage-1 and Stage-2. The definition of DR cases was the same as that of Stage-1 (n = 2260). The control group consisted of individuals having type 2 diabetes with diabetic nephropathy but who did not show any signs of DR (n = 723).
Replication stage (Stage-3): Participants were type 2 diabetes patients who visited the outpatient clinics of Toyama University Hospital, Tokai University Hospital, Shiga University of Medical Science Hospital and IMSUT Hospital—The Institute of Medical Science, The University of Tokyo. The definition of DR cases was the same as that of Stage-1 and 2 (n = 750). The control group consisted of individuals with type 2 diabetes who did not show any signs of DR with a long duration of diabetes (≥10 years) or with diabetic nephropathy (n = 839).
Diabetes was diagnosed according to World Health Organization criteria. We excluded individuals who were positive for antibodies against glutamic acid decarboxylase (GAD) and those with diabetes due to liver dysfunction, steroids and other drugs that might raise glucose levels, malignancy or those with a monogenic disorder known to cause diabetes. The clinical characteristics of Stages 1, 2 and 3 participants are shown in Supplementary Material, Table S16. Genomic DNA was extracted from peripheral leukocytes using a standard procedure. All individuals provided written informed consent to participate in this study.
Ethics approval
The protocol of this study conformed to the provisions of the Declaration of Helsinki and was approved by the ethical committees of the RIKEN Yokohama Institute, Toyama University Hospital, Tokai University Hospital, Shiga University of Medical Science and the Institute of Medical Science, The University of Tokyo.
Genotyping and quality control during the discovery stage
Set-1 individuals in stage-1 were genotyped using the Human Omni Express Exome Bead Chip (Illumina, Inc, CA, USA). There were 561 845 autosomal SNPs that passed quality control, with a call rate ≥0.99, for the Hardy–Weinberg equilibrium test P ≥ 1 × 10−6 in controls, and minor allele frequency (MAF) ≥0.01. Set-2 individuals in stage-1 were genotyped using the Illumina Human 610 K SNP array (Illumina, Inc, CA, USA). There were 479 948 autosomal SNPs that passed the quality control and were used for further analysis. For sample quality control, we evaluated cryptic relatedness for each sample using an identity-by-state method and removed samples that exhibited second-degree or closer relatedness. We further performed principal component analysis (PCA) to select individuals within the major Japanese (Hondo) cluster, as reported previously (7,8,9,42). Samples for 8880 individuals (4839 DR cases and 4041 controls) in set-1 as well as 2217 individuals (693 DR cases and 1524 controls) in set-2 that passed the quality control were utilized in subsequent analyses. To evaluate the potential effect of population stratification, we used a quantile–quantile plot of the observed P-values (Supplementary Material, Fig. S2).
Imputation
Genotype imputation was performed using MACH and Minimac (17,18) with individuals from the 1000 Genomes Project (phased JPT, CHB and Han Chinese South (CHS) data n = 275, March 2012) as reference populations (19). We selected SNPs with MAF ≥0.01 and a Minimac software quality score (r2) ≥0.7. Individual genotype dosage data were utilized for association studies using mach2dat (17,18).
Genotyping and quality control in the Stage-2 and Stage-3 samples
We genotyped 4572 type 2 diabetes patients (Stage-2; 2260 DR and 723 controls, Stage-3; 750 DR and 839 controls) using a multiplex PCR-Invader assay, as described previously (7,8). Genotyping success rates <95% or concordance rates <99.7% were excluded from further evaluation.
In silico replication analyses in non-Japanese studies
We obtained follow-up analysis data from previously reported GWAS cohorts (European population n = 2464, and African American population n = 1495) (12). The case–control definition compared any DR to no DR (Early Treatment Diabetic Retinopathy Study (ETDRS) ≥ 14 vs. ETDRS < 14) (12).
We also obtained in silico association data from a Korean cohort (n = 14 305). The case (n = 1452) consisted of diabetic patients visiting retina clinics at Seoul National University, Seoul National University Bundang Hospital and Seoul Metropolitan Government Seoul National University Boramae Medical Center for fundus examination. Patients with any stage of DR were defined as DR cases. The controls (n = 12 853) were participants of a population cohort, the Korea Genome and Epidemiology Study (KoGES), who had type 2 diabetes. Both DR cases and controls were genotyped using the Korea Biobank Array (KBA v1.0 and v1.1). Imputation was done with Minimac with individuals from the 1000 Genomes Project (Phase 3) as reference populations. Imputed SNPs with high imputation quality (r2 ≥ 0.8), MAF ≥0.01 were applied and EPACTS was used for association analysis.
The study protocol was approved by the ethical committees of Seoul National University Hospital, Seoul National University Bundang Hospital and Seoul Metropolitan Government Seoul National University Boramae Medical Center.
Statistical analyses
The association between each SNP and DR was assessed using a logistic regression test with an additive model. We combined data from each GWAS and our replication analyses using an inverse variance method and examined heterogeneity with Cochran’s Q test using METAL (43). Regional association plots were generated using LocusZoom (44). We used the GAS power calculator for statistical power analysis (http://csg.sph.umich.edu/abecasis/gas_power_calculator/index.html).
Gene-based and gene-set association analyses
Gene-based and gene-set association analyses were conducted using the summary statistics of GWAS meta-analysis (Stage-1 set-1 and set-2), which consisted of association data of 5 786 456 imputed SNPs with high imputation quality (r2 ≥ 0.7). Linkage disequilibrium (LD) from the 1000G Phase 3 EAS to map the GWAS SNPs to 18 480 protein-coding genes (hg19 build) and to calculate gene-based P-values, using MAGMA v1.07 (23), as implemented in FUMA v1.3.6 (22). Window sizes were set between 3 kb upstream and 1 kb downstream of the genes. The statistical significance of the gene-based analysis was set at a Bonferroni-corrected P < 2.7 × 10−6 (0.05/18480). In the gene-set analysis, 15 485 gene sets (curated gene sets: 5497, GO terms: 9988) from MsigdB v7.0 were examined, therefore, statistical significance was set at P < 3.2 × 10−6 (0.05/15485).
Functional mapping and annotation
Functional annotation was conducted in FUMA v1.3.6 (22), using the GWAS meta-analysis data (Stage-1 set-1 and set-2) of 5 786 456 imputed SNPs with high imputation quality (r2 ≥ 0.7). We assigned functional annotations to SNPs with association P < 5.0 × 10−6 and all variants with r2 ≥ 0.6 and included expression quantitative trait locus (eQTL) mapping and 3D chromatin interaction mapping (Hi-C). All LD information was calculated from the 1000G Phase 3 EAS. We additionally assessed cis-eQTL effects by manually searching publicly available data as of January 2020: GTEx portal v8 (https://gtexportal.org/home/) (20) and EyeGEx data, available from GTEx portal external datasets (21).
Genetic correlation
We conducted bivariate LD score regression (24) to quantify genetic correlations between DR and other diseases such as type 2 diabetes (8), diabetic nephropathy (DN) (9,25,26) or various quantitative traits including anthropometric, metabolic, kidney-related, haematological and blood pressure traits (27,28). We utilized in-house data for DR, type 2 diabetes (8), Japanese DN (9) or publicly available GWAS summary statistics for European DN, Japanese BMI and Japanese clinical laboratory data (25–28). To maintain sufficient statistical power, we utilized GWAS results with a sample size of >10 000. For the regression, we used the population-specific LD score and summary statistics of high-quality common SNPs present in the HapMap 3 reference panel for each available trait or disease. We defined significant genetic correlations as those with P < 0.05.
Acknowledgements
The authors thank all participating doctors, staff and patients from the collaborating institutes for providing DNA samples. We also thank Mr Takashi Morizono, Dr Michiaki Kubo, RIKEN Centre for Integrative Medical Sciences for their support of data managements, and technical staffs of RIKEN Centre for Integrative Medical Sciences for performing de novo genotyping.
Korean in silico study was performed with bioresources from National Bank of Korea, the Centers for Disease Control and Prevention, Republic of Korea.
Conflicts of Interest statement. The authors have declared that no competing interests exist.
Funding
This work was partly supported by a grant from the Ministry of Education, Culture, Science and Technology, Japan (to S.M.), and from the Japan Agency of Medical Research and Development (17km0405202h0802) (T.K., T.Y. , S.M.). National Research Foundation of Korea grants funded by the Korea government (NRF-2012R1A1A2008943), and intramural grants from the Korea National Institute of Health (2018-NI011-02). (J.A., K.H.P., J.K., S.M.) and National Institutes of Health grant EY022302.
Contribution statement
Conception and Design: Shiro Maeda.
Data curation: Atsushi Takahashi.
Analysis—Genome wide SNP discovery: Minako Imamura, Atsushi Takahashi.
Analysis—Gene based, gene set analysis, functional annotation: Minako Imamura.
Analysis- Genetic correlation: Masatoshi Matsunami.
In silico replication analysis: Lucia Sobrin and Gayatri Susarla on behalf of the iDRAGON consortium, Jeeyun Ahn, Kyu Hyung Park, Jinhwa Kong, Sanghoon Moon.
Data acquisition: Minoru Iwata, Shin-ichi Araki, Masao Toyoda, Lucia Sobrin on behalf of the iDRAGON consortium, Jeeyun Ahn, Kyu Hyung Park, Jinhwa Kong, Sanghoon Moon Kazuyuki Tobe, Hiroshi Maegawa, Shiro Maeda.
Supervising and data interpretation: Momoko Horikoshi, Toshimasa Yamauchi, Takashi Kadowaki, Shiro Maeda.
Writing—original draft: Minako Imamura.
Writing—review & editing: Shiro Maeda.
All authors approved the final manuscript.
Shiro Maeda is the guarantor of the work and as such had full access to all the data and takes responsibility for the integrity of the data and the accuracy of the data analysis.
References
Author notes
Appendix international Diabetic Retinopathy and Genetics CONsortium (iDRAGON) members: Samuela Pollack, Robert P. Igo, Jr., Richard A. Jensen, Mark Christiansen, Xiaohui Li, Ching-Yu Cheng, Maggie C.Y. Ng, Albert V. Smith, Elizabeth J. Rossin, Ayellet V. Segrè, Yii-Der Ida Chen, Jane Z. Kuo, Latchezar M. Dimitrov, Lynn K. Stanwyck, Alan Penman, Ching J. Chen, Heather Hancock, Kent D. Taylor, Family Investigation of Nephropathy and Diabetes-Eye Research Group, Barry I. Freedman, Paul Mitchell, Jamie E. Craig, Emily Y. Chew, Donald W. Bowden, Brian L. Yaspan, David Siscovick, Mary Frances Cotch, Jie Jin Wang, Kathryn P. Burdon, Tien Y. Wong, Barbara E. K. Klein, Ronald Klein, Jerome I. Rotter, Sudha K. Iyengar, Alkes Price, Lucia Sobrin