Abstract

Large meta-analyses of rheumatoid arthritis (RA) susceptibility in European (EUR) and East Asian (EAS) populations have identified >100 RA risk loci, but genome-wide studies of RA in African-Americans (AAs) are absent. To address this disparity, we performed an analysis of 916 AA RA patients and 1392 controls and aggregated our data with genotyping data from >100 000 EUR and Asian RA patients and controls. We identified two novel risk loci that appear to be specific to AAs: GPC5 and RBFOX1 (PAA < 5 × 10−9). Most RA risk loci are shared across different ethnicities, but among discordant loci, we observed strong enrichment of variants having large effect sizes. We found strong evidence of effect concordance for only 3 of the 21 largest effect index variants in EURs. We used the trans-ethnic fine-mapping algorithm PAINTOR3 to prioritize risk variants in >90 RA risk loci. Addition of AA data to those of EUR and EAS descent enabled identification of seven novel high-confidence candidate pathogenic variants (defined by posterior probability > 0.8). In summary, our trans-ethnic analyses are the first to include AAs, identified several new RA risk loci and point to candidate pathogenic variants that may underlie this common autoimmune disease. These findings may lead to better ways to diagnose or stratify treatment approaches in RA.

Introduction

Rheumatoid arthritis (RA) affects 0.5–1% of populations worldwide (1) and has both environmental and genetic influences (2). The genetic basis of RA has been explored extensively among persons of European (EUR) and of East Asian (EAS) ancestry (3,4) but is much less well studied in populations of African (AFR) and African-American (AA) ancestry. Our group has previously shown significant overlap between risk loci for RA in AAs (5,6), but genome-wide association studies (GWASs) systematically interrogating genetic influences on RA in AAs are lacking.

There is growing evidence of important differences in RA heritability among various populations across the globe. HLA-DRB1 is the strongest genetic risk factor for RA in all racial/ethnic groups analyzed (7,8), in particular among patients with RA-associated serum anti-citrullinated peptide/protein antibodies (ACPA) (9). Nevertheless, the HLA (Human leukocyte antigen) alleles (10) and amino acid residues (11) implicated differ between ethnicities (12). For instance, in EUR ancestry RA, a valine residue at amino acid position 11 of HLA-DRB1 confers risk (11). In AAs, an aspartic acid at position 11 is strongly associated with RA, which is not observed in EUR ancestry RA (13). Furthermore, residues at positions 71 and 74 appear to account for more RA risk in EURs than in AAs. Thus, even when a risk locus is shared across populations, genetic variants underlying the association may differ by ethnicity, supporting the need for more in-depth analysis of RA risk determinants in AAs.

Recent meta- and mega-analyses suggest multiple differences in risk loci outside the HLA region as well (14–16). Okada et al. (3) conducted a large trans-ethnic analysis of RA in populations of EUR and Asian ancestry and identified several RA risk loci that are not common to both populations. Of note, in this analysis only 2 of the 11 largest effect non-HLA RA risk loci found in EURs (TNFAIP3 and NFKBIE) have a concordant effect in Asians. The others either did not replicate or have minor allele frequency (MAF) too low to draw meaningful conclusions (3). For example, rs2476601 in PTPN22 is the strongest non-HLA risk variant for RA in EURs (17), but this variant is very rare in AFR and Asian populations, and variants in the PTPN22 locus have not been shown to confer genetic risk of RA in these populations (5,17). Other studies provide additional examples of non-major histocompatibility complex risk loci identified in EUR or Asian RA populations that differ in AAs and black AFRs (5,18,9). Thus, a major goal of the present study is to quantify similarities and differences among RA susceptibility loci in AAs compared with other global populations.

Another major gap in the study of RA genetic risk is to identify variants driving the biology of the risk locus (pathogenic variants) from those merely associated with the disease but not causing the pathobiology of the disease. Addressing this issue could help expedite focused mechanistic studies in RA, which could lead to more precise diagnosis and patient stratification for prognosis and optimal targeted therapies for individual patients (20). Many approaches to identify pathogenic variants have been used, such as the CAusal Variants Identification in Associated Regions Bayes Factor (CAVIARBF) (21) and Probabilistic Annotation INtegraTOR (PAINTOR) algorithms (22,23), which integrate linkage disequilibrium (LD) and association patterns in disease risk loci to assign each associated single nucleotide polymorphism (SNP) a posterior probability (PPOST) of being a pathogenic variant. When genotyping data from multiple global populations are available, integrating them in trans-ethnic fine-mapping (TEFM) studies can improve identification of pathogenic variants, in particular when AFR genotypes are included (15,16,24–26). Due to the paucity of available data in AAs, however, fine-mapping approaches have largely been limited to EUR and Asian ancestry individuals (22). Thus, the second goal of our study is to address this critical gap by integrating genetic studies of RA susceptibility from three ethnicities to better identify candidate pathogenic variants.

Our study represents the most comprehensive assessment of genetic risk for RA in AAs yet reported. Our analyses were performed in three stages (Fig. 1). In Phase 1, we conducted an RA GWAS on AAs using Omni 1M and 1S arrays (Illumina, San Diego, CA) and jointly analyzed these results with genotyping data from the Omni 5M array on an additional set of AA RA and controls. In Phase 2, we merged our AA RA data with EUR and Asian data from Okada et al. (3) to conduct a large trans-ethnic meta-analysis (TEMA) using METASOFT (27,28). In Phase 3, we used the results from the TEMA to conduct TEFM of RA risk loci and identify candidate pathogenic variants.

Our analyses provide several new insights into the genetic basis of RA. We identify two novel associations in AAs with RA. We also find many risk loci whose association may be limited to one or two ethnicities. These variants are typically discordant because of low MAF in one or more populations, not because of differing effect size of a variant common in all populations. We found several likely pathogenic variants with a posterior probability (PPOST > 0.8) not previously thought to be pathogenic, five of which have also been identified as potentially influencing efficacy and toxicity of drugs used to treat RA. These findings improve our understanding of the biology of RA-associated genetic variants and may ultimately lead to better ways to diagnose and treat patients with RA.

Consort diagram showing the phases and flow of the study. Phase 1 is a two-part GWAS of RA in AAs, which was jointly analyzed in a fixed-effects meta-analysis. Phase 2 is a TEMA, in which we combine the association summary statistics from our AA data (Phase 1) with those provided by Okada et al. (3) to conduct a random effects meta-analysis using METASOFT. In addition to computing association P-values, we generated M-values (see Materials and Methods). Phase 3 uses the TEMA data along with publicly available data to perform TEFM. We then prioritized candidate pathogenics variants by posterior probability and constructed credible sets for RA risk loci.
Figure 1

Consort diagram showing the phases and flow of the study. Phase 1 is a two-part GWAS of RA in AAs, which was jointly analyzed in a fixed-effects meta-analysis. Phase 2 is a TEMA, in which we combine the association summary statistics from our AA data (Phase 1) with those provided by Okada et al. (3) to conduct a random effects meta-analysis using METASOFT. In addition to computing association P-values, we generated M-values (see Materials and Methods). Phase 3 uses the TEMA data along with publicly available data to perform TEFM. We then prioritized candidate pathogenics variants by posterior probability and constructed credible sets for RA risk loci.

(A) Manhattan plot P-values from association testing of AAs with RA using calculated from the random effects meta-analysis conducted in Phase 2. Only PAA values (association P-values generated using AA genotypes) are displayed in this plot. The x-axis indicates chromosome and position; the y-axis indicates –log10(p). The horizontal red line is drawn at 5.0 × 10−9 and is the threshold for genomewide significance; the blue line is for suggestive evidence of association and is drawn at 1.0 × 10−6. Figure 2B–D is a locus zoom plot for each of the non-HLA associations detected in AAs. (B) chr8:114 980 000–115 150 000 (nearest gene: CSMD3). (C) chr13:92 900 000–93 050 000 in an intronic region of GPC5. Red coloration indicates the variants are more strongly linked to rs9516053, while blue coloration indicates variants more strongly linked to rs9589512. (D) chr16: 5 538 689–5 638 S689 in an intronic region of RBFOX1. (E) The independence of the associations of the PADI2 and PADI4 loci is shown. The y-axis is –log10(combined P-value) from the TEMA of all three populations, and the x-axis is genomic position. Red coloration indicates the variants are more strongly linked to rs761426 (the index variant in PADI2) than to rs2301888 (the index variant in PADI4). For each of the two loci, darker shading indicates stronger LD. The blue line indicates genomic recombination rate; peaks of recombination may explain the close proximity of two independent associations.
Figure 2

(A) Manhattan plot P-values from association testing of AAs with RA using calculated from the random effects meta-analysis conducted in Phase 2. Only PAA values (association P-values generated using AA genotypes) are displayed in this plot. The x-axis indicates chromosome and position; the y-axis indicates –log10(p). The horizontal red line is drawn at 5.0 × 10−9 and is the threshold for genomewide significance; the blue line is for suggestive evidence of association and is drawn at 1.0 × 10−6. Figure 2B–D is a locus zoom plot for each of the non-HLA associations detected in AAs. (B) chr8:114 980 000–115 150 000 (nearest gene: CSMD3). (C) chr13:92 900 000–93 050 000 in an intronic region of GPC5. Red coloration indicates the variants are more strongly linked to rs9516053, while blue coloration indicates variants more strongly linked to rs9589512. (D) chr16: 5 538 689–5 638 S689 in an intronic region of RBFOX1. (E) The independence of the associations of the PADI2 and PADI4 loci is shown. The y-axis is –log10(combined P-value) from the TEMA of all three populations, and the x-axis is genomic position. Red coloration indicates the variants are more strongly linked to rs761426 (the index variant in PADI2) than to rs2301888 (the index variant in PADI4). For each of the two loci, darker shading indicates stronger LD. The blue line indicates genomic recombination rate; peaks of recombination may explain the close proximity of two independent associations.

Results

TEMA identifies novel genetic associations in AAs

Consistent with the results of genetic studies of RA in other populations (2–4), our TEMA found a strong association of the HLA-DRB1 region with RA in AAs. We also identified two novel genetic associations of RA among AAs as shown in Figure 2A. The index variants in these novel risk loci are rs9516053 in glypican-5 (GPC5) and rs4602043 in RBFOX1. A third strong association was found with rs2203098 in CSMD3, but LD support for this variant seems to be weak (see Fig. 2B–D; Table 1; Supplementary Material, Table S1). These associations were not found in previous studies of RA in EURs or Asians. To assess whether variant associations detected in one population replicate in the others, we calculate an M-value for each (denoted MAFR,MEAS and MEUR), which is a posterior probability that the effect exists in each study of a meta-analysis (27). SNPs that are predicted to have an effect have M > 0.8; SNPs predicted to not have an effect have M < 0.2; and ambiguous SNPs have an M-value between 0.2 and 0.8. The associations with GPC5, RBFOX1 and CSMD3 appear to be specific to AAs, as M-values in AAs were 1.0 for each of the three loci; for EURs and Asians, MEUR and MEAS were <0.2 for each locus.

We also identified a fourth association that appears to be present in all three ethnic groups (MAFR, MEUR and MEAS > 0.8). This association maps to a group of variants in intron 14 of PADI2, for which rs761426 was the most strongly associated SNP [odds ratio (OR) = 0.90; 95% confidence interval: 0.88–0.914; PTE = 2.48 × 10−10; see Figure 2E]. Okada et al. (3) did not identify the PADI2 locus in their initial meta-analysis, but a subsequent conditional analysis of the PADI4 locus identified an independent association signal at PADI2 (rs761426, adjusted P = 2.3 × 10−9) (29). The RA risk T allele of rs761426 has a cis-eQTL (expression quantitative trait locus) effect that increases PADI2 mRNA expression in whole blood (P = 4.6 × 10−12) (29). Our data are consistent with rs761426 being both the index variant and a functional candidate pathogenic variant in AAs and other populations. We performed a conditional analysis to confirm that the association signal in PADI2 is independent of the PADI4 locus in AAs. Conditioning the association on the lead variant in PADI4 (rs2301888; PTE = 5.89 × 10−19) left the association in PADI2 essentially unaltered, matching a prior report (29). Thus, the association of PADI2 with RA appears to be independent of the association of PADI4 in EUR, Asian and AA populations.

Assessment of validated risk loci in AAs with RA

We found evidence that 29 of 101 known validated risk loci for RA (3) are associated with RA in AAs (Table 2). Given the differences in the number of subjects in the ethnic groups included in our TEMA, we sought to determine whether the lack of an association of a known risk locus in AAs is due to insufficient statistical power to detect an actual effect (type II error) or the true absence of an effect (β = 0). We accomplished this by plotting association P-values for RA in EURs, EASs and in AAs (PEUR, PEAS and PAFR, respectively) against M-values (posterior probabilities that a genetic variant has an effect in a given population) for each index variant reported by Okada et al. (3) (Table 1). P–M plots (30) for all three populations are shown in Figure 3A–C. Of the 101 RA risk variants, 18 could not be assigned an M-value in AAs because of low allele frequency. Of the remaining 83 RA variants, 51 variants had intermediate MAA values (0.2–0.8), signifying that there was insufficient evidence to interpret the effect size as either lack of effect or lack of statistical power (Supplementary Material, Table S1). Importantly, however, 28 variants had MAA ≥ 0.8 (Table 2), suggesting that the effect size for AAs is similar to that in EURs and EASs (28). In comparison, the number of variants showing an effect in EASs similar to that in EURs (MEAS ≥ 0.8) is 59, but this difference likely reflects improved statistical power, not biological differences in the degree of shared risk.

Table 1

Novel associations with RA reaching genomewide significance in one or more global population

SNP IDChrPositionCandidate geneAllele 1 (A1)Allele 2 (A2)PEURPEASPAATransethnic P-value*MEURMEASMAFR
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-100.99910.96
rs22030988115012597CSMD3GC6.87E-022.85E-016.54E-10*5.47E-08001
rs95160531392945884GPC5CT3.24E-018.06E-013.09E-09*3.28E-06001
rs4602043165588689RBFOX1CT3.31E-019.12E-024.07E-09*5.03E-0700.0091
SNP IDChrPositionCandidate geneAllele 1 (A1)Allele 2 (A2)PEURPEASPAATransethnic P-value*MEURMEASMAFR
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-100.99910.96
rs22030988115012597CSMD3GC6.87E-022.85E-016.54E-10*5.47E-08001
rs95160531392945884GPC5CT3.24E-018.06E-013.09E-09*3.28E-06001
rs4602043165588689RBFOX1CT3.31E-019.12E-024.07E-09*5.03E-0700.0091

Abbreviations: Chr, chromosome; Pos, base pair location; PEUR, PEAS, PAA, the P-value for the variant calculated using only data from EURs, EASs and AAs, respectively. * These P-values are the association P-value for the variant in the overall TEMA as calculated by METASOFT using Han and Eskins’s Random Effects model, which is optimized to detect potentially heterogenous associations statistic. MEUR, MEAS and MAFR, the M-value for the variant calculated using only data from EURs, EAS and AAs, respectively.

Table 1

Novel associations with RA reaching genomewide significance in one or more global population

SNP IDChrPositionCandidate geneAllele 1 (A1)Allele 2 (A2)PEURPEASPAATransethnic P-value*MEURMEASMAFR
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-100.99910.96
rs22030988115012597CSMD3GC6.87E-022.85E-016.54E-10*5.47E-08001
rs95160531392945884GPC5CT3.24E-018.06E-013.09E-09*3.28E-06001
rs4602043165588689RBFOX1CT3.31E-019.12E-024.07E-09*5.03E-0700.0091
SNP IDChrPositionCandidate geneAllele 1 (A1)Allele 2 (A2)PEURPEASPAATransethnic P-value*MEURMEASMAFR
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-100.99910.96
rs22030988115012597CSMD3GC6.87E-022.85E-016.54E-10*5.47E-08001
rs95160531392945884GPC5CT3.24E-018.06E-013.09E-09*3.28E-06001
rs4602043165588689RBFOX1CT3.31E-019.12E-024.07E-09*5.03E-0700.0091

Abbreviations: Chr, chromosome; Pos, base pair location; PEUR, PEAS, PAA, the P-value for the variant calculated using only data from EURs, EASs and AAs, respectively. * These P-values are the association P-value for the variant in the overall TEMA as calculated by METASOFT using Han and Eskins’s Random Effects model, which is optimized to detect potentially heterogenous associations statistic. MEUR, MEAS and MAFR, the M-value for the variant calculated using only data from EURs, EAS and AAs, respectively.

Table 2

Replication of genetic variants previously associated with RA in persons of EUR/Asian ethnicity

SNP IDChrPositionCandidate geneA1A2PEURPEASPAATransethnic P-valueMEURMEASMAFR
rs9268839632428772HLA-DRB1AG<1E-2503.68E-1342.66E-07>1E-2501.0001.0001.000
rs30872432204738919CTLA4AG9.00E-202.16E-042.05E-039.94E-241.0000.9990.989
rs118893412191943742STAT4TC6.36E-126.27E-094.77E-035.30E-201.0001.0000.979
rs2736337811341880BLKTC1.61E-071.99E-062.59E-034.20E-131.0001.0000.976
rs96534422100825367AFF3TC3.51E-124.60E-041.43E-021.67E-151.0000.9990.964
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-080.9991.0000.960
rs93788156426155IRF4GC1.59E-075.97E-051.55E-029.96E-121.0001.0000.956
rs9096852239747671SYNGR1AT3.10E-103.81E-063.33E-023.68E-141.0001.0000.954
rs597165451738031857IKZF3-CSF3TG2.02E-099.48E-051.36E-022.60E-131.0001.0000.950
rs2233424644233921NFKBIETC3.32E-089.34E-134.88E-021.57E-191.0001.0000.934
rs21053251173349725LOC100506023AC1.02E-087.56E-033.60E-029.93E-111.0000.9880.928
rs24512586159506600TAGAPTC6.42E-101.07E-012.25E-026.16E-111.0000.8970.921
rs7301352711128496952ETS1TC2.02E-061.24E-066.45E-023.14E-111.0001.0000.905
rs19804222204610396CD28TC6.08E-113.40E-021.11E-012.99E-121.0000.9570.903
rs96036161340368069COG6TC8.29E-111.01E-021.14E-012.27E-121.0000.9750.899
rs25614775102608924C5orf30AG5.26E-102.05E-018.86E-035.51E-101.0000.3410.887
rs23172301157674997FCRL3TG1.04E-053.14E-041.46E-011.12E-081.0000.9990.864
rs1077462412111833788SH2B3-PTPN11AG2.36E-072.09E-028.20E-081.0000.864
rs80329391538834033RASGRP1TC2.42E-123.07E-052.18E-013.49E-161.0001.0000.863
rs18770301737740161MED1TC1.46E-052.95E-042.17E-011.45E-081.0000.9990.862
rs10175798230449594LBHAG1.35E-079.78E-032.05E-013.35E-091.0000.9800.861
rs2664035448220839TECAG2.45E-066.03E-013.66E-024.07E-061.0000.2840.857
rs22366682145650009ICOSLG-AIRETC1.19E-052.60E-030.273847.99E-081.0000.9930.846
rs67325652111607832ACOXLAG8.77E-058.67E-032.59E-011.81E-061.0000.9820.841
rs93721206106667535ATG5TG1.18E-051.22E-032.34E-011.41E-071.0000.9920.838
rs81338432136738242RUNX1-LOC100506403AG6.00E-093.27E-021.76E-019.70E-101.0000.8970.835
rs706778106098949IL2RATC7.09E-122.90E-017.51E-022.12E-111.0000.0880.826
rs116050421172411664ARAP1AG1.36E-029.65E-042.79E-015.33E-050.9410.9930.821
rs64798001064036881RTKN2GC2.21E-036.93E-072.46E-018.10E-080.7811.0000.820
rs172643326138005515TNFAIP3AG6.85E-193.46E-011.26E-171.0000.186
rs11933540426120001C4orf52TC9.54E-176.40E-012.17E-151.0000.147
rs1079026811118729391CXCR5AG3.32E-153.17E-014.60E-011.23E-131.0000.0590.072
rs80268981569991417LOC145837AG2.35E-176.83E-034.48E-022.46E-171.0000.9730.013
SNP IDChrPositionCandidate geneA1A2PEURPEASPAATransethnic P-valueMEURMEASMAFR
rs9268839632428772HLA-DRB1AG<1E-2503.68E-1342.66E-07>1E-2501.0001.0001.000
rs30872432204738919CTLA4AG9.00E-202.16E-042.05E-039.94E-241.0000.9990.989
rs118893412191943742STAT4TC6.36E-126.27E-094.77E-035.30E-201.0001.0000.979
rs2736337811341880BLKTC1.61E-071.99E-062.59E-034.20E-131.0001.0000.976
rs96534422100825367AFF3TC3.51E-124.60E-041.43E-021.67E-151.0000.9990.964
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-080.9991.0000.960
rs93788156426155IRF4GC1.59E-075.97E-051.55E-029.96E-121.0001.0000.956
rs9096852239747671SYNGR1AT3.10E-103.81E-063.33E-023.68E-141.0001.0000.954
rs597165451738031857IKZF3-CSF3TG2.02E-099.48E-051.36E-022.60E-131.0001.0000.950
rs2233424644233921NFKBIETC3.32E-089.34E-134.88E-021.57E-191.0001.0000.934
rs21053251173349725LOC100506023AC1.02E-087.56E-033.60E-029.93E-111.0000.9880.928
rs24512586159506600TAGAPTC6.42E-101.07E-012.25E-026.16E-111.0000.8970.921
rs7301352711128496952ETS1TC2.02E-061.24E-066.45E-023.14E-111.0001.0000.905
rs19804222204610396CD28TC6.08E-113.40E-021.11E-012.99E-121.0000.9570.903
rs96036161340368069COG6TC8.29E-111.01E-021.14E-012.27E-121.0000.9750.899
rs25614775102608924C5orf30AG5.26E-102.05E-018.86E-035.51E-101.0000.3410.887
rs23172301157674997FCRL3TG1.04E-053.14E-041.46E-011.12E-081.0000.9990.864
rs1077462412111833788SH2B3-PTPN11AG2.36E-072.09E-028.20E-081.0000.864
rs80329391538834033RASGRP1TC2.42E-123.07E-052.18E-013.49E-161.0001.0000.863
rs18770301737740161MED1TC1.46E-052.95E-042.17E-011.45E-081.0000.9990.862
rs10175798230449594LBHAG1.35E-079.78E-032.05E-013.35E-091.0000.9800.861
rs2664035448220839TECAG2.45E-066.03E-013.66E-024.07E-061.0000.2840.857
rs22366682145650009ICOSLG-AIRETC1.19E-052.60E-030.273847.99E-081.0000.9930.846
rs67325652111607832ACOXLAG8.77E-058.67E-032.59E-011.81E-061.0000.9820.841
rs93721206106667535ATG5TG1.18E-051.22E-032.34E-011.41E-071.0000.9920.838
rs81338432136738242RUNX1-LOC100506403AG6.00E-093.27E-021.76E-019.70E-101.0000.8970.835
rs706778106098949IL2RATC7.09E-122.90E-017.51E-022.12E-111.0000.0880.826
rs116050421172411664ARAP1AG1.36E-029.65E-042.79E-015.33E-050.9410.9930.821
rs64798001064036881RTKN2GC2.21E-036.93E-072.46E-018.10E-080.7811.0000.820
rs172643326138005515TNFAIP3AG6.85E-193.46E-011.26E-171.0000.186
rs11933540426120001C4orf52TC9.54E-176.40E-012.17E-151.0000.147
rs1079026811118729391CXCR5AG3.32E-153.17E-014.60E-011.23E-131.0000.0590.072
rs80268981569991417LOC145837AG2.35E-176.83E-034.48E-022.46E-171.0000.9730.013

Chr, chromosome; Pos, base pair location; PEUR, the P-value for the variant calculated using only data from EURs. PEAS, the P-value for the variant calculated using only data from Asians. PAA, the P-value for the variant calculated using only data from AAs. These P-values are the association P-value for the variant in the overall TEMA as calculated by METASOFT using Han and Eskins’s Random Effects model, which is optimized to detect potentially heterogenous associations statistic. MEAS, the M-value for the variant calculated using only data from EURs. MEUR, the P-value for the variant calculated using only data from EASs. MAFR, the P-value for the variant calculated using only data from AAs.

Table 2

Replication of genetic variants previously associated with RA in persons of EUR/Asian ethnicity

SNP IDChrPositionCandidate geneA1A2PEURPEASPAATransethnic P-valueMEURMEASMAFR
rs9268839632428772HLA-DRB1AG<1E-2503.68E-1342.66E-07>1E-2501.0001.0001.000
rs30872432204738919CTLA4AG9.00E-202.16E-042.05E-039.94E-241.0000.9990.989
rs118893412191943742STAT4TC6.36E-126.27E-094.77E-035.30E-201.0001.0000.979
rs2736337811341880BLKTC1.61E-071.99E-062.59E-034.20E-131.0001.0000.976
rs96534422100825367AFF3TC3.51E-124.60E-041.43E-021.67E-151.0000.9990.964
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-080.9991.0000.960
rs93788156426155IRF4GC1.59E-075.97E-051.55E-029.96E-121.0001.0000.956
rs9096852239747671SYNGR1AT3.10E-103.81E-063.33E-023.68E-141.0001.0000.954
rs597165451738031857IKZF3-CSF3TG2.02E-099.48E-051.36E-022.60E-131.0001.0000.950
rs2233424644233921NFKBIETC3.32E-089.34E-134.88E-021.57E-191.0001.0000.934
rs21053251173349725LOC100506023AC1.02E-087.56E-033.60E-029.93E-111.0000.9880.928
rs24512586159506600TAGAPTC6.42E-101.07E-012.25E-026.16E-111.0000.8970.921
rs7301352711128496952ETS1TC2.02E-061.24E-066.45E-023.14E-111.0001.0000.905
rs19804222204610396CD28TC6.08E-113.40E-021.11E-012.99E-121.0000.9570.903
rs96036161340368069COG6TC8.29E-111.01E-021.14E-012.27E-121.0000.9750.899
rs25614775102608924C5orf30AG5.26E-102.05E-018.86E-035.51E-101.0000.3410.887
rs23172301157674997FCRL3TG1.04E-053.14E-041.46E-011.12E-081.0000.9990.864
rs1077462412111833788SH2B3-PTPN11AG2.36E-072.09E-028.20E-081.0000.864
rs80329391538834033RASGRP1TC2.42E-123.07E-052.18E-013.49E-161.0001.0000.863
rs18770301737740161MED1TC1.46E-052.95E-042.17E-011.45E-081.0000.9990.862
rs10175798230449594LBHAG1.35E-079.78E-032.05E-013.35E-091.0000.9800.861
rs2664035448220839TECAG2.45E-066.03E-013.66E-024.07E-061.0000.2840.857
rs22366682145650009ICOSLG-AIRETC1.19E-052.60E-030.273847.99E-081.0000.9930.846
rs67325652111607832ACOXLAG8.77E-058.67E-032.59E-011.81E-061.0000.9820.841
rs93721206106667535ATG5TG1.18E-051.22E-032.34E-011.41E-071.0000.9920.838
rs81338432136738242RUNX1-LOC100506403AG6.00E-093.27E-021.76E-019.70E-101.0000.8970.835
rs706778106098949IL2RATC7.09E-122.90E-017.51E-022.12E-111.0000.0880.826
rs116050421172411664ARAP1AG1.36E-029.65E-042.79E-015.33E-050.9410.9930.821
rs64798001064036881RTKN2GC2.21E-036.93E-072.46E-018.10E-080.7811.0000.820
rs172643326138005515TNFAIP3AG6.85E-193.46E-011.26E-171.0000.186
rs11933540426120001C4orf52TC9.54E-176.40E-012.17E-151.0000.147
rs1079026811118729391CXCR5AG3.32E-153.17E-014.60E-011.23E-131.0000.0590.072
rs80268981569991417LOC145837AG2.35E-176.83E-034.48E-022.46E-171.0000.9730.013
SNP IDChrPositionCandidate geneA1A2PEURPEASPAATransethnic P-valueMEURMEASMAFR
rs9268839632428772HLA-DRB1AG<1E-2503.68E-1342.66E-07>1E-2501.0001.0001.000
rs30872432204738919CTLA4AG9.00E-202.16E-042.05E-039.94E-241.0000.9990.989
rs118893412191943742STAT4TC6.36E-126.27E-094.77E-035.30E-201.0001.0000.979
rs2736337811341880BLKTC1.61E-071.99E-062.59E-034.20E-131.0001.0000.976
rs96534422100825367AFF3TC3.51E-124.60E-041.43E-021.67E-151.0000.9990.964
rs761426117413899PADI2AT1.23E-041.26E-062.56E-022.48E-080.9991.0000.960
rs93788156426155IRF4GC1.59E-075.97E-051.55E-029.96E-121.0001.0000.956
rs9096852239747671SYNGR1AT3.10E-103.81E-063.33E-023.68E-141.0001.0000.954
rs597165451738031857IKZF3-CSF3TG2.02E-099.48E-051.36E-022.60E-131.0001.0000.950
rs2233424644233921NFKBIETC3.32E-089.34E-134.88E-021.57E-191.0001.0000.934
rs21053251173349725LOC100506023AC1.02E-087.56E-033.60E-029.93E-111.0000.9880.928
rs24512586159506600TAGAPTC6.42E-101.07E-012.25E-026.16E-111.0000.8970.921
rs7301352711128496952ETS1TC2.02E-061.24E-066.45E-023.14E-111.0001.0000.905
rs19804222204610396CD28TC6.08E-113.40E-021.11E-012.99E-121.0000.9570.903
rs96036161340368069COG6TC8.29E-111.01E-021.14E-012.27E-121.0000.9750.899
rs25614775102608924C5orf30AG5.26E-102.05E-018.86E-035.51E-101.0000.3410.887
rs23172301157674997FCRL3TG1.04E-053.14E-041.46E-011.12E-081.0000.9990.864
rs1077462412111833788SH2B3-PTPN11AG2.36E-072.09E-028.20E-081.0000.864
rs80329391538834033RASGRP1TC2.42E-123.07E-052.18E-013.49E-161.0001.0000.863
rs18770301737740161MED1TC1.46E-052.95E-042.17E-011.45E-081.0000.9990.862
rs10175798230449594LBHAG1.35E-079.78E-032.05E-013.35E-091.0000.9800.861
rs2664035448220839TECAG2.45E-066.03E-013.66E-024.07E-061.0000.2840.857
rs22366682145650009ICOSLG-AIRETC1.19E-052.60E-030.273847.99E-081.0000.9930.846
rs67325652111607832ACOXLAG8.77E-058.67E-032.59E-011.81E-061.0000.9820.841
rs93721206106667535ATG5TG1.18E-051.22E-032.34E-011.41E-071.0000.9920.838
rs81338432136738242RUNX1-LOC100506403AG6.00E-093.27E-021.76E-019.70E-101.0000.8970.835
rs706778106098949IL2RATC7.09E-122.90E-017.51E-022.12E-111.0000.0880.826
rs116050421172411664ARAP1AG1.36E-029.65E-042.79E-015.33E-050.9410.9930.821
rs64798001064036881RTKN2GC2.21E-036.93E-072.46E-018.10E-080.7811.0000.820
rs172643326138005515TNFAIP3AG6.85E-193.46E-011.26E-171.0000.186
rs11933540426120001C4orf52TC9.54E-176.40E-012.17E-151.0000.147
rs1079026811118729391CXCR5AG3.32E-153.17E-014.60E-011.23E-131.0000.0590.072
rs80268981569991417LOC145837AG2.35E-176.83E-034.48E-022.46E-171.0000.9730.013

Chr, chromosome; Pos, base pair location; PEUR, the P-value for the variant calculated using only data from EURs. PEAS, the P-value for the variant calculated using only data from Asians. PAA, the P-value for the variant calculated using only data from AAs. These P-values are the association P-value for the variant in the overall TEMA as calculated by METASOFT using Han and Eskins’s Random Effects model, which is optimized to detect potentially heterogenous associations statistic. MEAS, the M-value for the variant calculated using only data from EURs. MEUR, the P-value for the variant calculated using only data from EASs. MAFR, the P-value for the variant calculated using only data from AAs.

(A–C) Plots of association P-values for RA in EURs, EASs and AAs (y-axis) and M-values (posterior probabilities that a genetic variant has an effect in a given population) (x-axis) in EURs (A) Asians, (B) and AAs (C). Larger values indicate increased likelihood the variant has an effect in the population plotted. Variants with M > 0.8 are colored blue and are considered to have evidence supporting effect replication in that population. Variants with M < 0.2 are colored red and are considered to have evidence against replication. Note that in (A) and (B) the HLA-DRB1 locus is shown at the top of the graph because it has P < 1 × 10−250 in EUR ancestry and < 1 × 10−133 in Asian ancestry (see Supplementary Material, Table 2). (D and E) Scatter plots of effect size of RA risk loci in EURs versus MAA (measure of evidence for association the same locus in AAs) (D) and EASs (E). EURs are used as the comparator because studies on this population have been larger than others to date. Loci are categorized as concordant (blue), discordant (red) or uncertain (gray). The color saturation is proportional to increasing MAF in EURs, while dot size is proportional to the MAF in the population analyzed. All ORs forced to be >1 by first taking the absolute value of beta then exponentiating. This is done in order to enable simple visual comparison of effect size.
Figure 3

(A–C) Plots of association P-values for RA in EURs, EASs and AAs (y-axis) and M-values (posterior probabilities that a genetic variant has an effect in a given population) (x-axis) in EURs (A) Asians, (B) and AAs (C). Larger values indicate increased likelihood the variant has an effect in the population plotted. Variants with M > 0.8 are colored blue and are considered to have evidence supporting effect replication in that population. Variants with M < 0.2 are colored red and are considered to have evidence against replication. Note that in (A) and (B) the HLA-DRB1 locus is shown at the top of the graph because it has P < 1 × 10−250 in EUR ancestry and < 1 × 10−133 in Asian ancestry (see Supplementary Material, Table 2). (D and E) Scatter plots of effect size of RA risk loci in EURs versus MAA (measure of evidence for association the same locus in AAs) (D) and EASs (E). EURs are used as the comparator because studies on this population have been larger than others to date. Loci are categorized as concordant (blue), discordant (red) or uncertain (gray). The color saturation is proportional to increasing MAF in EURs, while dot size is proportional to the MAF in the population analyzed. All ORs forced to be >1 by first taking the absolute value of beta then exponentiating. This is done in order to enable simple visual comparison of effect size.

Table 3

TEFM results for risk variants for RA susceptibility

SNP informationZ-statistics from GWAS of RA susceptibility in three global populationsUnannotated and (annotated) posterior probabilities from TEFMAnnotations and citations regarding variant
rsIDGeneEffect / alternate alleleEUR (3)EAS (3)AAEUR, Asian RA
(3,26)
AA, Asian & EUR RADescription; Reference numbers
rs2476601bPTPN22G / A−26.041.00 (1.00)1.00 (1.00)R620W (93)
rs7731626bANKRD55A / G−9.83−0.301.00 (1.00)1.00 (1.00)ANKRD55 eQTL (94)
rs147622113bILF3T / C−6.13−0.471.00 (1.00)1.00 (1.00)
rs34536443aTYK2C / G−8.120.99 (0.99)P1104A variant—
& rs74956615A / T−8.160.99 (0.99)Loss of kinase activity (95)
rs909685bSYNGR1A / T6.294.622.130.65 (0.84)0.94 (1.00)Disrupts PITX3 TF binding site (62)
rs1893592UBASH3AC / A−5.73−4.01−0.441.00 (1.00)1.00 (1.00)UBASH3A eQTL; (96)
rs72634030aC1QBPA / C3.504.140.290.99 (0.99)
rs67574266RELA / G7.48−0.321.850.21 (0.96)Canonical CTCF binding site in REL 5’UTR (63)
rs706778aIL2RAT / C6.861.062.40.94 (0.94)Linked to ASE of IL2RA; PFKFB3 expression (97)
rs10774624aSH2B3/
PTPN11
A / G−5.17−1.380.96 (0.94)
rs2233424 &NFKBIEA / G5.527.141.970.51 (0.51)Val194Ala (64). Alters
rs2233434aG / A5.527.131.970.48 (0.48)MTX uptake (65)
rs12715125bEOMESG / C−5.58−0.270.95 (0.99)0.79 (0.83)
rs3087243aCTLA4A / G−9.10−3.70−2.960.76 (0.80)CTLA4 CT60G (68)
rs13330176aIRF8A / T5.753.612.060.79 (0.80)
rs968567bFADS2T / C−4.950.29 (0.85)0.25 (0.65)AS-ELK1 binding to FADS2 promoter (98)
rs657075bIL3 / CSF2A / G2.644.450.73 (0.82)0.52 (0.52)ASCL6 eQTL (99)
rs71508903bARID5BT / C7.265.880.76 (0.93)0.51 (0.51)
rs187339910MMELA / G−5.22−4.181.00 (1.00)0.01 (0.01)
rs72767222ANK55RDA / C5.110.99 (0.99)0.00 (0.00)
rs12693993CD28G / A−2.74−1.76−0.960.68 (0.88)0.00 (0.00)
SNP informationZ-statistics from GWAS of RA susceptibility in three global populationsUnannotated and (annotated) posterior probabilities from TEFMAnnotations and citations regarding variant
rsIDGeneEffect / alternate alleleEUR (3)EAS (3)AAEUR, Asian RA
(3,26)
AA, Asian & EUR RADescription; Reference numbers
rs2476601bPTPN22G / A−26.041.00 (1.00)1.00 (1.00)R620W (93)
rs7731626bANKRD55A / G−9.83−0.301.00 (1.00)1.00 (1.00)ANKRD55 eQTL (94)
rs147622113bILF3T / C−6.13−0.471.00 (1.00)1.00 (1.00)
rs34536443aTYK2C / G−8.120.99 (0.99)P1104A variant—
& rs74956615A / T−8.160.99 (0.99)Loss of kinase activity (95)
rs909685bSYNGR1A / T6.294.622.130.65 (0.84)0.94 (1.00)Disrupts PITX3 TF binding site (62)
rs1893592UBASH3AC / A−5.73−4.01−0.441.00 (1.00)1.00 (1.00)UBASH3A eQTL; (96)
rs72634030aC1QBPA / C3.504.140.290.99 (0.99)
rs67574266RELA / G7.48−0.321.850.21 (0.96)Canonical CTCF binding site in REL 5’UTR (63)
rs706778aIL2RAT / C6.861.062.40.94 (0.94)Linked to ASE of IL2RA; PFKFB3 expression (97)
rs10774624aSH2B3/
PTPN11
A / G−5.17−1.380.96 (0.94)
rs2233424 &NFKBIEA / G5.527.141.970.51 (0.51)Val194Ala (64). Alters
rs2233434aG / A5.527.131.970.48 (0.48)MTX uptake (65)
rs12715125bEOMESG / C−5.58−0.270.95 (0.99)0.79 (0.83)
rs3087243aCTLA4A / G−9.10−3.70−2.960.76 (0.80)CTLA4 CT60G (68)
rs13330176aIRF8A / T5.753.612.060.79 (0.80)
rs968567bFADS2T / C−4.950.29 (0.85)0.25 (0.65)AS-ELK1 binding to FADS2 promoter (98)
rs657075bIL3 / CSF2A / G2.644.450.73 (0.82)0.52 (0.52)ASCL6 eQTL (99)
rs71508903bARID5BT / C7.265.880.76 (0.93)0.51 (0.51)
rs187339910MMELA / G−5.22−4.181.00 (1.00)0.01 (0.01)
rs72767222ANK55RDA / C5.110.99 (0.99)0.00 (0.00)
rs12693993CD28G / A−2.74−1.76−0.960.68 (0.88)0.00 (0.00)

aNovel candidate pathogenic variants identified in our study.

bCandidate pathogenic variants from Kichaev et al. (22) that were validated in our study.

Posterior probability estimates are given in the column ‘AA, EAS & EUR RA’. Those generated by Kichaev et al. are in the ‘EUR and EAS RA’ column. The zeros in the ‘AA, Asian & EUR RA’ column indicate the number of causal variants being modelled differed between studies.

Table 3

TEFM results for risk variants for RA susceptibility

SNP informationZ-statistics from GWAS of RA susceptibility in three global populationsUnannotated and (annotated) posterior probabilities from TEFMAnnotations and citations regarding variant
rsIDGeneEffect / alternate alleleEUR (3)EAS (3)AAEUR, Asian RA
(3,26)
AA, Asian & EUR RADescription; Reference numbers
rs2476601bPTPN22G / A−26.041.00 (1.00)1.00 (1.00)R620W (93)
rs7731626bANKRD55A / G−9.83−0.301.00 (1.00)1.00 (1.00)ANKRD55 eQTL (94)
rs147622113bILF3T / C−6.13−0.471.00 (1.00)1.00 (1.00)
rs34536443aTYK2C / G−8.120.99 (0.99)P1104A variant—
& rs74956615A / T−8.160.99 (0.99)Loss of kinase activity (95)
rs909685bSYNGR1A / T6.294.622.130.65 (0.84)0.94 (1.00)Disrupts PITX3 TF binding site (62)
rs1893592UBASH3AC / A−5.73−4.01−0.441.00 (1.00)1.00 (1.00)UBASH3A eQTL; (96)
rs72634030aC1QBPA / C3.504.140.290.99 (0.99)
rs67574266RELA / G7.48−0.321.850.21 (0.96)Canonical CTCF binding site in REL 5’UTR (63)
rs706778aIL2RAT / C6.861.062.40.94 (0.94)Linked to ASE of IL2RA; PFKFB3 expression (97)
rs10774624aSH2B3/
PTPN11
A / G−5.17−1.380.96 (0.94)
rs2233424 &NFKBIEA / G5.527.141.970.51 (0.51)Val194Ala (64). Alters
rs2233434aG / A5.527.131.970.48 (0.48)MTX uptake (65)
rs12715125bEOMESG / C−5.58−0.270.95 (0.99)0.79 (0.83)
rs3087243aCTLA4A / G−9.10−3.70−2.960.76 (0.80)CTLA4 CT60G (68)
rs13330176aIRF8A / T5.753.612.060.79 (0.80)
rs968567bFADS2T / C−4.950.29 (0.85)0.25 (0.65)AS-ELK1 binding to FADS2 promoter (98)
rs657075bIL3 / CSF2A / G2.644.450.73 (0.82)0.52 (0.52)ASCL6 eQTL (99)
rs71508903bARID5BT / C7.265.880.76 (0.93)0.51 (0.51)
rs187339910MMELA / G−5.22−4.181.00 (1.00)0.01 (0.01)
rs72767222ANK55RDA / C5.110.99 (0.99)0.00 (0.00)
rs12693993CD28G / A−2.74−1.76−0.960.68 (0.88)0.00 (0.00)
SNP informationZ-statistics from GWAS of RA susceptibility in three global populationsUnannotated and (annotated) posterior probabilities from TEFMAnnotations and citations regarding variant
rsIDGeneEffect / alternate alleleEUR (3)EAS (3)AAEUR, Asian RA
(3,26)
AA, Asian & EUR RADescription; Reference numbers
rs2476601bPTPN22G / A−26.041.00 (1.00)1.00 (1.00)R620W (93)
rs7731626bANKRD55A / G−9.83−0.301.00 (1.00)1.00 (1.00)ANKRD55 eQTL (94)
rs147622113bILF3T / C−6.13−0.471.00 (1.00)1.00 (1.00)
rs34536443aTYK2C / G−8.120.99 (0.99)P1104A variant—
& rs74956615A / T−8.160.99 (0.99)Loss of kinase activity (95)
rs909685bSYNGR1A / T6.294.622.130.65 (0.84)0.94 (1.00)Disrupts PITX3 TF binding site (62)
rs1893592UBASH3AC / A−5.73−4.01−0.441.00 (1.00)1.00 (1.00)UBASH3A eQTL; (96)
rs72634030aC1QBPA / C3.504.140.290.99 (0.99)
rs67574266RELA / G7.48−0.321.850.21 (0.96)Canonical CTCF binding site in REL 5’UTR (63)
rs706778aIL2RAT / C6.861.062.40.94 (0.94)Linked to ASE of IL2RA; PFKFB3 expression (97)
rs10774624aSH2B3/
PTPN11
A / G−5.17−1.380.96 (0.94)
rs2233424 &NFKBIEA / G5.527.141.970.51 (0.51)Val194Ala (64). Alters
rs2233434aG / A5.527.131.970.48 (0.48)MTX uptake (65)
rs12715125bEOMESG / C−5.58−0.270.95 (0.99)0.79 (0.83)
rs3087243aCTLA4A / G−9.10−3.70−2.960.76 (0.80)CTLA4 CT60G (68)
rs13330176aIRF8A / T5.753.612.060.79 (0.80)
rs968567bFADS2T / C−4.950.29 (0.85)0.25 (0.65)AS-ELK1 binding to FADS2 promoter (98)
rs657075bIL3 / CSF2A / G2.644.450.73 (0.82)0.52 (0.52)ASCL6 eQTL (99)
rs71508903bARID5BT / C7.265.880.76 (0.93)0.51 (0.51)
rs187339910MMELA / G−5.22−4.181.00 (1.00)0.01 (0.01)
rs72767222ANK55RDA / C5.110.99 (0.99)0.00 (0.00)
rs12693993CD28G / A−2.74−1.76−0.960.68 (0.88)0.00 (0.00)

aNovel candidate pathogenic variants identified in our study.

bCandidate pathogenic variants from Kichaev et al. (22) that were validated in our study.

Posterior probability estimates are given in the column ‘AA, EAS & EUR RA’. Those generated by Kichaev et al. are in the ‘EUR and EAS RA’ column. The zeros in the ‘AA, Asian & EUR RA’ column indicate the number of causal variants being modelled differed between studies.

There was evidence of effect discordance (MAA < 0.2) in four loci (Table 2) in AAs. For an additional 20 variants, no M-value was calculated because of very low MAF. In EASs, the results are similar: there are five variants with MEAS < 0.2 (in CD2, IFNGR2, CXCR5, IL2RA and GATA3; see Supplementary Material, Table S1) and no MEAS was calculated for 20 variants because of low MAF. Finally, in EURs, there was one variant for which no M-value was calculated because of low frequency, and there were four variants with MEUR < 0.2. We searched for characteristics common to the genetic variants that did not replicate (M < 0.2 in one or more populations or M-value not calculated). Interestingly, we noted that alleles that did not replicate were much more likely to have a large effect size (OR < 0.8 or OR > 1.25) and to be coding variants. When the index variants were sorted by effect size in the EUR population, only 3 (HLA-DRB1, NFKBIE and CD40) of the 21 variants with the largest effect size had both MAF > 0.05 and M >0.2 in at least one of the other populations (Supplementary Material, Table S1). These observations fit with well-known principles in population genetics and evolutionary theory that rare coding variants tend to be found at lower allele frequency because they exert stronger effects on fitness.

Overall, we found that variants with larger ORs in one population were likely to have lower MAF and decreased probability of replication (M < 0.2) in the remaining populations. Figure 3D plots the OR of each index variant in >100 risk loci in EUR populations against the M-value for AAs with RA. Among AAs, the following large effect loci in EURs (OREUR < 0.80 or OREUR > 1.25) are found at low risk allele frequency (RAF): PTPN22, TYK2, IL20RB, ATM, 10p14, DNASE1L4 and TNFAIP3. Several variants appear to have no effect in AAs (MAA < 0.2): TNFAIP3, CXCR5, LOC145837 and c4orf52. An analogous plot is presented for EAS populations in Figure 3E. In EASs, the following large effect risk loci in EURs are found at low RAF: PTPN22, ILF3, TYK2, IL20RB, ANKRD55, ATM, 10p14, DNASE1L4 and TNFAIP3. One large effect variant appears to lack a true effect in EASs (CXCR5) and may contribute to RA susceptibility only in EURs.

Association enrichment analysis

The index variant in a given locus is thought to be the actual disease-producing variant in the locus in only a small minority of loci (31). Thus, analyses based only on index variants could suffer from type II error. To account for this possibility, we performed an enrichment analysis using MAGENTA (32) to analyze the discordant loci across the entire gene locus. In the IL3/CSF2 locus, the index variant had low allele frequency and was not associated with RA in AAs, but both IL3 and CSF2 were enriched for genetic associations (P = 6.21 × 10−3 and P = 7.16 × 10−3). This indicates that IL3/CSF2 locus may in fact contribute to risk of RA in AAs but perhaps through different variants. For all other loci, neither analysis of the index variant (using M-values) nor analysis of the remainder of the locus provided evidence of an association.

Several factors unrelated to differing disease biology can result in effect size heterogeneity across populations in a TEMA. First, differences in statistical power directly influence variability of effect size estimates (33). Second, differences in LD structure across ethnicities can produce differences in effect size estimates in a GWAS risk locus, even when there are not differences in biology. This is relevant in that AFR populations are characterized by high genetic diversity and low levels of LD among loci, as compared with other populations (34). Third, the estimated effect of an allele from the initial study reporting the association is often exaggerated relative to the estimated effect in follow-up studies—a phenomenon known as the ‘winner’s curse’ (35). Differences in the biology between populations may also influence the effect of RA risk loci. PTPN22 is a known risk factor for RA in EUR ancestry patients, but the index SNP is virtually absent among AFR and Asian populations (17). Similarly, among AAs, focal segmental glomerulosclerosis and hypertension-attributed end-stage kidney disease are associated with two independent sequence variants in the APOL1 gene (36). These two APOL1 variants are common in AFR chromosomes but absent from EUR chromosomes (37). A third example is that a haplotype defined by three SNPs in the endothelial nitric oxide synthase gene confers susceptibility to hypertension in subjects of EUR descent but not in AAs (38).

TEFM and prioritizing candidate pathogenic variants

Our study leveraged the inclusion of genetic data from several populations to prioritize candidate pathogenic variants from among the many RA-associated variants known risk loci. As shown in Figure 1, we used the association summary statistics from the TEMA to conduct TEFM analyses. Excluding the risk loci in the HLA region, which have been extensively investigated, we analyzed 98 non-HLA RA risk loci.

Despite being a relatively small proportion of the total data set, addition of the AA data enabled identification of nine additional variants highly likely to be pathogenic (PPOST >0.8). These include rs3087243 in CTLA4, rs72634030 in C1QBP, rs34536443 in TYK2, rs706778 in IL2RA, rs10774624 in SH2B3/PTPN11, rs7902146 in ARID5B, rs2812378 in CCL19-CCL21, rs2233434 in NFKBIE and rs13330176 in IRF8 (Table 3). Consistent with Kichaev et al. (22), we validated that the following variants are likely to be pathogenic (PPOST >0.8): rs2476601 in PTPN22, rs7731626 in ANKRD55, rs147622113 in ILF3, rs909685 in SYNGR1, rs12715125 in EOMES, rs968567 in FADS2, rs657075 in IL3/CSF2 and rs71508953 in ARID5B (Table 3).

While no single variant with PPOST > 0.8 was identified in most loci, there were several interesting findings. For example, the 80% credible set (see Materials and Methods for definitions) in the IFNGR2 locus was comprised only of rs9974603 (PPOST = 0.59) and rs9975155 (PPOST = 0.24). Both of these variants lay >25 bp from a conserved transcription factor binding site (TFBS): rs9974603 is ~25 bp from the TFBS of the transcription factor E2F6, and rs9975155 is within the 5′UTR of IFNGR2 and is >10 bp from the TFBS of ZBTB7A. All the credible sets containing five variants or fewer are shown in Supplementary Material, Table S1.

Finally, we intersected all of the variants with PPOST > 0.05 with variants be associated with efficacy or side effects due to drugs used to treat RA as reported in the PharmGKB database (39) (see Materials and Methods). We detected five variants (in four RA risk loci) that have been reported to affect response or adverse events in four different drugs used to treat RA. These are the following: rs6920220 (PPOST = 0.17) in the TNFAIP3 locus, which has been associated with lack of sustained efficacy of methotrexate (MTX) (40); rs6927172 (PPOST = 0.16) in the TNFAIP3 locus, which has been associated with efficacy of tumor necrosis factor (TNF) inhibitors in inflammatory bowel disease (41); rs9373839 (PPOST = 0.08) in the ATG5 locus, which has also been associated with efficacy of a TNF inhibitor in inflammatory bowel disease (42); rs3087243 (PPOST = 0.76) in the CTLA4 locus, which has been associated with toxicity to TNF inhibitors in psoriasis (43); and rs7574865 (PPOST = 0.06) in STAT4, which has been associated with efficacy of etanercept in RA (44).

Discussion

Novel associations with RA susceptibility in AAs

Our detailed analyses of AAs have identified several new RA risk loci (P-value attaining our P-value threshold of 5 × 10−9): CSMD3, GPC5, RBFOX1 and PADI2. As shown in Figure 2, there is stronger LD support for GPC5, RBFOX1 and PADI2 than for CSMD3. CSMD3 is a 73 exon gene stretching ~1.2 Mb across 8p23. Interestingly, variants near its homolog CSMD2 were also suggestively associated with RA in our analysis (rs55798295, P = 2.84 × 10−7). This family of molecules (CSMD1, CSMD2 and CSMD3) appears to be involved in complement-mediated synapse pruning in the central nervous system. CSMD3 is associated with immune phenotypes such as influenza infection and asthma (45,46). We speculate that CSMD3 might contribute to RA by decrease inhibition of complement activation, but the remarkable pleiotropy of these genes suggests multiple other explanations. RBFOX1 is a regulator of alternative splicing of mRNA that has been extensively implicated in neural phenotypes (45). However, there is also evidence that it influences many immune processes, including T cell receptor (TCR) and B cell receptor (BCR) signaling, leukocyte migration and differentiation (47). We also found association of RA with a variant near GPC5, a pleotropic gene associated with several immune phenotypes, including multiple sclerosis in AAs (45). Glypicans are components of proteoglycans that appear to influence the behavior of the extracellular matrix during development and cellular proliferation. Another glypican molecule, GPC3, is a well-studied oncogene that serves as a ligand to CTLA4 (CD152) on the surface of CD4+CD25 T-cells, thereby influencing cellular proliferation and TNF production (48).

GPC5 is also a tumor suppressor gene but appears to inhibit tumor growth by suppressing WNT/B-catenin signaling (49). Polymorphisms in GPC5 have been associated with multiple sclerosis in Norwegian (50), Spanish (51) and AA populations (52), as well as with response to IFN-β in MS (53). A trans-eQTL in the GPC5 locus appears to downregulate KIAA0101 that encodes Proliferating Cell Nuclear Antigen-associated factor. It has been suggested that increased proliferation and reduced apoptosis of CD4+ T cells in the synovial membrane in RA are also regulated by the expression of KIAA0101 (54). GPC5 is also known to modulate levels of blood proteins through interaction with PRKCQ (55), another RA risk gene that is highly expressed on T lymphocytes. In light of these observations, we speculate that variants in the GPC5 locus may alter CD4+ T-cell behavior, contributing to dysregulated extracellular matrix (ECM) growth in the RA synovium.

Our TEMA identified an association with PADI2 in EUR, Asian and AA populations (M-value for each population > 0.95). The related gene PAD4 is a known risk allele for RA, particularly among EASs (3) and a key enzyme in RA pathogenesis because of its role in citrullination and the generation of the ACPA response (56). PAD2, PAD3 and PAD4 have distinct specificities against cellular substrates, which has important implications regarding autoantigens in RA (57), as their expression may result in a greater number of citrullinated neoantigens and thereby increase autoantibody-mediated RA pathobiology.

Prior genetic and experimental data suggest the association of PADI2 with RA is independent of that of PADI4 (29,58). Consistent with this finding, our data suggest that PADI2 is an independent genetic risk factor for RA in AAs as well (Fig. 2E). We did not find strong evidence of an effect of rs2301888 in PADI4 on RA risk in AAs (PAA = 0.44; MAA = 0.781). Not only PAD4 but PAD2 protein level and activity are increased in synovial fluid of RA compared with osteoarthritis (59). Likewise, a recent study of TNF-induced arthritis in mice showed that PADI2 deficiency led to decreased numbers of plasma cells, decreased serum IgG levels and decreased clinical and pathological findings (60). In addition, citrullination has been reported to be nearly absent from the ankles of PADI2-deficient but not from those deficient in PADI4 (60).

There may be two distinct mechanisms by which PAD2 levels become elevated in RA synovial fluid. First, the T allele of rs761426 (the index variant in our study) appears to increase expression of PADI2 mRNA in blood (29). Second, suppression of PADI2 expression mediated by microRNA miR-4728-5p (which targets PADI2) appears to be reduced in RA. Notably, not only PADI2 but also the locus containing miR-4728-5p has a genetic association with RA in EURs and EASs (3) and now in AAs (rs59716545; PTE = 2.60 × 10−13; MEUR = 1, MEAS = 1, MAFR = 0.95). Thus, our TEFM supports the conclusion that genetic variation in PADI2 and PADI4 loci and their regulators contribute to RA pathogenesis by increasing expression of these enzymes in three global populations.

Finally, our data show that, although risk loci are shared across different populations, the variants with the largest effect sizes are most likely to be specific to only one population (Fig. 3). This statement has implications for population-based models of RA risk. For example, Okada et al. (3) construct a trans-ethnic RA risk model that uses the RAF from one population but the OR from the other population. Based on this, they conclude 80% of genetic risk is shared. However, the proportion of phenotypic variance explained by a given SNP crucially depends not only on the effect size of a given SNP but on the RAF of that SNP as well (61). Our data suggest the largest effect risk loci also show the greatest interpopulation variance in RAF. Thus, the model used by Okada and many others may overestimate the variance that is shared across populations.

Candidate pathogenic variants

Kichaev et al. (22) reported a posterior probability of pathogenicity >0.8 for 12 of the 101 loci associated with RA analyzed by Okada et al. (3). We generated posterior probability estimates for the same 12 variants, which were largely concordant with prior results (Table 3). Addition of our modestly sized AA RA data set enabled identification of eight additional novel candidate pathogenic variants. Several of these variants are known to be functional polymorphisms. For instance, we obtained a posterior probability estimate of 1 for rs2476601, a well-known autoimmune risk variant and a posterior probability of 0.99 for rs34536443, which is a loss-of-function variant that alters catalytic ability of TYK2. We also identified the recently characterized variant rs909685 in SYNGR1. This SNP was recently shown to disrupt the binding site of PITX3, which imparts allele-specific expression of SYNGR1 (62). Of note, the variant we identified in the 5′UTR of REL also acts by disrupting a conserved residue in the CCCTC-binding factor (CTCF) canonical binding motif (63). Other examples are shown in Table 3.

Candidate pathogenic variants in two loci are of particular interest: NFKBIE (rs2233434-rs2233424, PPOST = 0.99) and CTLA4 (rs3087243, PPOST = 0.80). NFKBIE is the second largest effect non-HLA risk locus that has strong evidence of trans-ethnic support (PTE = 1.57 × 10−19; OREUR = 1.25, OREAS = 1.24, ORAA = 1.50; see Supplementary Material, Table S2 ). A recent study identified two functional nonsynonymous variants in NFKBIE: the G allele of rs2233434 (Val194Ala) and the C allele of rs2233433 (Pro175Leu). Both of these variants increase NF-κB activity upon stimulation of HEK293A cells with TNF, with rs2233433 having a stronger effect and demonstrated an expression imbalance in individuals heterozygous for rs2233434 (64). The same study showed dose-dependent inhibition of NFKBIE expression during exposure to vectors carrying the non-risk allele of rs2233434. Thus, functional data suggest that either of these variants could contribute to RA risk in EUR and Asian populations as they are in strong LD (r2 = 1). Our study helps shed light on which variant in the NFKBIE locus is more likely to be pathogenic. Despite the near absence of rs2233433 T allele in AFR populations, in our analysis rs2233434 (PPOST = 0.482) is >400 times more likely to be the pathogenic variant than rs2233433 (PPOST = 0.001).

Importantly, Imamura et al. (65) provided further support for the role of rs2233424 in RA. They analyzed both knockdown and overexpression of NFKBIE in human RA synovial cells with and without the Val194Ala genetic variant (rs2233434). They also show that MTX derivatives accumulate within the cells overexpressing the Val194Ala mutant allele compared with wild-type NFKBIE because of a decrease in SLC19A1 mRNA. Thus, this variant may affect both NF-κB signaling in RA susceptibility and MTX treatment response. This is an example in which analysis of samples from individuals of different ethnicities, including AAs, may inform precision medicine.

Our TEMA confirmed the well-known association of CTLA4 to RA susceptibility (Fig. 4A), and our TEFM provided additional insights (Fig. 4B). Specifically, our 90% credible set included rs3087243 (PPOST = 0.80) in the 3′UTR of CTLA-4 as well as rs11571302—an eQTL for CTLA4. rs3087243 lies within the binding site for several transcription factors including STAT3 (Fig. 4C) and is associated with lower mRNA levels of soluble CTLA-4 (66). While these findings provide suggestive evidence for how this variant may influence RA susceptibility, the precise mechanism is unclear. The rs3087243G (known as the CTLA4 CT60G allele) was recently identified in another fine-mapping study (67). That study did not detect differences in chromatin accessibility in CD4+ T cells nor differential binding to protein lysate based on rs3087243 allele status. Thus, while it is entirely possible rs3087243 is a pathogenic variant, it is also possible that a different variant underlies RA risk. If this is true, based on the assumptions of TEFM, such a variant is highly likely to be (1) linked to rs3087243, (2) not directly genotyped on our arrays and (3) not easily imputed. Importantly, there is a dinucleotide repeat lying within the 3′UTR of CTLA4 (Fig. 4C) (68) with a major allele of seven repeats (AT)7 and a variant allele with 28 dinucleotide repeats (AT)28. The (AT)28 variant is in strong linkage with the G allele of rs3087243, and transfection of (AT)28 allele decreases CTLA4 mRNA and protein levels in Jurkat cells (68). Because CTLA4 serves an inhibitor of T-cell co-stimulation (69), decreased RNA and protein levels would be expected to increase risk of autoimmune disease processes whose pathophysiology has T-cell component. Thus, we present a model in which either rs3087243, or (AT)28, or a haplotype containing both variants, produces RA risk by decreasing CTLA4 expression and protein levels.

In addition to identifying elongated (AT)n elements, we identified five other genetic variants in strong LD (r2 > 0.8) with rs3087243 in one or more ethnic populations. Because CTLA-4 can compete with CD28 for CD80/86 binding, reduction of its expression could lead to decrease the activation threshold of T cells. Finally, this variant is associated with paradoxical psoriasiform reactions to anti-TNFα drugs such as etanercept, which is used to treat RA (43). Therefore, it is possible that rs3087243 may tag a functional (AT)n repeat element that decreases transcription of CTLA4 that may thus predispose an individual to RA.

 (A) Scatter plots of the association (−log10P-value) of genetic variants in the CTLA4 locus (chr2:204 689 000–204 789 000) in AAs (left), EASs (middle) and EUR (right) to RA susceptibility. Below each scatter plot is a heatmap of LD between variants (r2) in each population. (B) Plot of the posterior probability of pathogenicity for each variant found within chr2:204 689 000–204 789 000. The two variants in red belong to the 90% credible set and are also shown in (C) The other variants are colored in blue. The region highlighted in the yellow panel represents an expanded view of the yellow vertical line in (C). (C) Gene diagram of the region containing the most likely candidate pathogenic variants. The top track is a partial gene diagram of CTLA4 showing Intron 3 and exon 4, the 3′UTR and intergenic region downstream. There is a dinucleotide repeat in the 3′UTR (marked (AT)28) that is strongly linked with rs3087243. The (AT)28 variant of this short tandem repeat is associated with decreased CTLA4 mRNA levels in autoreactive T-cell lines. The middle track shows the ENCODE TFBS. Variants rs3087243 and rs11571302 are the same as the two SNPs in the credible set shown in (B). The bottom track shows raw DNAse hypersensitivity signal in Tregs (blue), Th17 cells (brown), Th1 cells (yellow) and naïve T cells (purple).
Figure 4

 (A) Scatter plots of the association (−log10P-value) of genetic variants in the CTLA4 locus (chr2:204 689 000–204 789 000) in AAs (left), EASs (middle) and EUR (right) to RA susceptibility. Below each scatter plot is a heatmap of LD between variants (r2) in each population. (B) Plot of the posterior probability of pathogenicity for each variant found within chr2:204 689 000–204 789 000. The two variants in red belong to the 90% credible set and are also shown in (C) The other variants are colored in blue. The region highlighted in the yellow panel represents an expanded view of the yellow vertical line in (C). (C) Gene diagram of the region containing the most likely candidate pathogenic variants. The top track is a partial gene diagram of CTLA4 showing Intron 3 and exon 4, the 3′UTR and intergenic region downstream. There is a dinucleotide repeat in the 3′UTR (marked (AT)28) that is strongly linked with rs3087243. The (AT)28 variant of this short tandem repeat is associated with decreased CTLA4 mRNA levels in autoreactive T-cell lines. The middle track shows the ENCODE TFBS. Variants rs3087243 and rs11571302 are the same as the two SNPs in the credible set shown in (B). The bottom track shows raw DNAse hypersensitivity signal in Tregs (blue), Th17 cells (brown), Th1 cells (yellow) and naïve T cells (purple).

The results of our TEFM of the AFF3 locus are concordant with our previously published ImmunoChip analysis (6). Our current approach identified candidate variants near the 5′UTR of AFF3, many of which are eQTLs for AFF3. In particular, rs9653442 (PPOST = 0.33) and rs6712515 (PPOST = 0.25) were again found in the credible set (6).

Finally, TEFM identified two candidate variants in the IFNGR2 promoter. A recent study demonstrated that a large proportion of RA pathogenic variants lay within loci occupied by the Epstein Barr viral proteins EBNA2 and EBNA3C (70). This study suggested that autoimmune risk variants exert their effects by allele-dependent binding events in loci known to interact with EBNA proteins. EBNA3C promotes ZBTB7A and E2F6-mediated inhibition of E2F1, which promotes cell proliferation through several prominent RA risk genes involved in the cell cycle, including ATM, CDKN2 and CDKN4 (71). Because EBNA3C accentuates the effects of E2F6 and ZBTB7A, it seems possible that the C allele of rs9974603 and the T allele of rs9975155 serve to increase binding of these transcription factors to the IFNGR2 promoter, thereby increasing IFNGR2 expression in RA. We have recently shown that increased expression of IFNGR2 in peripheral blood mononuclear cells (PBMCs) is strongly associated with radiographic severity of RA in our Consortium for Longitudinal Evaluation of Early Arthritis Registry (CLEAR) AA cohort (72). Prior evidence suggests that increased expression of IFNGR2 in RA is most likely to occur in T cells (73), B cells or macrophages (74). Future studies may focus on testing whether TFs display allele-specific binding leading to differences in IFNGR2 expression with regard to RA or its phenotypes, such as radiographic erosions.

Three fine-mapping studies of RA, including ours, have obtained very different posterior probability estimates for variants in CD28 and ANKRD55. Kichaev et al. (22) reported rs7731626 and rs72767222 in ANKRD55 as candidate pathogenic variants, while Westra et al. (67) reported rs11377254 in ANKRD55 and rs117701653 in CD28. We found rs7736126 was a candidate variant (PPOST = 1.0) but did not confirm the others. The differences in findings could stem from the following: (1) different study designs (e.g. use of related traits in EURs only versus use of RA-only data in a trans-ethnic study); (2) differences in imputation quality, imputation method and additional quality control (QC) steps (see Supplementary Materials); (3) differences created by inclusion of rare variants, indels, etc.; and (4) differences in preparing samples for modeling more than one causal variant per locus. Given these discrepancies, we advise caution in interpretation of our fine-mapping results in the CD28 and ANKRD55 loci.

Our findings should be interpreted in light of limitations of our study. Although our study is the largest genetic association study of RA conducted in AAs, the sample size limits statistical power. Thus, we may miss associations of variants with low MAF and small effect sizes. In addition, a weakness of SNP genotyping studies is that indels, short tandem repeats and other variants are not evaluated as candidate pathogenic variants. If such a variant is in fact the pathogenic variant, the posterior probability of that variant can be misattributed to other variants (typically strongly linked variants). In addition, one of the associated variants in CSMD3 has weak LD support.

This study reports the largest GWAS in AAs with RA to date and is the first to accomplish a TEFM study of RA using three different ethnicities. Thus, this study is the most comprehensive genetic study of RA to date. Important findings include the identification of several novel RA loci. We present evidence that the PADI2 locus is associated with RA in EURs, Asians and AAs independent of the association with PADI4. Our findings also show that of the >100 known RA risk loci identified in EUR and Asian populations, 28 likely predispose to RA in AAs. Despite comprising only ~4% of the total data, inclusion of AA genotypes in a large TEFM analysis, identified eight new suspected candidate pathogenic variants. Finally, these findings may help identify biomarkers of RA treatment response. Overall, our study provides a strong rationale to include multiple populations in large analyses of RA to both to understand differences in risk architecture and to aid the goals of precision medicine for RA.

Materials and Methods

Study patients and controls

We genotyped a total of 916 cases of autoantibody-positive (ACPA-positive) AA patients with RA [from the CLEAR registry and the Veterans Affairs Rheumatoid Arthritis registry (VARA) registry] and 1370 AAs controls without rheumatic disease. Genotyped controls were from the CLEAR registry and the local Birmingham area. Out of study, previously genotyped controls were from SLEGEN, The International Consrtium on the Genetics of Systemic Lupus Erythematosus (see https://slegen.phs.wakehealth.edu/public/index.cfm). All RA cases satisfied the 1987 American Rheumatism Association (ARA) classification criteria (75). All subjects were self-declared AAs and >19 years of age. Genomic DNA from the CLEAR patients with RA and controls was isolated from peripheral blood using standard techniques as previously reported (8).

The CLEAR registry

The CLEAR registry enrolled AAs with RA in two phases. CLEAR 1 enrolled AAs of ≤2 years disease duration (from 2000 to 2006) and CLEAR 2 enrolled AAs with RA (not previously enrolled in CLEAR 1) of any disease duration (2006–2011). Participants were enrolled at the following academic sites: University of Alabama at Birmingham (Coordinating Center); Grady Hospital/Emory University, Atlanta, GA; University of North Carolina, Chapel Hill, NC; Washington University, St. Louis, MO; and Medical University of South Carolina, Charleston, SC. CLEAR controls were AAs without rheumatic disease who were age-, sex- and geographic location-matched (as a group) to the CLEAR RA patients. Controls were screened to exclude rheumatic disease using the validated Connective Tissue Disease Screening Questionnaire (76,77). Human subject protocols were approved by the institutional review boards of each of the participating institutions. Patient sociodemographics, medical history, medications, co-morbid conditions, disease activity measures and other variables were collected along with blood for extraction of DNA, serum, etc. and radiographs of the hands and feet for scoring of erosions and joint space narrowing (see below), making the CLEAR registry a valuable resource for the study of RA (8,78,79,72,80). ACPA status was confirmed on all CLEAR participants as previously reported (81). Characteristics of the CLEAR RA patients and controls are shown in Table 4. The genotyping arrays on which the all participants were assayed in this study is shown in Supplementary Material, Table S3.

Table 4

Baseline characteristics of CLEAR registry

CLEAR I
RA (n = 265)
CLEAR I
controls (n = 80)
CLEAR II
RA (n = 597)
CLEAR II
controls (n = 194)
Age at enrollment, years, mean (SD)51.0 (12.7)54.5 (13.1)56.0 (11.2)57.7 (7.61)
Age at RA onset, mean (SD)49.2 (12.6)45.4 (11.7)
Gender (female), %83.072.587.271.2
Disease duration at enrollment, months, median (IQ 25–75)12.1 (6.3–18.2)101 (36.5–196.0)
Family history of RA %30.938.7
Smoking (ever %)53.343.451.354.4
HAQ Score, median (IQ 25–75)1.38 (0.70–1.95)1.3 (0.75–1.82)
JAM Score, median (IQ 25–75)4.00 (0.00–12.75)2.30 (0.00–12.3)
Number of tender jointsa (of the 28 in the DAS28) (IQ 25–75)6.0 (1.0–15.0)3.8 (1.0–9.0)
Number of swollen jointsa (of the 28 in the DAS28) (IQ 25–75)3.0 (1.0–7.0)3.4 (1.0–9.0)
Rheumatoid factor, % positive94.815.282.719.8
Anti-CCP antibody, % positive97.93.874.32.8
Medications
DMARDs, ever used %85.694.8
MTX, ever used %78.293.9
Biologics, ever used %4.97.2
CLEAR I
RA (n = 265)
CLEAR I
controls (n = 80)
CLEAR II
RA (n = 597)
CLEAR II
controls (n = 194)
Age at enrollment, years, mean (SD)51.0 (12.7)54.5 (13.1)56.0 (11.2)57.7 (7.61)
Age at RA onset, mean (SD)49.2 (12.6)45.4 (11.7)
Gender (female), %83.072.587.271.2
Disease duration at enrollment, months, median (IQ 25–75)12.1 (6.3–18.2)101 (36.5–196.0)
Family history of RA %30.938.7
Smoking (ever %)53.343.451.354.4
HAQ Score, median (IQ 25–75)1.38 (0.70–1.95)1.3 (0.75–1.82)
JAM Score, median (IQ 25–75)4.00 (0.00–12.75)2.30 (0.00–12.3)
Number of tender jointsa (of the 28 in the DAS28) (IQ 25–75)6.0 (1.0–15.0)3.8 (1.0–9.0)
Number of swollen jointsa (of the 28 in the DAS28) (IQ 25–75)3.0 (1.0–7.0)3.4 (1.0–9.0)
Rheumatoid factor, % positive94.815.282.719.8
Anti-CCP antibody, % positive97.93.874.32.8
Medications
DMARDs, ever used %85.694.8
MTX, ever used %78.293.9
Biologics, ever used %4.97.2

aBased on the 28 joints used in the calculation of the DAS28.

AA patients with ACPA-positive RA and controls analyzed from the CLEAR registry. HAQ, Health Assessment Questionnaire; IQ 25–75, interquartile range; JAM, Joint Alignment and Motion score. DMARDs - disease modifying anti-rheumatic drugs. The CLEAR registry was started prior to the common use of DAS28 to quantify disease activity.

Table 4

Baseline characteristics of CLEAR registry

CLEAR I
RA (n = 265)
CLEAR I
controls (n = 80)
CLEAR II
RA (n = 597)
CLEAR II
controls (n = 194)
Age at enrollment, years, mean (SD)51.0 (12.7)54.5 (13.1)56.0 (11.2)57.7 (7.61)
Age at RA onset, mean (SD)49.2 (12.6)45.4 (11.7)
Gender (female), %83.072.587.271.2
Disease duration at enrollment, months, median (IQ 25–75)12.1 (6.3–18.2)101 (36.5–196.0)
Family history of RA %30.938.7
Smoking (ever %)53.343.451.354.4
HAQ Score, median (IQ 25–75)1.38 (0.70–1.95)1.3 (0.75–1.82)
JAM Score, median (IQ 25–75)4.00 (0.00–12.75)2.30 (0.00–12.3)
Number of tender jointsa (of the 28 in the DAS28) (IQ 25–75)6.0 (1.0–15.0)3.8 (1.0–9.0)
Number of swollen jointsa (of the 28 in the DAS28) (IQ 25–75)3.0 (1.0–7.0)3.4 (1.0–9.0)
Rheumatoid factor, % positive94.815.282.719.8
Anti-CCP antibody, % positive97.93.874.32.8
Medications
DMARDs, ever used %85.694.8
MTX, ever used %78.293.9
Biologics, ever used %4.97.2
CLEAR I
RA (n = 265)
CLEAR I
controls (n = 80)
CLEAR II
RA (n = 597)
CLEAR II
controls (n = 194)
Age at enrollment, years, mean (SD)51.0 (12.7)54.5 (13.1)56.0 (11.2)57.7 (7.61)
Age at RA onset, mean (SD)49.2 (12.6)45.4 (11.7)
Gender (female), %83.072.587.271.2
Disease duration at enrollment, months, median (IQ 25–75)12.1 (6.3–18.2)101 (36.5–196.0)
Family history of RA %30.938.7
Smoking (ever %)53.343.451.354.4
HAQ Score, median (IQ 25–75)1.38 (0.70–1.95)1.3 (0.75–1.82)
JAM Score, median (IQ 25–75)4.00 (0.00–12.75)2.30 (0.00–12.3)
Number of tender jointsa (of the 28 in the DAS28) (IQ 25–75)6.0 (1.0–15.0)3.8 (1.0–9.0)
Number of swollen jointsa (of the 28 in the DAS28) (IQ 25–75)3.0 (1.0–7.0)3.4 (1.0–9.0)
Rheumatoid factor, % positive94.815.282.719.8
Anti-CCP antibody, % positive97.93.874.32.8
Medications
DMARDs, ever used %85.694.8
MTX, ever used %78.293.9
Biologics, ever used %4.97.2

aBased on the 28 joints used in the calculation of the DAS28.

AA patients with ACPA-positive RA and controls analyzed from the CLEAR registry. HAQ, Health Assessment Questionnaire; IQ 25–75, interquartile range; JAM, Joint Alignment and Motion score. DMARDs - disease modifying anti-rheumatic drugs. The CLEAR registry was started prior to the common use of DAS28 to quantify disease activity.

The VARA registry

VARA is a prospective, observational, multicenter study that includes 12 VA medical centers (82). Fifty-five AAs with ACPA-positive RA from VARA were included. ACPA status was confirmed using a second-generation anti-cyclic citrullinated protein ELISA. Demographic and disease characteristics for AAs participating in VARA have previously been reported (83).

Birmingham controls

In addition to CLEAR controls, we genotyped healthy AAs from the Birmingham, Alabama area, as previously described (84). Genotype data from previously reported AA controls from the SLE Genetics consortium (SLEGEN) study were also included in the analysis (84) (see Supplementary Material, Table S1A).

Genotyping methods

Genotyping arrays

Illumina genotyping for all samples was conducted with standard Illumina Infinium protocols, and 500 ng of genomic DNA and clustering was performed with Illumina GenomeStudio software. CLEAR and VARA RA cases, CLEAR controls and Birmingham controls were genotyped on the Illumina Omni 1M array, the Omni 1S array or both (Supplementary Material, Table S3A). These samples were hybridized to Omni 1M (Omni 1M Quad v1.0_B) and Omni 1S (Omni 1S_8 v1_H) arrays.

We initially genotyped 683 subjects on Omni 1M and 1S arrays. This included 476 AA RA patients (421 from CLEAR and 55 from VARA; see Supplementary Material, Table S3A) and 207 AA controls (38 from CLEAR and 169 from the local Birmingham area; see Supplementary Material, Table S3B). We used previously existing Omni 1M and 1S genotyping data from 991 additional out-of-study AA controls from SLEGEN (23). In addition, we performed genotyping of an independent set of 634 AA subjects on the Omni 5M array (440 RA patients and 194 controls; Supplementary Material, Table S3).

Quality control and principal components analysis

We performed rigorous quality control on each array separately using the same criteria for all arrays. Samples were excluded for any of the following reasons: (i) a call rate <98.5% of the total number of SNPs on the chip; (ii) observed heterozygosity rate ± 3 standard deviation (SD) from the mean; (iii) outliers based on principal component analysis (PCA) were removed based on visual inspection of PCA plots; (iv) mismatches in sex designation as determined by genotype versus reported gender; and (v) identity by descent (IBD) >1875 between samples, in which case the sample with the lower call rate was excluded. Markers were excluded for any of the following reasons: (i) call rate > 98.5% on each chip separately; (ii) Hardy–Weinberg equilibrium P > 1 × 10−5 in control samples; (iii) MAF < 0.05. We used EIGENSTRAT (85) to estimate principal components. We ensured the selection of SNPs used to generate prinicipal component loadings did not introduce correlations between principal components and cohort membership, plate membership and other variables of interest. For the Omni 1M and 1S chips, principal components 1, 2, 4 and 6 were included as covariates because of mild correlation with phenotype status; for the Omni 5M chip, principal components 1, 2, 4 and 8 were included. We used data from HapMap3 Utah residents with Northern and Western EUR Ancestry (CEU), Yoruba in Ibadan, Nigeria (YRI) and Han Chinese and Japanese from Tokyo (CHB + JPT) to investigate the clustering of our study data. As expected, AA samples were admixed between the CEU and YRI samples (Supplementary Material, Fig. S1). To control for batch effect, we included batch ID as a covariate in association testing.

Imputation

Imputation was carried out using IMPUTE2 (86) with 1000 Genomes cosmopolitan samples. SNPs with high missingness across all studies were preferred and care was taken to ensure there were no large differences between case–control status and data missingness among SNPs used for imputation. The total number of SNPs after imputation was ~22 000 000. After QC based on imputation quality (Info > 0.5) and expected allele frequency (EAF > 0.05), 8 380 626 SNPs remained for analysis. About 0.6% of SNPs with EAF > 0.05 were excluded because of imputation quality.

Analysis of genotyping data

Joint analysis of Omni 1M, 1S and 5M data in a fixed effect model

We jointly analyzed all of our GWA data from the Illumina Omni 1M and 1S (phase I) and 5M (phase II) arrays. Logistic regression analysis was performed on both genotyped and imputed SNPs using SNPtest v2.5.1 (87) after including sex, cohort ID and principal components as covariates under additive and dominant genetic models. Setting an appropriate threshold of genome-wide significance has been the subject of substantial inquiry because of the difficulty of establishing an accurate family-wise error rate in the context of widespread LD. Most studies, particularly studies of EUR subjects, set this threshold at 5.0 × 10−8 when testing markers under an additive genetic model.

We used a conservative alpha level for two reasons. First, compared with EUR populations, AFR/AA populations have smaller average linkage blocks and weaker average LD. This has resulted in a recommendation of ~1.0 × 10−8 to 2.0 × 10−8 for the alpha threshold for genome-wide significance in AFR/AA populations (88,89). Second, we tested many of our variants under both additive and dominant genetic models but show results only from additive models. We applied an additional Bonferonni correction as a further conservative measure, which yielded alpha = 5.0 × 10−9. We considered SNPs with joint P < 1 × 10−6 suggestively associated, which is conservative compared with many recent reports (90). Several suggestive associations were found prior to incorporating the data from Okada et al. (3) (Supplementary Material, Fig. S2). QQ plots are given for the Omni 1M, 1S and 5M arrays with and without the HLA region included (Supplementary Material, Fig. S3; left and right panels, respectively). Because we expected that true positive associations would share the same direction of effect (granted the similar disease state and ethnic background) we used a fixed effects meta-analysis for this portion of the study.

Meta-analysis of RA GWASs in EURs, Asians and AAs

Joint analysis has been shown to be more well powered than replication analysis in GWASs (91). The summary statistics from the fixed-effect meta-analysis of RA in AAs in Phase I (916 RA and 1392 controls) were thus combined with summary statistics from EUR ancestry and Asian ancestry RA from Okada et al. (3). This data included 19 234 EUR RA and 61 654 EUR controls and 4873 EAS RA and 17 641 EAS controls. We then performed a random-effects TEMA using METASOFT (27,28). We selected a random effects framework because we expected heterogeneity of effect in some risk loci as previously reported in Asians and EURs (see Discussion). We computed P-values for all three populations together (PTE) but also for each population individually using the framework of Han and Eskin (27,28).

We relied on reference data sets to generate LD matrices for the EUR and Asian data because genotype data were not available for analysis. However, in several cases we detected a large d ifference between reported subpopulation-specific allele frequencies from Okada et al. (3) and the 1000 Genomes data. Variants with a difference in MAF > 0.1 were excluded from the analysis. We used a combination of MAF, strand checks and comparisons of LD and Z-statistics to unambiguously align most SNPs to the 1000 Genomes reference. A fraction of SNPs (in particular A/T and G/C SNPs with AF near 0.5) had to be dropped from the study.

In addition, we sought to employ a formal test of effect size in order to decide which RA risk loci from studies of other ethnic groups replicated in our study of AAs. To do this, we conducted a TEMA using GWA summary statistics from AAs (our analysis as described above) and summary statistics from persons of EUR ancestry and from Asian ancestry kindly provided by Okada and colleagues (3) as used in their RA meta-analysis. All data were combined into one data set and analyzed using METASOFT, an open-source meta-analysis tool (http://genetics.cs.ucla.edu/meta/) (27,28).

For each variant in this multi-ethnic data set, METASOFT calculated (a) association P-values for RA in EURs, EASs and in AAs (PEUR, PEAS and PAFR, respectively) and (b) M-values (posterior probabilities that a genetic variant has an effect in a given population). P- and M-values for all the index variants reported by Okada et al. (3) are shown in Table 1. M-values were not calculated for 18 of the index variants in AAs because of either low MAF or exclusion based on QC metrics as described above. We categorized results into three strata as used by Han and Eskin (27,28). These include (a) SNPs that are predicted to have an effect (M > 0.8); (b) SNPs that are predicted to not have an effect (M < 0.2); and (c) ambiguous SNPs (M-value between 0.2 and 0.8) in which it is unknown whether there is an effect. Figure 3 shows a P–M plot to aid with visualization of these results. For loci in which the index variant had an M-value between 0.2 and 0.8 or had very low allele frequency, we conducted an association enrichment analysis using MAGENTA (32). Specifically, we scanned whether the gene was enriched for associated variants. This was done in order to further assess whether the association was truly absent or merely discordant at the index variant alone.

TEFM to identify candidate pathogenic variants

We initially used both CAVIARBF and PAINTOR3 for the TEFM analyses. In general, we found the results produced by the PAINTOR3 algorithm to be more reliable than those computed using CAVIARBF (data not shown). This is likely because CAVIARBF does not yet integrate data from multiple populations into a single fine-mapping experiment, which is a key point given the strengths of our study. As a result, though we generated estimates using CAVIARBF, we relied on those from PAINTOR3. We conducted TEFM in 91 RA risk loci using the PAINTOR algorithm, which calculates a posterior probability that each variant in a risk locus is pathogenic. Results range from 0 indicating very unlikely to 1, indicating highly likely. This analysis included the following steps: step 1, alignment of reference and alternate alleles for all populations (Asian, EUR and AA) to match the designations found in the 1000 Genomes Project (23); step 2, generation of LD matrices for each population based on linkage in the 1000 Genomes Project; step 3, quantifying the importance of each genomic annotation to RA risk for each of 8138 genomic annotations in effect size estimates (gamma estimates) RA risk. The initial analysis was performed by running each of the 8138 genomic annotations in the 91 RA risk loci individually. A list of these annotations may be found in https://github.com/gkichaev/PAINTOR_V3.0/wiki/2b.-Overlapping-annotations (23); step 4, these effect size estimates were sorted in descending order, selecting those that were most informative (most likely to improve the model fit); step 5, we sequentially compared informative annotations, and if they were correlated, we removed the annotation with the smaller effect size estimate until only weakly correlated annotations remained (r2 < 0.10); and step 6, TEFM was performed using the five most informative uncorrelated annotations.

Within each locus, we summed sorted variants in descending order of posterior probability, until 90% of the probability mass was accounted for. This is known as a 90% credible set (92) of candidate pathogenic variants for each RA risk locus (Supplementary Material, Table S1). We compared our results to functionally validated SNPs to assess the degree of external validation. In most cases, the precise variant that gives rise to a GWAS association is unknown, but in some cases it can be reasonably assumed. For example, in the PTPN22 locus we obtained a posterior probability of 1.0 (almost certainly pathogenic) for rs2476601, which is generally accepted as the pathogenic variant in EURs with RA. We next established internal consistency by benchmarking our results on those reported by Kichaev et al. (22), who previously analyzed data on EURs and Asians with RA. Initially, our findings differed from those previously reported. We applied rigorous quality control measures to (1) increase the number of variants we were able to include and (2) improve the accuracy of our LD matrices. After QC, our data generally agreed strongly with the prior results. Finally, we wished to search in a systematic fashion for variants that (1) were found with posterior probability of pathogenicity >0.05 and (2) were found in a curated pharmacological database, such as the PharmGKB database (39). To accomplish this analysis, we downloaded the database of genetic variants from https://www.pharmgkb.org/ (date of access July 8, 2018) and intersected these against all variants with PPOST > 0.05.

Acknowledgements

We gratefully acknowledge the CLEAR Investigators Doyt L. Conn, MD, Beth L. Jonas, MD, Leigh F. Callahan, PhD, Edwin A. Smith, MD, Richard D. Brasington, Jr., MD and Larry W. Moreland, MD. The assistance of Stephanie Ledbetter, MS is greatly appreciated. We thank Dr Yukinori Okada for providing data and for helpful discussions. We appreciate the guidance and advice of Dr Robert Plenge and his willingness to provide samples.

Conflict of Interest statement. None declared.

Funding

National Institutes of Health (R01 AR057202 and 2P60 AR048095 to S.L.B., K01 AR060848 to R.J.R., K23 AR062100 to M.I.D., R37 AI024717 and 1U01 HG008666 to J.B.H., UL1 TR001417 and P01 AR49084 to R.P.K.); US Department of Veterans Affairs (to J.B.H.).

References

1.

Silman
,
A.J.
and
Pearson
,
J.E.
(
2002
)
Epidemiology and genetics of rheumatoid arthritis
.
Arthritis Res.
,
4
(
Suppl.) 3
,
S265
S272
.

2.

McInnes
,
I.B.
and
Schett
,
G.
(
2011
)
The pathogenesis of rheumatoid arthritis
.
N. Engl. J. Med.
,
365
,
2205
2219
.

3.

Okada
,
Y.
,
Wu
,
D.
,
Trynka
,
G.
,
Raj
,
T.
,
Terao
,
C.
,
Ikari
,
K.
,
Kochi
,
Y.
,
Ohmura
,
K.
,
Suzuki
,
A.
,
Yoshida
,
S.
et al. (
2014
)
Genetics of rheumatoid arthritis contributes to biology and drug discovery
.
Nature
,
506
,
376
381
.

4.

Viatte
,
S.
,
Plant
,
D.
and
Raychaudhuri
,
S.
(
2013
)
Genetics and epigenetics of rheumatoid arthritis
.
Nat. Rev. Rheumatol.
,
9
,
141
153
.

5.

Hughes
,
L.B.
,
Reynolds
,
R.J.
,
Brown
,
E.E.
,
Kelley
,
J.M.
,
Thomson
,
B.
,
Conn
,
D.L.
,
Jonas
,
B.L.
,
Westfall
,
A.O.
,
Padilla
,
M.A.
,
Callahan
,
L.F.
et al. (
2010
)
Most common single-nucleotide polymorphisms associated with rheumatoid arthritis in persons of European ancestry confer risk of rheumatoid arthritis in African Americans
.
Arthritis Rheum.
,
62
,
3547
3453
.

6.

Danila
,
M.I.
,
Laufer
,
V.A.
,
Reynolds
,
R.J.
,
Yan
,
Q.
,
Liu
,
N.
,
Gregersen
,
P.K.
,
Lee
,
A.
,
Kern
,
M.
,
Langefeld
,
C.D.
,
Arnett
,
D.K.
et al. (
2017
)
Dense genotyping of immune-related regions identifies loci for rheumatoid arthritis risk and damage in African Americans
.
Mol. Med.
,
23
,
177
187
.

7.

Stastny
,
P.
(
1978
)
Association of the B-cell alloantigen DRw4 with rheumatoid arthritis
.
N. Engl. J. Med.
,
298
,
869
871
.

8.

Hughes
,
L.B.
,
Laufer
,
V.A.
,
Reynolds
,
R.J.
,
Yan
,
Q.
,
Liu
,
N.
,
Gregersen
,
P.K.
,
Lee
,
A.
,
Kern
,
M.
,
Langefeld
,
C.D.
,
Arnett
,
D.K.
and
Bridges
,
S.L.
Jr
(
2008
)
The HLA-DRB1 shared epitope is associated with susceptibility to rheumatoid arthritis in African Americans through European genetic admixture
.
Arthritis Rheum.
,
58
,
349
358
.

9.

Ding
,
B.
,
Padyukov
,
L.
,
Lundstrom
,
E.
,
Seielstad
,
M.
,
Plenge
,
R.M.
,
Oksenberg
,
J.R.
,
Gregersen
,
P.K.
,
Alfredsson
,
L.
and
Klareskog
,
L.
(
2009
)
Different patterns of associations with anti-citrullinated protein antibody-positive and anti-citrullinated protein antibody-negative rheumatoid arthritis in the extended major histocompatibility complex region
.
Arthritis Rheum.
,
60
,
30
38
.

10.

Gregersen
,
P.K.
,
Silver
,
J.
and
Winchester
,
R.J.
(
1987
)
The shared epitope hypothesis. An approach to understanding the molecular genetics of susceptibility to rheumatoid arthritis
.
Arthritis Rheum.
,
30
,
1205
1213
.

11.

Raychaudhuri
,
S.
,
Sandor
,
C.
,
Stahl
,
E.A.
,
Freudenberg
,
J.
,
Lee
,
H.S.
,
Jia
,
X.
,
Alfredsson
,
L.
,
Padyukov
,
L.
,
Klareskog
,
L.
,
Worthington
,
J.
et al. (
2012
)
Five amino acids in three HLA proteins explain most of the association between MHC and seropositive rheumatoid arthritis
.
Nat. Genet.
,
44
,
291
296
.

12.

Viatte
,
S.
,
Plant
,
D.
,
Han
,
B.
,
Fu
,
B.
,
Yarwood
,
A.
,
Thomson
,
W.
,
Symmons
,
D.P.
,
Worthington
,
J.
,
Young
,
A.
,
Hyrich
,
K.L.
et al. (
2015
)
Association of HLA-DRB1 haplotypes with rheumatoid arthritis severity, mortality, and treatment response
.
JAMA
,
313
,
1645
1656
.

13.

Reynolds
,
R.J.
,
Ahmed
,
A.F.
,
Danila
,
M.I.
,
Hughes
,
L.B.
,
Gregersen
,
P.K.
,
Raychaudhuri
,
S.
,
Plenge
,
R.M.
and
Bridges
,
S.L.
Jr
(
2014
)
HLA-DRB1-associated rheumatoid arthritis risk at multiple levels in African Americans: hierarchical classification systems, amino acid positions, and residues
.
Arthritis Rheumatol.
,
66
,
3274
3282
.

14.

Richard-Miceli
,
C.
and
Criswell
,
L.A.
(
2012
)
Emerging patterns of genetic overlap across autoimmune disorders
.
Genome Med.
,
4
,
6
14
.

15.

Morris
,
A.P.
(
2011
)
Transethnic meta-analysis of genomewide association studies
.
Genet. Epidemiol.
,
35
,
809
822
.

16.

DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium
,
Asian Genetic Epidemiology Network Type 2 Diabetes (AGEN-T2D) Consortium
,
South Asian Type 2 Diabetes (SAT2D) Consortium
,
Mexican American Type 2 Diabetes (MAT2D) Consortium
,
Type 2 Diabetes Genetic Exploration by Nex-generation sequencing in muylti-Ethnic Samples (T2D-GENES) Consortium
,
Mahajan
,
A.
,
Go
,
M.J.
,
Zhang
,
W.
,
Below
,
J.E.
,
Gaulton
,
K.J.
et al. (
2014
)
Genome-wide trans-ancestry meta-analysis provides insight into the genetic architecture of type 2 diabetes susceptibility
.
Nat. Genet.
,
46
,
234
244
.

17.

Begovich
,
A.B.
,
Carlton
,
V.E.
,
Honigberg
,
L.A.
,
Schrodi
,
S.J.
,
Chokkalingam
,
A.P.
,
Alexander
,
H.C.
,
Ardlie
,
K.G.
,
Huang
,
Q.
,
Smith
,
A.M.
,
Spoerke
,
J.M.
et al. (
2004
)
A missense single-nucleotide polymorphism in a gene encoding a protein tyrosine phosphatase (PTPN22) is associated with rheumatoid arthritis
.
Am. J. Hum. Genet.
,
75
,
330
337
.

18.

Viatte
,
S.
,
Flynn
,
E.
,
Lunt
,
M.
,
Barnes
,
J.
,
Singwe-Ngandeu
,
M.
,
Bas
,
S.
,
Barton
,
A.
and
Gabay
,
C.
(
2012
)
Investigation of Caucasian rheumatoid arthritis susceptibility loci in African patients with the same disease
.
Arthritis Res. Ther.
,
14
,
R239
R245
.

19.

Govind
,
N.
,
Choudhury
,
A.
,
Hodkinson
,
B.
,
Ickinger
,
C.
,
Frost
,
J.
,
Lee
,
A.
,
Gregersen
,
P.K.
,
Reynolds
,
R.J.
,
Bridges
,
S.L.
Jr
,
Hazelhurst
,
S.
et al. (
2014
)
Immunochip identifies novel, and replicates known, genetic risk loci for rheumatoid arthritis in black South Africans
.
Mol. Med.
,
20
,
341
349
.

20.

Laufer
,
V.A.
,
Chen
,
J.Y.
,
Langefeld
,
C.D.
and
Bridges
,
S.L.
Jr
(
2017
)
Integrative approaches to understanding the pathogenic role of genetic variation in rheumatic diseases
.
Rheum. Dis. Clin. North Am.
,
43
,
449
466
.

21.

Chen
,
W.
,
Larrabee
,
B.R.
,
Ovsyannikova
,
I.G.
,
Kennedy
,
R.B.
,
Haralambieva
,
I.H.
,
Poland
,
G.A.
and
Schaid
,
D.J.
(
2015
)
Fine mapping causal variants with an approximate Bayesian method using marginal test statistics
.
Genetics
,
200
,
7197
7136
.

22.

Kichaev
,
G.
and
Pasaniuc
,
B.
(
2015
)
Leveraging functional-annotation data in trans-ethnic fine-mapping studies
.
Am. J. Hum. Genet.
,
97
,
260
271
.

23.

Kichaev
,
G.
,
Roytman
,
M.
,
Johnson
,
R.
,
Eskin
,
E.
,
Lindström
,
S.
,
Kraft
,
P.
and
Pasaniuc
,
B.
(
2017
)
Improved methods for multi-trait fine mapping of pleiotropic risk loci
.
Bioinformatics
,
33
,
248
255
.

24.

Ong
,
R.T.
,
Wang
,
X.
,
Liu
,
X.
and
Teo
,
Y.Y.
(
2012
)
Efficiency of trans-ethnic genome-wide meta-analysis and fine-mapping
.
Eur. J. Hum. Genet.
,
20
,
1300
1307
.

25.

International HapMap Consortium
(
2005
)
A haplotype map of the human genome
.
Nature
,
437
,
1299
1320
.

26.

Hinch
,
A.G.
,
Tandon
,
A.
,
Patterson
,
N.
,
Song
,
Y.
,
Rohland
,
N.
,
Palmer
,
C.D.
,
Chen
,
G.K.
,
Wang
,
K.
,
Buxbaum
,
S.G.
,
Akylbekova
,
E.L.
et al. (
2011
)
The landscape of recombination in African Americans
.
Nature
,
476
,
170
175
.

27.

Han
,
B.
and
Eskin
,
E.
(
2011
)
Random-effects model aimed at discovering associations in meta-analysis of genome-wide association studies
.
Am. J. Hum. Genet.
,
88
,
586
598
.

28.

Han
,
B.
and
Eskin
,
E.
(
2012
)
Interpreting meta-analyses of genome-wide association studies
.
PLoS Genet.
,
8
,
e1002555
.

29.

Okada
,
Y.
,
Muramatsu
,
T.
,
Suita
,
N.
,
Kanai
,
M.
,
Kawakami
,
E.
,
Iotchkova
,
V.
,
Soranzo
,
N.
,
Inazawa
,
J.
and
Tanaka
,
T.
(
2016
)
Significant impact of miRNA-target gene networks on genetics of human complex traits
.
Sci. Rep.
,
6
,
22223
.

30.

Kang
,
E.Y.
,
Park
,
Y.
,
Li
,
X.
,
Segrè
,
A.V.
,
Han
,
B.
and
Eskin
,
E.
(
2016
)
ForestPMPlot: a flexible tool for visualizing heterogeneity between studies in meta-analysis
.
G3 (Bethesda)
,
6
,
1793
1798
.

31.

Farh
,
K.K.
,
Marson
,
A.
,
Zhu
,
J.
,
Kleinewietfeld
,
M.
,
Housley
,
W.J.
,
Beik
,
S.
,
Shoresh
,
N.
,
Whitton
,
H.
,
Ryan
,
R.J.
,
Shishkin
,
A.A.
et al. (
2015
)
Genetic and epigenetic fine-mapping of causal autoimmune disease variants
.
Nature
,
518
,
337
343
.

32.

Goodarzi
,
M.O.
,
Guo
,
X.
,
Cui
,
J.
,
Jones
,
M.R.
,
Haritunians
,
T.
,
Xiang
,
A.H.
,
Chen
,
Y.D.
,
Taylor
,
K.D.
,
Buchanan
,
T.A.
,
Hsueh
,
W.A.
et al. (
2013
)
Systematic evaluation of validated type 2 diabetes and glycaemic trait loci for association with insulin clearance
.
Diabetologia
,
56
,
1282
1290
.

33.

Zanetti
,
D.
and
Weale
,
M.E.
(
2018
)
Transethnic differences in GWAS signals: a simulation study
.
Ann. Hum. Genet.
,
82
,
280
286
.

34.

Tishkoff
,
S.A.
,
Reed
,
F.A.
,
Friedlaender
,
F.R.
,
Ehret
,
C.
,
Ranciaro
,
A.
,
Froment
,
A.
,
Hirbo
,
J.B.
,
Awomoyi
,
A.A.
,
Bodo
,
J.M.
,
Doumbo
,
O.
et al. (
2009
)
The genetic structure and history of Africans and African Americans
.
Science
,
324
,
1035
1044
.

35.

Kraft
,
P.
(
2008
)
Curses—winner’s and otherwise—in genetic epidemiology
.
Epidemiology
,
19
,
649
651
.

36.

Genovese
,
G.
,
Friedman
,
D.J.
,
Ross
,
M.D.
,
Lecordier
,
L.
,
Uzureau
,
P.
,
Freedman
,
B.I.
,
Bowden
,
D.W.
,
Langefeld
,
C.D.
,
Oleksyk
,
T.K.
,
Uscinski Knob
,
A.L.
et al. (
2010
)
Association of trypanolytic ApoL1 variants with kidney disease in African Americans
.
Science
,
329
,
841
845
.

37.

Friedman
,
D.J.
and
Pollak
,
M.R.
(
2016
)
Apolipoprotein L1 and kidney disease in African Americans
.
Trends Endocrinol. Metab.
,
27
,
204
215
.

38.

Sandrim
,
V.C.
,
Coelho
,
E.B.
,
Nobre
,
F.
,
Arado
,
G.M.
,
Lanchote
,
V.L.
and
Tanus-Santos
,
J.E.
(
2006
)
Susceptible and protective eNOS haplotypes in hypertensive black and white subjects
.
Atherosclerosis
,
186
,
428
432
.

39.

Whirl-Carrillo
,
M.
,
McDonagh
,
E.M.
,
Hebert
,
J.M.
,
Gong
,
L.
,
Sangkuhl
,
K.
,
Thorn
,
C.F.
,
Altman
,
R.B.
and
Klein
,
T.E.
(
2012
)
Pharmacogenomics knowledge for personalized medicine
.
Clin. Pharmacol. Ther.
,
92
,
414
417
.

40.

Plant
,
D.
,
Farragher
,
T.
,
Flynn
,
E.
,
Martin
,
P.
,
Eyre
,
S.
,
Bunn
,
D.
,
Worthington
,
J.
,
Symmons
,
D.
,
Barton
,
A.
and
Thomson
,
W.
(
2012
)
A genetic marker at the OLIG3/TNFAIP3 locus associates with methotrexate continuation in early inflammatory polyarthritis: results from the Norfolk Arthritis Register
.
Pharmacogenomics J.
,
12
,
128
133
.

41.

Bank S.,

Andersen
,
P.S.
,
Burisch
,
J.
,
Pedersen
,
N.
,
Roug
,
S.
,
Galsgaard
,
J.
,
Turino
,
S.Y.
,
Brodersen
,
J.B.
,
Rashid
,
S.
,
Rasmussen
,
B.K.
et al. . (
2014
)
Associations between functional polymorphisms in the NFkappaB signaling pathway and response to anti-TNF treatment in Danish patients with inflammatory bowel disease
.
Pharmacogenomics J.
,
14
,
526
534
.

42.

Dezelak
,
M.
,
Repnik
,
K.
,
Koder
,
S.
,
Ferkolj
,
I.
and
Potočnik
,
U.
(
2016
)
A prospective pharmacogenomic study of Crohn’s disease patients during routine therapy with anti-TNF-alpha drug adalimumab: contribution of ATG5, NFKB1, and CRP genes to pharmacodynamic variability
.
OMICS
,
20
,
296
309
.

43.

Cabaleiro
,
T.
,
Prieto-Pérez
,
R.
,
Navarro
,
R.
,
Solano
,
G.
,
Román
,
M.
,
Ochoa
,
D.
,
Abad-Santos
,
F.
and
Daudén
,
E.
(
2016
)
Paradoxical psoriasiform reactions to anti-TNFalpha drugs are associated with genetic polymorphisms in patients with psoriasis
.
Pharmacogenomics J.
,
16
,
336
240
.

44.

Conigliaro
,
P.
,
Ciccacci
,
C.
,
Politi
,
C.
,
Triggianese
,
P.
,
Rufini
,
S.
,
Kroegler
,
B.
,
Perricone
,
C.
,
Latini
,
A.
,
Novelli
,
G.
,
Borgiani
,
P.
et al. (
2017
)
Polymorphisms in STAT4, PTPN2, PSORS1C1 and TRAF3IP2 genes are associated with the response to TNF inhibitors in patients with rheumatoid arthritis
.
PLoS One
,
12
,
e0169956
.

45.

Welter
,
D.
,
MacArthur
,
J.
,
Morales
,
J.
,
Burdett
,
T.
,
Hall
,
P.
,
Junkins
,
H.
,
Klemm
,
A.
,
Flicek
,
P.
,
Manolio
,
T.
,
Hindorff
,
L.
et al. . (
2014
)
The NHGRI GWAS Catalog, a curated resource of SNP-trait associations
.
Nucleic Acids Res.
,
42
(
Database issue
),
D1001
D1006
.

46.

Julià
,
A.
,
Ballina
,
J.
,
Cañete
,
J.D.
,
Balsa
,
A.
,
Tornero-Molina
,
J.
,
Naranjo
,
A.
,
Alperi-López
,
M.
,
Erra
,
A.
,
Pascual-Salcedo
,
D.
,
Barceló
,
P.
et al. (
2008
)
Genome-wide association study of rheumatoid arthritis in the Spanish population: KLF12 as a risk locus for rheumatoid arthritis susceptibility
.
Arthritis Rheum.
,
58
,
2275
2286
.

47.

Gorenshteyn
,
D.
,
Zaslavsky
,
E.
,
Fribourg
,
M.
,
Park
,
C.Y.
,
Wong
,
A.K.
,
Tadych
,
A.
,
Hartmann
,
B.M.
,
Albrecht
,
R.A.
,
García-Sastre
,
A.
,
Kleinstein
,
S.H.
et al. (
2015
)
Interactive big data resource to elucidate human immune pathways and diseases
.
Immunity
,
43
,
605
614
.

48.

Boswell
,
S.
,
Pathan
,
A.A.
,
Pereira
,
S.P.
,
Williams
,
R.
and
Behboudi
,
S.
(
2013
)
Induction of CD152 (CTLA-4) and LAP (TGF-beta1) in human Foxp3- CD4+ CD25- T cells modulates TLR-4 induced TNF-alpha production
.
Immunobiology
,
218
,
427
434
.

49.

Yuan
,
S.
,
Yu
,
Z.
,
Liu
,
Q.
,
Zhang
,
M.
,
Xiang
,
Y.
,
Wu
,
N.
,
Wu
,
L.
,
Hu
,
Z.
,
Xu
,
B.
,
Cai
,
T.
et al. (
2016
)
GPC5, a novel epigenetically silenced tumor suppressor, inhibits tumor growth by suppressing Wnt/beta-catenin signaling in lung adenocarcinoma
.
Oncogene
,
35
,
6120
6131
.

50.

Lorentzen
,
A.R.
,
Melum
,
E.
,
Ellinghaus
,
E.
,
Smestad
,
C.
,
Mero
,
I.L.
,
Aarseth
,
J.H.
,
Myhr
,
K.M.
,
Celius
,
E.G.
,
Lie
,
B.A.
,
Karlsen
,
T.H.
et al. (
2010
)
Association to the Glypican-5 gene in multiple sclerosis
.
J. Neuroimmunol.
,
226
,
194
197
.

51.

Cavanillas
,
M.L.
,
Fernández
,
O.
,
Comabella
,
M.
,
Alcina
,
A.
,
Fedetz
,
M.
,
Izquierdo
,
G.
,
Lucas
,
M.
,
Cénit
,
M.C.
,
Arroyo
,
R.
,
Vandenbroeck
,
K.
et al. (
2011
)
Replication of top markers of a genome-wide association study in multiple sclerosis in Spain
.
Genes Immun.
,
12
,
110
115
.

52.

Johnson
,
B.A.
,
Wang
,
J.
,
Taylor
,
E.M.
,
Caillier
,
S.J.
,
Herbert
,
J.
,
Khan
,
O.A.
,
Cross
,
A.H.
,
De Jager
,
P.L.
,
Gourraud
,
P.A.
,
Cree
,
B.C.
et al. (
2010
)
Multiple sclerosis susceptibility alleles in African Americans
.
Genes Immun.
,
11
,
343
350
.

53.

Cenit
,
M.D.
,
Blanco-Kelly
,
F.
, de las
Heras
,
V.
,
Bartolomé
,
M.
, de la
Concha
,
E.G.
,
Urcelay
,
E.
,
Arroyo
,
R.
and
Martínez
,
A
. (
2009
)
Glypican 5 is an interferon-beta response gene: a replication study
.
Mult. Scler.
,
15
,
913
917
.

54.

Aterido
,
A.
,
Palacio
,
C.
,
Marsal
,
S.
,
Avila
,
G.
and
Julià
,
A.
(
2014
)
Novel insights into the regulatory architecture of CD4+ T cells in rheumatoid arthritis
.
PLoS One
,
9
,
e100690
.

55.

Suhre
,
K.
,
Arnold
,
M.
,
Bhagwat
,
A.M.
,
Cotton
,
R.J.
,
Engelke
,
R.
,
Raffler
,
J.
,
Sarwath
,
H.
,
Thareja
,
G.
,
Wahl
,
A.
,
DeLisle
,
R.K.
et al. (
2017
)
Connecting genetic risk to disease end points through the human blood plasma proteome
.
Nat. Commun.
,
8
,
14357
.

56.

Firestein
,
G.
, and
McInnes
,
I.B.
(
2017
)
Immunopathogenesis of rheumatoid arthritis
.
Immunity
,
46
,
183
196
.

57.

Darrah
,
E.
,
Rosen
,
A.
,
Giles
,
J.T.
and
Andrade
,
F.
(
2012
)
Peptidylarginine deiminase 2, 3 and 4 have distinct specificities against cellular substrates: novel insights into autoantigen selection in rheumatoid arthritis
.
Ann. Rheum. Dis.
,
71
,
92
98
.

58.

Naranbhai
,
V.
,
Fairfax
,
B.P.
,
Makino
,
S.
,
Humburg
,
P.
,
Wong
,
D.
,
Ng
,
E.
,
Hill
,
A.V.
and
Knight
,
J.C.
(
2015
)
Genomic modulators of gene expression in human neutrophils
.
Nat. Commun.
,
6
,
7545
.

59.

Damgaard
,
D.
,
Senolt
,
L.
and
Nielsen
,
C.H.
(
2016
)
Increased levels of peptidylarginine deiminase 2 in synovial fluid from anti-CCP-positive rheumatoid arthritis patients: association with disease activity and inflammatory markers
.
Rheumatology (Oxford)
,
55
,
918
927
.

60.

Bawadekar
,
M.
,
Shim
,
D.
,
Johnson
,
C.J.
,
Warner
,
T.F.
,
Rebernick
,
R.
,
Damgaard
,
D.
,
Nielsen
,
C.H.
,
Pruijn
,
G.J.M.
,
Nett
,
J.E.
and
Shelef
,
M.A.
(
2017
)
Peptidylarginine deiminase 2 is required for tumor necrosis factor alpha-induced citrullination and arthritis, but not neutrophil extracellular trap formation
.
J. Autoimmun.
,
80
,
39
47
.

61.

Stahl
,
E.A.
,
Wegmann
,
D.
,
Trynka
,
G.
,
Gutierrez-Achury
,
J.
,
Do
,
R.
,
Voight
,
B.F.
,
Kraft
,
P.
,
Chen
,
R.
,
Kallberg
,
H.J.
,
Kurreeman
,
F.A.
et al. (
2012
)
Bayesian inference analyses of the polygenic architecture of rheumatoid arthritis
.
Nat. Genet.
,
44
,
483
489
.

62.

Cavalli
,
M.
,
Pan
,
G.
,
Nord
,
H.
,
Wallerman
,
O.
,
Wallén Arzt
,
E.
,
Berggren
,
O.
,
Elvers
,
I.
,
Eloranta
,
M.L.
,
Rönnblom
,
L.
,
Lindblad Toh
,
K.
et al. (
2016
)
Allele-specific transcription factor binding to common and rare variants associated with disease and gene expression
.
Hum. Genet.
,
135
,
485
497
.

63.

Wang
,
J.
,
Zhuang
,
J.
,
Iyer
,
S.
,
Lin
,
X.
,
Whitfield
,
T.W.
,
Greven
,
M.C.
,
Pierce
,
B.G.
,
Dong
,
X.
,
Kundaje
,
A.
,
Cheng
,
Y.
et al. (
2012
)
Sequence features and chromatin structure around the genomic regions bound by 119 human transcription factors
.
Genome Res.
,
22
,
1798
1812
.

64.

Myouzen
,
K.
,
Kochi
,
Y.
,
Okada
,
Y.
,
Terao
,
C.
,
Suzuki
,
A.
,
Ikari
,
K.
,
Tsunoda
,
T.
,
Takahashi
,
A.
,
Kubo
,
M.
,
Taniguchi
,
A.
et al. (
2012
)
Functional variants in NFKBIE and RTKN2 involved in activation of the NF-kappaB pathway are associated with rheumatoid arthritis in Japanese
.
PLoS Genet.
,
8
,
e1002949
e1002962
.

65.

Imamura
,
H.
,
Yoshina
,
S.
,
Ikari
,
K.
,
Miyazawa
,
K.
,
Momohara
,
S.
and
Mitani
,
S.
(
2016
)
Impaired NFKBIE gene function decreases cellular uptake of methotrexate by down-regulating SLC19A1 expression in a human rheumatoid arthritis cell line
.
Mod. Rheumatol.
,
26
,
507
516
.

66.

Ueda
,
H.
,
Howson
,
J.M.
,
Esposito
,
L.
,
Heward
,
J.
,
Snook
,
H.
,
Chamberlain
,
G.
,
Rainbow
,
D.B.
,
Hunter
,
K.M.
,
Smith
,
A.N.
,
Di Genova
,
G.
et al. (
2003
)
Association of the T-cell regulatory gene CTLA4 with susceptibility to autoimmune disease
.
Nature
,
423
,
506
511
.

67.

Westra
,
H.J.
,
Martínez-Bonet
,
M.
,
Onengut-Gumuscu
,
S.
,
Lee
,
A.
,
Luo
,
Y.
,
Teslovich
,
N.
,
Worthington
,
J.
,
Martin
,
J.
,
Huizinga
,
T.
,
Klareskog
,
L.
et al. (
2018
)
Fine-mapping and functional studies highlight potential causal variants for rheumatoid arthritis and type 1 diabetes
.
Nat. Genet.
,
50
,
1366
1374
.

68.

de
Jong
,
V.M.
,
Zaldumbide
,
A.
,
van der
Slik
,
A.R.
,
Laban
,
S.
,
Koeleman
,
B.P.
and
Roep
,
B.O.
(
2016
)
Variation in the CTLA4 3′UTR has phenotypic consequences for autoreactive T cells and associates with genetic risk for type 1 diabetes
.
Genes Immun.
,
17
,
75
78
.

69.

Chen
,
L.
and
Flies
,
D.B.
(
2013
)
Molecular mechanisms of T cell co-stimulation and co-inhibition
.
Nat. Rev. Immunol.
,
13
,
227
242
.

70.

Harley
,
J.B.
,
Chen
,
X.
,
Pujato
,
M.
,
Miller
,
D.
,
Maddox
,
A.
,
Forney
,
C.
,
Magnusen
,
A.F.
,
Lynch
,
A.
,
Chetal
,
K.
,
Yukawa
,
M.
et al. (
2018
)
Transcription factors operate across disease loci, with EBNA2 implicated in autoimmunity
.
Nat. Genet.
,
50
,
699
707
.

71.

Pei
,
Y.
(
2016
)
EBV nuclear antigen 3C mediates regulation of E2F6 to inhibit E2F1 transcription and promote cell proliferation
.
PLoS Pathog.
,
12
,
e1005844
e1005868
.

72.

Tang
,
Q.
,
Danila
,
M.I.
,
Cui
,
X.
,
Parks
,
L.
,
Baker
,
B.J.
,
Reynolds
,
R.J.
,
Raman
,
C.
,
Wanseck
,
K.C.
,
Redden
,
D.T.
,
Johnson
,
M.R.
et al. (
2015
)
Expression of interferon-γ receptor genes in PBMCs is associated with rheumatoid arthritis and its radiographic severity in African Americans
.
Arthritis Rheumatol.
,
67
,
1165
1170
.

73.

Regis
,
G.
,
Conti
,
L.
,
Boselli
,
D.
and
Novelli
,
F.
(
2006
)
IFNgammaR2 trafficking tunes IFNgamma-STAT1 signaling in T lymphocytes
.
Trends Immunol.
,
27
,
96
101
.

74.

Bach
,
E.A.
,
Aguet
,
M.
and
Schreiber
,
R.D.
(
1997
)
The IFN gamma receptor: a paradigm for cytokine receptor signaling
.
Annu. Rev. Immunol.
,
15
,
563
591
.

75.

Arnett
,
F.C.
,
Edworthy
,
S.M.
,
Bloch
,
D.A.
,
McShane
,
D.J.
,
Fries
,
J.F.
,
Cooper
,
N.S.
,
Healey
,
L.A.
,
Kaplan
,
S.R.
,
Liang
,
M.H.
,
Luthra
,
H.S.
et al. (
1988
)
The American Rheumatism Association 1987 revised criteria for the classification of rheumatoid arthritis
.
Arthritis Rheum.
,
31
,
315
324
.

76.

Karlson
,
E.W.
,
Sanchez-Guerrero
,
J.
,
Wright
,
E.A.
,
Lew
,
R.A.
,
Daltroy
,
L.H.
,
Katz
,
J.N.
and
Liang
,
M.H.
(
1995
)
A connective tissue disease screening questionnaire for population studies
.
Ann. Epidemiol.
,
5
,
297
302
.

77.

Karlson
,
E.W.
,
Costenbader
,
K.H.
,
McAlindon
,
T.E.
,
Massarotti
,
E.M.
,
Fitzgerald
,
L.M.
,
Jajoo
,
R.
,
Husni
,
E.
,
Wright
,
E.A.
,
Pankey
,
H.
and
Fraser
,
P.A.
(
2005
)
High sensitivity, specificity and predictive value of the connective tissue disease screening questionnaire among urban African-American women
.
Lupus
,
14
,
832
836
.

78.

Tan
,
W.
,
Wu
,
H.
,
Zhao
,
J.
,
Derber
,
L.A.
,
Lee
,
D.M.
,
Shadick
,
N.A.
,
Conn
,
D.L.
,
Smith
,
E.A.
,
Gersuk
,
V.H.
,
Nepom
,
G.T.
et al. (
2010
)
A functional RANKL polymorphism associated with younger age at onset of rheumatoid arthritis
.
Arthritis Rheum.
,
62
,
2864
2875
.

79.

Song
,
J.J.
,
Hwang
,
I.
,
Cho
,
K.H.
,
Garcia
,
M.A.
,
Kim
,
A.J.
,
Wang
,
T.H.
,
Lindstrom
,
T.M.
,
Lee
,
A.T.
,
Nishimura
,
T.
,
Zhao
,
L.
et al. (
2011
)
Plasma carboxypeptidase B downregulates inflammatory responses in autoimmune arthritis
.
J. Clin. Invest.
,
121
,
3517
3527
.

80.

Bridges
,
S.L.
, Jr,
Causey
,
Z.L.
,
Burgos
,
P.I.
,
Huynh
,
B.Q.
,
Hughes
,
L.B.
,
Danila
,
M.I.
,
van
Everdingen
,
A.
,
Ledbetter
,
S.
,
Conn
,
D.L.
,
Tamhane
,
A.
et al. . (
2010
)
Radiographic severity of rheumatoid arthritis in African Americans: results from a multicenter observational study
.
Arthritis Care Res. (Hoboken)
,
62
,
624
631
.

81.

Mikuls
,
T.R.
,
Holers
,
V.M.
,
Parrish
,
L.
,
Kuhn
,
K.A.
,
Conn
,
D.L.
,
Gilkeson
,
G.
,
Smith
,
E.A.
,
Kamen
,
D.L.
,
Jonas
,
B.L.
,
Callahan
,
L.F.
et al. (
2006
)
Anti-cyclic citrullinated peptide antibody and rheumatoid factor isotypes in African Americans with early rheumatoid arthritis
.
Arthritis Rheum.
,
54
,
3057
3059
.

82.

Mikuls
,
T.R.
,
Kerr
,
G.S.
, and
Cannon
,
G.W.
(
2015
)
Insights and implications of the VA Rheumatoid Arthritis Registry
.
Fed. Pract
,
32
,
24
29
.

83.

Mikuls
,
T.R.
,
Kazi
,
S.
,
Cipher
,
D.
,
Hooker
,
R.
,
Kerr
,
G.S.
,
Richards
,
J.S.
and
Cannon
,
G.W.
(
2007
)
The association of race and ethnicity with disease expression in male US veterans with rheumatoid arthritis
.
J. Rheumatol.
,
34
,
1480
1484
.

84.

Li
,
X.
,
Wu
,
J.
,
Ptacek
,
T.
,
Redden
,
D.T.
,
Brown
,
E.E.
,
Alarcón
,
G.S.
,
Ramsey-Goldman
,
R.
,
Petri
,
M.A.
,
Reveille
,
J.D.
,
Kaslow
,
R.A.
et al. (
2013
)
Allelic-dependent expression of an activating Fc receptor on B cells enhances humoral immune responses
.
Sci. Transl. Med.
,
5
,
216ra175
216ra186
.

85.

Price
,
A.L.
,
Patterson
,
N.J.
,
Plenge
,
R.M.
,
Weinblatt
,
M.E.
,
Shadick
,
N.A.
and
Reich
,
D.
(
2006
)
Principal components analysis corrects for stratification in genome-wide association studies
.
Nat. Genet.
,
38
,
904
909
.

86.

Howie
,
B.N.
,
Donnelly
,
P.
and
Marchini
,
J.
(
2009
)
A flexible and accurate genotype imputation method for the next generation of genome-wide association studies
.
PLoS Genet.
,
5
,
e1000529
.

87.

Marchini
,
J.
,
Howie
,
B.
,
Myers
,
S.
,
McVean
,
G.
and
Donnelly
,
P.
(
2007
)
A new multipoint method for genome-wide association studies by imputation of genotypes
.
Nat. Genet.
,
39
,
906
913
.

88.

Dudbridge
,
F.
and
Gusnanto
,
A.
(
2008
)
Estimation of significance thresholds for genomewide association scans
.
Genet. Epidemiol.
,
32
,
227
234
.

89.

Risch
,
N.
and
Merikangas
,
K.
(
1996
)
The future of genetic studies of complex human diseases
.
Science
,
273
,
1516
1517
.

90.

Wang
,
Z.
,
Sun
,
Y.
,
Fu
,
X.
,
Yu
,
G.
,
Wang
,
C.
,
Bao
,
F.
,
Yue
,
Z.
,
Li
,
J.
,
Sun
,
L.
,
Irwanto
,
A.
et al. (
2016
)
A large-scale genome-wide association and meta-analysis identified four novel susceptibility loci for leprosy
.
Nat. Commun.
,
7
,
13760
13767
.

91.

Skol
,
A.D.
,
Scott
,
L.J.
,
Abecasis
,
G.R.
and
Boehnke
,
M.
(
2006
)
Joint analysis is more efficient than replication-based analysis for two-stage genome-wide association studies
.
Nat. Genet.
,
38
,
209
213
.

92.

Edwards
,
W.
,
Lindman
,
H.
and
Savage
,
L.J.
(
1963
)
Bayesian statistical inference for psychological research
.
Psychol. Rev.
,
70
,
193
242
.

93.

Menard
,
L.
,
Saadoun
,
D.
,
Isnardi
,
I.
,
Ng
,
Y.S.
,
Meyers
,
G.
,
Massad
,
C.
,
Price
,
C.
,
Abraham
,
C.
,
Motaghedi
,
R.
,
Buckner
,
J.H.
et al. (
2011
)
The PTPN22 allele encoding an R620W variant interferes with the removal of developing autoreactive B cells in humans
.
J. Clin. Invest.
,
121
,
3635
3644
.

94.

Lopez de Lapuente
,
A.
,
Feliú
,
A.
,
Ugidos
,
N.
,
Mecha
,
M.
,
Mena
,
J.
,
Astobiza
,
I.
,
Riera
,
J.
,
Carrillo-Salinas
,
F.J.
,
Comabella
,
M.
,
Montalban
,
X.
et al. (
2016
)
Novel insights into the multiple sclerosis risk gene ANKRD55
.
J. Immunol.
,
196
,
4553
4565
.

95.

Couturier
,
N.
,
Bucciarelli
,
F.
,
Nurtdinov
,
R.N.
,
Debouverie
,
M.
,
Lebrun-Frenay
,
C.
,
Defer
,
G.
,
Moreau
,
T.
,
Confavreux
,
C.
,
Vukusic
,
S.
,
Cournu-Rebeix
,
I.
et al. . (
2011
)
Tyrosine kinase 2 variant influences T lymphocyte polarization and multiple sclerosis susceptibility
.
Brain
,
134
(
Pt 3
),
693
703
.

96.

Jansen
,
R.
,
Hottenga
,
J.J.
,
Nivard
,
M.G.
,
Abdellaoui
,
A.
,
Laport
,
B.
,
de
Geus
,
E.J.
,
Wright
,
F.A.
,
Penninx
,
B.W.J.H.
and
Boomsma
,
D.I.
(
2017
)
Conditional eQTL analysis reveals allelic heterogeneity of gene expression
.
Hum. Mol. Genet.
,
26
,
1444
1451
.

97.

Corradin
,
O.
,
Saiakhova
,
A.
,
Akhtar-Zaidi
,
B.
,
Myeroff
,
L.
,
Willis
,
J.
,
Cowper-Sal lari
,
R.
,
Lupien
,
M.
and
Markowitz
,
S.
(
2014
)
Scacheri PC. Combinatorial effects of multiple enhancer variants in linkage disequilibrium dictate levels of gene expression to confer susceptibility to common traits
.
Genome Res.
,
24
,–
13
.

98.

Bokor
,
S.
,
Dumont
,
J.
,
Spinneker
,
A.
,
Gonzalez-Gross
,
M.
,
Nova
,
E.
,
Widhalm
,
K.
,
Moschonis
,
G.
,
Stehle
,
P.
,
Amouyel
,
P.
,
De Henauw
,
S.
et al. (
2010
)
Single nucleotide polymorphisms in the FADS gene cluster are associated with delta-5 and delta-6 desaturase activities estimated by serum fatty acid ratios
.
J. Lipid Res.
,
51
,
2325
2333
.

99.

Chen
,
W.C.
,
Wei
,
J.C.
,
Lu
,
H.F.
,
Wong
,
H.S.
,
Woon
,
P.Y.
,
Hsu
,
Y.W.
,
Huang
,
J.D.
and
Chang
,
W.C.
(
2017
)
rs657075 (CSF2) is associated with the disease phenotype (BAS-G) of ankylosing spondylitis
.
Int. J. Mol. Sci.
,
18
,
E83
E95
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data