Abstract

Background

There is growing support for the use of genetic risk scores (GRS) in routine clinical settings. Due to the limited diversity of current genomic discovery samples, there are concerns that the predictive power of GRS will be limited in non-European ancestry populations. GRS for cardiometabolic traits were evaluated in sub-Saharan Africans in comparison with African Americans and European Americans.

Methods

We evaluated the predictive utility of GRS for 12 cardiometabolic traits in sub-Saharan Africans (AF; n = 5200), African Americans (AA; n = 9139) and European Americans (EUR; n = 9594). GRS were constructed as weighted sums of the number of risk alleles. Predictive utility was assessed using the additional phenotypic variance explained and the increase in discriminatory ability over traditional risk factors [age, sex and body mass index (BMI)], with adjustment for ancestry-derived principal components.

Results

Across all traits, GRS showed up to a 5-fold and 20-fold greater predictive utility in EUR relative to AA and AF, respectively. Predictive utility was most consistent for lipid traits, with percentage increase in explained variation attributable to GRS ranging from 10.6% to 127.1% among EUR, 26.6% to 65.8% among AA and 2.4% to 37.5% among AF. These differences were recapitulated in the discriminatory power, whereby the predictive utility of GRS was 4-fold greater in EUR relative to AA and up to 44-fold greater in EUR relative to AF. Obesity and blood pressure traits showed a similar pattern of greater predictive utility among EUR.

Conclusions

This work demonstrates the poorer performance of GRS in AF and highlights the need to improve representation of multiple ethnic populations in genomic studies to ensure equitable clinical translation of GRS.

Key Messages
  • Genetic risk score (GRS) prediction is poorer in sub-Saharan Africans compared with African Americans and European Americans.

  • To ensure equitable clinical translation of GRS, there is need to improve ethnic diversity in genomic studies.

Background

The use of aggregate genetic risk, as summed up in genetic risk scores (GRS), to identify subgroups of individuals at increased risk of disease or more likely to benefit from early intervention, is gaining recognition as a practical translational strategy of genomic findings for both public health and clinical care. This trend is supported by evidence showing that risk associated with GRS for certain common complex diseases, such as severe obesity and coronary artery disease, can be as high as the risk conferred by some rare monogenic mutations, and that incorporating such GRS in disease risk prediction models can substantially increase prediction accuracy.1–4 However, GRS derived from existing genome-wide association studies (GWAS) show greater predictive value in European populations than in non-European populations, a reflection of the fact that most GWAS have been conducted in European-ancestry populations. For example, GRS derived from the largest available datasets show up to 2- to 5-fold greater predictive power in European-ancestry populations relative to African Americans and East Asians for a number of complex traits, including anthropometric indices and mental health disorders.5–8

There are concerns that the adoption of routine use of GRS in clinical settings could exacerbate existing health disparities because of suboptimal utility in non-European-ancestry populations. Therefore, as the use of GRS moves from research to clinical settings, it is essential to clarify its utility in populations that are currently under-represented in genomic discoveries. Whereas there are limited data on the predictive utility of GRS in populations such as East Asians and African Americans, similar information is lacking in populations from continental Africa.6–9 In the present study, we sought to assess the predictive utility of GRS for a range of cardiometabolic traits in sub-Saharan Africans (AF) and to make comparisons with European Americans (EUR) and African Americans (AA). We aimed to do this using GRS constructed from genetic variants reported in publicly available databases of GWAS, to exemplify the potential use of such resources.

Methods

All human research was conducted according to the Declaration of Helsinki and all relevant ethical regulations for work with human participants. The AADM study protocol was approved by the Institutional Ethics Review Board (IRB) of the National Institutes of Health/National Human Genome Research Institute (protocol HG-09-N070). HUFS received ethical approval from Howard University IRB (protocol IRB-00-MED-13-1G). We obtained approval for controlled access (protocol number: 12-HG-N185) to each of the dbGaP (dbGaP Study Accession). All dbGaP studies obtained ethical approvals from the relevant institutions. Written informed consent was obtained from each participant before enrolment in all studies.

Study participants

The predictive utility of GRS was assessed in up to 5200 sub-Saharan Africans (AF), 9139 African Americans (AA) and 9594 individuals of European Americans (EUR). AF were drawn from the AADM study10,11 that enrolled participants aged 18 years or older from Nigeria, Ghana and Kenya ,as described previously.12 Data on AA were obtained from the Howard University Family Study (HUFS)13 and from the following dbGAP studies: Cleveland Family Study (CFS, phs000284),14 Jackson Heart Study (JHS, phs000286),15 Multi-Ethnic Study of Atherosclerosis (MESA, phs000209)16 and Atherosclerosis Risk in Communities Study (ARIC, phs000280).17 CFS, JHS, HUFS, MESA and ARIC participants are aged 35–84 years and were recruited from different parts of the USA. Data on EUR were obtained from the ARIC study.17

Cardiometabolic traits studied

We studied body mass index (BMI), waist circumference (WC), hip circumference (HC), waist-to-hip ratio (WHR), systolic blood pressure (SBP), diastolic blood pressure (DBP), fasting plasma glucose (FPG), triglycerides (TG), total cholesterol (TC), low-density lipoprotein (LDL) and high-density lipoprotein (HDL), all measured in standard units; type 2 diabetes (T2D) status was determined according to the American Diabetes Association criteria. Additionally, we derived the following binary traits based on commonly used clinical definitions: general obesity (BMI ≥30 Kg/m2), abdominal obesity (WC: ≥94 cm, men; ≥80 cm, women), raised WHR (WHR: ≥1.0, men; ≥0.85, women), raised TG (TG ≥2.26 mmol/L), raised TC (TC ≥6.22 mmol/L), raised LDL (LDL ≥4.14 mmol/L), raised FPG (FPG ≥7.0 mmol/L), raised SBP (SBP ≥140 mmHg) and raised DBP (DBP ≥90 mmHg).18–21

SNP selection

We accessed all data (regardless of the ancestry of the population studied) for each trait in the NHGRI-EBI database of published genome-wide association studies (GWAS Catalog) as of 25 May 2019.22 The GWAS Catalog is a curated comprehensive public repository of published GWAS reporting single nucleotide polymorphism (SNP)-trait associations with P-value <1 x 10–5. From the GWAS Catalog, we extracted the SNP identifier (RefSeq rs number) and the risk allele for each SNP reported. Each of the SNPs was then mapped to Ensembl release version 92 to identify the reference and alternative alleles. The set of overlapping SNPs between those extracted from the GWAS Catalog and the target dataset (genotype data imputed into corresponding ancestry population in the 1000 Genomes Project) were retained for constructing GRS. Further, we performed sensitivity analyses using independent SNPs obtained by pruning out the above SNPs with a variance inflation factor >2 (R2 <0.5) within a sliding ‘window’ of size 50 bp shifted over five SNPs at every step.23

Construction of GRS

An individual’s GRS was constructed as a weighted sum of the number of risk alleles over all the SNPs identified for each trait, using PLINK 1.9.24 Effects sizes used for weighting were obtained from the UK Biobank (UKBB)25 for BMI, WC, HC, WHR, SBP, DBP and T2D, or the largest study in the GWAS Catalog for the other traits (Spracklen et al.26 for TC, TG, HDL and LDL and Manning et al.27 for FPG). UKBB data were from White British individuals and Spracklen et al. study data were from European and East Asian individuals. Manning et al. study data were from European-ancestry individuals. For FPG, GRS was constructed for non-T2D cases only. The sign of the effect size was appropriately flipped when the reported risk allele in the weight-source dataset was the alternative of the risk allele in the target dataset.

Construction of principal components

To adjust for potential effects of genetic stratification within populations on the predictive performance of GRS, we adjusted for the principal components (PCs) of genotypes in trait-GRS regression models. PCs were constructed separately for each population using a set of approximately independent SNPs across the genome, using PLINK 1.9. The optimal set of SNPs (AF: 55 034 SNPs, AA: 77 013 SNPs, EUR: 59 096 SNPs) was obtained by pruning out SNPs with a variance inflation factor >2 within a sliding window of size 50 bp shifted over five SNPs at every step. The original data points were then projected onto the extracted PCs using eigenvectors produced using the flag—pca in PLINK.24

Statistical analysis

Trait-GRS association was assessed using correlations between traits and GRS, and by plotting the observed mean or prevalence of a trait against its GRS deciles. Predictive utility of GRS was assessed using two metrics: (i) additional trait variability attributable to GRS in terms of adjusted R-squared of the regression model; and, (ii) additional discriminatory power attributable to GRS in terms of area under the receiver operating characteristic (ROC) curve (AUC). R-squared assessments were based on comparisons of regression models fitted for each quantitative trait against traditional risk factors [age, sex, principal components of ancestry and BMI (except when BMI was the trait under study)], with (GRS model) and without GRS (traditional model). Logistic regression models were fitted for T2D and Efron’s R2 used to estimate the additional variation in the probability of T2D explained by GRS.28 AUCs based on logistic regression models fitted for binary traits and additional discriminatory power of GRS were assessed by comparing the model of GRS plus traditional risk factors with the model of only traditional risk factors. In addition, we compared the performance of our GWAS Catalog-based GRS with a genome-wide GRS based on all SNPs (P ≤ 1, i.e. not restricted to P < 1 x 10–5) approximately independent (R2 <0.5) within a window of one Mbp with minor allele frequency (MAF) >0.01. Filtering of SNPs and computation of weights were performed in the software GCTA using the flags—cojo-sblup with relevant parameters of each trait (Supplementary Table S1, available as Supplementary data at IJE online) and—cojo-wind 1000, and scores for each individual in the target dataset were computed in PLINK 1.9.24,29

All downstream analyses were performed in STATA version 15.1 (STATACorp, TX) and two-tailed value of P < 1.388e-3 (type 1 error rate, α = 0.05, adjusted for 36 tests) were considered to be consistent with evidence in support of the alternative hypothesis. The P-values referred to here relate to regression and correlation coefficients of association between each trait and its corresponding GRS.

Results

Distribution of GRS

Information about the cardiometabolic traits studied, number of SNPs, sources of weights and numbers of individuals studied are shown in Table 1. Our study samples clustered as expected with the 1000 Genomes Project samples (Supplementary Figure S1, available as Supplementary data at IJE online). The number of SNPs used to construct GRS did not significantly differ between the three groups. The distribution of GRS for the cardiometabolic traits studied differed among the three groups, except for total cholesterol (TC) (Figure 1).

Distribution of genetic risk scores by group. AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes; OR, odds ratio
Figure 1

Distribution of genetic risk scores by group. AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes; OR, odds ratio

Table 1

Sample size, descriptive summary of single nucleotide polymorphisms (SNPs) and source of weights

AF
AA
EUR
TraitNumber of SNPs identified in GWAS CatalogSource of weights (sample size)Number of GWAS Catalog SNPs in UKBB or largest study in GWAS CatalogNumber of SNPs presentIndividuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)
BMI650UKBB (360, 564)650620518715.271 (2.3)626913913.06 (1.2)615959415.22 (3.4)
WC253UKBB (360, 564)25124251976.953 (1.7)24591197.297 (1.5)24295847.786 (2.0)
HC157UKBB (360, 564)156148520013.536 (1.6)149693913.492 (1.6)147958411.989 (1.5)
WHR214UKBB (484, 900)21218951954.782 (0.7)18964605.002 (0.8)18995835.985 (0.9)
SBP183UKBB (360, 564)170159464622.845 (2.3)163722317.112 (2.2)157958925.004 (2.8)
DBP208UKBB (360, 564)198187464611.991 (1.3)189722310.825 (1.3)186958913.287 (1.7)
TG480Spracklen study (222, 097)22520741403.325 (0.8)20985733.216 (0.9)20795753.665 (1.3)
TC420Spracklen study (222, 097)18817441407.371 (0.6)17485767.369 (0.7)17495737.395 (0.7)
LDL423Spracklen study (222, 097)18617341084.786 (0.8)17485173.24 (0.6)17394185.753 (0.8)
HDL499Spracklen study (222, 097)26324641408.098 (1.2)24985728.132 (1.3)24795757.84 (1.5)
FPG42Manning study (58, 074)353121490.761 (0.1)3172550.728 (0.1)3187450.573 (0.1)
T2D374UKBB (N=360, 564)36233946620.029a(0.004)34190210.023a(0.003)33995760.027 (0.004)
AF
AA
EUR
TraitNumber of SNPs identified in GWAS CatalogSource of weights (sample size)Number of GWAS Catalog SNPs in UKBB or largest study in GWAS CatalogNumber of SNPs presentIndividuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)
BMI650UKBB (360, 564)650620518715.271 (2.3)626913913.06 (1.2)615959415.22 (3.4)
WC253UKBB (360, 564)25124251976.953 (1.7)24591197.297 (1.5)24295847.786 (2.0)
HC157UKBB (360, 564)156148520013.536 (1.6)149693913.492 (1.6)147958411.989 (1.5)
WHR214UKBB (484, 900)21218951954.782 (0.7)18964605.002 (0.8)18995835.985 (0.9)
SBP183UKBB (360, 564)170159464622.845 (2.3)163722317.112 (2.2)157958925.004 (2.8)
DBP208UKBB (360, 564)198187464611.991 (1.3)189722310.825 (1.3)186958913.287 (1.7)
TG480Spracklen study (222, 097)22520741403.325 (0.8)20985733.216 (0.9)20795753.665 (1.3)
TC420Spracklen study (222, 097)18817441407.371 (0.6)17485767.369 (0.7)17495737.395 (0.7)
LDL423Spracklen study (222, 097)18617341084.786 (0.8)17485173.24 (0.6)17394185.753 (0.8)
HDL499Spracklen study (222, 097)26324641408.098 (1.2)24985728.132 (1.3)24795757.84 (1.5)
FPG42Manning study (58, 074)353121490.761 (0.1)3172550.728 (0.1)3187450.573 (0.1)
T2D374UKBB (N=360, 564)36233946620.029a(0.004)34190210.023a(0.003)33995760.027 (0.004)

SNPs, single nucleotide polymorphisms; GWAS, genome-wide association studies; AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes; UKBB, UK Biobank; N, number; GRS, genetic risk score; SD, standard deviation.

a

Weighted by log (odds ratio).

Table 1

Sample size, descriptive summary of single nucleotide polymorphisms (SNPs) and source of weights

AF
AA
EUR
TraitNumber of SNPs identified in GWAS CatalogSource of weights (sample size)Number of GWAS Catalog SNPs in UKBB or largest study in GWAS CatalogNumber of SNPs presentIndividuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)
BMI650UKBB (360, 564)650620518715.271 (2.3)626913913.06 (1.2)615959415.22 (3.4)
WC253UKBB (360, 564)25124251976.953 (1.7)24591197.297 (1.5)24295847.786 (2.0)
HC157UKBB (360, 564)156148520013.536 (1.6)149693913.492 (1.6)147958411.989 (1.5)
WHR214UKBB (484, 900)21218951954.782 (0.7)18964605.002 (0.8)18995835.985 (0.9)
SBP183UKBB (360, 564)170159464622.845 (2.3)163722317.112 (2.2)157958925.004 (2.8)
DBP208UKBB (360, 564)198187464611.991 (1.3)189722310.825 (1.3)186958913.287 (1.7)
TG480Spracklen study (222, 097)22520741403.325 (0.8)20985733.216 (0.9)20795753.665 (1.3)
TC420Spracklen study (222, 097)18817441407.371 (0.6)17485767.369 (0.7)17495737.395 (0.7)
LDL423Spracklen study (222, 097)18617341084.786 (0.8)17485173.24 (0.6)17394185.753 (0.8)
HDL499Spracklen study (222, 097)26324641408.098 (1.2)24985728.132 (1.3)24795757.84 (1.5)
FPG42Manning study (58, 074)353121490.761 (0.1)3172550.728 (0.1)3187450.573 (0.1)
T2D374UKBB (N=360, 564)36233946620.029a(0.004)34190210.023a(0.003)33995760.027 (0.004)
AF
AA
EUR
TraitNumber of SNPs identified in GWAS CatalogSource of weights (sample size)Number of GWAS Catalog SNPs in UKBB or largest study in GWAS CatalogNumber of SNPs presentIndividuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)SNPs (N)Individuals (N)Mean GRS (SD)
BMI650UKBB (360, 564)650620518715.271 (2.3)626913913.06 (1.2)615959415.22 (3.4)
WC253UKBB (360, 564)25124251976.953 (1.7)24591197.297 (1.5)24295847.786 (2.0)
HC157UKBB (360, 564)156148520013.536 (1.6)149693913.492 (1.6)147958411.989 (1.5)
WHR214UKBB (484, 900)21218951954.782 (0.7)18964605.002 (0.8)18995835.985 (0.9)
SBP183UKBB (360, 564)170159464622.845 (2.3)163722317.112 (2.2)157958925.004 (2.8)
DBP208UKBB (360, 564)198187464611.991 (1.3)189722310.825 (1.3)186958913.287 (1.7)
TG480Spracklen study (222, 097)22520741403.325 (0.8)20985733.216 (0.9)20795753.665 (1.3)
TC420Spracklen study (222, 097)18817441407.371 (0.6)17485767.369 (0.7)17495737.395 (0.7)
LDL423Spracklen study (222, 097)18617341084.786 (0.8)17485173.24 (0.6)17394185.753 (0.8)
HDL499Spracklen study (222, 097)26324641408.098 (1.2)24985728.132 (1.3)24795757.84 (1.5)
FPG42Manning study (58, 074)353121490.761 (0.1)3172550.728 (0.1)3187450.573 (0.1)
T2D374UKBB (N=360, 564)36233946620.029a(0.004)34190210.023a(0.003)33995760.027 (0.004)

SNPs, single nucleotide polymorphisms; GWAS, genome-wide association studies; AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes; UKBB, UK Biobank; N, number; GRS, genetic risk score; SD, standard deviation.

a

Weighted by log (odds ratio).

Overall, relative to AF and AA, EUR had significantly higher GRS for six (waist circumference, WC; waist-hip ratio, WHR; systolic blood pressure, SBP; diastolic blood pressure, DBP; triglycerides, TG; low-density lipoprotein, LDL) out of the 12 traits studied. On the other hand, AF had a significantly higher GRS for hip circumference (HC), fasting plasma glucose (FPG) and T2D. The overlap of GRS distributions was greater between AF and AA [nearly identical for HC, WHR, TG and high-density lipoprotein (HDL)] than between any one of them and EUR, except for T2D and LDL for which there was greater overlap of GRS distributions between AF and EUR. Generally, the distribution of GRS among AA was consistently below or between the distributions among AF and EUR. We note that differences in the distributions of GRS between populations should be interpreted cautiously. Simulation studies have shown that the sign of mean GRS differences between populations is random even when causal variants and their effects are shared across ancestries.30 Systematic differences in GRS distributions likely reflect underlying differences in allele frequency and linkage disequilibrium (LD).

Association of GRS with cognate outcomes

GRS were more strongly associated with their respective traits among EUR relative to AF and AA (Table 2). Among EUR, 10 of the 12 trait-GRS showed evidence of association (P < 1.388e-3) and eight and six of 12 trait-GRS showed evidence of association among AA and AF, respectively (Supplementary Figure S2, available as Supplementary data at IJE online). In addition, the strongest trait-GRS associations were observed for lipid traits in all three groups.

Table 2

Correlations between quantitative traits and genetic risk score

AF
AA
EUR
TraitCorrelation coefficientPCorrelation coefficientPCorrelation coefficientP
BMI0.0510.00020.0410.00010.0913.14E-19
WC0.0280.04700.0230.02640.0541.20E-07
HC0.0430.00200.0330.00600.0811.25E-15
WHR0.0120.37610.0070.56420.0100.3421
SBP0.0080.58260.0340.00360.0713.14E-12
DBP0.0230.12190.0492.17E-050.0621.45E-09
TG0.1024.70E-110.0935.62E-180.1922.30E-80
TC0.1133.12E-130.1246.27E-310.1863.63E-75
LDL0.1411.32E-190.0958.22E-190.1861.10E-74
HDL0.1001.23E-100.1991.49E-780.1822.45E-72
FPG0.0010.96210.0200.06060.0841.29E-16
AF
AA
EUR
TraitCorrelation coefficientPCorrelation coefficientPCorrelation coefficientP
BMI0.0510.00020.0410.00010.0913.14E-19
WC0.0280.04700.0230.02640.0541.20E-07
HC0.0430.00200.0330.00600.0811.25E-15
WHR0.0120.37610.0070.56420.0100.3421
SBP0.0080.58260.0340.00360.0713.14E-12
DBP0.0230.12190.0492.17E-050.0621.45E-09
TG0.1024.70E-110.0935.62E-180.1922.30E-80
TC0.1133.12E-130.1246.27E-310.1863.63E-75
LDL0.1411.32E-190.0958.22E-190.1861.10E-74
HDL0.1001.23E-100.1991.49E-780.1822.45E-72
FPG0.0010.96210.0200.06060.0841.29E-16

AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose.

Table 2

Correlations between quantitative traits and genetic risk score

AF
AA
EUR
TraitCorrelation coefficientPCorrelation coefficientPCorrelation coefficientP
BMI0.0510.00020.0410.00010.0913.14E-19
WC0.0280.04700.0230.02640.0541.20E-07
HC0.0430.00200.0330.00600.0811.25E-15
WHR0.0120.37610.0070.56420.0100.3421
SBP0.0080.58260.0340.00360.0713.14E-12
DBP0.0230.12190.0492.17E-050.0621.45E-09
TG0.1024.70E-110.0935.62E-180.1922.30E-80
TC0.1133.12E-130.1246.27E-310.1863.63E-75
LDL0.1411.32E-190.0958.22E-190.1861.10E-74
HDL0.1001.23E-100.1991.49E-780.1822.45E-72
FPG0.0010.96210.0200.06060.0841.29E-16
AF
AA
EUR
TraitCorrelation coefficientPCorrelation coefficientPCorrelation coefficientP
BMI0.0510.00020.0410.00010.0913.14E-19
WC0.0280.04700.0230.02640.0541.20E-07
HC0.0430.00200.0330.00600.0811.25E-15
WHR0.0120.37610.0070.56420.0100.3421
SBP0.0080.58260.0340.00360.0713.14E-12
DBP0.0230.12190.0492.17E-050.0621.45E-09
TG0.1024.70E-110.0935.62E-180.1922.30E-80
TC0.1133.12E-130.1246.27E-310.1863.63E-75
LDL0.1411.32E-190.0958.22E-190.1861.10E-74
HDL0.1001.23E-100.1991.49E-780.1822.45E-72
FPG0.0010.96210.0200.06060.0841.29E-16

AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose.

Predictive utility of GRS

In regression models adjusted for traditional risk factors and population genetic structure (represented by the first three principal components of ancestry), GRS was significantly associated with body mass index (BMI), DBP, lipid traits and T2D in all three groups (Table 3). Furthermore, among AA and EUR, GRS was also significantly associated with WC, SBP and FPG and, additionally, with HC among EUR only. The effect sizes of the above seven trait-GRS associations (GRS association with BMI, DBP, lipid traits and T2D) ranked in roughly the same order, with the TC-GRS association being the strongest and BMI-GRS association the weakest. Notably, among these trait-GRS associations, the largest effect size was observed among EUR for TG, TC, LDL and T2D, whereas the other three (BMI, DBP and T2D) had their largest effect sizes among AA. As an example, among trait-GRS associations common to all three groups, the TC-GRS association was the strongest and the effect sizes were 0.226, 0.216 and 0.281 mmol/l per unit increase in GRS (all P < 0.0001) among AF, AA and EUR, respectively. Furthermore, there was evidence of association based on odds ratios for binary traits (comparing individuals in the top 10% of GRS with the rest) for lipids, FPG and T2D, but not for raised TG and raised FPG among AF. (Figure 2).

Association between GRS (individuals in the top 10% versus the others) and binary traits. AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes.
Figure 2

Association between GRS (individuals in the top 10% versus the others) and binary traits. AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes.

Table 3

Genetic risk Score effect size and R-squared

Panel A: AF

GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1330.03360.00010.07670.07410.0026
WC0.0740.06760.27490.57000.57000.0000
HC0.0950.07220.18980.55450.55450.0000
WHR0.00040.00150.77810.19650.1967−0.0002
SBP0.1410.13830.30680.16400.16400.0000
DBP0.3490.15140.02130.06590.06510.0008
TG0.0730.0162.83E-060.18030.17610.0042
TC0.2720.0366.89E-140.06280.05020.0126
LDL0.2260.0251.45E-190.07810.05960.0185
HDL0.0410.0065.44E-120.04030.02930.0110
FPG−0.17010.15700.27880.04470.04470.0000
T2D44.90.00246.84E-080.11800.10500.0130
GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1330.03360.00010.07670.07410.0026
WC0.0740.06760.27490.57000.57000.0000
HC0.0950.07220.18980.55450.55450.0000
WHR0.00040.00150.77810.19650.1967−0.0002
SBP0.1410.13830.30680.16400.16400.0000
DBP0.3490.15140.02130.06590.06510.0008
TG0.0730.0162.83E-060.18030.17610.0042
TC0.2720.0366.89E-140.06280.05020.0126
LDL0.2260.0251.45E-190.07810.05960.0185
HDL0.0410.0065.44E-120.04030.02930.0110
FPG−0.17010.15700.27880.04470.04470.0000
T2D44.90.00246.84E-080.11800.10500.0130
Table 3

Genetic risk Score effect size and R-squared

Panel A: AF

GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1330.03360.00010.07670.07410.0026
WC0.0740.06760.27490.57000.57000.0000
HC0.0950.07220.18980.55450.55450.0000
WHR0.00040.00150.77810.19650.1967−0.0002
SBP0.1410.13830.30680.16400.16400.0000
DBP0.3490.15140.02130.06590.06510.0008
TG0.0730.0162.83E-060.18030.17610.0042
TC0.2720.0366.89E-140.06280.05020.0126
LDL0.2260.0251.45E-190.07810.05960.0185
HDL0.0410.0065.44E-120.04030.02930.0110
FPG−0.17010.15700.27880.04470.04470.0000
T2D44.90.00246.84E-080.11800.10500.0130
GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1330.03360.00010.07670.07410.0026
WC0.0740.06760.27490.57000.57000.0000
HC0.0950.07220.18980.55450.55450.0000
WHR0.00040.00150.77810.19650.1967−0.0002
SBP0.1410.13830.30680.16400.16400.0000
DBP0.3490.15140.02130.06590.06510.0008
TG0.0730.0162.83E-060.18030.17610.0042
TC0.2720.0366.89E-140.06280.05020.0126
LDL0.2260.0251.45E-190.07810.05960.0185
HDL0.0410.0065.44E-120.04030.02930.0110
FPG−0.17010.15700.27880.04470.04470.0000
T2D44.90.00246.84E-080.11800.10500.0130

Panel B: AA

GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.2420.06410.00020.02140.02000.0014
WC0.1220.05580.02840.76790.76780.0001
HC−0.0140.04370.74360.83870.83870.000
WHR00.00120.74570.24300.2431−0.0001
SBP0.3340.09970.00080.10360.10230.0013
DBP0.4480.10411.75E-050.01740.01500.0024
TG0.0890.01011.08E-180.03590.02720.0087
TC0.2160.01811.29E-320.06510.04960.0155
LDL0.1870.029.93E-210.04620.03650.0097
HDL0.0660.00341.01E-800.09800.05910.0389
FPG0.2610.07240.00030.18060.17930.0013
T2D−7.4770.00220.39440.06700.06200.0050
GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.2420.06410.00020.02140.02000.0014
WC0.1220.05580.02840.76790.76780.0001
HC−0.0140.04370.74360.83870.83870.000
WHR00.00120.74570.24300.2431−0.0001
SBP0.3340.09970.00080.10360.10230.0013
DBP0.4480.10411.75E-050.01740.01500.0024
TG0.0890.01011.08E-180.03590.02720.0087
TC0.2160.01811.29E-320.06510.04960.0155
LDL0.1870.029.93E-210.04620.03650.0097
HDL0.0660.00341.01E-800.09800.05910.0389
FPG0.2610.07240.00030.18060.17930.0013
T2D−7.4770.00220.39440.06700.06200.0050

Panel B: AA

GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.2420.06410.00020.02140.02000.0014
WC0.1220.05580.02840.76790.76780.0001
HC−0.0140.04370.74360.83870.83870.000
WHR00.00120.74570.24300.2431−0.0001
SBP0.3340.09970.00080.10360.10230.0013
DBP0.4480.10411.75E-050.01740.01500.0024
TG0.0890.01011.08E-180.03590.02720.0087
TC0.2160.01811.29E-320.06510.04960.0155
LDL0.1870.029.93E-210.04620.03650.0097
HDL0.0660.00341.01E-800.09800.05910.0389
FPG0.2610.07240.00030.18060.17930.0013
T2D−7.4770.00220.39440.06700.06200.0050
GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.2420.06410.00020.02140.02000.0014
WC0.1220.05580.02840.76790.76780.0001
HC−0.0140.04370.74360.83870.83870.000
WHR00.00120.74570.24300.2431−0.0001
SBP0.3340.09970.00080.10360.10230.0013
DBP0.4480.10411.75E-050.01740.01500.0024
TG0.0890.01011.08E-180.03590.02720.0087
TC0.2160.01811.29E-320.06510.04960.0155
LDL0.1870.029.93E-210.04620.03650.0097
HDL0.0660.00341.01E-800.09800.05910.0389
FPG0.2610.07240.00030.18060.17930.0013
T2D−7.4770.00220.39440.06700.06200.0050

Panel C: EUR

GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1240.01371.79E-190.01530.00700.0083
WC0.0770.02820.00660.82150.82140.0001
HC0.1040.02950.00040.80270.80250.0002
WHR00.00060.710.47500.47500.00
SBP0.4390.05691.33E-140.14380.13850.0053
DBP0.3690.0593.96E-100.08960.08600.0036
TG0.150.00776.47E-820.10850.07370.0348
TC0.2810.01471.61E-790.07210.03690.0352
LDL0.2370.01251.30E-780.06360.02800.0356
HDL0.050.00251.95E-880.30610.27670.0294
FPG0.7070.05091.77E-430.13740.11840.0190
T2D53.7980.00236.31E-100.06900.05500.0140
GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1240.01371.79E-190.01530.00700.0083
WC0.0770.02820.00660.82150.82140.0001
HC0.1040.02950.00040.80270.80250.0002
WHR00.00060.710.47500.47500.00
SBP0.4390.05691.33E-140.14380.13850.0053
DBP0.3690.0593.96E-100.08960.08600.0036
TG0.150.00776.47E-820.10850.07370.0348
TC0.2810.01471.61E-790.07210.03690.0352
LDL0.2370.01251.30E-780.06360.02800.0356
HDL0.050.00251.95E-880.30610.27670.0294
FPG0.7070.05091.77E-430.13740.11840.0190
T2D53.7980.00236.31E-100.06900.05500.0140

PC1, PC2, PC3 are the first three principal components of ancestry; additional variation explained is in percentage points; GRS effect size for T2D is on the logit scale. GRS effect size = β linear regression coefficient (for quantitative trait) or odds ratio (for disease trait). GRS model, trait = α + age + sex + BMI + PC1 + PC2 + PC3 + GRS (BMI excluded in covariates when it is the trait under study).

AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; SE, standard error; R2, adjusted R-squared; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes.

Panel C: EUR

GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1240.01371.79E-190.01530.00700.0083
WC0.0770.02820.00660.82150.82140.0001
HC0.1040.02950.00040.80270.80250.0002
WHR00.00060.710.47500.47500.00
SBP0.4390.05691.33E-140.14380.13850.0053
DBP0.3690.0593.96E-100.08960.08600.0036
TG0.150.00776.47E-820.10850.07370.0348
TC0.2810.01471.61E-790.07210.03690.0352
LDL0.2370.01251.30E-780.06360.02800.0356
HDL0.050.00251.95E-880.30610.27670.0294
FPG0.7070.05091.77E-430.13740.11840.0190
T2D53.7980.00236.31E-100.06900.05500.0140
GRS Model
Model without GRS
TraitGRS effect sizeSEP-valueR2R2Additional variation explained
BMI0.1240.01371.79E-190.01530.00700.0083
WC0.0770.02820.00660.82150.82140.0001
HC0.1040.02950.00040.80270.80250.0002
WHR00.00060.710.47500.47500.00
SBP0.4390.05691.33E-140.14380.13850.0053
DBP0.3690.0593.96E-100.08960.08600.0036
TG0.150.00776.47E-820.10850.07370.0348
TC0.2810.01471.61E-790.07210.03690.0352
LDL0.2370.01251.30E-780.06360.02800.0356
HDL0.050.00251.95E-880.30610.27670.0294
FPG0.7070.05091.77E-430.13740.11840.0190
T2D53.7980.00236.31E-100.06900.05500.0140

PC1, PC2, PC3 are the first three principal components of ancestry; additional variation explained is in percentage points; GRS effect size for T2D is on the logit scale. GRS effect size = β linear regression coefficient (for quantitative trait) or odds ratio (for disease trait). GRS model, trait = α + age + sex + BMI + PC1 + PC2 + PC3 + GRS (BMI excluded in covariates when it is the trait under study).

AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; SE, standard error; R2, adjusted R-squared; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes.

The predictive utility of GRS was assessed in terms of additional variation explained by the model including GRS (GRS model) relative to variation explained by the model of traditional risk factors only (traditional model). The predictive utility of GRS showed significant variation both among traits and among groups (Figure 3). We observed substantial predictive utility of GRS for lipid traits and T2D in all groups, and additionally among EUR for BMI and FPG. Among AA, GRS also appeared to have predictive power for DBP. However, the predictive power of GRS was significantly greater in EUR compared with AF and AA, showing up to 5-fold and 20-fold greater predictive utility of GRS in EUR relative to AA and AF, respectively. However, exceptions were observed for HDL and DBP, for which the predictive utility of GRS was greater among AF (HDL, 4-fold) and AA (HDL, 6-fold; DBP, 3.8-fold) compared with EUR. Between AF and AA, disparity in the predictive value of GRS was less consistent and less profound, but still substantial for some traits. For example, the predictive utility of GRS for TG was 13-fold greater among AA relative to AF but 1.5-fold greater among AF relative to AA for HDL.

Percentage increase in R-squared attributable to genetic risk score. AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes
Figure 3

Percentage increase in R-squared attributable to genetic risk score. AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes

The predictive utility of GRS based on additional trait variation explained was limited for traits for which variability was substantially explained by traditional risk factors. This is not surprising, given the definition of predictive utility as the percentage increase in adjusted R2 of the model including traditional risk factors and GRS, relative to the base model of traditional risk factors only. The phenomenon was especially true for anthropometric traits across groups, except for BMI among EUR where the addition of GRS into the model more than doubled prediction accuracy, representing a 34-fold and 17-fold greater predictive utility in EUR compared with AF and AA, respectively.

We assessed the predictive utility of GRS for dichotomized transformations of the quantitative traits in addition to T2D using the area under the receiver operating characteristic curve (AUC). The heterogeneity among traits and disparity among groups of the predictive utility of GRS were similar under this approach. We observed a substantial predictive utility of GRS for components of lipid dysregulation and T2D across groups, but more so among EUR (Figure 4). Among AF, the greatest increases in AUC (less than 2% gains) were observed for lipid traits and T2D. Among AA, lipid traits and T2D had increases of up to 5.7%. Among EUR, increases of up to 23.2% were observed for nine of 12 traits, again showing better predictive performance among EUR.

Percentage increase in area under the receiver operating characteristic curve (AUC) attributable to genetic risk score
Figure 4

Percentage increase in area under the receiver operating characteristic curve (AUC) attributable to genetic risk score

As with the R-squared method, we found limited absolute predictive utility of GRS among AF and AA under the AUC approach for traits such as general obesity and abdominal obesity, whether defined by WC or by WHR. Absolute predictive performance of GRS was however substantially better among EUR. Thus, disparity in predictive utility of GRS among EUR relative to AF was extremely large for these traits. For example, the predictive utility of GRS among EUR relative to AF for general obesity and raised WHR was 249- and 172-fold, respectively. The disparity was reduced between EUR and AA for general obesity but not for raised WHR; thus the relative predictive utility of GRS among EUR relative to AA was 17- and 172-fold, respectively. For abdominal obesity, where GRS had no predictive utility beyond traditional risk factors among AF and AA, the relative increased predictive performance of GRS among EUR was infinite.

Sensitivity analyses

As a sensitivity analysis, we assessed the predictive utility of GRS constructed from only independent SNPs (i.e. with SNPs in high LD removed) (prunedGRS). The predictive utility of prunedGRS broadly recapitulated the above results: consistent trait-GRS associations for lipids with greater predictive power among EUR compared with AF and AA (Supplementary Figure S3, available as Supplementary data at IJE online). Predictive utility was lower for prunedGRS compared with GRS based on all SNPs in all three groups except for LDL among AF and AA. The number of SNPs removed due to high LD was lower for AF compared with AA and EUR across traits, but largely comparable between AA and EUR (Supplementary Table S2, available as Supplementary data at IJE online).

To assess the impact of adjusting for BMI on the predictive performance of GRS, we compared GRS models with and without adjustment for BMI (Supplementary Table S3, available as Supplementary data at IJE online). Adjustment for BMI affected the prediction of lipids more than other traits, with more traits affected among EUR. Among AF and AA, the greatest impact of BMI adjustment was with respect to HDL (BMI-adjusted model R2 = 0.0403 versus BMI-unadjusted model R2 = 0.034; AA: BMI-adjusted model R2 = 0.098 versus BMI-unadjusted model R2 = 0.0538), whereas TG was the most affected among EUR (BMI-adjusted model R2 = 0.1085 versus BMI-unadjusted model R2 = 0.0497). Notably, the predictive accuracy of GRS was better in EUR relative to AF and AA regardless of BMI adjustment.

We also compared our GWAS Catalog-based GRS with a genome-wide risk score based on a set of all approximately independent SNPs within 1 Mbp (R2 <0.5) with MAF >0.01 (GRSA). For the majority of the traits studied, the predictive accuracy of GRSA was lower than or not different from that of the simpler GRS, with a few exceptions (Table 4). Among AF, GRSA prediction accuracy was better than GRS for WC among AA, and it was better for BMI, LDL and TC among AA, and for BMI and FPG among EUR. Therefore, this was consistent with other studies in which genome-wide risk scores with correction for LD structure did not yield a uniform improvement in predictive power across all traits and all populations evaluated in the present study.31,32 However, prediction accuracy of both GRSA and the simpler GRS was lower among AF compared with AA and EUR.

Table 4

Prediction accuracy (adjusted R2) of the genetic risk score constructed from all single nucleotide polymorphisms with minor allele frequency greater than 0.01 (GRSA)

AF
AA
EUR
TraitModel without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2
BMI0.07410.07410.000.02000.0199−0.500.00700.0334377.14
WC0.57000.777036.320.07410.07683.620.82140.82220.10
HC0.55450.55450.000.83870.83970.120.80250.80370.15
WHR0.19670.1966−0.050.24310.25384.400.47500.47730.48
SBP0.16400.16450.300.10230.1022−0.100.13850.154911.84
DBP0.06510.0649−0.310.01500.0352134.670.08600.098414.42
TG0.17610.17680.400.02720.02999.930.07370.07745.02
TC0.05020.05030.200.04960.095392.140.03690.04049.49
LDL0.05960.06000.670.03650.0779113.420.02800.03069.29
HDL0.02930.02961.020.05910.06276.090.27670.28342.42
FPG0.04740.0472−0.420.06940.083520.320.06860.102749.71
T2D0.10500.10500.000.06200.06606.450.05500.05601.82
AF
AA
EUR
TraitModel without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2
BMI0.07410.07410.000.02000.0199−0.500.00700.0334377.14
WC0.57000.777036.320.07410.07683.620.82140.82220.10
HC0.55450.55450.000.83870.83970.120.80250.80370.15
WHR0.19670.1966−0.050.24310.25384.400.47500.47730.48
SBP0.16400.16450.300.10230.1022−0.100.13850.154911.84
DBP0.06510.0649−0.310.01500.0352134.670.08600.098414.42
TG0.17610.17680.400.02720.02999.930.07370.07745.02
TC0.05020.05030.200.04960.095392.140.03690.04049.49
LDL0.05960.06000.670.03650.0779113.420.02800.03069.29
HDL0.02930.02961.020.05910.06276.090.27670.28342.42
FPG0.04740.0472−0.420.06940.083520.320.06860.102749.71
T2D0.10500.10500.000.06200.06606.450.05500.05601.82

GRSA Model: trait = α + age + sex + BMI + PC1 + PC2 + PC3 + GRSA (BMI excluded in covariates when it is the trait under study).

SNP, single nucleotide polymorphism; AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; GRSA, genetic risk score based on all SNPs; adj-R2, adjusted R-squared; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes.

Table 4

Prediction accuracy (adjusted R2) of the genetic risk score constructed from all single nucleotide polymorphisms with minor allele frequency greater than 0.01 (GRSA)

AF
AA
EUR
TraitModel without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2
BMI0.07410.07410.000.02000.0199−0.500.00700.0334377.14
WC0.57000.777036.320.07410.07683.620.82140.82220.10
HC0.55450.55450.000.83870.83970.120.80250.80370.15
WHR0.19670.1966−0.050.24310.25384.400.47500.47730.48
SBP0.16400.16450.300.10230.1022−0.100.13850.154911.84
DBP0.06510.0649−0.310.01500.0352134.670.08600.098414.42
TG0.17610.17680.400.02720.02999.930.07370.07745.02
TC0.05020.05030.200.04960.095392.140.03690.04049.49
LDL0.05960.06000.670.03650.0779113.420.02800.03069.29
HDL0.02930.02961.020.05910.06276.090.27670.28342.42
FPG0.04740.0472−0.420.06940.083520.320.06860.102749.71
T2D0.10500.10500.000.06200.06606.450.05500.05601.82
AF
AA
EUR
TraitModel without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2Model without GRSModel with GRSA% change in adj-R2
BMI0.07410.07410.000.02000.0199−0.500.00700.0334377.14
WC0.57000.777036.320.07410.07683.620.82140.82220.10
HC0.55450.55450.000.83870.83970.120.80250.80370.15
WHR0.19670.1966−0.050.24310.25384.400.47500.47730.48
SBP0.16400.16450.300.10230.1022−0.100.13850.154911.84
DBP0.06510.0649−0.310.01500.0352134.670.08600.098414.42
TG0.17610.17680.400.02720.02999.930.07370.07745.02
TC0.05020.05030.200.04960.095392.140.03690.04049.49
LDL0.05960.06000.670.03650.0779113.420.02800.03069.29
HDL0.02930.02961.020.05910.06276.090.27670.28342.42
FPG0.04740.0472−0.420.06940.083520.320.06860.102749.71
T2D0.10500.10500.000.06200.06606.450.05500.05601.82

GRSA Model: trait = α + age + sex + BMI + PC1 + PC2 + PC3 + GRSA (BMI excluded in covariates when it is the trait under study).

SNP, single nucleotide polymorphism; AF, sub-Saharan Africans; AA, African Americans; EUR, European Americans; GRSA, genetic risk score based on all SNPs; adj-R2, adjusted R-squared; BMI, body mass index; WC, waist circumference; HC, hip circumference; WHR, waist-to-hip ratio; SBP, systolic blood pressure; DBP, diastolic blood pressure; TG, triglycerides; TC, total cholesterol; LDL, low-density lipoprotein; HDL, high-density lipoprotein; FPG, fasting plasma glucose; T2D, type 2 diabetes.

Further, we compared the proportion of phenotypic variance due to all SNPs with MAF >0.01 (hsnp2) to assess the potential role of non-additive genetic effects in determining differences in the predictive accuracy of GRS between AF and AA. We found differences in hsnp2 between AF and AA, with a greater hsnp2 observed for 8/12 traits among AF. Yet, as indicated above, GRS prediction accuracy was generally higher among AA. The explanation for this is not clear, but is potentially due to differences in the effect of non-additive genetic factors on these traits between the two populations (Supplementary Table S4, available as Supplementary data at IJE online).

Discussion

Using a dataset of about 24 000 individuals, we demonstrate that the predictive utility of GRS varied substantially among 12 cardiometabolic traits and among populations with differing proportions of African ancestry and in comparison with European ancestry populations. Trait-GRS association was strongest for lipids in all three groups but was only strong for the other traits in EUR. Additionally, the predictive utility of GRS was often strongest in EUR and poorest among AF. Between AF and AA, differences in GRS performance were less pronounced, but GRS prediction accuracy tended to be higher among AA, perhaps reflecting European admixture in AA. To our knowledge, this is the first study of GRS for complex traits in sub-Saharan Africans and the first comparison of GRS predictive utility between sub-Saharan Africans and African Americans. These findings have important implications for the potential benefits to be derived from the application of GRS in routine clinical risk prediction across populations of different ancestries.

Our results among EUR (ethnically matched with population from which summary data are derived) are broadly consistent with results reported for Europeans in the UKBB. For example, the additional variance explained by the GRS constructed for BMI (R2 = 0.0083) among EUR in the current study is within the confidence limits of what has been reported in the UKBB for White British individuals. In the UKBB, a GRS based on SNPs with P < 5e-8 in a discovery GWAS explained <1% of BMI variance [R2 = 0.0093, 95% confidence interval (CI) 0.0036–0.0142] in a withheld validation dataset.9 This consistency of results notwithstanding, the minimal added variance explained by GRS in both the current and previous studies for some traits limits the applicability of GRS in the prediction of those traits. We note that others have reported differing estimates of prediction accuracy with respect to BMI among European ancestry individuals.33 These differences in accuracy of GRS between populations of broadly the same ancestry may be explained by interaction of genetic risk factors with demographic factors including age and sex.9

The variation in the predictive performance of GRS among traits likely reflects differential heritability—a measure of the relative influence of genetic and environmental factors on a trait. The predictive power of GRS has been shown to correlate with heritability and greater heritability has been reported for lipids compared with obesity/anthropometric, blood pressure and glycaemic traits.34,35 This is consistent with the observations from the present study in which lipid traits stood out in terms of association with GRS. However, we note that among EUR, the predictive utility of GRS was higher for BMI than some lipid components, suggesting that differences in heritability among traits may not be consistent across populations, due to varying gene-environmental interactions. Further, differences in GWAS sample sizes as well as differences in the proportion of non-European participants in trait-specific GWAS may explain some of the variation in GRS performance observed across traits.

The predictive utility of GRS among AA was better than in AF but worse than in EUR in the present study. Reduced prediction accuracy in AA relative to EUR is consistent with previous reports of lower predictive utility of similarly constructed GRS in admixed individuals compared with Europeans.5,7,9,30 The observed pattern of predictive performance of GRS is consistent with the disproportionately large number of individuals of European ancestry in current genome-wide discovery studies and with the degree of genetic divergence of AF and AA from EUR. EUR contribute nearly four-fifths of individuals included in current GWAS, and AF is more genetically distant from EUR than admixed AA, who have about 20% European ancestry.36 In addition, under-representation of diverse global populations in available genomic resources (including genotyping arrays and imputation panels) means that these resources do not adequately capture global genetic diversity. due to differences in MAF and LD patterns among populations.9,30,37,38 Simulation studies suggest ∼70% loss in relative accuracy of polygenic scores due to differences in MAF and LD between GWAS discovery and GRS test populations.39 When population differences in variant effect sizes are factored in, an expected consequence is poorer prediction accuracy of GRS in under-represented populations. These considerations highlight the need for genomic resources, methods and tools that take into account global genetic diversity. Indeed, there is increasing evidence demonstrating improved GRS predictive accuracy when GRS are constructed from ancestry-matched variants and GWAS summary statistics.9,40,41

Further, inflation in the association between GRS and the trait tested due to sample overlap with the discovery GWAS could potentially explain better performance of GRS among EUR. Although SNPs selected from the GWAS Catalog makes sample overlap likely, using weights for non-overlapping GWAS, such as the UKBB, limits the potential effects of such overlap. Other factors that are important in disparities of GRS predictive utility include differences in polygenic adaptation due to natural selection, historical population size, residual uncorrected population structure and aetiological differences between populations.31,32,42,43 Other possible factors include differences in genetic architecture due to gene-environment or gene-gene interactions in admixed populations or monomorphism of the causal variant in an ancestral population.44,45 In this regard, it is important to note that AF differ from AA not just in genetic variation but also in environmental factors that influence cardiometabolic phenotypes, including dietary, behavioural, socioeconomic and other lifestyle factors.46

The intriguing lack of predictive utility of GRS for TG among AF is unclear but parallels the existence of lower TG observed in African ancestry individuals compared with non-African-ancestry individuals.47 A significant role for a genetic influence characterized by ancestry-specific loci has been suggested because of the consistency of lower TG levels across African-ancestry populations, despite divergent environmental contexts and the persistence of lower TG among AA compared with EUR in spite of similar environments.36,48 Therefore, poor predictive utility of GRS for TG among AF may be a reflection of non-transferability of current GWAS loci to AF possibly due to differences in sample size, effect size, allele frequency and gene-environment interactions. For HDL, the role of a genetic influence is less clear because of inconsistent differences in HDL levels among populations of different ancestry. Whereas AA tend to have higher HDL levels compared with EUR, AF from West Africa have been shown to have lower levels of HDL, suggesting an important role for environmental factors.36 Further research is needed to clarify the potential role of underlying genetic differences as the force behind HDL variation among populations of different ancestry and its impact on the predictive utility of GRS in the context of environmental differences.

Despite concerns about the impact on health disparities, our findings are indicative of a promising role for GRS in predicting the risk of hypercholesterolaemia across populations of different ancestral backgrounds. A potential application of GRS in this context could be assessing additive risk of elevated LDL beyond the causative monogenic mutations of familial hypercholesterolaemia (FH). As high GRS has been shown to be associated with severity of the FH phenotype, carriers of monogenic FH-mutations with extreme GRS could be prioritized for early intervention including treatment with statins, and knowledge of concomitant high GRS could encourage adherence to treatment among FH patients.49,50

Important strengths of this study are the large sample size, use of independent datasets for discovery and assessment of predictive utility in different populations. Additionally, SNPs were identified from the NHGRI-EBI GWAS Catalog (a curated comprehensive public repository of published GWAS) and highly precise summary statistics used for weighting were obtained from the UK Biobank, which has genotype and extensive phenotypic data on ∼500 000 individuals.22,25 However, our findings should be interpreted in the context of the limitations of the study. First, for constructing GRS, we only included SNPs that reached the GWAS Catalog criterion of 1 x 10-5 level of significance. There are SNPs not yet identified with the current sample sizes but which may be associated with the traits studied. Second, we did not account for gene-gene and gene-environment interactions, which may limit the predictive utility of GRS. Third, the high level of genetic heterogeneity observed in Africans calls for the inclusion of samples from other African populations beyond those included in the current study, in order to better represent the genetic diversity on the continent. Fourth, observed differences in GRS prediction accuracy between AF and AA may partly be explained differences in sample size, especially in light of the similar correlations between the GRS and quantitative traits between the two groups, as well as similarities in variance explained by the GRS. Finally, the predictive utility of GRS observed in this study might be understated if the causal variants of the traits studied are poorly tagged by SNPs used to construct GRS. This is particularly relevant because LD is weaker in African-ancestry individuals compared with European-ancestry individuals in whom most of the current genetic variants were discovered.

This first evaluation of GRS in sub-Saharan Africans demonstrates that the predictive performance of GRS for cardiometabolic traits is markedly poor among sub-Saharan Africans and currently provides little or no benefit over traditional risk factors. We also confirm that GRS prediction accuracy is lower among African Americans compared with European Americans. Therefore, unlike in EUR populations, GRS for cardiometabolic disorders remain suboptimal for clinical translation in sub-Saharan Africans as well as in African Americans. These findings add to the growing understanding of the strengths and limitations of the applications of GRS in routine clinical and/or public health settings and highlights the need to increase the inclusion of under-represented populations in genomic discovery to promote equity in translation of such discovery.

Funding

This research was supported by the Intramural Research Program of the Center for Research on Genomics and Global Health (CRGGH). The CRGGH is supported by the National Human Genome Research Institute, the National Institute of Diabetes and Digestive and Kidney Diseases and the Office of the Director at the National Institutes of Health (1ZIAHG200362). The following studies were funded by the listed NIH grants. AADM was supported by NIH grant 3T37TW00041-03S2. HUFS was supported by NIH grants S06GM008016-320107 (C.R.), S06GM008016-380111 (A.A.) and 2M01RR010284. This research was also supported in part by the NIH Intramural Research Program in the Center for Research on Genomics and Global Health (1ZIAHG200362). ARIC was supported by NIH grants N01-HC-55015, N01-HC-55018, N01-HC-55016, N01-HC-55021, N01-HC-55019, N01-HC-55020 and N01-HC-55022. CFS was supported by NIH grants R01-HL-46380 and M01-RR-00080. JHS was supported by NIH grants N01-HC-95170, N01-HC-95171, and N01-HC-95172. MESA was supported by NIH grants N01-HC-95159, N01-HC-95160, N01-HC-95161, N01-HC-95162, N01-HC-95168, N01-HC-95163, N01-HC-95164, N01-HC-95165, N01-HC-95166, N01-HC-95167, N01-HC-95169 and R01-HL-071205. The funders had no role in study design, data collection, data analysis, interpretation, or writing of the paper.

Data availability

Data generated in this study are available upon reasonable request from the corresponding author.

Acknowledgements

This work used the computational resources of the NIH HPC Biowulf cluster [https://hpc.nih.gov]. The contents of this publication are solely the responsibility of the authors and do not necessarily represent the official view of the National Institutes of Health (NIH).

Author contributions

C,R,, A,A. conceptualized the study; K.E., G.C., J.Z. and D.S. performed data management and statistical analysis; K.E. and A.A. drafted the paper; C.R., A.A., K.E., A.D., A.B., G.C. and D.S. edited the paper. All contributors reviewed and approved the manuscript.

Conflict of interest

None declared.

References

1

Khera
AV
,
Chaffin
M
,
Wade
KH
et al.
Polygenic prediction of weight and obesity trajectories from birth to adulthood
.
Cell
2019
;
177
:
587
96.e9
.

2

Khera
AV
,
Chaffin
M
,
Aragam
KG
et al.
Genome-wide polygenic scores for common diseases identify individuals with risk equivalent to monogenic mutations
.
Nat Genet
2018
;
50
:
1219
24
.

3

Natarajan
P
,
Young
R
,
Stitziel
NO
et al.
Polygenic risk score identifies subgroup with higher burden of atherosclerosis and greater relative benefit from statin therapy in the primary prevention setting
.
Circulation
2017
;
135
:
2091
101
.

4

Paquette
M
,
Chong
M
,
Thériault
S
,
Dufour
R
,
Paré
G
,
Baass
A.
Polygenic risk score predicts prevalence of cardiovascular disease in patients with familial hypercholesterolemia
.
J Clin Lipidol
2017
;
11
:
725
32.e5
.

5

Belsky
DW
,
Moffitt
TE
,
Sugden
K
et al.
Development and evaluation of a genetic risk score for obesity
.
Biodemography Soc Biol
2013
;
59
:
85
100
.

6

Ware
EB
,
Schmitz
LL
,
Faul
J
et al.
Heterogeneity in polygenic scores for common human traits
.
bioRxiv
2017
; doi: . 5 February, preprint: not peer reviewed.

7

Curtis
D.
Polygenic risk score for schizophrenia is more strongly associated with ancestry than with schizophrenia
.
Psychiatr Genet
2018
;
28
:
85
89
.

8

Duncan
L
,
Shen
H
,
Gelaye
B
, et al.
Analysis of polygenic score usage and performance across diverse human populations
.
bioRxiv
. doi: . 3 November
2018
. Preprint: not peer reviewed.

9

Martin
AR
,
Kanai
M
,
Kamatani
Y
,
Okada
Y
,
Neale
BM
,
Daly
MJ.
Clinical use of current polygenic risk scores may exacerbate health disparities
.
Nat Genet
2019
;
51
:
584
91
.

10

Rotimi
CN
,
Dunston
GM
,
Berg
K
et al.
In search of susceptibility genes for type 2 diabetes in West Africa: the design and results of the first phase of the AADM study
.
Ann Epidemiol
2001
;
11
:
51
58
.

11

Rotimi
CN
,
Chen
G
,
Adeyemo
AA
et al.
A genome-wide search for type 2 diabetes susceptibility genes in West Africans: the Africa America Diabetes Mellitus (AADM) Study
.
Diabetes
2004
;
53
:
838
41
.

12

Adeyemo
AA
,
Zaghloul
NA
,
Chen
G
et al. ; South Africa Zulu Type 2 Diabetes Case-Control Study.
ZRANB3 is an African-specific type 2 diabetes locus associated with beta-cell mass and insulin response
.
Nat Commun
2019
;
10
:
3195
.

13

Adeyemo
A
,
Gerry
N
,
Chen
G
et al.
A genome-wide association study of hypertension and blood pressure in African Americans
.
PLoS Genet
2009
;
5
:
e1000564
.

14

Dean
DA
2nd ,
Goldberger
AL
,
Mueller
R
et al.
Scaling up scientific discovery in sleep medicine: the national sleep research resource
.
Sleep
2016
;
39
:
1151
64
.

15

Sempos
CT
,
Bild
DE
,
Manolio
TA.
Overview of the Jackson Heart Study: a study of cardiovascular diseases in African American men and women
.
Am J Med Sci
1999
;
317
:
142
46
.

16

Bild
DE
,
Bluemke
DA
,
Burke
GL
et al.
Multi-ethnic study of atherosclerosis: objectives and design
.
Am J Epidemiol
2002
;
156
:
871
81
.

17

The Atherosclerosis Risk in Communities (ARIC) Study: design and objectives. The ARIC investigators
.
Am J Epidemiol
1989
;
129
:
687
702
.

18

Browning
LM
,
Hsieh
SD
,
Ashwell
M.
A systematic review of waist-to-height ratio as a screening tool for the prediction of cardiovascular disease and diabetes: 0.5 could be a suitable global boundary value
.
Nutr Res Rev
2010
;
23
:
247
69
.

19

Third Report of the National Cholesterol Education Program (NCEP) expert panel on detection, evaluation, and treatment of high blood cholesterol in adults (Adult Treatment Panel III) final report
.
Circulation
2002
;
106
:
3143
421
.

20

Whelton
PK
,
Carey
RM
,
Aronow
WS
et al.
2017 ACC/AHA/AAPA/ABC/ACPM/AGS/APhA/ASH/ASPC/NMA/PCNA guideline for the prevention, detection, evaluation, and management of high blood pressure in adults: a report of the American College of Cardiology/American Heart Association task force on clinical practice guidelines
.
Circulation
2018
;
138
:
e484
e594
.

21

Alberti
KG
,
Eckel
RH
,
Grundy
SM
et al.
Harmonizing the metabolic syndrome: a joint interim statement of the International Diabetes Federation Task Force on Epidemiology and Prevention; National Heart, Lung, and Blood Institute
;
American Heart Association; World Heart Federation; International Atherosclerosis Society; and International Association for the Study of Obesity
.
Circulation
2009
;
120
:
1640
45
.

22

MacArthur
J
,
Bowler
E
,
Cerezo
M
et al.
The new NHGRI-EBI Catalog of published genome-wide association studies (GWAS Catalog)
.
Nucleic Acids Res
2017
;
45
:
D896
d901
.

23

Purcell
S
,
Neale
B
,
Todd-Brown
K
et al.
PLINK: a tool set for whole-genome association and population-based linkage analyses
.
Am J Hum Genet
2007
;
81
:
559
75
.

24

Chang
CC
,
Chow
CC
,
Tellier
LC
,
Vattikuti
S
,
Purcell
SM
,
Lee
JJ.
Second-generation PLINK: rising to the challenge of larger and richer datasets
.
Gigasci
2015
;
4
:
7
.

25

Sudlow
C
,
Gallacher
J
,
Allen
N
et al.
UK biobank: an open access resource for identifying the causes of a wide range of complex diseases of middle and old age
.
PLoS Med
2015
;
12
:
e1001779
.

26

Spracklen
CN
,
Chen
P
,
Kim
YJ
et al.
Association analyses of East Asian individuals and trans-ancestry analyses with European individuals reveal new loci associated with cholesterol and triglyceride levels
.
Hum Mol Genet
2017
;
26
:
1770
84
.

27

Manning
AK
,
Hivert
MF
,
Scott
RA
et al. ; DIAbetes Genetics Replication And Meta-analysis (DIAGRAM) Consortium.
A genome-wide approach accounting for body mass index identifies genetic variants influencing fasting glycemic traits and insulin resistance
.
Nat Genet
2012
;
44
:
659
69
.

28

Efron
B.
Regression and ANOVA with zero-one data - measures of residual variation
.
J Am Stat Assoc
1978
;
73
:
113
21
.

29

Yang
J
,
Lee
SH
,
Goddard
ME
,
Visscher
PM.
GCTA: a tool for genome-wide complex trait analysis
.
Am J Hum Genet
2011
;
88
:
76
82
.

30

Martin
AR
,
Gignoux
CR
,
Walters
RK
et al.
Human demographic history impacts genetic risk prediction across diverse populations
.
Am J Hum Genet
2017
;
100
:
635
49
.

31

Sohail
M
,
Maier
RM
,
Ganna
A
et al.
Polygenic adaptation on height is overestimated due to uncorrected stratification in genome-wide association studies
.
eLife
2019
;
8
:
e39702
.

32

Berg
JJ
,
Harpak
A
,
Sinnott-Armstrong
N
et al.
Reduced signal for polygenic adaptation of height in UK Biobank
.
eLife
2019
;
8
:
e39725
.

33

Locke
AE
,
Kahali
B
,
Berndt
SI
et al. ; LifeLines Cohort Study.
Genetic studies of body mass index yield new insights for obesity biology
.
Nature
2015
;
518
:
197
206
.

34

Dudbridge
F.
Power and predictive accuracy of polygenic risk scores
.
PLoS Genet
2013
;
9
:
e1003348
.

35

Elder
SJ
,
Lichtenstein
AH
,
Pittas
AG
et al.
Genetic and environmental influences on factors associated with cardiovascular disease and the metabolic syndrome
.
J Lipid Res
2009
;
50
:
1917
26
.

36

Bentley
AR
,
Rotimi
CN.
Interethnic differences in serum lipids and implications for cardiometabolic disease risk in African ancestry populations
.
Glob Heart
2017
;
12
:
141
50
.

37

Auton
A
,
Brooks
LD
,
Durbin
RM
et al. ; 1000 Genomes Project Consortium.
A global reference for human genetic variation
.
Nature
2015
;
526
:
68
74
.

38

Sham
PC
,
Cherny
SS
,
Purcell
S
,
Hewitt
JK.
Power of linkage versus association analysis of quantitative traits, by use of variance-components models, for sibship data
.
Am J Hum Genet
2000
;
66
:
1616
30
.

39

Wang
Y
,
Guo
J
,
Ni
G
,
Yang
J
,
Visscher
PM
,
Yengo
L
,
Theoretical and empirical quantification of the accuracy of polygenic scores in ancestry divergent populations
.
bioRxiv
2020
. doi: . 15 January 2020, preprint: not peer reviewed.

40

Chikowore
T
,
van Zyl
T
,
Feskens
EJ
,
Conradie
KR.
Predictive utility of a genetic risk score of common variants associated with type 2 diabetes in a black South African population
.
Diabetes Res Clin Pract
2016
;
122
:
1
8
.

41

Chen
J
,
Sun
M
,
Adeyemo
A
et al.
Genome-wide association study of type 2 diabetes in Africa
.
Diabetologia
2019
;
62
:
1204
11
.

42

Novembre
J
,
Barton
NH.
Tread lightly interpreting polygenic tests of selection
.
Genetics
2018
;
208
:
1351
55
.

43

Henn
BM
,
Botigué
LR
,
Bustamante
CD
,
Clark
AG
,
Gravel
S.
Estimating the mutation load in human genomes
.
Nat Rev Genet
2015
;
16
:
333
43
.

44

Jain
D
,
Hodonsky
CJ
,
Schick
UM
et al.
Genome-wide association of white blood cell counts in Hispanic/Latino Americans: The Hispanic community health study/study of Latinos
.
Hum Mol Genet
2017
;
26
:
1193
204
.

45

Musunuru
K
,
Romaine
SP
,
Lettre
G
et al.
Multi-ethnic analysis of lipid-associated loci: the NHLBI CARe project
.
PloS One
2012
;
7
:
e36473
.

46

Mostafavi
H
,
Harpak
A
,
Conley
D
,
Pritchard
JK
,
Przeworski
M
,
Variable prediction accuracy of polygenic scores within an ancestry group
.
bioRxiv
2019
. doi: https://doi.org/10.1101/629949. 7 May 2019: preprint: not peer reviewed.

47

Chang M-h
N
,
Renée
M
,
Hong
Y
et al.
Racial/ethnic variation in the association of lipid-related genetic variants with blood lipids in the US adult population
.
Circulation
2011
;
4
:
523
33
.

48

Adeyemo
A
,
Bentley
AR
,
Meilleur
KG
et al.
Transferability and fine mapping of genome-wide associated loci for lipids in African Americans
.
BMC Med Genet
2012
;
13
:
88
.

49

Ghaleb
Y
,
Elbitar
S
,
El Khoury
P
et al.
Usefulness of the genetic risk score to identify phenocopies in families with familial hypercholesterolemia
.
Eur J Hum Genet
2018
;
26
:
570
78
.

50

Sarraju
A
,
Knowles
JW.
Genetic testing and risk scores: impact on familial hypercholesterolemia
.
Front Cardiovasc Med
2019
;
6
:
5
6
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data