Abstract

The Transcriptome-Wide Association Study (TWAS) is a widely used approach which integrates gene expression and Genome Wide Association Study (GWAS) data to study the role of cis-regulated gene expression (GEx) in complex traits. However, the genetic architecture of GEx varies across populations, and recent findings point to possible ancestral heterogeneity in the effects of GEx on complex traits, which may be amplified in TWAS by modeling GEx as a function of cis-eQTLs. Here, we present a novel extension to TWAS to account for heterogeneity in the effects of cis-regulated GEx which are correlated with ancestry. Our proposed Multi-Ancestry TwaS (MATS) framework jointly analyzes samples from multiple populations and distinguishes between shared, ancestry-specific and/or subject-specific expression-trait associations. As such, MATS amplifies power to detect shared GEx associations over ancestry-stratified TWAS through increased sample sizes, and facilitates the detection of genes with subgroup-specific associations which may be masked by standard TWAS. Our simulations highlight the improved Type-I error conservation and power of MATS compared with competing approaches. Our real data applications to Alzheimer’s disease (AD) case–control genotypes from the Alzheimer’s Disease Sequencing Project (ADSP) and continuous phenotypes from the UK Biobank (UKBB) identify a number of unique gene-trait associations which were not discovered through standard and/or ancestry-stratified TWAS. Ultimately, these findings promote MATS as a powerful method for detecting and estimating significant gene expression effects on complex traits within multi-ancestry cohorts and corroborates the mounting evidence for inter-population heterogeneity in gene–trait associations.

Introduction

The Transcriptome-Wide Association Study (TWAS) is a common approach to identify genes associated with complex traits and diseases. TWAS integrates expression and GWAS data to study the genetically regulated component of gene expression (also referred to as imputed gene expression), computed by taking a weighted sum of cis-eQTL genotypes, with weights derived from each Single Nucleotide Polymorphism's (SNP) estimated effect on gene expression (GEx). These weights can be obtained from GEx prediction models trained on independent data from a target case–control study, making TWAS applicable to large-scale genetic studies which do not have expression data. Unsurprisingly, imputed GEx is closely tied to (i) the underlying genetic architecture of gene expression (i.e. causal eQTLs and their effect sizes) and (ii) the accuracy and appropriateness of the prediction model for obtaining eQTL weights. Both of these factors can be heavily influenced by ancestry.

Research suggests that at least 25% of gene’s expression levels are significantly correlated with self-reported or local genetic ancestry, findings which are complemented by subsequent studies that identify variation in the underlying genetic architectures of GEx between populations (1–9). Population variation in cis-eQTLs can arise for a number of reasons, including variation in LD patterns and minor allele frequencies. A 2018 study by Mogil et al. revealed that the level of concordance between population eQTL effect sizes is proportional to their degree of shared ancestry, and that gene expression prediction and subsequent TWAS tests are optimized when the ancestry and MAF patterns of the prediction training sample and the target GWAS are similar (9).

The downstream effects of population genetic variation in GEx on complex traits are not fully understood to date, but it is increasingly clear that the impacts may be widespread throughout the genome and phenome (10,11). Bisogno et al. (12) showed that ancestral variation in gene expression can explain interpersonal differences in cell reprogramming, cell fate transitions and biological processes. Recent work by Blue et al. (13) revealed ancestry-linked heterogeneity in the effects of the protective (e2) and risk (e4) alleles for the APOE gene on Alzheimer’s disease (AD), suggesting that the effects of well-established loci involved in AD may be modified by ancestry.

Heterogeneity in GEx–trait interactions has also been directly observed through TWAS discoveries. Notably, TWAS may amplify inter-population heterogeneity by directly modeling GEx as a function of cis-eQTLs, which vary by ancestry. A race-stratified TWAS in breast cancer revealed heterogeneity in expression effects between African American and White women, including some ancestry-specific GEx associations which were not uncovered by unstratified standard TWAS (14). However, power may be lost in stratified analyses due to decreased sample sizes of each ancestry, particularly for genes with shared effects across populations. Therefore, while it is prudent to account for ancestry-specific GEx effects in TWAS, stratified TWAS may have limited power to detect homogeneous GEx–trait associations. The recent METRO model proposed by Li et al. incorporates the ancestry-specific expression prediction models to improve gene expression imputation (via a data determined weighted average of these prediction models). METRO estimates a single gene-trait effect for a GWAS dataset of a single ancestry and, therefore cannot estimate ancestry-specific gene-trait effects (15). As highlighted in our simulations, when applied to multi-ancestry GWAS data, METRO has low power to detect any gene expression effect in the presence of strong but oppositely signed effects between ancestries. Motivated by (16), we also explicitly account for individual-level heterogeneity of gene-trait associations, which is ignored in METRO.

In this study, we present the Multi-Ancestry TwaS (MATS), a novel approach to account for heterogeneity in the effects of imputed gene expression which are correlated with ancestry. By jointly analyzing samples from multiple populations, MATS is expected to improve power for detecting genes with shared expression–trait associations across populations through increased sample sizes, as compared with existing stratified TWAS approaches. Our proposed model of ancestry correlated GEx effects builds off of two existing and closely related approaches for modeling population substructure in genetic association testing, the generalized linear mixed model (GLMM) and principal component regression (PCR) (17). Specifically, we model population structure using a genetic relatedness matrix and its corresponding axes of genetic variation (i.e., Principal Components (PCs)), respectively, for GLMM- and PCR-based approaches (18–20). Using these models, we present a novel TWAS analysis pipeline that distinguishes between expression associations which are (i) shared across populations, (ii) population-specific and/or (iii) subject-specific. This pipeline relies on a sequence of tests to identify population-level and intra-population heterogeneity. We show the equivalence of these tests under both GLMM- and PCR-based modeling frameworks. Our test of intra-population heterogeneity is based on the empirical Bayes test for high-dimensional data of Goeman et al. (21) and has direct parallels to Kernel Machine Regression and related applications in genetic association studies (e.g. SKAT (22), C-alpha (23), SSU (24), MR-MEGA (16)).

In our primary analysis, we apply MATS to 2757 African American and 6710 Caucasian samples from the Alzheimer’s Disease Sequencing Project (ADSP), using ancestry-matched prediction models trained separately on African and European samples from the Multi-Ethnic Study of Atherosclerosis (MESA) study. We discover a number of genes with significant expression effects in AD that were undetected by standard TWAS, including SNX7, UGDH, HYAL3, CD68 and SUPT4H1. In a supplemental application, we apply MATS to a cohort of finescale population substructure, namely 37432 samples from the UK Biobank (UKBB) with neuroimaging phenotypes, which span four genetically determined subpopulations (African, Asian, European and Middle Eastern ancestry). Specifically, we study expression associations in 20 AD-associated imaging-derived phenotypes (IDPs) and 4 hematological traits. In the absence of ancestry-specific expression prediction models for each of the four UKBB subpopulations, we impute gene expression using the European expression predictions model in all samples. Nevertheless, we discover a handful of genes with suggestive population- and individual-level heterogeneity, highlighting the flexibility and power of MATS even when ancestry-specific expression models are unavailable.

These findings highlight the utility of our MATS framework for identifying expression-trait associations in multi-ancestry cohorts, even when the expression effects differ by subpopulation. Furthermore, our discovery of ancestrally heterogeneous expression–trait associations discourages against generalizing TWAS findings from predominantly European genetic studies to other populations and highlights the importance of expanding non-European cohorts in richly phenotyped genomic studies.

Results

Simulations

Here, we highlight key takeaways from our simulation study, which are outlined in greater detail in the Supplemental Material. Our first set of simulations assess the performance of MATS compared with METRO and standard TWAS in the absence of individual-level and population-level heterogeneity. As shown in Figure 1, the Type I errors for MATS are well controlled for the test of overall expression and population-level heterogeneity and are conservative for the test of individual-level heterogeneity. As is expected in the setting without heterogeneity, the TWAS Type-I errors are also well controlled, while METRO yields a conservative Type-I error rate.

Simulation Rejection Rates with varied total expression effect, in the absence of population level and individual level heterogeneity.
Figure 1

Simulation Rejection Rates with varied total expression effect, in the absence of population level and individual level heterogeneity.

As the shared expression effect increases (α1), we find that TWAS and MATS have greater power than METRO across all settings for α1 (Fig. 1C). Unsurprisingly, TWAS is slightly more powerful than MATS when there is no heterogeneity, owing to fewer degrees of freedom and smaller standard errors (Supplementary Material, Fig. S1). However, when we introduce population-level heterogeneity (Fig. 2C; α2 = 0.003, α3 = −0.003), TWAS and METRO are unable to detect expression–trait associations when present within only two populations. This finding is expected since TWAS tests for a non-zero global mean expression effect, which is close to zero when α2 = −α3 and the shared expression effect (α1) is small. In contrast, the MATS test of expression effects and population-level heterogeneity retains high power and unbiasedness across all magnitudes of α1, even when only two populations have true non-zero expression effects (Fig. 2, Supplementary Material, Fig. S2). In fact, the MATS test of population-level heterogeneity and overall expression effects remains powerful across most reasonable magnitudes of α2 and α3 (Fig. 3A and B, α1 = 0, vary α2 = −α3), even when only one population has a substantial expression effect (Fig. 3C and D, α1 = 0, α3 = −0.001, vary α2). The same findings hold when one population’s sample size is significantly larger than the rest; we leave these simulation figures to the Supplemental Material.

Simulation Rejection Rates with varied total expression effect, population level heterogeneity and no individual-level heterogeneity.
Figure 2

Simulation Rejection Rates with varied total expression effect, population level heterogeneity and no individual-level heterogeneity.

Simulation Rejection Rates with varied population-level heterogeneity and no individual-level heterogeneity.
Figure 3

Simulation Rejection Rates with varied population-level heterogeneity and no individual-level heterogeneity.

Figure 4 illustrates the rejection rates when perturbing individual-level heterogeneity. Specifically, we vary κ2 when α1 = α2 = α3 = 0. Notably, while the power for the test of individual-level heterogeneity increases as κ2 increases, the rejection rates for population-level heterogeneity (H0,4) and overall expression (H0,1) also increase. While the true values for the main effects are zero (α1 = α2 = α3 = 0), the rejection rates for these tests are not strictly Type-I error rates, given the assumption under both hypotheses that κ2 = 0. Intuitively, the increased rejection rates for both H0,1 and H0,4 with growing κ2 can be explained by spurious expression and inter-population signal introduced when the GEx–trait association is highly variable between subjects.

Simulation Rejection Rates with individual-level heterogeneity.
Figure 4

Simulation Rejection Rates with individual-level heterogeneity.

We further study the effects of extreme misspecification of K when α1 = α2 = α3 = 0, specifically setting K = I, an n × n identity matrix (i.e. misspecifying that all subjects are entirely unrelated). As illustrated in Supplementary Material, Figure S4, we observe mild inflation in Type I errors for the test of population-level heterogeneity for high levels of individual-level heterogeneity, but the Type-I errors remain well conserved across moderate levels of inter-subject heterogeneity. Power for the test of individual-level heterogeneity remained high, in spite of the misspecification.

Importantly, the simulations illustrated in Figures 2 and 3 reflect settings where the non-negative weight constraint in METRO is violated, thus influencing its highly conservative performance. Nevertheless, we feel that the settings in which one or more populations may have opposite signed expression effects can be reasonably expected in real data, which we confirm in our real data applications and highlighted in a number of motivating examples throughout Introduction section. Furthermore, the simulations illustrated in Figures 1 and 4 reflect settings in which the METRO conditions are satisfied, since all population-specific effects are in the same direction.

To parse apart whether the association signal is coming from population-level vs. individual-level effects, we recommend fitting the PC-interaction model outlined in the Methods to follow when H0,3 is rejected to adjust for individual-level heterogeneity by incorporating top PC-GEx interactions. We expect that the Type I error rates for H*0,4: α2 = α3 = 0 and H*0,1: α1 = α2 = α3 = 0 under this full model will be better conserved when the leading PCs explain a large proportion of the variability. However, if a substantial proportion of variability is explained by the remaining PCs, we expect that there may still be some Type I error inflation. In Supplementary Material, Figure S3, we illustrate the rejection rates for LRTs of H*0,1 and H*0,4 under the full PC-interaction model using the top 10 PCs, compared with standard TWAS. As expected, the Type-I errors for population-level heterogeneity under the full model are improved over the model that does not include PC-interaction effects. However, we still observe notable Type-I error inflation as κ2 increases, likely owing to missing signal when including only the top 10 PC-interactions, which explain 45% of the genetic variation in the simulated data.

Application to the Alzheimer’s disease sequencing project

Out of 1358 total prediction models from MESA, 1157 (African American (AFA)) and 1311 (Caucasian (CAU)) genes included at least one variant which was not in the target ADSP Whole Genome Sequencing (WGS) data (Fig. 5A). We applied our proposed method for integrating prediction and target model data using matched ancestry samples from 1000 genomes and successfully recovered complete information for all SNPs in 1160/1358 ≈ 85% and 1012/1358 ≈ 75% genes for the AFA and CAU prediction models, respectively. Figure 5B illustrates the number of recovered SNPs across all genes, highlighting the success of our proposed approach for salvaging information on missing prediction model variants.

Integrating GEx prediction and target data.
Figure 5

Integrating GEx prediction and target data.

Figure 6A gives the PC plots for the top 6 PCs based on the realized genetic relatedness matrix of all ADSP samples, revealing some clustering between self-reported ancestries based on the top two PCs. The proportion of variance explained by the top 3 and 10 PCs is 10.4 and 16.4%, respectively. Figure 6B illustrates PC plots computed on a 9467 (n) × 1358 matrix of imputed gene expression, where each column represents an imputed gene. These plots show notable separation between the AFA and CAU samples based on imputed expression, which can be largely explained by the different prediction models used to imputed GEx expression in AFA and CAU populations, differences of which are driven by variability in the genetic architectures of GEx between populations, impacting downstream interactions between expression and the complex trait.

PC plots computed on A) genotypes and B) imputed IDPs in the ADSP cohort.
Figure 6

PC plots computed on A) genotypes and B) imputed IDPs in the ADSP cohort.

MATS identified 20 genes with significant expression effects on AD at a Bonferonni-corrected threshold of P ≤ 0.05/1358, and 363 genes with false discovery rate q-values ≤ 0.01. Notably, none of the 20 genes reaching the Bonferroni-corrected significance level were identified by standard TWAS, and 17 were also significant for the MATS test of individual-level heterogeneity. The remaining three genes included UGDH (pMATS = 5.643 * 10−6, qMATS = 0.00016), SNX7 (pMATS = 1.618 * 10−6, qMATS = 0.00016) and CAMK1D (pMATS = 3.017 * 10−5, qMATS = 0.00021). As to be discussed, both SNX7 and UGDH yielded a significant test for population-level heterogeneity at an unadjusted significance level of 0.05, suggesting that their discovery was likely driven by population-level effect differences. For CAMK1D, the MATS tests of population-level and individual-level heterogeneity yielded p-values of ppop.het = 0.1048 and pindiv.het = 7.758e−05, respectively, with log odds ratio (OR) estimates for the AFA and CAU populations of 0.0837 and −0.186, respectively. These findings suggest that the significant overall association test was driven by the strong (but not independently significant) signal from the inter- and intra-population expression effects and may warrant further investigation of CAMK1D as a putative gene involved in AD.

The Venn diagram in Figure 7 illustrates the number of genes identified by standard TWAS, CAU and AFA specific TWAS, and the MATS test of overall expression at an unadjusted P-value threshold of 0.05. Notably, we are limited to the comparison of genes identified at an unadjusted significance level given that no genes were detected by standard or stratified TWAS with p≤ 0.05/1358 or q ≤ 0.1, aside from SNX7 under the CAU-specific TWAS model (pCAU-TWAS = 3.129*10−7, qCAU-TWAS = 0.00035). MATS identified 344 genes which were not identified under the stratified or standard TWAS models (p < 0.05), 342 of which had suggestive evidence of individual-level heterogeneity and 8 of which had suggestive evidence of population-level heterogeneity. Of these eight genes, five remained significant for population-level effect differences (p < 0.05) after controlling for intra-population heterogeneity using three leading PC-interactions, namely PQLC3, ALS2, CHURC1, C17orf75 and DNTTIP1. Furthermore, 170 of the 344 genes identified uniquely by MATS had same-signed effect estimates, highlighting the benefit of MATS over standard TWAS to identify genes with expression-AD signal, even when the overall mean effect is not attenuated by opposite signed population-level effects. Many of the genes exclusively detected by MATS have existing links to AD, including (but not limited to) WDFY1, CCT5, CDKN2B (25), SIGLEC7 (26), and SLC22A15 and SLC22A16 (27).

Comparison of the number of significant genes identified by competing methods for ADSP.
Figure 7

Comparison of the number of significant genes identified by competing methods for ADSP.

MATS was also able to detect a large proportion of genes identified by a population-stratified TWAS model, specifically 29/59 and 57/86 genes discovered uniquely by the AFA- and CAU-specific TWAS tests (p< 0.05), respectively. Notably, a number of genes were uniquely identified by the AFA-specific (23 genes) and CAU-specific (12 genes) models which were not discovered by MATS. In spite of non-significant tests under the MATS framework (due to its larger degrees of freedom), the population-level effect estimates and 95% confidence intervals for these genes were highly concordant between MATS and stratified-TWAS for most genes (Fig. 8). On the other hand, we also note the power gains of MATS in detecting many more genes missed by the stratified (and standard) TWAS models, and yielding significant genes after correcting for multiple testing.

Estimated expression effects from standard TWAS vs. MATS for ADSP.
Figure 8

Estimated expression effects from standard TWAS vs. MATS for ADSP.

Forty genes were significant at a Bonferonni-corrected significance level (p < 0.05/1358) for individual-level heterogeneity in AD. The five genes which yielded the smallest p-values included ISG15 (pindiv = 6.891 * 10−7), WDFY1 (pindiv = 1.308 * 10−6), IFT88 (pindiv = 1.308 * 10−6), PRELID1 (pindiv = 2.369 * 10−6) and CCT5 (pindiv = 2.402 * 10−6), none of which were significant for the MATS test of population-level heterogeneity standard TWAS, or stratified TWAS. A comparison of the population-level effect sizes for the full (with random effect for individual-level effect) and reduced model (without random effect) is illustrated in Supplementary Material, Figure S10. Interestingly, all of these genes were previously identified as susceptibility genes for AD or related dementias in the literature (ISG15 (28,29), WDFY1 (30), IFT88 (31), PRELID1 (32) and CCT5 (33)). Our findings suggest that the mechanisms underlying these expression-AD associations may be patient- or subgroup-specific, or may be influenced by environmental or batch effects within this sample. Ultimately, these genes can be promoted as candidate targets for personalized diagnosis and treatment which would likely go undetected in standard approaches focused on study or population-level mean effects.

At a Bonferroni-corrected significance level, no genes were identified by MATS for population-specific expression effects on AD, and only SNX7 was identified under the CAU-specific TWAS model. The lack of significant findings can be attributed to the small sample size of the ADSP data for detecting small population-level effect differences. At an unadjusted significance level of 0.05, 440 genes were identified by the MATS expression-association test, 401 of which were also significant (p < 0.05) for the MATS test of individual-level heterogeneity, and 43/440 were identified for population-level heterogeneity. The top 5 genes with the smallest p-value for population-level heterogeneity were SNX7 (ppop = 0.00028), CD68 (ppop = 0.0014), UGDH (ppop = 0.00218), SUPT4H1 (ppop = 0.00237) and HYAL3 (ppop = 0.00247). None of these genes were significant for the MATS test of individual-level heterogeneity at an adjusted p-value threshold, and only UGDH yielded an individual-level heterogeneity test p-value below 0.05, suggesting that these findings are driven by population-level expression effect differences. Effect estimates for these genes without adjusting for PC-interactions are given in Table 1, along with the findings from standard and stratified TWAS. We discover opposite signed log odds ratios between the AFA and CAU populations for all five genes, suggesting that the current biological models describing these gene’s role in AD may not be generalized across populations. For example, the estimated log odds ratios (ORs) for AFA and CAU population-specific SNX7 expression from the MATS model are |${\hat{\alpha}}_1=0.0229$| and |${\hat{\alpha}}_1+{\hat{\alpha}}_2=-0.1861$|⁠, respectively. Overexpression of SNX7 has been linked to the reduction of the peptides whose accumulation is a key driver of AD (34,35), and our findings suggest that such protective mechanisms may not be generalized to individuals of AFA ancestry.

Table 1

Population-specific effects for the top 5 genes with significant population-level effects

GeneMATS population  
heterogeneity p-value
MATS AFA Effect  
log OR/SE/p-value
MATS CAU Effect  
log OR/SE/p-value
Standard TWAS  
log OR/SE/p-value
AFA TWAS Effect  
log OR/SE/p-value
CAU TWAS Effect  
log OR/SE/p-value
SNX70.0002830.023/0.0456/0.614−0.186/0.0352/1.28e−07−0.109/0.0278/9.17e−050.0213/0.0467/0.649−0.179/0.0351/3.13e−07
CD680.0014600.227/0.104/0.0288−0.181/0.0751/0.016−0.0416/0.061/0.4960.224/0.106/0.0355−0.177/0.0747/0.0175
UGDH0.0021800.325/0.138/0.0183−0.119/0.0448/0.00801−0.0763/0.0425/0.07280.328/0.141/0.02−0.114/0.0445/0.0105
SUPT4H10.002380−0.0781/0.082/0.3410.252/0.0711/0.0003980.112/0.0544/0.0392−0.0956/0.0837/0.2530.241/0.0707/0.000642
HYAL30.002480−0.127/0.0542/0.01880.1/0.0521/0.0546−0.0134/0.0363/0.712−0.141/0.0563/0.01240.102/0.0518/0.0498
GeneMATS population  
heterogeneity p-value
MATS AFA Effect  
log OR/SE/p-value
MATS CAU Effect  
log OR/SE/p-value
Standard TWAS  
log OR/SE/p-value
AFA TWAS Effect  
log OR/SE/p-value
CAU TWAS Effect  
log OR/SE/p-value
SNX70.0002830.023/0.0456/0.614−0.186/0.0352/1.28e−07−0.109/0.0278/9.17e−050.0213/0.0467/0.649−0.179/0.0351/3.13e−07
CD680.0014600.227/0.104/0.0288−0.181/0.0751/0.016−0.0416/0.061/0.4960.224/0.106/0.0355−0.177/0.0747/0.0175
UGDH0.0021800.325/0.138/0.0183−0.119/0.0448/0.00801−0.0763/0.0425/0.07280.328/0.141/0.02−0.114/0.0445/0.0105
SUPT4H10.002380−0.0781/0.082/0.3410.252/0.0711/0.0003980.112/0.0544/0.0392−0.0956/0.0837/0.2530.241/0.0707/0.000642
HYAL30.002480−0.127/0.0542/0.01880.1/0.0521/0.0546−0.0134/0.0363/0.712−0.141/0.0563/0.01240.102/0.0518/0.0498
Table 1

Population-specific effects for the top 5 genes with significant population-level effects

GeneMATS population  
heterogeneity p-value
MATS AFA Effect  
log OR/SE/p-value
MATS CAU Effect  
log OR/SE/p-value
Standard TWAS  
log OR/SE/p-value
AFA TWAS Effect  
log OR/SE/p-value
CAU TWAS Effect  
log OR/SE/p-value
SNX70.0002830.023/0.0456/0.614−0.186/0.0352/1.28e−07−0.109/0.0278/9.17e−050.0213/0.0467/0.649−0.179/0.0351/3.13e−07
CD680.0014600.227/0.104/0.0288−0.181/0.0751/0.016−0.0416/0.061/0.4960.224/0.106/0.0355−0.177/0.0747/0.0175
UGDH0.0021800.325/0.138/0.0183−0.119/0.0448/0.00801−0.0763/0.0425/0.07280.328/0.141/0.02−0.114/0.0445/0.0105
SUPT4H10.002380−0.0781/0.082/0.3410.252/0.0711/0.0003980.112/0.0544/0.0392−0.0956/0.0837/0.2530.241/0.0707/0.000642
HYAL30.002480−0.127/0.0542/0.01880.1/0.0521/0.0546−0.0134/0.0363/0.712−0.141/0.0563/0.01240.102/0.0518/0.0498
GeneMATS population  
heterogeneity p-value
MATS AFA Effect  
log OR/SE/p-value
MATS CAU Effect  
log OR/SE/p-value
Standard TWAS  
log OR/SE/p-value
AFA TWAS Effect  
log OR/SE/p-value
CAU TWAS Effect  
log OR/SE/p-value
SNX70.0002830.023/0.0456/0.614−0.186/0.0352/1.28e−07−0.109/0.0278/9.17e−050.0213/0.0467/0.649−0.179/0.0351/3.13e−07
CD680.0014600.227/0.104/0.0288−0.181/0.0751/0.016−0.0416/0.061/0.4960.224/0.106/0.0355−0.177/0.0747/0.0175
UGDH0.0021800.325/0.138/0.0183−0.119/0.0448/0.00801−0.0763/0.0425/0.07280.328/0.141/0.02−0.114/0.0445/0.0105
SUPT4H10.002380−0.0781/0.082/0.3410.252/0.0711/0.0003980.112/0.0544/0.0392−0.0956/0.0837/0.2530.241/0.0707/0.000642
HYAL30.002480−0.127/0.0542/0.01880.1/0.0521/0.0546−0.0134/0.0363/0.712−0.141/0.0563/0.01240.102/0.0518/0.0498

Given that UGDH reached suggestive significance (p < 0.05) for both population- and individual-level heterogeneity, we fit a full model with three PC-interactions to estimate the population-specific expression effects after accounting for intra-population heterogeneity. Under this model, the estimated effects (as log ORs) of UGDH in the AFA versus CAU population are |${\hat{\alpha}}_1=0.44482$| (SE = 0.14054, p = 0.0015) and |${\hat{\alpha}}_1+{\hat{\alpha}}_2=-0.05745$| (SE = 0.04931, p = 0.2440), respectively, where the standard error (SE) for |${\hat{\alpha}}_1+{\hat{\alpha}}_2$| was computed as|$\sqrt{\mathrm{Var}\big({\hat{\alpha}}_1\big)+\mathrm{Var}\big({\hat{\alpha}}_2\big)+2\mathrm{Cov}\big({\hat{\alpha}}_1,{\hat{\alpha}}_2\ \big)}$| and the p-value corresponds to a subsequent Wald test. The LRT testing of all PC-interactions was significant (p= 1.808* 10−6), and the p-value for α2 (the difference between the AFA and CAU expression effect) is p= 0.000412. These findings suggest that the signal detected for UGDH was driven by both inter- and intra-population heterogeneity and that UGDH overexpression may be a risk factor for AD within the AFA population but incurs protective effects within CAU populations.

Application to imaging and hematological traits in UK Biobank

While we leave the details of our UKBB application to the Supplemental Material, we highlight some key findings here. In total, 63 GEx-IDP associations and 216 GEx-Hematological trait pairs were detected by MATS after Bonferroni correction. Only 24/63 and 10/216 of these pairs were also detected by TWAS, respectively, highlighting the improved power of MATS over standard TWAS to detect expression-trait associations. Among the 63 GEx-IDP associations, one pair was significant for individual-level heterogeneity and 12 were significant for population-level heterogeneity, with no overlap. Of particular note, we identified opposite signed effects in the expression of UTS2 on the insular cortex volume between the African and Asian populations (positive effects) versus the European and Middle Eastern populations (negative effects). Among the 216 GEx-Hematological trait pairs, 40 were significant for individual-level heterogeneity. One gene-trait pair, LRMDA on hemoglobin concentration, was significant for both individual- and population-level heterogeneity and remained significant for population-level heterogeneity after adjusting for the top 10 genetic PC-expression interactions, suggesting the presence of both inter- and intra-population heterogeneity.

Discussion

In this work, we presented the Multi-Ancestry TwaS, a novel extension to TWAS which models ancestral heterogeneity in the effects of cis-regulated gene expression on a complex trait. Within the MATS framework, we jointly model samples from multiple subpopulations from trans-ancestry studies, thereby boosting power over stratified TWAS to detect genes with shared expression-trait effects across ancestries. Furthermore, we proposed tests of both population- and subject-level heterogeneity to detect inter- and intra-population variation in GEx-trait effects, enabling MATS to detect association signal that would go undiscovered using standard TWAS, particularly in the most extreme cases where subpopulations have opposite signed expression effects. Our simulations highlight the superior power of MATS over existing methods to detect expression-trait association signals when the expression effects vary between and/or within subpopulations, and further reveal comparable power to TWAS in the absence of heterogeneity. This finding was validated through our applications to both the Alzheimer’s Disease Sequencing Project and UK Biobank. MATS identified many expression-trait associations which were undiscovered by stratified or standard TWAS, including genes with evidence of population- or subject-specific expression-trait associations. Of particular note, MATS confirmed previously supported protective effects of SNX7 overexpression in AD, but discovered possible risk conferring effects within African American populations. Furthermore, our application to the UK Biobank highlights the flexibility of MATS to study expression-trait associations within cohorts of finescale population substructure in which ancestry-matched expression prediction models are unavailable. Given the generalizability of MATS to both continuous and binary complex traits, we aim to raise the attention to more applications of MATS to both existing and future multi-ancestry genetic studies, which will be an important future direction.

This work, namely our discovery of genes with opposite signed expression effects between subpopulations, highlights the dangerous implications of generalizing findings from ancestrally homogeneous genetic studies to other populations. While our novel framework can detect such heterogeneity in the context of expression-trait association testing, the utility of MATS relies on the availability of trans-ancestry genetic studies with sufficiently sized subpopulations. In spite of the recent efforts of major sequencing studies to increase non-European populations (including ADSP (36)), the representation of non-EUR subjects in most major genetic biobanks remains disproportionate, as seen in our application to the UK Biobank (37). Furthermore, the reliability of MATS and standard TWAS are tied to the accuracy of the gene expression prediction models, which to date are trained on only a few hundred samples with available expression data in non-CAU samples. As such, our work emphasizes the need to prioritize ancestral diversity with well-powered subpopulations in large-scale biobanks with sequencing and expression data.

Materials and Methods

The transcriptome wide association study

The TWAS infers associations between genetically regulated gene expression (GEx) and a complex trait. TWAS is equivalent to the method of 2 Stage Least Squares (2SLS), so-called because it is performed in two stages. In the first stage, a prediction model for gene expression (x) is trained on a set of cis-eQTLs, commonly defined as SNPs associated with expression levels that lie within 1 Mb of the gene’s transcription start site. Stage 1 model coefficients |$(\hat{w})$| are then used to obtain fitted values for x (Eq. 1), denoted as |$\hat{x}$| and commonly interpreted as the genetically regulated component of gene expression. We use gi (p x 1) to denote the vector of p cis-eQTL genotypes for subject i, i ∈ 1, …, n, with each SNP additively coded as 0, 1 or 2 effect alleles.

In Stage 2, the trait (Y) is regressed on imputed gene expression, as given in Eq. (2), where C represents a set of covariates (including the intercept) and h(·) represents some link function. The association between genetically regulated GEx and the trait is tested via divergence from the null hypothesis, H0: β = 0, in the following model:

(1)
(2)

Conveniently, many studies have publicly available pre-trained GEx prediction models. This allows for imputation of GEx in cohorts that have available genotype and phenotype data, but not expression data, making it possible to apply TWAS to many large genetic case–control studies. Furthermore, TWAS benefits from improved biological interpretation over standard GEx-trait association tests by modeling GEx in terms of the additive effects from eQTLs (i.e. directly modeling the full pathway from SNPs to GEx to a complex trait). As described by Knutson et al. (2020), TWAS can also overcome potential biases introduced by confounding on the trait-GEx relationship and can thereby take on causal interpretations when certain conditions are met (38).

Modeling heterogeneous gene expression in TWAS

Here, we expand Stage 2 in TWAS to model possible heterogeneous effects of gene expression at both the intra- and inter-population levels. The proposed Stage 2 model is given in Eq. (3) for subject i in population j

(3)
(4)
(5)

where |${\hat{x}}_{i,j}$| represents the imputed gene expression for subject i in population j with j ∈ 1, …, k and i ∈ 1,... nj such that n1 + · · ·  + nk = n. C represents a set of measured confounders, including an intercept and indicator variables for population. For now, we assume C also contains some leading genetic PCs to account for population stratification, though we consider this treatment in greater detail in the Supplemental Material.

A key feature of Eq. (3) is that the expression effect βi,j is indexed at the population and individual levels. Specifically, we decompose βi,j into three components, as given in Eq. (4): 1) the fixed expression effect for population 1, α1, treated as the reference group, 2) the difference in the expression effect between the j-th ancestral group compared to the reference population, denoted as α2, …, αk, and 3) the subject-specific GEx effect, ηi,j. The final mixed-effects Stage 2 model in Eq. (5) follows from substituting Eq. (4) into Eq. (3), where |$\hat{X}=\mathit{{diag}}({\hat{x}}_1,\dots, {\hat{x}}_n)$| and Dj is an indicator for population j. We refer to the model in Eq. (5) as the Multi-Ancestry TwaS model, or MATS for short.

Under the MATS model, we treat α = [α1, …, αk]T as fixed, and treat η = [η1, …, ηn]T as random such that E[η] = 0 and Var[η] = κ2K. Here, K represents an n x n similarity matrix and serves to account for genetic relatedness between subjects. There are a number of choices for K in practice, including a realized genetic relationship matrix (GRM), estimated as the sample covariance matrix computed on a matrix of genome-wide minor allele counts. Misspecification of K has been studied in detail in the related application of mixed modeling to adjust for population stratification in GWAS: if the extent of the misspecification is severe, power may be reduced and type I errors may be inflated. We explore a specific case of ‘extreme’ misspecification in our simulations, which is well known to inflate Type I error: namely, misspecifying K = I, where I is the n × n identity matrix. Notably, the use of K limits this model to individual-level data, though future work may extend the MATS model for summary data applications, similar to the work of (16).

The MATS model in Eq. (5) provides a framework to test for expression-trait associations while accounting for possible ancestral heterogeneity. As to be described in the following section, this MATS expression association test can detect association signals originating from 1) a shared expression effect across populations and subjects (as in standard TWAS), 2) population-specific effects and/or 3) subject-specific or intra-population expression-trait associations. In the sections to follow, we outline a pipeline of tests and corresponding models to identify the source of the association signal and appropriately estimate expression effects for the given setting.

Importantly, a key distinction between the MATS framework and existing models for Multi-Ancestry TWAS is that MATS models a distinct gene expression effect for each ancestral group, providing a way to estimate the effects of gene expression that differ at the population level (as has been observed in real data analysis). In contrast, the existing METRO method for multi-ancestry TWAS accounts for ancestry by weighting the relative contributions of each population’s gene expression model from Stage 1, but assumes a homogeneous gene expression effect across ancestral groups.

Given that MATS builds off of TWAS (or analogously, 2 Stage Least Squares), associations detected under MATS may be causal under certain conditions, particularly when instrumental variable conditions are met. However, the introduction of the random component and a non-linear link function complicates direct causal interpretations. Nevertheless, we highlight a number of MATS discoveries in our real data application that are closely related to the biology of disease, thereby validating the genes that are uniquely discovered by MATS (as compared to TWAS).

A novel test of expression-trait associations in multi-ancestry cohorts

Testing for expression-trait associations under the model in Eq. (5) corresponds to the null hypothesis H0,1: α1 = α2 = · · · = αk = 0 and κ2 = 0, which will be rejected if any of the population-level or ancestry-adjusted subject-specific expression effects differ from zero. To test this hypothesis, we combine the two asymptotically independent tests of H0,2: α1 = α2 = · · · = αk = 0 given κ2 = 0, and H0,3: κ2 = 0. A detailed outline of the score tests for H0,2 and H0,3 are given in the Supplemental Material. Briefly, the score statistic under H0,2 is given as

(6)

with |${S}_2={\big[\hat{x},\hat{x}\odot D\big]}^{\mathrm{T}}\big(Y-{\mu}^{\ast}\big)\;\mathrm{and}\;{\varDelta}^{\ast }= Var\big[{S}_2\big]$|⁠, where |$\odot$| represents element-wise multiplication, D represents the n x k matrix of population indicators (i.e. D = [D1, …, Dj, …, Dk]) and μ* = (μ1*, … μn*)T represents the mean estimates under the null model for H0,2.

The score test for H0,3, i.e. testing for individual-level heterogeneity, builds off of the work of Goeman et al. (21) and Zhang and Lin (39) and corresponds to the score test statistic

(7)

where |${\hat{\mu}}$| represents the mean estimates under H0,3. We show that U3 asymptotically follows a mixture of independent χ21 distributions, |${\sum}_{i=1}{\lambda}_i{\chi}_1^2$|⁠, where λi are eigenvalues of |$\frac{1}{n}{W}^{1/2}\hat{X}\;K\;{\hat{X}}^{\mathrm{T}}\;{W}^{1/2}$| (39). Several methods have been developed to approximate the mixture chi-square distribution to obtain p-values, including Davies (40), Liu et al. (41), Wang (42) and Chen et al. (43). However, these approaches require computing all eigenvalues λ, which can be very computationally expensive even in moderate sample sizes. As such, we approximate p-values using the highly accurate and computationally inexpensive fastSKAT algorithm proposed by Lumley et al. (44).

As outlined by Sun et al., U2 and U3 (the score test statistic for H0,2 and H0,3, respectively) are asymptotically independent, availing a number of methods for combining independent test statistics to test H0,1 (45,46). Many of these approaches take the general form of a weighted sum of the independent tests, given in Eq. (8), where pα and pκ are the p-values resulting from the tests on the fixed and random components, respectively.

(8)

The most straightforward approach is Fisher’s method. The test statistic of Fisher’s method is equivalent to an unweighted sum of the –2log(p-value) for the individual components, such that ρα = ρκ = 1 in Eq. (8). Fisher’s test statistic follows a χ24 under H0,1.

Fisher’s method achieves high power when both |$\alpha \ne 0$| and κ > 0. However, when only one of the two alternatives is true, Fisher’s method loses power. Su et al. (47) propose a data adaptive approach for determining weights ρα, ρκ. However, their simulations show that Fisher’s approach generally yields power comparable to the adaptively weighted test across different scenarios. As such, we default to the standard Fisher’s method in practice.

Distinguishing between shared, population-specific and subject-specific expression effects

A significant test of H0,1 suggests that gene expression has some effect on the complex trait, either at the global (shared) level, the population level and/or the individual level. A number of subsequent tests under the MATS framework can help identify the source of the association and inform estimation of either shared or population-specific effects, adjusted for individual-level heterogeneity as needed.

To start, the presence of individual-level heterogeneity can be assessed based on the significance of the score test of H0,3. If H0,3 is not rejected, we can operate from a reduced model without the subject-specific random effects, namely

(9)

Under this reduced model, we can test for the presence of heterogeneity in the population-level effects, corresponding to any standard test (e.g. Score, LRT or Wald test) of H0,4: α2 = · · · = αk = 0. We outline the score test for H0,4 in the Supplemental Material. Divergence from H0,4 implies at least one population expression effect differs from the reference population. Notably, this null model is equivalent to the standard TWAS approach, and thus rejection of H0,4 implies improvement of the population-specific MATS model over the standard TWAS approach with ‘pooled’ populations. If H0,4 is not rejected, the standard TWAS model can be used to estimate the shared expression effect. Notably, due to the increased sample size, we expect this approach to have improved power over stratified TWAS to detect risk genes with shared effects across subpopulations, but may lose power compared to standard TWAS when no population heterogeneity is present due to the greater degrees of freedom of the MATS expression association test.

Fitting a full model in the presence of individual-level heterogeneity

When individual-level heterogeneity is detected (i.e. H0,3 is rejected), it is appropriate to fit a full model to estimate and test for population-level expression effects which adjusts for individual-level heterogeneity. However, the full model in Eq. (5) cannot be readily estimated due to the difficulty in accurately finding the maximum likelihood estimate of κ2, estimation which is avoided under the score test for H0,3. To address this challenge, we propose a fixed effects approximation to the random individual-level effect in Eq. (5) which can be used to estimate an approximate full model in the presence of individual-level heterogeneity. A detailed outline of this approach is given in the Supplemental Material.

Briefly, our proposed full model replaces the random individual-level effect in Eq. (5) with interactions between leading genetic PCs and imputed genetic expression:

(10)

Here, t (n x 1) represents the -th genetic PC with corresponding eigenvalue ω, ∈ 1, …, L, where L is some arbitrarily selected number of top PCs. The last L terms in Eq. (10) correspond to the interactions between imputed gene expression and each weighted genetic PC, with each interaction effect denoted as γ, ∈ 1, …, L. These interaction terms represent the effect of gene expression on the trait which is correlated with ancestry.

In the Supplemental Material, we show that if interaction effects for all PCs are included (i.e. L = rank(K)) and we treat all interaction effects as random, the corresponding test of heterogeneity under model in Eq. (10) is directly equivalent to the score test for H0,2. It follows from this equivalence that a model which incorporates a few leading PC-expression interactions as fixed effects is a reasonable approximation to the GLMM MATS model with a random individual-level expression effect. While the ideal number of top PCs is unknown and this approximation may vary in strength depending on the degree of genetic differentiation captured in the top PCs, an LRT of H0: γ1 = · · · = γL* = 0 can be used to confirm whether the intra-population variability is captured within the leading L* PC-interactions. That is, rejection of H0 suggests that non-ignorable intra-population signal is captured within the leading L* terms. Less formally, additional PCs can be added and their importance assessed based on whether the results significantly change.

This model facilitates testing of global and population-level effects after adjusting for individual-level heterogeneity using a standard LRT of H*0,1: α1 = · · · = αk = 0 and H*0,4: α2 = · · · = αk = 0, respectively. This approach is suitable for fitting a heterogeneity-adjusted full model after rejecting the test for individual-level heterogeneity under the GLMM, which provides more accurate estimates of the population-level effects compared to a model which disregards the presence of intra-population heterogeneity altogether.

Data

GEx prediction models from MESA

We downloaded pre-trained gene expression models published by Mogil et al. (https://github. com/WheelerLab/DivPop) for 4674 genes (9). These models were trained using genetic and purified human monocytes expression data from The Multi-Ethnic Study of Atherosclerosis (MESA), which consisted of 6814 individuals, including 233 AFA, 352 Hispanic (HIS), and 578 CAU subjects. Ancestry data was based on self-report. Models were trained separately on data from each of the three ancestral groups. As detailed in Mogil et al., the expression models between AFA and CAU significantly varied at some genes. As such, we match the prediction model to the target GWAS ancestry (AFA and CAU only) when imputing gene expression in TWAS. Notably, a number of key risk genes in AD are not included in the MESA models (e.g. APOE, CLU, PICALM, TREM2, SORL1), which may have implications in our application to ADSP.

For each gene, an elastic net was trained to predict gene expression from cis-eQTL genotypes, specifically including common variants (MAF > 0.01) within 1 Mb of the transcription start site. Tuning parameters were selected using predictive R2 and assessed via nested cross-validation for each ancestry model, where three specific mixing parameters were considered (α = 0.05, 0.5, 1). We limit our analysis to testing a set of 1358 genes with a prediction performance R2 > 0.1 in both AFA and CAU populations.

Given that gene expression is cell-type specific, findings from MATS may differ depending on the tissue type of the gene expression model used. Investigators should carefully prioritize gene expression models of relevant cell-types for the trait in consideration. While the current model considers only expression for a single tissue, future extensions of MATS may integrate data from multiple tissues, which has been shown to boost power to detect associated genes for some traits (48).

Notably, the competing METRO method requires individual-level gene expression data or OLS estimates for eQTLs in the multi-ancestry GEx model. Neither of these data is presently available for our real data applications (ADSP or UKB) or for many other sources of multi-ancestry genetic data, thus highlighting another key advantage of MATS over METRO.

ADSP WGS data

Using the MESA GEx prediction models, we imputed gene expression using the individual-level data from the ADSP to test for heterogeneity-adjusted expression-AD associations. ADSP comprises three sub-studies (ADNI, ADSP discovery, and ADSP Extension), including both multiplex family and case–control design. ADSP includes subjects who self-reported as Black or African American (AFA), Non-Hispanic white (CAU), and Hispanic (HIS). However, due to limited sample sizes within the HIS subpopulation, we limited our analysis to the AFA and CAU samples from the case–control studies. Our sample selection procedure is illustrated in Figure 9. In total, we analyzed 4828 AD cases and 4558 controls (n = 9386 combined), each with available Whole Genome Sequencing data measured on the HisSeq X Ten platform. AD cases were identified as having possible, probable or definite AD based on cognitive assessment criteria from the National Institute of Neurological Diseases–Alzheimer’s (NINCDS-ADRDA). We excluded all subjects with uncertain AD statuses, or missing covariates for sex, age at onset (cases) or age at the last exam (controls). A breakdown of cases and controls by ancestry group is given in Table 2.

Sample selection procedure for real data applications.
Figure 9

Sample selection procedure for real data applications.

Table 2

Sample sizes for ADSP case–control data by self reported ancestry

Ancestryn casesn controlsn total
Black/African American108316662749
Caucasian374528926637
Ancestryn casesn controlsn total
Black/African American108316662749
Caucasian374528926637
Table 2

Sample sizes for ADSP case–control data by self reported ancestry

Ancestryn casesn controlsn total
Black/African American108316662749
Caucasian374528926637
Ancestryn casesn controlsn total
Black/African American108316662749
Caucasian374528926637

To ensure concordance between all reference panels and the MESA prediction models, we performed liftover on the ADSP genotypes from hg38 to hg19. We computed a similarity matrix for all samples after pruning markers using an r2 cutoff of 0.2, a window size of 1000 kb and a step size of 50 variants. We computed a genetic relatedness matrix using plink’s –make-rel function and performed PCA on the pruned genotypes using plink’s –pca option. For all MATS and TWAS models, we adjust for biological sex, age at baseline visit and APOE status.

UK Biobank GWAS

The UK Biobank (UKBB) is a rich resource comprising ∼500000 subjects with GWAS and phenotypic measurements. A subsample of 37438 subjects also provided Magnetic Resonance Images, from which the summarized brain imaging-derived phenotypes (IDPs) were computed, including many volumetric regions with previous evidence of causal associations in AD (38,49). While the UK Biobank is composed of primarily European individuals, population substructure and ancestry-linked gene-phenotype associations have been illustrated within UKBB, findings which motivate our application of MATS (50–52). To analyze population level heterogeneity, we derive four subpopulations based on genetically derived ancestry using the TRACE software (53), which included African, Asian, European and Middle Eastern populations. We apply MATS to study the expression associations in two sets of outcome phenotypes: (i) 20 IDPs with putative causal AD associations, analyses that complement our primary ADSP application and (ii) four blood cell traits, including platelet count, hemoglobin, hematocrit and white blood cell count. The latter analyses mirror the work of Wen et al., who studied ancestry correlated gene expression in these hematological traits using stratified TWAS between African, Hispanic and European ancestries (54). The details of these data and analyses are given in the Supplemental Material.

Integrating prediction models with target GWAS

When imputing GEx using the MESA prediction models, we found that many cis-eQTLs in the prediction models were not measured in the GWAS or WGS data, a common challenge in two-sample TWAS analyses. There are a number of common approaches to circumvent this issue, including (i) performing genotype imputation on the target sample to predict missing alleles, (ii) using a variant in high LD with the missing eQTL (i.e. an LD proxy) as a surrogate marker and (iii) excluding missing eQTLs when imputing gene expression. However, when genetic imputation of the target sample yields poorly predicted genotypes at missing markers or no variants in the target sample are in high LD with the missing variant, the first two approaches are not feasible. The latter approach may severely bias imputed GEx and the resulting TWAS when missing variants are strongly predictive of expression or many moderately predictive variants are missing.

Here, we propose an alternative approach for imputing gene expression using all prediction weights when some prediction model eQTLs are missing in the target data. As previously described, imputed gene expression is given as the fitted values from a regression of expression on its corresponding eQTLs, given as |$\hat{x}=E\big[x|G\big]=E\big[{g}^{(1)}{w}_1+\cdots +{g}^{(j)}{w}_j+{g}^{\big(j+1\big)}{w}_{j+1}+\cdots +{g}^{(p)}{w}_p|g\big]$|⁠, where g(j) represents n × 1 vector of genotypes for the jth eQTL, j ∈ 1, …, p, wj represents the estimated effect of the j-th eQTL and G represents the n x p matrix of genotypes, G = (g1, …, gp). Suppose SNPs 1, …, j are all missing from the target sample. Define the n × j matrix of missing genotypes as gmiss = (g1, …, gj) with weight vector |${\hat{w}}_{miss}$| and the n × (pj) genotypes measured in the GWAS data as ggwas = (gj + 1, …, gp) with weight vector |${\hat{w}}_{gwas}$|⁠, such that G = (gmiss,  ggwas). Then, we can rewrite the stage 1 fitted values as

(11)

The unknown quantity in Eq. (11), E[gmiss|G], can be estimated by performing a multivariate regression of the genotypes g* on G using a reference panel of similar ancestry, where we henceforth use the superscript * to denote genotypes from an external reference. That is, we obtain fitted values from the following regression:

Reference panels such as the 1000 Genomes publicly report whole genome sequencing data on multiple populations, thereby making it likely that all prediction model variants are present in the reference data, and estimation of E[gmiss|G] can be performed separately using reference panel samples matched to the self-reported ancestry of target sample subjects.

Simulation setup

To evaluate the performance of our proposed test, we perform simulations under a variety of settings based on real data. We simulate genotype data using the sim1000G in R (https://github.com/adimitromanolakis/sim1000G), a genotype simulator that has been shown to successfully preserve ancestry-specific MAFs and LD patterns from a reference sample in simulated data (55). We simulate genotypes based on 50 cis-SNPs from a randomly selected gene (TOMM22 on chromosome 22) using the Phase III 1000 Genomes reference panel. For our primary simulations, we separately generate data on 2000 subjects for each of the EUR, AFR and AMR populations (total n = 6000).

We specify true Stage 1 GEx weights (i.e. marginal SNP effects on GEx) by sampling from a wj ∼ N (0, 0.01). We allow 80% of these weights to be shared across all three subpopulations, and randomly sample the remaining 20% separately for each ancestry group to reflect the expected variability in gene expression prediction models across subpopulations (9). We generate gene expression from x = Gw + βU U + ϵX, where ϵX,i is independently sampled from a N(0, 0.01) and U represents ‘unmeasured’ confounders randomly sampled from N(0, 0.1) with effect size βU,X = 0.01. We then generate a continuous phenotype as |$Y={\sum}_{r=1}^{10}{c}_rP{C}_r+{\alpha}_1x+{\alpha}_2x\ast{D}_2+{\alpha}_3x\ast{D}_3+{\beta}_{U,Y}U+\mathbf{X}\eta +{\in}_Y$|⁠, where Dj represents an indicator for the jth population, ϵY ∼ N (0, σϵ) represents the random error and βU,Y represents the effect of unmeasured confounders on Y, which we set to 0.01. Here, PCr represents the rth genetic PC with effect cr, sampled randomly from cr ∼ N (0, 0.05). We sample the expression variance parameter, η, from MVN(0, κ2K), where K represents the genetic relatedness matrix based on the simulated genotypes.

To assess the performance of our proposed MATS model against standard TWAS, we first estimate weights and impute gene expression for each population separately, using the full samples for estimation (i.e. one sample TWAS). We assess the rejection rates for H0,4 (population-level heterogeneity), H0,3 (intra-population heterogeneity), H0,1 (overall gene expression effect) against PC-adjusted TWAS. We vary the parameters α1, α2 and α3 across the values [0, 0.001, 0.005, 0.01, 0.02, 0.03, 0.05], and vary κ2 across [1e − 10, 5e − 04, 0.001, 0.00125, 0.0015, 0.00175, 0.002, 0.0025, 0.0035, 0.005, 0.01].

As it currently stands, most multi-ancestry GWAS or sequencing studies have imbalanced sample sizes between self-reported ancestries. To reflect this situation, we consider a supplementary simulation setting with n1 = 4000, n2 = 500 and n3 = 500. Furthermore, while the Stage 1 sample sizes for our real data applications are small (nAFA = 233, nCAU = 578), recent work suggests that the cis-heritability of gene expression and Stage 2 sample size have substantially greater influences over the power of the Stage 2 TWAS test than does the Stage 1 sample size (56). This suggests our simulations reasonably reflect the expected power of our proposed Stage 2 MATS model in real data, and that the small Stage 1 sample size may not overly limit the power of our applied results (which are limited to heritable genes).

Code availability

MATS can be applied in R using the package available at https://github.com/kathalexknuts/MATS. This package integrates R and C++, along with a number of existing packages to improve computational efficiency, including bigQF (https://github.com/tslumley/bigQF) for p-value approximation in the MATS test of individual-level heterogeneity. For a total sample size of 6000, the suite of MATS tests took only 2.919 seconds to run.

Acknowledgements

Access to the ADSP and UKBB data was approved through the NIAGADS DAR ID #10141 and UKB Application #35107, respectively.

Conflict of Interest statement

The authors have no conflicts of interest to declare.

Funding

National Institutes of Health grants R01AG065636, R01AG069895, RF1 AG067924, U01AG073079, R01HL116720 and R01GM126002; Minnesota Supercomputing Institute at the University of Minnesota.

References

[1]

Wojcik
,
G.L.
,
Graff
,
M.
,
Nishimura
,
K.K.
,
Tao
,
R.
,
Haessler
,
J.
,
Gignoux
,
C.R.
,
Highland
,
H.M.
,
Patel
,
Y.M.
,
Sorokin
,
E.P.
,
Avery
,
C.L.
 et al. (
2019
)
Genetic analyses of diverse populations improves discovery for complex traits
.
Nature
,
570
,
514
518
.

[2]

Storey
,
J.D.
,
Madeoy
,
J.
,
Strout
,
J.L.
,
Wurfel
,
M.
,
Ronald
,
J.
and
Akey
,
J.M.
(
2007
)
Gene-expression variation within and among human populations
.
Am. J. Hum. Genet.
,
80
,
502
509
.

[3]

Spielman
,
R.S.
,
Cheung
,
V.G.
,
Bastone
,
L.A.
,
Burdick
,
J.T.
,
Morley
,
M.
and
Ewens
,
W.J.
(
2007
)
Common genetic variants account for differences in gene expression among ethnic groups
.
Nat. Genet.
,
39
,
226
231
.

[4]

Zhang
,
W.
,
Duan
,
S.
,
Kistner
,
E.O.
,
Bleibel
,
W.K.
,
Huang
,
R.S.
,
Clark
,
T.A.
,
Chen
,
T.X.
,
Schweitzer
,
A.C.
,
Blume
,
J.E.
,
Cox
,
N.J.
and
Dolan
,
M.E.
(
2008
)
Evaluation of genetic variation contributing to differences in gene expression between populations
.
Am. J. Hum. Genet.
,
82
,
1223
1223
.

[5]

Montgomery
,
S.
,
Dermitzakis
,
E.T.
,
Bird
,
C.P.
,
Dunning
,
M.
,
Nica
,
A.C.
,
Koller
,
D.
,
Forrest
,
M.S.
,
Flicek
,
P.
,
Deloukas
,
P.
,
Tavara
,
S.
 et al. (
2007
)
Population genomics of human gene expression
.
Nat. Genet.
,
39
,
1217
1224
.

[6]

Torgerson
,
D.G.
,
Boyko
,
A.R.
,
Hernandez
,
R.D.
,
Indap
,
A.
,
Hu
,
X.
,
White
,
T.J.
,
Sninsky
,
J.J.
,
Cargill
,
M.
,
Adams
,
M.D.
,
Bustamante
,
C.D.
and
Clark
,
A.G.
(
2009
)
Evolutionary processes acting on candidate cis-regulatory regions in humans inferred from patterns of polymorphism and divergence
.
PLoS Genet.
,
5
,
e1000592
e1000592
.

[7]

Zhong
,
Y.
,
Perera
,
M.A.
and
Gamazon
,
E.R.
(
2019
)
On using local ancestry to characterize the genetic architecture of human traits: Genetic regulation of gene expression in multiethnic or admixed populations
.
Am. J. Hum. Genet.
,
104
,
1097
1115
.

[8]

Stranger
,
B.E.
,
Montgomery
,
S.B.
,
Dimas
,
A.S.
,
Parts
,
L.
,
Stegle
,
O.
,
Ingle
,
C.E.
,
Sekowska
,
M.
,
Smith
,
G.D.
,
Evans
,
D.
,
Gutierrez-Arcelus
,
M.
 et al. (
2012
)
Patterns of cis regulatory variation in diverse human populations
.
PLoS Genet.
,
8
,
272
284
.

[9]

Mogil
,
L.S.
,
Andaleon
,
A.
,
Badalamenti
,
A.
,
Dickinson
,
S.P.
,
Guo
,
X.
,
Rotter
,
J.I.
,
Johnson
,
W.C.
,
Im
,
H.K.
,
Liu
,
Y.
and
Wheeler
,
H.E.
(
2018
)
Genetic architecture of gene expression traits across diverse populations
.
PLoS Genet.
,
14
,
e1007586
e1007586
.

[10]

Albert
,
F.W.
and
Kruglyak
,
L.
(
2015
)
The role of regulatory variation in complex traits and disease
.
Nat. Rev. Genet.
,
16
,
197
212
.

[11]

Boyle
,
E.A.
,
Li
,
Y.I.
and
Pritchard
,
J.K.
(
2017
)
An expanded view of complex traits: From polygenic to omnigenic
.
Cell
,
169
,
1177
1186
.

[12]

Bisogno
,
L.S.
,
Yang
,
J.
,
Bennett
,
B.D.
,
Ward
,
J.M.
,
Mackey
,
L.C.
,
Annab
,
L.A.
,
Bushel
,
P.R.
,
Sing-hal
,
S.
,
Schurman
,
S.H.
,
Byun
,
J.S.
 et al. (
2020
)
Ancestry-dependent gene expression correlates with reprogramming to pluripotency and multiple dynamic biological processes
.
Sci. Adv.
,
6
, eabc3851.

[13]

Blue
,
E.E.
,
Horimoto
,
A.R.
,
Mukherjee
,
S.
,
Wijsman
,
E.M.
and
Thornton
,
T.A.
(
2019
)
Local ancestry at APOE modifies Alzheimer’s disease risk in Caribbean Hispanics
.
Alzheimer’s and dementia
,
15
,
1524
1532
.

[14]

Bhattacharya
,
A.
,
Garcia-Closas
,
M.
,
Olshan
,
A.F.
,
Perou
,
C.M.
,
Troester
,
M.A.
and
Love
,
M.I.
(
2020
)
A framework for transcriptome-wide association studies in breast cancer in diverse study populations
.
Genome Biol.
,
21
,
1
42
.

[15]

Li
,
Z.
,
Zhao
,
W.
,
Shang
,
L.
,
Mosley
,
T.H.
,
Kardia
,
S.L.R.
,
Smith
,
J.A.
and
Zhou
,
X.
(
2022
)
Metro: Multi-ancestry transcriptome-wide association studies for powerful gene-trait association detection
.
Am. J. Hum. Genet
., 109, 783–801.

[16]

Magi
,
R.
,
Horikoshi
,
M.
,
Sofer
,
T.
,
Mahajan
,
A.
,
Kitajima
,
H.
,
Franceschini
,
N.
,
McCarthy
,
M.I.
and
Morris
,
A.P.
(
2017
)
Trans-ethnic meta-regression of genome-wide association studies accounting for ancestry increases power for discovery and improves fine-mapping resolution
.
Hum. Mol. Genet.
,
26
,
3639
3650
.

[17]

Zhang
,
Y.
and
Pan
,
W.
(
2015
)
Principal component regression and linear mixed model in association analysis of structured samples: Competitors or complements?
 
Genet. Epi.
,
39
,
149
155
.

[18]

Buckler
,
E.S.
,
Yu
,
J.
,
Pressoir
,
G.
,
Briggs
,
W.H.
,
Vroh Bi
,
I.
,
Yamasaki
,
M.
,
Doebley
,
J.F.
,
McMullen
,
M.D.
,
Gaut
,
B.S.
,
Nielsen
,
D.M.
,
Holland
,
J.B.
and
Kresovich
,
S.
(
2006
)
A unified mixed-model method for association mapping that accounts for multiple levels of relatedness
.
Nat. Genet.
,
38
,
203
208
.

[19]

Zhao
,
K.
,
Aranzana
,
M.J.
,
Kim
,
S.
,
Lister
,
C.
,
Shindo
,
C.
,
Tang
,
C.
,
Toomajian
,
C.
,
Zheng
,
H.
,
Dean
,
C.
,
Marjoram
,
P.
and
Nordborg
,
M.
(
2007
)
An arabidopsis example of association mapping in structured samples
.
PLoS Genet.
,
3
,
0071
0082
.

[20]

Patterson
,
N.
,
Price
,
A.L.
and
Reich
,
D.
(
2006
)
Population structure and eigenanalysis
.
PLoS Genet.
,
2
,
2074
2093
.

[21]

Goeman
,
J.
,
Van de Geed
,
S.
and
van
 
Houwelingen
,
H.
(
2006
)
Testing against a high dimensional alternative
.
J. R. Stat. Soc.
,
68
,
477
493
.

[22]

Wu
,
M.C.
,
Kraft
,
P.
,
Epstein
,
M.P.
,
Taylor
,
D.M.
,
Chanock
,
S.J.
,
Hunter
,
D.J.
and
Lin
,
X.
(
2010
)
Powerful SNP-set analysis for case-control genome-wide association studies
.
Am. J. Hum. Genet.
,
86
,
929
942
.

[23]

Neale
,
B.M.
,
Rivas
,
M.A.
,
Voight
,
B.F.
,
Altshuler
,
D.
,
Devlin
,
B.
,
Orho-Melander
,
M.
,
Kathiresan
,
S.
,
Purcell
,
S.M.
,
Roeder
,
K.
and
Daly
,
M.J.
(
2011
)
Testing for an unusual distribution of rare variants
.
PLoS Genet.
,
7
,
e1001322
e1001322
.

[24]

Pan
,
W.
(
2009
)
Asymptotic tests of association with multiple SNPs in linkage disequilibrium
.
Genet. Epi.
,
33
,
497
507
.

[25]

Cortini
,
F.
,
Fenoglio
,
C.
,
Venturelli
,
E.
,
Villa
,
C.
,
Clerici
,
F.
,
Serpente
,
M.
,
Cantoni
,
C.
,
Fumagalli
,
G.
,
Mariani
,
C.
,
Bresolin
,
N.
,
Scarpini
,
E.
and
Galimberti
,
D.
(
2010
)
Cell-dependent kinase inhibitor 2a and 2b genetic variability in patients with Alzheimers disease
.
J. Neurol.
,
258
,
704
705
.

[26]

Siddiqui
,
S.S.
,
Matar
,
R.
,
Merheb
,
M.
,
Hodeify
,
R.
,
Vazhappilly
,
C.G.
,
Marton
,
J.
,
Shamsuddin
,
S.A.
and
Al Zouabi
,
H.
(
2019
)
Siglecs in brain function and neurological disorders
.
Cell
,
8
,
1125
.

[27]

Jia
,
Y.
,
Wang
,
N.
,
Zhang
,
Y.
,
Xue
,
D.
,
Lou
,
H.
and
Liu
,
X.
(
2020
)
Alteration in the function and expression of SLC and ABC transporters in the neurovascular unit in Alzheimer’s disease and the clinical significance
.
Aging Dis.
,
11
,
390
404
.

[28]

Roy
,
E.R.
,
Wang
,
B.
,
Wan
,
Y.-W.
,
Chiu
,
G.
,
Cole
,
A.
,
Yin
,
Z.
,
Propson
,
N.E.
,
Xu
,
Y.
,
Jankowsky
,
J.L.
,
Liu
,
Z.
 et al. (
2020
)
Type I interferon response drives neuroinflammation and synapse loss in Alzheimer disease
.
J. Clin. Invest.
,
130
,
1912
1930
.

[29]

Mastroeni
,
D.
,
Nolz
,
J.
,
Sekar
,
S.
,
Delvaux
,
E.
,
Serrano
,
G.
,
Cuyugan
,
L.
,
Liang
,
W.S.
,
Beach
,
T.G.
,
Rogers
,
J.
and
Coleman
,
P.D.
(
2018
)
Laser-captured microglia in the Alzheimer’s and Parkinson’s brain reveal unique regional expression profiles and suggest a potential role for hepatitis B in the Alzheimer’s brain
.
Neurobiol. Aging
,
63
,
12
21
.

[30]

Arisi
,
I.
,
D’Onofrio
,
M.
,
Brandi
,
R.
,
Felsani
,
A.
,
Capsoni
,
S.
,
Drovandi
,
G.
,
Felici
,
G.
,
Weitschek
,
E.
,
Berto-Lazzi
,
P.
and
Cattaneo
,
A.
(
2011
)
Gene expression biomarkers in the brain of a mouse model for Alzheimer’s disease: Mining of microarray data by logic classification and feature selection
.
J. Alzheimers Dis.
,
24
,
721
738
.

[31]

Berbari
,
N.F.
,
Malarkey
,
E.B.
,
Yazdi
,
S.M.Z.R.
,
McNair
,
A.D.
,
Kippe
,
J.M.
,
Croyle
,
M.J.
,
Kraft
,
T.W.
and
Yoder
,
B.K.
(
2014
)
Hippocampal and cortical primary cilia are required for aversive memory in mice
.
PLoS One
,
9
,
e106576
e106576
.

[32]

Falabella
,
M.
,
Vernon
,
H.J.
,
Hanna
,
M.G.
,
Claypool
,
S.M.
and
Pitceathly
,
R.D.
(
2021
)
Cardiolipin, mitochondria, and neurological disease
.
Trends Endocrinol. Metab.
,
32
,
224
237
.

[33]

Shen
,
L.
,
Chen
,
C.
,
Yang
,
A.
,
Chen
,
Y.
,
Liu
,
Q.
and
Ni
,
J.
(
2015
)
Redox proteomics identification of specifically carbonylated proteins in the hippocampi of triple transgenic Alzheimer’s disease mice at its earliest pathological stage
.
J. Proteome
,
123
,
101
113
.

[34]

Xu
,
S.
,
Zhang
,
L.
and
Brodin
,
L.
(
2018
)
Overexpression of SNX7 reduces aβ production by enhancing lysosomal degradation of APP
.
Biochem. Biophys. Res. Commun.
,
495
,
12
19
.

[35]

Sekar
,
S.
,
McDonald
,
J.
,
Cuyugan
,
L.
,
Aldrich
,
J.
,
Kurdoglu
,
A.
,
Adkins
,
J.
,
Serrano
,
G.
,
Beach
,
T.G.
,
Craig
,
D.W.
,
Valla
,
J.
,
Reiman
,
E.M.
and
Liang
,
W.S.
(
2015
)
Alzheimer’s disease is associated with altered expression of genes involved in immune response and mitochondrial processes in astrocytes
.
Neurobiol. Aging
,
36
,
583
591
.

[36]

Mena
,
P.R.
,
Kunkle
,
B.W.
,
Faber
,
K.M.
,
Adams
,
L.D.
,
Inciute
,
J.D.
,
Whitehead
,
P.L.
,
Foroud
,
T.
,
Reyes-Dumeyer
,
D.
,
Kuzma
,
A.
,
Naj
,
A.
 et al. (
2021
)
The Alzheimer’s disease sequencing project follow up study (ADSP-FUS): Increasing ethnic diversity in Alzheimer’s genetics research with the addition of potential new cohorts
.
Alzheimers. Dement.
,
17
,
e056101
.

[37]

Sirugo
,
G.
,
Williams
,
S.M.
and
Tishkoff
,
S.A.
(
2019
)
The missing diversity in human genetic studies
.
Cell
,
177
,
1080
1080
.

[38]

Knutson
,
K.A.
,
Deng
,
Y.
and
Pan
,
W.
(
2020
)
Implicating causal brain imaging endophenotypes in Alzheimer’s disease using multivariable IWAS and GWAS summary data
.
NeuroImage
,
223
,
117347
117347
.

[39]

Zhang
,
D.
and
Lin
,
X.
(
2003
)
Hypothesis testing in semiparametric additive mixed models
.
Biostatistics
,
4
,
57
74
.

[40]

Davies
,
R.B.
(
1980
)
Algorithm as 155: The distribution of a linear combination of χ2 random variables
.
J. R. Stat. Soc. Ser. C. Appl. Stat.
,
29
,
323
333
.

[41]

Liu
,
H.
,
Tang
,
Y.
and
Zhang
,
H.
(
2009
)
A new chi-square approximation to the distribution of non-negative definite quadratic forms in non-central normal variables
.
Comp. Stats. and Data Analysis
,
53
,
853
856
.

[42]

Wang
,
K.
(
2016
)
Boosting the power of the sequence kernel association test by properly estimating its null distribution
.
Am. J. Hum. Genet.
,
99
,
104
114
.

[43]

Chen
,
J.
,
Chen
,
W.
,
Zhao
,
N.
,
Wu
,
M.C.
and
Schaid
,
D.J.
(
2016
)
Small sample kernel association tests for human genetic and microbiome association studies
.
Genet. Epi.
,
41
,
5
19
.

[44]

Lumley
,
T.
,
Brody
,
J.
,
Peloso
,
G.
,
Morrison
,
A.
and
Rice
,
K.
(
2018
)
Fastskat: Sequence kernel association tests for very large sets of markers
.
Genet. Epi.
,
42
,
516
527
.

[45]

Sun
,
J.
,
Zheng
,
Y.
and
Hsu
,
L.
(
2013
)
A unified mixed effects model for rare variant association in sequencing studies
.
Genet. Epi.
,
37
,
334
344
.

[46]

Koziol
,
J.
and
Perlman
,
M.
(
1978
)
Combining independent chi-square tests
.
J. Amer. Stat. Asssoc.
,
73
,
753
763
.

[47]

Su
,
Y.-R.
,
Di
,
C.
,
Bien
,
S.
,
Huang
,
L.
,
Dong
,
X.
,
Abecasis
,
G.
,
Berndt
,
S.
,
Bezieau
,
S.
,
Brenner
,
H.
,
Caan
,
B.
 et al. (
2018
)
A mixed-effects model for powerful association tests in integrative functional genomics
.
Am. J. Hum. Genet.
,
102
,
904
919
.

[48]

Barbeira
,
A.N.
,
Pividori
,
M.D.
,
Zheng
,
J.
,
Wheeler
,
H.E.
,
Nicolae
,
D.L.
and
Im
,
H.K.
(
2019
)
Integrating predicted transcriptome from multiple tissues improves association detection
.
PLoS Genet.
,
15
,
e1007889
.

[49]

Knutson
,
K.
and
Pan
,
W.
(
2020
)
Integrating brain imaging endophenotypes with GWAS for Alzheimer’s disease
.
Quant. Bio., 9, 185–200
.

[50]

Galinsky
,
K.J.
,
Loh
,
P.-R.
,
Mallick
,
S.
,
Patterson
,
N.J.
and
Price
,
A.L.
(
2016
)
Population structure of UK Biobank and ancient Eurasians reveals adaptation at genes influencing blood pressure
.
Am. J. Hum. Genet.
,
99
,
1130
1139
.

[51]

Haworth
,
S.
,
Mitchell
,
R.
,
Corbin
,
L.
,
Wade
,
K.H.
,
Dudding
,
T.
,
Budu-Aggrey
,
A.
,
Carslake
,
D.
,
Hemani
,
G.
,
Paternoster
,
L.
,
Smith
,
G.D.
 et al. (
2019
)
Apparent latent structure within the UK Biobank sample has implications for epidemiological analysis
.
Nat. Commun.
,
10
,
333
333
.

[52]

Cook
,
J.P.
,
Mahajan
,
A.
and
Morris
,
A.P.
(
2020
)
Fine-scale population structure in the UK Biobank: Implications for genome-wide association studies
.
Hum. Mol. Genet.
,
29
,
2803
2811
.

[53]

Wang
,
C.
,
Zhan
,
X.
,
Liang
,
L.
,
Abecasis
,
G.
and
Lin
,
X.
(
2015
)
Improved ancestry estimation for both genotyping and sequencing data using projection procrustes analysis and genotype imputation
.
Am. J. Hum. Genet.
,
96
,
926
937
.

[54]

Wen
,
J.
,
Xie
,
M.
,
Rowland
,
B.
,
Rosen
,
J.D.
,
Sun
,
Q.
,
Chen
,
J.
,
Tapia
,
A.L.
,
Qian
,
H.
,
Kowalski
,
M.H.
,
Shan
,
Y.
 et al. (
2021
)
Transcriptome-wide association study of blood cell traits in African ancestry and Hispanic/Latino populations
.
G. E. N.
,
12
,
1049
.

[55]

Dimitromanolakis
,
A.
,
Xu
,
J.
,
Krol
,
A.
and
Briollais
,
L.
(
2019
)
sim1000g: A user-friendly genetic variant simulator in r for unrelated individuals and family-based designs
.
BMC Bioinform.
,
20
,
26
26
.

[56]

He
,
R.
,
Xue
,
H.
and
Pan
,
W.
(
2022
)
Statistical power of transcriptome-wide association studies
.
Genet. Epi
. Advance online publication.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)

Supplementary data