## Abstract

Meta-analysis strategies have become critical to augment power of genome-wide association studies (GWAS). To reduce genotyping or sequencing cost, many studies today utilize shared controls, and these individuals can inadvertently overlap among multiple studies. If these overlapping individuals are not taken into account in meta-analysis, they can induce spurious associations. In this article, we propose a general framework for adjusting association statistics to account for overlapping subjects within a meta-analysis. The key idea of our method is to transform the covariance structure of the data, so it can be used in downstream analyses. As a result, the strategy is very flexible and allows a wide range of meta-analysis methods, such as the random effects model, to account for overlapping subjects. Using simulations and real datasets, we demonstrate that our method has utility in meta-analyses of GWAS, as well as in a multi-tissue mouse expression quantitative trait loci (eQTL) study where our method increases the number of discovered eQTL by up to 19% compared with existing methods.

## Introduction

Meta-analysis of genome-wide association studies (GWAS) has become crucial to scale up genetic discovery (1,2). Using meta-analysis, investigators are able to combine many studies to dramatically increase sample size and thereby to increase statistical power to detect phenotypically associated variants. In recent years, large-scale meta-analyses have become essential to increase the number of known associated loci in many human diseases (3,4). Indeed, almost all GWAS publications today employ meta-analysis methodologies (5).

A range of meta-analysis methods are now commonly used in GWAS (6). The fixed effects (FE) model, which assumes that effect sizes are fixed across studies, is powerful when its assumption holds true (7,8). The random effects (RE) model assumes that the effect sizes can be different between studies, a phenomenon called heterogeneity (9). We recently proposed a RE model that is more powerful than FE if the data are heterogeneous (10). Additional categories of meta-analysis methods include the *P*-value-based approaches (11,12); the subset approaches, which assume that the effects are present or absent in studies (13,14); and the Bayesian approaches (15–17). The majority of these methods assume that component studies are independent and that the individuals collected in one study are separate and unrelated from those in other studies.

Many studies today utilize shared controls to reduce genotyping or sequencing cost. For example, the 1958 British Birth Cohort (18) is being widely used in recent GWAS (19–23). As a result, there can be inadvertent overlap between individuals across multiple studies. The resulting dependencies can induce spurious associations if the overlapping subjects are not explicitly removed prior to meta-analysis or taken into account during meta-analysis. A simple solution would be to manually split the overlapping subjects into separate studies, but this can be suboptimal in terms of statistical power (24). Moreover, splitting genotype data is often not possible in practice if the primary genotype data are not shared; this is a common situation in current GWAS due to the sensitivities of sharing individual-level data.

Previous studies addressed this challenge, but only focusing on specific meta-analysis models. Lin and Sullivan (24) proposed a FE meta-analysis approach that takes into account overlapping subjects. Zaykin and Kozbur (12) and Bhattacharjee *et al.* (13) extended this method to the *P*-value-based approach and subset approach, respectively. None of these approaches are able to account for dependency structure in other meta-analysis frameworks such as the RE models. Application of the RE models to meta-analyses recently became crucial because of its potential to go beyond standard study designs. Investigators are now increasingly using meta-analysis to combine data from multiple related diseases in order to identify shared pathways (25,26), to combine data from multiple tissues to identify shared expression quantitative trait loci (eQTL) (27,28) and to combine neuroimaging data of different psychiatric diseases to identify shared brain features (29). In these new study designs, the use of RE models is ideal because the effect sizes are not necessarily the same between datasets. However, if dependency structure exists in the data, the current RE models cannot be applied due to the independence assumption.

Another practical challenge in using the current solutions to dependency structure is that one cannot use widely used pipelines for GWAS meta-analysis. Many analysis pipelines utilize standard software packages implementing meta-analysis models (2,30). These packages have advantages in solving practical issues in data cleaning, quality control and imputation (2). However, many of these do not support overlapping subjects and therefore cannot be used if subjects are shared between studies.

In this article, we propose a flexible and general framework for meta-analyzing dependent studies. The philosophy of our framework is unique, in that our framework does not compete with existing methods; it enables them to account for overlapping subjects. Given dependent studies, our framework transforms the covariance structure of the data, so it can be used in methods that strictly assume independence between studies. Specifically, our method *decouples* dependent studies into independent studies, adjusting association statistics to account for uncertainties in dependent studies. As a result, our approach enables general meta-analysis methods, including the FE and RE models, to account for overlapping subjects. Existing pipelines implementing these models can be reused for analyzing dependent studies, as long as our framework is applied as a front end of the analysis procedure.

We applied our approach to the eQTL data of Sul *et al.* (27), which involves data from multiple mouse tissues. Using our framework to combine information from multiple tissues and to account for tissue heterogeneity with the RE model, we increased the number of discovered eQTL by up to 19% compared with existing methods. We also performed a meta-analysis of three autoimmune diseases with shared controls using the Wellcome Trust Case Control Consortium (WTCCC) data (18) to demonstrate that our framework improved statistical significance of known candidate loci associated to multiple autoimmune diseases.

## Results

### Overview of the method

We propose a general approach that accounts for the dependency structure between studies in a meta-analysis. As we review in the Materials and Methods section, many traditional as well as recently proposed meta-analysis methods cannot account for overlapping subjects (Supplementary Material, Table S1). Our goal was to develop a framework that could be generally applied to existing approaches, so that methods not yet accounting for dependency could, and so that existing software packages could be reused. To this end, we can transform the covariance structure of the data by *decoupling* dependent studies into independent studies. Specifically, to account for uncertainties between studies that are dependent, we increase the standard errors in the decoupled studies. In Figure 1, we have three studies, A, B and C, whose statistics are correlated and have unit variances. Our approach decouples these studies into independent studies that have the same information with respect to the summary statistic. The variances of A, B and C increase to 1.39, 2.52 and 1.24, respectively. The variance of Study B increases the most drastically to 2.52 because B was highly correlated to A and C, with correlations of 0.5 and 0.3, respectively. High correlations to other studies imply that this study has the smallest information content as an independent study. We can then use the decoupled studies in the downstream meta-analysis approaches.

### False-positive rate

One of the primary goals of decoupling is to prevent spurious associations that can occur if we ignore overlapping subjects in a meta-analysis. Thus, we performed simulations to compare the false-positive rates of different methods: (a) a naive method that does not account for overlapping subjects, (b) a splitting method that splits shared controls into individual studies and (c) a decoupling method that systematically accounts for overlapping subjects. We simulated genotypes assuming a SNP with the minor allele frequency (MAF) *P* = 0.3. We constructed five studies whose numbers of cases are 1000, 2000, 3000, 4000 and 5000. We assumed that these studies share 10 000 controls. In the splitting method, we equally split controls, allotting 2000 controls per study. Under the null hypothesis of no association of the SNP to the trait, we generated 1 million such study sets. We applied the FE model to all three methods.

Table 1 shows that both decoupling and splitting correctly control the false-positive rate. However, the naive method that ignores overlapping subjects does not control the false-positive rate correctly. We calculated the ratios between the estimated false-positive rates and the significance threshold *α*. The ratios for decoupling and splitting are consistently close to 1.0 (mean ratio over different thresholds, 0.98 for decoupling and 1.08 for splitting). However, the ratio of the naive method is very large (mean ratio of 6.65). In fact, the ratio becomes larger as the threshold becomes more significant, the maximum being 13.56 at threshold *α* = 5 × 10^{−5}. This phenomenon emphasizes that ignoring overlapping subjects will incur a large number of false positives at the extreme threshold of 5 × 10^{−8} that is typically used in GWAS. We also applied the RE model of Han and Eskin (10) to the three methods. Decoupling conservatively controlled false-positive rate (mean ratio of 0.70), and the naive method again had an inflated false-positive rate (mean ratio of 5.40, see Supplementary Material, Table S2). Additionally, to explore a wider range of realistic scenarios, instead of fixing sample sizes of five studies, we randomly sampled sample sizes for simulations. We sampled case sample size of each study from N(3000, 1000^{2}), with lower bound 500, and assumed study-specific control samples whose size is sampled from N(3000, 1000^{2}). We assumed 10 000 shared controls and similarly repeated 1 million null simulations. Supplementary Material, Table S3 shows that the results are largely unchanged. Decoupling and splitting controlled false-positive rate well. Decoupling RE showed similar conservative tendency (mean ratio of 0.72).

Threshold α | 0.05 | 0.01 | 0.005 | 0.001 | 5 × 10^{−4} | 1 × 10^{−4} | 5 × 10^{−5} |

Decoupling | 0.0496 (0.99) | 0.00997 (1.00) | 0.00499 (1.00) | 0.000992 (0.99) | 0.0005 (1.00) | 8.7 × 10^{−5} (0.87) | 4.9 × 10^{−5} (0.98) |

Splitting | 0.0499 (1.00) | 0.01 (1.00) | 0.00501 (1.00) | 0.00104 (1.04) | 0.000539 (1.08) | 0.000123 (1.23) | 5.9 × 10^{−5} (1.18) |

Naive | 0.1 (2.00) | 0.0308 (3.08) | 0.0186 (3.72) | 0.0059 (5.90) | 0.00354 (7.09) | 0.00112 (11.20) | 0.000678 (13.56) |

Threshold α | 0.05 | 0.01 | 0.005 | 0.001 | 5 × 10^{−4} | 1 × 10^{−4} | 5 × 10^{−5} |

Decoupling | 0.0496 (0.99) | 0.00997 (1.00) | 0.00499 (1.00) | 0.000992 (0.99) | 0.0005 (1.00) | 8.7 × 10^{−5} (0.87) | 4.9 × 10^{−5} (0.98) |

Splitting | 0.0499 (1.00) | 0.01 (1.00) | 0.00501 (1.00) | 0.00104 (1.04) | 0.000539 (1.08) | 0.000123 (1.23) | 5.9 × 10^{−5} (1.18) |

Naive | 0.1 (2.00) | 0.0308 (3.08) | 0.0186 (3.72) | 0.0059 (5.90) | 0.00354 (7.09) | 0.00112 (11.20) | 0.000678 (13.56) |

Naive refers to a method that ignores overlapping subjects. Splitting refers to a strategy that splits shared controls into individual studies. All three methods used the FE model. The numbers in parentheses are the ratios of estimated false-positive rates and the threshold *α*. We expect these numbers to be close to 1.0.

### Power

Decoupling can allow the effective use of shared controls to increase power, when compared with the power of splitting individuals into individual studies. Using simulations, we compared the power of the decoupling and splitting methods. Among the three methods that we evaluated for false-positive rate, we excluded the naive method that ignored shared controls because the naive method did not control the false-positive rate properly. We assumed the same number of cases and shared controls as in the false-positive rate simulations above (unequal case sample sizes from 1000 to 5000, and shared controls of 10 000 which are equally split in the splitting method). However, this time, we assumed that the SNP had a genetic effect on the trait with relative risk *γ*. Given population MAF *p*, the MAF in cases becomes

Based on this frequency, we generated 10 000 simulation sets, each consisting of 5 case/control studies. In each set, we calculate the FE model *P*-values for both decoupling and splitting. We approximated power as the proportion of the sets in which the *P*-value exceeded the threshold of 5 × 10^{−8}. We gradually increased *γ* and evaluated the power of the decoupling and splitting methods. Figure 2 shows that decoupling consistently outperforms splitting at all the relative risks we tested. For example, at relative risk 1.13, the power of decoupling is 78.2%, whereas the power of splitting is 70.4%. This result is in line with Lin and Sullivan's (24) observation that their approach (which is equivalent to decoupling applied to FE) outperforms splitting, unless controls are split into studies proportionally to case sample sizes, in which situation both methods will have similar power.

One advantage of decoupling is that it allows a wide range of meta-analysis methods to account for overlapping subjects, including the RE model. Using simulations, we demonstrated that we can gain power by using decoupling with the RE model in situations where effect sizes are heterogeneous between studies. Assuming the same simulation frameworks as in the previous power simulations, we additionally assumed that only two out of the five studies have genetic effects with relative risk 1.2. We assumed that the two studies having the largest case sample sizes (4000 and 5000) have effects. Then, after decoupling, we applied both the FE model and Han and Eskin's (10) RE model. We also used the splitting method and applied FE and RE. Figure 3 shows that in this situation, RE gains power over FE. When applied after decoupling, the power of RE is 86%, whereas the power of FE is 61%. Thus, we gained 25% more power by decoupling, because decoupling allowed the application of RE for dependent studies. Splitting has suboptimal power in this simulation as well. When applied after splitting, the power of RE and FE is only 51 and 14%, respectively. Note that the power difference between decoupling FE and splitting FE is even more prominent in this simulation than in Figure 2, because of our design that assumes effects in the two studies having the largest case sample sizes. Since we equally split controls in our simulation, which is equivalent to down-weighting studies having large case sizes, the power drop of splitting compared with decoupling is dramatic in this design.

### Cross-disease analysis using the WTCCC data

One application of the decoupling approach is cross-disease analysis, in which controls are shared between diseases. Because it is reasonable to assume that effect size varies between different diseases, it is ideal to use the RE model. Thus, in this design, decoupling can help because it allows the application of RE models when studies share controls. As an example of cross-diseases analysis, we applied our decoupling approach to the WTCCC data. The WTCCC has performed GWAS on seven diseases, including three autoimmune diseases (Crohn's disease, rheumatoid arthritis and type 1 diabetes; CD, RA and T1D in short), using a set of shared controls (18). We performed meta-analysis on the three autoimmune diseases to identify shared loci between them.

We compared three methods: the naive method that ignores the fact that the studies share controls, the splitting method that splits controls to each disease and the decoupling method. For all three methods, we applied both the FE model and the Han and Eskin's (10) RE model. In the splitting method, we assumed that each individual is assigned to one of the three diseases with equal probability. We considered 397 450 SNPs that passed quality control for all three diseases in the original publication (18) and whose MAF are greater than 1%.

Supplementary Material, Figure S1 shows that the qq-plot is inflated for the naive method, which ignores overlapping subjects, for both FE and RE. The genomic control factors (31) were $\lambda FE=1.86$ and $\lambda RE=1.62$ for FE and RE, respectively. In contrast, the splitting and decoupling methods did not have such inflation. The genomic control factors were $\lambda FE=1.05$ and $\lambda RE=1.07$ for splitting, and $\lambda FE=1.05$ and $\lambda RE=0.82$ for decoupling.

We then examined the *P*-values of three non-MHC loci that were originally reported in the WTCCC analyses as candidates for shared loci between autoimmune diseases. Both splitting and decoupling gave significant *P*-values at those loci (*P*< 1.2 × 10^{−7}, Bonferroni corrected for 397 450 tests, see Table 2). However, decoupling gave much more significant *P*-values than splitting did. For example, at *PTPN22*, the decoupling RE *P*-value (1.00 × 10^{−29}) is about seven orders of magnitude more significant than the splitting RE *P*-value (1.88 × 10^{−22}). At these loci, RE gave more significant *P*-values than FE did for both splitting and decoupling (Table 2 and Supplementary Material, Fig. S2A). This is due to the effect size heterogeneity at these loci between different diseases.

Gene | RSID | Chr. | Position | Meta-analysis P-values | |||
---|---|---|---|---|---|---|---|

Splitting FE | Splitting RE | Decoupling FE | Decoupling RE | ||||

PTPN22 | rs6679677 | 1 | 114 015 850 | 1.45 × 10^{−13} | 1.88 × 10^{−22} | 4.48 × 10^{−23} | 1.00 × 10^{−29} |

SH2B3 | rs17696736 | 12 | 110 949 538 | 1.93 × 10^{−09} | 1.18 × 10^{−09} | 7.72 × 10^{−10} | 3.40 × 10^{−10} |

PTPN2 | rs2542151 | 18 | 12 769 947 | 1.14 × 10^{−07} | 9.03 × 10^{−08} | 3.74 × 10^{−08} | 5.16 × 10^{−08} |

Gene | RSID | Chr. | Position | Meta-analysis P-values | |||
---|---|---|---|---|---|---|---|

Splitting FE | Splitting RE | Decoupling FE | Decoupling RE | ||||

PTPN22 | rs6679677 | 1 | 114 015 850 | 1.45 × 10^{−13} | 1.88 × 10^{−22} | 4.48 × 10^{−23} | 1.00 × 10^{−29} |

SH2B3 | rs17696736 | 12 | 110 949 538 | 1.93 × 10^{−09} | 1.18 × 10^{−09} | 7.72 × 10^{−10} | 3.40 × 10^{−10} |

PTPN2 | rs2542151 | 18 | 12 769 947 | 1.14 × 10^{−07} | 9.03 × 10^{−08} | 3.74 × 10^{−08} | 5.16 × 10^{−08} |

The three non-MHC loci reported in the WTCCC combined analysis (18) are presented. The candidate gene for each locus was obtained from the ImmunoBase website (www.immunobase.org).

To test the robustness of methods against noise, we added the other four diseases as noise into the meta-analysis. Supplementary Material, Figure S2B shows that when we meta-analyze all seven diseases, the absolute magnitudes of the significance of *PTPN22* are reduced, but it is still significant for both the decoupling FE and the decoupling RE. Moreover, the relative significance gain of decoupling RE compared with decoupling FE is still largely pronounced (*PTPN22*: $PFE=1\xd710\u221211$ and $PRE=9\xd710\u221218$).

### Multi-tissue eQTL analysis

Recently, Sul *et al.* (27) applied our decoupling approach to multi-tissue eQTL analysis. Their goal was to combine information from multiple tissues to increase power for eQTL identification. They proposed a linear mixed model that accounts for multiple tissues being collected from the same individuals. However, using this approach, the resulting effect size estimates that are to be combined become correlated, which prevents the application of the standard meta-analysis method. Moreover, it is reasonable to assume effect size heterogeneity between tissues. Thus, decoupling with the RE model may assist in analysis. In their study, Sul *et al.* (27) compared the performance of decoupling FE, decoupling RE and the tissue-by-tissue approach (TBT) in terms of the number of eQTL identified. In TBT, they call an eQTL significant if any of the *q*-values in the set of 4 or 10 tissues is smaller than the false discovery rate (FDR) threshold. They demonstrated that applying decoupling with FE found a much larger number of eQTL in 4 or 10 mouse tissues than applying TBT, and applying decoupling with RE found even more at the same FDR level of 5%.

We revisited this dataset and examined whether the relative performances between methods remained unchanged if we changed the FDR threshold. See Sul *et al.* (27) for a detailed description of the dataset. When we varied the FDR threshold down to 0.0001 and up to 0.2, the relative performances of the methods did not change (Fig. 4). For example, at the threshold of FDR = 0.0001, TBT, decoupling FE and decoupling RE discovered 61, 718 and 780 eQTL, respectively, in 10 tissues. At the threshold of FDR = 0.2, TBT, decoupling FE and decoupling RE discovered 3870, 4914 and 5855 eQTL, respectively, in 10 tissues. Thus, decoupling RE improved the number of discovered eQTL by up to 19% compared with the best competing method (decoupling FE). These observations underscore that in multi-tissue eQTL analysis, it is important to take into account both the between-tissue heterogeneity and the between-tissue correlations, and that the decoupling RE approach can handle both of them effectively.

### Simulations of decouplable matrices

We define the covariance matrix of the studies in a meta-analysis as *decouplable* if the decoupled variances are non-negative. In the Materials and Methods section, we analytically demonstrate that when the number of studies is two and the two studies share controls, the covariance matrix is always decouplable. To demonstrate that the covariance matrix is highly likely to be decouplable for more than two studies sharing controls, we performed extensive simulations. We assumed a situation where 10 studies share $n0$ controls. We assumed the same MAF of 0.3 and odds ratio of 1.2 for all studies. Then, to simulate differences in the sample sizes between studies, we made the number of cases in five studies large and the number of cases in the other five studies small. Specifically, the number of cases of study *i* was $ni1=\mu +\delta $ for five studies and $\mu \u2212\delta $ for the other five studies. We then varied $n0$ from 1000 to 20 000 and *δ* from 0 to 4500 when *μ* was fixed at 5000. In this simulation, we observed that for all configurations we tested, the matrices were decouplable. Supplementary Material, Figure S3 shows that for all points, the percentage of studies that have negative variance is consistently zero in this simulation. This empirical result demonstrates that in most practical and common situations where studies share controls, the matrices are highly likely to be decouplable.

## Discussion

We proposed a general framework for dealing with dependency structure in a meta-analysis. Our approach is philosophically different from other approaches, in that our method transforms the covariance structure of the data so that it can be used in other approaches that assume independence. Therefore, our approach flexibly allows many meta-analysis methods, such as the RE model, to account for dependency between studies and overlapping subjects. The simulations and applications to the WTCCC data support the utility of our approach. Moreover, our method allows the use of standard software packages designed for meta-analysis of GWAS, even when studies share subjects. Our method can be used beyond the standard meta-analysis designs, such as for multi-tissue eQTL analysis.

In our method, the covariance matrix can be non-decouplable in theory (the decoupled variances can be negative). However, as we analytically and empirically showed, for the common and typical situations where controls are shared, the matrices are likely decouplable. However, a covariance matrix that is not decouplable is certainly possible because of numerical instabilities when estimating the variances of effect size estimates. For such situations where a matrix is not decouplable, we can use a heuristic of dropping the least informative (smallest) studies until the variances become positive. If the test turns out to be moderately significant, we can revisit the data and apply the standard Lin and Sullivan (24) approach.

Our method builds upon the Lin and Sullivan (24) approach. Our method is optimal in the same sense that the Lin and Sullivan (24) approach is optimal for the FE model. Thus, the optimality with respect to power is guaranteed only under the FE model. Our false-positive rate simulations demonstrated that our approach has some conservative tendencies under the RE model. However, we showed that the decoupling RE approach has much higher power than the decoupling FE approach (which is equivalent to the Lin and Sullivan (24) approach) in both simulations and in real data, showing that decoupling can be a flexible and useful tool despite the fact that it might not be optimal for all methods. Optimal solutions to account for dependency structure in each meta-analysis method will be an interesting and important topic for further research.

In this article, we demonstrated the false-positive rate of different methods such as decoupling RE only by simulations. Obviously, the value of simulations is limited to the conditions chosen. Nevertheless, we explored a wide range of situations by using randomly chosen sample sizes as well as fixed sample sizes (Supplementary Material, Table S3) so that our simulations can represent realistic scenarios. In future studies, we expect that analytical characteristics of decoupling can be further investigated, which may establish theoretical connection of decoupling to other meta-analysis methods than FE.

We note that one should be careful in interpreting data based on decoupled studies. For example, the heterogeneity testing such as Cochran's *Q* test (1) is highly conservative when using decoupled studies and may not detect true heterogeneity. Unfortunately, there is no good alternative to Cochran's *Q* test and the $I2$ estimate (32) that can take into account correlated data. Developing such methods will also be an interesting future research area.

## Materials and Methods

### Review of meta-analysis methods

To better discuss our proposed analysis method, we first briefly describe several existing meta-analysis methods.

#### Fixed effects model

The FE model assumes that the magnitude of the effect size is fixed across studies (7,8). Two widely used FE methods are the inverse-variance-weighted effect size method (1) and the weighted sum of *z*-scores method (2). Because these two methods are approximately equivalent (10), we only describe the inverse-variance-weighted effect size method. Let $X1,...,XN$ be the effect size estimates in *N* independent studies (such as the log odds ratios or regression coefficients). Let $SE(Xi)$ be the standard error of $Xi$, and let $Vi=SE(Xi)2$. Let $Wi=Vi\u22121$ be the inverse variance. The inverse-variance-weighted summary effect size is

Because the standard error of $XFE$ is $SE(XFE)=(\u2211Wi)\u22121$, we can construct a summary *z*-score:

that follows $N(0,1)$ under the null hypothesis of no association. The *P*-value is calculated as follows:

#### Random effects model

The RE model assumes that the effect sizes vary among studies. In this model, the effect size is assumed to follow an underlying probability distribution with mean *μ* and variance $\tau 2$. There are several approaches to estimate $\tau 2$, the most common being the moment-based estimator of DerSimonian and Laird (9). Given the estimate $\tau \u02c62$, the summary effect size estimate analogous to Equation (1) in FE is

The standard error of $XRE$ is $SE(XRE)=(\u2211(Wi\u22121+\tau \u02c62)\u22121)\u22121$. Then, similarly to FE, one can construct a *z*-score statistic:

The traditional RE model cannot account for overlapping subjects.

#### New random effects model of Han and Eskin

Han and Eskin (10) found that the traditional RE is conservative and rarely achieves higher power than the FE model. This is because traditional RE tests $\mu \u22600$ by constructing a *z*-score, and constructing a *z*-score is equivalent to a likelihood ratio test assuming the same heterogeneity $\tau 2$ under the null and alternative hypotheses. This is a conservative assumption because the known causes of heterogeneity in GWAS, such as gene-by-gene interactions or linkage disequilibrium, do not cause heterogeneity under the null hypothesis. Han and Eskin (10) therefore proposed a new RE model that assumes no heterogeneity under the null hypothesis. The test statistic is

*P*-value can be calculated using the asymptotic distribution or using a pre-constructed

*P*-value table for a more accurate calculation accounting for the small number of observations. The RE model of Han and Eskin (10) cannot account for overlapping subjects.

*P*-value-based approaches

There exist meta-analysis approaches that combine *P*-values instead of effect sizes, the most traditional being Fisher's method (11). Fisher's method combines the *P*-values $p1,\u2026,pN$ by constructing a statistic:

*P*-value-based approaches is that they can be used even without knowing the directions of effect. Fisher's method strictly assumes independence between studies and cannot account for overlapping subjects. However, a recently proposed

*P*-value-based approach by Zaykin and Kozbur (12) can take into account overlapping subjects.

#### Subset approaches

Subset approaches are similar to the FE model, but they differ in that they assume the effects can exist in only a subset of the studies. Han and Eskin (10) computed a statistic called the ‘*m*-value’, a posterior probability that the effect is present in a study. They incorporated *m*-values into the weighted *z*-score FE approach to upweight studies with high *m*-values. The *P*-value is assessed using importance sampling. This method cannot account for overlapping subjects. Bhattacharjee *et al.* (13) proposed an approach that computes FE statistics using all possible subsets of studies in a meta-analysis and uses the maximum statistic. The method expedites the enumeration of all possible $2N\u22121$ subsets of *N* studies by using a novel statistical technique. This method can account for overlapping subjects.

#### Bayesian approaches

Morris (15) proposed a Bayesian approach optimized for trans-ethnic meta-analysis. This method utilizes the Markov Chain Monte Carlo (MCMC) procedure to navigate through possible disease models. In his MCMC sampling procedure, closely related populations are given higher chance to have similar effect sizes, which increases the statistical power to detect heterogeneity caused by the population spectrum. This method cannot account for overlapping subjects. Wen and Stephen (16) proposed a new method that takes the heterogeneity in the data into account by using a hierarchical model in the Bayesian framework. Wen (17) recently extended this method to a model selection approach in complex linear systems, which can account for overlapping subjects.

### Definition of dependency

In this article, we often describe two studies in a meta-analysis as *dependent*, which needs to be defined. We define two studies as dependent if the statistics from those studies are correlated. The statistics can be correlated in a number of reasons, the most common one being that the control subjects overlap, resulting in positive correlations.

### The Lin and Sullivan approach

Lin and Sullivan (24) proposed a systematic approach for dealing with overlapping subjects in the FE model. Although only applicable to the FE model, this approach was an innovative breakthrough that provided a foundation of methods that account for dependency in meta-analysis. Recently, Zaykin and Kozbur (12) and Bhattacharjee *et al.* (13) extended this approach to the *P*-value-based approach and the subset approach, respectively. Our proposed approach also builds upon this method.

The first step of their approach is to analytically calculate the correlation structure of the statistics $X1,\u2026,XN$ caused by overlapping subjects, where *N* denotes the number of studies in a meta-analysis. Let **x** be the vector of effect size estimates, $x=(X1,\u2026,XN)$, and let

be the correlation matrix of *X* where $rij$ denotes the correlation between $Xi$ and $Xj$. $rij$ is analytically approximated with the formula

*i*th (or

*j*th) study, respectively; $nij1$ and $nij0$ are the numbers of cases and controls that overlap between the

*i*th and

*j*th studies, respectively. See Bhattacharjee

*et al.*(13) for an extended formula for a more complicated situation where some cases in one study are controls in other studies. Given the correlation matrix

**C**, it is straightforward to calculate the covariance matrix

**Ω**.

The second step of the Lin and Sullivan approach is to optimally take into account **Ω** in the testing. The optimal FE model meta-analysis statistic is

**e**is a vector of ones (

**e**= (1,…,1)). The formal proof for the optimality of this statistic is shown in Wei

*et al.*(34) and Wei and Johnson (35). The variance of the statistic is given,

*z*-score $XLin/Var(XLin)$ is calculated to obtain the

*P*-value. Note that in the special case where $X1,\u2026,XN$ are independent (

**Ω**is a diagonal matrix), $XLin$ equals to $XFE$, and $Var(XLin)$ equals to $Var(XFE)$. Thus, the Lin and Sullivan (24) approach becomes equivalent to the standard FE model.

### Decoupling approach

#### Decoupling

We propose the following decoupling transformation, which keeps the same information as the Lin and Sullivan (24) approach, with respect to the summary effect size. Suppose that we are given the effect sizes **x**, the standard errors **s** and the correlation matrix **C** as computed by formula (4). Computing **C** using formula (4) is feasible in many situations since we typically know the exact numbers of shared cases and controls. If this is not the case, an alternative approach would be empirically approximating **C** by calculating correlations using genome-wide *z*-scores of the studies. This empirical approximation requires an assumption that the majority of the SNPs in the genome follow the null hypothesis and do not have an effect, which may not be true if the trait is polygenic (36).

Given **x**, **s** and **C**, we propose the following procedure:
In the Supplementary Materials, we present a simple R code that performs this procedure.

- $xDecoupled\u2190x.$
Compute the covariance matrix of

**x**:where $Diag(s)$ denotes a diagonal matrix whose diagonals are$\Omega \u2190Diag(s)\u22c5C\u22c5Diag(s),$**s**.Compute the decoupled covariance matrix:

$\Omega Decoupled\u2190Diag(eT\Omega \u22121)\u22121,$- where the brackets [ ] denote the index of an element in a vector or matrix.$sDecoupled[i]\u2190\Omega Decoupled[i,i]for eachi=1,\u2026,N,$
Use $xDecoupled$ and $sDecoupled$ in the downstream meta-analysis.

#### Relation to the Lin and Sullivan approach

Our approach has a relationship to the Lin and Sullivan (24) approach; if the decoupled studies are used in the FE model, the results are equivalent. Thus, one can use existing software packages, such as METAL (30) or MANTEL (2), to obtain optimal results after the data transformation. Below is a brief proof of this relationship.

Given an effect size vector **x** and standard errors **s**, the standard FE model equations (1) and (2) can be rewritten as

Now, we decouple studies and put them into the standard FE model formulae. Given a covariance matrix **Ω**, our decoupling approach will calculate the updated standard errors $sDecoupled$ by calculating $\Omega Decoupled$. Note that because $\Omega Decoupled$ is a diagonal matrix, the following relationship holds:

If we plug $xDecoupled$ and $sDecoupled$ into Equation (7), then

#### Decouplable matrix

We define a *decouplable* covariance matrix as a covariance matrix whose decoupled variances are non-negative. In theory, not every matrix **Ω** is decouplable. However, in practice, in typical situations where studies share all or a subset of control samples, the covariance matrix is highly likely to be decouplable (see the Results section).

We can analytically demonstrate that the covariance matrix is always decouplable in a simple situation where two studies share a subset of controls. We assume that the two studies have the same MAFs and the same odds ratios. The correlation formula of equation (4) is

Because only controls are shared, $nij1=0$. Thus,

Let $si$ and $sj$ be the standard errors of the two studies. Then, the covariance matrix is

For the matrix to be decouplable, the column sums ($si2\u2212sisjr$ and $sj2\u2212sisjr$) have to be positive. Without loss of generality, assume that $si\u2264sj$ (study *i* is larger than study *j*). Then, the following inequality is the sufficient and necessary condition for decouplability:

Under the assumption of the same MAF and odds ratio, the variance of the statistic will only depend on the sample size. The effective total sample size can be calculated as

Because the standard error will be approximately inversely proportional to the square root of the total sample size, inequality (9) becomes

This inequality always holds true because

Thus, inequality (9) holds, and therefore, matrix **Ω** is always decouplable.

### Meta-analysis in multi-tissue eQTL analysis

One of the applications of meta-analysis methods is the combination of information from multiple tissues to increase the power of identification of eQTL. Because a gene's expression levels in multiple tissues are correlated within an individual, a study should take into account that correlation. Recently, Sul *et al.* (27) proposed a linear mixed model approach to account for the fact that eQTL studies collect multiple tissues from the same individuals. Let *T* be the number of tissues, and let *N* be the number of individuals for each tissue. To test the statistical significance between single nucleotide polymorphism (SNP) *j* and the expression of a gene, Sul *et al.* (27) built a linear mixed model:

where Y is an $NT\xd71$ matrix denoting expression levels of N individuals in T tissues. 1 is an NT × T matrix denoting the intercepts for T tissues. The first column is the intercept for the first tissue; the first N rows are ones, and the next $NT\u2212N$ are zeros. Other columns are similarly defined. α is a $T\xd71$ matrix denoting coefficients of intercepts. $Xj$ is an $NT\xd7T$ matrix denoting SNP genotype for T tissues. This is similar to the 1 matrix, and we replace ones in the 1 matrix with SNP information. β is a $T\xd71$ matrix denoting the coefficients of SNP effects in T tissues. $u\u223cN(0,\sigma v2D)$ is the random effects due to the repeated measurements of individuals. D is an NT × NT matrix representing how individuals are shared across tissues. $e\u223cN(0,\sigma e2I)$ is the random noise.

Sul *et al.* (27) efficiently estimated the two variance components ($\sigma v2$ and $\sigma e2$) using the techniques described in Kang *et al.* (37). After obtaining the covariance matrix of the model, $\Sigma =\sigma \u02c6v2D+\sigma \u02c6e2I$, we can apply the generalized least squares to obtain the estimates and variances of ** β**:

Our goal is to combine information from multiple tissues in $\beta \u02c6$. However, as shown in the variance formula, the estimates in $\beta \u02c6$ are dependent on each other. Thus, we can apply our decoupling strategy. Note that it is reasonable to assume that the genetic effects on expression level can vary between tissues. Therefore, application of the RE model is appropriate in this situation.

## Supplementary Material

Supplementary Material is available at HMG online.

## Funding

This work was supported by the Asan Institute for Life Sciences, Asan Medical Center, Seoul, Korea (2015-0222 to B.H.); the Korean Health Technology R&D Project, Ministry of Health & Welfare, Republic of Korea (HI14C1731 to B.H.); the National Science Foundation (0513612 to D.D. and E.E., 0731455 to D.D. and E.E., 0729049 to D.D. and E.E., 0916676 to D.D. and E.E., 1065276 to D.D. and E.E.); the National Institutes of Health (K25-HL080079 to D.D. and E.E., U01-DA024417 to D.D. and E.E., P01-HL30568 to D.D. and E.E., PO1-HL28481 to D.D. and E.E., NIH-NIAMS 1R01AR062886-01 to B.H., S.R. and P.I.W.d.B.) and the Netherlands Organization for Scientific Research (NWO) (VIDI award to P.I.W.d.B.).

## Acknowledgements

We thank the WTCCC for data access.

*Conflict of Interest statement:* None declared.