Meta-analysis of genome-wide association studies for body fat distribution in 694 649 individuals of European ancestry

Abstract More than one in three adults worldwide is either overweight or obese. Epidemiological studies indicate that the location and distribution of excess fat, rather than general adiposity, are more informative for predicting risk of obesity sequelae, including cardiometabolic disease and cancer. We performed a genome-wide association study meta-analysis of body fat distribution, measured by waist-to-hip ratio (WHR) adjusted for body mass index (WHRadjBMI), and identified 463 signals in 346 loci. Heritability and variant effects were generally stronger in women than men, and we found approximately one-third of all signals to be sexually dimorphic. The 5% of individuals carrying the most WHRadjBMI-increasing alleles were 1.62 times more likely than the bottom 5% to have a WHR above the thresholds used for metabolic syndrome. These data, made publicly available, will inform the biology of body fat distribution and its relationship with disease.


Supplementary Figure 1 | Manhattan and QQ plots for meta-analysis of fat distribution and obesity phenotypes.
We performed meta-analysis of our UK Biobank GWAS with existing GWAS data generated by the GIANT consortium. Manhattan and QQ plots from these meta-analyses in the combined sample for waist-to-hip ratio (max N = 697,734), waist-to-hip ratio adjusted for BMI (max N = 694,649) and BMI (max N = 806,834) are shown here. Note that the y-axes are not continuous and are disrupted at p < 1 x 10 -100 or p < 1 x 10 -50 . Genome-wide significance was set at p < 5 x 10 -9 to reflect the SNP density of the UK Biobank data [1]. Traditional genome-wide significance (p < 5 x 10 -8 ) is indicated by the second, lower horizontal line.
a. Analysis of body mass index b. Analysis of waist-to-hip ratio c. Analysis of waist-to-hip ratio adjusted for body mass index b. Waist-to-hip ratio signals in the original meta-analysis (all samples; x-axis) and the sensitivity meta-analyses (y-axis)

Combined samples
Women only Men only b. Waist-to-hip ratio genome-wide significant signals (P < 5 x 10 -8 ) in the original meta-analysis (GIANT all & UK Biobank; x-axis) and the sensitivity meta-analysis (GIANT population-based only & UK Biobank; y-axis)

Combined Samples
Women only Men only c. Waist-to-hip ratio adjusted for BMI genome-wide significant signals (P < 5 x 10 -8 ) in the original meta-analysis (GIANT all & UK Biobank; x-axis) and the sensitivity meta-analysis (GIANT population-based only & UK Biobank; y-axis)

Combined Samples
Women only Men only d. Waist-to-hip ratio genome-wide significant signals (P < 5 x 10 -8 ) in the original meta-analysis (GIANT all & UK Biobank; x-axis) and the sensitivity meta-analysis (GIANT population-based only & UK Biobank; y-axis)

Combined Samples
Women only Men only

Supplementary Figure 4 | Test of collider bias at genome-wide associated SNPs.
Conditioning a variable on a second, correlated variable (sometimes called conditioning on a 'collider') can induce both false-positive and false-negative associations [4,5]. Body mass index (BMI) and waist-to-hip ratio (WHR) correlate to one another (the correlation between 2 traits after correcting for age, sex, PCs, centres and genotype chip is 0.5 in 378,178 unrelated European from UK Biobank study; p = 2 x 10 -16 ); therefore, conditioning WHR on BMI to generate the waist-to-hip ratio adjusted for BMI (WHRadjBMI) phenotype may have resulted in collider bias at genome-wide associated SNPs.
We examined the association statistics of WHRadjBMI index SNPs in meta-analyses of BMI and WHR (see Supplementary Methods). WHRadjBMI-associated SNPs that show a stronger association with BMI than with WHR (green points) potentially suffer from collider bias. SNPs with an association >2 orders of magnitude stronger in BMI than in WHR (blue-green points) show stronger effects of collider bias. Consistently, these tend to be SNPs with near-zero effects in WHR and non-zero effects in BMI (righthand panels). The data underlying these figures are provided in Supplementary Table 2. a. Collider bias analysis in 346 index SNPs from the combined sample analysis b. Collider bias analysis in 346 index SNPs from the women-only analysis c. Collider bias analysis in 346 index SNPs from the men-only analysis

Supplementary Figure 5 | Miami and QQ plots for sex-specific meta-analyses of fat distribution and obesity phenotypes.
Shown are sex-stratified results from meta-analyses for waist-to-hip ratio, waistto-hip ratio adjusted for BMI and BMI. Note that the y-axes are not continuous and are disrupted at p < 1 x 10 -100 or p < 1 x 10 -50 . Genome-wide significance was set at p < 5 x 10 -9 to reflect the SNP density of the UK Biobank data [1]. Traditional genome-wide significance (p < 5 x 10 -8 ) is indicated by the second, lower horizontal line.  Figure 6 | Test for sex-dimorphism of index SNPs from genome-wide significant loci. For each locus revealed in each of our meta-analyses (combined samples and sex-specific analyses), we tested for evidence of sex-dimorphism at the index SNP. We repeated these analyses for all phenotypes (waist-to-hip ratio, waist-to-hip ratio adjusted for BMI, and BMI). Approximately 27% of the SNPs associated to waist-to-hip ratio adjusted for BMI (discovered in the sex-specific analyses) and approximately 24% of the SNPs associated to waist-to-hip ratio (discovered in the sex-specific analyses) show evidence of sex-dimorphism. None of the loci discovered through sex-specific analysis of BMI were sex-dimorphic. A full table of the sex-dimorphic SNPs appears in Supplementary Table 1. Nonsignificant points are shown in faded colors, and points are sized by the -log 10 (pdiff) test for sexdimorphism. Horizontal bars indicate standard error in men; vertical bars indicate standard error in women.
a. Index SNPs from combined and sex-specific analyses of waist-to-hip ratio.
b. Index SNPs from combined and sex-specific analyses of body mass index.

Supplementary Figure 8 | Concordance between WHRadjBMI-associated SNPs and SNPs from a genome-wide association study of imaging-based measures of subcutaneous and ectopic fat.
Recently, Chu et al [6] performed a genome-wide association study in a multi-ancestry sample, examining 9 different subcutaneous and ectopic fat depots. We downloaded the summary-level data (see Code and Data Release) and examined the effect of WHRadjBMI-associated SNPs on different measures of fat depots from the Chu et al GWAS. Adjusting for 3 sample groups (combined, females only, and males only) and 8 phenotypes (the 8 fat depots) we found a strong correlation between alleles associated with higher WHRadjBMI and higher PAT, higher VAT, higher VATSAT and lower SAT. The effect of WHRadjBMI index SNPs from female analysis on measures of fat depots was stronger than the effect of index SNPs from male analysis.

Supplementary Figure 9 | Phenotypic distributions in UK Biobank sample.
We extracted waist-to-hip (WHR) measures and body mass index (BMI) measures from the UK Biobank phenotype information, as well as a number of important phenotype-level covariates: age at assessment, sex, and UK Biobank assessment centre. We generated phenotypes in a manner consistent with previous efforts in the GIANT consortium [2,3,7]. We regressed each of the WHR and BMI phenotypes on age at assessment, age at assessment squared, assessment centre, and sex. To generate the WHR adjusted for BMI (WHRadjBMI) phenotype, we additionally included BMI as a covariate. We extracted the residuals from each of these regressions and then inverse normalized the residuals, to result in the final phenotype for analysis.
Here, we show all phenotypes before and after standardisation. Phenotype conversions by sex were performed in sex-specific analysis groups.
a. Phenotype distributions in body mass index b. Phenotype distributions in waist-to-hip ratio, including adjustment for BMI Supplementary Figure 10 | Correlation between various LD Score reference panels to use in BOLT-LMM. Before performing genome-wide association testing, we performed sensitivity testing in BOLT-LMM to optimize the data used in the genetic relationship matrix as well as the LD Score (LDSC) reference panel used (Supplementary Methods). We calculated correlation of SNP LD scores across all these panels to evaluate stability of the LD score metric and decided to use either the 'Baseline' panel [8,9] or custom UK Biobank panel for further sensitivity testing. Shading indicates Pearson's correlation (r) on a [-1,1] scale where darker blue shading indicates stronger positive correlation.
eur, LD scores calculated from European-ancestry samples in 1000 Genomes Phase 1; base, LD scores calculated in a 'baseline' model using 1000 Genomes Phase 3; genotyped, LD scores calculated with genotyped SNPs; imputed, LD scores calculated from imputed dosages converted to best-guess genotypes; 1k/5k/10k, the number of samples used to estimate LD Scores.

Supplementary Figure 11 | GWAS in obesity and fat distribution traits in UK Biobank only (combined sample).
Manhattan plots for GWAS of body mass index (BMI), waist-to-hip ratio and waistto-hip ratio adjusted for BMI in UK Biobank are shown. Note that y-axes are not continuous. Lines indicate traditional genome-wide significance (p < 5 x 10 -8 ) and genome-wide significance in this analysis (blue line; p < 5 x 10 -9 ).
a. Analysis of body mass index b. Analysis of waist-to-hip ratio c. Analysis of waist-to-hip ratio adjusted for BMI

Supplementary Figure 12 | Genome-wide association testing in obesity and fat distribution traits in UK Biobank (sex-specific analyses).
We performed genome-wide association testing in the UK Biobank samples for: body mass index (BMI), waist-to-hip ratio (WHR) and WHR adjusted for BMI (WHRadjBMI). The resulting Miami plots and quantile-quantile (QQ) plots show the results in the sex-specific analyses.
a. Analysis of body mass index b. Analysis of waist-to-hip ratio c. Analysis of waist-to-hip ratio adjusted for BMI

Supplementary Figure 13 | Concordance check of previously-implicated loci in genome-wide association studies in UK Biobank.
After completing our GWAS in UK Biobank, we looked up the previously-described loci in BMI, WHR and WHRadjBMI (49 loci) and BMI (97 loci) from an effort by the GIANT consortium in 2014 [2,3]. We checked concordance between the previous associations and our data by examining minor allele frequency, effect size (beta), standard error, and -log10(p-value). Concordance checks and the correlation (Pearson's r) between UK Biobank and the previously-reported GIANT loci are shown for BMI, WHR and WHRadjBMI are shown.
a. Body mass index, previously-known loci in GIANT (x-axis) vs UK Biobank (y-axis)

Combined samples
Women only a. In the combined sample

Supplementary Tables
Supplementary Table 1 | Summary-level statistics for index and secondary SNPs discovered in the combined and sex-specific meta-analyses. The tables containing summary-level data for associated SNPs are provided as downloadable text files from the project's GitHub repository (https://github.com/lindgrengroup/fatdistnGWAS). The file names are as follows: ( nmeta.combined Meta-analysis sample size in the combined analysis info.combined The imputation score in the meta-analysis (taken from UK Biobank, as GIANT data does not include imputation quality).

Next 14 columns
Frequency, beta, se, pval, dir, nmeta and info repeated for the female-only and male-only analyses psexdiff P-value from the test of sex-dimorphism between males and females (see Methods of the main paper for details)

Supplementary Table 2 | Genomic inflation (lambda) and Linkage Disequilibrium Regression Score (LDSC) Intercepts in genome-wide association studies and meta-analysis.
A standard quality control step after performing a genome-wide association study is to check the calibration of the resulting pvalues. A standard metric used to evaluate whether the results are well-calibrated (i.e., primarily following the null distribution with some indication of polygenicity) is genomic inflation (lambda, λ) [12]. Lambda is expected to be ~1, under the assumption that only a percentage of SNPs will show true association to the trait. A second metric, the LDSC intercept [13], has been used in more recent GWAS; large sample sizes and tremendous polygenicity make it difficult to understand if a lambda that deviates substantially from 1 is indicative of polygenicity or confounding. As a quality control step on our data, we calculated both lambda and LDSC intercept in the UK Biobank GWAS and in the meta-analysis results. All LDSC intercepts and lambdas were calculated using LD Scores generated using UK Biobank (as described for BOLT-LMM sensitivity testing; see Supplementary Methods).  [4,5]. Body mass index (BMI) and waist-to-hip ratio (WHR) correlate to one another (see Supplementary Figures 4 and 15); therefore, conditioning WHR on BMI to generate the waist-to-hip ratio adjusted for BMI (WHRadjBMI) phenotype may have resulted in collider bias at genome-wide associated SNPs. We examined the association statistics of WHRadjBMI index SNPs in meta-analyses of BMI and WHR (see Supplementary Methods). Here, we provide association statistics for WHRadjBMI-associated SNPs extracted from meta-analyses of BMI and WHR. The tables containing these data are provided as downloadable text files from the project's GitHub repository (https://github.com/lindgrengroup/fatdistnGWAS). The file names are as follows: (  The table containing the results from our directional consistency analysis for the 346 index SNPs are  provided  as  a  downloadable  text  file  from  the project's GitHub repository (https://github.com/lindgrengroup/fatdistnGWAS). The file names is as follows: The association statistics in EXTEND were calculated by carrying out a linear regression model of WHRadjBMI on each SNP. All betas have been aligned to the WHRadjBMI increasing allele from our main meta-analysis. GIANTUKB_MA columns refer to data obtained from our main meta-analysis and EXTEND refers to estimates obtained from EXTEND dataset.   The table contains summary-level data for the effect of 105 sexually dimorphic SNPs on BF% in males and females separately provided as downloadable text files from the project's GitHub repository (https://github.com/lindgrengroup/fatdistnGWAS). The file name is as follows:

Supplementary Table 8 | Summary statistics for the effect of sex-dimorphic WHRadjBMI index SNPs on BF% in UK Biobank individuals.
(1) SuppleTable8/WHRadjBMI_dimorphic_snps_merged_bfp_association_statistics.txt The table contains columns with the association statistics for the effect of 105 dimorphic SNPs on WHRadjBMI and BF% in sex specific analysis. Columns are named with corresponding phenotype, sex and statistics given e.g bf_male_beta. All effects are given for WHRadjBMI increasing alleles from the combined meta-analysis results. The column named Male_or_female_specific indicates whether the SNP is having a greater effect in females compared to males based on the criteria described in the 'Identification of sex-dimorphic signals' section of the Methods section of the main paper. Malespecific SNPs are denoted MALE and female-specific SNPs denoted FEMALE.

Supplementary Table 9 | Number of WHRadjBMI index SNPs showing strong associations with BF% and the direction of effect.
Association statistics between WHRadjBMI index SNPs and BF% were obtained from UK Biobank individuals. Amongst the 346 WHRadjBMI index SNPs from the combined meta-analysis, 59 SNPs were strongly associated with BF% in the combined GWAS (based on a Bonferroni corrected p-value of 0.05/346 = 1.44 x 10 -4 ); 34 of these were associated with increased BF% and 25 with decreased BF%. Of the 105 sex-dimorphic SNPs, 36 were strongly associated with BF% in the combined GWAS with 21 being associated with a higher BF% (based on a Bonferroni corrected p-value 0.05/105 = 4.8 x 10 -4 ).

Supplementary Table 11 | Samples excluded from genome-wide association testing in UK Biobank.
For genome-wide association testing in UK Biobank, we excluded: samples with withdrawn consent, the heterozygosity and missingness outliers (as identified by UK Biobank upon data release), samples with phenotypic vs genotypic sex mismatches, and samples with genotyped but not imputed data. We additionally performed a series of sensitivity genome-wide association studies, dropping (1) all related samples and (2) all related samples and all samples that were not white British individuals. Sample counts (potentially overlapping) and the exclusion criteria are provided below.

Consent withdrawn 18
Heterozygosity and missingness outliers 968 Phenotypic vs genotypic sex mismatch 378 Samples genotyped but not imputed 968 Not a white British sample 78,560 Related samples (kinship > 0.0442 (i.e., 3rd degree relative or higher) 107,162

Supplementary Table 12 | Configurations for sensitivity testing in BOLT-LMM.
We performed a series of genome-wide association studies (GWAS) in the waist-to-hip ratio adjusted for body mass index (WHRadjBMI) phenotype. We varied either (a) the genetic relationship matrix (GRM) or (b) the LD Score reference panel (LDSC), to see which model yielded the best calibrated LD Score intercept and heritability estimate (h 2 g) consistent with previous information. If the model is well-calibrated (i.e., the association statistics are well behaved), the LD Score intercept should be ~1. An LD Score intercept much larger than one is indicative of stratification due to relatedness, ancestral heterogeneity, or other sources. Where relevant, statistics are reported for both the 'infinitesimal' and 'noninfinitesimal' models in BOLT-LMM. The combination of a GRM calculated with imputed, pruned SNPs (r 2 < 0.2) and an LD Score reference panel calculated from UK Biobank (UKBB) yielded the heritability estimate closest to the current estimate, and yielded the best-calibrated LD Score intercept (highlighted in green). To ensure that our initial analyses in UK Biobank were not confounded by relatedness or ancestral heterogeneity (and to check that the linear mixed model was properly accounting for this structure), we additionally ran GWAS in UK Biobank using: (1) only the unrelated samples, and (2) only the unrelated white British samples. We then meta-analyzed this data with the pre-existing data, to check the consistency of our index and secondary signals in these analyses. Sample sizes for these analyses are provided here.

Supplementary Table 15 | Summary of samples analyzed in the sensitivity meta-analysis of GIANT population-based studies only and UK Biobank.
To ensure that no bias was introduced in our GIANT and UK Biobank meta-analysis by the inclusion of cases and controls in the original GIANT meta-analysis [3], we carried out a meta-analysis of GIANT population-based only studies and UK Biobank. A visualisation of this sensitivity analysis is shown in Supplementary Figure 3 and a link to the summarystatistics for each study is provided in the 'Code and Data Release' section.

Summary-level data from a genome-wide association study of ectopic fat depots
We looked up SNPs associated to WHRadjBMI in a recently-performed genome-wide association study (GWAS) of ectopic fat depots in a multi-ancestry sample [6]. GWAS was performed in 8 specific depots, and the data and links to that data are here: https://grasp.nhlbi.nih.gov/FullResults.aspx

Data for the genetic relationship matrix and LD Score reference panel
In implementing a linear mixed model, BOLT-LMM [14] requires three primary components: (1) the (imputed) genotype and phenotype data of the samples you wish to test; (2) a genetic relationship matrix (GRM), to estimate structure in the data due to relatedness, ancestral heterogeneity, or other factors; and (3) a reference panel of linkage disequilibrium scores (LDSC) [13], used to calibrate test statistics. Before beginning our genome-wide association studies (GWAS) in UK Biobank, we performed sensitivity testing in BOLT-LMM to ascertain which data should be used to populate the GRM, and which data to use as the LDSC reference panel.
The coordinates for these regions are: We additionally tested three different LDSC reference panels. The first panel was derived from the European-ancestry 1000 Genomes [9,17] samples and is distributed with the LDSC software (https://github.com/bulik/ldsc). The second panel was called the 'baseline' LDSC reference, generated in work by Finucane et al [8] and computed using data from 1000 Genomes Phase 3 [9]. We constructed the third LDSC panel by calculating LD scores from best-guess genotypes in the UK Biobank data. To test the stability of these scores, we: (1) Selected three sample sizes in which to calculate LD scores: 974 samples (0.1% of the UK Biobank data), 4,874 samples (1% of the UK Biobank data), and 9,748 samples (2% of the UK Biobank data) (2) Randomly selected five sets of unrelated samples for each of these sample sizes (e.g., five different sets of 1,000 unrelated samples in UK Biobank) (3) Calculated LD scores within each random set (4) Calculated the correlation of the LD scores for these sample sets. LD Scores were calculated for either (a) Genotyped SNPs (b) Imputed SNPs We found the LD scores to be highly stable across different sets of samples and SNPs (genotyped or imputed) in UK Biobank (Supplementary Figure 10), and therefore selected a panel calculated in 9,748 samples constructed either from genotyped SNPs or imputed SNPs converted to best-guess genotypes, as well as the 'baseline' panel to use in sensitivity testing.

Genome-wide association studies (GWAS) for sensitivity testing
We then ran a series of GWAS for sensitivity testing, altering the GRM and LDSC reference panel, to see which configuration seemed optimal given the data (Supplementary Table 12).
The combination of: (1) A genetic relationship matrix calculated from imputed SNPs converted to best-guess genotypes and pruning SNPs at r 2 = 0.2, and (2) An LDSC reference panel calculated from imputed SNPs converted to best-guess genotypes and pruning SNPs at r 2 = 0.2 in 9,748 UK Biobank samples yielded the best-calibrated LD Score intercept (1.031) as well as a heritability estimate (17.3%) most consistent with current estimates for WHRadjBMI.
We therefore decided to run all of our GWAS in UK Biobank for BMI, WHR, and WHRadjBMI using this selection for GRM and LDSC reference panel.

Constructing an LD reference panel for locus identification and conditional testing
To identify top (i.e., index) signals and any secondary signals, we first performed linkage disequilibrium (LD)-based clumping [15,16], followed by conditional and joint analysis using GCTA [18].

LD Clumping
LD clumping (in Plink [15,16]) relies on (a) summary-level data from a genome-wide association study or meta-analysis and (b) a reference panel from which LD calculations can be performed. Calculating LD in the full UK Biobank (N ~ 500,000) is computationally expensive; therefore, we created a 'reference' set of data from the UK Biobank data. We identified all of the unrelated samples in UK Biobank (N ~ 400,000) and selected a random 5% of the samples. We used these 20,275 samples to create the set of genotypes used for LD clumping. We subsetted these samples out of the UK Biobank data, and kept only high-quality SNPs: imputation info score > 0.9, minor allele frequency > 0.1%, and Hardy-Weinberg p > 1 x 10 -7 . Additionally, we used a hardcall threshold (--hard-call-threshold in Plink1.9 [16]) of 0.1. This threshold means the following conversion is applied to the imputed data: Dosage: 0 -0.1 → genotype is AA Dosage: 0.9 -1.1 → genotype is AB Dosage: 1.9 -2.0 → genotype is BB See Plink1.9 documentation for further details: https://www.cog-genomics.org/plink2/input. After applying this conversion, we additionally removed any SNP with missingness > 0.05.
Using this set of SNPs in 20,275 samples, we performed LD clumping in Plink1.9 [16]. We set genomewide significance at p < 5 x 10 -9 , performed clumping in a window of 5Mb, allowing for LD down to r 2 = 0.05 and down to a secondary p-value (--clump-p2) of 0.05.

Conditional and joint proximal conditional testing in GCTA
After performing LD clumping, we identified the genomic span of each 'clumped' region. Overlapping regions were collapsed into one (larger) locus. We then added 1kb buffer up-and downstream of the locus boundaries.
Within each locus (i.e., genomic window) we extracted all SNPs from the LD reference panel used for clumping, and again used this data to perform joint and conditional testing using GCTA in order to identify any secondary signals in each locus. We did this using --cojo-slct (http://cnsgenomics.com/software/gcta/GCTA_UserManual_v1.24.pdf), which performs proximal conditional testing when individual-level data is not available for exact conditional testing. Again, we set genome-wide significance (--cojo-p) at p < 5 x 10 -9 .

Waist-to-hip ratio (WHR)
As a sensitivity check for our WHRadjBMI meta-analysis, we additionally performed a meta-analysis in the waist-to-hip ratio (WHR) phenotype. The meta-analysis was performed identically to the metaanalysis in WHRadjBMI: genome-wide association testing in UK Biobank, performed in BOLT-LMM, was followed by meta-analysis of the summary-level data with pre-existing data from the GIANT consortium.
Because BMI and WHR are phenotypically correlated, conditioning WHR on BMI (to account for general adiposity) can induce false-positive and false-negative associations due to collider bias. To investigate the extent to which collider bias was affecting our data, we additionally performed a meta-analysis of BMI, following the exact analytic steps as those used to analyze WHRadjBMI and WHR.

Identification of sex-dimorphic SNPs genome-wide
Identifying sex-dimorphic SNPs from the index SNPs identified in our meta-analyses can generate bias around whether the SNP effects will be stronger in men or women (or neither). Index SNPs identified in the combined analysis will be more likely to have similar effects in men and women; index in the women-only analysis will be more likely to have a stronger effect in women (and the same holds for index SNPs identified in the men-only analysis being more likely to have a stronger effect in men). Given this bias, we additionally identified, genome-wide, all SNPs with evidence for sexual dimorphism (pdiff < 5 x 10 -9 ).
We tested all SNPs in our meta-analyses for evidence of sexual dimorphism, and then used the Plink clumping approach to identify those SNPs that were independent from one another. This clumping approach is identical to that described for identifying the index SNPs reported in the main paper (including arguments passed to Plink, provided in the main Methods as well as on this paper's GitHub repository); the only difference is that the p-value used for clumping was the pdiff (test of sexual dimorphism, as calculated in EasyStrata [19]). Using this approach, we identified 61 sex-dimorphic SNPs, 54 of which had stronger effects in women (see Methods for more details on identifying effects stronger in men or women).
Finally, the sexual dimorphism test is as follows: where se is the standard error and r is the genome-wide Spearman rank correlation coefficient between SNP effects in females and males. True shared genetic architecture between males and females could inflate the value of r. The value of r across all SNPs in the meta-analysis is 0.023. Therefore, we recalculated r on a set of ~5M null SNPs with p > 0.5 in the combined analysis, and in the women-only analysis and in the men-only analysis. We estimated the correlation across the betas in men and women across these SNPs and found r = -0.145.
We then recalculated the sexual dimorphism test for all SNPs in the meta-analysis by first calculating the above t-statistic, and then, assuming that the t-statistic is distributed ~N(0,1) (i.e., approximately z-distributed, as is assumed in EasyStrata), calculated the p-values following: Recalculating pdiff in this manner only somewhat impacted our results. Of the 61 SNPs we initially found to be sexually dimorphic (p < 5 x 10 -9 ) genome-wide, we found that 48 of them remained significantly sexually dimorphic after adjusting the SNPs used to calculate the Spearman rank correlation coefficient.