Genome-wide association analysis identifies TYW3/CRYZ and NDST4 loci associated with circulating resistin levels

Resistin is a polypeptide hormone that was reported to be associated with insulin resistance, inﬂammation and risk of type 2 diabetes and cardiovascular disease. We conducted a genome-wide association (GWA) study on circulating resistin levels in individuals of European ancestry drawn from the two independent studies: the Nurses’ Health Study ( n 5 1590) and the Health, Aging and Body Composition Study ( n 5 1658). Single-nucleotide polymorphisms (SNPs) identiﬁed in the GWA analysis were replicated in an independent cohort of Europeans: the Gargano Family Study ( n 5 659). We conﬁrmed the association with a previously known locus, the RETN gene (19p13.2), and identiﬁed two novel loci near the TYW3/CRYZ gene (1p31) and the NDST4 gene (4q25), associated with resistin levels at a genome-wide signiﬁcant level, best represented by SNP rs3931020 ( P 5 6.37 3 10 –12 ) and SNP rs13144478 ( P 5 6.19 3 10 2 18 ), respectively. Gene expression quantitative trait loci analyses showed a signiﬁcant cis association between the SNP rs3931020 and CRYZ gene expression levels ( P 5 3.68 3 10 2 7 ). We also found that both of these two SNPs were signiﬁcantly associated with resistin gene ( RETN ) mRNA levels in white blood cells from 68 subjects with type 2 diabetes (both P 5 0.02). In addition, the resistin-rising allele of the TYW3/CRYZ SNP rs3931020, but not the NDST4 SNP rs13144478, showed a consistent association with increased coronary heart disease risk [odds ratio 5 1.18 (95% CI, 1.03–1.34); P 5 0.01]. Our results suggest that genetic variants in TYW3/CRYZ and NDST4 loci may be involved in the regulation of circulating resistin levels. More studies are needed to verify the associations of the SNP rs13144478 with NDST4 gene expression and resistin-related disease.


INTRODUCTION
Resistin is a polypeptide hormone that belongs to a family of cysteine-rich proteins called resistin-like molecules (1). It was originally reported to link obesity to insulin resistance and diabetes as an adipocytokine in mice (2). Many subsequent human studies have yielded controversial results in the association of circulating resistin levels with insulin resistance, type 2 diabetes and cardiovascular disease (3 -17). In mice, resistin is preferentially expressed in adipocytes, whereas in humans it is highly expressed in monocyte and macrophage and only modestly in adipose cells (18,19). Genetic factors may play important roles in the regulation of circulating resistin levels. It has been estimated that 70% of the variation of circulating resistin levels is heritable (20). Several singlenucleotide polymorphisms (SNPs) in the human resistin gene (RETN) have been associated with circulating resistin levels in previous candidate gene studies (12,(20)(21)(22)(23). A previous one-stage genome-wide association (GWA) study failed to indentify loci associated with circulating resisting levels at a genome-wide significant level (24). Therefore, we undertook a GWA analysis for circulating resistin levels in individuals of European ancestry from two discovery cohorts and an independent replication study.

RESULTS
The characteristics of participants in each cohort are shown in Table 1. In the discovery stage, we performed three GWA scans for resistin levels in a total of 3248 individuals of European ancestry from the Nurses' Health Study (NHS) type 2 diabetes controls, the NHS type 2 diabetes cases and the Health, Aging and Body Composition Study (Health ABC) separately. There was no evidence of systematic bias in the distribution of P-values for association tests [genomic inflation factors (l) were 0.99, 0.99 and 1.02, respectively] (Supplementary Material, Fig. S1). The results from the three GWA scans were combined by a meta-analysis. A graphical summary of the meta-analysis results is displayed in Supplementary Material, Figure S2. No associations with resistin levels reached a genome-wide significance level (P ¼ 5 × 10 28 ).
We took four top SNPs representing independent loci having a stage 1 P , 5 × 10 25 for further replication in a replication sample of the Gargano Family Study (GFS, n ¼ 659) ( Table 2). SNPs were considered to be independent if the pairwise linkage disequilibrium r 2 , 0.1 and if they were apart at least 1 Mb from one another (25). The SNPs rs17372114, rs3931020 and rs6068258 were genotyped, and the SNP rs13144478 was imputed (MACH r 2 ¼ 0.9). The SNPs rs3931020 in the TYW3/CRYZ locus (1p31) and rs13144478 in the NDST4 locus (4q25-q26) showed directionally consistent associations with resistin levels in the GFS, and the P-values for the combined results of discovery and replication stages reached genome-wide significance (P ¼ 6.4 × 10 212 and 6.2 × 10 218 , respectively; Table 2). Regional association plots of SNPs in the TYW3/CRYZ and NDST4 loci are displayed in Figure 1. In addition, there were eight SNPs in a previously known locus (12,(20)(21)(22)(23), the RETN gene (from 7630 to 7650 kb, on chromosome 19p13.2), with association results available in our discovery data. Most of these SNPs were imputed with modest imputation quality (0.2 ≤ MACH r 2 ≤ 0.7); and a previously reported SNP rs3745367 (IVS2 181G/A) (20) showed strongest association with resistin levels (P ¼ 0.007) (Supplementary Material, Table S1). After combining our present result and previously reported data in the GFS (20), the SNP rs3745367 showed a suggestive association with circulating resistin levels (P ¼ 2.66 × 10 26 ; Table 2).
To further examine the positional candidate genes of the identified loci, gene expression quantitative trait loci (eQTL) analyses were performed in the Gene Expression Omnibus (GEO) database with expression data from human lymphoblastoid cell lines (26). The SNP rs3931020 showed a significant cis association with CRYZ gene expression levels (P ¼ 3.68 × 10 27 ), and another SNP rs277369 in linkage disequilibrium with rs3931020 (r 2 ¼ 0.711) showed the most significant association (P ¼ 6.81 × 10 224 ). The SNP rs3931020 also showed a significant but weaker cis association with TYW3 gene expression (P ¼ 2.7 × 10 24 ), compared with the CRYZ gene. Unfortunately, the SNP rs13144478 or correlated SNPs near the NDST4 gene were not available in the gene expression database.
We also examined the association between the two newly identified SNPs and RETN expression levels in white blood cells. Consistent with the associations with circulating resistin levels, the rs3931020 C allele was significantly associated with higher RETN mRNA levels (P ¼ 0.023 under an additive inheritance model, Fig. 2A), and RETN mRNA was significantly higher in carriers of the SNP rs13144478 T allele than that in non-T allele carriers (P ¼ 0.022 under a dominant inheritance model since only one homozygous individual for the T allele was observed, Fig. 2B). Adjustments for age and sex did not change the observed association between the SNPs and RETN mRNA expression.
We further tested the associations of the newly identified resistin-associated SNPs with risk of type 2 diabetes (2591 cases; 3052 controls) and coronary heart disease (CHD) (776 cases; 1677 controls) in case -control samples from the NHS and Health Professional Follow-up Study (HPFS) ( Table 3).
In the pooled analysis, after adjustment for age and BMI, the SNP rs3931020 was associated with an increased CHD risk [odds ratio (OR) ¼ 1.18, 95% CI 1.03-1.34; P ¼ 0.01] but not type 2 diabetes risk, toward the same direction observed with resistin level. The association between the SNP rs3931020 and risk of CHD did not change after further adjustment for type 2 diabetes status. No association was observed between the SNP rs13144478 and type 2 diabetes or CHD risk. In addition, we further examined the  association between the SNP rs3931020 and risk of CHD in the CARDIoGRAM (27), a consortium of GWA studies for CHD, and found that there was a marginal association between this SNP and risk of CHD (P ¼ 0.09).

DISCUSSION
In the first two-stage GWA analysis of circulating resistin levels in individuals of European ancestry, we identified two novel loci near the TYW3/CRYZ gene (1p31) and the NDST4 gene (4q25), associated with resistin levels at a genome-wide significant level, best represented by SNP rs3931020 and SNP rs13144478, respectively. Our in silico analysis indicates that the SNP rs3931020 strongly regulate the downstream located CRYZ gene. The CRYZ gene encodes zeta-crystallin, a NADPH-dependent quinone reductase which is expressed in lenses and many other tissues (28). This protein specifically binds to adenine -uracil-rich elements in 3 ′ -UTR of mRNA, and it has been reported to act as trans-acting factors in the regulation of certain mRNAs (29,30). We also found that the SNP rs3931020 was associated with resistin mRNA levels in human white blood cells, with the consistent direction of genetic effect on circulating resistin levels. Although entirely speculative, our data allow one to speculate that the association of the SNP rs3931020 with circulating and mRNA resistin levels might be mediated through its effect in increasing CRYZ expression, which in turn, by stabilizing RETN mRNA, increases circulating resistin levels. The SNP rs3931020 also regulate the upstream TYW3 gene, which is involved in wybutosine synthesis. Wybutosine is a hyper-modified guanosine with a tricyclic nucleoside found at the 3 ′ -position adjacent to the anticodon of eukaryotic phenylalanine tRNA (31). Whether a potential role of the TYW3 gene in tRNA modification may underlie the observed association between rs3931020 and resistin mRNA and circulating levels is presently unknown. Further mapping studies of this locus as well as functional studies are needed to address this possibility. Another novel finding of this study is that the resistin-rising allele (C-allele) of TYW3/CRYZ SNP rs3931020 was associated with increased CHD risk, but not with type 2 diabetes risk. Our result is in line with the fact that most of previous studies reported consistent associations between elevated circulating resistin levels and increased risk of cardiovascular disease (8 -11,13,15-17), although the results in the association between resistin and type 2 diabetes were controversial (3,5,7,11,14). The SNP rs13144478 is located downstream of the NDST4 gene, encoding heparin sulfate N-deacetylase/N-sulfotransferase 4 (32), which is not known to be involved in the regulation of circulating resistin levels. The newly observed association between the NDST4 gene and resistin needs to be interpreted with caution, since cis eQTL data of the NDST4 SNP rs13144478 was not available and no other SNPs in or near this locus showed strong association with resistin. More studies are needed to verify the results and to clarify the underlying mechanisms.
The major strengths of our study included large and welldefined cohorts, high-quality genotype data, minimal population stratification and consistent evidence in expression analysis (RETN mRNA). Several limitations need to be acknowledged. Resistin levels were lower in the replication sample than those in discovery cohorts, which might be due to difference in laboratory methods in resistin measurements and various characteristics of study populations. However, these differences may not affect the genetic associations, as the association analyses were conducted separately in each cohort and then combined by a meta-analysis, and there was no significant heterogeneity in genetic associations among these different study cohorts. We confirmed the association for one previously reported SNP rs3745367 in the RETN locus (20), but the association did not reach a GWA significance level. Previously reported SNPs in the 3 ′ region of RETN (23) were not significantly associated with resistin levels in our study, which might be due to the poor imputation quality (MACH r 2 ≈ 0.2) of these SNPs in the discovery data. Owing to the relative small sample size of our replication cohort, we selected only four top independent signals identified from the discovery stage for replication. Studies with more replication samples and a greater number of the top-ranked GWAS hits might result in the identification of additional genetic variants. In addition, our study samples exclusively consist of Caucasians with European ancestry. Therefore, these findings may not be generalizable to other populations (33). In conclusion, we found two novel loci at TYW3/CRYZ and NDST4 gene regions associated with circulating resistin levels and provided consistent data on the association for RETN mRNA levels. We observed that the newly identified SNP rs3931020 at the TYW3/CRYZ locus, but not the SNP rs13144478 at the NDST4 locus, was associated with an increased risk of CHD. More studies are warranted to verify the associations of the NDST4 locus with resistin and related disease. Further studies are needed to identify the molecular mechanisms that link these genomic loci to circulating resistin levels and to examine their effects on resistin-related disease risk.

Study populations
The present study is a two-stage GWA study of circulating resistin levels following similar study design of our previous GWA analyses of biomarkers (34 -37). The discovery cohorts consisted of 3248 individuals from the NHS (n ¼ 1590) and the Health ABC (n ¼ 1658). The NHS was established in 1976 when 121 700 female registered nurses aged 30-55 years residing in 11 large US states completed a mailed questionnaire on their medical history and lifestyle (38). A total of 32 826 women provided blood samples between 1989 and 1990. NHS participants for the current GWA scans were a subset of women who had data on resistin levels (n ¼ 1590) included in a nested case -control study of type 2 diabetes [3221 women (1467 cases, 1754 controls)] (39). For the analysis of associations between SNPs and risk of type 2 diabetes, we included the total nested case -control samples from the NHS (3221 women) and 2422 men (1124 cases, 1298 controls) from the HPFS (39). In addition, in order to test the associations with risk of CHD, we included nested case -control samples [1141 women (341 cases, 800 controls) and 1312 men (435 cases, 877 controls)] from the NHS and HPFS (40). Detailed information about these nested case -control studies has been described elsewhere (39,40). Plasma resistin was assayed by ELISA (Linco Research, Inc.). The minimum detectable range of this assay is 0.16 ng/ml for a sample size of 10 ml (diluted 1:10), with an intra-assay CV% of 3.2-7.0%. Genotyping was done using the Affymetrix Genome-Wide Human 6.0 Array (Santa Clara, CA, USA). Genotypic data were checked for quality as described in detail elsewhere (39). Briefly, all samples used in the present study achieved a call rate of ≥98%. Individual SNPs were excluded if they were monomorphic, had a missing call rate of ≥2%, more than one discordance, a Hardy -Weinberg equilibrium (HWE) P-value of ,1 × The Health ABC study is a longitudinal cohort study designed to investigate relationships among health conditions, body composition, social and behavioral factors and functional decline. The study population included 3075 well-functioning black and white men and women aged 70-79 (48% men, 42% Blacks) from Pittsburgh, Pennsylvania and Memphis, Tennessee. Baseline interview and clinic-based examination occurred between April 1997 and June 1998. For this study, only white participants with a resistin measurement (n ¼ 1658) were included. Resistin was measured on EDTA plasma by the Human Resistin ELISA Kit (Linco Research, Inc.) at the University of Pennsylvania. Intra-and inter-assay CV% for this assay were 4.5 and 7.4%, respectively. The lowest 5% extreme values were repeated. Genotyping was performed by the Center for Inherited Disease Research using the Illumina Human1M-Duo BeadChip system. Samples were excluded from the data set for the reasons of sample failure, genotypic sex mismatch and first-degree relative of an included individual based on genotype data. In the replication sample of the GFS, a total of 659 nondiabetic individuals (247men and 412 women) from 235 families were recruited in the Gargano area in center-east Italy and examined as previously described (20,41). Serum resistin concentrations were measured by a commercial ELISA kit (Bio-Vendor, Brno Czech Republic). Inter-and intra-assay CV% were 3.2-4 and 6.3-7.2%, respectively. It has been shown that resistin levels measured by BioVendor ELISA are lower than those measured by Linco ELISA, but these two methods have a high correlation coefficient of 0.92 according to the manufactory protocol of Linco ELISA. Replication SNPs were genotyped by TaqMan SNP allelic discrimination technique, by means of an ABI 7000 (Applied Biosystems, CA, USA). Call rate and concordance rate were ≥98% and .99%, respectively. All the SNPs were in HWE (P . 0.05).

Statistical analysis
In the discovery stage, following the analytical strategy widely used in the high-quality GWA consortia (25,42), we performed GWA analysis separately in each study sample and then combined by a meta-analysis. In the NHS, GWA analysis of resistin levels across 2.5 million SNPs (imputed data were expressed as allele dosage) was performed separately in control and diabetic case samples by using linear regression under an additive genetic model adjusting for age and BMI in the ProbABEL package (43). In Health ABC, linear regression modeling under an additive genetic model was used with adjustment for age, gender and study sites. Meta-analysis of these three GWAS scans was conducted using inverse variance weights under a fixed-effect model in METAL (http://www.sph.umich.edu/csg/abecasis/Metal). Associations between resistin levels and each SNP in the GFS were tested by a linear mixed effects model implemented in SOLAR which accounts for within-family correlations with adjustment for age and gender. Each SNP was included in a model as a fixed effect with additive coding. Similarly, fixed-effect models were used to combine the results from discovery and replication analyses in METAL. Resistin levels were logtransformed before analysis. The associations between SNPs and the risk of type 2 diabetes and CHD in the NHS and HPFS were analyzed using the logistic regression model, adjusted for age and BMI.

eQTL and RETN mRNA expression analyses
We performed eQTL analyses in the GEO database, in which data on 408 273 SNPs and eQTL from measurements of 54 675 transcripts representing 20 599 genes in Epstein -Barr virus-transformed lymphoblastoid cell lines from 400 children were available (26). In the RETN mRNA expression analysis, RETN RNA were extracted by using the PAXgene Blood RNA System (PreAnalytiX, GmbH, Germany) from white blood cells of 68 patients with type 2 diabetes (38 male and 30 female, mean age 65.1 + 7.0 years), consecutively recruited at the IRCCS 'Casa Sollievo della Sofferenza' in San Giovanni Rotondo (Gargano, Italy). RETN mRNA levels were measured by quantitative real-time PCR in triplicates and normalized using the GAPDH housekeeping gene. The difference in mRNA levels by genotypes was estimated by ANCOVA models (SPSS 15.0, SPSS, Inc., Chicago, IL, USA), according to additive and dominant models of inheritance. Logtransformed values were used to account for the skewed distribution of the RETN/GAPDH ratios.