Abstract

Family-based linkage analysis has been a powerful tool for identification of genes contributing to traits with monogenic patterns of inheritance. These approaches have been of limited utility in identification of genes underlying complex traits. In contrast, searches for common genetic variants associated with complex traits have been highly successful. It is now widely recognized that common variations frequently explain only part of the inter-individual variation in populations. ‘Rare’ genetic variants have been hypothesized to contribute significantly to phenotypic variation in the population. We have developed a combination of family-based linkage, whole-exome sequencing, direct sequencing and association methods to efficiently identify rare variants of large effect. Key to the successful application of the method was the recognition that only a few families in a sample contribute significantly to a linkage signal. Thus, a search for mutations can be targeted to a small number of families in a chromosome interval restricted to the linkage peak. This approach has been used to identify a rare (1.1%) G45R mutation in the gene encoding adiponectin, ADIPOQ. This variant explains a strong linkage signal (LOD > 8.0) and accounts for ∼17% of the variance in plasma adiponectin levels in a sample of 1240 Hispanic Americans and 63% of the variance in families carrying the mutation. Individuals carrying the G45R mutation have mean adiponectin levels that are 19% of non-carriers. We propose that rare variants may be a common explanation for linkage peaks observed in complex trait genetics. This approach is applicable to a wide range of family studies and has potential to be a discovery tool for identification of novel genes influencing complex traits.

INTRODUCTION

Family-based linkage analysis has been highly successful in locating and identifying genes that contribute to relatively rare disorders with monogenic patterns of inheritance. In contrast, efforts to extend these family-based approaches to common disorders and quantitative traits have been less successful. With few exceptions, human geneticists have turned to population-based studies searching for common variants that contribute to these disorders and traits. These population-based approaches have been highly successful, but it is now widely recognized that common variants explain relatively modest proportions of risk for disease or proportions of variance for continuous traits (1). Several sources have been proposed for the ‘missing heritability’ including rare variants, epigenetic mechanisms, copy number variations and gene–gene interactions.

The Insulin Resistance Atherosclerosis Family Study (IRASFS) is a multi-center study designed to identify genetic and environmental determinants of glucose homeostasis and adiposity in Hispanic American and African American populations (2). Hispanic American families were recruited from two sites, San Antonio, Texas, and the San Luis Valley, Colorado, and underwent comprehensive clinical phenotyping and genetic analysis. We previously reported striking evidence for linkage of plasma levels of the adipocytokine adiponectin to chromosome 3q27 with a LOD score of 8.21 (3) in 90 Hispanic families (n = 1153 subjects) from IRASFS. This linkage peak overlies the location of the adiponectin protein coding gene, ADIPOQ, but common polymorphisms in ADIPOQ were, at best, minimally associated with plasma adiponectin levels and explained little of the evidence for linkage in the overall sample (3). The high LOD score suggested this result was unlikely to be due to chance, and thus making it an attractive target for molecular genetic analysis with the goal of identifying the genetic variation(s) underlying the evidence for linkage. An integrated approach using whole-exome sequencing, direct sequencing, family-based linkage analysis and association analysis rapidly identified the trait defining mutation.

RESULTS

Family-specific linkage analysis

Common variation in the ADIPOQ gene did not explain the strong evidence for linkage of adiponectin to chromosome 3 (LOD > 8.0) in Hispanic families from IRASFS (3). A notable feature of IRASFS is the relatively large family size (average of 12.8 subjects per family in the Hispanic sample). Consequently a family-specific quantitative trait multipoint linkage analysis was performed on chromosome 3 for 80 families, i.e. focusing on individual family results rather than the entire sample together. Markers were microsatellite markers from the prior genome scan linkage analysis (3–5). Table 1 summarizes the maximum LOD scores, location (in cM) of the maximum LOD on chromosome 3, and the number of DNA samples in the family for each family. Sixty-six families had maximum LOD scores ranging from 0 to 2.0 on chromosome 3. In the remaining 14 families (with LOD >2.0), the maximum LOD scores varied in magnitude and location, but two families, 1008 and 2010, had notable maximum LOD scores of 4.75 and 5.08, respectively, at 201 cM: a location which overlies the ADIPOQ gene location. For families 1008 and 2010, DNA samples were available from 22 subjects in each family. When characteristics of the families were evaluated, they did not differ dramatically from the overall sample (Supplementary Material, Table S1) except for the mean level of plasma adiponectin in the two families: 7.0 ± 6.3 µg/ml (family 1008) and 8.0 ± 5.5 µg/ml (family 2010), compared with the 13.6 ± 7.4 µg/ml for the entire IRASFS Hispanic sample (P < 0.001).

Table 1.

Summary of family-specific LOD scores for plasma adiponectin on chromosome 3

Family number Number of subjects in family (G45R carriers) LOD score Location of peak LOD (cM) 
66 families mean size 12.6 all ≤2 — 
2029 2.07 18 
1002 11 2.07 188 
1031 2.07 103 
2021 16 2.14 216 
1005 18 2.21 188 
1007 10 2.21 178 
1037 2.48 37 
2030 11 2.53 104 
1059 2.68 70 
1032 13 2.75 18 
1036 16 3.08 124 
2001 21 (4) 3.57 186 
1008 22 (10) 4.75 201 
2010 22 (8) 5.08 201 
Family number Number of subjects in family (G45R carriers) LOD score Location of peak LOD (cM) 
66 families mean size 12.6 all ≤2 — 
2029 2.07 18 
1002 11 2.07 188 
1031 2.07 103 
2021 16 2.14 216 
1005 18 2.21 188 
1007 10 2.21 178 
1037 2.48 37 
2030 11 2.53 104 
1059 2.68 70 
1032 13 2.75 18 
1036 16 3.08 124 
2001 21 (4) 3.57 186 
1008 22 (10) 4.75 201 
2010 22 (8) 5.08 201 

Exome sequencing, mutation identification and contribution to linkage

Based on these observations, DNA samples from three subjects in families 1008 and 2010 were chosen for whole-exome sequencing. These samples included one subject from 1008 with adiponectin in the normal range (i.e. 7.96 µg/ml) and two samples with low adiponectin levels (2.61 and 2.85 µg/ml, one each from families 1008 and 2010, respectively). Exome sequencing resulted in an average of 15 148 high-quality variant calls (defined as minimum quality score of 10 for the consensus genotype with reference sequence hg18; http://genome.ucsc.edu/cgi-bin/hgGateway), of which an average of 1831 (12.1%) were not previously annotated in dbSNP. In the linkage region for adiponectin on chromosome 3, 177–191 Mb, the two low-adiponectin samples have only 93 and 68 variant calls (13 and 5 absent from dbSNP, respectively). Fifty of the variant calls were common to both low-adiponectin samples, but only a single variant call was common to the low-adiponectin samples and not previously annotated in dbSNP. This was a G → C at base 188053674 on chromosome 3 corresponding to a glycine to arginine coding change in exon 2 of the ADIPOQ gene at codon 45 (G45R).

With this observation, the exons and neighboring regions of the ADIPOQ gene were sequenced using conventional DNA sequencing for members of families 1008 and 2010 in which plasma adiponectin was measured. Approximately 5.9 kb of DNA was sequenced in each of 44 samples. Nineteen single-base sequence variants were identified, of which five were novel (Supplementary Material, Table S2). Four coding variants were identified: two synonymous and two non-synonymous, including the G45R variant. The R variant was observed in 9 of 22 DNAs from family 1008 and 7 of 22 DNAs from family 2010. All subjects with the R mutation were present as heterozygotes with G carriers. Figure 1 shows the pedigrees of families 2010 and 1008 with genotype (G for glycine homozygotes and R for arginine/glycine heterozygotes) and individual values for plasma adiponectin levels. G45R carriers have substantially lower levels of plasma adiponectin (P-value = 1.21 × 10−17). On the basis of these observations, the entire sample of 1240 Hispanic subjects was genotyped for the G45R SNP and an additional 13 G45R individuals were identified. A total of 29 individuals in seven different families from both the San Luis Valley and San Antonio cohorts were identified with the G45R genotype. Individuals with the G45R variant were found in multiple families from the long-established Hispanic population (San Luis Valley) and families with a history of more recent immigration (San Antonio). We have surveyed over 4000 chromosomes in the African American population and 3000 chromosomes from European Americans and found no copies of the G45R. Individuals with the arginine allele have <20% of the levels of plasma adiponectin, 2.6 ± 2.2 µg/ml (n = 25), compared with the non-carrier individuals, 13.9 ± 7.3 µg/ml (n = 1110). In the 1008 and 2010 families, the R SNP allele frequency was 18%, but 1.1% in the entire cohort. To evaluate the contribution of the G45R variant to the overall evidence of linkage, LOD scores were calculated in models with and without the inclusion of the G45R SNP (Fig. 2) as a covariate (in addition to age, gender, BMI and recruitment site). Adjusting for the G45R SNP reduced the maximum LOD from 8.02 to 0.83. Linkage adjustment with other variants in ADIPOQ had minimal impact on the linkage signal (3).

Figure 1.

Pedigrees of families 2010 (A) and 1008 (B) from the IRASFS. Generations are denoted to the left. Genotyped individuals are noted as ‘G’ (homozygous for G45) and ‘R’ (heterozygotes carrying both G45 and R45 alleles). Below each individual is the measured plasma adiponectin concentration in µg/ml.

Figure 1.

Pedigrees of families 2010 (A) and 1008 (B) from the IRASFS. Generations are denoted to the left. Genotyped individuals are noted as ‘G’ (homozygous for G45) and ‘R’ (heterozygotes carrying both G45 and R45 alleles). Below each individual is the measured plasma adiponectin concentration in µg/ml.

Figure 2.

Linkage analysis of plasma adiponectin on chromosome 3 in IRASFS Hispanic families. Maximum LOD score is shown on the y-axis and centiMorgans position on chromosome 3 is shown in the x-axis. Blue line is linkage analysis of all families using age, gender, recruitment center and BMI as covariates. Green line is linkage analysis of all families using age, gender, recruitment center, BMI and G45R genotype as covariates.

Figure 2.

Linkage analysis of plasma adiponectin on chromosome 3 in IRASFS Hispanic families. Maximum LOD score is shown on the y-axis and centiMorgans position on chromosome 3 is shown in the x-axis. Blue line is linkage analysis of all families using age, gender, recruitment center and BMI as covariates. Green line is linkage analysis of all families using age, gender, recruitment center, BMI and G45R genotype as covariates.

Additional evidence that the G45R variant is the trait-altering variant resulted from two additional experiments. First, other variants that were identified in the conventional sequencing survey were also assessed for association with plasma adiponectin levels in the linked families. The coding variants T31I and G51G (Supplementary Material, Table S2) had P-values for association with adiponectin of 0.11 and 0.16, respectively, and could be easily excluded. Only a single individual in family 2010 has the T31I and a single individual in family 1008 has the G51G. A greater challenge comes in trying to differentiate between the common SNP rs2241766 (G15G) and G45R. Supplementary Material, Figure S1, illustrates the pattern of association representative of common variants as observed for rs2241766 (G15G). Adiponectin levels vary by genotype with this common variant in the linked families 1008 and 2010, but do not vary significantly with plasma adiponectin levels in the entire sample (Supplementary Material, Fig. S1A). In contrast, adiponectin levels vary by G45R genotype in both the linked families and the entire Hispanic sample (Supplementary Material, Fig. S1B). Thus, within the linked families, other variants may show evidence of association owing to cosegregation with the G45R, but will not show evidence of association in the entire cohort.

In addition, the size distribution of plasma adiponectin multimers was assessed in subjects carrying the G45R variant and their homozygous normal (G45G) family members. These results are summarized in Supplementary Material, Table S3 which shows that <6.3% of the adiponectin in G45R subjects is in the high-molecular-weight (HMW) form (multiple G45R samples have HMW levels less than the minimum level of detection). In samples from other family members homozygous for the normal allele, the average proportion of HMW is 67%.

G45R contribution to variance in adiponectin

With the substantial reduction in the LOD score and other evidence pointing to the functional consequences of carrying the G45R variant, we estimated the contribution of the G45R SNP to the variance in plasma adiponectin in the sample using several different models (Table 2). When all of the samples are considered, age, gender, recruitment center and PC explain 8.8% of the variance in plasma adiponectin. Addition of BMI increases this to 17%. Adding G45R genotype to the models increases the explained variance to 24 and 31%, respectively. Thus, G45R contributes 16–17% of the observed variance in plasma adiponectin levels in the complete sample of 1135 subjects with adiponectin values (Supplementary Material, Table S1) as determined from the partial R2P. Within the two linked families, 1008 and 2010, and all families which have at least one member with the G45R, the proportion of variance in adiponectin explained by the G45R is even greater: 62–63%.

Table 2.

Contribution of the G45R mutation to variance in plasma adiponectin levels

 Proportion of variance explained by:
 
 age, gender, center, PC  age, gender, center, PC, BMI  
Sample Without G45R Add G45R R2P Without G45R Add G45R R2P 
All samples 0.088 0.237 0.157 0.173 0.313 0.168 
Families with G45R 0.108 0.660 0.618 0.156 0.695 0.630 
Families without G45R 0.088 0.088 — 0.182 0.182 — 
 Proportion of variance explained by:
 
 age, gender, center, PC  age, gender, center, PC, BMI  
Sample Without G45R Add G45R R2P Without G45R Add G45R R2P 
All samples 0.088 0.237 0.157 0.173 0.313 0.168 
Families with G45R 0.108 0.660 0.618 0.156 0.695 0.630 
Families without G45R 0.088 0.088 — 0.182 0.182 — 

R2P, partial R2: the proportion of variation in plasma adiponectin levels explained by G45R after adjusting for family structure and covariates (age, gender, center, PC and ∓BMI).

Association with biomedical traits

Levels of adiponectin protein have been associated with a wide range of biomedically important traits. Since the G45R variant leads to a reduction in mean plasma adiponectin levels of over 80%, we investigated whether the G45R variation is associated with other phenotypes that have been associated with adiponectin levels in the literature (6). Table 3 summarizes the results of association analyses with the G45R polymorphism and traits in the entire Hispanic sample and in the G45R-enriched families 1008 and 2010. Plasma adiponectin levels were strongly associated with G45R with a P-value of 5.0 × 10−40 in the entire sample and in families 1008 and 2010, 1.2 × 10−17, with the R variant associated with lower adiponectin levels. In contrast, in the entire sample, there was limited evidence of association with other traits such as insulin sensitivity, a phenotype widely believed to be associated with adiponectin levels. There was some evidence of association with acute insulin response (AIR, first-phase insulin response), and trends in others (fasting glucose, diastolic blood pressure). The evidence for association was greater for AIR when analysis was restricted to the individuals in families 1008 and 2010, but was weaker for the other traits. Thus, the family-based design provides additional insight into the possible impact of this polymorphism.

Table 3.

Association analysis of G45R polymorphism and traits in the IRASFS

  Entire IRASFS sample
 
Families 1008 and 2010
 
  Genotypic means Additive genotypic associationa 
  1/1 1/2       
 Trait Mean SD Median Mean SD Median P-value Beta SE P-value Beta SE 
 Adiponectin (µg/mL) 13.9 7.3 12.5 2.62 2.22 2.16 5.03E − 40 −1.17 0.09 1.21E − 17 −1.21 0.09 
Glucose homeostasis Insulin sensitivity index (SI; ×10−5 min−1/[pmol/L]) 2.16 1.85 1.72 1.94 2.33 0.97 0.96 0.00 0.10 0.90 0.02 0.15 
AIR (pmol/l) 760 645 594 725 811 476 0.047 −4.97 2.50 0.0054 −9.67 3.29 
Disposition index (SI × AIR; ×10−5 min−11324 1233 1008 1113 1469 586 0.29 −3.75 3.56 0.15 −7.40 5.05 
Fasting glucose (mg/dl) 93.3 9.4 92.0 97.7 11.6 97.0 0.058 3.69 1.95 0.32 2.73 2.69 
Fasting insulin (uU/ml) 14.8 10.9 12.0 18.6 9.7 19.0 0.21 0.17 0.13 0.08 0.25 0.14 
Metabolic clearance of insulin (min−10.09 0.03 0.10 0.09 0.04 0.09 0.58 0.01 0.01 0.22 −0.02 0.01 
Adiposity Body mass index (BMI; kg/m2)b 28.8 6.14 27.9 30.2 5.72 29.5 0.35 0.04 0.04 0.92 0.00 0.06 
Visceral adipose tissue (cm2)b 113 61 106 122 67 117 0.77 −0.16 0.55 0.62 −0.45 0.89 
Sub'cutaneous adipose tissue (cm2)b 339 155 314 308 144 280 0.53 −0.56 0.90 0.26 −1.45 1.27 
Waist circumference (cm)b 89.6 14.3 88.7 93.3 13.3 93.4 0.32 0.03 0.03 0.85 0.01 0.04 
Liver density (HU) 51.8 12.4 55.7 50.3 9.9 52.0 0.49 −163.22 238 0.67 −119.37 289.20 
Lipids Triglycerides (mg/dl) 124 85 101 122 67 113 0.87 −0.02 0.12 0.68 −0.06 0.15 
HDL (mg/dl) 44.0 13.1 42.0 36.5 7.0 37.0 0.34 −0.05 0.06 0.38 −0.06 0.07 
LDL (mg/dl) 109 31 106 112 25 114 0.72 0.11 0.31 0.18 0.45 0.33 
Cholesterol (mg/dl) 178 38 174 173 29 175 0.90 −0.01 0.04 0.51 0.03 0.05 
Hypertension Systolic blood pressure (mm) 118.2 17.8 115.0 112.5 12.3 110.0 0.12 −0.04 0.03 0.21 −0.04 0.03 
Diastolic blood pressure (mm) 76.1 9.8 76.0 71.3 8.2 70.0 0.077 −0.05 0.03 0.19 −0.04 0.03 
Albumin creatinine ratio (American; mg/g) 54.6 310 6.87 18.4 23.3 8.31 0.48 −0.01 0.02 0.28 −0.03 0.03 
Inflammation IL6 (pg/ml) 2.56 3.01 1.87 2.99 2.06 2.45 0.14 0.20 0.14 0.98 0.00 0.17 
C-reactive protein (mg/ml) 3.50 4.29 1.96 4.56 4.87 2.20 0.83 0.05 0.22 0.41 −0.22 0.26 
  Entire IRASFS sample
 
Families 1008 and 2010
 
  Genotypic means Additive genotypic associationa 
  1/1 1/2       
 Trait Mean SD Median Mean SD Median P-value Beta SE P-value Beta SE 
 Adiponectin (µg/mL) 13.9 7.3 12.5 2.62 2.22 2.16 5.03E − 40 −1.17 0.09 1.21E − 17 −1.21 0.09 
Glucose homeostasis Insulin sensitivity index (SI; ×10−5 min−1/[pmol/L]) 2.16 1.85 1.72 1.94 2.33 0.97 0.96 0.00 0.10 0.90 0.02 0.15 
AIR (pmol/l) 760 645 594 725 811 476 0.047 −4.97 2.50 0.0054 −9.67 3.29 
Disposition index (SI × AIR; ×10−5 min−11324 1233 1008 1113 1469 586 0.29 −3.75 3.56 0.15 −7.40 5.05 
Fasting glucose (mg/dl) 93.3 9.4 92.0 97.7 11.6 97.0 0.058 3.69 1.95 0.32 2.73 2.69 
Fasting insulin (uU/ml) 14.8 10.9 12.0 18.6 9.7 19.0 0.21 0.17 0.13 0.08 0.25 0.14 
Metabolic clearance of insulin (min−10.09 0.03 0.10 0.09 0.04 0.09 0.58 0.01 0.01 0.22 −0.02 0.01 
Adiposity Body mass index (BMI; kg/m2)b 28.8 6.14 27.9 30.2 5.72 29.5 0.35 0.04 0.04 0.92 0.00 0.06 
Visceral adipose tissue (cm2)b 113 61 106 122 67 117 0.77 −0.16 0.55 0.62 −0.45 0.89 
Sub'cutaneous adipose tissue (cm2)b 339 155 314 308 144 280 0.53 −0.56 0.90 0.26 −1.45 1.27 
Waist circumference (cm)b 89.6 14.3 88.7 93.3 13.3 93.4 0.32 0.03 0.03 0.85 0.01 0.04 
Liver density (HU) 51.8 12.4 55.7 50.3 9.9 52.0 0.49 −163.22 238 0.67 −119.37 289.20 
Lipids Triglycerides (mg/dl) 124 85 101 122 67 113 0.87 −0.02 0.12 0.68 −0.06 0.15 
HDL (mg/dl) 44.0 13.1 42.0 36.5 7.0 37.0 0.34 −0.05 0.06 0.38 −0.06 0.07 
LDL (mg/dl) 109 31 106 112 25 114 0.72 0.11 0.31 0.18 0.45 0.33 
Cholesterol (mg/dl) 178 38 174 173 29 175 0.90 −0.01 0.04 0.51 0.03 0.05 
Hypertension Systolic blood pressure (mm) 118.2 17.8 115.0 112.5 12.3 110.0 0.12 −0.04 0.03 0.21 −0.04 0.03 
Diastolic blood pressure (mm) 76.1 9.8 76.0 71.3 8.2 70.0 0.077 −0.05 0.03 0.19 −0.04 0.03 
Albumin creatinine ratio (American; mg/g) 54.6 310 6.87 18.4 23.3 8.31 0.48 −0.01 0.02 0.28 −0.03 0.03 
Inflammation IL6 (pg/ml) 2.56 3.01 1.87 2.99 2.06 2.45 0.14 0.20 0.14 0.98 0.00 0.17 
C-reactive protein (mg/ml) 3.50 4.29 1.96 4.56 4.87 2.20 0.83 0.05 0.22 0.41 −0.22 0.26 

Bold values denote P-values < 0.05.

aCovariate adjustment: age, sex, center, BMI, PC except where noted.

bCovariate adjustment: age, sex, center, PC (e.g. no BMI adjustment).

DISCUSSION

In this report, we describe a family-based approach which, in combination with whole-exome and direct sequencing, rapidly identified a rare variant (1.1% frequency) that contributes ∼17% of the variance in plasma adiponectin in this Hispanic American sample. This is an extremely efficient and cost-effective approach for identification of a rare variant compared with many proposed methods. The progression of identifying linkage at a specific location in the entire sample, followed by targeting individual families which are linked at that location, is a rapid and elegant approach that can be applied to many existing family-based studies. While the linkage to plasma adiponectin was chosen for detailed analysis due to the strong linkage evidence and an obvious gene target, the approach is generalizable. IRASFS has more than 50 LOD scores >3.0 in combined family analysis that spans a rich set of detailed glucose homeostasis, obesity, lipid and hypertension traits. Evaluation of a sample of these linkage peaks suggests that a small number of families drive most of the evidence for linkage for each linkage peak. When the approach is extended to individual families without reference to the overall linkage result, the number of linkages increases. For example, there are a total of 32 Hispanic families with LOD scores for insulin sensitivity >3.0 and 7 families with LOD scores >4.0 (Supplementary Material, Table S4). It is unlikely that all of these are true linkages, but similar to the adiponectin study, multiple families have mean insulin sensitivity measures in the upper or lower quintiles of the sample.

Key to the success of using this family-based method is the presence of a rare variant in the parent generation (e.g. generation I in Fig. 1). There are over 120 Hispanic and African American families in IRASFS and thus potentially 480 chromosomes in the founders from which rare variants can be inherited. This equates to >90% power to detect variants of 0.5% frequency. While this type of family study is no longer common, there are existing family cohorts to which this approach could be applied and new cohorts could be developed. In this study, large families facilitated application of the method, but methodological development targeting shared haplotypes may facilitate the use of smaller families. Equally, the gene in this study, ADIPOQ, was an obvious candidate. This study is not novel in the sense that there are now multiple examples where rare variants of significant effect have been identified in genes of known function (7–10).

The family-based approach we describe has the potential for generalization. As we have shown, it is now technically practical to perform whole-exome sequencing. While exome sequencing facilitated discovery of the G45R, this discovery could have easily been achieved using targeted conventional sequencing of the candidate ADIPOQ. Exome sequencing seems likely, however, to be essential if this method is extended to novel gene discovery. For complex phenotypes (e.g. insulin resistance), obvious candidate genes may not be readily recognized in the dozens to hundreds of genes encompassed by a linkage peak. Exome sequencing would be quicker, more comprehensive and likely less expensive to use in this scenario.

Individuals with the G45R variant were found in multiple families in a long-established Hispanic population (San Luis Valley) and families with a history of more recent immigration (San Antonio). We have surveyed over 4000 chromosomes in the African American population and 3000 chromosomes from European Americans and found no copies of the G45R.

The absence of strong association of the G45R with physiological traits is thought provoking. While the number of G45R carriers is small (n = 29) in this sample, the association with plasma adiponectin is profound. Especially noteworthy is the absence of association with insulin sensitivity index (P = 0.9). Mouse adiponectin knockout models have been inconsistent in their conclusions on insulin sensitivity (10), but most are insulin resistant. The results with G45R raise the possibility that adiponectin is not a direct effecter of insulin resistance or that these individuals can physiologically compensate for the low levels of adiponectin. Understanding such a hypothesized compensation could be insightful into the physiological role of adiponectin in humans. In addition, this family-based approach suggests an influence of adiponectin on AIR (Table 3), which is amplified in families with multiple carriers of the G45R variant. While limited, there are reports of adiponectin influences on insulin secretion in model systems (11).

It is likely that the characteristics of the adiponectin protein amplify the effect of the G45R mutation. Normally, much of circulating adiponectin is in the form of HMW multimers of 12–16 monomers of adiponectin (12). If the 45R-carrying monomers interfere with the assembly or secretion of HMW adiponectin, this could magnify the impact of the mutation on both genetic results and biological actions of the protein. This is what was observed when HMW adiponectin was evaluated (Supplementary Material, Table S3). Defects in multimerization and secretion have previously been described from in vitro experiments with ADIPOQ variants found in subjects with hypoadiponectinemia (12) which are entirely consistent with the G45R mutation described here. For G45R, this ‘perfect storm’ of biology and genetics likely amplifies the linkage signal for adiponectin. Interestingly, this also suggests that more nominal LOD scores for traits without these characteristics that amplify the linkage signal can have a similar molecular basis. Thus, the general method we have outlined could have results of equal interest for a wide range of biological processes in a richly phenotyped sample such as IRASFS.

Several recent studies (13–15) in large samples with GWAS data have convincingly demonstrated that common variants in the ADIPOQ locus are associated with circulating adiponectin levels in European-derived populations. It is striking that the β for association with plasma adiponectin for the G45R mutations, −1.17 (Table 3), is substantially greater than observed with any common variants. This single rare variant in Hispanics explains a greater proportion of the phenotypic variance (≈17%) than observed cumulatively for common variants, estimated by Heid et al. (16) to be 6.7%. Thus, a rare variant can be a major contributor to variation in a population, and in families segregating the variant, the contribution is greater still. The latter emphasizes the importance of a ‘personalized’ approach to rare variant analysis.

These observations do pose the question of whether undiscovered rare variants of large effect underlie at least part of the variance attributed to common variants. This seems possible, but to be tested, requires elucidation of such rare variants. The G45R variant described here is 523 bp from rs17366568 (Supplementary Material, Fig. S3), the most strongly associated SNP reported by Heid et al. (16), but, as noted, the G45R was not observed in European-derived DNAs. Hivert et al. (14). reported two common promoter SNPs in high linkage disequilibrium (r2 = 0.80) which were most strongly associated with adiponectin levels. Of these, rs17300539 was previously evaluated in IRASFS (3) and was modestly associated (P-value = 0.011) and contributed modestly to the linkage (LOD score reduction from 8.59 to 8.23). This variant, however, was associated with an increase in adiponectin levels (3). We speculate that variance in adiponectin is likely due to the sum of both common and rare variants, and our results suggest rare variants contribute a greater proportion of the variance.

While common in monogenic disorders, there are few reports describing the molecular basis for linkage to complex or quantitative traits (17,18), though these do include a report for a common variant in the ADIPOQ gene (19). A review of this literature shows that most recent studies have been performed with a focus on common variants. Classic monogenic disorders are distinguished by extreme phenotypes with high penetrance, which facilitates ascertainment of affected families in a background of a large number of unaffected families or individuals in the population. For complex, multigenic traits, which do not result in extreme, readily observable phenotypes, ascertainment of families in the population is much more challenging. While rare variants likely do not account for all linkage peaks, we propose that rare variants contribute significantly to observations of linkage to complex traits. The observations that we have made with adiponectin suggest evaluations of linkage peaks should encompass rare variants and that family-based approaches can greatly facilitate such studies.

MATERIALS AND METHODS

IRAS family study

Characteristics of the study sample are summarized in Supplementary Material, Table S1. The study design, recruitment and phenotyping for IRASFS have been described in detail (2). Briefly, IRASFS is designed to identify the genetic basis of insulin resistance and adiposity. Subjects in this report have been recruited from clinical centers in San Luis Valley, Colorado (a rural Hispanic population) and San Antonio, Texas (an urban Hispanic population). While a diagnosis of diabetes was not a requirement to participate, ∼16% of the subjects have diabetes. Family members were recruited to obtain an average of 12–13 family members. The examination included a fasting blood draw and medical history interview. The clinical examination included an insulin-modified frequently sampled glucose tolerance test (FSIGT) using the reduced sampling protocol (20), and measures of glucose homeostasis were computed using the results of the FSIGT combined with the MINMOD analysis program. Height, weight, waist and hip circumferences were measured, and computed tomography was used to estimate visceral and subcutaneous fat and liver density. Methods for determining these phenotypes have been described in detail (2).

Laboratory methods

Extensive analysis of biomarkers was performed. Relevant to this study, plasma adiponectin levels were measured by radioimmunoassay (RIA; Linco Research, St Charles, MO, USA) in 1153 individuals. This RIA uses a polyclonal anti-adiponectin antibody which recognizes trimers and higher multimers of adiponectin and includes recognition of the globular domain. In addition, samples have been measured with a monoclonal antibody-based ELISA, with similar results to the RIA. In a third analysis, the proportion of adiponectin present in the HMW isoform was assayed using ELISAs (Millipore Corporation, Billerica, MA, USA) for total and HMW adiponectin and calculating the proportion of HMW.

Exome sequencing was performed using the SureSelect Human All Exon Target Enrichment System (Agilent Technologies, version 1.01, Santa Clara, CA, USA) and high-throughput sequencing (Illumina Genome Analyzer IIx) at the Hudson Alpha Institute (Huntsville, AL, USA). Variant calling was performed with Crossbow 2.0 (http://bowtie-bio.sourceforge.net/crossbow/index.shtml) (21). For targeted exon sequencing of ADIPOQ, fragments were PCR-amplified and directly sequenced using Big Dye Ready Reaction Mix on an ABI3730xl sequencer (Applied Biosystems, Foster City, CA, USA). Sequence data were visualized using Sequencher Software version 4.9 (GeneCodes Corporation, Ann Arbor, MI, USA). SNP genotyping was performed on a Sequenom MassArray Genotyping System (Sequenom, San Diego, CA, USA) using methods described previously (22). Discordance between blind duplicate samples included in the genotyping was <0.2%. Microsatellite marker genotyping data for chromosome 3 were available from a 10 cM genome scan performed by the Mammalian Genotyping Center in Marshfield, Wisconsin.

Statistical analysis

Our approach to linkage analysis has been previously summarized in detail (3–5). Briefly, multipoint identity-by-descent estimates were computed using LOKI (23,24). For each pedigree, a separate quantitative trait variance component linkage analysis method on chromosome 3 was computed using the Sequential Oligogenic Linkage Analysis Routines (SOLAR) program (25). The distribution of the plasma adiponectin was log-transformed to best approximate conditional normality and homogeneity of variance, conditional on the covariates of age, gender, recruitment center and BMI. To evaluate the contribution of the ADIPOQ G45R variation to the linkage signal, linkage analyses were repeated adjusting for the G45R SNP (presence/absence since there are no R homozygotes).

The test for association and the proportion of variation (R2) explained by the G45R polymorphism (presence/absence) was estimated using the variance component-measured genotype model also implemented in SOLAR. A partial R2 was computed on the residuals from the model regressing adiponectin onto the above covariates. These tests of association and estimation of the R2 statistic adjust for admixture via a principal component based on 80 ancestry informative markers previously genotyped in the sample (22). Haplotype pairs (haplo-genotypes) for each individual were determined using the most likely haplotype based on the expectation–maximization algorithm implemented in the program Dandelion (www.phs.wfubmc.edu) from 14 polymorphic SNPs (3,26).

SUPPLEMENTARY MATERIAL

Supplementary Material is available at HMG online.

Conflict of Interest statement. None declared.

FUNDING

This work was supported by the National Institutes of Health (HL060894, HL060931, HL060944, HL061019 and HL061210); the Cedars-Sinai Board of Governors' Chair in Medical Genetics (to J.I.R.); the General Clinical Research Centers Program (NCRR grant M01RR00069); the Diabetes Endocrinology Research Center Grant (DDK063491); and the Wake Forest University Health Sciences Center for Public Health Genomics.

REFERENCES

1
Manolio
T.A.
Collins
F.S.
Cox
N.J.
Goldstein
D.B.
Hindorff
L.A.
Hunter
D.J.
McCarthy
M.I.
Ramos
E.M.
Cardon
L.R.
Chakravarti
A.
, et al.  . 
Finding the missing heritability of complex diseases
Nature
 , 
2009
, vol. 
461
 (pg. 
747
-
753
)
2
Henkin
L.
Bergman
R.N.
Bowden
D.W.
Ellsworth
D.L.
Haffner
S.M.
Langefeld
C.D.
Mitchell
B.D.
Norris
J.M.
Rewers
M.
Saad
M.F.
, et al.  . 
Genetic epidemiology of insulin resistance and visceral adiposity. The IRAS Family Study design and methods
Ann. Epidemiol.
 , 
2003
, vol. 
13
 (pg. 
211
-
217
)
3
Guo
X.
Saad
M.F.
Langefeld
C.D.
Williams
A.H.
Cui
J.
Taylor
K.D.
Norris
J.M.
Jinagouda
S.
Darwin
C.H.
Mitchell
B.D.
, et al.  . 
Genome-wide linkage of plasma adiponectin reveals a major locus on chromosome 3q distinct from the adiponectin structural gene: the IRAS family study
Diabetes
 , 
2006
, vol. 
55
 (pg. 
1723
-
1730
)
4
Rich
S.S.
Bowden
D.W.
Haffner
S.M.
Norris
J.M.
Saad
M.F.
Mitchell
B.D.
Rotter
J.I.
Langefeld
C.D.
Hedrick
C.C.
Wagenknecht
L.E.
, et al.  . 
A genome scan for fasting insulin and fasting glucose identifies a quantitative trait locus on chromosome 17p: the insulin resistance atherosclerosis study (IRAS) family study
Diabetes
 , 
2005
, vol. 
54
 (pg. 
290
-
295
)
5
Rich
S.S.
Bowden
D.W.
Haffner
S.M.
Norris
J.M.
Saad
M.F.
Mitchell
B.D.
Rotter
J.I.
Langefeld
C.D.
Wagenknecht
L.E.
Bergman
R.N.
Identification of quantitative trait loci for glucose homeostasis: the Insulin Resistance Atherosclerosis Study (IRAS) Family Study
Diabetes
 , 
2004
, vol. 
53
 (pg. 
1866
-
1875
)
6
Yamauchi
T.
Kadowaki
T.
Physiological and pathophysiological roles of adiponectin and adiponectin receptors in the integrated regulation of metabolic and cardiovascular diseases
Int. J. Obes. (Lond.)
 , 
2008
, vol. 
32
 
Suppl. 7
(pg. 
S13
-
S18
)
[PubMed]
7
Kathiresan
S.
Willer
C.J.
Peloso
G.M.
Demissie
S.
Musunuru
K.
Schadt
E.E.
Kaplan
L.
Bennett
D.
Li
Y.
Tanaka
T.
, et al.  . 
Common variants at 30 loci contribute to polygenic dyslipidemia
Nat. Genet.
 , 
2009
, vol. 
41
 (pg. 
56
-
65
)
8
Lusis
A.J.
Pajukanta
P.
A treasure trove for lipoprotein biology
Nat. Genet.
 , 
2008
, vol. 
40
 (pg. 
129
-
130
)
9
Meyre
D.
Proulx
K.
Kawagoe-Takaki
H.
Vatin
V.
Gutierrez-Aguilar
R.
Lyon
D.
Ma
M.
Choquet
H.
Horber
F.
Van Hul
W.
, et al.  . 
Prevalence of loss-of-function FTO mutations in lean and obese individuals
Diabetes
 , 
2010
, vol. 
59
 (pg. 
311
-
318
)
10
Yano
W.
Kubota
N.
Itoh
S.
Kubota
T.
Awazawa
M.
Moroi
M.
Sugi
K.
Takamoto
I.
Ogata
H.
Tokuyama
K.
, et al.  . 
Molecular mechanism of moderate insulin resistance in adiponectin-knockout mice
Endocr. J.
 , 
2008
, vol. 
55
 (pg. 
515
-
522
)
11
Okamoto
M.
Ohara-Imaizumi
M.
Kubota
N.
Hashimoto
S.
Eto
K.
Kanno
T.
Kubota
T.
Wakui
M.
Nagai
R.
Noda
M.
, et al.  . 
Adiponectin induces insulin secretion in vitro and in vivo at a low glucose concentration
Diabetologia
 , 
2008
, vol. 
51
 (pg. 
827
-
835
)
12
Waki
H.
Yamauchi
T.
Kamon
J.
Ito
Y.
Uchida
S.
Kita
S.
Hara
K.
Hada
Y.
Vasseur
F.
Froguel
P.
, et al.  . 
Impaired multimerization of human adiponectin mutants associated with diabetes. Molecular structure and multimer formation of adiponectin
J. Biol. Chem.
 , 
2003
, vol. 
278
 (pg. 
40352
-
40363
)
13
Heid
I.M.
Henneman
P.
Hicks
A.
Coassin
S.
Winkler
T.
Aulchenko
Y.S.
Fuchsberger
C.
Song
K.
Hivert
M.F.
Waterworth
D.M.
, et al.  . 
Clear detection of ADIPOQ locus as the major gene for plasma adiponectin: results of genome-wide association analyses including 4659 European individuals
Atherosclerosis
 , vol. 
208
 (pg. 
412
-
420
)
14
Hivert
M.F.
Manning
A.K.
McAteer
J.B.
Florez
J.C.
Dupuis
J.
Fox
C.S.
O'Donnell
C.J.
Cupples
L.A.
Meigs
J.B.
Common variants in the adiponectin gene (ADIPOQ) associated with plasma adiponectin levels, type 2 diabetes, and diabetes-related quantitative traits: the Framingham Offspring Study
Diabetes
 , 
2008
, vol. 
57
 (pg. 
3353
-
3359
)
15
Richards
J.B.
Waterworth
D.
O'Rahilly
S.
Hivert
M.F.
Loos
R.J.
Perry
J.R.
Tanaka
T.
Timpson
N.J.
Semple
R.K.
Soranzo
N.
, et al.  . 
A genome-wide association study reveals variants in ARL15 that influence adiponectin levels
PLoS Genet.
 , 
2009
, vol. 
5
 pg. 
e1000768
 
16
Heid
I.M.
Henneman
P.
Hicks
A.
Coassin
S.
Winkler
T.
Aulchenko
Y.S.
Fuchsberger
C.
Song
K.
Hivert
M.F.
Waterworth
D.M.
, et al.  . 
Clear detection of ADIPOQ locus as the major gene for plasma adiponectin: results of genome-wide association analyses including 4659 European individuals
Atherosclerosis
 , 
2010
, vol. 
208
 (pg. 
412
-
420
)
17
Hugot
J.P.
Chamaillard
M.
Zouali
H.
Lesage
S.
Cezard
J.P.
Belaiche
J.
Almer
S.
Tysk
C.
O'Morain
C.A.
Gassull
M.
, et al.  . 
Association of NOD2 leucine-rich repeat variants with susceptibility to Crohn's disease
Nature
 , 
2001
, vol. 
411
 (pg. 
599
-
603
)
18
Parson
W.
Kraft
H.G.
Niederstatter
H.
Lingenhel
A.W.
Kochl
S.
Fresser
F.
Utermann
G.
A common nonsense mutation in the repetitive Kringle IV-2 domain of human apolipoprotein(a) results in a truncated protein and low plasma Lp(a)
Hum. Mutat.
 , 
2004
, vol. 
24
 (pg. 
474
-
480
)
19
Pollin
T.I.
Tanner
K.
O'Connell
J.R.
Ott
S.H.
Damcott
C.M.
Shuldiner
A.R.
McLenithan
J.C.
Mitchell
B.D.
Linkage of plasma adiponectin levels to 3q27 explained by association with variation in the APM1 gene
Diabetes
 , 
2005
, vol. 
54
 (pg. 
268
-
274
)
20
Steil
G.M.
Volund
A.
Kahn
S.E.
Bergman
R.N.
Reduced sample number for calculation of insulin sensitivity and glucose effectiveness from the minimal model. Suitability for use in population studies
Diabetes
 , 
1993
, vol. 
42
 (pg. 
250
-
256
)
21
Langmead
B.
Trapnell
C.
Pop
M.
Salzberg
S.L.
Ultrafast and memory-efficient alignment of short DNA sequences to the human genome
Genome Biol.
 , 
2009
, vol. 
10
 pg. 
R25
 
22
Palmer
N.D.
Langefeld
C.D.
Ziegler
J.T.
Hsu
F.
Haffner
S.M.
Fingerlin
T.
Norris
J.M.
Chen
Y.I.
Rich
S.S.
Haritunians
T.
, et al.  . 
Candidate loci for insulin sensitivity and disposition index from a genome-wide association analysis of Hispanic participants in the Insulin Resistance Atherosclerosis (IRAS) Family Study
Diabetologia
 , vol. 
53
 (pg. 
281
-
289
)
23
Heath
S.C.
Markov chain Monte Carlo segregation and linkage analysis for oligogenic models
Am. J. Hum. Genet.
 , 
1997
, vol. 
61
 (pg. 
748
-
760
)
24
Heath
S.C.
Snow
G.L.
Thompson
E.A.
Tseng
C.
Wijsman
E.M.
MCMC segregation and linkage analysis
Genet. Epidemiol.
 , 
1997
, vol. 
14
 (pg. 
1011
-
1016
)
25
Almasy
L.
Blangero
J.
Multipoint quantitative-trait linkage analysis in general pedigrees
Am. J. Hum. Genet.
 , 
1998
, vol. 
62
 (pg. 
1198
-
1211
)
26
Sutton
B.S.
Weinert
S.
Langefeld
C.D.
Williams
A.H.
Campbell
J.K.
Saad
M.F.
Haffner
S.M.
Norris
J.M.
Bowden
D.W.
Genetic analysis of adiponectin and obesity in Hispanic families: the IRAS Family Study
Hum. Genet.
 , 
2005
, vol. 
117
 (pg. 
107
-
118
)