-
PDF
- Split View
-
Views
-
Cite
Cite
Maciej Trzaskowski, Paul Lichtenstein, Patrik K Magnusson, Nancy L Pedersen, Robert Plomin, Application of linear mixed models to study genetic stability of height and body mass index across countries and time, International Journal of Epidemiology, Volume 45, Issue 2, April 2016, Pages 417–423, https://doi.org/10.1093/ije/dyv355
- Share Icon Share
Abstract
Background: It is now possible to estimate genetic correlations between two independent samples when there is no overlapping phenotypic information. We applied the latest bivariate genomic methods to children in the UK and older adults in Sweden to ask two questions. Are the same variants driving individual differences in anthropometric traits in these two populations, and are these variants as important in childhood as they are later in life?
Methods: A sample of 3152 11-year-old children in the UK was compared with a sample of 6813 adults with an average age of 65 in Sweden. Genotypes were imputed from 1000 genomes with combined 9 767 136 single nucleotide polymorphisms meeting quality control criteria in both samples. Two cross-sample GCTA-GREML analyses and linkage disequilibrium (LD) score regressions were conducted to assess genetic correlations across more than 50 years: child versus adult height and child versus adult body mass index (BMI). Consistency of effects was tested using the recently proposed polygenic scoring method.
Results: For height, GCTA-GREML and LD score indicated strong genetic stability between children and adults, 0.58 (0.16) and 1.335 (1.09), respectively. For BMI, both methods produced similarly strong estimates of genetic stability 0.75 (0.26) and 0.855 (0.49), respectively. In height, adult polygenic score explained 60% of genetic variance in childhood and 10% of variance in BMI.
Conclusions: Here we replicated and extended previous findings of longitudinal genetic stability in anthropometric traits to cross-cultural dimensions, and showed that for height but not BMI these variants are as important in childhood as they are in adulthood.
Key Messages
There are new genomic methods that enable estimation of genetic covariance from individual genotype and GWA summary statistics data where no phenotypic covariance is available.
Strong genetic stability can be observed in anthropometric traits over more than 50 years.
Genetic stability is very strong even when comparing children in the UK and adults in Sweden.
The analytical polygenic prediction method shows that for height, but not BMI, these variants are as important in childhood as they are in adulthood.
Introduction
Recent developments in genetic technology (genome-wide genotyping) and methodology (e.g., genome-wide relatedness maximum likelihood model implemented in genome-wide complex trait analysis software: GCTA-GREML) have made it possible to conduct quantitative genetic research on unrelated individuals rather than family members such as twins. 1,2 This advance enabled estimates of genetic influence on individual differences in population samples that can be much bigger than family samples. A bivariate extension to this model also permits estimation of co-heritability between traits. 3,4 An especially interesting application of that is fitting a bivariate model to two unrelated samples with phenotypic information on one trait at two non-overlapping ages, but where each sample only has data on one of the ages (we could call this a ‘cross-sample’ GCTA-GREML). This has been previously used to show strong overlap between genetic influences on schizophrenia in individuals with European ancestry and those from African descent. 5 One of the phenomena that is well suited to testing the cross-sample GCTA-GREML, but has not yet been investigated using this method, is age-to-age genetic stability, the extent to which the same genetic factors influence a trait across development.
Cross-sample GCTA-GREML makes it possible to assess long-term genetic stability by comparing a sample of unrelated children from one study with a sample of unrelated adults from another study. Application of this method reduces the need for the ‘single-sample’ long-term longitudinal design by overcoming some of its limitations. Although the prospective longitudinal design is still the best design for genetic investigation of developmental change and continuity, it is very costly in time and is often troubled by attrition that could lead to measurement biases and a reduction in power. Until recently, this traditional longitudinal design has been the only way to investigate genetic stability, but now we can use cross-sample GCTA-GREML to investigate long-term genetic stability and change.
Age-to-age genetic stability has previously been reported in longitudinal twin and extended family studies for several complex traits including BMI (body mass index) 6,7 and height. 8 These studies were traditional longitudinal studies in which the same twins were assessed from age to age; as a result, the longest age span investigated was from age 20 to age 45. 7 There were strong stabilities over 25 years—phenotypic stability was 0.89 for height and 0.54 for BMI. 7 The authors also reported that most of this phenotypic stability is due to genetic factors, with genetic correlations estimated from the twin design as 0.97 for height and 0.69 for BMI. In a short-term longitudinal analysis of twin data on BMI from ages 4 to 10 years, we reported a twin-estimated genetic correlation of 0.58 [95% confidence interval (CI): 0.48 to 0.68]. 9 In the same study we found suggestive substantial genetic stability (r = 0.66; 95% CI: −0.28 to 1.00), using bivariate GCTA-GREML for unrelated individuals and an increase in influence of a polygenic predictor (based on 32 earlier reported variants 10 ) on BMI. Comparison of these methods within the same sample supported somewhat puzzling findings of strong genetic stability even in the presence of increasing heritability. A more recent long-term longitudinal study used the same polygenic predictor suggesting similar conclusions. 11 The authors showed that higher obesity genetic risk was associated with higher average BMI and a steeper increase in BMI between early adulthood and age 65.
Strong genetic stability was also found in other complex traits. For example, a short-term longitudinal study of cognitive abilities from ages 7 to 12 years, found a substantial genetic stability r = 0.73 (95% CI: 0.16 to 1.00) using the above-mentioned bivariate GCTA-GREML method. 12 Similar genetic stability on cognitive abilities was reported in an earlier published long-term longitudinal study of unrelated individuals. 13 This study also used repeated measures bivariate GCTA-GREML design, but it spaned data over more than 50 years(ages 11 and 65–79). Genetic correlation from childhood to adulthood was 0.62 (95% CI: 0.19 to 1.00); this 95% confidence interval, just like in our short-term study, did not overlap zero but did overlap one. This suggested that nearly two-thirds of the genetic influences on general intelligence at old age are accounted for by the genetic influences in childhood.
In the present study we examined 50-year genetic stability of height and body mass index (BMI). However, unlike the previous longitudinal publications, here we present, for the first time, cross-sample GCTA-GREML analyses of genetic stability for height and BMI using a sample of UK children and a sample of older Swedish adults. This allows us to investigate genetic stability across 50 years and to compare the genetic architecture of anthropometric traits in childhood and adulthood using data where longitudinal repeated measurements are not available. Furthermore, it has been shown that genetic estimates could be affected by linkage disequilibrium (LD) between genotyped markers and causal variants. 14,15 It has been previously discussed that such effect can bias estimates of single nucleotide polymorphism (SNP) heritability, 14–19 but we have also shown that this would not affect estimates of genetic correlation. 12 We used recently developed LD score regression—LDSC 14 —as it explicitly controls for effect of LD. Therefore if this effect was true, we would observe difference in estimates of SNP heritability but not in genetic correlations. Finally, we wanted to address another aspect of genetic stability that has been previously looked at in BMI data within the Swedish population. 11 Specifically, providing that we found evidence of genetic stability, we wanted to see if those same SNPs played equally important roles at both ages. Genetic stability could be strong but effect sizes could be completely different. To do that we used the recently developed analytical polygenic predictor method. 20,21 We therefore asked two questions related to genetic stability of anthropometric traits: do the same variants affect children in the UK and adults in Sweden? Is the effect of these variants as important in adult life as it is in childhood?
Material and methods
Samples and genotyping
Twins Early Development Study
The UK sample was drawn from the Twins Early Development Study (TEDS), a longitudinal multivariate study of approximately 11 000 twin pairs born in England and Wales between 1994 and 1996 22 and representative of the UK population. 23 The project received ethical approval from the Institute of Psychiatry ethics committee (05/Q0706/228). Parental consent was obtained before data collection. DNA and anthropometric data were collected when children were 11 years old [mean = 11.26 years, standard deviation (SD) = 0.69). After quality control, data with no missing measure of height and BMI were available for 2221 and 2186 unrelated individuals, respectively, ( Table 2 ) and approximately 7.7 million SNPs (see Supplementary data for detailed description of quality control procedures, available at IJE online).
TwinGene
TwinGene is a sample of approximately 12 600 twins drawn from the Swedish Twin Registry, 24 assessed when adults were aged 47 to 94 (mean = 64.81, SD = 8.26). The current sample was based on one individual from each twin pair. Quality control resulted in a sample with no missing measure of height and BMI of 5938 and 5928 unrelated individuals, respectively ( Table 2 ), and ∼ 9.7 million SNPs (see Supplementary data , available at IJE online). There were 7 686 666 variants in common across the two samples and 9 767 136 across the combined data.
Measures
TEDS
Children’s height and weight in TEDS were self-reported using tape measures that were sent to all families. The twins were allowed to report in either metric or imperial units; data were later re-coded into metric units. Reported heights and weights were validated against measures collected in person from a sub-sample, with parent- and researcher-measured heights and weights correlated 0.90 and 0.83, respectively, in a subsample of 228 families. 25 BMI was calculated as weight (kg / height (m 2 ). Given that we were comparing children’s anthropometric measures with adults’, we needed to standardise the scores of each child. This is done because, unlike adult growth, child development does not follow a linear pattern, varying by age and gender. For this reason, children’s BMI were standardised to age- and gender-appropriate population reference data, using UK 1990 population reference data. 9,25
TwinGene
Anthropometric measures were collected during DNA collection. 24 Each participant was asked to visit their local healthcare centre, where blood and health and anthropometric measures were collected. Participant’s weight and height were recorded without shoes and in light clothing. BMI was calculated as specified above. Of note, we did not remove BMI outliers from either TEDS or TwinGene, as their raw measures of height and weight were within a plausible range.
Covariates
Genomic data were pruned for LD using PLINK 1.07, 26 resulting in ∼ 100 000 SNPs. This thinned SNP set was then used to calculate ancestral axes for the combined sample (children and adults); 20 Principle Components (PC) were calculated using PLINK 1.90 27 and all were used as covariates in the main analyses. Age and gender were also included as fixed covariates. For comparison, we also projected previous analyses, EIGENSTRAT-calculated 28 within-sample ancestral axes, produced independently for TEDS and TwinGene, before performing cross-sample GCTA-GREML analysis; we obtained nearly identical results (not shown).
Analyses
The bivariate cross-sample GCTA-GREML was first described in detail by Visscher et al ., 4 and we have detailed its important algorithms and aspects in the Supplementary data , available at IJE online. Here we would like to briefly highlight important elements conceptually. First, genetic influence is estimated from across the whole genome (algorithm 1 in Supplementary data , available at IJE online). Second, the model is based on a simple linear mixed design where additive genetic influences are modelled as random effects ( Supplementary Data and Supplementary Data in supplementary materials ). Third, in contrast to standard repeated measures design ( Supplementary Data in supplementary materials ), the data have no phenotypic overlap. In our example, one sample has childhood data only and the other sample has only adult data available. For that reason, the only influence that can be estimated on the covariance is genetic. There is no residual covariance ( Supplementary Data in supplementary materials ). P- values of are obtained from univariate likelihood ratio tests (LRT), where fit of the full model (with genetic component) is compared with the fit of a reduced ‘null’ model (with no genetic component). Similarly, P -values for correlation estimates are from LRT. However, here the test is applied to a bivariate model where fit of the full model is compared with a reduced model where genetic correlation is fixed to either zero or one (see Supplementary data , available at IJE online).
As mentioned earlier, to guard against potential bias from uneven LD between tagging markers and causal variants, we took advantage of the recently developed LDSC that can be applied to genome-wide association (GWA) summary statistics only. 14 In addition, to test for consistency of effect sizes we used another recently developed polygenic method that can be also applied to GWA summary statistics only. 29 To produce summary statistics, we run independently GWA on TEDS and TwinGene samples (see Supplementary data for details, available at IJE online) and used the results to run the two earlier-mentioned methods. Genome-wide methods are often affected by inflation in test statistics, and this could be driven by either population stratification or polygenic signal. LDSC partitions these effects out, giving unbiased results. 14 Finally, the analytical polygenic scoring method described previously 20,21 and available in an R 30 package called ‘gtx’, 31 extends our findings by testing not only if we have the same variants affecting anthropometric traits at two ages but also if these same variants are as important in childhood as they are in adulthood.
Results
Descriptive statistics
Table 1 describes age, gender and sample size of the two samples. Age ranges are 11–12 years in TEDS and 47–94 years in TwinGene. Descriptive statistics for height and BMI are shown in Table 2 . Because they are children, the TEDS sample is shorter and lighter than the TwinGene adult sample, but as mentioned in the Methods section, these traits were standardised to make the comparison possible. Both TEDS and TwinGene samples were representative of their respective populations with respect to their height and BMI. 24,32,33
Sample . | Trait . | Mean (SD) . | Minimum . | Maximum . | Sample na . |
---|---|---|---|---|---|
TEDS | Height (cm) | 146.69 (8.36) | 112.00 | 188.00 | 2221 |
Weight (kg) | 38.69 (8.68) | 22.20 | 95.00 | 2186 | |
BMI b | 17.86 (3.08) | 12.12 | 46.96 | 2186 | |
TwinGene | Height (cm) | 170.10 (9.18) | 117. 00 | 205.00 | 5938 |
Weight (kg) | 75.61 (13.87) | 37.60 | 171. 50 | 5928 | |
BMI b | 26.07 (4.00) | 15.43 | 66.99 | 5928 |
Sample . | Trait . | Mean (SD) . | Minimum . | Maximum . | Sample na . |
---|---|---|---|---|---|
TEDS | Height (cm) | 146.69 (8.36) | 112.00 | 188.00 | 2221 |
Weight (kg) | 38.69 (8.68) | 22.20 | 95.00 | 2186 | |
BMI b | 17.86 (3.08) | 12.12 | 46.96 | 2186 | |
TwinGene | Height (cm) | 170.10 (9.18) | 117. 00 | 205.00 | 5938 |
Weight (kg) | 75.61 (13.87) | 37.60 | 171. 50 | 5928 | |
BMI b | 26.07 (4.00) | 15.43 | 66.99 | 5928 |
a Unrelated Genetic Relatedness Matrix (GRM) < 0.025 cut-off) and non-missing phenotype data.
b Body mass index = weight (kg) / height (m 2 ).
Sample . | Trait . | Mean (SD) . | Minimum . | Maximum . | Sample na . |
---|---|---|---|---|---|
TEDS | Height (cm) | 146.69 (8.36) | 112.00 | 188.00 | 2221 |
Weight (kg) | 38.69 (8.68) | 22.20 | 95.00 | 2186 | |
BMI b | 17.86 (3.08) | 12.12 | 46.96 | 2186 | |
TwinGene | Height (cm) | 170.10 (9.18) | 117. 00 | 205.00 | 5938 |
Weight (kg) | 75.61 (13.87) | 37.60 | 171. 50 | 5928 | |
BMI b | 26.07 (4.00) | 15.43 | 66.99 | 5928 |
Sample . | Trait . | Mean (SD) . | Minimum . | Maximum . | Sample na . |
---|---|---|---|---|---|
TEDS | Height (cm) | 146.69 (8.36) | 112.00 | 188.00 | 2221 |
Weight (kg) | 38.69 (8.68) | 22.20 | 95.00 | 2186 | |
BMI b | 17.86 (3.08) | 12.12 | 46.96 | 2186 | |
TwinGene | Height (cm) | 170.10 (9.18) | 117. 00 | 205.00 | 5938 |
Weight (kg) | 75.61 (13.87) | 37.60 | 171. 50 | 5928 | |
BMI b | 26.07 (4.00) | 15.43 | 66.99 | 5928 |
a Unrelated Genetic Relatedness Matrix (GRM) < 0.025 cut-off) and non-missing phenotype data.
b Body mass index = weight (kg) / height (m 2 ).
Heritabilities and genetic correlations of height and BMI from childhood to adulthood
In children, of height was 0.47 (0.15) and in adults 0.69 (0.08) ( Figure 1 a). The genetic correlation between adult and child height was 0.58(0.16). Likelihood ratio test for of height in children was 8.305 with P -value = 0.002 and n = 2080. For in adults LRT = 85.051, with P -value = 1.45e-20 and n = 5573. LRT for correlation different from zero was 17.986 with P- value = 1e-05 and n = 7653. When genetic correlation was tested as different from one, LRT was 14.089, P -value = 9e-05 and n = 7653.

(a) Bivariate GCTA-GREML model of height between children in the UK and adults in Sweden. (b) Bivariate GCTA-GREML model of BMI between children in the UK and adults in Sweden. V(G): SNP heritability; V(e): residual variance; standard error in parentheses; 190x142mm (300 x 300 DPI).
For standardised BMI, was 0.37 (0.16) in childhood and 0.26 (0.08) in adulthood ( Figure 1 b). The genetic correlation between child and adult BMI was 0.75 (0.26). As explained in the Methods section, the correlation represented by the double-ended arrow in Figure 1 a and b can be estimated for additive genetic influence but not the residual. LRT for of BMI in children was 12.028, P -value = 0.0003, n = 2049 and in adults LRT = 7.179, P -value = 0.004, n = 5462. LRT for genetic correlation being different from zero was 11.485, P -value = 0.0004 and n = 7511. For genetic correlation different from one, LRT = 2.250, P -value = 0.07, n = 7511.
LDSC results were consistent with GREML analyses, showing strong genetic correlation between both samples rg LDSC = 1.335 (1.09) and rg LDSC = 0.855 (0.49) in height and BMI respectively ( Supplementary Table 1 , available as Supplementary data at IJE online). The estimates of SNP heritability for height were 0.13 (0.19) and 0.60 (0.11) in TEDS and TwinGene, respectively. The estimates of SNP heritability in BMI were 0.20 (0.15) and 0.23 (0.08) in TEDS and TwinGene, respectively.
Analytical polygenic scoring approach suggested that the most predictive polygenic score obtained from adult height data accounts for 60% ( P -value < 3.5e-07) of the genetic variance in child data. In contrast, the most predictive polygenic score from adult BMI accounted for only 10% ( P -value < 3.5e-07) of the phenotypic variance in child data. Of note, the most natural implementation of this method would have been to calculate a polygenic score in child data and apply it to adult data. However, the difference in sample size meant that to maximise the accuracy of the polygenic score we needed to calculate it in adult data and predict it in child data. Although counterintuitive, the direction of the prediction was not crucial to our conjecture and thus it did not affect our interpretation.
Discussion
We used cross-sample GCTA-GREML and LDSC to estimate genetic correlations from childhood to adulthood for height and BMI using data from children in the UK (age 11 years) and adults (average age 65) in Sweden. The results from both methods show that genetic correlations for both height and BMI were substantial, (rg GREML = 0.58 and rg LDSC = 1.335) for height and (rg GREML = 0.75 and rg LDSC = 0.86) for BMI, respectively. Although genetic stability for height and BMI is a known phenomenon, this is the first time it has been shown across a somewhat uniquely defined 50-year period (from age 11 to 65), and to persist even though our cross-sample design involved children in the UK and adults in Sweden.
A unique feature of our design is its unusual definition of longitudinal. These two samples are not separated by ‘secular time’ in the same way as they would be in the traditional longitudinal design. That is, the two samples were assessed at approximately the same calendar time despite being 50 years apart in age. We have shown, using measured genomic variants, that the influence of common genetic variants on the individual differences in anthropometric traits is highly stable even over a 50-year span, and that this is true for unrelated individuals drawn from different countries, growing up or having grown up in different environments. Because this is a cross-sample GCTA-GREML, no individuals were assessed both as children and adults, which meant that we could not estimate how much of the phenotypic covariance was due to genetic influences (bivariate heritability). However, tentatively using the estimates from a 25-year span twin study 7 suggests that genetic stability in height and BMI could account for as much as 60% and 100% of the phenotypic covariance, respectively.
Our results make sense in terms of what has been found in a longitudinal twin study of genetic stability for height and BMI from age 20 to age 45. 7 Across 25 years, this twin study reported genetic correlations of 0.97 for height and 0.69 for BMI, as compared with our cross-sample GCTA-GREML estimates for a 50-year age span of 0.58 (0.16) for height and 0.75 (0.26) for BMI. Although could underestimate the ‘true’ heritability (due to linkage disequilibrium between the markers and the causal variants), genetic correlations are unbiased. 12 It is therefore not surprising that the results obtained from LDSC analysis, where LD is explicitly modelled, were in agreement with strong genetic stability in both traits.
Finally, the polygenic scoring method suggested that the variants affecting height in adult data explain the majority of genetic variation in child data. In contrast, the adult BMI predictor only explained a small fraction of genetic variation in child BMI. This makes sense in terms of our genetic stability and extends our understanding of its nature. Genetic correlations suggest that virtually the same variants influence individual differences in anthropometric traits throughout life. However, polygenic prediction suggests that for height, the effect of the same variants is largely as important in childhood as it is in adulthood. The same is not true for BMI, where the effect of the same variants change greatly from childhood to adulthood. This is supportive of previously reported longitudinal changes in BMI within a Swedish population where polygenic score acquired at age 25 predicted ∼ 10% in older adults. 11 Of note, when we tested the same hypothesis in reversed order, i.e. using child polygenic score to predict adult height/BMI, we found much lower estimates for both height and BMI (10% and 2%, respectively). The substantial drop was almost certainly driven by the less accurate predictor, as the TEDS sample is only a half of the TwinGene sample.
In conclusion, we used GREML-GCTA and LD score regression to show strong genetic stability between children in the UK and older adults in Sweden. We have shown that genetic stability not only spans across a 50-year age gap but also across geographical distance. Furthermore, just as was previously reported in a study of long-term genetic stability of cognitive abilities, 13 genetic stability of anthropometric traits across two European populations suggests that more than half of genetic influence in late adulthood is the same as that influencing individual differences in childhood. Finally, we have also shown that, for height, the influence of these variants does not change much across the human life span, whereas it changes substantially for BMI.
Funding
MT is supported by a British Academy Post-doctoral Fellowship [pf140018]; TEDS is supported by a programme grant to RP from the UK Medical Research Council [G0901245, and previously G0500079], with additional support from the US National Institutes of Health [HD044454; HD046167]. Genome-wide genotyping was made possible by a grant from the Wellcome Trust to the Wellcome Trust Case Control Consortium 2 project [085475/B/08/Z; 085475/Z/08/Z]. RP is supported by a Medical Research Council Research Professorship award [G19/2] and a European Research Council Advanced Investigator award [295366], the Swedish Foundation for International Cooperation in Research and Higher Education.
Acknowledgment
We would like to thank Xu Chen for performing an independent run of GWA on TwinGene data.
Conflict of interest: The authors have declared no conflict of interest.
References