Biological age in UK Biobank: biomarker composition and prediction of mortality, coronary heart disease and hospital admissions

2 Background: Age is the strongest risk factor for most chronic diseases, and yet individuals 3 may age at different rates biologically. A biological age formed from biomarkers may be a 4 stronger risk factor than chronological age and understanding what factors contribute to it 5 could provide insight into new opportunities for disease prevention. 6 Methods and findings: Among 480,019 UK Biobank participants aged 40-70 recruited in 7 2006-2010 and followed up for 6-12 years via linked death registry and secondary care 8 records, a subpopulation of 141,254 (29.4%) non-smoking adults in good health and with no 9 medication use or disease history at baseline were identified. Independent components of 72 10 biomarkers measured at baseline were characterised by principal component analysis. The 11 Klemera Doubal method (KDM), which derived a weighted sum of biomarker principal 12 components based on the strengths of their linear associations with chronological age, was 13 used to derive sex-specific biological ages in this healthy subpopulation. The proportions of 14 the overall biological and chronological age effects on mortality, coronary heart disease and 15 age-related non-fatal hospital admissions (based on a hospital frailty index) that were 16 explained by biological age were assessed using log-likelihoods of proportional hazards 17 models. the the key biomarkers to the contributing components little these for but and Conclusions: This study identified that markers of impaired function in a range of organs account for a substantial proportion of the apparent effect of age on disease and hospital 30 admissions. It supports a broader, multi-system approach to research and prevention of diseases of ageing.

biomarker with chronological age were visually assessed for linearity (S1 Appendix 3A), 95 before using linear methods to estimate biological ages. To represent the biomarkers as 96 linearly uncorrelated principal components, PCA with varimax rotation 11 was carried out on 97 the 72 biomarkers, which gave 51 principal components with eigenvalues >0. 33. These 98 principal components were characterised based on constituent biomarkers with the largest 99 factor loadings (S1 Appendix 3B-D). 100 Biological ages were estimated using KDM, 10 separately within each sex and prior health 101 subpopulation (S1 Appendix 3C). A similar method, stepwise linear regression, was 102 considered and the results for both methods were compared in the appendices (S1 Appendix 103 3C and 4). Biomarker principal components were ranked by their importance, measured by 104 the proportion of variance in the biological ages that each component explained (S1 105 Appendix 3E). 106 Three health outcomes were constructed from HES and death records: (1) death from chronic 107 disease (excluding: infectious diseases, pregnancy, congenital malformations and external 108 causes), 16 (2) fatal and non-fatal coronary heart disease (CHD) and (3) age-related non-fatal 109 hospital admissions (S1 Appendix 2). These hospital admissions are the subset of those types 110 of admissions in a published hospital frailty risk score 17 that are age-related in the UK 111 Biobank (S1 Table 6). The predictive powers of chronological age and biological ages for the area under the receiver operating curve). Prediction of CHD and hospital admissions by 116 biological age was compared with a benchmark of prediction by a mortality score similar to 117 those proposed by previous studies 2, 14 and derived from stepwise Cox regression, using 118 unadjusted Cox models (S1 Appendix 3F).

119
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint The copyright holder for this . http://dx.doi.org/10.1101/2019. 12.12.19014720 doi: medRxiv preprint first posted online Dec. 15, 2019 ; To investigate the relationship of the biological ages to chronological age, the proportion of 120 variation in chronological age described by each biological age was estimated. The 121 proportion of the overall biological and chronological age effect on mortality, CHD and 122 hospital admission risk that was explained by each biological age was also estimated, by 123 comparing the log-likelihoods from these Cox models (S1 Appendix 3G). Calibration of 124 biological ages to chronological age and the risk calibration of biological ages with each 125 health outcome was assessed (S1 Appendix 3H).

126
The statistical analysis was repeated on biomarkers corresponding to the 10 most important

133
Study characteristics 134 Of the 480,019 participants, 141,254 (29.4%) were in the healthy subpopulation (Table 1). 135 During a median follow up period of 8.7 years for mortality and 8.0 years for CHD and 136 hospital admissions (S1 Appendix 1), 1.7% of healthy, and 3.9% of all participants died from 137 chronic diseases; 1.9% of healthy and 4.0% of all participants without CHD at baseline had a 138 first CHD event; 16.0% of healthy and 23.1% of all participants who were not admitted to 139 hospital for age-related reasons prior to baseline had been admitted with diagnoses of these 140 conditions during follow up (S1 Table 7). Sociodemographic patterns and the proportion of 141 participants healthy at baseline were similar between sexes.

142
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint The copyright holder for this . http://dx.doi.org/10.1101/2019. 12.12.19014720 doi: medRxiv preprint first posted online Dec. 15, 2019 ; Biomarker characteristics 145 The relationships of most candidate biomarkers to chronological age were broadly linear or 146 flat (S1 Figure 3 and Figure 1). Several biomarkers displaying non-linear trends and 147 differences by sex or by prior health are highlighted in Figure 1, which shows standardised calcium, alkaline phosphatase and phosphate (S1 Figure 3).

156
Many biomarker principal components had a single biomarker strongly loaded onto them and 157 were easily characterised. Multiple biomarkers were strongly loaded onto the adiposity, lung 158 function, blood pressure and blood lipid principal components (S1 Figure 5). The coefficients 159 of the biomarker principal components in the biological ages for the healthy subpopulation 160 are listed in S1 Table 9. The estimated biological ages appeared stable in 10-fold cross 161 validation as their prediction errors were small (S1 Appendix 4 and S1 Figure 6).

162
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint Predictive power and calibration of biological ages 172 The KDM biological ages (based on 51 principal components) were well calibrated as they 173 matched healthy participants' chronological ages on average (S1 Figure 7). The KDM ages 174 were more predictive of CHD and hospital admissions than the benchmark mortality score 175 (approximate increases in C-indices for CHD/hospital admissions: 0.135/0.111 in men, 176 0.109/0.068 in women), and the mortality score performed only slightly better than chance 177 (C-indices ~0.5; S1 Table 12). KDM ages alone were not statistically significantly better than  Table 13). However, when estimated in unhealthier subpopulations, they supplemented  Table 13).

185
The stepwise regression age was poorly calibrated as it was distributed across a narrower age 186 range than chronological age on average (S1 Figure 7), and was not considered in further 187 analyses. The prediction results and a risk calibration assessment are described in detail in S1 188 Appendix 4.

189
Biomarker importance in biological ages 190 In the KDM ages, reduced lung function featured most strongly in the healthy subpopulation 191 (Figure 2), describing 12.4% (men) and 10.3% (women) of the variation in biological age (S1 192   Table 10). Higher cystatin C, slower reaction time, lower insulin-like growth factor-1 (IGF-193 1), lower hand grip strength, higher and higher blood pressure also featured strongly for both 194 sexes; while lower albumin, higher sex hormone-binding globulin and lower muscle mass 195 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint The copyright holder for this . http://dx.doi.org/10.1101/2019.12.12.19014720 doi: medRxiv preprint first posted online Dec. 15, 2019 ; biomarkers featured strongly for men; and higher levels of alkaline phosphatase, LDL-C and 196 apolipoprotein B and HbA1c for women. Multiple body systems were represented by these 197 biomarkers: respiratory, renal, cardiovascular, musculoskeletal, endocrine, metabolic and 198 immune, liver and nervous systems (S1 Table 5). When the analysis was restricted to the top 199 10 biomarker components corresponding to 13 biomarkers for men and 12 for women, forced  Relationship between biological and chronological age 209 The KDM ages described 44.0% and 51.3% of the variation in chronological age for healthy 210 men and women respectively. More importantly, with respect to the prediction of mortality, 211 CHD and hospital admissions, and averaged across sexes, the KDM ages described 66%, 212 80% and 63% of the overall biological and chronological age effect respectively ( Figure 3A 213 and S1 Table 11). The proportion described by the KDM age is attributed to each constituent is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.

218
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity.   Eastern European biological ages, which instead found that the top-ranking blood-based 244 biomarkers varied by population and sex. 8 Studies of ageing biomarkers also found that lung 245 and renal biomarkers were top-ranking determinants of functional decline 19 and variation in 246 age-related traits. 20 The present study provides additional detail on the relative importance on 247 ageing of biomarkers within body system groups, such as cystatin C over other renal 248 biomarkers (creatinine and creatinine-based eGFR), 21 as previous studies each assessed only 249 one of these biomarkers. 8,9,19,20 250 Several key biomarkers in this study (blood pressure, blood lipids, height and lung function) 251 have each been shown to be associated observationally, and in some cases causally, in 252 randomised trials and Mendelian randomisation studies, with a range of age-related diseases 253 ( Table 2). Associations for other key determinants such as cystatin C and hand grip strength 254 have been less extensively researched, and available studies have focused on mortality and 255 cardiovascular outcomes. [21][22][23][24][25] Blood pressure did not feature as strongly as the 256 aforementioned biomarkers in our study, despite being well-established as a modifiable and 257 causal risk factor of cardiovascular disease. 26 Other cardiovascular biomarkers did not feature 258 strongly for men, whereas for women, LDL-C and apolipoprotein B, causally linked to 259 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint The copyright holder for this . http://dx.doi.org/10.1101/2019.12.12.19014720 doi: medRxiv preprint first posted online Dec. 15, 2019 ; atherosclerotic cardiovascular disease, 27 were also important (Figure 2), and may have 260 contributed to better prediction of CHD in women than in men (S1 Tables 12 and 13). Cohort 261 effects in this population are difficult to disentangle, and may influence trends in body size.

262
Hence, height may be acting as a proxy for cohort effects.  is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. for women) potentially due to reverse causality, 56 but has been causally linked to 30 276 diseases. 57 Therefore, BMI may be a modifiable risk factor that affects biological age. were better than the benchmark mortality score in predicting CHD and hospital admissions 283 (S1 Table 12). The predictive value of a biological age varied by health status and was 284 greater in unhealthy individuals (S1 Tables 12 and 13), likely reflecting the contribution of 285 diagnostic indicators of ageing. Therefore, it is important to take into account the health and 286 age profile of the population when comparing different studies.

287
The KDM permitted investigation of the relationship between biological and chronological 288 ages with respect to predicting health outcomes, and automatically calibrated biological ages 289 to chronological age. By contrast, the stepwise regression method resulted in a poorly-290 calibrated biological age in the UK Biobank despite its frequent use in biological ageing 291 studies 5,9,12 (where its risk calibration was not investigated). Comparison of an individual's 292 . CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint Strengths and limitations of this study 305 The estimation methods used assumed that biomarkers with the strongest linear relation to 306 chronological age contribute most to biological age, but these biomarkers are not necessarily 307 strongly linked to health outcomes. Analysis of key determinants of biological age were 308 limited by the range of biomarkers available. Not all biomarker trends in the UK Biobank 309 were linear (S1 Figure 3), but a previous study has shown that incorporating non-linearity 310 was computationally complex and only slightly improved the accuracy of estimated 311 biological age components. 63 However, the epidemiological reliability of the present analyses 312 was increased through stratification by prior health, use of biomarker principal components, 313 cross validation and adherence to clinical risk prediction reporting guidelines (S1 Table 14). 18

315
. CC-BY 4.0 International license It is made available under a is the author/funder, who has granted medRxiv a license to display the preprint in perpetuity. (which was not peer-reviewed) preprint The copyright holder for this . http://dx.doi.org/10.1101/2019. 12.12.19014720 doi: medRxiv preprint first posted online Dec. 15, 2019 ; A biological age consisting of clinical biomarkers reflecting functionality of a range of 316 organs accounted for a substantial proportion of the effect of age on disease and hospital 317 admissions in the UK Biobank. An overall biological age has potential to be used and 318 evaluated as a broader-based approach to risk identification and prevention than individual 319 biomarkers. Of the most important biomarkers contributing to the derived biological age, 320 cardiometabolic biomarkers have well-studied causal associations with mortality and 321 cardiovascular disease, but further research is needed to identify modifiable causal factors 322 underlying all components, for a range of age-related diseases. Evidence Co-operative. The funders had no role in study design, data collection and analysis, 342 decision to publish, or preparation of the manuscript.

346
The underlying data is open access through application to the UK Biobank, and materials and 347 methods will be made freely available through the UK Biobank as part of this project.