Impact of follow-up time and analytical approaches to account for reverse causality on the association between physical activity and health outcomes in UK Biobank

Abstract Background The advent of very large cohort studies (n > 500 000) has given rise to prospective analyses of health outcomes being undertaken after short (<4 years) follow-up periods. However, these studies are potentially at risk of reverse causality bias. We investigated differences in the associations between self-reported physical activity and all-cause and cardiovascular disease (CVD) mortality, and incident CVD, using different follow-up time cut-offs and methods to account for reverse causality bias. Methods Data were from n = 452 933 UK Biobank participants, aged 38–73 years at baseline. Median available follow-up time was 7 years (for all-cause and CVD mortality) and 6.1 years (for incident CVD). We additionally analysed associations at 1-, 2- and 4-year cut-offs after baseline. We fit up to four models: (1) adjusting for prevalent CVD and cancer, (2) excluding prevalent disease, (3) and (4) Model 2 excluding incident cases in the first 12 and 24 months, respectively. Results The strength of associations decreased as follow-up time cut-off increased. For all-cause mortality, Model 1 hazard ratios were 0.73 (0.69–0.78) after 1 year and 0.86 (0.84–0.87) after 7 years. Associations were weaker with increasing control for possible reverse causality. After 7-years follow-up, the hazard ratios were 0.86 (0.84–0.87) and 0.88 (0.86–0.90) for Models 1 and 4, respectively. Associations with CVD outcomes followed similar trends. Conclusions As analyses with longer follow-up times and increased control for reverse causality showed weaker associations, there are implications for the decision about when to analyse a cohort study with ongoing data collection, the interpretation of study results and their contribution to meta-analyses.


Introduction
Many prospective cohort studies have demonstrated prospective associations between lifestyle behaviours and the risk of mortality or morbidity. [1][2][3] Typically, follow-up periods have been at least 5 years, and often >10 years, because it was typically necessary to wait for a sufficient number of events to occur before analyses could be undertaken. 4 With the advent of very large cohort studies such as the Million Women Study (n > 1 000 000) and the UK and China Kadoorie Biobanks (both n > 500 000), the number of events rarely limits the possibility of early prospective analyses. [5][6][7][8] However, there is little research to indicate if the length of follow-up time after which analyses are undertaken, combined with the analytical choices made to address issues of reverse causality, impacts on the estimated association between behavioural exposures and health outcomes.
In this study, we focus on physical activity behaviour. Previous work in this field has focussed on the potential biases introduced by using very long follow-up periods (>30 years). Longer durations have generally but not consistently attenuated the protective inverse association between physical activity and all-cause mortality, [9][10][11][12] coronary heart disease incidence and mortality. [12][13][14][15] Andersen 9 attributed this to violation of the constant exposure assumption, i.e. real changes in physical activity behaviour during that time.
For substantially shorter follow-up periods, a major issue of concern is reverse causality, i.e. the potential that underlying (diagnosed or undiagnosed) illness impacts negatively on physical activity, so the observed association between physical activity and health outcomes is more driven by the causal link between the underlying disease state and subsequent outcome events, rather than the physical activity exposure per se. 16 This will typically lead to an overestimation of the true association between physical activity and health outcomes. Common methods to account for this bias include removing individuals who have prevalent disease, or those who experience the event soon after baseline, who are presumed to represent those with undiagnosed illness at baseline. However, the latter is not always feasible when the study follow-up period is short or the number of events is low. 7,8 The aim of this study was to investigate whether the strength of the associations between physical activity and all-cause mortality and cardiovascular disease (CVD) outcomes varies over different follow-up times in the range 1-7 years. A secondary aim was to examine how different approaches to account for reverse causality impact on the strength of the association.

Data source
We used data from UK Biobank, a study involving over 500 000 individuals aged 37-73 years at baseline. Participants undertook an assessment at one of 22 centres across Great Britain between 2006 and 2010. This included a touchscreen self-administered questionnaire and anthropometric measurements. 17,18 Consent to link these data to the national death registries and to hospital episode inpatient data was also obtained. Questionnaire and mortality data were downloaded on 31 August 2018, containing information from 502 543 participants after withdrawals. The hospital episode data were downloaded on 9 April 2018.

Physical activity exposure
The questionnaire covered the frequency and duration of individuals' participation in four types of discretionary physical activity (home and leisure domains) which are considered to be in the moderate-to-vigorous physical activity intensity range: 19 (i) walking for pleasure, not as a means of transport, (ii) other exercises, e.g. swimming, cycling, keep-fit, bowling, (iii) strenuous sports, and (iv) heavy do-it-yourself (DIY) home maintenance, e.g. weeding, lawn mowing, carpentry, digging. The response options for frequency were: none, once in the last 4 weeks, 2-3 times in last 4 weeks, once per week, 2-3 times per week, 4-5 times per week, or every day. The response options for duration were: <15 min, 15-30 min, 30-60 min, 1-1 1 = 2 h, 1 1 = 2 -2 h, 2-3 h or >3 h. Responses were scaled to a total weekly duration using the median values (maximum duration of 3 1 =2 h), and truncated at 2000 min (n ¼ 544 affected). We excluded those missing both frequency and duration for an activity type (n ¼ 19 643) and those who completed the different pilot questionnaire (n ¼ 3798). We imputed the frequency or duration from the median of all others reporting participation in that activity type (n ¼ 5994) if only one was missing.

Outcomes
The three outcomes of interest were all-cause mortality, CVD mortality and CVD incidence (fatal and non-fatal). CVD was defined as a primary or secondary diagnosis with International Classification of Disease 10th revision codes I20-25 for ischaemic heart disease and I60-69 for cerebrovascular disease. The censor dates differed for the mortality and hospital episode data for different home nations. Participants attending an assessment centre in England and Eleven individuals were excluded due to inconsistencies in their mortality data (death date prior to interview, diagnosis but no death date, death date but no diagnosis).

Covariates
Potential confounders were chosen a priori, based on previous literature. Demographic characteristics included age, sex, ethnicity (white/non-white), highest educational level achieved (degree or above/any other qualification/no qualification) and Townsend indicator of deprivation (a continuous index derived from the respondent's post code, with higher scores indicating higher deprivation). Lifestyle behaviours included smoking status (never/previous/current); alcohol consumption (never/previous drinker/current drinker); addition of salt to food (never or rarely/sometimes/usually or always); consumption of oily fish (never/ less than once per week/at least once per week); fruit and vegetable intake (a score of 0-4 was computed from questions asking the frequency of raw and cooked vegetable, fresh and dried fruit intake; respondents were given 1 point if they reported more than 2 portions of each type); consumption of processed or red meat (average days per week derived from questions on the frequency of processed, beef, lamb and pork intake); leisure screen time (combined total duration of reported TV and computer time during leisure time, categorized to <4 h per day/!4 h per day in line with current estimates of where the risk of all-cause mortality and CVD mortality increases non-linearly 3 ); and usual sleep per 24-h period (<7 h/7-8 h/>8 h). We also derived a five-category variable that combined the responses from questions on employment status (unemployed/in paid or self-employment), walk or cycle to work (yes/no) and heavy manual/physical job (yes/no) because the latter two were only applicable to those in work.
Variables indicating self-reported health status included: current prescription of blood pressure medication (yes/no); current prescription of cholesterol medication (yes/no); diabetes (insulin prescription or self-reported diagnosis/neither); paternal or maternal history of heart attack, angina or stroke (yes to either parent/neither); and paternal or maternal history of cancer (yes to either parent/neither). Body mass index (BMI) was derived from height and weight measured at baseline and categorized into under/normal weight (<25 kg/m 2 ), overweight (!25-<30 kg/m 2 ) and obese (!30 kg/m 2 ). Individuals with prevalent CVD were identified from their self-reported previous diagnosis of a heart attack, angina or stroke, or a hospital episode with a previous relevant diagnosis. Individuals with prevalent cancer were also identified via self-report or hospital episode data (International Classification of Disease 10th revision codes C00-C99). Missing data in these covariates led to the exclusion of n ¼ 26 104 individuals.

Statistical analyses
Cox regression with age as the underlying timescale was used to estimate the association between physical activity at baseline and all-cause and CVD mortality, and CVD incidence. We used cubic spline regression to examine the nature of the dose-response relationships and identify a transformation of the physical activity variable which would enable us to include this in the model as a continuous variable (i.e. assuming a log-linear association with the hazard), and hence simplify the presentation of the results from different models (see Supplementary Figure 1, available as Supplementary data at IJE online). Based on the findings, the physical activity variable (in min/week) was transformed by adding 1 and taking the natural log. We subsequently standardized this variable by subtracting the sample mean and dividing by its standard deviation.
The median available follow-up time was 6.1 years (for incident CVD) and 7 years (for all-cause and CVD mortality). We also used cut-offs at 1, 2 and 4 years of follow-up after baseline. Where possible within each outcome/ follow-up time combination, four models were fitted using different methods to account for reverse causality bias: Model 1 ¼ adjustment for prevalent CVD and cancer; Model 2 ¼ exclusion of individuals with prevalent CVD and cancer; Model 3 ¼ Model 2 plus excluding incident cases that occurred within the first year; Model 4 ¼ Model 2 plus excluding incident cases that occurred within the first 2 years. All models were adjusted for potential confounders.
Proportional hazard assumptions for the exposure and all covariates were assessed using log-log plots for each outcome based on Model 4 with the maximum follow-up time. Assessment centre, ethnicity, alcohol consumption and the combined variable for employment status/active commuting/manual work were accounted for by stratification of the baseline hazard function, rather than as covariates in the linear predictor, because they did not always meet the proportional hazard assumptions.
We quantified the potential degree of regression dilution bias using data from two sub-samples that undertook repeat exposure assessment at one of two follow-up visits; these were conducted a median of 4.4 years (n ¼ 18 213) and 7.6 years (n ¼ 21 205) after baseline. We also created a further sub-sample of those who undertook the first repeat visit <3 years after baseline (n ¼ 2122; minimum follow-up 2 years). We standardized the natural log of the minutes of reported physical activity þ1 at these visits to the baseline scale (subtracting the baseline mean and dividing by the baseline standard deviation). The coefficient from a linear regression of the follow-up visit variable on the baseline variable (also transformed and standardized as above) was estimated to indicate the degree of stability in the measured exposure variable over time.
We also performed the models without adjustment for BMI, diabetes diagnosis or insulin prescription, blood pressure medication or cholesterol medication as sensitivity analyses, as these covariates could plausibly be on the causal pathway.

Results
Sample sizes ranged between 384 615 and 452 993 across the different analyses. Table 1 summarizes the baseline characteristics of the cases and non-cases in the analysis samples used to fit Model 1 for each outcome, with the maximum available follow-up time. Cases for all outcomes were less active, older, of higher education level and more overweight than non-cases. They were also more likely to be unemployed, a current smoker, a previous drinker, report higher leisure screen time, take medication and have a history of disease. Figure 1 and Table 2 show the hazard ratios (HRs) for a one standard deviation difference in the transformed physical activity variable, for each combination of outcome, model and follow-up time cut-off. With a few exceptions, for any given model the strength of the association decreased as follow-up time increased (Table 2). The greatest difference was seen for all-cause mortality. For example, for Model 1, the hazard ratio was 0.86 (0.84-0.87) after 7 years of follow-up compared with 0.73 (0.69-0.78) after 1 year, i.e. a 2-fold difference in magnitude (log hazard ratios À0.32 vs À0.15). The equivalent estimates for CVD mortality and incidence were approximately 30-70% higher. There were some exceptions to this in the models that excluded those with early events (Models 3 and 4) after the longer follow-up times (4 and 7 years), but the percentage differences were of smaller magnitudes (<25%).
Supplementary Table 1 and Figure 2, available as Supplementary data at IJE, online display these same estimates, re-arranged to allow direct comparison between models for the same follow-up time cut-off. With the maximum available follow-up time, the estimates from Model 4 (excluding those with prevalent disease and those experiencing the outcome within the first 2 years of follow-up) were attenuated compared with those from Model 1 (adjustment for prevalent disease only). The relative differences between models were greatest in analyses using 4-years of follow-up. The relative differences between models were greatest for incident CVD, although the absolute differences were similar for the other outcomes.
To assist with interpretation, Figure 2 shows the HRs for the different follow-up times by model across the range of 0-600 min/week of discretionary MVPA. Supplementary  Figure 3, available as Supplementary data at IJE online, displays these estimates re-arranged to facilitate comparisons between models across the different follow-up times.
The coefficients from regression of standardized physical activity at follow-up on baseline were 0.2 (SE 0.020) for those who undertook the first re-visit <3 years after baseline, 0.49 (SE 0.007) for whole first re-visit sample (median 4.4 years after baseline) and 0.39 (SE 0.006) second re-visit (median 7.6 years after baseline).
The results of the sensitivity analyses without adjustment for covariates potentially on the causal pathway were almost identical to the main analyses (data not shown).

Discussion
With increasing availability of large datasets (e.g. biobanks), researchers face important decisions alongside the unique opportunities offered by these resources. This study is the first to quantify the combined impact of choice of follow-up time cut-off and method to address reverse causality bias on estimates of association between self-reported physical activity and these outcomes. We found that choice Fruit and vegetable intake score, median (IQR) of cut-off for follow-up time within a range of 1-7 years can strongly influence the magnitude of the prospective association. When all-cause mortality was the outcome, the log HRs based on a maximum follow-up of 1 year were over double the magnitude of those obtained using a median 7year follow-up. There may be a number of explanations for the attenuation of the association with increasing follow-up time, further to the reduced influence of reverse causality bias. For example, we would expect individual variation in physical activity levels, resulting in the baseline exposure measurement not perfectly reflecting exposure over the entire follow-up period. As Skogstad et al. showed, some health benefits of physical activity may not persist if levels are not maintained: after an 8 week physical activity intervention, both physical activity levels and various biomarkers of CVD risk returned to baseline levels 15 months later. 20 Therefore, one should expect some attenuation as follow-up time increases. However, since the magnitude of the regression dilution over the 3-7 years after baseline were estimated to be fairly similar in this study, it is unlikely that changes in physical activity behaviour fully explain these differences.
Excluding individuals with prevalent disease attenuated the estimated associations compared with adjusting for it, as did excluding early incident cases to address the issue of undiagnosed underlying disease impacting physical activity levels. However, the impact of different analytical approaches for dealing with reverse causality was smaller as follow-up time increased. Our results are comparable with the findings of Andersen et al. who observed attenuation in the associations of physical activity levels with all-cause mortality over a 10year period in Danish adults. 9 The HR comparing the risk of all-cause mortality for least active individuals with that of the highly active decreased from 2.6 to 1.9 when follow-up lengthened from 2 to 10 years, representing $30% difference on the log scale. The most comparable result in the present study was the 55% attenuation of the Model 2 estimates between 2-and 7-year follow-up. Different exposure measures and modelling choices, sample sizes and event rates, and sample characteristics are likely to influence the level of attenuation. 21 One other study investigated differences in the association between physical activity and health outcomes by different cut-offs for follow-up time up to 7 years. De Bruijn et al. found attenuation in the association between physical activity and dementia over a 2-6 year period, to the extent that there was no evidence of an association after 5 years. 22 In response to this finding, the authors discussed the need to consider the interplay between physical activity and the disease pathway when choosing the follow-up time. However for this outcome, there are further complexities to consider, as the disease state (even at a pre-diagnostic stage) may also negatively impact on the subjective recall of physical activity. Therefore, despite their rigorous screening methods at baseline, correlated measurement error may also explain the associations.
A major strength of the current work is its relevance to the decisions faced by researchers today. We used a large sample of middle-aged adults where event rates may in principle be high enough to undertake association analyses soon after baseline. By quantifying the difference in the estimates, we have assisted those who meta-analyse results from studies with different follow-up periods in the <10 year range. Our results are particularly timely as the follow-up time for the UK Biobank subsample with accelerometry measures (undertaken $3 years after baseline interview 23 ) will soon be sufficient for prospective analyses. However, it remains unclear whether our findings would also apply to a situation where physical activity is measured objectively. This will be important to investigate to aid interpretation of the results of two recent studies 8,24 reporting on the associations between accelerometerderived physical activity and mortality after 1-2 years of follow-up. Accelerometer-derived metrics may be more precise at differentiating between levels of physical activity than self-report methods, 25 which may change the strength of the association with health outcomes and crosssectionally with underlying disease. If so, it is possible that there may be a different pattern of variation in the estimates based on different lengths of follow-up and ways of accounting for reverse causality. This study is also the first to present associations of this particular physical activity exposure summary measure and all-cause and CVD mortality and CVD incidence in the UK Biobank sample. Previous work has either reported associations by domain 26 or has used summary measures from the International Physical Activity Questionnaire Short Form. 7 There are also some limitations of our work. The UK Biobank sample is non-representative of the general population (5.5% response rate) and has been shown to be healthier than the UK population. 27 This may affect the generalizability of the results. Potentially, the associations we observe in this study would likely be greater still in older or less healthy populations who have an increased prevalence of underlying disease, or for whom the timeline of disease progression may be different. Therefore, similar work in other population samples is needed. We have also only investigated one behavioural exposure; the pattern of associations may also be present for other exposures, for example specific sedentary behaviours or food intake. Lastly, $30% of the UK Biobank participants have been identified as having at least one third-degree or closer relative in the study. We have not accounted for this relatedness in our analyses, thus the certainty of individual HRs may be slightly overestimated. However, this should not impact on our main conclusions, as we focus here on the relative difference between models rather than absolute associations.
In conclusion, we have shown important differences in the associations between self-reported physical activity and all-cause mortality and CVD outcomes as follow-up time increases over a 7 year period. We have also shown that analytical approaches to account for reverse causality can affect estimates, particularly with shorter follow-up times. The expected time course of disease progression is critical to these decisions, and, as such, just because analyses can be undertaken, it does not mean that they should be.

Supplementary data
Supplementary data are available at IJE online.