Estimation of time-varying causal effects with multivariable Mendelian randomization: some cautionary notes

Abstract Introduction For many exposures present across the life course, the effect of the exposure may vary over time. Multivariable Mendelian randomization (MVMR) is an approach that can assess the effects of related risk factors using genetic variants as instrumental variables. Recently, MVMR has been used to estimate the effects of an exposure during distinct time periods. Methods We investigated the behaviour of estimates from MVMR in a simulation study for different time-varying causal scenarios. We also performed an applied analysis to consider how MVMR estimates of body mass index on systolic blood pressure vary depending on the time periods considered. Results Estimates from MVMR in the simulation study were close to the true values when the outcome model was correctly specified: i.e. when the outcome was a discrete function of the exposure at the precise time points at which the exposure was measured. However, in more realistic cases, MVMR estimates were misleading. For example, in one scenario, MVMR estimates for early life were clearly negative despite the true causal effect being constant and positive. In the applied example, estimates were highly variable depending on the time period in which genetic associations with the exposure were estimated. Conclusions The poor performance of MVMR to study time-varying causal effects can be attributed to model misspecification and violation of the exclusion restriction assumption. We would urge caution about quantitative conclusions from such analyses and even qualitative interpretations about the direction, or presence or absence, of a causal effect during a given time period.


Introduction
Multivariable Mendelian randomization (MVMR) is an extension of standard (univariable) Mendelian randomization to investigate the causal effects of related risk factors with shared genetic predictors. 1 In standard Mendelian randomization, we take genetic variants that are predictors of a single risk factor, and assess whether genetically predicted levels of the risk factor are associated with the outcome. 2 If the genetic variants satisfy the instrumental variable assumptions, then an association between genetically predicted levels of the risk factor and the outcome is indicative of a causal effect of the risk factor on the outcome. 3 MVMR is analogous, except that instead of testing whether genetically predicted values of a single risk factor are associated with the outcome, we test whether genetically predicted values of multiple risk factors are associated with the outcome or not in a multivariable model. 4 Estimates from MVMR represent direct effects of the risk factors on the outcome. 5 The method is primarily used in two contexts: first, to assess the effect of an exposure when genetic variants associated with the exposure may have pleiotropic effects on the outcome via other measured risk factors; 1 and second, to assess the relative contribution of causal pathways from the exposure to the outcome via other measured risk factors in a mediation analysis. 5, 6 We investigate a related application of MVMR that has been considered in empirical investigations 7,8 and has recently been described from a methodological perspective 9,10 that we refer to as time-varying Mendelian randomization. In this setting, the risk factors are not separate exposures, but rather multiple measures of the same exposure at different time points (Figure 1). For instance, Richardson et al. considered body mass index (BMI) measured during early life and during later life as separate risk factors, and assessed whether genetically predicted values of early-life and later-life BMI were associated with coronary artery disease (CAD) risk. 7 The authors interpreted a positive univariable association between genetically predicted early-life BMI and CAD risk as evidence that earlylife BMI is a causal risk factor for CAD, and lack of an independent association between genetically predicted earlylife BMI and CAD risk in a multivariable model that also included genetically predicted later-life BMI as evidence that early-life BMI does not have a direct effect on CAD risk, but that the risk is mediated via later-life BMI.
These analyses pose several difficulties. First, it is necessary to have some linear independence between genetic predictors of the exposure at different time points. 11 If genetic associations with the exposure at different time points are perfectly proportional, then it is not possible to disentangle the effects of the exposure at different time points. It is not necessary to find distinct genetic predictors of the exposure at different time points, but if correlations across variants between the genetic associations with the exposure at different time points are strong, then the analysis will have limited utility.
However, even when such genetic variants are available, the use of MVMR to investigate time-varying causal effects necessitates strong parametric assumptions that are unlikely to be plausible in practice. In univariable Mendelian randomization, in order to estimate a parameter that represents a causal effect, it is necessary to make parametric assumptions. 12 There are different parametric assumptions, depending on which causal effect to be identified or interpreted. For example, for the average treatment effect (ATE), one normally requires the homogeneity and the linearity of exposure-outcome model. 13 Similarly, monotonicity is usually required for the local average treatment effect (LATE). However, even if these assumptions are not satisfied, the standard Mendelian randomization estimate is a valid test statistic for the sharp causal null hypothesis that the exposure has no causal effect on the outcome at any time point. 14 Rejection of this null hypothesis implies that the exposure has a causal effect on the outcome at some time point in at least a subset of the population. Here, we will show by counterexample that an analogous result for time-varying Mendelian randomization does not hold.
In this paper, we will explore the behaviour of estimates from MVMR for a time-varying exposure in a simple simulation study. We shall show that, when the model for the outcome is correctly specified, estimates from MVMR are close to the true parameter values. However, when the model is incorrectly specified (as is likely in practice), estimates are highly misleading. We also perform an applied MVMR analysis of BMI on systolic blood pressure (SBP) and show that estimates and inferences are highly variable depending on the time period at which genetic associations with BMI are estimated. Methods MVMR has been described at length in the literature 1,4 and in particular in the context of time-varying causal effects. 9, 10 We here provide a brief overview of the approach.

Assumptions and estimation
We use the term 'exposure' to describe the putative causal factor and the term 'risk factor' to describe the value of the exposure at a particular time point. MVMR requires each genetic variant to satisfy the multivariable instrumental variable assumptions 15 : i. the variant is associated with one or more of the risk factors, ii. the variant is not associated with the outcome via a confounding pathway and iii. the variant does not affect the outcome directly, only possibly indirectly via one or more of the risk factors.
In order to estimate causal effects, we need to make additional assumptions. For simplicity, we assume that all variables are continuous, and the associations between the genetic variants and the risk factors, the genetic variants and the outcome, and the causal effect of the risk factors on the outcome are homogeneous and linear without effect modification by any confounder. 3 This enables the estimation of the direct effects of the risk factors on the outcome using the two-stage least squares (2SLS) method, which can be implemented by first regressing the risk factors on the genetic variants, and then regressing the outcome on the fitted values of the risk factors in a multivariable model. 16 The term 'direct effect' is imprecise; it has previously been argued that the direct effects estimated in instrumental variable analyses are most naturally interpreted as controlled direct effects (i.e. the effect of varying the exposure while fixing the mediator at a given value), as the instrumental variables set the values of the risk factors. 6 However, in the all-linear setting, natural and controlled direct effects take the same value, so any difference is a question of interpretation. An alternative estimation method that could be applied here is g-estimation of structural mean models, although estimates are similar to those from 2SLS in the alllinear setting. 10 Equivalent estimates to those from 2SLS would also be obtained from the multivariable inversevariance weighted method if we had access to summarized data on the genetic associations with the outcome and with the exposure at the relevant time points. 17

Simulation studies
We investigate the behaviour of MVMR estimates for a time-varying exposure in a series of simulation studies. We simulate the time-varying exposure X(t) according to the following data-generating model: where the genetic variants G j for j ¼ 1; 2; . . . ; 30 follow independent binomial distributions Bð2; 0:3Þ, and the confounder U and the error term X have independent standard normal distributions. The exposure varies over Figure 1 Directed acyclic graph illustrating multivariable Mendelian randomization assumptions for two risk factors (X 1 and X 2 ), J genetic variants (G 1 ; G 2 ; . . . ; G J ) that are assumed to satisfy the multivariable instrumental variable assumptions, an outcome (Y) and an unmeasured confounder (U). In time-varying Mendelian randomization, we assume that the two risk factors are measurements of the same exposure at different time points. The dashed arrow from X 1 to X 2 indicates the potential causal effect of the exposure at time point 1 on the exposure at time point 2 time t, which can be interpreted as an individual's age, owing to sinusoidal effects of the error term [sinðtÞ], the confounder [cosðtÞ] and the genetic variants: where A 1;j ; A 2;j ; A 3;j and A 4;j all have independent normal distributions for each variant j. We use the cosine function to generate variability in the genetic effects and thus avoid weak instruments bias caused by the high linear dependence between genetic predictors of the exposure at different time points. The distributions of fA 1;j ; A 2;j ; A 3;j ; A 4;j g are fixed for each scenario, but vary between scenarios. Detailed values for the independent normal distribution are shown in Supplementary Table S1 (available as Supplementary data at IJE online). We consider two scenarios for the outcome. First, we assume that the outcome is a function of the exposure at two fixed time points, t ¼ 10 and t ¼ 50: where U is the confounder as above and Y has an independent standard normal distribution. The true causal effect of the exposure at time 10 is þ0.4 and the causal effect of the exposure at time 50 is -0.8. We consider estimates from MVMR in four cases: first, when the exposure is measured at times 10 and 50 (Scenario 1A); then when the exposure is measured at times 10, 40 and 50 (Scenario 1B); at times 15 and 30 (Scenario 1C); and at times 15 and 50 (Scenario 1D). We recognize that this outcome model is somewhat unrealistic; we explore these scenarios to investigate whether methods can consistently estimate causal effects when the model is correctly specified. Second, and more realistically, we assume that the outcome is a continuous function of the exposure that varies over time. We express the relationship between the exposure and outcome using an integral, where the causal effect of the exposure depends on the time-varying function bðtÞ: where again the confounder U and Y have independent standard normal distributions. We consider three different scenarios for bðtÞ:  Figure 2. We consider MVMR estimates when the risk factors are the exposure measured at times 10 and 50. In Scenario 2C, we also consider a wider range of choices of timings for the exposure measurements and consider the impact on estimates.

Interpreting multivariable estimates as causal effects
In order to better interpret parameters estimated in timevarying MVMR, we consider a simplified scenario in which the genetic effects on the exposure vary linearly over time. Suppose we have two measured time points t 1 and t 2 satisfying 0 t 1 < t 2 50, and consider estimates at the measured time pointsb 1 andb 2 for an outcome Y defined by Equation (4). The total cumulative effect Ð 50 0 bðtÞdt represents the impact of a lifelong increase in the exposure by one unit. These estimates have the asymptotic values (see Supplementary Text S1 and S2, available as Supplementary data at IJE online): which represent weighted cumulative effects across the whole life course. Hence the estimates do not represent an effect limited to a particular time period: the estimate at t 1 more strongly represents the effect of the exposure during early life and the estimate at t 2 more strongly represents the effect during later life, but both estimates are influenced by the effect of the exposure across the whole life course. Hence, if there is a sustained effect of the exposure during early life, the estimate at t 2 will be non-zero even if the effect during later life is null. If t 1 ¼ 0 and t 2 ¼ 50, then the weighting functions t 2 Àt t2Àt1 and tÀt1 t2Àt1 take values between 0 and 1, and the estimates at t 1 and t 2 will be non-negative if the effect function bðtÞ is non-negative. However, if t 1 ¼ 10, then the weighting function tÀt 1 t2Àt1 will take negative values for t < 10. Similarly, if t 2 ¼ 40, then the weighting function t2Àt t2Àt1 will take negative values for t > 40. Hence, even if the bðtÞ is nonnegative (or even strictly positive) across the whole life course, MVMR estimates can be negative. For instance, if t 1 ¼ 10 and t 2 ¼ 50, then the asymptotic estimate at t 2 would be negative if bðtÞ is large and positive for t < 10, and zero (or alternatively small and positive) for t ! 10.
We note that the sum of the asymptotic estimates from MVMR is the total cumulative effect. This result holds more generally provided the genetic effects on the exposure are not too irregular (see Supplementary Text S3, available as Supplementary data at IJE online for precise conditions).
We also consider simulated scenarios for a datagenerating model in which the genetic effects are linear and the confounder is allowed to vary in time: where UðtÞ; X ðtÞ; Y ðtÞ are independent Brownian motions with variance equal to 1 at t ¼ 50, U 0 $ N ð0; 1 2 Þ and the instrumental effects on the exposure are: where fa j g and fb j g are independently simulated from the uniform distribution UðÀ0:1; 0:1Þ and UðÀ0:04; 0:04Þ, respectively. The trajectory for the exposure of one individual, along with the instrument and confounding effects, is shown in Supplementary Figure S1 (

Illustrative example: BMI and SBP
To investigate how estimates may behave in a real data analysis, we performed a time-varying MVMR analysis in which the exposure is BMI and the outcome is SBP. Previous Mendelian randomization analyses have suggested that BMI has a positive causal effect on SBP, 18 although how this effect may vary over time has not been explored. We took data from UK Biobank, a prospective cohort study of around half a million people aged 40-69 years at baseline, recruited in 2006-2010 from across the UK. 19 We considered 366 089 unrelated individuals of European ancestries, who passed various quality-control filters as described previously. 20 We used 93 uncorrelated (pairwise r 2 < 0:01) single-nucleotide polymorphisms as instrumental variables that have previously been shown to be associated with BMI at a genome-wide level of statistical significance. 21 This genome-wide association study did not include UK Biobank participants, thus avoiding bias due to winner's curse. 22 We derived MVMR estimates using a two-sample 2SLS method, as we only have measurements of BMI for each individual at a single time point. In the first stage, we performed two separate regressions of BMI on the genetic variants using individuals recruited between ages 41-46 and 60-65 years. We used coefficients from these regressions to estimate genetically predicted BMI at ages 41-46 and 60-65 years for individuals aged >65 years at recruitment. We then performed the second-stage regression of SBP on genetically predicted BMI using individuals aged >65 years at recruitment. To investigate variability of results to the choice of time period, we repeated this analysis using different age ranges for the first BMI regression: 42-47, 43-48, 44-49 and so on up to 55-60 years. The second time period was always taken as 60-65 years, and the second-stage regression was always conducted in individuals aged >65 years at recruitment. Estimates represent the change in SBP in mmHg per 1-kg/m 2 increase in genetically predicted BMI. The diagram demonstrating the fitting procedure is shown in Supplementary Figure S2 (available as Supplementary data at IJE online).

Simulation study
For each scenario, we generated 1000 data sets on 10 000 participants. The average proportion of variance in the exposure explained by the genetic variants under each scenario is $10%, corresponding to a univariable F-statistic of 36.9. The average conditional F-statistic values for MVMR in most scenarios are >10, a value conventionally regarded as a guide to diagnose weak instruments. 4,11 Detailed values for instrument strength values are shown in Supplementary Table S2 (available as Supplementary data at IJE online). We present results from MVMR obtained using the 2SLS method.

Discrete effects of the exposure at specific time points
Results from the simulation study in which the outcome is a function of the exposure at times 10 and 50 are shown in Figure 3. In Scenario 1A, we use the exposure measured at times 10 and 50 as risk factors. Median estimates across scenarios are close to the true causal effects: namely þ0.4 at time 10 and -0.8 at time 50, though there is some bias towards the null due to weak instruments. In Scenario 1B, we use the exposure measured at times 10, 40 and 50 as risk factors. Again, median estimates across scenarios are close to the true causal effects, with the median estimate at time 40 around zero. However, in Scenario 1C (using the exposure at times 15 and 30) and in Scenario 1D (using the exposure at times 15 and 50), median estimates are substantially different to the true values. In Scenario 1C, the median estimate at time 15 is negative and at time 30 is positive; this is the opposite to the true situation, as the true effect is positive at the earlier time point and negative at the later time point. In Scenario 1D, the median estimate at time 50 is correctly negative, but the median estimate at time 15 is also negative, whereas the early-age causal effect is positive. We repeated the simulation study with stronger instruments (average proportion of variance in the exposure explained was around 30%); results shown in Supplementary Figure S3 (available as Supplementary data at IJE online) support our claim that bias in Scenarios 1A and 1B is due to weak instruments.

Continuous effects of the exposure across time
Results from the simulation study in which the outcome is a continuous function of the exposure are shown in  Although median estimates were mostly positive, they were sometimes close to zero and occasionally clearly negative. Moreover, it is not clear that estimates improved when considering measures of the exposure at increased numbers of time points, as some median estimates were either close to zero or negative even when the exposure was measured at four time points.

Linear time-varying genetic effects
Results from the simulation study in which the outcome is a continuous function of the exposure and the genetic effects on the exposure vary linearly with time are shown in Figure 6. In Scenarios 3A and 3B, the median estimates for the first measured time point are positive, whereas in truth the causal effect of the exposure is null at these time points. When the second time point is changed to 40, as in Scenario 3D, the median estimates for the first measured  (6). This indicates that the problem in timevarying MVMR is not bias, but rather that the quantities estimated do not have a meaningful interpretation as the causal effect of an exposure restricted to a particular time period.
Finally, results from the simulation in which the effects of genetic variants on the exposure switch from off to on at different times are presented in Supplementary Figures  S5 and S6 (available as Supplementary data at IJE online). Results are very similar to those in Figure 6, indicating that findings are not sensitive to the choice of model relating the genetic variants to the exposure.

Illustrative example: BMI and SBP
Results from the illustrative example for the effect of BMI on SBP are shown in Figure 7. Conditional F-statistics were close to 1 in all cases, indicating little variability in the genetic effect over time (Supplementary Table S3, available as Supplementary data at IJE online). This means that there is little information in the data to obtain precise multivariable estimates at the different time points. Scatter plots of the genetic associations with BMI at different time periods are shown in Supplementary Figure S4 (available as Supplementary data at IJE online); whereas most variants are similarly associated with BMI across time periods, some variants are more strongly associated during the earlier or later time period, enabling the MVMR analysis to be performed. MVMR estimates for both time periods are highly variable about the age range over which the risk factor associations were estimated. The range of variation in Mendelian randomization estimates was greater than expected based on the standard errors of estimates. In some cases, the 95% CI for the estimate in the first time period was negative and excluded the null and the 95% CI for the estimate in the second time period was positive and excluded the null. But there were also positive point estimates for the first time period and negative point estimates for the second time period. In summary, both estimates Multivariable Mendelian randomization estimates (95% CIs) of the effect of body mass index (BMI) on systolic blood pressure for different time periods. The first risk factor is BMI over the first time period as indicated. The second risk factor is BMI over the time period from age 60 to 65 years. Estimates for systolic blood pressure are performed in individuals aged >65 years at the time of recruitment and inferences based on those estimates were strongly dependent on the choice of the first time period-a choice that is likely to be arbitrary in practice.

Discussion
As the well-known aphorism goes: all models are incorrect, but some models are useful. 23 In univariable Mendelian randomization, strict parametric assumptions are required for Mendelian randomization estimates to be interpreted as either an ATE or a LATE. 12 However, even if these assumptions are not satisfied, the univariable Mendelian randomization estimate still has an interpretation as a test statistic for a relevant causal hypothesis. 24 Indeed, it is been argued that the numerical value of a Mendelian randomization estimate rarely represents the estimate of a policy-relevant parameter, 25 as (amongst other reasons) it represents the impact of a lifelong change in the distribution of the exposure, whereas interventions on exposures in clinical practice are more limited in time. 26,27 Hence the primary value of a Mendelian randomization investigation is to provide evidence supporting (or questioning) a causal hypothesis, rather than providing a causal estimate. 28 However, estimates from MVMR for a time-varying exposure do not seem to have a similar interpretation as a test statistic for a relevant causal hypothesis relating to the presence of a causal effect over a specific time period. When the exposure affects the outcome at a limited number of discrete time points and the risk factors in the MVMR analysis are the values of the exposure at these time points, causal effects at these time points can be unbiasedly estimated. But if these time points are not correctly identified, estimates obtained by the ill-specified model are incorrect in a way that is misleading to any inferences being drawn from their magnitude, about either the presence or the direction of a causal effect. Similarly, if the effect of the exposure on the outcome is not discrete, but rather continuous, then the values estimated by MVMR do not have a naturally intuitive interpretation. For the MVMR model to be correctly specified, the model in Figure 1 must represent the true data-generating model, and not merely be a simplification of the model. It is implausible for the true effect of an exposure on the outcome to be discrete at the precise time points at which measurements of the exposure are available; hence, we expect potentially misleading estimates to be ubiquitous for timevarying Mendelian randomization investigations in practice.
We have provided a simulated example in which the effect of the exposure on the outcome is positive in early life only and negative in later life, but estimates from MVMR would suggest the opposite. We have also provided a simulated example in which the effect of the exposure on the outcome occurs in early life only and not in later life, but estimates from MVMR would suggest the opposite. Similarly, we have provided a simulated example in which the effect of the exposure on the outcome occurs in later life only and not in early life, but estimates from MVMR analysis are positive throughout. We have provided a simulated example in which estimates from MVMR at different time points are not necessarily positive even though the effect of the exposure is uniformly constant and positive. Finally, we have considered a simple case in which the genetic effects on the exposure vary linearly, and have shown algebraically that values estimated by MVMR do not represent intuitively interpretable causal effects. We have also provided an applied example with real data, and shown that estimates are sensitive about the time period over which genetic associations with the exposure were estimated. The choice of time period influenced the conclusions drawn from the analysis: whether there is a negative effect of BMI on SBP during the first time period or not, and whether there is a positive effect of BMI on SBP during the second time period or not.
The poor performance of MVMR in these examples can be attributed to model misspecification. In general, when a statistical model is misspecified, estimates suffer from bias that is unpredictable in both magnitude and direction. 29 The estimates of MVMR represent weighted cumulative effects, whereas the weighting function is affected by the measured time points and the time-varying instrumental effects, which makes results sensitive to the time points chosen and unpredictable as the time-varying instrument effects are unknown. Another explanation is violation of the exclusion restriction assumption. The instrumental variable assumptions state that the totality of the effect of the genetic variants on the outcome is mediated via the exposure, such that if the exposure were fixed and the genetic variants varied, the outcome would remain the same. 30 If the effect of the exposure acts continuously over time, it would not be sufficient to fix its value at a few specific time points, but it would be necessary to fix the trajectory of the exposure over time. Similarly, complete mediation of the genetic effect on the outcome via the exposure would only be observed when considering the distribution of the exposure over time. As time is continuous rather than discrete, it is unlikely that any model containing values of the exposure at specific discrete time points could be correctly specified. One potential extension of this work is to consider models that fit the outcome as a continuous function of the exposure, which requires the exposure or the instrument effects over the whole life course to be modelled.
One qualitative limitation of our cautionary remarks is that we have considered scenarios in which the risk factors represent the same exposure measured at different time points. In the example of Richardson et al., 7 it is arguable that early-life BMI and later-life BMI represent biologically distinct exposures, whose effect on the outcome is restricted to a limited time period. For example, early-life BMI may only affect the outcome for the growth and development period until full adult size is achieved. Hence MVMR could be a legitimate tool for obtaining meaningful inferences in this case, and the estimates represent the effect of the corresponding distinct exposure during the time period. Whereas we agree that such a situation is more amenable to the use of MVMR, we would still be cautious about over-interpretation of results from timevarying Mendelian randomization investigations, and would recommend that investigators assess the robustness of findings carefully (e.g. by assessing the consistency of results with different choices of genetic variants and/or different time points for the measurement of the exposure). Another limitation is that the simulation scenarios considered may be unrealistic, although similar fragility in estimates was observed in the applied analysis. A further practical issue, as observed in the applied example, is that it is difficult to find genetic variants for which the effect on the exposure is more or less strong at different time periods, and hence estimates from MVMR may be highly variable and imprecise.
In conclusion, MVMR analyses to investigate timevarying causal effects rely on parametric assumptions that are unlikely to be satisfied in practice, and provide estimates that can be misleading if the model is incorrect. We therefore strongly discourage quantitative conclusions to be drawn from these analyses, and would advise caution about the qualitative interpretation of such findings as a guide of the direction or even existence of a causal effect during a particular phase of life.

Ethics approval
The UK Biobank study has approval from the North West Multicentre Research Ethics Committee (11/NW/0382).

Data availability
Individual-level data from UK Biobank cannot be shared publicly for ethical/privacy reasons. The data will be shared on reasonable request to the corresponding author, with the permission of UKBiobank. The research has been conducted using the UK Biobank resource under Application Number 7439.

Supplementary data
Supplementary data are available at IJE online.

Author contributions
H.T. designed and conducted the simulation and applied study. S.B. directed the study's implementation and helped to interpret the findings. All authors reviewed and edited the manuscript.