Commentary: Incorporating concepts and methods from causal inference into life course epidemiology

VC The Author 2017. Published by Oxford University Press on behalf of the International Epidemiological Association 771 This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted reuse, distribution, and reproduction in any medium, provided the original work is properly cited. International Journal of Epidemiology, 2017, 771 doi: 10.1093/ije/dyw367 Advance Access Publication Date: 14 March 2017 Erratum


Introduction
The review by Ben-Shlomo et al. 1 highlights how life course epidemiology is evolving and adapting to accommodate increasing access to data on novel dimensions and over extended periods. This enriched framework raises ever greater methodological challenges, leaving statisticians like us daunted by the task of translating life course enquiries into suitable analyses of the data at hand.
Take for example Figure 4 of Ben-Shlomo et al.. 1 This is very useful for gaining a 'big picture' understanding of a complex area such as ageing, and for establishing which processes may benefit from a more detailed investigation. However, the leap from such a diagram to a specific data analysis should not be (and is not typically) made without greater thought. We will argue in this commentary that some recent developments from the field of modern causal inference may be helpful in this regard. First, in order to state unambiguously the question (or questions) of interest, the potential outcomes framework, a cornerstone of modern causal inference thinking, is invaluable. Then, the conceptual framework should be refined to a causal directed acyclic graph (DAG) relevant to the question, and the causal DAG should be formally interrogated to see if the question can be addressed, and if so how. Indeed, depending on the question, the causal DAG and the data available, we may find that standard statistical methods traditionally used in epidemiology are sufficient; in other settings we may find that more novel techniques are needed.
We will discuss each of these points next, mentioning also the issues of missing data and measurement error, as well as highlighting concerns about the difference between the processes which are the focus of investigations and their manifestations in observed data.

What is the question of interest?
For illustration, we take an example briefly discussed by Ben-Shlomo et al., 1 namely the relationship between nutrition and type II diabetes. Whereas the broad aim of a project may be to understand the effect of nutrition across the life course on the risk of developing adult type II diabetes, a more specific question must be established before we can proceed. In this section, we will highlight the range of different questions that may be of interest, and how they can be unambiguously distinguished using the potential outcomes notation.

Potential outcomes
Suppose for simplicity that nutrition is reliably measured (e.g. via detailed food frequency questionnaires together with analysis of urinary samples) twice during the life course on a cohort of people: once in childhood and again in early adulthood. Let X 1 and X 2 denote relevant summaries of nutritional status in childhood and early adulthood, respectively, and let the binary variable Y denote the development of type II diabetes by age 70 years, say. Let Yðx 1 Þ be the potential outcome, i.e. the value that Y would take if we were hypothetically to intervene on X 1 and set it to the value x 1 . We can similarly define potential outcomes Yðx Þ for hypothetical interventions on X 2 . We may also define Yðx 1 ; x 2 Þ, the value that Y would take were we to intervene on both X 1 and X 2 and set them to x 1 and x 2 , respectively. We will now use this very simple example to illustrate how subtly different questions of interest can be articulated using these potential outcomes.

Total and joint effects
The total causal effect (TCE) of X 1 on Y (including both its direct effect and its indirect effect via X 2 ) can be expressed as a comparison of the distribution of Yðx 1 Þ for different values of x 1 . Often the mean is compared, as in TCE 1 ðx 1 Þ ¼ EfYðx 1 Þg À EfYðx Ã 1 Þg, where x Ã 1 is a reference (or baseline) value of X 1 ; for binary X 1 , TCE 1 ¼ EfYð1Þg À EfYð0Þg. If Y were a time-to-event outcome, such as time to onset of type II diabetes, we could compare the survivor functions of the potential outcomes, for example. 2 The TCE of X 1 would be of public health interest, for example if primary school nutrition programmes were being considered. If a public health nutrition initiative targeted at adults would instead be considered, then the causal effect of X 2 on Y, TCE 2 would more likely be of interest, which would involve a comparison of the distribution, e.g. the mean, of Yðx 2 Þ for different values of x 2 . Alternatively, the likely impact of a general intervention such as increased taxation of unhealthy food would more naturally lead to a comparison of the distribution of Yðx 1 ; x 2 Þ for different values of x 1 and x 2 : this is the joint effect of ðX 1 ; X 2 Þ on Y: 3,4 Controlled direct effects and conceptual models in life course epidemiology Another possible aim might be to gain a better biological understanding of the timing and strength of the mechanisms linking nutrition to type II diabetes prevalence. In this case we might compare the distribution of Yðx 1 ; x 2 Þ for different values of x 1 but for a fixed value of x 2 , known as a controlled direct effect, for example CDE 1 ðx 2 Þ ¼ EfYðx 1 ; x 2 Þg À EfYðx Ã 1 ; x 2 Þg. 5 Evidence of variation in CDE 1 ðx 2 Þ for different values of x 2 would indicate that the effect of nutrition in childhood on the risk of diabetes varies according to the level at which adult nutrition is set. This would support the so-called pathways model discussed in life course epidemiology 6,7 according to which sensitive periods of exposure interact in their impact on risk (see below for further discussion of this). In the absence of such effect heterogeneity, we could compare the common CDE 1 , which represents the effect of X 1 that is not mediated by X 2 , with the total causal effect of X 2 , TCE 2 . A similarity between them would support the cumulative exposure model. This is because the similarity (in addition to the lack of interaction) implies that experiencing the exposure during each of these two periods (directly) influences the risk of type 2 diabetes by the same amount, and hence it is the cumulative exposure, rather than the timing of it, that matters. If instead one or other effect were much smaller than the other, there would be support for the sensitive period model, and one or other effect being zero would support the critical period model. 6,8 Interaction versus effect modification Returning again to the pathways model, there are two subtly different possible questions even here, which can clearly be articulated using potential outcomes, namely the difference between interaction and effect modification. 9 For ease of explanation, suppose that X 1 and X 2 are both binary, and that we are interested in comparing the means of the potential outcomes. An interaction is said to be present if the two CDEs differ, i.e. if EfYð1; 1ÞgÀ EfYð0; 1Þg 6 ¼ EfYð1; 0Þg À EfYð0; 0Þg. This is the same as saying that: EfYð1; 1Þg À EfYð1; 0Þg 6 ¼ EfYð0; 1Þg À EfYð0; 0Þg, i.e. the causal effect of changing adult nutrition on the risk of type II diabetes differs according to the level at which we set childhood nutrition. Conversely effect modification, as defined by VanderWeele, 9 would be present if the causal effect of changing adult nutrition on the risk of type II diabetes differs between those who in reality have different childhood nutrition statuses, i.e. if (1) here, to clarify that the hypothetical intervention being considered is on X 2 rather than X 1 .) In this literature, interaction has a causal connotation with respect to both exposures, whereas effect modification is causal only with respect to (in this case) the later exposure X 2 . Which of these questions is of interest will depend on the broader aim of the investigation, and will have an impact on how the data are analysed.

Effect decomposition
Alternatively, we might be interested in effect decomposition, i.e. in asking what proportion of the effect of childhood nutrition on type II diabetes is mediated by earlyadult nutrition. For these questions, so-called natural direct and indirect effects are relevant, 10,11 and can again be unambiguously stated as a counterfactual comparison; the natural direct effect, for example, is a comparison of the distribution of Yfx 1 ; X 2 ðx Ã 1 Þg for different values of x 1 where X 2 ðx Ã 1 Þ is the potential value of X 2 were we to set X 1 to x Ã 1 . More specifically, the natural effect of X 1 , expressed as a mean difference, is defined as Þgg. These effects and their estimation have received much attention in the recent causal inference literature. 5 Since these effects involve nested counterfactuals, they require very strong untestable assumptions for identification, assumptions that could not even be hypothetically verified in an experimental setting. For this reason, attention is currently being diverted to more policy-relevant effects known as interventional direct and indirect effects identifiable under weaker conditions. 12 We expect that these effects will soon be estimated in applications in life course epidemiology.

Multiple exposures/mediators
Suppose we had an additional exposure time point, such as nutrition during infancy; we would then be in a setting in which we could potentially be interested in the joint effects of more than two exposures, 3,4 or we might be interested in effect decomposition with multiple mediators. 13,14

Comment
In any particular study, it is unlikely that all of the above would be relevant. Our main message, however, is that life course investigations are causal enquiries. Familiarity with the modern causal inference literature-and with the subtly different flavours of causal effects defined therein-has the potential to aid researchers in formulating and communicating the question(s) of interest.

What is the appropriate causal DAG and what can it tell us?
Once the question of interest has been stated, establishing whether it can be answered under plausible assumptions using the data at hand, and if so how, can be aided by drawing a causal diagram [more precisely, a causal directed acyclic graph (DAG)]. 15 Such a causal DAG should reflect a priori subject-matter knowledge regarding the likely causal structure of the variables being studied. Unlike conceptual frameworks, causal DAGs are well-defined mathematical objects that can be interrogated using procedures such as d-separation; 16 as such, care must be taken to draw them correctly, otherwise the resulting conclusions will be unreliable. Further, although a unique causal DAG can never be determined from the data and hence subject-matter knowledge is crucial, some candidate causal DAGs may be incompatible with the data, and thus compatibility should be investigated. There exists some recent technical literature on the various possible causal interpretations of DAGs, 17,18 but one shared feature is that to be causal, any common (measured or unmeasured) cause of two or more variables in the diagram must itself be in the diagram.
See, for example, our Figure 1 for the nutrition-diabetes example. The most naive causal DAG would assume no common causes of any pair of X 1 ; X 2 ; Y and, under this assumption, finding a causally interpretable statistical analysis of the data would be straight-forward. More realistically, however, the nutrition-diabetes relationship will be confounded by a number of factors that we denote by C [e.g. socioeconomic position, physical activity and body mass index (BMI)]. In most life course settings, many of these confounders themselves will change over time (hence C 1 and C 2 in the figure), and to make progress, reliable repeated measures on these confounders would be needed. In particular, note that we have allowed the later confounders C 2 to be affected by the earlier exposure X 1 (as would be expected, particularly with say physical activity and BMI). This dependence of C 2 on X 1 introduces a potentially problematic feature common in life course studies, namely timedependent (or intermediate) confounding.
If Figure 1(b) were correct, standard regression methods could be used to make inferences about the total effect of X 2 (by adjusting for X 1 ; C 1 ; C 2 ), and also about effect modification by X 1 of the total effect of X 2 on Y. 19 In the unlikely event that we could additionally assume no unmeasured common causes U of X 1 and X 2 , then inference about the total effect of X 1 could also be made using standard methods upon adjustment for C 1 . If, however, we wished to make inference about the joint effects of X 1 ; X 2 , or the controlled direct effect of X 1 on Y fixing X 2 to a particular value, or to learn about interaction between X 1 and X 2 in their effect on Y, then so-called g-methods that deal with time-dependent confounding would be needed. 3 Effect decomposition using natural direct and indirect effects would not be possible in the presence of U, and would anyway require additional strong parametric assumptions (due to the presence of intermediate confounding). 20,21 However, progress could be possible using interventional mediation effects 12 using the observed (rather than counterfactual) distribution of X 2 given X 1 and C 1 ; see a related discussion by VanderWeele and Robinson. 22 Figure 2 is an expanded version of Figure 1(b) specific to the nutrition-diabetes example. It highlights how controlling for adult physical activity and BMI (C 2 Þ would remove part of the effect of childhood nutrition, the part that they mediate (thicker arrows in the DAG), whereas not controlling for adult physical activity and BMI would confound the effect of adult nutrition. Hence a traditional regression approach would not achieve the estimation of the joint effects of childhood and adult nutrition (including their interaction), nor of the controlled direct effect of childhood nutrition.

Missing data and measurement error
As well as confounding, other challenges facing those attempting to estimate causal effects in life course studies are: the likely depletion of participation during the course of follow-up, with the resulting missing data; and measurement error in particular when it affects a mediator in the considered analysis. The first challenge is a particularly relevant problem in studies of ageing, as also discussed by Ben-Shlomo et al., 1 where data missing due to death or other competing events present an especially difficult problem. 23 Measurement error is ubiquitous in observational epidemiology and is arguably of even greater concern when it affects a mediator, with induced biases in opposite directions for direct and indirect effects. 24,25 Inevitably when faced with incomplete and/or mismeasured data, as with confounding, assumptions have to be made and then sensitivity to these assumptions assessed. Causal diagrams present an opportunity to represent these assumptions, for example by including missingness indicators (see for example Daniel et al. 26 and the extensions by Mohan, Pearl and others) 27 or measurement error mechanisms in the DAG. 28 The implications of different assumptions can then be formally assessed, and appropriate analyses, if they exist, identified.

Processes versus snapshots
We end our commentary by drawing attention to a recent important cautionary note on the use of causal DAGs, by Aalen et al., 29 particularly relevant to studies in life course epidemiology. Conceptual frameworks, e.g. Figure 4 in reference 1, rightly concern processes, usually latent, such as the continuous-time evolution of an individual's reproductive function or nutritional trajectory over the life course. However, when we draw causal DAGs that are to be useful for informing data analysis, we naturally focus on the 'snapshot' observations of this latent process that are available in our data. As long as the translation from process to snapshot is done knowingly and carefully, this is wise; however, if we treat the snapshots in our causal DAGs as if they actually represented the whole process, then mistakes can be made and conclusions adversely affected. For example, one reasonable hypothesis may be that the effect of nutrition on type II diabetes risk is entirely mediated by BMI. However, the mediation in this instance would be through the entire BMI process. If BMI is measured only on a relatively small number of occasions, then we would expect only a proportion of the effect of nutrition on type II diabetes risk to be mediated by the observed measures of BMI.

Summary
In this commentary we have highlighted concepts and methods from the field of causal inference, many of them recent contributions, which we believe to be relevant in life course studies. We have discussed how the language of potential outcomes can help to articulate the precise question(s) being asked and how causal DAGs-distinct from conceptual frameworks-should be carefully drawn and interrogated and missing data and measurement error mechanisms included.