Stochastic Mechanistic Interaction

We propose a fully probabilistic formulation of the notion of mechanistic interaction (interaction in some fundamental mechanistic sense) between the effects of putative (possibly continuous) causal factors A and B on a binary outcome variable Y indicating 'survival' vs 'failure'. We define mechanistic interaction in terms of departure from a generalized 'noisy OR' model, under which the multiplicative causal effect of A (resp., B) on the probability of failure cannot be enhanced by manipulating B (resp., A). We present conditions under which mechanistic interaction in the above sense can be assessed via simple tests on excess risk or superadditivity, in a possibly retrospective regime of observation. These conditions are defined in terms of generalized conditional independence relationships (generalised because they may involve non-stochastic 'regime indicators') that can often be checked on a graphical representation of the problem. Inference about mechanistic interaction between direct, or path-specific, causal effects can be accommodated in the proposed framework. The method is illustrated with the aid of a study in experimental psychology.


Introduction
Consider an outcome variable Y that responds to manipulations of two variables, A and B. We are interested in whether the effects of A and B on Y interact in some fundamental mechanistic sense. For example, we might be interested in whether an environmental exposure A interferes with the effect of a drug B on a disease Y at some mechanistic-presumably molecular-level. Such a relationship, which we shall make more formal in a later section of this paper, we call mechanistic interaction.
One might begin to investigate mechanistic interaction by fitting a regression model of the dependence of Y on (A, B) and then testing for presence of statistical A × B interaction, but such a test will depend on the chosen response scale, and will generally not be interpretable in any deep mechanistic sense. Hence the need for a mathematical formalization of mechanistic (as opposed to statistical) interaction, and of the conditions under which this phenomenon can be detected from empirical data via appropriate, response-scale independent, statistical tests. In many applications, discovery of mechanistic interaction could represent a step forward in the understanding of the studied system. In genetics, evidence of mechanistic interaction between two genes with respect to a phenotype of interest could point to the molecular mechanisms implicated (Bernardinelli et al., 2012).
Ideally we would wish to assess mechanistic interaction by a controlled experiment, but this is often not possible or not convenient. Various authors have proposed tests for inferring mechanistic interaction (suitably defined) from observational data (Rothman, 1976;Rothman & Greenland, 1998;Greenland & Poole, 1988;VanderWeele & Robins, 2008, 2009VanderWeele, 2009VanderWeele, , 2010VanderWeele, , 2011VanderWeele & Laird, 2011). Consider, for example, the case that A, B and Y are all binary variables, with Y = 1 referred to as "failure" and Y = 0 as "survival", and let C denote a further (possibly empty) set of observed variables. Let R abc denote the observational risk of failure, Y = 1, conditional on A = a, B = b, C = c. Then, in certain observational situations, and under certain conditions, the following properties (of which the first is stronger than the second) have been shown to imply some form of mechanistic interaction: Excess risk: (1) Superadditivity: These can be alternatively expressed as: Excess risk: Superadditivity: where S ijc := 1 − R ijc is the corresponding probability of survival (of Y = 0). An important property of the above tests is that they are (at least approximately, under assumptions) testable under retrospective sampling. More precisely, the above conditions give criteria for synergistic mechanistic interaction between A and B in producing Y , in that the combined effect (suitably measured) of increases in A and in B to increase Y is greater than expected on the basis of their individual effects. This is the interpretation we shall maintain here. The case of antagonistic mechanistic interaction, where the the combined effect is smaller than expected, is readily handled by interchanging the values 0 and 1 for Y , and interchanging the Rs and Ss in equations (1)-(4) and elsewhere.
Most work to date on mechanistic interaction has been formulated assuming the potential outcome (PO) framework (Rubin, 1974) for causality or some essentially equivalent formulation, though the literature also offers some purely probabilistic approaches. The former category is exemplified by the stochastic PO approach of VanderWeele & Robins (2012); the latter is exemplified by previous work of the authors of this paper (Berzuini & Dawid, 2013) and by the recent paper of Ramsahai (2013). Section 8 discusses these approaches and their limitations. Aiming to overcome these limitations, we here supply a new definition of mechanistic interaction between the effects of possibly continuous causal factors A ∈ A and B ∈ B on a binary variable Y . Our definition is in terms of departure from a null "generalised noisy-OR" model (Pearl, 1988;Lemmer & Gossink, 2004), under which the multiplicative causal effect of A (resp., B) on the probability of survival (Y = 0) cannot be enhanced through manipulation of B (resp., A). Conditions are presented under which mechanistic interaction in the above sense can be assessed from observational data via simple tests on excess risk or superadditivity, under either a prospective or a retrospective sampling regime. The approach extends that of Berzuini & Dawid (2013) in not requiring any variable in the problem to depend on its causes in a deterministic way. In common with our previous work, the conditions for observational identifiability of mechanistic interaction are here expressed in terms of conditional independence relationships between problem variables, which can often be checked on a causal diagram representation of the problem. In § 5 we establish a connection between mechanistic interaction and mediation by discussing sufficient conditions for a valid test of mechanistic interaction under a regime where a mediator of the effects of interest is fixed by intervention. In this case, what is being assessed is a mechanistic interaction between direct effects or, given a causal diagram representation of the problem, mechanistic interaction between path-specific effects. Finally, our approach caters for the many applied situations, e.g. in epidemiology, where A and B arise as continuous variables. The methods are illustrated by means of a study in experimental psychology.

Assumptions and notation
We are interested in the way a response variable Y would react to real or hypothetical manipulations of causal factors A and B, and in particular whether or not the effects of A and B on Y can be regarded as interacting in some fundamental mechanistic sense. In order to address this, we must first understand what might be meant by "no mechanistic interaction". Here we suggest a possible explication of this concept. This however is not absolute, but relative to an appropriately chosen "context". That is, we specify certain context variables W in the problem, which might modify in some way the dependence of Y on (A, B), and only consider this dependence within a fixed context, i.e., conditional on fixed values W = w.
In contrast to the formulation of Berzuini & Dawid (2013), we do not require that Y be a deterministic function of (A, B, W ), but allow for a fully stochastic dependence of Y on these inputs. Note that this allows considerable freedom in the selection of the context variables. Indeed, even in those rare cases when there does exist a choice for W supporting a deterministic relationship, this might be regarded as inhabiting too deep a level of description to be useful for the purpose at hand, and a more coarse-grained choice, yielding a genuinely stochastic relationship, could be more appropriate. In any given application, care must be taken to ensure that we are arguing at a suitable level of granularity. As an analogy, for most purposes it is appropriate to think of the determination of the sex of an embryo as governed by a random process (essentially a fair coin toss), even though a very detailed description of the positions, motions, properties and behaviours of the gametes prior to fertilisation might allow deterministic prediction.
Our definition of "no mechanistic interaction" will relate to a (possibly hypothetical) "interventional regime", in which the values of A and B are set by some external agent or process. However, the data that available to investigate this property will generally have been generated under some other, typically purely observational, regime, where, in particular, the values of A and B have arisen in some uncontrolled stochastic way. We will need to be able to relate these regimes in order to transfer information from one to the other. To streamline this task we introduce the regime indicator σ AB , a non-stochastic variable, where σ AB = ab indicates the interventional regime where A is set to a and B to b, and σ AB = ∅ the observational regime. More generally, σ X will denote a similar regime indicator for interventions on a set X of variables of interest.
We introduce the symbol for the probability of survival (Y = 0), in context W = w, when A and B are set to respective values a and b by an exogenous intervention. Then one waywhich we shall henceforth adopt-of understanding the effect (A, B) on Y is by considering the dependence of π w (a, b) on (a, b). In particular, the effect of changing the values set for (A, B) from (a ′ , b ′ ) to (a, b) is measured by the relative survival probability (RSP): π w (a, b) π w (a ′ , b ′ ) .

Structural conditions
We shall impose the following structural conditions:

Condition 1
The causal factors A and B are continuous or ordered categorical scalar random variables.

Condition 2
The effects of A and B on Y are positive: for any fixed (b, w) (resp., (a, w)), P (Y = 1 | W = w, σ AB = ab) is a non-decreasing function of a (resp., b).
An alternative expression of Condition 2 is that π w (a, b), given by (5), is, for each w, a non-increasing function of each of a and b.

No mechanistic interaction
We henceforth make the structural assumptions of the previous section.

Point null hypothesis
One possible way of expressing the concept of no mechanistic interaction between A and B in producing Y is that, for all w, we can express π w (a, b) in the product form for all a ∈ A, b ∈ B. We term this the point null hypothesis.
Imposing the further requirements π w (1, 0) = π w (0, 1) = 0 would now imply and this constellation of values represents Y as the Boolean expression A OR B.
Intuitively, the point null hypothesis asserts that manipulation of causal factor A (resp., B) is an uncertain cause of Y (it can produce the failure, but it will not always do so), and its effect on Y will never depend on the value imposed on the remaining factor, B (resp., A). For example, in a bowling club, imagine Player 1 being assigned a ball of size A = a and Player 2 a ball of size B = b, and then each player being asked to try to knock his respective pin down. Let the event Y = 1 indicate that at least one of the two players knocks the pin over, and Y = 0 indicate instead that both pins "survive" the throw. Think of π w (a, b) as representing the probability of Y = 0 with assigned ball sizes (a, b) with W = w indicating specific circumstances such as air humidity, temperature. Then the point null hypothesis, as expressed by Equation (6), asserts that the probability of Y = 0 is the product of the probability λ w (a) that Player 1 fails to knock the pin down, and the probability µ w (b) that Player 2 also fails. Which is to say that the performance of one player is not affected by the size of the ball given to the other player. Equation (8) makes the following implication explicit: giving one player a larger ball does not change the effect on the outcome one obtains by giving a larger ball to the other player.

Interval null hypothesis
Taking into account that we are only interested in synergistic (as opposed to antagonistic) interaction, we can weaken the above point null hypothesis, as expressed by (8) or (9), to yield the following interval null hypothesis: Equivalently, Informally: intervening to increase B (resp., A) will not enhance the failureinducing effect of an interventional increase in A (resp., B).

Mechanistic interaction
When this holds, we may write A * B, or (to make explicit that the property is relative to the specified context variable W ), A * B [W ]. Note that, unlike statistical interaction, the property of mechanistic interaction is unaffected by monotonic increasing transformations of A and B.
Because we focus on synergy, we have defined "interference to produce"; we could similarly define its antagonism counterpart, "interference to prevent".
If Equation (15) holds then clearly π w (a ′ , b) > 0, π w (a, b ′ ) > 0. Also π w (a ′ , b ′ ) > 0 for, if it were not so, all terms of the equation would be 0 by virtue of Condition 2. Thus Definition 3.1 applies just when there exist w, a > a ′ , b > b ′ such that Inequality (15) represents a stochastic extension of the deterministic mechanistic interaction concept of Berzuini & Dawid (2013). Under such deterministic dependence of Y on (A, B, W ), each term in (15) can only take values 0 or 1. Together with Condition 2, this implies: The above inequality says that there are values b, b ′ ∈ B such that, in some context W = w, when we set B = b a manipulation of A from a ′ to a causes Y to change from 0 to 1; whereas, in the same context, when we set B = b ′ , the same manipulation makes no difference to Y . In other words, whenever Y is deterministic, presence of mechanistic interaction in our formulation is characterized by the possibility of preventing the effect of a certain manipulation of A by acting on B; and vice versa. If further A and B are binary, then (a, a ′ , b, b ′ ) = (1, 0, 1, 0), and the definition says that A and B interact mechanistically in producing Y = 1 when there exists a value w of the context variable W such that the dependence of Y on (A, B) obeys the Boolean conjunction law: Y = A ∧ B.

Observational identification of mechanistic interaction
We now consider how we might use observational data to assess the presence or absence of mechanistic interaction between the effects of A and B on Y . We shall do this be means of a set C ⊆ W of observed context variables; the remaining variables U = W \ C may be observed or unobserved.
We shall need to consider, in addition to the structural conditions of § 2.1, some causal conditions, relating the behaviours under observational and interventional circumstances. These we express as follows, where we have used the symbol ⊥ ⊥ for "conditionally independent of" (Dawid, 1979(Dawid, , 2002.
Finally, we shall sometimes require observational independence between A and B, conditional on C: Condition 3 says that the context variables W should be independent of the values of (A, B), as well as of σ AB , i.e., of whether those values arose naturally or by external intervention: this condition may be described as exogeneity. It is helpful for interpretive purposes, but is not required for the mathematical development; indeed, in § 5 we shall find it fruitful to weaken it. Condition 4 says that, conditional on C, the distribution of U is fixed: the same under both interventional and observational conditions and, further, independent of the values of A and B.
Condition 5 requires that the effects of A and B on Y be "unconfounded", conditional on the context variables W . Together with Condition 3, this defines W as a "sufficient covariate" for the effect of (A, B) on Y (Guo & Dawid, 2010). Condition 6 holds trivially for an interventional regime σ AB = ab, so only has force for the observational regime σ AB = ∅. It is a strong condition, but in certain circumstances can be avoided-see Corollary 1 below. Condition 6 is implied by Condition 3, and we shall want to retain this weaker condition.

Causal diagrams
It will sometimes be possible to use an influence diagram (ID) to represent the assumed causal and conditional independence relationships between the problem variables and the regimes (Dawid, 2002). This extension of the methodology of directed acyclic graphs (Cowell et al., 1999) incorporates the relevant regime indicators as decision nodes, sending arrows into the variables they refer to. Figure 1, for example, might represent the effects of genetic variants A and B on myocardial infarction Y say, possibly mediated by obesity M, with Z representing observed information about diet, social status, etc.. In the diagram, Z is assumed to be sufficient (albeit not minimal sufficient) for the effect of A and B on Y . The diagram contains the regime indicator σ AB (it also contains another regime indicator, σ M , which will be considered further in § 6 below). Our causal conditions, expressing conditional independence relationships between the stochastic variables and the non-stochastic indicators, can be represented and checked using an ID, with the aid of a graphical criterion such as d-separation (Geiger et al., 1990) or its moralisation equivalent (Lauritzen et al., 1990). For a problem that can be modelled by the graph of Figure 1, causal conditions 3-6 follow by application of the moralisation criterion to the graph if we choose C to be empty and W ≡ (Z, U). .

Main theorem
Our observational criterion for mechanistic interaction will involve a dichotomisation of the ranges of A and B, determined by respective "cutoff thresholds" τ A and τ B . Let α be the indicator variable of "A > τ A ", and β the indicator variable of "B > τ B ". The symbol R ijc is henceforth reinterpreted as: Figure 1: In an influence diagram such as this, variables may depend on their predecessors in the graph in a fully stochastic way. Regime nodes, here σ AB and σ M , indicate whether the variables into which they send arrows are manipulated (interventional regime) or observed (observational regime). and likewise S ijc = 1 − R ijc . We reinterpret the inequalities (1)-(4) correspondingly. Note that R ijc is estimable from data on variables A, B and C, gathered under the observational regime. Proof. We proceed by assuming both superaddivity and the interval null hypothesis, and deriving a contradiction. In the following, all probabilities and expectations are taken under the observational regime σ AB = ∅.

Condition 7
The effect of A on Y is either positive, in the sense that P (Y = 1 | W = w, σ AB = ab) is a non-decreasing function of a for all (b, w); or negative, in the sense that P (Y = 1 | W = w, σ AB = ab) is a non-increasing function of a for all (b, w); and similarly with the rôles of A and B interchanged.
Corollary 2 Suppose that, in the statement of Theorem 4.1 or Corollary 1, we replace Condition 2 by the weaker Condition 7, and at the same time replace the superadditivity property (2) by the stronger excess risk property (1) (again reinterpreted in terms of definition (18)). Then the conclusion remains valid.
Proof. We use the same notation as in the proof of Theorem 4.1. Arguing similarly to that proof, we deduce that there exists a value u * of U such that R * 11 − R * 10 − R * 01 > 0. This implies both S * 11 − S * 10 < 0 and as well as (19).  (29), we deduce that the effect of A is positive. The rest of the proof now follows as before. ✷

Comment.
By allowing the dichotomization of A to be arbitrary, the above theorem fits the common situation where the continuous factor is made available in a dichotomized form, without the possibility of recovering the original continuous measurements.

Direct effects interaction
This section of the paper examines relationships between mechanistic interaction and mediation. Mediation analysis hinges on the concept of direct effect of a variable X on Y . One variant of this concept, the direct effect of X on Y controlling for F , is meant to quantify the sensitivity of Y to changes in X when F is held fixed by intervention, that is, when a hypothetical physical intervention changes the value of X from some reference value x to some value x ′ , while F is set to some constant (Pearl, 2005;Robins & Greenland, 1992).
A connection between our account of mechanistic interaction and mediation can be established by defining the concept of mechanistic interaction between A and B when a further variable, F , is set by intervention to a constant. Let Z := W \ F denote the unmanipulated context variables, and extend the notation (5) by writing for the probability of Y = 0 given Z = z, conditional on A, B and F being manipulated to take on values, a, b and f , respectively. Then take the the direct effect of A on Y , in context Z = z, controlling for (B = b, F = f ), to be measured in terms of relative survival probability by the quantity Then a sensible way of defining "no direct (synergistic) mechanistic interaction" is to require that the act of setting B to a higher value can never enhance the direct effect of A, as measured by (31) with a > a ′ . This leads to the following generalization of our previous Definition 3.1 of mechanistic interaction: A, B, Z), respectively, such that The following theorem holds.

Theorem 5.1 Suppose
(That is, conditional on (A, B, F, Z), the dependence of Y on F is not further affected by the way the value of F has been generated, be it by mere observation or by intervention.) Proof. In this case π f z (a, b) = π f z (a, b). ✷ Thus under Condition 8 we can use Theorem 4.1 and its Corollaries to investigate mechanistic interaction between direct effects. For such an application, the context variable W must be replaced by W * = (W, F ). Since typically F will be affected by (A, B), Condition 3 will no longer be appropriate after such replacement; but this will not affect the mathematical development, so long as we can still assume the weaker Condition 4.

Examples
We now illustrate our framework with the aid of four examples. Example 1 Consider again the diagram of Figure 1. The diagram assumes Under the assumptions embodied in this graph, Condition 3-Condition 6 hold for W ≡ Z, C = ∅. So if we can assume the validity of Condition 1 and Condition 2 (or Condition 7), the "total" mechanistic interaction A * B [U] can be examined under an observational regime by testing for the excess risk condition (1), or for the superadditivity condition (2), as appropriate, with C empty.

Also, it can be checked that Conditions 4-6 and Condition
Because U ′ is unobserved, model (33) cannot be fitted to the data, nor its additive structure empirically tested. We thus integrate (33) with respect to U ′ . Under an observational regime the integration yields This integration corresponds to deleting node U ′ and the arrows it originates from Figure 2(a). The resulting "marginal" graph, shown in Figure 2(b), obeys the causal conditions for an observational test of A * B | M in any stratum C ≡ F ≡ M = c via excess risk or superadditivity, as appropriate. Now if we define then the excess risk (resp., superadditivity) condition for A * B | M = 0 under the causal assumptions of Figure 2(a) and the distributional assumptions of (33) can be expressed in the parametric form f (β) > 0 (resp., f (β) + 1 > 0).
Proceed by choosing appropriate distributions p(U ′ ) and p(M | A, B, U ′ , σ M = ∅) in such a way to complete the specification of an observational probability distribution over the graphs of Figure 2.
Considerations of identifiability of the distribution parameters will often suggest a discrete p(U ′ ), so that the overall distribution will have the form of a discrete mixture of distributions. Once this is done, Markov chain Monte Carlo methods can be used to generate samples from the posterior distribution of f (β), for a simulation-consistent estimate of the posterior probability of excess risk or superadditivity, as a basis for the test of A * B | M. Analysis will generally involve trying different choices of M = 0 and a cautious adjustment for test multiplicity. We shall conclude in favour of a "direct effects" A * B | M interaction when the data adequately support presence of excess risk or superadditivity (as appropriate) for at least one choice of M = 0. ✷ Example 4 Brader and colleagues (Valentino et al., 2008) study the reaction of public opinion to media stories about immigration. A sample of 354 white, non-Latino, adults is invited to read a mock newspaper story about the costs of immigration. All the participants are given the same story, except that in some stories the immigrants are reported to be latino, and in the remaining ones they are white. Previous analyses of Brader's data have been based on the causal assumptions of Figure 1, with A representing the participant's age, the binary variable B indicating whether the participant was randomized to a "latino" or to a "white" story, M being an approximately normal variable representing the participant's level of "anxiety about immigration", as measured at the end of the reading through a questionnaire, and Z representing observed socio-educational variables independent of age. Let the binary variable Y indicate the participant's response -yes or no -to the following question after the reading: "do you agree or disagree about the idea of sending Congress a letter of complaint about immigration policy?". A positive response is interpreted in terms of opposition to immigration. Brader's own analysis of the data (and a similar analysis by Imai et al. (2013)) suggests that white opposition to immigration is greater when the story involves latinos, compared to stories about white immigrants, and that a substantial component of this effect is mediated by anxiety.
Under the assumptions of Figure 1, a test of A * B[U] along the lines of Example 1 addresses the question whether the total effects of age A and ethnic connotation B interact mechanistically to produce an opposition reaction in the reader. A positive answer to this question might be interpreted as saying that if we "allow the participant to become older", before we test him, we shall cause his opinion to be less sensitive to ethnic connotation.
In order to probe deeper into mechanism, we may wish to test for A * B | M [Z, U] along the lines of Example 1. This is a way of asking whether a young age and a latino connotation of the story would work in synergism to produce an opposition reaction if were to fix the level of anxiety to some constant level. A positive answer to this question might suggest that the reason why ethnicity has a lower impact on the opinions of the old is not because the old react with less anxiety, but because they let their rational (as opposed to emotional) thinking play a bigger role in the shaping of their opinion. ✷

Causality and agency
The above example raises some issues of the interpretation of "causality" in our approach. According to our description so far, that concept has been closely tied to the possibility of making external interventions to set values for the "causal variables" A and B. This conception is in line with philosophical "agency" theories of causality, Price (1991); Hausman (1998);Woodward (2003), which regard causes as handles for manipulating effects. However, such an anthropocentric manipulationist view is unnecessarily restrictive, and can hamper application of causal inference to numerous scientific disciplines that demand a more general notion of cause, not tied simply to what human agents can do.
In our application above, while the variable A = "stated ethnicity" was manipulable (and was manipulated), we can not reasonably regard the variable B = "age of participant" as manipulable. We might however conceive of being able to observe an individual at various points of her life, and be interested in the way in which her age then might make a difference to her psychological response to certain media framing techniques. Psychologists have knowledge, theories and hypotheses about the role of age in the response process. They can, for example, make informed guesses about -and explain on the basis of psychological theories -the different outcome we might have obqserved had the individual been younger or older than he is (e.g., "young people tend to react with less anxiety"). Specific psychological mechanisms and reactions are associated with young age. We should not give up looking into them simply because the age variable falls outside the standard manipulability theory of causation.
As another example, in epidemiology it is often appropriate to consider, as a cause of a disease, a variable such as genotype, whose manipulation by human beings is not practically possible; and application of mechanistic interaction tests to investigations of epistasis or pharmacogenomics will require a broader conception of "intervention" than the agency approach typically supplies. Recent discussions of the topic (Woodward, 2013) have loosened the strict confines of the manipulationist theory, regarding as an "intervention" any appropriate (in a sense that has to be made clear) exogenous causal process, without any necessary connection with human action.

Related work
A recent paper by VanderWeele & Robins (2012) (hereafter VR) tackles mechanistic interaction via stochastic (rather than deterministic) POs. In the standard PO formulation, the value that Y would take in individual ω in response to an intervention that sets (A, B) to values (a, b) is regarded as a potential outcome, Y ab (ω). Potential outcomes are fixed for each particular individual even before the treatment is applied, and unaffected by the particular regime in which the values of A and B are set. VR relax this by allowing each individual ω to be characterized by a stochastic potential outcome, Y ab (ω), that varies in the individual according to a Bernoulli distribution with the expected value fixed by the intervention and by random circumstances, although these latter are assumed not to be affected by the treatment. Because of the latter constraint, it is not clear whether VR's approach, as currently formulated, copes with situations where a stochastic mediator of the effect of (A, B) on Y introduces intervention-dependent random variation 1 .
Ramsahai (2013) gives a fully probabilistic account of mechanistic interaction, which boasts aspects of greater generality relative to ours, including freedom from 1 We also note that in Rubin's standard PO formulation there is a value of the response for each individual and possible intervention, and such value is constant across all possible regimes, in the sense that it is not affected by the way the values of A and B are generated. In VR's approach, the response has its expected value fixed by the particular individual, set of circumstances and intervention. But conditional on this expected value, is the realized value of the response assumed to vary across regimes? In other words, is the observationally detected response identical to what I would have observed had I intervened? And, if so, are the regime-specific versions of the response assumed independent? We feel that the question matters to the very purpose of carrying inferences from the observational to other regimes. These considerations are related to certain ambiguities of counterfactual-based formulations of causality (Dawid, 2000) monotonicity assumptions about the effects of A and B. There are also aspects of lesser generality: no attempt is made in Ramsahai' paper to examine the implications of the presence of continuous causal factors. It is therefore appropriate to proceed by comparing Ramsahai's method and ours in the special case where A and B are binary variables, with (a, a ′ , b, b ′ ) ≡ (1, 0, 1, 0), In this special case, our condition for presence of mechanistic interaction, as expressed by (16), specializes to π w (1, 1) π w (0, 1) < π w (1, 0) π w (0, 0) .
In our approach, these inequalities are consistent with, but not sufficient for, the presence of mechanistic interaction. In fact, consistently with our concluding remarks of § 3.1, inequalities (37)-(38) do not imply (36). By contrast, in Ramsahai's approach, those inequalities are taken to define mechanistic interaction for binary variables. Hence Ramsahai's definition of mechanistic interaction is weaker than ours. The more exacting nature of our definition of mechanistic interaction, combined with allowance for continuity, explains the stronger assumptions required in our approach compared to those of Ramsahai.
To elucidate the differences between the approaches, suppose that, in the bowling example, A (the first player's ball size) takes value 0 (the player has no ball to throw) or 1 (the player throws a ball). Interpret B analogously. It then seems reasonable, on the basis of physics and common sense, to assume that inequalities (37)-(38) hold in this example. In Ramsahai's formulation, this is sufficient to conclude in favour of mechanistic interaction between the effects of the throws of the two players, even before looking into the data, and even if the two players act independently. This appears to clash with our psychological notion of synergism. By contrast, in our formulation, (37)-(38) are not sufficient to conclude in favour of mechanistic interaction: we would need to find evidence that contradicts the idea of independent throws expressed by (14)-which accords with our intuition.
To conclude, we note that our approach uses statistics (excess risk and superadditivity) which are often testable at negligible computational cost in prospective studies, and (approximately and under assumptions) also in retrospective studies.
By contrast,, attention needs to be paid to the computational feasibility of Ramsahai's approach.

Discussion
Mechanistic interaction has often been tackled within a potential outcome framework (Rubin, 1974) or within an equivalent formulation of causality. We have discussed possible limitations of this approach. We have also discussed limitations of current approaches to mechanistic interaction which reject the potential outcome formulation in favour of the standard probability formalism. Motivated by the limitations of the previous approaches, we have proposed a novel definition of the causal notion of mechanistic interaction, and presented sufficient conditions for its identification from observational data. Because these conditions are expressed in terms of conditional independence, they hold irrespective of particular parametric or distributional assumptions about the problem variables. A further advantage of our conditional independence formulation of the identifiability conditions is that these can be straightforwardly checked on a causal diagram of the problem, when this is available. The use of causal diagrams for the mentioned purposes has been extensively illustrated.
We recommend the use of our conditions as a first step in an analysis of mechanistic interaction. In applications where our conditions do not directly support the desired test, one can attempt to make the mechanistic interaction of interest observationally identifiable by introducing appropriate distributional assumptions. We have discussed and illustrated this in Example 3.
Our theory provides conditions for testing for mechanistic interaction in (real or hypothetical) situations in which an intervention is exerted on variables (even posttreatment ones) different from the main factors A and B of interest. We have discussed the connection between this and the idea of mechanistic interaction between effects that flow along specific paths in a causal diagram representation of the problem.
Importantly, our method does not require the assumption that Y depends on its causal influences in a functional way. By relaxing such an assumption, our method gains applicability in a much wider range of situations, and confers more leeway on the researcher in the choice of the conditioning variables in the test.
Once the conditions for a test of the mechanistic interaction of interest have been found valid, the actual test involves simple (and well known) excess risk or superadditivity statistics. These tests are valid under prospective sampling and (under assumptions) retrospective sampling. In the latter case, a key assumption is that the response event of interest is rare under any possible configuration of the causal factors. In the context of retrospective case-control studies in epidemiology, this is the well-known rare disease assumption that typically motivates this kind of studies.
Finally, our approach embraces the very large class of applications where the main causal factors, A and B, are only available as a discretized version of the fundamental variables, no longer available in their original continuous form.
Various possible enhancements of the method are envisaged, one of these being the extension of the theory to embrace higher-order mechanistic interactions. Equally important will be the application of the method in a variety of situations and disciplines, from genetic epidemiology (e.g., in the identification of geneenvironment interactions) to experimental psychology. We hope that the proposed method will help researchers better to identify from data analysis small sets of interactions underlying mechanisms of scientific interest.