## Abstract

Meta-analyses of psychological interventions typically find a pooled effect of “psychological intervention” compared with usual care. This answers the research question, “Are psychological interventions in general effective?” In fact, psychological interventions are usually complex with several different components. The authors propose that mixed treatment comparison meta-analysis methods may be a valuable tool when exploring the efficacy of interventions with different components and combinations of components, as this allows one to answer the research question, “Are interventions with a particular component (or combination of components) effective?” The authors illustrate the methods using a meta-analysis of psychological interventions for patients with coronary heart disease for a variety of outcomes. The authors carried out systematic literature searches to update an earlier Cochrane review and classified components of interventions into 6 types: usual care, educational, behavioral, cognitive, relaxation, and support. Most interventions were a combination of these components. There was some evidence that psychological interventions were effective in reducing total cholesterol and standardized mean anxiety scores, that interventions with behavioral components were effective in reducing the odds of all-cause mortality and nonfatal myocardial infarction, and that interventions with behavioral and/or cognitive components were associated with reduced standardized mean depression scores.

Systematic review and meta-analysis are well-established methods for describing and summarizing a body of literature that compares interventions for a given patient population (1). Standard meta-analytical methods are typically restricted to comparisons of 2 interventions using direct, head-to-head evidence alone. So, for example, if we are interested in the *A* versus *B* comparison, we would include only studies that compare *A* versus *B* directly. Consequently, meta-analyses of psychological interventions typically group all such interventions together as the same “treatment” and make the pairwise comparison of “all psychological interventions” versus some comparator (often “usual care”) (2). This allows us to answer the research question, “*Are psychological interventions in general effective*?” However, psychological interventions are usually complex, consisting of several different components, and so in some sense no 2 interventions are exactly alike. If there is no systematic structure to the differences between interventions, then this heterogeneity can be incorporated by using random effects models (1). In fact, psychological interventions often have well-defined components that can readily be classified. For example, 1 intervention may be primarily cognitive, another primarily behavioral, while others may have both cognitive and behavioral components. Therefore, differences between interventions can be highly structured.

Meta-regression methods (3) allow the potential for modeling systematic structured differences between interventions. Mixed treatment comparison meta-analysis (4–7) is a special form of meta-regression that enables the simultaneous comparison of multiple interventions in a single analysis, allowing us to make indirect comparisons while fully respecting the randomized structure of the evidence. For example, if we have some studies comparing intervention *C* with intervention *A* and some studies comparing intervention *C* with intervention *B*, then we can indirectly compare interventions *A* and *B* via the common comparator treatment *C* using the relation $dAB=dAC\u2212dBC$, where *d* is the treatment effect for the indicated pairwise comparison. All 3 treatments *A*, *B*, and *C* can then be simultaneously compared and ranked according to a given outcome. In addition, we can summarize the uncertainty regarding which is the best intervention type with the probability that it is the most effective. This means that we can put together whole structured networks of evidence and can then answer questions such as, “*Which type of intervention has the greatest probability of being most effective*?” or “*Which combinations of components have the greatest probability of being most effective*?”

Applications of these methods have begun to appear in medical journals (8) and in medical decision-making (9, 10). However, there have been no applications, thus far, in the area of psychological interventions and, in particular, not for complex interventions. Our main aim was to develop a framework in which mixed treatment comparison methods could be used to explore the effects of different components of complex interventions. The methods are illustrated by using a Cochrane review of the effect of psychological interventions on coronary heart disease outcomes, that is, coronary heart disease-related morbidity and mortality (2), which is extended here from December 2001 until the end of December 2006.

We first describe the literature searches and the classification scheme for the interventions. We then describe the statistical models that we investigated. Results from the model are presented, followed by a discussion of the methods, some issues related to meta-analysis with continuous outcomes, in particular, when measured on different scales, and further modeling challenges in this area.

## MATERIALS AND METHODS

### Updated systematic literature search

Medline was searched from January 2002 to December 2006 for randomized, controlled trials of psychological interventions for adults with coronary heart disease. To ensure consistency with the earlier review, inclusion criteria stated that trials should use a parallel group design, with at least 6 months’ follow-up, and report at least 1 of the following outcomes: all-cause mortality, cardiac mortality, nonfatal myocardial infarction, total cholesterol, systolic or diastolic blood pressure, depression, or anxiety. Twenty-two studies were identified and added to the 34 studies included in the earlier review. From the earlier review, 1 study was omitted because it was not randomized, and another was omitted because it didn't report the outcome of interest. Of the 22 new studies, we were able to extract useful information from only 19. This gave a total of 51 studies, 7 of which were 3-arm trials. Table 1 shows the number of studies reporting each outcome from the earlier review (2) and from our update; references to the included studies are listed separately as “References to Articles Used in Systematic Review” posted on the *Journal*’s website (http://aje.oxfordjournals.org/). Full details of the updated searches are available from the authors.

Outcome | No. of Studies Reporting Outcome | |

Cochrane Review | Update | |

All-cause mortality | 22 | 14 |

Cardiac mortality | 11 | 4 |

Nonfatal myocardial infarction | 18 | 4 |

Total cholesterol | 9 | 5 |

Systolic blood pressure | 5 | 4 |

Diastolic blood pressure | 5 | 4 |

Depression | 10 | 9 |

Anxiety | 9 | 5 |

Outcome | No. of Studies Reporting Outcome | |

Cochrane Review | Update | |

All-cause mortality | 22 | 14 |

Cardiac mortality | 11 | 4 |

Nonfatal myocardial infarction | 18 | 4 |

Total cholesterol | 9 | 5 |

Systolic blood pressure | 5 | 4 |

Diastolic blood pressure | 5 | 4 |

Depression | 10 | 9 |

Anxiety | 9 | 5 |

Rees et al. *Cochrane Database Syst Rev.* 2004;(2):CD002902 (2).

### Classification of interventions

Psychological interventions, over and above “usual care,” consisted of components that were classified into 5 groups: educational, behavioral, cognitive, relaxation, and psychosocial support. Educational (EDU) interventions were defined as those explicitly educating patients about health risks, cardiovascular health risks, and basic anatomy of the cardiovascular system. Behavioral (BEH) interventions focused on achieving change in behavioral domains relevant to coronary heart disease (e.g., smoking cessation courses, physical exercise training, food preparation classes, and nutritional counseling sessions). The main purpose of the cognitive (COG) interventions was to change patients' beliefs and perceptions about the factors that lead to coronary heart disease to help them manage their stress and adjust to their medical condition. Usually, the cognitive interventions were implemented by highly specialized clinical psychologists, psychotherapists, or psychiatrists. Relaxation (REL) interventions were focused on training patients in different relaxation techniques, such as yoga and breathing courses. Finally, psychosocial support (SUP) interventions included attempts to bring patients together to encourage practical and/or emotional support. The studies included in our review covered 19 of the 32 possible combinations of these components (Table 2). Table 2 also shows the number of trial arms with each type of intervention, in total and also for each outcome. A full breakdown of the components of interventions on each arm, including a definition of “usual care” used in each study, is given in Web Appendix Table 1. (This information is described in the first of 4 supplementary tables; each is referred to as “Web Appendix Table” in the text and is posted on the *Journal*’s website (http://aje.oxfordjournals.org/).)

Intervention | Total No. of Arms | No. of Trial Arms by Outcome With Intervention | |||||||

All-Cause Mortality | Cardiac Mortality | Nonfatal Myocardial Infarction | Total Cholesterol | Systolic Blood Pressure | Diastolic Blood Pressure | Depression | Anxiety | ||

Usual care only | 51 (7) | 36 (5) | 15 (2) | 22 (4) | 14 | 9 | 9 | 19 (3) | 14 |

Educational | 3 (1) | 3 (1) | 1 | 1 | 1 | 1 | 1 | 1 (1) | 0 |

Behavioral | 6 (2) | 6 (1) | 4 (1) | 5 (2) | 2 | 0 | 0 | 1 | 0 |

Cognitive | 9 (5) | 7 (2) | 5 (3) | 6 (4) | 2 | 2 | 2 | 5 (1) | 3 |

Support | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 |

Educational + behavioral | 3 | 2 | 1 | 2 | 0 | 1 | 1 | 2 | 1 |

Educational + cognitive | 5 (4) | 5 (4) | 1 | 2 (1) | 1 | 1 | 1 | 4 | 2 |

Educational + relaxation | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |

Behavioral + cognitive | 4 | 2 | 1 | 1 | 2 | 0 | 0 | 0 | 0 |

Behavioral + relaxation | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |

Cognitive + relaxation | 2 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |

Cognitive + support | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |

Educational + behavioral + cognitive | 2 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 |

Educational + behavioral + relaxation | 3 (1) | 3 (1) | 0 | 3 (1) | 1 | 0 | 0 | 2 | 2 |

Educational + behavioral + support | 1 | 2 | 1 | 0 | 1 | 1 | 1 | 0 | 0 |

Educational + cognitive + relaxation | 2 (1) | 2 (1) | 1 | 1 | 0 | 0 | 0 | 0 | 0 |

Behavioral + cognitive + relaxation | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |

Behavioral + cognitive + support | 1 | 2 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |

Educational + behavioral + cognitive + relaxation | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |

Intervention | Total No. of Arms | No. of Trial Arms by Outcome With Intervention | |||||||

All-Cause Mortality | Cardiac Mortality | Nonfatal Myocardial Infarction | Total Cholesterol | Systolic Blood Pressure | Diastolic Blood Pressure | Depression | Anxiety | ||

Usual care only | 51 (7) | 36 (5) | 15 (2) | 22 (4) | 14 | 9 | 9 | 19 (3) | 14 |

Educational | 3 (1) | 3 (1) | 1 | 1 | 1 | 1 | 1 | 1 (1) | 0 |

Behavioral | 6 (2) | 6 (1) | 4 (1) | 5 (2) | 2 | 0 | 0 | 1 | 0 |

Cognitive | 9 (5) | 7 (2) | 5 (3) | 6 (4) | 2 | 2 | 2 | 5 (1) | 3 |

Support | 1 | 1 | 0 | 0 | 1 | 1 | 1 | 0 | 1 |

Educational + behavioral | 3 | 2 | 1 | 2 | 0 | 1 | 1 | 2 | 1 |

Educational + cognitive | 5 (4) | 5 (4) | 1 | 2 (1) | 1 | 1 | 1 | 4 | 2 |

Educational + relaxation | 2 | 2 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |

Behavioral + cognitive | 4 | 2 | 1 | 1 | 2 | 0 | 0 | 0 | 0 |

Behavioral + relaxation | 1 | 1 | 1 | 1 | 0 | 0 | 0 | 0 | 0 |

Cognitive + relaxation | 2 | 1 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |

Cognitive + support | 1 | 1 | 0 | 1 | 0 | 0 | 0 | 0 | 0 |

Educational + behavioral + cognitive | 2 | 1 | 1 | 1 | 0 | 0 | 0 | 1 | 0 |

Educational + behavioral + relaxation | 3 (1) | 3 (1) | 0 | 3 (1) | 1 | 0 | 0 | 2 | 2 |

Educational + behavioral + support | 1 | 2 | 1 | 0 | 1 | 1 | 1 | 0 | 0 |

Educational + cognitive + relaxation | 2 (1) | 2 (1) | 1 | 1 | 0 | 0 | 0 | 0 | 0 |

Behavioral + cognitive + relaxation | 1 | 0 | 0 | 0 | 0 | 0 | 0 | 1 | 1 |

Behavioral + cognitive + support | 1 | 2 | 0 | 1 | 1 | 1 | 1 | 1 | 1 |

Educational + behavioral + cognitive + relaxation | 2 | 0 | 0 | 0 | 1 | 0 | 0 | 0 | 1 |

Numbers in parentheses indicate the number of arms from 3-arm trials.

### Statistical methods

#### Binary outcome data.

Three binary outcome measures were included in our review: all cause mortality, cardiac mortality, and nonfatal myocardial infarction. These can be summarized as binomial counts, $rj,k$, out of total number at risk, $nj,k$, on intervention *k* in study *j*. These data provide information on the probability, $pj,k$, of the outcome (risk of mortality, cardiac-specific mortality, and nonfatal myocardial infarction, respectively). We use a logistic regression model. Each study has a reference “baseline” intervention arm, $bj$, with study-specific “baseline” log-odds of outcome, $\mu j$, for intervention arm, $bj$. The log-odds ratio, $\delta j,k$, of outcome for intervention *k*, relative to baseline, $bj$, is assumed to come from a random effects model with mean log-odds ratio, $(dk\u2212dbj)$, and between-study standard deviation, τ, where $dk$ is the mean log-odds ratio of outcome for intervention relative to usual care (so that $d1=0$). This model can be written as follows:

*b*, so that $(dk\u2212dbj)$ can be replaced by $(dk\u2212db)$ in equation 1, this is not a necessary requirement for the mixed treatment comparison methods (7). The model for the mean log-odds ratios for the interventions, $dk$, is described below. It is through the intervention effect model that we can combine all of the studies comparing different types of intervention.

#### Continuous outcome data measured on the same scale.

Three continuous outcomes measured on the same scale across studies were included in our review: total cholesterol, systolic blood pressure, and diastolic blood pressure. These can be summarized as mean change from baseline, $yj,k$, assumed to have a normal likelihood with corresponding standard error, $SEj,k$, for intervention *k* in study *j*. It was necessary, in some trials, to assume a correlation between pre- and posttrial measures in order to obtain these summaries (11). Following the earlier review, we assumed a correlation of 0.5 but also obtained results for a correlation of 0.7 as a sensitivity analysis. The data provide information on the mean outcome, $\theta j,k$, that is, the mean change in total cholesterol, systolic blood pressure, and diastolic blood pressure, respectively. The model for the intervention effects is identical to that described above (equation 1), with the exception that it is on a natural rather than a logistic scale, giving a linear regression model:

*j*, and $\delta j,k$ is the mean difference in change in outcome for intervention

*k*relative to intervention $bj$. $\delta j,k$ is assumed to come from a random effects model with the mean of the mean differences equal to ($dk\u2212dbj$) and with between-study standard deviation, τ. The model for the mean intervention effects, $dk$, is described below.

#### Continuous outcome data measured on different scales.

For the depression and anxiety outcomes, there were a variety of different scales of measurement across studies. Again, these can be summarized as the mean change from baseline, $yj,k$, assumed to have a normal likelihood with corresponding standard error, $SEj,k$, for intervention *k* in study *j*. Again, the data provide information on the mean change in outcome, $\phi j,k$ (mean depression or anxiety score, respectively). However, the $\phi j,k$s will be measured on different scales in different studies. We standardize these using the pooled standard deviation across arms within each study, $\theta j,k=\phi j,k/SDj$. The linear regression model is on these standardized means, $\theta j,k$, and is identical to equation 2, although the interpretation of the parameters is different.

$\mu j$ is the change from baseline in the standardized mean outcome for “baseline” intervention, $bj$, and $\delta j,k$ is the standardized mean difference (SMD) in change in outcome for intervention *k*, relative to intervention $bj$. $\delta j,k$ is assumed to come from a random effects model with the mean of the SMDs equal to ($dk\u2212dbj$) and between-study standard deviation, τ. The model for the mean SMDs, $dk$, is described below.

#### Models for intervention effects.

We explored 4 different models for the intervention effects.

Model 1 (single-effect model). In this model, all psychological interventions are grouped together as a single “treatment,” and all intervention effects are set equal, $dk=d$. This is the model that was used in the original Cochrane review (2).

Model 2 (additive main effects model). In this model, there is a separate effect for each of the different components of an intervention. The total intervention effect, $dk$, is a sum of the relevant component effects, $dEDU,dBEH,dCOG,dREL,dSUP$, for a particular intervention, *k*. So for an intervention with behavioral and cognitive components, we have $dk=dBEH+dCOG$; for an intervention with educational, cognitive, and support components, we have $dk=dEDU+dCOG+dSUP$, and so on. In general, this model can be written as follows:

*k*contains an educational component.

Model 3 (2-way interaction model). This is an extension of the main effects model with additional terms for the combination of each pair of components. This model allows interventions with particular pairs of components to have either a bigger (synergistic) or smaller (antagonistic) effect than would be expected from the sum of their effects alone. As an example, an intervention with educational and cognitive components would have $dk=dEDU+dCOG+dEDU*COG$, and an intervention with behavioral, cognitive, and support components would have $dk=dBEH+dCOG+dSUP+dBEH*COG+dBEH*SUP+dCOG*SUP$. In general, this model can be written as follows:

Model 4 (full interaction model). In this model, each of the 26 possible different interventions has a different effect, $dk=dk$; that is, each different combination of components is considered a different intervention in its own right.

Models 1–4 were compared by using the deviance information criterion (DIC) (12), which is the sum of a measure of goodness of fit (posterior mean deviance) and a measure of model complexity (effective number of parameters). Models with a smaller DIC are preferred; however, differences of less than 3 are not considered important (http://www.mrc-bsu.cam.ac.uk/bugs/winbugs/dicpage.shtml (12)). However, it is important to note that the power to estimate and detect evidence of interactions will be limited by the data available on the various combinations of components.

#### Implementation.

All models were fitted by using Bayesian inference computed with Monte Carlo Markov chain simulation in WinBUGS version 1.4.1 (http://www.mrc-bsu.cam.ac.uk/bugs/welcome.shtml (13)). All baseline and intervention effect parameters were given flat normal(0,1000) priors and the between-study standard deviation flat uniform distributions with an appropriately large range given the scale of measurement. Convergence was assessed by using the Brooks-Gelman-Rubin diagnostic (14) in WinBUGS, and in all cases a burn-in of at least 20,000 simulations was discarded. All results presented are based on a further sample of at least 40,000 simulations. WinBUGS programs can be downloaded from http://www.bris.ac.uk/cobm/research/mpes.

#### Other summary measures.

The Monte Carlo Markov chain simulation framework of WinBUGS allows us to present summaries that are of key interest, such as the probability that a particular intervention is the most effective. This is calculated by recording the proportion of iterations that a given intervention gave the greatest relative effect. We can also obtain relative effect estimates between 2 active interventions. For example, under the main effects model, model 2, the estimated relative effect between an intervention with behavioral and cognitive components compared with an intervention with an educational component is obtained by calculating the function, $dBEH+dCOG\u2212dEDU$, for each iteration of the simulation.

## RESULTS

Table 3 shows that, for all outcomes except cardiac mortality and anxiety, there is little to choose among models 1–4. This suggests that there is little evidence of synergy/attenuation, although the evidence structure here provides low power to detect such interactions (Table 2). Further examination of the cardiac mortality outcome shows a single study (15) that gave results that were inconsistent with the other evidence reporting this outcome. Removing this study shows that there is little to choose among the models for cardiac mortality also (Table 3). For the anxiety outcome, the best fitting model is model 1, where there is a single effect for any psychological intervention, with a DIC at least 6.5 lower than for the more complex models. For all outcomes, we report the results from the simplest model (model 1) and also from the main effects model (model 2), as the most easily interpretable alternative and of most practical interest. This assumes that the effect of the different components is additive. When interpreting the results from model 2, the reader should bear in mind that this analysis is effectively performing 5 different significance tests, and so ideas of significance should be adjusted accordingly (e.g., Bonferroni correction).

Outcome | Deviance Information Criterionb | |||

Model 1 (Single Effect) | Model 2 (Additive Main Effects) | Model 3 (2-Way Interaction) | Model 4 (Full Interaction) | |

All-cause mortality | 361.1 | 360.6 | 360.9 | 362.9 |

Cardiac mortality | 160.8 (147.6) | 161.2 (150.4) | 157.2 (151.7) | 157.4 (151.5) |

Nonfatal myocardial infarction | 243.7 | 241.0 | 247.2 | 244.4 |

Total cholesterol | −21.0 | −20.0 | −18.7 | −18.5 |

Systolic blood pressure | 86.3 | 87.0 | 87.7 | 87.6 |

Diastolic blood pressure | 71.0 | 70.7 | 70.5 | 70.5 |

Depression | 121.9 | 123.5 | 121.6 | 123.2 |

Anxiety | 72.4 | 78.9 | 82.0 | 82.1 |

Outcome | Deviance Information Criterionb | |||

Model 1 (Single Effect) | Model 2 (Additive Main Effects) | Model 3 (2-Way Interaction) | Model 4 (Full Interaction) | |

All-cause mortality | 361.1 | 360.6 | 360.9 | 362.9 |

Cardiac mortality | 160.8 (147.6) | 161.2 (150.4) | 157.2 (151.7) | 157.4 (151.5) |

Nonfatal myocardial infarction | 243.7 | 241.0 | 247.2 | 244.4 |

Total cholesterol | −21.0 | −20.0 | −18.7 | −18.5 |

Systolic blood pressure | 86.3 | 87.0 | 87.7 | 87.6 |

Diastolic blood pressure | 71.0 | 70.7 | 70.5 | 70.5 |

Depression | 121.9 | 123.5 | 121.6 | 123.2 |

Anxiety | 72.4 | 78.9 | 82.0 | 82.1 |

The numbers in parentheses for cardiac mortality are obtained after omitting study 5 (Cowan et al. *Nurs Res*. 2001;50(2):68–76 (15)). Note that, in this example, there is little power to detect interaction effects (models 3 and 4).

The sum of a measure of goodness of fit (posterior mean deviance) and a measure of model complexity (effective number of parameters).

### Primary outcomes

If we fit model 1, with a single effect for any psychological intervention, then we see an intervention effect only on the nonfatal myocardial infarction outcome (Table 4, model 1) with a posterior mean log-odds ratio of −0.35 (95% credible interval: −0.65, −0.10). The results from model 2, which has an additive effect for each component of an intervention, show that interventions with behavioral components have the strongest effects on all-cause mortality (Table 4, model 2) with a posterior mean log-odds ratio of −0.58 (95% credible interval: −1.13, −0.05) and on nonfatal myocardial infarction (Table 4, model 2) with a posterior mean log-odds ratio of −0.64 (95% credible interval: −1.13, −0.16). From model 2, interventions with a behavioral component were the most effective for all-cause mortality (with probability 0.61) (Table 5) and for cardiac mortality (with probability 0.40) (Table 5), whereas interventions with a psychosocial support component were most effective for nonfatal myocardial infarction (with probability 0.76) (Table 5).

Outcome | Summary | Model 1 (Single Effect) | Model 2 (Additive Main Effects) | ||||||||||

d | d_{EDU} | d_{BEH} | d_{COG} | d_{REL} | d_{SUP} | ||||||||

Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | ||

All-cause mortality | Log-odds ratio | −0.14 | −0.47, 0.15 | 0.29 | −0.27, 0.85 | −0.58 | −1.13, −0.05 | −0.01 | −0.52, 0.45 | −0.38 | −1.16, 0.37 | 0.21 | −0.66, 1.06 |

Cardiac mortalityb | Log-odds ratio | −0.16 | −0.44, 0.07 | 0.27 | −0.46, 0.98 | −0.34 | −1.00, 0.30 | −0.33 | −0.83, 0.03 | 0.03 | −1.49, 1.53 | 0.10 | −1.12, 1.31 |

Nonfatal myocardial infarction | Log-odds ratio | −0.35 | −0.65, −0.10 | −0.16 | −0.71, 0.34 | −0.64 | −1.13, −0.16 | −0.09 | −0.41, 0.28 | −0.005 | −0.61, 0.57 | −1.49 | −3.42, 0.19 |

Total cholesterol, mmol/L | Mean difference | −0.32 | −0.50, −0.13 | −0.13 | −0.71, 0.42 | −0.14 | −0.60, 0.33 | −0.29 | −0.71, 0.13 | 0.49 | −0.23, 1.24 | −0.05 | −0.70, 0.61 |

Systolic blood pressure, mm Hg | Mean difference | −1.21 | −4.24, 2.33 | −2.81 | −12.84, 7.18 | 5.53 | −8.61, 19.78 | −0.95 | −9.13, 7.80 | −0.07 | −17.54, 16.50 | −0.74 | 12.38, 11.63 |

Diastolic blood pressure, mm Hg | Mean difference | −1.37 | −3.31, 0.62 | −3.77 | −10.42, 3.00 | 3.18 | −6.61, 12.48 | 0.89 | −4.87, 6.44 | −2.39 | −14.00, 9.43 | −0.85 | −8.74, 7.40 |

Depression | SMD | −0.23 | −0.35, −0.11 | −0.01 | −0.24, 0.22 | −0.26 | −0.55, 0.02 | −0.24 | −0.42, −0.06 | 0.08 | −0.20, 0.34 | 0.57 | −0.07, 1.21 |

Anxiety | SMD | −0.15 | −0.29, −0.04 | −0.19 | −0.49, 0.14 | −0.02 | −0.37, 0.34 | −0.12 | −0.37, 0.10 | 0.02 | −0.31, 0.34 | −0.04 | −0.39, 0.38 |

Outcome | Summary | Model 1 (Single Effect) | Model 2 (Additive Main Effects) | ||||||||||

d | d_{EDU} | d_{BEH} | d_{COG} | d_{REL} | d_{SUP} | ||||||||

Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | Posterior Mean | 95% Credible Interval | ||

All-cause mortality | Log-odds ratio | −0.14 | −0.47, 0.15 | 0.29 | −0.27, 0.85 | −0.58 | −1.13, −0.05 | −0.01 | −0.52, 0.45 | −0.38 | −1.16, 0.37 | 0.21 | −0.66, 1.06 |

Cardiac mortalityb | Log-odds ratio | −0.16 | −0.44, 0.07 | 0.27 | −0.46, 0.98 | −0.34 | −1.00, 0.30 | −0.33 | −0.83, 0.03 | 0.03 | −1.49, 1.53 | 0.10 | −1.12, 1.31 |

Nonfatal myocardial infarction | Log-odds ratio | −0.35 | −0.65, −0.10 | −0.16 | −0.71, 0.34 | −0.64 | −1.13, −0.16 | −0.09 | −0.41, 0.28 | −0.005 | −0.61, 0.57 | −1.49 | −3.42, 0.19 |

Total cholesterol, mmol/L | Mean difference | −0.32 | −0.50, −0.13 | −0.13 | −0.71, 0.42 | −0.14 | −0.60, 0.33 | −0.29 | −0.71, 0.13 | 0.49 | −0.23, 1.24 | −0.05 | −0.70, 0.61 |

Systolic blood pressure, mm Hg | Mean difference | −1.21 | −4.24, 2.33 | −2.81 | −12.84, 7.18 | 5.53 | −8.61, 19.78 | −0.95 | −9.13, 7.80 | −0.07 | −17.54, 16.50 | −0.74 | 12.38, 11.63 |

Diastolic blood pressure, mm Hg | Mean difference | −1.37 | −3.31, 0.62 | −3.77 | −10.42, 3.00 | 3.18 | −6.61, 12.48 | 0.89 | −4.87, 6.44 | −2.39 | −14.00, 9.43 | −0.85 | −8.74, 7.40 |

Depression | SMD | −0.23 | −0.35, −0.11 | −0.01 | −0.24, 0.22 | −0.26 | −0.55, 0.02 | −0.24 | −0.42, −0.06 | 0.08 | −0.20, 0.34 | 0.57 | −0.07, 1.21 |

Anxiety | SMD | −0.15 | −0.29, −0.04 | −0.19 | −0.49, 0.14 | −0.02 | −0.37, 0.34 | −0.12 | −0.37, 0.10 | 0.02 | −0.31, 0.34 | −0.04 | −0.39, 0.38 |

Abbreviations: SMD, standardized mean difference; subscript abbreviations: BEH, behavioral intervention; COG, cognitive intervention; EDU, educational intervention; REL, relaxation intervention; SUP, psychosocial support intervention.

Results are shown for the relevant summary measure for each of the outcome measures, assuming a correlation of 0.5 between pre- and postmeasures for the continuous outcomes.

Results presented for the cardiac mortality outcome omit study 5 (Cowan et al. *Nurs Res.* 2001;50(2):68–76 (15)).

Outcome | Probability of Being Most Effective | |||||

Usual Care | Educational | Behavioral | Cognitive | Relaxation | Psychosocial Support | |

All-cause mortality | 0.000 | 0.009 | 0.614 | 0.024 | 0.316 | 0.038 |

Cardiac mortalitya | 0.000 | 0.022 | 0.398 | 0.207 | 0.239 | 0.135 |

Nonfatal myocardial infarction | 0.000 | 0.010 | 0.226 | 0.001 | 0.007 | 0.756 |

Total cholesterol | 0.000 | 0.194 | 0.212 | 0.420 | 0.012 | 0.162 |

Systolic blood pressure | 0.004 | 0.325 | 0.063 | 0.195 | 0.226 | 0.187 |

Diastolic blood pressure | 0.001 | 0.465 | 0.056 | 0.043 | 0.317 | 0.117 |

Depression | 0.000 | 0.039 | 0.515 | 0.423 | 0.015 | 0.008 |

Anxiety | 0.000 | 0.483 | 0.142 | 0.176 | 0.075 | 0.125 |

Outcome | Probability of Being Most Effective | |||||

Usual Care | Educational | Behavioral | Cognitive | Relaxation | Psychosocial Support | |

All-cause mortality | 0.000 | 0.009 | 0.614 | 0.024 | 0.316 | 0.038 |

Cardiac mortalitya | 0.000 | 0.022 | 0.398 | 0.207 | 0.239 | 0.135 |

Nonfatal myocardial infarction | 0.000 | 0.010 | 0.226 | 0.001 | 0.007 | 0.756 |

Total cholesterol | 0.000 | 0.194 | 0.212 | 0.420 | 0.012 | 0.162 |

Systolic blood pressure | 0.004 | 0.325 | 0.063 | 0.195 | 0.226 | 0.187 |

Diastolic blood pressure | 0.001 | 0.465 | 0.056 | 0.043 | 0.317 | 0.117 |

Depression | 0.000 | 0.039 | 0.515 | 0.423 | 0.015 | 0.008 |

Anxiety | 0.000 | 0.483 | 0.142 | 0.176 | 0.075 | 0.125 |

Results presented for the cardiac mortality outcome omit study 5 (Cowan et al. *Nurs Res.* 2001;50(2):68–76 (15)).

### Intermediate outcomes

There was no evidence of an effect of either psychological intervention or particular components of interventions on systolic or diastolic blood pressure (Table 4, model 1). There is evidence that psychological interventions, in general, lead to a reduction in mean total cholesterol (Table 4), with a mean difference of −0.32 (95% credible interval: −0.50, −0.13) mmol/L, but this could not be attributable to any single intervention type; interventions with a cognitive, behavioral, educational, or psychosocial support component have a probability of being most effective (*P* = 0.42, 0.21, 0.19, and 0.16, respectively) (Table 5). However, it is clear that interventions with a relaxation component do not appear to be effective for total cholesterol (*P* = 0.01) (Table 5). Interventions with an educational component were most effective for systolic and diastolic blood pressure (*P* = 0.33 and 0.47, respectively) (Table 5).

### Psychological outcomes

There was evidence that psychological interventions reduced standardized mean depression and anxiety scores (Table 4, model 1), with a SMD of −0.23 (95% credible interval: −0.35, −0.11) for depression and a SMD of −0.15 (95% credible interval: −0.29, −0.04) for anxiety. There was some evidence that an intervention with cognitive and/or behavioral components was associated with a reduction in standardized mean depression scores, with a posterior mean SMD of −0.26 (95% credible interval: −0.55, 0.02) for behavioral and a posterior mean SMD of −0.24 (95% credible interval: −0.42, −0.06) for cognitive components, respectively. From model 2, interventions with a behavioral component were most effective for depression (with probability 0.52) (Table 5), whereas interventions with an educational component were most effective for anxiety (with probability 0.48) (Table 5).

## DISCUSSION

Standard methods for the meta-analysis of complex interventions typically “lump” together all intervention arms as the same “treatment” and can therefore only be used to answer the research question, “*Are psychological interventions in general effective*?” Although this may be of interest, it is hard to see how the results of such an analysis could inform decisions on implementation of interventions or inform the design of new interventions. In our review, there were 19 different types of intervention according to our classification scheme. If psychological interventions are, in general, effective, we are then left with the question, “*Which of these 19 in particular should we consider implementing*?” If there is a common comparator group across all trials, then standard methods supported by subgroup analyses may shed some light, albeit often with limited power. The mixed treatment comparison methods presented here allow us to investigate further whether interventions with particular components are more likely to be effective. In other words, we can pose the research questions, “*Which type of intervention has the greatest probability of being most effective*?” with our main effects model (model 2) or “*With which types or combinations of component do interventions have the greatest probability of being most effective*?” with the interaction models (models 3 and 4). For example, we found—as did the original Cochrane review (2)—that psychological interventions were associated with a reduction in standardized mean depression scores. However, we additionally showed that there was some evidence that interventions with either behavioral or cognitive components were more likely to be effective than interventions without these components. For anxiety, we found—as did the original Cochrane review (2)—that psychological interventions were associated with a reduction in the standardized mean anxiety score. However, we were also able to conclude that different intervention types did not have differential effects, which can only be gleaned from the mixed treatment comparison analyses that we have presented here.

The mixed treatment comparison approach allows us to make indirect comparisons. In this example, there were no trials of relaxation versus usual care, but there were trials on behavioral + relaxation interventions and trials on behavioral interventions, which provide an indirect estimate of the effect of relaxation compared with usual care, as long as we assume that there are no interactions between components. There is a limitation on how many parameters can be identified for a given data structure. In particular, as in this example, there may be little power to detect evidence of interaction effects, and attempts to fit interaction models may result in overfitting of the data. The unified mixed treatment comparison approach does, however, provide greater power than a standard pairwise approach with interaction tests based on subgroup analyses.

Suppose that every trial had included an arm with every distinct possible intervention. The key assumption made by mixed treatment comparison is that arms are missing at random. A further assumption for meta-analyses of complex interventions is that interventions have been defined in a similar way across trials. This is unlikely to be the case; in particular, we might expect heterogeneity in the definition of “usual care.” While standard procedures for the treatment of coronary heart disease patients exist in many countries (e.g., in the United Kingdom (16)), they may well vary between countries and possibly also vary in implementation between health-care regions/hospitals. The definition of “usual care” in this meta-analysis is heterogeneous across studies and even between institutions within studies (Web Appendix Table 1). However, there do not appear to be any systematic patterns in definition of “usual care” across types of interventions, so although we expect such heterogeneity to weaken the efficacies observed, we do not expect it to induce any systematic biases. Any implementation of interventions such as these will be subject to heterogeneity in usual care.

Trials of complex interventions very often report continuous outcome measures that bring with them specific technical challenges for meta-analysis (1). Continuous outcomes are usually measured pre- and postintervention, with interest being on the mean change from the “baseline” measurement. We have summarized the 2 measures using the mean and standard deviation change from baseline which, depending on how results are reported, requires an assumption on the correlation between pre- and postmeasures (11). We assumed in these cases that the correlation is 0.5; however, we found that our results were robust (Web Appendix Tables 2–4) to a higher assumed correlation of 0.7 that has been observed in blood pressure and cholesterol (17, 18). A more sophisticated approach would be to use trial arms that report both pre- and postmeasures to provide information on this correlation (19), which could be applied here. Another issue with continuous outcome measures is that they may be measured on different scales. For example, depression and anxiety scores were measured by using 9 and 3 different instruments, respectively. We have worked with standardized mean scores to allow studies to be pooled regardless of scale of measurement. It is common to summarize each study with a single measure across arms—the SMD (20). However, taking away a fixed mean value from the active intervention arms ignores the uncertainty associated in the usual-care arm measurement. We have instead modeled the uncertainty in each arm and formed the SMD as part of the statistical model on which intervention effects act. While we do not expect the estimated mean effects to differ using this approach, the uncertainty in the intervention effect estimates will be characterized more accurately by use of arm-based data.

We have performed meta-analyses for several different outcomes independently and, in fact, the original review included yet more outcomes. These outcomes, however, are clearly related. We would expect the intermediate outcomes (total cholesterol, systolic blood pressure, and diastolic blood pressure) to be highly related with each other and furthermore to be surrogate markers for the primary endpoint outcomes (all-cause mortality, cardiac mortality, and nonfatal myocardial infarction). Similarly, we would expect the psychological outcomes (depression and anxiety) to be correlated with each other and also associated with other morbidity. There is a need for methods that characterize and incorporate these relations to be developed, which could potentially strengthen the resulting analysis (21).

In health technology assessment, what is required for an intervention to be recommended is not just its effectiveness but also its cost-effectiveness (22). The simulation environment that we have used allows us to easily rank competing interventions in order of their effectiveness (Table 5). To extend this to cost-effectiveness would require cost and quality-of-life data (quality-adjusted life-years) (23). This presents a possible solution to the correlated and multiple outcomes if they could jointly be mapped into a single quality-adjusted life-years outcome measure. Similarly, if all of the continuous outcomes reported on different scales could be mapped onto a common quality-adjusted life-years scale, then we have a standard metric on which they may be combined.

Complex interventions are increasingly being developed and researched in trials on the context of health and disease, and methods of analysis are still developing. The methods presented here will be useful in providing pooled effect estimates that have the potential to answer key questions concerning efficacy, cost-effectiveness, and prioritizing areas for further research.

### Abbreviations

- DIC
deviance information criterion

- SMD
standardized mean difference

Author affiliations: Academic Unit of Health Primary Care, Department of Community Based Medicine, University of Bristol, Bristol, United Kingdom (N. J. Welton, D. M. Caldwell); and Department of Social Medicine, University of Bristol, Bristol, United Kingdom (E. Adamopoulos, K. Vedhara).

N. J. W., D. M. C., and K. V. were supported by the Medical Research Council's Health Services Research Collaboration, and E. A. received support from a Medical Research Council Health Services Research Collaboration Research Initiation Grant (06/IG2038).

The authors thank Margaret Burke and Karen Rees for providing data from the earlier Cochrane review (2).

Conflict of interest: none declared.

## References

*A rapid and systematic review of the clinical and cost-effectiveness of newer drugs for treatment of mania associated with bipolar affective disorder*