-
PDF
- Split View
-
Views
-
Cite
Cite
Claudia Schmoor, Angelika Caputo, Martin Schumacher, Evidence from Nonrandomized Studies: A Case Study on the Estimation of Causal Effects, American Journal of Epidemiology, Volume 167, Issue 9, 1 May 2008, Pages 1120–1129, https://doi.org/10.1093/aje/kwn010
Close -
Share
Abstract
Although randomized controlled trials are regarded as the gold standard for comparison of treatments, evidence from observational studies is still relevant. To cope with the problem of possible confounding in these studies, investigators need methods for analyzing their results which adjust for confounders and lead to unbiased estimation of the treatment effect. In this paper, the authors describe the main principles of three statistical methods for doing this. The first method is the classical approach of a multiple regression model including the effects of treatment and covariates. This considers the relation between prognostic factors and the outcome variable as a relevant criterion for adjustment. The second method is based on the propensity score, focusing on the relation between prognostic factors and treatment assignment. The third method is an ecologic approach using a grouped treatment variable, which may aid in avoiding confounding by indication. These approaches are applied to a partially randomized trial conducted in 720 German breast cancer patients between 1984 and 1997. The study had a comprehensive cohort study design that included recruitment of patients who had consented to participation but not to randomization because of a preference for one of the treatments. This design offers a unique opportunity to contrast results from the nonrandomized portion of a study with those for a randomized subcohort as a reference.
Randomized controlled trials are considered the gold standard for comparison of clinical treatments. Consequently, treatment recommendations depend primarily on the results of randomized controlled trials. Nevertheless, evidence from nonrandomized observational studies is relevant. Randomization is sometimes not acceptable to patients (e.g., when treatments differ qualitatively, as in surgical therapy vs. medical therapy). For some questions, nonrandomized observational studies may be the sole source of available evidence.
In the analysis of results from nonrandomized observational studies, a simple overall comparison of the treatment arms may lead to a biased estimate of the treatment effect due to confounding factors (covariates). Various statistical methods have been proposed for analyzing data from nonrandomized observational studies such that estimated treatment effects may be interpreted as causal effects. In this paper, we consider three different approaches. The classical approach of fitting a multiple regression model including the effects of treatment and of covariates considers the relation between prognostic factors and outcome as a relevant criterion for adjustment. The second approach is based on the propensity score, focusing on the relation between prognostic factors and treatment assignment (1). The third approach estimates the effect of a grouped-treatment (GT) variable related to the assigned treatment, for which it can be assumed that no confounding by indication exists (2–4).
We illustrate these approaches using an example of a study with a so-called comprehensive cohort study design (5, 6), where all patients fulfilling the clinical eligibility criteria and giving consent to participation are recruited. Patients are randomized between study treatments, if they consent to randomization. If not, they receive their preferred study treatment according to the protocol. This results in a prospective cohort study that includes as a subcohort the participants in the classical randomized clinical trial. This design offers an ideal situation for investigating the properties of the different approaches for analysis of nonrandomized observational studies (7).
MATERIALS AND METHODS
Breast cancer study with a comprehensive cohort study design
In 1984, the German Breast Cancer Study Group started a comprehensive cohort study of breast cancer at 44 hospitals in Germany. The study was initiated to compare, using a 2 × 2 design, three cycles of chemotherapy with six cycles of chemotherapy and to investigate the additional effect of tamoxifen as an adjuvant treatment among patients with primary histologically proven nonmetastatic node-positive breast cancer who had been treated with mastectomy. Chemotherapy was administered according to the so-called CMF (cyclophosphamide-methotrexate-flourouracil) regimen, consisting of 500 mg/m2 cyclophosphamide, 40 mg/m2 methotrexate, and 600 mg/m2 flourouracil given intravenously on days 1 and 8 of a 4-week treatment period. Endocrine therapy consisted of a daily dose of 30 mg of tamoxifen over 2 years. The study was performed after approval by an ethical committee. Informed consent was obtained from each patient. Further details on the study design have been provided elsewhere (8, 9). In 1986, the protocol was changed; after that time, premenopausal patients were not allowed to receive tamoxifen and were randomized only with regard to three cycles of CMF (3×CMF) versus six cycles of CMF (6×CMF). Therefore, the patients in the randomized portion of the study were all randomized with regard to 3×CMF versus 6×CMF, but only a part of them were randomized with regard to tamoxifen. Here, we consider only the effect of 3×CMF versus 6×CMF. The tamoxifen treatment is considered as a covariate and is dealt with in the same way as the prognostic factors.
The covariates given in table 1 were considered as possible confounders using the listed categories. All analyses were restricted to the 450 of 473 randomized patients and 238 of 247 nonrandomized patients with complete information on the covariates. Table 2 shows the randomization rates at the 44 clinical centers.
Distribution of randomized and nonrandomized patients according to prognostic factors and treatment with tamoxifen in a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Factor | Proportion | |
| Randomized patients (n = 450) | Nonrandomized patients (n = 238) | |
| Menopausal status | ||
| Premenopausal | 0.42 | 0.43 |
| Postmenopausal | 0.58 | 0.57 |
| No. of positive lymph nodes | ||
| 1–3 | 0.56 | 0.53 |
| 4–9 | 0.30 | 0.29 |
| >9 | 0.14 | 0.18 |
| Tumor size (mm) | ||
| ≤20 | 0.27 | 0.24 |
| 21–30 | 0.42 | 0.42 |
| >30 | 0.31 | 0.34 |
| Tumor grade | ||
| I | 0.12 | 0.12 |
| II | 0.66 | 0.62 |
| III | 0.22 | 0.25 |
| Estrogen receptor status | ||
| Positive | 0.60 | 0.65 |
| Negative | 0.40 | 0.35 |
| Progesterone receptor status | ||
| Positive | 0.59 | 0.64 |
| Negative | 0.41 | 0.36 |
| Treatment with tamoxifen | ||
| No | 0.60 | 0.71 |
| Yes | 0.40 | 0.29 |
| Factor | Proportion | |
| Randomized patients (n = 450) | Nonrandomized patients (n = 238) | |
| Menopausal status | ||
| Premenopausal | 0.42 | 0.43 |
| Postmenopausal | 0.58 | 0.57 |
| No. of positive lymph nodes | ||
| 1–3 | 0.56 | 0.53 |
| 4–9 | 0.30 | 0.29 |
| >9 | 0.14 | 0.18 |
| Tumor size (mm) | ||
| ≤20 | 0.27 | 0.24 |
| 21–30 | 0.42 | 0.42 |
| >30 | 0.31 | 0.34 |
| Tumor grade | ||
| I | 0.12 | 0.12 |
| II | 0.66 | 0.62 |
| III | 0.22 | 0.25 |
| Estrogen receptor status | ||
| Positive | 0.60 | 0.65 |
| Negative | 0.40 | 0.35 |
| Progesterone receptor status | ||
| Positive | 0.59 | 0.64 |
| Negative | 0.41 | 0.36 |
| Treatment with tamoxifen | ||
| No | 0.60 | 0.71 |
| Yes | 0.40 | 0.29 |
Distribution of randomized and nonrandomized patients according to prognostic factors and treatment with tamoxifen in a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Factor | Proportion | |
| Randomized patients (n = 450) | Nonrandomized patients (n = 238) | |
| Menopausal status | ||
| Premenopausal | 0.42 | 0.43 |
| Postmenopausal | 0.58 | 0.57 |
| No. of positive lymph nodes | ||
| 1–3 | 0.56 | 0.53 |
| 4–9 | 0.30 | 0.29 |
| >9 | 0.14 | 0.18 |
| Tumor size (mm) | ||
| ≤20 | 0.27 | 0.24 |
| 21–30 | 0.42 | 0.42 |
| >30 | 0.31 | 0.34 |
| Tumor grade | ||
| I | 0.12 | 0.12 |
| II | 0.66 | 0.62 |
| III | 0.22 | 0.25 |
| Estrogen receptor status | ||
| Positive | 0.60 | 0.65 |
| Negative | 0.40 | 0.35 |
| Progesterone receptor status | ||
| Positive | 0.59 | 0.64 |
| Negative | 0.41 | 0.36 |
| Treatment with tamoxifen | ||
| No | 0.60 | 0.71 |
| Yes | 0.40 | 0.29 |
| Factor | Proportion | |
| Randomized patients (n = 450) | Nonrandomized patients (n = 238) | |
| Menopausal status | ||
| Premenopausal | 0.42 | 0.43 |
| Postmenopausal | 0.58 | 0.57 |
| No. of positive lymph nodes | ||
| 1–3 | 0.56 | 0.53 |
| 4–9 | 0.30 | 0.29 |
| >9 | 0.14 | 0.18 |
| Tumor size (mm) | ||
| ≤20 | 0.27 | 0.24 |
| 21–30 | 0.42 | 0.42 |
| >30 | 0.31 | 0.34 |
| Tumor grade | ||
| I | 0.12 | 0.12 |
| II | 0.66 | 0.62 |
| III | 0.22 | 0.25 |
| Estrogen receptor status | ||
| Positive | 0.60 | 0.65 |
| Negative | 0.40 | 0.35 |
| Progesterone receptor status | ||
| Positive | 0.59 | 0.64 |
| Negative | 0.41 | 0.36 |
| Treatment with tamoxifen | ||
| No | 0.60 | 0.71 |
| Yes | 0.40 | 0.29 |
Rates of randomization versus rates of treatment with three cycles of CMF,* by clinical center, among nonrandomized patients in a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Clinical center | No. of patients | Proportion of randomized patients | No. of nonrandomized patients | Proportion of nonrandomized patients treated with three cycles of CMF (grouped-treatment variable) |
| 1 | 12 | 0.50 | 6 | 1.00 |
| 2 | 6 | 0.33 | 4 | 1.00 |
| 3 | 5 | 0.80 | 1 | 1.00 |
| 4 | 30 | 0.97 | 1 | 1.00 |
| 5 | 34 | 0.38 | 21 | 0.90 |
| 6 | 34 | 0.65 | 12 | 0.83 |
| 7 | 22 | 0.82 | 4 | 0.75 |
| 8 | 27 | 0.33 | 18 | 0.72 |
| 9 | 20 | 0.50 | 10 | 0.70 |
| 10 | 7 | 0.57 | 3 | 0.67 |
| 11 | 11 | 0.55 | 5 | 0.60 |
| 12 | 18 | 0.17 | 15 | 0.53 |
| 13 | 26 | 0.77 | 6 | 0.50 |
| 14 | 12 | 0.83 | 2 | 0.50 |
| 15 | 6 | 0.67 | 2 | 0.50 |
| 16 | 46 | 0.89 | 5 | 0.40 |
| 17 | 22 | 0.27 | 16 | 0.38 |
| 18 | 24 | 0.42 | 14 | 0.36 |
| 19 | 6 | 0.50 | 3 | 0.33 |
| 20 | 28 | 0.89 | 3 | 0.33 |
| 21 | 7 | 0.57 | 3 | 0.33 |
| 22 | 20 | 0.85 | 3 | 0.33 |
| 23 | 10 | 0.70 | 3 | 0.33 |
| 24 | 11 | 0.00 | 11 | 0.27 |
| 25 | 46 | 0.83 | 8 | 0.25 |
| 26 | 4 | 0.00 | 4 | 0.25 |
| 27 | 19 | 0.37 | 12 | 0.17 |
| 28 | 45 | 0.69 | 14 | 0.14 |
| 29 | 14 | 0.07 | 13 | 0.00 |
| 30 | 12 | 0.25 | 9 | 0.00 |
| 31 | 3 | 0.00 | 3 | 0.00 |
| 32 | 4 | 0.25 | 3 | 0.00 |
| 33 | 19 | 0.95 | 1 | 0.00 |
| 34 | 45 | 1.00 | 0 | —† |
| 35 | 2 | 1.00 | 0 | — |
| 36 | 1 | 1.00 | 0 | — |
| 37 | 8 | 1.00 | 0 | — |
| 38 | 3 | 1.00 | 0 | — |
| 39 | 3 | 1.00 | 0 | — |
| 40 | 1 | 1.00 | 0 | — |
| 41 | 5 | 1.00 | 0 | — |
| 42 | 2 | 1.00 | 0 | — |
| 43 | 2 | 1.00 | 0 | — |
| 44 | 6 | 1.00 | 0 | — |
| Total | 688 | 0.65 | 238 | 0.46 |
| Clinical center | No. of patients | Proportion of randomized patients | No. of nonrandomized patients | Proportion of nonrandomized patients treated with three cycles of CMF (grouped-treatment variable) |
| 1 | 12 | 0.50 | 6 | 1.00 |
| 2 | 6 | 0.33 | 4 | 1.00 |
| 3 | 5 | 0.80 | 1 | 1.00 |
| 4 | 30 | 0.97 | 1 | 1.00 |
| 5 | 34 | 0.38 | 21 | 0.90 |
| 6 | 34 | 0.65 | 12 | 0.83 |
| 7 | 22 | 0.82 | 4 | 0.75 |
| 8 | 27 | 0.33 | 18 | 0.72 |
| 9 | 20 | 0.50 | 10 | 0.70 |
| 10 | 7 | 0.57 | 3 | 0.67 |
| 11 | 11 | 0.55 | 5 | 0.60 |
| 12 | 18 | 0.17 | 15 | 0.53 |
| 13 | 26 | 0.77 | 6 | 0.50 |
| 14 | 12 | 0.83 | 2 | 0.50 |
| 15 | 6 | 0.67 | 2 | 0.50 |
| 16 | 46 | 0.89 | 5 | 0.40 |
| 17 | 22 | 0.27 | 16 | 0.38 |
| 18 | 24 | 0.42 | 14 | 0.36 |
| 19 | 6 | 0.50 | 3 | 0.33 |
| 20 | 28 | 0.89 | 3 | 0.33 |
| 21 | 7 | 0.57 | 3 | 0.33 |
| 22 | 20 | 0.85 | 3 | 0.33 |
| 23 | 10 | 0.70 | 3 | 0.33 |
| 24 | 11 | 0.00 | 11 | 0.27 |
| 25 | 46 | 0.83 | 8 | 0.25 |
| 26 | 4 | 0.00 | 4 | 0.25 |
| 27 | 19 | 0.37 | 12 | 0.17 |
| 28 | 45 | 0.69 | 14 | 0.14 |
| 29 | 14 | 0.07 | 13 | 0.00 |
| 30 | 12 | 0.25 | 9 | 0.00 |
| 31 | 3 | 0.00 | 3 | 0.00 |
| 32 | 4 | 0.25 | 3 | 0.00 |
| 33 | 19 | 0.95 | 1 | 0.00 |
| 34 | 45 | 1.00 | 0 | —† |
| 35 | 2 | 1.00 | 0 | — |
| 36 | 1 | 1.00 | 0 | — |
| 37 | 8 | 1.00 | 0 | — |
| 38 | 3 | 1.00 | 0 | — |
| 39 | 3 | 1.00 | 0 | — |
| 40 | 1 | 1.00 | 0 | — |
| 41 | 5 | 1.00 | 0 | — |
| 42 | 2 | 1.00 | 0 | — |
| 43 | 2 | 1.00 | 0 | — |
| 44 | 6 | 1.00 | 0 | — |
| Total | 688 | 0.65 | 238 | 0.46 |
CMF, cyclophosphamide-methotrexate-flourouracil.
Not applicable.
Rates of randomization versus rates of treatment with three cycles of CMF,* by clinical center, among nonrandomized patients in a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Clinical center | No. of patients | Proportion of randomized patients | No. of nonrandomized patients | Proportion of nonrandomized patients treated with three cycles of CMF (grouped-treatment variable) |
| 1 | 12 | 0.50 | 6 | 1.00 |
| 2 | 6 | 0.33 | 4 | 1.00 |
| 3 | 5 | 0.80 | 1 | 1.00 |
| 4 | 30 | 0.97 | 1 | 1.00 |
| 5 | 34 | 0.38 | 21 | 0.90 |
| 6 | 34 | 0.65 | 12 | 0.83 |
| 7 | 22 | 0.82 | 4 | 0.75 |
| 8 | 27 | 0.33 | 18 | 0.72 |
| 9 | 20 | 0.50 | 10 | 0.70 |
| 10 | 7 | 0.57 | 3 | 0.67 |
| 11 | 11 | 0.55 | 5 | 0.60 |
| 12 | 18 | 0.17 | 15 | 0.53 |
| 13 | 26 | 0.77 | 6 | 0.50 |
| 14 | 12 | 0.83 | 2 | 0.50 |
| 15 | 6 | 0.67 | 2 | 0.50 |
| 16 | 46 | 0.89 | 5 | 0.40 |
| 17 | 22 | 0.27 | 16 | 0.38 |
| 18 | 24 | 0.42 | 14 | 0.36 |
| 19 | 6 | 0.50 | 3 | 0.33 |
| 20 | 28 | 0.89 | 3 | 0.33 |
| 21 | 7 | 0.57 | 3 | 0.33 |
| 22 | 20 | 0.85 | 3 | 0.33 |
| 23 | 10 | 0.70 | 3 | 0.33 |
| 24 | 11 | 0.00 | 11 | 0.27 |
| 25 | 46 | 0.83 | 8 | 0.25 |
| 26 | 4 | 0.00 | 4 | 0.25 |
| 27 | 19 | 0.37 | 12 | 0.17 |
| 28 | 45 | 0.69 | 14 | 0.14 |
| 29 | 14 | 0.07 | 13 | 0.00 |
| 30 | 12 | 0.25 | 9 | 0.00 |
| 31 | 3 | 0.00 | 3 | 0.00 |
| 32 | 4 | 0.25 | 3 | 0.00 |
| 33 | 19 | 0.95 | 1 | 0.00 |
| 34 | 45 | 1.00 | 0 | —† |
| 35 | 2 | 1.00 | 0 | — |
| 36 | 1 | 1.00 | 0 | — |
| 37 | 8 | 1.00 | 0 | — |
| 38 | 3 | 1.00 | 0 | — |
| 39 | 3 | 1.00 | 0 | — |
| 40 | 1 | 1.00 | 0 | — |
| 41 | 5 | 1.00 | 0 | — |
| 42 | 2 | 1.00 | 0 | — |
| 43 | 2 | 1.00 | 0 | — |
| 44 | 6 | 1.00 | 0 | — |
| Total | 688 | 0.65 | 238 | 0.46 |
| Clinical center | No. of patients | Proportion of randomized patients | No. of nonrandomized patients | Proportion of nonrandomized patients treated with three cycles of CMF (grouped-treatment variable) |
| 1 | 12 | 0.50 | 6 | 1.00 |
| 2 | 6 | 0.33 | 4 | 1.00 |
| 3 | 5 | 0.80 | 1 | 1.00 |
| 4 | 30 | 0.97 | 1 | 1.00 |
| 5 | 34 | 0.38 | 21 | 0.90 |
| 6 | 34 | 0.65 | 12 | 0.83 |
| 7 | 22 | 0.82 | 4 | 0.75 |
| 8 | 27 | 0.33 | 18 | 0.72 |
| 9 | 20 | 0.50 | 10 | 0.70 |
| 10 | 7 | 0.57 | 3 | 0.67 |
| 11 | 11 | 0.55 | 5 | 0.60 |
| 12 | 18 | 0.17 | 15 | 0.53 |
| 13 | 26 | 0.77 | 6 | 0.50 |
| 14 | 12 | 0.83 | 2 | 0.50 |
| 15 | 6 | 0.67 | 2 | 0.50 |
| 16 | 46 | 0.89 | 5 | 0.40 |
| 17 | 22 | 0.27 | 16 | 0.38 |
| 18 | 24 | 0.42 | 14 | 0.36 |
| 19 | 6 | 0.50 | 3 | 0.33 |
| 20 | 28 | 0.89 | 3 | 0.33 |
| 21 | 7 | 0.57 | 3 | 0.33 |
| 22 | 20 | 0.85 | 3 | 0.33 |
| 23 | 10 | 0.70 | 3 | 0.33 |
| 24 | 11 | 0.00 | 11 | 0.27 |
| 25 | 46 | 0.83 | 8 | 0.25 |
| 26 | 4 | 0.00 | 4 | 0.25 |
| 27 | 19 | 0.37 | 12 | 0.17 |
| 28 | 45 | 0.69 | 14 | 0.14 |
| 29 | 14 | 0.07 | 13 | 0.00 |
| 30 | 12 | 0.25 | 9 | 0.00 |
| 31 | 3 | 0.00 | 3 | 0.00 |
| 32 | 4 | 0.25 | 3 | 0.00 |
| 33 | 19 | 0.95 | 1 | 0.00 |
| 34 | 45 | 1.00 | 0 | —† |
| 35 | 2 | 1.00 | 0 | — |
| 36 | 1 | 1.00 | 0 | — |
| 37 | 8 | 1.00 | 0 | — |
| 38 | 3 | 1.00 | 0 | — |
| 39 | 3 | 1.00 | 0 | — |
| 40 | 1 | 1.00 | 0 | — |
| 41 | 5 | 1.00 | 0 | — |
| 42 | 2 | 1.00 | 0 | — |
| 43 | 2 | 1.00 | 0 | — |
| 44 | 6 | 1.00 | 0 | — |
| Total | 688 | 0.65 | 238 | 0.46 |
CMF, cyclophosphamide-methotrexate-flourouracil.
Not applicable.
The primary endpoint for analysis was event-free survival time. Patients were followed at regular intervals until the middle of 1997, leading to a median follow-up time of approximately 8.5 years (10). Event-free survival time was defined from mastectomy to the first event of failure (locoregional recurrence, distant metastasis, a second cancer contralateral or at a distant site, or death without previous failure); 400 events were observed.
Unadjusted analysis of treatment effect
For unadjusted analysis, the hazard ratio for the comparison between treatment groups and its 95 percent confidence interval were calculated, and a two-sided Wald test of the hypothesis of no treatment effect was performed in a Cox regression model (11) including only treatment as a covariate. Cumulative event-free survival rates for the treatment arms were estimated by means of the Kaplan-Meier method (12). This analysis was performed separately for randomized and nonrandomized patients. For the latter, this analysis is obviously inadequate, but it is included for illustration.
Analysis strategies adjusting for confounders
Multiple Cox regression analysis adjusting for covariates.
An adjusted analysis was performed within a Cox regression model including the respective covariates additional to treatment. The hazard ratio for the comparison between treatment groups and its 95 percent confidence interval were calculated, and a two-sided Wald test of the hypothesis of no treatment effect was performed. Cumulative event-free survival rates for the treatment arms adjusted for covariates were estimated through an adjusted Cox model (13). This analysis was performed in both nonrandomized and randomized patients.
Propensity-score-based analysis.
In the nonrandomized portion of the study, a propensity-score-based analysis was performed. The propensity score is defined as the conditional probability of receiving one of the treatments under comparison, here 3×CMF, given the observed covariates (1). Then a stratified analysis of the treatment effect is performed using strata that are homogenous with respect to the propensity score. Since the true propensity score has the property that treatment assignment and covariates are conditionally independent given the propensity score (1), within homogenous strata, covariates should be balanced between treatment groups. Thus, a stratified analysis theoretically leads to unbiased estimation of the treatment effect, assuming that all confounders are accounted for.
First, the propensity score is estimated by means of a logistic regression model for treatment assignment, 3×CMF versus 6×CMF, dependent on the covariates. We included for estimation of the propensity score only those covariates that showed an effect on treatment assignment with a p value smaller than 0.157, corresponding to the Akaike criterion (14). Second, patients are divided into strata based on the estimated propensity score, and we ascertain whether covariates are balanced between the treatment groups. Then the treatment effect is estimated using a stratified Cox regression model.
The GT approach.
In the nonrandomized portion of the study, an analysis using a GT approach was performed. In the GT approach, the treatment individually assigned is considered to be confounded by indication, which means that patients may be selected to receive one of the treatments because of known or unknown prognostic factors (4). Whereas the first two approaches try to adjust only for known confounders, the GT approach also tries to eliminate bias arising from unknown confounders. This approach requires several assumptions (2) which are sometimes called instrumental-variable assumptions (3):The observation that hospital practice of treating patients with 3×CMF or 6×CMF varies considerably among the 33 clinical centers entering nonrandomized patients into the study (see table 2) leads us to use as a GT variable (3) the proportion of patients treated with 3×CMF at the respective center. This variable was estimated from the data on the nonrandomized patients. A causal diagram satisfying assumptions 1−3 with regard to the GT variable is given in figure 1. Assumptions 1–3 imply that the hospital where a patient is treated and thus the probability of receiving 3×CMF (i.e., the GT variable) can be considered a pseudorandomized treatment choice. The key assumptions for valid inferences using the GT approach are similar to those in randomized controlled trials (15). In randomized controlled trials, the random assignment has no direct effect on the outcome (i.e., assumption 3), except via the close relation between the randomized treatment assignment and the actual treatment received (i.e., assumption 1). As the randomized treatment assignment, the GT variable must be unrelated to observed and unobserved prognostic factors (i.e., assumption 2), which means that prognostic factors do not influence the patient's choice of a certain hospital.
The GT variable must be related to the treatment individually assigned, in order to have reasonable strength.
The GT variable must be unrelated to observed and unobserved prognostic factors.
The GT variable must be unrelated to outcome, except through pathways that operate via the treatment individually assigned.
Causal diagram satisfying the assumptions regarding the grouped treatment (GT) variable used in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997. 3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil.
Causal diagram satisfying the assumptions regarding the grouped treatment (GT) variable used in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997. 3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil.
The effect of treatment on event-free survival will be estimated by the regression coefficient of the GT variable in a Cox regression model for event-free survival including only the GT variable as a covariate.
Data storage and analysis was performed using the Statistical Analysis System (16).
RESULTS
Analysis of the randomized portion of the study
The results of the comparison between 3×CMF and 6×CMF in the 450 randomized patients are shown in figure 2 and table 3. The unadjusted comparison showed no difference between 3×CMF and 6×CMF. Since the major prognostic factors were well balanced between the randomized treatment arms (8, 9), the result remained unchanged when prognostic factors were adjusted for in the analysis.
Effect of treatment with three cycles of CMF* versus treatment with six cycles of CMF among randomized and nonrandomized breast cancer patients in unadjusted and adjusted analyses, using different methods of adjustment, German Breast Cancer Study Group, Germany, 1984–1997
| Method of analysis | Hazard ratio | Standard error | 95% confidence interval | p value† |
| Randomized patients (n = 450; 262 events) | ||||
| Unadjusted‡ | 1.077 | 0.124 | 0.845, 1.372 | 0.55 |
| Conventional adjustment for covariates§ | 1.054 | 0.125 | 0.825, 1.345 | 0.67 |
| Nonrandomized patients (n = 238; 138 events) | ||||
| Unadjusted‡ | 0.693 | 0.173 | 0.494, 0.973 | 0.034 |
| Conventional adjustment for covariates§ | 1.002 | 0.195 | 0.683, 1.470 | 0.99 |
| Stratified for propensity score¶ | 0.987 | 0.192 | 0.677, 1.438 | 0.95 |
| Grouped-treatment variable# | 0.758 | 0.280 | 0.438, 1.311 | 0.32 |
| Method of analysis | Hazard ratio | Standard error | 95% confidence interval | p value† |
| Randomized patients (n = 450; 262 events) | ||||
| Unadjusted‡ | 1.077 | 0.124 | 0.845, 1.372 | 0.55 |
| Conventional adjustment for covariates§ | 1.054 | 0.125 | 0.825, 1.345 | 0.67 |
| Nonrandomized patients (n = 238; 138 events) | ||||
| Unadjusted‡ | 0.693 | 0.173 | 0.494, 0.973 | 0.034 |
| Conventional adjustment for covariates§ | 1.002 | 0.195 | 0.683, 1.470 | 0.99 |
| Stratified for propensity score¶ | 0.987 | 0.192 | 0.677, 1.438 | 0.95 |
| Grouped-treatment variable# | 0.758 | 0.280 | 0.438, 1.311 | 0.32 |
CMF, cyclophosphamide-methotrexate-flourouracil.
p value from a two-sided Wald test in a Cox regression model.
Cox regression model including a treatment indicator.
Cox regression model including a treatment indicator and the covariates listed in table 1.
Cox regression model including a treatment indicator, stratified for six values of the estimated propensity score.
Cox regression model including the grouped-treatment variable listed in the last column of table 2.
Effect of treatment with three cycles of CMF* versus treatment with six cycles of CMF among randomized and nonrandomized breast cancer patients in unadjusted and adjusted analyses, using different methods of adjustment, German Breast Cancer Study Group, Germany, 1984–1997
| Method of analysis | Hazard ratio | Standard error | 95% confidence interval | p value† |
| Randomized patients (n = 450; 262 events) | ||||
| Unadjusted‡ | 1.077 | 0.124 | 0.845, 1.372 | 0.55 |
| Conventional adjustment for covariates§ | 1.054 | 0.125 | 0.825, 1.345 | 0.67 |
| Nonrandomized patients (n = 238; 138 events) | ||||
| Unadjusted‡ | 0.693 | 0.173 | 0.494, 0.973 | 0.034 |
| Conventional adjustment for covariates§ | 1.002 | 0.195 | 0.683, 1.470 | 0.99 |
| Stratified for propensity score¶ | 0.987 | 0.192 | 0.677, 1.438 | 0.95 |
| Grouped-treatment variable# | 0.758 | 0.280 | 0.438, 1.311 | 0.32 |
| Method of analysis | Hazard ratio | Standard error | 95% confidence interval | p value† |
| Randomized patients (n = 450; 262 events) | ||||
| Unadjusted‡ | 1.077 | 0.124 | 0.845, 1.372 | 0.55 |
| Conventional adjustment for covariates§ | 1.054 | 0.125 | 0.825, 1.345 | 0.67 |
| Nonrandomized patients (n = 238; 138 events) | ||||
| Unadjusted‡ | 0.693 | 0.173 | 0.494, 0.973 | 0.034 |
| Conventional adjustment for covariates§ | 1.002 | 0.195 | 0.683, 1.470 | 0.99 |
| Stratified for propensity score¶ | 0.987 | 0.192 | 0.677, 1.438 | 0.95 |
| Grouped-treatment variable# | 0.758 | 0.280 | 0.438, 1.311 | 0.32 |
CMF, cyclophosphamide-methotrexate-flourouracil.
p value from a two-sided Wald test in a Cox regression model.
Cox regression model including a treatment indicator.
Cox regression model including a treatment indicator and the covariates listed in table 1.
Cox regression model including a treatment indicator, stratified for six values of the estimated propensity score.
Cox regression model including the grouped-treatment variable listed in the last column of table 2.
Event-free survival rates by duration of chemotherapy in the randomized portion (n = 450) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997. a) Unadjusted Kaplan-Meier estimates; b) adjusted estimates from a Cox model, adjusted for the covariates listed in table 1. 3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil; 6×CMF, six cycles of cyclophosphamide-methotrexate-flourouracil.
Event-free survival rates by duration of chemotherapy in the randomized portion (n = 450) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997. a) Unadjusted Kaplan-Meier estimates; b) adjusted estimates from a Cox model, adjusted for the covariates listed in table 1. 3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil; 6×CMF, six cycles of cyclophosphamide-methotrexate-flourouracil.
Comparison of randomized and nonrandomized patients
Analysis of factors that influenced patients' consent to randomization revealed that the major factor was the hospital where the patient was informed about the study. Consent to randomization seemed mainly to depend not on the patient but on the doctor's effort to seek consent (table 2). Prognostic factors were rather balanced when randomized and nonrandomized patients were compared (table 1), and event-free survival rates of randomized and nonrandomized patients were nearly identical (figure 3).
Event-free survival rates by randomization status in a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997.
Event-free survival rates by randomization status in a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997.
Analysis of the nonrandomized portion of the study
Unadjusted analysis.
In an unadjusted comparison of the treatments, 3×CMF showed a higher event-free survival rate than did 6×CMF (figure 4, part a). The estimated hazard ratio was 0.693 (95 percent confidence interval (CI): 0.494, 0.973; p = 0.034) (table 3).
Event-free survival rates by duration of chemotherapy in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997. a) Unadjusted Kaplan-Meier estimates; b) adjusted estimates from a Cox model, adjusted for the covariates listed in table 1. In part b, the dotted line is not visible because the dotted line and the solid line are superimposed upon each other. 3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil; 6×CMF, six cycles of cyclophosphamide-methotrexate-flourouracil.
Event-free survival rates by duration of chemotherapy in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997. a) Unadjusted Kaplan-Meier estimates; b) adjusted estimates from a Cox model, adjusted for the covariates listed in table 1. In part b, the dotted line is not visible because the dotted line and the solid line are superimposed upon each other. 3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil; 6×CMF, six cycles of cyclophosphamide-methotrexate-flourouracil.
In patients with good prognostic factors such as fewer than four positive axillary lymph nodes or a tumor smaller than 3 cm, the rate of treatment with 3×CMF was higher than that in patients with a poor prognosis (table 4). Additionally, the choice of treatment with tamoxifen was related to length of chemotherapy, with a higher 3×CMF treatment rate in patients receiving tamoxifen. This emphasizes the necessity of adjusting the comparison of the treatment arms for covariates.
Relation of covariates to individual treatment assignment and to proportion of patients treated with 3×CMF* at the respective clinical center (grouped-treatment variable) in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Factor | Proportion of patients treated with 3×CMF | Mean proportion of patients treated with 3×CMF at the respective clinical center (mean grouped-treatment variable) |
| Menopausal status | ||
| Premenopausal | 0.44 | 0.43 |
| Postmenopausal | 0.48 | 0.49 |
| No. of positive lymph nodes | ||
| 1–3 | 0.60 | 0.53 |
| 4–9 | 0.33 | 0.38 |
| >9 | 0.26 | 0.39 |
| Tumor size (mm) | ||
| ≤20 | 0.53 | 0.41 |
| 21–30 | 0.50 | 0.49 |
| >30 | 0.36 | 0.46 |
| Tumor grade | ||
| I | 0.46 | 0.45 |
| II | 0.50 | 0.49 |
| III | 0.38 | 0.39 |
| Estrogen receptor status | ||
| Positive | 0.50 | 0.49 |
| Negative | 0.39 | 0.41 |
| Progesterone receptor status | ||
| Positive | 0.47 | 0.46 |
| Negative | 0.45 | 0.46 |
| Treatment with tamoxifen | ||
| No | 0.40 | 0.45 |
| Yes | 0.61 | 0.49 |
| Factor | Proportion of patients treated with 3×CMF | Mean proportion of patients treated with 3×CMF at the respective clinical center (mean grouped-treatment variable) |
| Menopausal status | ||
| Premenopausal | 0.44 | 0.43 |
| Postmenopausal | 0.48 | 0.49 |
| No. of positive lymph nodes | ||
| 1–3 | 0.60 | 0.53 |
| 4–9 | 0.33 | 0.38 |
| >9 | 0.26 | 0.39 |
| Tumor size (mm) | ||
| ≤20 | 0.53 | 0.41 |
| 21–30 | 0.50 | 0.49 |
| >30 | 0.36 | 0.46 |
| Tumor grade | ||
| I | 0.46 | 0.45 |
| II | 0.50 | 0.49 |
| III | 0.38 | 0.39 |
| Estrogen receptor status | ||
| Positive | 0.50 | 0.49 |
| Negative | 0.39 | 0.41 |
| Progesterone receptor status | ||
| Positive | 0.47 | 0.46 |
| Negative | 0.45 | 0.46 |
| Treatment with tamoxifen | ||
| No | 0.40 | 0.45 |
| Yes | 0.61 | 0.49 |
3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil.
Relation of covariates to individual treatment assignment and to proportion of patients treated with 3×CMF* at the respective clinical center (grouped-treatment variable) in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Factor | Proportion of patients treated with 3×CMF | Mean proportion of patients treated with 3×CMF at the respective clinical center (mean grouped-treatment variable) |
| Menopausal status | ||
| Premenopausal | 0.44 | 0.43 |
| Postmenopausal | 0.48 | 0.49 |
| No. of positive lymph nodes | ||
| 1–3 | 0.60 | 0.53 |
| 4–9 | 0.33 | 0.38 |
| >9 | 0.26 | 0.39 |
| Tumor size (mm) | ||
| ≤20 | 0.53 | 0.41 |
| 21–30 | 0.50 | 0.49 |
| >30 | 0.36 | 0.46 |
| Tumor grade | ||
| I | 0.46 | 0.45 |
| II | 0.50 | 0.49 |
| III | 0.38 | 0.39 |
| Estrogen receptor status | ||
| Positive | 0.50 | 0.49 |
| Negative | 0.39 | 0.41 |
| Progesterone receptor status | ||
| Positive | 0.47 | 0.46 |
| Negative | 0.45 | 0.46 |
| Treatment with tamoxifen | ||
| No | 0.40 | 0.45 |
| Yes | 0.61 | 0.49 |
| Factor | Proportion of patients treated with 3×CMF | Mean proportion of patients treated with 3×CMF at the respective clinical center (mean grouped-treatment variable) |
| Menopausal status | ||
| Premenopausal | 0.44 | 0.43 |
| Postmenopausal | 0.48 | 0.49 |
| No. of positive lymph nodes | ||
| 1–3 | 0.60 | 0.53 |
| 4–9 | 0.33 | 0.38 |
| >9 | 0.26 | 0.39 |
| Tumor size (mm) | ||
| ≤20 | 0.53 | 0.41 |
| 21–30 | 0.50 | 0.49 |
| >30 | 0.36 | 0.46 |
| Tumor grade | ||
| I | 0.46 | 0.45 |
| II | 0.50 | 0.49 |
| III | 0.38 | 0.39 |
| Estrogen receptor status | ||
| Positive | 0.50 | 0.49 |
| Negative | 0.39 | 0.41 |
| Progesterone receptor status | ||
| Positive | 0.47 | 0.46 |
| Negative | 0.45 | 0.46 |
| Treatment with tamoxifen | ||
| No | 0.40 | 0.45 |
| Yes | 0.61 | 0.49 |
3×CMF, three cycles of cyclophosphamide-methotrexate-flourouracil.
Multiple Cox regression analysis with adjustment for covariates.
From a Cox regression analysis including the prognostic factors for adjustment, the hazard ratio for 3×CMF versus 6×CMF was estimated as 1.002 (95 percent CI: 0.683, 1.470; p = 0.99) (table 3). Figure 4, part b, shows the corresponding adjusted event-free survival rates. After adjustment for prognostic factors, no difference was observed between treatment arms in nonrandomized patients.
Propensity-score-based analysis.
Logistic regression analysis revealed that the number of positive axillary lymph nodes and the decision to undergo tamoxifen treatment were both related to the choice of 3×CMF versus 6×CMF (p < 0.157; results not shown in detail). Therefore, the propensity score was estimated with a logistic regression model including these two covariates (table 5). Thus, the estimated propensity score had only six different values defined by the categories of the number of positive axillary lymph nodes and the decision to undergo tamoxifen treatment. Its median value was 0.54 in 3×CMF (range, 0.22–0.75) and 0.41 in 6×CMF (range, 0.22–0.75), showing considerable overlap.
Effects of prognostic factors on treatment assignment (three cycles of CMF* vs. six cycles of CMF) in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Factor | Odds ratio | 95% confidence interval | p value† |
| No. of positive lymph nodes | |||
| 1–3 | 1 | <0.0001 | |
| 4–9 | 0.29 | 0.16, 0.55 | |
| >9 | 0.23 | 0.10, 0.51 | |
| Treatment with tamoxifen | |||
| No | 1 | 0.003 | |
| Yes | 2.54 | 1.38, 4.68 |
| Factor | Odds ratio | 95% confidence interval | p value† |
| No. of positive lymph nodes | |||
| 1–3 | 1 | <0.0001 | |
| 4–9 | 0.29 | 0.16, 0.55 | |
| >9 | 0.23 | 0.10, 0.51 | |
| Treatment with tamoxifen | |||
| No | 1 | 0.003 | |
| Yes | 2.54 | 1.38, 4.68 |
CMF, cyclophosphamide-methotrexate-flourouracil.
p value from a two-sided Wald test in a logistic regression model.
Effects of prognostic factors on treatment assignment (three cycles of CMF* vs. six cycles of CMF) in the nonrandomized portion (n = 238) of a comprehensive cohort study of breast cancer, German Breast Cancer Study Group, Germany, 1984–1997
| Factor | Odds ratio | 95% confidence interval | p value† |
| No. of positive lymph nodes | |||
| 1–3 | 1 | <0.0001 | |
| 4–9 | 0.29 | 0.16, 0.55 | |
| >9 | 0.23 | 0.10, 0.51 | |
| Treatment with tamoxifen | |||
| No | 1 | 0.003 | |
| Yes | 2.54 | 1.38, 4.68 |
| Factor | Odds ratio | 95% confidence interval | p value† |
| No. of positive lymph nodes | |||
| 1–3 | 1 | <0.0001 | |
| 4–9 | 0.29 | 0.16, 0.55 | |
| >9 | 0.23 | 0.10, 0.51 | |
| Treatment with tamoxifen | |||
| No | 1 | 0.003 | |
| Yes | 2.54 | 1.38, 4.68 |
CMF, cyclophosphamide-methotrexate-flourouracil.
p value from a two-sided Wald test in a logistic regression model.
Patients were divided into the six strata defined by the values of the included covariates, leading to perfect balance of these covariates. From a Cox regression analysis stratified for the six propensity score strata, the hazard ratio for 3×CMF versus 6×CMF was estimated as 0.987 (95 percent CI: 0.677, 1.438; p = 0.95) (table 3).
The GT approach.
Hospital practices varied considerably in the choice of 3×CMF versus 6×CMF in nonrandomized patients (table 2). Thus, an important prerequisite for the use of the GT approach (4) is given in our study. The mean of the GT variable was equal to 0.66 in patients who received 3×CMF and equal to 0.29 in patients who received 6×CMF. The rank correlation between the individually assigned treatment variable and the GT variable equaled 0.61, showing that assumption 1 was fulfilled for the GT variable.
Table 4 shows the relations of the covariates to individual treatment assignment and the GT variable. Imbalances observed between the individual treatment variable and the covariates were substantially reduced when the GT variable was examined in relation to the covariates. An imbalance with a difference in the mean GT variable between the covariate categories of more than 0.1 only remains for the number of positive axillary lymph nodes. This indicates that assumption 2 for the GT variable seems to be reasonably fulfilled for most of the observed prognostic factors, although some differences between clinical centers may still exist regarding the disease status of patients. The large variation in the proportion of 3×CMF also supports the assumption that the heterogeneous treatment assignment at the different centers is more influenced by hospital practice than by individual covariates.
Using the GT variable, the hazard ratio for 3×CMF versus 6×CMF was estimated as 0.758 (95 percent CI: 0.438, 1.311; p = 0.32) (table 3). This seems to have eliminated some of the bias, since the effect of 3×CMF versus 6×CMF was less extreme than in the unadjusted analysis but was still much larger than in the other adjusted analyses and in the randomized portion of the study.
DISCUSSION
In this paper, we conducted separate analyses of the randomized and nonrandomized portions of a comprehensive cohort study in breast cancer patients. Our intention was to present and compare different methods proposed for estimation of causal effects in nonrandomized studies and to consider their ability to reproduce the correct unbiased result (it being known from the randomized portion of the study). The underlying assumption that the randomized and nonrandomized patients were comparable seems to have been fulfilled in our study.
The analysis of the randomized portion of the study has been published previously (9). An unadjusted analysis of the treatment effect and an analysis adjusted for the known covariates showed almost identical results—namely, no difference between 3×CMF and 6×CMF with respect to event-free survival, with an estimated hazard ratio close to 1.
An unadjusted analysis of the nonrandomized portion of the study that did not take covariates into account showed an apparent superiority of treatment arm 3×CMF as compared with treatment arm 6×CMF, with an estimated hazard ratio of 0.693 (95 percent CI: 0.494, 0.973; p = 0.034); this contrasted with the results of the randomized portion of the study. However, obvious imbalances of the covariates between treatment arms indicated confounding. With a conventional Cox regression analysis adjusting for known prognostic factors and treatment with tamoxifen, we could correct for this bias. The resulting estimate of the treatment effect now agreed with that obtained in the randomized portion of the study.
An analysis based on the propensity score was also able to correct for the bias, with a resulting treatment effect estimate that was also close to 1. Recently, investigations have been performed on which variables should be included for optimal estimation of the propensity score (17, 18). This has not been considered in detail here. Those factors showing an influence on treatment assignment were selected—namely, the number of positive axillary lymph nodes and the decision to undergo tamoxifen treatment. We stratified the analysis for the resulting six propensity score strata. Other methods of adjusting for the propensity score have been proposed in the literature (e.g., matching for the propensity score or propensity-score-based weighting), which may lead to quite different estimated treatment effects in the case of a nonuniform treatment effect across different values of the propensity score (19). Our procedure is close to an approach that matches for the propensity score, because only patients with an identical propensity score are summarized in a stratum. In a sensitivity analysis, we additionally estimated separate treatment effects in the six strata, showing that no nonuniform effect was present in our study. We thus may conclude that our result is not much influenced by the choice of method of adjusting for the propensity score. We performed an additional sensitivity analysis in which an indicator of treatment center was included for estimation of the propensity score. The analysis of the treatment effect was then stratified using quintiles of the estimated propensity score, producing a result similar to the one presented in table 3 (data not shown).
The GT variable used in our study, the proportion of patients treated with 3×CMF at the respective clinical center, was not able to remove the bias due to confounding. The estimated effect of 3×CMF versus 6×CMF was less extreme but still close to that obtained in the unadjusted analysis. This may be due to the residual imbalance with respect to the number of positive axillary lymph nodes. This is especially problematic because the number of positive axillary lymph nodes is the strongest prognostic factor in this patient population (9). For situations where the GT variable is related to known prognostic factors, it has been proposed that these factors be included as individual covariates for adjustment in the model relating the GT variable to the outcome (3). However, this must also be regarded as problematic, because this may introduce confounding with other factors in an uncontrolled way. In our study, including the factor number of positive axillary lymph nodes as an individual covariate in the regression model leads to an estimated effect for the GT variable close to 1, thus being in good agreement with the unbiased effect from the randomized portion of the study. Another consideration concerns the effect of clustering of the GT variable across clinical centers. Table 2 shows that no important clustering is present, and an additional sensitivity analysis adjusting the GT variable for the number of patients per center showed a result similar to the one presented in table 3.
The tamoxifen treatment was dealt with in the same way as the prognostic factors, although, in a strict sense, it is not a variable causing confounding by indication but rather must be considered a part of the hospital practice. This was not considered problematic, especially because its imbalance in relation to the individual treatment with 3×CMF or 6×CMF could be substantially reduced by considering the GT variable.
Note that the estimated standard error of the effect of the GT variable is larger than the estimated standard error of the estimated treatment effect when analyzing the individually assigned treatment. This was also observed in other, similar applications (15, 20) and in a simulation study on the properties of the GT approach (3). This is due to the reduced variability of the GT variable as compared with the individual treatment variable. Some authors refer to this as a loss in precision due to the use of an imperfect surrogate (3, 15). Another point sometimes mentioned is that a considerable correlation between patients at a given center can lead to larger confidence intervals, and it has been proposed that statistical methods accounting for this correlation be used (21). However, this correlation would be relevant not only for the analysis using the GT variable but also for the analysis using the individual treatment variable. Nevertheless, if this were considered necessary, it could be accomplished in survival analysis by including an appropriate frailty term in the model (22).
A further point of interest is interpretation of the effect estimates resulting from the different methods. Adjustment of the analyses for covariates results in estimates of the treatment effect that are conditional on the observed covariates and marginal with respect to unobserved covariates, whereas the analyses unadjusted for covariates, the analysis based on the propensity score (23), and the GT approach (3, 20) all result in estimates of the treatment effect that are marginal with respect to observed and unobserved covariates. The marginal effect is interpreted on the population level; it is the change in the hazard of the population if all patients were to receive one treatment in comparison with all patients' receiving the other treatment. The conditional effect is interpreted more on the individual level; it is the change in hazard for a patient if she receives one treatment in comparison with whether she receives the other treatment, conditional on the measured covariates. In the proportional hazards model in general, these parameters do not coincide (3, 24–26). In expectation, the marginal effect is more conservative (i.e., closer to unity) than the conditional effect. In many applied settings where both conventional regression models and propensity-score-based methods are compared, this difference in parameters is often not considered.
We conclude that in our study, the propensity-score-based approach does not yield any obvious advantages over the conventional regression approach, as has also been stressed in other recent applications and reviews (20, 27–29). The conventional regression analysis and the propensity-score-based approach can adjust only for measured confounders. The GT approach additionally tries to reduce bias due to unmeasured variables' being confounders by indication on the individual level. This requires the assumption that these variables are not related to the GT variable—that is, are not confounders by indication on the ecologic level. In our study, this means that prognostic factors did not influence the patient's choice of a certain hospital because of its practice concerning the frequency of giving the treatments under comparison. We conjecture that this may be fulfilled more often than “no confounding” on the individual level. Nevertheless, with the GT approach in our study, we did not succeed in removing the confounding completely.
The comprehensive cohort study design offers an ideal situation for investigating the properties of the different approaches for analyzing nonrandomized observational studies mentioned above, since, being part of the same study, it allows a comparison of the results obtained in the nonrandomized patients with the unbiased results obtained in the randomized patients.
Abbreviations
- CI
confidence interval
- CMF
cyclophosphamide-methotrexate-flourouracil
- GT
grouped treatment
This work was supported by the Deutsche Forschungsgemeinschaft (German Research Foundation), Research Unit FOR 534.
Conflict of interest: none declared.




