Recent scientific advances in personalized medicine are being translated into clinical benefit: Fontes Jardim et al. ( 1 ) report larger treatment effects in personalized vs nonpersonalized US Food and Drug Administration–approved cancer treatments. One should, however, be careful in quantifying the benefits in this way, as this type of analysis can be difficult to interpret because the treatment effects seen in trials of nonapproved agents are not considered.

Here we focus on a statistical issue that can lead to a bias: Fontes Jardim et al. ( 1 ) report a significant difference in the overall survival (OS) hazard ratios between personalized and nonpersonalized trials of the approved agents. The difference may be partially because of the personalized trials having fewer events on average than the nonpersonalized trials. This is true because: 1) the distribution of trial results from smaller trials will be more spread out than from larger trials; and 2) if one selects only trials with small P values, the hazard ratios will be larger for small trials than for large trials. Note that even though the sample sizes of the personalized vs nonpersonalized trials are not statistically significantly different (Table 1 [ 1 ]), the standard errors (SEs) of the estimated log hazard ratios, which are inversely proportional to the number of events, are quite different. For example, the mean and standard deviation of the SEs of the estimated log hazard ratios for OS are 0.121±0.060 for the 33 nonpersonalized trials and 0.273±0.150 for the 13 personalized trials ( P < .001, Wilcoxon test).

To provide an example of the induced bias, we consider the following hypothetical example in which there is no true difference between the personalized and nonpersonalized treatments. First, the true hazard ratios for OS and progression-free survival (PFS) for both personalized and nonpersonalized treatments were generated from the same distribution ( Figure 1A ; Supplementary Appendix , available online). Then the observed hazard ratios for the two groups were simulated by adding to the true hazard ratios a random noise corresponding to an SE chosen randomly from the 33 nonpersonalized SEs or the 13 personalized SEs, respectively ( Figure 1 , B and C ). Finally, the subdistributions for “approved trials” were identified as the subset of trials with either OS or PFS hazard ratio statistically significant. Even though the true hazard ratios came from the same distribution, the mean OS hazard ratio for approved personalized treatments is 0.73 and for approved nonpersonalized treatments is 0.83 ( Figure 1 , B and C ). These can be compared with the reported OS hazard ratios of 0.71 and 0.81, respectively ( 1 ). Note that because the difference in the number of PFS events between the personalized and nonpersonalized trials is less dramatic than for deaths the potential bias in estimating the PFS hazard ratios is smaller: The simulated mean PFS hazard ratios for approved personalized and nonpersonalized treatments are 0.53 and 0.59, respectively (which can be compared with the reported PFS hazard ratios of 0.41 and 0.59, respectively).

 Hypothetical distributions of overall survival hazard ratios for new treatments compared with control treatments. A ) True hazard ratios. B ) Observed hazard ratios from personalized treatments. C ) Observed hazard ratios from nonpersonalized treatments. Shaded distributions represent subdistribution of “approved trials.” Arrows designate the means of the shaded subdistributions.
Figure 1.

Hypothetical distributions of overall survival hazard ratios for new treatments compared with control treatments. A ) True hazard ratios. B ) Observed hazard ratios from personalized treatments. C ) Observed hazard ratios from nonpersonalized treatments. Shaded distributions represent subdistribution of “approved trials.” Arrows designate the means of the shaded subdistributions.

This hypothetical example suggests that it is possible that some of the differences seen in ( 1 ) are because of the differences in the number of events in the trials and not because of any intrinsic property of the treatments being tested.

Reference

1.

Fontes Jardim
DL
Schwaederle
M
Wei
C
et al.
Impact of a Biomarker-Based Strategy on Oncology Drug Development: A Meta-analysis of Clinical Trials Leading to FDA Approval
.
J Natl Cancer Inst
.
2015
;
107
(
11
):djv253 doi:10.1093/jnci/djv253.

Supplementary data