-
Views
-
Cite
Cite
J. Jack Lee, Demystify Statistical Significance—Time to Move on From the P Value to Bayesian Analysis , JNCI: Journal of the National Cancer Institute, Volume 103, Issue 1, 5 January 2011, Pages 2–3, https://doi.org/10.1093/jnci/djq493
- Share Icon Share
Extract
Compared to the long histories of mathematics, physics, chemistry, and biology, statistics is a young science. Rooted in mathematics, statistics is a science of quantitative reasoning. It provides a formal framework for assessing the strength of evidence in the midst of uncertainty. More specifically, statistics allows one to quantify the probability of an event to make proper inference. It provides tools that we can use to sift through mountains of information to identify the signal and filter out the noise. Recently, the United Nations declared October 20, 2010, as “World Statistics Day.” The declaration stated “On 20 October 2010, the World will celebrate the first World Statistics Day, to raise awareness of the many achievements of official statistics premised on the core values of service, integrity and professionalism” ( 1 ).
Medicine is an ancient profession. Practicing medicine has evolved from empirical based to evidence based. Evidence-based medicine has become a motto for modern medicine, and statistics is an indispensable tool for evaluating the strength of evidence contained in the data ( 2 , 3 ). How much information do the data contain? Formulating the problem in the context of evaluating drug efficacy, one may ask: “Based on the data, does the drug work?” In this issue of the Journal, Ocana and Tannock ( 4 ) ask: “When are ‘positive’ clinical trials in oncology truly positive?” They provide a critical assessment of 18 randomized trials that led to the approval of targeted drugs by the United States Food and Drug Administration (FDA) over the past 10 years. They make four major claims: 1) statistical significance does not equate to clinical significance; 2) the P value alone is not sufficient to conclude that a drug works; 3) a prespecified magnitude of clinical benefit, δ, that is defined as the difference in primary endpoints between control and experimental groups, is required to gauge whether a drug works or not; and 4) the absolute difference is more relevant than the relative difference in measuring a drug's efficacy. The excellent commentary by Ocana and Tannock ( 4 ) echoes several previous works addressing these issues ( 5–8 ). It challenges the current drug approval paradigm and calls out to both the medical and statistical communities to come up with a more robust framework for assessing drug effectiveness to determine more accurately whether a drug really works. The FDA (and European Medicines Agency alike) indeed considers both clinical and statistical evidence in evaluating drug efficacy. The question is how to provide a robust framework for weighing the evidence. I support the general claims of Ocana and Tannock ( 4 ) but argue that they do not go far enough. I provide my own assessment on the statistical evaluation of the drug's efficacy and give my perspective on its future direction.