Estimation of clinical trial success rates and related parameters

SUMMARY Previous estimates of drug development success rates rely on relatively small samples from databases curated by the pharmaceutical industry and are subject to potential selection biases. Using a sample of 406 038 entries of clinical trial data for over 21 143 compounds from January 1, 2000 to October 31, 2015, we estimate aggregate clinical trial success rates and durations. We also compute disaggregated estimates across several trial features including disease type, clinical phase, industry or academic sponsor, biomarker presence, lead indication status, and time. In several cases, our results differ significantly in detail from widely cited statistics. For example, oncology has a 3.4% success rate in our sample vs. 5.1% in prior studies. However, after declining to 1.7% in 2012, this rate has improved to 2.5% and 8.3% in 2014 and 2015, respectively. In addition, trials that use biomarkers in patient-selection have higher overall success probabilities than trials without biomarkers.


A2. Path-by-path vs. phase-by-phase
This paper uses the path-by-path method of computing the probability of success, where we identify all the drug development paths before computing the proportion of paths that make it through from Phase 1 to approval. In contrast, the phase-by-phase method computes the proportion of observed phase transitions from one phase to the next before multiplying the individual probabilities in each stage to produce the overall probability of success.
It is not uncommon for datasets to contain missing data points. For example, for some drugs and indications, we observe Phase 1 trials and Phase 3 trials, but not Phase 2 trials. This may occur because there is an error in data collection and data processing, or for other reasons. In this example, we do not observe any Phase 2 trials for Drug Development 001. Our idealized model imputes the phase for the drug development and our 'path-by-path' method computes POS1,2, POS2,3, POS3,AP P , and POS1,AP P to be 1, 2 3 , 1 2 and 1 3 , respectively. In contrast, the 'phase-by-phase' method does not impute the phase and will compute POS1,2, POS2,3, POS3,AP P , and POS1,AP P to be 1, 1 2 , 1 2 , and 1 4 , respectively.
We treat these cases as successes in our methodology. While we acknowledge that this may produce higher success rates for Phase 1 and Phase 2 trials, we find it only logical to include these 'missing' data points, as they definitely must have occurred in a development path. (We give an example of how the phase-by-phase method underestimates the POS in Figure S3.) In addition, we perform Monte Carlo simulations to demonstrate the impact of ignoring 'missing' phase transitions. Setting the POS 1,2 , POS 2,3 , and POS 3,App to be 0.5, we generate 1,000 drug development paths randomly and corrupt them to simulate missing phase transitions. We then run the phase-by-phase and path-by-path computations on the simulated data. As can be seen in Figure S4, which plots the means of 1,000 such runs, the path-by-path method accurately estimates the POS, while the phase-by-phase method underestimates the POS.
However, the path-by-path approach is not suitable in analyzing instances where one does not have the full information about the drug development programs, such as a rolling-window computation where the time window is much shorter than the complete drug development period (typically around a decade). This is because our algorithm aggressively imputes the 'missing' phase transitions when it is given only a snippet of information. We give a fictitious example to As can be seen, our algorithm inferred all the phase transitions for the drug development project given the latest information at that point in time. While the algorithm works accurately when one has a massive database across long time horizons, it is unable to provide an accurate assessment of changes in success rates over short time windows. In our example, the Phase 1 trial is repeatedly counted as a success across multiple time windows, and this inflates the estimate of the success rate of Phase 1 trials in a short interval. When this situation occurs, we use the phase-by-phase approach.
A subtle but important difference between the two computation methods is that, while the path-by-path approach measures the proportion of drug development projects that progress, the phase-by-phase approach measures the proportion of phase transitions that occur. The two measures will produce the same results if there is no missing data point. However, these conditions do not hold true in real life clinical trial databases. By applying the phase-by-phase algorithm to the entire dataset, our evaluation is that it tends to underestimate the success rate. Nevertheless, the latter method is a strong enough proxy to estimate trends in drug development success rates.

A3. Algorithm
Algorithm 1 -Identifying trials in a drug development and computing the probability of success

A4. All indications versus lead indications
The model and algorithm presented in SECTION A3 considered each drug-indication pair as a unique development path. Some analysts, however, are interested in the lead indication for a given drug, i.e., the indication that has progressed furthest in the development pipeline. If there is more than one indication in the highest phase of the pipeline, the indication that reached the phase first will be considered the lead indication. Indication B in Fig. S2 is the lead indication, as it is the only indication for which the drug is approved. We argue that using lead indications in financial analysis is problematic.
First, the definition of lead indication makes it confusing to analyze phase transition proba-bilities. Consider the following example: Suppose that a company at time t completes Phase 2 clinical trials for two indications, Ind A and Ind B. It then decides to conduct a Phase 3 trial for Ind A, making Ind A the lead indication for the drug at t + 1. A short time later, at t + 2, the company reconsiders its priorities, and decides to accelerate development of the drug for Ind B.
Ind B makes it to the market earlier than Ind A, and is now the lead indication for the drug.
Hence, depending on when one takes a snapshot of the data, one may end up with different lead indications and estimates of the indication-specific phase transition probabilities. As such, considering all indications in computing the phase transition probabilities is more robust and accurate.
Second, from a financial perspective, it may be more informative to use indication-specific drug development paths to compute the different metrics. Very often, a New Drug Application (NDA) specifies the indication and dosage that the drug is intended to treat, and a company would need to resubmit another application if they wish to market it for another disease or dosage. Since the patient segment determines the market size and thus the financial potential of the drug, it is more appropriate to use indication-specific probabilities in the financial analysis of drug development endeavors.  Table S3. Probability of success with and without biomarkers, using data from January 1, 2005, to October 31, 2015, computed using the phase-by-phase method. These results consider trials that have the objective of evaluating or identifying the use of any novel biomarkers as indicators of therapeutic efficacy or toxicity, in addition to patient stratification. Since for the majority (92.3%) of trials using biomarkers their status is observed only on or after January 1, 2005, the choice of the time period is to ensure a fair comparison between trials using and not using biomarkers. (2016) Our results for trials using biomarkers are very different from extant papers such as Thomas and others (2016). The authors of Thomas and others (2016) kindly shared their analysis with us, allowing us to compare and contrast the methodologies and results. The main differences between the two analyses are in the identification of phase transitions, the application of filters, and the quantity of data (see Table S4).

A7. Comparison of results for biomarker trials against Thomas and others
Thomas and others (2016) This paper Identification of phase transitions From BioMedTracker database Using Algorithm 1 in Figure S5 What constitutes a biomarker trial?
Considered only biomarkers in patient selection Considered to 'involve biomarkers' if a trial includes includes an objective of evaluating or identifying the use of any novel biomarkers as indicators of therapeutic efficacy or toxicity, or to use biomarkers in the selection of patients.

Data source
Merges BioMedTracker with Amplions BiomarkerBase. Only trials from clinicaltrials.gov were used as NCT numbers were used as trial identifiers. Analysis consists of 512 phase transitions.
Uses trials tagged as 'involve biomarker' by Informa. Both clinicaltrials.gov and private information were used, summing up to 10,650 phase transitions. Table S4. Differences between the biomarker study in Thomas and others (2016) and this paper.
Thomas and others (2016) provided a sample of 1,593 trial entries for comparison. Of these, 722 entries are used in their analysis. We merged our algorithm output with this subset of trials to produce tag outcomes for 1,065 of the 1,953 entries. Only 438 data points exist in both analyses.
Our algorithm is unable to produce outcomes for some trials for which Thomas and others (2016) did because an insufficient period has passed since the conclusion of the trial. This relates to the t1, t2, and t3 parameters in our algorithm.
Of the 438 overlapping data points, our algorithm arrived at the same conclusion as Thomas and others (2016) for 90.0% of the data, suggesting that our algorithm identifies phase transitions accurately.
others (2016) in Table S5. We see that our algorithm tends to identify more failures compared to Thomas and others (2016). This may be due to our method of counting a trial that is in limbo for an extended period of time as 'terminated'. Given these checks, we conclude that our results differ from Thomas and others (2016) mainly due to the use of Algorithm 1 to process more trial data to produce POS estimates.

A8. Probability of Success over Time
The following tables supplement SECTION 4.4. We tabulate the POS over time for each therapeutic group.

A10. Completion Rates
An alternative measure of performance for clinical trials is the completion rate. It answers the question, "How likely is a trial to complete?" The completion rate of Phase i trials (CR i ) is computed by dividing the number of trials in Phase i that were tagged as 'completed' by the number of trials that have been initiated in Phase i. This metric is useful in real option valuation, where uncertain possible outcomes with various endpoints are implicitly modeled in order to provide a more robust and comprehensive cost-benefit analysis. Our data shows that clinical trial completion rates are high across all phases, averaging at 85.8% (Table S16) The completion rates for non-industry sponsored trials are provided in SECTION A13.  Table S16. Completion rates of industry-sponsored clinical trials (i.e., the number of trials that were tagged as completed divided by the number of trials that were initiated) by phase and therapeutic group, using the entire dataset from January 1, 2000, to October 31, 2015.

A11. Duration
One principal component of the cost of conducting a trial is its expected duration. All else being equal, one would expect that a longer trial would require more hours of labor and supplies, resulting in a higher cost. In addition, from a financial perspective, a longer trial is exposed to more uncertainties. We quantify the distribution of the duration of trials in order to inform companies and investors of the potential risk in a project. We assume that there is no underlying process that induces gaps in the data. We drop trial data without date-stamps for the start or the end of the trial, as we cannot make a statement on the time spent in development for these trials. After data processing, 99,363 trials remain for our computations. Our data has a resolution of 1 calendar month.
The distribution of duration varies widely across different therapeutic groups and phases (Table S17) Taking cues from Abrantes-Metz and others (2005), we also compute the duration of trials conditioned on their eventual status ('advanced' or 'terminated') using a 5-year rolling window ( Figure S6). With our larger dataset, we found that Phase 2 trials that were terminated tend to conclude 8.1 months earlier than Phase 2 trials that advanced (Table S18). Terminated Phase 3 trials, however, tend to conclude about 3.2 months after Phase 3 trials that successfully advanced. The difference within the Phase 1 group is insignificant; while we see a difference of 53 days, this is within our margin of error, given that the resolution for a time period is 2 calendar months, or 60 days. By composing a time series using 5-year rolling windows (see Figure S6), we see that the differences (or lack thereof) remain constant over time.

A12. Distribution of Duration
In this section, we document the distribution of duration conditioned on the indication group and phase in order to inform interested readers.

A13. Non-Industry Trials
The clinical research sector outside the pharmaceutical industry is an integral part of drug research and development. Not only is this sector actively involved with industry in conducting trials, but academics and hospitals also conduct fundamental research that furthers understanding of basic pharmacokinetics, among other phenomena measured in clinical trials. We thus seek to quantify the performance of this sector.
As our database does not record non-industry approvals, we supplement our dataset with data from Drugs@FDA, the U.S. Food and Drug Administration's (FDA) approved drugs database.
In all, 53 drug approvals for 17 unique compounds were awarded to non-industry organizations (see Table S20). Of these, only three compounds were non-generic: two were awarded to the U.S.
Army and the remaining compound is a PET imaging diagnostic agent. The remaining drugs are generic compounds whose patents have expired and have been awarded to hospitals and non-profits.
Given the altruistic aims of organizations outside the industry, and the fact that virtually no novel drugs have been granted by the FDA to these organizations, we look at only the completion rates for non-industry trials. We find that, although Phase 1 trials conducted outside the industry have lower completion rates than those within the industry, non-industry organizations outperform the latter in completing Phase 2, Phase 3, and Phase 4 trials (compare Tables S16 and S19). This suggests that each group has a relative advantage in completing different phases of clinical trials, and that there may be exploitable synergies to be gained when working together.
Computing the POS of drug development projects conditioned on the status and number of non-industry partners (