Abstract

Background

Antibiotic noninferiority randomized controlled trials (RCTs) are used for approval of new antibiotics and making changes to antibiotic prescribing in clinical practice. We conducted a systematic review to assess the methodological and reporting quality of antibiotic noninferiority RCTs.

Methods

We searched MEDLINE, Embase, the Cochrane Database of Systematic Reviews, and the Food and Drug Administration drug database from inception until November 22, 2019, for noninferiority RCTs comparing different systemic antibiotic therapies. Comparisons between antibiotic types, doses, administration routes, or durations were included. Methodological and reporting quality indicators were based on the Consolidated Standards of Reporting Trials reporting guidelines. Two independent reviewers extracted the data.

Results

The systematic review included 227 studies. Of these, 135 (59.5%) studies were supported by pharmaceutical industry. Only 83 (36.6%) studies provided a justification for the noninferiority margin. Reporting of both intention-to-treat (ITT) and per-protocol (PP) analyses were done in 165 (72.7%) studies. The conclusion was misleading in 34 (15.0%) studies. The studies funded by pharmaceutical industry were less likely to be stopped early because of logistical reasons (3.0% vs 19.1%; odds ratio [OR] = 0.13; 95% confidence interval [CI], .04–.37) and to show inconclusive results (11.1% vs 42.9%; OR = 0.17; 95% CI, .08–.33). The quality of studies decreased over time with respect to blinding, early stopping, reporting of ITT with PP analysis, and having misleading conclusions.

Conclusions

There is room for improvement in the methodology and reporting of antibiotic noninferiority trials. Quality can be improved across the entire spectrum from investigators, funding agencies, as well as during the peer-review process.

There is room for improvement in the methodology and reporting of antibiotic noninferiority trials including justification of noninferiority margin, reporting of intention-to-treat analysis with per-protocol analysis, and having conclusions that are concordant with study results.

Clinical Trials Registration PROSPERO registration number CRD42020165040.

Noninferiority randomized controlled trials (RCTs) are commonly used study designs to evaluate drug therapy [1]. They have unique methodological considerations including sample size calculations, a priori–defined noninferiority margins and inclusion of both intention-to-treat (ITT) and per-protocol (PP) analyses [2, 3]. Variation in the reporting of noninferiority RCTs can result in misinterpretation of results, overestimation of benefits, and underestimation of harms. A Consolidated Standards of Reporting Trials (CONSORT) statement was published as a guideline for the reporting of noninferiority RCTs [2, 3].

There is emerging evidence that the methodology and reporting quality of noninferiority RCTs have deficiencies when measured against the CONSORT standards. Major concerns include lack of blinding, no justification of noninferiority margin, reporting of only ITT analysis, and misleading conclusions [4–6].

In infectious diseases research, noninferiority RCTs are used to evaluate antibiotics. In a systematic review of noninferiority RCTs of drugs, approximately 20% of trials evaluated an anti-infective agent [6]. Almost all clinical trials for approval of new antibiotics were noninferiority RCTs in the past 30 years [7]. The US Food and Drug Administration (FDA) and European Medicines Agency have provided guidance on the use of noninferiority trials to support approval of antibacterial drug products [8, 9].

The use of only noninferiority trials for the evaluation and approval of new antibiotics have been controversial. There is a concern for lack of assay sensitivity, which is the ability of a study to distinguish between active and inactive treatment [10]. In a noninferiority trial, new treatment may be shown to be noninferior to control treatment. However, it is possible that both the new treatment and control treatment were no better than placebo, so the noninferiority trial cannot provide reliable evidence of effectiveness of a new therapy [10]. New antibiotics approved by noninferiority trials usually are from an existing antibiotic class and do not have better outcomes, reflecting stagnancy and lack of innovation [11]. In response to this, experts have suggested use of innovative superiority trial designs to evaluate new antibiotics [7, 12]. As well, the landscape of antibiotic research is evolving with proposed changes such as shift to nonprofit organizations [13] and prioritization of targets for drug development [14]. After considering and balancing feasibility, ethics, and incentive for the pharmaceutical industry, a noninferiority trial is nevertheless an acceptable trial design if done in a rigorous manner, as outlined in prior statements from the Infectious Diseases Society of America for different infection syndromes [15–17]. Therefore, it is all the more important for antibiotic noninferiority RCTs to be sound in methodology and reporting because they result in the approval of new antibiotics and changes to antibiotic prescribing in clinical practice and guidelines.

It is unclear if the methodological and reporting deficiencies and concerns found in noninferiority RCTs in general also exist in antibiotic noninferiority RCTs. The primary aim of this systematic review was to evaluate the methodological and reporting quality specifically in noninferiority RCTs on antibiotics.

METHODS

This review was conducted and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses guidelines (see checklist in Supplementary Text 1) [18]. The study protocol was registered with PROSPERO (CRD42020165040).

Data Sources and Selection Criteria

We searched MEDLINE, Embase, and the Cochrane Database of Systematic Reviews from inception to November 22, 2019. The detailed search strategy was developed with a research librarian (Supplementary Text 2). We used the FDA drugs database to supplement this search [19]. For new antibiotics that were approved, we read through the FDA antibiotic approvals and labels to find the clinical studies that supported the approval and were also published in journal articles.

We included studies published in English that were identified as noninferiority RCTs in humans comparing 2 or more systemic antibiotic regimens used to treat a bacterial infection. Studies were included if the treatment and comparison arms differed in terms of antibiotic type, dose, administration, and/or duration.

Trials comparing an antibiotic to placebo alone were excluded, as were commentaries, reviews, study protocols, secondary analysis, and conference proceedings. We excluded trial registrations where the results were not published in a journal article. Phase 2 and pilot studies were identified and excluded after full text reading.

Data Extraction

Reviewers screened abstracts after appropriate training to identify potentially relevant studies and extract full texts for reading. The first 300 abstracts that each reviewer screened were double checked by another independent reviewer for consistency. If consistent, the reviewer then screened abstracts independently. For the full text review, 2 independent reviewers read and extracted the data in duplicate onto a standardized extraction form. Disagreements between reviewers were resolved by discussion to reach consensus and, if necessary, adjudication by a third reviewer.

Variables Collected

Study characteristics of interest included journal, year of study, study center(s), study population, sample size, treatment arms, infectious disease syndrome, rationale for noninferiority design, funding, sample size calculation, outcomes, and interpretation of results. Funding was classified as supported by pharmaceutical industry, not supported by pharmaceutical industry, or unclear based on the conflict of interest and funding statements. Studies on new antibiotics were defined as studies that were conducted before or within 5 years of FDA approval of the antibiotic that the study was focused on.

We used a standardized definition to interpret the results that considers the point estimate, confidence interval (CI), and the noninferiority margin based on the FDA recommendations for superiority, noninferiority, inferiority, and inconclusive results (Figure 1) [8]. The study results were also deemed to be inconclusive if both ITT and PP analyses results were reported and inconsistent with one another. We compared our conclusion of study results to the author’s conclusions in the journal article. A misleading conclusion was defined as when the author’s conclusion was discordant to our conclusions following the FDA definition described previously [4].

Interpretation of point estimate and confidence interval. Bars indicate confidence interval. Δ is the noninferiority margin. Interpretation is as follows for the new treatment compared to old treatment: A is superior; B and C are noninferior; D and E are inconclusive, where noninferiority would be rejected; F meets the noninferiority margin criteria, but is actually inferior as the lower bound is above 0; and G and H are inferior.
Figure 1.

Interpretation of point estimate and confidence interval. Bars indicate confidence interval. Δ is the noninferiority margin. Interpretation is as follows for the new treatment compared to old treatment: A is superior; B and C are noninferior; D and E are inconclusive, where noninferiority would be rejected; F meets the noninferiority margin criteria, but is actually inferior as the lower bound is above 0; and G and H are inferior.

Primary Outcomes

The primary outcomes were quality indicators specific to noninferiority RCTs, which were selected from the checklist in the CONSORT guidelines [2, 3] and prior systematic reviews on the quality of noninferiority trials [4–6]. For each quality indicator, the reviewer determined if it was present or absent [4–6]. Quality indicators included inclusion of noninferiority in title, blinding, specification of noninferiority margin, justification of noninferiority margin, type of CI used, concordance of type I error rate used with CI, early stopping, reporting of ITT with PP analyses, handling of missing data, and comparison to historical control. As well, we recorded the use of a figure to illustrate point estimate, CI, and noninferiority margin. “Double-blinded” was defined specifically as blinding of the participant and individuals providing care. Early stopping because of logistics was defined as stopping the study before reaching target sample size for logistical reasons such as funding or recruitment. This definition excludes early stopping based on interim analyses results that met the prespecified early stopping rule.

Risk of Bias

Two independent reviewers assessed the risk of bias in duplicate using the Cochrane Collaboration’s tool for assessing risk of bias in randomized trials [20].

Comparison

The primary comparison was based on pharmaceutical funding or no pharmaceutical funding. Secondary comparison was an examination of trend over time by 3 publication year periods: 2007 or before, 2008–2013, and 2014 or after. This was based on the CONSORT statement publication years of 2006 and 2012 [2, 3]. We judged that the cutoff time point of 2007 and 2013 allowed sufficient time for the authors and journals to implement the newly published guidelines.

Statistical Analysis

Descriptive analyses included number (percentage) for categorical variables and median (interquartile range) for continuous variables. Comparisons between groups were done with Fisher’s exact test for categorical variables and Wilcoxon rank-sum test for continuous variables. Odds ratio (OR) and its 95% CI were calculated for categorical variable using a logistic regression model. The studies included in the systematic review were heterogeneous in terms of clinical syndrome, antibiotics used, and outcomes. Therefore, a pooled meta-analysis was not attempted. All tests were 2-sided with a P < .05 significant level. All analyses were done with R, version 3.6.3 (R Foundation for Statistical Computing, Vienna, Austria).

RESULTS

Our literature search yielded 6017 records, which included 4009 unique abstracts. At the abstract screening stage, 3766 abstracts were excluded, leaving 243 studies for full text review. After full text review, 227 studies were included in the analysis (Figure 2). Over time, the number of antibiotics noninferiority trials has increased (Supplementary Table 1).

Flow diagram of study selection process.
Figure 2.

Flow diagram of study selection process.

Study Characteristics

Of the 227 studies, the most common infectious disease syndromes studied were community-acquired pneumonia (15.9%), skin and soft-tissue infection (15.0%), and urinary tract infection (11.5%) (Table 1 and Supplementary Table 2). The most common rationale for noninferiority design included testing new antibiotics (44.1%), shorter duration (17.6%), and older antibiotics as alternative therapeutic options (16.3%).

Table 1.

Study Characteristics

All StudiesGroup A: Industry-supported StudiesGroup B:
Nonpharmaceutical-supported Studies
A vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Adults only182 (80.2%)117 (86.7%)58 (69.1%)2.91 (1.49–5.82).0029
Multicenter203 (89.4%)132 (97.8%)66 (78.6%)12.00 (3.89–52.54)<.0001
Sample size per group median (IQR)204 (114–313)254 (160–359)125 (70–245)<.0001
Infectious disease syndrome<.0001
 CAP36 (15.9%)28 (20.7%)8 (9.5%)2.49 (1.12–6.11)
 SSTI34 (15.0%)28 (20.7%)6 (7.1%)3.40 (1.43–9.45)
 Urinary tract infection26 (11.5%)14 (10.4%)10 (11.9%)0.86 (.36–2.08)
 Intra-abdominal infection21 (9.3%)16 (11.9%)5 (6.0%)2.12 (.80–6.71)
 Tuberculosis14 (6.2%)1 (0.7%)12 (14.3%)0.04 (.002–.23)
Helicobacter pylori12 (5.3%)2 (1.5%)9 (10.7%)0.13 (.02–.50)
 HAP or VAP12 (5.3%)11 (8.2%)0 (0%)N/A
 Other72 (31.7%)35 (25.9%)34 (40.5%)0.51 (.29–.92)
Rationale
 New antibiotics100 (44.1%)97 (71.9%)2 (2.4%)104.66 (30.85–655.83)<.0001
 Shorter duration40 (17.6%)10 (7.4%)28 (33.3%)0.16 (.07–.34)<.0001
 Alternative option37 (16.3%)18 (13.3%)17 (20.2%)0.61 (.29–1.26).1885
 Easier administration27 (11.9%)19 (14.1%)7 (8.3%)1.80 (.75–4.80).2827
 PO instead of parenteral18 (7.9%)3 (2.2%)15 (17.9%)0.10 (.02–.33)<.0001
 Better safety10 (4.4%)3 (2.2%)6 (7.1%)0.30 (.06–1.15).0893
 Lower cost9 (4.0%)2 (1.5%)7 (8.3%)0.17 (.02–.70).0290
 Narrower spectrum9 (4.0%)0 (0%)8 (9.5%)N/A.0004
Primary outcome.0086
 Clinical outcome185 (81.5%)119 (88.2%)60 (71.4%)2.98 (1.48–6.12)
 Microbiological outcome34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78)
 Both clinical and microbiological outcome4 (1.8%)2 (1.5%)2 (2.4%)0.62 (.07–5.22)
 Other4 (1.8%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
Recording of adverse events219 (96.5%)134 (99.3%)78 (92.9%)10.31 (1.72–196.57).0137
Conclusion based on results<.0001
 Noninferiority shown157 (69.2%)109 (80.7%)44 (52.4%)3.81 (2.10–7.05)
 Superiority shown5 (2.2%)5 (3.7%)0 (0%)N/A
 Inferiority shown10 (4.4%)6 (4.4%)4 (4.8%)0.93 (.26–3.73)
 Inconclusive55 (24.2%)15 (11.1%)36 (42.9%)0.17 (.08–.33)
All StudiesGroup A: Industry-supported StudiesGroup B:
Nonpharmaceutical-supported Studies
A vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Adults only182 (80.2%)117 (86.7%)58 (69.1%)2.91 (1.49–5.82).0029
Multicenter203 (89.4%)132 (97.8%)66 (78.6%)12.00 (3.89–52.54)<.0001
Sample size per group median (IQR)204 (114–313)254 (160–359)125 (70–245)<.0001
Infectious disease syndrome<.0001
 CAP36 (15.9%)28 (20.7%)8 (9.5%)2.49 (1.12–6.11)
 SSTI34 (15.0%)28 (20.7%)6 (7.1%)3.40 (1.43–9.45)
 Urinary tract infection26 (11.5%)14 (10.4%)10 (11.9%)0.86 (.36–2.08)
 Intra-abdominal infection21 (9.3%)16 (11.9%)5 (6.0%)2.12 (.80–6.71)
 Tuberculosis14 (6.2%)1 (0.7%)12 (14.3%)0.04 (.002–.23)
Helicobacter pylori12 (5.3%)2 (1.5%)9 (10.7%)0.13 (.02–.50)
 HAP or VAP12 (5.3%)11 (8.2%)0 (0%)N/A
 Other72 (31.7%)35 (25.9%)34 (40.5%)0.51 (.29–.92)
Rationale
 New antibiotics100 (44.1%)97 (71.9%)2 (2.4%)104.66 (30.85–655.83)<.0001
 Shorter duration40 (17.6%)10 (7.4%)28 (33.3%)0.16 (.07–.34)<.0001
 Alternative option37 (16.3%)18 (13.3%)17 (20.2%)0.61 (.29–1.26).1885
 Easier administration27 (11.9%)19 (14.1%)7 (8.3%)1.80 (.75–4.80).2827
 PO instead of parenteral18 (7.9%)3 (2.2%)15 (17.9%)0.10 (.02–.33)<.0001
 Better safety10 (4.4%)3 (2.2%)6 (7.1%)0.30 (.06–1.15).0893
 Lower cost9 (4.0%)2 (1.5%)7 (8.3%)0.17 (.02–.70).0290
 Narrower spectrum9 (4.0%)0 (0%)8 (9.5%)N/A.0004
Primary outcome.0086
 Clinical outcome185 (81.5%)119 (88.2%)60 (71.4%)2.98 (1.48–6.12)
 Microbiological outcome34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78)
 Both clinical and microbiological outcome4 (1.8%)2 (1.5%)2 (2.4%)0.62 (.07–5.22)
 Other4 (1.8%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
Recording of adverse events219 (96.5%)134 (99.3%)78 (92.9%)10.31 (1.72–196.57).0137
Conclusion based on results<.0001
 Noninferiority shown157 (69.2%)109 (80.7%)44 (52.4%)3.81 (2.10–7.05)
 Superiority shown5 (2.2%)5 (3.7%)0 (0%)N/A
 Inferiority shown10 (4.4%)6 (4.4%)4 (4.8%)0.93 (.26–3.73)
 Inconclusive55 (24.2%)15 (11.1%)36 (42.9%)0.17 (.08–.33)

8 studies had unclear funding and so were not included in group A or B.

Abbreviations: CAP, community-acquired pneumonia; CI, confidence interval; HAP, hospital acquired pneumonia; IQR, interquartile range; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio; PO, route by mouth; SSTI, skin or soft-tissue infection; VAP, ventilator-associated pneumonia.

Table 1.

Study Characteristics

All StudiesGroup A: Industry-supported StudiesGroup B:
Nonpharmaceutical-supported Studies
A vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Adults only182 (80.2%)117 (86.7%)58 (69.1%)2.91 (1.49–5.82).0029
Multicenter203 (89.4%)132 (97.8%)66 (78.6%)12.00 (3.89–52.54)<.0001
Sample size per group median (IQR)204 (114–313)254 (160–359)125 (70–245)<.0001
Infectious disease syndrome<.0001
 CAP36 (15.9%)28 (20.7%)8 (9.5%)2.49 (1.12–6.11)
 SSTI34 (15.0%)28 (20.7%)6 (7.1%)3.40 (1.43–9.45)
 Urinary tract infection26 (11.5%)14 (10.4%)10 (11.9%)0.86 (.36–2.08)
 Intra-abdominal infection21 (9.3%)16 (11.9%)5 (6.0%)2.12 (.80–6.71)
 Tuberculosis14 (6.2%)1 (0.7%)12 (14.3%)0.04 (.002–.23)
Helicobacter pylori12 (5.3%)2 (1.5%)9 (10.7%)0.13 (.02–.50)
 HAP or VAP12 (5.3%)11 (8.2%)0 (0%)N/A
 Other72 (31.7%)35 (25.9%)34 (40.5%)0.51 (.29–.92)
Rationale
 New antibiotics100 (44.1%)97 (71.9%)2 (2.4%)104.66 (30.85–655.83)<.0001
 Shorter duration40 (17.6%)10 (7.4%)28 (33.3%)0.16 (.07–.34)<.0001
 Alternative option37 (16.3%)18 (13.3%)17 (20.2%)0.61 (.29–1.26).1885
 Easier administration27 (11.9%)19 (14.1%)7 (8.3%)1.80 (.75–4.80).2827
 PO instead of parenteral18 (7.9%)3 (2.2%)15 (17.9%)0.10 (.02–.33)<.0001
 Better safety10 (4.4%)3 (2.2%)6 (7.1%)0.30 (.06–1.15).0893
 Lower cost9 (4.0%)2 (1.5%)7 (8.3%)0.17 (.02–.70).0290
 Narrower spectrum9 (4.0%)0 (0%)8 (9.5%)N/A.0004
Primary outcome.0086
 Clinical outcome185 (81.5%)119 (88.2%)60 (71.4%)2.98 (1.48–6.12)
 Microbiological outcome34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78)
 Both clinical and microbiological outcome4 (1.8%)2 (1.5%)2 (2.4%)0.62 (.07–5.22)
 Other4 (1.8%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
Recording of adverse events219 (96.5%)134 (99.3%)78 (92.9%)10.31 (1.72–196.57).0137
Conclusion based on results<.0001
 Noninferiority shown157 (69.2%)109 (80.7%)44 (52.4%)3.81 (2.10–7.05)
 Superiority shown5 (2.2%)5 (3.7%)0 (0%)N/A
 Inferiority shown10 (4.4%)6 (4.4%)4 (4.8%)0.93 (.26–3.73)
 Inconclusive55 (24.2%)15 (11.1%)36 (42.9%)0.17 (.08–.33)
All StudiesGroup A: Industry-supported StudiesGroup B:
Nonpharmaceutical-supported Studies
A vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Adults only182 (80.2%)117 (86.7%)58 (69.1%)2.91 (1.49–5.82).0029
Multicenter203 (89.4%)132 (97.8%)66 (78.6%)12.00 (3.89–52.54)<.0001
Sample size per group median (IQR)204 (114–313)254 (160–359)125 (70–245)<.0001
Infectious disease syndrome<.0001
 CAP36 (15.9%)28 (20.7%)8 (9.5%)2.49 (1.12–6.11)
 SSTI34 (15.0%)28 (20.7%)6 (7.1%)3.40 (1.43–9.45)
 Urinary tract infection26 (11.5%)14 (10.4%)10 (11.9%)0.86 (.36–2.08)
 Intra-abdominal infection21 (9.3%)16 (11.9%)5 (6.0%)2.12 (.80–6.71)
 Tuberculosis14 (6.2%)1 (0.7%)12 (14.3%)0.04 (.002–.23)
Helicobacter pylori12 (5.3%)2 (1.5%)9 (10.7%)0.13 (.02–.50)
 HAP or VAP12 (5.3%)11 (8.2%)0 (0%)N/A
 Other72 (31.7%)35 (25.9%)34 (40.5%)0.51 (.29–.92)
Rationale
 New antibiotics100 (44.1%)97 (71.9%)2 (2.4%)104.66 (30.85–655.83)<.0001
 Shorter duration40 (17.6%)10 (7.4%)28 (33.3%)0.16 (.07–.34)<.0001
 Alternative option37 (16.3%)18 (13.3%)17 (20.2%)0.61 (.29–1.26).1885
 Easier administration27 (11.9%)19 (14.1%)7 (8.3%)1.80 (.75–4.80).2827
 PO instead of parenteral18 (7.9%)3 (2.2%)15 (17.9%)0.10 (.02–.33)<.0001
 Better safety10 (4.4%)3 (2.2%)6 (7.1%)0.30 (.06–1.15).0893
 Lower cost9 (4.0%)2 (1.5%)7 (8.3%)0.17 (.02–.70).0290
 Narrower spectrum9 (4.0%)0 (0%)8 (9.5%)N/A.0004
Primary outcome.0086
 Clinical outcome185 (81.5%)119 (88.2%)60 (71.4%)2.98 (1.48–6.12)
 Microbiological outcome34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78)
 Both clinical and microbiological outcome4 (1.8%)2 (1.5%)2 (2.4%)0.62 (.07–5.22)
 Other4 (1.8%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
Recording of adverse events219 (96.5%)134 (99.3%)78 (92.9%)10.31 (1.72–196.57).0137
Conclusion based on results<.0001
 Noninferiority shown157 (69.2%)109 (80.7%)44 (52.4%)3.81 (2.10–7.05)
 Superiority shown5 (2.2%)5 (3.7%)0 (0%)N/A
 Inferiority shown10 (4.4%)6 (4.4%)4 (4.8%)0.93 (.26–3.73)
 Inconclusive55 (24.2%)15 (11.1%)36 (42.9%)0.17 (.08–.33)

8 studies had unclear funding and so were not included in group A or B.

Abbreviations: CAP, community-acquired pneumonia; CI, confidence interval; HAP, hospital acquired pneumonia; IQR, interquartile range; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio; PO, route by mouth; SSTI, skin or soft-tissue infection; VAP, ventilator-associated pneumonia.

Only 83 (36.6%) studies had a justification for the specified noninferiority margin (Table 2). Both ITT and PP analysis results were reported in 165 (72.7%) studies. The authors’ conclusions were misleading in 34 (15.0%) studies (Supplementary Table 3). Of the 34 misleading conclusions, the authors concluded noninferiority, whereas the study results showed inconclusive results in 27 (79.4%) studies.

Table 2.

Methodology and Reporting Quality of Studies

All StudiesGroup A: Industry-supported StudiesGroup B: Nonpharmaceutical-supported StudiesA vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Noninferiority in title59 (26.0%)24 (17.8%)32 (38.1%)0.35 (.19–.65).0013
Double-blinded120 (52.9%)101 (74.8%)18 (21.4%)10.89 (5.80–21.36)<.0001
Noninferiority margin specified221 (97.4%)135 (100%)80 (95.2%)N/A.0207
Noninferiority margin justified83 (36.6%)44 (32.6%)37 (44.1%)0.61 (.35–1.08).1131
 Justification by clinical basis32 (14.1%)5 (3.7%)27 (32.1%)0.08 (.03–.20)<.0001
 Justification by prior studies37 (16.3%)21 (15.6%)15 (17.9%)0.85 (.41–1.78).7091
 Justification by guidelines2 (0.9%)1 (0.7%)1 (1.2%)0.62 (.02–15.80)>.9999
 Justification by regulatory bodies42 (18.5%)31 (23.0%)10 (11.9%)2.21 (1.05–4.99).0500
 Justification by effect of control18 (7.9%)11 (8.2%)7 (8.3%)0.98 (.37–2.75)>.9999
Adequate information for sample size recalculation152 (67.0%)80 (59.3%)68 (81.0%)0.34 (.18–.64).0010
2-sided 95% CI or 1-sided 97.5% CI used193 (85.0%)129 (95.6%)59 (70.2%)9.11 (3.77–25.60)<.0001
Discordance between type I error rate used and CI9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87).0568
Early stopping due to logistics21 (9.3%)4 (3.0%)16 (19.1%)0.13 (.04–.37).0001
Analysis used.1326
 ITT only41 (18.1%)23 (17.0%)17 (20.2%)0.81 (.40–1.64)
 PP only21 (9.3%)8 (5.9%)11 (13.1%)0.42 (.16–1.08)
 ITT and PP165 (72.7%)104 (77.0%)56 (66.7%)1.68 (.91–3.08)
Handling of missing data69 (30.4%)50 (37.0%)19 (22.6%)2.01 (1.10–3.80).0358
 Imputation of missing data18 (7.9%)9 (6.7%)9 (10.7%)0.60 (.22–1.59).3182
 Worst case scenario55 (24.2%)41 (30.4%)14 (16.7%)2.18 (1.13–4.43).0253
 Sensitivity analyses11 (4.9%)9 (6.7%)2 (2.4%)2.93 (.73–19.53).2111
Point estimate with CI205 (90.3%)129 (95.6%)70 (83.3%)4.30 (1.65–12.60).0033
Figure of point estimate, CI, and noninferiority margin44 (19.4%)24 (17.8%)20 (23.8%)0.69 (.35–1.36).3012
Comparison to historical control20 (8.8%)13 (9.6%)7 (8.3%)1.17 (.46–3.24).8137
Misleading conclusion34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78).0105
All StudiesGroup A: Industry-supported StudiesGroup B: Nonpharmaceutical-supported StudiesA vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Noninferiority in title59 (26.0%)24 (17.8%)32 (38.1%)0.35 (.19–.65).0013
Double-blinded120 (52.9%)101 (74.8%)18 (21.4%)10.89 (5.80–21.36)<.0001
Noninferiority margin specified221 (97.4%)135 (100%)80 (95.2%)N/A.0207
Noninferiority margin justified83 (36.6%)44 (32.6%)37 (44.1%)0.61 (.35–1.08).1131
 Justification by clinical basis32 (14.1%)5 (3.7%)27 (32.1%)0.08 (.03–.20)<.0001
 Justification by prior studies37 (16.3%)21 (15.6%)15 (17.9%)0.85 (.41–1.78).7091
 Justification by guidelines2 (0.9%)1 (0.7%)1 (1.2%)0.62 (.02–15.80)>.9999
 Justification by regulatory bodies42 (18.5%)31 (23.0%)10 (11.9%)2.21 (1.05–4.99).0500
 Justification by effect of control18 (7.9%)11 (8.2%)7 (8.3%)0.98 (.37–2.75)>.9999
Adequate information for sample size recalculation152 (67.0%)80 (59.3%)68 (81.0%)0.34 (.18–.64).0010
2-sided 95% CI or 1-sided 97.5% CI used193 (85.0%)129 (95.6%)59 (70.2%)9.11 (3.77–25.60)<.0001
Discordance between type I error rate used and CI9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87).0568
Early stopping due to logistics21 (9.3%)4 (3.0%)16 (19.1%)0.13 (.04–.37).0001
Analysis used.1326
 ITT only41 (18.1%)23 (17.0%)17 (20.2%)0.81 (.40–1.64)
 PP only21 (9.3%)8 (5.9%)11 (13.1%)0.42 (.16–1.08)
 ITT and PP165 (72.7%)104 (77.0%)56 (66.7%)1.68 (.91–3.08)
Handling of missing data69 (30.4%)50 (37.0%)19 (22.6%)2.01 (1.10–3.80).0358
 Imputation of missing data18 (7.9%)9 (6.7%)9 (10.7%)0.60 (.22–1.59).3182
 Worst case scenario55 (24.2%)41 (30.4%)14 (16.7%)2.18 (1.13–4.43).0253
 Sensitivity analyses11 (4.9%)9 (6.7%)2 (2.4%)2.93 (.73–19.53).2111
Point estimate with CI205 (90.3%)129 (95.6%)70 (83.3%)4.30 (1.65–12.60).0033
Figure of point estimate, CI, and noninferiority margin44 (19.4%)24 (17.8%)20 (23.8%)0.69 (.35–1.36).3012
Comparison to historical control20 (8.8%)13 (9.6%)7 (8.3%)1.17 (.46–3.24).8137
Misleading conclusion34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78).0105

8 studies had unclear funding and so were not included in group A or B.

Abbreviations: CI, confidence interval; ITT, intention-to-treat analysis; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio; PP, per-protocol analysis.

Table 2.

Methodology and Reporting Quality of Studies

All StudiesGroup A: Industry-supported StudiesGroup B: Nonpharmaceutical-supported StudiesA vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Noninferiority in title59 (26.0%)24 (17.8%)32 (38.1%)0.35 (.19–.65).0013
Double-blinded120 (52.9%)101 (74.8%)18 (21.4%)10.89 (5.80–21.36)<.0001
Noninferiority margin specified221 (97.4%)135 (100%)80 (95.2%)N/A.0207
Noninferiority margin justified83 (36.6%)44 (32.6%)37 (44.1%)0.61 (.35–1.08).1131
 Justification by clinical basis32 (14.1%)5 (3.7%)27 (32.1%)0.08 (.03–.20)<.0001
 Justification by prior studies37 (16.3%)21 (15.6%)15 (17.9%)0.85 (.41–1.78).7091
 Justification by guidelines2 (0.9%)1 (0.7%)1 (1.2%)0.62 (.02–15.80)>.9999
 Justification by regulatory bodies42 (18.5%)31 (23.0%)10 (11.9%)2.21 (1.05–4.99).0500
 Justification by effect of control18 (7.9%)11 (8.2%)7 (8.3%)0.98 (.37–2.75)>.9999
Adequate information for sample size recalculation152 (67.0%)80 (59.3%)68 (81.0%)0.34 (.18–.64).0010
2-sided 95% CI or 1-sided 97.5% CI used193 (85.0%)129 (95.6%)59 (70.2%)9.11 (3.77–25.60)<.0001
Discordance between type I error rate used and CI9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87).0568
Early stopping due to logistics21 (9.3%)4 (3.0%)16 (19.1%)0.13 (.04–.37).0001
Analysis used.1326
 ITT only41 (18.1%)23 (17.0%)17 (20.2%)0.81 (.40–1.64)
 PP only21 (9.3%)8 (5.9%)11 (13.1%)0.42 (.16–1.08)
 ITT and PP165 (72.7%)104 (77.0%)56 (66.7%)1.68 (.91–3.08)
Handling of missing data69 (30.4%)50 (37.0%)19 (22.6%)2.01 (1.10–3.80).0358
 Imputation of missing data18 (7.9%)9 (6.7%)9 (10.7%)0.60 (.22–1.59).3182
 Worst case scenario55 (24.2%)41 (30.4%)14 (16.7%)2.18 (1.13–4.43).0253
 Sensitivity analyses11 (4.9%)9 (6.7%)2 (2.4%)2.93 (.73–19.53).2111
Point estimate with CI205 (90.3%)129 (95.6%)70 (83.3%)4.30 (1.65–12.60).0033
Figure of point estimate, CI, and noninferiority margin44 (19.4%)24 (17.8%)20 (23.8%)0.69 (.35–1.36).3012
Comparison to historical control20 (8.8%)13 (9.6%)7 (8.3%)1.17 (.46–3.24).8137
Misleading conclusion34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78).0105
All StudiesGroup A: Industry-supported StudiesGroup B: Nonpharmaceutical-supported StudiesA vs B
B as Reference
OR
95% CI
A vs B
P Value
(N = 227)(N = 135)(N = 84)
Noninferiority in title59 (26.0%)24 (17.8%)32 (38.1%)0.35 (.19–.65).0013
Double-blinded120 (52.9%)101 (74.8%)18 (21.4%)10.89 (5.80–21.36)<.0001
Noninferiority margin specified221 (97.4%)135 (100%)80 (95.2%)N/A.0207
Noninferiority margin justified83 (36.6%)44 (32.6%)37 (44.1%)0.61 (.35–1.08).1131
 Justification by clinical basis32 (14.1%)5 (3.7%)27 (32.1%)0.08 (.03–.20)<.0001
 Justification by prior studies37 (16.3%)21 (15.6%)15 (17.9%)0.85 (.41–1.78).7091
 Justification by guidelines2 (0.9%)1 (0.7%)1 (1.2%)0.62 (.02–15.80)>.9999
 Justification by regulatory bodies42 (18.5%)31 (23.0%)10 (11.9%)2.21 (1.05–4.99).0500
 Justification by effect of control18 (7.9%)11 (8.2%)7 (8.3%)0.98 (.37–2.75)>.9999
Adequate information for sample size recalculation152 (67.0%)80 (59.3%)68 (81.0%)0.34 (.18–.64).0010
2-sided 95% CI or 1-sided 97.5% CI used193 (85.0%)129 (95.6%)59 (70.2%)9.11 (3.77–25.60)<.0001
Discordance between type I error rate used and CI9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87).0568
Early stopping due to logistics21 (9.3%)4 (3.0%)16 (19.1%)0.13 (.04–.37).0001
Analysis used.1326
 ITT only41 (18.1%)23 (17.0%)17 (20.2%)0.81 (.40–1.64)
 PP only21 (9.3%)8 (5.9%)11 (13.1%)0.42 (.16–1.08)
 ITT and PP165 (72.7%)104 (77.0%)56 (66.7%)1.68 (.91–3.08)
Handling of missing data69 (30.4%)50 (37.0%)19 (22.6%)2.01 (1.10–3.80).0358
 Imputation of missing data18 (7.9%)9 (6.7%)9 (10.7%)0.60 (.22–1.59).3182
 Worst case scenario55 (24.2%)41 (30.4%)14 (16.7%)2.18 (1.13–4.43).0253
 Sensitivity analyses11 (4.9%)9 (6.7%)2 (2.4%)2.93 (.73–19.53).2111
Point estimate with CI205 (90.3%)129 (95.6%)70 (83.3%)4.30 (1.65–12.60).0033
Figure of point estimate, CI, and noninferiority margin44 (19.4%)24 (17.8%)20 (23.8%)0.69 (.35–1.36).3012
Comparison to historical control20 (8.8%)13 (9.6%)7 (8.3%)1.17 (.46–3.24).8137
Misleading conclusion34 (15.0%)13 (9.6%)19 (22.6%)0.36 (.17–.78).0105

8 studies had unclear funding and so were not included in group A or B.

Abbreviations: CI, confidence interval; ITT, intention-to-treat analysis; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio; PP, per-protocol analysis.

Risk of Bias

Of 227 studies, 112 (49.3%), 89 (39.2%), and 41 (18.1%) studies were at high risk for performance bias, detection bias, and reporting bias respectively (Table 3, Supplementary Table 4, Supplementary Figure 1).

Table 3.

Assessment of Risk of Bias in Studies

All Studies(N = 227)Group A: Industry-supported Studies(N = 135)Group B: Nonpharmaceutical-supported Studies(N = 84)A vs B
B as Reference
OR
95% CI
A vs B P Value
Randomization.0185
 High risk5 (2.2%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
 Low risk155 (68.3%)88 (65.2%)65 (77.4%)0.55 (.29–1.01)
 Unclear67 (29.5%)46 (34.1%)16 (19.1%)2.20 (1.17–4.31)
Allocation concealment.1211
 High risk9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87)
 Low risk104 (45.8%)65 (48.2%)38 (45.2%)1.12 (.65–1.95)
 Unclear114 (50.2%)68 (50.4%)40 (47.6%)1.12 (.65–1.93)
Performance bias<.0001
 High risk112 (49.3%)39 (28.9%)66 (78.6%)0.11 (.06–.21)
 Low risk109 (48.0%)91 (67.4%)17 (20.2%)8.15 (4.37–15.88)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Detection bias<.0001
 High risk89 (39.2%)28 (20.7%)54 (64.3%)0.15 (.08–.26)
 Low risk131 (57.7%)102 (75.6%)28 (33.3%)6.18 (3.43–11.42)
 Unclear7 (3.1%)5 (3.7%)2 (2.4%)1.58 (.33–11.19)
Attrition bias.6293
 High risk78 (34.4%)45 (33.3%)28 (33.3%)1.00 (.56–1.79)
 Low risk143 (63.0%)85 (63.0%)55 (65.5%)0.90 (.50–1.58)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Reporting bias.0268
 High risk41 (18.1%)17 (12.6%)21 (25.0%)0.43 (.21–.88)
 Low risk186 (81.9%)118 (87.4%)63 (75.0%)2.31 (1.14–4.75)
 Unclear0 (0%)0 (0%)0 (0%)N/A
All Studies(N = 227)Group A: Industry-supported Studies(N = 135)Group B: Nonpharmaceutical-supported Studies(N = 84)A vs B
B as Reference
OR
95% CI
A vs B P Value
Randomization.0185
 High risk5 (2.2%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
 Low risk155 (68.3%)88 (65.2%)65 (77.4%)0.55 (.29–1.01)
 Unclear67 (29.5%)46 (34.1%)16 (19.1%)2.20 (1.17–4.31)
Allocation concealment.1211
 High risk9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87)
 Low risk104 (45.8%)65 (48.2%)38 (45.2%)1.12 (.65–1.95)
 Unclear114 (50.2%)68 (50.4%)40 (47.6%)1.12 (.65–1.93)
Performance bias<.0001
 High risk112 (49.3%)39 (28.9%)66 (78.6%)0.11 (.06–.21)
 Low risk109 (48.0%)91 (67.4%)17 (20.2%)8.15 (4.37–15.88)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Detection bias<.0001
 High risk89 (39.2%)28 (20.7%)54 (64.3%)0.15 (.08–.26)
 Low risk131 (57.7%)102 (75.6%)28 (33.3%)6.18 (3.43–11.42)
 Unclear7 (3.1%)5 (3.7%)2 (2.4%)1.58 (.33–11.19)
Attrition bias.6293
 High risk78 (34.4%)45 (33.3%)28 (33.3%)1.00 (.56–1.79)
 Low risk143 (63.0%)85 (63.0%)55 (65.5%)0.90 (.50–1.58)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Reporting bias.0268
 High risk41 (18.1%)17 (12.6%)21 (25.0%)0.43 (.21–.88)
 Low risk186 (81.9%)118 (87.4%)63 (75.0%)2.31 (1.14–4.75)
 Unclear0 (0%)0 (0%)0 (0%)N/A

8 studies had unclear funding and so were not included in group A or B.

Abbreviations: CI, confidence interval; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio.

Table 3.

Assessment of Risk of Bias in Studies

All Studies(N = 227)Group A: Industry-supported Studies(N = 135)Group B: Nonpharmaceutical-supported Studies(N = 84)A vs B
B as Reference
OR
95% CI
A vs B P Value
Randomization.0185
 High risk5 (2.2%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
 Low risk155 (68.3%)88 (65.2%)65 (77.4%)0.55 (.29–1.01)
 Unclear67 (29.5%)46 (34.1%)16 (19.1%)2.20 (1.17–4.31)
Allocation concealment.1211
 High risk9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87)
 Low risk104 (45.8%)65 (48.2%)38 (45.2%)1.12 (.65–1.95)
 Unclear114 (50.2%)68 (50.4%)40 (47.6%)1.12 (.65–1.93)
Performance bias<.0001
 High risk112 (49.3%)39 (28.9%)66 (78.6%)0.11 (.06–.21)
 Low risk109 (48.0%)91 (67.4%)17 (20.2%)8.15 (4.37–15.88)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Detection bias<.0001
 High risk89 (39.2%)28 (20.7%)54 (64.3%)0.15 (.08–.26)
 Low risk131 (57.7%)102 (75.6%)28 (33.3%)6.18 (3.43–11.42)
 Unclear7 (3.1%)5 (3.7%)2 (2.4%)1.58 (.33–11.19)
Attrition bias.6293
 High risk78 (34.4%)45 (33.3%)28 (33.3%)1.00 (.56–1.79)
 Low risk143 (63.0%)85 (63.0%)55 (65.5%)0.90 (.50–1.58)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Reporting bias.0268
 High risk41 (18.1%)17 (12.6%)21 (25.0%)0.43 (.21–.88)
 Low risk186 (81.9%)118 (87.4%)63 (75.0%)2.31 (1.14–4.75)
 Unclear0 (0%)0 (0%)0 (0%)N/A
All Studies(N = 227)Group A: Industry-supported Studies(N = 135)Group B: Nonpharmaceutical-supported Studies(N = 84)A vs B
B as Reference
OR
95% CI
A vs B P Value
Randomization.0185
 High risk5 (2.2%)1 (0.7%)3 (3.6%)0.20 (.01–1.60)
 Low risk155 (68.3%)88 (65.2%)65 (77.4%)0.55 (.29–1.01)
 Unclear67 (29.5%)46 (34.1%)16 (19.1%)2.20 (1.17–4.31)
Allocation concealment.1211
 High risk9 (4.0%)2 (1.5%)6 (7.1%)0.20 (.03–.87)
 Low risk104 (45.8%)65 (48.2%)38 (45.2%)1.12 (.65–1.95)
 Unclear114 (50.2%)68 (50.4%)40 (47.6%)1.12 (.65–1.93)
Performance bias<.0001
 High risk112 (49.3%)39 (28.9%)66 (78.6%)0.11 (.06–.21)
 Low risk109 (48.0%)91 (67.4%)17 (20.2%)8.15 (4.37–15.88)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Detection bias<.0001
 High risk89 (39.2%)28 (20.7%)54 (64.3%)0.15 (.08–.26)
 Low risk131 (57.7%)102 (75.6%)28 (33.3%)6.18 (3.43–11.42)
 Unclear7 (3.1%)5 (3.7%)2 (2.4%)1.58 (.33–11.19)
Attrition bias.6293
 High risk78 (34.4%)45 (33.3%)28 (33.3%)1.00 (.56–1.79)
 Low risk143 (63.0%)85 (63.0%)55 (65.5%)0.90 (.50–1.58)
 Unclear6 (2.6%)5 (3.7%)1 (1.2%)3.19 (.50–61.74)
Reporting bias.0268
 High risk41 (18.1%)17 (12.6%)21 (25.0%)0.43 (.21–.88)
 Low risk186 (81.9%)118 (87.4%)63 (75.0%)2.31 (1.14–4.75)
 Unclear0 (0%)0 (0%)0 (0%)N/A

8 studies had unclear funding and so were not included in group A or B.

Abbreviations: CI, confidence interval; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio.

Comparison of Studies by Funding

A total of 135 (59.5%) studies were supported by pharmaceutical industry, 84 (37.0%) were not supported by pharmaceutical industry, and 8 (3.5%) studies had an unclear source of funding. Of the 84 studies not supported by pharmaceutical industry, 30 (35.7%) were funded by the government and 54 (64.3%) were funded by other public funding agencies (Supplementary Table 5). Compared with nonpharmaceutical-supported studies, industry-supported studies were less likely to show inconclusive results (11.1% vs 42.9%; OR = 0.17; 95% CI, .08–.33) (Table 1).

Industry-supported studies were less likely to be stopped early for logistical reasons (3.0% vs 19.1%; OR = 0.13; 95% CI, .04–.37) (Table 2). In the 16 studies that were not supported by pharmaceutical industry and were stopped early because of logistical reasons, 11 (68.8%) had inconclusive results.

With respect to risk of bias, industry-supported studies were more likely to be at low risk for performance bias (67.4% vs 20.2%; OR = 8.15; 95% CI, 4.37–15.88), detection bias (75.6% vs 33.3%; OR = 6.18; 95%, CI 3.43–11.42), and reporting bias (87.4% vs 75.0%; OR = 2.31; 95% CI, 1.14–4.75) (Table 3).

Comparison of Studies Over Time

Forty-four (19.4%) studies were published in 2007 or earlier, 67 (29.5%) studies were published between 2008 and 2013, and 116 (51.1%) studies were published in or after 2014. Newer studies were more likely to justify the noninferiority margin (11.4%, 28.4%, and 50.9%, respectively, P < .0001) (Table 4). Although not statistically significant, newer studies were less likely to be double-blinded (68.2%, 53.7%, and 46.6%, respectively; P = .0511) or to report both ITT and PP analyses (81.8%, 73.1%, and 69.0%, respectively; P = .1035). There were an increasing proportion of studies over time that were stopped early because of logistical reasons (4.6%, 6.0%, and 12.9%, respectively; P = .1746) or had a misleading conclusion (6.8%, 22.4%, and 13.8%, respectively; P = .0797). Moreover, newer studies were more likely to be at high risk for performance bias (43.2%, 46.3%, and 53.5%, respectively; P = .0350) and detection bias (25.0%, 38.8%, and 44.8%, respectively; P = .0214) (Table 5).

Table 4.

Comparison of Methodology and Reporting Quality of Studies by Publication Year

2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or after (N = 116) N (%) OR 95% CIP Value
Supported by pharmaceutical industry34 (77.3%)49 (73.1%)52 (44.8%)<.0001
Reference0.80 (.32–1.92)0.24 (.10–.51)
Noninferiority in title5 (11.4%)8 (11.9%)46 (39.7%)<.0001
Reference1.06 (.33–3.72)5.13 (2.03–15.72)
Double-blinded30 (68.2%)36 (53.7%)54 (46.6%).0511
Reference0.54 (.24–1.19)0.41 (.19–.83)
Noninferiority margin specified43 (97.7%)67 (100%)111 (95.7%).2769
ReferenceN/A0.52 (.03–3.32)
Noninferiority margin justified5 (11.4%)19 (28.4%)59 (50.9%)<.0001
Reference3.09 (1.12–9.99)8.07 (3.22–24.71)
 Justification by clinical basis2 (4.6%)7 (10.5%)23 (19.8%).0275
Reference2.45 (.56–16.99)5.19 (1.44–33.30)
 Justification by prior studies2 (4.6%)9 (13.4%)26 (22.4%).0167
Reference3.26 (.79–22.10)6.07 (1.70–38.75)
 Justification by guidelines0 (0%)1 (1.5%)1 (.9%)>.9999
ReferenceN/AN/A
 Justification by regulatory bodies4 (9.1%)6 (9.0%)32 (27.6%).0015
Reference0.98 (.26–4.05)3.81 (1.39–13.43)
 Justification by effect of control1 (2.3%)5 (7.5%)12 (10.3%).2636
Reference3.47 (.53–67.69)4.96 (.93–91.78)
Adequate information for sample size re-calculation20 (45.5%)39 (58.2%)93 (80.2%)<.0001
Reference1.67 (.78–3.63)4.85 (2.31–10.40)
2-sided 95% CI or 1-sided 97.5% CI used38 (86.4%)59 (88.1%)96 (82.8%).6222
Reference1.16 (.36–3.61)0.76 (.26–1.94)
Discordance between type I error rate used and CI1 (2.3%)2 (3.0%)6 (5.2%).7343
Reference1.32 (.12–28.99)2.35 (.39–44.98)
Early stopping due to logistics2 (4.6%)4 (6.0%)15 (12.9%).1746
Reference1.33 (.25–9.93)3.12 (.83–20.34)
Analysis used.1035
 ITT only4 (9.1%)9 (13.4%)28 (24.1%)
Reference1.55 (.47–6.04)3.18 (1.15–11.27)
 PP only4 (9.1%)9 (13.4%)8 (6.9%)
Reference1.55 (.47–6.04)0.74 (.22–2.89)
 ITT and PP36 (81.8%)49 (73.1%)80 (69.0%)
Reference0.60 (.23–1.51)0.49 (.20–1.12)
Handling of missing data17 (38.6%)14 (20.9%)38 (32.8%).0945
Reference0.42 (.18–.97)0.77 (.38–1.61)
 Imputation of missing data1 (2.3%)4 (6.0%)13 (11.2%).1461
Reference2.73 (.39–54.39)5.43 (1.03–100.11)
 Worst case scenario16 (36.4%)12 (17.9%)27 (23.3%).0894
Reference0.38 (.16–.91)0.53 (.25–1.14)
 Sensitivity analyses1 (2.3%)1 (1.5%)9 (7.8%).1236
Reference0.65 (.03–16.77)3.62 (.65–67.71)
Point estimate with CI43 (97.7%)58 (86.6%)104 (89.7%).1400
Reference0.15 (.01–.84)0.20 (.01–1.07)
Figure of point estimate, CI, and noninferiority margin1 (2.3%)5 (7.5%)38 (32.8%)<.0001
Reference3.47 (.53–67.69)20.95 (4.30–378.24)
Comparison to historical control4 (9.1%)5 (7.5%)11 (9.5%).9064
Reference0.81 (.20–3.43)1.05 (.34–3.95)
Misleading conclusions3 (6.8%)15 (22.4%)16 (13.8%).0797
Reference3.94 (1.20–17.85)2.19 (.68–9.76)
2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or after (N = 116) N (%) OR 95% CIP Value
Supported by pharmaceutical industry34 (77.3%)49 (73.1%)52 (44.8%)<.0001
Reference0.80 (.32–1.92)0.24 (.10–.51)
Noninferiority in title5 (11.4%)8 (11.9%)46 (39.7%)<.0001
Reference1.06 (.33–3.72)5.13 (2.03–15.72)
Double-blinded30 (68.2%)36 (53.7%)54 (46.6%).0511
Reference0.54 (.24–1.19)0.41 (.19–.83)
Noninferiority margin specified43 (97.7%)67 (100%)111 (95.7%).2769
ReferenceN/A0.52 (.03–3.32)
Noninferiority margin justified5 (11.4%)19 (28.4%)59 (50.9%)<.0001
Reference3.09 (1.12–9.99)8.07 (3.22–24.71)
 Justification by clinical basis2 (4.6%)7 (10.5%)23 (19.8%).0275
Reference2.45 (.56–16.99)5.19 (1.44–33.30)
 Justification by prior studies2 (4.6%)9 (13.4%)26 (22.4%).0167
Reference3.26 (.79–22.10)6.07 (1.70–38.75)
 Justification by guidelines0 (0%)1 (1.5%)1 (.9%)>.9999
ReferenceN/AN/A
 Justification by regulatory bodies4 (9.1%)6 (9.0%)32 (27.6%).0015
Reference0.98 (.26–4.05)3.81 (1.39–13.43)
 Justification by effect of control1 (2.3%)5 (7.5%)12 (10.3%).2636
Reference3.47 (.53–67.69)4.96 (.93–91.78)
Adequate information for sample size re-calculation20 (45.5%)39 (58.2%)93 (80.2%)<.0001
Reference1.67 (.78–3.63)4.85 (2.31–10.40)
2-sided 95% CI or 1-sided 97.5% CI used38 (86.4%)59 (88.1%)96 (82.8%).6222
Reference1.16 (.36–3.61)0.76 (.26–1.94)
Discordance between type I error rate used and CI1 (2.3%)2 (3.0%)6 (5.2%).7343
Reference1.32 (.12–28.99)2.35 (.39–44.98)
Early stopping due to logistics2 (4.6%)4 (6.0%)15 (12.9%).1746
Reference1.33 (.25–9.93)3.12 (.83–20.34)
Analysis used.1035
 ITT only4 (9.1%)9 (13.4%)28 (24.1%)
Reference1.55 (.47–6.04)3.18 (1.15–11.27)
 PP only4 (9.1%)9 (13.4%)8 (6.9%)
Reference1.55 (.47–6.04)0.74 (.22–2.89)
 ITT and PP36 (81.8%)49 (73.1%)80 (69.0%)
Reference0.60 (.23–1.51)0.49 (.20–1.12)
Handling of missing data17 (38.6%)14 (20.9%)38 (32.8%).0945
Reference0.42 (.18–.97)0.77 (.38–1.61)
 Imputation of missing data1 (2.3%)4 (6.0%)13 (11.2%).1461
Reference2.73 (.39–54.39)5.43 (1.03–100.11)
 Worst case scenario16 (36.4%)12 (17.9%)27 (23.3%).0894
Reference0.38 (.16–.91)0.53 (.25–1.14)
 Sensitivity analyses1 (2.3%)1 (1.5%)9 (7.8%).1236
Reference0.65 (.03–16.77)3.62 (.65–67.71)
Point estimate with CI43 (97.7%)58 (86.6%)104 (89.7%).1400
Reference0.15 (.01–.84)0.20 (.01–1.07)
Figure of point estimate, CI, and noninferiority margin1 (2.3%)5 (7.5%)38 (32.8%)<.0001
Reference3.47 (.53–67.69)20.95 (4.30–378.24)
Comparison to historical control4 (9.1%)5 (7.5%)11 (9.5%).9064
Reference0.81 (.20–3.43)1.05 (.34–3.95)
Misleading conclusions3 (6.8%)15 (22.4%)16 (13.8%).0797
Reference3.94 (1.20–17.85)2.19 (.68–9.76)

Abbreviations: CI, confidence interval; ITT, intention-to-treat analysis; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio; PP, per-protocol analysis.

Table 4.

Comparison of Methodology and Reporting Quality of Studies by Publication Year

2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or after (N = 116) N (%) OR 95% CIP Value
Supported by pharmaceutical industry34 (77.3%)49 (73.1%)52 (44.8%)<.0001
Reference0.80 (.32–1.92)0.24 (.10–.51)
Noninferiority in title5 (11.4%)8 (11.9%)46 (39.7%)<.0001
Reference1.06 (.33–3.72)5.13 (2.03–15.72)
Double-blinded30 (68.2%)36 (53.7%)54 (46.6%).0511
Reference0.54 (.24–1.19)0.41 (.19–.83)
Noninferiority margin specified43 (97.7%)67 (100%)111 (95.7%).2769
ReferenceN/A0.52 (.03–3.32)
Noninferiority margin justified5 (11.4%)19 (28.4%)59 (50.9%)<.0001
Reference3.09 (1.12–9.99)8.07 (3.22–24.71)
 Justification by clinical basis2 (4.6%)7 (10.5%)23 (19.8%).0275
Reference2.45 (.56–16.99)5.19 (1.44–33.30)
 Justification by prior studies2 (4.6%)9 (13.4%)26 (22.4%).0167
Reference3.26 (.79–22.10)6.07 (1.70–38.75)
 Justification by guidelines0 (0%)1 (1.5%)1 (.9%)>.9999
ReferenceN/AN/A
 Justification by regulatory bodies4 (9.1%)6 (9.0%)32 (27.6%).0015
Reference0.98 (.26–4.05)3.81 (1.39–13.43)
 Justification by effect of control1 (2.3%)5 (7.5%)12 (10.3%).2636
Reference3.47 (.53–67.69)4.96 (.93–91.78)
Adequate information for sample size re-calculation20 (45.5%)39 (58.2%)93 (80.2%)<.0001
Reference1.67 (.78–3.63)4.85 (2.31–10.40)
2-sided 95% CI or 1-sided 97.5% CI used38 (86.4%)59 (88.1%)96 (82.8%).6222
Reference1.16 (.36–3.61)0.76 (.26–1.94)
Discordance between type I error rate used and CI1 (2.3%)2 (3.0%)6 (5.2%).7343
Reference1.32 (.12–28.99)2.35 (.39–44.98)
Early stopping due to logistics2 (4.6%)4 (6.0%)15 (12.9%).1746
Reference1.33 (.25–9.93)3.12 (.83–20.34)
Analysis used.1035
 ITT only4 (9.1%)9 (13.4%)28 (24.1%)
Reference1.55 (.47–6.04)3.18 (1.15–11.27)
 PP only4 (9.1%)9 (13.4%)8 (6.9%)
Reference1.55 (.47–6.04)0.74 (.22–2.89)
 ITT and PP36 (81.8%)49 (73.1%)80 (69.0%)
Reference0.60 (.23–1.51)0.49 (.20–1.12)
Handling of missing data17 (38.6%)14 (20.9%)38 (32.8%).0945
Reference0.42 (.18–.97)0.77 (.38–1.61)
 Imputation of missing data1 (2.3%)4 (6.0%)13 (11.2%).1461
Reference2.73 (.39–54.39)5.43 (1.03–100.11)
 Worst case scenario16 (36.4%)12 (17.9%)27 (23.3%).0894
Reference0.38 (.16–.91)0.53 (.25–1.14)
 Sensitivity analyses1 (2.3%)1 (1.5%)9 (7.8%).1236
Reference0.65 (.03–16.77)3.62 (.65–67.71)
Point estimate with CI43 (97.7%)58 (86.6%)104 (89.7%).1400
Reference0.15 (.01–.84)0.20 (.01–1.07)
Figure of point estimate, CI, and noninferiority margin1 (2.3%)5 (7.5%)38 (32.8%)<.0001
Reference3.47 (.53–67.69)20.95 (4.30–378.24)
Comparison to historical control4 (9.1%)5 (7.5%)11 (9.5%).9064
Reference0.81 (.20–3.43)1.05 (.34–3.95)
Misleading conclusions3 (6.8%)15 (22.4%)16 (13.8%).0797
Reference3.94 (1.20–17.85)2.19 (.68–9.76)
2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or after (N = 116) N (%) OR 95% CIP Value
Supported by pharmaceutical industry34 (77.3%)49 (73.1%)52 (44.8%)<.0001
Reference0.80 (.32–1.92)0.24 (.10–.51)
Noninferiority in title5 (11.4%)8 (11.9%)46 (39.7%)<.0001
Reference1.06 (.33–3.72)5.13 (2.03–15.72)
Double-blinded30 (68.2%)36 (53.7%)54 (46.6%).0511
Reference0.54 (.24–1.19)0.41 (.19–.83)
Noninferiority margin specified43 (97.7%)67 (100%)111 (95.7%).2769
ReferenceN/A0.52 (.03–3.32)
Noninferiority margin justified5 (11.4%)19 (28.4%)59 (50.9%)<.0001
Reference3.09 (1.12–9.99)8.07 (3.22–24.71)
 Justification by clinical basis2 (4.6%)7 (10.5%)23 (19.8%).0275
Reference2.45 (.56–16.99)5.19 (1.44–33.30)
 Justification by prior studies2 (4.6%)9 (13.4%)26 (22.4%).0167
Reference3.26 (.79–22.10)6.07 (1.70–38.75)
 Justification by guidelines0 (0%)1 (1.5%)1 (.9%)>.9999
ReferenceN/AN/A
 Justification by regulatory bodies4 (9.1%)6 (9.0%)32 (27.6%).0015
Reference0.98 (.26–4.05)3.81 (1.39–13.43)
 Justification by effect of control1 (2.3%)5 (7.5%)12 (10.3%).2636
Reference3.47 (.53–67.69)4.96 (.93–91.78)
Adequate information for sample size re-calculation20 (45.5%)39 (58.2%)93 (80.2%)<.0001
Reference1.67 (.78–3.63)4.85 (2.31–10.40)
2-sided 95% CI or 1-sided 97.5% CI used38 (86.4%)59 (88.1%)96 (82.8%).6222
Reference1.16 (.36–3.61)0.76 (.26–1.94)
Discordance between type I error rate used and CI1 (2.3%)2 (3.0%)6 (5.2%).7343
Reference1.32 (.12–28.99)2.35 (.39–44.98)
Early stopping due to logistics2 (4.6%)4 (6.0%)15 (12.9%).1746
Reference1.33 (.25–9.93)3.12 (.83–20.34)
Analysis used.1035
 ITT only4 (9.1%)9 (13.4%)28 (24.1%)
Reference1.55 (.47–6.04)3.18 (1.15–11.27)
 PP only4 (9.1%)9 (13.4%)8 (6.9%)
Reference1.55 (.47–6.04)0.74 (.22–2.89)
 ITT and PP36 (81.8%)49 (73.1%)80 (69.0%)
Reference0.60 (.23–1.51)0.49 (.20–1.12)
Handling of missing data17 (38.6%)14 (20.9%)38 (32.8%).0945
Reference0.42 (.18–.97)0.77 (.38–1.61)
 Imputation of missing data1 (2.3%)4 (6.0%)13 (11.2%).1461
Reference2.73 (.39–54.39)5.43 (1.03–100.11)
 Worst case scenario16 (36.4%)12 (17.9%)27 (23.3%).0894
Reference0.38 (.16–.91)0.53 (.25–1.14)
 Sensitivity analyses1 (2.3%)1 (1.5%)9 (7.8%).1236
Reference0.65 (.03–16.77)3.62 (.65–67.71)
Point estimate with CI43 (97.7%)58 (86.6%)104 (89.7%).1400
Reference0.15 (.01–.84)0.20 (.01–1.07)
Figure of point estimate, CI, and noninferiority margin1 (2.3%)5 (7.5%)38 (32.8%)<.0001
Reference3.47 (.53–67.69)20.95 (4.30–378.24)
Comparison to historical control4 (9.1%)5 (7.5%)11 (9.5%).9064
Reference0.81 (.20–3.43)1.05 (.34–3.95)
Misleading conclusions3 (6.8%)15 (22.4%)16 (13.8%).0797
Reference3.94 (1.20–17.85)2.19 (.68–9.76)

Abbreviations: CI, confidence interval; ITT, intention-to-treat analysis; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio; PP, per-protocol analysis.

Table 5.

Comparison of Risk of Bias of Studies by Publication Year

2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or After (N = 116) N (%) OR 95% CIP Value
Randomization.0522
 High risk1 (2.3%)1 (1.5%)3 (2.6%)
Reference0.65 (.03–16.77)1.14 (.14–23.42)
 Low risk23 (52.3%)45 (67.2%)87 (75.0%)
Reference1.87 (.86–4.11)2.74 (1.33–5.69)
 Unclear20 (45.5%)21 (31.3%)26 (22.4%)
Reference0.55 (.25–1.20)0.35 (.17–.72)
Allocation concealment.0042
 High risk1 (2.3%)2 (3.0%)6 (5.2%)
Reference1.32 (.12–28.99)2.35 (.39–44.98)
 Low risk10 (22.7%)37 (55.2%)57 (49.1%)
Reference4.19 (1.83–10.24)3.28 (1.53–7.58)
 Unclear33 (75.0%)28 (41.8%)53 (45.7%)
Reference0.24 (.10–.54)0.28 (.12–.59)
Performance bias.0350
 High risk19 (43.2%)31 (46.3%)62 (53.5%)
Reference1.13 (.53–2.45)1.51 (.75–3.07)
 Low risk24 (54.6%)31 (46.3%)54 (46.6%)
Reference0.72 (.33–1.54)0.73 (.36–1.45)
 Unclear1 (2.3%)5 (7.5%)0 (0%)
Reference3.47 (.53–67.69)N/A
Detection bias.0214
 High risk11 (25.0%)26 (38.8%)52 (44.8%)
Reference1.90 (.83–4.54)2.44 (1.15–5.48)
 Low risk32 (72.7%)36 (53.7%)63 (54.3%)
Reference0.44 (.19–.97)0.45 (.20–.93)
 Unclear1 (2.3%)5 (7.5%)1 (.9%)
Reference3.47 (.53–67.69)0.37 (.01–9.59)
Attrition bias.0387
 High risk20 (45.5%)22 (32.8%)36 (31.0%)
Reference0.59 (.27–1.28)0.54 (.26–1.10)
 Low risk24 (54.6%)40 (59.7%)79 (68.1%)
Reference1.23 (.57–2.67)1.78 (.87–3.63)
 Unclear0 (0%)5 (7.5%)1 (.9%)
ReferenceN/AN/A
Reporting bias.2974
 High risk8 (18.2%)16 (23.9%)17 (14.7%)
Reference1.41 (.56–3.81)0.77 (.31–2.03)
 Low risk36 (81.8%)51 (76.1%)99 (85.3%)
Reference0.71 (.26–1.79)1.29 (.49–3.18)
 Unclear0 (0%)0 (0%)0 (0%)
ReferenceN/AN/A
2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or After (N = 116) N (%) OR 95% CIP Value
Randomization.0522
 High risk1 (2.3%)1 (1.5%)3 (2.6%)
Reference0.65 (.03–16.77)1.14 (.14–23.42)
 Low risk23 (52.3%)45 (67.2%)87 (75.0%)
Reference1.87 (.86–4.11)2.74 (1.33–5.69)
 Unclear20 (45.5%)21 (31.3%)26 (22.4%)
Reference0.55 (.25–1.20)0.35 (.17–.72)
Allocation concealment.0042
 High risk1 (2.3%)2 (3.0%)6 (5.2%)
Reference1.32 (.12–28.99)2.35 (.39–44.98)
 Low risk10 (22.7%)37 (55.2%)57 (49.1%)
Reference4.19 (1.83–10.24)3.28 (1.53–7.58)
 Unclear33 (75.0%)28 (41.8%)53 (45.7%)
Reference0.24 (.10–.54)0.28 (.12–.59)
Performance bias.0350
 High risk19 (43.2%)31 (46.3%)62 (53.5%)
Reference1.13 (.53–2.45)1.51 (.75–3.07)
 Low risk24 (54.6%)31 (46.3%)54 (46.6%)
Reference0.72 (.33–1.54)0.73 (.36–1.45)
 Unclear1 (2.3%)5 (7.5%)0 (0%)
Reference3.47 (.53–67.69)N/A
Detection bias.0214
 High risk11 (25.0%)26 (38.8%)52 (44.8%)
Reference1.90 (.83–4.54)2.44 (1.15–5.48)
 Low risk32 (72.7%)36 (53.7%)63 (54.3%)
Reference0.44 (.19–.97)0.45 (.20–.93)
 Unclear1 (2.3%)5 (7.5%)1 (.9%)
Reference3.47 (.53–67.69)0.37 (.01–9.59)
Attrition bias.0387
 High risk20 (45.5%)22 (32.8%)36 (31.0%)
Reference0.59 (.27–1.28)0.54 (.26–1.10)
 Low risk24 (54.6%)40 (59.7%)79 (68.1%)
Reference1.23 (.57–2.67)1.78 (.87–3.63)
 Unclear0 (0%)5 (7.5%)1 (.9%)
ReferenceN/AN/A
Reporting bias.2974
 High risk8 (18.2%)16 (23.9%)17 (14.7%)
Reference1.41 (.56–3.81)0.77 (.31–2.03)
 Low risk36 (81.8%)51 (76.1%)99 (85.3%)
Reference0.71 (.26–1.79)1.29 (.49–3.18)
 Unclear0 (0%)0 (0%)0 (0%)
ReferenceN/AN/A

Abbreviations: CI, confidence interval; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio.

Table 5.

Comparison of Risk of Bias of Studies by Publication Year

2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or After (N = 116) N (%) OR 95% CIP Value
Randomization.0522
 High risk1 (2.3%)1 (1.5%)3 (2.6%)
Reference0.65 (.03–16.77)1.14 (.14–23.42)
 Low risk23 (52.3%)45 (67.2%)87 (75.0%)
Reference1.87 (.86–4.11)2.74 (1.33–5.69)
 Unclear20 (45.5%)21 (31.3%)26 (22.4%)
Reference0.55 (.25–1.20)0.35 (.17–.72)
Allocation concealment.0042
 High risk1 (2.3%)2 (3.0%)6 (5.2%)
Reference1.32 (.12–28.99)2.35 (.39–44.98)
 Low risk10 (22.7%)37 (55.2%)57 (49.1%)
Reference4.19 (1.83–10.24)3.28 (1.53–7.58)
 Unclear33 (75.0%)28 (41.8%)53 (45.7%)
Reference0.24 (.10–.54)0.28 (.12–.59)
Performance bias.0350
 High risk19 (43.2%)31 (46.3%)62 (53.5%)
Reference1.13 (.53–2.45)1.51 (.75–3.07)
 Low risk24 (54.6%)31 (46.3%)54 (46.6%)
Reference0.72 (.33–1.54)0.73 (.36–1.45)
 Unclear1 (2.3%)5 (7.5%)0 (0%)
Reference3.47 (.53–67.69)N/A
Detection bias.0214
 High risk11 (25.0%)26 (38.8%)52 (44.8%)
Reference1.90 (.83–4.54)2.44 (1.15–5.48)
 Low risk32 (72.7%)36 (53.7%)63 (54.3%)
Reference0.44 (.19–.97)0.45 (.20–.93)
 Unclear1 (2.3%)5 (7.5%)1 (.9%)
Reference3.47 (.53–67.69)0.37 (.01–9.59)
Attrition bias.0387
 High risk20 (45.5%)22 (32.8%)36 (31.0%)
Reference0.59 (.27–1.28)0.54 (.26–1.10)
 Low risk24 (54.6%)40 (59.7%)79 (68.1%)
Reference1.23 (.57–2.67)1.78 (.87–3.63)
 Unclear0 (0%)5 (7.5%)1 (.9%)
ReferenceN/AN/A
Reporting bias.2974
 High risk8 (18.2%)16 (23.9%)17 (14.7%)
Reference1.41 (.56–3.81)0.77 (.31–2.03)
 Low risk36 (81.8%)51 (76.1%)99 (85.3%)
Reference0.71 (.26–1.79)1.29 (.49–3.18)
 Unclear0 (0%)0 (0%)0 (0%)
ReferenceN/AN/A
2007 or Before (N = 44) N (%) OR 95% CI2008–2013 (N = 67) N (%) OR 95% CI2014 or After (N = 116) N (%) OR 95% CIP Value
Randomization.0522
 High risk1 (2.3%)1 (1.5%)3 (2.6%)
Reference0.65 (.03–16.77)1.14 (.14–23.42)
 Low risk23 (52.3%)45 (67.2%)87 (75.0%)
Reference1.87 (.86–4.11)2.74 (1.33–5.69)
 Unclear20 (45.5%)21 (31.3%)26 (22.4%)
Reference0.55 (.25–1.20)0.35 (.17–.72)
Allocation concealment.0042
 High risk1 (2.3%)2 (3.0%)6 (5.2%)
Reference1.32 (.12–28.99)2.35 (.39–44.98)
 Low risk10 (22.7%)37 (55.2%)57 (49.1%)
Reference4.19 (1.83–10.24)3.28 (1.53–7.58)
 Unclear33 (75.0%)28 (41.8%)53 (45.7%)
Reference0.24 (.10–.54)0.28 (.12–.59)
Performance bias.0350
 High risk19 (43.2%)31 (46.3%)62 (53.5%)
Reference1.13 (.53–2.45)1.51 (.75–3.07)
 Low risk24 (54.6%)31 (46.3%)54 (46.6%)
Reference0.72 (.33–1.54)0.73 (.36–1.45)
 Unclear1 (2.3%)5 (7.5%)0 (0%)
Reference3.47 (.53–67.69)N/A
Detection bias.0214
 High risk11 (25.0%)26 (38.8%)52 (44.8%)
Reference1.90 (.83–4.54)2.44 (1.15–5.48)
 Low risk32 (72.7%)36 (53.7%)63 (54.3%)
Reference0.44 (.19–.97)0.45 (.20–.93)
 Unclear1 (2.3%)5 (7.5%)1 (.9%)
Reference3.47 (.53–67.69)0.37 (.01–9.59)
Attrition bias.0387
 High risk20 (45.5%)22 (32.8%)36 (31.0%)
Reference0.59 (.27–1.28)0.54 (.26–1.10)
 Low risk24 (54.6%)40 (59.7%)79 (68.1%)
Reference1.23 (.57–2.67)1.78 (.87–3.63)
 Unclear0 (0%)5 (7.5%)1 (.9%)
ReferenceN/AN/A
Reporting bias.2974
 High risk8 (18.2%)16 (23.9%)17 (14.7%)
Reference1.41 (.56–3.81)0.77 (.31–2.03)
 Low risk36 (81.8%)51 (76.1%)99 (85.3%)
Reference0.71 (.26–1.79)1.29 (.49–3.18)
 Unclear0 (0%)0 (0%)0 (0%)
ReferenceN/AN/A

Abbreviations: CI, confidence interval; N/A, not applicable, where numbers were too few in a cell within the 2 × 2 table to make accurate estimate of odds ratio; OR, odds ratio.

Discussion

This systematic review assessed the methodological and reporting quality of antibiotic noninferiority trials. The increase in numbers of antibiotic noninferiority trials over time likely reflects overall increase in new antibiotic development as a result of financial incentives and increased regulatory flexibility [21]. Overall, the majority of antibiotic noninferiority trials were reasonably well conducted and reported. Only a minority of trials had substantial deficiencies in the study design, reporting, or interpretation of results. Deficiencies included a lack of justification of the noninferiority margin, lack of reporting of ITT with PP analyses, and having misleading conclusions. With the exception of justification of the noninferiority margin, there may be a trend of a decrease in methodological and reporting quality over time. Industry-supported studies were less likely to be stopped early and to report inconclusive results.

Deficiencies identified in our systematic review were also previously found in systematic reviews of noninferiority trials not specifically on antibiotics. Justification of noninferiority margin was missing in 63% of studies in our systematic review, whereas it ranged from 54% to 80% in prior systematic reviews [4–6]. Henanff et al found misleading conclusions in 12% of studies [4], and we found misleading conclusions in 15% of studies. Reporting of both ITT and PP analyses occurred in 42%–54% of studies in prior systematic reviews [4–6, 22], which was lower than the 73% of studies found in our systematic review. This may be due to the requirement of both ITT and PP analyses in regulatory body guidelines on noninferiority RCTs [6].

Similar to our review, a prior systematic review showed that industry-supported studies were more likely to be of better quality in terms of methodology and have favorable outcomes [23]. This was attributed to possible publication bias, where studies with unfavorable results may have been suppressed from being published by pharmaceutical industries [23]. Our study results suggest that another factor contributing to more favorable results could be that pharmaceutical industry studies had enough resources to see studies to completion. Finally, FDA guidance likely acted as quality assurance for pharmaceutical industry trials. Infectious Diseases Society of America activism has contributed to the publication of noninferiority trial guidance for different infection syndromes by the FDA [24–28]. These guidance documents outline standards such as noninferiority margin and population analysis [24–28]. Pharmaceutical industry trials must uphold these high standards to receive approval for the drug.

A prior systematic review on noninferiority trials of all drugs showed no improvement over time before and after publication of the CONSORT statements [6]. We found a possible trend of declining quality over time. This difference occur because this prior systematic review included studies up to 2009 [6], and our systematic review included additional studies from 2009 to 2019.

Our review has several implications that could improve the methodological and reporting quality of future antibiotic noninferiority RCTs. First, the reporting quality of antibiotic noninferiority trials may be declining over time, despite publication of the CONSORT guidelines [2, 3]. One possible explanation is the exponential increase in the number of noninferiority trials. In the past, a noninferiority design was rare and novel. The investigators who conducted such trials were likely more familiar with and attentive to the study design and reporting. The decreasing proportion of industry-funded studies over time may also contribute to the decrease in quality, because industry-supported studies were more likely to fulfill certain quality indicators, such as blinding, and not having misleading conclusions. In this systematic review, we have highlighted deficiencies including lack of justification of the noninferiority margin, lack of reporting of both ITT as well as PP analyses, and having misleading conclusions. In the future, authors and journals should focus on these areas during the study design, manuscript writing, and peer review process.

Second, we found that nonpharmaceutical supported studies were more likely to be stopped early for logistical reasons leading to inconclusive results. This suggests that feasibility factors may be overlooked in the study design and review of grant application stage by government and other public funding agencies. Funding agencies should require pilot/feasibility data before funding large noninferiority trials; however, these agencies should have a low threshold to support such pilot/feasibility studies. It is of utmost importance that noninferiority trials in infectious diseases can be conducted without the support of the pharmaceutical industry to optimize the use of existing antibiotics such as studies on shorter durations, oral instead of parental routes, better safety, and lower cost.

Our study has several strengths. First, we undertook a comprehensive and inclusive literature search of 4 databases. Second, the data extraction was systematic and rigorous with completion by 2 independent reviewers. Third, the assessment of reporting quality and risk of bias was thorough and based on established guidelines [2, 3, 20] as well as prior studies to improve comparability [4–6].

There are several limitations that merit mentioning. First, we excluded publications not in English; however, only 22 (0.5%) studies were excluded based on language alone. Second, quality based on the reporting of the published article may not necessarily reflect the quality of the study itself. Discrepancies between protocols or trial registry entries and the published article are not an uncommon occurrence [29]. Therefore, it is theoretically possible that a rigorously conducted study may not be well reported. Third, our strict expectations for methodological and reporting quality may not be practical for all trials to fulfill. For example, blinding for a noninferiority trial on intravenous vs oral antibiotics would necessitate an intravenous placebo for an extensive period. This would be impractical and expose the oral antibiotic treatment arm to unnecessary and significant risk associated with long-term intravenous catheters. However, these were rare occurrences.

In conclusion, we found room for improvement in the methodology and reporting of antibiotic noninferiority RCTs that authors and journals can work on during the study design, manuscript writing, and peer review process. Publication of better quality noninferiority studies ensures that antibiotic therapy used in clinical practice is in accordance with the best possible evidence. Although noninferiority trials provide almost all of the current evidence on new antibiotics, it should be acknowledged that noninferiority trials might limit the possibility of major advances over existing antibiotics [11]. Therefore, we hope to see a shift toward testing for superiority in future antibiotic trials [7, 12].

Supplementary Data

Supplementary materials are available at Clinical Infectious Diseases online. Consisting of data provided by the authors to benefit the reader, the posted materials are not copyedited and are the sole responsibility of the authors, so questions or comments should be addressed to the corresponding author.

Notes

Author contributions. A. D. B., M. L., and D. M. conceived and designed the study. A. D. B., A. S. K., C. K. L. L., P. T., X. X. L., V. M., A. C., V. R. K., A. F., and Y. L. performed abstract screening and data extraction from full text. A. D. B. performed the analysis and wrote a first draft of the manuscript. All authors reviewed and revised the manuscript. All authors approved the final manuscript to be submitted.

Acknowledgments. The authors thank Neera Bhatnagar for her guidance on search strategy.

Potential conflicts of interest. M. L. reports a contract with the WHO to help update antibiotics in the Essential Medicines List. All other authors declare that they have no competing interests. All authors have submitted the ICMJE Form for Disclosure of Potential Conflicts of Interest. Conflicts that the editors consider relevant to the content of the manuscript have been disclosed.

References

1.

Suda
KJ
,
Hurley
AM
,
McKibbin
T
,
Motl Moroney
SE
.
Publication of noninferiority clinical trials: changes over a 20-year interval
.
Pharmacotherapy
2011
;
31
:
833
9
.

2.

Piaggio
G
,
Elbourne
DR
,
Altman
DG
,
Pocock
SJ
,
Evans
SJ
;
CONSORT Group
.
Reporting of noninferiority and equivalence randomized trials: an extension of the CONSORT statement
.
JAMA
2006
;
295
:
1152
60
.

3.

Piaggio
G
,
Elbourne
DR
,
Pocock
SJ
,
Evans
SJ
,
Altman
DG
,
for the CONSORT Group
.
Reporting of noninferiority and equivalence randomized trials: extension of the CONSORT 2010 statement
.
JAMA
2012
;
308
:
2594
2604
.

4.

Le Henanff
A
,
Giraudeau
B
,
Baron
G
,
Ravaud
P
.
Quality of reporting of noninferiority and equivalence randomized trials
.
JAMA
2006
;
295
:
1147
51
.

5.

Rehal
S
,
Morris
TP
,
Fielding
K
,
Carpenter
JR
,
Phillips
PP
.
Non-inferiority trials: are they inferior? A systematic review of reporting in major medical journals
.
BMJ Open
2016
;
6
:
e012594
.

6.

Wangge
G
,
Klungel
OH
,
Roes
KC
,
de Boer
A
,
Hoes
AW
,
Knol
MJ
.
Room for improvement in conducting and reporting non-inferiority randomized controlled trials on drugs: a systematic review
.
PLoS One
2010
;
5
:
e13550
.

7.

Infectious Diseases Society of America (IDSA)
.
White paper: recommendations on the conduct of superiority and organism-specific clinical trials of antibacterial agents for the treatment of infections caused by drug-resistant bacterial pathogens
.
Clin Infect Dis
2012
;
55
:
1031
46
.

8.

Center for Biologics Evaluation and Research (CBER), Center for Drug Evaluation and Research (CDER)
.
Non-Inferiority Clinical Trials to Establish Effectiveness: Guidance for Industry
. Available at: https://www.fda.gov/media/78504/download Accessed
24 Febuary 2020
.

9.

European Medicines Agency
.
Guideline on the evaluation of medicinal products indicated for treatment of bacterial infections, Rev. 3.
2019
. Available at: https://www.ema.europa.eu/en/documents/scientific-guideline/draft-guideline-evaluation-medicinal-products-indicated-treatment-bacterial-infections-revision-3_en.pdf Accessed
12 October 2019
.

10.

Temple
R
,
Ellenberg
SS
.
Placebo-controlled trials and active-control trials in the evaluation of new treatments. Part 1: ethical and scientific issues
.
Ann Intern Med
2000
;
133
:
455
63
.

11.

Deak
D
,
Outterson
K
,
Powers
JH
,
Kesselheim
AS
.
Progress in the fight against multidrug-resistant bacteria? A review of US Food and Drug Administration–approved antibiotics, 2010–2015
.
Ann Intern Med
2016
;
165
:
363
72
.

12.

Powers
JH
,
Evans
SR
,
Kesselheim
AS
.
Studying new antibiotics for multidrug resistant infections: are today’s patients paying for unproved future benefits?
BMJ
2018
;
360
:
k587
.

13.

Nielsen
TB
,
Brass
EP
,
Gilbert
DN
,
Bartlett
JG
,
Spellberg
B
.
Sustainable discovery and development of antibiotics - is a nonprofit approach the future?
N Engl J Med
2019
;
381
:
503
5
.

14.

Spellberg
B
,
Nielsen
TB
,
Gilbert
DN
,
Shorr
AF
,
Brass
EP
.
Ensuring sustainability of needed antibiotics: aiming for the DART board
.
Ann Intern Med
2019
;
171
:
580
2
.

15.

Infectious Diseases Society of America
.
Position paper: recommended design features of future clinical trials of antibacterial agents for community-acquired pneumonia
.
Clin Infect Dis
2008
;
47
:
S249
65
.

16.

Corey
GR
,
Stryjewski
ME
.
New rules for clinical trials of patients with acute bacterial skin and skin-structure infections: do not let the perfect be the enemy of the good
.
Clin Infect Dis
2011
;
52
Suppl 7
:
S469
76
.

17.

Spellberg
B
,
Talbot
G
.
Recommended design features of future clinical trials of antibacterial agents for hospital-acquired bacterial pneumonia and ventilator-associated bacterial pneumonia
.
Clin Infect Dis
2010
;
51
:
S150
70
.

18.

Moher
D
,
Liberati
A
,
Tetzlaff
J
,
Altman
DG
;
PRISMA Group
.
Preferred reporting items for systematic reviews and meta-analyses: the PRISMA statement
.
Ann Intern Med
2009
;
151
:
264
9, W64
.

19.

U. S. Food and Drug Administration
.
Drugs@FDA: FDA-Approved drugs
. Available at: https://www.accessdata.fda.gov/scripts/cder/daf/ Accessed
20 March 2020
.

20.

Higgins
JP
,
Altman
DG
,
Gøtzsche
PC
, et al. ;
Cochrane Bias Methods Group; Cochrane Statistical Methods Group
.
The Cochrane Collaboration’s tool for assessing risk of bias in randomised trials
.
BMJ
2011
;
343
:
d5928
.

21.

Talbot
GH
,
Jezek
A
,
Murray
BE
, et al.
The Infectious Diseases Society of America’s 10×’20 initiative (10 new systemic antibacterial agents US Food and Drug Administration approved by 2020): is 20×’20 a possibility?
Clin Infect Dis
2019
;
69
:
1
11
.

22.

Beckett
RD
,
Loeser
KC
,
Bowman
KR
,
Towne
TG
.
Intention-to-treat and transparency of related practices in randomized, controlled trials of anti-infectives
.
BMC Med Res Methodol
2016
;
16
:
106
.

23.

Lexchin
J
,
Bero
LA
,
Djulbegovic
B
,
Clark
O
.
Pharmaceutical industry sponsorship and research outcome and quality: systematic review
.
BMJ
2003
;
326
:
1167
70
.

24.

Center for Drug Evaluation and Research (CDER)
.
Guidance for Industry Acute Bacterial Skin and Skin Structure Infections: Developing Drugs for Treatment
. https://www.fda.gov/files/drugs/published/Acute-Bacterial-Skin-and-Skin-Structure-Infections---Developing-Drugs-for-Treatment.pdf Accessed
8 June 2020
.

25.

Center for Drug Evaluation and Research (CDER)
.
Guidance for Industry Hospital-Acquired Bacterial Pneumonia and Ventilator-Associated Bacterial Pneumonia: Developing Drugs for Treatment
. Available at: https://www.fda.gov/media/79516/download. Accessed
8 June 2020
.

26.

Center for Drug Evaluation and Research (CDER)
.
Guidance for Industry Community-Acquired Bacterial Pneumonia: Developing Drugs for Treatment
. https://www.fda.gov/media/75149/download Accessed
8 June 2020

27.

Center for Drug Evaluation and Research (CDER)
.
Guidance for Industry Complicated Intra-abdominal Infections: Developing Drugs for Treatment
https://www.fda.gov/media/84691/download Accessed
8 June 2020
.

28.

Center for Drug Evaluation and Research (CDER)
.
Guidance for Industry Complicated Urinary Tract Infections: Developing Drugs for Treatment
https://www.fda.gov/files/drugs/published/Complicated-Urinary-Tract-Infections---Developing-Drugs-for-Treatment.pdf Accessed
8 June 2020

29.

Dwan
K
,
Altman
DG
,
Cresswell
L
,
Blundell
M
,
Gamble
CL
,
Williamson
PR
.
Comparison of protocols and registry entries to published reports for randomized controlled trials
.
Cochrane Database of Syst Rev
.
2011
;
1
:
MR000031
.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)