Two decades of surgical randomized controlled trials: worldwide trends in volume and methodological quality

Abstract Background RCTs are essential in guiding clinical decision-making but are difficult to perform, especially in surgery. This review assessed the trend in volume and methodological quality of published surgical RCTs over two decades. Methods PubMed was searched systematically for surgical RCTs published in 1999, 2009, and 2019. The primary outcomes were volume of trials and RCTs with a low risk of bias. Secondary outcomes were clinical, geographical, and funding characteristics. Results Some 1188 surgical RCTs were identified, of which 300 were published in 1999, 450 in 2009, and 438 in 2019. The most common subspecialty in 2019 was gastrointestinal surgery (50.7 per cent). The volume of surgical RCTs increased mostly in Asia (61, 159, and 199 trials), especially in China (7, 40, and 81). In 2019, countries with the highest relative volume of published surgical RCTs were Finland and the Netherlands. Between 2009 and 2019, the proportion of RCTs with a low risk of bias increased from 14.7 to 22.1 per cent (P = 0.004). In 2019, the proportion of trials with a low risk of bias was highest in Europe (30.5 per cent), with the UK and the Netherlands as leaders in this respect. Conclusion The volume of published surgical RCTs worldwide remained stable in the past decade but their methodological quality improved. Considerable geographical shifts were observed, with Asia and especially China leading in terms of volume. Individual European countries are leading in their relative volume and methodological quality of surgical RCTs.


Introduction
RCTs are essential in guiding clinical treatment decisions.However, conducting RCTs can be highly challenging given the numerous ethical, logistic, and financial hurdles.Furthermore, for RCTs to be worthwhile, they should meet high levels of methodological quality 1 .A weak or biased RCT may lead to abandonment of a beneficial intervention or the adoption of an ineffective intervention that might even harm patients 2 .Finally, an RCT must be reported in a clear and comprehensive manner to facilitate its interpretation and critical review.
Surgical RCTs have been criticized for their low methodological quality 3,4 .It is clear that such RCTs face some unique challenges, such as low patient accrual owing to strong patient and surgeon preferences, difficulties with blinding, steep surgical learning curves, varying surgical expertise and experience, variation in surgical quality control, and standardization of procedures 5 .A number of initiatives have been established to provide guidance in facing these unique challenges.Most notably, the IDEAL (Idea, Development, Exploration, Assessment and Long-term follow-up) collaboration formulated a framework to specifically evaluate complex interventions such as surgical procedures, including research options when an RCT might not be the appropriate study design 6,7 .To evaluate the status of surgical RCTs, the trends in volume and methodological quality of surgical RCTs published in 1999 and 2009 were assessed previously 8 .In recent years, however, new regulations such as the European Clinical Trials Directive have been put in place which may further hamper the execution of surgical RCTs [9][10][11] .The previous systematic review was updated with data from 2019 to assess trends in the volume and methodological quality of surgical RCTs in the past decade.

Methods
This review is a 10-year update of a previously published study 8 and is reported in accordance with PRISMA guidelines (Fig. 1).Methodology was similar in regard to the search, inclusion and exclusion criteria, and data extraction 8 .In brief, the Cochrane High Sensitive Search Strategy was used augmented with free-text terms to identify RCTs in PubMed in 2019 (Fig. 2).https://doi.org/10.1093/bjs/znad160Advance Access Publication Date: 28 June 2023

Systematic Review
Abstracts were screened for relevance by two reviewers and disagreements were solved by consensus between the two.
Inclusion and exclusion criteria were similar to those of the previous review 8 .A surgical RCT was identified as any trial determining the effect of a general surgical procedure (that is gastrointestinal, trauma based on affiliation, vascular, thoracic, breast, paediatric, transplantation, and other general surgical procedures, regardless of affiliation of corresponding author), or an RCT of which the corresponding author is affiliated to a general surgical department.If participants received an additional treatment (for example chemotherapy) as part of surgical treatment, the RCT was included.If the trial focused purely on the additional treatment and the corresponding author was not a surgeon, the RCT was excluded.RCTs published by other surgical specialties (cardiac surgery, neurosurgery, maxillofacial surgery, otolaryngology, ophthalmology, plastic surgery, gynaecology, urology and orthopaedics) were excluded 8 .Publications in languages other than English, French, German, and Dutch were excluded for practical reasons.
Clinical, geographical, and funding characteristics of included RCTs were extracted (Table 1).All RCTs were evaluated according to a nine-item list based on the Cochrane guidelines for methodological assessment of randomized trials: primary outcome; sample size calculation; presence of baseline; generation of allocation sequence; concealment of allocation; blinding; double blinding; type of analysis; and handling of drop-outs.
Detailed definitions used for each item have been published previously 8 .A trial with a low risk of bias was defined as one that met all of the following four requirements: adequate generation of allocation, adequate concealment of allocation, intention-to-treat analysis, and adequate handling of dropouts.Extraction of all data was conducted by two reviewers, and all discrepancies were reviewed by one of the senior authors.
Characteristics of the trials were compared between each pair of consecutive study years (1999 versus 2009 and 2009 versus 2019).A subgroup analysis was performed for volume and quality based on the geographical area of origin.For studies published in 1999, 2009, and 2019, population data from the years 2000, 2010, and 2019 respectively were used [12][13][14] .Median (i.q.r.) values were calculated for continuous data, whereas dichotomous outcomes are presented as the number of events with percentage.Data from 2009 were compared with data from 1999, and data from 2019 with those from 2009, by Fisher's exact, χ 2 , and Mann-Whitney U tests, and the relative rate (RR) with corresponding 95 per cent confidence interval, as appropriate.P < 0.050 was considered statistically significant.

Results
The search for 2019 was undertaken on 3 April 2020 and identified 52 673 PubMed hits (Fig. 1).

General characteristics
Epidemiological and clinical characteristics of the included trials are shown in  to 1.58; P < 0.001) in investigator-initiated (non-industry) trials.In an analysis excluding trials lacking informative data (that is, not reported), the decrease in industry-funded and increase in non-industry trials was clearer and more pronounced (Table 1).
The number of trials not reporting the source of funding decreased every year, being 57.0 per cent in 1999, 42.2 per cent in 2009, and 34.7 per cent in 2019.

Volume
Overall, the absolute volume of RCTs remained stable between 2019 and 2009 (438 versus 450).This contrasts with a 50.0 per cent increase in the previous decade (300 RCTs in 1999).In 2019, most RCTs originated from Asia/Oceania (Table 1).(Table 2).When the number of inhabitants was considered, Finland was the country with the most trials relative to population in 2009 and 2019, with the Netherlands in second place in both years.

Reported risk of bias
Methodological characteristics related to the risk of bias of the included trials are shown in Table 3 S1.
In the interval 2009-2019, there was a significant increase in trials with a low risk of bias in Asia/Oceania (from 4.9 per cent in 2009 to 18.1 per cent in 2019; RR 3.50, 1.70 to 7.32; P < 0.001).In contrast, RCTs from Africa/South America did not show improvement in reported methodological characteristics over the past 10 years; the proportion of trials with a low risk of bias was below 10 per cent in both years.Quality did not significantly improve in Europe (from 23.0 per cent in 2009 to 30.5 per cent in 2019; RR 1.35, 0.96 to 1.92; P > 0.050) and North America (from 16.1 per cent in 2009 to 23.6 per cent in 2019; RR 1.47, 0.69 to 3.16; P > 0.050) (Table S1).The top 10 countries by methodological quality in 2019 are shown in Table 4.The top three countries were all in Europe; The UK had the highest proportion of trials with a low risk bias of (57.1 per cent), with the Netherlands ranking second (51.9 per cent), and Finland ranking third (38.5 per cent).Korea was in fourth place (30.0 per cent).Nigeria was the top country by relative number of RCTs per specialist surgical workforce per 100 000 inhabitants (Table S2).

Discussion
This updated systematic review demonstrated that the overall worldwide volume of published surgical RCTs has remained stable in the past decade.This in contrast to a 50.0 per cent increase a decade earlier.The region of origin of published surgical RCTs has shifted considerably, with Asia/Oceania now the leading continent in volume, with 45.4 per cent of surgical RCTs, whereas the number of RCTs from Europe has declined.China has become the leading country in terms of absolute volume, followed by the USA.Concerns about methodological quality persist, although almost all quality characteristics have improved.Some 94.3 per cent of RCTs were at moderate or high risk of bias two decades ago, whereas this has now decreased to 77.9 per cent.Certain European countries continue to be leading in terms of relative RCT volume per capita (Finland, the Netherlands) and methodological quality (UK, the Netherlands).
Although the volume of published surgical RCTs has decreased slightly in the past decade, the overall volume of published (both surgical and non-surgical) RCTs continued to increase between 1966 and 2018 (176 620 trials) 15 .This can be explained partly by the described difficulties associated with performing surgical RCTs.For example, when two interventions have different benefit-to-harm profiles, patients and surgeons may have strong treatment preferences 16 .This may lead to difficulties in recruitment of both patients as well as surgeons, especially for a complex surgical procedure in which surgical experience differs 17 .This might hinder recruitment and also pose a threat to external RCT validity.These difficulties might drive researchers to opt for other study designs, such as prospective cohort studies, resulting in fewer RCTs than in other specialties 18,19 .To overcome treatment preferences, participating surgeons should have clearly passed the learning curve and be able to perform both techniques.There may also be different surgeons from one unit 16,20 .
However, the slight decrease in volume of surgical RCTs over the past decade is not necessarily a negative development.An ever-increasing number of RCTs cannot be the ultimate goal.The results of this review may indicate that the steady state has been reached.An improvement in this respect is that trial quality has improved significantly over the past decade, while the size has remained the same.Looking forward to the next decade, the expectation/aim would be to publish the same volume of surgical RCTs with a further improvement in quality.
The region of origin of published surgical RCTs has shifted, with Asia/Oceania as the leading continent in volume, whereas the number of RCTs from Europe has declined.In 2014, the new European Trials Directive started, which possibly influenced the use of RCTs in Europe 21 .In Asia, the steep increase in Chinese trials is particularly notable.China published 81 surgical RCTs in 2019, compared with 40 in 2009, and unranked a decade earlier.This was coupled with an over sixfold increase in the proportion of trials with a low risk of bias, from 2.5 per cent in 2009 to 16.0 per cent in 2019.Similar results were observed in another study that included 7422 RCTs published in Chinese medical journals and indicated that the quality of reporting surgical RCTs has improved [22][23][24] .This improvement could be explained by the fact that China started several programmes, such as the Thousand Talents Plan to temporarily attract scientists from abroad accompanied by government investments in innovation and healthcare 25 .
The UK had the highest proportion of trials with a low risk of bias (57.1 per cent) worldwide in 2019, in comparison to 33.3 per cent in 2009 8 .A possible explanation for this increase is the implementation of a national programme for surgical trials in the UK by the National Institute for Health Research and the Royal College of Surgeons in 2013 26 .With the advent of this programme, mentors were available to guide new research surgeons through the trial process, which could have led to the observed high methodological quality of RCTs.
Most medical and surgical journals have adopted the CONSORT criteria which has led to improved reporting of RCTs [27][28][29][30][31][32] .However, poor reporting is still common, with deficiencies in reporting the randomization method, blinding, and allocation concealment 33 .The present review showed a significant increase in the reporting of blinding, generation of allocation, and concealment of allocation in the interval 2009-2019.Nonetheless, there is still a substantial proportion of RCTs with poor quality of reporting.Journal editors and peer reviewers have a critical role in addressing this issue.This topic has received notable attention, with several published recommendations on how to improve the peer review process and appeals to encourage more journals to adopt the CONSORT criteria [34][35][36][37][38][39][40] .
Clearly, adequate reporting of methodological quality is not the same as actual methodological quality.Interestingly, one study 41 even declared that the actual methodology of a published trial is often better than that reported.This study compared the study protocol with the actual published RCTs, and found that adequate allocation of concealment was achieved in all trials, but was reported in only 42 per cent of the published reports.The same was observed for the sample size calculation and the use of intention-to-treat analyses.However, users of randomized trials do not have access to the unpublished study data, making the final article essential for assessing quality.
Interestingly, this review also identified a significant shift in trial funding.Over the past 20 years, fewer RCTs have been funded by industry.This review identified an industry funding rate of 11.4 per cent in contrast with 33 per cent in another review of surgical RCTs (2008-2020) 42 .Several studies 43,44 have shown that industry funding leads to overestimation of positive outcomes, which clearly affects the interpretation of results.On the other hand, funding of surgical RCTs may become increasingly difficult in future years with declining support from industry.
The surgical field will have to develop methods to overcome difficulties in performing surgical RCTs.Options include innovative trial designs such as registry-based trials [45][46][47][48] .Registry-based RCTs are associated with lower costs as they use an ongoing registry for data collection.Several registry-based trials are currently ongoing in the USA and Europe [48][49][50] .In addition, the trials within cohorts and stepped-wedge RCTs (SW-RCTs) are alternative RCT designs to overcome difficulties in surgical RCTs 51,52 .For example, a recent nationwide SW-RCT 53 from the Netherlands, focusing on improved complication detection and management after pancreatic surgery, reported a halving of postoperative mortality.
The results of this systematic review should be interpreted in light of some limitations.First, the available Medical Subject Headings (MeSH) term titles in PubMed have changed over time 54 .This may have led to differences in the degree of identification of surgical studies between 1999, 2009, and 2019.However, the search did not rely solely on MeSH terms, but also used various permutations of free-text terms to compensate for these potential differences.Second, reports of published articles in languages not spoken fluently by the authors were excluded.This only pertained to 10 excluded RCTs from 2019 (exact languages of excluded trials are shown in Fig. 2).Third, trials were classified on the basis of the country of the leading department if the trial was performed across multiple countries or continents.Of the included trials, 86.8 per cent were conducted in single countries, so any possible influence of this practice on results is presumably very limited.Moreover, the majority of international trials were undertaken within the same continent.Fourth, this review is a continuation of a previously published article.The methods used were same as those employed in the earlier article, but differences in interpretation cannot be excluded.To minimize these differences, the reviewers of both studies have been in close contact to clarify any ambiguities.Fifth, although a trial with a low risk of bias is defined according to empirical evidence, it must be noted that other factors not included in this definition could influence the quality of the RCT.Therefore, Table 3 also presents data according to other criteria to allow the reader to judge every criterion separately.Additionally, the term 'low risk of bias' is not the same as 'high quality'.Rather, these trials have adequate methodology based on a number of important characteristics and, although his decreases the risk of bias, it cannot eliminate this risk completely.Sixth, there is an inevitable delay in detecting developments in surgical RCTs because of the lag time between study protocol development and final publication.
In conclusion, the volume of published surgical RCTs worldwide has remained stable in the past decade; although the reported quality improved somewhat, there remains a lot to be gained.A significant increase in volume of published surgical RCTs was observed in Asia in general and China in particular.The 10 best countries in terms of methodological quality were all from Europe, with the UK ranking first (Table 4).Thus, education in trial methodology, improved research infrastructure, and enforced adherence to reporting guidelines remain necessary, with additional focus on innovative trial designs to overcome the unique issues with surgical RCTs.

1999 Papers retrieved after search and screened by title and abstract n = 12 870 2019 1 (Fig. 2
Fig. 2 Flow chart of selection process for the years 1999, 2009, and 2019 *Including orthopaedics, urology, neurosurgery, cardiac surgery, ophthalmology, obtetrics and gynaecology, oral and maxillofacial surgery, and otolaryngology.†Six in Chinese, two in Russian, one in Portuguese, and one in Czech.

1 Flow chart showing selection of articles for review published in 2019
The search in 1999 and 2009 was performed on 3 June 2010 and identified 12 870 and 25 611 PubMed hits respectively.After screening, 300, 450, and 438

Table 1 .
The

Table 1 Characteristics of included surgical randomized trials 1999 (n = 300
China was the country with the largest volume of surgical RCTs (Table2).There was an increase in both absolute and relative volume, from 40 trials (8.9 per cent) in 2009 to 81 (18.5 per cent) in 2019.The volume of published trials in the USA remained stable in the past decade, with 50 in 2019Values are n (%) unless indicated otherwise; *values in parentheses are 95% confidence intervals.†Thiscomparisonwas performed in the authors' previous study8.+Analysisexcludingthe less informative 'not reported' group was added to aid interpretation of results.The relationship observed over time regarding industry funding persisted and relative differences were more pronounced.‡Includesbreast,abdominal wall, thoracic, and endocrine surgery.§Analysisexcluding the less informative 'unclear' group showed similar results regarding the trend for benign and malignant disease over time.¶Fortrialsstudying surgical interventions only.#On the basis of impact factors of the Institute for Scientific Information for the respective year(1999, 2009, 2019).Journals without an impact factor are not included.^Comparing 1999 and 2009 using Fisher's exact, χ 2 , and Mann-Whitney U tests.~Comparing 2009 and 2019 using Fisher's exact, χ 2 , and Mann-Whitney U tests.RR, relative rate.

Table 2 Top 10 countries by absolute and relative volume of published surgical randomized trials Top 10 by absolute volume of surgical RCTs* Top 10 by relative volume of surgical RCTs per 10 million inhabitants
*Values are n (%).

Table 4 Top 10 ranking for countries in 2019 based on the proportion of trials with a low risk of bias
*Values are median (i.q.r.) 2019 impact factors.Only countries with at least 10 published trials were analysed.†Trial with adequate generation of allocation, adequate concealment of allocation, intention-to-treat analyses, and adequate handling of drop-outs.