The decision of where to start a research project has been influenced by many factors over the years. Tradition has a large impact, but the individual researchers' or clinicians' personal interest has also played a major role. The pharmaceutical industries' interest has without doubt initiated and sponsored many projects in order to get new products onto the market. The lack of an overview and control has led to an abundance of evidence within certain areas of our specialty, whereas other areas are scarcely, or not at all, researched. One way of ‘mapping’ the evidence in order to find out what we know and what we do not know is the production of systematic reviews. Although systematic reviews are considered top of the evidence hierarchy, they are not flawless. The aim of this article is to explain the systematic review and point to some of the challenges in the development and use of systematic reviews.

Editor's key points

  • The systematic review is a summary of all the available evidence on a specific clinical question and includes a meta-analysis of eligible small studies to increase the statistical power of the analysis.

  • The quality of a systematic review is determined by the degree of heterogeneity and reporting bias.

  • Heterogeneity (differences between the studies included) may be classified as clinical, methodological, or statistical heterogeneity.

  • Quality systematic reviews are an invaluable tool to evaluate the current ‘state of the art’ of evidence cabout a specific clinical question and to focus attention on research priorities for the future.

A systematic review starts with a focused clinical question. The clinical question will include a well-defined population, a clear description of the intervention(s) in focus, the comparisons, and the patient relevant outcomes to be looked at. The clinical question is the basis for a comprehensive literature search that will seek to identify all high-quality research evidence relevant to that question; criteria for including and excluding for potential trials; evaluating the retrieved trials according to the pre-defined criteria; and a summary of the results in either a meta-analysis or, if trials or results are too heterogeneous, a text-based summary.1

The strength of the systematic review is the strict methodology that is used to minimize bias and increase the transparency of the reporting.

A systematic review may conclude that an intervention is effective, not effective, or that there is insufficient evidence for the authors to draw a conclusion. Unfortunately, the last option is too often the case.

The lack of evidence can be due to no or little research dealing with the intervention or to studies being of poor quality because of a wrong design or high risk of bias.

However, even though a systematic review is considered top of the hierarchy of evidence, it still has flaws and there are some major challenges in undertaking systematic reviews.


Studies to be included in systematic reviews differ in several ways. Any kind of variability between studies in a systematic review is termed heterogeneity. These differences can be divided into different types. The most obvious type is termed clinical heterogeneity. This covers variations in patient populations (age range, ethnicity, co-morbidity, etc.) and differences in the way the interventions are carried out (doses, application, location, etc.) and finally in definitions of the outcome measures (pneumonia, for example, can be defined by clinical picture, X-ray finding, microbiology, or any combination of this).

Variability in the study design and in the risk of bias is termed methodological heterogeneity. This will reflect variations in study quality.2

The variability in the effect size (intervention effects) is statistical heterogeneity.

If the methodological or statistical heterogeneity is large between studies, it may be a sign that the studies may not examine the same effect.

Heterogeneity may be quantified in several ways. One way is to look at the forest plot. If the confidence intervals do not overlap very well it may be a sign of statistical heterogeneity. (The results are too different.) Heterogeneity can be assessed statistically to indicate whether the differences in results are likely to be attributable to chance alone.

Although heterogeneity is inevitable in systematic reviews, a number of approaches can be considered: if the heterogeneity is too extensive, the meta-analysis should be abandoned, and the results are reported in a narrative style. It may also be relevant to further explore the reasons for heterogeneity: this may be done by performing subgroup analysis or meta-regression analysis. If meta-analysis is performed, a random-effect model should be used.

If a small number of studies are very different from the main body of others, these may be omitted in a sensitivity analysis.

The commonly used depiction of mixing apples and oranges may apply, but if the clinical question is broad, this may not necessarily be wrong.

Reporting bias

Reporting bias occurs when authors, publishers, and pharmaceutical industries handle the reporting of trials that show a positive (e.g. significant) result different from the results that are negative or inconclusive. Positive or statistically significant results increase the chance of publication and to be cited by other authors.

This means that only a selected proportion of research projects will be easily accessible for authors of systematic reviews. It is important to realize that trials with non-significant results are as important for estimating the effects of an intervention as positive results, because reporting bias always will tend to overestimate the treatment effect of an intervention.

Reporting bias is an umbrella term for all biases relating to the reporting of trials, such as publication bias, ‘time-lag’ bias, duplicate publication bias, citation bias, language bias, and outcome reporting bias.

Publication bias covers the fact that many trials with non-significant or ‘negative’ results will never be published.2 Authors may choose not to seek publication of their work either because they were disappointed at not being able prove their original hypothesis, or not finding their results interesting, or for fear of a long and cumbersome publication process. Studies have shown that peer reviewers are likely to rate significant studies more positively than negatively,3 and some journal editors do not want to publish non-significant studies. The pharmaceutical industry may try to avoid publication of results that bring doubt about their products' efficacy or safety.

Time-lag bias describes the fact that even if negative trials are published, this often occurs much later than trials with positive results. Stern and Simes4 found that for trials with significant results, the median time from acceptance in an ethics committee till publication was 4.7 yr, whereas it was 8.0 yr for negative trials. The ‘survival’ of trials as ‘unpublished’ according to results is shown in Figure 1.4

Fig 1

Proportion of quantitative studies not published, according to the type of results.4

Fig 1

Proportion of quantitative studies not published, according to the type of results.4

Duplicate or multiple publication bias refers to the fact that some research findings are published more than once. Gøtzsche5 found that some results were published as much as three or four times. If it is not clearly indicated that a given set of results has been published previously, all reports may be erroneously included in a meta-analysis, thereby distorting the results.

Location bias can cover a wide range of biases from the likelihood of being published in a high-impact international journal vs a journal in a foreign language, to differences in indexing methods between the medical databases such as Medline, Embase, Central, and LiLacs. Finally, it can refer to differences in scientific traditions and norms between countries, meaning that not all countries have an equally high standard regarding rigorous peer review and detection of fraud before publication.6

Language bias refers to the fact that significant trials are much more likely to be published in English than in other, less widespread languages. This is why reviewers must be careful not to apply language restrictions when searching for trials. However, translation of results is often a problem, because it is difficult to find translators. The problem is getting smaller as many countries, for example, Denmark, have started publishing their national medical journal in English instead of their own language to reach a broader audience.

Outcome reporting bias occurs when authors collect a wide range of data of different outcomes but choose only to publish some (which will most likely be the ones that showed statistical difference) in their final publication.

Strategies to minimize shortcomings in meta-analysis

Authors of systematic reviews should be careful to search the available literature comprehensively and to avoid language restrictions.

In Cochrane systematic reviews (, authors often hand search abstract books to find abstracts of trials that have been presented at scientific meetings, but have not yet been published to include in the systematic review. Authors of Cochrane reviews also often contact authors of relevant trials and experts in the field of interest, to achieve results from trials that have been performed, but for some reason have not been published (file drawer trials).

During the last decade, it has become the norm to register clinical trials online in a clinical trial registration database before initiating the trial. Most international journals agree that the registration of trials in the initial phase is a mandatory requirement for publication. However, only a summary of protocol information is registered and there is no registration of data.

Some journals have made it a trademark to publish trials with negative results. These journals are peer-reviewed and the publications generally of high quality. However, these journals are also highly selective in their choice of papers and only help the problems with reporting bias to a limited extent.

The Cochrane Anaesthesia Review Group

The Cochrane Anaesthesia Group (CARG) was established in 2000 and produces systematic reviews in the fields of anaesthesia, intensive care medicine, perioperative medicine, prehospital medicine, and emergency medicine. Pain and palliative medicine belongs to the Pain and Palliative Care Group (PAPAS). CARG is based in Copenhagen, Denmark, UT, USA, and Oxford, UK, but like all Cochrane Review Groups, it works globally.

CARG has 15 editors from 10 countries. The group helps authors through the process of producing systematic reviews of the highest possible quality, taking into account the issues mentioned above. The review title is checked for originality and importance. The protocol is developed with editorial assistance and the trial search coordinator assists the authors with comprehensive literature searches and retrieval of papers. Among the editors are three statisticians, who assist authors with analysis. The editorial group assists authors throughout the publication process. The reviews are updated at regular intervals. If the original authors fail to achieve this, Group l looks for others to continue the process. The collaboration between various Cochrane groups minimizes the duplication of effort.

The development of the systematic review process and the existence of the Cochrane Collaboration have focused attention on mapping the evidence, the abundance of knowledge in some fields of the specialty, and the scarcity of it in others.

In conclusion, although there have been major breakthroughs over the past 10 yr, there are still major challenges ahead, in order to achieve a mapping of evidence in anaesthesia. Many areas have not been researched very well, and the challenges of heterogeneity and reporting bias make it difficult to reach firm conclusions about many aspects of evidence-based practice. Quality systematic reviews can provide excellent summaries of available evidence and indicate the optimum direction of the future research effort.

Declaration of interest

None declared.


Evidence-based medicine and the Cochrane Collaboration in anaesthesia
Br J Anaesth
, vol. 
The Cochrane Handbook
Oxford: Wiley-Blackwell
Reviewer bias: a blinded experimental study
J Lab Clin Med
, vol. 
Publication bias: evidence of delayed publication in a cohort study of clinical research projects
Br Med J
, vol. 
Multiple publication of reports of drug trials
Eur J Clin Pharmacol
, vol. 
Do certain countries produce only positive results? A systematic review of controlled trials
Control Clin Trials
, vol.