-
PDF
- Split View
-
Views
-
Cite
Cite
Sara Monteiro Moraes, Teresa Cristina Abreu Ferrari, Alline Beleigoli, The accuracy of the Global Trigger Tool is higher for the identification of adverse events of greater harm: a diagnostic test study, International Journal for Quality in Health Care, Volume 35, Issue 1, 2023, mzad005, https://doi.org/10.1093/intqhc/mzad005
Close - Share Icon Share
Abstract
Global Trigger Tool (GTT) of the Institute for Healthcare Improvement (IHI) has been used as a measurement strategy for patient safety by several institutions and national programs. Although the greater ability of the GTT to identify adverse events (AEs) compared to other methods has already been demonstrated, there are few data on its accuracy, and studies suggest lower sensitivity for minor AEs. This study aimed to assess the accuracy of the GTT for identifying AEs in adult inpatients for all AEs and for the subgroup of AEs with greater harm to the patient, classified as F–I on the IHI-GTT adapted version of the National Coordinating Council for Medication Error Reporting and Prevention (NCC MERP) Index for Categorizing Errors. In this diagnostic test study, GTT is the index test and identification of AEs (yes/no) represents the condition of interest. Due to the lack of a gold standard test, a composite reference standard method was developed. Reference standard method combined real-time (during hospitalizations) and retrospective search of medical records and administrative data for screening criteria and AEs. Both tests were applied to a random sample of 211 hospitalizations of adult inpatients during October–November 2016 in a large public hospital in Belo Horizonte, Brazil. The accuracy of the GTT was evaluated using sensitivity, specificity, and global accuracy. A total of 176 AEs were identified in 67 admissions using reference standard method and 129 AEs in 76 admissions using GTT, resulting in rates of 126 and 93 AEs/1000 patient-days, respectively. Sensitivity, specificity, and global accuracy of the GTT for the identification of individual AEs were, respectively, 0.41 (95% confidence interval [CI] 0.34; 0.49), 0.68 (95% CI 0.60; 0.74), and 0.54 (95% CI 0.49; 0.60) for all AEs, regardless of the harm categorization, and 0.85 (95% CI 0.72; 0.93), 0.88 (95% CI 0.82; 0.92), and 0.87 (95% CI 0.82; 0.91) for the subgroup of AEs categorized as harm F–I. Among the main AEs missed by the GTT are AEs related to nursing care, such as those related to peripheral venous access and gastric/enteric catheters. GTT proved to be a valid method for identifying AEs in adult inpatients. Its accuracy increases when minor harm AEs are not counted. Among the main AEs missed by the GTT are those related to nursing care. Therefore, the GTT should be used in conjunction with other measurement strategies to achieve results that are representative of the quality profile of the care provided and, thus, guide the best improvement strategies.
Introduction
One of the major challenges health-care systems face today is providing safe and quality care in complex, pressured, and fast-moving environments [1]. A reliable, valid, and feasible measurement strategy is required to determine whether efforts to enhance safety result in overall improvements [2, 3].
The Global Trigger Tool (GTT) of the Institute for Health-care Improvement (IHI) is a simple, inexpensive, and easy-to-execute method for estimating the occurrence of adverse events (AEs) in adult hospitalized patients [4, 5]. Several institutions and national programs have used it as a patient safety measure [2–12]. Although its greater ability to identify AEs compared to other methods has already been demonstrated [6–10], there is no clear data on its accuracy [3, 11, 12]. Moreover, studies suggest that minor harm AEs are more difficult to be identified by the GTT and the exclusion of these events could increase the method’s reliability and validity [2, 5].
If we understand a method for identifying AEs as a diagnostic test, its accuracy could be assessed by comparing its results with the best available test [13]. The scarcity of studies addressing the accuracy of the GTT is justified by the difficulty in establishing a gold standard or an acceptable reference standard test [2–4, 11, 12]. Classen et al., in one of the few studies that calculated the accuracy of the GTT, compared the results obtained through this method with those obtained through a detailed retrospective review of medical records and administrative data. They found a sensitivity of 94.9% and a specificity of 100% for identifying admissions with AEs [7]. However, the reference standard test used has similar limitations to the GTT—data obtained exclusively from medical records, which may have resulted in overestimated accuracy data.
Some discuss methodological alternatives for diagnostic accuracy studies when a gold standard does not exist [14–17]. An approach when there is no clear data on the accuracy of the available tests and none is indicated as the preferred reference standard is using a composite reference standard, in which the results of several imperfect tests are combined [17]. In the context of patient safety, this seems particularly applicable as previous studies have shown that different methods identify different AEs and recommend combining different AE identification approaches to understand patient safety issues occurring within an organization [6–9].
The aim of this study was to assess the accuracy of the GTT for identifying AEs in adult inpatients for all AEs and the subgroup of AEs with greater patient harm. For this, a composite reference standard method (RSM) was built. The results are expected to support health-care systems in choosing the most appropriate method, or a combination of them, for managing care risk and building strategies that support actions to improve the security and quality of care.
Methods
Study design, population, and sample
This is a diagnostic test study in which the GTT is the index test and identification of AEs (yes/no) represents the condition of interest. It was performed in a 500-bed, general, public, university hospital in Belo Horizonte, Brazil, offering tertiary and quaternary care. It has a hybrid medical record, with both electronic and paper-based documentation.
The cross-section of the study was defined as the period from October 2 to November 4, 2016, and is referred as the “study period.” Patients aged 18 years or older who were hospitalized for >24 hours during the study period were eligible. Patients whose admission records were unavailable or incomplete, i.e. did not contain critical elements, such as a discharge summary, prescriptions, and significant portions of medical and nursing evolutions, were excluded.
Sample size calculation considered a confidence level of 95%, margin of error of 5%, population size of 1500 (historical monthly average of hospitalizations), and minimum expected proportion of AEs in the population of 20%, which was based on previous studies showing AE rates between 7.2% and 27.0% of hospitalizations [3]. This resulted in a sample of 212 patient admissions. An intentional random oversampling of about 25% was added due to potential losses, leading to a final sample of 268 admissions.
Different researchers’ teams independently applied the GTT and the constructed RSM. Both tests were used following the methodological characteristics of each, being exclusively retrospective for the GTT, through the analysis of medical records, and following a combination of strategies in the RSM that included information collection during the study period and retrospective analysis of clinical and administrative data. The study design is represented in Fig. 1.

Diagnostic test study: assessment of accuracy of the Global Trigger Tool to identify adverse events
GTT test
At the time of the study, there was no official version of the GTT in Portuguese. The tool translation and adaptation were based on the original GTT white paper and a Portuguese version translated by the IHI Latin American team. No substantial changes were made to the general content of the original version. This process was described elsewhere [18].
The review team consisted of a pair of primary reviewers composed of medical students of the fourth and fifth years and two senior physician specialists in Internal Medicine, who alternated, depending on availability, to validate the findings of the primary reviewers. All reviewers were familiar with the medical record and the model of care offered by the institution. The qualification of the reviewers included the individual reading of the GTT white paper, followed by a 12-hour theoretical–practical training, focused on AE concept understanding, meaning of each trigger, and harm categorization. All reviewers evaluated 10 medical records, including five IHI’s commented examples and five institution’s medical records.
The GTT was applied following the IHI protocol. In a first step, each primary reviewer was allowed a maximum of 20 minutes per record to search for triggers, identify possible AEs, classify them according to the harm category, and record the findings. Subsequently, the primary reviewers discussed individual findings and recorded the consensus. In the second step, medical reviewer, based on the findings of the primary reviewers, confirmed the AEs and rectified harm categories when necessary. Medical reviewers could consult medical records for clarification and had the support of a group of specialist physicians, who were consulted in complex situations, when the attribution of harm to health care was not evident.
Reference standard test
The choice of which tests would compose the RSM was based on previous publications describing the methods for identifying AEs, their strengths, and limitations and the feasibility of applying each in the research context [19, 20]. Systematic real-time data collection (interviews with professionals and review of prescriptions); voluntary institutional incident reporting system; analysis of existing and routinely collected data (information technology and electronic medical records, such as reviewing laboratory test results, ordering antibiotic initiation, and operating notes for surgical and obstetric procedures, and administrative data, such as reports of incidents related to transfusion and infections associated with health care); and analysis of deaths and early hospital readmissions were included. AEs identified by the RSM research group during the collection of other data were also included. Methods that used a retrospective review of medical records, such as the classic Harvard Method or its variations [21], were not included due to the risk of incorporation bias.
The methods selected to compose the RSM were included in phase 1. They were understood as a set of different sources of information to identify screening criteria for AEs. Subsequently, in phase 2, a physician reviewed charts from positive patient admissions for at least one screening criterion, looking for additional information to validate the occurrence of harm to the patient related to health care. After confirming an AE, it was rated on the harm categorization. The RSM team of physicians comprised two Internal Medicine specialists with 5–6 years of experience. They were trained on the concepts and classifications used in the study and could consult each other or a medical specialist. RSM is described in detail in Supplementary Material 1.
Definitions
AE was defined as “an unintended physical injury resulting from or contributed to by medical care which requires additional monitoring, treatment or hospitalization, or which results in death” [5]. AEs resulting from acts of omission clearly related to the occurrence of harm to the patient were counted. These events include delays in providing health services, such as carrying out diagnoses or treatments, due to organizational issues.
All AEs were classified according to nature and harm. The IHI-GTT adapted version of the National Coordinating Council for Medication Error Reporting and Prevention (NCC MERP) Index for Categorizing Errors was used to classify AE-related harm into five categories: (E) temporary harm and required intervention, (F) temporary harm and required initial or prolonged hospitalization, (G) permanent harm, (H) intervention required to sustain life, and (I) death [5, 22].
Although the GTT allows the identification of AEs during the entire period of hospitalization, the RSM analysis was restricted to the study period. For this reason, only AEs that occurred in this period, regardless of the beginning or end of hospitalization, or those prior to it, provided they were directly related to the current admission, such as those that caused temporary or permanent harm, requiring new treatments or interventions, were counted.
To reduce the subjectivity of the medical reviewer’s judgment in determining the occurrence of an AE in both methods, GTT, and RSM, a four-point scale (0–3) was used to determine the level of confidence that the harm could be attributed to health care, rather than the patient’s disease process. Only those classified as 2–3 (moderate to certain evidence that the harm could be attributed to the health care) were considered AEs. Questions adapted from Baker et al. [23] and Mendes et al. [24] were used to support the medical reviewer’s judgment.
Analysis and statistics
The sample was characterized using absolute and relative frequencies for qualitative variables and measures of central tendency and dispersion for quantitative variables. The variable of interest was the occurrence of AEs. The frequency of occurrence of AEs was presented as the number of AEs per 1000 patient-days. McNemar’s test was used to compare the performance of the GTT and the RSM to identify AEs of different harm categories. The test results were considered dichotomous, and the classic measures of accuracy—sensitivity, specificity, and global accuracy—were calculated using 2 × 2 tables [14, 15, 25].
The results were demonstrated using two different study units: (i) per patient admission, in which the occurrence of one or more AEs in a hospitalization counted as a single positive case, and (ii) per AE, in which each AE identified in an admission counted as a positive case. In both, patient admissions in which no AE was detected were counted as a negative case each. Accuracy of the GTT was evaluated for AEs in general (E–I) and for the subgroup of events of greater harm, classified as F, G, H, or I (F–I).
Results
Preliminarily, a total of 1172 admissions were considered eligible. Of the 268 records selected, 49 could not be accessed due to remote storage of large records and their use for other purposes, such as medical care, audits, or billing. Of the 219 available records, 20 were deemed as not meeting the eligibility criteria. An extra round of randomization selected 12 additional admissions, leading to 211 admissions. Table 1 shows the sample description.
| Variable . | Sample (n = 211) . |
|---|---|
| Genre | |
| Women | 134 (63.5%) |
| Men | 77 (36.5%) |
| Age group | |
| <60 years old | 148 (70.1%) |
| ≥60 years old | 63 (29.9%) |
| Type of hospital admission | |
| Urgency | 172 (81.5%) |
| Elective | 39 (18.5%) |
| Charlson Comorbidity Index | |
| 0 | 86 (40.8%) |
| 1–2 | 78 (36.9%) |
| 3–4 | 31 (14.7%) |
| ≥5 | 16 (7.6%) |
| Reason for admission | |
| Surgical | 90 (42.7%) |
| Clinical | 76 (36.0%) |
| Obstetric | 45 (21.3%) |
| Mean length of stay (days) | 12.2 (SD 18.6) |
| Variable . | Sample (n = 211) . |
|---|---|
| Genre | |
| Women | 134 (63.5%) |
| Men | 77 (36.5%) |
| Age group | |
| <60 years old | 148 (70.1%) |
| ≥60 years old | 63 (29.9%) |
| Type of hospital admission | |
| Urgency | 172 (81.5%) |
| Elective | 39 (18.5%) |
| Charlson Comorbidity Index | |
| 0 | 86 (40.8%) |
| 1–2 | 78 (36.9%) |
| 3–4 | 31 (14.7%) |
| ≥5 | 16 (7.6%) |
| Reason for admission | |
| Surgical | 90 (42.7%) |
| Clinical | 76 (36.0%) |
| Obstetric | 45 (21.3%) |
| Mean length of stay (days) | 12.2 (SD 18.6) |
SD, standard deviation.
| Variable . | Sample (n = 211) . |
|---|---|
| Genre | |
| Women | 134 (63.5%) |
| Men | 77 (36.5%) |
| Age group | |
| <60 years old | 148 (70.1%) |
| ≥60 years old | 63 (29.9%) |
| Type of hospital admission | |
| Urgency | 172 (81.5%) |
| Elective | 39 (18.5%) |
| Charlson Comorbidity Index | |
| 0 | 86 (40.8%) |
| 1–2 | 78 (36.9%) |
| 3–4 | 31 (14.7%) |
| ≥5 | 16 (7.6%) |
| Reason for admission | |
| Surgical | 90 (42.7%) |
| Clinical | 76 (36.0%) |
| Obstetric | 45 (21.3%) |
| Mean length of stay (days) | 12.2 (SD 18.6) |
| Variable . | Sample (n = 211) . |
|---|---|
| Genre | |
| Women | 134 (63.5%) |
| Men | 77 (36.5%) |
| Age group | |
| <60 years old | 148 (70.1%) |
| ≥60 years old | 63 (29.9%) |
| Type of hospital admission | |
| Urgency | 172 (81.5%) |
| Elective | 39 (18.5%) |
| Charlson Comorbidity Index | |
| 0 | 86 (40.8%) |
| 1–2 | 78 (36.9%) |
| 3–4 | 31 (14.7%) |
| ≥5 | 16 (7.6%) |
| Reason for admission | |
| Surgical | 90 (42.7%) |
| Clinical | 76 (36.0%) |
| Obstetric | 45 (21.3%) |
| Mean length of stay (days) | 12.2 (SD 18.6) |
SD, standard deviation.
There were identified 627 occurrences matching screening criteria by the composite RSM, of which 274 (43.7%) were considered AEs. One AE could be related to one or more screening criteria. The most frequent sources of information were the interviews with health-care professionals (n = 357), followed by requests to start antibiotics (n = 85). The frequency of occurrence and positivity for AEs of screening criteria identified using the RSM by the type of source is described in Table 2.
Frequency of occurrence of the RSM screening criteria by type of source and percentage of them that were confirmed as adverse event.
| Type of source on the screening criteria . | Total . | Confirmed as adverse events (%) . |
|---|---|---|
| Interviews with health-care professionals | 357 | 165 (46.2) |
| Request to start antibiotics | 85 | 29 (34.1) |
| Results of laboratory tests | 49 | 11 (22.5) |
| Review of operative notes | 34 | 9 (26.5) |
| Review of prescriptions | 33 | 15 (45.5) |
| Findings in medical records during the collection or analysis of other data | 21 | 21 (100.0) |
| Voluntary reporting | 21 | 11 (52.4) |
| Early hospital readmission | 11 | 3 (27.3) |
| Review of obstetric notes | 7 | 2 (28.6) |
| Report on health care–associated infections | 6 | 6 (100.0) |
| Transfusion Agency Reports | 2 | 2 (100.0) |
| Review of the cause of death | 1 | 0 (0.0) |
| Total | 627 | 274 (43.7) |
| Type of source on the screening criteria . | Total . | Confirmed as adverse events (%) . |
|---|---|---|
| Interviews with health-care professionals | 357 | 165 (46.2) |
| Request to start antibiotics | 85 | 29 (34.1) |
| Results of laboratory tests | 49 | 11 (22.5) |
| Review of operative notes | 34 | 9 (26.5) |
| Review of prescriptions | 33 | 15 (45.5) |
| Findings in medical records during the collection or analysis of other data | 21 | 21 (100.0) |
| Voluntary reporting | 21 | 11 (52.4) |
| Early hospital readmission | 11 | 3 (27.3) |
| Review of obstetric notes | 7 | 2 (28.6) |
| Report on health care–associated infections | 6 | 6 (100.0) |
| Transfusion Agency Reports | 2 | 2 (100.0) |
| Review of the cause of death | 1 | 0 (0.0) |
| Total | 627 | 274 (43.7) |
Frequency of occurrence of the RSM screening criteria by type of source and percentage of them that were confirmed as adverse event.
| Type of source on the screening criteria . | Total . | Confirmed as adverse events (%) . |
|---|---|---|
| Interviews with health-care professionals | 357 | 165 (46.2) |
| Request to start antibiotics | 85 | 29 (34.1) |
| Results of laboratory tests | 49 | 11 (22.5) |
| Review of operative notes | 34 | 9 (26.5) |
| Review of prescriptions | 33 | 15 (45.5) |
| Findings in medical records during the collection or analysis of other data | 21 | 21 (100.0) |
| Voluntary reporting | 21 | 11 (52.4) |
| Early hospital readmission | 11 | 3 (27.3) |
| Review of obstetric notes | 7 | 2 (28.6) |
| Report on health care–associated infections | 6 | 6 (100.0) |
| Transfusion Agency Reports | 2 | 2 (100.0) |
| Review of the cause of death | 1 | 0 (0.0) |
| Total | 627 | 274 (43.7) |
| Type of source on the screening criteria . | Total . | Confirmed as adverse events (%) . |
|---|---|---|
| Interviews with health-care professionals | 357 | 165 (46.2) |
| Request to start antibiotics | 85 | 29 (34.1) |
| Results of laboratory tests | 49 | 11 (22.5) |
| Review of operative notes | 34 | 9 (26.5) |
| Review of prescriptions | 33 | 15 (45.5) |
| Findings in medical records during the collection or analysis of other data | 21 | 21 (100.0) |
| Voluntary reporting | 21 | 11 (52.4) |
| Early hospital readmission | 11 | 3 (27.3) |
| Review of obstetric notes | 7 | 2 (28.6) |
| Report on health care–associated infections | 6 | 6 (100.0) |
| Transfusion Agency Reports | 2 | 2 (100.0) |
| Review of the cause of death | 1 | 0 (0.0) |
| Total | 627 | 274 (43.7) |
A total of 176 AEs were identified in 67 admissions using the RSM and 129 AEs in 76 admissions using the GTT, resulting in rates of 126 and 93 AEs/1000 patient-days, respectively. Seventy-two AEs were identified by both methods. There is no significant difference between the methods in the identification of AEs in general (P = 0.10). However, when analyzing subgroups, more AEs categorized as harm “E” were identified using the RSM (P < 0.005), while the GTT was superior in identifying AEs of greater harm (F–I) (P = 0.02) (Table 3). No AE resulted in patient death.
| Adverse events by category of harma . | Reference standard . | GTT . | P-valueb . |
|---|---|---|---|
| E | 124 | 64 | <0.005 |
| F | 42 | 49 | 0.11 |
| G | 3 | 8 | – |
| H | 7 | 8 | – |
| I | 0 | 0 | – |
| All category of harm (E–I) | 176 | 129 | 0.10 |
| Greater harm (F–I) | 52 | 65 | 0.02 |
| Adverse events by category of harma . | Reference standard . | GTT . | P-valueb . |
|---|---|---|---|
| E | 124 | 64 | <0.005 |
| F | 42 | 49 | 0.11 |
| G | 3 | 8 | – |
| H | 7 | 8 | – |
| I | 0 | 0 | – |
| All category of harm (E–I) | 176 | 129 | 0.10 |
| Greater harm (F–I) | 52 | 65 | 0.02 |
Category of harm by the NCC MERP-IHI adapted version: (E) temporary harm and required intervention, (F) temporary harm and required initial or prolonged hospitalization, (G) permanent harm, (H) intervention required to sustain life, and (I) death.
McNemar Test.
| Adverse events by category of harma . | Reference standard . | GTT . | P-valueb . |
|---|---|---|---|
| E | 124 | 64 | <0.005 |
| F | 42 | 49 | 0.11 |
| G | 3 | 8 | – |
| H | 7 | 8 | – |
| I | 0 | 0 | – |
| All category of harm (E–I) | 176 | 129 | 0.10 |
| Greater harm (F–I) | 52 | 65 | 0.02 |
| Adverse events by category of harma . | Reference standard . | GTT . | P-valueb . |
|---|---|---|---|
| E | 124 | 64 | <0.005 |
| F | 42 | 49 | 0.11 |
| G | 3 | 8 | – |
| H | 7 | 8 | – |
| I | 0 | 0 | – |
| All category of harm (E–I) | 176 | 129 | 0.10 |
| Greater harm (F–I) | 52 | 65 | 0.02 |
Category of harm by the NCC MERP-IHI adapted version: (E) temporary harm and required intervention, (F) temporary harm and required initial or prolonged hospitalization, (G) permanent harm, (H) intervention required to sustain life, and (I) death.
McNemar Test.
Table 4 shows the frequency of AEs identified by nature using the two methods, considering all AEs and the subgroup of AEs with greater harm (F–I). In general, the most frequent AEs were those related to peripheral venous access, medication, surgical/anesthetics, infections, and delays in providing health services. None of the 16 AEs related to gastric/enteric catheters were identified through the GTT, and only two of the 67 (3%) AEs related to peripheral venous access were evidenced by the index test. All AEs categorized in these two types were classified as harm “E” and mostly referred to accidental removal.
Frequency of occurrence of adverse events by nature and by method of identification in absolute number and in percentage considering all category of harm (E–I) and the subgroup of events of greater harm (F–I).
| . | All (E–I) . | Greater harm (F–I) . | ||
|---|---|---|---|---|
| Adverse events by nature . | Reference standard (%) . | GTT-IHI (%) . | Reference standard (%) . | GTT-IHI (%) . |
| Peripheral venous accesses | 66 (37.5) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Medication | 29 (16.5) | 46 (35.7) | 10 (19.2) | 12 (18.5) |
| Surgical/anesthetics | 19 (10.8) | 27 (20.9) | 13 (25.0) | 21 (32.3) |
| Infections | 19 (10.8) | 22 (17.1) | 13 (25.0) | 13 (20.0) |
| Delays in the provision of health services | 12 (6.8) | 12 (9.3) | 12 (23.1) | 12 (18.5) |
| Gastric/enteric catheters | 16 (9.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Transfusion of blood products | 2 (1.1) | 6 (4.7) | 0 (0.0) | 1 (1.5) |
| Phlebitis | 3 (1.7) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Pressure injury | 1 (0.6) | 3 (2.3) | 0 (0.0) | 0 (0.0) |
| Radiotherapy | 1 (0.6) | 3 (2.3) | 1 (1.9) | 3 (4.6) |
| Airway | 2 (1.1) | 0 (0.0) | 2 (3.9) | 0 (0.0) |
| Central vascular accesses | 2 (1.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Bladder catheter | 0 (0.0) | 2 (1.6) | 0 (0.0) | 2 (3.1) |
| Obstetric care | 1 (0.6) | 1 (0.8) | 1 (1.9) | 1 (1.5) |
| Fall | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Transplants | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Dialysis therapy | 0 (0.0) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Skin injury from mechanical restraint | 1 (0.6) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Total | 176 | 129 | 52 | 65 |
| . | All (E–I) . | Greater harm (F–I) . | ||
|---|---|---|---|---|
| Adverse events by nature . | Reference standard (%) . | GTT-IHI (%) . | Reference standard (%) . | GTT-IHI (%) . |
| Peripheral venous accesses | 66 (37.5) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Medication | 29 (16.5) | 46 (35.7) | 10 (19.2) | 12 (18.5) |
| Surgical/anesthetics | 19 (10.8) | 27 (20.9) | 13 (25.0) | 21 (32.3) |
| Infections | 19 (10.8) | 22 (17.1) | 13 (25.0) | 13 (20.0) |
| Delays in the provision of health services | 12 (6.8) | 12 (9.3) | 12 (23.1) | 12 (18.5) |
| Gastric/enteric catheters | 16 (9.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Transfusion of blood products | 2 (1.1) | 6 (4.7) | 0 (0.0) | 1 (1.5) |
| Phlebitis | 3 (1.7) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Pressure injury | 1 (0.6) | 3 (2.3) | 0 (0.0) | 0 (0.0) |
| Radiotherapy | 1 (0.6) | 3 (2.3) | 1 (1.9) | 3 (4.6) |
| Airway | 2 (1.1) | 0 (0.0) | 2 (3.9) | 0 (0.0) |
| Central vascular accesses | 2 (1.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Bladder catheter | 0 (0.0) | 2 (1.6) | 0 (0.0) | 2 (3.1) |
| Obstetric care | 1 (0.6) | 1 (0.8) | 1 (1.9) | 1 (1.5) |
| Fall | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Transplants | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Dialysis therapy | 0 (0.0) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Skin injury from mechanical restraint | 1 (0.6) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Total | 176 | 129 | 52 | 65 |
Frequency of occurrence of adverse events by nature and by method of identification in absolute number and in percentage considering all category of harm (E–I) and the subgroup of events of greater harm (F–I).
| . | All (E–I) . | Greater harm (F–I) . | ||
|---|---|---|---|---|
| Adverse events by nature . | Reference standard (%) . | GTT-IHI (%) . | Reference standard (%) . | GTT-IHI (%) . |
| Peripheral venous accesses | 66 (37.5) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Medication | 29 (16.5) | 46 (35.7) | 10 (19.2) | 12 (18.5) |
| Surgical/anesthetics | 19 (10.8) | 27 (20.9) | 13 (25.0) | 21 (32.3) |
| Infections | 19 (10.8) | 22 (17.1) | 13 (25.0) | 13 (20.0) |
| Delays in the provision of health services | 12 (6.8) | 12 (9.3) | 12 (23.1) | 12 (18.5) |
| Gastric/enteric catheters | 16 (9.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Transfusion of blood products | 2 (1.1) | 6 (4.7) | 0 (0.0) | 1 (1.5) |
| Phlebitis | 3 (1.7) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Pressure injury | 1 (0.6) | 3 (2.3) | 0 (0.0) | 0 (0.0) |
| Radiotherapy | 1 (0.6) | 3 (2.3) | 1 (1.9) | 3 (4.6) |
| Airway | 2 (1.1) | 0 (0.0) | 2 (3.9) | 0 (0.0) |
| Central vascular accesses | 2 (1.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Bladder catheter | 0 (0.0) | 2 (1.6) | 0 (0.0) | 2 (3.1) |
| Obstetric care | 1 (0.6) | 1 (0.8) | 1 (1.9) | 1 (1.5) |
| Fall | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Transplants | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Dialysis therapy | 0 (0.0) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Skin injury from mechanical restraint | 1 (0.6) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Total | 176 | 129 | 52 | 65 |
| . | All (E–I) . | Greater harm (F–I) . | ||
|---|---|---|---|---|
| Adverse events by nature . | Reference standard (%) . | GTT-IHI (%) . | Reference standard (%) . | GTT-IHI (%) . |
| Peripheral venous accesses | 66 (37.5) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Medication | 29 (16.5) | 46 (35.7) | 10 (19.2) | 12 (18.5) |
| Surgical/anesthetics | 19 (10.8) | 27 (20.9) | 13 (25.0) | 21 (32.3) |
| Infections | 19 (10.8) | 22 (17.1) | 13 (25.0) | 13 (20.0) |
| Delays in the provision of health services | 12 (6.8) | 12 (9.3) | 12 (23.1) | 12 (18.5) |
| Gastric/enteric catheters | 16 (9.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Transfusion of blood products | 2 (1.1) | 6 (4.7) | 0 (0.0) | 1 (1.5) |
| Phlebitis | 3 (1.7) | 2 (1.6) | 0 (0.0) | 0 (0.0) |
| Pressure injury | 1 (0.6) | 3 (2.3) | 0 (0.0) | 0 (0.0) |
| Radiotherapy | 1 (0.6) | 3 (2.3) | 1 (1.9) | 3 (4.6) |
| Airway | 2 (1.1) | 0 (0.0) | 2 (3.9) | 0 (0.0) |
| Central vascular accesses | 2 (1.1) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Bladder catheter | 0 (0.0) | 2 (1.6) | 0 (0.0) | 2 (3.1) |
| Obstetric care | 1 (0.6) | 1 (0.8) | 1 (1.9) | 1 (1.5) |
| Fall | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Transplants | 1 (0.6) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Dialysis therapy | 0 (0.0) | 1 (0.8) | 0 (0.0) | 0 (0.0) |
| Skin injury from mechanical restraint | 1 (0.6) | 0 (0.0) | 0 (0.0) | 0 (0.0) |
| Total | 176 | 129 | 52 | 65 |
Although the number of AEs classified as harm “E” is not large enough to allow a subgroup analysis, the GTT tends to be equal or slightly superior to the RSM for the identification of minor AEs of other natures. For example, 34 medication AEs were identified through the GTT, while only 19 were identified through the RSM (12 of them were identified by both); among infections and surgical/anesthetics, nine and six AEs were identified by each of the methods, respectively, with three of them identified by both in each of these natures.
The results of the accuracy of the GTT compared to the RSM are shown in Table 5. Considering only the subgroup of AEs of greater harm (F–I), the sensitivity, specificity, and global accuracy data were, respectively, 0.90 (95% confidence interval [CI] 0.77; 0.97), 0.90 (95% CI 0.84; 0.94), and 0.90 (95% CI 0.85; 0.94) for the study unit “per patient admission” and 0.85 (95% CI 0.72; 0.93), 0.88 (95% CI 0.82; 0.92), and 0.87 (95% CI 0.82; 0.91) for the study unit “per EA.” When all AEs were evaluated (E–I), the validity estimates were significantly lower for the study unit “per AE,” with a sensitivity of 0.41 (95% CI 0.34; 0.49), specificity of 0.68 (95% CI 0.60; 0.74), and global accuracy of 0.54 (95% CI 0.49; 0.60).
Sensitivity, specificity, and global accuracy of the GTT in relation to the RSM for the identification of adverse events considering all category of harm (E–I) and the subgroup of events of greater harm (F–I).
| . | “Per patient admission” study unit . | “Per adverse event” study unity . | ||
|---|---|---|---|---|
| . | All (E–I) n = 211 . | Greater harm (F–I) n = 211 . | All (E–I) n = 352 . | Greater harm (F–I) n = 226 . |
| Sensitivitya | 0.76 (0.64; 0.86) | 0.90 (0.77; 0.97) | 0.41 (0.34; 0.49) | 0.85 (0.72; 0.93) |
| Specificitya | 0.83 (0.75; 0.88) | 0.90 (0.84; 0.94) | 0.68 (0.60; 0.74) | 0.88 (0.82; 0.92) |
| Global accuracya | 0.81 (0.75; 0.86) | 0.90 (0.85; 0.94) | 0.54 (0.49; 0.60) | 0.87 (0.82; 0.91) |
| . | “Per patient admission” study unit . | “Per adverse event” study unity . | ||
|---|---|---|---|---|
| . | All (E–I) n = 211 . | Greater harm (F–I) n = 211 . | All (E–I) n = 352 . | Greater harm (F–I) n = 226 . |
| Sensitivitya | 0.76 (0.64; 0.86) | 0.90 (0.77; 0.97) | 0.41 (0.34; 0.49) | 0.85 (0.72; 0.93) |
| Specificitya | 0.83 (0.75; 0.88) | 0.90 (0.84; 0.94) | 0.68 (0.60; 0.74) | 0.88 (0.82; 0.92) |
| Global accuracya | 0.81 (0.75; 0.86) | 0.90 (0.85; 0.94) | 0.54 (0.49; 0.60) | 0.87 (0.82; 0.91) |
95% confidence interval.
Sensitivity, specificity, and global accuracy of the GTT in relation to the RSM for the identification of adverse events considering all category of harm (E–I) and the subgroup of events of greater harm (F–I).
| . | “Per patient admission” study unit . | “Per adverse event” study unity . | ||
|---|---|---|---|---|
| . | All (E–I) n = 211 . | Greater harm (F–I) n = 211 . | All (E–I) n = 352 . | Greater harm (F–I) n = 226 . |
| Sensitivitya | 0.76 (0.64; 0.86) | 0.90 (0.77; 0.97) | 0.41 (0.34; 0.49) | 0.85 (0.72; 0.93) |
| Specificitya | 0.83 (0.75; 0.88) | 0.90 (0.84; 0.94) | 0.68 (0.60; 0.74) | 0.88 (0.82; 0.92) |
| Global accuracya | 0.81 (0.75; 0.86) | 0.90 (0.85; 0.94) | 0.54 (0.49; 0.60) | 0.87 (0.82; 0.91) |
| . | “Per patient admission” study unit . | “Per adverse event” study unity . | ||
|---|---|---|---|---|
| . | All (E–I) n = 211 . | Greater harm (F–I) n = 211 . | All (E–I) n = 352 . | Greater harm (F–I) n = 226 . |
| Sensitivitya | 0.76 (0.64; 0.86) | 0.90 (0.77; 0.97) | 0.41 (0.34; 0.49) | 0.85 (0.72; 0.93) |
| Specificitya | 0.83 (0.75; 0.88) | 0.90 (0.84; 0.94) | 0.68 (0.60; 0.74) | 0.88 (0.82; 0.92) |
| Global accuracya | 0.81 (0.75; 0.86) | 0.90 (0.85; 0.94) | 0.54 (0.49; 0.60) | 0.87 (0.82; 0.91) |
95% confidence interval.
Discussion
Statement of principal findings
The GTT showed satisfactory sensitivity, specificity, and global accuracy for the detection of AEs in inpatients when compared to a composite RSM. The GTT accuracy was higher when AEs with minor harm were not counted. The main AEs categorized as harm “E” that the GTT misses are related to nursing care, such as peripheral venous accesses and gastric/enteric catheters.
Interpretation within the context of the wider literature
To our knowledge, there is no other study that has evaluated the accuracy of the GTT by comparing it with a composite reference standard. A concern about the accuracy of the GTT and other methods that are based on retrospective review of medical records refers to their dependence on the quality of notes and the possible lack of recording of AEs, especially those that caused minor harm to the patient [26]. To assess the supposed impact of this on the GTT accuracy, it was essential that the RSM overcome this limitation and, for this reason, the interview with professionals was included as a search strategy [20]. The screening criteria identified through this source led to the confirmation of 127 of the 176 AEs of the RSM (72.2%), and they were the only source for the identification of 103 AEs (87 categorized as harm “E”).
Considering the study unit “per AE,” there is a significant difference in the GTT accuracy measures when evaluating all AEs in relation to the subgroup of AEs with greater harm. Of the 124 AEs categorized as “E” identified by the RSM, 96 were not identified through the GTT. Of these, notes were found in the medical records of 85; 71 of which were in nursing notes, part of the record not prioritized in the GTT review process [5, 27, 28]. This loss of sensitivity for minor AEs has already been reported by other researchers and can be justified by issues inherent to the GTT methodology, which restricts the time for reviewing medical records and recommends that it is not meant be read “from the first page to the last page” [2, 5], more than due to the low quality of the records [20].
This study was conducted in a university hospital, which has been striving to achieve quality standards in care. These characteristics may have influenced the quality of the medical records and, consequently, the GTT accuracy measures. In addition, during the study period, professionals were approached on the topic of patient safety and this may have improved the quality of notes regarding AEs.
Analyzing only the events identified through the GTT, as in other studies [6–9, 11], the most common natures of AEs were those related to medication, surgery/anesthetics, and infections. The failure to identify a large number of AEs related to nursing care through the GTT method, including those related to peripheral venous access and gastric/enteric catheters, corroborates the impression reported by experienced reviewers that the GTT is primarily focused on harm related to care performed by physicians [27].
Strengths and limitations
The combination of strategies for identifying AEs in the RSM was the strength of the research and aimed to overcome the individual weaknesses of each method, especially regarding underreporting, lack of adequate registration in medical records, and divergences in administrative data [20]. Despite the efforts made by the researchers to create a method “almost perfect,” the reference standard test used in this study is suboptimal. There were identified situations in which the GTT was correct, and the RSM was wrong regarding the occurrence of AEs. This bias led to an inadequate reduction in agreement in 2 × 2 tables and results in underestimated accuracy measures [13–15].
Precautions were taken to reduce the chance of bias. The index test and the RSM were applied independently by groups of exclusive reviewers, ensuring blinding of the results, and both used the same definitions and classifications. Despite the RSM includes data from medical record as a source of information, a systematic search for triggers as in the GTT was not used.
Although primary reviewers with little experience on GTT were employed, we do not consider this as a limitation. Studies have shown that reviewers’ training and experience increase the reliability between them [29, 30], which can also impact the validity of the tool. However, the inter-examiner reliability of this pair of primary reviewers was previously described with substantial reliability results for the identification of hospitalizations with AEs in relation to a pair of experienced nurses [18]. The results found by the authors are comparable to the findings of other studies that used experienced reviewers [2, 28, 29]. Another concern was that use of medical students as primary reviewers could have lowered the accuracy of nursing care–related AEs by the GTT. This is unlikely to have occurred as, among the 93 records evaluated by the two pairs of primary reviewers and included in this study, nurses and medical students identified, respectively, one and two AEs related to peripheral venous access (versus 38 through the RSM) and none AE related to gastric/enteric catheters (versus nine through the RSM).
Although the selection of the methods that composed the RSM was based on the literature, a recent systematic review of all existing methods was not conducted. Feasibility criteria in the local context and the researchers’ view of good opportunities to identify AEs played an important role in this choice. The arbitrary choice of some information sources and screening criteria that were included in the RSM can be considered a limitation of the method.
Another limitation of the methods used to identify AEs, both the GTT and most of those that composed the RSM, is that the decision whether the unfavorable outcome for the patient, which is usually identified by a trigger or screening criterion, is related to the natural evolution of the disease or whether it is related to health care depends on the judgment of a professional. To reduce this subjectivity, clear definitions and training strategies were used, in addition to a scale to support the decision of the medical reviewers of both groups. However, a reproducibility analysis was not performed between the medical reviewers of the GTT and the RSM.
Finally, the interviews with the professionals, method that composes the RSM, were carried out by undergraduate students. The interviews were semi-structured and followed a systematic script (see Supplementary Material 1). They were trained and supervised by the main researchers. However, we did not assess students’ ability to obtain information from professionals during interviews and to deepen what was reported to them.
Implications for policy, practice, and research
The results of this study reinforce the validity of the GTT for the identification of AEs with greater harm, but, on the other hand, they emphasize its inability to identify some natures (or types) of minor harm AEs. In the present study, AEs related to peripheral venous accesses and gastric/enteric catheters correspond to 35.6% of all AEs identified. These events, although individually causing less impact to the patient, become relevant due to their frequency, consuming significant staff time and costly resources [31, 32]. One study estimated that the annual cost of removing medical devices in a 42-bed intensive care unit was over US$250 000; 88% of these events involved gastrointestinal tubes and vascular catheters [33]. Therefore, these natures of AEs should be seen as indicators of poor quality care and target of improvement actions, which include appropriate measurement strategies [31, 32–34].
Although it was not the objective of the study, the analysis of the AEs identified by different sources in the RSM allows inferences about possible complementary methods to identify certain groups of AEs that are usually missed by the GTT, such as the semi-structured interview with the professionals who provide direct patient care. This method was used as a strategy to identify AEs related to feeding tube complications [34] and can be an alternative to direct observation, which is not suitable for global assessment of AEs, with possible gains in safety culture through greater involvement of direct care professionals [20]. However, further studies are needed on the potential of this method to identify different types of AEs, sampling strategies, possible biases, and costs involved.
Conclusions
The GTT proved to be a valid method for identifying AEs in hospitalized adult patients. Its accuracy increases when minor harm AEs are not counted. Among the main AEs missed by the GTT are those related to nursing care. Therefore, it should be used in combination with other measurement strategies to achieve results that are representative of the quality profile of the care provided and, thus, guide the best improvement strategies.
Supplementary data
Supplementary data is available at INTQHC Journal online.
Data availability statement
The data underlying this article will be shared on reasonable request to the corresponding author.