-
PDF
- Split View
-
Views
-
Cite
Cite
Kai Sheng Saw, Chen Liu, William Xu, Chris Varghese, Susan Parry, Ian Bissett, Faecal immunochemical test to triage patients with possible colorectal cancer symptoms: meta-analysis, British Journal of Surgery, Volume 109, Issue 2, February 2022, znab411, https://doi.org/10.1093/bjs/znab411
- Share Icon Share
Abstract
This review evaluated the utility of single quantitative faecal immunochemical test (FIT) as a triaging tool for patients with symptoms of possible colorectal cancer, the effect of symptoms on FIT accuracy, and the impact of triaging incorporating FIT on service provision.
Five databases were searched. Meta-analyses of the extracted FIT sensitivities and specificities for detection of colorectal cancer at reported f-Hb thresholds were performed. Secondary outcomes included sensitivity and specificity of FIT for advanced colorectal neoplasia and serious bowel disease. Subgroup analysis by FIT brand and symptoms was undertaken.
Fifteen prospective cohort studies, including 28 832 symptomatic patients were included. At the most commonly reported f-Hb positivity threshold of ≥ 10 µg Hb/g faeces (n=13), the summary sensitivity was 88.7% (95% c.i. 85.2 to 91.4) and the specificity was 80.5% (95% c.i. 75.3 to 84.8) for colorectal cancer. At lower limits of detection of ≥ 2 µg Hb/g faeces, the summary sensitivity was 96.8% (95% c.i. 91.0 to 98.9) and the specificity was 65.6% (95% c.i. 59.0 to 71.6). At the upper f-Hb positivity thresholds of ≥ 100 µg Hb/g faeces and ≥ 150 µg Hb/g faeces, summary sensitivities were 68.1% (95% c.i. 59.2 to 75.9) and 66.3% (95% c.i. 52.2 to 78.0), with specificities of 93.4% (95% c.i. 91.3 to 95.1) and 95.1% (95% c.i. 93.6 to 96.3) respectively. FIT sensitivity was comparable between different assay brands. FIT sensitivity may be higher in patients reporting rectal bleeding.
Single quantitative FIT at lower f-Hb positivity thresholds can adequately exclude colorectal cancer in symptomatic patients and provides a data-based approach to prioritization of colonoscopy resources.
Introduction
Colorectal cancer incidence and mortality are increasing in some countries1,2. Population bowel screening programmes have increased the demand for limited endoscopy resources. Patients with colorectal symptoms that raise the possibility of a colorectal cancer diagnosis also compete for endoscopy capacity, potentially leading to diagnostic delay and poorer treatment outcomes.
Symptoms alone are poor predictors of a colorectal cancer diagnosis3. A previous systematic review4 concluded that a faecal immunochemical test (FIT) was clinically useful and cost-effective for triage of patients with symptoms of possible colorectal cancer. This resulted in a change to the UK National Institute for Health and Care Excellence (NICE) recommendations to include the use of FIT in symptomatic patients. The COVID-19 pandemic has further limited access to endoscopy, promoting further interest in FIT as a triage tool for patients with colorectal symptoms5,6.
Three previous meta-analyses4,7,8 of small cohort studies concluded that, at the faecal haemoglobin (f-Hb) threshold of 10 µg Hb/g faeces, FIT is useful for excluding colorectal cancer in symptomatic patients. However, larger cohort studies have since been published, so an updated meta-analysis is required. Additionally, further questions about the applicability of FIT in patients with specific symptoms have been raised, and contradicting reports exist on how FIT has influenced service provision4,9–12.
The primary aim of this meta-analysis was to evaluate the utility of quantitative FIT as a triage tool at lower f-Hb thresholds to exclude symptomatic patients from invasive investigations, and at upper f-Hb thresholds to prioritise urgency of investigation. Secondary aims were to evaluate whether specific symptoms have an impact on the diagnostic performance of FIT and to evaluate the impact of FIT triage on service provision.
Methods
This systematic review is reported in accordance with PRISMA guidelines and conducted in accordance with the Cochrane Handbook for Diagnostic Test Accuracy Reviews13,14. The review was registered with the International Prospective Register of Systematic Reviews, PROSPERO (CRD42021244982).
Data sources
A systematic search of MEDLINE, Embase, Cochrane, Scopus, and PubMed databases was undertaken using a predefined search strategy (Appendix S1) including publications from database inception to March 2021. An updated literature search was conducted in August 2021 to capture additional studies published during the time elapsed between the initial search and manuscript submission. Search terms were broadly categorized into: index test of FIT and target condition of colorectal cancer. Search strategies from previous systematic reviews4,7 were referenced. No language limits were applied. The search was limited to humans only and any articles with the word ‘screen’ in the article title were excluded. A bibliographic search of included studies and relevant review articles was also performed.
Study selection: inclusion and exclusion criteria
Prospectively recruited cohort studies that used quantitative FIT for patients reporting symptoms of possible colorectal cancer were included. Acceptable reference standards included lower gastrointestinal tract endoscopy or CT colonography. In studies designed such that not all patients with a negative FIT would undergo further investigation, the acceptable reference standard for the cohort was at least 24 months of clinical records or cancer registry-based follow-up to minimize the risk of verification bias. Sufficient reported data to determine the numbers of true-positives, false-positives, true-negatives, and false-negatives for colorectal cancer were required. Attempts were made to contact authors for additional data when studies included a mixed cohort containing a subset of symptomatic patients. If multiple studies involved the same patient cohort, only the most recently published study was included. No limitations were placed on the quantitative FIT assay brand or analyser.
Exclusion criteria were: retrospective or case–control studies, patients who underwent colonic investigation for screening or surveillance, studies involving asymptomatic patients undergoing investigation for family history alone, and studies excluding specific symptoms or involving patients with existing underlying gastrointestinal tract disease.
Two authors concurrently and independently performed the database search, title and abstract screening, and conducted full-text reviews to determine eligibility. Consensus was required for study inclusion and discrepancies were adjudicated by a senior author.
Data extraction
One author undertook data extraction using a pro-forma table. Two different authors independently checked the accuracy of data extraction. Extracted data included information on study design, study setting, study inclusion and exclusion criteria, participant demographics, presenting symptoms, target condition(s), proportion of returned FITs, FIT assay brand, reference standard used, reduction in referrals, numbers of true-positives, false-positives, true-negatives, and false-negatives for each reported f-Hb positivity threshold, and a detailed descriptions of false-negative cases, where present. In studies in which more than one FIT was used in patient assessment, data from only the first test were extracted. Anaemia was considered a symptom for the purpose of this review as it is an important clinical presentation of colorectal cancer15.
Quality assessment of included studies
Methodological bias was assessed using the Quality Assessment of Diagnostic Accuracy Studies (QUADAS-2)16 by three authors independently. Final agreement on assessment outcomes was achieved by consensus and discussion.
Data synthesis and statistical analysis
The primary outcomes were the summary sensitivity and specificity of quantitative FIT for the detection of colorectal cancer at various f-Hb positivity thresholds. The f-Hb positivity thresholds chosen for meta-analyses were thresholds at limits of detection or commonly reported thresholds in included studies. A bivariate meta-analysis using methods described by Reitsma and colleagues17 was conducted. Summary estimates of sensitivity and specificity, with 95 per cent confidence intervals, were calculated for each f-Hb threshold. Although meta-analyses to compute paired summary variables such as positive and negative predictive values were possible, these values are heavily dependent on the prevalence of colorectal cancer in investigated cohorts, so were not calculated. For between-study comparison, diagnostic odds ratios and summary receiver operating characteristic curves (sROCs) were used as a global measure of diagnostic test performance. sROCs were constructed using a bivariate model and graphically depicted a summary operating point and 95 per cent confidence regions. The number needed to scope (NNS), that is the number of individuals required to undergo colonoscopy to detect one colorectal cancer, was calculated for each reported f-Hb threshold18. As NNS is prevalence-dependent, no meta-analysis was undertaken.
Subgroup analysis for the target condition of colorectal cancer was performed to assess the impact of assay brand and specific presenting symptoms where sufficient data were available. Subgroup summary estimates were compared with the main results based on 95 per cent confidence intervals. Forest plots for subgroup analysis were constructed.
Secondary outcomes included the summary sensitivity and specificity of FIT for the detection of the secondary target conditions of advanced colorectal neoplasia and serious bowel disease. Advanced colorectal neoplasia was defined as the presence of colorectal cancer or high-risk adenomas. Serious bowel disease was defined as the presence of advanced colorectal neoplasia or inflammatory bowel disease.
Sensitivity analyses based on publication type and variable target condition definitions were performed to assess their impact on the summary findings. Heterogeneity was assessed visually through inspection of sROCs and statistically using the I2 method; in the latter analysis, values of 25, 50, and 75 per cent corresponds to low, moderate, and high heterogeneity respectively19. Publication bias was assessed visually with funnel plots and quantitatively by means of Deeks’ test20. P < 0.050 was considered statistically significant.
Analyses were done using R version 4.0.1 (R Foundation for Statistical Computing, Vienna, Austria) with metaR and mada packages.
Results
Search results
The initial literature search yielded 16 698 references, of which 15 studies were included in the review (Fig. 1)9,10,21–33. A table of excluded articles with reasons for exclusion is available in Appendix S2. Additional unpublished data were supplied by the authors of two studies23,28 to allow their inclusion in the meta-analysis.

A total of 28 832 patients were included, with individual cohort sizes ranging from 178 to 9822 patients (Appendix S3). All studies included prospectively recruited cohorts; there were six multicentre studies and one conducted in a primary-care setting9,10,21,23,24,31,33. The prevalence of colorectal cancer ranged from 1.9% to 16.8%. The most commonly used FIT assay brands were HM-JACKarcTM (Hitachi Chemical Diagnostics Systems, Tokyo, Japan) and OC-SensorTM (Eiken Chemical Company, Tokyo, Japan), and the most commonly reported f-Hb positivity threshold was ≥ 10 µg Hb/g faeces (n=13).
Study quality assessment
Methodological quality assessments demonstrated that no study was at high risk of bias in more than one domain (Appendix S4). Seven studies were assessed to be at low risk of bias across all four domains.
Diagnostic performance of quantitative FIT for detection of colorectal cancer in symptomatic patients
The summary sensitivity and specificity at different f-Hb thresholds are reported in Table 1 and illustrated graphically in Fig. 2. The summary sensitivity for the most commonly reported f-Hb positivity threshold of ≥10 µg Hb/g faeces was 88.7% (95% c.i. 85.2 to 91.4) and the specificity was 80.5% (95% c.i. 75.3 to 84.8). In patients presenting with symptoms of possible colorectal cancer, quantitative FIT, at ≥10 µg Hb per g faeces, will detect 88 of 100 colorectal cancers and miss 12 cases.

Summary receiver operating characteristic curves for faecal immunochemical test at various faecal haemoglobin positivity thresholds for detection of colorectal cancer. Solid lines represent summary receiver operator characteristic curve. Full coloured points on solid lines represents the summary estimate point. Area within Dashed lines represent the 95% confidence region. Washed out/Lighter coloured points represent individual study cohorts Size of washed out/Lighter coloured points is proportional to number of participants of each study (i.e. larger symbol means bigger number of participant/larger study)
Colorectal cancer detection: diagnostic performance measures stratified by faecal haemoglobin positivity threshold for faecal immunochemical test
Reference . | Assay type/brand . | FIT positivity at specified threshold (%) . | CRC prevalence (%) . | No. of patients in analysis . | Sensitivity (%) (95% CI) . | Specificity (%) (95% CI) . | Negative predictive value (95% CI) . | Negative likelihood ratio (95% CI) . | No. needed to scope . |
---|---|---|---|---|---|---|---|---|---|
Threshold ≥ 2 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 37.2 | 3.3 | 9822 | 97.0 (94.5, 98.3) | 64.9 (63.9, 65.8) | 99.8 (99.7, 99.9) | 0.047 (0.025, 0.086) | 11.5 |
Turvill et al.33 2021 | HM-JACKarc | 40.9 | 3.0 | 5040 | 92.7 (87.4, 95.9) | 60.7 (59.3, 62.1) | 99.6 (99.4, 99.8) | 0.120 (0.068, 0.212) | 14.7 |
D'Souza et al.27 2020 | HM-JACKarc | 30.0 | 4.0 | 298 | 100 (100) | 73.0 (67.6, 77.8) | 100 (100) | 0.053 (0.003, 0.799) | 7.4 |
Summary estimate (I2 = 68%) | 96.8* (91.0, 98.9) | 65.6 (59.0, 71.6) | 0.077 (0.035, 0.167) | ||||||
Threshold ≥ 10 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 19.0 | 3.3 | 9822 | 90.9 (87.3, 93.5) | 83.5 (82.8, 84.3) | 99.6 (99.5, 99.8) | 0.109 (0.078, 0.154) | 6.2 |
Mowat et al.9 2019 | HM-JACKarc | 21.9 | 1.9 | 5372 | 84.5 (76.2, 90.2) | 79.4 (78.2, 80.4) | 99.6 (99.4, 99.8) | 0.196 (0.125, 0.307) | 13.5 |
Turvill et al.33 2021 | HM-JACKarc | 21.2 | 3.0 | 5040 | 87.4 (81.2, 91.8) | 80.9 (79.8, 82.0) | 99.5 (99.3, 99.7) | 0.156 (0.102, 0.237) | 8.1 |
Khan et al.29 2020 | HM-JACKarc | 20.0 | 5.1 | 928 | 85.1 (72.3, 92.6) | 83.5 (80.9, 85.8) | 99.1 (98.4, 99.8) | 0.178 (0.090, 0.353) | 4.6 |
Farrugia et al.28 2020 | HM-JACKarc | 22.1 | 5.7 | 612 | 85.7 (70.6, 93.7) | 81.8 (78.4, 84.7) | 99.0 (98.0, 100) | 0.175 (0.078, 0.394) | 4.5 |
D'Souza et al.27 2020 | HM-JACKarc | 16.0 | 4.0 | 298 | 91.7 (64.4, 98.5) | 87.1 (82.7, 90.5) | 99.6 (98.8, 100) | 0.096 (0.015, 0.625) | 4.4 |
Rodriguez-Alonso et al.22 2015 | OC-Sensor | 22.5 | 3.0 | 1003 | 96.7 (83.3, 99.4) | 79.8 (77.1, 82.2) | 99.9 (99.1, 100) | 0.042 (0.006, 0.287) | 7.8 |
Symonds et al.23 2016 | OC-Sensor | 22.5 | 5.8 | 480 | 85.7 (68.5, 94.3) | 81.4 (77.6, 84.7) | 98.9 (97.8, 99.9) | 0.175 (0.071, 0.435) | 4.5 |
Morales-Arraez et al.25 2018 | OC-Sensor | 48.6 | 11.4 | 245 | 92.9 (77.4, 98.0) | 57.1 (50.5, 63.5) | 98.4 (96.2, 100) | 0.125 (0.033, 0.478) | 4.6 |
Ayling et al.26 2019 | OC-Sensor | 6.7 | 3.9 | 178 | 71.4 (35.9, 91.8) | 95.9 (91.8, 98.0) | 98.8 (97.1, 100) | 0.298 (0.092, 0.962) | 2.4 |
Navarro et al.30 2020 | FOBGold/SENTiFIT | 28.3 | 5.0 | 727 | 94.4 (81.9, 98.5) | 75.1 (71.8, 78.2) | 99.6 (99.0, 100) | 0.074 (0.019, 0.285) | 6.1 |
Maclean et al.32 2021 | QuikRead GO | 31.5 | 2.5 | 553 | 92.9 (68.5, 98.7) | 70.1 (66.1, 73.8) | 99.7 (99.2, 100) | 0.102 (0.015, 0.674) | 13.4 |
Tsapournas et al.31 2020 | QuikRead GO | 26.4 | 5.4 | 242 | 92.3 (66.7, 98.6) | 77.3 (71.4, 82.2) | 99.4 (98.3, 100) | 0.100 (0.015, 0.655) | 5.3 |
Summary estimate (I2 =0%) | 88.7 (85.2, 91.4) | 80.5 (75.3, 84.8) | 0.144 (0.119, 0.175) | ||||||
Reference | Assay type/brand | FIT positivity at specified threshold (%) | CRC prevalence (%) | No. of patients in analysis | Sensitivity (%) (95% CI) | Specificity (%) (95% CI) | Positive predictive value (95% CI) | Positive likelihood ratio (95% CI) | No. needed to scope |
Threshold ≥ 100 µg Hb per g faeces or equivalent | |||||||||
Turvill et al. 33 2021 | HM-JACKarc | 9.1 | 3.0 | 5040 | 66.2 (58.4, 73.3) | 92.7 (91.9, 93.4) | 21.9 (18.1, 25.7) | 9.1 (7.8, 10.6) | 4.6 |
Maclean et al.32 2021 | QuikRead GO | 7.1 | 2.5 | 553 | 71.4 (45.4, 88.3) | 94.6 (92.4, 96.2) | 25.6 (11.9, 39.3) | 13.3 (8.2, 21.6) | 3.9 |
Summary estimate (I2 =0%) | 68.1 (59.2, 75.9) | 93.4 (91.3, 95.1) | 10.2 (7.2, 14.4) | ||||||
Threshold ≥ 150 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 7.6 | 3.3 | 9822 | 70.8 (65.7, 75.5) | 94.6 (94.1, 95.0) | 31.1 (27.8, 34.4) | 13.0 (11.7, 14.5) | 3.2 |
Maclean et al.32 2021 | QuikRead GO | 5.4 | 2.5 | 553 | 57.1 (32.6, 78.6) | 95.9 (93.9, 97.3) | 26.7 (10.8, 42.5) | 14.0 (7.6, 25.8) | 3.8 |
Summary estimate (I2 =11%) | 66.3 (52.2, 78.0) | 95.1 (93.6, 96.3) | 13.1 (11.7, 14.5) |
Reference . | Assay type/brand . | FIT positivity at specified threshold (%) . | CRC prevalence (%) . | No. of patients in analysis . | Sensitivity (%) (95% CI) . | Specificity (%) (95% CI) . | Negative predictive value (95% CI) . | Negative likelihood ratio (95% CI) . | No. needed to scope . |
---|---|---|---|---|---|---|---|---|---|
Threshold ≥ 2 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 37.2 | 3.3 | 9822 | 97.0 (94.5, 98.3) | 64.9 (63.9, 65.8) | 99.8 (99.7, 99.9) | 0.047 (0.025, 0.086) | 11.5 |
Turvill et al.33 2021 | HM-JACKarc | 40.9 | 3.0 | 5040 | 92.7 (87.4, 95.9) | 60.7 (59.3, 62.1) | 99.6 (99.4, 99.8) | 0.120 (0.068, 0.212) | 14.7 |
D'Souza et al.27 2020 | HM-JACKarc | 30.0 | 4.0 | 298 | 100 (100) | 73.0 (67.6, 77.8) | 100 (100) | 0.053 (0.003, 0.799) | 7.4 |
Summary estimate (I2 = 68%) | 96.8* (91.0, 98.9) | 65.6 (59.0, 71.6) | 0.077 (0.035, 0.167) | ||||||
Threshold ≥ 10 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 19.0 | 3.3 | 9822 | 90.9 (87.3, 93.5) | 83.5 (82.8, 84.3) | 99.6 (99.5, 99.8) | 0.109 (0.078, 0.154) | 6.2 |
Mowat et al.9 2019 | HM-JACKarc | 21.9 | 1.9 | 5372 | 84.5 (76.2, 90.2) | 79.4 (78.2, 80.4) | 99.6 (99.4, 99.8) | 0.196 (0.125, 0.307) | 13.5 |
Turvill et al.33 2021 | HM-JACKarc | 21.2 | 3.0 | 5040 | 87.4 (81.2, 91.8) | 80.9 (79.8, 82.0) | 99.5 (99.3, 99.7) | 0.156 (0.102, 0.237) | 8.1 |
Khan et al.29 2020 | HM-JACKarc | 20.0 | 5.1 | 928 | 85.1 (72.3, 92.6) | 83.5 (80.9, 85.8) | 99.1 (98.4, 99.8) | 0.178 (0.090, 0.353) | 4.6 |
Farrugia et al.28 2020 | HM-JACKarc | 22.1 | 5.7 | 612 | 85.7 (70.6, 93.7) | 81.8 (78.4, 84.7) | 99.0 (98.0, 100) | 0.175 (0.078, 0.394) | 4.5 |
D'Souza et al.27 2020 | HM-JACKarc | 16.0 | 4.0 | 298 | 91.7 (64.4, 98.5) | 87.1 (82.7, 90.5) | 99.6 (98.8, 100) | 0.096 (0.015, 0.625) | 4.4 |
Rodriguez-Alonso et al.22 2015 | OC-Sensor | 22.5 | 3.0 | 1003 | 96.7 (83.3, 99.4) | 79.8 (77.1, 82.2) | 99.9 (99.1, 100) | 0.042 (0.006, 0.287) | 7.8 |
Symonds et al.23 2016 | OC-Sensor | 22.5 | 5.8 | 480 | 85.7 (68.5, 94.3) | 81.4 (77.6, 84.7) | 98.9 (97.8, 99.9) | 0.175 (0.071, 0.435) | 4.5 |
Morales-Arraez et al.25 2018 | OC-Sensor | 48.6 | 11.4 | 245 | 92.9 (77.4, 98.0) | 57.1 (50.5, 63.5) | 98.4 (96.2, 100) | 0.125 (0.033, 0.478) | 4.6 |
Ayling et al.26 2019 | OC-Sensor | 6.7 | 3.9 | 178 | 71.4 (35.9, 91.8) | 95.9 (91.8, 98.0) | 98.8 (97.1, 100) | 0.298 (0.092, 0.962) | 2.4 |
Navarro et al.30 2020 | FOBGold/SENTiFIT | 28.3 | 5.0 | 727 | 94.4 (81.9, 98.5) | 75.1 (71.8, 78.2) | 99.6 (99.0, 100) | 0.074 (0.019, 0.285) | 6.1 |
Maclean et al.32 2021 | QuikRead GO | 31.5 | 2.5 | 553 | 92.9 (68.5, 98.7) | 70.1 (66.1, 73.8) | 99.7 (99.2, 100) | 0.102 (0.015, 0.674) | 13.4 |
Tsapournas et al.31 2020 | QuikRead GO | 26.4 | 5.4 | 242 | 92.3 (66.7, 98.6) | 77.3 (71.4, 82.2) | 99.4 (98.3, 100) | 0.100 (0.015, 0.655) | 5.3 |
Summary estimate (I2 =0%) | 88.7 (85.2, 91.4) | 80.5 (75.3, 84.8) | 0.144 (0.119, 0.175) | ||||||
Reference | Assay type/brand | FIT positivity at specified threshold (%) | CRC prevalence (%) | No. of patients in analysis | Sensitivity (%) (95% CI) | Specificity (%) (95% CI) | Positive predictive value (95% CI) | Positive likelihood ratio (95% CI) | No. needed to scope |
Threshold ≥ 100 µg Hb per g faeces or equivalent | |||||||||
Turvill et al. 33 2021 | HM-JACKarc | 9.1 | 3.0 | 5040 | 66.2 (58.4, 73.3) | 92.7 (91.9, 93.4) | 21.9 (18.1, 25.7) | 9.1 (7.8, 10.6) | 4.6 |
Maclean et al.32 2021 | QuikRead GO | 7.1 | 2.5 | 553 | 71.4 (45.4, 88.3) | 94.6 (92.4, 96.2) | 25.6 (11.9, 39.3) | 13.3 (8.2, 21.6) | 3.9 |
Summary estimate (I2 =0%) | 68.1 (59.2, 75.9) | 93.4 (91.3, 95.1) | 10.2 (7.2, 14.4) | ||||||
Threshold ≥ 150 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 7.6 | 3.3 | 9822 | 70.8 (65.7, 75.5) | 94.6 (94.1, 95.0) | 31.1 (27.8, 34.4) | 13.0 (11.7, 14.5) | 3.2 |
Maclean et al.32 2021 | QuikRead GO | 5.4 | 2.5 | 553 | 57.1 (32.6, 78.6) | 95.9 (93.9, 97.3) | 26.7 (10.8, 42.5) | 14.0 (7.6, 25.8) | 3.8 |
Summary estimate (I2 =11%) | 66.3 (52.2, 78.0) | 95.1 (93.6, 96.3) | 13.1 (11.7, 14.5) |
Values in parentheses are 95 per cent confidence intervals. *Continuity correction. FIT, faecal immunochemical test; CRC, colorectal cancer; Hb, haemoglobin; NICE, National Institute for Health and Care Excellence.
Colorectal cancer detection: diagnostic performance measures stratified by faecal haemoglobin positivity threshold for faecal immunochemical test
Reference . | Assay type/brand . | FIT positivity at specified threshold (%) . | CRC prevalence (%) . | No. of patients in analysis . | Sensitivity (%) (95% CI) . | Specificity (%) (95% CI) . | Negative predictive value (95% CI) . | Negative likelihood ratio (95% CI) . | No. needed to scope . |
---|---|---|---|---|---|---|---|---|---|
Threshold ≥ 2 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 37.2 | 3.3 | 9822 | 97.0 (94.5, 98.3) | 64.9 (63.9, 65.8) | 99.8 (99.7, 99.9) | 0.047 (0.025, 0.086) | 11.5 |
Turvill et al.33 2021 | HM-JACKarc | 40.9 | 3.0 | 5040 | 92.7 (87.4, 95.9) | 60.7 (59.3, 62.1) | 99.6 (99.4, 99.8) | 0.120 (0.068, 0.212) | 14.7 |
D'Souza et al.27 2020 | HM-JACKarc | 30.0 | 4.0 | 298 | 100 (100) | 73.0 (67.6, 77.8) | 100 (100) | 0.053 (0.003, 0.799) | 7.4 |
Summary estimate (I2 = 68%) | 96.8* (91.0, 98.9) | 65.6 (59.0, 71.6) | 0.077 (0.035, 0.167) | ||||||
Threshold ≥ 10 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 19.0 | 3.3 | 9822 | 90.9 (87.3, 93.5) | 83.5 (82.8, 84.3) | 99.6 (99.5, 99.8) | 0.109 (0.078, 0.154) | 6.2 |
Mowat et al.9 2019 | HM-JACKarc | 21.9 | 1.9 | 5372 | 84.5 (76.2, 90.2) | 79.4 (78.2, 80.4) | 99.6 (99.4, 99.8) | 0.196 (0.125, 0.307) | 13.5 |
Turvill et al.33 2021 | HM-JACKarc | 21.2 | 3.0 | 5040 | 87.4 (81.2, 91.8) | 80.9 (79.8, 82.0) | 99.5 (99.3, 99.7) | 0.156 (0.102, 0.237) | 8.1 |
Khan et al.29 2020 | HM-JACKarc | 20.0 | 5.1 | 928 | 85.1 (72.3, 92.6) | 83.5 (80.9, 85.8) | 99.1 (98.4, 99.8) | 0.178 (0.090, 0.353) | 4.6 |
Farrugia et al.28 2020 | HM-JACKarc | 22.1 | 5.7 | 612 | 85.7 (70.6, 93.7) | 81.8 (78.4, 84.7) | 99.0 (98.0, 100) | 0.175 (0.078, 0.394) | 4.5 |
D'Souza et al.27 2020 | HM-JACKarc | 16.0 | 4.0 | 298 | 91.7 (64.4, 98.5) | 87.1 (82.7, 90.5) | 99.6 (98.8, 100) | 0.096 (0.015, 0.625) | 4.4 |
Rodriguez-Alonso et al.22 2015 | OC-Sensor | 22.5 | 3.0 | 1003 | 96.7 (83.3, 99.4) | 79.8 (77.1, 82.2) | 99.9 (99.1, 100) | 0.042 (0.006, 0.287) | 7.8 |
Symonds et al.23 2016 | OC-Sensor | 22.5 | 5.8 | 480 | 85.7 (68.5, 94.3) | 81.4 (77.6, 84.7) | 98.9 (97.8, 99.9) | 0.175 (0.071, 0.435) | 4.5 |
Morales-Arraez et al.25 2018 | OC-Sensor | 48.6 | 11.4 | 245 | 92.9 (77.4, 98.0) | 57.1 (50.5, 63.5) | 98.4 (96.2, 100) | 0.125 (0.033, 0.478) | 4.6 |
Ayling et al.26 2019 | OC-Sensor | 6.7 | 3.9 | 178 | 71.4 (35.9, 91.8) | 95.9 (91.8, 98.0) | 98.8 (97.1, 100) | 0.298 (0.092, 0.962) | 2.4 |
Navarro et al.30 2020 | FOBGold/SENTiFIT | 28.3 | 5.0 | 727 | 94.4 (81.9, 98.5) | 75.1 (71.8, 78.2) | 99.6 (99.0, 100) | 0.074 (0.019, 0.285) | 6.1 |
Maclean et al.32 2021 | QuikRead GO | 31.5 | 2.5 | 553 | 92.9 (68.5, 98.7) | 70.1 (66.1, 73.8) | 99.7 (99.2, 100) | 0.102 (0.015, 0.674) | 13.4 |
Tsapournas et al.31 2020 | QuikRead GO | 26.4 | 5.4 | 242 | 92.3 (66.7, 98.6) | 77.3 (71.4, 82.2) | 99.4 (98.3, 100) | 0.100 (0.015, 0.655) | 5.3 |
Summary estimate (I2 =0%) | 88.7 (85.2, 91.4) | 80.5 (75.3, 84.8) | 0.144 (0.119, 0.175) | ||||||
Reference | Assay type/brand | FIT positivity at specified threshold (%) | CRC prevalence (%) | No. of patients in analysis | Sensitivity (%) (95% CI) | Specificity (%) (95% CI) | Positive predictive value (95% CI) | Positive likelihood ratio (95% CI) | No. needed to scope |
Threshold ≥ 100 µg Hb per g faeces or equivalent | |||||||||
Turvill et al. 33 2021 | HM-JACKarc | 9.1 | 3.0 | 5040 | 66.2 (58.4, 73.3) | 92.7 (91.9, 93.4) | 21.9 (18.1, 25.7) | 9.1 (7.8, 10.6) | 4.6 |
Maclean et al.32 2021 | QuikRead GO | 7.1 | 2.5 | 553 | 71.4 (45.4, 88.3) | 94.6 (92.4, 96.2) | 25.6 (11.9, 39.3) | 13.3 (8.2, 21.6) | 3.9 |
Summary estimate (I2 =0%) | 68.1 (59.2, 75.9) | 93.4 (91.3, 95.1) | 10.2 (7.2, 14.4) | ||||||
Threshold ≥ 150 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 7.6 | 3.3 | 9822 | 70.8 (65.7, 75.5) | 94.6 (94.1, 95.0) | 31.1 (27.8, 34.4) | 13.0 (11.7, 14.5) | 3.2 |
Maclean et al.32 2021 | QuikRead GO | 5.4 | 2.5 | 553 | 57.1 (32.6, 78.6) | 95.9 (93.9, 97.3) | 26.7 (10.8, 42.5) | 14.0 (7.6, 25.8) | 3.8 |
Summary estimate (I2 =11%) | 66.3 (52.2, 78.0) | 95.1 (93.6, 96.3) | 13.1 (11.7, 14.5) |
Reference . | Assay type/brand . | FIT positivity at specified threshold (%) . | CRC prevalence (%) . | No. of patients in analysis . | Sensitivity (%) (95% CI) . | Specificity (%) (95% CI) . | Negative predictive value (95% CI) . | Negative likelihood ratio (95% CI) . | No. needed to scope . |
---|---|---|---|---|---|---|---|---|---|
Threshold ≥ 2 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 37.2 | 3.3 | 9822 | 97.0 (94.5, 98.3) | 64.9 (63.9, 65.8) | 99.8 (99.7, 99.9) | 0.047 (0.025, 0.086) | 11.5 |
Turvill et al.33 2021 | HM-JACKarc | 40.9 | 3.0 | 5040 | 92.7 (87.4, 95.9) | 60.7 (59.3, 62.1) | 99.6 (99.4, 99.8) | 0.120 (0.068, 0.212) | 14.7 |
D'Souza et al.27 2020 | HM-JACKarc | 30.0 | 4.0 | 298 | 100 (100) | 73.0 (67.6, 77.8) | 100 (100) | 0.053 (0.003, 0.799) | 7.4 |
Summary estimate (I2 = 68%) | 96.8* (91.0, 98.9) | 65.6 (59.0, 71.6) | 0.077 (0.035, 0.167) | ||||||
Threshold ≥ 10 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 19.0 | 3.3 | 9822 | 90.9 (87.3, 93.5) | 83.5 (82.8, 84.3) | 99.6 (99.5, 99.8) | 0.109 (0.078, 0.154) | 6.2 |
Mowat et al.9 2019 | HM-JACKarc | 21.9 | 1.9 | 5372 | 84.5 (76.2, 90.2) | 79.4 (78.2, 80.4) | 99.6 (99.4, 99.8) | 0.196 (0.125, 0.307) | 13.5 |
Turvill et al.33 2021 | HM-JACKarc | 21.2 | 3.0 | 5040 | 87.4 (81.2, 91.8) | 80.9 (79.8, 82.0) | 99.5 (99.3, 99.7) | 0.156 (0.102, 0.237) | 8.1 |
Khan et al.29 2020 | HM-JACKarc | 20.0 | 5.1 | 928 | 85.1 (72.3, 92.6) | 83.5 (80.9, 85.8) | 99.1 (98.4, 99.8) | 0.178 (0.090, 0.353) | 4.6 |
Farrugia et al.28 2020 | HM-JACKarc | 22.1 | 5.7 | 612 | 85.7 (70.6, 93.7) | 81.8 (78.4, 84.7) | 99.0 (98.0, 100) | 0.175 (0.078, 0.394) | 4.5 |
D'Souza et al.27 2020 | HM-JACKarc | 16.0 | 4.0 | 298 | 91.7 (64.4, 98.5) | 87.1 (82.7, 90.5) | 99.6 (98.8, 100) | 0.096 (0.015, 0.625) | 4.4 |
Rodriguez-Alonso et al.22 2015 | OC-Sensor | 22.5 | 3.0 | 1003 | 96.7 (83.3, 99.4) | 79.8 (77.1, 82.2) | 99.9 (99.1, 100) | 0.042 (0.006, 0.287) | 7.8 |
Symonds et al.23 2016 | OC-Sensor | 22.5 | 5.8 | 480 | 85.7 (68.5, 94.3) | 81.4 (77.6, 84.7) | 98.9 (97.8, 99.9) | 0.175 (0.071, 0.435) | 4.5 |
Morales-Arraez et al.25 2018 | OC-Sensor | 48.6 | 11.4 | 245 | 92.9 (77.4, 98.0) | 57.1 (50.5, 63.5) | 98.4 (96.2, 100) | 0.125 (0.033, 0.478) | 4.6 |
Ayling et al.26 2019 | OC-Sensor | 6.7 | 3.9 | 178 | 71.4 (35.9, 91.8) | 95.9 (91.8, 98.0) | 98.8 (97.1, 100) | 0.298 (0.092, 0.962) | 2.4 |
Navarro et al.30 2020 | FOBGold/SENTiFIT | 28.3 | 5.0 | 727 | 94.4 (81.9, 98.5) | 75.1 (71.8, 78.2) | 99.6 (99.0, 100) | 0.074 (0.019, 0.285) | 6.1 |
Maclean et al.32 2021 | QuikRead GO | 31.5 | 2.5 | 553 | 92.9 (68.5, 98.7) | 70.1 (66.1, 73.8) | 99.7 (99.2, 100) | 0.102 (0.015, 0.674) | 13.4 |
Tsapournas et al.31 2020 | QuikRead GO | 26.4 | 5.4 | 242 | 92.3 (66.7, 98.6) | 77.3 (71.4, 82.2) | 99.4 (98.3, 100) | 0.100 (0.015, 0.655) | 5.3 |
Summary estimate (I2 =0%) | 88.7 (85.2, 91.4) | 80.5 (75.3, 84.8) | 0.144 (0.119, 0.175) | ||||||
Reference | Assay type/brand | FIT positivity at specified threshold (%) | CRC prevalence (%) | No. of patients in analysis | Sensitivity (%) (95% CI) | Specificity (%) (95% CI) | Positive predictive value (95% CI) | Positive likelihood ratio (95% CI) | No. needed to scope |
Threshold ≥ 100 µg Hb per g faeces or equivalent | |||||||||
Turvill et al. 33 2021 | HM-JACKarc | 9.1 | 3.0 | 5040 | 66.2 (58.4, 73.3) | 92.7 (91.9, 93.4) | 21.9 (18.1, 25.7) | 9.1 (7.8, 10.6) | 4.6 |
Maclean et al.32 2021 | QuikRead GO | 7.1 | 2.5 | 553 | 71.4 (45.4, 88.3) | 94.6 (92.4, 96.2) | 25.6 (11.9, 39.3) | 13.3 (8.2, 21.6) | 3.9 |
Summary estimate (I2 =0%) | 68.1 (59.2, 75.9) | 93.4 (91.3, 95.1) | 10.2 (7.2, 14.4) | ||||||
Threshold ≥ 150 µg Hb per g faeces or equivalent | |||||||||
D’Souza et al.10 NICE FIT 2021 | HM-JACKarc | 7.6 | 3.3 | 9822 | 70.8 (65.7, 75.5) | 94.6 (94.1, 95.0) | 31.1 (27.8, 34.4) | 13.0 (11.7, 14.5) | 3.2 |
Maclean et al.32 2021 | QuikRead GO | 5.4 | 2.5 | 553 | 57.1 (32.6, 78.6) | 95.9 (93.9, 97.3) | 26.7 (10.8, 42.5) | 14.0 (7.6, 25.8) | 3.8 |
Summary estimate (I2 =11%) | 66.3 (52.2, 78.0) | 95.1 (93.6, 96.3) | 13.1 (11.7, 14.5) |
Values in parentheses are 95 per cent confidence intervals. *Continuity correction. FIT, faecal immunochemical test; CRC, colorectal cancer; Hb, haemoglobin; NICE, National Institute for Health and Care Excellence.
At another commonly reported f-Hb positivity threshold of ≥ 20 µg Hb/g faeces involving five studies, the summary sensitivity was 83.6% (95% c.i. 70.8 to 91.5) with specificity of 82.8% (95% c.i. 75.3 to 88.3).
At the lower limit of detection for the HM-JACKarc FIT assay (≥2 µg Hb/g faeces), three studies involving 15 160 patients were included in meta-analysis, with a summary sensitivity of 96.8% (95% c.i. 91.0 to 98.9) and specificity of 65.6% (95% c.i. 59.0 to 71.6). At lower f-Hb positivity thresholds of ≥2 µg Hb/g faeces and ≥10 µg Hb/g faeces, negative predictive values for every study reporting at these thresholds were no lower than 98.4%, regardless of colorectal cancer prevalence.
At the upper f-Hb positivity threshold of ≥100 µg Hb/g faeces, two studies32,33 were included in meta-analysis, with a summary sensitivity of 68.1% (95% c.i. 59.2 to 75.9) and specificity of 93.4% (95% c.i. 91.3 to 95.1). Two studies10,32 were included in meta-analysis at an upper f-Hb positivity threshold of ≥150 µg Hb/g faeces, with a summary sensitivity of 66.3% (95% c.i. 52.2 to 78.0) and specificity of 95.1% (95% c.i. 93.6 to 96.3). At both of these upper f-Hb positivity thresholds, summary estimates of positive likelihood ratio were greater than 10. Another study33 reported diagnostic performance at f-Hb positivity threshold of ≥300 µg Hb/g faeces with a specificity of 95.1% (95% c.i. 94.4 to 95.7) and positive likelihood ratio of 10.7. One study31 reported that almost all patients diagnosed with colorectal cancer had f-Hb levels above the upper limits of detection of ≥200 µg Hb/g faeces for the QuikRead GO® FIT assay brand(Orion Diagnostica Oy, Danderyd, Sweden).
Subgroup analysis by assay type/brand
Six studies used the HM-JACKarc FIT assay. For the primary target condition of colorectal cancer, summary sensitivity at a threshold of ≥10 µg Hb/g faeces was 88.4% (95% c.i. 84.8 to 91.3) and the specificity was 82.2% (95% c.i. 80.3 to 83.9) (Appendix S5). Five studies used the OC-Sensor FIT assay. For the primary target condition of colorectal cancer, the summary sensitivity at a f-Hb threshold of ≥10 µg Hb/g faeces was 87.8% (95% c.i. 72.9 to 95.0) and the specificity was 82.4% (95% c.i. 60.3 to 93.5) (Appendix S6). Two studies31,32 used the QuikReadGO assay. For the primary target condition of colorectal cancer, summary sensitivity at a threshold of ≥10 µg Hb/g faeces was 92.6% (95% c.i. 74.7 to 98.1) and the specificity was 73.5% (95% c.i. 65.8 to 79.9). There was no statistically significant difference in diagnostic test performance when the results were stratified by FIT assay brand at the commonly reported f-Hb threshold of ≥10 µg Hb/g faeces (Appendix S7). All included studies that used the HM-JACKarc FIT assay were conducted in the UK. Meta-analyses of data obtained for the FIT brand FOB GOLD®/SENTiFIT® (Sysmex-Sentinel, Barcelona, Spain) and HM-JACKTM (Kyowa Medex Company, Tokyo, Japan) were not possible as each assay was used in only one included study.
Subgroup analysis by symptoms
Twelve studies reported data outlining symptom frequencies (Appendix S3). The methods of symptom reporting, analysis, and definitions were heterogeneous, and precluded meta-analyses in most instances. Two studies provided sufficient data for a limited meta-analysis of the impact of the symptom of rectal bleeding on FIT diagnostic performance. The summary sensitivity of FIT at a f-Hb positivity threshold of ≥10 µg Hb/g faeces was 96.1% (95% c.i. 91.5 to 98.2) for patients reporting rectal bleeding, whereas it was 86.0% (95% c.i. 80.2 to 90.3) for patients who did not report rectal bleeding. An inverse relationship was noted for specificity (Appendix S8). Two studies10,22 outlined results of multivariable regression analyses assessing the impact of anaemia on FIT diagnostic performance, with inconsistent findings. Two studies28,29 noted that anaemia was a common referral indication among patients with false-negative results at low f-Hb positivity thresholds, but it was unclear whether a regression analysis was undertaken.
Two studies30,31 assessed the impact of symptoms on FIT diagnostic performance, and concluded that no combination of FIT and symptoms was statistically significantly superior at detection of colorectal cancer when compared with single quantitative FIT in isolation.
Diagnostic performance of quantitative FIT in detection of secondary target conditions
The diagnostic performance of quantitative FIT for detection of secondary target conditions is outlined in Appendix S9. For advanced colorectal neoplasia, the summary sensitivity of FIT at a f-Hb positivity threshold of ≥10 µg Hb per/faeces was 68.4% (95% c.i. 64.2 to 72.3) with a specificity of 81.1% (95% c.i. 77.2 to 84.5). For serious bowel disease, the summary sensitivity of FIT at a f-Hb positivity threshold of ≥10 µg Hb/g faeces was 69.7% (95% c.i. 62.6 to 76.0) with a specificity of 80.4% (95% c.i. 73.9 to 85.5).
Impact on service provision
Modelling results for the NNS at f-Hb threshold of ≥10 µg Hb/g faeces are presented in Table 2. Across the range of reported colorectal cancer prevalences and f-Hb thresholds, the NNS with triaging incorporating FIT was modelled to be lower than the NNS without FIT incorporated triaging. The same pattern in NNS differences between FIT positive and FIT negative patient groups was noted across all f-Hb positivity thresholds reported in all the included studies.
Number needed to scope for detection of one colorectal cancer at specified faecal haemoglobin positivity thresholds
Reference . | CRC prevalence (%) . | Cohort size . | % of cohort with negative FIT at threshold ≥10 µg Hb per g faeces . | NNS . | False-negative rate (%) . | ||
---|---|---|---|---|---|---|---|
Without FIT triaging . | FIT ≥ 10 µg Hb per g faeces . | FIT < 10 µg Hb per g faeces . | |||||
D’Souza et al.10 | 3.3 | 9822 | 81.0 | 29.9 | 6.2 | 265.3 | 9.1 |
Mowat et al.9 | 1.9 | 5372 | 78.1 | 52.2 | 13.5 | 262.3 | 15.5 |
Turvill et al.33 | 3.0 | 5040 | 78.8 | 33.4 | 8.1 | 209.2 | 12.6 |
Rodriguez-Alonso et al.22 | 3.0 | 1003 | 77.5 | 33.4 | 7.8 | 777.0 | 3.3 |
Khan et al.29 | 5.1 | 928 | 80.0 | 19.7 | 4.6 | 106.1 | 14.9 |
Navarro et al.30 | 5.0 | 727 | 71.7 | 20.2 | 6.1 | 260.5 | 5.6 |
Farrugia et al.28 | 5.7 | 612 | 77.9 | 17.5 | 4.5 | 95.4 | 14.3 |
Maclean et al.32 | 2.5 | 553 | 68.5 | 39.5 | 13.4 | 379.0 | 7.1 |
Symonds et al.23 | 5.8 | 480 | 77.5 | 17.1 | 4.5 | 93.0 | 14.3 |
D’Souza et al.27 | 4.0 | 298 | 83.9 | 24.8 | 4.4 | 250.0 | 8.3 |
Morales-Arraez et al.25 | 11.4 | 245 | 51.4 | 8.8 | 4.6 | 63.0 | 7.1 |
Tsapournas et al.31 | 5.4 | 242 | 73.6 | 18.6 | 5.3 | 178.0 | 7.7 |
Ayling et al.26 | 3.9 | 178 | 93.3 | 25.4 | 2.4 | 83.0 | 28.6 |
Reference . | CRC prevalence (%) . | Cohort size . | % of cohort with negative FIT at threshold ≥10 µg Hb per g faeces . | NNS . | False-negative rate (%) . | ||
---|---|---|---|---|---|---|---|
Without FIT triaging . | FIT ≥ 10 µg Hb per g faeces . | FIT < 10 µg Hb per g faeces . | |||||
D’Souza et al.10 | 3.3 | 9822 | 81.0 | 29.9 | 6.2 | 265.3 | 9.1 |
Mowat et al.9 | 1.9 | 5372 | 78.1 | 52.2 | 13.5 | 262.3 | 15.5 |
Turvill et al.33 | 3.0 | 5040 | 78.8 | 33.4 | 8.1 | 209.2 | 12.6 |
Rodriguez-Alonso et al.22 | 3.0 | 1003 | 77.5 | 33.4 | 7.8 | 777.0 | 3.3 |
Khan et al.29 | 5.1 | 928 | 80.0 | 19.7 | 4.6 | 106.1 | 14.9 |
Navarro et al.30 | 5.0 | 727 | 71.7 | 20.2 | 6.1 | 260.5 | 5.6 |
Farrugia et al.28 | 5.7 | 612 | 77.9 | 17.5 | 4.5 | 95.4 | 14.3 |
Maclean et al.32 | 2.5 | 553 | 68.5 | 39.5 | 13.4 | 379.0 | 7.1 |
Symonds et al.23 | 5.8 | 480 | 77.5 | 17.1 | 4.5 | 93.0 | 14.3 |
D’Souza et al.27 | 4.0 | 298 | 83.9 | 24.8 | 4.4 | 250.0 | 8.3 |
Morales-Arraez et al.25 | 11.4 | 245 | 51.4 | 8.8 | 4.6 | 63.0 | 7.1 |
Tsapournas et al.31 | 5.4 | 242 | 73.6 | 18.6 | 5.3 | 178.0 | 7.7 |
Ayling et al.26 | 3.9 | 178 | 93.3 | 25.4 | 2.4 | 83.0 | 28.6 |
CRC, colorectal cancer; FIT, faecal immunochemical test; Hb, haemoglobin; NNS, number needed to scope to detect one CRC.
Number needed to scope for detection of one colorectal cancer at specified faecal haemoglobin positivity thresholds
Reference . | CRC prevalence (%) . | Cohort size . | % of cohort with negative FIT at threshold ≥10 µg Hb per g faeces . | NNS . | False-negative rate (%) . | ||
---|---|---|---|---|---|---|---|
Without FIT triaging . | FIT ≥ 10 µg Hb per g faeces . | FIT < 10 µg Hb per g faeces . | |||||
D’Souza et al.10 | 3.3 | 9822 | 81.0 | 29.9 | 6.2 | 265.3 | 9.1 |
Mowat et al.9 | 1.9 | 5372 | 78.1 | 52.2 | 13.5 | 262.3 | 15.5 |
Turvill et al.33 | 3.0 | 5040 | 78.8 | 33.4 | 8.1 | 209.2 | 12.6 |
Rodriguez-Alonso et al.22 | 3.0 | 1003 | 77.5 | 33.4 | 7.8 | 777.0 | 3.3 |
Khan et al.29 | 5.1 | 928 | 80.0 | 19.7 | 4.6 | 106.1 | 14.9 |
Navarro et al.30 | 5.0 | 727 | 71.7 | 20.2 | 6.1 | 260.5 | 5.6 |
Farrugia et al.28 | 5.7 | 612 | 77.9 | 17.5 | 4.5 | 95.4 | 14.3 |
Maclean et al.32 | 2.5 | 553 | 68.5 | 39.5 | 13.4 | 379.0 | 7.1 |
Symonds et al.23 | 5.8 | 480 | 77.5 | 17.1 | 4.5 | 93.0 | 14.3 |
D’Souza et al.27 | 4.0 | 298 | 83.9 | 24.8 | 4.4 | 250.0 | 8.3 |
Morales-Arraez et al.25 | 11.4 | 245 | 51.4 | 8.8 | 4.6 | 63.0 | 7.1 |
Tsapournas et al.31 | 5.4 | 242 | 73.6 | 18.6 | 5.3 | 178.0 | 7.7 |
Ayling et al.26 | 3.9 | 178 | 93.3 | 25.4 | 2.4 | 83.0 | 28.6 |
Reference . | CRC prevalence (%) . | Cohort size . | % of cohort with negative FIT at threshold ≥10 µg Hb per g faeces . | NNS . | False-negative rate (%) . | ||
---|---|---|---|---|---|---|---|
Without FIT triaging . | FIT ≥ 10 µg Hb per g faeces . | FIT < 10 µg Hb per g faeces . | |||||
D’Souza et al.10 | 3.3 | 9822 | 81.0 | 29.9 | 6.2 | 265.3 | 9.1 |
Mowat et al.9 | 1.9 | 5372 | 78.1 | 52.2 | 13.5 | 262.3 | 15.5 |
Turvill et al.33 | 3.0 | 5040 | 78.8 | 33.4 | 8.1 | 209.2 | 12.6 |
Rodriguez-Alonso et al.22 | 3.0 | 1003 | 77.5 | 33.4 | 7.8 | 777.0 | 3.3 |
Khan et al.29 | 5.1 | 928 | 80.0 | 19.7 | 4.6 | 106.1 | 14.9 |
Navarro et al.30 | 5.0 | 727 | 71.7 | 20.2 | 6.1 | 260.5 | 5.6 |
Farrugia et al.28 | 5.7 | 612 | 77.9 | 17.5 | 4.5 | 95.4 | 14.3 |
Maclean et al.32 | 2.5 | 553 | 68.5 | 39.5 | 13.4 | 379.0 | 7.1 |
Symonds et al.23 | 5.8 | 480 | 77.5 | 17.1 | 4.5 | 93.0 | 14.3 |
D’Souza et al.27 | 4.0 | 298 | 83.9 | 24.8 | 4.4 | 250.0 | 8.3 |
Morales-Arraez et al.25 | 11.4 | 245 | 51.4 | 8.8 | 4.6 | 63.0 | 7.1 |
Tsapournas et al.31 | 5.4 | 242 | 73.6 | 18.6 | 5.3 | 178.0 | 7.7 |
Ayling et al.26 | 3.9 | 178 | 93.3 | 25.4 | 2.4 | 83.0 | 28.6 |
CRC, colorectal cancer; FIT, faecal immunochemical test; Hb, haemoglobin; NNS, number needed to scope to detect one CRC.
Regarding referral numbers, one study9 reported a 15.1% reduction in overall referrals to relevant secondary-care services within 1 year of the introduction of FIT compared with the same period in previous years. Three studies10,27,31 used the FIT positivity rate at specified lower f-Hb thresholds to project anticipated reductions in referrals based on the application of FIT in triaging pathways for patients with symptoms of possible colorectal cancer. The projections ranged from a 62.8% to an 83.9% reduction in referrals depending on f-Hb threshold. Seven studies also reported the percentage of normal colonoscopies in their cohorts, ranging from 31.3% to 64.3%, but it is unclear how many of these patients had a negative FIT at lower f-Hb thresholds.
Ten studies reported the proportion of respondents (those who returned a FIT kit), with a mean of 72.7% (range 41.6 to 98.2), which can be seen as a surrogate marker for the acceptability of the test.
Study heterogeneity and sensitivity analyses
Sensitivity analyses were performed: including only studies that used colonoscopy as a reference standard; excluding studies with variable definitions of high-risk adenoma and serious bowel disease; excluding conference abstracts; and excluding two cohorts that included only patients with anaemia. All such analyses showed no statistically significant differences compared with the summary sensitivity and specificity results presented.
The I2 value for the meta-analyses involving the target condition colorectal cancer ranged from 0% to 68%, indicating low-to-moderate between-study heterogeneity. Visual inspection of sROC plots (Fig. 2) identified two small studies25,26 that recruited symptomatic patients with anaemia as obvious outliers on either side of summary estimates of sensitivity and specificity, suggesting that cohorts of anaemic patients may have contributed to the heterogeneity. However, sensitivity analysis excluding these two studies did not demonstrate a statistically significant difference compared with the summary sensitivity and specificity results presented. Threshold effect was accounted for by separate meta-analyses of FIT diagnostic performance at described thresholds.
Publication bias
Graphical inspection of funnel plots and Deeks’ tests showed no significant publication bias, but the small number of included studies does not allow definitive conclusions to be drawn.
Discussion
This systematic review and meta-analysis evaluated the performance of single quantitative FIT in triaging patients with possible symptoms of colorectal cancer referred for colonic investigation. At lower f-Hb positivity thresholds of 10 and 2 μg Hb/g faeces, the results suggest that single quantitative FIT is a clinically useful test to exclude colorectal cancer in symptomatic patients. At upper f-Hb positivity thresholds of 100 and 150 μg Hb/g faeces, the findings suggest that quantitative FIT can be used to stratify patients into different colorectal cancer risk groups based on the f-Hb level. For detection of the secondary target conditions of advanced colorectal neoplasia and serious bowel disease, FIT at low f-Hb thresholds has inferior sensitivity compared with that for detection of colorectal cancer.
Pin Vieito and colleagues7 previously conducted a similar systematic review and meta-analysis, which included seven studies involving a total of 5229 symptomatic patients who completed a FIT. Only summary diagnostic performance data for the OC-Sensor assay were reported, with a summary sensitivity of 94.1 per cent and specificity of 66.0 per cent. Likewise, Westwood and colleagues4 undertook a systematic review and meta-analysis that included nine studies. Meta-analysis was again reported only for the OC-Sensor assay, with a summary sensitivity of 92.1 per cent and specificity of 85.8 per cent. The variation in summary estimates between reviews may be related to key differences in the patient cohorts included. These two previous reviews included studies of symptomatic cohorts that were mixed with asymptomatic patients, such as patients undergoing surveillance for polyps and colorectal cancers4,7,34–38. Furthermore, inclusion of cohorts in which a proportion of patients who had a FIT did not undergo a reference standard test could lead to verification bias39. The present review used a stringent set of eligibility criteria to minimise the risk of contaminated cohorts and verification bias.
One of the largest reported service evaluations assessing the real-world performance of quantitative FIT incorporated triaging of symptomatic patients, was conducted in the UK from 2017 to 2019, and included 13 361 patients in primary care12. Various subgroups of study recruitment and follow-up period have been described over multiple publications, representing an important body of work in this area of research11,12,40–43. The 10-month interim false-negative rate at thresholds of 10 and 20 μg Hb per g faeces were 7.9 and 12.3 per cent respectively, suggesting promising results in line with the findings of this review12. Furthermore, colorectal cancers detected at thresholds of 100 and 150 μg Hb per g faeces represented 64.1 and 55.4 per cent of all colorectal cancers detected at time of analysis, validating the importance of upper f-Hb thresholds in prioritizing symptomatic patients for further colonic assessment.
No test is perfect. An estimate of the sensitivity of colonoscopy for detection of colorectal cancer is 94.7% (95% c.i. 90.4 to 97.2)44. For a less invasive and less resource-intensive test, FIT has fairly high sensitivity for the detection of colorectal cancer in symptomatic patients, especially at f-Hb thresholds approaching lower limits of detection. It is worth noting that the quantitative FIT assay–analyser systems included in this review have different lower limits of detection45. As with any quantitative test, selection of progressively lower f-Hb positivity thresholds for further investigation to avoid missing colorectal cancers will inevitably involve larger numbers of false-positive results that need further colonic evaluation. However, incorporation of FIT for triaging symptomatic patients appears superior to triaging based on symptoms alone in terms of colonoscopy resource requirements. The NNS for each f-Hb threshold must be interpreted with caution as it is dependent on the prevalence of colorectal cancer in a target population. Therefore, meta-analyses of this variable were not undertaken in the present review. Despite this, NNS remains a useful metric and is better understood by the wider clinical community who will be ordering FIT, making clinical decisions based on FIT, and communicating FIT results to patients.
As reported by some of the larger studies in this review, the ‘optimal' f-Hb threshold at which both sensitivity and specificity are maximised can vary between study cohorts, ranging from 15 to 38 μg Hb per g faeces despite similar patient selection10,22,33. Within the limitations of the present review, the f-Hb threshold at which both summary sensitivity and specificity estimates would be maximised would appear to be closer to 20 rather than 10 μg Hb per g faeces that is recommended by NICE guidelines. Recent publications12,33,46 have demonstrated that lowering the f-Hb threshold below points of maximal sensitivity and specificity increases sensitivity minimally with a disproportionately large decrease in specificity. However, this statistical definition of ‘optimal' may not necessarily be acceptable to individual patients and clinicians alike, who may place greater importance on maximising sensitivity alone given the concerns around missing a colorectal cancer diagnosis, particularly in the presence of concerning symptoms. Reassuringly, the NNS modelling suggests that, even at the lowest f-Hb thresholds where sensitivity alone is maximised, a triage system incorporating FIT remains more efficient than the contemporary practice of triaging based on symptoms alone.
The sensitivity of FIT decreases significantly in the detection of advanced colorectal neoplasia and serious bowel disease. The summary sensitivity was found to be 68.4% and 69.7% for the respective conditions at f-Hb positivity thresholds of at least 10 μg Hb/g faeces. Although this systematic review was not specifically designed to assess the performance of FIT in the detection of serious bowel disease, previous systematic reviews4,7 reported similar FIT summary sensitivity and specificity for these target conditions. The heterogeneity in definitions of high-risk adenomas, advanced colorectal neoplasia, and serious bowel disease contributes to the difficulty in making interstudy comparisons. In the context of symptomatic patients, the diagnostic performance of FIT for detecting conditions such as high-risk adenoma may not be as important given they have no clearly demonstrable association with bowel symptoms potentially indicative of underlying colorectal cancer47. Future research is required to assess methods to improve the diagnostic performance of FIT for detection of other colonic pathologies to allow incorporation of FIT into triage systems for diagnoses beyond just colorectal cancer. The relatively lower sensitivity of FIT for serious bowel disease compared with colorectal cancer will demand clinically and fiscally appropriate safety netting systems to ensure that patients with false-negative tests have readily accessible alternative pathways for the diagnosis and treatment of non-colorectal cancer pathologies.
A limitation of this review was the exclusion of several studies owing to inadequate data reporting. Seven potential study cohorts with a mixture of symptomatic and asymptomatic patients were excluded only after unsuccessful attempts to contact authors to request further data on subsets of symptomatic patients, and the impact of these exclusions on the summary estimates is unclear. There were some potential confounding factors and sources of heterogeneity that may have affected the diagnostic performance of FIT, which this review could not adjust for due to limited data, such as the prevalence of colorectal cancer within the cohort, patient age, sex, level of deprivation, baseline Hb level, and variations in FIT analyser calibration45. Despite no limitation on publication language, there was a narrow geographical distribution of the study cohorts, with 14 from Europe and one from Australia. The included studies were also overwhelmingly conducted in a secondary or tertiary healthcare setting. This may limit the generalizability of the review findings to other populations.
The inclusion of recent publications with different FIT assay brands in this review has facilitated analysis of the impact of assay brand on diagnostic performance. Although there were no significant differences in diagnostic performance between FIT brands, this result needs to be considered with caution as there are known issues with comparing measurement results across different FIT brands and study cohorts despite a universal measurement unit45,48. One study41 that directly compared two of the most commonly used FIT brands in a cohort of 732 symptomatic patients demonstrated small but significant variations between brands, particularly around lower f-Hb cut-off points.
There remain other unanswered questions about the use of quantitative FIT for symptomatic patients that require further research. The first is the impact of specific symptoms on FIT performance. Current NICE guidelines49 recommend use of FIT in low-risk patients who do not report specific symptoms such as rectal bleeding. The present meta-analysis showed that a negative FIT result at low f-Hb positivity thresholds may be even more accurate at excluding colorectal cancer in patients reporting rectal bleeding than in those without. The corollary of this is the increase in patients with false-positive FIT results requiring further colonic investigation. This finding should be interpreted with caution as only two studies were included in the meta-analysis owing to limited data. Meta-analyses of FIT diagnostic performance based on the presentation of anaemia or other common symptoms were not possible. Existing studies10,22,25,26,28,29 have produced contradictory conclusions about the impact of anaemia on FIT diagnostic performance, so it is recommended that future studies report FIT outcome data stratified by presenting symptoms.
The characteristics of patients with false-negative FIT results, especially at lower f-Hb positivity thresholds, is another area warranting more high-quality data. Given that these are expected to be rare occurrences, detailed reporting of the patient- and disease-related risk factors for this group would allow future analysis and may provide insights into the development of a safety netting strategy to further reduce missed colorectal cancers.
In this review, the impact of introducing FIT on service provision has been described in a number of metrics, but comprehensive assessment of the effect on real-world clinical practice is lacking. Only one included study9 investigated FIT use in a primary-care setting and described an actual 15.1% reduction in overall referrals to relevant secondary-care services following the introduction of FIT. This is in stark contrast to the expected reduction in referrals of around 78.1% (based on the FIT positivity rate at a lower f-Hb threshold, as described in a number of other included studies). Hence, large projected reductions in referrals to secondary care based on research data should be considered with caution when planning FIT implementation as there are likely to be a number of clinical and systemic factors that influence referral patterns which are unaccounted for. A possibility could be that primary-care referrers employ a lower threshold or broader criteria for offering patients a FIT than when referring patients to secondary-care services for further colonic investigation. The larger overall number of patients with false-positive FIT results generated at lower f-Hb positivity thresholds would go on to receive colonic investigation and thus reduce the magnitude of the predicted reduction in referral numbers.
The results of this systematic review and meta-analysis suggest that a single quantitative FIT at low f-Hb positivity thresholds can adequately exclude colorectal cancer in patients referred with symptoms of possible colorectal cancer. More data on FIT performance at upper f-Hb thresholds can further validate its usefulness as a prioritisation tool. FIT facilitates a data-based approach to prioritisation and more efficient allocation of colonoscopy resources. These findings have the potential to reduce the rate of unnecessary colonic investigations, and release colonoscopy capacity for colorectal cancer surveillance and screening initiatives.
Funding
Research undertaken by K.S.S. is funded through the University of Auckland—Fellow in Surgery scholarship.
Disclosure. The authors declare no conflict of interest.
Supplementary material
Supplementary material is available at BJS online.
References