Cost-effectiveness of Artificial Intelligence-Aided Colonoscopy for Adenoma Detection in Colon Cancer Screening

Abstract Background and Aims Artificial intelligence-aided colonoscopy significantly improves adenoma detection. We assessed the cost-effectiveness of the GI Genius technology, an artificial intelligence-aided computer diagnosis for polyp detection (CADe), in improving colorectal cancer outcomes, adopting a Canadian health care perspective. Methods A Markov model with 1-year cycles and a lifetime horizon was used to estimate incremental cost-effectiveness ratio comparing CADe to conventional colonoscopy polyp detection amongst patients with a positive faecal immunochemical test. Outcomes were life years (LYs) and quality-adjusted life years (QALY) gained. The analysis applied costs associated with health care resource utilization, including procedures and follow-ups, from a provincial payer’s perspective using 2022 Canadian dollars. Effectiveness and cost data were sourced from the literature and publicly available databases. Extensive probabilistic and deterministic sensitivity analyses were performed, assessing model robustness. Results Life years and QALY gains for the CADe and conventional colonoscopy groups were 19.144 versus 19.125 and 17.137 versus 17.113, respectively. CADe and conventional colonoscopies’ overall per-case costs were $2990.74 and $3004.59, respectively. With a willingness-to-pay pre-set at $50,000/QALY, the incremental cost-effectiveness ratio was dominant for both outcomes, showing that CADe colonoscopy is cost-effective. Deterministic sensitivity analysis confirmed that the model was sensitive to the incidence risk ratio of adenoma per colonoscopy for large adenomas. Probabilistic sensitivity analysis showed that the CADe strategy was cost-effective in up to 73.4% of scenarios. Conclusion The addition of CADe solution to colonoscopy is a dominant, cost-effective strategy when used in faecal immunochemical test-positive patients in a Canadian health care setting.


BACKGROUND
The gold standard for the detection of precancerous colonic lesions is colonoscopy (1). However, lesions are still missed, accounting for 25% of interval colorectal cancers (CRC) (2). The current principal quality indicator for colonoscopy is an endoscopist's adenoma detection rate (ADR) (3). Studies have shown that an increased ADR improves adjusted hazard ratios for interval CRCs and cancer-related mortality (4). Indeed, ADR directly correlates with outcomes, with a 1% increase in ADR resulting in a 3% decrease in CRC risk (5). Recent data have confirmed that the benefits of an increase in ADR apply to more contemporary examinations across many institutions and remain true after extensive adjustments for patient characteristics and procedural indications (6).
Over the past 5 years, there have been an emergence and proliferation of artificial intelligence (AI) clinical solutions targeting various therapeutic areas in Gastroenterology (7). These technologies often greatly rely on vast amounts of imaging information, which has been the case in the digestive endoscopy field, particularly colonoscopy (8). Clinical applications of AI to colonoscopy have included the detection of premalignant or malignant lesions, also known as computer-aided detection (CADe), as well as the characterization of such identified polyps (CADx) (9).
Although many CADe solutions are emerging, the first to be approved by the US Food and Drug Administration was the GI Genius solution (Medtronic PLC, Minneapolis, MN, the USA), with relevant available data having been published in the form of a recent multicentre, randomized clinical trial (RCT) and subsequent meta-analysis (8,10).
To complement such compelling efficacy evidence for polyp detection by AI-aided colonoscopy, the economic impact of CADe solutions must now be characterized to better determine the feasibility of adopting such technology in health care settings with limited resources, as is the case in the publicly funded Canadian health care system. The goal of this study was thus to evaluate the cost-effectiveness of AI-aided colonoscopy using a specific CADe solution in a Canadian health care setting for which high-quality clinical and generalizable data exist.

METHODS
We adopted a two-stage EXCEL-based (Microsoft Corp., Redmond, WA, USA) Markov simulation model with 1-year cycles and a lifetime horizon to estimate the incremental cost-effectiveness ratio (ICER) comparing CADe to conventional colonoscopy polyp detection rates ( Figure 1).

Patient Population, Index Colonoscopy Findings and Assumptions About Disease Progression
The target population was comprised of patients 50 years of age and older undergoing colonoscopy as a result of a positive faecal immunochemical test (FIT). We assessed this population since organized population-based CRC screening programs are principally FIT-based, as is the case throughout most of Canada (11). The prevalence of adenomas or CRC in the at-risk population was based on the recent Ontario colon cancer screening program report (12). Baseline adenomas or CRC diagnostic rates were derived from a recent metaanalysis as a function of adenoma size and CRC stage (8). Patient characteristics are displayed in Table 1.

Adenoma Detection With and Without CADe and Missed Adenoma Rates
Assumptions for ADR in conventional and CADe colonoscopies were extracted from the pivotal RCT of 685 subjects undergoing a colonoscopy for broad indications, including initial screening with colonoscopy, post polypectomy, as a result of symptoms, or following a positive FIT (10).  The model is based on the adenoma missed rate (AMR) reported in the meta-analysis of specific for small, medium and large adenomas (16). For CRC stages I and II, the AMR of large adenomas is assumed to have reduced by 50%. We then calculated the probability of detecting adenomas or CRC correctly with colonoscopy (i.e., the sensitivity) as one minus the AMR.
The sensitivity of colonoscopy using the CADe solution was then estimated by applying the incidence rate ratio (IRR) of adenoma per colonoscopy (APC) reported by Repici (10). The risk reduction of CADe colonoscopy was calculated by multiplying the AMR by the IRR. The risk reduction of CADe colonoscopy was thus 15.5% for small adenomas (<5 mm), 8.28% for medium adenomas (6-9 mm) and 7.6% for large adenomas (>10 mm). For CRC, we used the per-patient detection rate's RR as a proxy. All the assumptions relating to the polyp detection rates and incremental benefits provided by the CADe solution are listed in Table 2.

Model Structure and Probability Assumptions
In the first stage of the model, patients were assessed for CRC in each cycle; the missed polyp can stay at the same stage or increase in size, which results in a subsequent endoscopic polypectomy at follow-up examination. As per CRC screening guidelines, patients with a negative colonoscopy (e.g., no adenoma or CRC at the index colonoscopy) were considered healthy and had a follow-up FIT after 10 years (17). Patients diagnosed with smaller adenomas (<10 mm or less than three polyps and no high-grade dysplasia) had a colonoscopy after 5 years, while high-risk patients (>10 mm or more than three polyps or exhibiting high-grade dysplasia) underwent a colonoscopy after 3 years; these were the recommended guidelines at the time the study was planned (18).
Amongst patients who would prove to have CRC, patients with large polyps entered the second stage of the model after subsequent assessment for disease progression. In each cycle, the polyp size could remain the same or increase and progress to the next CRC stage (CRC stages I-IV). Possible therapeutic interventions included endoscopic polypectomy, surgery, chemotherapy or radiotherapy (for advanced CRC) and follow-up care. All the assumptions  relating to disease progression and health state transitions are listed in Table 3. The probabilities of transitions between health states (in the Markov cycles) were calculated for the following parameters: utility scores, adenoma size (small, medium, large) and CRC progression (stages I-IV), adenoma missing rates (AMR), IRR per colonoscopy (Table 3).

Effectiveness Assumptions
The outcomes of effectiveness were life years (LY) gained and quality-adjusted life years (QALY). LY and QALY are recognized and commonly used measures of effectiveness in economic studies (21). QALY is a single index generated by combining disease mortality and morbidity (expressed by the utility values in the cost-effectiveness studies) and is commonly used as a measurement tool for effectiveness and can assist health policymakers in setting priorities among competing health care technologies (22).
The mean utility value attributed to having an adenoma was 0.91 (95% CI: 0.87 to 0.93), and the mean utility value for CRC in various stages was: Stages I and II: 0.67, 95% CI:

Health Care Resource Utilization and Associated Cost Assumptions
Direct medical costs associated with health care resource utilization were used in the analysis. The cost variables for various resources included FIT, colonoscopy, endoscopic polypectomy (for discovered and endoscopically resectable adenomas), surgical procedures for patients diagnosed with CRC stages I, II and III, adjuvant chemotherapy for CRC patients with surgical findings of stage III CRC, systemic chemotherapy for patients diagnosed with CRC stage IV, and follow-up visits ( Table 4).
According to the Canadian health economics guidelines, costs and outcomes were discounted yearly at 3.5% (28). All costs were expressed in 2022 Canadian dollars and were sourced from the published literature, the Ontario case costing Initiative, and the Canadian Institute for Health Information patient cost estimator (24,29). The costs of physician consultations were derived from the Ontario schedule of benefits: physician services (30). The cost of the CADe solution was estimated at All cost assumptions were varied over a range extending to 15% beyond appropriate point estimates (Table 4). This analysis compares conventional colonoscopy with CADe from a Canadian provincial payer perspective. The acceptable willingness-to-pay (WTP) value in Canada was set a priori at $50,000 per QALY.

Sensitivity Analyses
We performed both deterministic and probabilistic sensitivity analyses using a Monte Carlo simulation on a hypothetical cohort of 1,000 patients on the main outcome variables to assess the robustness of the model. In particular, with regards to the confidence intervals around the difference in ADR point estimate comparing a colonoscopy performed with CADe versus not. The simulation of the point estimate of the difference in ADR was varied across the full IRR 95% confidence interval based on the published RCT to calculate the standard error. We adopted this approach as this represents the most conservative approach. However, as it is clinically unlikely that a colonoscopy performed with CADe solution would result in a lower ADR compared with conventional colonoscopy, we also completed the probabilistic sensitivity analysis varying the ADR difference only across the range of values over which polyp detection improved with CADe.

Incremental Cost-effectiveness Ratios
LYs gained in the CADe colonoscopy and conventional colonoscopy groups were 19.144 and 19.125 (P = 0.019), respectively. The QALY gains for CADe colonoscopy and conventional colonoscopy were 17.137 and 17.113 (P = 0.024) per thousand patients. The total 3-year acquisition cost of the CADe technology was calculated at $81,000.
If applied to 1,000 colonoscopies yearly, the per-case cost of CADe colonoscopy and conventional colonoscopy were $2,990.74 and $3,004.59, respectively, resulting in approximately $14 overall cost savings in the CADe group.
With a WTP threshold set at $50,000 per QALY, the ICER was the dominant strategy (offering better outcomes at a lower overall cost) for both outcomes, showing that performing colonoscopy using CADe solution is a cost-effective strategy in the Canadian health care system (Table 5).

Sensitivity Analyses
The deterministic sensitivity analysis demonstrated that the model was sensitive to between-group differences in the incidence risk ratio of adenoma and adenoma miss rates per colonoscopy for larger adenomas as main cost drivers and, to a lesser extent, assumption about adenoma utility values ( Figure 2).
The probabilistic sensitivity analysis showed that the CADe strategy was cost-effective in 63% of simulations when the difference in ADR across its entire 95% confidence interval was varied (Figure 3). This value rose to 73.4% of simulations when varying baseline assumptions only across an improvement in the ADR range attributable to the CADe solution (the latter is not shown in Figure 3).

DISCUSSION
The rapid expansion of AI in health care highlights the importance of evaluating the economics of AI-aided technologies in policymaking. The current analysis is timely, as numerous trials and meta-analyses have shown the effectiveness of AI technology in colorectal adenoma detection (31,32). In the year post-surgery: colonoscopy, abdominal instrument exams every 6 months, laboratory exams every 3 months, 1 PET-OCCI 2018 CMG, case-mix grouping; CRC, colorectal cancer; FIT, faecal immunochemical test; OCCI, Ontario case-costing initiative; PET, positron emission tomography. This study demonstrates that the CADe solution is cost-effective compared to conventional colonoscopy amongst FIT-positive patients in the Canadian health care system, exhibiting a dominant ICER. The chosen study patient population is critical to organized CRC screening programs as most are based on FIT testing, as in Canada (11). Despite the upfront acquisition and maintenance costs of the AI technology, this analysis suggests that the overall cost per case would decrease if adopted, using a lifetime horizon amongst patients undergoing colonoscopy for a positive FIT. The plausible biological rationale is that the higher ADR results in a better diagnostic yield and treatment of precursor lesions before they become cancerous. Consequently, there is an improvement in the patient's quality of life over a lifetime while decreasing overall health care costs, explained by a reduced number of downstream colon cancer treatments and followups. The deterministic and probabilistic sensitivity analyses cover a wide range of effectiveness variables (e.g., IRR APC relative to the adenoma size) and costs (e.g., treatment and surgical procedures costs), confirming the robustness of our analysis, using both conservative and more liberal ranges of ADR differences between the CADe and conventional colonoscopy approaches. We selected data from the GI Genius CADe solution as it is one of the first commercialized, beststudied, and one of only a few approved CADe solutions in Canada and the United States. AMR estimates used in this analysis that were taken from a study by Zhao et al. overlap greatly with an additional recent report by Wallace et al. (33). A recent study also demonstrated a significant increase in ADR attributable to CADe, specifically in patients undergoing colonoscopy for a positive FIT; importantly, the ADR and its range overlap extensively with the assumptions of our model, supporting the validity and robustness of our findings (34). Interestingly, ADR estimates and ranges attributed to CADe from a large recent meta-analysis that included ten RCTs studying patients with a broad range of indications for colonoscopy (n = 6,629) while regrouping multiple CADe technologies overlapped completely with the assumptions used in our model (ADR = CADe 35.4% vs. standard 25.9%, RR = 1.43 [1.33-1.53]) (30). Importantly, a recent meta-analysis by Huang et al. has also confirmed improved total numbers of sessile serrated and advanced adenomas per colonoscopy, for which individual studies had yielded variable results due to smaller numbers of these rarer lesions (31).
The results are aligned with preliminary findings from recent international studies published as sole abstracts to date. These include Jootun et al., who used a Markov model simulation to evaluate the cost-effectiveness of CADe in a cohort of 1,000 patients aged 50 with a time horizon extending from screening and diagnosis through a patient's life span (35). The measures of effectiveness included LYs gained, CRCs prevented and QALYs. The authors showed that the CADe strategy was cost-effective in the Spanish health care system, achieving an incremental 0.033 QALY gain with a WTP threshold set at €20,000-€30,000 per QALY (35). A comparable analysis from Italy yielded similar findings from an Italian national health care system perspective (36).
Areia et al. also reported that a CADe solution yielded greater QALYs at lesser costs with assumptions pertinent to a US setting in a secondary analysis of a Markov microsimulation that principally reported on colon cancer incidence and mortality as well as costs of implementing a CADe solution (37). They demonstrated that the CADe solution resulted in a 4.8% incremental gain for relative reduction of CRC incidence and a 3.6% incremental gain for CRCrelated mortality. CADe thus decreased the cost per screened individual from USD 3,400 to USD 3,343 (a saving of USD 57 per individual). Similarly, they reported that a once-in-alifetime colonoscopy performed with CADe solution for the United States population would result in additional yearly prevention of 7,194 CRC cases and 2,089 related deaths at an annual saving of USD 290 million.
Although at the macro level, the full economic analysis that considers both cost and outcomes (e.g., cost-effectiveness) is a key decision determinant for policymakers to assess the feasibility of a technology, at a meso-level, understanding the budget impact on a system is helpful for financial planning (38). As such, a few additional studies have evaluated the cost consequences of AI-aided colonoscopy in other health care systems using different perspectives. Döring et al. assessed the cost impact of CADe on the German health care system by adopting data from a recently published meta-analysis on CADe (39). They concluded that a combination of CADe solution with polyp management could lead to cost control in German's CRC screening program.
The economic benefits of AI may extend to polyp characterization and not just detection. Indeed, in an add-on analysis to their single-group, open-label, prospective clinical trial, Mori et al. estimated that the diagnose-and-leave strategy with AI-aided polyp characterization (CADx) could reduce the average colonoscopy cost and the gross annual reimbursement for colonoscopies by USD 149.2 million (18.9%) in Japan, USD 12.3 million (6.9%) in England, USD 1.1 million (7.6%) in Norway and USD 85.2 million (10.9%) in the United States, compared with a resect-all-polyps strategy as is most often current practice (40).
Indirectly addressing potential cost impacts, a retrospective analysis from the United States evaluated CADe solution and concluded that despite some concerns about potential negative effects on efficiency in high-volume ambulatory surgical centres, such CADe solution use was not associated with a significant increase in procedural time (41).
Comparable to other medical procedures that are dependent on operator judgment and decisions, colonoscopy outcomes (e.g., ADR) are quite operator-dependent and often relate to operator experience with better outcomes for patients who have access to centres with more experienced clinicians (42). Similar to other advanced technologies, AI-aided procedures have thus the potential to improve health equity by democratizing access to enhanced diagnostic results or treatment options (43). It is the case, more specifically, for CADe (and possibly CADx), as optimizing the diagnosis and outcomes of colonoscopy may lead to increased health care system efficiencies while also yielding potential cost savings (44). Adopting a societal perspective, the improved efficiency may help decision-takers finance additional advanced technologies for patient care in other therapeutic areas (45). Furthermore, like other AI technologies, CADe performance will likely further improve through ongoing annotated data inputs.

LIMITATIONS
This study must be interpreted in the context of some limitations. We used a simulation economic model, which imposes methodological structural limitations. Indeed, there exists uncertainty around effectiveness outcomes reported in international publications. However, as discussed above, extensive sensitivity analyses addressed the uncertainty of the effectiveness and cost model assumptions. The study population focused exclusively on FIT-positive individuals, which represents a varying proportion of all patients undergoing a colonoscopy in an endoscopy unit-a proportion that is dependent on referral indications amidst the presence or absence of an organized CRC screening program. This study was conducted using the Ontario colon cancer screening program and cost data from a Canadian provincial health care perspective (17). Ontario has a well-established CRC screening program, and success rates may differ in other jurisdictions in Canada or internationally. Similarly, this analysis used a provincial payers' perspective, including direct medical costs, and did not account for potential societal benefits and indirect and intangible costs, which would further enhance CADe-related benefits. As mentioned in the methods section, ADR assumption ranges included theoretical worsening polyp detection when using CADe, even if clinically unlikely, highlighting the conservative nature of the base-case analysis and its main conclusions. When only including the beneficial 'positive interval' of the full CADerelated ADR range difference, the proportion of cost-effective simulation increased from 63% to 73.4%. Additional unknowns relating to model assumptions include a possible levelling off of CRC benefits with greater ADR values (6), the impact of longer post-polypectomy screening intervals (18) (here, too, resulting in more conservative estimates in our analysis), and reliable estimates for AMR of larger lesions (33,46) that may in part be attributable to limitations in parallel versus tandem RCT colonoscopy study designs (47,48). Although this study adopted a Canadian payer's perspective, due to the conservative nature of our assumptions and the higher WTP threshold usually adopted in the United States, it is likely that an effective CADe solution is good value for money and would be proven a cost-effective strategy in the United States; further studies are now needed to confirm this assumption. Finally, the global benefits of a CADe platform can only truly become fully appreciable only once the impact on the breadth of colonoscopy quality facets are assessed, including assessment of bowel preparation, mucosal surface area observed, polyp size and completeness of resection, as well as polyp characterization in addition to sole detection. Some of these may bear further potential cost savings, such as enhancing a polyp and discard strategy, although any true benefits remain currently unclear (49).

CONCLUSION
The addition of CADe technology to colonoscopy is a cost-effective and dominant strategy (better outcomes at lower overall cost) for improving polyp detection in patients with a positive FIT in a Canadian universal health care system perspective when assessing LY and QALY over a patient's lifetime.

FUNDING
This study was supported by an unconditional consulting grant by Medtronic Canada.