Abstract

Objectives

The objective of this work is to demonstrate the value of simulation testing for rapidly evaluating artificial intelligence (AI) products.

Materials and Methods

Researcher-physician teams simulated the use of 2 Ambient Digital Scribe (ADS) products by reading scripts of outpatient encounters while using both products, yielding a total of 44 draft notes. Time to edit, perceived amount of effort and editing, and errors in the AI-generated draft notes were analyzed.

Results

Ambient Digital Scribe Product A draft notes took significantly longer to edit, had fewer omissions, and more additions and irrelevant or misplaced text errors than ADS Product B. Ambient Digital Scribe Product A was rated as performing better for most encounters.

Discussion

Artificial intelligence-enabled products are being rapidly developed and implemented into practice, outpacing safety concerns. Simulation testing can efficiently identify safety issues.

Conclusion

Simulation testing is a crucial first step to take when evaluating AI-enabled technologies.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/pages/standard-publication-reuse-rights)
You do not currently have access to this article.