Pneumonia diagnosis performance in the emergency department: a mixed-methods study about clinicians’ experiences and exploration of individual differences and response to diagnostic performance feedback

Abstract Objectives We sought to (1) characterize the process of diagnosing pneumonia in an emergency department (ED) and (2) examine clinician reactions to a clinician-facing diagnostic discordance feedback tool. Materials and Methods We designed a diagnostic feedback tool, using electronic health record data from ED clinicians’ patients to establish concordance or discordance between ED diagnosis, radiology reports, and hospital discharge diagnosis for pneumonia. We conducted semistructured interviews with 11 ED clinicians about pneumonia diagnosis and reactions to the feedback tool. We administered surveys measuring individual differences in mindset beliefs, comfort with feedback, and feedback tool usability. We qualitatively analyzed interview transcripts and descriptively analyzed survey data. Results Thematic results revealed: (1) the diagnostic process for pneumonia in the ED is characterized by diagnostic uncertainty and may be secondary to goals to treat and dispose the patient; (2) clinician diagnostic self-evaluation is a fragmented, inconsistent process of case review and follow-up that a feedback tool could fill; (3) the feedback tool was described favorably, with task and normative feedback harnessing clinician values of high-quality patient care and personal excellence; and (4) strong reactions to diagnostic feedback varied from implicit trust to profound skepticism about the validity of the concordance metric. Survey results suggested a relationship between clinicians’ individual differences in learning and failure beliefs, feedback experience, and usability ratings. Discussion and Conclusion Clinicians value feedback on pneumonia diagnoses. Our results highlight the importance of feedback about diagnostic performance and suggest directions for considering individual differences in feedback tool design and implementation.


Background and significance
Diagnosis is one of the major tasks in clinical practice, but misdiagnosis is a major source of medical error in emergency department (ED) and hospital settings. 1,21][12] However, such tools may present new challenges that are poorly understood.
5][16] On one hand, measures that are properly connected to clinicians' goals to provide high-quality patient care can engage and motivate clinicians. 17However, measures of performance can also thwart an open culture of safety if misused or not supported by practice environments that support acknowledging and learning from mistakes. 18,19nterpreting feedback is a complex process influenced by individual characteristics, context, and experience. 20,21xploring individual differences in beliefs including motivational theories, such as mindset 22 and failure beliefs are crucial to better understand clinicians' responses to feedback. 23,246][27] This relates to the individuals' underlying mental model of the connection between intelligence and performance such that poorer performance may point to being a less intelligent person and thus potentially more distressing for those with fixed mindsets.9][30] Understanding how beliefs such as these contribute to learning and improvement is critical if we are to create usable, meaningful feedback tools and performance measures that support a continuous learning healthcare system culture. 24,25Understanding beliefs is consistent with the adult learning theory-based principles of problem-based learning and self-directed learning, but the mindset and failure attributions are tailored specifically to potential challenges in design of feedback tools. 31neumonia is an ideal clinical diagnosis for disentangling complex factors in diagnostic accuracy.7][38][39][40][41][42] The majority of pneumonia cases are diagnosed in the ED, where clinicians encounter exceptionally complex patients to evaluate, frequent interruptions, time pressure, and substantial cognitive load, which can all contribute to inaccurate initial diagnoses. 43,44Currently, clinicians rarely receive feedback about diagnosis, constituting missed opportunities for self-assessment and diagnostic skills improvement. 6,45Prior work by our team showed over half of all cases of pneumonia demonstrated discordances between the ED diagnosis and discharge diagnosis. 46Diagnostic discordances may not always represent individual clinician error, but their measurement is an efficient way to identify cases for review to improve clinician performance.
The purpose of this mixed-methods study was to characterize the process of pneumonia diagnosis in the ED and the clinicians' experience of receiving feedback about their performance using a tool we developed and designed for ED clinicians, "Dx-Connect."Dx-Connect displays patient data from the EHR and measures of diagnostic performance at the clinician level, measured by concordance/discordance between the initial diagnosis, radiology report, and discharge documentation, thus connecting the provider to ultimate diagnostic and clinical outcomes of their patients.We examined the providers' responses to this feedback with semistructured interviews and surveys.

Recruitment and ethics
We identified a cohort of patient hospitalizations from the ED at an academic medical center within 12 months preceding the study (October 1, 2020, to September 30, 2021) with an initial or discharge diagnosis of pneumonia identified by a previously validated approach that combines International Classification of Disease (ICD) 10 diagnosis codes and natural language processing (NLP). 46ED clinicians with at least 10 pneumonia cases (n ¼ 123) were eligible for participation and received a department-wide email with general information about the study and invitation to participate.We accepted 1 volunteer participant who responded to the general email and purposively sampled an additional 15 participants for email invitation into the interview and survey portion of the study.Purposive sampling was based on age, gender, and type (physician vs advanced practicing clinician) to increase representativeness of the sample of interviewees.A consent cover letter was presented at the outset of the interview.All study procedures were reviewed and approved by the University of Utah Institutional Review Board (IRB # 00136521).

Procedures
Using a sequential mixed-methods design we (1) analyzed diagnostic discordance data for ED clinicians and prepared a visualization of these data for display in prototype Dx-Connect, (2) interviewed ED clinicians about diagnosing pneumonia in the ED context and their reactions when exploring the prototype feedback tool, and (3) surveyed clinicians to assess comfort with their personal performance feedback, mindset about intelligence, comfort with failure, and feedback tool usability.

Quantitative analysis of diagnostic performance
Using the EHR from the University of Utah (UU), we developed, tested, and implemented measures of diagnostic accuracy for ED providers caring for patients admitted to the hospital with pneumonia between January 1, 2015, and March 31, 2022.We identified discordances between the initial/ED diagnosis, chest imaging diagnosis, and the final/discharge diagnosis of pneumonia by combining diagnosis codes with natural language (NLP) of clinical text from ED documents, chest imaging reports, and discharge summaries. 47reating the initial/ED diagnosis as a "test" diagnosis and the discharge diagnosis as an imperfect reference standard and "true," we classified each case evaluated by the clinician as either "true positive" (concordance between ED diagnoses and discharge diagnoses We also classified discordance or concordance between initial positive cases with the diagnosis of pneumonia within the chest imaging report.("false positive" For each participating provider and the entire department, we summarized annual and quarterly: (1) positive predictive value (PPV) against chest image as the percent of ED diagnoses for pneumonia that also had a positive chest image; (2) PPV against discharge as the percent positive ED diagnoses with a discharge diagnosis for pneumonia, and (3) the sensitivity against discharge diagnosis as the percent of positive discharge diagnoses of pneumonia that also had an initial positive ED diagnosis.

Prototype feedback tool
Dx-Connect incorporates EHR data and displays clinicianlevel diagnosis data for pneumonia cases that were hospitalized from the ED by the participating clinician (Figure 1).
Additional features included (1) a Venn Diagram displaying the total number of pneumonia encounters with an initial diagnosis by that clinician and the number of concordant and discordant cases, (2) quarterly and annual positive predictive value and sensitivity, and (3) a table with select patient case details.Case clinical notes, discharge summaries, and chest X-ray reports (with diagnosis terms highlighted by NLP) were accessible through links embedded in the table.We also created a method for clinicians to compare themselves to their peers as seen in the "View Department Measures Button" (see Figure 2).Clicking this button generated a view of relative performance compared to other clinicians in the ED in which they worked for concordant diagnoses (see Figure 3).

Interview
We designed a semistructured interview, with input from an ED physician, pulmonary critical care physician, behavioral scientist, and an informaticist, that focused on pneumonia diagnostic experiences, presentation of individual diagnostic performance feedback, and exploration of how the feedback would be used (see Supplementary material).The interview followed a cognitive task analysis (CTA) approach prompting clinicians to describe a specific case of pneumonia diagnosis in the ED to characterize the cognitive process of pneumonia diagnosis in the ED. 48,49Interviewers described themselves as researchers engaged in development and testing of the feedback tool.We then presented the feedback tool, described above, to the provider and asked questions surrounding their understanding of the measure and their reactions to their feedback.

Survey measures
Following the interview, participants were provided with a link to complete a survey online via Qualtrics (Qualtrics LLC, United States).The survey included a 3-item Mindset questionnaire assessing beliefs related to a growth mindset and implicit theories of intelligence. 50,51The Performance Failure Attribution Inventory was used to assess beliefs about failure. 52A single item assessed Comfort with Feedback.Participants also completed the System Usability Scale (SUS) 53 a usability assessment applied to thousands of informatics tools with a range of 0-100 with higher scores indicating increased usability.The mean score across tools is 68. 54

Quantitative analysis of EHR data
Department-level PPV against chest image, PPV against discharge, and sensitivity against discharge diagnosis were summarized.

Qualitative thematic analysis
6][57] We reviewed 5 transcripts together and assigned precodes-categorizations based directly on participants' statements.We then developed a code book with code names, definitions, and sample quotes.The remaining 6 transcripts were coded independently by 2 coders per transcript with discrepancies resolved by full coding team consensus.Specific codes were grouped into themes through iterative discussion and then reviewed by two additional ED physicians (E.R. and M.F.).Interview participants were invited to review and comment on the thematic results as a member checking procedure for validation. 58All interviewees received an email with a summary of findings.We received no comments back from participants about the findings.

Descriptive survey analysis
We conducted exploratory, descriptive analysis of survey data using SPSS (SPSS Inc., United States). 59Median values are more interpretable than means in small samples; thus, we split the Mindset Scale and the Failure Attribution Scale at   the sample median, to explore the relationship with comfort with receiving feedback and usability ratings.Then, we conducted cross tabs exploration and graphically displayed how participants at above and below sample median level of fixed vs growth intelligence levels and failure attribution levels were distributed across feedback comfort and the System Usability Scale (SUS).

Quantitative analysis
Diagnostic discordance in pneumonia was high occurring in over half of all pneumonia hospitalizations (Figure 2).Among patients with an initial ED diagnosis of pneumonia, the PPV against chest image (percent with a positive chest image) was 65%, and PPV against discharge (percent with positive discharge diagnosis for pneumonia) was 26%.Among those patients with a discharge diagnosis of pneumonia, 65% had an initial ED diagnosis of pneumonia.
A total of 11 interviews and 9 surveys were completed. 60linicians were primarily physicians (7) with advanced practice clinicians (3) including physician assistants and nurse practitioners (Table 1).

Qualitative results
Thematic results are described below and displayed with sample quotes (Table 2).The table with full length and additional quotes is available in the Supplementary material.

THEME 1. Diagnosing pneumonia in the ED context is characterized by diagnostic uncertainty and may be a secondary priority relative to disposition and treatment
Clinicians described the diagnostic process for pneumonia in the ED as being fraught by uncertainty and ambiguity about the pneumonia diagnosis.Some clinicians described discomfort with diagnosis applied to ED settings, favoring the term diagnostic impressions given their uncertainty, "I think one that's interesting here is, . . .with some of these patients I would probably waffle a little bit and say, 'No, this is my clinical impression,'" (Table 2, Quote 1D).Clinicians reported they valued diagnosis and constructed their workflow to improve accuracy (eg, to improve their information gathering or guard themselves against cognitive errors such as confirmation bias or diagnostic anchoring), "I think for me, the big thing is really removing as much subjectivity out of the process as I can." (Table 2, Quote 1A).Many clinicians reported, however, that their primary goal in the ED was appropriate disposition and treatment of the patient and that the diagnosis was a vehicle to support treatment and admission to the hospital.THEME 2. Existing diagnostic skill improvement processes are fragmented, inconsistent, and self-directed Clinicians reported they valued and sought follow-up information on individual patients they were curious or worried about, "I followed his progress by looking up his chart. ..It's usually my own curiosity and chart stalking the patient. .." (2A).Participants valued case review as a form of feedback, but they reported difficulty finding and tracking patients in their existing systems "A lot of times we do not, [get feedback] unless they come back as, what we call, a bounce back, or failure of treatment and coming back to be admitted to the hospital. ..." (2D).THEME 3. Clinicians liked the measure, feedback tool, and features Clinicians had a positive response to the performance measures (eg, positive predictive value), Venn Diagram, and information table of individual patient data.These data accurately reflected the clinicians' patients and diagnoses, and the tool prompted memories of the specific patients.Clinicians appreciated task feedback (individual case accuracy and review), which sparked their interest in self-improvement "You may make me more thoughtful about when I say about how I qualify things that we put into our diagnoses" (3B).Normative feedback (comparison to peers) was consistent with clinicians' self-reported values of excellence in performance and competition "I think having a comparison with my colleagues is really beneficial because I kind of see where I am relative to them.I respect them" (3D).THEME 4. Clinicians had strong reactions to feedback data across a spectrum from implicit trust in measure to extreme skepticism Implicit trust in the measure was reflected by concerns about participants' own performance or suitability for their role "[The comparison to my peers] is now just helping me understand, 'Am I the worst person in the world?Do I need to find a totally different line of work or am I doing okay?'" (4C).Skeptical reactions including direct challenges to the validity of concordance and/or the validity of the reference standards (chest imaging or discharge diagnosis), "So if I diagnose someone with pneumonia, and I admit them to the hospital.And then, ultimately, on their discharge, they disagree, and they don't have a diagnosis of pneumonia, what further testing is gone in to reach that conclusion?" (4D).Clinicians highlighted the nature of diagnosis as an evolving and ambiguous process, and a mismatch between this process and social pressures to assign patients to diagnostic labels and commit them to associated treatment pathways "the discharge diagnosis looks strange to me. .

Survey results
An overview of survey results is reported in Table 3.The median level of growth mindset in the sample was 4-representing a slight growth mindset (indicating a median of "somewhat disagree" with the statement that intelligence is stable).For failure attribution, the median value indicated a response for most failure-related questions between "believe none of the time" and "believe 25% of the time."Comfort with feedback was between neutral and "somewhat comfortable."The system usability scale score mean was 75, which is above the 68 considered "average" across systems. 61he clinicians in our sample with beliefs consistent with a more fixed mindset (below-median ratings on the growth mindset scale) rated comfort with feedback lower than those with beliefs consistent with growth mindset (above median ratings) (Figure 4).Usability ratings were split: those with beliefs consistent with fixed mindset rated Dx-Connect far below or far above average on the SUS.Clinicians with low comfort with failure (endorsing "do not believe" for most questions, below median ratings for our sample) reported a restricted range of comfort receiving feedback (neutral or somewhat comfortable) whereas those with high comfort with failure responded across the range of the scale of feedback.Those with low comfort with failure rated the tool very low on usability (scores < 61) whereas 5 of the 6 clinicians with higher comfort with failure rated the tool above average on usability (scores > 70) (see Figure 5).

Discussion
In an in-depth exploration of the process of diagnosis of pneumonia and clinician interaction with diagnostic performance feedback, ED clinicians reported valuing diagnostic feedback, enthusiasm for tools to provide feedback, and motivation to use feedback for practice improvement.Their characterization of diagnosis in the ED as a secondary priority to stabilization and treatment and strong responses to measures of discordances in diagnosis suggest complex relationships between measurement, feedback, and meaningful quality improvement in diagnosis for ED clinicians.This complexity requires a nuanced approach when developing paths to improvement.We found individual learning characteristics of mindset and failure attribution could influence clinicians' reactions to feedback that warrants further study.Our work has important implications for development and use of practice measures and feedback tools that target diagnostic accuracy.Diagnosis is an intuitive skill that should improve throughout a clinician's experience.However, clinicians often plateau after training, because "for medical professionals to be able to keep improving their diagnostic performance during years of professional practice, they would need more feedback than the clinical environment naturally provides." 62Clinicians in our study cited the importance of learning through experience but a lack of consistent, timely and accurate feedback that is required for improvement.Clinicians generally accept feedback when they believe it to be accurate. 63But clinicians in our study had mixed reactions to the feedback measures, ranging from acceptance to substantial disagreement, particularly when they were interpreted as evaluations of diagnostic performance or quality.
Measuring diagnostic accuracy is further challenged by the fact that the diagnosis of pneumonia lacks a gold standard and often carries substantial uncertainty, particularly in the ED, even though early and accurate diagnosis is recognized by clinicians as important.Diagnosis simultaneously represents both a clinician's assessment of patient's biophysical state and a tool for managing care processes to get a patient the treatment the clinician believes they need, even under uncertainty.In the ED, the assignment of a diagnosis is a method to communicate with other care teams and to justify appropriate treatment and evaluation while expeditiously  moving the patient to the most appropriate site of care based on the clinicians' overall clinical judgment, of patient status.Discordance between ED and chest imaging or discharge diagnoses could potentially represent cognitive errors, such as confirmation or anchoring bias, which can contribute to a diagnostic error. 64,65ur work suggests that for a pneumonia diagnosis with considerable uncertainty in the complex context of the ED, "diagnostic error" may be a mischaracterization of a measurement error.Our findings complement the clinician responses to a recent systematic review of diagnostic error in the ED conducted by the Agency for Health-Related Quality: a multiorganization response representing ED clinicians criticized the use of the term diagnostic error, emphasizing that "Emergency care is less about arriving at the final diagnosis, and more about real-time identification and treatment of life-threatening conditions." 2,66This was similar to the impressions of the ED clinicians in our sample.In future work, it will be important to acknowledge uncertainty potentially by choosing labels for feedback that acknowledge this lack of gold standard rather than the "true positive" that we used in this display.Our study findings further highlight the challenge and importance of developing measures of diagnostic accuracy that adequately reflect the uncertainty of initial diagnoses and the importance of balancing accuracy with other aspects of high-quality emergency care.
Feedback has been highlighted as a missing link required to improve diagnostic performance, with new EHR capabilities providing new paths to improvement. 7,9Yet few tools exist in actual practice due to multiple sociotechnical barriers. 67ur study demonstrated the feasibility and usability of an EHR-based tool aimed at supporting diagnostic accuracy through feedback, and clinicians favorably assessed the tool.The intent of individual feedback in a culture of safety is to foster clinician curiosity, enhanced motivation to learn, and improvement in performance by using nonpunitive mechanisms. 7However, improvement through feedback is a complex and social process.Too great a focus on errors, comparison to peers, or provider incentives within a culture of blame historically endemic in medicine can result in biased feedback and interfere with the process of self-reflection and improvement. 68Previous performance measures in pneumonia have been criticized for lacking clear benefit to patients. 69s cultural and social processes in medicine evolve, understanding how provider and system experiences and attitudes toward feedback influence the use of measures are crucial to meaningful improvement in care. 70,71ur descriptive survey results suggested a potential relationship between learning mindset and individual responses to feedback and usability ratings, which has important implications for implementation and evaluation of feedback tools.Mindset theory and similar concepts have been proposed for use in medical education, clinical practice, quality improvement, and within models for data feedback systems. 18,19,24,72art of the promise of this approach is that growth mindset can be influenced by interventions and could be harnessed to improve feedback experiences, promote useful and valuable feedback for clinicians, and shift paradigms of feedback toward a safety culture. 23,73[76] We recognize several limitations in our study.As a small study at a single institution, the results are not necessarily representative of clinician experiences at other ED environments.While our sample size is consistent with the classic Nielson usability sample rule and consistent with many other usability studies focused on clinicians, 60 it is small.We were unable to systematically vary our study of reactions to specific feedback types (eg, normative vs specific task feedback) given our holistic approach to understanding reactions to Dx-Connect.Our preliminary study was not designed to investigate whether use of the tool improved diagnostic accuracy or skill in practice.We focused on clinician responses in qualitative interviews, so the influence of Dx-Connect feedback on practice and outcomes remains unknown and requires further study.Clinicians assessed the tool favorably but may have been influenced by being interviewed by members of the development team. 77Despite these limitations, our preliminary but intriguing findings suggest that performance feedback in diagnosis is feasible, usable, and perceived positively to clinicians as well as eliciting strong responses.
In the case of pneumonia where there is no gold standard for diagnostic accuracy, the impact of feedback about diagnostic performance is particularly complex and important to understand.When confronting clinicians with potentially imperfect measures and challenging feedback surrounding diagnoses, it is important to develop and use measures with humility, acknowledging uncertainty and the purpose of providing learning opportunities for professional growth. 78How should we incorporate the complexities of the diagnostic context within feedback processes?How can we ensure feedback is provided in a supportive way to maximize the likelihood of a constructive process of sensemaking and improvement?These are critical questions pointing to future directions for promoting diagnostic excellence for complex diseases.

3Figure 1 .
Figure 1.Agreement in initial and discharge pneumonia diagnosis in patients admitted ED clinician.

Figure 2 .
Figure 2. Agreement in initial and discharge pneumonia diagnosis in patients admitted by ED clinician.

Figure 3 .
Figure 3. Diagnostic Accuracy Assessment for Emergency Department.ED diagnoses of Pneumonia are shown in upper left, and discharge diagnoses in upper right.The Chest imaging in center displays the number of diagnoses confirmed/disconfirmed with chest imaging.

FigureFigure 5 .
Figure Individual beliefs and relationships to feedback comfort.

in IniƟal & Discharge Pneumonia Diagnosis in PaƟents AdmiƩed by ED Clinician. Agreement in IniƟal & Discharge Pneumonia Diagnosis in PaƟents AdmiƩed by ED Department
12 Months o December 31, 2021 Measure

Table 1 .
Clinician demographics by role.

Table 2 .
Selected quotes by selected codes and theme.Theme 1. Diagnosing pneumonia in the ED context is characterized by diagnostic uncertainty and may be a secondary priority relative to disposition and treatment 1A.Diagnostic reasoning . . .I think for me, the big thing is really removing as much subjectivity out of the process as I can. ... (Participant 1) 1B.Case diagnosis-diagnostic certainty-covering bases . ...I think oftentimes, we treat a lot of things empirically to cover in case it is pneu-Theme 4. Clinicians had strong reactions to feedback data across a spectrum from implicit trust in measure to extreme skepticism.
A lot of times we do not, [get feedback] unless they come back as, what we call, a bounce back, or failure of treatment and coming back to be admitted to the hospital. ...(Participant 4) Theme 3. Clinicians liked the measure, feedback tool and features 3A.User design reaction-reaction to normative data Doctors are. ...competitive and we like to know how we're doing compared to our colleagues because we want to see if we're doing okay. ... (Participant 6) 3B.Tool evaluation-potential use You may make me more thoughtful about when I say about how I qualify things that we put into our diagnoses . ... (Participant 6) 3C.Tool evaluation-feedback reading through my note. ..you can tell that we weren't certain that she had a pneumonia, which is why I think we gave the diagnosis of possible pneumonia.In classic emergency medicine fashion, we kind of hedged. ...(Participant 5) 4A.Interpretation-diagnostic discordance . ... we like to think of pneumonia as a cut and dry diagnosis, or chest X-rays as black and white. ... it's easier to call it pneumonia. . .(Participant