Real-world performance and accuracy of stress echocardiography: the EVAREST observational multi-centre study

Abstract Aims Stress echocardiography is widely used to identify obstructive coronary artery disease (CAD). High accuracy is reported in expert hands but is dependent on operator training and image quality. The EVAREST study provides UK-wide data to evaluate real-world performance and accuracy of stress echocardiography. Methods and results Participants undergoing stress echocardiography for CAD were recruited from 31 hospitals. Participants were followed up through health records which underwent expert adjudication. Cardiac outcome was defined as anatomically or functionally significant stenosis on angiography, revascularization, medical management of ischaemia, acute coronary syndrome, or cardiac-related death within 6 months. A total of 5131 patients (55% male) participated with a median age of 65 years (interquartile range 57–74). 72.9% of studies used dobutamine and 68.5% were contrast studies. Inducible ischaemia was present in 19.3% of scans. Sensitivity and specificity for prediction of a cardiac outcome were 95.4% and 96.0%, respectively, with an accuracy of 95.9%. Sub-group analysis revealed high levels of predictive accuracy across a wide range of patient and protocol sub-groups, with the presence of a resting regional wall motion abnormalitiy significantly reducing the performance of both dobutamine (P < 0.01) and exercise (P < 0.05) stress echocardiography. Overall accuracy remained consistently high across all participating hospitals. Conclusion Stress echocardiography has high accuracy across UK-based hospitals and thus indicates stress echocardiography is being delivered effectively in real-world practice, reinforcing its role as a first-line investigation in the assessment of patients with stable chest pain.


Aims
Stress echocardiography is widely used to identify obstructive coronary artery disease (CAD). High accuracy is reported in expert hands but is dependent on operator training and image quality. The EVAREST study provides UK-wide data to evaluate real-world performance and accuracy of stress echocardiography.

Methods and results
Participants undergoing stress echocardiography for CAD were recruited from 31 hospitals. Participants were followed up through health records which underwent expert adjudication. Cardiac outcome was defined as anatomically or functionally significant stenosis on angiography, revascularization, medical management of ischaemia, acute coronary syndrome, or cardiac-related death within 6 months. A total of 5131 patients (55% male) participated with a median age of 65 years (interquartile range 57-74). 72.9% of studies used dobutamine and 68.5% were contrast studies. Inducible ischaemia was present in 19.3% of scans. Sensitivity and specificity for prediction of a cardiac outcome were 95.4% and 96.0%, respectively, with an accuracy of 95.9%. Sub-group analysis revealed high levels of predictive accuracy across a wide range of patient and protocol sub-groups, with the presence of a resting regional wall motion abnormalitiy significantly reducing the performance of both dobutamine (P < 0.01) and exercise (P < 0.05) stress echocardiography. Overall accuracy remained consistently high across all participating hospitals. Stress echocardiography has high accuracy across UK-based hospitals and thus indicates stress echocardiography is being delivered effectively in real-world practice, reinforcing its role as a first-line investigation in the assessment of patients with stable chest pain.

Introduction
Functional imaging has equal prominence with non-invasive anatomical imaging for the diagnosis of coronary artery disease (CAD) in guidance issued by the European Society of Cardiology. 1 In the UK, NICE proposes non-invasive anatomical imaging for first-line investigation with functional imaging for second-line investigation. 2 However, lack of infrastructure and trained personnel 3 means in realworld practice functional imaging remains the main first-line test for CAD. 4,5 Global reliance on functional imaging as the first-line investigation for CAD, particularly when this approach differs from some national guidelines, raises concerns about whether current patient care is optimal. However, evaluation of individual imaging tests in guidelines has tended to need to rely on meta-analysis of small experimental studies [6][7][8] or, in the case of stress echocardiography, historical studies from the 1990s and 2000s. 6,9 Recent large-scale randomized clinical trials, such as PROMISE, show similar outcomes with either an anatomical or functional imaging approach, 10 and contemporary single centre observational studies indicate good performance of stress echocardiography for diagnosis and prognostication. 11,12 Furthermore recent studies such as ISCHEMIA, 13 combined with evidence from COURAGE, 14 demonstrate the non-inferiority of a medical therapy-first strategy compared with an initial invasive strategy. Whilst other large-scale, prospective studies have examined the accuracy of stress echocardiography in other regions across the world, 15-20 the EVAREST (Echocardiography: Value and Accuracy at Rest and Stress) study is the first such large-scale evaluation of the use and accuracy of stress echocardiography in clinical practice within the National Health Service in the UK. The participating centres are representative of the geographical variation, hospital size, and patient demographics seen within the UK. In this real-world practice, we describe stress echocardiogram protocol performance, as well as accuracy and patient outcome based on all those with 6-month outcome data by January 2021.

Study design
EVAREST is a prospective, multi-centre, observational study. Following a pilot project in Oxford (OxCardioFuse, IRAS reference: 08/H0604/127), recruitment commenced in the main study (ClinicalTrials.gov ID: NCT03674255). Details about hospital recruitment are provided in Supplementary data online.

Participants
Patients undergoing stress echocardiography for evaluation of stable chest pain were recruited from 28 NHS Trusts, comprising 31 hospitals, between March 2015 and March 2020. All patients provided informed consent. Ethical approval was granted by the Health Research Authority NRES Committee (South Central-Berkshire) review board (IRAS reference: 14/SC/1437). This study was conducted in accordance with the Declaration of Helsinki.

Procedures
Tests were conducted and reported in accordance with each hospital's standard protocol; mode of stress and contrast use were at the operator's discretion. Procedure details, results, and participant medical history were obtained from medical records and recorded on an electronic database (Castor EDC, Amsterdam, Netherlands).

Outcomes
Follow-up is ongoing with a proportion of patients consented to followup for 10 years. Initial follow-up included a medical record review and patient telephone call completed towards the end of the first year after recruitment. Cardiac imaging, procedure reports, or death certification were obtained, if applicable, and data extracted including the location of coronary disease, if available. All angiogram reports were reviewed and diameter stenosis (as visually assessed by the operator) was recorded for each coronary artery. Analysis presented in this article is based on the full dataset with follow-up censored at 6 months.
All clinical data were reviewed by an adjudication committee, including at least one accredited cardiologist, blinded to stress echocardiogram results and a binary (cardiac/non-cardiac) outcome assigned (Supplementary data online, Figure S1). Cardiac outcome was defined as angiography demonstrating an anatomically or functionally significant lesion [defined as greater than 70% narrowing (or 50% in the left main stem) or abnormal fractional flow reserve or instantaneous wave-free ratio], referral for revascularization, initiation of appropriate pharmacological therapy, acute coronary syndrome, or cardiac-related death. All patients in whom no additional cardiac intervention, management, or investigation was required were assigned a non-cardiac outcome.
When assessing the accuracy of stress echo to identify the location of the coronary disease, each segment was assigned a supplying artery (as per Elhendy et al. 21 ). The basal, mid and apical anterior wall, basal and mid-anteroseptum, mid-inferoseptum, apical septum, apical lateral, and apical cap, were assigned to the left anterior descending (LAD) coronary artery territory. The basal and mid-lateral, and basal and mid-inferolateral walls were assigned to the left circumflex artery (LCx) territory. The basal, mid and apical inferior, and basal inferoseptum were assigned to the right coronary artery (RCA) territory.

Statistical analysis
Patient demographics and stress echocardiogram protocols were reported using standard approaches. Descriptive statistics were investigated as frequencies and medians [interquartile range (IQR)]. Normality was assessed by Shapiro-Wilk test. Sub-group comparisons were made by Mann-Whitney or v 2 tests, as appropriate. Association of patient demographics or test protocol on contrast usage and accuracy of stress echocardiography were tested with odds ratios (OR) and 95% confidence intervals (CIs) in multivariate logistic regressions. Kaplan-Meier survival curves and Log-Rank tests were used to study differences in cardiac outcomes between groups. A Cox proportional hazard model was used to estimate the hazard ratio (HR) of a positive stress echocardiogram and ischaemic burden, after adjusting for cardiac risk factors and resting regional wall motion abnormalities. To compare outcomes against stress echocardiogram results the stress echocardiogram was defined as either true positive, true negative, false positive, or false negative. Sensitivity, specificity, positive predictive value, and negative predictive value were calculated using standard approaches for stress echocardiography overall, and for sub-groups based on patient characteristics and stress echocardiogram protocol (provided the sub-group contained at least 50 patients). Receiver operating characteristic (ROC) curves were plotted for each sub-group, and the area under the ROC curve (AUROC) was calculated. AUROCs were compared by a v 2 test to determine differences in predictive accuracy between patient and protocol sub-groups. Univariate logistic regression, Kruskal-Wallis and Mann-Whitney tests were used to investigate coronary vessel-specific accuracy.  Figure 1 shows patient recruitment from 31 hospitals. The broad geographical distribution of this research network and hospital characteristics are shown in the Supplementary data online (Supplementary data online, Figure S2). Of those recruited, 32 were identified as screening failures and 46 were excluded from the analysis as their stress echocardiogram was not performed. A further 97 patients were excluded as their stress echocardiogram was inconclusive or abandoned. Of the 5354 patients who were followed up, 223 were excluded. Therefore, a total of 5131 patients were included in the analysis.

Results
Patient demographics are reported in Table 1. Median age was 65 years (IQR 57-74) and 2823 (55%) were male. Stress echocardiograms were negative for inducible ischaemia in 4139 (80.7%) patients and positive in 992 (19.3%). Table 1 shows that patients with a positive test were more likely to have cardiac risk factors including male sex, increased age, increased body mass index (BMI), hypertension, hypercholesterolaemia, diabetes mellitus, and pre-existing vascular disease. Pre-existing CAD was present in 1868 (36.7%) participants, with 867 (17.2%) participants having previously suffered a myocardial infarction.
Dobutamine was the most common stressor accounting for 3739 (72.9%) of tests, while exercise was used in 1375 (26.8%) studies. Of those undergoing exercise stress echocardiography, 918 (66.8%) underwent treadmill stress, whilst 454 (33.0%) underwent bicycle ergometer stress, mode of stress was not recorded for 3 (0.2%) patients. Seventeen patients (0.2%) underwent a pacemaker-mediated study. Supplementary data online, Table S1 shows a higher prevalence of cardiovascular risk factors in those undergoing dobutamine stress echocardiograms, compared to those having exercise studies. Left ventricular (LV) contrast was used in 3510 (68.5%) of studies, with more frequent use in dobutamine stress echocardiograms compared to exercise stress (76.1% vs. 47.8%). Increased age and BMI were independently associated with contrast use in multivariate regression analysis (Supplementary data online, Table S2).
Six-month outcome data were analysed to determine the predictive accuracy of stress echocardiography. Figure 2A demonstrates time-related events up to 6 months after stress echocardiography. A positive stress echocardiogram was significantly associated with cardiac outcome (adjusted HR 123.9, 95% CI 88.8-172.8; P < 0.0001).
Overall sensitivity for all types of stress echocardiography and patent was 95.4% with a specificity of 96.0%. Positive predictive value and negative predictive value were 82.8% and 99.0%, respectively. Overall accuracy was 95.9%. No significant difference in predictive ability was observed between dobutamine and exercise stress echocardiography (P = 0.533). Table 2 shows the sensitivity, specificity, and accuracy for stress echocardiography when separated by sub-group according to patient characteristic and type of stressor. The presence of a resting regional wall motion abnormality was associated with a significant reduction in overall predictive accuracy in both exercise stress echocardiography (P < 0.05) and dobutamine stress echocardiography (P < 0.01). The presence of left bundle branch block (LBBB), which is more common in those with resting wall motion abnormalities, also reduced sensitivity, specificity, and accuracy during dobutamine stress echocardiography. However, there was no statistically significant difference in predictive performance (P = 0.366). The presence of atrial fibrillation selectively reduced sensitivity of dobutamine stress echocardiography but overall predictive ability did not change (P = 0.728). Sensitivity was higher for both dobutamine and exercise stress echocardiography in those with the previous coronary artery bypass graft surgery but specificity was lower resulting in no overall change in predictive ability for either dobutamine (P = 0.813) or exercise stress echocardiography (P = 0.982). Increased BMI > 40 kg/m 2 did not significantly impact overall performance (P = 0.402); however, sensitivity was higher during dobutamine stress echocardiography in those patients with a BMI of <40 kg/m 2 . Overall predictive ability was significantly greater (P < 0.0001) in patients aged <40 years. Subgroup analysis was carried out on patients undergoing stress echocardiography prior to surgery. No significant differences (P = 0.562) in predictive accuracy were observed in this group of patients. Table 3 reports the accuracy of stress echocardiography related to contrast use. No statistically significant differences in overall accuracy were observed between contrast and non-contrast stress echocardiograms (P = 0.813). A significant (P < 0.05) reduction in predictive accuracy was observed with non-contrast exercise stress echocardiography in patients with abnormal resting wall motion, compared with contrast-enhanced stress echocardiography. This related to a higher specificity when contrast was used. However, no difference was observed between contrast and noncontrast stress echocardiography when resting wall motion was normal (P = 0.616).
In the 21 hospitals that recruited more than 50 patients, the diagnostic performance of stress echocardiography was determined by calculating AUROCs. These ranged from 0.900 to 1.000, with a mean AUROC of 0.9494, demonstrating that stress echocardiography is being performed to a high diagnostic standard at all centres. Comparison of the AUROCs between centres, however, did reveal a statistically significant difference in accuracy (P < 0.0001).
The presence of a resting regional wall motion abnormality was significantly associated with the likelihood of having a positive stress echocardiogram, with an adjusted odds ratio of 4.1 (95% CI 3.5-4.9) (P < 0.0001). Of those stress echocardiograms that were positive for inducible ischaemia, 30.7% had resting wall motion abnormalities, compared with 9.7% of negative stress echocardiograms. The presence of a resting regional wall motion abnormality was also significantly associated with the likelihood of severe coronary disease on angiography, with an adjusted HR of 2.8 (95% CI 2.4-3.3) (P < 0.0001). Figure 3 demonstrates how the occurrence of severe coronary disease differs based on both the presence of resting regional wall motion abnormalities and the presence of inducible ischaemia.
The median number of ischaemic segments identified during a positive stress echocardiogram was 3 (IQR 2-4) (see Supplementary data online, Figure S3). Figure 2B demonstrates a significant separation in outcomes over 6 months according to number of ischaemic segments (P < 0.0001). Those patients with a positive stress echocardiogram who were managed medically but subsequently presented with acute coronary syndromes had the significantly higher ischaemic burden at baseline compared to those who were managed medically with no further cardiac events [four segments (IQR 3-6) vs. two segments (IQR 1-3), P < 0.01]. Ischaemic burden was significantly higher in patients referred for angiography compared to those managed medically [four segments (IQR 2-5) and two segments (IQR 1-3), P < 0.0001] and those found to have angiographically severe disease had a higher ischaemic burden compared to those with non-obstructive disease [four segments (IQR 3-6) and three segments (

Discussion
This study provides contemporary, real-world data on the use, and accuracy of stress echocardiography in clinical practice across a  national healthcare system. When used as the first-line test for the evaluation of CAD, outcomes for patients are consistent, or better, than reported as best practice from randomized controlled trials of anatomical [22][23][24] or functional imaging. 8,10 Across hospitals of varying sizes, activity levels, and locations, stress echocardiography was performed consistently to a high standard. It is noteworthy that only 1.8% of stress echocardiograms were considered non-diagnostic. Historically, significant variability in the performance of stress echocardiography has been reported between different studies with sensitivity and specificity ranging from 33 to 96% and 38 to 97%, respectively. 2,6,25 National echocardiography societies have therefore prioritized education, training, and monitoring of competence. 20,26 These initiatives could explain why this study shows delivery of stress echocardiography to a high standard with high levels of clinically meaningful sensitivity and specificity within the UK. Protocol selection may also partly be responsible. Dobutamine stress echocardiography was more commonly used than exercise stress and, although operator experience with exercise and local facilities may drive this difference, 4 use of dobutamine was associated with the presence of a higher BMI, increased age, and a greater number of cardiac risk factors, suggesting a degree of stressor selection to optimize the procedure. Benefits of stress echocardiography include a lack of ionizing radiation, which complicates other cardiac imaging modalities. However, image quality can be adversely affected by patient body habitus, making interpretation challenging. One study reports up to one in three patients may have sub-optimal images. 27 This can be overcome with LV contrast agents. 28 We observed a high use of LV contrast, at 68.5% of studies; known to increase diagnostic accuracy. 28 Patients receiving contrast tended to have an elevated BMI and older age, matching known factors likely to increase the requirement for contrast use. 28 Our findings demonstrate contrast-enhanced stress echocardiography has a high predictive accuracy, even in the sub-group of patients with a BMI >40 kg/m 2 .
Accuracy was mainly affected by non-procedural factors, specifically, pre-existing regional wall motion abnormalities, which are recognized as complicating identification of new wall motion abnormalities 29 as well as resulting in a higher risk of adverse events. 21,30 The reduction in accuracy in those with regional wall motion abnormalities may reflect an impact of dobutamine on post-systolic shortening, 31 which could disguise a lack of segmental contractile function, leading to misdiagnosis on visual assessment.
We have demonstrated the ability of stress echocardiography to accurately detect flow-limiting coronary disease in the LAD and RCA; however, no significant association was observed between ischaemia detected the LCx territory and LCx coronary disease. This lack of association between LCx ischaemia and corresponding disease on angiography may be explained by the termination of the stress echocardiogram following the development of ischaemia in a different territory with a lower coronary flow reserve. Once ischaemia has been documented, especially in dobutamine stress echocardiography, the test is typically terminated and may therefore mask an ischaemic response in another territory with significant stenosis.
Since stress echocardiography relies on the qualitative assessment of wall motion, accurate interpretation is dependent on operator experience. 32 One obstacle to a more widespread use of stress echocardiography may be lack of trained operators to confidently and accurately interpret the test. In the future, this obstacle may be overcome by the incorporation of artificial intelligence (AI) tools into the clinic capable of performing a quantitative assessment of stress images. [33][34][35] Increased consistency and confidence in reporting by the use of AI could broaden the range of personnel who could perform stress echocardiograms.
Acute coronary events or cardiac-related deaths that occur after a negative stress echocardiogram remain a concern. However, this study shows similar rates of 1-2% of patients having acute events over 6 months in both the negative and positive stress echocardiogram cohorts. Recent trials have shown an early invasive strategy has a similar impact on longer-term event rates as a medical management-based approach, 13,24 which may reflect the evolving nature of the underlying pathology and emergence of new disease. As CAD progresses over time, accuracy for stress echocardiography to predict longer-term outcomes is likely to vary and subsequent analysis with longer-term follow-up will be of interest.
The present study reveals over half of patients who have positive functional imaging do not go on to have further investigation or intervention. The number of ischaemic segments was lower in this group consistent with accepted clinical decision making to manage medically those with lower ischaemic burden. 14 This study confirms a striking graded association between the degree of ischaemia assessed by the clinician and the likelihood of cardiac outcome over the next 6 months. Reassuringly, outcome at 6 months in the medically managed positive stress echocardiogram population was comparable to other arms of clinical care. The recently published ISCHEMIA study would support the medical management of stable ischaemic heart disease patients with preserved ventricular function and no evidence of heart failure or LMS disease, even if they have a large burden of ischaemia. 13 Long-term follow-up of this study will investigate whether revascularization reduces the incidence of myocardial infarction in the longer term in patients with significant ischaemia.
The study has limitations. Firstly, by using real-world data, angiographic confirmation of obstructive or non-obstructive coronary disease was not available for all patients. Instead, patients were allocated to outcome based on clinical history during a 6-month period, using criteria developed for handling outcomes in this setting. 36,37 Therefore, patients with obstructive coronary disease who had a negative stress echocardiogram but then remained well for the next 6 months could have been misclassified from an anatomical perspective in analysis. Arguably, this outcome was clinically acceptable and the statistical misclassification bias is minimized by related misclassification in patients with positive stress echocardiogram who did not undergo further investigation. Secondly, patients who underwent angiography were judged based on the degree of stenosis in their epicardial arteries assessed by the operating clinician rather than an independent review of the angiogram. Thirdly, this meant potential causes of non-obstructive ischaemia, such as microvascular disease, may have been misclassified in outcome allocation as a false positive stress echocardiogram. Fourthly, not all sites started recruiting at the same time and therefore some sites contributed more proportionally to the dataset. Reanalysis at future time points beyond 6 months and with more patients from each site providing outcome data will be of interest. Finally, due to the nature of the consent process, there may