Abstract

Aims To compare 19 risk score algorithms with regard to their validity to predict 30-day and 1-year mortality after cardiac surgery.

Methods and results Risk factors for patients undergoing heart surgery between 1996 and 2001 at a single centre were prospectively collected. Receiver operating characteristics (ROC) curves were used to describe the performance and accuracy. Survival at 1 year and cause of death were obtained in all cases. The study included 6222 cardiac surgical procedures. Actual mortality was 2.9% at 30 days and 6.1% at 1 year. Discriminatory power for 30-day and 1-year mortality in cardiac surgery was highest for logistic (0.84 and 0.77) and additive (0.84 and 0.77) European System for Cardiac Operative Risk Evaluation (EuroSCORE) algorithms, followed by Cleveland Clinic (0.82 and 0.76) and Magovern (0.82 and 0.76) scoring systems. None of the other 15 risk algorithms had a significantly better discriminatory power than these four. In coronary artery bypass grafting (CABG)-only surgery, EuroSCORE followed by New York State (NYS) and Cleveland Clinic risk score showed the highest discriminatory power for 30-day and 1-year mortality.

Conclusion EuroSCORE, Cleveland Clinic, and Magovern risk algorithms showed superior performance and accuracy in open-heart surgery, and EuroSCORE, NYS, and Cleveland Clinic in CABG-only surgery. Although the models were originally designed to predict early mortality, the 1-year mortality prediction was also reasonably accurate.

See page 768 for the editorial comment on this article (doi:10.1093/eurheartj/ehi792)

Introduction

Despite technological advancements, open-heart operations still carry a risk of mortality and morbidity. To aid in the selection of patients for cardiac surgery, several risk-scoring systems have been developed during the last decades. These aim to estimate the risk of peri-operative death, based on the occurrence of different risk factors. Operative mortality is also increasingly used as an indicator of the quality of cardiac surgery.1

To make an accurate comparison between different institutions or surgeons, mortality data must be adjusted to the risk profiles of the patients.2,3 Differences between the available risk algorithms regarding score design and the patient population on which the score development was based could influence their accuracy and performance. Ideally, a risk model should be useful for outcome prediction at different surgical centres, both at the institutional level and for individual patients.4 Operative mortality is the outcome variable most commonly used as a quality indicator, but long-term mortality may be more relevant from a patient perspective.

A few comparative studies of different risk algorithms exist.48 However, the relative performance of the risk-scoring systems currently used remains unclear. The purpose of this study was to compare 19 open-source risk score algorithms with regard to their validity to predict 30-day and 1-year mortality after cardiac surgery in a large single-institution patient population.

Methods

Study design and patients

The study was approved by the Ethics Committee of the Medical Faculty at Lund University. Risk factors for all adult patients undergoing heart surgery at the University Hospital of Lund between January 1996 and February 2001 were prospectively collected when the patients were admitted to the Department of Cardiothoracic Surgery. The patient record form contained a total of 248 variables (80 pre-, 106 intra-, and 62 post-operative) based on the Society of Thoracic Surgeons (STS)9 patient record form. The data was stored in a local adult cardiac surgery database.

Data collection and risk-score calculation

From the total of 248 variables, those corresponding to the risk factors in the different risk models were selected. Thus, a subset of 104 of the pre- and intra-operative variables were imported into the statistical software package, together with 30-day and 1-year mortality for the population. Missing values were replaced using the probability imputation technique10 before the risk score was calculated. The probability imputation technique substitutes conditional probabilities for missing covariate values when the covariate is qualitative. The risk score for each algorithm was calculated for every patient according to the published definitions (Table 1).

Follow-up

The vital status at 1 year after the operation was obtained for all patients from the Population and Welfare Statistics Sweden, Statistiska Centralbyrån, Stockholm, Sweden, as was the date and cause of mortality.

Statistical analysis

Means (±SD) were used to describe the continuous variables, and frequencies were calculated for categorical variables. Score-predicted operative mortality (death within 30 days of operation) was calculated using the mean score from the different risk models, except for the Northern New England algorithm where the published score-mortality table11 was used. Receiver operating characteristics (ROC) curves were used to describe the performance and predictive accuracy for the different algorithms.12 The discriminatory power, i.e. the c-index, was evaluated by calculating the areas under ROC curves.13 The areas under curves are presented with 95% confidence limits. An area of 1.0 under the ROC curve indicates perfect discrimination, whereas an area of 0.50 indicates complete absence of discrimination. Any intermediate value is a quantitative measure of the ability of the risk predictor model to distinguish between survivors and non-survivors.

To compare the areas under the resulting ROC curves (used as an index for the predicted value), the non-parametric approach described by DeLong et al.14 was used. The ROC area for each risk algorithm was systematically compared with the ROC area of the other 18 algorithms. The numbers of algorithms with a significantly larger or smaller ROC area was then computed. The probability significance level was adjusted for the effect of multiple comparisons using Sidak's method.

Graphs and statistical analyses were performed using the Intercooled Stata version 9.0 (2005) statistical package (StataCorp LP, College Station, TX, USA) and GraphPad Prism 4b, 2004 for Mac OS X, GraphPad Software, Inc., USA.

Results

Patient population

Between January 1996 and February 2001, 6499 consecutive heart operations were performed on 6414 patients. During the period January–March 1998, database service and upgrade resulted in missing values in 30% of the data points. All operations (n=277) from this period were excluded from the study. Thus, 6153 patients, undergoing 6222 operations, were included in the analysis. In 2% of the total data points, missing values were replaced using the probability imputation technique.10 There was accurate documentation of data including mortality and cause of death in all cases, and no patient was lost to follow-up.

The average age was 66.3±10.6 years (range 18–95). The majority of patients were men (72%). A coronary artery bypass grafting (CABG)-only operation was performed in 4351 cases (70%), 1340 (22%) cases had a valve procedure with or without CABG surgery, and 531 (8%) were miscellaneous procedures, e.g. post-infarction septal rupture (37 cases), aortic aneurysm or dissection (209 cases), and cardiac transplantation (78 cases). Previous cardiac surgery had been performed in 457 cases (7.3%). Seventy-eight patients (1.3%) were in cardiogenic shock at the start of the operation and 628 (10%) were operated within 24 h after acceptance for surgery (emergency surgery). The actual 30-day mortality was 2.9% (n=180) and the 1-year mortality was 6.1% (n=377).

Performance and predictive accuracy for the algorithms

The discriminatory power (i.e. the area under the ROC curve) for 30-day mortality and 1-year mortality was highest for the logistic (0.84 and 0.77) and additive (0.84 and 0.77) European System for Cardiac Operative Risk Evaluation (EuroSCORE) algorithms, followed by the Cleveland Clinic (0.82 and 0.76) and the Magovern (0.82 and 0.76) scoring systems (Figures 1 and 2). None of the other risk algorithms had a significantly better discriminatory power (larger ROC area) than these four (Figure 3). In the subanalysis with CABG-only patients, the discriminatory power for the two EuroSCORE algorithms were highest, followed by the New York State (NYS) and Cleveland Clinic risk algorithm (Table 2).

The mortality predictions of the different scoring systems are shown in (Figure 4).

Follow-up

The most common cause of death within 30 days was cardiovascular disease (n=163, 91%), followed by cerebrovascular disease (n=3, 1.7%), malignant neoplasm (n=3, 1.7%), and chronic lower respiratory disease (n=2, 1.1%). Cardiovascular disease was also the most common cause of death within 1 year (n=280, 74%), followed by malignant neoplasm (n=22, 5.8%), cerebrovascular disease (n=16, 4.2%), chronic lower respiratory disease (n=10, 2.7%), and septicaemia (n=10, 2.7%). For each risk algorithm, the ROC areas for cardiovascular-related (n=163) and total 30-day mortality (n=180) were almost identical (difference 0.005 or less). The discriminatory power for cardiovascular-related 1-year mortality (n=280) increased by approximately 0.03 for all 19 algorithms compared with the discriminatory power for total 1-year mortality (n=377) (logistic EuroSCORE 0.80, additive EuroSCORE 0.80, Cleveland Clinic 0.79, and Magovern 0.78). However, it did not change their relative order of discriminatory power.

Discussion

The purpose of this study was to compare 19 commonly used cardiac surgical risk scores with regard to their validity in a large single-institute patient population. The results show that four of the algorithms had a superior performance and accuracy to predict 30-day and 1-year mortality, expressed as discriminatory power, compared with the other 15 algorithms. Despite the fact that all of the algorithms were designed to predict early mortality, they also predict 1-year mortality well, especially when the cause of death was cardiovascular disease.

Most algorithms overestimated the 30-day mortality in this patient population. The same finding has been reported in other studies.4,6 Rather than reflecting weaknesses in the risk score algorithm, these findings are probably explained by differences in patient mix and temporal periods compared to the original databases used for development of the algorithms.6 Prediction of mortality rate in the CABG-only subgroup was almost perfect using the Northern New England and NYS algorithms, which are both for use in CABG surgery and newly developed.

The potential of ROC curves in medical diagnostic testing was recognized as early as 1960.15 Even if comparison of ROC curves in a statistically valid fashion to evaluate models remains controversial, the ROC curve is currently the best developed statistical tool for describing performance.12 The EuroSCORE model, which had the highest discriminatory power, has been shown to work well to predict 30-day mortality in many European countries16 and in the United States.17 It compared favourably with the STS risk stratification algorithm7 (which is not open source and was therefore not included in the present analysis). Recently, it was demonstrated that EuroSCORE could predict intensive care unit stay and costs of open-heart surgery.18 The Cleveland Clinic model has also shown high discrimination to predict early mortality.8 An important finding in the present study is that these algorithms could be used also to predict long-term mortality (1 year), especially for cardiovascular deaths.

Earlier studies have compared the performance of different risk algorithms to predict 30-day mortality,4,6,8 but have not shown significant differences in performance and accuracy. This may be explained by smaller patient materials.6,8

The predictive accuracy of different risk scoring systems may be influenced by numerous factors, such as differences in variable definitions, management of incomplete data fields, surgical procedure selection criteria, and geographical differences in patient risk factors. The prevalence of risk factors in patients referred for heart surgery may also change over time. Difficulties thus arise when comparison of the accuracy and predictive power of large databases are attempted. However, ROC analysis is a robust technique for such comparisons. Importantly, the shapes of the ROC curves were similar among the compared risk models (Figure 2), making direct comparison possible.12 Murphy-Filkins et al.19 showed that an increase up to five times of a low-frequency variable (for example, due to difference in a variable definition) did not appreciably change the model fit.

All surgical procedures were included in the study, irrespective of the number of operations the patients underwent. Thus, a patient could participate two or more times in the analysis. This could be debated, as a dependence of the data that arises from multiple procedures performed within a patient may occur. An alternative would be to include only the first procedure for each patient. A subanalysis using this approach (n=6153) showed only very small differences in the ROC area for the different risk algorithms (in average 0.001). A drawback of excluding patients having a second procedure during the study period is that some high-risk cases will be eliminated from the analysis. Regardless of which method used, the differences caused by this dependence was negligible, most likely due to the small number of patients (1%) who had more than one procedure.

The probability imputation technique, used in this study, has been shown to work well in prognostic factor studies.20 Another strategy to handle incomplete data is to exclude the patients with missing values from analysis, but because missing values are more likely in emergent high-risk patients, this could result in bias.

Geographical differences in the occurrence of patient risk factors may have influenced the design of different risk-scoring systems, but do not seem to influence the present results. The best-performing risk scores in this study were developed in two different geographical areas: Europe and the USA.

Eight of the included risk algorithms (Cabdeal, NYS, Northern New England, Magovern, Toronto, Toronto (modified), UK national score, and Veterans Affairs) were originally designed to predict early mortality in CABG-only patients, which also could affect the predictive accuracy. A subanalysis of CABG-only patients in this material identified the same two risk-scoring systems with the largest ROC areas (EuroSCORE additive and logistic), followed by the NYS and the Cleveland Clinic risk-scoring systems.

The smaller ROC area for the 1-year than for the 30-day mortality prediction was expected. Risk models originally designed to predict 30-day mortality will mainly predict cardiovascular death, which was the most common cause of early post-operative mortality (91%). At 1 year, the causes of death will be more diverse and the proportion of cardiovascular-related death will decrease (74%).

The strength of the present study is that the algorithms could be compared using a relatively large patient material, where the patient data were collected on a regular basis in the daily clinical work. The data was pre-operatively entered into the database, generally by residents, and not by the surgeon performing the operation.

During the last decades, several different risk score algorithms for cardiac surgery have been published, but it still remains difficult to risk stratify individual patients.4,8 One method to improve risk algorithm development could be to include more patients with higher risk scores as suggested by Wyse and Taylor.21 However, we found that the Cleveland Clinic score, which was developed on 5051 patients, performed almost as well as the EuroSCORE, developed on 13 302 patients.

Most risk algorithms are based on logistic regression analysis with a priori assumptions of linear relationships. Another method to improve risk prediction could be to use a more complex risk model, such as the artificial neural network, which has the advantage of the capacity to model complex, non-linear relationships and is relatively robust and tolerant of missing data.22 There are only a few studies done in this area, which merits further investigation.

Even if a perfect risk prediction algorithm in cardiac surgery is never achieved, identification of the best-performing risk algorithms is important. Pre-operative risk stratification may aid in the selection between cardiac surgery and other therapeutic modalities currently available, facilitate the planning of hospital resource utilization, and enable accurate comparison between different institutions or surgeons.

Conflict of interest: none declared.

Appendix

Table A1

Pre-operative general risk factors in 6222 open-heart operations

Pre-operative risk factor Mean (±SD) or n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Ageb (years) 66.3 (10.6) √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ 
Female gender 1765 (28.4) √   √  √ √ √ √ √ √  √ √  √ √  
Heightb (centimetres) 171.4 (8.0)  √    √         √    
Weightb (kilograms) 78.7 (13.8)  √ √   √    √ √    √    
Hbb (g/L) 134.1 (16.3)   √   √             
Serum creatinineb (µmol/L) 95.2 (40.5) √ √ √ √ √ √ √    √ √    √   
Hypertension (sys >140 mmHg) 2458 (40.0)          √ √  √ √   √  
Diabetes 1106 (17.9)  √ √   √ √   √ √   √   √  
Hypercholesterolemia (treated) 2274 (37.0)           √        
Chronic pulmonary disease 477 (7.7)  √ √ √  √ √ √   √      √  
Active smoker 539 (8.8)                 √  
Cerebrovascular disease 448 (7.2)   √ √  √ √    √     √ √ √ 
Peripheral vascular disease 636 (10.3)   √ √  √ √ √   √  √ √   √ √ 
Kidney disease by history 248 (4.0)                √ √  
Dialysis 28 (0.5)     √  √ √  √ √      √  
Adult congenital heart disease 11 (0.2)           √        
ASA medication 4346 (69.9)           √        
Diuretic medication 2203 (35.4)                  √ 
Immunosuppressive medication 71 (1.2)           √        
Pre-operative risk factor Mean (±SD) or n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Ageb (years) 66.3 (10.6) √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ 
Female gender 1765 (28.4) √   √  √ √ √ √ √ √  √ √  √ √  
Heightb (centimetres) 171.4 (8.0)  √    √         √    
Weightb (kilograms) 78.7 (13.8)  √ √   √    √ √    √    
Hbb (g/L) 134.1 (16.3)   √   √             
Serum creatinineb (µmol/L) 95.2 (40.5) √ √ √ √ √ √ √    √ √    √   
Hypertension (sys >140 mmHg) 2458 (40.0)          √ √  √ √   √  
Diabetes 1106 (17.9)  √ √   √ √   √ √   √   √  
Hypercholesterolemia (treated) 2274 (37.0)           √        
Chronic pulmonary disease 477 (7.7)  √ √ √  √ √ √   √      √  
Active smoker 539 (8.8)                 √  
Cerebrovascular disease 448 (7.2)   √ √  √ √    √     √ √ √ 
Peripheral vascular disease 636 (10.3)   √ √  √ √ √   √  √ √   √ √ 
Kidney disease by history 248 (4.0)                √ √  
Dialysis 28 (0.5)     √  √ √  √ √      √  
Adult congenital heart disease 11 (0.2)           √        
ASA medication 4346 (69.9)           √        
Diuretic medication 2203 (35.4)                  √ 
Immunosuppressive medication 71 (1.2)           √        

ASA indicates acetylsalicylic acid; Hb, hemoglobin; sys, systolic arterial blood pressure.

aAdditive and logistic.

bContinuous variables are presented as mean (+SD). The analysis is based on operations where the risk factor data were available.

Table A2

Pre-operative cardiac risk factors in 6222 open-heart operations

Pre-operative risk factor Mean (±SD) or n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Previous cardiac surgery 457 (7.3) √  √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ 
Active endocarditis 55 (0.9)    √       √        
Heart failure 1156 (18.6)      √     √    √ √  √ 
Cardiomegaly 327 (5.3)      √             
Unstable angina 744 (12.0)  √  √       √    √   √ 
CCSb 2.6 (1.0)                 √  
NYHAb 2.4 (1.0)            √     √ √ 
Recent MI (within 24 h) 144 (2.3) √                  
Recent MI (within 48 h) 207 (3.3)     √      √        
Recent MI (within 21 days) 793 (12.9)    √        √   √ √   
Ventricular arrhythmia (acute) 64 (1.0)     √  √    √        
Atrial fibrillation 508 (8.3)      √             
Pacemaker 33 (1.0)           √        
Left main stenosis 964 (17.9)           √  √ √   √  
Triple vessel disease 2690 (50.7)             √      
LVEFb 49.7 (11.6) √  √ √ √ √ √ √ √ √ √  √ √ √ √ √  
Aortic gradient >120 mmHg 278 (4.5)          √ √        
Pulmonary hypertension 191 (3.1)    √      √ √     √   
Pre-operative risk factor Mean (±SD) or n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Previous cardiac surgery 457 (7.3) √  √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ √ 
Active endocarditis 55 (0.9)    √       √        
Heart failure 1156 (18.6)      √     √    √ √  √ 
Cardiomegaly 327 (5.3)      √             
Unstable angina 744 (12.0)  √  √       √    √   √ 
CCSb 2.6 (1.0)                 √  
NYHAb 2.4 (1.0)            √     √ √ 
Recent MI (within 24 h) 144 (2.3) √                  
Recent MI (within 48 h) 207 (3.3)     √      √        
Recent MI (within 21 days) 793 (12.9)    √        √   √ √   
Ventricular arrhythmia (acute) 64 (1.0)     √  √    √        
Atrial fibrillation 508 (8.3)      √             
Pacemaker 33 (1.0)           √        
Left main stenosis 964 (17.9)           √  √ √   √  
Triple vessel disease 2690 (50.7)             √      
LVEFb 49.7 (11.6) √  √ √ √ √ √ √ √ √ √  √ √ √ √ √  
Aortic gradient >120 mmHg 278 (4.5)          √ √        
Pulmonary hypertension 191 (3.1)    √      √ √     √   

CCS, Canadian Cardiovascular Society; LVEF, left ventricular ejection fraction; NYHA, New York Heart Association; MI, myocardial infarction.

aAdditive and logistic.

bContinuous variables are presented as mean (+SD). The analysis is based on operations where the risk factor data were available.

Table A3

Critical pre-operative situations in 6222 open-heart operations

Pre-operative risk factor n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Urgent surgery 1376 (22.2)      √  √ √    √ √   √ √ 
Emergency surgery 628 (10.1) √ √ √ √  √  √ √   √ √ √ √ √ √ √ 
PTCA failure/complication 138 (2.2)      √    √ √        
Intubated 71 (1.1)     √      √ √     √  
IABP 134 (2.2)          √ √      √ √ 
Uncontrolled systemic disturbanceb 1135 (18.2)               √    
Cardiogenic shock 78 (1.3)      √ √    √ √       
Hemodynamically unstable 286 (4.6)       √            
Critical statec 308 (5.0)    √               
Catastrophic statesd 206 (3.3)          √         
Pre-operative risk factor n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Urgent surgery 1376 (22.2)      √  √ √    √ √   √ √ 
Emergency surgery 628 (10.1) √ √ √ √  √  √ √   √ √ √ √ √ √ √ 
PTCA failure/complication 138 (2.2)      √    √ √        
Intubated 71 (1.1)     √      √ √     √  
IABP 134 (2.2)          √ √      √ √ 
Uncontrolled systemic disturbanceb 1135 (18.2)               √    
Cardiogenic shock 78 (1.3)      √ √    √ √       
Hemodynamically unstable 286 (4.6)       √            
Critical statec 308 (5.0)    √               
Catastrophic statesd 206 (3.3)          √         

IABP, intra-aortic ballon pump; PTCA, percutaneous transluminal coronary angioplasty.

aAdditive and logistic.

bAny one or more of the following: systolic pulmonary arterial pressure>50 mmHg; uncontrolled systemic arterial hypertension; renal insufficiency; chronic lung disease; poor hepatic function; cerebrovascular insufficiency; severe arrhythmias; active endocarditis; cachexia.

cAny one or more of the following: ventricular tachycardia or fibrillation or aborted sudden death; pre-operative cardiac massage; pre-operative ventilation before arrival in the anaesthetic room; pre-operative inotropic support; intraaortic balloon counterpulsation; or pre-operative acute renal failure (anuria or oliguria<10 mL/h)

dAny one or more of the following: acute structural defect (acute ventricular septal defect or acute mitral valve regurgitation); cardiogenic shock; acute renal failure.

Table A4

Surgical information in 6222 open-heart operations

Operation n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Venous graft alone 572 (9.2)     √              
Single valve surgery only 657 (10.6)         √          
Valve surgery only 721 (11.6)                √   
Aortic valve surgeryb 1106 (17.9)   √       √ √        
Mitral valve surgeryc 449 (7.3) √  √       √ √ √       
Tricuspid valve surgeryb 40 (0.6)     √      √ √       
Valve surgery and CABG 619 (9.9)     √    √ √ √ √    √   
Otherd than isolated CABG 1871 (30.1)    √               
Heart transplantation 78 (1.3)     √              
Post-infarction septal rupture 37 (0.6)    √ √      √        
Left ventricular aneurysm 16 (0.3)          √ √ √       
Surgery on thoracic aorta 209 (3.4)    √        √       
Aortic dissection (acute) 79 (1.3)     √      √        
Operation n (%) Amphiascore Cabdeal Cleveland Clinic EuroSCOREa French score Magovern NYS Northern New England Ontario Parsonnet Parsonnet (modified) Pons Toronto Toronto (modified) Tremblay Tuman UK national score Veterans Affairs 
Venous graft alone 572 (9.2)     √              
Single valve surgery only 657 (10.6)         √          
Valve surgery only 721 (11.6)                √   
Aortic valve surgeryb 1106 (17.9)   √       √ √        
Mitral valve surgeryc 449 (7.3) √  √       √ √ √       
Tricuspid valve surgeryb 40 (0.6)     √      √ √       
Valve surgery and CABG 619 (9.9)     √    √ √ √ √    √   
Otherd than isolated CABG 1871 (30.1)    √               
Heart transplantation 78 (1.3)     √              
Post-infarction septal rupture 37 (0.6)    √ √      √        
Left ventricular aneurysm 16 (0.3)          √ √ √       
Surgery on thoracic aorta 209 (3.4)    √        √       
Aortic dissection (acute) 79 (1.3)     √      √        

aAdditive and logistic.

bWith or without CABG surgery.

cWith or without CABG surgery, except for Amphiascore where the definition is mitral valve surgery with CABG surgery.

dTotal number of valve or miscellaneous procedures with or without CABG surgery.

Figure 1 The ROC area (diamonds) with 95% confidence intervals (horizontal bars) for 30-day mortality and 1-year mortality. (A) 30-day mortality and (B) 1-year mortality. Open heart surgery (n=6222). See Table 1 for abbreviations.

Figure 1 The ROC area (diamonds) with 95% confidence intervals (horizontal bars) for 30-day mortality and 1-year mortality. (A) 30-day mortality and (B) 1-year mortality. Open heart surgery (n=6222). See Table 1 for abbreviations.

Figure 2 The ROC curves. The sensitivity of prediction of 30-day mortality vs. 1-specificity for the 19 risk algorithms is plotted. The solid line represents the absence of discrimination. Open-heart surgery (n=6222).

Figure 2 The ROC curves. The sensitivity of prediction of 30-day mortality vs. 1-specificity for the 19 risk algorithms is plotted. The solid line represents the absence of discrimination. Open-heart surgery (n=6222).

Figure 3 Comparison of the ROC area for different risk algorithms. For each risk scoring system (left y-axis), the number of risk algorithms with a significantly (P<0.05) larger (black bar) or smaller (grey bar) ROC area are shown. (A) 30-day mortality and (B) 1-year mortality. Open-heart surgery (n=6222). See Table 1 for abbreviations.

Figure 3 Comparison of the ROC area for different risk algorithms. For each risk scoring system (left y-axis), the number of risk algorithms with a significantly (P<0.05) larger (black bar) or smaller (grey bar) ROC area are shown. (A) 30-day mortality and (B) 1-year mortality. Open-heart surgery (n=6222). See Table 1 for abbreviations.

Figure 4 Observed 30-day mortality with 95% confidence intervals (vertical lines) in comparison to score-predicted 30-day mortality (diamonds) with 95% confidence intervals (horizontal bars). (A) All open-heart surgery and (B) CABG-only surgery. Asterisk denotes the predicted mortality calculated from ACC/AHA score mortality table11 specified for CABG-only surgery. See Table 1 for abbreviations.

Figure 4 Observed 30-day mortality with 95% confidence intervals (vertical lines) in comparison to score-predicted 30-day mortality (diamonds) with 95% confidence intervals (horizontal bars). (A) All open-heart surgery and (B) CABG-only surgery. Asterisk denotes the predicted mortality calculated from ACC/AHA score mortality table11 specified for CABG-only surgery. See Table 1 for abbreviations.

Table 1

Synopsis of original data of 19 risk score algorithms

 Region Year of data collection Year of publication Number of patients (centers) Risk variables ROC area 
Amphiascore23 Netherlands 1997–2001 2003 7282 (1) 0.84 
Cabdeala,24 Finland 1990–1991 1996 386 (1) 0.71 
Cleveland clinic25 USA 1986–1988 1992 5051 (1) 13 N/A 
EuroSCORE (add.)26 Europe 1995 1999 13 302 (128) 17 0.79 
EuroSCORE (log.)27 Europe 1995 2003 13 302 (128) 17 0.79 
French score28 France 1993 1995 7181 (42) 13 0.75 
Magoverna,29 USA 1991–1992 1996 1567 (1) 18 0.86 
NYSa,3,30 USA 1998 2001 18 814 (33) 14 0.79 
NNEa,11 USA 1996–1998 1999 7290 (N/A) N/A 
Ontario31 Canada 1991–1993 1995 6213 (9) 0.75 
Parsonnet32 USA 1982–1987 1989 3500 (1) 16 N/A 
Parsonnet (mod.)33 France 1992–1993 1997 6649 (42) 41 0.70 
Pons34 Spain 1994 1997 1309 (7) 11 N/A 
Torontoa,35 Canada 1993–1996 1999 7491 (2) 0.78 
Toronto (mod.)a,36 Canada 1996–1997 2000 1904 (1) N/A 
Tremblay37 Canada 1989–1990 1993 2029 (1) N/A 
Tuman38 USA N/A 1992 3156 (1) 10 N/A 
UK national scorea,5 UK 1995–1996 1998 1774 (2) 19 0.75 
Veterans Affairsa,39 USA 1987–1990 1993 12 712 (43) 10 N/A 
 Region Year of data collection Year of publication Number of patients (centers) Risk variables ROC area 
Amphiascore23 Netherlands 1997–2001 2003 7282 (1) 0.84 
Cabdeala,24 Finland 1990–1991 1996 386 (1) 0.71 
Cleveland clinic25 USA 1986–1988 1992 5051 (1) 13 N/A 
EuroSCORE (add.)26 Europe 1995 1999 13 302 (128) 17 0.79 
EuroSCORE (log.)27 Europe 1995 2003 13 302 (128) 17 0.79 
French score28 France 1993 1995 7181 (42) 13 0.75 
Magoverna,29 USA 1991–1992 1996 1567 (1) 18 0.86 
NYSa,3,30 USA 1998 2001 18 814 (33) 14 0.79 
NNEa,11 USA 1996–1998 1999 7290 (N/A) N/A 
Ontario31 Canada 1991–1993 1995 6213 (9) 0.75 
Parsonnet32 USA 1982–1987 1989 3500 (1) 16 N/A 
Parsonnet (mod.)33 France 1992–1993 1997 6649 (42) 41 0.70 
Pons34 Spain 1994 1997 1309 (7) 11 N/A 
Torontoa,35 Canada 1993–1996 1999 7491 (2) 0.78 
Toronto (mod.)a,36 Canada 1996–1997 2000 1904 (1) N/A 
Tremblay37 Canada 1989–1990 1993 2029 (1) N/A 
Tuman38 USA N/A 1992 3156 (1) 10 N/A 
UK national scorea,5 UK 1995–1996 1998 1774 (2) 19 0.75 
Veterans Affairsa,39 USA 1987–1990 1993 12 712 (43) 10 N/A 

Add, additive; log, logistic; mod, modified; NNE, Northern New England; N/A, not available. Cleveland Clinic risk score algorithm is also known as Higgins score, NNE as American College of Cardiology/American Heart Association (ACA/AHA) score, and Ontario as Provincial Adult Cardiac Care Network (PACCN) score.

aAlgorithms developed for CABG-only surgery.

Table 2

ROC area for the five risk algorithms with best performance and accuracy in CABG-only surgery (n=4351)

 30-day mortality ROC area (95% CI) 1-year mortality ROC area (95% CI) 
EuroSCORE (logistic) 0.86 (0.82–0.90) 0.75 (0.72–0.79) 
EuroSCORE (additive) 0.85 (0.81–0.89) 0.75 (0.71–0.78) 
NYS 0.84 (0.80–0.88) 0.75 (0.72–0.79) 
Cleveland Clinic 0.84 (0.80–0.88) 0.75 (0.71–0.78) 
Parsonnet (modified) 0.84 (0.80–0.88) 0.73 (0.69–0.77) 
 30-day mortality ROC area (95% CI) 1-year mortality ROC area (95% CI) 
EuroSCORE (logistic) 0.86 (0.82–0.90) 0.75 (0.72–0.79) 
EuroSCORE (additive) 0.85 (0.81–0.89) 0.75 (0.71–0.78) 
NYS 0.84 (0.80–0.88) 0.75 (0.72–0.79) 
Cleveland Clinic 0.84 (0.80–0.88) 0.75 (0.71–0.78) 
Parsonnet (modified) 0.84 (0.80–0.88) 0.73 (0.69–0.77) 

Cleveland Clinic risk score algorithm is also known as Higgins score.

References

1
Dubois RW, Rogers WH, Moxley JH III, Draper D, Brook RH. Hospital inpatient mortality. Is it a predictor of quality?
N Engl J Med
 
1987
;
317
:
1674
–1680.
2
Dubois RW, Brook RH, Rogers WH. Adjusted hospital death rates: a potential screen for quality of medical care.
Am J Public Health
 
1987
;
77
:
1162
–1166.
3
Hannan EL, Kilburn H Jr, Racz M, Shields E, Chassin MR. Improving the outcomes of coronary artery bypass surgery in New York State.
JAMA
 
1994
;
271
:
761
–766.
4
Asimakopoulos G, Al-Ruzzeh S, Ambler G, Omar RZ, Punjabi P, Amrani M, Taylor KM. An evaluation of existing risk stratification models as a tool for comparison of surgical performances for coronary artery bypass grafting between institutions.
Eur J Cardiothorac Surg
 
2003
;
23
:
935
–942.
5
Bridgewater B, Neve H, Moat N, Hooper T, Jones M. Predicting operative risk for coronary artery surgery in the United Kingdom: a comparison of various risk prediction algorithms.
Heart
 
1998
;
79
:
350
–355.
6
Geissler HJ, Holzl P, Marohl S, Kuhn-Regnier F, Mehlhorn U, Sudkamp M, de Vivie ER. Risk stratification in heart surgery: comparison of six score systems.
Eur J Cardiothorac Surg
 
2000
;
17
:
400
–406.
7
Nilsson J, Algotsson L, Hoglund P, Luhrs C, Brandt J. Early mortality in coronary bypass surgery: the EuroSCORE versus The Society of Thoracic Surgeons risk algorithm.
Ann Thorac Surg
 
2004
;
77
:
1235
–1239; discussion 1239–1240.
8
Pinna-Pintor P, Bobbio M, Colangelo S, Veglia F, Giammaria M, Cuni D, Maisano F, Alfieri O. Inaccuracy of four coronary surgery risk-adjusted models to predict mortality in individual patients.
Eur J Cardiothorac Surg
 
2002
;
21
:
199
–204.
9
Edwards FH, Clark RE, Schwartz M. Coronary artery bypass grafting: the Society of Thoracic Surgeons National Database experience.
Ann Thorac Surg
 
1994
;
57
:
12
–19.
10
Schemper M, Smith TL. Efficient evaluation of treatment effects in the presence of missing covariate values.
Stat Med
 
1990
;
9
:
777
–784.
11
Eagle KA, Guyton RA, Davidoff R, Ewy GA, Fonger J, Gardner TJ, Gott JP, Herrmann HC, Marlow RA, Nugent W, O'Connor GT, Orszulak TA, Rieselbach RE, Winters WL, Yusuf S, Gibbons RJ, Alpert JS, Garson A Jr, Gregoratos G, Russell RO, Ryan TJ, Smith SC Jr. ACC/AHA guidelines for coronary artery bypass graft surgery: executive summary and recommendations. A report of the American College of Cardiology/American Heart Association Task Force on Practice Guidelines (Committee to revise the 1991 guidelines for coronary artery bypass graft surgery).
Circulation
 
1999
;
100
:
1464
–1480.
12
Pepe MS. The receiver operating characteristic curve.
The Statistical Evaluation of Medical Tests for Classification and Prediction
 . New York: Oxford University Press;
2003
.
p66
–94.
13
Hanley JA, McNeil BJ. The meaning and use of the area under a receiver operating characteristic (ROC) curve.
Radiology
 
1982
;
143
:
29
–36.
14
DeLong ER, DeLong DM, Clarke-Pearson DL. Comparing the areas under two or more correlated receiver operating characteristic curves: a nonparametric approach.
Biometrics
 
1988
;
44
:
837
–845.
15
Lusted LB. Logical analysis in roentgen diagnosis.
Radiology
 
1960
;
74
:
178
–193.
16
Nashef SA, Roques F, Michel P, Cortina J, Faichney A, Gams E, Harjula A, Jones MT. Coronary surgery in Europe: comparison of the national subsets of the European system for cardiac operative risk evaluation database.
Eur J Cardiothorac Surg
 
2000
;
17
:
396
–399.
17
Nashef SA, Roques F, Hammill BG, Peterson ED, Michel P, Grover FL, Wyse RK, Ferguson TB. Validation of European System for Cardiac Operative Risk Evaluation (EuroSCORE) in North American cardiac surgery.
Eur J Cardiothorac Surg
 
2002
;
22
:
101
–105.
18
Nilsson J, Algotsson L, Hoglund P, Luhrs C, Brandt J. EuroSCORE predicts intensive care unit stay and costs of open heart surgery.
Ann Thorac Surg
 
2004
;
78
:
1528
–1534; discussion 1534–1535.
19
Murphy-Filkins R, Teres D, Lemeshow S, Hosmer DW. Effect of changing patient mix on the performance of an intensive care unit severity-of-illness model: how to distinguish a general from a specialty intensive care unit.
Crit Care Med
 
1996
;
24
:
1968
–1973.
20
Schemper M, Heinze G. Probability imputation revisited for prognostic factor studies.
Stat Med
 
1997
;
16
:
73
–80.
21
Wyse RK, Taylor KM. Using the STS and multinational cardiac surgical databases to establish risk-adjusted benchmarks for clinical outcomes.
Heart Surg Forum
 
2002
;
5
:
258
–264.
22
Lippmann RP, Shahian DM. Coronary artery bypass risk prediction using neural networks.
Ann Thorac Surg
 
1997
;
63
:
1635
–1643.
23
Huijskes RVHP, Rosseel PMJ, Tijssen JGP. Outcome prediction in coronary artery bypass grafting and valve surgery in the Netherlands: development of the Amphiascore and its comparison with the Euroscore.
Eur J Cardiothorac Surg
 
2003
;
24
:
741
–749.
24
Kurki TS, Kataja M. Preoperative prediction of postoperative morbidity in coronary artery bypass grafting.
Ann Thorac Surg
 
1996
;
61
:
1740
–1745.
25
Higgins TL, Estafanous FG, Loop FD, Beck GJ, Blum JM, Paranandi L. Stratification of morbidity and mortality outcome by preoperative risk factors in coronary artery bypass patients. A clinical severity score.
JAMA
 
1992
;
267
:
2344
–2348.
26
Nashef SA, Roques F, Michel P, Gauducheau E, Lemeshow S, Salamon R. European system for cardiac operative risk evaluation (EuroSCORE).
Eur J Cardiothorac Surg
 
1999
;
16
:
9
–13.
27
Roques F, Michel P, Goldstone AR, Nashef SA. The logistic EuroSCORE.
Eur Heart J
 
2003
;
24
:
882
.
28
Roques F, Gabrielle F, Michel P, De Vincentiis C, David M, Baudet E. Quality of care in adult heart surgery: proposal for a self-assessment approach based on a French multicenter study.
Eur J Cardiothorac Surg
 
1995
;
9
:
433
–439; discussion 439–440.
29
Magovern JA, Sakert T, Magovern GJ, Benckart DH, Burkholder JA, Liebler GA, Magovern GJ, Sr. A model that predicts morbidity and mortality after coronary artery bypass graft surgery.
J Am Coll Cardiol
 
1996
;
28
:
1147
–1153.
30
Coronary Artery Bypass Surgery in New York State 1996–1998. [pdf]
2001
[cited 8 April 2005]. www.health.state.ny.us/nysdoh/consumer/heart/1996-98cabg.pdf
31
Tu JV, Jaglal SB, Naylor CD. Multicenter validation of a risk index for mortality, intensive care unit stay, and overall hospital length of stay after cardiac surgery. Steering Committee of the Provincial Adult Cardiac Care Network of Ontario.
Circulation
 
1995
;
91
:
677
–684.
32
Parsonnet V, Dean D, Bernstein AD. A method of uniform stratification of risk for evaluating the results of surgery in acquired adult heart disease.
Circulation
 
1989
;
79
:
I3
–I12.
33
Gabrielle F, Roques F, Michel P, Bernard A, de Vicentis C, Roques X, Brenot R, Baudet E, David M. Is the Parsonnet's score a good predictive score of mortality in adult cardiac surgery: assessment by a French multicentre study.
Eur J Cardiothorac Surg
 
1997
;
11
:
406
–414.
34
Pons JM, Granados A, Espinas JA, Borras JM, Martin I, Moreno V. Assessing open heart surgery mortality in Catalonia (Spain) through a predictive risk model.
Eur J Cardiothorac Surg
 
1997
;
11
:
415
–423.
35
Ivanov J, Tu JV, Naylor CD. Ready-made, recalibrated, or remodeled? Issues in the use of risk indexes for assessing mortality after coronary artery bypass graft surgery.
Circulation
 
1999
;
99
:
2098
–2104.
36
Ivanov J, Borger MA, David TE, Cohen G, Walton N, Naylor CD. Predictive accuracy study: comparing a statistical model to clinicians' estimates of outcomes after coronary bypass surgery.
Ann Thorac Surg
 
2000
;
70
:
162
–168.
37
Tremblay NA, Hardy JF, Perrault J, Carrier M. A simple classification of the risk in cardiac surgery: the first decade.
Can J Anaesth
 
1993
;
40
:
103
–111.
38
Tuman KJ, McCarthy RJ, March RJ, Najafi H, Ivankovich AD. Morbidity and duration of ICU stay after cardiac surgery. A model for preoperative risk assessment.
Chest
 
1992
;
102
:
36
–44.
39
Grover FL, Johnson RR, Marshall G, Hammermeister KE. Factors predictive of operative mortality among coronary artery bypass subsets.
Ann Thorac Surg
 
1993
;
56
:
1296
–1306; discussion 1306–1307.

Supplementary data

Comments

0 Comments