-
PDF
- Split View
-
Views
-
Cite
Cite
F. Lacour-Gayet, D. Clarke, J. Jacobs, J. Comas, S. Daebritz, W. Daenen, W. Gaynor, L. Hamilton, M. Jacobs, B. Maruszsewski, M. Pozzi, T. Spray, G. Stellin, C. Tchervenkov, C. Mavroudis, The Aristotle score: a complexity-adjusted method to evaluate surgical results, European Journal of Cardio-Thoracic Surgery, Volume 25, Issue 6, June 2004, Pages 911–924, https://doi.org/10.1016/j.ejcts.2004.03.027
- Share Icon Share
Abstract
Objectives: Quality control is difficult to achieve in Congenital Heart Surgery (CHS) because of the diversity of the procedures. It is particularly needed, considering the potential adverse outcomes associated with complex cases. The aim of this project was to develop a new method based on the complexity of the procedures. Methods: The Aristotle project, involving a panel of expert surgeons, started in 1999 and included 50 pediatric surgeons from 23 countries, representing the EACTS, STS, ECHSA and CHSS. The complexity was based on the procedures as defined by the STS/EACTS International Nomenclature and was undertaken in two steps: the first step was establishing the Basic Score, which adjusts only the complexity of the procedures. It is based on three factors: the potential for mortality, the potential for morbidity and the anticipated technical difficulty. A questionnaire was completed by the 50 centers. The second step was the development of the Comprehensive Aristotle Score, which further adjusts the complexity according to the specific patient characteristics. It includes two categories of complexity factors, the procedure dependent and independent factors. After considering the relationship between complexity and performance, the Aristotle Committee is proposing that: Performance=Complexity×Outcome. Results: The Aristotle score, allows precise scoring of the complexity for 145 CHS procedures. One interesting notion coming out of this study is that complexity is a constant value for a given patient regardless of the center where he is operated. The Aristotle complexity score was further applied to 26 centers reporting to the EACTS congenital database. A new display of centers is presented based on the comparison of hospital survival to complexity and to our proposed definition of performance. Conclusion: A complexity-adjusted method named the Aristotle Score, based on the complexity of the surgical procedures has been developed by an international group of experts. The Aristotle score, electronically available, was introduced in the EACTS and STS databases. A validation process evaluating its predictive value is being developed.
1 Introduction
In 1999, the International Nomenclature of Congenital Heart Surgery [1,2] was initiated by the Society of Thoracic Surgeons (STS) and the European Association of Cardio-Thoracic Surgery (EACTS). Following this important creation, it was possible to establish consistent databases in Congenital Heart Surgery (CHS), worldwide. Simultaneously, the STS and the EACTS began using the STS-EACTS Nomenclature including its minimal dataset. The progression of these two international databases has been rapid. Today, in 2003, there are 13,000 and 16,000 cases recorded in the databases of the STS and of the EACTS, respectively.
The motivation behind the Complexity Score Project was a growing frustration of pediatric cardiac surgeons over the fact that their surgical performance was being evaluated based on hospital mortality without regard for the complexity of the operations performed.
A working group from the European Congenital Heart Surgery Association (ECHSA) and the International Nomenclature Committee of the STS, decided to develop a risk-stratification method which could be adapted to our specialty.
The project was only possible because of the excellent relationships, based upon expertise, loyalty, energy, and friendship, created inside this international group of pediatric cardiac surgeons. The group included representatives from the four major international societies of pediatric cardiac surgery; the STS, the EACTS, the Congenital Heart Surgeons Society (CHSS) and the ECHSA.
2 Status of quality evaluation in congenital heart surgery
The continuous evaluation of quality of care is becoming a duty of surgical practice. This is particularly true in pediatric cardiac surgery, where adverse outcomes can be frequent due to the severity of the pathology [3,4]. Initially considered a research issue, this responsibility is rapidly increasing, driven by demand from hospital managers, referring physicians, families, insurance companies, government agencies, courts and the media.
Evaluation of quality of care is a new chapter of modern medicine which follows a different rhetoric and the need to compare and measure. Many instruments used in the past to describe results are inadequate and obsolete. New methods, parameters and vocabulary are needed. The comparison and measurement of quality of care depends on four tools:
A common language used in the population studied: nomenclature[1].
A database with a simplified data set: registry[5].
A parameter to allow comparison: complexity[6].
A data verification process: validation.
Evaluation of quality in CHS is particularly complex as our specialty deals with approximately 150 surgical procedures and 200 diagnoses. Combined, the outcome analyses require comparing several hundreds of different factors. In addition CHS implies performing challenging procedures, which require optimal control of advanced surgical technique. When starting this project in 1999, we faced two difficulties: (1) the multi-institutional databases were just starting [7,8] and there was no reliable data yet available. (2) Due to the absence of risk stratification, the more prominent centers dealing with the sickest patients and potentially having a significant mortality were very reluctant to send their data.
It was necessary to base this risk-adjustment on an evaluation that was partially subjective. Following many discussions, it was felt that an approach based on the consensus of a panel of experts was valid, provided that the risk-adjustment score is subsequently prospectively validated based on collected outcome data. Because our scoring system was derived from opinions, we gave the name of Aristotle to this project. According to Aristotle's philosophy (Rhetoric, Book I, 350 bc); “When there is no scientific answer available, the opinion (Doxa) perceived and admitted by the majority has value of truth.”
3 Methods
3.1 Principles
Although it was not perceived at the beginning of the study, the concept of complexity arose naturally within the Aristotle Committee. It is based on a different appreciation of the so called ‘Risk Factors’. The incremental risk factors for mortality and morbidity are extremely variable. They are currently defined by publications, narrowly focused on diagnostic groups, coming from prominent centers presenting their best work or from multi-institutional studies (CHSS).
Dealing with the responsibility of constructing a quality of care evaluation applicable to all centers and not only to the best ones, the risk factors currently recognized are insufficient. In addition, risk factors are labile. A good example of this is the arterial switch operation. Complex coronary anatomy that was a risk factor in the eighties is no longer a risk factor in 2003 in experienced centers [9]. In fact, a given center may well control an adverse anatomy or an associated procedure, when another center does not. How do we deal with this evidence that is the basis of surgical performance?
We decided to introduce a first concept: complexity, based on complexity factors. The original and specific characteristic of complexity is that it is a constant calculated by the scoring system we have developed.
After considering the relationship between Complexity and Performance; we decided to introduce a second concept and propose that complexity is a component of a new equation of quality of care:

An analogy can be drawn with the sport of alpine skiing; the difficulty of a ski slope is a constant that is stratified in Europe by the ski resort managers, and labeled green, blue, red, or black, according to the difficulty. Modifiers such as weather conditions or quality of snow can influence the complexity and can be evaluated. The outcome is variable depending on the expertise of the skier…
The same is true for complexity in CHD surgical procedures. It is a constant, at a given time for a given procedure in a given patient, whatever the center and its global location.
The principle may be generalized to other disciplines including non-surgical specialties, even though the definition of complexity might differ. By looking at a variety of outcomes, different categories of performance can be defined. Although, there is no consensus today on the definition of medical or surgical performance, we envision that overall performance is an aggregate of multiple areas of performance, depending on the various outcomes considered (Table 1) ; the term quality being reserved to long term results.

4 Methodology details
The objectives of the project were to:
Precisely score the complexity of each procedure.
Produce a comprehensive scoring applied to all procedures.
Develop a system that is applicable worldwide.
The scoring of complexity is based on primary procedures and not on diagnoses as there may be several procedures that can apply to the same diagnosis.
The complexity score is the sum of three factors:
The potential for hospital mortality
The potential for post-operative morbidity, defined as the length of ICU stay
The technical difficulty, defined as the anticipated level of surgical expertise required to perform a given procedure.
The scoring was based on a grade from 1 to 5 in each category (Table 2) .

The complexity is the sum of: potential for mortality (discharge or 30 days mortality), the potential for morbidity (ICU length of stay) and the potential of anticipated surgical technique difficulty
The evaluation of complexity was carried out in two steps:
- The Basic Score is a procedure-adjusted complexity and only applies to procedures. An international group including more than 50 prominent centers and 23 Nations was asked, through a questionnaire, to score 145 procedures of the short list of the International Nomenclature, according to potential of mortality, potential of morbidity and estimated technical difficulty. Only simple form of the pathology indicating the procedure was considered. For each procedure, the median value of mortality, morbidity and technical difficulty obtained from the 50 centers was calculated. The sum of these three median values gives the final basic score for each procedure ( Appendix B2). The distribution of the scoring among the centers was, in general, quite uniform, although some rare or new procedures had a large dispersion. The scale ranges from 1.5 to 15 (Fig. 1), and four levels of difficulty were defined ( Appendix B2).Fig. 1
The Aristotle scale ranges from 1.5 to 25. The basic score [1.5–15] reflects only procedure complexity. The comprehensive score (1.5–25) includes complexity factors related to the specific patient.
The Comprehensive Aristotle Score introduces patient-adjusted complexity. It includes two categories of complexity factors:
Procedure dependent factors ( Appendix B3) adjust each patient's procedure to a specific complexity:
Anatomical factors (n=76).
Associated procedures (n=85).
Age (n=6 age groups). The impact of age varies in either direction depending on the procedure.
Each factor is scored for contribution to mortality, morbidity, and difficulty
Procedure independent factors ( Appendix B4) adjust for each patient's clinical status a specific complexity, (81 factors):
General factors (n=3)
Clinical factors (n=31)
Extra-cardiac factors (n=39)
Surgical factors (n=8)
Each factor is scored for contribution to mortality, morbidity, and difficulty.
All complexity factors meet the following requirements; precisely quantifiable, easily available, admitted by a majority, and controllable.
The comprehensive score adds 10 points and two levels of complexity to the basic score scale, (15.1–20=level 5 and 20.1–25=level 6), (Fig. 1). In the case of associated procedures, the system defines the primary procedure as the one with the highest complexity according to the basic score.
The challenging task of developing the scores for 145 procedures took four years to complete, from 2000 through September 2003. It was finally achieved by the Aristotle Executive Committee who met more than 20 times in conjunction with various international meetings.
The score values were exchanged several times among the committee members until a true consensus was achieved. Further, all the scores were finally reviewed in Denver, at The Children's Hospital Heart Institute.
5 Preliminary results
5.1 Basic score
The basic score is very simple to apply and can be used retrospectively to enter complexity into almost any database software. It is important to emphasize that it is only a procedure-adjusted score. Since August 2003, the basic score is included in the STS and EACTS databases, at their respective data collection sites; the Duke Clinical Research Institute, in Durham, NC, and the Memorial Hospital in Warsaw. Some initial results comparing centers using the basic score are already available. The STS data analysis will be presented at the STS 2004 meeting in San Antonio.
The data analysis using the basic score has started at the EACTS. The first evaluation deals with a preliminary study of the variation in performance of European centers.
Twenty-six EACTS Centers were studied (Table 3) , during the period 1999–2003, involving a total of 13,508 patients and 14,493 procedures. The centers with less than 200 procedures performed during the time period were excluded. The average number of procedures per center was 519 (206–2457). According to the volume harvested, there were: 2 large centers (>1000 procedures), 10 medium centers (500–1000 procedures) and 14 smaller centers (<500 procedures). The average hospital mortality within 30 days was 4.8% (1.9–9.6), corresponding to a hospital survival of 95.2±2.02% (90.4–98.1%). The average complexity, according to the Basic Score was 6.7±0.4 (5.7–7.2). The average Performance=(Complexity×Survival)/100 was 6.3±0.4 (5.5–6.9).

We compared centers in two different ways.
- First, the centers were compared plotting Complexity versus Hospital Survival as shown on Fig. 2A.Fig. 2
(A) Data from 26 centers referring data to the EACTS Congenital database. Survival is plotted against complexity (basic score). The average survival and complexity of the centers are indicated. (B) Same 26 centers data. The graph values follow the proposed equation: Performance=Complexity×Survival. Three bubble sizes indicate the volume of procedures reported by centers (large, medium, smaller). Sloping lines indicate the levels of complexity (basic score). Only centers having the same complexity level (on the same slope lines) can be compared together.
Second, the centers were compared using the equation: Operative Performance=(Complexity×Hospital Survival)/100, with Performance plotted against Survival, as shown in Fig. 2B. In addition, three different sizes of bubble indicate the volume of procedures harvested by centers. The complexity levels are represented by sloping lines on the graph.
The average values of the vertical and horizontal axes allow defining four quadrants;
In the upper right quadrant are the best centers, with three leading centers very close together.
In the upper left quadrant are centers with low mortality but with less complex procedures; these centers select their patients and might send away the more complex cases.
In the right lower quadrant are centers with high complexity but a higher mortality. These centers should be carefully evaluated; they can only be compared to centers of the same level of complexity. If isolated in a large geographic area or being the only national centers, they are performing satisfactorily. If located near a leading center, they may consider sending away their most complex patients. They will then move to the left and toward the top.
The left lower quadrant contains centers with lower complexity and higher mortality. These centers should be informed by the scientific societies and those having survival more than two standard deviations below the mean may be encouraged and supported to organize a retraining of their program.
5.2 Comprehensive Aristotle score
The comprehensive score is much more precise. The complexity can vary enormously within the same basic procedure category.
Using again the example of arterial switch, it is not difficult to show this variation for the 10 switch operations performed in Denver at The Children's Hospital Heart Institute, from January to September, 2003. The basic score, adjusting only the procedures, shows the patients being on one of two complexity score levels; either line 10 or 11, respectively for TGA-IVS or TGA-VSD and DORV nc VSD (Fig. 3A) . The Comprehensive score, adjusting for the specific patient characteristics, shows a large dispersion of the complexity, jumping to over 15 for 3 patients (one with TGA-IVS intramural coronaries, one with TGA-VSD and single ostium with double coronary loop and one with DORV nc VSD and arch obstruction)(Fig. 3B). There was no mortality in this group of patients. This example illustrates well the notion of the complex switch and the need of a second learning curve. The same concept holds true for Norwood, Truncus Arteriosus, TAPVR, CAVSD, etc.

(A) Ten arterial switch patients plotted using the basic score (procedure only) complexity rating show little variation. (B) The same ten patients plotted using the Aristotle comprehensive score illustrate considerably more dispersion of complexity.
The evaluation of the complexity is limited in this preliminary report by the exclusive use of the basic score ( Appendix B2) dealing only with a procedure-related complexity. We expect that the introduction of the comprehensive score ( Appendices B3 and B4), which took four years of development, will provide a much better accuracy when dealing with combined procedures, with the large variation of the anatomy and with the clinical status of each particular patient. This requires a prospective study that is underway.
6 Development
6.1 Software development
The basic score is available in Appendix B2. The comprehensive score ( Appendix B3) is completed and ready to be included in the multi-center databases of the STS and the EACTS.
An Aristotle™ prototype software, developed on Excel was shown at the EACTS meeting in Vienna; it allows very fast and simple navigation to score the patient complexity factors. Several patient database prototype softwares containing Aristotle™ are under development. The complete Aristotle™ system will be provided freely to the database participants of the STS and EACTS, with the freedom of using indifferently the Basic Score or the Comprehensive Score. For other parties, the Aristotle™ software will be made available on the Internet or on compact disc.1
6.2 Validation of the Aristotle Scores
The Aristotle project is a work still in progress. This new method of evaluation of quality of care needs to be further validated. It is the next task of the Aristotle Committee which will evaluate the predictive value of the Aristotle Score on mortality and morbidity and compare the respective value of the basic score and the comprehensive score. The validation process is underway and should be completed in 2004 for the Basic Score and in 2005 for the Comprehensive score. So far, only very few databases are validated and they do not use the International Nomenclature. Several centers volunteered to have their databases reviewed by the Aristotle Committee will provide the material for validation.
The Aristotle Scores will evolve overtime. As soon as the multi-institutional databases have collected a large amount of validated data, mortality and morbidity observed in these databases will replace our potentials of mortality and morbidity. The technical difficulty may remain a factor controlled by a panel of experts. Nevertheless the Aristotle score will remain constant for periods of four years and be updated only at the World Congress of Pediatric Cardiology and Cardiac Surgery.
7 Discussion
7.1 Functions of the Aristotle system
Knowing precisely the complexity of a given patient undergoing CHS is crucial information requested by many parties: the patient and his family, the surgical team, the cardiologists, and the health care payers. This is important information that was not available until now.
The stratification of each patient's complexity allows selective referral of the complex patient to the appropriate center.
Residents and Fellows in charge of pre-operative evaluation can find for each procedure a list of anatomic factors that the surgeon needs to know.
The Aristotle system allows a simple electronic collection of all complexity factors. This should offer to outcome research a rich and organized collection of data allowing clear definition of risk factors.
The calculation of surgeon's remuneration in many places remains imprecise. Further evaluation of cost according to complexity could provide health insurance companies, precise benchmarks to use for their financial management.
7.2 Quality control organized by the scientific societies
Evaluation of quality of care has become a duty of the modern medical practice. It is requested in first place by the patients and the health insurance companies. It seems important that the evaluation of quality is organized by the physicians. It might be a new role of the scientific societies to organize, implement and finally control this process that should ultimately become a self-evaluation process.
In their possible role as quality of care supervisor, the scientific societies have several responsibilities: confidentiality, support, and promotion.
The confidentiality of the data collected is crucial in regard to both the patients and the centers. Some kind of protection and support is needed for centers showing suboptimal results at a given time. It is expected that these centers would be informed confidentially and that the societies would support and help organize a retraining of their team.
The centers, which send complete and authenticated data, provided that their results remain above or close to the average values, should be promoted. It is the system that is now implemented at the EACTS though the accreditation process of the European Cardiothoracic Surgery Institute of Accreditation [10].
7.3 Databases volume, risk-stratification and validation
To be efficient, a multi-institutional database needs to collect a critical amount of data. The very large size of the STS adult database is based on the participation of nearly 500 centers. One reason for their important success seems to be the introduction of a risk-adjustment process. The risk-adjustment method initially developed in the Veterans Hospitals system in 1987 by F. Grover and K. Hammermiester [11], was further applied to the STS database in 1990 [12]. This method was further generalized to all specialties by S. Khuri [13]. Once a fair risk-adjustment for mortality and morbidity was introduced, the STS database markedly increased its volume. We expect that the complexity stratification described here will similarly stimulate the growth of participation in our CHS databases.
The preliminary graphs (Fig. 2A and B) shown in this study are based on data from the EACTS congenital database. These data are not authenticated and do not represent the same time period of harvesting at each center. These graphs, therefore, show only preliminary results; final conclusions on effect of center's size on outcomes will be drawn later on verified and validated data. The first impression, nevertheless, shows that centers reporting smaller volume of procedures can have good results.
The validation process of the Scientific Societies databases remains a controversial issue. It, nevertheless, is needed and is anticipated by the health care payers. The mechanism of such a process is not established yet and is still under investigation at the STS and EACTS.
7.4 Controversies around the proposed definition of performance
Several remarks arose at the EACTS Vienna Annual Meeting and inside the Aristotle Committee regarding the proposed equation of performance. Performance is a concept that is per se subjective. Performance, either medical or surgical, is not precisely defined and has different aspects as shown in Table 1.
The original contribution of the Aristotle project is: to define Complexity as a constant and global value for a given patient, and to define Performance as a combination of Complexity and Outcome. Based on this concept, we have proposed that Performance equals Complexity times Outcome. If it is well accepted that the complexity impacts on the survival and that their combination is the performance; there is no consensus on the optimal function that applies between complexity and survival. Multiplication was chosen after trying different mathematical functions to relate complexity and outcome; it is the simplest function.

Outcome is Current Flow or Intensity, Complexity is Resistance and Performance is Potential Difference. Actually, a team having excellent results when dealing with complex patients is probably performing under a ‘high voltage’.
Another more suitable function may arise from the ongoing validation. The absence of definition, as shown on Fig. 2A, raises more questions than answers. At this stage of the development, we consider that the hypothesized equation provides a fair definition of performance but other better solutions may come forth.
7.5 Further developments
A study of the morbidity of CHS will be conducted and may produce a classification of centers, based on peri-operative performance, which should be different from that produced by the study of the operative performance. The precise definition of patients with high risk of morbidity is expected. A long term outcome evaluation is needed in the future and will involve the active participation of the pediatric cardiologists.
A study evaluating cost according to complexity will orient more precisely the financial management of our expensive specialty.
Scoring of interventional cardiology procedures is easily achievable and will complete the listing of procedures possibly performed during surgical hospitalization.
Combined evaluation including centers from the STS and EACTS might be organized in the future.
Finally, the Aristotle method, based on a precise evaluation of complexity that is a constant and global value, can be applied to other disciplines including non surgical specialties.
8 Conclusion
The complexity-adjustment project, carried out over a four year period involving a large international work group, is now completed. It is based on a precise measurement of the new concept of complexity, which is a constant and global value.
Two Aristotle scores are available; the basic score, a procedure-adjusted complexity score, and the comprehensive score, a patient-adjusted complexity score.
Preliminary results using the equation, complexity×survival=performance, allows establishing a new mode of classification of the CHS centers; that we believe is more precise and fair.
The comprehensive Aristotle score allows much more precise complexity stratification, including all characteristics of the patient. Allowing accurate evaluation of surgical performance in CHS, the Aristotle score is also a powerful vector of communication with patients, surgeons, cardiologists and health care payers. Evaluating the predictive values of the Aristotle method is in progress to confirm the validity of the method.
We would like to acknowledge the members of the Aristotle Committee for their contributions to this work and their continuing support of this project. We would also like to acknowledge Mr Josh McKennett from The Children's Hospital in Denver for his assistance with software development. Finally, we acknowledge Dr Zdislaw Tobota, Memorial Hospital, Warsaw for showing us the ‘bubbles’.
Dr W. Klepetko (Vienna, Austria): I think this is really an important tool in our need to validate the outcome of our surgical performance. I would like to start the discussion with two questions.
Is this score already available to every center right now or will there be some more time that they need to introduce it and to open it to a more wider use to all the centers?
Dr Lacour-Gayet: Well, the basic score is running into the STS and the EACTS databases, and as you may see this afternoon at the Business Meeting, it is providing very interesting information. We believe that it is not enough and that we should use the comprehensive score; because under a single procedure name there is a very important spectrum of differences. Everyone knows that a switch with intramural coronary arteries is more complex than a switch with straightforward coronaries. Through a validation process, we would like to make precise calculation and ask the statisticians to calculate the predictability of the system. This is going on and should be available in the next months.
Dr B. Maruszewski (Warsaw, Poland): I would like to add one thing.
I believe for this project an evaluation of the quality of care is very important. This thing needs to be validated, and this is a very powerful and very important tool. This tool doesn't mean that we want to punish anyone or we want to evaluate anyone by name, but this is the information that everyone who sends the data to the database will be able to receive, and this will just help him to know what his performance is.
And just to announce it again, because there are sort of questions and doubts, this is all anonymous. I as the manager of the database don't know which center is which, but every one of you who will join the database will know where actually he is on the scale of performance.
Appendix B1 Aristotle Work Group. 50 Centers—23 Nations

Appendix B2 Basic complexity score: the complexity is calculated on a simple anatomic form of procedures

Appendix B3 Comprehensive complexity score: procedure dependent factors

Appendix B4 Comprehensive complexity score: procedure independent factors
