The development of concise grading schemes for diffuse gliomas with proven relevance to tumor behavior and susceptibility to therapy is important for clinical decision making. At present, there is unacceptably large interobserver discrepancy in the application of the current World Health Organization (WHO) criteria for accrual of patients in trials for patients with gliomas. Because of a lack of relevant studies, the WHO guidelines for grading are not yet as clear aswould be desirable. The development of well-defined grading schemes consisting of features with low interobserver scoring variability and prognostic or predictive relevance is needed. Although interobserver concordance can be tested in retrospective studies, the prognostic or predictive qualities of histological parameters can only be tested in prospective studies. Only evidence-based histopathology will retain its critical role in the diagnosis and treatment of diffuse gliomas.
The clinical outcome of patients with glial tumors is determined by many factors. In addition to age, clinical condition, and tumor location, the tissue diagnosis is a major prognostic factor. By histological examination of glioma tissue, one tries to type and grade the tumor. The morphological resemblance of the tumor cells and their architecture to differentiated astrocytes, oligodendrocytes, or their progenitor cells suggest tumor cell lineage. Tumor typing is undisputed in cases with classic histology of morphological pure cell proliferations, but it is far more subjective when mixed cell populations are encountered or when the tumor cells do not exhibit clear lineage characteristics. Some molecular aberrations that were specifically encountered in either astrocytomas or oligodendrogliomas assist in tumor typing. Classifying aberrations are particularly useful as genetic signatures in gliomas with undetermined (i.e. mixed) histological features. The combined loss of 1p/19q and mutation of the TP53 gene are (almost mutually exclusive) characteristic of gliomas with the histological features of typical oligodendroglioma and astrocytoma, respectively (1-5). Moreover, loss of 1p/19q does not only correlate with classic honeycomb histology, but was also identified as an independent prognosticator (6). Furthermore, 1p/19q loss is predictive of the sensitivity of oligodendrogliomas to alkylating chemotherapy (7-11). Hypermethylation of the promoter of the MGMT gene would be directly predictive of success of this treatment, but also seems to some extent to be a prognostic factor (12, 13). Other molecular parameters with prognostic impact are the amplification of the EGFR gene and the recently identified mutation of the IDH1 gene (14, 15). Recent investigations in the expression of the entire genome have yielded either upregulated or downregulated clusters of genes, and these profiles have been matched with the histological classification and clinical outcome of gliomas (16, 17). Direct comparisons with survival data were also made (18). Only a few studies have focused on expression differences between low-grade and anaplastic gliomas, and various genes reflecting tumor grade were identified (19). Some authors have claimed superiority of expression data above conventional histological parameters for prognostication (17, 20-22). These data have direct relevance to grading, and the findings need to be extended and correlated with the historical parameters of glioma grading. The grade of a glioma relates to the stage of tumor progression and correlates with prognosis. Therefore, not only the type, but also the grade of the tumor is important for therapeutic decisions. In this review, the history of the practice of grading of gliomas is summarized, and current practice and ongoing difficulties in grading are discussed.
Matching the tumor histology with its radiological appearance is essential for clinical practice. First, in most cases, the radiological presentation of the tumor will suggest its histological subtype. Furthermore, the radiological presentation of diffuse infiltrating gliomas will often correlate with malignancy grade and may point to potential sampling errors. Generally, the radiological presentation of diffuse gliomas is more informative of the grade than of the type of the tumor. Tumor enhancement is indicative of a disrupted blood-brain barrier caused by vessels with inferior barrier functions and corresponds to advanced stages of tumorigenesis. During the last decade, new tools of the radiological arsenal have been tested for their contribution to estimates of glioma malignancy. The combination of conventional magnetic resonance (MR) imaging, perfusion MR imaging, and proton MR spectroscopic imaging correlated well with the tumor histology in 160 glioma patients (23). Systematic correlations between histopathology and scan appearance are important to discover new radiological clues for the characterization of the tumors. Although the specificity of different scan modalities, proton MR spectroscopy with perfusion or diffusion applications for histological malignancy grades, is progressing (23-26), histopathologic grading is still regarded as the gold standard of prognostic significance. Some authors include radiological features as part of grading schemes for gliomas. A grading scheme for oligodendroglial tumors was launched in which the radiological appearance of the tumors with respect to neovascularization was included (27). One step further was made in a study on oligodendrogliomas in which in addition to the histological features, the clinical data, chromosomal changes, and radiological enhancement were taken into consideration; the resulting scoring scheme was claimed to be superior than histological grading according to World Health Organization (WHO) and St. Anne classification criteria alone (28). This integrative approach is crucial for therapeutic decision making but should preferentially be performed at a multidisciplinary level, leaving the contributions of the various disciplines pure to avoid bias. Nevertheless, the lesson of the integrative approach to glioma patients and their tumors is well taken.
HISTORY OF GRADING TUMORS
By the end of the 19th century, it was believed that tumor formation results from defective embryogenesis and that tumor cells are arrested representatives of particular developmental stages (29). Subsequently, the initial classifications were based on the resemblance of the predominant tumor cell type in the tissue specimens with cells known from cerebral development (30). For many tumors, however, no developmental counterpart could be matched. Moreover, tumors with similar histological characteristics could show considerable differences in clinical behavior, thereby invalidating this concept for clinical use. The discrepancies triggered Dr. James W. Kernohan to change the concept and start grading glial tumors analogous to what had been done in the field of colonic tumors by his colleague Broders (31). Kernohan acknowledged only a few groups based on the resemblance of the glial tumor cells with astrocytes, oligodendrocytes, and ependymal lineage (32). In addition, malignancy grades were attributed. Key features in the Kernohan classification scheme were cellular anaplasia and number of mitoses, but other features such as increase in cell density and mitotic activity, vascular proliferation, and necrosis were variably implemented as well. The term anaplasia (ανα = backward and πλασσɛιν = to form) means loss of differentiation and specifically refers to phenotypical changes of the tumor cells. Clinicians often use the term anaplasia loosely for high-grade malignancy, whereas pathologists are more specific by addressing various histological features that reflect the histopathologic malignancy grade of glial tumors: cell density, mitotic index, degree of pleomorphism of nuclei and cytoplasmic shapes, microvascular proliferation (MVP), and necrosis. Without providing a clear definition of the term, Kernohan mainly addressed nuclear and cytoplasmic characteristics when using the term anaplasia (32). After Kernohan's efforts, grading schemes encompassing various numbers of the features were launched. Some seemed to be prone to unacceptable subjectivity and large interindividual variations and remained without practical application. Others used histological features with little or no prognostic value.
The atlas on the histology of brain tumors of Zülch (33) which appeared in 1971 may be regarded as the forerunner of the official WHO editions. Zülch made an effort to classify all intracranial neoplasms by creating groups according to expected clinical behavior. This way, the adjectives “benign,” “semibenign,” “semimalignant,” and “malignant” were attributed to heterogeneous groups consisting of tumors of entirely different lineages. Zülch's scheme was not confined to neuroglial tumors, but encompassed other tumors such as medulloblastomas, meningiomas, schwannomas, and even pituitary adenomas. For the benign or grade 1 tumors, postoperative survival of at least 5 or more years was to be expected; the semibenign grade 2 tumors would match a survival ranging between 3 and 5 years; the semimalignant grade 3 tumors, a 2- to 3-year survival period; the malignant grade 4 tumors matched with survival times of 6 to 15 months (33). While this scheme was merely a malignancy scale of various tumors (34), subsequent WHO editions have incorporated histological parameter-based grading criteria (34-38).
THE PROBLEM OF INTEROBSERVER VARIABILITY
The reproducibility of histological parameter assessments and the prognostic significance evaluated in prospective settings are pivotal for the development of optimal classification schemes for the diffuse gliomas. Prospective study settings are not necessary for measuring interobserver variability in the scoring of individual histological features because the clinical relevance of the features is not a concern. The issue of interobserver variability in typing and grading gliomas has been the subject of several retrospective studies (39-42). In a study of the diagnostic accuracy and concordance between pathologists in typing and grading of gliomas, Coons et al (41) demonstrated that κ scores could be markedly improved by joint sessions in which difficulties were identified and parameters were redefined. Similarly, in a study by Fuller et al (43), a reduction in discrepancies between pathologists on typing and grading of mixed gliomas was achieved when subsequent cooperative sessions were organized. The role of the pathologist's experience on κ scores was demonstrated in a study in which reviewers of various levels of experience graded 30 astrocytomas according to a modified scheme of Ringertz; however, it seemed that even among the 5 most experienced neuropathologists, the κ score still did not exceed 0.63 (44). The importance of exactly defining how to score a histological feature is illustrated by the contradictory results of estimating MVP (42, 44).
Even features such as mitotic index appear to be prone to variation (Fig. 1). Apart from difficulties in their recognition and differentiating mitoses from karyorrhexis or pyknosis (45), some studies have indicated that the fraction of mitoses in a biopsy may be a function of the prefixation time (46). Moreover, the mitotic index may show spatial variations within specimens. Instead, the Mib-1 labeling index (LI) would be an improvement for measuring the proliferative fraction of a tumor cell population (Fig. 2). The LI refers to the fraction of nuclei that are positively stained for the Mib-1 antibody raised against the Ki-67 antigen, a nuclear protein that is upregulated during all phases of the cell cycle, except the G0 phase. In many series, the Mib-1 LI has proven to be reliably correlated with clinical behavior (47, 48). However, Mib-1 LI has not been incorporated into WHO grading because of reservations as to variation in the results of this immunohistochemical technique.
Several studies have been aimed at reducing subjectivity of scoring histological features by the use of computer-aided estimations or morphometric methods. For example, some authors have developed semiautomatic methods for estimating nuclear pleomorphism (49) or cell proliferation by cytometrically imaging DNA content (50). Similar efforts were made in categorizing of particular histological patterns in pediatric gliomas (13, 51). Other studies addressed the important issue of prioritizing features within the grading schemes to obtain maximal correlation with clinical outcome (52-55). For ranking of scores, various computer-guided methods that claimed to eliminate interobserver variability have been developed (49,55).
In some studies, interobserver variability was measured against molecular changes, the latter being used as an objective identifier (10, 56). There should, however, be some reservations with the idea that molecular changes can always be objectively assessed. First, tests for particular molecular changes may not always yield clear results; problems in reproducibility may exist or reliable testing may be frustrated by a lack of tumor DNA in the samples. Furthermore, various molecular changes may show good overall correlation with histological tumor lineage, but genotypical-phenotypical discrepancies may remain in some of the cases (1, 2). Once consensus has been reached on the definition of oligodendrogliomas, astrocytomas, and mixed oligoastrocytomas (OAs) (by histological, molecular, or a combination of criteria), the histological parameters to be used for grading of these entities can be developed.
THE SEARCH FOR RELEVANT HISTOPATHOLOGIC FEATURES
Grading of Astrocytomas
The grading scheme used by Kernohan for astrocytomas consisted solely of the parameters anaplasia and number of mitoses (32). The scheme was applied to all tumors with astrocytic differentiation including the pilocytic variant. Although the scheme was designed to recognize 4 malignancy grades, no more than 2 different curves for overall survival were obtained when the scheme was applied in retrospective series (57, 58). Various modified schemes for grading astrocytomas followed. When pilocytic astrocytomas were excluded, the resulting curves invariably showed 3 grades of malignancy: low grade, anaplastic astrocytoma (AA), and glioblastoma (GB) (57, 59-61). However, none of these schemes were widely accepted, and the reason may be that the recognition of individual features was not sufficiently detailed and that respective grades were not clearly defined. In an effort to simplify the grading procedure by using only a few relevant features, Daumas-Duport proposed a new grading scheme for astrocytomas in 1988 (62) consisting of 4 criteria: nuclear atypia, mitotic activity, necrosis, and florid MVP (62). Based on the presence of these criteria, tumors were assigned grades 2 to 4; pilocytic astrocytomas inherently received grade 1. Grading a set of 415 tumors according to Kernohan yielded 2 pairs of intertwined survival curves, whereas application of the new Daumas-Duport (or St. Anne-Mayo) scheme resulted in 3 distinctly running curves, namely, grades 2, 3, and 4. There were only 2 grade 1 tumors, the curves of which intermingled with the curves of 2 pilocytic astrocytomas. In 2000, the WHO adopted the St. Anne-Mayo grading scheme but neglected grade 1, which was reserved for extremely rare diffuse astrocytomas lacking nuclear anaplasia and mitoses (36, 62). Comparing the scheme of St. Anne-Mayo with its WHO modification shows the tendency to upgrade the tumors by WHO grades (Table 1).
The criteria for distinguishing low-grade astrocytoma from AA and AA from GB remain problematic. In the current WHO system (WHO 2007), low-grade astrocytomas are delineated from AAs based on increased cell density (Fig. 3), nuclear pleomorphism (Fig. 4), and mitotic activity (Fig. 1). Because the feature of nuclear pleomorphism is difficult to define (Fig. 4), this delineation is prone to interobserver variability. It is not surprising, therefore, that the feature of nuclear pleomorphism failed to show discriminative power in a recent panel review (63). In the 1979, 1993, 2000, and 2007 WHO editions, the features considered to delineate low- and high-grade astrocytoma and GB were subject to changes (34, 35, 37, 64) (Table 2). In the WHO edition of 1979, the diagnosis of AA was compatible with the presence or high score of cell density, cellular pleomorphism, loss of cellular differentiation, mitoses, MVP (Fig. 5), necrosis (Fig. 6), and giant cells (38). However, according to the subsequent WHO editions, tumors with positive scores for these features should be diagnosed as GB (34, 35, 37, 64). Remarkably, in the 1993 edition of WHO, the so-called incipient MVP was prominently present in an illustration of an AA ( Fig. 5 on page 55), whereas in the later editions, the presence of this feature would warrant the immediate diagnosis of GB. The definition of incipient MVP and its demarcation from true MVP remain unresolved (Fig. 5). In the International Agency for Research on Cancer series on Pathology and Genetics of Tumours of the Nervous System of 1997 (36), the presence of MVP, not necrosis, was compatible with the diagnosis of AA. On the other hand, in the official WHO edition of 2000, either MVP or necrosis alone would warrant the diagnosis of GB, and tumors harboring either of these features were no longer to be diagnosed as AA (37). This view prevailed in the latest edition of WHO classification of 2007 (34).
Grading of Oligodendroglial Tumors
In an early publication on oligodendrogliomas, Greenfield and Robertson (65) noticed that not all oligodendrogliomas are slowly growing benign tumors. Therefore, the introduction of a grading scheme seemed legitimate; however, subsequent studies demonstrated that criteria for grading oligodendroglial tumors were difficult to identify (57, 65, 66), and that the reproducibility of grading oligodendrogliomas in particular was troublesome (32, 67, 68). An explanation for the failure to develop satisfactory grading schemes for oligodendroglioma may be the relatively small series of these tumors available (57, 66, 69-71). In addition, the schemes for oligodendroglioma lacked details on the features and scoring (72,73). The question as to whether grading of oligodendroglial neoplasms is less reproducible than grading of astrocytomas is difficult to answer because there are almost no studies addressing this issue (41). So far, there is no evidence that the prognostic relevance of any particular feature would differ for either tumor type.
Kernohan tried to distinguish malignant oligodendrogliomas from benign ones, but failed to do so (67). In particular, the mitotic count did not match the clinical outcome (67). Moreover, scoring of other histological features also yielded contradictory results (74). Smith and colleagues (68) from the Armed Forces Institute of Pathology introduced a grading scheme for oligodendrogliomas in 1983 that consisted of 4 grades using only 5 histological features that were scored in a simple on-off manner. This scheme consisted of endothelial proliferation, necrosis, nuclear/cytoplasmic ratio, cell density, and cellular pleomorphism. Remarkably, the mitotic index was not included (68). By comparing the results of grading according to Smith's and Kernohan's grading schemes, the results seemed to be essentially the same. Both schemes yielded 3 distinct survival curves; grades 2 and 3 were intertwined, both running separately from the curves of worse and best survival (75). Some later attempts to isolate features with independent significance for survival identified mitotic activity (42, 76, 77), but many others did not (27,68,78, 79). Microvascular proliferation was repeatedly found to be a feature with independent significance (76, 80). The value of necrosis seemed to be less clear (63, 75, 81-84).
Gliomas with oligodendroglial features encompass a spectrum of tumors ranging from tumors with the classic honeycomb architecture to those with various numbers of astrocytic cells. The delineation from either pure oligodendrogliomas or astrocytomas is controversial. It is generally accepted that tumors with variable numbers of glial fibrillary acidic protein-positive oligodendroglial cells (gliofibrillary oligodendrocytes or minigemistocytes) should still be considered as oligodendrogliomas (34). The presence of neoplastic cells with astrocytic phenotypes should trigger the diagnosis of mixed OA. However, the delineation of oligodendrogliomas from the OAs based on the recognition of neoplastic astrocytes has proven to be subject to large interobserver discrepancies (63). The combined loss of 1p/19q correlates well with classic oligodendroglial histology and tumors carrying this genotypic aberration represent a distinct clinical entity (6). There are data addressing the genotypic verification of mixed gliomas, and the percentages of specific genotypes are subject to variations (1). Because of the uncertainties in histological typing no specific grading scheme for tumors with mixed histology has been developed.
In the WHO 2007 edition, anaplastic OAs were split into those with or without necrosis. In a retrospective setting, the former displayed a steeper Kaplan-Meier curve (83). However, anaplastic OAs may be diagnostic dilemmas, particularly with respect to their distinction from pure anaplastic oligodendrogliomas and astrocytomas on one hand, and from GBs on the other. Moreover, in the latest WHO fascicles, the controversial entity of GB with oligodendroglial features was introduced. The “glioblastoma with oligodendroglial component” became rather controversial because GBs have been considered to be of astrocytic lineage. Taking into consideration that oligodendrogliomas (including high-grade oligodendrogliomas) may harbor tumor cells with astrocytic phenotypes, that mixed OAs are not univocally delineated from pure oligodendrogliomas, and that cells with scanty cytoplasm in GBs may look like oligodendroglioma cells, it is not surprising that this new addition is a source of confusion (85).
Grading of Ependymomas
The contradictory results of grading ependymomas in the literature illustrate the relative importance of histological grading for prognostication in patients with this tumor, whereas clinical factors, such as tumor localization, dominate in prognostic importance (86-90). Moreover, relatively limited patient groups without uniform treatments impair evaluation of the parameters and grading schemes. The results of studies are often difficult to compare because of variable inclusion of distinct subtypes, such as the myxopapillary ependymoma variant and subependymomas. In addition, results may be incomparable because of the variable inclusion of children in the surveys. In general, most grading studies identified the parameters of cell proliferation as significant for prognosis (91-95). Because many ependymomas harbor angiomatous vessels and tumors may display exophytic growth patterns, features such as MVP and necrosis seem to have a different connotation in these tumors, as compared with the diffuse gliomas.
PROOF OF THE PUDDING: TESTING INTEROBSERVER VARIATION AND SIGNIFICANCE OF SPECIFIC PARAMETERS
Although there are data on the results of a panel review of gliomas in clinical trial (96), there are only a few results of panel reviews for glioma typing and grading according to the current WHO guidelines. The effects of subsequent modifications of WHO guidelines for grading astrocytomas may be illustrated by the results of central reviewing of a European Organisation for Research and Treatment of Cancer (EORTC) trial on the effects of additional radiation of AAs. This trial encompassed almost 20 years, beginning in 1988 and reopened in 1994 (97). The initial central pathology review was performed during the study, and the diagnosis of AA was confirmed in only 35% of cases. A second central review after closing of the trial showed a confirmation of this diagnosis in 36% of cases. The concordance on the diagnosis of AA between the central reviewers seemed to be even worse and was most likely caused by the aforementioned variation of WHO guidelines for distinguishing AA and GB (35, 36, 38). It should be kept in mind that during the long period the trial was open, not only the changes in grading of astrocytic tumors had taken place, but also the therapeutic consequences of the diagnosis of oligodendroglioma caused an increasing awareness of this diagnosis, thereby influencing the percentages. The concordance on the diagnosis of GB seemed to be 73%, reflecting the well-recognized diagnostic criteria. In a panel review of a EORTC study on the effects of combining radiotherapy with PCV administration for patients with anaplastic oligodendrogliomas and anaplastic OA, the features of cell density, nuclear abnormalities, mitoses, MVP, and necrosis were scored by 9 neuropathologists (63). The reviewers were asked to score according to their interpretation of the criteria of the WHO 2000 edition. The reviewers were unaware of clinical data and each independently scored the tumors. Approximately half of the diagnoses of anaplastic oligodendroglioma were conformed by consensus (6 or more of the 9 reviewers), and 62% confirmation was reached when at least 4 reviewers (the majority) agreed. Discrepancies on tumor grade were much lower, with only 2% (consensus) and 11% ("majority") discrepancies about tumor grade. For diagnosis of anaplastic OAs, the panel consensus was only 8%, and the majority confirmation 32%. For mixed gliomas, there were hardly any discrepancies for tumor grade by consensus. With respect to the individual features of the WHO grading scheme, the concordance of the panel was highest for necrosis (92%) and relatively high for MVP, cell density, and mitoses (87%, 85%, and 85%, respectively). The feature of nuclear abnormalities was unanimously scored as present by all panelists in all tumors; therefore, this criterion is not discriminative. Again, the concordance on scores of individual parameters was higher when features were better distinguishable or better defined (42). Further panel reviews need to be organized to reveal weaknesses in definitions of histological features and grades and to develop improved grading schemes that can be adopted by the WHO.
CONCLUSIONS AND RECOMMENDATIONS
Glioma grading schemes should consist of histological criteria that are relevant to the biological progression of the tumor and these criteria should be legitimized in prospective controlled studies. Prior to the application of a grading scheme, tumors need to be reliably typed. It is expected that an increasing number of genetic characteristics will appear to be of prognostic or predictive value. For grading, comparisons of survival curves should be done with consideration of treatments and other relevant clinical parameters. The inclusion of related histological features in grading schemes should be avoided. The histological features should be unequivocally defined and scored to reduce interobserver variability. In addition, respective grades of a scheme should be clearly outlined. Panel reviews measuring interobserver variations on assessment of histological features are necessary. Features that can be reproducibly scored should be subjected to multivariate analysis for interdependencies and correlated with prognostic or predictive molecular changes. This way, a minimal set of features with the highest reproducibility in assessment and with the strongest prognostic or predictive values can be identified. Information on the radiological presentation is important for the interpretation of biopsy findings and should be matched with the results of histological grading. In various glioma subgroups, the proliferation index as assessed by immunohistochemistry for Ki-67 has proven to be reliable and independent parameter for predicting biological behavior and could be included in new grading schemes.
In grateful reminiscence of my teacher Dr. Ellsworth C. Alvord.