The assessment of new therapies in the adjuvant setting in early breast cancer requires large numbers of patients and many years of follow-up for results to be presented. Therefore, the neoadjuvant study setting, which allows for early prediction of treatment response in smaller patient sets, has become increasingly popular. Ki67 is the most commonly used and extensively studied intermediate biomarker of treatment activity and residual risk in neoadjuvant trials on endocrine therapy, new biological therapies, and chemotherapy. It is increasingly being used as a primary endpoint for new therapies particularly those added to endocrine therapy. The PeriOperative Endocrine Therapy for Individualizing Care (POETIC) trial, including more than 4000 postmenopausal, estrogen receptor (ER)–positive patients randomly assigned to receive 2 weeks of presurgical treatment with an aromatase inhibitor or no further treatment, is the largest window-of-opportunity trial conducted and is assessing the clinical utility of on-treatment Ki67 as a predictor of long-term outcome. For generalizability, Ki67 measurements in the POETIC and other trials need to use standard methodology. The International Working Group on Ki67 in Breast Cancer is conducting a series of studies to bring this to reality.
One of the main values of neoadjuvant therapy is its potential to reveal comparative treatment effectiveness much earlier than other treatment settings. Clinical response other than pathological complete remission (pCR) in chemotherapy studies has not proved very informative (potential reasons for this are considered below). There has, therefore, been much interest in biomarkers as surrogate or, more realistically, intermediate markers of treatment benefit. Changes in apoptosis and more recently in RAD51 foci have been proposed as markers for chemotherapy but have not been fully validated (1). In contrast, the proliferation marker Ki67 has substantial data to support its use, especially in endocrine therapy. This article, therefore, focuses on the multiple uses of Ki67 in neoadjuvant therapy.
In neoadjuvant endocrine trials, Ki67 is the most commonly used pharmacodynamic marker. It may be used in neoadjuvant trials as an eligibility criterion and for early prediction of response and acquired resistance. At the time of diagnosis, low levels of Ki67 may be used as an exclusion criterion, as statistically significant suppression of Ki67 in response to treatment is less likely to be detectable with low baseline proliferation rates.
The foundation for the use of Ki67 for prediction of response and long-term outcome at the end of neoadjuvant treatment was in the results from large neoadjuvant and adjuvant studies of parallel design; Immediate Preoperative Anastrozole, Tamoxifen, or Combined with Tamoxifen (IMPACT) trial in parallel with Arimidex, Tamoxifen Alone or Combined (ATAC), and PO24 (preoperative treatment of postmenopausal breast cancer patients with letrozole; a randomized double-blind multicenter study) in parallel with Breast International Group 1-98 (2–5). IMPACT randomly assigned patients to receive 12 weeks of neoadjuvant anastrozole, tamoxifen or the combination, and PO24 randomly assigned between 16 weeks of letrozole and tamoxifen. In both IMPACT and PO24, the aromatase inhibitor induced a statistically significantly superior suppression of Ki67 levels compared with tamoxifen or the combination, which translated into an improved long-term outcome for patients treated with an aromatase inhibitor in the much larger adjuvant trials ATAC and Breast International Group 1-98, respectively (3,5).
The pretreatment Ki67 value reflects the intrinsic prognosis of the patient, whereas the change in Ki67 corresponds to response to the treatment, and the on-treatment value combines both features and relates to residual risk on endocrine treatment (Figure 1). Evidence for this was provided by the IMPACT study: Both pretreatment and the 2-week Ki67 value were predictive of recurrence-free survival. However, in a multivariate model including both the pretreatment and 2-week Ki67 value, only the 2-week value remained an independent predictor of residual risk and long-term outcome (6).
In predicting benefit from endocrine treatment, the change in Ki67 and not the on-treatment value proved to be the most powerful predictor in the IMPACT study: The absolute differences in on-treatment Ki67 levels between the treatment arms showed similar differences as the percentage change did (2), but the statistical significance was weaker and more patients would have been needed in that trial to detect differences in treatment efficacy by using the static on-treatment Ki67 level as opposed to the dynamic change in K67.
In the IMPACT trial, Ki67 predicted outcome, but clinical response did not, there being no differences in clinical response rates between study arms (7). There was a statistically significant relationship of degree of decrease in Ki67 and clinical response, but the association was very modest (2). There is a rational, if unproven, explanation for why changes in a proliferative biomarker such as Ki67 might predict adjuvant benefit better than clinical response itself. In the neoadjuvant setting, a profound reduction in proliferation would be expected to be needed in a highly proliferative tumor to produce an objective clinical response. However, in the adjuvant setting, any reduction in proliferation theoretically translates into an improved outcome.
Recent Data on Validity of Prediction
Over the last few years, more data relating to the validity of Ki67 as a predictor of treatment benefit and long-term prognosis have been published.
In the neoadjuvant Z1031 study, 377 postmenopausal, estrogen receptor (ER)–positive women were randomly assigned to receive neoadjuvant exemestane, letrozole, or anastrozole with Ki67 data available for 266 patients (8). There was no statistically significant difference in changes in Ki67 after treatment between the three treatment groups (Kruskal–Wallis P = .45, adjusted for three-way comparison). In line with this, in one adjuvant equivalent study, MA.27, in which postmenopausal patients were randomly assigned to receive exemestane or anastrozole, neither therapy proved superior (9).
The other comparison of anastrozole vs letrozole is being made in the Comparison Trial of Letrozole to Anastrozole in the Adjuvant Treatment of Postmenopausal Women With Hormone Receptor and Node Positive Breast Cancer (FACE) trial, in high-risk postmenopausal women (http://clinicaltrials.gov/show/NCT00248170). When results from this study are presented, it will be interesting to see whether the strong trend to a poorer suppression of Ki67 found with anastrozole (−78%, standard error of the mean (SEM) 4%) vs letrozole (−87.1%, SEM 2.8%) in the Z1031 predicts for poorer outcome with anastrozole in the FACE trial.
In the neoadjuvant anastrozole versus tamoxifen in patients receiving goserelin for premenopausal breast cancer (STAGE) trial, 197 ER-positive, HER2-negative premenopausal women were randomly assigned to receive neoadjuvant goserelin either in combination with anastrozole or tamoxifen (10). There were statistically significantly more partial and complete responses as well as statistically significantly superior suppression of Ki67 in the anastrozole + goserelin compared with the tamoxifen + goserelin treatment arm (11). In contrast, in the equivalent adjuvant study, the Austrian Breast and Colorectal Cancer Study Group trial 12 (ABCSG-12 trial) including 1803 premenopausal ER-positive patients, there was no statistically significant difference in disease-free survival between study arms (12). Reasons for the apparent lack of predictive value of Ki67 in the STAGE/ABCSG-12 trials might be found in differences in the baseline characteristics of the patients. There were statistically significantly more overweight patients in ABCSG-12 trial compared with the neoadjuvant STAGE trial. In previous studies on postmenopausal women, obesity has not only been associated with an overall increased risk of recurrences, but there is also a trend toward more relative benefit of anastrozole in women with lower weight (13). Moreover, in a substudy of the ABCSG-12 trial, a threefold increased risk of death was found in overweight women treated with anastrozole compared with tamoxifen, with no such difference found for normal-weight women (14). In the STAGE study, serum estradiol levels were monitored during treatment using a highly sensitive assay. As expected, at 4 weeks, anastrozole treatment induced statistically significantly superior suppression of estradiol levels compared with tamoxifen. Over time there was however a gradual increase in estradiol levels in the anastrozole-treated arm, indicating that tachyphylaxis might be occurring with incomplete estrogen suppressive effects with anastrozole and goserelin in this premenopausal cohort as time progressed. Because of increased total-body aromatization, estradiol levels are generally higher in overweight women. The inferior suppression of estrogen in the anastrozole arm with increasing time may therefore have been even more pronounced in the overweight women. Taken together, this may have influenced the long-term efficacy of anastrozole in the ABCSG-12 trial compared with the STAGE trial. At this stage, it is therefore reasonable to continue to accept the value of Ki67 as an intermediate marker of treatment response and long-term outcome in the neoadjuvant setting.
The paired neoadjuvant/adjuvant trials reported above had parallel randomizations, but they were conducted independently of each other: The evidence from the neoadjuvant study was not used for decisions on further clinical development of the respective therapeutic strategy. As increasing numbers of new compounds are being introduced, Ki67 have begun to be used in neoadjuvant trials for early evidence of treatment benefit and as a means of selecting which drugs to move forward into further larger trials. Trial 223, a phase II trial in early breast cancer of neoadjuvant anastrozole compared with anastrozole and gefitinib, was the first neoadjuvant trial to use Ki67 and not clinical response as a primary endpoint (15). Previous data on gefitinib had been conflicting (16,17). In Trial 223, there were no statistically significant differences in change in Ki67 between baseline and 2 or 16 weeks, or from 2 to 16 weeks, and a nonsignificant trend of clinical response against the combination treatment (15). Based partly on these results, it was decided not to pursue with a larger phase III adjuvant trial including gefitinib.
A neoadjuvant trial of letrozole ± everolimus found a statistically significantly greater mean reduction of Ki67 from baseline to day 15 in the presence of everolimus (90.7% ± 3.2%) compared with placebo (74.8% ± 6.8%). There was only a moderate correlation with clinical response with no statistically significant increase in this with the mTOR (mammalian target of rapamycin) inhibition (18). This small difference in Ki67 was followed by a remarkable difference in 6.5-month median progression-free survival in Everolimus in Combination With Exemestane in the Treatment of Postmenopausal Women With Estrogen Receptor Positive Locally Advanced or Metastatic Breast Cancer Who Are Refractory to Letrozole or Anastrozole (BOLERO-2), a trial in endocrine-resistant postmenopausal patients with advanced disease who were randomly assigned to receive exemestane with placebo or exemestane + everolimus (19). Although the neoadjuvant outcome did predict the result of the phase III trial, there must be concern about the risk of a false-negative result for future study drugs if the selection process was to be based solely on Ki67 given the modest difference in Ki67 seen in this case. It is possible that a new treatment that did not generate a survival difference of the magnitude seen with everolimus could still be a very worthwhile development but might be disregarded based on a smaller nonsignificant change in Ki67.
Recent Applications of Ki67 in Clinical Research
In studies aimed at analyzing endocrine-resistant tumors, on-treatment Ki67 levels have been used to define patients as nonresponders, and thereby selecting only the endocrine-resistant cases from a neoadjuvant patient population for further studies (20). In the neoadjuvant Aromatase Inhibitor Phase II Trial by Ellis et al. presented at San Antonio Breast Cancer Symposium 2012, postmenopausal ER-positive stage II patients received 2–4 weeks of neoadjuvant treatment with an aromatase inhibitor, after which they were re-biopsied. Patients with a residual Ki67 of more than 10% were classified as treatment resistant and triaged to chemotherapy or immediate surgery. The success of this approach is questioned however by the observed low pCR rate of these patients to chemotherapy (21).
Given the validation described above, change in Ki67 can act as an endpoint for studying predictors of benefit or response and resistance to endocrine therapy. We have reported a study based on whole-genome expression analysis in which the gene most strongly predictive of poor ∆Ki67 was SLAMF8 (22). This was one of a large number of genes related to inflammation or immune response associated with poorer suppression of Ki67. An immune metagene showed a statistically highly significant relationship with Ki67 change but notably explained only about 15% of the variability indicating the need for much larger studies to explain most of the endocrine resistance.
More recently, Ellis et al. (20) reported a highly novel, mutation-based study using massively parallel sequencing and Ki67 as the biomarker in breast cancer treated with neoadjuvant letrozole. Tumors with Ki67 more than 10% after 4-month treatment were defined as “aromatase inhibitor resistant” and were associated with luminal B status and the presence of multiple pathways based on presence or absence of specific mutations. However, as discussed above, the more direct marker of response is the dynamic change in Ki67, whereas the on-treatment Ki67 signifies residual risk after treatment, which is a combination of prognosis and treatment effect and at least theoretically may not reflect change in Ki67. For example, using the on-treatment cutoff of 10%, a statistically significant decrease in Ki67 from 40% to 12% would be classified as nonresponse and a change from 2% to 9% as a response, respectively, whereas the opposite would be the case if the dynamic variable had been chosen. When Ellis et al. (20) secondarily assessed genes mutated that were associated with change in Ki67, trans-acting T-cell-specific transcription factor (GATA3) was identified as statistically significantly associated with good antiproliferative response when mutated (P = .012).
As noted above, studies of response and resistance require large populations for sufficient power. PeriOperative Endocrine Therapy for Individualizing Care (POETIC) trial is the largest study to assess the validity of Ki67 as a marker of response and long-term outcome in a presurgical window-of-opportunity setting. The study randomly assigns between 2 weeks of presurgical and 2 weeks of postsurgical treatment with a nonsteroidal aromatase inhibitor or to no treatment (2:1) (23). The primary aim is to assess whether short-term presurgical endocrine treatment improves outcome, a hypothesis that is based on an animal model where endocrine therapy or chemotherapy given before surgery improved outcome, and required 4000 postmenopausal patients with ER-positive breast cancer (24). A second aim of the study is to validate the on-treatment Ki67 as a marker of long-term outcome in a larger patient set compared with the baseline Ki67, as previous studies have only been performed in smaller patient populations. Importantly, the size of the study will allow a secure assessment of the impact of using the on-treatment value on decision making. There were initial challenges in conducting a presurgical study involving more than 100 centers, especially in obtaining diagnostic and surgical biopsies from each patient. However, the recruitment target was passed in January 2013 with up to about 150 patients/month having been recruited showing the clear feasibility of such a large study.
It has recently been recognized that Ki67 may also be used for selecting patients in need of further treatment after neoadjuvant chemotherapy by predicting prognosis in residual disease. The prognosis for patients showing a pCR is generally good; however, only a small percentage of patients achieve this. For the large majority of non-pCR patients, the prognosis is much more heterogeneous. New tools are needed to identify patients with a low risk of relapse who can be spared further treatment, as well as identifying patients with a higher risk of relapse who could be offered other treatments or inclusion in studies on new drugs. Symmans et al. (25) reported that a continuous variable for assessing residual disease—Residual Cancer Burden—correlates with long-term outcome in non-pCR patients. We and others have found that high residual Ki67 after neoadjuvant chemotherapy is predictive of inferior outcome in patients not achieving a pCR (26,27). In a recently presented study from our group, the prognostic value of Ki67 and the Residual Cancer Burden (25), alone or in combination (Residual Proliferative Cancer Burden), was assessed (28). The value of Residual Proliferative Cancer Burden in predicting outcome exceeded that of either Residual Cancer Burden or Ki67 alone.
In the Palbociclib in Addition to Standard Endocrine Treatment in Hormone Receptor Positive Her2 Normal Patients With Residual Disease After Neoadjuvant Chemotherapy and Surgery (PENELOPE-B) study, the value of the CDK 4/6 inhibitor palbociclib in addition to endocrine treatment in ER-positive, HER-negative patients with high-risk residual disease after neoadjuvant chemotherapy and surgery will be assessed (http://clinicaltrials.gov/ct2/show/NCT01864746). Another postneoadjuvant prognostic tool, the clinical-pathological stage–estrogen/grade index, was chosen as the eligibility criterion (29). Ki67 is not a part of the clinical-pathological stage–estrogen/grade index but will also be assessed within the trial. This offers an opportunity of comparing the prognostic and/or predictive value of Ki67 with the clinical-pathological stage–estrogen/grade score.
In many ways, the approach to combining Ki67 with clinicopathological parameters after neoadjuvant chemotherapy is analogous to the approach in the postneoadjuvant endocrine setting where the preoperative endocrine index, which also includes Ki67, has been proposed for outcome prediction and as a means of identifying patients in need of additional adjuvant therapy. The index was developed in the PO24 trial and validated in the IMPACT trial (30). The index includes pathological tumor size and nodal status, ER status and Ki67, and divides patients into three risk groups (risk score 0, 1–3, and ≥4) with statistically significant differences in outcome. Of particular note, patients with pathological stage I or 0 and preoperative endocrine index score 0 after treatment were found to have such a low risk of relapse that they might be spared chemotherapy (30).
Challenges and Limitations
Because Ki67 has become the proliferation marker of choice in breast cancer both for prediction of prognosis and treatment benefit, considerable efforts have been made to improve standardization and reproducibility. The International Ki67 in Breast Cancer Group has created guidelines for the assessment of Ki67 with recommendations on preanalytical and analytical procedures, as well as on interpretation, scoring, and data handling (31). A reproducibility study specifically aimed at the scoring procedures has subsequently been conducted where centrally stained tissue microarrays were scored by eight experienced laboratories according to local routines (32). Mean Ki67 values ranged from 7% to 24 %, with high intralaboratory reproducibility (intraclass correlation coefficient = 0.94), but only a moderate interlaboratory reproducibility (intraclass correlation coefficient = 0.71). The main contributors to interlaboratory disconcordance were differences in definition of staining positivity and approaches to scoring. Laboratories with formal counting had more consistent results than those performing visual estimates. In a second part of the study, a web-based calibration training tool was created, which was found to improve overall concordance. An important conclusion was drawn that absolute values and cutoffs for Ki67 cannot be transferred between laboratories without much improved standardization of the scoring methodology (32).
One potential additional variable not addressed by Nielsen may be heterogeneity in Ki67 expression. It also raises the question of how representative one diagnostic core is to the entire tumor. We published about 15 years ago the variability in Ki67 seen between two core-cuts taken from a tumor at the same time: Essentially Ki67 in a second biopsy had to vary by at least 50% relative to the first biopsy to be deemed to be statistically significantly different from the first (33). The control arm of POETIC will provide much more data on this. Studies have been conflicting on whether there are differences in Ki67 levels between biopsy and surgical specimen (34,35). In the guidelines from the International Ki67 in Breast Cancer Working Group, it is noted that it is highly preferable to use the same type of specimens when comparative scores are to be made (31).
Ki67 is by far the most widely used biomarker for assessing response and residual risk with neoadjuvant therapy. Its appropriate application requires an appreciation of the variables that influence its measurement. Studies are ongoing to establish these variables more fully and to create guidance to minimize their impact.
We are grateful to Breakthrough Breast Cancer and the National Institute for Health Research Biomedical Research Centre at the Royal Marsden Hospital for financial support of the multiple studies summarized here. This work was also supported by funds from the Mrs Berta Kamprad Foundation (BKS 24/2013), the Swedish Breast Cancer Association (BRO, 35581), and the Skåne County Council’s Research and Development Foundation (10603).