Single response assessment of transplant-ineligible multiple myeloma: a supplementary analysis of JCOG1105 (JCOG1105S1)

Abstract Background The International Myeloma Working Group response criteria require two consecutive assessments of paraprotein levels. We conducted an exploratory analysis to evaluate whether a single response assessment could be a substitute for the International Myeloma Working Group criteria using data from JCOG1105, a randomized phase II study on melphalan, prednisolone and bortezomib. Methods Of 91 patients with transplant-ineligible newly diagnosed multiple myeloma, 79 patients were included. We calculated the kappa coefficient to evaluate the degree of agreement between the International Myeloma Working Group criteria and the single response assessment. Results Based on the International Myeloma Working Group criteria, 11 (13.9%), 20 (25.3%), 36 (45.6%) and 12 (15.2%) patients had stringent complete response/complete response, very good partial response, partial response and stable disease, respectively. Based on the single response assessment, 17 (21.5%), 19 (24.1%), 35 (44.3%) and 8 (10.1%) patients had stringent complete response/complete response, very good partial response, partial response and stable disease, respectively. The kappa coefficient was 0.76 (95% confidence interval, 0.65–0.88), demonstrating good agreement. The single response assessment was not inferior to the International Myeloma Working Group criteria in the median progression-free survival (3.8 and 2.9 years) in stringent complete response/complete response patients, suggesting that the single response assessment was not an overestimation. Conclusions The single response assessment could be a substitute for the current International Myeloma Working Group criteria for transplant-ineligible newly diagnosed multiple myeloma.


Introduction
Multiple myeloma (MM) is a malignant disorder characterized by a clonal proliferation of plasma cells producing a monoclonal immunoglobulin. The European Group for Blood and Bone Marrow Transplant/International Bone Marrow Transplant Registry/American Bone Marrow Transplant Registry (EBMT/IBMTR/ABMTR) published criteria for the response and progression of MM treated by stem cell transplantation, commonly referred to as the EBMT criteria (1). They defined complete response (CR), partial response (PR) and minimal response (MR), which required that the response was maintained for a minimum of 6 weeks to avoid recording a transient response. In 2006, the International Myeloma Working Group (IMWG) developed uniform response criteria, which have been used to measure the effect of treatment (2). All response categories require two consecutive assessments of serum or urine monoclonal protein concentrations at any time. Recently, IMWG has defined new response categories that also required two consecutive assessments of the paraprotein level (3). The purpose of these two consecutive assessments was to eliminate laboratory error or fluctuation of the measurement. However, two consecutive assessments are bothersome in clinical practice; furthermore, the interval between two assessments or exact timing of assessments is not clearly defined. Moreover, there is a risk of underestimating the best response due to the lack of a second response assessment, especially in the setting of clinical trials.
The Japan Clinical Oncology Group (JCOG)-Lymphoma Study Group (LSG) has conducted a randomized phase II study to optimize a more promising modified regimen containing melphalan, prednisolone and bortezomib (MPB) for transplant-ineligible newly diagnosed MM (TI-NDMM) (JCOG1105, jRCTs031180097) (4,5). The CR rate in this investigator-initiated study was lower than that in previous studies (6,7), and one possible reason for that difference was a failure to confirm CR with a second response assessment, including immunofixation electrophoresis of both serum and urine.
If a single response assessment can be demonstrated to be equally valid and precise as the current IMWG criteria, it could be used as a substitute for two consecutive assessments, therefore, lowering the burden on the medical system and avoiding the risk of underestimating the best response. Thus, this analysis aimed to evaluate whether a single response assessment can substitute the current IMWG criteria using data from JCOG1105.

Patients and methods
Summary of JCOG1105 JCOG1105 (4,5) was a randomized phase II study to develop a more promising MPB regimen for the upcoming phase III study for TI-NDMM. The following patients were enrolled in JCOG1105 between July 2013 and April 2016: newly diagnosed symptomatic MM (IMWG 2003), ECOG performance status of 0-2 or 3 due to bone lesions, aged 65-79 years or 20-64 years, who were not candidates for stem cell transplantation, with preserved organ function. The follow-up ended in June 2019. The study protocol was approved by the Protocol Review Committee of JCOG and the respective institutional review boards. Informed consent about the secondary use of data was obtained from the enrolled patients upon registration in JCOG1105.

Patients and response criteria
A total of 91 patients enrolled in JCOG1105 were analyzed in this study. Patients with the best overall response of progressive disease (PD) or not evaluable (NE) by the IMWG criteria were excluded. We excluded patients with PD because the IMWG criteria also allow for a single assessment to determine PD based on clinical judgment, and some patients can continue treatment because PD was not confirmed by two consecutive assessments. Two sets of response criteria were used in this supplementary analysis. The first one included IMWG criteria, which required two consecutive assessments made at any time (2) and adopted by JCOG1105. The second one was an exploratory criterion called a single response assessment, which did not require confirmation by the second assessment. The response subcategory was stringent complete response (sCR), CR, very good partial response (VGPR), PR and stable disease (SD). Because of the small number of patients with sCR, sCR and CR were combined into one response category.

Statistical analysis
We analyzed the original data from JCOG1105 without collecting additional information. The primary endpoint of this supplementary analysis was the kappa coefficient to evaluate the degree of agreement between the IMWG criteria and the single response assessment using sCR/CR, VGPR, PR or SD. We selected the kappa coefficient, which is considered to be a more robust measure than a simple percentage calculation because it takes into account the possibility of chance agreement (8). The single response assessment was considered useful in cases with kappa coefficient of ≥0.7. The secondary endpoints were progression-free survival (PFS), overall survival (OS) and time to next treatment (TNT). Survival analysis in this study was performed to ensure whether the single response assessment was not inferior to the IMWG criteria due to the possibility of response overestimation. PFS, OS and TNT were estimated using the Kaplan-Meier method. The definitions of OS, PFS and TNT were identical to those reported in JCOG1105 as detailed in the previous study (4). OS, PFS and TNT were measured from the date of enrollment in JCOG1105 as described previously (4). All statistical analyses were performed by JCOG Data Center using SAS version 9.4 (SAS Institute, Cary, NC, USA).

Patients characteristics
In JCOG1105, efficacy analyses were performed in all 88 eligible patients (4,5). Among them, nine patients with the best overall response of PD or NE by the IMWG criteria were excluded from this supplementary analysis. As a result, 79 patients were evaluated. Figure 1 shows the patient-flow diagram of this study. Patient characteristics are shown in Table 1
We also performed the same survival analysis for Arm A and Arm B. There was no difference in PFS (Supplemental Fig. S1), OS (Supplemental Fig. S2) or TNT (Supplemental Fig. S3) between IMWG criteria and single response assessment. sCR, stringent complete response; CR, complete response; VGPR, very good partial response; PR, partial response; SD, stable disease; IMWG, International Myeloma Working Group; PD, progressive disease. a All response categories require two consecutive assessments made at any time.

Discussion
This explanatory analysis was performed to evaluate whether a single response assessment can substitute the current IMWG criteria using data from a randomized phase II study that developed a more promising MPB regimen for TI-NDMM (JCOG1105). Herein, we demonstrated that the single response assessment may be used as a substitute for the current IMWG criteria with two consecutive assessments for TI-NDMM. The IMWG published uniform response criteria for clinical trials (2). They eliminated the mandatory minimal 6-week interval to confirm the achievement of response because the 6-week response duration does not carry major clinical significance and is not a surrogate for the durability of response. However, they required two consecutive assessments made at any time to eliminate laboratory or other errors. In 2016, the IMWG published new response criteria that retained two consecutive assessments of M protein levels (3). There have been no reports on the single response assessment of paraprotein levels and no discussion on whether two consecutive assessments were essential or not. The use of response criteria, which required two consecutive assessments, may underestimate the best response in the absence of a second assessment. Indeed, 13 patients (16.4%) of 79 in this analysis were not evaluated in a second response assessment. The best response for 5 out of the 13 patients was cycle 9 (the last cycle), and the IMWG criteria may underestimate the late responder because of the lack of a second evaluation. The SWOG S0777 randomized, open-label phase III trial on bortezomib, lenalidomide and dexamethasone (BLd) was reported (9), and the overall response rate to BLd regimen in that study was lower than that in the original BLd study (10) (81.5 vs. 100%). They discussed that one reason for these differences was that 20 patients (9%) in the BLd group who were not evaluated for the second response assessment had unconfirmed PR and were listed in the confirmed SD category.
Although careful judgment is required because M protein is the surrogate marker for abnormal plasma cells in MM patients, there is little evidence that two consecutive assessments are mandatory. Indeed, the level of agreement between the IMWG criteria and the single response assessment was sufficiently high (kappa coefficient of ≥0.7) in our analysis. Furthermore, results of PFS, OS and TNT showed no significant difference between the IMWG criteria and the single response assessment. Although survival analysis was performed to ensure whether the single response assessment was not inferior to the IMWG criteria due to the possibility of response overestimation, the single response assessment seemed to produce longer median PFS (3.8 vs. 2.9 years) and TNT (5.0 vs. 2.9 years) than the IMWG criteria in sCR/CR patients. Similar survival in patients achieving sCR/CR in the single response assessment compared with that in patients achieving CR in the more rigorous IMWG criteria is an important finding that supports the notion that single response assessment might be sufficient as long as there are no mistakes in the specimens.
Recent attempts had focused on the identification of residual tumor cells in the bone marrow using multi-color flow cytometry or next-generation sequencing (11). IMWG has defined new response categories of minimal residual disease (MRD) negativity, and there was no need for two consecutive assessments for MRD (3). MRD tests should be initiated only at the time of a suspected CR. Since sCR/CR of the single response assessment increased from 13.9 to 21.5% compared with those in the IMWG criteria in this study, the single response assessment may increase the chance of measuring MRD. The importance of MRD measurement is likely to increase in the future, and treatment stratification by MRD is being considered. When considering treatment strategies, such as shortening treatment in the MRD-negative patients, the single response assessment allows for more MRD testing to identify these patients.
There are several limitations in this supplementary analysis. First, JCOG1105 was not designed to analyze the new response criteria. Second, the number of patients analyzed was small. Finally, we could not conclude whether the single response assessment was also applicable to patients with transplant-eligible MM because all patients who were analyzed in this study were transplant-ineligible. However, to our knowledge, this is the first study addressing the utility of the single response assessment for MM. Recently, a suggestion for simplifying the IMWG criteria regarding the utility of repeating bone marrow biopsy for confirmation of CR was reported (12), and our study was also one of such attempts. . TNT by response status evaluated using the IMWG criteria (A) and the single response assessment (B). Patients with sCR/CR evaluated using the IMWG criteria (n = 11) and the single response assessment (n = 17) (C). Patients with VGPR evaluated using the IMWG criteria (n = 20) and the single response assessment (n = 19) (D). TNT, time to next treatment.
In conclusion, we found that the single response assessment could be a substitute for the current IMWG criteria with two consecutive assessments in patients with TI-NDMM. As this exploratory analysis included a limited number of patients, further investigation designed prospectively is necessary to confirm our results. We are also planning to validate the single response assessment for MM in the next phase III trial of JCOG-LSG (JCOG1911).

Supplementary Material
Supplementary material is available at Japanese Journal of Clinical Oncology online.