Propensity score regression analysis of oesophageal adenocarcinoma treatment with surgery alone or neoadjuvant chemotherapy

Background Propensity score (PS) regression analysis can be used to minimize differences between cohorts in order to perform comparisons The aim of this study was to use PS analysis to examine the outcomes of oesophageal adenocarcinoma (OAC) treatment with surgery alone or neoadjuvant chemotherapy (NAC) followed by surgery (NACS), to see whether the benefits seen in a randomized trial (MRC OE02) were reproducible in a UK cancer network clinical practice. Methods Consecutive patients undergoing potentially curative treatment for OAC in a regional cancer network were studied. Multiple regression models, including PS analysis, were developed to account for confounding factors. Primary outcome measures were disease‐free (DFS) and overall (OS) survival. Results A cohort of 440 patients was included in a regression analysis controlling for confounders (176 surgery alone, 264 NACS). NACS was associated with a higher positive margin status rate compared with surgery alone (42·4 versus 26·7 per cent respectively; P < 0·001), an inferior 5‐year DFS rate (32·1 versus 56·9 per cent; P < 0·001) and a worse 5‐year OS rate (27·5 versus 47·3 per cent; P < 0·001). On regression adjustment based on propensity scores, NACS was not associated with DFS (P = 0·220) or OS (P = 0·431). The Mandard tumour regression grade (TRG) score was significantly associated with DFS (hazard ratio (HR) 0·21, 95 per cent c.i. 0·07 to 0·70) and OS (HR 0·27, 0·13 to 0·59). Five‐year DFS and OS rates related to TRG were 64 and 62 per cent respectively for 25 good responders versus 8·0 and 8·6 per cent for 127 poor responders (P < 0·001). Conclusion The prescription of NAC to all patients with OAC risks delay in effective treatment of patients who are relatively chemoresistant, given the variability in pathological response. Identification of patients with OAC who may derive the most benefit from NAC should be the focus.


Introduction
The optimal treatment strategy for patients diagnosed with oesophageal adenocarcinoma (OAC) is controversial. As most patients have at least locoregional disease at presentation, multimodal therapy is used widely. UK guidance recommends neoadjuvant chemotherapy (NAC) followed by surgery (NACS) 1 , whereas chemoradiotherapy is more widely used in the neoadjuvant phase in many other European countries 2 . In North America, NAC followed by surgery is often accompanied by adjuvant postoperative chemotherapy or chemoradiotherapy 3 .
RCTs attempting to establish survival benefit for treatment with NACS compared with surgery alone have reported contrasting outcomes. In the two largest of these studies, the UK MRC OE02 trial 4 reported a 5-year survival rate of 23⋅0 per cent after NAC compared with 17⋅1 per cent after surgery alone (hazard ratio (HR) 0⋅82, 95 per cent c.i. 0⋅71 to 0⋅95; P = 0⋅03), whereas the US RTOG trial 8911 (US Intergroup 113) reported equivalence 5 . A Cochrane review 6 considered these two studies to be of high quality with low risk of bias, and concluded that, although NACS may offer a survival advantage over surgery alone, further research was required.
Propensity score (PS) analysis is being used increasingly to compare non-randomized cohorts 7 . It enables estimates of probability of undergoing a treatment given a vector of observed variables and is a powerful alternative for drawing causal inference on observational data compared with conventional case-mix adjustment. This is based on the adjustment made by PS analysis for confounding factors (or baseline characteristics) on the independent variable (for example treatment option). PSs are generated by a logistic regression model and aim to replace a group of baseline characteristics with one score. Following this, PSs can be used in a number of analytical techniques, the most common being matching, stratification and regression adjustment. In this way, treatment arms can be balanced in terms of important co-variables, allowing a fair comparison of treatments to be made 8 -12 . As much of the selection bias is adjusted for, PS analysis provides a scientifically sound alternative to RCTs in situations where interventions cannot be allocated randomly for ethical and practical reasons 8,9 .
The aim of this study was to examine the outcomes of OAC treatment with surgery alone or with NACS, by means of PS regression analysis, to see whether the benefits suggested in the MRC OE02 trial were reproducible in contemporary clinical practice in a UK regional cancer network.

Methods
The study included consecutive patients diagnosed with potentially curable oesophageal cancer of adenocarcinoma cell type between 1 January 2003 and 30 June 2018, by a regional multidisciplinary team serving a population of 1⋅76 million. Clinical and pathological information was collected prospectively. Preoperative staging involved CT, endoluminal ultrasonography (EUS) and laparoscopy, if appropriate. For all patients diagnosed from 2009 onwards, CT-PET has been incorporated routinely. All staging was done in accordance with the UICC TNM seventh edition 13 . Pathological response to chemotherapy was determined using the Mandard tumour regression grade (TRG) score 14 , and was recorded from pathology reports issued at the time of resection. EUS examinations were performed or supervised by one of two radiologists.
Ethical approval was sought from the regional ethics committee, but the chair confirmed that individual patient consent was not required to report clinical outcomes alone and thus no formal approval was necessary.

Surgery with or without neoadjuvant chemotherapy
Before the publication of the initial OE02 results in 2002 4 , the main curative treatment for these patients was primary surgery. However, after this, fit patients with T3 and equivocal T4, N0 and N1 tumours were generally treated with neoadjuvant therapy before surgery 15 . The majority of these patients received two cycles of cisplatin 80 mg/m 2 and 5-fluorouracil (5-FU) 1000 mg/m 2 for 4 days. A minority received four cycles of epirubicin 50 mg/m 2 , cisplatin 60 mg/m 2 and 5-FU 200 mg/m 2 or capecitabine 625 mg/m 2 . Other slightly altered regimens were used, depending on patient co-morbidity or adverse reactions. CT, after the final dose of NAC and before surgery, was used to establish tumour response to NAC 16 .
Patients treated with neoadjuvant chemoradiotherapy were excluded. Patients with radiologically perceived T1-2, N0 disease, and those considered unsuitable to receive chemotherapy because of other co-morbidities, were offered surgery alone.
Most patients had transthoracic oesophagectomy (TTO) as described by Tanner 17 and Lewis 18 . Transhiatal oesophagectomy (THO), as described by Orringer 19 , was used selectively in patients with adenocarcinoma of the lower third of the oesophagus who had significant cardiorespiratory co-morbidity. Some patients with type 2 junctional cancers who underwent an extended total gastrectomy were also included. Oesophageal resection was defined as potentially curative when all visible tumour had been removed. Involvement of the circumferential resection margin was defined as the presence of tumour less than 1 mm from the circumferential margin 20 .

Follow-up and disease recurrence
All patients were reviewed every 3 months for the first year after oesophagectomy, and every 6 months thereafter. Disease recurrence was based on clinical suspicion and confirmed by radiological investigation or endoscopy. Patterns of recurrence were defined as locoregional, distant (metastatic), or both locoregional and distant when both were diagnosed at the same time. The time of recurrence was taken as the date of the confirmatory investigation. Death certification was obtained from the Office for National Statistics.

Statistical analysis
Sample size calculations were based on a prestudy literature survey of Cancer Research UK cancer statistics 21 , which indicated that the baseline 5-year survival rate in patients diagnosed with stage II OAC was expected to be 40 per cent, compared with 20 per cent in patients with stage III OAC, and that a 15 per cent difference in survival would be a realistic expectation. Thus, a minimum of 276 patients were needed to provide 80 per cent power to detect such a difference with P < 0⋅050.
PSs were generated using a logistic regression model, and included all relevant independent variables thought to be potential confounding factors. These were considered by the regional multidisciplinary team, and comprised patient demographics (age above 70 years and sex) and clinical staging (cTNM) based on radiological assessment of T and N status 22 . Generated PSs were then used in a regression adjustment to estimate the effect of the exposure to treatment on disease-free (DFS) and overall (OS) survival.
Complete case analysis was based on intention to treat, and the primary outcome measures were DFS and OS. Secondary outcome measures included OAC recurrence and postoperative morbidity. Grouped data were expressed as median (i.q.r.) values, and non-parametric statistical methods were used. Continuous data were compared using the Mann-Whitney U test and categorical data using the χ 2 test and Fisher's exact test when the number of events was low. DFS for all patients was calculated using methodology similar to that of both the MRC OEO2 and US Intergroup randomized trials, by measuring the period from a landmark time of 6 months after diagnosis to the date of recurrence to allow for the variable interval to surgery after diagnosis, depending on whether NAC was prescribed 4 . As in the above trials, events resulting in a failure to complete curative treatment, such as not proceeding to surgery, open and close laparotomy, palliative resection and in-hospital mortality, were assumed to have occurred at this landmark time, to maintain the intention-to-treat analysis. OS was measured from the date of diagnosis to date of death or censorship, whichever occurred first. Cumulative survival was calculated according to the Kaplan-Meier method, and differences between groups were analysed with the log rank test. Univariable analyses were done initially to examine factors influencing survival, and those with associations found to be statistically significant (P < 0⋅050) were retained in a Cox proportional hazards model. Cox models (controlling for PS) were used to estimate the effect of the treatment on the outcomes, DFS and OS. Data analysis was performed using the SPSS ® version 25 (IBM, Armonk, New York, USA).

Variation in clinicopathological factors and perioperative outcomes
Details of 440 patients related to treatment modality are shown in Table 1. The operative approach in the surgery-alone compared with the NACS cohort was TTO in 42 (23⋅9 per cent) and 146 (55⋅3 per cent) patients respectively (P < 0⋅001), and THO in 114 (64⋅8 per cent) and 68 (25⋅8 per cent) (P < 0⋅001). The rate of open and close laparotomy was nine (5⋅1 per cent) for the surgery-alone cohort, compared with 38 (14⋅4 per cent) for the NACS cohort (P = 0⋅002).

Discussion
The principal findings of this study were that, following application of PS adjustment, DFS and OS were comparable for surgery and NAC followed by surgery, both clinically and statistically. Operative morbidity and mortality were higher after surgery alone, but this difference was not statistically significant.
This study has several limitations. It was not a randomized trial, rendering it vulnerable to selection bias and confounding by case mix. Groups of patients were unbalanced in terms of age and stage of disease. Data were from a single regional network. As expected, the process and strategy of radiological staging has developed and improved over the 15 years of the study; CT equipment has advanced and therefore the quality of staging may have been inconsistent. In contrast, the EUS equipment used was not upgraded during the study period, and the implementation of propensity scoring in regression analyses of DFS and OS means much of the selection bias has been considered.
Despite the advantages of PS analysis, the methodology still has limitations, principally the inability to adjust for unknown confounding factors 10 , as well as the assumption that the relationship between the PS and the outcome has been modelled correctly 9 . Consequently, the implementation of PS analysis in observational studies does not negate the use of randomized trials, but rather emphasizes the advantages associated with randomization. In clinical situations where randomization may be impractical, PS analysis is theoretically a way of minimizing bias to obtain results that may approach the level of evidence provided by the rigorous methodology of an RCT 9,23 . PS analysis has two other important strengths. If multivariable model analyses have traditionally been the preferred statistical method for assessing the effect of a predictor variable on outcomes after controlling for baseline characteristics, their appropriateness depends on a consistency with several assumptions underlying any given model. PS analysis has proved to be the most useful statistical method for controlling confounders, providing appropriate estimates even when faced with situations of extreme correlation between the confounders and the exposure 24 . PS analysis is well suited when several risk-adjusted outcomes are under assessment (DFS and OS), because it simplifies the weighting of multiple outcomes as, once calculated, it can be used for each outcome separately. Allied to PS analysis, this study has additional strengths, in that it is a large study from a regional cancer network, with a well audited practice 16 . Accurate state-of-the-art radiological staging was utilized by means of PET-CT (from 2008) and EUS (all patients) 11 . No patients were lost to follow-up and dates of death were obtained from the Office for National Statistics, making the survival data especially robust.
A number of RCTs have compared NAC followed by surgery with surgery alone for OAC, but conclusions have differed. The MRC OE02 trial 4 randomized 802 patients to either treatment arm (9 per cent of patients in each arm also had radiotherapy) and reported significantly improved survival in the NACS arm, with 2-and 5-year survival rates of 34 and 17 per cent after surgery alone, and 43 and 23 per cent after NACS. The RTOG trial 8911 5 , however, reported equivalence, with 2-and 5-year survival rates of 60 and 20⋅7 per cent after surgery alone, and 59 and 19⋅4 per cent after NACS, consistent with the present PS analysis. The RTOG trial 8911 did, however, describe a highly significant improvement in survival for patients in the NACS cohort who demonstrated (by way of barium study) a significant response to chemotherapy. Although it was reported 5 that only 19 per cent of patients in the NACS cohort had major objective disease regression, these patients also received postoperative therapies. Those who did not respond had around a 10 per cent poorer survival than patients who had surgery alone. Response to NAC is heterogeneous, with TRG score correlating with survival, but only around only 16⋅0 per cent of patients benefit from a good response to NAC, which translates into improved DFS and OS 25 .
The present study has demonstrated similar survival after surgery alone and NAC followed by surgery in patients with OAC, in keeping with the US RTOG trial 5 , but in contrast to the UK MRC OE02 trial 4 . The small subset of patients whose disease responded significantly to NAC (16⋅4 per cent) nevertheless did have improved survival, compared with those with a poor response. With recent advances in chemotherapy 26 and the addition of radiotherapy in more recent studies, the best treatment for potentially resectable OAC 27 remains elusive. For new trials to provide definitive answers, the issue of identifying patients who derive benefit from neoadjuvant therapies, and the development of alternative strategies for those who do not, remains important.