Filgotinib decreases both vertebral body and posterolateral spine inflammation in ankylosing spondylitis: results from the TORTUGA trial

Abstract Objectives To assess the effects of filgotinib on inflammatory and structural changes at various spinal locations, based on MRI measures in patients with active AS in the TORTUGA trial. Methods In the TORTUGA trial, patients with AS received filgotinib 200 mg (n = 58) or placebo (n = 58) once daily for 12 weeks. In this post hoc analysis, spine MRIs were evaluated using the Canada–Denmark (CANDEN) MRI scoring system to assess changes from baseline to week 12 in total spine and subscores for inflammation, fat, erosion and new bone formation (NBF) at various anatomical locations. Correlations were assessed between CANDEN inflammation and clinical outcomes and Spondyloarthritis Research Consortium of Canada (SPARCC) MRI scores and between baseline CANDEN NBF and baseline BASFI and BASMI scores. Results MRIs from 47 filgotinib- and 41 placebo-treated patients were evaluated. There were significantly larger reductions with filgotinib vs placebo in total spine inflammation score and most inflammation subscores, including posterolateral elements (costovertebral joints, transverse/spinous processes, soft tissues), facet joints and vertebral bodies. No significant differences were observed for corner or non-corner vertebral body inflammation subscores, spine fat lesion, bone erosion or NBF scores. In the filgotinib group, the change from baseline in the total inflammation score correlated positively with the SPARCC spine score. Baseline NBF scores correlated with baseline BASMI but not BASFI scores. Conclusions Compared with placebo, filgotinib treatment was associated with significant reductions in MRI measures of spinal inflammation, including in vertebral bodies, facet joints and posterolateral elements. Trial registration ClinicalTrials.gov (https://clinicaltrials.gov), NCT03117270.


Introduction
Axial SpA (axSpA) is a chronic inflammatory condition involving the axial joints and entheses that can lead to chronic pain, structural damage and disability [1,2]. AS is considered a subset of axSpA. There is substantial overlap in the clinical definitions of classic AS (based on Rheumatology key messages . Filgotinib significantly reduced spinal inflammation in diverse spinal locations when compared with placebo. . In particular, filgotinib reduced inflammation in the facet joints and posterolateral elements. . Filgotinib ameliorates inflammation in spinal structures that are highly relevant to spinal function and mobility. the modified New York criteria) and radiographic axSpA (r-axSpA; based on the Assessment of SpondyloArthritis International Society criteria), such that the two terminologies largely identify the same group of patients [3,4]. Both AS and r-axSpA are characterized by sacroiliitis on conventional radiographs [3].
MRI is the optimal imaging modality for evaluating inflammatory changes in AS [5,6]. Current methods for quantifying inflammation of the spine are highly discriminatory between active therapy and placebo, but focus on lesions in the vertebral bodies. It is assumed that inflammatory lesions in posterolateral elements and facet joints respond similarly to therapeutic intervention, but there are no data from placebo-controlled trials to confirm this. Moreover, inflammation at these locations may significantly affect spinal mobility and function, and the impact of new therapies should therefore also include an evaluation of inflammation in these regions. While inflammation is known to be a predictor of the development of structural lesions in patients with AS [7][8][9], the development of fat lesions has also been associated with structural lesion development [10][11][12]. Measurement of fat lesions in addition to inflammatory lesions may therefore be important when assessing the efficacy of potential AS therapies. The Canada-Denmark (CANDEN) MRI scoring system allows comprehensive semi-quantitative assessment of inflammation, fat, erosion and new bone formation (NBF; i.e. bone spurs and ankylosis) of the spine [13][14][15][16]. In contrast to other scoring systems, CANDEN MRI allows evaluation by anatomical location and includes all the spinal regions that can be affected in AS [13][14][15].
Treatment options for patients with AS who do not respond to NSAIDs currently comprise TNF-a inhibitors, the IL-17 inhibitors secukinumab and ixekizumab and the recently approved Janus kinase (JAK) inhibitor upadacitinib [1,17,18]. JAKs are central transmitters of pro-and anti-inflammatory cytokine signals in immune cells and are therefore interesting targets for immunomodulation [19]. Filgotinib, an oral JAK1 preferential inhibitor, reduced disease activity and improved symptoms in patients with active AS in the phase 2 TORTUGA trial (NCT03117270) [20]. In the TORTUGA trial, filgotinib significantly improved Spondyloarthritis Research Consortium of Canada (SPARCC) [21,22] MRI inflammation scores (bone marrow oedema) in the vertebral bodies and SI joints compared with placebo [20]. However, the effects of JAK inhibitors, including filgotinib, on structural lesions in active AS are unknown and their impact on inflammation in the posterolateral part of the spine, e.g., the facet joints, the entheses of transverse and spinous processes and the surrounding soft tissues, has not been investigated.
The aim of this post hoc analysis was to evaluate the effects of filgotinib on spinal lesions, focussing on inflammatory and fat lesions in different anatomical locations of the spine in patients from the TORTUGA trial.

Study design
The design of the TORTUGA trial, a multicentre, doubleblind, randomized trial, has been reported previously [20]. Briefly, 116 adults with active AS (as per the modified New York classification criteria, with sacroiliitis confirmed by central reading) and inadequate response or intolerance to two or more NSAIDs were treated with oral filgotinib 200 mg (n ¼ 58) or placebo (n ¼ 58) once daily for 12 weeks. Prior use of one TNF inhibitor was permitted (in up to 30% of enrolled patients). Patients were recruited at sites in seven countries: Belgium, Bulgaria, Czech Republic, Estonia, Poland, Spain and Ukraine. The study protocol was reviewed and approved by the central or individual independent ethics committee in each participating country (Supplementary Table  S1, available at Rheumatology online). All patients provided written informed consent.

CANDEN MRI scoring
During the TORTUGA trial, MRIs were conducted at baseline and week 12 (or at the early discontinuation visit). Semi-coronal T1-weighted and short tau inversion recovery MRI sequences were independently evaluated post hoc by two experts (blinded to time point and assigned treatment) according to the detailed anatomybased CANDEN MRI method (www.carearthritis.com/mri portal/canden/index/) [13][14][15]. The CANDEN MRI scoring system provides overall scores for spine inflammation, fat, bone erosion and NBF in the cervical, thoracic and lumbar segments on sagittal slices of the spine ( Supplementary Fig. S1, available at Rheumatology online) [15,16]. Vertebral body lesions are assessed in each of 23 discovertebral units (DVUs), each unit defined by the area between horizontal lines drawn across the middle of the vertebral bodies of adjacent vertebrae. This area includes the intervertebral disc, vertebral endplates on each side of the disc and adjacent bone marrow. Vertebral body lesions are documented according to their presence in central and lateral sagittal slices. Lesions are also recorded in the facet joints, spinous processes and soft tissues at all 23 vertebral levels, in transverse processes at 17 levels from T1 to L5 and in the rib at 12 levels from T1 to T12. If a lesion is absent, a score of 0 is applied; if a lesion is present, a score of 1 or 2 is applied depending on the lesion type (a score of 6 is applied for corner and non-corner ankylosis). Additional scores of 1 or 2 are added for certain large lesions [16].
The CANDEN MRI spine fat score has a total scoring range of 0-510 and can be divided into vertebral body and posterior element (facet joints) fat subscores [16]. The CANDEN MRI bone erosion score has a scoring range of 0-208 and comprises vertebral body and posterior element (facet joints) erosion subscores. The CANDEN MRI NBF total score ranges from 0 to 460 and comprises vertebral body and posterior element (facet joints) subscores [16].

Outcome measures
Endpoints of this post hoc analysis included change from baseline to week 12 in CANDEN MRI total spine scores for inflammation, fat, bone erosion and NBF and also spine inflammation and fat subscores. Correlations were assessed between the change in CANDEN MRI inflammation total spine and subscores and the change in clinical outcomes and between the baseline CANDEN MRI NBF score (total, vertebral body and facet joints) and baseline functional (BASFI) and mobility (BASMI) measures.

Statistical analyses
CANDEN MRI scores were treated as continuous variables and observed changes from baseline were evaluated using analysis of covariance with factors for treatment, baseline value and randomization stratification by prior TNF inhibitor use. Least squares mean changes from baseline and between-group differences with 95% CIs were calculated; P-values were nominal.
Spearman correlations were determined between the change from baseline in CANDEN MRI total spine and subregion inflammation scores and the change from baseline in the following clinical outcomes: CRP, AS Disease Activity Score, BASDAI, BASFI, BASMI, lumbar flexion, chest expansion, SPARCC MRI SI joint inflammation and SPARCC MRI spine inflammation 23-DVU score (changes in SPARCC MRI SI joint and spine inflammation scores were assessed as secondary endpoints in the TORTUGA trial [20]). Pearson correlations were determined between the baseline CANDEN MRI NBF score (total, vertebral body and facet joints) and baseline BASFI and BASMI scores.
The mean of the two reader scores was used to compare changes in total spine and regional inflammation scores between treatment groups; interreader intraclass correlation coefficients (ICCs) were calculated to assess the consistency and reliability of scoring between the two MRI readers, using the ICC 2.1 model. As prespecified, interreader discrepancies were resolved by an independent adjudicator if one reader determined a case was unreadable or if the change from baseline in CANDEN spine inflammation, fat lesion or erosion score differed between the two primary reviewers by !6 points in different directions (one reader detected an improvement, the other detected a worsening) or by !15 points in the same direction (both detected either improvement or worsening). Cut-offs for CANDEN scores triggering adjudication were based on the estimated smallest detectable change for the CANDEN total spine inflammation score derived in a previously reported placebo-controlled trial of adalimumab, in which the CANDEN score was used to evaluate treatment responses [16]. Final scores for cases requiring adjudication were calculated from the mean of the adjudicator's score and the closest score of the two primary readers.

Patient characteristics
MRI scans from 88 patients with an evaluable MRI at baseline and week 12 (or early termination visit) were evaluated (filgotinib, n ¼ 47; placebo, n ¼ 41) in this post hoc analysis. Baseline characteristics were generally similar between these patients and those from the TORTUGA trial who had been excluded from the present analysis because of missing MRI scans (Supplementary  Table S2, available at Rheumatology online).
In patients with MRI scans, the mean duration of AS was longer in those on placebo than those on filgotinib (7.7 vs 5.3 years, respectively; Table 1). The mean baseline total spine inflammation score was higher in the filgotinib group than the placebo group, while the mean baseline NBF score was lower in the filgotinib group vs the placebo group (Table 1). In the filgotinib and placebo groups, 95.7% and 85.4%, respectively, had an NBF score of <100, 2.1% and 7.3% had a score of 100-<150 and 2.1% and 7.3% had a score of !150. The mean baseline vertebral body and facet joints CANDEN NBF scores, according to baseline subgroups for CANDEN total NBF score, are presented in Supplementary Table S3, available at Rheumatology online.

Change in CANDEN MRI scores
Total spine inflammation scores decreased from baseline in the filgotinib group but not in the placebo group (P < 0.001 for between-group difference; Table 2); this finding was supported by the corresponding cumulative probability plot (Fig.  1A). There were significantly greater reductions with filgotinib vs placebo in most spine inflammation subscores, including the posterior elements inflammation subscore (P ¼ 0.006), posterolateral inflammation subscore (P ¼ 0.007), vertebral body inflammation subscore (P ¼ 0.009) and facet joints inflammation subscore (P ¼ 0.026; Table 3). An example of reduced facet joints inflammation following treatment with filgotinib is shown in Fig. 2. No statistically significant between-group differences were observed in the change from baseline in vertebral body corner or noncorner (spondylodiscitis) inflammatory lesion subscores (Table 3). These findings were supported by cumulative probability plots (Fig. 1B-G). Total spine fat lesion scores numerically increased from baseline in the filgotinib group but decreased in the placebo group (P ¼ 0.088 for between-group difference; Table 2). The between-group difference for changes in spine fat subscores did not reach statistical significance (Table 3). There were no statistically significant differences between groups for changes in total spine bone erosion (P ¼ 0.20) or NBF (P ¼ 0.39) scores (Table 2).

Interreader reproducibility
Interreader reproducibility data indicated strong agreement between the two readers for CANDEN MRI scores at baseline, with ICC values >0.50 in 12 of the 14 scores assessed, 7 of which were >0.75 (Table 4). For the change from baseline to week 12, ICC values >0.50 and >0.75 were recorded for 5 and 1 of the 14 scores, respectively (Table 4).

Correlation between the change in CANDEN MRI inflammation spine scores and change in clinical measures
In these exploratory post hoc analyses, in the filgotinib group, the change from baseline to week 12 in the     Table S4, available at Rheumatology online). In the placebo group, the change from baseline to week 12 in the CANDEN total MRI inflammation score, facet joints subscore and posterolateral inflammation subscore each correlated positively with SPARCC MRI spine scores (r ¼ 0.33, P ¼ 0.035; r ¼ 0.40, P ¼ 0.010; and r ¼ 0.37, P ¼ 0.016, respectively), while the change in the facet joints subscore correlated negatively with lumbar flexion (r¼À0.41, P ¼ 0.009; Table S4). However, it should be noted that P-values were not corrected for multiple testing.   These data highlight the additional information obtained through the CANDEN MRI inflammation score compared with the more established scoring systems such as the AS spine MRI (ASspiMRI) inflammation [23], SPARCC [21] and Berlin methods [24], which do not incorporate assessments at different anatomical locations. As such, using the CANDEN MRI score in future research could help to identify patient subgroups with different disease trajectories and allow evaluation of the relationship between different lesion types over time, as well as the impact of therapy on this relationship [16].

Correlation between baseline CANDEN NBF scores and baseline functional and mobility measures
The results from the current analysis on vertebral body inflammation, as assessed using the CANDEN scoring system, are in accordance with findings from the SPARCC analysis from the TORTUGA trial, which have been previously reported [20]. There was a slight difference between treatment groups with regard to fat lesions. Fat lesion development is a predictor of NBF in the spine in patients treated with TNF inhibitors [10][11][12]. However, the pathological basis of the transition from fat to new bone is not well understood [25] and there is a current lack of longitudinal MRI data regarding disease progression, particularly in patients treated with non-TNF inhibitor therapies. Longer-term data are required to evaluate the impact of reductions in fat lesion development on the progression of disease.
In the filgotinib group, positive correlations were observed between changes in the CANDEN total MRI inflammation score and SPARCC MRI spine score. However, there was no correlation between the change in CANDEN inflammation scores and clinical parameters, such as the BASDAI and BASFI. This lack of correlation with clinical parameters has been reported in trials of TNF inhibitor agents that assessed correlations with the SPARCC spine inflammation score [26]. Moderate correlations do exist in early AS, but become less evident as disease progresses [27,28]. This might reflect the confounding effects of concomitant degenerative and mechanical disorders of the spine and the potential for the emergence of non-inflammatory pain hypersensitivity as observed in other chronic inflammatory joint diseases [29,30].
Baseline CANDEN MRI NBF scores (total score and facet joints and vertebral body subscores) each correlated positively with baseline BASMI, but no correlation with BASFI was observed. In a study assessing the relationship between BASMI and ASspiMRI measurements in golimumab-treated patients, Baraliakos et al. [31] found that, at baseline, lumbar active inflammatory ASspiMRI scores correlated with lumbar flexion and lateral lumbar flexion (each P < 0.01), whereas chronic structural ASspiMRI also correlated with lateral lumbar flexion (P ¼ 0.04). No significant correlations were found for changes from baseline in these measures at week 14. At week 104, a weak but significant correlation between the change from baseline in cervical spine chronic structural ASspiMRI score and BASMI cervical tragus-to-wall distance component score was seen [31]. These results suggest that in clinical trial participants with established AS, MRI measures of NBF, and not inflammation, were most consistently associated with restriction of mobility [32,33]. It has been reported that spinal mobility impairment is independently determined by clinical disease activity, MRI spinal inflammation, structural damage, enthesitis and age [33]. The effect of spinal inflammation is more relevant in early AS, while spinal structural damage has a greater impact in later stages of disease [32,33].
The ICC data show strong agreement between MRI readers at baseline, which is a strength of the study. In comparison, ICC values for the change from baseline, particularly for structural changes, were lower. However, low ICC values for change scores may reflect that variation in structural or inflammation changes between patients was limited, especially for lesions in posterolateral locations, and as such do not necessary indicate poor reliability. As expected over a 12 week study period, minimal changes in erosion and NBF were seen. In addition to the short study duration, potential limitations include the imbalance in MRI measures at baseline and the post hoc nature of the analysis. MRIs were not available from all patients, which could also have impacted results.
In conclusion, filgotinib was associated with significant reductions vs placebo in MRI measures of spinal inflammation at week 12 of the TORTUGA trial using the CANDEN method. In particular, filgotinib resulted in a substantial decrease in inflammation in the posterolateral elements and facet joints. These findings need to be confirmed in larger studies and long-term effects remain to be determined.   Ø. contributed to data collection. All authors contributed to data analysis or interpretation, reviewed and critically revised the manuscript, approved the final draft and are accountable for the accuracy and integrity of the work.