Understanding flare in axial spondyloarthritis: novel insights from daily self-reported flare experience

Abstract Objectives Our objective was to explore daily self-reported experiences of axial SpA (axSpA) flare based on data entered into the Project Nightingale smartphone app (www.projectnightingale.org), between 5 April 2018 and 1 April 2020. Methods Paired t-tests were conducted for mean_flare_on and mean_flare_off scores for each recorded variable. The mean estimated difference between flare and non-flare values for each variable was calculated with 95% CIs. Mean, S.d. and range were reported for flare duration and frequency. Participants with ≥10 days of data entry were included for affinity propagation cluster analysis. Baseline characteristics and mean flare on vs mean flare off values were reported for each cluster. Welch’s t-test was used to assess differences between clusters. Results A total of 143/189 (75.7%) participants recorded at least one flare. Each flare lasted a mean of 4.30 days (S.d. 6.82, range 1–78), with a mean frequency of once every 35.32 days (S.d. 65.73, range 1–677). Significant relationships were identified between flare status and variable scores. Two clusters of participants were identified with distinct flare profiles. Group 1 experienced less severe worsening of symptoms during flare in comparison to group 2 (P < 0.01). However, they experienced significantly longer flare duration (7.2 vs 3.5 days; P < 0.01), perhaps indicating a prolonged, yet less intense flare experience. Groups were similar in terms of flare frequency and clinical characteristics. Conclusions Two clusters of participants were identified with distinct flare experiences but similar baseline clinical characteristics. Smartphone technologies capture subtle changes in disease experience not currently considered in clinical practice.


Introduction
Axial SpA (axSpA) is a chronic, inflammatory disease characterized by alternating periods of flare and more stable disease activity. Flares are often both unpredictable and debilitating and a greater understanding of their nature and outcome is therefore important to both those living with axSpA and clinicians [1,2]. Over the last decade, although rapid advances have been made in terms of our understanding of axSpA, the natural Rheumatology key messages . Daily self-reported smartphone data identified two distinct clusters of people living with axSpA who had different flare experiences. . Despite differences in flare duration and symptoms, baseline clinical measures were similar between clusters. . Smartphone technologies capture subtle changes in disease experience not currently considered in clinical practice.
history of the disease remains elusive. It has been hypothesized that the presence of early, severe disease flares (often associated with a worsening of symptoms or increased disease activity) may allow for early identification of people living with axSpA who may develop more severe disease [3][4][5][6]. Indeed, severe flare has been identified as a poor prognostic factor in axSpA, particularly in early disease [4]. It has therefore been suggested that early, aggressive treatment of severe flares in axSpA may improve long-term outcomes.
Despite the frequent use of the concept 'flare' within rheumatic condition terminology, an accepted, consistent definition of a flare does not yet exist for axSpA. In recent years there have been attempts to define flare in both axSpA and other chronic rheumatic conditions such as RA based on validated composite indices or through qualitative retrospective investigation of flare states [3,[7][8][9]. Indeed, there is growing interest in the concept of flare and in characterizing the lived experiences behind this multidimensional phenomenon [1,6,7,[9][10][11][12][13][14][15][16][17]. Such an understanding is critical to better characterize the natural history of the condition and in the future may facilitate optimization/personalization of available treatments. The problematic nature of defining a flare in axSpA lies in part in the multifaceted, heterogeneous nature of the disease. This problem was clearly demonstrated by Gossec et al. [7], whereby, in a preliminary attempt to classify flare, 27 different flare definitions were identified among 38 publications on axSpA.
In prior studies investigating flare experiences, those living with axSpA have often been asked to recall a history of flare or prior experience of flare [3][4][5][6]. However, this retrospective characterization is subject to recall bias and may not provide an accurate picture of the lived day-to-day reality. Recently introduced smartphone technologies for the daily monitoring of disease symptoms and activity provide unique insights into the daily experiences of individuals with chronic, fluctuating conditions [18][19][20][21][22][23][24][25]. Such technologies may allow for a more accurate investigation of flare experiences [12,16,17].
In the present study we conducted an exploratory analysis on a dataset of participants entering daily symptoms and behaviour into the Project Nightingale (uMotif) app (www.projectnightingale.org). Our objective was to explore individual's self-reported experiences of flare. We hoped to characterize the constituents of flare, the frequency and duration of flare and whether people living with axSpA could be clustered based on their similar experiences. We then attempted to further characterize these clusters of participants to provide detailed insights into potential distinct subtypes of flare experience.

Overview of Project Nightingale
Since April 2018, people living with axSpA under the care of the Royal National Hospital for Rheumatic Diseases (RNHRD), Royal United Hospitals NHS Foundation Trust (RUH), Bath, UK, have been eligible to participate in Project Nightingale. Project Nightingale was created to allow people living with axSpA to track daily symptoms and behaviour via their smartphone device to gain further insights into the nature of their condition.
All participants are invited to track 10 variables via the uMotif smartphone app, including 8 fixed and 2 optional variables. Fixed variables are tracked by all participants and include pain, mood, fatigue, sleep, stress, flare, recommended exercise and anti-inflammatory use. Two optional variables are chosen by each participant from the following: caffeine intake, hot flushes, adherence to medication, screen time, confidence in selfmanagement, eyesight, hydration, chest pain, flare of psoriasis, impact of menstrual cycle, red painful eyes, smoking habits and blood in stool. The variables and associated scales were designed by the lead consultant for axSpA at the RNHRD to optimize clinical relevance, following years of regular, detailed and empathetic interaction with people living with axSpA.
Participants are asked how they are feeling each day via the app. They rate each variable on a 5-point Likert scale. The interface for recording each outcome is displayed as a flower-like visualization whereby each petal represents one of the 10 tracking variables ( Supplementary Fig. S1, available at Rheumatology Advances in Practice online). Participants are required to drag their finger from the centre of the flower to the outer edge of each petal to record their symptoms. For each variable, a score of 1 equates to the less healthy or desirable outcome, whereas a score of 5 represents the most healthy/positive outcome or behaviour. For example, for pain, 1 ¼ debilitating pain and 5 ¼ no pain. The flower-motif recording interface acts as a visual metaphor, whereby a full flower represents the most healthy or optimal outcomes. Participants receive daily reminders for data entry as a notification to their smartphone if data has not already been entered. In the uMotif app settings, participants can choose to opt out of reminders or alter their time and frequency.

Data collection
For the present study we utilized smartphone data collected via the Project Nightingale (uMotif) app between 5 April 2018 and 1 April 2020. The South West-Central Bristol UK local research ethics committee for National Health Service research approved the study and all patients provided written informed consent (Bath Spondyloarthritis Biobank; REC reference 13/SW/0096). Clinical data were collected based on routine assessment at the RNHRD. Baseline measures were extracted at the visit date closest to Project Nightingale registration, restricted to visit dates within 90 days of Project Nightingale registration. Data from participants' wearable and smart device applications were downloaded regularly and incorporated into the patient record.

Statistical methods
For participants with at least one flare and non-flare set of recorded variables, data were aggregated to one row per participant, containing mean values with and without flare for each petal variable. For example, Participant 1 would have an average_pain_flare_on feature and an average_pain_flare_off feature for each variable. Paired t-tests were conducted for each variable to investigate which variables correlated with flare status. The difference between the flare_on and flare_off features were taken for each pair to create a set of 'difference' features to capture the effect of a flare on each petal variable for each participant. The mean estimated difference between flare and non-flare values for each variable was calculated with its 95% CI. The mean, S.D. and range were reported for flare duration and flare frequency. For the flare duration calculation, two logged periods of flare occurring within 3 days were considered as one period of flare if missing 1 day of data between entries.
For the cluster analysis, Project Nightingale participants with <10 days of data entry were excluded. Difference features for each variable for each participant were normalized to between À1 and 1 and then used for clustering. Affinity propagation was used as the clustering algorithm via the apcluster R package (R Foundation for Statistical Computing, Vienna, Austria) [26]. negDistMat (r ¼ 2) was used for the similarity matrix, squaring the distance measures between participants to calculate similarities [27]. q ¼ 0 was used to minimize the number of clusters found. Given the size of the dataset (129 participants), it was decided to reduce the number of clusters in order to achieve a meaningful sample size for each cluster [28].
Baseline characteristics and mean flare on vs mean flare off values were reported for each cluster. Welch's t-test was used to assess differences between clusters.

Patient and public involvement (PPI) statement
Project Nightingale was established through a strong collaboration between the RNHRD (RUH, Bath) and consultant R.S., engagement with relevant stakeholders [people living with axSpA and healthcare professionals (HCPs)], the charity White Swan and the Bath Institute for Rheumatic Diseases (BIRD). This has facilitated PPI from project initiation. Petal tracking variables were determined by R.S. based on decades of clinical experience and interactions with people living with axSpA. Additional optional variables were also added to the scope based on patient feedback at Project Nightingale information days. These regular Project Nightingale and axSpA information days organized by BIRD have facilitated patient-HCP-researcher discussion, knowledge exchange, participant feedback and dissemination of results. Such interactions and collaborations have informed advancement of future Project Nightingale research plans and app innovations.
PPI has been maintained during the coronavirus disease 2019 pandemic via regular Project Nightingale patient-HCP-researcher discussions during the well-established RNHRD axSpA rehabilitation course. A Project Nightingale BIRD podcast episode and Facebook Live event with the National Axial Spondyloarthritis Society have also facilitated PPI. The Project Nightingale blog and Twitter have facilitated regular research updates and dissemination of results to the wider axSpA community. This has allowed for further patient participation and discussion of experiences [29,30].  (Table 1). Small but significant (P < 0.01) estimated differences were found between flare and non-flare scores for pain, fatigue, sleep quality, exercise, mood, anti-inflammatory use, stress, confidence in self-management and chest pain.

Results
Between 5 April 2018 and 1 April 2020, 129 patients had registered for participation in Project Nightingale and provided !10 days of data entry suitable for the cluster analysis. Two clusters of participants were identified based on distinct profiles of uMotif petal symptom scores during flares, using non-flare scores as a baseline comparator (Fig. 1, Table 2). Group 1 appeared to experience less severe worsening of pain, fatigue, sleep, mood and stress during flare (vs non-flare) compared with group 2 (P < 0.01). However, this group also experienced significantly longer flare duration (7.2 vs 3.5 days; P < 0.01) (Supplementary Table S1, available at Rheumatology Advances in Practice online), perhaps indicating a more prolonged, yet less intense flare experience. Although not reaching significance due to small sample size, group 2 also demonstrated a greater decrease (worsening) in the score for chest pain, confidence in self-management, eyesight, flare of psoriasis, impact of menstrual cycle and screen time. Changes in anti-inflammatory use and recommended exercise during flare vs non-flare appeared similar between the two groups, perhaps suggesting similar behaviours while attempting to resolve flares. Group 2 reported slightly (petal score difference <0.5) better sleep quality (P ¼ 0.022) and very slightly higher levels of recommended exercise (P ¼ 0.026) than group 1 when not in flare, despite worse scores for pain (P ¼ 0.043), fatigue (P ¼ 0.001), mood (P ¼ 0.031) and stress (P < 0.001) during flare (Table 3). No significant differences were found between groups for pain, fatigue, mood, anti-inflammatory use or stress when not in flare. The baseline (at Project Nightingale registration) characteristics of participants in each cluster group are presented in Table 4. Both groups were similar in terms of gender, HLA-B27 status and other clinical characteristics such as spinal mobility (BASMI). However, group 1 had a significantly greater proportion of smokers (P < 0.001) and group 2 had a significantly greater proportion of people who had never smoked (P < 0.05).

Discussion
To our knowledge, this is the first study to investigate, characterize and group daily self-reported flare profiles in people with axSpA utilizing a smartphone application and remote data collection. Two distinct clusters of participants were identified. One group reported significantly shorter flare duration (P < 0.01), however, they experienced a significantly greater worsening of pain, fatigue, mood, sleep and stress during flare (P < 0.01), perhaps indicating a shorter, although more intense, flare experience. The number and frequency of flares were similar between clusters, as were baseline clinical measures such as BASMI, BASDAI, BASFI and quality of life [measured through the Ankylosing Spondylitis Quality of Life (ASQoL) questionnaire]. Smartphone technologies therefore have the potential to capture subtle, potentially critical changes in disease activity that are not currently considered in clinical practice. Although the long-term significance of these is yet to be explored, such work is planned in our future research agenda. Furthermore, the study of such daily self-report data may in the future allow for prediction of flare based on patterns of symptoms/behaviour or enable a greater understanding of behaviours that lead to earlier resolution of flare. This may facilitate earlier targeting and prevention of flares to reduce flare frequency and duration, to ultimately improve the quality of life for patients.
Prior qualitative work by Brophy and Calin in 2002 [3] also identified two types of flare, localized and generalized, based on group discussions with 214 patients over the period of 1 year. All participants had experienced a localized flare involving pain and immobility in one area, sometimes accompanied by fatigue and emotional symptoms. In contrast, only 40% (85/214) of participants had experienced generalized flares, involving the whole body. This was described as an infrequent event whereby all symptoms were experienced to the extreme. Individuals reporting generalized flares described the localized flares as not a 'true' flare-perceiving localized increases in disease activity as incomparable to the crippling, acute and devastating phenomenon of a whole-body flare. Similar experiences of localized (minor) or generalized (major) flares have been characterized in later studies by Stone et al. in 2008 [6] and a follow-up study in 2010 [5]. In the present study we were unable to determine the location of flares. However, our results appear broadly consistent in terms of one group of patients experiencing more intense, debilitating flares, with greater changes in symptoms such  [3] described the majority of flares as short-term (days to weeks), broadly in agreement with the present study. However, in 2010, Cooksey et al. [5] reported a mean flare duration of 2.4 weeks, compared with an average duration of 7.2 and 3.5 days for group 1 and group 2, respectively, in the present study. This is likely because our flare duration calculation was quite strict, in that a flare required subsequent days of uMotif flare entries to be considered as 'continued'. Just 1 day of missing data was permitted. For example, if a participant recorded a flare on a Monday and Wednesday but with missing data on Tuesday, this would be recorded as a single period of flare. However, if a participant recorded a flare on a Monday and Thursday with 2 missing days of data, this would be considered as two separate periods of flare. This was defined in alignment with a more recent study by Jacquemin et al. [14], whereby the majority of reported flares lasted 3 days. However, this definition may have considerably underestimated the flare duration in the present study. The past 10-20 years have shown dramatic advances in our understanding of axSpA, including the introduction of the widespread use of biologics, improved treatment strategies and a change in definition of disease (to include non-radiographic axSpA in addition to AS). Therefore this may have contributed to the differences in flare duration seen in the present study and the study by Jacquemin et al. [14]. Indeed, both earlier studies (Brophy and Calin [3], Cooksey et al.  [5]) included only people with AS, but not nonradiographic axSpA, perhaps further contributing to the disparity in flare duration.
It is also important to note that despite short flare duration in the present study, the mean flare frequency per participant was once every 35.32 days (S.D. 65.73). This  suggests that there is still a need for optimization/personalization of treatments in axSpA in order to reduce the frequency of debilitating flare and potential associated poor clinical outcomes and work impairment [4,6,31,32].
Beyond the importance of flare characterization in clinical practice, flare also represents an important endpoint to consider in clinical trials. As a potential indicator of disease severity, flare assessment is vital to understanding disease status or treatment efficacy and is of particular importance in tapering or discontinuation trials [33][34][35][36]. There has recently been an attempt to quantify a single definition of flare based on validated composite indices for the purpose of harmonizing trial designs in axSpA [7,9]. However, it is important to distinguish between the necessarily stricter, arbitrarily homogeneous definition of flare that is required in clinical trials vs the highly variable, highly individualized flare experiences of those living with axSpA. In clinical practice, in order to move towards optimization and personalization of treatments, the latter definition as explored in the present study may arguably be of greater significance. This may be supported by the fact that in the present study, although group 2 reported significantly worse flare experiences via the uMotif app, we found no significant differences in baseline spinal mobility, disease activity or function as measured by validated BASMI, BASDAI and BASFI measures between the two groups, highlighting the power of smartphone technologies to capture potentially critical fluctuations in disease severity that are too subtle to be observed by traditional, infrequent measurement of existing validated indices. Indeed, future integration of daily self-reported health data into the electronic health record may allow for greater optimization and personalization of treatment outcomes through more accurate reporting of disease experiences [37].
A limitation of the present study is with regard to adherence. Upon registration, participants were encouraged to enter data every day. However, they were told that any data entry may be useful, including restarting after inactive periods. Prior qualitative and quantitative evidence suggests that patients with worse disease experiences in axSpA may be more likely to adhere to self-tracking behaviour [38]. Therefore our results may be biased towards those with more severe disease. Similar results have been reported in the literature for other inflammatory, rheumatic conditions such as RA, where it has been suggested that patients may primarily use self-tracking apps in the case of impending flares [39].
Another potential source of bias in the present study is that the RNHRD is a tertiary hospital receiving both local and specialist referrals. Therefore our cohort may be more severely affected by axSpA or less likely to experience a down period between flares. However, both our own data and data from prior studies from the RNHRD suggest that our cohort of patients reflects the full spectrum of axSpA disease [6,40]. For example, the population included in the present study showed a range of BASDAI scores from 0 to 8.6 and BASMI scores from 0 to 7.8. Disease duration (from age of onset to age at study consent) ranged from 4 years to 68 years. Furthermore, it is now common practice and recommended for general practitioners to refer all suspected axSpA diagnoses to a specialist centre [41].

Conclusions
The results of the present study yield novel insights into the characterization of flares in axSpA. Significant relationships were identified between a variety of patient-reported symptoms and flare, including variables that, to our knowledge, have not yet been explored in axSpA. Clustering of daily self-reported symptom data has identified two clusters of people with axSpA who have distinct flare profiles. One group appears to experience significantly longer flare duration. However, this group also experiences less dramatic worsening of pain, fatigue, sleep, mood and stress during flare compared with non-flare. Although we observed differences between the two groups in terms of flare experiences, clinical differences in BASMI, BASDAI and BASFI were not identified, highlighting the potential of smartphone technologies to capture subtle, potentially critical changes in disease activity that are not currently considered in clinical practice.