Development of the Warwick Axial Spondyloarthritis faTigue and Energy questionnaire (WASTEd)—a new patient-reported outcome measure

Abstract Objective The aim was to co-produce and test a potential new patient-reported outcome measure (PROM), the Warwick Axial Spondyloarthritis faTigue and Energy questionnaire (WASTEd), providing vital qualitative confirmation of conceptual relevance, clarity and acceptability. Methods Informed by measurement theory, we collaborated with patient partners throughout a three-stage, iterative process of PROM development. In stage 1, informed by patient interviews, reviews exploring patients’ fatigue experiences and existing PROMs of fatigue, an initial measurement framework of axial spondyloarthritis (axSpA) fatigue and energy and candidate items were defined. In stage 2, the relevance and acceptability of the measurement framework and candidate items were assessed qualitatively by focus group participants. In stage 3, patients participated in pre-testing interviews to assess item comprehensiveness, relevance, acceptability and comprehensibility. Results Stage 1 informed the development of an initial five-domain measurement framework with 59 candidate items. In stage 2, five patients and seven health-care professionals participated in four focus groups to derive a 40-item model of fatigue and energy. Collaborative engagement with patient research partners supported refinement of questionnaire structure and content further. Pre-testing with ten patients across two interview rounds in stage 3 produced a four-domain, 30-item long-form questionnaire. Conclusion An active collaboration with patients and health-care professionals has supported the co-production of a potential new PROM of axSpA fatigue, underpinned by strong evidence of face and content validity. The WASTEd extends the assessment of fatigue beyond severity, highlighting the importance of symptom frequency, energy and self-management. Future research will involve psychometric evaluation, supporting item reduction, structural refinement and confirmation of PROM validity.


Introduction
Axial spondyloarthritis (axSpA) is a progressive, disabling rheumatic disease, often beginning in early adulthood [1,2]. AxSpA typically advances slowly, often leading to an insidious decline in quality of life and in physical and social functioning [3]. Although pain, stiffness and reduced mobility are cardinal features [1,2], fatigue is a major concern to patients, with 75% experiencing severe fatigue [4].
Patients describe wide-ranging adaptations attempting to mitigate the impact of fatigue on their daily life and social activities [5,6]. Growing recognition of the importance of fatigue resulted in its inclusion in updated axSpA outcome reporting guidance [7], with the recommended assessment being a single-item measure of fatigue severity (taken from the BASDAI) [8]. However, significant limitations associated with single-item assessment include the inability to detail the nuances of fatigue experience/impact and a risk of overlooking some patients experiencing major fatigue-related impairment [4]. A recent systematic review of the quality and acceptability of single and multi-item, fatigue-specific patient-reported outcome measures (PROMs) in axSpA patients highlights further methodological inadequacies, including poor conceptual underpinning and limited relevance to patients' experience of fatigue [9]. No measures involved patients as collaborative research partners in PROM development. The review concluded that existing measures were likely to underestimate the significant impact of fatigue in axSpA. It is therefore unsurprising that health professionals often overlook the fatigue experienced by axSpA patients [5].
A patient-derived, multi-item PROM specific to the experience of axSpA fatigue would be invaluable in evidencing the significant impact of fatigue on patients' lives, highlighting their unmet needs to health-care professionals and supporting the provision of targeted and timely care. The active engagement with patients in rigorous qualitative research seeks to ensure that the outcomes that really matter to patients are included in PROM development, enhancing face and content validity, relevance and acceptability [10][11][12][13][14]. This study describes the initial qualitative stages in the development of a new measure of fatigue in axSpA.

Methods
This study is part of a five-stage project and describes a three-stage qualitative process to establish a potential new axSpA fatigue-specific PROM (Fig. 1). Study methods involved semi-structured interviews with patients (stage 1), focus groups with patients and health-care professionals (stage 2), and cognitive and pre-test interviews (stage 3).
Working in collaboration with patients as research partners (PRPs) throughout all stages (Supplementary Table S1, available at Rheumatology Advances in Practice online), an iterative process of item development and refinement is described. National Health Service ethical approval was granted (REC reference: 16/WM/0147), and written informed consent was obtained from participants in all stages.
Informed by international guidance [12,13,15], three qualitative development stages are described: 1. Development of a measurement framework. This step clarifies the measurement focus by identifying essential domains, subdomains and patient-important outcomes from the patient perspective. 2. Confirming the measurement framework and refining candidate items. The purpose is to refine the measurement framework and ensure that it covers essential content, enhancing relevance to patients and health-care professionals. 3. Pre-testing items and confirming content validity. This focuses on clarifying the comprehensiveness, relevance, acceptability and comprehensibility of the developing PROM through qualitative evaluation with patients.

Stage 1: development of a measurement framework
A measurement framework provides structure for what should be measured, describing the overarching concept of health and anticipated relationships between domains and patient-important outcomes [14]. Qualitative research is essential to development, clarifying the essence of axSpA from the perspective of a patient [10,11]. Qualitative semi-structured interviews, drawing on phenomenology as their methodological underpinning, were conducted with axSpA patients to explore their lived experiences of axSpA and fatigue. The interview study was conceived with PRPs who supported generation of the topic guide and the initial analysis. These interviews were re-analysed to inform the developing measurement framework [16]. A thematic analysis supported the extraction and grouping of patient-important outcomes and themes into domains and subdomains of similar or shared meaning [17]. An impact triad of severity, importance and self-management informed this process [18].
Working collaboratively and iteratively, members of the core research team (N.A.P., E.T., J.M., K.L.H. and J.C.P.) and PRPs (including G.S. and J.T.) reviewed the developing framework and potential questions (items). As part of our initial review of the quality and acceptability of PROMs used to assess fatigue in axSpA [9], we also sought additional reviews of fatigue PROMs across Key messages . Single-item assessments of fatigue severity do not fully capture patient experience of fatigue. . Qualitative research informed the development of a multidomain measurement framework of energy and fatigue. . The co-produced draft measure of fatigue and energy (WASTEd) has strong face and content validity. a range of conditions [19][20][21]. The item content of these existing measures of fatigue or energy were reviewed, and potential items judged relevant to the developing framework were identified and/or modified. Where necessary, new items were crafted from the qualitative data to reflect language used by patients [16]. Stage 2: confirming the measurement framework and refining candidate items Patients with confirmed axSpA [22] (age 18 years) and health-care professionals with experience of working with axSpA patients were invited to participate in separate focus groups. To explore the content, relevance and acceptability of the measurement framework and candidate items, focus group activities were structured into three parts (Table 1).
Patients were recruited from rheumatology outpatient departments at three UK hospitals and purposively sampled for age, sex and disease duration. Professionals were identified through the Ankylosing Spondylitis Special Interest Group Northwest (ASSIGNw) network, rheumatology departments of participating sites, and known contacts. Informed consent was secured from all participants. All groups were moderated by N.A.P., co-facilitated by J.C.P. or J.M. and digitally audio recorded.
Focus groups followed a semi-structured format (Table 1). First, findings of the qualitative interviews were shared and discussed. The developing measurement framework was then presented, with each participant  Assessing fatigue and energy in axial spondyloarthritis https://academic.oup.com/rheumap receiving a reference paper copy. Participants discussed the framework with reference to their own experiences, highlighting where important outcomes were missing. Second, participants ranked outcomes individually in order of importance. After group discussion, participants were asked to reach agreement on the most important outcomes. Finally, participants explored the relevance, acceptability and comprehensibility of candidate items. The group worked iteratively through the proposed items for each subdomain. N.A.P. and E.T. analysed the data thematically and deductively [17] after each group, drawing on the measurement framework and highlighting areas of resonance or dissonance with the proposed framework or items.
Stage 3: pre-testing the items and confirming content validity After stage two, a questionnaire was formatted with candidate items grouped into similar concepts. Two rounds of pre-testing interviews were conducted with patients (recruited as for stage 2) to explore item comprehensiveness, relevance, acceptability and comprehensibility [11,12,15]. These assessed whether participants could: understand items and formulate a response (comprehension); retrieve necessary information to enable a response (retrieval); determine the accuracy of retrieval (judgement); and select an appropriate response option (response mapping) [11,15]. Techniques of thinking aloud and verbal probing were used to explore any problems with the questionnaire, item format or wording, to identify any remaining omissions in content and to confirm content validity [11]. Verbal probes were developed in collaboration with PRPs. The reading level (reading age, difficulty and accessibility) was assessed after each round using the Flesch Kincaid reading level [23].

Round 1
Items were grouped into sequential blocks of six items. After self-completing (while talking aloud) each block, the researcher questioned participants using verbal probes. These sought to elucidate whether there was a discernible, conceptual distinction between fatigue and energy at the point of item completion.

Round 2
Questionnaire completion and testing following the threestep test interview (TSTI) approach [24], as follows. First, to reflect usual questionnaire completion, participants were observed completing the full list of items, uninterrupted by the researcher. Participants were encouraged to think aloud during this activity. Their response behaviour was observed. Second, the participant's experience of questionnaire completion was then explored by the researcher (verbal probes), supplemented by researcher observations, to check that items were understood as intended. Finally, a semi-structured debrief explored general questions about the questionnaire and item presentation; for example, layout, font sizing, white spacing etc.
Analysis of pre-testing interview data Informed by international guidance [13][14][15] and a modified category set from the question appraisal system (QAS-99), an item assessment checklist was developed to ensure transparency in the decision-making process for item retention, modification or rejection (Table 2). Decisions were made collaboratively between research team members and PRPs.

Results
Stage 1: development of a measurement framework of fatigue in axSpA Development of the measurement framework was based on analysis of a prior study of 17 axSpA patients who participated in semi-structured, audio-recorded interviews (8 female: age range 22-72 years; disease duration range 1-41 years) [16]. Five domains were defined, reflecting 13 subdomains ( Fig. 2): (1) symptoms: fatigue and energy; (2) impact: cognitive, physical and social; (3) sleep; (4) emotional wellbeing: mood, anxiety and worrying, sense of self and self-isolation; and (5) self-management: achieving balance, energy management and support. Four reviews were identified describing the quality and acceptability of 26 fatigue-specific PROMs across a range of conditions, including RA, axSpA, cancer and Parkinson's disease [9,[19][20][21]. A review of energy PROMs was not identified. However, items within the vitality subscale of the short-form 36-item health survey (SF-36) were reviewed [25]. From these PROMs, a total of 44 items were mapped to the developing measurement framework, capturing the conceptualized patientimportant outcomes (Supplementary Table S2, available at Rheumatology Advances in Practice online). Fifteen items were newly crafted to reflect the remaining patient-important outcomes and complete coverage of the measurement framework.
The research team and PRPs reviewed all 59 items for comprehensiveness, relevance, item structure and language. Nineteen items were removed owing to repetition and unsuitable language (e.g. feeling 'listless' [26]). Phraseology was standardized for the 40 remaining items. Drawing on the findings from the patient interviews and guidance recommending shorter recall periods for variable, frequent symptoms, such as fatigue [14], a recall period of 1 week was selected.
Measurement theory guided consideration of two potential response scales [27]: numerical rating scale (NRS) or categorical, descriptive scales. physiotherapists and 1 occupational therapist] participated across two groups (each 1.5 h).
All participants endorsed the proposed conceptualization of fatigue and energy in axSpA, confirming that nothing important was missing from the measurement framework or developing items. This provoked healthcare professionals to comment on the limitations associated with the assessment of fatigue in clinical practice, 'historically and even currently, the actual measures that we use don't really factor in the fatigue elements . . . the fatigue isn't asked about really' (professional, group 2), welcoming the potential benefits of a questionnaire based on the developing measurement framework.
In each domain, discussions highlighted the distinctions between fatigue and energy, the importance of emotional wellbeing, challenges of sleep as a fatiguerelated concept and benefit gained from a greater awareness of how patients cope with their fatigue.

Symptoms
Patients and professionals agreed on the importance of including items specific to fatigue and energy, with distinct questions pertaining to frequency and severity: 'I would identify more with the lack of energy than fatigue' (patient, group 1). However, fatigue duration was not recommended by either group.

Impact
The cognitive, physical and social impact of fatigue was discussed. Cognitive effects included difficulty in concentrating and recalling memories, although patients tended to link memory changes to ageing. Physical effects included difficulties with self-care, such as cooking and cleaning and exercising, owing to low energy. Social impact extended into work, caring responsibilities and personal life, such as leisure and social activities.
The importance of distinguishing between fatigue and energy was apparent in this domain, with patients agreeing that items on social activity were reflective of energy, rather than fatigue. Health-care professionals also highlighted the importance of financial impact. However, patients argued that financial implications were often associated with axSpA rather than specifically attributable to fatigue.

Sleep
This reflected challenges related to getting to sleep, staying asleep and waking feeling unrefreshed. Sleep as an important mediator of mood and fatigue (and therefore a person's ability to function) was recognized by all participants. However, owing to the complexity and multidimensional nature of sleep, participants felt that it was an unhelpful item to include in a fatigue-specific measure; for example: 'I just don't sleep well; it's nothing to do with my AS' (patient, group 1). Participants agreed that sleep embraced too many variables to contribute meaningfully to an assessment of fatigue and energy and should be considered for removal after further assessment in stage 3 (pre-testing interviews).

Emotional wellbeing
A domain specific to the impact of fatigue on emotional wellbeing was endorsed strongly. Health-care professionals TABLE 2 Item assessment checklist: identifying areas of concern when assessing items during pre-testing interviews [14,26]  reflected on the failure of clinics to assess this aspect of emotional wellbeing appropriately with axSpA patients. However, terminology was challenged by one professional group as potentially beyond the scope of the WASTEd: 'I like the word downhearted but not depressed. We need to take depressed out because . . . you shouldn't diagnose yourself or someone else with depression [using this questionnaire]' (professional, group 2).
Self-management 'Self-management' sought to address how patients achieved a balance in managing exertion/activity levels and making adaptions. Energy management was important, describing how decisions were made based on perceived energy expenditure, current stamina and methods to re-energise. 'Support' referred to how people delegate responsibilities and rely on others to continue with daily life and activities. Health-care professionals welcomed the inclusion of separate domains: 'with this we can tell whether those not coping [with fatigue] . . . and those who are managing . . .' (professional, group 1). They identified a need for domain-level scoring to support care decision-making: 'we want to be able to see change and use domain scores' (professional, group 1).
All participants confirmed that a categorical, descriptive scale was easier to understand and respond to than a numerical rating scale. A 1 week recall period was supported. Some patients suggested that a longer recall period would capture previous fatigue experiences, whereas professionals were concerned that a longer recall period would have little clinical value.
After conclusion of the focus groups, a 32-item WASTEd was produced ready for consideration in pretesting interviews. Four items captured symptoms, 11 impact, 2 sleep, 7 emotional wellbeing and 8 self-management. Practicable recommendations to improve the questionnaire, including scoring, scale type and presentation, are summarized in Table 3.
Stage 3: pre-testing the items and confirming content validity Ten male patients [mean (range) age 52.8 (28-75) years; mean disease duration 18.5 years] participated in the pre-testing interviews: five in round 1 and five in round 2. Interviews were conducted in local rheumatology departments [mean (range) duration 80 (33-120) min].
The research team and PRP group reviewed the results after each round. Item modifications informed by round 1 were agreed before further testing in round 2. The checklist (Table 2) confirmed that most items were presented clearly and understood; no item was assigned more than two areas of concern. Minor modifications were suggested to improve question clarity or response option anchors (Supplementary Table S3, available at Rheumatology Advances in Practice online).

Round 1
Reading ease equated to 75.7, suggesting a reading age between 11 and 12 years [23]. From the 32 items, 13 remained unchanged. Concerns were raised across four of the seven assessment categories for 19 items. First, there were issues of clarity (17 items); for example, the item 'How often have you felt drained?' was amended to read, 'How often have you felt drained of energy?'. In other examples, a suggestion to underline the word that was the focus of the question was proposed. For example, fatigue or energy. Second, there were concerns about assumptions (5 items). Items were perceived as being double-barrelled or interpreted differently by participants. For example, for the item, 'Because of your energy levels, have you found it mentally difficult to start, or finish doing things?', a participant responded, 'if I could cross out start then I'd say not at all' (R2). Item rephrasing was explored. Third, sensitivity (2 items): an item asking, 'Has fatigue made you feel downhearted?' elicited emotional responses from three participants, and in one interview, required a brief pause (R2, R3 and R5). Item rephrasing was explored. Fourth, regarding response options (3 items), anchors were refined.
The core research team and PRPs agreed on removal of three items and minor modifications; 29 items were retained for consideration in round 2.

Round 2
Reading ease equated to 69.7, suggesting a reading age of 13-15 years [23]. Minor issues were raised for 5 of the 29 items (Supplementary Table S4, available at Rheumatology Advances in Practice online). First, minor modifications were made to the language (3 items); for example, changing the item 'preferred to be alone due to fatigue' to read 'need to be alone', recognizing that this is not a person's preference. Second, response anchor modification to improve meaning (2 items); for example, changing 'always' to 'everyday' as a response to the question, 'How often have you felt fatigued?'. Third, one item about 'coping with fatigue', which was removed in round 1, was highlighted as important and 'missing'. Modifications to the original item were explored between members of the core research team and PRPs. PRPs differentiated between the concept of managing fatigue (viewed as reflecting practical steps) and that of coping with fatigue (described as an internal process), providing important insight that supported item rewriting. In response, minor changes to the identified items were made, and the 'coping with fatigue' item was reintroduced. This process produced a 30-item long-form version of the WASTEd ready for future statistical evaluation. The questionnaire is presented in Supplementary Data S1, available at Rheumatology Advances in Practice online.

Discussion
The Warwick Axial Spondyloarthritis faTigue and Energy (WASTEd) questionnaire represents the first patientderived, co-produced measure of fatigue and energy specific to the experience of patients with axSpA. Patients have made a substantial contribution to the development process, both as research partners and as participants, informing the co-production of a measurement framework and list of items that have resonance with their lived experience and an assessment of fatigue that is both understandable and useful. The engagement with clinicians in this process has ensured the clinical utility of the measure. Future application of the WASTEd in clinical and research settings will ensure that the patient's experience of fatigue is communicated clearly in decision-making.
Historically, fatigue in axSpA has been assessed as a unidimensional construct of fatigue severity [8]. However, the results of the present study challenge the narrow focus of current assessment guidance and confirm fatigue as a complex, multidimensional experience, of which energy is an important and distinct component. Although increasingly recognized in other fields, including HIV [28,29] and nephrology [30], this is the first time that the importance of energy has been described in a rheumatology population and given such prominence in a PROM. Described as a replenishable resource that was essential to support physical and mental activity [16], the conceptual underpinnings of the WASTEd questionnaire confirmed energy as a necessary component of fatigue assessment, with the potential to be scored as its own subscale. Likewise, self-management and coping also emerged as essential, patient-derived components of patients' fatigue experience and were thus incorporated as items within the measure. Although these are not new concepts to the experience of axSpA [5,6,16], the WASTEd represents the first time that such patient-important outcomes have been incorporated into fatigue assessment. Moreover, the WASTEd questionnaire incorporates an item on fatigue frequency, addressing a known gap in axSpA-fatigue measurement and thus supporting identification of patients with major fatigue (i.e. both frequent and severe) [4].
Developed using a transparent and methodologically robust process [13][14][15], the new fatigue and energy framework was derived from qualitative work with patients [16] and reviews of existing measures of fatigue and/or energy from both within [9,21] and outside rheumatology [19,20]. The distinction of energy as a separate but essential aspect of fatigue experience extends previous knowledge and might help to determine a patient's ability, for example, to maintain their home axSpA-exercise regimens. The reviews of existing measures identified only one assessment of energy: the vitality subscale of the SF-36 [25]. Although widely used, measurement evidence is too limited to recommend its use for axSpA fatigue assessment [9].
Exploration of the developing measurement framework with separate groups of patients and health professionals also confirmed the content, relevance and acceptability of the developing PROM in capturing the fatigue outcomes that really matter, the severity of fatigue and energy, associated impact on daily life and ability of patients to self-manage [18]. This triad formed part of the analysis process in this study, making the impact of fatigue and energy the focus of the PROM. Evidence suggests that PROMs with clear conceptual underpinnings have high levels of face and content validity [11,13,15], which can enhance patient acceptability and clinical utility and improve responsiveness to important changes in health [31].
Although the numbers of patient participants in the focus groups were limited, the voice of patients was widely represented throughout the development process and further enhanced by the active involvement of our PRPs at all key stages (Supplementary Table S1, available at Rheumatology Advances in Practice online). Although all patient participants in stage 3 cognitive interviews were male, the contribution of the PRPs (4 female and 3 male) to the analysis ensured that a gendered view of the data was facilitated. A rigorous approach to all phases of the qualitative research is described, which involved patient, clinical and research experts participating in an iterative approach to item development and refinement. This process increases confidence that our multifaceted approach has minimized the risk that any patient-important outcomes have been omitted from the measurement framework. The involvement of health-care professionals in additional focus groups was essential to enhancing the clinical relevance of the model and is a further strength of the WASTEd.
The first-version, 30-item WASTEd has demonstrable face and content validity, underpinned by rigorous qualitative research and active patient involvement [16]. Future research will seek to administer the measure to a larger and more diverse UK-wide patient population (in terms of disease duration, disease activity and sociodemographic variables). Quantitative assessment, using modern psychometric theory, will inform item reduction towards a short-form set of items that best represents patient-reported fatigue. Further analysis will confirm the dimensionality of the fatigue and energy model, Nathan A. Pearson et al. measurement reliability, construct validity and ability to detect change in fatigue. Such evidence is crucial to confirming the suitability of the developing PROM for use in research and clinical practice. This work highlights fatigue and energy as important and measurable patient-reported concepts in axSpA. It is an important step towards the availability of a high-quality, patientderived, relevant and acceptable PROM.