Accuracy of the Wound Healing Questionnaire in the diagnosis of surgical-site infection after abdominal surgery in low- and middle-income countries

Abstract Introduction Telemedicine is being adopted for postoperative surveillance but requires evaluation for efficacy. This study tested a telephone Wound Healing Questionnaire (WHQ) to diagnose surgical site infection (SSI) after abdominal surgery in low- and middle-income countries. Method A multi-centre, international, prospective study was embedded in the FALCON trial; a factorial RCT testing measures to reduce SSI in seven low- and middle-income countries (NCT03700749). It was conducted according to a pre-registered protocol (SWAT126) and reported according to STARD guidelines. The reference test was in-person review by a trained clinician at 30 postoperative days according to US Centres for Disease Control criteria. The index test was telephone administration of an adapted WHQ at 27 to 30 postoperative days by a researcher blinded to the outcome of in-person review. The sum of item response scores generated an overall score between 0 and 29. The primary outcome was the diagnostic accuracy of the WHQ, defined as the proportion of SSI correctly identified by the telephone WHQ, and summarized using the area under the receiving operator characteristic curve (AUROC) and diagnostic test accuracy statistics. Results Patients were included from three upper-middle income (396 patients, 13 hospitals), three lower-middle income (746 patients, 19 hospitals), and one low-income country (54 patients, 4 hospitals). 90.3% (1088 of 1196) patients were successfully contacted. Those with non-midline incisions (adjusted odds ratio: 0.36, 95% c.i. 0.17 to 0.73, P=0.005) or a confirmed diagnosis of SSI on in-person assessment (odds ratio: 0.42, 95% c.i. 0.20 to 0.92, P=0.006) were harder to reach. The questionnaire correctly discriminated between most patients with and without SSI (AUROC 0.869, 95% c.i. 0.824 to 0.914), which was consistent across subgroups. A representative cut-off score of ≥4 displayed a sensitivity of 0.701 (0.610-0.792), specificity of 0.911 (0.878-0.943), positive predictive value of 0.723 (0.633-0.814) and negative predictive value of 0.901 (0.867-0.935). Conclusion SSI can be diagnosed using a telephone questionnaire (obviating in-person assessment) in low resource settings.


Lay summary
A wound infection happens when germs enter the cut made in your body by a doctor when you are operated on.Germs are small organisms that cannot be seen by your eyes, but they can cause problems in the healing of the cut.Infection is the most common problem after surgery and can delay you getting out of hospital and back to normal life.The current way to check whether you have an infection is for a doctor or nurse to look at the cut made on your tummy and see how it is healing.For example, a doctor may check if the cut has a green liquid oozing from it or if the area of the wound is red or swollen.A month after you leave hospital, a doctor may ask you to come back for a follow-up visit.However, this will require you to travel to hospital and take a day off work or away from your family, and can be expensive and time-consuming if you travel far.We wanted to find out if talking to a doctor over the phone would work as well as you travelling to hospital to show the wound to a doctor or nurse in person.To do this, we asked over 1000 patients who had recently undergone surgery to be checked using both methods-to take a phone call from one doctor and be checked in person by a different doctor.We were able to compare the phone follow-up and in-person check to see if the doctors came to a different conclusion.We also looked at whether patients were able to receive a phone call at home and their experience of the process.For most patients, the phone call from a doctor was just as good at seeing if a patient had an infection as a face-to-face check-up by a doctor.However, the phone call was not perfect all the time, particularly for patients with very mild infections.Most patients were able to receive the phone call after a few tries and all

Introduction
Surgical-site infection (SSI) is the most common postoperative complication, with a cross-societal impact on patients, communities, and economies worldwide [1][2][3] .It has been recognized as the highest-priority research area in global surgery 4 and SSI prevention is the subject of several ongoing global randomized trials and quality improvement programmes 5,6 .
Whilst some SSI occurs while patients are in-hospital, the majority occurs after discharge 7 .Post-discharge surveillance of SSI is therefore considered a key quality marker in wound infection research 8 .The accepted reference standard of assessment for SSI during the 30 days after surgery is an in-person review according to US Centers for Disease Control and Prevention (CDC) criteria 9 .However, in-person assessment is labour and time intensive, and requires patients to take additional time off work and incur costs of travel.This is particularly challenging in resource-limited environments, where there are shortages in the surgical workforce and patients are already at risk of catastrophic expenditure as a direct and indirect result of their surgical care 10 .
Remote follow-up methods were rapidly adopted during the SARS-CoV-2 pandemic to reduce the risk of in-hospital transmission and to conserve resources for surges in COVID-19 admissions and to address elective surgical backlogs 11,12 .Whilst telephone follow-up may offer greater efficiency and cost savings, missed SSI events may lead to patient harm directly through care delays or indirectly through inefficiencies in SSI-prevention research 13 .
The overall aim of this study was to evaluate the feasibility and diagnostic accuracy of telephone administration of a Wound Healing Questionnaire (WHQ) for remote detection of SSI after abdominal surgery in low-and middle-income countries (LMICs).The results of this study will inform efficient design and the conduct of future randomized trials and postoperative surveillance programmes.

Methods
This was a prospective, multicentre, international, non-randomized cohort Study Within A Trial (SWAT) exploring the feasibility and accuracy of remote follow-up pathways for SSI assessment (TALON).It was embedded within a pragmatic, multicentre, factorial randomized controlled trial testing measures to reduce SSI in LMICs (FALCON).FALCON was a stratified, pragmatic, multicentre, 2 × 2 factorial trial testing two measures (skin preparation and antimicrobial sutures) to reduce superficial or deep-skin infection after abdominal surgery for 5788 patients in 54 hospitals in 7 LMICs (NCT03700749) 14 .In this trial, superiority of the intervention groups over the control group, was not demonstrated overall, alone or in combination, or in any pre-planned subgroup 15 .
The study protocol was pre-registered online on the MRC Hubs for Trial Methodology Research SWAT store database 16 (Queen's University Belfast) (SWAT ID 126) and published in Trials 17 .This report was prepared with reference to the Statistical Analyses and Methods in the Published Literature ('SAMPL') guidelines 18 , the methodology standards of the Patient-Centred Outcomes Research Institute ('PCORI') 19 , the STARD guidelines for diagnostic test accuracy studies 20 , and the COSMIN guidelines for patient-reported outcomes research 21 .

Ethics approval and consent to participate
A protocol amendment to embed TALON in the host trial (FALCON) was obtained from the University of Birmingham International Ethics Committee (Reference: ERN_18-0230A).All individual participating countries obtained local or national ethical approval for TALON in accordance with local protocols.Written (or fingerprint) informed consent to participate was obtained from all participants.

Inclusion and exclusion criteria
Consecutive adult patients (greater than 16 years old) recruited to the FALCON trial between 10 December 2018 and 6 September 2020 were eligible for recruitment to TALON.Any centre participating in FALCON was eligible to participate.Centres were given flexibility to include patients over different date ranges depending on their local capacity and infrastructure, so long as sampling was consecutive.This included a broad range of abdominal operations with a predicted clean/clean-contaminated, contaminated, or dirty operating field and a planned skin incision of greater than 5 cm, for benign disease, malignant disease, trauma, and obstetric indications.This aimed to be representative of patients undergoing emergency or elective surgery in LMICs.Patients who were unlikely to be contactable for the 30-day follow-up were excluded from the FALCON trial.Patients with a missing FALCON 30-day follow-up assessment (either in person or by telephone) or who died before 30 days after surgery were excluded from analysis in this study.

Reference and index diagnostic tests
The reference diagnostic test for SSI during the 30 days after surgery was in-person review according to US CDC criteria 9 .This is widely accepted as a quality standard in SSI research and has been used by most major international RCTs 8 , and included both in-hospital and post-discharge diagnosis of SSI.A full description of the definition used in the FALCON trial is available in Appendix B.
The index diagnostic test under evaluation was a telephone-administered Bluebelle WHQ 17 , adapted for use in LMICs.The WHQ was originally developed and validated in the UK (English language) to assess post-discharge infections after abdominal surgery 22,23 .The WHQ was designed to be completed either by healthcare professionals or patients 24 , and, as such, has been described as a 'universal-reporter' outcome measure ('UROM') 25 .In a UK validation study, the WHQ demonstrated good reliability and high sensitivity and specificity when discriminating between SSI and no SSI in comparison with an in-person US CDC assessment 22,23 .The original WHQ was adapted for use in global surgery trials for use across language and resource settings using recognized practices for translating outcome measures, reported in detail separately and summarized in Appendix C. Briefly, this involved two phases.First, an adaptation phase with structured interviewing and translatability assessment with local researchers, triangulated with analysis of the scaling and measurement properties of the WHQ in pilot data, and informed by Rasch unidimensional measurement modelling 26,27 .Second, a nine-phase translation phase for each language of delivery following Mapi recommendations 26 .In the adapted version of the WHQ, the response options and subsequent scoring were also modified.Here, 'WHQ' refers to this adapted questionnaire.In the adapted WHQ scale, items assessing SSI signs and symptoms were scored between 0 and 2 (not at all, a little, and a lot) and items assessing wound care interventions were scored between 0 and 1 (no and yes).These were added together to create an overall score between 0 and 29.The full adapted WHQ instrument is available in Appendix D.
According to the TALON protocol, the WHQ was to be administered over the telephone by a non-surgeon (consultant, attending, or equivalent) researcher (that is, a junior doctor, research nurse, or other non-clinical member of staff) between 27 and 30 days after surgery (that is before the reference diagnostic test) as the index diagnostic test in this study.The researcher administering the questionnaire was independent of the 30-day wound assessment in the FALCON trial (that is each was blinded to the reference and index test result respectively) and underwent standardized training from the Study Management Group ('SMG').Details of monitoring and quality assurance can be found in Appendix E. Pathways for questionnaire administration were co-designed between patient partners, site investigators, and research managers.Methodological adaptations for delivery during the SARS-CoV-2 pandemic are summarized in Appendix F.

FALCON trial follow-up
Due to personal (mobility, deterioration, and psychological) and environmental (cost, transport links, and SARS-CoV-2 transmission risk) reasons, not all patients were able to return to hospital for the reference test assessment in the FALCON trial (in-person 30-day follow-up according to US CDC criteria).Eligible patients were therefore categorized according to their corresponding FALCON trial follow-up as having: in-person FALCON trial follow-up; or telephone FALCON trial follow-up only.Only patients with in-person FALCON trial follow-up were considered to have had a reference test completed.

Outcomes
The primary outcome measure was the diagnostic accuracy of the telephone WHQ in identification of SSI up to 30 days after surgery.The performance of the test was summarized using discrimination (area under the receiving operator characteristic curve (AUROC)) and diagnostic test accuracy statistics (accuracy, sensitivity, specificity, positive predictive value, and negative predictive value).
The secondary outcome measure was the feasibility of the telephone WHQ follow-up, which was characterized using: telephone contact (successful contact of a patient on the telephone by the research team); return rate (successful completion of the WHQ where telephone contact was made); patient satisfaction (the patient's self-reported satisfaction with the telephone WHQ follow-up); and data completion rate (complete item response data).The estimated 'retention benefit' of using a telephone pathway versus in-person follow-up was estimated as the difference between the proportion of patients for whom the telephone WHQ was successfully completed and/or a telephone FALCON trial follow-up was completed, and the proportion for whom an in-person FALCON trial follow-up was completed 28 .

Sample size
A range of sample sizes and their impact on the precision of estimates of sensitivity and specificity, from 95% confidence intervals, were investigated.Calculations assumed a 30-day SSI prevalence of 21.0% using the binomial exact formula and were pre-specified 29 (Appendix G).Sample sizes were adjusted to allow for 15.0% predicted loss to follow-up from the FALCON trial, and 15.0% of patients predicted not to undergo in-person follow-up.In patients with successful telephone contact and in-person FALCON trial follow-up, 87 events and 325 non-events would estimate sensitivity of 0.92 (95% c.i. 0.84 to 0.97) and specificity of 0.95 (95% c.i. 0.92 to 0.97).A target of 100 or more patients per country were recommended to be recruited; however, no minimum or maximum sample size limitations per site or per country were imposed.

Telephone Wound Healing Questionnaire administration pathway
Data were collected about the pathway for telephone WHQ follow-up to describe the variability in administration across contexts, including: questionnaire translation (pre-translated questionnaire/ad hoc, translated by questionnaire administrator/ ad hoc, translated by formal translator); language of delivery; telephone owner (patient themselves/healthcare worker/friend or relative/other); telephone type (landline/mobile telephone with a camera/mobile telephone without a camera); questionnaire administrator (consultant (doctor)/junior doctor/research nurse/ other non-clinical); and duration (min).

Statistical analysis
A full statistical analysis plan (SAP) was published online on 8 March 2021 30 .Any changes from the SAP are summarized in Appendix H.All analyses were performed using R Studio version 4.1.1(R Foundation for Statistical Computing, Vienna, Austria) packages: tidyverse, finalfit, reportROC, predictr, and bcROCcurve.Country income level was defined according to the World Bank's 2018 definitions and countries classified into upper-middle-income countries, lower-middle-income countries, or low-income countries based on annual gross domestic product per capita ($).The overall rate of missing data was anticipated to be low.A sensitivity analysis for the primary validation model was pre-planned to be performed with missing item response data imputed using Multiple Imputation by Chained Equations if the level of missingness was above 5% overall (that is per questionnaire) or for any individual item.Data from patients with in-person or telephone FALCON trial follow-up only were included in the evaluation of feasibility outcome measures.Data only from patients with in-person and telephone FALCON trial follow-up (that is both the reference and index test available) were included in the evaluation of diagnostic accuracy.A potential risk of partial verification bias by including only patients with in-person FALCON trial follow-up in the diagnostic accuracy analysis was identified a priori and addressed in a sensitivity analysis.
Baseline demographics and feasibility outcomes were presented overall, by country, by patient home location, and by FALCON trial follow-up group.Distributions of continuous variables were visually inspected for normality.Differences between these groups were explored using Student's t test for normal data and the Mann-Whitney U test for non-normal data.The chi-squared test was used for categorical data, with Fisher's exact modification, where required.The proportion of patients included by FALCON trial follow-up group over the study interval was summarized graphically.An exploratory mixed-effects binary regression model was used to explore the factors associated with successful telephone contact, with patients nested within countries.The causal pathway for telephone contact was mapped and patient-, disease-, operation-, and location-specific factors were selected a priori for inclusion in risk adjustment.Cross-tabulations of the reference test diagnosis ('no SSI' or 'SSI') against a binary outcome variable derived from the total score of the index test (created by a cut-off score; a WHQ total score of less than or equal to specified values between 1 and 10) were presented.The total WHQ score was examined against the reference test to evaluate the performance of the WHQ in discriminating between individuals with and without SSI.A receiver operating characteristic (ROC) curve was plotted showing test performance across all thresholds, with overall discrimination presented as the AUROC, with 95% confidence intervals, overall and across several subgroups.Diagnostic test accuracy statistics (accuracy, sensitivity, specificity, positive predictive value, and negative predictive value) were presented at statistically 'optimized' cut-off points (using Youden's index, which maximizes the sum of sensitivity and specificity) and at several other cut-off points 'to rule in' (that is score greater than or equal to X) or 'rule out' (that is score less than or equal to X) to support clinical application.
Several sensitivity analyses were performed for the primary model: • To allow flexibility during the SARS-CoV-2 pandemic, administration of the WHQ was permitted after FALCON follow-up.The effect of a longer duration after surgery between the WHQ and telephone assessment was explored in a sensitivity analysis including both per-protocol and out-of-protocol patients.• To evaluate the diagnostic accuracy of the WHQ in post-discharge SSI diagnosis only, a second analysis excluded patients with an in-hospital, pre-discharge SSI diagnosis.• To address a risk of partial verification bias, an inverse probability weighted (IPW) sensitivity analysis was conducted for the primary model.In brief, this bias represents a missing data problem, where the reference test is missing for a subset of the sample 31 .Under an assumption of missing data at random ('MAR'), the IPW method weights each observation in the verified sample by the inverse of the probability of verification to provide a corrected estimate of sensitivity and specificity.The estimated probability of verification is then obtained using a logistic regression model 32,33 .
Subgroups included urban versus rural home location, uppermiddle-income countries versus lower-middle-income countries versus low-income countries, patient age greater than 60 years versus less than or equal to 60 years, and elective versus emergency surgery, which were pre-specified, and pre-translated questionnaire versus ad hoc translation and no reoperation (mild SSI only), which were added post-hoc for exploratory analysis.Calibration of the WHQ was presented as the proportion of patients with an SSI diagnosis in the reference test at each WHQ point score interval.

Community engagement and involvement
The aim of community engagement and involvement (CEI) in this study was to optimize the pathway for telephone WHQ administration to ensure cultural and contextual acceptability and to maximize both the telephone contact rate and questionnaire completion rate.Patient and community partners were involved in study prioritization, design, steering, and reporting using three methods.First, through direct involvement in the SMG.Second, through a UK-based advisory group with expatriate partners from collaborating countries.Third, an extended network of patient and community partners were consulted through the NIHR Unit on Global Surgery network.CEI in this study was reported according to the GRIPP2 short form 34 .

Results
Overall, 1240 patients were included, with telephone WHQ follow-up attempted, of whom 29 patients had died by 30 days after surgery, the status was missing for 1 patient, and 14 patients had no FALCON trial follow-up.A total of 1196 patients were therefore eligible for inclusion in the analyses (Fig. 1).Patients were from three upper-middle-income countries (396 patients in 13 hospitals), three lower-middle-income countries (746 patients in 19 hospitals), and one low-income country (54 patients in 4 hospitals).The countries contributing the largest numbers of patients were Ghana (532 of 1196 patients; 44.5%), Mexico (216 of 1196 patients; 18.1%), and India (120 of 1196 patients; 10.0%) (Table S1).A total of 209 of 1196 patients (17.5%) had an SSI diagnosis within 30 days after surgery in the FALCON trial.A comparison of patients included in the TALON study versus the FALCON trial overall is presented in Table S2.
Of note, there were fewer patients undergoing elective surgery, fewer female patients, more intermediate/minor operations, and more contaminated/dirty surgery in TALON than in FALCON overall.

Feasibility outcomes
Baseline demographics grouped by whether telephone contact was made or not are presented in Table 1.Overall, the telephone contact rate was high at 90.3% (1088 of 1196 patients), with 9.7% (108 of 1196 patients) lost to follow-up, with some variability by country (Table S3).The WHQ was completed for all but one patient where successful contact was made (1087 of 1088 patients; 99.9%).The rate of telephone contact reduced as the time from date of surgery increased (Fig. S1).The most significant factor associated with lower odds of telephone contact in the multivariable model was time from surgery (Fig. 2 and Table S4).Importantly, patients with non-midline incisions (adjusted odds ratio 0.36 (95% c.i. 0.17 to 0.73); P = 0.005) or with a confirmed reference test diagnosis of SSI (OR 0.42 (95% c.i. 0.20 to 0.92); P = 0.006) were less likely to be contactable.Where data were available, most patients were followed up with one (267 of 560 patients; 47.7%) or two to three (185 of 560 patients; 33.0%) attempts at telephone follow-up (data missing for 636 patients).Patients overall felt very satisfied (393 of 550 patients; 71.5%) or satisfied (152 of 550 patients; 27.6%) with undergoing telephone WHQ follow-up (data missing for 646 patients).
Telephone WHQ administration was performed across diverse settings and patient groups, in 22 languages and 36 hospitals (Table 2).Overall, 65.0% of contactable patients (707 of 1087) lived in urban settings and 34.9% of contactable patients (380 of 1087) lived in rural settings; data missing for 1 patient.In addition, 64.4% of contactable patients (701 of 1087) received the call using their own telephone, whereas 33.7% of contactable patients (367 of 1087) received the call using a family member's telephone.A total of 699 patients (64.2%) used a smartphone with video capability.Importantly, the WHQ was mainly delivered by non-consultant (attending) grade researchers (other doctor for 367 patients (33.7%), research nurse for 327 patients (30.1%), and other non-clinical for 385 patients (35.4%)) and largely took less than 20 min to complete for 96.0%(528 of 550 patients; data missing for 538 patients).There were several differences in the implementation of telephone WHQ follow-up across the participating countries (Table 2).

Patterns of FALCON trial follow-up
An overview of the grouping of patients included in this study is included in Fig. 1.
Of the 1209 patients who were contactable for telephone WHQ follow-up, 531 (47.5%) had a FALCON trial in-person follow-up.The proportion of patients with in-person follow-up over the study interval is shown in Fig. S2.Having a telephone follow-up pathway (telephone FALCON trial follow-up and/or telephone WHQ) led to 52.5% additional patients (557 of 1088) with complete outcome assessment (estimated 'retention benefit') compared with in-person FALCON trial follow-up alone.No adverse events were reported related to completion of in-person FALCON trial follow-up or the telephone WHQ.
There were several differences in the patients who had inperson FALCON trial follow-up, and telephone FALCON trial follow-up only and no trial follow-up (Table S5).Of note, there were fewer patients in rural settings (30.7% versus 39.0%; P =0.006) and fewer male patients (48.0%versus 58.7%; P < 0.001), as well as more patients with an obstetric indication (18.8% versus 5.9%; P < 0.001) and urogenital (22  open non-midline (44.6% versus 24.1%; P < 0.001) surgery that returned for in-person versus telephone FALCON follow-up.However, patients from all participating countries and having a mixture of baseline risk and operation types were included in both groups.Of the patients who had in-person FALCON trial follow-up (531 patients), the timing of the telephone WHQ was per protocol for 388 patients (73.1%) and outside of protocol for 141 patients (26.6%) (Fig. S3).

Diagnostic accuracy
The level of data missingness overall for all item responses was low (13 of 10 089; 0.1%) and similarly for each individual item (range 0.0-0.1%),so complete case analysis was conducted without imputation.Patients' total WHQ scores for those with and without a diagnosis of SSI made at the FALCON trial assessment 30 days after surgery are presented in Fig. S4 and Table S6.The proportion of patients with an SSI at each WHQ point score interval is presented in Fig. S5.In patients with a WHQ point score of zero (that is they did not report any symptoms of SSI over the telephone; 147 patients) who did go on to have an SSI diagnosis made on 30-day follow-up (7 patients), the features that were most commonly detected in person were purulent fluid (6 of 7 patients), pain at the wound site (6 of 7 patients), and diagnosis of SSI by a clinician or on imaging (6 of 7 patients) (Table S7).
A summary of the performance metrics and diagnostic test accuracy statistics is shown in Table 3, Table S8, and Fig. 3.In the per-protocol analysis (388 patients), the WHQ demonstrated excellent overall discrimination (AUROC 0.869 (95% c.i. 0.824 to 0.914)).The discrimination was similar in sensitivity analyses  S9.The cut-off point identified using Youden's index was 3.5 (WHQ total score greater than or equal to 4), which diagnosed post-discharge SSI with a sensitivity of 0.701 (95% c.i. 0.610 to 0.792), a specificity of 0.911 (95% c.i. 0.878 to 0.9430), a positive predictive value of 0.723 (95% c.i. 0.633 to 0.814), and a negative predictive value of 0.901 (95% c.i. 0.867 to 0.935).
The performance of the WHQ was maintained across key subgroups (Fig. 3).Some differences were observed (reduced overall discrimination in rural settings (AUROC 0.818 (95% c.i. 0.721 to 0.914)) versus urban settings (AUROC 0.886 (95% c.i. 0.836 to 0.937)) and poorer discrimination after emergency surgery (AUROC 0.871 (95% c.i. 0.826 to 0.916)) versus elective surgery (AUROC 0.966 (95% c.i. 0.895 to 1.000))), although the 95% confidence intervals overlapped for both of these comparisons and interpretation of the analysis of elective surgery was limited by a low SSI rate in the elective surgery subgroup.

Community engagement and involvement
Patients had a direct impact on study delivery and reporting.First, variables related to acceptability, the number of attempts needed, and time taken were added to the telephone WHQ pathway item set in response to pilot testing and early exploration of the data during study monitoring.Second, several suggestions were provided to iteratively improve the implementation of telephone WHQ administration.To summarize this shared learning, a toolkit (Appendix I) was co-produced with study authors and provided to sites to share best practice for acceptable and inclusive delivery of a telephone follow-up pathway.This was presented as a slide presentation (Microsoft Powerpoint ® , Microsoft Corporation, Redmond, WA, USA) and infographic poster (Adobe Illustrator ® , Adobe, San Jose, CA, USA).Finally, subgroup analysis was added for mild SSI only, due to concerns that patients with less severe problems may be missed and so delay receiving care.

Fig. 2 Factors associated with successful telephone contact in a multivariable model
A lower OR conveyed a lower likelihood of telephone contact successfully being made by telephone to complete the TALON questionnaire.Full model presented in Table S4.WHQ, Wound Healing Questionnaire; SSI, surgical-site infection.
NIHR Global Health Research Unit on Global Surgery | 7

Discussion
This prospective validation study within a large international pragmatic trial demonstrated high feasibility and validity of telephone assessment for diagnosis of SSI in low-resource environments using the adapted WHQ.The WHQ was demonstrated to be suitable for use across a diverse range of settings, countries, and languages in three continents with high completion and low missing data rates.The diagnostic accuracy of the WHQ score was good when delivered per-protocol and was robust regarding several sensitivity analyses.However, it was less discriminative in certain subgroups, such as patients living in rural areas.Several cut-off points of the WHQ score and their corresponding diagnostic accuracy statistics were presented to facilitate application of the WHQ to different contexts.Co-production of the telephone WHQ administration pathway facilitated culturally and contextually attuned delivery.This tool is now available for global implementation in postoperative   surveillance pathways and to optimize efficient trial design and conduct.Few existing high-quality studies have evaluated the diagnostic accuracy of telemedicine methods for remote diagnosis of SSI.A prospective cohort study published in 2022 raised a significant concern for under-detection of SSI using non-standardized methods.On meta-analysis, only four studies were identified with paired in-person and telephone follow-up for which diagnostic test accuracy statistics could be calculated 13 .Three studies were at high risk of bias and just one, the UK validation of the English language WHQ, was identified as being at low risk of bias 23 .This is why this instrument was chosen to update and adapt to use in this international study.These international data therefore play an important role in informing safe upscaling of methods for remote postoperative surveillance.Differences in performance for patients living in rural versus urban settings may reflect differences in items related to the treatment pathway for wound infection (for example seeking advice for a wound problem, readmission to hospital) and patients' access to care in rural environments.Whilst the sensitivity may be marginally reduced, remote follow-up methods may improve the reach into these communities, improve diversity and representation, and reduce attrition bias.
There are several ways in which this WHQ instrument may be applied.First, it may be used in research studies to provide a diagnosis of SSI (that is binary outcome of SSI/no SSI) remotely, without the need for in-person review.Choice of cut-off SSI threshold score would need to consider a balance of sensitivity and specificity, and the consequences of missing or over diagnosing SSI.This has important implications for trial design and conduct.Trials regarding SSI need to be large and pragmatic, and a validated remote method for assessing SSI will reduce trial costs.Second, the WHQ may be used in clinical practice to triage patients into existing clinical care pathways, with those at very low risk of a SSI diagnosis being given reassurance and those at moderate or high risk of a SSI diagnosis being asked to return for outpatient assessment.This could be adopted by either primary or secondary care depending on the structure of the local health system, but will require further optimization during implementation.Other work in this area has suggested that triage using remote, digital methods is safe and feasible, and has cost savings 35 .Combining remote tools to detect SSI and other common postoperative complications could be an accessible and rapid step towards the digital future of surgery.However, further research is needed to explore the impact of remote assessment pathways on reducing delays to care and improving clinical outcomes.
The present study confirms that digital follow-up pathways in low-resource environments are feasible and resilient.This supports estimates of high access to mobile communications by the World Bank 36 .By moving to remote, telephone assessment, over 50% additional patients were able to be followed up who may otherwise have been lost to follow-up, substantially improving trial retention 29,[37][38][39] .Intuitively, the time from surgery to attempted follow-up was strongly associated with the likelihood of successful contact.Certain groups were highlighted to be more challenging to reach.Patients with non-midline incisions may represent patients undergoing appendicectomy or cholecystectomy, who are likely to return to work soon after their operation and have limited physical opportunity and/or reflective motivation to complete follow-up 40 .An association between SSI diagnosis and reduced odds of successful contact  S8. *Overall analysis included only patients with per-protocol WHQ administration.†Includes ad hoc, translated from English by questionnaire administrator and ad hoc, translated from English with formal translator.‡Surgical-site infection recorded using reference test of 30-day in-person FALCON trial follow-up.§Cut-off scores defined using Youden's index, which maximizes the sum of sensitivity and specificity in the cohort of interest.Implementation of the WHQ should be supported by clinical decision-making using cut-off scores in Table S9.SSI, surgical-site infection; AUROC, area under the receiver operating characteristic curve (used as an overall measure of discrimination); WHQ, Wound Healing Questionnaire.highlights a potential risk for attrition bias.Specific efforts to improve retention in these groups should be co-developed by CEI partners in future trials.

Per protocol
Postoperative surveillance is burdensome in high-volume, low-resource settings, both for patients and health systems.Remote follow-up is likely to substantially reduce direct and indirect costs (for example time out of work, informal caregivers), for patients who may already be at risk of catastrophic expenditure as a result of their surgical episode 10 .Task shifting of wound assessment to more junior or non-clinical staff is likely to significantly improve efficiency and reduce the 'footprint' of research studies on local systems.Here, the WHQ was largely delivered by non-expert assessors, which both reduced the opportunity cost to the limited surgical workforce and built capacity in research skills and wound evaluation.
Video and photographic assessment of the healing surgical wound is a promising area of innovation that was not evaluated in the present study 41,42 .Assessment using the telephone WHQ was less accurate for 'mild SSI' (that is not needing reoperation) in a subgroup analysis and signs such as purulent fluid, wound opening, and greater than expected pain on palpation were sometimes missed by the WHQ in an exploratory analysis.The evidence base for adoption of this 'enhanced' remote assessment remains scarce, but it has been widely adopted during the SARS-CoV-2 pandemic 43,44 .The data of the present study show promise for the feasibility of video and photographic assessment in low-resource contexts, with 64.2% of patients having access to a mobile telephone with a camera (range by country: 11.1% in Rwanda to 80.6% in Mexico).Urgent evidence is required to better understand the safety and potential limitations of this practice.
The present study has several limitations.One, it is assumed that the reference test of in-person assessment could correctly detect when a wound infection had or had not occurred.Whilst in the FALCON trial there was a minimum training requirement for those involved in wound evaluation, false positives or false negatives at in-person review would affect the estimates of diagnostic accuracy upon administration of the WHQ and no fully objective reference test is available.Two, there was a theoretical risk of patients developing a new SSI between their WHQ completion and 30-day follow-up.This is clinically unlikely and not supported by existing cohort data 29 .Three, the WHQ was commonly performed with ad hoc translation by the questionnaire administrator.This may have decreased both the reproducibility and accuracy of the instrument, but reflected the diverse, real-world setting of delivery, and no significant difference was seen in discrimination when translation was performed ad hoc versus with a pre-translated questionnaire.Four, despite a careful quality assurance and training process, repeated measurements were not performed, so inter-rater or intra-rater reliability could not be evaluated.Five, the study was underpowered to explore differences in accuracy between countries or languages.Six, there was a risk of partial verification bias in only including patients with in-person FALCON follow-up in the diagnostic accuracy analysis, but this was addressed using inverse probability weighting.Seven, small changes to the published SAP were made; however, these were responsive to CEI and investigators' priorities and are described transparently.Eight, participants were concurrently involved in an RCT in SSI reduction so may have been more conscious of symptoms of infection than a general population.Nine, acceptability of telephone follow-up was assessed at the end of the telephone WHQ and was not anonymized, so was at risk of social acceptability bias.It is also possible that patients with more SSI symptoms had more frequent healthcare interactions and were able to respond more accurately to the WHQ creating a verification bias.The study excluded patients who died before 30 days (29 patients), representing a competing risk when interpreting the generalizability of the results to a highest-risk group of patients.All three of these sources of bias had minimal impact on relative diagnostic ORs upon meta-epidemiological analysis.Finally, we did not assess accuracy of the WHQ in detecting SSI in non-abdominal surgery or the clinical impact of introducing pathways for remote wound surveillance into routine practice; this highlights important areas for further research.
However, with a robust protocol, run through the platform of a randomized trial and in accordance with best-practice guidelines, it provides high-quality evidence to support implementation of postoperative tele-surveillance.TALON also provides a proof of concept for international SWATs, which can now be used to explore other high-priority methodological challenges in other global health trials, including outcome assessment in other perioperative events.

Funding
TALON was funded through a doctoral research fellowship from the National Institute of Health and Social Care (NIHR) Academy (NIHR300175).The FALCON trial was funded by a NIHR Global Health Research Unit Grant (NIHR 16.136.79).The funder and sponsor had no role in study design or writing of this report.The funder has approved the submission of this report for publication.The views expressed are those of the authors and not necessarily those of the National Health Service, the NIHR, or the UK Department of Health and Social Care.Jane Blazeby and Rhiannon Macefield are funded by the NIHR Bristol and Weston Biomedical Research Centre.Jane Blazeby is an NIHR Senior Investigator.

Fig. 3
Fig. 3 Receiver operating characteristic curves for the Wound Healing Questionnaire in detecting surgical-site infection up to 30 days after surgery SSI, surgical-site infection.

Table 2 Follow-up for patients contactable for telephone WHQ (n = 1088)
Values are n (%).*Question added after pilot phase in response to Community Engagement and Involvement group feedback, so not available for patients recruited in the pilot phase.

Table 3 Diagnostic test accuracy characteristics overall and across subgroups
Values are n (%) unless otherwise indicated.Full set of test accuracy statistics presented in Table