Role of the halo sign in the assessment of giant cell arteritis: a systematic review and meta-analysis

Abstract Objectives This systematic review and meta-analysis aimed to evaluate the diagnostic value of the halo sign in the assessment of GCA. Methods A systematic literature review was performed using MEDLINE, EMBASE and Cochrane central register databases up to August 2020. Studies informing on the sensitivity and specificity of the US halo sign for GCA (index test) were selected. Studies with a minimum of five participants were included. Study articles using clinical criteria, imaging such as PET-CT and/or temporal artery biopsy (TAB) as the reference standards were selected. Meta-analysis was conducted with a bivariate model. Results The initial search yielded 4023 studies. Twenty-three studies (patients n = 2711) met the inclusion criteria. Prospective (11 studies) and retrospective (12 studies) studies in academic and non-academic centres were included. Using clinical diagnosis as the standard (18 studies) yielded a pooled sensitivity of 67% (95% CI: 51, 80) and a specificity of 95% (95% CI: 89, 98%). This gave a positive and negative likelihood ratio for the diagnosis of GCA of 14.2 (95% CI: 5.7, 35.5) and 0.375 (95% CI: 0.22, 0.54), respectively. Using TAB as the standard (15 studies) yielded a pooled sensitivity of 63% (95% CI: 50, 75) and a specificity of 90% (95% CI: 81, 95). Conclusion The US halo sign is a sensitive and specific approach for GCA assessment and plays a pivotal role in diagnosis of GCA in routine clinical practice. Registration PROSPERO 2020 CRD42020202179.


Introduction
GCA is a form of large vessel vasculitis, which can cause critical ischaemia. Associated retinal ischaemia can lead to permanent blindness in $15-25% of patients, making it a medical emergency [1]. However, making a diagnosis of GCA can be challenging, because none of the symptoms or laboratory findings have perfect sensitivity or specificity for the disease [2]. The ACR 1990 classification criteria for GCA have been developed Key messages . Compared with previous meta-analyses, the halo sign had similar sensitivity (67%) but higher specificity (95%). . Higher specificity might potentially reflect improved technique and equipment. . Studies showed design heterogenicity; we recommend that future researchers adopt multicentre prospective standardized study protocols.
for research purposes, but have limited specificity for GCA in daily clinical practice [3].
Since the publication of the ACR Classification criteria, ultrasonography has been shown to play a pivotal role in the diagnosis of GCA, with the most specific finding being the halo sign, a circumferential hypoechoic vessel wall thickening around the lumen, most probably attributable to vessel wall oedema [4] and intimal hyperplasia [5]. GCA predominantly involves the external carotid artery and its branches, such as the temporal arteries (cranial GCA), the aorta, subclavian and axillary arteries [6]. Traditionally, glucocorticoids (GCs) have been the mainstay of treatment for GCA [7], although cohort studies and the GiACTA trial showed only 15-20% sustained remission with GCs alone [8]. Current guidelines suggest starting GCs immediately in patients where GCA is strongly suspected, pending investigation, to prevent serious ischaemic complications [9]. Long-term use of high-dose GCs can lead to severe adverse effects, such as hypertension, hyperglycaemia, osteoporosis, Cushingoid changes, mood disturbance, electrolyte imbalance, cataracts and glaucoma, but this is not an exhaustive list [9,10]. Therefore, a prompt and accurate diagnosis is vital to ensure that vision is preserved whilst avoiding unnecessary exposure to a potentially toxic treatment [11]. GCA fast-track clinics have been shown to reduce permanent visual loss by facilitating a rapid specialist clinical assessment with US of the temporal and/or axillary arteries [1,12].
Historically, a positive temporal artery biopsy (TAB) has been the gold standard test for a histological diagnosis of GCA [13,14]. However, TAB is invasive and lacks sensitivity [15]. This deficiency is particularly true with extra-cranial involvement, where access to histological samples has obvious practical constraints and is usually identified incidentally following cardiovascular surgery [15]. Non-invasive imaging techniques, including US, MRI and PET-CT, are readily able to identify these patients [2,16,17]. The EULAR recommends US of temporal and/or axillary arteries as the first imaging modality for suspected predominantly cranial GCA, where adequate expertise and equipment are available [18]. US is safe, non-invasive and has high sensitivity. It is a relatively quick procedure, often used as a point-of-care test, well tolerated by patients, with a growing body of evidence for its use in follow-up [19]. At present, a noncompressible halo sign is the main finding on US of active GCA patients [4,13,20]. The accuracy and criterion validity of US in the diagnosis of GCA was investigated in several studies [19,[21][22][23]. A meta-analysis of prospective studies compared the final diagnosis of GCA with temporal artery US, showing a pooled sensitivity of 77% and a pooled specificity of 96% [24].
US also allows for assessment of the intimal media complex and measurement of intimal medial thickness (IMT). Although no definite consensus has been reached, studies suggest that at the age of 70 years, a normal temporal artery has an IMT of $0.2 mm, whereas abnormal or inflamed temporal arteries have an IMT range between 0.5 and 0.9 mm [18,25]. Axillary arteries of patients aged $70 years have a normal IMT of $0.6 mm, whereas patients with extra-cranial (large vessel) GCA have a mean IMT of 1.6-1.7 mm [25,26]. An axillary artery IMT of 1.0 mm was determined as a cutoff value to discriminate between a normal and abnormal artery by Schä fer et al. [25]. Currently, US assessment of suspected GCA patients is reported in a dichotomous manner (positive or negative). However, a range of extent and severity of these findings can be observed in the temporal and axillary arteries [27]. A recent post hoc prospective study of a quantitative ultrasonographic halo score, which combines the grade and extent of halos seen in temporal arteries, their branches and axillary arteries in GCA, has shown value as a marker of disease activity and ocular ischaemia [3]. Whether the halo score might be of help with diagnosis, prognosis and GCA monitoring is being tested in an ongoing prospective multicentre study of patients presenting with new GCA (HAS-GCA study; NIHR IRAS# 264294) [13].
This systematic review and meta-analysis focused on evaluation of the clinical role of the halo sign in managing a clinically suspected GCA population and ascertaining the areas that warrant further exploration. This study also updates estimates of diagnostic accuracy, because newer studies have been published using modern US equipment.

Methods
For this literature review and meta-analysis, we followed the format of population, intervention, comparator and outcome (PICO) [28] (Supplementary Table S1, available at Rheumatology Advances in Practice online) and guidelines of (PRISMA-DTA) Preferred Reporting Items for Systematic Reviews and Meta-Analyses [29,30]. This study protocol was registered with the international prospective register of systematic reviews (PROSPERO 2020 CRD42020202179). No ethical approval or informed consent was required.

Literature search
The literature was searched systematically by two investigators (A.S. and F.C.) using a broad search of different databases; MEDLINE, EMBASE and Cochrane central registry (Supplementary Table S2, available at Rheumatology Advances in Practice online). These databases were searched for original primary studies that examined the sensitivity and specificity of the halo sign, demonstrated by temporal artery and/or axillary artery ultrasonography for GCA diagnosis, published in English, from their inception dates until August 2020. The search terms included giant cell arteritis, temporal arteritis, diagnostic imaging, imaging, ultrasound, ultrasonography, halo sign and temporal artery biopsy. An experienced medical librarian carried out the complete search.
Alwin Sebastian et al.

Study selection and eligibility criteria
The titles and abstracts were screened by two independent reviewers (A.S. and F.C.). Full texts were assessed independently by two reviewers (A.S. and F.C.). Any disagreement between reviewers was resolved by consensus or, if consensus could not be obtained, by consulting a third reviewer (K.S.M.v.d.G.), who made the final decision.
We included prospective and retrospective crosssectional or longitudinal studies and randomized controlled trials of GCA, conducted in single or multicentre settings, provided the patients had temporal and/or axillary artery US performed for diagnosis. We included studies: (1) containing patients with suspected GCA; (2) using clinical diagnosis, an imaging test (US/PET-CT) and/or TAB as the reference standard for GCA; (3) in which US was performed at any time from the clinical suspicion of GCA; and (4) in which at least five patients had GCA and at least five did not have GCA. Case reports, case series, conference abstracts and casecontrol studies were excluded because specificity could not be evaluated. Adult human subjects (age !50 years), clinically classified as suspected GCA, were included. The reference standard clinical diagnosis of GCA was considered when the treating clinician-suspected GCA based on clinical criteria such as age !50 years, abnormal blood markers (CRP >5 mg/l, ESR >30 mm/h), unequivocal cranial symptoms of GCA and/or PMR symptoms and evidence of GCA by imaging (US/PET-CT) or positive TAB. All the participants must have had a temporal artery and/or axillary artery US to look for the halo sign and/or compression sign, occlusion and stenosis. Moreover, TAB was also used as a reference standard separately.

Data collection
Study characteristics and data from 2 Â 2 tables (true positive, false negative, false positive or true negative) were extracted by one reviewer (A.S.) and checked by a second reviewer (F.C.). If no consensus could be obtained, a third reviewer (K.S.M.v.d.G.) made the final decision. A standard data sheet was used to collect information on study characteristics. Authors of studies were not contacted. In the event of potential overlap of patients between studies from the same hospital, data were obtained from the most extensive study for the meta-analysis. When multiple reference standards were used in the same study, the clinical diagnosis was used as the primary reference standard for the data analysis. The other was used for sub-group analysis. Any disagreement between reviewers was either resolved by consensus or by consulting a third reviewer (K.S.M.v.d.G.).

Quality assessment
The risk of bias was evaluated by two reviewers (A.S. and F.C.) with the quality assessment of diagnostic accuracy studies (QUADAS-2) tool [31]. Any disagreement between reviewers was resolved through discussion with other review authors (S.I., J.J. and B.D.). The QUADAS-2 tool focuses on the bias and applicability of study results regarding patient selection, the index test, the reference standard, and study flow and timing [31].

Statistical analysis
The sensitivity and specificity of the halo sign, along with their 95% CIs, were calculated for each study, and the total sample size of reviews was plotted. Study heterogeneity was examined visually by plotting sensitivity and specificity in forest plots and receiver operating characteristics (ROC) space [32]. We used hierarchical logistic regression modelling (bivariate model) (Supplementary Fig. S1, available at Rheumatology Advances in Practice online) to determine pooled estimates of diagnostic accuracy parameters (i.e. sensitivity, specificity, diagnostic odds ratio and likelihood ratios). STATA v.15 software was used for the statistical analysis and creating hierarchical summary receiver-operating characteristic (HSROC) plots. Forest plots were created in REVIEW MANAGER v.5.3.

Study characteristics
The initial search yielded 4023 unique studies. Based on title/abstract screening, 106 articles were selected for full-text screening. Twenty-three articles were selected for the systematic review and meta-analysis [15,24,. The flow of information through the review is illustrated in the PRISMA flow diagram [54,55] (Fig. 1).
A total of 2711 subjects were collected from 23 studies, and their characteristics are summarized in the study characteristics table (Table 1). There were 12 retrospective and 11 prospective studies performed at academic and non-academic centres. Clinical diagnosis was the most commonly used reference standard, whereas some reports presented TAB as the reference standard. A variable proportion of patients underwent unilateral or bilateral temporal artery US assessment ( Table 2). The clinical diagnosis was based mainly on clinical and laboratory findings, imaging and/or TAB results. In the studies using clinical diagnosis as a reference standard (18 studies), all patients were reviewed to ensure the clinical diagnosis was not later revised. The majority of studies assessed the cranial arteries alone (15 studies), whereas others evaluated both cranial and extra-cranial arteries (eight studies). Most of the GCA studies tested the halo sign as a main lesion to define vasculitis. Other US signs addressed (mostly in combination with the halo sign) were stenosis and occlusion [33,39,43,50] and the compression sign [24,40]. Two studies reported the compression sign [24,40], and four studies reported stenosis and occlusion along with halo sign [33,39,43,50]. Fifteen studies used TAB [15, 33, 34, 39, 41-46, 48-50, 52, 53], and two studies used the compression sign [24,40] as reference. More than half of the publications examined colour duplex US with frequencies of 5-15 MHz. The US specifications are summarized in Table 2.

Evaluation of bias
Patient selection and flow of timing were the primary sources of bias (Fig. 2). Studies using TAB as the reference standard might have contributed to the selection bias, because there would be a strong initial clinical suspicion to request this invasive test. Studies using ACR 1990 clinical criteria as the diagnosis standard were at high risk of bias, because the index test could have altered the initial clinical decision. The flow of timing had a considerable amount of risk of bias, because the index test was performed at various time periods from the initial clinical suspicion of GCA. Additional data and details on the risk of bias assessment are summarized in Fig. 2 and Supplementary Fig. S2

Meta-analysis
Results of the pooled estimates for US signs of GCA in comparison to the clinical diagnosis or TAB as reference standard are summarized in Table 3. All 23 studies (n ¼ 2711 patients) investigated the value of the halo sign in comparison to the clinical diagnosis 6 TAB, yielding a pooled sensitivity of 67% (95% CI: 51, 80) and a specificity of 95% (95% CI: 89%, 98%). This gave a positive and negative likelihood ratio for the diagnosis of GCA of 14.2 (95% CI: 5.7, 35.5) and 0.35 (95% CI: 0.22, 0.54), respectively (Fig. 3A). When analysed, the halo sign with TAB as standard yielded a pooled sensitivity of 63% (95% CI: 50, 75) and a specificity of 90% (95% CI: 81, 95). The halo sign against TAB as standard revealed a positive likelihood ratio of 6.06 (95% CI: 3.34,      (Fig. 3B). The analysis of the combined US signs (halo sign, stenosis or occlusion) in comparison to clinical diagnosis or TAB (four studies, n ¼ 270) resulted in a sensitivity of 52% (95% CI: 18, 84) and specificity of 81% (95% CI: 64, 91) ( Supplementary Fig. S3A, available at Rheumatology Advances in Practice online). The combination of halo sign and stenosis (four studies, n ¼ 230) resulted in a sensitivity of 43% (95% CI: 12, 80) and specificity of 85% (95% CI: 66, 94) ( Supplementary  Fig. S3B, available at Rheumatology Advances in Practice online). Authors of two studies (n ¼ 140, both with low risk of bias), from the same research group, investigated the compression sign [24,40] and described sensitivities between 77 and 79% and a specificity of 100% of the compression sign when compared with the clinical diagnosis of cranial GCA. When comparing the studies done before 2010 (seven studies) and after 2010 (11 studies), later studies showed higher sensitivity of 71% (earlier studies, 63%) and similar specificity 96% (earlier studies, 95%) (Supplementary Table S4, available at Rheumatology Advances in Practice online).
Forest plots and HSROC curves indicated that clinical diagnosis or TAB as a standard had limited heterogeneity, whereas halo sign with stenosis and occlusion or halo with stenosis showed high between-study heterogeneity ( Supplementary Fig. S1, available at Rheumatology Advances in Practice online).

Discussion
This systematic review and meta-analysis evaluated the role of the halo sign in the assessment of GCA. When compared with previous meta-analysis, the diagnostic performance of the halo sign for the diagnosis of cranial GCA was of similar sensitivity (67% vs 68-77%) [19,22,23,56], but higher specificity (95% vs 81-96%) [19,22,23,56]. When combining the halo sign with occlusion or stenosis, the present study showed lower sensitivity (52% vs 78%) [56] and higher specificity (81% vs 79%) [56]. This discrepancy could be attributable to the inclusion of high-quality studies and exclusion of overlapping studies, and might also be related to better equipment, with 5-15 MHz probes used in the earlier studies. Another reason could be that occlusion and stenosis are not assessed routinely, as mentioned in OMERACT, and more work is certainly needed to standardize the definition of these findings. A recent study showed that when combining the GCA pre-test probability score with the halo sign, the sensitivity increases to between 94 and 100% [57].
The present study also showed a comparable diagnostic accuracy of the halo sign compared with TAB. US might be a more thorough GCA assessment than TAB, because it allows for detailed analysis of the temporal arteries along their entire length, minimizing the effect of skip lesions [58]. TAB is also an invasive procedure, which can have procedural complications, and is not readily available for re-assessment of the artery if relapse occurs. In line with these findings, a review by Schmidt et al. [59] reported that biopsy has a relatively low yield compared with US in GCA diagnosis. The statistical findings of the present study indicate that the halo sign is a useful tool that could be incorporated in everyday clinical practice, because US is costeffective and provides more accurate and specific results for the assessment of GCA. The findings of the TABUL study provided significant results for the specificity and sensitivity of the halo sign in GCA assessment, with values of 69% and 82%, respectively [15]. It asserts that the use of US in GCA assessment is highly dependent on the halo sign, because it determines the presence of an area of inflammation in the arteries. A recent publication of the novel halo score, graded with the halo thickness, confirms that the halo sign and halo count are significantly correlated with inflammatory markers, ocular ischaemia and intimal hyperplasia on TAB [3].
Limitations of this systematic review and metaanalysis are the inclusion of both prospective and retrospective observational studies. The retrospective studies might have contributed to bias in analysis of the final data. It has not been possible to evaluate the specific issues related to US operator and image interpretation variability [60]. The reviews did not present inter-rater/ intra-rater reliability data. Different sonographic skill levels of the rheumatologists or sonographers might have had an impact on the final results. When the colour intensity is more robust, such as in smaller vessels, it is easier to distinguish the dark, hypoechoic halo sign [56]. Other malignant conditions, ANCA vasculitis, infections or poor US technique, can give rise to a false-positive halo [52]. A further issue was the methodologies used between the studies. Studies concluding that US is superior to TAB in diagnosis of GCA vary in their design [35,46]. We included studies if they had US performed >2 weeks from the initial clinical suspicion of GCA, although they would have been exposed to a high dose of CSs, which might reduce the halo thickness and accuracy of US. When the ACR classification criteria for GCA were applied as the reference standard [22,61], the meta-analyses reported a lower sensitivity and a higher specificity of the halo sign for GCA diagnosis. However, these criteria were designed for classification and research purposes and are inadequate for diagnosis of GCA in clinical practice [21]. Therefore, ACR criteria as the reference standard could be a limiting factor in the present study.

Conclusion
This meta-analysis shows that the US halo sign has a significant role in the assessment and diagnosis of GCA. US is a sensitive and specific approach for GCA assessment, which seems to be improving with better equipment and user familiarity with US techniques. However, the studies analysed showed heterogeneity in their design and outcomes. Therefore, it is recommended that future researchers conduct multicentre prospective studies for analysing the effectiveness of the halo sign in the assessment of GCA, with a standardized study protocol.
Study concept and design: A.S., F.C., K.S.M.v.d.G., S.I. and B.D.; data collection: A.S. and F.C.; statistical analysis and data interpretation: A.S., F.C., J.J. and K.S.M.v.d.G.; all authors reviewed the manuscript content and gave the final approval of the version. A temporal artery biopsy was also performed in some studies with the clinical diagnosis as the reference standard for GCA. Clinical diagnosis is the final diagnosis made according to the ACR criteria or physician diagnosis. DOR: diagnostic odds ratio; TAB: temporal artery biopsy.