Targeted use of intraoperative frozen-section analysis lowers the frequency of completion thyroidectomy

Abstract Background The impact of intraoperative frozen section (iFS) analysis on the frequency of completion thyroidectomy for the management of thyroid carcinoma is controversial. Although specialized endocrine centres have published their respective results, there are insufficient data from primary and secondary healthcare levels. The aim of this study was to analyse the utility of iFS analysis. Methods In the Prospective Evaluation Study Thyroid Surgery (PETS) 2 study, 22 011 operations for benign and malignant thyroid disease were registered prospectively in 68 European hospitals from 1 July 2010 to 31 December 2012. Group 1 consisted of 569 patients from University Medical Centre (UMC) Mainz, and group 2 comprised 21 442 patients from other PETS 2 participating hospitals. UMC Mainz exercised targeted but liberal use of iFS analysis for suspected malignant nodules. iFS analysis was compared with standard histological examination regarding the correct distinction between benign and malignant disease. The percentage of completion thyroidectomies was assessed for the participating hospitals. Results iFS analysis was performed in 35.70 per cent of patients in group 1 versus 21.80 per cent of those in group 2 (risk ratio (RR) 1.6, 95 per cent c.i. 1.5 to 1.8; P < 0.001). Sensitivity of iFS analysis was 75.0 per cent in group 1 versus 63.50 per cent in group 2 (RR 1.2, 1.2 to 1.3; P = 0.040). Completion surgery was necessary in 8.10 per cent of patients in group 1 versus 20.8 per cent of those in group 2 (RR 0.4, 0.2 to 0.7; P = 0.001). Conclusion iFS analysis is a useful tool in determining the appropriate surgical management of thyroid disease. Targeted use of iFS was associated with a significantly higher sensitivity for the detection of malignancy, and with a significantly reduced necessity for completion surgery.


Introduction
Exclusion of malignancy is currently the predominant indication for thyroid surgery in Germany and other parts of Europe. The suspicion of thyroid malignancy is based mainly on growth rate, ultrasound patterns, elastography, results of 99m Tc scintigraphy (cold nodule) 1 and, increasingly, on results of 99m Tc-sestamibi (MIBI) scintigraphy (mismatch) 2 . Multinodular goitre, compression symptoms and/or multifocal disease are frequently concomitant factors favouring thyroid resection. Negative experiences with misleading fine-needle aspiration cytology (FNAC) results, due to inexperience in performing the biopsy and in the cytopathological assessment, and to insufficient reimbursement for the FNAC procedure, are reasons why this diagnostic option is not used frequently in Germany and other parts of Europe, defying recommendations in guidelines 1,3 .
Intraoperative frozen-section (iFS) analysis is another option for the pathological evaluation of thyroid tumours, with the potential to influence the surgical strategy in the primary thyroid procedure, by switching to a more aggressive surgical approach in patients with malignancy. Despite 'suspicion of malignancy' being the predominant indication for thyroid surgery, and again in contrast to guidelines for the surgical management of thyroid disease 1,4 , at present iFS analysis is performed with a low frequency in Europe. Since the 1980s, the utility of iFS analysis has remained controversial [5][6][7] . Being hampered by the identification of capsular and vascular invasion, numerous authors [8][9][10][11][12] have reported a low sensitivity for iFS analysis, in particular for the identification of follicular thyroid malignancy. Similar limitations, however, have also complicated preoperative FNAC, leading to 'indeterminate' diagnoses in 20-30 per cent of patients 13 .
The Prospective Evaluation Study Thyroid Surgery (PETS) 2, a prospective multicentre European study, was initiated to analyse the quality and pattern of medical care for thyroid disease. PETS 2 includes clinics from all levels of primary, secondary and tertiary care 14 , with the aim of assessing 'service reality' and establishing evidence for future recommendations and guidelines. The aim of the present analysis was to examine the sensitivity and potential impact of iFS analysis on the surgical management of thyroid malignancy in a 'real world' setting.
Methods PETS 2 included 68 European hospitals from Austria, Czech Republic, Germany, Norway and Poland, with the majority of centres being in Germany. A cross-section of low-and highvolume centres was included 14 . Consecutive patients undergoing a thyroid procedure from 1 July 2010 to 31 December 2012 were included in the study, with a follow-up period for complications until 31 December 2015. Perioperative and follow-up data were collected prospectively using a predefined questionnaire and transmitted in pseudonymized form.
To illustrate the differences between targeted but liberal versus infrequent use of iFS analysis, results from University Medical Centre (UMC) Mainz, Section of Endocrine Surgery (group 1) and those of the remaining hospitals participating in PETS 2 (group 2) were compared. In addition to the PETS 2 data, at UMC Mainz the initially intended resection strategy (unilateral versus bilateral lobe resection) as well as strategy changes were documented.
This work was carried out in accordance with the ethical standards of the Helsinki Declaration of 1975. Informed consent was obtained from all participants. The study was approved by the national ethical committee.

Statistical analysis
Data were analysed with the IBM SPSS V R version 23 (IBM, Armonk, NY, USA). iFS analysis was compared with standard histological examination regarding the distinction between benign and malignant thyroid disease. Subgroup analysis, according to preoperative diagnosis, was also performed. Parameters of accuracy were calculated. The proportion of patients requiring completion surgery for intended radioiodine therapy 1,3 (total thyroidectomy 6 central lymphadenectomy in patients with incomplete, mostly unilateral, thyroid surgery) was analysed in both cohorts. Results were compared between the groups using Fisher's exact test. Results with P < 0.050 were considered significant. Risk ratios (RRs) with 95 per cent c.i. are presented. Individual results for participating hospitals are presented in a funnel plot.

Results
PETS 2 included 22 011 operations; 569 operations were performed at UMC Mainz (group 1) and 21 442 at the remaining hospitals (group 2) ( Table 1). The interinstitutional comparison of the hospitals participating in PETS 2 found that both iFS analysis and FNAC were used more frequently than average at UMC Mainz ( Fig. 1).
Use of intraoperative frozen-section analysis and frequency of completion thyroidectomy iFS analysis was performed in 35.7 per cent of patients in group 1 and 21.8 per cent of those in group 2 (Fig. 2). In group 1, iFS analysis was performed in 27.5 per cent of patients considered before surgery to have benign disease, but which was deemed suspicious due to intraoperative findings (such as visual appearance, presence of desmoplasia, palpation), compared with 18.8 per cent in group 2. The proportion deemed before surgery to be 'suspicious for malignancy' was 10.0 per cent in group 1 and 5.6 per cent in group 2. For these patients, iFS analysis was performed in 27.1 per cent in group 1 versus 15.6 per cent in group 2 ( Table 1). Of patients with malignant disease (later confirmed by histological examination), iFS analysis was conducted in 71.4 per cent in group 1, but in only 41.8 per cent in group 2 ( Fig. 3).
A significantly lower proportion of completion thyroidectomies was necessary in group 1 (9 of 112. 8.0 per cent) compared with group 2 (451 of 2169, 20.8 per cent) (RR 0.4, 95 per cent c.i. 0.2 to 0.7; P ¼ 0.001) (Fig. 4), despite the significantly higher percentage of patients with thyroid carcinoma undergoing surgery at UMC Mainz (RR 1.7, 1.5 to 2.0; P < 0.001). Of note, 14.1 per cent (20 of 142) of the tumours judged as benign on iFS analysis at UMC Mainz were rediagnosed as malignant in the final histological assessment (Fig. 2). Seven of the nine cases of completion thyroidectomy in group 1 resulted from false-negative (incorrectly benign) iFS results. The underlying malignancies were four follicular thyroid carcinomas (FTCs) (including 1 oncocytic FTC and 1 minimally invasive FTC), two follicular variant papillary thyroid carcinomas (PTCs) and one 'classical' PTC.

Predictive value of intraoperative frozen-section analysis
The sensitivity (correct detection of malignancy) of iFS analysis was significantly higher in group 1 than in group 2 (75.0 versus 63.5 per cent respectively) (RR 1.2, 95 per cent c.i. 1.2 to 1.3; P ¼ 0.040). After adjusting for tumour prevalence, the positive predictive value (PPV) and negative predictive value of the two groups were similar ( Fig. 4 and Table S1).
In patients with thyroid disease diagnosed as benign before surgery, the sensitivity of iFS analysis in detecting malignancy was 65.2 per cent in group 1 and 40.5 per cent in group 2 (RR 1.6, P ¼ 0.028; specificity 100 versus 99.6 per cent respectively) (data not shown).

Extent of surgery
Bilateral thyroid lobe resection was the predominant resection strategy in both cohorts. In 8 per cent (6 of 74) of patients in group 1 with an intended unilateral procedure, the resection strategy was switched to bilateral lobe resection following iFS analysis, which correctly indicated malignancy ( Table 2). In group 2, iFS analysis was performed in 18.8 per cent (3636 of 19 387) of patients who had disease classified as benign before surgery. Moreover, in 4 of 128 patients (3.1 per cent) in group 1 with thyroid disease diagnosed benign before surgery, central lymph node dissection was carried out, due to the diagnosis of malignancy following iFS analysis. In group 2, this occurred in only 26 of 3636 patients (0.7 per cent) ( Table 2). In both groups, no false-positive iFS results preceded central lymph node resection. For this analysis, central lymphadenectomy was defined as resection of more than three lymph nodes from each side of the central lymph node compartment (compartments 1a and 1b 15 , level 6 (and 7) respectively 16 ).

Histopathological findings
In both cohorts, most malignant tumours were PTC, as would be expected (Table S2). Malignant tumours that escaped detection in iFS analysis were PTC, including pT1a carcinoma and the       (Table S2). Among the malignant tumours escaping iFS detection, FTC was also common (20 per cent in group 1, 21.3 per cent in group 2). The relatively high proportion of medullary thyroid carcinoma among the undetected tumours was due to the fact that iFS analysis was used to determine the extent of disease.

Subgroup analysis for fine-needle aspiration cytology categories
Preoperative FNAC analysis was available for 27.1 per cent of patients in group 1 and 21.6 per cent of those in group 2 ( Table S3).
The contemporaneous availability of iFS and FNAC results allowed for an analysis of iFS sensitivity for different Bethesda categories ( Fig. S1 and Table S4). The category associated with the highest sensitivity for iFS analysis was Bethesda category IV: 82.3 per cent for group 1 and 83.3 per cent for group 2 (Table S4).

Discussion
The sensitivity of iFS analysis was significantly higher in the UMC Mainz group (75.0 per cent versus 63.5 per cent in the PETS 2 cohort), and completion surgery was performed significantly more often in the PETS 2 cohort (group 2) (20.8 per cent versus 8.1 per cent in the UMC Mainz group). One reason for the significantly lower risk of completion surgery at UMC Mainz was the more frequent use of iFS analysis (35.7 per cent versus 21.8 per cent in group 2), particularly the intentional use of iFS analysis for thyroid nodules assumed before surgery to be benign, but deemed suspicious in the intraoperative setting. The decision to perform iFS of a respective nodule was influenced by the clinical picture (such as patient age, node size, growth rate), sonographic patterns (preoperative ultrasound imaging by the operating surgeon), palpatory findings (hardness, elasticity, calcifications) and-especially visible in nodules located near the thyroid capsule-the presence of desmoplastic alterations (such as dense fibrosis around the nodule, star-shaped fibrosis). Cysts or nodules featuring visible colloid areas were judged less suspicious and did not undergo iFS analysis. Enlarged or otherwise suspicious lymph nodes were assessed routinely by means of iFS analysis. The targeted selection of nodules subjected to iFS analysis is illustrated by the result that in group 1 iFS was conducted in 71.4 per cent of patients with confirmed malignancy at final histology, compared with 41.9 per cent in the remaining PETS 2 cohort. Selecting nodules or lymph nodes for iFS analysis is dependent on the experience of the surgeon-experience that comes from having been exposed frequently to iFS results during training and professional career.
The RR of sensitivity significantly favoured the UMC Mainz cohort, whereas the specificity of iFS analysis was similar for the two groups, and a direct influence of carcinoma prevalence on iFS sensitivity and specificity was excluded statistically. Routine use of iFS develops and maintains the expertise of the pathologists involved, allowing for a higher sensitivity. Although rare, false-negative iFS results were documented at UMC Mainz, leading to seven completion thyroidectomies. Addressing this problem, from 2013 onward, UMC Mainz performed fast paraffin embedding (12-36 h) in patients with 'suspicious' iFS results (stromal desmoplasia, homogeneous follicular architecture and hypercellularity), allowing for the performance of completion thyroidectomy within 72 h of primary surgery; within the period associated with less morbidity and an improved oncological outcome 17 .
In 2015, Hosseini and colleagues 18 reported a 73 per cent reduction of secondary surgery in patients with follicular lesions on preoperative FNAC, influenced by the results of iFS analysis. Estebe and co-workers 19 found that, independent of Bethesda categories, iFS analysis contributed to a reduction of secondary surgery. In addition, in PTC diagnosed before surgery , Hong et al. 20 and Park and Lee 21 reported the value of iFS analysis in determining the extent of extracapsular invasion, which potentially influences the resection strategy.
The literature suggests that the sensitivity of iFS depends essentially on the histological subtype. Whereas 21 per cent of follicular lesions were detected, non-follicular lesions (primarily PTC) were detected with a sensitivity of 66 per cent in a metaanalysis by Peng and Wang 22 . From this meta-analysis of the literature from 1982 to 2007, the authors concluded that the sensitivity of iFS analysis to detect FTC was significantly lower than the sensitivity provided by preoperative FNAC (21 versus 69 per cent respectively). Yet, for this analysis, the diagnosis of 'follicular neoplasm' by FNAC was considered test-positive, whereas for a test-positive result in iFS analysis precise criteria of malignancy (capsular and vascular invasion) had to be fulfilled. Consequently, specificity and PPV significantly favoured iFS diagnosis (specificity 99 per cent versus 60 per cent for FNAC; PPV 86 versus 35 per cent respectively) 22 . In 21 studies that did not differentiate between follicular and non-follicular lesions, iFS analysis appeared significantly superior to FNAC diagnosis for the abovementioned measures of accuracy 22 . The detected sensitivity (71 6 13 per cent) was similar to the results of the present analysis. The present study illustrates the diagnostic restriction of iFS analysis for the correct evaluation of the follicular variant of PTC, papillary microcarcinoma and FTC. Cohen and co-workers 23 also reported particular difficulties for assessment of the follicular variant of PTC.
In addition to differences in the frequency of iFS analysis, the significantly higher number of preoperative FNAC examinations performed in the cohort undergoing surgery at UMC Mainz contributed to the reduction in completion thyroidectomy rates. Cetin et al. 24 reported the utility of iFS analysis (sensitivity 72.9 per cent, specificity 100 per cent) in thyroid nodules with a FNAC diagnosis of 'suspicious for malignant disease'. Similarly, Roychoudhury and colleagues 25 and Cohen et al. 23 reported the utility of iFS analysis in nodules of Bethesda category V, which, in contrast, was not observed by Huang and co-workers 26 . In the present study, a sensitivity for iFS analysis of 82.3 per cent for nodules of Bethesda category V was documented at UMC Mainz, and 83.3 per cent in the PETS 2 cohort. Posillico et al. 27 reported iFS analysis to be an important tool for determining the extent of thyroid surgery in patients with nodular thyroid and preoperative FNAC results categorized as atypia/follicular lesion of undetermined significance (Bethesda category III).
The increasing use of molecular testing of FNAC samples will further refine preoperative diagnosis in the future, directing iFS analysis to become an additional complementary tool for acquiring information on tumour size, focality, lymph node affection or extracapsular growth 28,29 .
To optimize the preoperative and intraoperative diagnosis of differentiated thyroid carcinoma, multimodal assessment including sonography, elastography, scintigraphy, FNAC with molecular genetic analysis, and iFS analysis is of crucial importance. The position of iFS analysis within the framework of this multimodal assessment is a central one, complemented by the experience of the thyroid surgeon in evaluating the examinations performed before surgery, and especially during surgery.