Defining indocyanine green fluorescence to assess anastomotic perfusion during gastrointestinal surgery: systematic review

Abstract Background The aim of this systematic review was to identify all methods to quantify intraoperative fluorescence angiography (FA) of the gastrointestinal anastomosis, and to find potential thresholds to predict patient outcomes, including anastomotic leakage and necrosis. Methods This systematic review adhered to the PRISMA guidelines. A PubMed and Embase literature search was performed. Articles were included when FA with indocyanine green was performed to assess gastrointestinal perfusion in human or animals, and the fluorescence signal was analysed using quantitative parameters. A parameter was defined as quantitative when a diagnostic numeral threshold for patient outcomes could potentially be produced. Results Some 1317 articles were identified, of which 23 were included. Fourteen studies were done in patients and nine in animals. Eight studies applied FA during upper and 15 during lower gastrointestinal surgery. The quantitative parameters were divided into four categories: time to fluorescence (20 studies); contrast‐to‐background ratio (3); pixel intensity (2); and numeric classification score (2). The first category was subdivided into manually assessed time (7 studies) and software‐derived fluorescence–time curves (13). Cut‐off values were derived for manually assessed time (speed in gastric conduit wall) and derivatives of the fluorescence–time curves (Fmax, T1/2, TR and slope) to predict patient outcomes. Conclusion Time to fluorescence seems the most promising category for quantitation of FA. Future research might focus on fluorescence–time curves, as many different parameters can be derived and the fluorescence intensity can be bypassed. However, consensus on study set‐up, calibration of fluorescence imaging systems, and validation of software programs is mandatory to allow future data comparison.


Introduction
Anastomotic leakage (AL) remains one of the most severe complications after gastrointestinal cancer surgery with restoration of continuity. Leakage rates of up to 20 per cent are reported after restorative cancer resection of both the upper and lower gastrointestinal tract 1,2 . Various risk factors have been associated with AL 3,4 . Adequate blood perfusion has been described as one of the key factors for adequate healing of the anastomosis, and is a surgically modifiable factor.
To aid the surgeon with assessment of gastrointestinal perfusion and determination of the optimal site for anastomosis, fluorescence angiography (FA) has gained support among gastrointestinal surgeons 5,6 . FA is a technique that uses an imaging system capable of excitation and detection of the fluorescent contrast agent indocyanine green (ICG) 7 . ICG is a cyanine dye with an absorption and emission peak in the near-infrared region 6 , at about 800 nm. ICG is approved for FA by the US Food and Drug Administration and European Medicines Agency, and is safe to use as side-effects occur rarely 6 . After intravenous injection, ICG distributes through the vascular system bound to plasma proteins, and its immediate fluorescence detection correlates with areas of perfused tissue. ICG is detectable within 1 min of injection 5 , and imaging can be performed in real time, making FA suitable for intraoperative enhanced reality of perfusion. This aids the surgeon in optimizing the anastomotic site and potentially lowering AL secondary to insufficient perfusion. In early observations 8,9,10 , use of FA was reported to lower AL rates after gastrointestinal cancer surgery.
When intraoperative management was adapted according to subjective interpretation of FA, AL and perianastomotic necrosis still occurred 8,9,10 . Although the pathophysiology of AL is multifactorial and dependent on several factors other than perfusion, other explanations for these observations include undertreatment and overtreatment, and difficulty in visualizing venous congestion by FA. Hitherto, no threshold is known for adequate perfusion. Overtreatment might be a result of more extended resections based on the FA findings when the imaging system was not sufficiently specific for detection of ischaemia. Overtreatment can come at the cost of a tension-free anastomosis, risking AL. Furthermore, venous congestion is more difficult to detect by subjective interpretation of FA, as ICG enters the tissue of interest when arterial blood flow is intact 11 .
To overcome the limitations of subjective interpretation of FA and evaluate ICG fluorescence objectively, research in the past decade has focused on measuring the fluorescent signal in quantitative values. However, no consensus exists on the method of quantification of the ICG fluorescence, and no threshold for adequate perfusion has yet been identified. This systematic review of the literature aimed to provide an overview of all the methods of FA quantification employed during gastrointestinal surgery and thresholds that have been produced to predict patient outcomes, in particular AL and necrosis. According to the identified methods, the aim was to outline recommendations for future research strategies.

Methods
The authors adhered to the PRISMA guideline 12 . PubMed and Embase databases were searched on 15 January 2019 to identify all studies that performed FA during gastrointestinal surgery and investigated quantitative fluorescence values (Appendix S1, supporting information). After removal of duplicates, title and abstract screening was executed independently by two authors according to predetermined criteria (Table S1, supporting information). Subsequently, full-text screening was conducted, and articles were deemed eligible when they presented original work on FA during gastrointestinal surgery in humans or animals. Reference lists of included articles were scanned to obtain potential additional articles. Conflicts were discussed to reach consensus.
Reported outcomes had to include a quantitative fluorescence parameter, ideally correlated with patient outcomes. A parameter was considered quantitative when a diagnostic numeric threshold for AL or necrosis could potentially be produced. Examples of quantitative fluorescence parameters are numeric classification scores, time to fluorescence enhancement, equations, or software analyses. Descriptive grouping using 'no', 'little' or 'good' fluorescence was not considered as numeric quantification of fluorescence.
Quality assessment of all included articles was performed independently by two authors. For human studies, the Newcastle-Ottawa Scale (NOS) for cohort studies was used. Animal studies were assessed using the SYRCLE (SYstematic Review Center for Laboratory animal Experimentation) risk-of-bias tool 13 .
Data extraction and aggregation was done by two authors. From all articles, only the groups that received FA were extracted and analysed for the purpose of the present review. Extracted data on the method of FA included the dose of ICG, the near-infrared imaging system and the software program used. The primary outcome was the quantitative fluorescence parameter (one or multiple). Secondary outcomes included patient outcomes, such as AL and necrosis rates, and change in management due to FA following conventional assessment of perfusion.
To aggregate the software-derived fluorescence-time curves, the curves were extracted from the individual graphs into data points using CurveSnap version 1 (Xoofee; https://curvesnap.en. softonic.com/) To compare the curves, the data points from the curves were read into ExcelV R (Microsoft, Redmond, Washington, USA). Fluorescence intensity values were normalized from 0 to

Statistical analysis
Results are presented using descriptive statistics. Categorical data are presented as number of cases and percentages. Continuous data, when normally distributed, are shown as mean(s.d.) values or total range, or, when not normally distributed, as median (i.q.r.) values or total range.

Results
A total of 1317 records were screened for title and abstract, after which the full texts of 40 articles were assessed for eligibility. In total, 23 articles  were included in this review, of which eight concerned upper and 15 lower gastrointestinal surgery. The process of screening and eligibility assessment is summarized in a PRISMA diagram (Fig. 1).
Fourteen studies were performed in humans and nine in animals. The quality of the human studies was either poor or good ( Table S2, supporting information). Most studies had a non-comparative design, so no points were granted to the comparability domain, which resulted in poor quality according to the NOS. All animal studies addressed attribution and reporting bias according to the SYRCLE classification, but selection, performance and detection bias were scarcely considered ( Table S3, supporting information). Characteristics of all included studies on patients are shown in Table S4 (supporting information) and those for animal studies in Table S5 (supporting information); overall, 11 different imaging systems and 11 different software programs were described. Clinical outcomes were reported in 13 of the 14 articles in patients (Table 1).
All 23 studies reported one or more quantitative fluorescence parameters of FA (Table 2). For the scope of this review, the reported quantitative fluorescence parameters were divided into four categories: time to fluorescence (20 studies), including manually assessed time to fluorescence (7) and software-derived fluorescence-time curves (13); contrast-to-background ratio (CBR) (3); pixel intensity (2); and numeric classification score (2).

Manually assessed time to fluorescence
Three studies examined perfusion of the gastric conduit (2 during oesophagectomy in patients and 1 using a porcine oesophagectomy model), one study evaluated perfusion of bowel ends after gastrectomy in patients, and three examined anastomotic perfusion of the lower gastrointestinal tract in patients [14][15][16][17][18][19][20] . All six human studies 14,15,[17][18][19][20] concerned prospective cohort observations. Change in management and AL were reported in all studies, and management changes were determined by FA in four 14,15,18,20 of the six studies (Table 1). In the porcine oesophagectomy model, ischaemia was studied by reversible ligation of the right gastroepiploic artery.
Five studies 15-18 ,20 evaluated time between ICG injection and first enhancement in the bowel ends, one study 19 observed time between ICG injection and subjective interpreted maximum fluorescent excitation, and one study 14 evaluated the flow speed (cm/ s) of ICG fluorescence through tissue. Mean values for manually assessed time are shown in Table 2.
One study 14 produced a cut-off value for ICG flow speed (cm/s) to predict AL. Koyanagi and colleagues 14 calculated the ICG flow speed by evaluating time from first fluorescence in the pylorus to the terminal end of ICG fluorescence divided by the measured distance between the two points. The flow speed was significantly associated with the occurrence of AL, and the cut-off value was determined as 1Á76 cm/s ( Table 2). In addition, two studies15, 16 produced no cut-off value, but proposed a definition for FA threshold. Kumagai and co-workers 15 proposed a '90-second rule', constructing all anastomoses in the area of the gastric conduit that was enhanced within 90 s (preferably within 60 s) from the first fluorescent enhancement in the root of the right gastroepiploic artery. Only three anastomoses were constructed in an area enhanced after 60 s, of which one anastomosis, constructed in an area enhanced after 77 s, resulted in AL. In another study, Quan et al. 16 defined areas as ischaemic when no fluorescence was seen 360 s after ICG injection.  36 Conventional assessment 4 of 77 (5) 2 of 77 (3) n.a. Sherwinter et al. 19 Conventional assessment 2 of 20 (10) 2 of 20 (10)

Software-derived fluorescence-time curves
Seven studies were performed in patients and evaluated perfusion of the gastric conduit during oesophagectomy 21,22 , perfusion of free jejunal grafts during pharyngo-oesophagectomy 23 and perfusion of bowel ends during procedures of the lower gastrointestinal tract 20, [24][25][26] . The other six studies were animal studies and assessed perfusion in a segment of small bowel or sigmoid in pigs [27][28][29][30][31] or stomach perfusion in pigs 32 .
Six of the seven studies in patients had a prospective design. Four studies 20,21,23,25 reported change in management according to FA, and the AL rate was observed in five studies 20,21,22,25,26 (Table 1). One study 23 observed venous congestion, which was defined as 'subjectively' judged unusually slow fluorescence inflow or graft necrosis due to venous thrombosis, which was confirmed during reoperation. Animal studies observed normal organ perfusion 32 , anastomotic healing 27 or ischaemic areas after ligation of supplying vessels [28][29][30][31] .
All studies investigated the fluorescence-time curve, which was defined by a software-derived graph that displayed the fluorescent signal on the y-axis and time on the x-axis of a particular part of the gastrointestinal tract (Fig. 2). From this curve, all reported quantitative derivatives are summarized in Fig. 2 and their mean values are presented in Table 2. A representative 'normal' fluorescence-time curve was shown in six of the seven studies in patients 20,21,22,23,25,26 . Fig. 3a shows the raw data of the curves. The baseline intensity (F bg ) and point t ¼ 0 differed for all curves. After intensity normalization of the curves and creating an overlay of t ¼ 0, the fluorescence-time curves tended to follow similar morphology, but with a large variation (Fig. 3b). Of note, when the inflow was steeper, the outflow declined faster. Based  15 Time from first fluorescence root from RGEA to anastomotic site Anastomotic site in area perfused < 90 s (preferably < 60 s) Quan et al. 16 Time to first visible fluorescence signal s 138Á0(82Á1) n.a. Sherwinter et al. 19 Time to maximum fluorescent excitation s 33Á0(1Á82) n.a. Wada et al. 20 Time  34 CBR over time (F norm /F bg ) n.a. n.r. n.a. Quan et al. 16 Ratio of gastric conduit CBR/oesophageal CBR n.a. 0Á97(0Á024) n.a.

Pixel intensity
Foppa et al. 35 Maximum pixel intensity SPY units n.r. n.a. Protyniak et al. 36 Lowest pixel intensity 0-256 greyscale 66 § n.a. Numeric classification score Huh et al. 17 Fluorescence and clinical scoring system 1-5 points FS 3Á5 (range 3-5) n.a. Sherwinter et al. 19 Fluorescence and clinical scoring system 1-5 points n.r. n.a. * Values in parentheses are area under the curve, sensitivity (%) and specificity (%). † For an explanation of derivatives, see Fig. 2. ‡ For patients without anastomotic leakage. § Average mean according to volume of procedures. n.a., Not applicable; ICG, indocyanine green; OR, odds ratio; RGEA, right gastroepiploic artery; n.r., not reported; F max , maximum intensity; T max , time from ICG inflow to F max ; AU, arbitrary units; T 1/2 , time from ICG inflow to half of F max ; F bg , baseline or background intensity; F norm , Fmax corrected for background (F max subtracted by F bg ); TR, time ratio (T 1/2 divided by T max ); T out , time of ICG outflow; T bg , time from ICG injection to ICG inflow in tissue of interest;; CBR, contrast-to-background ratio; FS, fluorescence score. on morphology of the fluorescence-time curves, two studies 21,22 reported curve types of the gastric conduit during oesophagectomy in patients. Ishige  Furthermore, three studies produced cut-off values for patient outcomes: two 20,26 for AL and one 23 for venous congestion. The cut-off values were derived for F max , T 1/2 , TR and slope ( Table 2). The studies were inconsistent in considering the quantitative parameters that were predictive for AL. Association of T 1/2 with AL was evaluated in two studies 20,26 , and was found to be predictive for AL in one 26 . The slope was predictive for AL in two   2 Fluorescence-time curve and its derivativesAU, arbitrary units; F norm , F max corrected for background (F max subtracted by F bg ); F max , maximum intensity; F bg , baseline or background intensity; F 1/2 , half of F max ; T 1/2 , time from ICG inflow to half of F max , T max , time from ICG inflow to F max ; TR, time ratio (T 1/2 divided by T max ); T bg , time from ICG injection to ICG inflow in tissue of interest; T out , time of ICG outflow. studies 20,26 , but with a lower area under the curve in one of the studies 26 .

Contrast-to-background ratio
In this category, one study calculated the CBR during perfusion assessment of the gastric conduit in a porcine oesophagectomy model, and two studies of small bowel segments in pigs and rats 16,33,34 . Two different equations for CBR were evaluated. Quan and co-workers 16 defined the CBR as fluorescence intensity divided by background fluorescence intensity, and calculated the CBR separately in the gastric conduit and proximal oesophagus. Subsequently, the ratio between the two CBRs was determined (gastric conduit CBR/oesophageal CBR) 16 . Two studies 33,34 defined CBR as: (mean fluorescence intensity À mean background fluorescence intensity)/mean background fluorescence intensity. In these two studies, CBR was studied over time. CBR-time curves in small bowel segments appeared to follow a similar shape to that of fluorescence-time curves (Fig. 3b). Matsui et al. 34 identified four CBR-time curve patterns in pigs: a normal (sharp inflow peak and rapid decline), a delayed (inflow peak and increase over time), a capillary (absent peak and increase over time) and an arterial insufficiency pattern (no change from the background signal). The last two patterns were seen in the ligated areas, whereas the delayed pattern was observed in the adjacent areas. In rats, the absence of an arterial inflow peak in the CBR curve showed accuracy of 85 per cent for predicting clinical necrosis (sensitivity 60 per cent, specificity 100 per cent). In this category, there was scant evidence for a threshold to predict patient outcomes.

Pixel intensity
Two studies, 35,36 used embedded software (SPY Elite TM with SPY Q software; Novadaq Technologies, Toronto, Ontario, Canada) to quantify maximum pixel intensity of the ICG fluorescence in patients during bowel resection. Change in management was determined by FA in one study 35 , and AL was observed in the other 36 (Table 1). Quantitative values were reported for the total cohort in one study 36 (Table 2). In this category, there was no evidence for a FA threshold.

Numeric classification score
In this category, one study 17 assessed quantification by a numeric classification score in patients undergoing gastric cancer surgery, and one study 19 by assessing patients undergoing low anterior resection. Change in management, according to conventional white light assessment, and AL were reported in both studies. Sherwinter et al. 19 introduced a scoring system consisting of a fluorescence score of 1-5 (where 1 indicated no uptake and 5 maximum uptake, scoring according to subjective assessment) and a clinical score of 1-5. Huh and colleagues 17 used the same scoring system; the mean fluorescence score at the stomach side is shown in Table 2. In this category, there was scant evidence for a threshold to predict patient outcomes.

Discussion
In this systematic review all current methods of FA quantification of the ICG fluorescence signal were identified, evaluating perfusion at the anastomotic site during gastrointestinal surgery. The explored quantitative fluorescence parameters were divided into four categories: time to fluorescence, by manually assessed time to fluorescence and by software-derived fluorescence-time curves; contrast-to-background ratio; pixel intensity; and numeric classification score. In the first category, cut-off values were found for ICG flow speed (cm/s) in the gastric conduit wall and derivatives of the fluorescence-time curves (F max , T 1/2 , TR and slope) for AL, and T 1/2 for venous congestion.
For the short-term future, manually assessed time to fluorescence seems the most promising method of quantification to produce a threshold for patient outcomes. The method does not require software for analysis, and thus development of a cut-off value is possible on a large scale. Potentially, time to fluorescence would indicate both arterial and venous problems. Additionally, time to no fluorescence could be of added value to predict tissue necrosis, as suggested by Quan and colleagues 16 . However, this method of quantification is still dependent on the fluorescence intensity and its subjective interpretation. To bypass the fluorescence intensity, fluorescence-time curves seem most promising in the long term. Derivatives of the curve are potentially independent of fluorescence intensity, but also of dose and distance between the imaging system and target organ. The slope, for instance, is distance-independent. The curves might also specify inadequate arterial inflow or venous outflow 23,37 . Furthermore, multiple measurements with sequential doses of the fluorescent dye might be possible, even when a high background signal remains after the first FA measurement.
The other categories (contrast-to-background ratio, pixel intensity and numeric classification score) seem less appropriate to generate a clinical threshold. Although fluorescence intensity is proportional to the amount of ICG in the tissue, pixel intensity is relative and incomparable, as it depends on patient characteristics, dose, distance, and imaging systems and their settings. The CBR depends on pixel intensity, but also on the background and positioning of the regions of interest. The background differs between imaging systems, as shown in Fig. 3, and favourable positioning of regions of interest to calculate CBR can produce bias in the values 38 . The classification score seems easily applicable for clinical purposes, but the assessment of ICG fluorescence is still qualitative.
This systematic review was limited by the low quality of studies and the small number of patient groups with low absolute numbers of AL. Furthermore, data comparison and aggregation was challenging owing to the lack of consensus in definitions in the reporting of FA findings and patient outcomes, standardization of the FA method, calibration of imaging systems, and lack of insight in software algorithms. Below, fundamentals for future research are outlined to overcome these limitations in the future.
Up to now, no consensus exists on the dependent variable of quantification outcomes. Either perfusion characteristics or patient outcomes are now selected as the dependent variable. For example, Nerup and co-workers 32 also determined microspheremeasured regional blood flow, and Diana et al. 28-31 measured lactate levels to correlate FA quantification to perfusion characteristics. In this way, quantification of FA will not provide information about the impact on patient outcomes, and a threshold is difficult to produce. In this review, quantitative fluorescence parameters were associated with patient outcomes. However, when considering patient outcomes such as AL, the use of quantitative FA will never account for all the risk factors associated with AL. For future research, it is of paramount importance to investigate quantification of FA in relation to occurrence of AL in a large number of patients, and to correct for other risk factors.
The FA protocol of the studies differed in ICG dose, situation of measurements, field of view, definition of t ¼ 0 and change of management. A dose of ICG is recommended at 0Á05 mg/kg per bolus and an extra bolus of 2Á5 mg if the signal starts to fade, when evaluating literature and experience 5,8 . The field of view would ideally assess the whole organ of interest, but software analysis must be performed in smaller regions to prevent levelling out of the quantitative value. For time to fluorescence, t ¼ 0 must be stated clearly. Most studies lacked information on t ¼ 0, but, when reported, it was often described as the moment of ICG injection. Using this definition, the location of intravenous access can bias time to fluorescence. It might be more accurate to choose fluorescence enhancement at the base of the supplying vessels of the organ of interest or a nearby organ in the field of view as t ¼ 0. Furthermore, change of management in terms of additional resection with adapted level of the anastomotic site due to FA must be reported.
Finally, the available imaging systems have different signal sensitivity due to differences in hardware, optics and image processing, which makes comparison of results difficult. To make data generated by different near-infrared imaging systems comparable, calibration of the imaging systems could contribute to standardization. Gorpas and colleagues 39 proposed a composite phantom for calibration of different imaging systems. As well as calibration, validation of the software algorithms is also mandatory to make pooling of data possible. A standard laboratory experiment using the above-mentioned phantom to assess whether values of the quantitative parameter are the same for different imaging systems and software programs would be helpful, and might allow correction for differences.
In future studies, artificial intelligence might improve the readout of FA 40 . Potentially, artificial intelligence will reduce interobserver variation and show new ways to interpret FA. Ideally, artificial intelligence could create a prediction model that combines the clinical history of patients with FA imaging in order to predict AL.