Breath analysis for the detection of digestive tract malignancies: systematic review

Abstract Background In recent decades there has been growing interest in the use of volatile organic compounds (VOCs) in exhaled breath as biomarkers for the diagnosis of multiple variants of cancer. This review aimed to evaluate the diagnostic accuracy and current status of VOC analysis in exhaled breath for the detection of cancer in the digestive tract. Methods PubMed and the Cochrane Library database were searched for VOC analysis studies, in which exhaled air was used to detect gastro-oesophageal, liver, pancreatic, and intestinal cancer in humans, Quality assessment was performed using the QUADAS-2 criteria. Data on diagnostic performance, VOCs with discriminative power, and methodological information were extracted from the included articles. Results Twenty-three articles were included (gastro-oesophageal cancer n = 14, liver cancer n = 1, pancreatic cancer n = 2, colorectal cancer n = 6). Methodological issues included different modalities of patient preparation and sampling and platform used. The sensitivity and specificity of VOC analysis ranged from 66.7 to 100 per cent and from 48.1 to 97.9 per cent respectively. Owing to heterogeneity of the studies, no pooling of the results could be performed. Of the VOCs found, 32 were identified in more than one study. Nineteen were reported as cancer type-specific, whereas 13 were found in different cancer types. Overall, decanal, nonanal, and acetone were the most frequently identified. Conclusion The literature on VOC analysis has documented a lack of standardization in study designs. Heterogeneity between the studies and insufficient validation of the results make interpretation of the outcomes challenging. To reach clinical applicability, future studies on breath analysis should provide an accurate description of the methodology and validate their findings.


Introduction
Cancer is one of the leading causes of premature death. With an increasing worldwide life expectancy, the prevalence of cancer and its burden on society is growing 1 . Early-stage cancers are often asymptomatic and therefore difficult to detect. Treatment options for cancer, and ultimately their success rates, are greatly dependent on the disease stage at the time of diagnosis. The 5-year survival rate of stage I colorectal cancer is approximately 97.7 per cent, but it drops to 43.9 per cent for stage IV. Similar reductions in 5-year survival rate are seen in other cancer types, including gastro-oesophageal, liver, and pancreatic cancers 2 . These survival rates indicate the importance of screening programmes in detecting cancers in an early stage. Current screening and diagnostic techniques are often invasive and not patient friendly. This review focuses on the detection of digestive tract malignancies, including colorectal, gastro-oesophageal, liver, and pancreatic cancers, by analysis of exhaled air.
Colorectal cancer is one of the largest causes of cancer-related deaths. Screening for colorectal cancer with known tumour markers, such as carcinoembryonic antigen or cancer antigen 19.9, are not ideal owing to low sensitivity and specificity 3,4 . Faecal blood tests, such as the guaiac faecal occult blood test and the more recent faecal immunohistochemical test (FIT), can be used as screening tools for colorectal cancer 5 . In 2014, a national screening programme was introduced in the Netherlands using the FIT, leading to earlier diagnosis of colorectal cancer. In the event of a positive test, which indicates an increased risk of colorectal cancer, colonoscopy is recommended 6 . Although the FIT is non-invasive and has a sensitivity of over 80 per cent, a malignancy is found during colonoscopy after only about 8 per cent of positive tests 7 .
Next to colorectal cancer, gastric carcinoma is a common digestive tract malignancy with reported late diagnosis and high mortality rates 8 . In high-incidence countries, including Japan, screening programmes using gastroscopy have shown a decrease in mortality 9 . However, a major drawback of this screening programme is the invasive character of the endoscopic procedures used and the risk of complications.
Liver cancer is far less common. Screening using regular ultrasound imaging is performed in patients with underlying risk factors, such as chronic viral hepatitis or alcohol intake 10 . Although it is non-invasive, its sensitivity is relatively low and is operator-dependent 11 .
The same holds for pancreatic cancer. Pancreatic tumours in less than 20 per cent of patients are operable at the time of diagnosis 12 , and screening (using endoscopic ultrasonography or MRI) is currently recommended only for patients with a genetic predisposition 13 . However, the prognosis is poor, symptoms are associated with disease progression, and deaths from the disease are increasing globally 1,14,15 .
There is a general need for improvement in screening techniques for digestive tract malignancies. The sensitivity and specificity of most screening tools are not high enough to reach clinically valuable post-test probabilities in a screening setting. Thus, it remains a challenge within global healthcare to develop more suitable diagnostic tools for tumour detection 16,17 .
In recent years, detection of cancer by analysis of volatile organic compounds (VOCs) in body materials has shown promising results. VOC analysis has a long-standing history in medical research. By 1971, the Nobel Prize winner Linus Pauling 18 had detected 250 different compounds in breath using gas chromatography. Today, the value of VOC analysis in exhaled breath has been examined as monitoring tool in many diseases 19 , including the heart transplant rejection breath test 20 .
VOCs are carbon-based organic molecules, and their presence in exhaled breath can be divided into exogenously or endogenously derived compounds, according to their origin. Exogenous VOCs originate from environmental factors, such as food and beverage consumption, smoking, or other environmental exposures. Endogenous VOCs are produced as by-or endproducts of human or microbial metabolism. Apart from breath, VOCs can be detected in sweat, blood, tissue samples, urine, and faeces 21,22 . At present, more than 800 different breath VOCs have been registered in the Chemical Abstracts Service system 22 . The composition of VOCs in exhaled breath can be altered owing to pathological processes such as the presence of cancer. Tumourassociated inflammation leading to enhanced oxidative stress, altered glucose metabolism, and redox regulation in cancer cells can lead to different VOC signatures in patients with cancer [23][24][25] . Breath analysis methods might be able to identify 'breath signatures' specific to those with cancer. This could be of value in clinical practice.
Analysis of VOC profiles can be performed using a variety of analytical platforms 26 . Currently, the most common systems in use are gas chromatography mass spectrometry (GC-MS), proton transfer reaction mass spectrometry (PTR-MS), and selected ion flow tube mass spectrometry (SIFT-MS). In addition, pattern recognition sensor systems are emerging that detect total VOC-binding patterns instead of individual VOCs. The latter systems are commonly referred to as an electronic nose or E-nose 26,27 . All systems have their strengths and limitations. Systems that allow selective quantification of VOCs are usually more laborious, require trained personnel, and are expensive in comparison to systems that register unselective VOC binding patterns, such as portable E-nose systems 17 .
The non-invasive nature of breath analysis makes it interesting for clinical use. Despite a long history of breath research, there are currently only a few applications in the clinic. This review provides an overview of the current literature on the identification of digestive tract cancer by means of VOC analysis in exhaled breath. The aim was to examine the diagnostic performance of VOC analysis and also to identify potential pitfalls in order to improve future research in this field.

Search strategy
An electronic search of PubMed and the Cochrane Library was performed in May 2019. Neoplasm, cancer, tumour, electronic nose, volatile organic compounds, VOC, exhaled breath, predictive value of tests, sensitivity, and specificity were used as search terms, and were combined using AND-OR combinations.
Studies of cancer diagnosis that met the following criteria were included: at least two different groups of patients were included in the study, with regard to the presence of cancer; the index test was analysis of endogenous VOCs in exhaled breath; and the disease type was cancer of the digestive tract (oesophagus, stomach, liver, pancreas, and bowel). Studies were excluded if they were published before 2000, were not performed in adult humans, did not analyse malignant diseases, or analysed biofluids (such as breath condensate, urine, blood, and faeces).
The selection of potentially eligible articles was performed according to the PRISMA guidelines 28 . Discrepancies between the selections were solved in a consensus meeting between the reviewers. The following information was gathered independently and tabulated from the articles by type of cancer: author(s), year of publication, index test, reference test, method of data analysis, comparison groups, sensitivity, specificity, accuracy, and area under the curve (AUC). All VOCs identified in the studies were tabulated.

Quality assessment
The methodological quality of the articles was assessed by means of the Quality Assessment of Diagnostic Studies 2 tool (QUADAS-2) 29 ; a modified version was used 30 (Table S1). The assessment was performed by two independent researchers and discrepancies were resolved by consensus.

Results
A total of 7114 studies were identified by the search in PubMed. After applying the eligibility criteria 21 articles were identified. Two articles were retrieved by manual search, and finally 23 articles  were included in the review (Fig. 1).

Quality assessment of the studies
An overview of the results of quality assessment is provided in Table 1 and Fig. 2. The risk of bias was highest for patient selection. The most common reasons for unclear or high risk of bias were unclear specification, or issues regarding the eligibility criteria. For the index test criterion, the most common reason for a high-risk assessment was not having performed a blinded validation of the diagnostic model.
For flow and timing, the most common reason for high risk of bias was not having attempted to limit exogenous and endogenous influences on VOC composition. Regarding the applicability of the studies to the study question, the overall applicability concern was scored tolerantly and assessed as relatively low.
however, in four studies 37,38,40,44 , oesophagogastroduodenoscopy to rule out malignancy was not performed in controls. Only one study 45 included patients with liver cancer (30 patients). The patients had histologically proven stage I-V cancer and were compared with a group of healthy volunteers, and with a group of patients with hepatitis B-induced liver cirrhosis. The two control groups did not receive the same reference test as the cancer group. Patients with hepatitis B and cirrhosis were untreated and the disease confirmed histologically or cytologically. The healthy volunteers, however, who were the patient's relatives and hospital staff with no history of cancer or other chronic disease, did not undergo any reference test.
Two studies 46,47 included patients with histologically proven pancreatic cancer (25 and 65 patients respectively). The control groups consisted of perceived healthy controls in one study 47 , and patients suspected to have pancreatic disease who were scheduled for pancreatic imaging and found to be negative for malignancy in the other 46 .
Six studies 48-53 analysed breath samples from patients with colorectal cancer. The study population ranged from 20 to 65 patients with cancer. Patients with stage I-IV disease were included in all but one study [48][49][50][51][52] ; the other study 53 included only patients with stage I-III tumours. The control group consisted of healthy controls in four studies [49][50][51]53 . One study 52 compared VOCs from patients with colorectal cancer with those from patients with head and neck cancer (squamous cell carcinoma) or breast cancer. The remaining study 48 was a follow-up analysis in which patients with colorectal cancer were compared with those with colorectal cancer from the original study 49 , who meanwhile had been treated and declared tumour-free.
In addition, the follow-up patients were compared with healthy controls. All studies used histologically proven colorectal cancer as reference.
In general, many factors were heterogeneous across the studies. The eligibility criteria were sometimes not described clearly. Some studies included benign disease, whereas this was an exclusion criterion in other studies. There was also no consensus regarding how to deal with co-morbidities, and the timing of the index test compared with the reference test was not always at the same stage of the diagnostic process.

Patient preparation and sample collection
Measures to reduce influences of ambient air were taken in 21 of 23 studies (Table S2). Performing a lung washout was done in 10 of 23 studies, and sampling ambient air as a reference value was performed in 9 of 23. Nineteen of 23 studies described having taken measures to limit the influence of food and/or beverages. The timing of fasting before breath collection ranged from 2 h to more than 24 h. Withholding from alcohol consumption and/or smoking before measurement was mentioned explicitly in 10 of 23 studies, and was at least recorded in 17 of 23 studies. Other preparatory measures described were withholding from physical exercise, being in an emotional balance, gurgling with water before breath collection, and restraining from the use of toothpaste.
The timing of breath collection in the diagnostic process differed between the studies. Sample collection was performed using the following systems: Mylar V R bags, Tedlar V R bags, syringes, inert steel bags or chambers, nalophan sampling bags, BioVOC tm breath sampler, directly into PTR-MS instrument, and directly into e-nose. Research groups then stored and analysed the samples themselves or transported them to a laboratory that had access to the required analytical platform.

Analytical platforms and data analysis
A variety of methods were used to analyse VOCs from exhaled breath ( Table 2). GC-MS and sensor array systems were most often used to analyse exhaled breath (8 studies) 32 40 , trichloro(phenethyl)silane field effect transistor (1 study) 37 , and IMR-MS (1 study) 47 . Only three studies 34,36,52 used sensor systems for analysis: AEONOSE (eNose company) (2 studies) and breath analyser (Figaro, USA) (1 study). Data analysis was performed by a variety of techniques, involving the following methods: principal component analysis (PCA), probabilistic neural networks, partial least squared discriminant analysis (PLSDA), discriminant function analysis, artificial neural networks, Fisher least discriminant function analysis, least shrinkage and selection operator logistic regression (LLR), Mann-Whitney U test with LLR, predictive probability models, Mann-Whitney U test with binary logistic regression model, PCA with PLSDA with variable importance in the projection model, t test with ANOVA, and PCA with stepwise discriminant analysis. A detailed explanation of these methods is beyond the scope of this review.

Diagnostic test performance and validation
A summary of the diagnostic performance of the individual studies is provided in Table 2. The results were divided into four groups based on the cancer type studied. Data on sensitivity, specificity, accuracy, and AUC were retrieved from the articles. Where a study compared the index group with multiple reference groups, the results of these comparisons are also included. Five authors did not report diagnostic performance. The sensitivity ranged from 66.7 to 100 per cent, whereas specificity ranged from 48.1 to 97.9 per cent. In most studies, the sensitivity and specificity were lower in the validation phase than the training phase. Owing to heterogeneity of the studies, no meta-analysis could be performed. Internal, external or cross-validation was performed in one third of the studies ( Table 2).
Both the largest (484 patients) 32 and the smallest (30) 40 studies, including patients and controls, analysed VOCs from patients with gastro-oesophageal cancer.

Reported volatile organic compounds
In total, 106 different VOCs were identified. For most VOCs, there was a statistically significant difference in presence between the groups. Some of the identified VOCs were only significant within a subgroup. Of the VOCs recorded, 32 were identified by more than one study (Table S3). These VOCs were either found to be cancer type-specific in multiple studies (19 VOCs), or were found in different cancer types (13 VOCs) and were therefore more general cancer VOCs. The VOCs that were identified in the most studies (4 studies each) were decanal, nonanal, and acetone.
Thirteen compounds identified in multiple studies were described for different cancer types. Acetone was found to be significantly different in the oesophageal cancer/gastric cancer, pancreatic cancer, and colorectal cancer groups. 2-Methylpentane, 3-methypentane, 4-methyloctane, dodecane, decanal, and nonanal were found in the oesophageal cancer/gastric cancer and colorectal cancer groups. Pentane,undecane,tetradecane, hexane, ammonia and 1,2,3-trimethylbenzene were found in the pancreatic cancer and colorectal cancer groups. The remaining 19 VOCs were found only in studies of the same cancer.

Discussion
The diagnostic performance of breath analysis for diagnosing cancer has shown promising results, with good sensitivity and specificity. The potential use of breath analysis as a non-invasive test that can be applied clinically may differ for each specific type of digestive tract malignancy as it depends on the cancer prevalence and existing diagnostic alternatives. Breath analysis could be considered as an additional screening tool to supplement faecal blood testing in colorectal cancer, or a screening tool for gastric cancer in countries with a high incidence, such as Asian countries, including Japan. Another option could be monitoring of patients with Barrett's oesophagus to detect a potential conversion to malignancy. Breath analysis might be of special interest for pancreatic cancer, as its incidence is rising and the prognosis is poor, partly because it is often missed in the early stages 14 . A non-invasive test with the ability to distinguish between benign and malignant masses would be welcome. Despite the amount of research already done, there is currently no breath test being used for the detection of gastrointestinal tract malignancies, and the majority of clinical investigations are proof-ofconcept studies. Most of these studies have been performed in small populations using different analytical techniques with poor standardization. VOCs are a product of metabolic processes and so their presence in exhaled breath greatly depends on the metabolic state of the patient. Alterations in breath profiles could not only be induced by cancer but also by other potential endogenous and exogenous influences, such as fasting status, microbiome, smoking, medication, co-morbidities, and exposure to varying ambient air pollutants; all these issues should be taken into consideration when designing a diagnostic study on breath analysis 21 .
Several initiatives are under way to develop protocols for standardization of sampling and analytical measurements in the International Association of Breath Research 54-56 and the European Respiratory Society 57 . In a recent review 30 , a proposed framework for conducting and reporting future studies investigating the role of VOCs in cancer diagnosis was formulated. Applying standardization would contribute to improved quality of individual studies and enhance comparison between studies, leading to faster implementation of this promising diagnostic tool in clinical practice.
Although there is an abundance of possibilities for performing VOC analysis, a disadvantage in most of the currently available       studies is possible overestimation of the predictive value and lack of external validation. Prediction models generally perform better on data on which the model was developed than on new data.
Owing to relatively small sample sizes in most of the studies, there is a lack of external validation leading to a possible reduction in reproducibility 58 . According to the TRIPOD statement 59 , it is highly recommended for studies of prediction models to at least perform internal validation of the findings. Truly reliable results will only be generated by also validating the results externally.
There are many different analytical methods being used in studies of VOCs, and a distinction can be made between the so called real-time and offline analysis techniques 22 . The majority of the included studies used an offline combination of GC-MS systems with a sensor array system. An advantage of this approach is that specific discriminative VOCs can be identified and used to develop sensor systems applicable to clinical settings. However, certain conditions must be fulfilled for development of a breath test for use in the clinic. For clinical use, it is most important that the device is easy to carry, gives quick results, is non-invasive, should not be susceptible to environmental influences, and has both a high sensitivity and specificity.
VOCs that appeared in multiple studies might have the most discriminative value for discriminating cancer from non-cancer conditions. Some VOCs, such as acetone, 2-methylpentane, 3methylpetane, decanal, nonanal, pentane, and tetradecane, were identified in studies of different cancer types. This suggests that VOCs can be cancer type-specific, but also general markers for cancer. The vast majority of the VOCs, however, were only identified in single studies. Of the single VOCs that were identified in multiple studies, including decanal, nonanal and acetone, not all can be attributed directly to certain (patho)physiological processes. However, it is known that cancers often show metabolic abnormalities, such as dysregulation of glucose, fatty acid, and amino acid metabolism 60 . One should keep in mind that not only cancers but also other metabolic abnormalities might cause alterations in breath profiles. For example, an increase in acetone can be a result of diabetic ketoacidosis. However, acetone is a ketone strongly related to fatty acid oxidation. Fatty acids consist of a carboxyl group and a hydrocarbon chain that can be saturated or unsaturated, and are required for synthesis of membranes and signalling molecules in cellular proliferation, as seen in cancers [60][61][62] .
Headspace analysis of healthy intestinal epithelial cells and colonic cancer cells has already shown differences in release of VOCs. This indicates that metabolic abnormalities of cancer cells might contribute to the differences in exhaled breath profiles 63 . As the pathophysiological mechanisms that lead to the altered VOC production in patients with cancer have not yet been elaborated sufficiently, it remains difficult to determine the origin of the distinctive VOCs.
More recent studies using sensor systems, such as the Aenose, have shown promising results of exhaled breath analysis for diagnosing malignancies. However, these studies were unable to identify individual compounds as they used sensor measurements that were analysed using pattern recognition techniques 64 . Additionally, they can be criticized for showing poor linear reproducibility of the results and they also seem to be particularly sensitive to exogenous influences, such as humidity 17 .
As for use in clinical practice, it would be of interest to determine whether a breath test could be applied not only to distinguish between healthy patients and those with cancer, but also between similar diseases such as cancer and benign conditions of the same organ 22 . Therefore, one should consider also including patients with benign diseases in breath analysis studies. During the review process, an additional study 65 was published that met the search criteria for the present analysis. Breath analysis was performed using the Aenose for diagnosing colorectal cancer. The final model for distinguishing colorectal cancer from healthy controls showed a sensitivity of 95 per cent and specificity of 64 per cent, with an AUC of 0.84. Benign conditions such as advanced adenoma, non-advanced adenomas or hyperplastic polyps were also taken into account. Although the Aenose was able to distinguish patients with colorectal cancer from healthy controls, it was not able to differentiate colorectal cancer from advanced adenomas, or advanced adenomas from non-advanced adenomas, suggesting that the VOC profiles are too similar 65 . A different study 66 using the Aenose for a known precursor of oesophageal carcinoma, Barrett's oesophagus, had shown promising results, with a sensitivity of 91 per cent and specificity of 74 per cent for differentiating patients with Barrett's oesophagus from healthy controls. These findings demonstrate that exhaled breath analysis may be of use in the early detection of precancerous conditions, enabling better surveillance or earlier treatment. However, as discussed above, a number of steps still need to be taken to develop clinically applicable breath tests.
Currently, multiple systems are used for VOC detection, which have similar diagnostic performance. However, comparison and pooling of the studies proved to be difficult in the present analysis owing to wide heterogeneity between the studies. A consensus on how studies that analyse VOCs in exhaled breath should be performed will greatly advance progress in this field.
The appearance of some of VOCs in multiple studies of the same cancer type, but also different cancer types, suggests that there could be tumour-specific and also general cancer-associated VOCs. Further studies are needed to determine whether such VOCs could be used to improve cancer diagnostics.

Funding
This study was supported by the Dutch Digestive Foundation (MLDS career development grant CDG16-12 to T.L.)