Uncertainty quantification in computed tomography pulmonary angiography

Abstract Computed tomography (CT) imaging of the thorax is widely used for the detection and monitoring of pulmonary embolism (PE). However, CT images can contain artifacts due to the acquisition or the processes involved in image reconstruction. Radiologists often have to distinguish between such artifacts and actual PEs. We provide a proof of concept in the form of a scalable hypothesis testing method for CT, to enable quantifying uncertainty of possible PEs. In particular, we introduce a Bayesian Framework to quantify the uncertainty of an observed compact structure that can be identified as a PE. We assess the ability of the method to operate under high-noise environments and with insufficient data.


Significance statement
Computed Tomography (CT) imaging in medicine is widely used to visualize internal organs for diagnostic purposes.In the context of pulmonary embolism (PE) detection in the setting of acute chest pain, the PE can appear in CT scans as small structures with weak amplitude.So PE detection can be challenging in practice for clinicians, who have to decide whether the structures are PEs or not.This ambiguity can occur due to imperfect data acquisition (e.g.insufficient data, high noise environment).In this work we propose a computational tool to help clinicians to decide whether an observed structure is a PE or an artifact due to imperfect data.Our method quantifies the uncertainty of the structure, leveraging optimization and Bayesian theory.

Introduction 1.PE detection with computed tomography angiography
Most medical image modalities such as Computed Tomography (CT), ultrasound and Magnetic Resonance Imaging are the result of an intricate image reconstruction process that uses noisy and incomplete captured data.In particular, CT is a popular imaging modality used to diagnose various types of pathologies, such as acute inflammatory conditions, strokes and malignancy.X-rays are passed through the patient's body from multiple angles and an attenuation coefficient is calculated depending on the densities of the different tissues the x-rays pass through.A reconstruction algorithm is used to create final 3D image.This algorithm is subject to creation of artifacts, i.e., structures not present in the ground truth image being captured [23].They can interfere with conclusions drawn by radiologists, who then have to infer if structures appearing in CT images are pathological or artifactual due to the inaccuracy of the data acquisition.This is quite common when assessing CT scans for the presence of acute PE, which is a major cause of mortality with approximately 30,000 deaths per year in the UK [2].Assessment and detection of PE and its cardiovascular complications is routinely performed with a CT pulmonary angiography (CTPA) [12].Chronic thromboembolic pulmonary hypertension is also a potential long term disabling complication of acute PE and CTPA is an important diagnostic tool as well as being useful to assess for operability [10].However, a variety of patient and protocol related factors can result in image artifacts that may impact the clinicoradiological confidence of image interpretation.If a false positive diagnosis is made, this can result in inappropriate patient treatment with anticoagulation, which is associated with an unnecessary increase in bleeding rates [9].
In this context, quantifying uncertainty of the PE-like structures observed in reconstructed CT thorax images would improve diagnosis accuracy.In this paper we present an uncertainty quantification (UQ) framework to perform hypothesis tests on PE-like structures, and determine whether they are present in the patient thorax or are artifacts arising from inaccurate data acquisition.

Bayesian inference for imaging
Reconstruction of images from CT data can be formulated as an inverse problem.The objective is to find an estimate x x x † of an unknown image x x x (i.e., patient's thorax) from measurements y y y acquired with a CT scanner [19,7].Following a Bayesian framework [8], the image and the data are related through a statistical model.Then the estimate x x x † is inferred from y y y according to its posterior distribution, which combines information from the likelihood, related to the observations y y y, and the prior, used to introduce a priori information on the target image.The prior is used to regularize the model, to help to overcome ill-posedness and/or ill-conditionedness of the inverse problem.Common choices are to impose feasibility constraints, and to promote smoothness or sparsity of x x x, possibly in some transformed domain such as wavelet, Fourier or total variation (TV) [3].
Sampling methods, e.g., Markov Chain Monte Carlo methods (MCMC), draw random samples according to the posterior distribution.These methods then allow us to form estimators (e.g., minimum mean square error (MMSE) estimator, posterior mean or maximum a posteriori (MAP) estimator), and to perform UQ through confidence intervals and hypothesis testing [17,18].The main drawback of these methods is their high computational cost making them inefficient for highdimensional problems, as encountered in imaging.Indeed, for CT imaging, the dimension of x x x are often of the order of 10 8 in the case of high resolution lung scans [1].Although multiple works have emerged in the last years to help scaling sampling methods, e.g.[13,22,21], they usually remain prohibitive in such high dimensions.
Methods of choice for handling high-dimensional problems are proximal splitting optimization algorithms [5,11,4].These are known to be very efficient to form MAP estimates.Nevertheless, these methods only provide a point estimate, without quantifying the uncertainty on the delivered solution.To overcome this issue, recently a Bayesian Uncertainty Quantification by Optimization (BUQO) approach has been proposed in [14,15,16], to perform hypothesis testing on particular structures appearing on MAP estimates.The method determines whether the structures of interest are true, or are reconstruction artifacts due to acquisition inaccuracy.BUQO has the advantage of being scalable for high-dimensional problems, as the UQ problem is recast in an optimization framework, to leverage proximal splitting optimization algorithms.

Uncertainty Quantification for PE
UQ is the main tool to assist doctors for accurate decision-making processes.Ill-posed and illconditioned inverse problems result in high uncertainty about the estimate.In this work, we focus on quantifying uncertainty of PE-like structures in CT thorax images.Specifically, we design a method based on BUQO to determine whether these structures are PEs, or if they are reconstruction artifacts.

Methods
In this section we describe the steps of the proposed PE UQ technique.First, we form the CT image using an optimization algorithm (Section 22.1).Second, we identify PE-like structures in the image estimate, and postulate the null hypothesis that these structures are not present in the ground truth image, i.e., they are not in the patient's thorax, but instead are reconstruction artifacts arising due to the ill-posedness of the problem.Third, we use our method to decide whether the null-hypothesis can be rejected or not (Section 22.3).

Bayesian inference and optimization for CT imaging
In general, the gantry of a CT scanner, which includes multiple x-ray sources and multiple detectors will rotate around the patient's chest.This generates an M -dimensional array of data, denoted by y y y, consisting of attenuated X-ray intensities [19,7].The pattern of attenuation is determined by the geometry of the area through which the beams are directed.The aim of CT reconstruction is to recover a voxel array of dimension N1 , denoted by x x x, that represents the geometry of the organs inside the thorax given the observed noisy data y y y.This can be reasonably approximated as a linear inverse problem of the form where φ φ φ represents the CT measurement operator described above, and w w w is a realization of an additive independent and identically distributed (i.i.d.) random noise.
Using a Bayesian formulation, the posterior distribution of the problem, which combines information from the likelihood and the prior, can be expressed as where f is assumed to be a log-concave likelihood associated with the statistical model of (2.1), and g is a log-concave prior distribution for x x x.The usual approach to estimate x x x is to use a MAP approach, that consists in defining x x x † as a minimizer of the negative logarithm of (2.2), i.e., x x x † ∈ argmin x x x f y y y (φ φ φx x x) + g(x x x). (2.3) In this work, we assume that the exact noise distribution is unknown, but that it has a bounded energy, i.e., w w w 2 ε, where • 2 is the usual Euclidean norm, and ε > 0.Then, a typical choice is to take f y y y (φ φ φx x x) to be the indicator function of the 2 -ball B 2 (y y y, ε), centered in y y y with radius ε > 0. In addition, a common choice for the prior term g(x x x) is to promote sparsity of the image of interest in some basis (e.g., wavelet or TV).Then, (2.3) can be rewritten as where the operator ψ ψ ψ models a linear transform, chosen such that ψ ψ ψx x x has only few non-zero coefficients.(2.4) can be solved efficiently using proximal splitting algorithms [5,11,4].

High dimensional hypothesis testing
The method described in the previous section provides a point estimate x x x † of x x x, without additional information regarding its uncertainty.In this work, we propose to perform a hypothesis test on structures that can be identified as PEs in the MAP estimate.In traditional hypothesis testing, one computes the credible interval I α and the test statistic θ from data.The null hypothesis H 0 is rejected if θ is not in the credible region.Similarly for our proposed method, we compute the High Posterior Density (HPD) region C α and an image x x x S with the structure removed (similar to the test statistic).We reject the null hypothesis (which states that the structure is absent) if x x x S does not lie inside the credible region.This is determined by the distance between x x x S and x x x C , which are the two elements of S and C α respectively that are closest to each other.If this distance is zero, we conclude that x x x C ∈ S otherwise, x x x C / ∈ S.
To illustrate our approach, we recall the basics of hypothesis testing.Typically, we postulate a null-hypothesis, i.e., we make a claim about the distribution of observed data.We use the observed data to compute a statistic θ.We decide to reject or not the null-hypothesis depending if θ lies in a High Probability Interval (see Figure 1).This can be extended to computational imaging [15,16], to quantify uncertainty on structures appearing on the MAP estimate x x x † , obtained by solving (2.4).In this context, we postulate the null-hypothesis H 0 and the alternative hypothesis H 1 as follows: H 0 : The structure is ABSENT from the true image H 1 : The structure is PRESENT in the true image Formally, using Bayesian decision theory [17], we can conclude that H 0 is rejected in favor of H 1 if P(H 0 |y y y) α, where α ∈ (0, 1) denotes the level of significance of the test.Such probability can be approximated by MCMC approaches [18], however it becomes intractable for high-dimensional problems such as CT imaging.To overcome this difficulty, we introduce a subset S of R N , associated with H 0 , containing all the possible images without the structure of interest.Then, by definition, we have P(H 0 |y y y) = P(x x x ∈ S|y y y).To perform the hypothesis test, we will compare S with a posterior credible set C * α , corresponding to the set of possible solutions where most of the posterior probability mass of x x x|y y y lies [14].Formally, C * α satisfies P(x x x ∈ C * α |y y y) = 1 − α.Again computing such probability in high-dimension is intractable.Instead, [14] introduced a conservative credible region C α , in the sense that P(x x x ∈ C α |y y y) 1 − α, that does not require any additional computational cost other than building a MAP estimate x x x † , i.e., solving (2.4).Note that, by construction, we have x x x † ∈ C α , and C α consists of defining a feasibility set around x x x † .
The BUQO approach adopted in this work consists in determining if the intersection between S and C α is empty.If it is empty, it means that P(x x x ∈ S|y y y) = P(H 0 |y y y) 1 − (1 − α) = α, hence H 0 is rejected.To determine if S ∩ C α = ∅, we aim to find an image belonging to S ∩ C α .If such image exists, it means that S ∩ C α = ∅, and it is possible to find (at least) one image supported by the data y y y without the structure of interest, hence H 0 cannot be rejected.Otherwise S ∩ C α = ∅, and H 0 is rejected (see the second row of Figure 1).

Hypothesis test for PE detection
In this section, we explain the proposed method to determine whether S ∩ C α is empty or not.In addition, we give mathematical definitions of sets S and C α , tailored for the PE UQ problem.
To find the closest image to the the MAP estimate x x x † , belonging to S, one can project x x x † into S.We denote x x x † S = Proj S (x x x † ) this image.The first step is to verify if x x x † S ∈ C α .If it is the case, then we have found an image in the intersection x x x † S ∈ S ∩ C α , and H 0 cannot be rejected, i.e., we are uncertain that the PE is present.If x x x † S ∈ C α , it does not mean that C α ∩ S is empty, and there might still be an image which belongs to both sets.To ascertain if the intersection is empty, we propose to equivalently compute the distance between S and C α , denoted dist(S, C α ), and to verify if it is zero or positive.If dist(S, C α ) > 0, then we can conclude that C α ∩ S = ∅, so H 0 is rejected in favor of H 1 .Otherwise, if dist(S, C α ) = 0, there exists (at least) one image in the intersection, and hence H 0 cannot be rejected.
To evaluate dist(S, C α ), we need to minimize the distance between an element x x x C of C α and an element x x x S of S, i.e., we want to 2 . (2.5) For our problem, the conservative credible set, associated with (2.4), is defined as C α := {x x x 0 | φ φ φx x x − y y y 2 ε and ψ ψ ψx x x 1 η α }, where η α = ψ ψ ψx x x † 1 + N + 16N log (3/α).One main contribution of this work is to define S to be a set describing all possible images without PE structures that can be identified in the MAP estimate.In particular, we want the pixel intensity profile within the structure's area to be similar to the pixel intensity profile of a neighborhood of the structure.To this aim, we propose to define S as the intersection of three sets, i.e., S := I ∩ E ∩ S, given by intensity: energy: smoothness: where M : R N → R N S is a linear operator selecting the pixels of the image corresponding to the PE area.The first set I is the positive orthant, to ensure images in S are intensity images.The second set E controls the energy in the structure, ensuring that pixels inside the structure's area are taking values around a predefined mean value µ pix , chosen according to its neighborhood.The third set S is a smoothness constraint, to control the pixel intensity variation in the structure's area to be close to a mean value µ ∇ corresponding to the variations in its neighborhood.For both E and S, r pix and r ∇ are positive predefined constants to control the similarity between the structure's area and its neighborhood.

Experiments
In this section, we present experimental results on synthetic CT data.We apply the BUQO method to real CT slices that contain a PE and assess the ability of the algorithm to detect the PEs under different noise levels and detector setups.We also apply the BUQO method to test for the presence of reconstruction artifacts that were created when simulating the forward problem.

Dataset Description
CTPA was performed on multidetector array scanners, (SOMATOM® Drive and Definition Edge, Siemens Healthineers, Erlangen, Germany).The parameters were as follows; 128 x 0.6 mm slice thickness, 1.2 pitch, 0.5 s rotation time, 145 kVp tube voltage and 120 mAs with automatic dose modulation.60 mls of non-ionic intravenous contrast medium (iohexol, 350 mg iodine/ml; Omnipaque 350, Amersham Health, England) was administered at 6 ml/s via an 18 G cannula.The acquisition was triggered by bolus tracking of the main pulmonary artery, with a threshold of 100 Hounsfield units (HU) and 4 second delay after triggering.The study received approval from the Research Ethics Committee and Health Research Authority (IRAS ID 284089).Informed written consent was not required.

Measurements
From this data, we consider two slices of reconstructed clinical images containing PEs.Using these slices, we simulate data to study the effect of CT acquisition quality on PE detection.To this end we consider the model described in (2.1), with a forward operator φ φ φ modeling a parallel beam geometry with a fixed number of detectors D = 450 and a variable number of acquisition angles M a ∈ {50, 100, 200, 300, 450}.We generate w w w in (2.1) as a realization of an i.i.d.Gaussian noise vector of size M a × D and variance σ 2 .We then reconstruct the CT image by solving equation (2.4) to obtain the MAP estimate.

PE definition
To create the masks related to the operator M in (2.7) and (2.8), we used MITK [24].Two types of masks were created by experienced clinical radiologists: Masks identifying the location of real PEs appearing in the CTPA scans; and masks identifying the location of PE-like artifacts appearing in the CTPA scans due to low quality of the acquired data.In Figure 2 we show, for both slices, the PEs, and the artifacts of interest arising from the reconstruction process.
The set S, as defined in Section 22.3, captures the pixel profile for an artery that does not have a PE.In the definition of S, some parameters related to the energy and smoothness constraints must be chosen (see (2.7) and (2.8), resp.).We propose to choose them automatically, by looking at histograms of pixel intensities and gradients in a neighborhood of the mask.Precisely, we sample pixels around the area of interest and compute the histogram of the intensities of the sampled pixel.Then, in (2.7), µ pix is set to be the median of this histogram, and r pix is set to be the maximum of the difference between the upper 60 th percentile and the median; and the difference between the median and the lower 60 th percentile.The same is done to compute µ ∇ and µ pix in (2.8), but with the histogram of sampled gradients instead.

Result interpretation
To assess the effect of the acquisition quality (i.e., noise level σ and number of angles M a ) on the ability of our method to detect true structures, we introduce a structure confidence quantity If ρ α = 0, then dist(S, C α ) = 0 and we can conclude that there exists an image without the observed structure that lies in the credible set C α .If ρ α > 0, then dist(S, C α ) > 0, and the null hypothesis is rejected.The closer to one the value of ρ α is, the more certain we are that the null hypothesis should be rejected, and thus that the structure of interest is present in the true image.In practice, numerical errors must be taken into account, and the two above conditions should be relaxed as ρ α δ and ρ α > δ, respectively, for some tolerance δ to be determined by the user.
Note that ρ α provides additional information than only an accept/reject hypothesis test.It can be interpreted as a percentage of the structure's energy that is confirmed by the data.So when a selected PE-like structure is probed for UQ, ρ α provides a percentage of the structure's energy that can be trusted.
In Figure 1, we compare our method to traditional hypothesis testing in statistics.It is therefore natural to interpret ρ α as being equivalent to a p-value in hypothesis testing.However, accepting or rejecting the null hypothesis in our cases does not depend on some hard threshold on ρ α .There are two reasons for this.Firstly, traditional hypothesis testing is a frequentist method, where one would typically take the output of models at face value.Our method is a Bayesian method, where one is more interested about prior and posterior distribution.As such, ρ α is telling us the percentage of the structure that can be explained by the data.Setting a threshold on when to accept or reject the null should be an application-specific matter.Secondly, the method we have proposed does not only generate ρ α , but also generates x x x C and x x x S , whose qualitative contribution to the decision to accept or reject the null hypothesis is as important as the quantitative contribution of ρ α .Figure 2 shows images x x x C and difference images |x x x C − x x x S |, for different detector settings, and therefore different values of ρ α .It can be seen that nonnegative values of ρ α do not necessarily correspond to images that would be considered normal by a radiologist.However, very high values of ρ α (close to 1) tend to correspond to high fidelity images, which mimic real CT scans very well.

Confidence with respect to measurements
We show in Figure 3 the behavior of ρ α for two assessed PE structures, with respect to the noise level σ for a fixed number of angles M a (left), and with respect to the number of angles for a fixed noise level (right).It can be observed that the ability of the algorithm to confirm the presence of PEs improves with decreasing noise levels and increasing number of angles.
For the PE structure in Figure 3(left), we provide additional results in Figure 2(right).The images show the results of BUQO when considering (σ, M a ) = (50, 0.007), (σ, M a ) = (200, 0.035), and (σ, M a ) = (450, 0.007).In particular, the last row shows the differences (in absolute values) between x x x S and x x x C .This corresponds to the residual PE structure that is probed by BUQO.It can be seen as a 2D map version of quantity ρ α , giving the intensity value per pixel that is validated by the data.We can see that when the acquisition quality improves (i.e., σ decreases and/or M a increases), the intensity value per pixel that is validated by the data increases.
In Figure 2(left) we show results of BUQO for three PE-like structures that are reconstruction artifacts.For these structures, the last row show that the intensity value per pixel that is validated by the data is equal to 0 (i.e., ρ α = 0).Hence our method cannot reject H 0 , and the data cannot support the existence of the structure.

Complexity
In our experiments (see Figure 4), we found that the numerical complexity of the proposed uncertainty quantification is usually negligible compared to that of the reconstruction algorithm providing the MAP estimate.The computational bottleneck is usually the evaluation of the forward operator and its adjoint.The complexity is assessed in terms of total number of iterations (i.e., number of evaluations of the forward operators and their adjoints) to reach convergence of the algorithms used to evaluate the MAP and for BUQO (primal-dual algorithms in both cases).Convergence is assumed to have occurred when all constrained are satisfied, and the estimates are stable, up to a fixed tolerance.

Discussion
We have introduced an UQ method in CT imaging that can be used to assess PE-like structures observed in CT scans.We have simulated different acquisition environments by varying the number of measurements and the noise level in the forward problem and used the resulting MAP estimate to investigate the behavior of the proposed method to quantify uncertainty of PE-like structures.Our method demonstrates diminishing confidence with a decrease in data quality, while correctly identifying reconstruction artifacts produced in simulation using low quality data.In this closing section, we go over the strengths and weaknesses of the proposed method.
Manual annotations.The proposed method requires 3 inputs, namely the MAP estimate, the mask that isolates the area under investigation and the set S, which represents our prior knowledge.Currently, the mask is the result of a time consuming manual segmentation exercise, done by experienced clinical radiologists which can be replaced by an automatic segmentation algorithm based on deep learning methods [20].The set S is built making use of a constraint defined in the gradient domain of the image (which is unsuitable for artifacts appearing close to a boundary); and is done by manual sampling (which is time consuming).Instead, the set S could be the result of a data-driven method such as generative appearance models [6] Clinical Use.Acute PE carries a significant associated morbidity and mortality and thus improvement in the degree of radiologist certainty in positive identification of acute PEs in clinical practice is paramount.It is also important to improve the degree of radiologist certainty in identifying artifacts as such rather than false positive PEs, in order to avoid inappropriate treatment with anticoagulation and unnecessary bleeding risks.Further work is needed to validate the described method in clinical practice.

Figure 1 :
Figure 1: The Figure shows the parallel between traditional hypothesis testing and our method.In traditional hypothesis testing, one computes the credible interval I α and the test statistic θ

Figure 2 :
Figure 2: Left: Output of BUQO when used to quantify uncertainty of reconstruction artifacts.The forward problem parameters are chosen to be (M a , σ) = (50, 0.175) for all the artifacts.Right: Output of BUQO when used to quantify uncertainty of PEs, as the value of ρ α increases.The forward problem parameters are chosen to be (from left to right column): (M a , σ) = (50, 0.007), (M a , σ) = (200, 0.035) and (M a , σ) = (450, 0.007).First row: MAP estimates, zoomed on the structures of interest.Second row: Output image x x x C from BUQO.Third row: Difference images |x x x S − x x x C |.

Figure 3 :
Figure 3: Structure confidence ρ α as a function of number of angles M a (left) and noise level σ (right).High and low structure confidence are illustrated with qualitative examples of x x x C , x x x S and x x x † .Both plots show that as the data quality (i.e.number of angles and signal-to-noise ratio) increases, the structure confidence increases too, and we are more certain of the presence of the structure.

Figure 4 :
Figure 4: The histogram shows the number of forward operator evaluations needed for BUQO convergence as a ratio of the number of forward operator evaluations needed for convergence of the CT reconstruction algorithm.The data is split by the number of angles used in each simulation.