Robustness and classification capabilities of MRI radiomic features in identifying carotid plaque vulnerability

Abstract Objectives To assess how radiomic features may be combined with plaque morphological and compositional features identified by multi-contrast MRI to improve upon conventional risk assessment models in determining culprit carotid artery lesions. Methods Fifty-five patients (mean age: 62.6; 35 males) with bilateral carotid stenosis who experienced transient ischaemic attack (TIA) or stroke were included from the CARE-II multi-centre carotid imaging trial (ClinicalTrials.gov Identifier: NCT02017756). They underwent MRI within 2 weeks of the event. Classification capability in distinguishing culprit lesions was assessed by machine learning. Repeatability and reproducibility of the results were investigated by assessing the robustness of the radiomic features. Results Radiomics combined with a relatively conventional plaque morphological and compositional metric-based model provided incremental value over a conventional model alone (area under curve [AUC], 0.819 ± 0.002 vs 0.689 ± 0.019, respectively, P = .014). The radiomic model alone also provided value over the conventional model (AUC, 0.805 ± 0.003 vs 0.689 ± 0.019, respectively, P = .031). T2-weighted imaging-based radiomic features had consistently higher robustness and classification capabilities compared with T1-weighted images. Higher-dimensional radiomic features outperformed first-order features. Grey Level Co-occurrence Matrix, Grey Level Dependence Matrix, and Grey Level Size Zone Matrix sub-types were particularly useful in identifying textures which could detect vulnerable lesions. Conclusions The combination of MRI-based radiomic features and lesion morphological and compositional parameters provided added value to the reference-standard risk assessment for carotid atherosclerosis. This may improve future risk stratification for individuals at risk of major adverse ischaemic cerebrovascular events. Advances in knowledge The clinical relevance of this work is that it addresses the need for a more comprehensive method of risk assessment for patients at risk of ischaemic stroke, beyond conventional stenosis measurement. This paper shows that in the case of carotid stroke, high-dimensional radiomics features can improve classification capabilities compared with stenosis measurement alone.


Introduction
Carotid atherosclerotic disease is a subset of cardiovascular diseases (CVD), which is the leading cause of stroke and death worldwide. 1Measuring the degree of luminal stenosis remains the major approach for clinical assessment of the severity of carotid atherosclerosis.It is the only currently validated diagnostic criterion for patient risk stratification.However, the majority of clinical events occur in patients with mild to moderate carotid stenosis. 2Moreover, while the severe stenosis patients (�70%) have been proven through clinical trials to benefit from

Key points
� T2-weighted imaging-based radiomic features had consistently higher robustness and classification capabilities compared with T1-weighted images.� Higher-dimensional radiomic features had better performance than first-order features in identifying textures which could detect vulnerable carotid lesions.� Radiomic features combined with MRI plaque features may improve atherosclerotic plaque risk stratification.
carotid endarterectomy (CEA) and carotid artery stenting (CAS), the benefits of surgery taper off when stenosis severity is reduced (<70%). 2Therefore, those who lie in the mild-tomoderate stenosis range in particular require improved risk stratification for future ischaemic events.
Previous studies have suggested that morphological and compositional characteristics of atherosclerotic plaques may potentially better define clinical progression than luminal stenosis alone.A vulnerable carotid atherosclerotic plaque is often characterized by the presence of intra-plaque haemorrhage (IPH) and a large lipid-rich necrotic core (LRNC) with a thin or defective fibrous cap (FC) that can be visualized by high-resolution, multi-contrast MRI. 3 A comprehensive picture of plaque vulnerability should therefore include lesion morphological and compositional features. 4,5Instead of simply identifying features which relate to atherosclerotic composition, radiomics may have the potential to add incremental value by using the data directly from images to identify features which describe lesions in a way which is beyond the reach of already established methods. 6Some radiomic features may have obvious clinical relevance, such as the size of different zones within the image or the maximum brightness of an image; others are more abstract in nature, and represent patterns typically hidden to the human eyes.
In this study, it is hypothesized that radiomic features extracted from MRI can improve risk stratification of carotid plaques beyond conventionally used features such as the reference-standard degree of stenosis.

Patients and image segmentation
Data from fifty-five patients with bilateral carotid stenosis from the Chinese atherosclerotic risk evaluation (CARE-II) multicentre clinical trial were used in this study (ClinicalTrials.govIdentifier: NCT02017756).Institutional review board (IRB) approvals were obtained for the entire study and for each participating institution, and all participants in the study provided written informed consent. 7The primary objective of CARE-II was to determine the prevalence and characteristics of high-risk atherosclerotic plaques in the carotid arteries in Chinese patients with recent ischaemic stroke or transient ischaemic attack (TIA).In this study, linear logistic regression models were trained to differentiate culprit and non-culprit carotid artery lesions.The term "culprit" always refers to the causative ipsilateral carotid artery or plaque, as determined following the assessment of ultrasound and MRI examinations.The "non-culprit" side was deemed to have no causative basis for symptoms.
Those recruited underwent carotid MRI with the following sequences: three-dimensional (3D) time-of-flight (TOF), T1weighted (T1) quadruple inversion recovery (QIR), T2weighted (T2) multi-slice double inversion recovery (MDIR), and magnetization prepared gradient recalled echo (MPRAGE).Detailed MR imaging parameters can be found in the references, 7 and in Supplementary material A. Detailed patient demographics are provided in Table 1.
The MRI segmentations were performed manually by experts with reference to 3D time-of-flight (TOF), T1-weighted, T2weighted and magnetization-prepared rapid gradient echo (MP-RAGE) images to discern lumen and outer wall boundaries and various atherosclerotic components including IPH, LRNC, and calcification. 8The segmentation and related pre-processing procedures were as follows: (1) lumen and outer wall boundaries were drawn manually on the T1-weighted images to define the region of interest (ROI); (2) all images including the masks for the lumen and outer wall were converted into NIFTI format; (3) radio frequency bias correction 9 and de-noising using a spatially adaptive method 10 were performed; (4) global rigid and local deformable co-registrations were performed with T1-weighted images being the reference modality; 11 and (5) pixels of all images were finally extracted according to the segmentation masks.All steps were performed automatically in MR-VascularView# (Nanjing Jingsan Medical Science and Technology, Ltd, Jiangsu, China), except for step 1. Manual delineation of the lumen and outer wall was performed using a custom-built user interface within MATLAB (MathWorks, Natick, MA).
The overall workflow is illustrated in Figure 1 and includes: calibration; pre-processing procedure to select robust features; development of the predictive model for the identification of culprit lesions; and a post-processing procedure to select the most useful features.1][22] The ROI was manipulated by expanding and eroding the wall boundary outwards and inwards with 1-2 pixels.Only stable radiomic features would then be carried forward to the final analysis and those highly sensitive to minor ROI changes and image quality were omitted.

Feature regularization and model cross-validation
Least absolute shrinkage and selection operator (LASSO) and ridge regression were used to reduce over-fitting by introducing regularization penalty terms to the cost function.In both LASSO and ridge regression, large coefficients in the linear model are penalized by a characteristic penalty term, though in slightly different ways.LASSO regression involves the L1 norm of the vector of feature coefficients w in its penalty term, which is the sum of the absolute values of the feature coefficients.In ridge regression, the penalty term involves the square of the L2 norm, which is the sum of squares of feature coefficients. 23The corresponding penalty terms are shown below, where N refers to the number of features involved in the model.
Since the sum of squares of coefficients less than one is less than their sum, the ridge regression penalty is more accommodating for a more balanced coefficient vector, while LASSO regression tends to favour multiple coefficients being set to zero, resulting in a sparser model.While both types of regularization were performed, with feature selection in mind, the results following LASSO regularization were typically considered in the final analysis.
To validate the predictive ability of the model and assess its sensitivity to the dataset, a five-fold cross-validation scheme was employed where the data were split into five equally sized folds.Four-fifths of the data are used to train a linear logistic regression model, and one-fifth of the data are left unseen to evaluate the model.Additionally, to assess the repeatability of this analysis, this scheme was performed on resampled datasets 1000 times according to the bootstrapping technique of resampling with replacement.Bootstrapping resampling measures the sensitivity to the data set of the cross-validation analysis.This also allowed for a more comprehensive analysis for feature selection.The validation methodology is further illustrated in Figure 1.

The combined model and feature selection
The combined model consisted of the radiomic model and conventional risk features composed of morphological and compositional metrics.Morphological metrics included: degree of stenosis; minimum minor axis length (MMAL); minimum lumen area (MLA); inward remodelling index (IRI); and outward remodelling index (ORI), plaque burden, and plaque volume.Compositional features included calcification presence, calcification volume, IPH presence, IPH volume, and LRNC presence, and LRNC volume.The merit of these features alone was considered in the univariate analysis.
To select features, a protocol was developed based on the cross-validation procedure used to identify the most useful features for the combined model.Using LASSO regularization, the frequency of non-zero feature coefficients across all five cross-validation folds and all bootstrap iterations was plotted to illustrate the features that were most commonly included across all possible models.With ridge regression, a similar principle was utilized, except that the features were considered selected if their corresponding average coefficient across all five cross-validation folds was found to be in the top 10 among all features considered in the model.

Statistical analysis
Univariate analysis was performed using the v 2 test, Fisher exact test, or a Student t-test when appropriate and those with P-value �.1 were included in the multivariate analysis.The ICC was defined using a 2-way mixed analysis of variance (ANOVA) model.The relationship between sensitivity and specificity forms the basis of receiver-operating characteristic (ROC) curves.For all possible cut-offs (the point which separates culprit and non-culprit predictions) of the logistic regression S-curve, a point was added to the ROC curve and the resulting area under the curve (AUC) calculated accordingly.The AUC value is then used to measure the effectiveness of a given predictive model.The significance of the difference between two ROC AUC measurements were assessed using the Wilcoxon statistic for paired samples. 24

Univariate baseline predictors
The baseline model is firstly built upon the consideration of the univariate predictive value of morphological and compositional features.Geometric features except for ORI all had significant univariate predictive value, as did calcification presence and IPH presence (Table 2).The multivariate predictive value of these features is also presented along with the significance of the multivariate odds ratio within the model, in Table 2.

Robustness of radiomic features
Robustness of the radiomic features across 12 extraction settings (bin width: 10, 15, 20, 25, 30, 35 and bin count: 8, 16, 32, 64, 128, 256) was calculated for all 107 radiomic features produced by PyRadiomics.The T2-weighted images provided a higher proportion of moderate (0.5 �ICC < 0.9) and high (ICC �0.9) robust features in comparison with T1-weighted images.Of the features derived from T1-weighted images, 20% had low robustness (ICC < 0.5), compared with almost none which were derived from T2-weighted images.The bin width was a more reliable extraction setting compared to the bin count, which led to a high number of low robustness features (see Figure 2).

Integration of the radiomics and conventional models
The classification capabilities of models with different features included is demonstrated in Table 3 across five crossvalidation folds.It can be seen that if only lesion geometry (including stenosis) was considered, the overall accuracy was slightly greater than 60% at the optimal operating point, which is comparable to the case where only lesion components (calcification, etc.) were considered.The combination of lesion geometric and compositional features improved the classification capabilities, albeit without a significant difference when compared with either plaque morphology or composition alone.The radiomics model had better performance than any of the previous models.The combination of radiomic and lesion morphological and compositional features further improved the accuracy to 75.8%.The ROC plots relating to the performance of these models are illustrated in Figure 3.The multivariate odds ratios of all features within the combined model are presented in Supplementary material B, which highlights the relative weight of each feature within the cross-validated predictive models.This contains a conventional model, radiomic model, and a combined model.
In light of prioritizing clinical interpretability, a simplified radiomic model was tested, which includes only seven radiomic features: GLSZM, Large Dependence High Grey Level Emphasis (GLDM), Small Dependence Low Grey Level Emphasis (GLDM), Difference Variance (GLCM), Informational Measure of Correlation (GLCM), kurtosis (first-order), and skewness (first-order).The performance of these simplified features is demonstrated in Table 3 and the ROC plots may be found within Supplementary material C, which contrasts the ability of simplified and full models (containing all available features) to distinguish culprit and nonculprit plaques.

Feature selection
Each feature class was analysed individually, and the corresponding ROC curves were created for calculation of the AUC (Table 4).The AUC was calculated based on the performance of predictive models on a hold-out set according to the cross-fold validation method.In general, GLDM features derived from T2-weighted images had the best performance, followed by GLSZM and GLCM.NGTDM features had the poorest predictive value, and the poorest robustness profile.
The frequency of selection using LASSO and ridge protocols developed in this study are presented in the Supplementary materials D and E. First-order features skewness and kurtosis appear consistently in feature selection according to the established protocol, despite the first-order feature family in itself not having the strongest predictive performance as outlined in Table 4.The morphological and compositional features including calcification, IPH and plaque burden (PB) were often selected by the linear model under different perturbations of the training data.

Discussion
While univariate analysis revealed the capability of stenosis and other more intricate geometrical features to differentiate culprit and non-culprit plaques, these assessments do not tell the whole story.Features obtained using MRI improved upon the predictive capability of stenosis measurement alone.Radiomics then further improved the predictive model to provide a robust means of describing the plaque vulnerability.This was also true of a simplified model with the involvement of GLSZM, GLDM and GLCM features, kurtosis, and skewness.This implies that interpretable features become more useful when combined with other higher order features along with the plaque features; it also emphasizes the multifactorial, complex nature of carotid plaque vulnerability.T2weighted imaging-based radiomic features had consistently higher robustness than T1-weighted images as measured by the ICC among radiomic features.This may have then carried through to the superior classification capabilities of features derived from the T2-weighted images as opposed to the T1weighted images.This suggests that T2-weighting may be more efficient at picking up tissue contrasts which relate to plaque vulnerability.
Since the plaque components such as IPH and the extracellular fibrous matrix can take multiple forms, it is possible that radiomic texture features detect the particular sub-types of these plaque features which are more vulnerable.The profile of plaque haemorrhage-for instance, how it has developed and progressed-as identified by its texture features, may be a more powerful tool to investigate plaque the discrepancy in the AUC increase between the radiomiconly and all-inclusive models is relatively low, with a mean AUC increase of 1.74%, the statistical significance was improved by 98.6%, which increases the certainty of improvement over conventional methods when radiomic features are combined with the morphological and compositional features.
Limitations include the retrospective nature of the analysis and number of patients.Imaging was performed post-event, and the analysis aims to formulate a means of predicting patient outcome before the event occurs.Classifiers may not be so meaningful if the rupture has already happened and caused significant remodelling within the timeframe between event and patient presentation.Therefore, this work relies upon the implicit assumption that the underlying morphology is highly similar to what it was in the time leading up to the event.In the CARE-II trial, the MRI protocol was pre-defined.The 2 mm slice thickness could incur information loss.The effect of different image quantization settings was explored, but how well the radiomic features would perform under MRI scanners with different signal-to-noise ratio and other imaging parameters remains to be seen.Image pixel intensity normalization during pre-processing was forwent due to homogeneous image acquisition parameters.External validation on different scanners would be required to validate the robustness of the risk prediction models recommended by this study.Different ROI delineations, as defined by the original manual segmentation, impact the radiomic feature extraction.Perturbations of the original ROI using dilation and erosion allow us to understand which image modalities are robust to inter-observer variability.Radiomic features extracted from T2-weighted images were generally more robust to these ROI perturbations.
In conclusion, risk stratification for carotid atherosclerosis can be improved by a more comprehensive overview of the plaque risk beyond the current gold-standard degree of stenosis measurement.These results suggest that the geometric and plaque compositional features, in conjunction with radiomic features, can provide diagnostic improvement for patients with carotid atherosclerosis.

Figure 1 .
Figure 1.Schematic which demonstrates the methodology of extracting and evaluating radiomic features, and the cross-validation technique.

Table 2 .
Univariate and multivariate analysis of the baseline model, consisting of lesion geometry and plaque morphology (n ¼ 55).