Incorporation of quantitative MRI in a model to predict temporal lobe epilepsy surgery outcome

Abstract Quantitative volumetric brain MRI measurement is important in research applications, but translating it into patient care is challenging. We explore the incorporation of clinical automated quantitative MRI measurements in statistical models predicting outcomes of surgery for temporal lobe epilepsy. Four hundred and thirty-five patients with drug-resistant epilepsy who underwent temporal lobe surgery at Cleveland Clinic, Mayo Clinic and University of Campinas were studied. We obtained volumetric measurements from the pre-operative T1-weighted MRI using NeuroQuant, a Food and Drug Administration approved software package. We created sets of statistical models to predict the probability of complete seizure-freedom or an Engel score of I at the last follow-up. The cohort was randomly split into training and testing sets, with a ratio of 7:3. Model discrimination was assessed using the concordance statistic (C-statistic). We compared four sets of models and selected the one with the highest concordance index. Volumetric differences in pre-surgical MRI located predominantly in the frontocentral and temporal regions were associated with poorer outcomes. The addition of volumetric measurements to the model with clinical variables alone increased the model’s C-statistic from 0.58 to 0.70 (right-sided surgery) and from 0.61 to 0.66 (left-sided surgery) for complete seizure freedom and from 0.62 to 0.67 (right-sided surgery) and from 0.68 to 0.73 (left-sided surgery) for an Engel I outcome score. 57% of patients with extra-temporal abnormalities were seizure-free at last follow-up, compared to 68% of those with no such abnormalities (P-value = 0.02). Adding quantitative MRI data increases the performance of a model developed to predict post-operative seizure outcomes. The distribution of the regions of interest included in the final model supports the notion that focal epilepsies are network disorders and that subtle cortical volume loss outside the surgical site influences seizure outcome.


INTRODUCTION
Surgery is usually the most effective treatment for drugresistant focal epilepsies. 1,2 Despite the extensive literature on epilepsy surgery, predicting the likelihood of postoperative seizure freedom for a given patient remains challenging. 3,4 A nomogram to predict individual surgical outcomes using basic clinical variables 5 was developed and validated by our group with modest accuracy (C-statistics of 0.6). Since brain MRI is a critical tool to localize the underlying epileptic lesion and the network of brain damage beyond the seizure focus, 6,7 we hypothesize that the inclusion of quantitative MRI (qMRI) data may enhance the model's accuracy.
Studies on volumetric measurements of the hippocampus enhanced our ability to detect signs of hippocampus atrophy on MRI and by doing so, revolutionized temporal lobe epilepsy (TLE) surgery. Volumetric measurement of brain structures is a long-standing research tool to detect MRI abnormalities that may not be readily identified by visual analysis. 8 Studies on TLE surgery using different neuroimaging techniques have demonstrated that structural abnormalities in brain regions outside the surgical margins influence postoperative seizure outcomes. [8][9][10][11] However, the translation of this imaging research knowledge into routine clinical practice has remained elusive, limiting its ultimate clinical impact.
The development of automatic segmentation algorithms enables volumetric brain measurement in clinical practice. We explore here the prognostic value of quantitative volumetric MRI measurements in the context of temporal lobe surgery, using NeuroQuant, an Food and Drug Administration approved software (CorTechs Labs, San Diego, CA, USA) that performs automatic volumetric measurements, providing percentile volume data of brain regions referenced against an age and gender-matched normative cohort. 12 We hypothesize that subtle structural evidence of epileptic network pathology extending beyond the temporal lobe reduces the odds of seizure freedom after surgery. If this is correct, qMRI measured in the context of routine clinical Graphical Abstract practice could be leveraged to enhance individualized seizure outcome prediction prior to TLE surgery. Given the advanced analytical tools, the epilepsy community should explore the incorporation of volumetric measurements into routine clinical care.

Patient selection
In this multicentre retrospective study, we selected patients who underwent temporal lobe surgery for epilepsy (n ¼ 653). We excluded patients with multilobar resections (n ¼ 86), prior brain surgeries (n ¼ 44), post-operative events of unclear nature (n ¼ 22), and patients who did not have an available pre-operative high-resolution 3D T1-weighted MRI (n ¼ 66). The final cohort included 435 patients treated at Cleveland Clinic, USA (n ¼ 289) Mayo Clinic, USA (n ¼ 57) and University of Campinas, Brazil (n ¼ 89) from 2010 to 2018.
Demographic and clinical data were collected from medical records. All patients underwent a comprehensive pre-surgical assessment, including clinical history and video-EEG, and with magnetoencephalography, nuclear imaging with fluorodeoxyglucose positron emission tomography and/or single-photon emission tomography when indicated. Visual MRI analysis was performed by a neuroradiologist specialized in epilepsy who was blinded to post-surgical seizure outcomes.
The following potential seizure outcome predictors were considered: pre-operative seizure frequency, age at epilepsy onset, age at surgery, duration of epilepsy, sex, aetiology, side of surgery, presence of generalized tonic-clonic seizures, MRI abnormalities and type of surgery. Aetiology was defined by pathology or MRI findings (when pathology results were not available). These outcome predictors were similar to the ones used in the generation of our already published epilepsy surgery nomogram. 5

Seizure outcomes
The primary outcome was defined by seizure control at last follow-up. Acute seizures were defined as seizures occurring within the first month after surgery and were not considered as seizure recurrence unless they persisted beyond the acute post-operative phase. Two separate analyses were done: one defining seizure control as complete postoperative seizure freedom; and one defining it as maintaining an Engel score 13 of Ia or Ib (allowing for some postoperative seizures but eventual seizure control by the last follow-up).

Quantitative MRI
The pre-operative 3D T1-weighted high-resolution MRIs were de-identified and sent to Neuroquant for quantitative analysis. The software calculates the volume of 71 brain regions providing the left, right and total volume of each region.
For these analyses, we excluded the following brain regions: brainstem, cerebellum, choroid plexus and individual ventricles (the total ventricular volume was included in the analysis), which resulted in 58 regions and intracranial volume measured in percentiles (a total of 175 measurements: left, right and asymmetry index). The percentile results compare the volumes of the different brain regions against an age and gender-matched normative cohort. NeuroQuant's normative database is built on a population-based sample data set collected from several thousand subjects from 3 to 100 years of age with an equivalence of gender.
Neuroquant uses the percentage of intracranial volume difference between left and right volumes divided by the mean to calculate the asymmetry index. This value is then compared to the normative database, and results are provided in percentile. When interpreting asymmetry values as percentiles, the closer the value is to 50, the smaller the difference between left and right volumes. If the asymmetry value measured in percentile is between 1 and 49, the left side is smaller than right, and if the value is between 51 and 100, the right side is smaller.
We analysed right-and left-sided surgeries separately to account for structural and functional inter-hemispheric differences. 14,15

Surgical Lacuna
To evaluate whether the ipsilateral temporal regions included in the models were resected or not, we reviewed all available post-operative MRIs (354 patients, including 144 on the right side and 210 on the left side). Anatomical sub-regions relevant to the model predictive performance were classified as being resected or not (partial resections were included in the resection group).

Statistical analysis Demographics and clinical data
Statistical analysis was run for the right side and the left side of the surgery separately. For each side of surgery, the patients' collected information collected was summarized as the mean and standard deviation for continuous variables, and as counts and percentage for all categorical variables.
A two-sample t-test was performed for comparing continuous variables by outcomes, while categorical variables were analysed by the Chi-square test. Fisher's Exact test was used when one or more of the cells had an expected frequency of five or less. The Bonferroni correction procedure was applied to account for multiple comparisons.
We used the complete-case analysis to address missing data.

qMRI: variable selection procedure
The statistical method used is illustrated in Fig. 1. Due to the high number of variables, a selection procedure was performed. For the qMRI analysis, we used the volume of different brain regions measured in percentiles. Variables at P < 0.15 on a two-sample t-test were preselected as potential predictor variables. Correlation analysis of the predictors was conducted to avoid multi-collinearity in a regression model.
Both backward elimination using Akaike's information criterion (AIC) 16 as a selection criteria, and random forest selection methods were performed. For the backward elimination method, we started with a model that included all variables and calculated the AIC value. The AIC is a measure of the relative goodness of fit for a specific set of data, which is used to perform model comparisons. Lower AIC indicates a better model. By removing one variable at a time from the initial model, we created new models with new AIC values. The model with the lowest AIC was selected. The same procedure was then repeatedly performed in the newer models until we had a final model with the lowest AIC value. The variables selected in this model will be those used for the analysis. Random forest variable selection ranks explanatory (independent) variables using the random forest score of importance (i.e. large values are ranked more important than low values). Random forest computes how much each variable decreases the node impurity (e.g. potential for misclassification). The most important variable is the one that decreases the impurity the most. The final importance of a variable is the average of the impurity decrease for each variable across all the trees. Two sets of predictors were selected after the backward elimination method and random forest selection method.
Development of models to predict seizure outcome using clinical and qMRI data To develop predictive models, we created four models applied to the right-and left-sided surgeries with outcomes being either seizure free or Engel I at last follow-up ( Fig. 1).
• Model 1: we used the backward method to pre-select the variables and performed a logistic regression. • Model 2: we used random forest to pre-select the variables and performed a logistic regression. • Model 3: we pre-selected variables using the backward method and performed a random forest regression. • Model 4: we pre-selected variables using a random forest selection method and performed a random forest regression.
Compared to the logistic regression, random forest regression does not assume the model has a linear relationship, and it utilizes ensemble learning. Random forest regression takes random samples, forms many decision trees, and then averages out the leaf nodes to get a more precise model. We selected these two modelling methodologies because logistic regression is the most classical prediction model methodology (a benchmark method) and random forest modelling has been shown to be one of the most reliable machine learning approaches outperforming other machine learning methods 17 such as support vector machines [18][19][20] when the sample size is moderate or small.
The concordance index of each model was calculated. The concordance index is used to compare the goodness of fit of logistic regression models. 3 Development of models to predict seizure outcome using clinical data only To evaluate the impact of qMRI on the model's performance, we also created models including only clinical predictors applied to the right-and left-sided surgeries with outcomes being either seizure free or Engel I at last follow-up. We used logistic regression and random forest regression to create the models and the concordance index to evaluate performance.

Testing dataset
The entire patient dataset served as both development and validation cohorts. To adjust the concordance index, the whole cohort was randomly split into training and testing sets, with 70% used for training and 30% for testing. Using the training dataset, we performed the original regression to the model outcome as a function of the predictors selected. The model's performance was assessed on the testing dataset. This process was repeated 100 times with different random seeds, and the mean and 95% confidence limit of the area under the curve was calculated.
We opted for the bootstrap method instead of the traditional k-fold method to account for the heterogeneity of the different cohorts regarding type of surgery and epilepsy aetiology.

Final model
The concordance index calculated for the testing dataset was used as a measure of the predictive accuracy of the model. The respective C-indices were compared to each other, and the model with the highest concordance index was selected as the final model. All analyses were performed using R studio software. The level of statistical significance was set at P < 0.05 (twotailed). To describe the cohort, we used the median (interquartile range) for numeric variables, and counts (%) for categorical data. Kruskal-Wallis and Fisher's Exact tests were used to test for univariate associations of numeric and categorical variables with the treatment, respectively. Lastly, we analysed whether the presence of one or more extratemporal abnormalities in the extra-temporal regions of interest identified by the final model correlated with seizure outcomes (we defined abnormality as volumes less than 5% compared to the normative population 21,22 ).

Graphic visualization
For the graphic visualization of these results, the significant regions of interest included in the final models were identified on a 3D MRI brain atlas: Neuromorphometrics (http://www.neuromorphometrics.com/), and the t-values were displayed using a colour scale to highlight the selected areas, with cool colours representing negative tvalues and hot colours positive t-values.

Standard protocol approvals, registrations and patient consent
The Cleveland Clinic Institutional Review Board approved this study and waived the requirement for individual informed consent. All data from participating sites (Mayo Clinic, USA and University of Campinas, Brazil) were de-identified of all patient health information.

Data availability statement
The data that support the findings of this study are available from the corresponding author, LJ, upon reasonable request.

Patient characteristics
A total of 435 patients were included in this cohort. Median follow-up time post-surgery was 34 months (25th/75th, 17/60) with a maximum follow-up of 116 months. Tables 1 and 2 display summary statistics for seizure freedom and Engel I outcomes at the last follow-up, respectively. The initially investigated variables are displayed in Tables 1 and 2.

Comparing different models
We used different techniques to create models to predict the probability of being seizure-free and the probability of an Engel I at last follow-up according to the side of surgery. Models including only clinical variables were also created for comparison. Table 3 displays the c-indices of these different models. After comparing the respective c-indices to each other, we selected Model 1 as our final model: The logistic regression model, including clinical variables using the backward elimination method as a selection procedure (adjusted concordance index) ( Table 3).
When we evaluated models with 'clinical predictors only', we chose those created using a logistic regression as the final model based on the overall c-indices.
The 95% confidence intervals displayed in Table 3 demonstrate that the adjusted c-indexes from the final models with and without qMRI data (highlighted in grey) were significantly different.
The variables identified by the final models as outcome predictors are displayed in Tables 4 and 5. The graphic visualization of these results is shown in Fig. 2.

Predictors of surgical outcome in RIGHT-sided surgeries
Complete seizure freedom versus seizure recurrence In patients with right-sided temporal lobe resection, smaller cortical volumes in the ipsilateral transverse temporal (P ¼ 0.021), entorhinal (P ¼ 0.021) and pericalcarine cortices (P ¼ 0.029), and larger volumes in the contralateral parietal region (P ¼ 0.019) were associated with failure of postoperative seizure freedom ( Fig. 2A).
When evaluating asymmetry findings, the group with recurrent seizures had smaller volumes in the contralateral nucleus accumbens (P ¼ 0.004) and paracentral region (P ¼ 0.04) ( Table 4).

Predictors of the surgical outcome in LEFT-sided surgeries
The areas included in the model were different when surgery was performed on the left side.

Complete seizure freedom versus seizure recurrence
In patients with left-sided temporal lobe resection, history of generalized tonic-clonic seizures (P ¼ 0.011), smaller cortical volumes in the contralateral middle frontal region (P ¼ 0.033) and the degree of asymmetry in the middle frontal region (P ¼ 0.033), with the contralateral side being smaller, predicted seizure recurrence (Fig. 2C).

Engel I seizure outcome versus Engel II-IV
In the left-sided surgery group, predictors of Engel II-IV were smaller volumes of the contralateral pars orbitalis (P ¼ 0.016) (Fig. 2D), asymmetry of the middle frontal region, with the side contralateral to the surgery smaller (P ¼ 0.004), asymmetry of the occipital lobe (P ¼ 0.023), inferior parietal region (P ¼ 0.042) and cerebral white matter hypointensities (P ¼ 0.046), with the ipsilateral side being smaller (Table 5).

Surgical Lacuna
In the left-sided surgery group, the ipsilateral transverse temporal gyrus was the only temporal region included in the model to predict Engel I outcome. Only 9/210 (4.3%) had that structure resected or partially resected. In the right-sided surgery group, the percentage of patients who had the following ipsilateral temporal regions resected is as follows: We then evaluated whether the resection of these areas was associated with surgical outcome, none was (Table 6).

Extra-temporal volume abnormalities
56.6% of patients with one or more abnormal regions of interest (bolded in Tables 4 and 5) were seizure-free at last follow-up as opposed to 68.0% of those with no abnormalities in any of these extra-temporal regions (Pvalue ¼ 0.02).

DISCUSSION
Despite significant advances in the evaluation of patients undergoing pre-surgical evaluation, our ability to predict surgical outcomes remains suboptimal. 3,4 In the present study, we investigated if qMRI measurements could help develop a model to predict seizure outcome after TLE surgery. 5

The model
Since there is no clear consensus on the best statistical method to apply in the development of tools to predict outcomes, we compared models using the combination of two different selection methods (backward and random forest) and two different regressions (logistic regression and random forest). The logistic regression using the backward elimination method as a selection procedure had the highest c-statistics and was selected as the final model ( Table 3). One of the strengths of this study is that we not only compared the importance of the quantitative and clinical variables in predicting surgical outcome but also explored different methods of conducting the statistical analysis.
We studied right-and left-sided surgeries separately to account for structural and functional interhemispheric differences. 14,15 Studies using different neuroimaging techniques demonstrated that left TLE is usually associated with more diffuse and widespread changes compared to right TLE. 6,15,23,24 In our study, volume differences were better predictors of outcome on the right-sided models compared to the left-sided, where asymmetry differences seemed to be better predictors. The 'floor effect' could explain this difference. Since the right-sided TLE seems to be a more unilateral disease, the presence of volume differences as predictors stands out as compared to leftsided TLE. Another possible contributor is the younger age at onset of epilepsy in left TLE as compared to the right TLE as demonstrated in a large multicentre study. 6 If the left and right TLE groups were similar, we would expect a mirror effect, with the same structures being identified by the model. The fact that the structures were different reinforces the hypothesis that left and right TLE behave differently. Our results are, therefore, in agreement with the literature suggesting the left-and right-sided TLE should be viewed as aetiological and pathologically distinct entities with distinct outcome predictors. 14,15,23,25 We should also consider the possibility that the lack of mirroring effect could be artificial, due to overfitting of the lateralized models or under powering thus failing to detect the same signals in each model. A  large sample size is always intuitively desired for classification or regression studies, and a larger sample size theoretically can minimize the empirical risk. Cui and Gong 26 conducted a comprehensive study on sample size effects for building prediction models in neuroimage studies. They showed greater improvements in the accuracy and stability of the prediction when the sample size is increased from an initially small sample size, whereas smaller improvements are observed when the sample size is increased from an initially large sample size. According to their findings, the average accuracy and stability of the prediction appear to plateau at sample sizes of 200-300, regardless of the algorithm. Therefore, a minimum sample size of 200 is recommended for machine learning regression prediction. Our study included 435 subjects, and the samples for building each of the sub-models were around 200. Even though we always desire larger sample sizes, compared to prior studies, we describe a cohort with a relatively reasonable size. Our sample may not fully represent the entire spectrum of the population, therefore limiting the generalizability of the predicted results to certain independent sample sets.
By adding the qMRI data to the model, we were able to increase the c-statistics further from 0.58 to 0.70 (right-sided surgery) and from 0.61 to 0.66 (left-sided surgery) for complete seizure freedom. For Engel I score prediction, the C-statistics increased from 0.62 to 0.67 (right-sided surgery) and from 0.68 to 0.73 (left-sided surgery). The models created in this study using clinical predictors only had similar c-statistics values compared to our previously published nomogram (the C-statistics for complete seizure freedom was 0.6, and for Engel I score 0.61). Even though the increase in the C-index was statistically significant, assessing the clinical significance of this enhancement is a nuanced exercise. Similar studies in other fields considered comparable c-statistic enhancements as an improvement in the model's performance. In lung cancer research, Mayo Clinic's Solitary Pulmonary Nodule Malignancy Risk model was the well-known benchmark model for lung cancer prediction. Reid et al. developed improved models to help characterize Pulmonary Nodules considered high enough risk by a clinician to recommend a biopsy. In an independent sample used for validation, c-index for Reid's model was 0.67 compared with 0.63 for the Mayo Clinic model, and is preferred as offering improved clinical utility. 27 We believe any enhancement in the model's performance is important, especially considering the resources needed to generate this increase in the c-statistic in our study are relatively minor: Neuroquant is commercially available, Food and Drug Administration approved, and user friendly, so implementing it does not necessitate the typical major investments required to build a research imaging program within an epilepsy surgery centre. Given the availability of clinical software packages for automated volume segmentation and measurements, our Figure 2 The heatmap represents the t statistics* of the univariate analysis and displays the structures that were identified by the model as outcome predictors. (A) Failure of postoperative seizure control for right-sided temporal lobe surgeries associated with smaller volumes of the transverse temporal, pericalcarine and entorhinal cortex in the right (ipsilateral) hemisphere, and with larger volumes of medial parietal in the left (contralateral) hemisphere. Seizure recurrence also associated with asymmetry of nucleus accumbens and paracentral regions, with the left side (contralateral hemisphere) smaller than the right. (B) Failure of postoperative seizure control for left-sided temporal lobe surgeries associated with smaller volumes of the right middle frontal region in the right (contralateral) hemisphere, and with asymmetry of middle frontal gyri, with the right side (contralateral hemisphere) smaller than the left. (C) Worse outcomes (Engel II-IV) for right-sided temporal lobe surgeries associated with smaller volumes of the entorhinal cortex in the right (ipsilateral) hemisphere, and with smaller volumes of the pericalcarine cortex and with larger volumes of the primary motor cortex in the left (contralateral) hemisphere. (D) Worse outcomes (Engel II-IV) for left-sided temporal lobe surgeries associated with smaller volumes of the pars orbitalis in the right (contralateral) hemisphere and asymmetry of the occipital lobe, and inferior parietal region with the left side (ipsilateral) hemisphere smaller than the right and asymmetry of middle frontal region with the right side (contralateral hemisphere) smaller than the left. * t-statistics represented in colour bars: cool colours negative t-values and hot colours positive t values.
study provides now a new tool that could be incorporated in routine clinical practice to enhance surgical outcome prediction.

Regions of interest
Volumetric differences located predominantly in the ipsiand contra-lateral fronto-central and temporal regions were associated with worse outcomes. The comparison between regions included in our model with that described in the literature revealed some interesting similarities. The Enhancing Neuro Imaging Genetics through Meta-Analysis-epilepsy study compared patients with epilepsy and controls, looking for areas with reduced volume 6 using freesurfer. Even though Neuroquant is a different tool, its analysis procedure is similar to the one performed by freesurfer with comparable results. 28 Even though the Enhancing Neuro Imaging Genetics through Meta-Analysis-epilepsy study aimed to look for brain regions related to epilepsy regardless of seizure outcome, in general, many areas identified by our model as outcome predictors overlap with the ones reported by Enhancing Neuro Imaging Genetics through Meta-Analysis-epilepsy. The areas also overlap with other studies that reported progressive atrophy in the ipsilateral temporopolar and central regions and contralateral orbitofrontal, insular, and angular regions in TLE. 29 A more rapid progression of atrophy was seen in the frontocentral and parietal regions in patients with longer duration of disease. 29 Studies using similar morphometric techniques, including voxel-based morphometry, surface-shape analysis and cortical thickness, also reported an association between surgical outcome and morphometric changes in extrahippocampal structures 8 like the entorhinal cortex, 30 temporopolar and insular cortices, 11 parahippocampal region, 31 thalamotemporal structure 32 and whole-brain extrahippocampal structures. 10,33 Given many regions identified by our model as outcome predictors overlap with regions described in previous studies rejects the assumption that these areas might have been selected by chance. Our findings reinforce the involvement of these areas in the pathophysiology of TLE, explaining their relevance in predicting surgical outcome. Resections of the ipsilateral temporal lobe structures identified by this model were not associated with seizure outcome. A limitation of this sub-analysis was the uneven distribution of cases, with some of the evaluated structures being routinely removed and others rarely resected. For example, some structures, like the transverse gyrus, are rarely resected, making it difficult to analyse the role of this region in surgical outcomes. This limitation might have influenced our results, explaining why, even though these variables were predictive of outcome in the model, they were not associated with overall seizure freedom in the surgical lacuna analysis. Future studies focusing on the exact extent of the resection are needed to better address this relationship. Because we do not acquire post-operative MRIs routinely, many patients had missing data. This limitation might have led to a selection bias that needs to be considered when evaluating the results.
One unexpected result was the fact that the ipsilateral hippocampal volume had no predictive value for temporal lobe surgery outcome. One explanation could be the floor effect, in other words, since the ipsilateral hippocampus was already atrophic, the range of volume variability was limited restricting our ability to differentiate subgroups. If we analyse the subgroup of patients with hippocampal sclerosis, the median hippocampal volume percentile (which could range from 1 to 100) was 1 (interquartile range 1-11) on the left-sided surgery group and 1 interquartile range (1-12) on the right-sided group, confirming the floor effect. Even though unexpected, this finding is in agreement with other studies that could not find the association between hippocampal volume and surgical outcome. 8 Furthermore, because the qMRI data were evaluated in conjunction with clinical data, we hypothesize that the presence of hippocampal sclerosis might have removed the additional information that hippocampal volumes might have provided.

Epilepsy network as an outcome predictor
We tried to delineate the weight of each brain region included in the model in the definition of surgical results. However, we could not find a linear relationship. Instead, we found a more widespread pattern of atrophy, including areas that are likely important for outcome prediction, independent of the reduced cortical volumes being ipsi-or contra-lateral to the surgical site. The broad distribution of these regions of interest within and outside of the temporal lobe supports the current notion that focal epilepsies are in fact focal network disorders 7,23,34-36 and the notion that the presence of damage and dysfunction outside of the surgical focus is relevant for the outcome. 6,29 Volumetric differences in regions outside the surgical site probably mirror changes in the epileptic network and could work, for that reason, as a biomarker for surgical outcome. 37 The correlation between the regions selected by the model and seizure outcome showed that patients with one or more abnormal regions had better outcomes, reinforcing the hypothesis that the more abnormal the network, the worse the outcome. Further studies will be necessary to define the exact functional connection between these areas.

Clinical translation
Different techniques have been applied to several neuroimaging applications to improve the accuracy of predictive models. 38 However, the translation from research findings into clinical care still requires improvement. We chose Neuroquant as a tool for MRI volumetry as it is Food and Drug Administration approved, clinically available, and practical. It performs an automated analysis, has a user-friendly interface, and promptly provides the volumetric results. Another benefit of Neuroquant is that it accepts high-resolution 3D images acquired using different MRI protocols. One limitation of this technique is that the software is not free of charge. The novelty of this work is that the development of this preliminary model moves us one step forward towards the translation of research findings into a possible clinically meaningful and practical tool to be used by any institution, and not only by dedicated research centres. Although our findings need further confirmation, our results are robust. The data come from three different epilepsy centres, and our findings are consistent with previously published results. Our results also emphasize the need for statistical models to account for the complex relationship between quantitative measurement and surgical outcomes, and encourage the use of qMRI measurement to enhance our ability to predict surgical outcomes.

Study limitations
The initial inclusion of most regions of interest provided by Neuroquant is both a limitation and an advantage.
The limitation is that we included, at first, a high number of regions of interest. To overcome this issue, we performed a pre-selection of the most important predictive variables before the construction of the model. By doing so, we avoided a selection bias.
The creation of an online risk calculator would facilitate the introduction of our findings into clinical practice. However, the development of clinically useful calculators, based on models with multiple and complex variables, is challenging. Our results show that quantitative data can be used to improve the prediction of surgical outcome, however, new studies focused on automatic import of big data, visualization tools and innovative prediction methods are necessary to improve the usefulness of this tool. Further improvements are needed for this tool to be ready for general clinical deployment, such as additional external validation, and applications in diverse epileptic pathologies.
Even though we are comparing this model's performance to the previously published nomogram's, 5 it is important to highlight that our study included only patients with temporal lobe resection, while the nomogram also included extra-temporal cases (32% of the development cohort).
Since we use regression models with a multitude of features, there is a potential for overfitting when calculating the full model c-index. To overcome this issue, we use the adjusted c-index.

Conclusion
Our study demonstrates the prognostic value of qMRI in the context of brain surgery for drug-resistant TLE and provides a path for translation of sophisticated epilepsy imaging research into clinical application. The MRI volumetric measurements likely represent an intermediate quantitative trait that influences the surgical result. It is likely not the individual variable per se that defines the outcome, but the interplay among them. The volume of each region of interest may be a deconstruction of the endophenotype MRI, 39 meaning that even though we assessed the areas selected by the model, the interpretation of the individual regions of interest should be done with caution. Regardless of the reason, the combination of variables ended up enhancing the performance of the model compared to our previous Nomogram. 5 The models presented here provide an individualized prediction of surgical outcome with potentially clinically meaningful use.