Skip to Main Content
Book cover for Jasper's Basic Mechanisms of the Epilepsies (5 edn) Jasper's Basic Mechanisms of the Epilepsies (5 edn)

Contents

Book cover for Jasper's Basic Mechanisms of the Epilepsies (5 edn) Jasper's Basic Mechanisms of the Epilepsies (5 edn)
Disclaimer
Oxford University Press makes no representation, express or implied, that the drug dosages in this book are correct. Readers must therefore always … More Oxford University Press makes no representation, express or implied, that the drug dosages in this book are correct. Readers must therefore always check the product information and clinical procedures with the most up to date published product information and data sheets provided by the manufacturers and the most recent codes of conduct and safety regulations. The authors and the publishers do not accept responsibility or legal liability for any errors in the text or for the misuse or misapplication of material in this work. Except where otherwise stated, drug dosages and recommendations are for the non-pregnant adult who is not breastfeeding.

Globally, an estimated 2.4 million people are diagnosed with epilepsy each year. Thus, a new person is diagnosed with epilepsy every 13 seconds (WHO, 2019). In 60% of those affected, epileptogenesis is initiated by structural causes such as traumatic brain injury (TBI; Hauser et al., 1993; Scheffer et al., 2017). Over 10 hypothesis-driven monotherapy approaches have demonstrated some disease-modifying effects in animal models of posttraumatic epileptogenesis (Dulla and Pitkänen, 2021). Currently, however, no clinical treatments are available to stop or alleviate epileptogenesis in at-risk patients after TBI or to alleviate the course of posttraumatic epilepsy (PTE) after its diagnosis. One major reason for the stalled progression of compounds showing proof-of-concept evidence in animal models to clinical antiepileptogenesis trials is the lack of prognostic biomarkers for epileptogenesis. Such biomarkers could be used to stratify patient populations for antiepileptogenesis trials and reduce study costs, making sufficiently powered clinical trials affordable (Engel et al., 2013; Pitkänen et al., 2018).

Our previous work utilizing univariate statistical assay indicated that behavioral tests or assessments of cortical lesion volume performed poorly in separating rats with and without PTE (Manninen et al., 2020; Lapinlampi et al., 2021). Conventional statistical analyses emphasize explanation of phenomenon under study, for example, by null hypothesis testing for presumed univariate difference between the groups. Conversely, machine learning (ML) excels at prediction, deriving complex multivariate, and possibly nonlinear relationships between patterns in data and outcomes of interest. The relationships are encapsulated in a form of a model, for which the defining parameters are learned from the data without domain expertise built into the model. The ability to derive predictive models without human bias allows screening of data for the presence of patterns that can be further refined into biomarkers, by assessing the performance of models learned from the data. Once learned and validated, ML models can be reverse engineered to gain insights into how variables in the data relate to PTE. The variables relevant to classification can be further refined in downstream analyses to transform them into a human interpretable formula or encapsulated directly in a ML model.

We utilized ML to assess whether animals with epilepsy after lateral fluid-percussion-induced brain injury (TBI+) can be separated from animals without epilepsy (TBI–), using animal weight during the follow-up, TBI induction parameters, behavioral measurements, and cortical lesion volumes extracted from magnetic resonance imaging (MRI). We applied the same approach to separate animals with TBI from naïve and sham-operated experimental controls. Because the large cohorts of animals studied were generated by inducing TBI in eight consecutive subcohorts over a 3-year period, we also studied the degree of inter-cohort differences. Intuitively, a possibility to reliably separate samples into cohorts based on feature values signals marked inter-cohort differences. Specifically, we address the following questions: Can we separate (1) naïve/sham animals from TBI animals and (2) TBI+ animals from TBI– animals based on given unimodal or multimodal feature sets, (3) which assay features differentiate the groups, and (4) will a linear combination of measurement variables suffice for PTE classification?

The study outline is summarized in Figure 40–1A. The study population was generated in eight successive subcohorts of 18 to 22 rats per cohort over a period of 3 years. Rats were randomized into the naïve, sham-operated experimental control or TBI groups. A total of 150 rats completed the 6 months follow-up and were included in the analysis. Of these, 13 were naïve animals, 23 were sham-operated experimental controls, and 114 were rats with lateral fluid-percussion injury (FPI)-induced TBI. Of the 114 TBI animals, 29 had unprovoked electrographic seizures during the sixth month of video-EEG monitoring and were diagnosed with epilepsy (TBI+). Eighty-five of the 114 rats with TBI did not show any seizures (TBI–). Seven TBI+ rats had 1 unprovoked seizure during the sixth postinjury month, 2 had 2 seizures, and 20 had ≥3 seizures. Details of procedures used to induce lateral FPI, behavioral assessment, neuroscore, elevated plus maze (EPM), open field (OF), Morris water maze (MWM), sucrose preference (SP), forced swimming (FS), and video-EEG monitoring were described by Lapinlampi et al. (2021). Cortical lesion volume was assessed as described by Manninen et al. (2020).

  A. The study population was generated in eight successive sub-cohorts of 18 to 22 rats per cohort over a period of 3 years. Rats were randomized into the naïve, sham-operated experimental control or TBI groups. A total of 150 rats completed the 6 months follow-up and were included in the analysis. Uniform Manifold Approximation and Projection (UMAP) for the (B-E) combination of all modalities and (F) neuroscore. B. Naïve, sham-operated and TBI groups were clearly separated from each other in the feature space. C-E. Division of TBI animals into four separate groups is evident from the visualization. D. A complete homogeneity of one of the clusters was reached when the threshold of seizure frequency/month was increased to ≥ 2 (i.e., only the rats with more severe epilepsy were included in the TBI + group). E. The number of TBI + animals in the bottom right cluster decreases by two, but the cluster still consists of a mixture of TBI + and TBI- animals. F. No clustering of the eight subcohorts of animals was evident from the cohort-wise UMAP. G. Neuroscore separated animals in the TBI group from those in the naïve and sham-operated groups well, with the exception of two naïve animals.
Figure 40–1.

 A. The study population was generated in eight successive sub-cohorts of 18 to 22 rats per cohort over a period of 3 years. Rats were randomized into the naïve, sham-operated experimental control or TBI groups. A total of 150 rats completed the 6 months follow-up and were included in the analysis. Uniform Manifold Approximation and Projection (UMAP) for the (B-E) combination of all modalities and (F) neuroscore. B. Naïve, sham-operated and TBI groups were clearly separated from each other in the feature space. C-E. Division of TBI animals into four separate groups is evident from the visualization. D. A complete homogeneity of one of the clusters was reached when the threshold of seizure frequency/month was increased to ≥ 2 (i.e., only the rats with more severe epilepsy were included in the TBI + group). E. The number of TBI + animals in the bottom right cluster decreases by two, but the cluster still consists of a mixture of TBI + and TBI- animals. F. No clustering of the eight subcohorts of animals was evident from the cohort-wise UMAP. G. Neuroscore separated animals in the TBI group from those in the naïve and sham-operated groups well, with the exception of two naïve animals.

We considered measurements from several different assays as modalities: body weight, neuroscore, EPM, OF, MWM, SP, FS, and MRI lesion volume and variables related to TBI induction (impact pressure, apnea time, duration of immediate postimpact seizure-like behavior). The modalities themselves were composed of features corresponding to assay measurements (e.g., animal weight on day 1). Furthermore, two multimodal feature sets were generated by combining all available modalities (termed “All”) and all behavioral assays (termed “Behavior”). The list of the modalities and the features they contained are presented in Table 40–1. The total number of features per modality was 440 for All, 316 for Behavior, 54 for MRI, 50 for EPM test, 24 for OF test, 14 for MWM test, 14 for FS test, 206 for neuroscore test, 8 for SP test, 67 for body weight, and 3 for TBI induction parameters. Missing measurements were linearly imputed for animal weight. For other modalities missing values were denoted with appropriate values outside the range of feature values attainable in the experiment. No imputation was performed during feature-wise univariate statistical tests and instead samples with missing values for the tested feature were excluded. The heading degrees from MWM were split into two features presenting the sine and cosine of the heading angle. For modalities for which time series of measurements were available, engineered temporal features were produced by calculating time series mean, median, standard deviation, difference between subsequent measurements, and the slope of a linear model fitted through the measurements. Three different datasets for TBI– versus TBI+ classification were generated by varying the threshold between TBI– and TBI+ from 1 to 3 seizures. For the remaining chapter, these specific datasets are referred as TBI1±, TBI2±, and TBI3±. In the TBI1± dataset, the TBI– class consisted of animals without observed seizures and the TBI+ class consisted of animals with at least one observed seizure. In the TBI2±(TBI3±)dataset, the TBI– class consisted of animals with less than 2 (3) observed seizures and the TBI+ class consisted of the animals with at least 2 (3) observed seizures. Thus, the datasets contained for naive/sham versus TBI classification 150 samples (13 naive, 23 sham, 114 TBI) and for TBI– versus TBI+ classification 114 samples (29 TBI1+, 85 TBI1–; 22 TBI2+, 92 TBI2–; 20 TBI3+, 94 TBI3–).

Table 40–1
The List of Modalities, Measurements (Features) Included in Each Modality, and Time Points on Which the Measurements Were Made
ModalityFeatureTime Points

EPM

Total distance

D28, D126

Center duration

Latency to first center entry

South duration

South frequency

Latency to first south entry

East duration

East frequency

Latency to first east entry

West duration

West frequency

Latency to first west entry

North duration

North frequency

North latency to first

Open arms duration

Open arms frequency

Latency to first open arms entry

Closed arms duration

Closed arms frequency

Closed arms latency to first

Mean velocity

Total entries

Open arms entries proportion

FST

Climbing duration

D42, D132

Climbing frequency

Swimming duration

Swimming frequency

Immobility duration

Immobility frequency

Immobility latency

MRI

Lesion size on threshold 1

D2, D7, D21

Lesion size on threshold 2

Lesion size on threshold 3

Lesion size on threshold 4

Lesion size on threshold 5

Total lesion size

MWM

Total distance moved

D41

Mean distance traveled to platform

Mean heading

Duration in platform zone

Entry frequency to platform zone

Latency to first entry to platform zone

Duration in zone 1

Duration in northeast zone

Entry frequency to northeast zone

Latency to first entry to northeast zone

Duration in southeast zone

Duration in southwest zone

Duration in northwest zone

Rotation frequency

Rotation frequency 2

Mean velocity

Neuroscore

Left contraflexion

D0, D2, D6, D14

Right contraflexion

Left hindlimb flexion

Right hindlimb flexion

Left lateral pulsion

Right lateral pulsion

Angleboard left

Angleboard right

Angleboard score

Total score left

Total score right

Neuroscore

OF

Total distance

D29, D127

Duration in outer zone

Outer zone entry frequency

Latency to first outer zone entry

Duration in middle zone

Middle zone entry frequency

Latency to first middle zone entry

Duration in inner zone

Inner zone entry frequency

Latency to first inner zone entry

Mean velocity

Total entries

Sucrose consumption

Sugar consumption

D40, D140

Water consumption

Total consumption

TBI

Hit pressure

D0

Apnea time

Seizure duration

Lesion size

Total lesion size

-

Weight

Animal weight

D-1, D0, D1, D2, D3, D4, D5, D6, D7, D14, D21, D28, D29, D32, D35, D42, D49, D56, D77, D84, D112, D119, D126, D127, D130, D140, D141, D142, D143, D147, D168, D183

ModalityFeatureTime Points

EPM

Total distance

D28, D126

Center duration

Latency to first center entry

South duration

South frequency

Latency to first south entry

East duration

East frequency

Latency to first east entry

West duration

West frequency

Latency to first west entry

North duration

North frequency

North latency to first

Open arms duration

Open arms frequency

Latency to first open arms entry

Closed arms duration

Closed arms frequency

Closed arms latency to first

Mean velocity

Total entries

Open arms entries proportion

FST

Climbing duration

D42, D132

Climbing frequency

Swimming duration

Swimming frequency

Immobility duration

Immobility frequency

Immobility latency

MRI

Lesion size on threshold 1

D2, D7, D21

Lesion size on threshold 2

Lesion size on threshold 3

Lesion size on threshold 4

Lesion size on threshold 5

Total lesion size

MWM

Total distance moved

D41

Mean distance traveled to platform

Mean heading

Duration in platform zone

Entry frequency to platform zone

Latency to first entry to platform zone

Duration in zone 1

Duration in northeast zone

Entry frequency to northeast zone

Latency to first entry to northeast zone

Duration in southeast zone

Duration in southwest zone

Duration in northwest zone

Rotation frequency

Rotation frequency 2

Mean velocity

Neuroscore

Left contraflexion

D0, D2, D6, D14

Right contraflexion

Left hindlimb flexion

Right hindlimb flexion

Left lateral pulsion

Right lateral pulsion

Angleboard left

Angleboard right

Angleboard score

Total score left

Total score right

Neuroscore

OF

Total distance

D29, D127

Duration in outer zone

Outer zone entry frequency

Latency to first outer zone entry

Duration in middle zone

Middle zone entry frequency

Latency to first middle zone entry

Duration in inner zone

Inner zone entry frequency

Latency to first inner zone entry

Mean velocity

Total entries

Sucrose consumption

Sugar consumption

D40, D140

Water consumption

Total consumption

TBI

Hit pressure

D0

Apnea time

Seizure duration

Lesion size

Total lesion size

-

Weight

Animal weight

D-1, D0, D1, D2, D3, D4, D5, D6, D7, D14, D21, D28, D29, D32, D35, D42, D49, D56, D77, D84, D112, D119, D126, D127, D130, D140, D141, D142, D143, D147, D168, D183

Abbreviations: EPM, elevated plus maze; FS, forced swimming; MRI, magnetic resonance imaging; MWM, Morris water maze; OF, open field; TBI, traumatic brain injury. DX denotes measurements from day X.

To assess the tendency of the data to form clusters, we used dimensionality reduction with uniform manifold approximation and projection (UMAP) to visualize modalities (McInnes et al., 2020). Univariate differences between naïve, sham, and TBI groups were assessed using Kruskal-Wallis test with post hoc multiple comparison analysis with Dunn test. A 95% bootstrap confidence interval for differences between the means of groups in each feature was calculated to gauge the size of the differences in group means. A pairwise Mann-Whitney U test was applied to compare the differences between TBI– and TBI+ groups with TBI+ membership criteria varying from ≥1 to ≥3 observed seizures during the follow-up. The Benjamini-Hochberg procedure was used to control for false discovery rate (FDR) in all tests (Benjamini & Hochberg, 1995).

Succinctly, ML solves the optimization problem

(1)

where f presents a mapping from an input vector x¯i to the corresponding target variable yi. The vectors x¯i—referred to as feature vectors—correspond to measurements from a single animal i, whereas values yi encode class membership. The estimated function is encapsulated in the form of a mathematical structure known as model and the process of adjusting model parameters to minimize the mapping error, compounded with regularization term Ω(f) presenting model complexity, is referred to as training. The mapping error is expressed in terms of a loss function l(y,f(x¯)), which measures the divergence between known values of yi and the values f(x¯i) mapped by the model.

The regularization term Ω(f) is derived from model parameters. Regularization alleviates the risk of failing to estimate the actual general relationship between x¯i and yi by overemphasizing experiment specific details present in the training data. This focus on specificities of the training data is referred to as overfitting, and it manifests as a decrease in the model’s ability to correctly assign labels to vectors not utilized in training, that is, to generalize outside the training data to a whole population.

In this work, we utilized k-nearest neighbors (Cover and Hart, 1967), Random Forests (Breiman, 1984;Breiman, 2001), and regularized logistic regression (Tibshirani, 1996; Hastie et al., 2001) as classification models.

For model training and evaluation, nested cross-validation (CV) with grid search for hyperparameter optimization was performed by using a stratified 10-fold split both on the upper and lower CV level. During training, all features were standardized to zero mean and unit variance. For each CV fold, the standardization of test set was performed using the mean and standard deviation of the training set in order to avoid bias by information leakage from training to testing. Features with zero variance, that is, features which were constant across samples, were excluded from each fold’s testing and training sets.

In the inner training loop, a separate feature selection step was performed prior to model training utilizing recursive feature elimination (RFE) and univariate filtering with Kolmogorov-Smirnov test (KST). In RFE the least informative feature in terms of reduction in variance Gini index was repeatedly removed until target number of retained features (set as a RFE hyperparameter) was reached. The number of retained features was jointly optimized with the model hyperparameters. To avoid bias induced by preselected feature selection method, as an alternative to RFE and KST filtering the complete set of features was passed to the classifier in the feature selection step. The best performing feature combination and model configuration in terms of area under receiver-operator characteristic curve (AUC) on the inner CV was utilized to evaluate the model. Separate models were trained for naïve/sham versus TBI and TBI– versus TBI+ classification.

Classification performance was evaluated by calculating AUC, accuracy, sensitivity, and specificity on pooled predictions from the outer CV loop (Airola et al., 2009). To alleviate randomness caused by small sample size, nested CV was repeated 10 times with different random number generator seeds, and the performance metrics were averaged over the 10 repeats. Classification concordance was calculated as a ratio of correct classification of each animal over the 10 nested CV repeats. RF feature importance was calculated by averaging the feature specific mean decreases in Gini index over 10 CV folds, and p-values were assigned to RF importance using the permutation method (Altmann et al., 2010) with 500 iterations. Similarly, permutation testing with 500 iterations was used to assess the statistical significance of TBI– versus TBI+ classification scores reaching at least moderate AUC.

Statistical analysis on the differences between classifier performance between RF trained for TBI– versus TBI+ classification on the explored thresholds of TBI+ membership, and between LR and RF trained for TBI3– TBI3+ was conducted using 20 × 10cv paired t-test (Bouckaert and Frank, 2004) performed on averages over sorted runs. For the statistical analysis, nested CV was repeated 20 times using 10-fold nonstratified CV on the outer CV loop, and t-test was performed over the means of classifier accuracies from ranked folds over 20 runs. This approach has been previously shown to yield lower type I error and higher replicability than direct comparison over folds (Bouckaert, 2003, 2004). For 20 × 10cv correction for multiple comparison over models and thresholds was performed using the Bonferroni-Holm method (Holm, 1979).

To gauge the level of inter-cohort similarity in the analyzed modalities, multivariate intra-class correlation (ICC) and 1-versus-rest classification of cohorts was utilized. If no inter-cohort differences are present, the ratio of within cohort distances in the measurement space between samples should be similar between cohort distances, which results in ICC near zero. Similarly, when inter-cohort differences are nonexistent, it is not possible for a ML classifier to differentiate a single cohort from the rest.

ICC 95% confidence intervals were computed using bootstrapping. In classification experiments K-nearest neighbor classifier (KNN) was used. In classification experiments, leave-one-out CV was utilized in the outer loop of nested CV.

Data processing and analyses were performed using Python 3.9 and Cython 0.29.22. Kruskall-Wallis, Mann-Whitney U, Dunn, and Kolmogorov-Smirnov tests were conducted using statsmodels 0.13. Dimensionality reduction via UMAP was performed using umap-learn 0.5.1. For ML and CV, scikit-learn 1.0 was utilized. During grid search KS p-cutoff was varied between 0.05 and 0.1, and the maximum number of features in RFE was between 5 and 15. The number of trees in RF was varied between 100 and 1000, with the maximum number of features per tree kept at default values of square root of the number of sample features. The depth of trees was unrestricted, and the number of samples in bootstrapped datasets during bagging was equal to number of samples in the dataset.

Figure 40–2B presents UMAP visualization for clustering of naïve, sham, and TBI groups in the feature space of the “All” modality. Combination of all modalities separated TBI animals from naïve and sham-operated rats. Naive and sham animals were also separated. In the feature space of unimodal neuroscore only, TBI animals were well separated from naïve and sham animals (Fig. 40–1G), even though two naïve animals resembled the rats in the TBI group.

  A. A box-plot of mean accuracies from paired 20 x 10 cv test. RF classifiers differ in performance on different TBI+ inclusion thresholds, and the RF for TBI3– vs. TBI3+ differs significantly from corresponding LR model (*p < 0.001). B. Feature importance in terms of mean decrease of Gini index in permutation test (*p < 0.05, **p < 0.01). C. Ratio of correct classification (TBI3– vs. TBI3) of individual animals over the 10 CV folds. Green line indicates the cumulative proportion.
Figure 40–2.

 A. A box-plot of mean accuracies from paired 20 x 10 cv test. RF classifiers differ in performance on different TBI+ inclusion thresholds, and the RF for TBI3– vs. TBI3+ differs significantly from corresponding LR model (*p < 0.001). B. Feature importance in terms of mean decrease of Gini index in permutation test (*p < 0.05, **p < 0.01). C. Ratio of correct classification (TBI3– vs. TBI3) of individual animals over the 10 CV folds. Green line indicates the cumulative proportion.

Kruskal-Wallis test (KW) with Dunn post-hoc analysis revealed that naïve and sham animals differed in terms of slope of line fitted through the time series of body weight measurements (H 14, difference CI95[–0.33,–0.16]), with sham group showing a greater weight increase over time.

Naïve/Sham and TBI animals showed differences of several points in magnitude in the mean of almost all neuroscores (Dunn p < 0.001, H ≥ 38). Similarly, the mean body weight of TBI animals was up to 43.6 g (CI95[36.5,50.4, H 50, p < 0.001]) less than in naïve and up to 34.5 g less than in sham (CI95[27.4;41.2],H 52,2, p < 0.001) rats till day 42 post injury. In MWM, TBI animals traveled up to 18 cm longer distance to reach the platform (CI[14.7;21.9], H 64.8, Dunn p < 0.001), spent up to 10.7 s less time in the northeast zone (CI95[7.9;13.7], H 46.7, p < 0.0001), and swam approximately 6 cm/s faster (CI95[4.3;9.3], H 43.5, p < 0.0001).

Next, we assessed whether the TBI+ and TBI– animals clustered separately and whether the separation was affected by the seizure frequency. Four distinct clusters were visible in the feature space of TBI group only (Fig. 40–1C–E). When the inclusion criteria for TBI+ was increased to ≥2 seizures (i.e., only the rats with more severe epilepsy were included into the TBI+ group), one of the clusters reached a complete homogeneity, indicating presence of a subgroup among TBI– samples discernable from TBI+.

MWU test showed TBI1– and TBI1+ groups to differ mainly (p < 0.05) in terms of EPM latency to first visit in north section (1049 s higher in TBI1–, CI95[474;1173], U 807), duration in the north section (12.5 s longer for TBI1+, CI95[1.7,29.4], U 826), total distance (256 cm longer for TBI1+, CI95[123.2;392.9], U 759), velocity (0.86 cm/s faster for TBI1+, CI95[0.41;1.34], U 761), and EPM north frequency (1 visit more for TBI1+, CI95[0.26;2.17], U 868.5). Similarly, in the OF test TBI1+ animals showed a 2.26 cm/s faster mean velocity (CI95[1.26;3.48], U 756) and 674 cm longer total distance (CI95[365.5;996.1], U 757) compared to TBI1–. Additionally, TBI1+ animals differed also in terms of several weight-related variables, with TBI1+ animals weighting up to 21.6 g (CI95[11.2;32.9], U 725.5) less before day 32.

The pattern persisted in TBI2– versus TBI2+ with comparisons, including differences in weight, EPM, and OF. TBI2+ rats had lower body weight on multiple time points (p < 0.01, U ≥ 449). In OF, they moved a 787 cm longer total distance (CI95[417.2;1140.4], U 559, p < 0.05) and with a 2.6 cm/s higher mean velocity (CI95[1.53;3.95], U 558, p < 0.05). In EPM, TBI2+ rats showed a 4.5 greater center area (CI95[2;7], U 556, p < 0.05) and 2.1 greater west area visiting frequency (CI95[0.87;3.4], U 639, p < 0.05), and 32 s longer west area visit duration (CI95[7.3;58.9], U 613, p < 0.05), as compared to TBI2–. Additionally, in contrast to TBI1– versus TBI1+, TBI2+ showed 0.42 points lower mean left (CI95[0.21;0.64], U 646, p < 0.05) and 0.32 lower right neuroscore (CI95[0.15; 0.56], U < 699, p < 0.05) than TBI2–.

TBI3+ showed a lower body weight (p < 0.01, U ≥ 402), visited on average 4.1 (CI95[1.53;6.74], U 531, p < 0.01) and 2.1 (CI95[0.79;3.41], U 593, p < 0.05) more often the EPM center and west areas, respectively, and spent 32 s longer (CI95[9.9;62.7], U 551, p < 0.05) on the EPM center area than TBI3–. They had 0.32 points lower mean left hind limb flexion score on day 6 than rats in the TBI3– group (CI95[0.17;0.6], U 629, p < 0.05).

A perfect naïve/sham versus TBI classification was possible with RF utilizing the multimodal All and Behavioral modalities, resulting in a pooled AUC 1.0 (Table 40–2). RF achieved a similar performance using neuroscore and MRI. Using only weight-related variables, RF separated naïve/sham animals from TBI animals with a pooled AUC 0.96 and using MWM with pooled AUC 0.93. Similarly, LR showed a high performance when combined with the multimodal All and Behavior modalities, with pooled AUC 0.99 and 0.98, respectively. Utilizing neuroscore, LR reached a pooled AUC of 0.98 and using MWM 0.91. In contrast to RF, LR was not able to correctly assign classes utilizing only MRI (pooled AUC 0.69).

Table 40–2
Mean Model Performance in Naïve/Sham versus Traumatic Brain Injury (TBI) Classification from Repeated 10-fold Cross-Validation Using the Combinational Modalities
ModalityModelPooled AUCSensitivitySpecificity

All

RF

1.00

1.00

1.00

LR

0.99

1.00

0.99

Behavior

RF

1.00

1.00

1.00

LR

0.98

1.00

0.98

MRI

RF

1.00

1.00

1.00

LR

0.69

1.00

0.36

EPM

RF

0.63

0.93

0.18

LR

0.60

0.95

0.04

OF

RF

0.71

0.92

0.29

LR

0.74

0.90

0.29

MWM

RF

0.93

0.95

0.73

LR

0.91

0.90

0.83

FS

RF

0.73

0.92

0.38

LR

0.69

0.81

0.47

Neuroscore

RF

1.00

1.00

1.00

LR

0.98

1.00

0.98

Sucrose preference

RF

0.44

0.82

0.08

LR

0.58

1.00

0.00

Weight

RF

0.96

0.96

0.79

LR

0.98

0.95

0.86

ModalityModelPooled AUCSensitivitySpecificity

All

RF

1.00

1.00

1.00

LR

0.99

1.00

0.99

Behavior

RF

1.00

1.00

1.00

LR

0.98

1.00

0.98

MRI

RF

1.00

1.00

1.00

LR

0.69

1.00

0.36

EPM

RF

0.63

0.93

0.18

LR

0.60

0.95

0.04

OF

RF

0.71

0.92

0.29

LR

0.74

0.90

0.29

MWM

RF

0.93

0.95

0.73

LR

0.91

0.90

0.83

FS

RF

0.73

0.92

0.38

LR

0.69

0.81

0.47

Neuroscore

RF

1.00

1.00

1.00

LR

0.98

1.00

0.98

Sucrose preference

RF

0.44

0.82

0.08

LR

0.58

1.00

0.00

Weight

RF

0.96

0.96

0.79

LR

0.98

0.95

0.86

Note: Random forest detected TBI perfectly using a combination of all modalities (All), combination of all behavioral modalities (Behavior) and neuroscores. High naïve/sham versus TBI classification performance was achieved on all modalities except sucrose intake, forced swimming (FS), elevated plus-maze (EPM), and open field (OF) test.

Abbreviations: EPM, elevated plus maze; FS, forced swimming; MRI, magnetic resonance imaging; MWM, Morris water maze; OF, open field.

A reliable naive/sham classification was not possible using EPM, OF, FS, or sucrose consumption either with RF or LR.

In TBI1– versus TBI1+ and TBI2– and TBI2+ classification, RF achieved pooled AUC 0.70 and 0.71, respectively, using the All modality (Table 40–3). When threshold was increased to ≥ 3 seizures, RF separated TBI3– from TBI3+ moderately with AUC 0.75 (permutation test, p < 0.05). Conversely, AUC of LR on the same modality was 0.61, 0.62, and 0.60 on thresholds 1, 2, and 3, respectively. In unimodal classification, none of the individual modalities enabled distinguishing TBI+ from TBI- on any threshold (Table 40–4).

Table 40–3
Mean Model Performance in Rats with Traumatic Brain Injury without Epilepsy (TBI–) versus with Epilepsy (TBI+) Classification from Repeated 10-fold Cross-Validation Using the Combinational Modalities
ModalityThreshold for TBI+ModelAUCSensitivitySpecificity

All

1

RF

0.71

0.23

0.91

LR

0.61

0.43

0.74

2

RF

0.70

0.18

0.93

LR

0.62

0.36

0.82

3

RF

0.75

0.10

0.94

LR

0.60

0.33

0.83

Behavior

1

RF

0.67

0.23

0.89

LR

0.61

0.31

0.82

2

RF

0.56

0.04

0.94

LR

0.55

0.13

0.89

3

RF

0.49

0.13

0.85

LR

0.53

0.11

0.92

ModalityThreshold for TBI+ModelAUCSensitivitySpecificity

All

1

RF

0.71

0.23

0.91

LR

0.61

0.43

0.74

2

RF

0.70

0.18

0.93

LR

0.62

0.36

0.82

3

RF

0.75

0.10

0.94

LR

0.60

0.33

0.83

Behavior

1

RF

0.67

0.23

0.89

LR

0.61

0.31

0.82

2

RF

0.56

0.04

0.94

LR

0.55

0.13

0.89

3

RF

0.49

0.13

0.85

LR

0.53

0.11

0.92

Note: Random forest consistently outperforms logistic regression. Moderate pooled area under curve (AUC) of 0.75 was reached by RF trained on All modality with threshold of three or more seizures.

Table 40–4
Mean Model Performance in Rats with Traumatic Brain Injury without Epilepsy (TBI–) versus with Epilepsy (TBI+) Classification from Repeated 10-fold Cross-Validation Using Unimodal Modalities
ModalityThreshold for TBI+ModelAUCSensitivitySpecificity

MRI

1

RF

0.49

0.11

0.89

LR

0.40

0.05

0.88

2

RF

0.37

0.06

0.88

LR

0.44

0.02

0.92

3

RF

0.44

0.12

0.91

LR

0.50

0.09

0.92

EPM

1

RF

0.63

0.23

0.93

LR

0.58

0.27

0.85

2

RF

0.65

0.10

0.96

LR

0.59

0.34

0.86

3

RF

0.58

0.06

0.94

LR

0.57

0.31

0.84

OF

1

RF

0.68

0.21

0.88

LR

0.67

0.11

0.93

2

RF

0.69

0.13

0.93

LR

0.67

0.12

0.95

3

RF

0.64

0.12

0.96

LR

0.60

0.05

0.98

MWM

1

RF

0.58

0.19

0.93

LR

0.62

0.22

0.87

2

RF

0.51

0.04

0.96

LR

0.54

0.09

0.92

3

RF

0.64

0.12

0.96

LR

0.60

0.05

0.98

FS

1

RF

0.58

0.19

0.93

LR

0.62

0.22

0.87

2

RF

0.51

0.04

0.96

LR

0.54

0.09

0.92

3

RF

0.53

0.08

0.95

LR

0.58

0.01

0.99

Neuroscore

1

RF

0.48

0.11

0.87

LR

0.55

0.20

0.84

2

RF

0.53

0.09

0.92

LR

0.53

0.13

0.90

3

RF

0.63

0.10

0.94

LR

0.47

0.10

0.86

Sucrose preference

1

RF

0.41

0.17

0.75

LR

0.44

0.00

0.99

2

RF

0.40

0.10

0.79

LR

0.44

0.00

0.99

3

RF

0.37

0.11

0.82

LR

0.46

0.00

0.99

Weight

1

RF

0.62

0.19

0.93

LR

0.57

0.40

0.77

2

RF

0.68

0.22

0.94

LR

0.61

0.39

0.81

3

RF

0.62

0.13

0.94

LR

0.67

0.39

0.83

TBI

1

RF

0.62

0.19

0.93

LR

0.57

0.40

0.77

2

RF

0.68

0.22

0.94

LR

0.61

0.39

0.81

3

RF

0.62

0.13

0.94

LR

0.67

0.39

0.83

ModalityThreshold for TBI+ModelAUCSensitivitySpecificity

MRI

1

RF

0.49

0.11

0.89

LR

0.40

0.05

0.88

2

RF

0.37

0.06

0.88

LR

0.44

0.02

0.92

3

RF

0.44

0.12

0.91

LR

0.50

0.09

0.92

EPM

1

RF

0.63

0.23

0.93

LR

0.58

0.27

0.85

2

RF

0.65

0.10

0.96

LR

0.59

0.34

0.86

3

RF

0.58

0.06

0.94

LR

0.57

0.31

0.84

OF

1

RF

0.68

0.21

0.88

LR

0.67

0.11

0.93

2

RF

0.69

0.13

0.93

LR

0.67

0.12

0.95

3

RF

0.64

0.12

0.96

LR

0.60

0.05

0.98

MWM

1

RF

0.58

0.19

0.93

LR

0.62

0.22

0.87

2

RF

0.51

0.04

0.96

LR

0.54

0.09

0.92

3

RF

0.64

0.12

0.96

LR

0.60

0.05

0.98

FS

1

RF

0.58

0.19

0.93

LR

0.62

0.22

0.87

2

RF

0.51

0.04

0.96

LR

0.54

0.09

0.92

3

RF

0.53

0.08

0.95

LR

0.58

0.01

0.99

Neuroscore

1

RF

0.48

0.11

0.87

LR

0.55

0.20

0.84

2

RF

0.53

0.09

0.92

LR

0.53

0.13

0.90

3

RF

0.63

0.10

0.94

LR

0.47

0.10

0.86

Sucrose preference

1

RF

0.41

0.17

0.75

LR

0.44

0.00

0.99

2

RF

0.40

0.10

0.79

LR

0.44

0.00

0.99

3

RF

0.37

0.11

0.82

LR

0.46

0.00

0.99

Weight

1

RF

0.62

0.19

0.93

LR

0.57

0.40

0.77

2

RF

0.68

0.22

0.94

LR

0.61

0.39

0.81

3

RF

0.62

0.13

0.94

LR

0.67

0.39

0.83

TBI

1

RF

0.62

0.19

0.93

LR

0.57

0.40

0.77

2

RF

0.68

0.22

0.94

LR

0.61

0.39

0.81

3

RF

0.62

0.13

0.94

LR

0.67

0.39

0.83

Abbreviations: EPM, elevated plus-maze; FS, forced swimming test; MRI, magnetic resonance imaging; MWM, Morris water maze; OF, open field.

Note: None of the modalities enable reliable TBI– versus TBI + classification in any of the thresholds.

As studies with large animal numbers need to be conducted typically in several successive subcohorts, we next assessed the similarity between the subcohorts. Moderate ICC was observed on weight (0.33, CI95 [0.26,0.42]) and to lesser extent in MRI (0.23, CI95[0.17,0.29]). Lower ICC was observed on FS (0.15, CI95[0.1,0.2]), sucrose preference (0.16, C95[0.09,0,23]), TBI induction parameters (0.15, CI95[0.06,0.25]), and neuroscore (0.13, CI95[0.09,0.18]). For the remaining modalities, ICC was near zero.

Similarly, animals could be assigned into a correct cohort based on their weight with a mean pooled AUC 0.89. However, reliable cohort-wise classification using MRI or other modalities was not possible.

We have previously investigated behavioral and cortical MRI parameters using more traditional biomarker statistics (Manninen et al., 2020; Lapinlampi et al., 2021). Here, we applied ML approach to discover single or combinatory biomarkers for PTE that may have gone undetected in previous analyses.

ML classifiers detect patterns specific to PTE during training to derive an optimal rule for classification. This allows hypothesis-free multivariate and nonlinear modeling of the relationship between measurements and PTE. When the data contain negligible amounts of information related to PTE, the extracted rules reflect measurement noise, leading to low classifier performance (close to chance level) during repeated CV. For this reason, ML provides a built-in method for checking correctness in biomarker discovery. The inability to combine a set of variables from an assay to detect PTE may indicate the following: (a) low relevance of the latent biological phenomenon measured by the assays to the PTE classification, (b) inability of the assay to measure the latent phenomenon with a sufficiently high signal-to-noise ratio, (c) inability of the utilized ML methods to determine relationship between the variables and PTE, or (d) sample presenting insufficiently the distribution to be estimated. Point (b) may result from the inherent nature of the assay or from inter-cohort differences due to, for example, experimenter level differences or from center-level differences in multicenter studies. In the present dataset, the differences between the cohorts were negligible, with the exception of differences in body weight and measures of cortical MRI.

A single, irrelevant assay can still provide complementary information when combined with measurements from another assay. For example, moderate classification performance of AUC 0.75 was reached by combining modalities such as weight, EPM, sucrose intake, and neuroscore, which individually were not able to separate TBI3– and TBI3 + classes. Nevertheless, classification accuracy showed relatively high variance, and a subset of animals was assigned into a correct class in only below 60% of the CV repeats. The inclusion of additional modalities such as EEG parameters or plasma markers could provide insight into the underlying pathology and further improve the classification performance.

There was limited overlap between the features highlighted by conventional statistics and RF feature importance. Statistical significance does not ensure separation; that is, statistical significance tells that class means of feature differ, which is conceptually different from the class separation. As class separation is necessitated for a pattern to qualify as a biomarker, and individual measurements may require contextualization through accounting for complementary information from other measurements, screening for candidate features for a biomarker by their contribution to a multivariate predictive model for PTE is justified. The observed low classification performance of linear combination of measurement variables utilized by LR implies necessity to account for nonlinear effects in PTE classification with the utilized modalities.

In these experiments PTE was defined by the presence of one or more unprovoked seizures. This classification scheme does not take into account, for example, the number and duration of seizures, the behavioral characteristics of the seizures or presence of epileptiform activity in addition to seizures. Thus, animals with a single observed seizure may differ in terms of severity of the underlying pathology. When the severity of PTE in TBI+ group was increased through stricter inclusion criteria of 3 or more electrographic seizures, AUC of 0.75 was reached. Conversely, regarding animals with less than 3 seizures as “nonepileptic” did not negatively affect the classification. This suggests that a combination of weight, sucrose intake, neuroscore, and EPM measurements differentiate only the severe PTE (i.e., animals with three seizures per month). However, as ML models are based solely on patterns present in the data, the observed performance may be affected by the presence of possible confounding factors. As is the case with ML generally, validation on independent datasets is necessary before the patterns discovered can be considered as possible biomarkers.

This study was supported by Medical Research Council of the Academy of Finland (grants 272249 and 273909, AP), by Research Council for Natural Sciences and Engineering of the Academy of Finland (Grant 316258, JT) and by the European Union’s Seventh Framework Program (FP7/2007-2013) under grant agreement n°602102 (EPITARGET)(AP). The computational analyzes were performed on servers provided by UEF Bioinformatics Center, University of Eastern Finland, Finland.

The authors declare no relevant conflicts.

Airola, A., Pahikkala, T., Waegeman, W., Baets, B. De, & Salakoski, T. A comparison of AUC estimators in small-sample studies. In S. Džeroski, P. Guerts, & J. Rousu (Eds.), Proceedings of the third International Workshop on Machine Learning in Systems Biology. 2009; 8: 3–13. PMLR. http://proceedings.mlr.press/v8/airola10a.html

Altmann, A., Toloşi, L., Sander, O., & Lengauer, T.  

Permutation importance: a corrected feature importance measure.
 
Bioinformatics
.
2010
; 26(10): 1340–1347. https://doi.org/10.1093/bioinformatics/btq134

Benjamini, Y., & Hochberg, Y.  

Controlling the False Discovery Rate: A Practical and Powerful Approach to Multiple Testing.
 Journal of the Royal Statistical Society. Series B (Methodological).
1995
;
57
(1): 289–300. http://www.jstor.org/stable/2346101

Bouckaert, R. R.  

Choosing between Two Learning Algorithms Based on Calibrated Tests.
 
Proceedings of the Twentieth International Conference on International Conference on Machine Learning.
 
2003
; 20: 51–58.

Bouckaert, R. R.  

Estimating Replicability of Classifier Learning Experiments.
 Proceedings of the Twenty-First International Conference on Machine Learning. 2014; 21: 15. https://doi.org/10.1145/1015330.1015338

Bouckaert, R. R., & Frank, E. (

2004
). Evaluating the Replicability of Significance Tests for Comparing Learning Algorithms BTAdvances in Knowledge Discovery and Data Mining (H. Dai, R. Srikant, & C. Zhang (eds.); pp. 3–12). Springer Berlin Heidelberg.

Breiman, L., Friedman, J.H., Olshen, R.A., & Stone, C.J.  

Classification And Regression Trees (1st ed.).
 
Routledge.
 
1984
. https://doi.org/10.1201/9781315139470

Breiman, L.  

Random Forests.
 Machine Learning.
2001
;
45
(1): 5–32. https://doi.org/10.1023/A:1010933404324

Cover, T., Hart, P.  

Nearest neighbor pattern classification.
IEEE Trans. Inf.
Theory
 
1967
; 13:21–27. https://doi.org/10.1109/TIT.1967.1053964

Dulla, C.G., Pitkänen, A.  

novel approaches to prevent epileptogenesis after traumatic brain injury. Neurotherapeutics.
2021 Jul; 18(3): 1582–1601. doi:10.1007/s13311-021-01119-1. Epub 2021 Sep 30. PMID: 34595732; PMCID: PMC8608993.

Engel, J.J., Pitkanen, A., Loeb, J.A., Dudek, F.E., Bertram, E.H. 3rd, Cole, A.J., Moshe, S.L., Wiebe, S., Jensen, F.E., Mody, I., Nehlig, A., Vezzani, A.  

Epilepsy biomarkers.
Epilepsia
2013
;
54 Suppl
4: 61–69. https://doi.org/10.1111/epi.12299

Hastie, T.; Tibshirani, R. & Friedman, J.  

The Elements of Statistical Learning
. Springer New York Inc. New York, NY, USA;
2001
.

Hauser, W.A., Annegers, J.F., Kurland, L.T.  

Incidence of epilepsy and unprovoked seizures in Rochester, Minnesota: 1935–
1984.
Epilepsia
 
1993
; 34: 453–468.

Holm, S.  

A Simple Sequentially Rejective Multiple Test Procedure.
 Scandinavian Journal of Statistics.
1979
;
6
: 65–70.

Lapinlampi, N., Andrade, P., Paananen, T., Hämäläinen, E., Ekolle Ndode-Ekane, X., Puhakka, N., & Pitkänen, A.  

Postinjury weight rather than cognitive or behavioral impairment predicts development of posttraumatic epilepsy after lateral fluid-percussion injury in rats.
 
Epilepsia
,
2020
; 61(9): 2035–2052. https://doi.org/https://doi.org/10.1111/epi.16632

Manninen, E., Chary, K., Lapinlampi, N., Andrade, P., Paananen, T,. Sierra, A., Tohka, J., Gröhn, O., & Pitkänen, A. Early increase in cortical T2 relaxation is a prognostic biomarker for the evolution of severe cortical damage, but not for epileptogenesis, after experimental traumatic brain injury.

J Neurotrauma.
2020 Dec 1; 37(23): 2580–2594. doi:10.1089/neu.2019.6796. Epub 2020 Jun 26. PMID: 32349620.

McInnes, L., Healy, J., & Melville, J. (

2020
). UMAP: Uniform Manifold Approximation and Projection for Dimension Reduction.

Pitkänen, A., Ekolle Ndode-Ekane, X., Lapinlampi, N., & Puhakka, N.  

Epilepsy biomarkers—Toward etiology and pathology specificity. Neurobiol.
 
Dis.
 
2018
; 123: 42–58. https://doi.org/10.1016/j.nbd.2018.05.007

Scheffer, I.E., Berkovic, S., Capovilla, G., Connolly, M.B., French, J., Guilhoto, L., Hirsch, E., Jain, S., Mathern, G.W., Moshé, S.L., Nordli, D.R., Perucca, E., Tomson, T., Wiebe, S., Zhang, Y.-H., Zuberi, S.M.  

ILAE classification of the epilepsies: Position paper of the ILAE Commission for Classification and Terminology.
 
Epilepsia
 
2017
; 58: 512–521. https://doi.org/10.1111/epi.13709

Tibshirani, R.  

Regression Shrinkage and Selection via the Lasso.
J. R. Stat. Soc. Ser.
B
 
1996
; 58: 267–288.

WHO. Epilepsy, 2021. Accessed November 1, 2021. https://www.who.int/news-room/fact-sheets/detail/epilepsy

Close
This Feature Is Available To Subscribers Only

Sign In or Create an Account

Close

This PDF is available to Subscribers Only

View Article Abstract & Purchase Options

For full access to this pdf, sign in to an existing account, or purchase an annual subscription.

Close