Deep learning unmasks the ECG signature of Brugada syndrome

Abstract One in 10 cases of sudden cardiac death strikes without warning as the result of an inherited arrhythmic cardiomyopathy, such as Brugada Syndrome (BrS). Normal physiological variations often obscure visible signs of this and related life-threatening channelopathies in conventional electrocardiograms (ECGs). Sodium channel blockers can reveal previously hidden diagnostic ECG features, however, their use carries the risk of life-threatening proarrhythmic side effects. The absence of a nonintrusive test places a grossly underestimated fraction of the population at risk of SCD. Here, we present a machine-learning algorithm that extracts, aligns, and classifies ECG waveforms for the presence of BrS. This protocol, which succeeds without the use of a sodium channel blocker (88.4% accuracy, 0.934 AUC in validation), can aid clinicians in identifying the presence of this potentially life-threatening heart disease.

" Tables S10, S11 • DNN Performance Comparison with Other Popular Classifiers " Table S12 The ECG Processing Pipeline: A Journey from ECG to Diagnosis

Context
As its ultimate goal, our automated digital ECG analysis seeks to classify a BrS disease state on the basis of features extracted from a standard ECG trace.Broadly speaking, all algorithms approach this problem in the following way: The most common data processing pipeline for ECG analysis applies a convolutional or recurrent neural network (CNN or RNN) directly to the raw ECG trace.This provides the most hands-off approach, because the preprocessing, feature selection and modelling are all streamlined into a single step.Here, the CNN/RNN algorithm generates and optimizes a set of features that it finds best correlated to the outcome.It then feeds this set of features to a fully-connected deep neural network (DNN), which determines the optimal combination and weighting of features to predict the outcome.This strategy comes with its own advantages and disadvantages: We employ a robust ECG preprocessing strategy that derives a single, high-fidelity representative heartbeat from the full trace for each lead of the input ECG.Our algorithm subsequently classifies upon this basis.A representative heartbeat in our case is the median of a single, complete cardiac cycle, centered near the R peak, as recorded by each lead.We meticulously locate each R peak in an ECG trace by a multi-resolution continuous wavelet transform (CWT) deconstruction.We find this yields the most accurate results, because our CWT procedure identifies parts of the ECG signal in time that exhibit the spatial frequency (sharpness) of the R peak.This function of signal morphology and time requires a significant amount of computational time to generate.But, once it identifies the R peaks located along a trace, the algorithm rapidly segments and stacks each detected beat within a canonical 750 ms window to form a complete captured cardiac cycle.We next perform clustering analysis to remove statistical outliers.Averaging this stack of ECGs produces a very high SNR representative heartbeat.We feed this representative heart beat to a fully connected DNN for classification.

Advantages Disadvantages
• The high SNR representative heartbeat is the feature.It is not learned, but extracted as a persistent property of the raw ECG trace.• Statistically superior performance in classification over CNN / RNN approaches • Quick to train DNN • Accurate determination of R-peak positions requires elaborate data processing.This step forms the computational bottleneck.• Cannot be applied to streaming data (no realtime analysis).• The code is complex.

Autoencoder for Fast R-Peak Detection
Owing to the computational expense of CWT calculations, we developed a neural network-based approach to identifying and localizing R peaks.Here, we sought to combine the accuracy afforded by the combined representative beat and DNN approach with the computational efficiency of the CNN approach.Unfortunately, a CNN by itself struggles to learn a diagnostically useful set of features to form the representative beat.To solve this problem, we accelerated the process of R peak detection while maintaining the accuracy of the CWT algorithm.
An autoencoder (1) is a convolutional neural network-based learning hierarchy that learns a path leading directly from a raw ECG trace to a segmentation map of R peak locations.The segmentation map in our case forms a classification data structure of the same dimensionality as an ECG trace that represents R peaks locations as 1 and every other point as 0. We apply the AE in place of the CWT algorithm to locate R peaks.

Advantages Disadvantages
• Significant improvement in deployment time • Minimal to no ECG data preprocessing needed • Produces representative beat consistent with CWT method • Requires some postprocessing to converge o n unique R peak timestamps • False-positive identification of T waves can occur Across 1,865 ECGs, the R peak location computation time with CWT for an 8 second trace is on average 20.3 seconds, which is about 2.5 times longer than ECG trace itself.With the same set of ECGs, the computation time for the autoencoder method is on average 0.011 seconds, which is 727 times shorter than duration of the ECG trace.This enormous speedup in processing time opens the door to perform real-time monitoring of heartbeats with very low computational expense.Any currently available ECG machine has the onboard computational power to apply our algorithm in real time.Some applications could include: • real-time monitoring of ST elevation in the operating room • real-time processing on wearable devices • lightweight cloud computing for uploaded ECGs (can process an enormous bandwidth of ECGs, probably a few hundred per second on a regular desktop computer) Beyond the diagnosis of BrS, this enables the ability to create representative heart beats for enormous datasets of millions of subjects in a reasonable time frame.We have shown that the representative heart beat is a superior machine learning input feature than a CNN / RNN can find for itself.

Autoencoder Output Post-Processing
The following steps outline the post-processing strategy for filtering the output of the autoencoder to determine a discrete set of R peak locations.1. Apply a threshold to the CNN results to select peaks above the baseline.A standard deviation threshold of σ = 0.5 is used.2. Apply a low-pass filter with a cutoff frequency of 30 Hz to the second derivative of the ECG signal using a Butterworth filter of order 9. 3. Multiply the filtered second derivative by the original CNN prediction and normalize the result.This step helps reduce the identification of false positive P and T waves.4. Choose the V1 lead from the ECG data. 5. Remove R peaks that are within 20 ms of the beginning or end of the signal, as these often are incomplete heartbeats.6. Use basin-hopping to find unique R peak locations.
• For each R peak location, find the local minimum within a window centered around the R peak.
• Update the R peak locations based on the local minima found.
• Remove duplicates and ensure that the R peak locations are within the valid range.
7. Apply a 400 ms exclusivity rule to remove duplicates and nearby false positives.
8. Calculate the heart rate (HR) and heart rate variability (HRV) from the final R-peak locations.9. Create a representative beat for each subject by chopping and stacking ECG signals around the R peaks.
• For each R peak location, extract a window of the ECG signal centered near the R-peak.Typically this window is 750 ms.The window is centered slightly after the R peak location, due to the typically longer duration of the QT interval compared to the PR interval.

Filtering and Denoising
We first apply a sequence of infinite impulse response (IIR) filters to each ECG trace, treating them as electronic signals.To this purpose, we apply a fast Fourier transform (FFT) to the traces and use a peak-finding algorithm on the resulting power spectrum to identify anomalous frequencies.We suppress these with a series of bandstop filters with widths of 3 Hz.We then apply another series of bandstop filters to remove 50 Hz AC hum (2) and its overtones from the ECG.Finally, we apply a highpass filter, centered at 0.5 Hz, to remove baseline drift from the trace (1); this varies according to the selected onboard filters and the electromagnetic interference (EMI) environment of the ECG instrument.The result is a denoised ECG signal with a flat baseline at 0 mV. Figure S3 provides an example of the power spectrum of an ECG trace before and after denoising.

Detection of Heartbeats
In order to generate consistent representative heartbeats, we define a canonical window of fixed length that fully encompasses the typical cardiac cycle for every subject.For the purposes of this study, we set this window to 750 ms, corresponding to a heart rate of 80 BPM, typical of our cohort.We choose the R peak as the reference point of our canonical window owing to its consistently sharp morphology.
To identify and localize R peaks, we apply a CNN based autoencoder described previously.For a given lead the set of R peak positions serve to align the heartbeat waveforms to form an average.
Having established the R peak positions, we segment each trace to fit a canonical 750 ms window, individually defined around each R peak.As such, these canonical windows capture each detected heartbeat regardless of heart rate, though some overlapping information may be captured in subjects with elevated heart rates.We align the R peaks to 300 ms from window onset, a slight offset from the window's midpoint, in order to fully capture the systole.This offset also helps to account for subjects with longer QT intervals, whose complete cardiac cycles might not otherwise be captured within our canonical window.The result is a two-dimensional stack of individual heartbeats for each subject, as shown in Figure 3a.Here, the R peak positions also provide the information necessary to synchronize the positions of traces in the measurement window before fusion.

Outlier Identification
The two-dimensional stack of ECG traces for each lead may include statistical outliers for several reasons, such as patient movement, irregularities in a single cardiac cycle (e.g., premature beat), change in shape of the ECG that causes the detection algorithm to misidentify the R peak, or a change in the trace baseline that has escaped the FFT highpass filter.Figure S4 shows some examples.These degrade the fidelity of the representative beat.Removing these outliers improves the suitability of the dataset for calibration purposes.
To identify statistical outliers in each ECG stack, we apply a series of outlier detection algorithms and reject individual heartbeats identified as such.We first apply principal component analysis (PCA), which projects the data onto an orthogonal basis, in which each dimension (principal component) represents a unique contribution to the overall variance in the data.Using Hotelling's T 2 statistic (3), we define a 95% confidence region in the first two principal component spaces (see Figure S4A.The critical distance of the confidence region is defined as follows: where s is the principal component score matrix of the two-dimensional stack of ECG segments, N is the number of individual heartbeats in the ECG stack (i.e., population size), A is the number of principal components in the model, and F crit is the critical value of the F-distribution with the given degrees of freedom at 95% confidence.Heartbeats in the ECG stack that lie outside this confidence region in the first two principal components are marked as outliers, and are rejected from the dataset.These outliers are transient irregularities in the ECG trace, which can be caused by myriad factors, including patient movement or momentary electrical artifacts.In cases where subjects present with high intrabeat variability, the confidence region would be correspondingly larger, encompassing more of that variance.

ECG Trace Processing
All DNN code has been written in-house using Python 3.7 and Tensorflow, while other supervised learning algorithms were explored using MATLAB 2020's deep learning toolkit.Figure 3a outlines the ECG processing workflow from raw text files to representative heartbeats.We first apply a sequence of infinite impulse response (IIR) filters to each ECG trace, treating them as electronic signals.
To this purpose, we apply a fast Fourier transform (FFT) to the traces and use a peak-finding algorithm on the resulting power spectrum to identify anomalous frequencies.We suppress these with a series of bandstop filters with widths of 3 Hz.We then apply another series of bandstop filters to remove 50 Hz AC hum 1 and its overtones from the ECG.Finally, we apply a highpass filter, centered at 0.5 Hz, to remove baseline drift from the trace. 2 This FFT denoising step accomplishes three things.It eliminates AC hum, which varies according to the the selected onboard filters and the electromagnetic interference (EMI) environment of the ECG instrument.It flattens the wandering baseline, and it suppresses anomalous high-frequency transients.The result is a denoised ECG signal with a flat baseline at 0 mV.Supplementary Figure S12 provides an example of the the power spectrum of an ECG trace before and after denoising.
In order to generate consistent representative heartbeats, we define a canonical window of fixed length that fully encompasses the cardiac cycle for every patient.For the purposes of this study, we set this window to 750 ms, which isolates heart beats for pulse rates as high as 80 BPM.We choose the R-peak as the reference point of our canonical window owing to its consistently sharp morphology.
To identify and localize R-peaks, we apply a continuous wavelet transform (CWT)-based peak-finding algorithm to each lead.This algorithm convolves the ECG trace with a model wavelet, in this case the Mexican Hat wavelet, over a series of widths.This generates a two-dimensional scalogram, depicting wavelet frequency as a function of signal time.A peak-finding algorithm then localizes sharp ridges in this scalogram, which correspond to R-peak positions in the ECG trace.For a given lead the set of R-peak positions serve to align the heartbeat waveforms to form an average.
Having established its R-peak position, we segment each trace to fit a canonical 750 ms window, defined around each R-peak.The R-peak is slightly offset from the center of the canonical window in order to fully capture the systole.The result is a two-dimensional stack of individual Baseline and Ajmaline heartbeats for each patient as shown in Figure 3a(i).Here, the R-peak positions also provide the information necessary to synchronize the positions of traces in the measurement window before fusion.
The two-dimensional stack of ECG traces for each lead may include statistical outliers for several reasons, such as patient movement, irregularities in a single cardiac cycle (eg.premature beat), change in shape of the ECG that causes the CWT algorithm to misidentify the R-peak, or a change in the trace baseline that has escaped the FFT highpass filter.Figure S12 shows some examples.These degrade the fidelity of the representative beat.Removing these ouliers improves the suitability of the dataset for calibration purposes.
To identify statistical outliers in each ECG stack, we apply a series of outlier detection algorithms and reject individual heartbeats identified as such.We first apply principal component analysis (PCA), which projects the data onto an orthogonal basis, in which each dimension (principal component) represents a unique contribution to the overall variance in the data.Using Hotelling's T 2 statistic, 3 we define a 95% confidence region in the first two principal component spaces (see Figure S13A.The critical distance of the confidence region is defined as follows: where s is the principal component score matrix of the ECG stack, N is the number of individual heartbeats in the ECG stack (i.e., population size), A is the number of principal components in the model, and F crit is the critical value of the F-distribution with the given degrees of freedom at 95% confidence.Heartbeats in the ECG stack that lie outside this confidence region in the first two principal components are marked as outliers, and are rejected from the dataset.We next calculate the leverage h that each heartbeat i in the ECG stack has on the principal component model as follows: where a is the principal component (a 2 [1, A]), s is the principal component score matrix of the ECG stack, and N is the number of individual heartbeats in the ECG stack (i.e.population size).Heartbeats whose leverages exceed µ + 2s are marked as outliers, and are rejected from the dataset.
We next calculate the leverage h that each heartbeat i in the ECG stack has on the principal component model as follows: where a is the principal component (a ∈ [1, A]), s is the principal component score matrix of the ECG stack, and N is the number of individual heartbeats in the ECG stack (i.e., population size).Heartbeats whose leverages exceed μ + 2σ are marked as outliers, and are rejected from the dataset.
We then perform a density-based spatial clustering of applications with noise (DBSCAN) (4) analysis on the principal component scores.This analysis uses a k-nearest neighbor algorithm as a distance measure to identify clusters of heartbeats in the ECG stack; this is useful in cases where noisy ECG signals have caused the misidentification of some Rpeaks.Heartbeats falling outside the largest identifiable cluster are marked as outliers and may be rejected from the dataset.The resulting corrected segmented traces are shown in Figure 3a. Figure S5 shows an example DBSCAN analysis.
Once outliers have been removed from the ECG stack, we calculate the median value for each stack; this generates a representative heartbeat for each of the twelve ECG leads (Figure 3a).Data from each step outlined above were inserted back into the SQL database for each subject.

Neural Networks
Among a total of 1,455 unique subjects, the DNN optimization procedure accepts 1,154 ECG records (596 Brugada and 558 Control, respectively).We have formed the primary input for developing various DNN models from chosen configurations of X ECG constructed using various combinations of n leads ∈ [1,12] leads.We have down-sampled ECGs in every case from 2000 Hz to 200 Hz, which exceeds the Nyquist limit of 150 Hz.This is consistent with practices reported elsewhere.With a heartbeat sampling window of 750 ms, this sampling interval reduces the dimensionality of the classification problem to n leads * 1500/(2000 Hz/200 Hz) = 150 * n leads .We have also tested the utility of other factors of variation (FoV) such as age, sex, personal histories of syncope and cardiac arrest, and family histories of sudden death and BrS diagnosis, formatted similarly to the target vector y, and concatenated to the representative fused ECG for each subject.
The neural network consists of a sequence of input, hidden, and output layers, repeated three times, as shown in Figure 3b.The input layer accepts 150 * n leads + nFoV elements for each subject.Each hidden layer unit incorporates a fully connected layer linking 5 nodes with an L1 kernel regularizer (5), a Gaussian noise regularization layer (6) with a standard deviation of σ = 0.1, a rectified linear activation unit (ReLU) function (7), and a batch normalization layer (8).
The multilayer perceptron formed after three successive applications of this hidden layer, passes to a fully connected classification layer with a sigmoid activation function, in which an adagrad optimizer minimizes the binary crossentropy function ( 9).An early stopping algorithm terminate if a local minimum is reached in the output of the binary cross-entropy loss function of the validation data.The algorithm allows for a maximum of 10,000 epochs with a patience of 50 epochs.An isotonic regression function (10) calibrates the predicted probabilities to the distribution of DNN scores (11).
We have performed an extensive grid search optimizing the accuracy of calibration as a function of the structure of the DNN model with respect to the following degrees of freedom.We additionally explored the use of convolutional layers which yielded lower training accuracies than the DNNs trained with representative heart beats.The additional computation time needed to train convolutional neural networks (CNNs) made analysis by LOOCV unattainable for this dataset.
• Nodes: 1, 2, 3, ..., 9, 10 • Layers: 1, 2, 3, ..., 9, 10 where s is the principal component score matrix of the ECG stack, N is the number of individual heartbeats in the ECG stack (i.e., population size), A is the number of principal components in the model, and F crit is the critical value of the F-distribution with the given degrees of freedom at 95% confidence.Heartbeats in the ECG stack that lie outside this confidence region in the first two principal components are marked as outliers, and are rejected from the dataset.We next calculate the leverage h that each heartbeat i in the ECG stack has on the principal component model as follows: where a is the principal component (a 2 [1, A]), s is the principal component score matrix of the ECG stack, and N is the number of individual heartbeats in the ECG stack (i.e.population size).Heartbeats whose leverages exceed µ + 2s are marked as outliers, and are rejected from the dataset.

Calibration and Validation
To exercise maximum independence in calibration and validation, we create a separate, distinct neural network modeling problem for every subject.Setting aside one subject in each case as an independent holdout, we partition the remaining ECG database into training and validation subsets using stratified k-fold cross-validation with 7 folds (86%: 14% training and validation split).We iteratively optimize weight and bias matrices through backpropagation, using the training subset under supervision of a validation subset.Upon maximizing the binary cross-entropy function, we test the trained DNN with the independent holdout subject using leave-one-out cross-validation (LOOCV), as shown in Figures 3c and S13.We have also explored bootstrapping, though this exhibits little difference generalization when compared to k-fold cross-validation.
In order to minimize sampling bias in the training/validation subset, we have carried out 1,154 distinct iterations of the above procedure, in every case tuned by 7-fold cross-validation.This amounts to 8,078 separately calibrated DNNs for each set of lead combinations considered.

Validation of the 9-Lead DNN Model
The text describes a DNN model for the classification of the response to an ajmaline challenge, performed to diagnose BrS; its performance has been confirmed by nearly 2,500 independent validations.Because the local uncontrolled variance of any single ECG measurement vastly exceeds the variation in response between individual ECG instruments, we can expect our model to show comparable performance in the classification of a subsequent consecutive cohort.As a preliminary test of this hypothesis, we collected the electrocardiograms of a new cohort of 405 consecutive subjects admitted to IRCCS Policlinico San Donato for BrS screening.In addition to the standard protocol, including a sodium channel blocker challenge, these subjects underwent ECG examination using Mortara ELI™ 350 ECG machines, recorded using the same high precordial lead placement.We processed the digitized Mortara ECGs according to the previously described procedure, generating downsampled representative heartbeats, and termed this the Mortara cohort.We analyzed these 9-lead representative traces to predict sodium channel blocker response using our DNN model, which was trained solely with the training dataset, obtained by the Claris™ system, as previously described in the text.

Bandpass Filtering of Electrical Noise
Figure S3 shows the power spectra of raw and denoised power spectra of an ECG trace as calculated by FFT.Note that the regular, sharp peaks in the raw spectrum (A), representing AC hum, are suppressed.

Detailed Outcomes in the Tests for Overfitting
The reliable extension of a multivariate classification model to the classification of a new, independent set of holdout subjects requires rigorous tests of overfitting.Figure S6 shows results that certify the absence of overfitting in the development of DNN models classifying ECGs.
Though limited in size from an informatics perspective, these datasets represent the largest available collection of electrocardiograms determinately classified to recognize Brugada Syndrome even in the case of an apparently normal ECG.
Our constraints on database size present particular concerns with regard to data handling, and avoiding the definition of networks solely on the basis of self-consistency.Though we employ rigorous cross-validation steps to prevent overfitting, we have also taken three steps to assess if this problem is nonetheless present.As further tests for overfitting, 1) We   in both the training and testing curves (bottom).This indicates that the randomly calibrated DNN has a prediction accuracy akin to a 50:50 coin-flip.The y-permuted ROC curves in Figure S6 show the LOOCV results of the 9-lead DNN with a randomly permuted y-vector.Here, the AUC-ROC value is 0.470.The overlap of the logistic and diagonal no skill curves indicates a randomized DNN has no predictive ability for the classification of BrS.

Confidence Intervals Associated with Statistical Predictions
The parameters such as Sensitivity and Specificity in Table 1 themselves represent the confidence with which the model classifies electrocardiograms for a Brugada SCB outcome on scale from 0 to 1. Their magnitudes depend upon the decision threshold.To find the best-balanced decision threshold (0.498 in the present case), we apply the statistically optimized Youden's J Statistic (14).
The area under curve (AUC) of the receiver operating characteristic (ROC) obtained for a model such as ours gauges its accuracy for all decision thresholds.To compute AUC values, we implement DeLong's algorithm, as described in (15).In addition to calculating the AUC value, Delong's algorithm evaluates a covariance matrix to produce a standard error (SE) with the AUC value.The SE is calculated as the square root of the variance of the covariance matrix.From this error value, using a 95% confidence interval, we calculate the error bars on the AUC as [AUC-1.96SE,AUC+1.96SE].
Table 1 reports these 95% confidence intervals for the AUC values obtained from the analysis of the independent holdout cases in the Claris (training) and Mortara (independent validation) cohorts, as well as those associated with the manual assessments of MD1 and 2. Note how this metric sets the manual assessments apart, and assigns a slightly higher uncertainty to AUC values derived from analyses of the smaller Motara dataset.

Representative ECG Data
This section presents representative ECG data obtained for patients diagnosed with Brugada Syndrome, either as manifested in a spontaneous type 1 ST-elevation in a baseline ECG or suggested by a positive type 1 response to the administration of ajmaline, as well as subjects diagnosed as healthy controls.Figures S7 through S13 show conventional 12lead electrocardiograms with high precordial lead placements for V1-V6 (V1 II ICS, V2 II ICS, V1 III ICS, V2 III ICS, V1 IV ICS and V2 IV ICS respectively) and standard placements for leads I, II, III, aVR, aVF, aVL.These figures display traces before and after the administration of ajmaline for all cases except those diagnosed as BrS(+) on the basis of a spontaneous type 1 feature in the baseline ECG.The individual traces in blue show average representations of the ECG derived from 9 leads after trace processing as outlined in the Methods section.Each trace describes a complete depolarization-repolarization cycle, showing the characteristic ECG morphology in each lead before and during the ajmaline challenge.Importantly, these contain key diagnostic information from the full-length ECG, but are significantly smaller in size, making them far more useful for neural network processing.

Leave One Out Cross Validation (LOOCV) and Ensemble Learning
The most critical feature of any DNN classifier is the ability to make accurate predictions on previously unseen data.Generalization performance, which describes how well a classification algorithm works on a new dataset, is crucial to measure the real-world performance of the DNN.The present work uses cross-validation as a data subsampling technique to assess the generalization performance of our DNN classification algorithm.
The computational resources required for LOOCV exceed practicable bounds for datasets of more than a few thousand observations.This computational expense normally confines external validation practice to a single step of cross validation having withheld a fraction of observations.A dataset of fewer than 2,000 subjects present the opportunity to perform LOOCV using less than one week of computer time.
For each subject in the LOOCV process, we use ensemble learning to calculate a final classification score (ŷ).Ensemble learning, which averages the results of multiple DNN models to produce a single score, is a common practice for improving the performance of a classification algorithm.It reduces the likelihood of a misclassification due to a single poor DNN. Figure S14 depicts this cross validation and ensemble learning strategy.

DNN Feature Importance in the 12 Lead ECG
This section presents the feature importance of the DNN. Figure S15 presents a heat map, generated by overlaying the features of greatest significance as determined by the DNN input weights, on top of a 12-lead representative ECG.Understanding where a DNN places most importance in the input data is crucial for interpreting the model's decisions and for understanding its strengths and weaknesses.By identifying which ECG features the DNN uses to make predictions, we can understand which aspects of the ECG signal play the greatest role in identifying the BrS phenotype.This information can guide in the selection of new data for training, to improve the interpretability of the model and to understand how the model might fail in identifying the BrS phenotype.Additionally, by understanding where a DNN places importance on ECG features, we can also identify any potential biases in the training data that the model may have learned.
We have analyzed our DNN to determine the most significant weights of feature performance.The BrS phenotype is most characterized by the type 1 and 2 patterns.The type 1 pattern is characterized by a 'coved type' ST-segment while the type 2 pattern is characterized by a 'saddle-back type' with ST increase of ≥ 1mm in leads V1 to V3.A DNN classification model that aims to distinguish between these distinct type 1 and 2 patterns would likely place a significant emphasis solely on the ST-segment of leads V1 to V3.However, while the DNN does assign some importance to these regions, it also considers, but not limited to, more complex and multi-faceted features such as a non-linear combination of the ST-segments from V1 and V2, the QRS complexes in leads I, III, V1, V2, V6, and aVL.Importantly, the learning algorithm was not provided with any type of ECG classification labeling, therefore it formed its own representation of the important features indicative of a positive response to the SCB challenge.It is also important to point out that the DNN puts no importance on the 11 areas where the representative beats from each lead are concatenated together end to end.Table S3.Demographics for the ECG dataset after ECG quality thresholding factored according to ECG type, sex, and age.The Claris cohort contains 1,154 subjects, while there are a total of 370 exclusive subjects in the Mortara cohort.
The ajmaline SCB challenge presents a finite risk of death.The extensive experience of the San Donato BrS screening program, and the standby availability of a life-support team with a capacity for extracorporeal membrane oxygenation (ECMO) minimizes this risk (16).Nevertheless, as noted in the text, the occurrence of potentially life threatening arrhythmias is well documented in the literature, and the fact that the risk is low does not translate to mean zero risk.The San Donato centre has experienced a major arrhythmic event rate of <0.05% without any deaths.

Summary of sub-cluster correlations
Within the Claris dataset 73 of the 75 patients with a type 2 or 3 waveform responded positively to an ajmaline challenge, and were diagnosed BrS(+).The validation exercise just described correctly identified 72 patients as positive, for a Sensitivity of 97.3%.When applied to the entirely separate Mortara dataset, a completely independent set of Claristrained DNN models correctly identified 13 of 13 BrS(+) patients for a sensitivity of 100%.The results thus offer little room for improvement by reclassification.Although this classification success is interesting (and gratifying), the numbers of such patients are small, and we do not wish to associate a perceived type 2 or 3 pattern with a definitive diagnosis of BrS.Indeed, a bias along these lines might have caused the BrS(+) diagnoses of MD1 to include so many apparent BrS phenocopies.

Convolutional Neural Network (CNN) Processing
Our convolutional neural network (CNN) works by optimizing a set of features learned from the 9-lead ECG data.We train the CNN using the ECG data from subjects in the Claris cohort, and then validate the trained CNN with ECGs in the Mortara cohort.The raw traces are subject to the same filtering and denoising steps outlined in the Methods section.Each trace is then down-sampled from 1000 Hz to 250 Hz before input into the CNN.We determined the structure of the CNN presented in • Gradient Descent Optimizers: adam, 13 adagrad, 9 rmsprop, sgd • Dropout Layers

Convolutional Neural Network Validation Results
Below, Table S9 details the prediction accuracy of the 9-lead CNN model by ECG morphology, age, and sex when applied to the Mortara independent validation cohort.Among these factors, the type 1 patient subgroup with accuracy of 100% (27 of 27) vastly exceeds that in the total cohort.The CNN methodology applied to develop the classification model for the purposes of the validation detailed above shares similarities with previous research exploring the ability of machine learning models to recognize signs of BrS in clinical ECGs.In every case, those studies trained neural networks to recognize the appearance of a coved ST elevation (type 1 BrS signature) in minimally processed ECG traces.The prediction success of these studies, as measured by accuracy (correctly classified subjects as a percent of the total validation set) ranged from 76.9% to 93.8%.The CNN model above, trained to predict the result of the BrS(+) SCB challenge in an ECG of any type, achieves an accuracy of 100% in recognizing type 1 ECGs, both by leave one out cross validation (LOOCV) in the training cohort of 103 patients, as described in the text, and by independent classification of the 27 patients in the Mortara validation cohort.

Subject
The most important distinction of our approach is the capacity we prove to identify the Brugada Syndrome phenotype in all ECGs, including the majority that are evidently asymptomatic.In addition, our study differs from others in our ECG preprocessing methodology.Many machine learning (ML) applications, including those by Liu et al. and Dimitri et al., use convolutional layers to extract features from minimally processed 12-lead ECG traces.In contrast, our methodology employs a powerful preprocessing step that reduces the dimensionality of the input ECG data to a higher-fidelity single representative beat for each lead.This step improves the performance of the learning process by providing a superior representation of the ECG data, as demonstrated by the comparison of our DNN (AUC 0.934±0.027,accuracy 88.4%) with our best CNN (AUC 0.863±0.041,accuracy 83.5%).Additionally, when we apply a CNN to our representative beat instead of a minimally filtered ECG trace, we see little difference in the learning performance in our training cohort, indicating significant added value in our preprocessing.
Our method offers several advantages in terms of machine learning (ML) training and deployment.One advantage is that our additional preprocessing step has a minimal impact on the deployment time, while still providing significant benefits in terms of the dimensionality reduction.This reduction in dimensionality improves the training time for producing our final DNN models when compared to using traditional convolutional neural network (CNN) models.This distinction is beneficial as it allows us to easily retrain the models with new data and scale the use of our method to larger datasets in the future.
Our study distinguishes itself with its innovative ECG preprocessing methodology which improves the performance of the learning process by providing a superior representation of the ECG data, as well as its ability to identify patients with seemingly normal ECGs that would respond positively to a sodium channel blocker challenge, something not addressed in the aforementioned studies.The added value in our preprocessing and its distinction from Liu et al. and Dimitri et al. sets our study apart from currently published literature and adds value for researchers and physicians in the field.

•
Simple to implement • Rapid deployment (applicable to streaming data) • Minimal or no data preprocessing needed • Generally yields satisfactory results • Prolonged model training time • Classifies based on a set of arbitrarily learned features • Black box learning process R-Peak Based ECG Segmentation using Wavelet Transform

Figure S1 :
Figure S1: Example of R peak location results using the autoencoder algorithm.(2) Segmentation map, where identified R peaks are assigned a value of 1, and all other points 0. (Middle) Raw ECG trace with R peak locations identified.(Bottom left) Stack of segmented heartbeats, based on R peak locations.(Bottom right) The same stack of segmented heartbeats, after cluster analysis has identified and removed outliers.

Figure S2 :
Figure S2: Autoencoder algorithm description.(2) Graphical representation of the data pipeline, from raw ECG traces (left) to R peak segmentation map (right).(Bottom) Detailed description of AE CNN structure.

Figure S4 .
Figure S4.Typical outlier detection plots for a stack of ECG traces.Outliers are indicated in red.A: Hotelling's T2 confidence ellipse overlaid on principal components 1 and 2. B Leverage h plotted against Pearson's correlation coefficient for each ECG segment.C: The ECG stack, with rejected traces indicated in red.

Figure S5 .
Figure S5.Typical example of a DBSCAN analysis on a stack of ECG traces in PCA space.
check for convergence between the training and testing loss functions in the DNN training step.2) We employ LOOCV.For each subject, we exclude this one subject from the subset of ECG data in the DNN training and validation.Through a 7-fold CV partitioning scheme of the DNN training subset, we apply these DNNs to each reserved subject to generate predictions of BrS diagnosis, along with concomitant DNN scores.3) As a final test, we train neural networks after having randomly permuted the diagnosis vector.This process creates a model with no deterministic correlation between the ECG data and BrS type 1 pattern; such a model should have no better than a random chance of classifying an ECG correctly (≈50% accuracy).We perform this randomization process dozens of times and have observed that the resulting trained neural networks consistently have accuracies near 50%.Additionally, the DNN model accurately classifies independent ECGs recorded in separate circumstances using the Mortara ELI™ 350 system.

Figure S6 .
Figure S6.DNN training curves, y-vector permutation overfitting test training curves, and y-vector permutation overfitting test ROCs.For each training curve and dataset, the results of cross-entropy loss function (2) and accuracy (bottom) are shown as a function of training epoch.The resulting LOOCV ROC curve is shown for each dataset under the yvector permutation overfitting test.

Figure
Figure S6 shows the results of tests for overfitting in the calibration of DNN models that classify ECGs.For each training curve, the top plot shows how the value of the binary cross-entropy function minimizes with the training epoch, and the bottom plot shows consequential maximization of the model accuracy.In the DNN training curve, the first criterion of overfitting (outlined in the methods section) is met by the convergence between training (blue) and testing (orange) curves.This confirms the absence of overfitting and underfitting of the training and testing subsets used in the DNN weight optimization procedure.We satisfy a second criterion for overfitting by a combination of LOOCV for independent holdout testing as well as 7-fold cross-validation to reduce training and testing subgroup biases.We apply a third test for overfitting by assessing the performance of a DNN trained by a diagnosis vector that is randomly permuted to remove the ECG-diagnosis correlation.The training curves for each y-permuted dataset in Figure S6 show a minimization of the binary cross-entropy function (2); however, the overall accuracy of the DNN remains stagnant at about 50%

Figure S7 .
Figure S7.Subject 1233, male, survived cardiac arrest.No reported family history of BrS or SCD.(2) Raw ECG traces for a BrS(+), presenting the ST-elevation characteristic of spontaneous type 1 diagnosis in the absence of ajmaline.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, without ajmaline administration.Patients in this class do not receive ajmaline.The deep neural network assigned a DNN score of ŷ = 0.978 ± 0.008.

Figure S8 .
Figure S8.Subject 274, male, with a family history of SCD (brother).Here, we see a Type 2 pattern in the Baseline ECG converting to type 1 pattern after ajmaline test, confirming the BrS(+) diagnosis.(2) Raw ECG traces classified by the DNN as BrS(+), before the administration of ajmaline.(center) Raw ECG traces for the same patient after administration of ajmaline.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, before ajmaline administration and after ajmaline administration.The deep neural network assigned a DNN score ŷ = 0.933 ± 0.017.

Figure S9 .
Figure S9.Subject 851, male, affected by atrial fibrillation and suspicious ECG pattern associated with a lifethreatening ventricular arrhythmia during flecainide therapy.(2) Raw ECG traces forming a suspicious ECG pattern classified by the DNN as BrS(+), before the administration of ajmaline.(center) Raw ECG traces for the same patient after administration of ajmaline, revealing a type 1 pattern.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, before ajmaline administration and after ajmaline administration.The deep neural network assigned a DNN score of ŷ = 0.773 ± 0.063.

Figure S10 .
Figure S10.Subject 139, male, survived cardiac arrest with a family history of SCD (grandfather and two uncles).(2) Raw ECG traces showing a suspicious pattern with elevated J point in the high right precordial leads recorded in from the 2nd to 4th intercostal space (from V1 to V6). (center) Raw ECG traces for the same patient after administration of ajmaline showing the occurrence of a type 1 BrS pattern.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, before and after ajmaline administration.The deep neural network assigned a DNN scores of ŷ = 0.932 ± 0.033.

Figure S11 .
Figure S11.Subject 1111, female, with a family history of BrS and SCD.The baseline normal ECG pattern did not convert to type 1 after ajmaline challenge, thus excluding a BrS diagnosis.(2) Raw ECG traces classified by the DNN as BrS(-), before the administration of ajmaline.(center) Raw ECG traces for the same patient after administration of ajmaline.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, before ajmaline administration and after ajmaline administration.The deep neural network assigned a DNN score of ŷ = 0.221 ± 0.031.

Figure S12 .
Figure S12.Suject 169, female, reported syncope during fever.Family history of BrS (mother affected with a pathogenic variant in the SCN5A mutation, and brother with SCD).She proved negative for SCN5A variants.(2) Raw ECG traces classified by the DNN as BrS(-), before the administration of ajmaline, despite a suspicious Type 2 baseline ECG pattern in the V2 II intercostal space.(center) Raw ECG traces for the same patient after administration of ajmaline, confirming a negative diagnosis.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, before ajmaline administration and after ajmaline administration.The deep neural network assigned a DNN scores of ŷ = 0.304 ± 0.132.

Figure S13 .
Figure S13.Subject 466, male, previous syncope and family history of SCD (father and cousin) and BrS (brother).The patient showed a suspicious incomplete right bundle branch block in high right precordial leads, which did not convert to a type 1 BrS pattern after ajmaline challenge.(2) Raw ECG traces classified by the DNN as BrS(-), before the administration of ajmaline.(center) Raw ECG traces for the same patient after administration of ajmaline.(bottom) ECG traces showing single average representative heartbeats measured for this patient at nine unique lead positions, before ajmaline administration and after ajmaline administration.The deep neural network assigned a DNN score of ŷ = 0.251 ± 0.110.

Figure S14 .
Figure S14.Data partitioning strategy for training a neural network.The overall data partitioning scheme implements leave-one-out cross-validation (LOOCV) to assess the performance of a DNN for each subject.For each subject, 7-fold cross-validation is used to separate the dataset into seven training and validation subsets, each tested on the holdout, in an effort to reduce overfitting and improve generalization.This analysis yields an average score and standard deviation for each subject across the 7-fold DNNs.

Figure S15 .
Figure S15.DNN Feature Importance.12-lead representative ECG trace which is used as an input to the DNN for each subject.The colormap overlays the total sum of the DNN input weights for each segment of the representative ECG trace on a log scale.The minimum and maximum weights range from 0.0001 to 0.3021.
Liu et al. and other cited researchers, Dimitri et al., Liao et al. and Nakamura et al., the training of our DNN does not confine it to recognize BrS only when signified by a spontaneous type 1 ST-elevation.

Table S1 •
Leave One Out Cross Validation (LOOCV) and Ensemble Learning " Figure S14 • DNN Feature Importance in the 12 Lead ECG " Figure S15 • Details in the Prediction of Brugada Syndrome for the Claris Training Cohort " Details in the Prediction of Brugada Syndrome for the Mortara Validation Cohort "

Table S2 •
Demographics of the Training and Validation Cohorts" Tables S3, S4• Details of the Classification of Brugada Syndrome "

Table S9 •
Detailed Clinician Results

•
Store the extracted windows in rep stack 12lead.10.Apply cluster analysis to remove outlier beats, then calculate the representative beat taking the median of the remainder.
11.If clustering_opt is set to False, calculate the representative beat by taking the median across all beats.

Table S1 .
Clinical characteristics of the Claris (training) cohort.*Thegenetic test result was available in 380 subjects in the BrS Group and in 43 subjects in the Control Group at the time of the analysis.

Table S2 .
Clinical characteristics of the Mortara (validation) cohort.The genetic test result was available in 26 subjects in BrS Group and in 7 subjects in Control Group at the time of the analysis.

of the Training and Validation Cohorts
TableS5details the prediction accuracy of the 9-lead DNN regression model by ECG morphology, age, and sex.Among these factors, the prediction accuracy varies to a statistically significant degree only in the normal and suspicious ECG morphology subgroups (78.6% (367 of 467) and 87.0% (508 of 584), respectively).However, diagnostic accuracy in the type 1 patient subgroup, 100% (103 of 103), vastly exceeds that in the normal subgroup with absolute certainty.

Table S5 .
Results of the best performing DNN broken down by subject sub-clusters.The 9-lead DNN is the best performing network for the training cohort gauged by its AUC value.p-values are calculated by a χ 2 test comparing accuracy within the ECG type (Normal / Suspicious), sex, and age cohorts.A 95% confidence interval for AUC-ROC values are calculated using Delong's method.

Table S7
details the prediction accuracy of the 9-lead DNN model by ECG morphology, age, and sex when applied to the Mortara independent validation cohort.All subgroups classify statistically similar (p-value > 0.05) to the main subgroup of comparison.

Table S7 .
Validation of the Mortara cohort by the 9-lead DNN broken down by subject sub-clusters.p-values are calculated by a χ 2 test comparing accuracy within the ECG type (All / Type 1 / Type 2 / Other), sex, and age cohorts.A 95% confidence interval for AUC-ROC values are calculated using Delong's method.
Table S8 by an exhaustive grid search of the following hyperparameters (optimized values highlighted): o Pooling strides: 1, 2, 3

Table S8 .
CNN structure used for validation of the Validation cohort.

Table S9 .
Results of the Mortara cohort validation by the 9-lead CNN broken down by subject sub-clusters.p-values are calculated by a χ 2 test comparing accuracy to the Mortara accuracy, sex, and age cohorts.A 95% confidence interval for AUC-ROC values are calculated using Delong's method.