-
PDF
- Split View
-
Views
-
Cite
Cite
Victor Barreto Mesquita, Florêncio Mendes Oliveira Filho, Paulo Canas Rodrigues, Detection of crossover points in detrended fluctuation analysis: an application to EEG signals of patients with epilepsy, Bioinformatics, Volume 37, Issue 9, May 2021, Pages 1278–1284, https://doi.org/10.1093/bioinformatics/btaa955
- Share Icon Share
Abstract
The quantification of long-range correlation of electroencephalogram (EEG) signals is an important research direction for its relevance in helping understanding the brain activity. Epileptic seizures have been studied in the past years where different non-linear statistical approaches have been employed to understand the relationship between the EEG signal and the epileptic discharge. One of the most widely used method for to analyse long memory processes is the detrended fluctuation analysis (DFA). However, no objective and pragmatic methods have been developed to detect crossover points and reference channels in DFA.
In this article, we propose: (i) two automatic approaches that successfully detect crossover points in DFA related methods on the log–log plot and (ii) a criteria to choose the reference channel for the log-amplitude function. Moreover, the DFA is applied to EEG signals of 10 epileptic patients collected from the CHB-MIT database, being the log-amplitude function used to compare the different brain hemispheres by making use of the methodology proposed in the article. The existence of long-range power-law correlations is demonstrated and indicates that the EEG signals of epileptic patients present three well-defined regions with the first region showing a 1/f noise (pink noise) for seven subjects and a random walk behaviour for three subjects. The second and third regions show anti-persistence behaviour. Moreover, the results of the log-amplitude function were divided in two groups: the first, including seven subjects, where the increase in the scales results in an increase in the fluctuation in the frontal channels and the second, included three subjects, where the fluctuation for large scales are greater for the parietal channels.
The functions used in this article are available in the R package DFA (Mesquita et al., 2020).
Supplementary data are available at Bioinformatics online.
1 Introduction
Epilepsy affects about 50 million people around the world, from which, ∼75% start in the childhood (Stafstrom and Carmant, 2015). This disorder may be characterized by an enduring alteration in the brain activity, causing seizures or involuntary periods of unusual behaviour, movement, sensation or consciousness (Ngugi et al., 2010). These events are unpredictable and may be occur at any time, resulting in an increasing risk of serious injuries (Shoeb, 2009) and neuronal damage (Sutula et al., 2003).
In 2010, the Commission on Classification and Terminology of the International League Against Epilepsy published a revised terminology that classified the seizures in three categories (Berg et al., 2010): focal (seizures involving only part of the brain), generalized (seizures involving both sides of the brain) and epileptic spasms. A seizure can begin focally and later generalize (Stafstrom and Carmant, 2015), being the success or failure of a epilepsy surgery dependent on the location of the epileptic focus (Acar et al., 2007). The history and neurologic examination is still the main way of the diagnosis of seizures and epilepsy, whereas laboratory evaluations serve as adjunct tests. Techniques, such as computed tomography and magnetic resonance imaging, are broadly used due to their high spatial resolution but have the disadvantage of the high cost of the required equipment. For an evaluation of a patient with seizures, techniques, such as magnetoencephalography (MEG), with high temporal and spatial resolution are also employeed. However, they have little mobility and are relatively expensive (Singh, 2014). An electroencephalogram (EEG) is a recording of the brains’ electrical activity (Stafstrom and Carmant, 2015) and provides useful information about the brain state (Subha et al., 2010). The EEG method has been widely used to study the brain because of: (i) its lower cost in terms of hardware (Schultz, 2012); (ii) it records high temporal definition in the order of milliseconds (Hämäläinen et al., 1993); and (iii) it is a non-invasive method because the required electrodes are placed on the surface of the brain (Baravalle et al., 2018). Because of this, the EEG findings and ancillary information are often used by physician to assess the better way to deal with epilepsy/seizures.
The detrended fluctuation analysis (DFA), primarily proposed by Peng et al. (1994), has been widely used to measure the fluctuation of the non-stationary time series in the temporal domain. The DFA has been frequently used in areas, such as economy (Alvarez-Ramirez et al., 2008; Ferreira et al., 2020; Guedes et al., 2019), physiology (Castiglioni et al., 2011) and environmental sciences (Brito et al., 2019; Yue et al., 2010). Its main advantage over other conventional methods is that it prevents the detection of spurious correlation embed in the series, which may been artefact or related to external trend (Peng et al., 1994, 1995). DFA has also proven to be useful to deal with EEG signals as these signals are often contaminated with artefacts (Acar et al., 2007). The DFA was also used in studies about epilepsy. For example, Simozo et al. (2014) applied a method of band classification to detect changes in the power-law coefficient, Shalbaf et al. (2009) used features of DFA and the SD of the EEG signals in a simple linear discriminant analysis classifier to differentiate three seizures states, and Zhou et al. (2007) applied the DFA to EEG signals of epileptic rats.
In the context of DFA, the log–log plot usually presents crossover points (CPs) and, often, it is necessary more than one adjustment using least squares to fully comprehend all the properties of the time series (Zebende et al., 2017b). Ge and Leung (2013) employed segmented regression to identify CPs in avian-influenza outbreaks, but did not discuss the situations where the curves present crossover regions at the end of the series that might lead to negative scaling exponents. The identification of these very important CPs is usually done by the analyst without considering an objective and pragmatic methodology.
In this article, we propose two approaches for automatic detection of CPs in DFA, based in statistical and mathematical concepts. Moreover, we also propose an objective criterion to choose a reference channel for the delta-log function that successfully detects CPs, based on the concept of area under the curve (AUC).
The proposed methods are applied, in combination with the DFA, to EEG signals of 10 paediatric subjects with intractable seizures, being the different hemispheres of the brain compared by the log-amplitude function and the methodology proposed in this article.
The rest of the article is organized as follows. Section 2 describes the data used in this article, together with the existing and proposed methodology. Section 3 is dedicated to the presentation and discussion of the results. The concluding remarks are drawn in Section 4.
2 Materials and methods
2.1 The data
In this article, we consider a subset of a database that includes EEG recordings from 24 paediatric subjects with intractable seizures, collected at the Children’s Hospital Boston, publicly available in Physionet website: https://physionet.org/content/chbmit/1.0.0/, and with full description in Shoeb (2009). The subjects were monitored for several days in order to characterize their seizures and assess their candidacy for surgical intervention. All signals were sampled at 256 samples per second with 16-bit resolution. Most files contain 23 EEG signals (24 or 26 in a few cases), using the standard International 10–20 system of EEG electrode positions (Fig. 1). In total, the collection includes 664.edf files with recordings of 1, 2 or 4 h. From these, 129 files contain one or more seizures, for a total of 198 seizures.

Distribution of the positions of the electrodes on the scalp, according to the 10–20 system of EEG electrode positions: frontal (blue), central (dark green), temporal (light green), parietal (red) and occipital (dark red).
Obtained from the MATLAB toolbox EEGLAB.
The methodology described in this article was applied to a subset of the full data obtained as follows: (i) 10 subjects were randomly selected from the total set of 24 that had at least 1 seizure and records of 1 h (in this case, the subjects chb01, chb02, chb03, chb05, chb08, chb11, chb18, chb20, chb22 and chb24); (ii) for each of these 10 subjects, 1 period of 1 h was randomly selected from those periods that had included exactly one seizure; and (iii) all 22 channels from that individual in that period of 1 h were considered in the analysis. This resulted 10 datasets (individuals) with 22 columns (channels) and 921 600 rows (1 h of EEG signals).
2.2 Detrended fluctuation analysis
The DFA, primarily proposed by Peng et al. (1994), has been widely used to quantify the long-range correlation in non-stationary time series. Its main advantage over other conventional methods is that it prevents the detection of spurious correlation embed in the series, which may been artefact or related to external trend. The DFA can be performed by considering the following steps:
- Let be a time series of length N. The time series XN should be integrated in a process that subtracts the mean of the time series from the cumulative sum as followswhere is the mean value of XN.(1)
The integrated signal y(k) is then divided into intervals (boxes) of equal length n. For each of these intervals, the trend is adjusted by polynomials of order ‘m’ using the least squares optimization. The local signal for each box, , represents the trend in each interval and indicates the ordinate of the linear fit;
The trend of y(k) is removed by subtracting it from the local trend in each box (of length n);
- The fluctuation, or root mean square (RMS) deviation from the time series without trend, , of the new signal is obtained in this step by computing(2)
Repeat the procedure described above over a range of different time scales (of length n, ), and plot the log–log graph of against n.
The relationship between n and is expected to behave as the power-law . The scaling exponent αDFA is a similar approach to the Hurst exponent and it is calculated as the slope of the straight line fit of the against using least squares. When αDFA is between 0 and 1.5 (Guedes et al., 2019; Zebende et al., 2017), its value provides information of the series self-correlation as follows
: anti-persistent series, i.e. there is a >50% probability of a negative value being preceded by a positive value and vice versa;
: uncorrelated series, i.e. the series behaves as a random walk;
: series with persistent long-range correlation, i.e. there is a >50% probability of positive values being preceded by positive values and vice versa;
: noise, i.e. pink noise;
: non-stationary series;
: Brownian noise.
2.3 Log-amplitude function
The log-amplitude fluctuation can be evaluated in the following way:
If , then the channel yy presents larger RMS fluctuations than the channel xx;
If , then the channel yy presents equal RMS fluctuations as the channel xx;
If , then the channel yy presents smaller RMS fluctuations than the channel xx.
Recent studies have used the log-amplitude function to compare the brain activities of different hemispheres (left/right) and (frontal/parietal) for tasks, such as reading (Oliveira Filho et al., 2019), in the study of the EEG signals before and after meditation (Ghosh et al., 2018; Hirekhan and Manthalkar, 2018) in analysis of public data of electrocardiogram and blood pressure.
2.4 Automatic detection of CPs in DFA
The log–log plot mentioned above usually presents (Peng et al., 1994), which are the locations where there are changes in the behaviour of the curve. Thus, for a better data fitting, it is often necessary more than one adjustment by the least squares method. In fact, the choice of theses CPs is made by a subjective criteria based on the visual identification, which might cause inaccuracy of the results. Inappropriate and erroneous fits can lead to the wrong interpretation of the series self-correlation, because the interpretation of the behaviour of the time series is based in thresholds of the scaling exponent. For instance, a slope =0.50 can be interpreted as a random walk behaviour, while a slope of 0.56 can be interpreted as a persistent behaviour.
Taking into account this scenario, and lack of objective procedure, we propose two criteria for automatic detection of CPs based in statistics and mathematical concepts, that provide a step forward in the research related to DFA. The first criterion is based on the Euclidean distance and the second is based on the secant method. Detailed descriptions are presented in the next two subsections.
2.4.1 Euclidean method
The Euclidean method can be described as follows.
Let r be the log fluctuation curve. Fit a straight line s by linking the first (P1) and the last (P2) point of the curve r, where and ;
- For each point between P1 and P2 in the curve r, obtain and store the Euclidean distance defined by:(4)
Find the position of the point with higher geometric distance, which will be the CP.
If there are more than one , they can be obtained by repeating the steps (ii) and (iii), considering: (i) P1 to be the first CP if the new CP is located to its right; and/or (ii) P2 to be the first CP if the new CP is located to its left.
2.4.2 Secant method
In this subsection, we provide another alternative to automatic detection of CPs by using concepts of the secant method, widely used in numerical analysis. Its main idea is to approach the position of the CPs with the average of the extremes slopes of the log fluctuation curve. This method provides an advantage over the Euclidean method because no negative coefficients/exponents are obtained, which might happen for the Euclidean method. The adaptation of the secant method for automatic detection of CPs can be described as follows:
Fit a regression line to the first points of the curve, and a second regression line to the last points of the same curve;
Compute the mean of the two slopes obtained in (i);
Fit lines by linking every two consecutive points in the curve between and ;
Find the position of the first point with a slope smaller than the slope found in (ii), which will be the CP.
This method is valid when the points in the curve show an increasingpattern, due to the constrains that result from using only two consecutive points in iii. If there are more than one , they can be obtained by repeating the steps above considering the curve before and/or after the CP obtained in (iv).
2.5 Automatic identification of reference channels in log-amplitude
To apply the log-amplitude function it is necessary to choose a reference channel. Currently, users make this choice by visual inspection of the raw data or some other possible subjective criteria. It is expected that the candidate to be the reference channel presents a greater fluctuation than the other channels in most of the scales. In practice, it means, that the reference channel will present a higher curve than the other channels on the log-amplitude plot. Although an apparently simple choice, sometimes the graphical interpretation is not so clear and the choice might be subjective.
In this article, we propose an objective and pragmatic procedure that helps in the identification of reference channels for the log-amplitude function, based on the AUC in the following way. To do so, it is enough to, for each subject, compute the AUC for each channel, by the trapezoidal rule on the log–log plot. The reference channel for each subject will be the one with the largest AUC. Under this rule, the AUC is evaluated by dividing the total area between the first and the last box into little trapezoids, being the total AUC the sum of the areas of the little trapezoids. Formally, the steps to obtain the reference channel by the trapezoidal rule can be defined as follows
- Calculate the area of the all small trapezoids between any two consecutive boxes, that maximize the area under the r log–log curve between those boxes, from the first (P1) to the last (P2) point on r. The area of each small trapezoid can be obtained as(5)
- Sum the areas of all small trapezoids obtained in (i)with ni the i-th box between P1 and the P2, , b the total number of boxes, and .(6)
3 Results and discussion
3.1 Descriptive statistics of the EEG signals
The main descriptive statistics, mean, SD, skewness and kurtosis, for the 22 channels in each of the 10 subjects are presented in Figure 2. Supplementary Figures S1–S10 show the EEG recordings for the subjects chb01_03, chb02_19, chb03_04, chb05_06, chb08_11, chb11_82, chb18_29, chb20_12, chb22_38 and chb24_13, respectively, for all the 22 channels under consideration: FP1–F7, F7–T7, T7–P7, P7–O1, FP1–F3, F3–C3, C3–P3, P3–O1, FP2–F4, F4–C4, C4–P4, P4–O2, FP2–F8, F8–T8, T8–P8, P8–O2, FZ–CZ, CZ–PZ, P7–T7, T7–FT9, FT9–FT10 and FT10–T8.

Descriptive statistics for the EEG channels: (a) mean, (b) SD, (c) skewness and (d) kurtosis, for the 22 channels (horizontal axes) in each of the 10 considered subjects (different colours). Each statistic was obtained based on a sample of N = 921 600 observations
From the analysis of Figure 2, the descriptive statistics do not show large differences between subjects. In general, the frontal channels displayed higher mean and SD than other hemispheres of the brain. Most of the channels/EEG signals had a positive skewness (heavy values above the mean) and show a leptokurtic distribution (heavy tails). All statistics were obtained for EEG signals with a recording time of 1h resulting in N = 921 600 observations.
3.2 DFA
Supplementary Figures S11 and S12 show the as a function of n for the 22 channels in all 10 subjects, with crossovers obtained by the Euclidean and secant methods, respectively. The analysis of these figures indicate that the amplitude of the brain activity of the seizures subjects are long-range correlated and multi-scaling. These long-range correlations imply that the current and future values of the EEG signals are influenced by previous values of EEG signals. All EEG signals seem to have crossover phenomena with different time scales. The curves do not exhibit a single power-law behaviour with only one scaling exponent, but three well-defined regions with different average values of scaling. Table 1 shows a summary of the CPs and scaling exponents estimated for the EEG signals considering both Euclidean and secant methods.
CPs, scaling exponents and fit errors () estimated for the EEG signals for the Euclidean and secant methods for all individuals
Subject . | Euclidean . | . | Secant . | . | Coefficients (Euclidean) . | . | . | Coefficients (secant) . | . | . |
---|---|---|---|---|---|---|---|---|---|---|
. | First CP . | Second CP . | First CP . | Second CP . | First region . | Second region . | Third region . | First region . | Second region . | Third region . |
chb01 | 28 | 56 | 26 | 42 | 1.2685 (0.0090) | 0.2136 (0.0938) | 0.0272 (0.0585) | 1.2828 (0.0035) | 0.3154 (0.0330) | 0.0380 (0.1050) |
chb02 | 34 | 68 | 33 | 47 | 1.0430 (0.0251) | 0.1738 (0.1787) | −0.0092 (0.3112) | 1.0522 (0.0196) | 0.3884 (0.0185) | 0.0215 (0.5482) |
chb03 | 33 | 67 | 31 | 44 | 1.0691 (0.0228) | 0.1477 (0.1625) | 0.0810 (0.0529) | 1.0985 (0.0149) | 0.4224 (0.0187) | 0.0608 (0.0684) |
chb05 | 26 | 54 | 24 | 40 | 1.2322 (0.0176) | 0.2112 (0.0726) | 0.0163 (0.2696) | 1.2686 (0.0079) | 0.343 (0.0297) | 0.0338 (0.2554) |
chb08 | 32 | 65 | 30 | 45 | 1.0500 (0.0323) | 0.1613 (0.1332) | 0.0097 (0.6953) | 1.0731 (0.0232) | 0.3601 (0.0202) | 0.0319 (0.3061) |
chb11 | 36 | 74 | 36 | 48 | 0.9699 (0.0272) | 0.1409 (0.2066) | 0.0001 (0.3541) | 0.9699 (0.0259) | 0.3861 (0.0189) | 0.0278 (0.4195) |
chb18 | 37 | 74 | 36 | 51 | 1.0303 (0.0225) | 0.1852 (0.2244) | 0.0116 (0.4646) | 1.0405 (0.02016) | 0.427 (0.02039) | 0.0294 (0.3233) |
chb20 | 33 | 67 | 32 | 45 | 1.0991 (0.0207) | 0.1845 (0.1457) | −0.0322 (0.3720) | 1.0991 (0.0170) | 0.3699 (0.0220) | 0.0258 (0.5506) |
chb22 | 35 | 71 | 34 | 48 | 0.9775 (0.0296) | 0.1459 (0.1698) | 0.0273 (0.1265) | 0.9867 (0.0261) | 0.3562 (0.0176) | 0.0354 (0.1842) |
chb24 | 28 | 57 | 27 | 41 | 1.2326 (0.0164) | 0.1797 (0.1304) | −0.0018 (0.2669) | 1.2818 (0.0073) | 0.4009 (0.0252) | 0.0232 (0.4442) |
Subject . | Euclidean . | . | Secant . | . | Coefficients (Euclidean) . | . | . | Coefficients (secant) . | . | . |
---|---|---|---|---|---|---|---|---|---|---|
. | First CP . | Second CP . | First CP . | Second CP . | First region . | Second region . | Third region . | First region . | Second region . | Third region . |
chb01 | 28 | 56 | 26 | 42 | 1.2685 (0.0090) | 0.2136 (0.0938) | 0.0272 (0.0585) | 1.2828 (0.0035) | 0.3154 (0.0330) | 0.0380 (0.1050) |
chb02 | 34 | 68 | 33 | 47 | 1.0430 (0.0251) | 0.1738 (0.1787) | −0.0092 (0.3112) | 1.0522 (0.0196) | 0.3884 (0.0185) | 0.0215 (0.5482) |
chb03 | 33 | 67 | 31 | 44 | 1.0691 (0.0228) | 0.1477 (0.1625) | 0.0810 (0.0529) | 1.0985 (0.0149) | 0.4224 (0.0187) | 0.0608 (0.0684) |
chb05 | 26 | 54 | 24 | 40 | 1.2322 (0.0176) | 0.2112 (0.0726) | 0.0163 (0.2696) | 1.2686 (0.0079) | 0.343 (0.0297) | 0.0338 (0.2554) |
chb08 | 32 | 65 | 30 | 45 | 1.0500 (0.0323) | 0.1613 (0.1332) | 0.0097 (0.6953) | 1.0731 (0.0232) | 0.3601 (0.0202) | 0.0319 (0.3061) |
chb11 | 36 | 74 | 36 | 48 | 0.9699 (0.0272) | 0.1409 (0.2066) | 0.0001 (0.3541) | 0.9699 (0.0259) | 0.3861 (0.0189) | 0.0278 (0.4195) |
chb18 | 37 | 74 | 36 | 51 | 1.0303 (0.0225) | 0.1852 (0.2244) | 0.0116 (0.4646) | 1.0405 (0.02016) | 0.427 (0.02039) | 0.0294 (0.3233) |
chb20 | 33 | 67 | 32 | 45 | 1.0991 (0.0207) | 0.1845 (0.1457) | −0.0322 (0.3720) | 1.0991 (0.0170) | 0.3699 (0.0220) | 0.0258 (0.5506) |
chb22 | 35 | 71 | 34 | 48 | 0.9775 (0.0296) | 0.1459 (0.1698) | 0.0273 (0.1265) | 0.9867 (0.0261) | 0.3562 (0.0176) | 0.0354 (0.1842) |
chb24 | 28 | 57 | 27 | 41 | 1.2326 (0.0164) | 0.1797 (0.1304) | −0.0018 (0.2669) | 1.2818 (0.0073) | 0.4009 (0.0252) | 0.0232 (0.4442) |
CPs, scaling exponents and fit errors () estimated for the EEG signals for the Euclidean and secant methods for all individuals
Subject . | Euclidean . | . | Secant . | . | Coefficients (Euclidean) . | . | . | Coefficients (secant) . | . | . |
---|---|---|---|---|---|---|---|---|---|---|
. | First CP . | Second CP . | First CP . | Second CP . | First region . | Second region . | Third region . | First region . | Second region . | Third region . |
chb01 | 28 | 56 | 26 | 42 | 1.2685 (0.0090) | 0.2136 (0.0938) | 0.0272 (0.0585) | 1.2828 (0.0035) | 0.3154 (0.0330) | 0.0380 (0.1050) |
chb02 | 34 | 68 | 33 | 47 | 1.0430 (0.0251) | 0.1738 (0.1787) | −0.0092 (0.3112) | 1.0522 (0.0196) | 0.3884 (0.0185) | 0.0215 (0.5482) |
chb03 | 33 | 67 | 31 | 44 | 1.0691 (0.0228) | 0.1477 (0.1625) | 0.0810 (0.0529) | 1.0985 (0.0149) | 0.4224 (0.0187) | 0.0608 (0.0684) |
chb05 | 26 | 54 | 24 | 40 | 1.2322 (0.0176) | 0.2112 (0.0726) | 0.0163 (0.2696) | 1.2686 (0.0079) | 0.343 (0.0297) | 0.0338 (0.2554) |
chb08 | 32 | 65 | 30 | 45 | 1.0500 (0.0323) | 0.1613 (0.1332) | 0.0097 (0.6953) | 1.0731 (0.0232) | 0.3601 (0.0202) | 0.0319 (0.3061) |
chb11 | 36 | 74 | 36 | 48 | 0.9699 (0.0272) | 0.1409 (0.2066) | 0.0001 (0.3541) | 0.9699 (0.0259) | 0.3861 (0.0189) | 0.0278 (0.4195) |
chb18 | 37 | 74 | 36 | 51 | 1.0303 (0.0225) | 0.1852 (0.2244) | 0.0116 (0.4646) | 1.0405 (0.02016) | 0.427 (0.02039) | 0.0294 (0.3233) |
chb20 | 33 | 67 | 32 | 45 | 1.0991 (0.0207) | 0.1845 (0.1457) | −0.0322 (0.3720) | 1.0991 (0.0170) | 0.3699 (0.0220) | 0.0258 (0.5506) |
chb22 | 35 | 71 | 34 | 48 | 0.9775 (0.0296) | 0.1459 (0.1698) | 0.0273 (0.1265) | 0.9867 (0.0261) | 0.3562 (0.0176) | 0.0354 (0.1842) |
chb24 | 28 | 57 | 27 | 41 | 1.2326 (0.0164) | 0.1797 (0.1304) | −0.0018 (0.2669) | 1.2818 (0.0073) | 0.4009 (0.0252) | 0.0232 (0.4442) |
Subject . | Euclidean . | . | Secant . | . | Coefficients (Euclidean) . | . | . | Coefficients (secant) . | . | . |
---|---|---|---|---|---|---|---|---|---|---|
. | First CP . | Second CP . | First CP . | Second CP . | First region . | Second region . | Third region . | First region . | Second region . | Third region . |
chb01 | 28 | 56 | 26 | 42 | 1.2685 (0.0090) | 0.2136 (0.0938) | 0.0272 (0.0585) | 1.2828 (0.0035) | 0.3154 (0.0330) | 0.0380 (0.1050) |
chb02 | 34 | 68 | 33 | 47 | 1.0430 (0.0251) | 0.1738 (0.1787) | −0.0092 (0.3112) | 1.0522 (0.0196) | 0.3884 (0.0185) | 0.0215 (0.5482) |
chb03 | 33 | 67 | 31 | 44 | 1.0691 (0.0228) | 0.1477 (0.1625) | 0.0810 (0.0529) | 1.0985 (0.0149) | 0.4224 (0.0187) | 0.0608 (0.0684) |
chb05 | 26 | 54 | 24 | 40 | 1.2322 (0.0176) | 0.2112 (0.0726) | 0.0163 (0.2696) | 1.2686 (0.0079) | 0.343 (0.0297) | 0.0338 (0.2554) |
chb08 | 32 | 65 | 30 | 45 | 1.0500 (0.0323) | 0.1613 (0.1332) | 0.0097 (0.6953) | 1.0731 (0.0232) | 0.3601 (0.0202) | 0.0319 (0.3061) |
chb11 | 36 | 74 | 36 | 48 | 0.9699 (0.0272) | 0.1409 (0.2066) | 0.0001 (0.3541) | 0.9699 (0.0259) | 0.3861 (0.0189) | 0.0278 (0.4195) |
chb18 | 37 | 74 | 36 | 51 | 1.0303 (0.0225) | 0.1852 (0.2244) | 0.0116 (0.4646) | 1.0405 (0.02016) | 0.427 (0.02039) | 0.0294 (0.3233) |
chb20 | 33 | 67 | 32 | 45 | 1.0991 (0.0207) | 0.1845 (0.1457) | −0.0322 (0.3720) | 1.0991 (0.0170) | 0.3699 (0.0220) | 0.0258 (0.5506) |
chb22 | 35 | 71 | 34 | 48 | 0.9775 (0.0296) | 0.1459 (0.1698) | 0.0273 (0.1265) | 0.9867 (0.0261) | 0.3562 (0.0176) | 0.0354 (0.1842) |
chb24 | 28 | 57 | 27 | 41 | 1.2326 (0.0164) | 0.1797 (0.1304) | −0.0018 (0.2669) | 1.2818 (0.0073) | 0.4009 (0.0252) | 0.0232 (0.4442) |
For both the Euclidean and the secant methods, the amplitude of the EEG signals at small scales show a strong long-range correlation with average values of exponents close to one, except for subjects chb01, chb05 and chb24. However, for medium and large time scales starting from the first 15 min where the EEG signals were recorded, a weak long-range correlation is exhibited because the mean values of the scaling exponents are <0.5. For small scales (), the mean values for the scaling exponent indicate a non-stationarity behaviour for subjects chb01, chb05 and chb24, representing unpredictability characteristics of the EEG signals, besides being associated to pathological changes of epileptic seizures (Klonowski, 2009). For the remaining seven subjects, the values close to one indicate a noise (pink noise) behaviour. In terms of fit errors, they are very small in the first region for both Euclidean and secant methods. For the second region, the fit error shows a slight increase for the secant method and an overall larger increase for the Euclidean method. For the third region, because of its overall less linear behaviour associated to larger scales, the fit errors show a larger increase. Some of the larger values for the fit error are because we are fitting a linear model for possible non-linear behaviour and that the coefficients are obtained as the average for all channels in a given individual.
These patterns for the EEG signals are typically observed in signals generated by neurobiological systems (He, 2014; Ouyang et al., 2020; Van de Ville et al., 2010), as well as by physiological processes, such as neural network organization (Lipsitz, 2002; Lipsitz and Goldberger, 1992) and cognitive processes (Ihlen and Vereijken, 2010; Wijnants et al., 2013), and are associated with stability and adaptability into dynamic processes (Bak et al., 1987). But, for larger scales (n > 400), the weak long-range correlation indicates that there is a greater probability of negative values in time series being proceeded by positive values in the time series (and/or vice versa). The different degrees of long-range correlation are due to the fact that the subjects suffering from epilepsy experience some significant brain alterations during the seizures.
Our findings showed that the CPs in the first region are similar for both the Euclidean and secant methods, being identified in neighbouring time scales (Table 1). These small differences in the locations identified as CPs are due to the fact that the secant method is influenced by the slopes in the extremes of the curve. For the second CPs, it is possible to see a larger discrepancy in the locations among the individuals, which is justified by the construction of these two methods, visible specially after detecting the first CP. The reason for this is that, after removing the first CP and the first part of the series, the observations become more similar, making it more difficult to accurately estimate the second CP. A similar pattern is found when comparing the coefficients of the three regions between methods.
In our analysis with 220 EEG signals (22 channels from each of the 10 subjects), no negative scaling exponent was found for the secant method, which was not the case for the Euclidean method, which resulted in three subjects with negative scaling exponent on the third region. This happens because the decrease in the slope between any two pair of points located on the log–log plot is now seen as a decrease in the geometric distance between the same points. In both methods, the localization of the following CP can be obtained recursively in a similar manner but for a subset of the original series.
3.3 Log-amplitude DFA
Based on the AUC criteria, the reference channels were selected as follows: channel 15 (P8–O2) for chb01, channel 13 (FP2–F8) for chb02, channel 5 (FP1–F3) for chb03, channel 2 (F7–T7) for chb05, channel 19 (P7–T7) for chb08, channel 13 (FP2–F8) for chb11, channel 21 (FT9–FT10) for chb18, channel 17(FZ–CZ) for chb20, channel 14 (F8–T8) for chb22 and channel 21 (FT9–FT10) for chb24.
Figure 3 shows, for each of the 10 subjects, the as a function of n for all 22 channels, considering the reference channel as defined by the AUC criteria.

as a function of n: each curve represents the log difference between the reference channel yy and others channels xx. The reference channel was choose by applying the AUC criteria. The values from 1 to 22 indicate the potential difference among the channels. In this order: and . On the left panel, from top to bottom, there are the subjects chb01, chb03, chb08, chb18 and chb22, and on the tight panel the subjects chb02, chb05, chb11, chb20 and chb24. Equal colours in different plots are associated to the difference between the same channels
Based on the location of the reference channel and on the behaviour observed in Figure 3, we are able to divide the subjects into two groups. The first group, with the seven subjects chb02, chb03, chb08, chb11, chb18, chb22 and chb24, shows a larger amplitude in most of the scales around the frontal channels and a similar behaviour on the channels when increasing the scales, when the size of the scales increases the fluctuation was concentrated around the frontal channels. It is worth highlighting the potential difference between the channels FT9 and FT10 that was selected to be the reference channel more than once, represent the pair of channels with the largest interelectrode distances (Fig. 1), and can be associated to a high voltage channel (Libenson, 2009). The second group, with the three subjects chb01, chb05 and chb20, the behaviour is reversed, i.e. when increasing the size of the scales, the channels that stood out are located in the hemisphere parietal–occipital of the brain, in particular the voltage difference between the channels P3 and O1. There is a relation in some of these results with those presented by Luo et al. (2011), who found that the functional connectivity between the frontal and parietal lobe revealed a significant negative correlation with epilepsy duration.
The contribution of the AUC criteria to the delta-function methodology becomes clear as most of the scales of the delta-log fluctuation between the reference channel, chosen by AUC criteria, and the other channels show positive values. This indicates that the AUC classified well the channel with highest curve in the delta-log plot.
4 Concluding remarks
The analysis and quantification of long memory processes is of great importance in time series analysis in general and in the study of EEG signals in particular. One of the most widely used methods to do so is the DFA. In this article, we proposed: (i) two automatic and objective algorithms for the detection of CPs in DFA related methods, which use to be chosen in subjective and visual manner and (ii) an objective criteria to choose the reference channel for the log-amplitude function.
The proposed methodology was then applied to the EEG signals of 10 paediatric subjects with seizures, collected from the CHB-MIT database. The results suggest that a single DFA exponent is insufficient for comprehend all the complex properties of the EEG signals. The EEG signals exhibit three power-law correlations, the first indicates that the power of the signal frequency content decreases rapidly as a function of the frequency (f) itself, the second indicates that the behaviour of the EEG signals change with time, and the third indicates that there is a >50% probability of a negative value being preceded by a positive value.
Besides the power-law correlation, the relationship between different regions of the brain was investigated using the log-amplitude function and the results suggest that the fluctuation are not concentrated around the same hemisphere for all subjects. This fact confirms that epileptic seizures originate from different areas of the cerebral cortex.
The two algorithms for automatic detection of CPs show a great performance and usefulness. The automatic identification of the reference channel also shows a great usefulness and reduction of the inherent subjective choice. The practical conclusions in relation to the specific application in EEG signals of epileptic patients are also potentially relevant.
The methodology presented in this article is of great generality and utility, and can be applied to any other time series applications where DFA is also employed, which represents one step forward in the methodological research related to DFA and to the study of long memory processes.
Acknowledgements
The authors are grateful to the members of CHB-MIT database, which made available the dataset used in this study.
Funding
P.C.R. acknowledges financial support from the CNPq grant ‘bolsa de produtividade PQ-2’ 305852/2019-1.
Conflict of Interest: none declared.