The Nencki-Symfonia electroencephalography/event-related potential dataset: Multiple cognitive tasks and resting-state data collected in a sample of healthy adults

Abstract Background One of the goals of neuropsychology is to understand the brain mechanisms underlying aspects of attention and cognitive control. Several tasks have been developed as a part of this body of research, however their results are not always consistent. A reliable comparison of the data and a synthesis of study conclusions has been precluded by multiple methodological differences. Here, we describe a publicly available, high-density electroencephalography (EEG) dataset obtained from 42 healthy young adults while they performed 3 cognitive tasks: (i) an extended multi-source interference task; (ii) a 3-stimuli oddball task; (iii) a control, simple reaction task; and (iv) a resting-state protocol. Demographic and psychometric information are included within the dataset. Dataset Validation First, data validation confirmed acceptable quality of the obtained EEG signals. Typical event-related potential (ERP) waveforms were obtained, as expected for attention and cognitive control tasks (i.e., N200, P300, N450). Behavioral results showed the expected progression of reaction times and error rates, which confirmed the effectiveness of the applied paradigms. Conclusions This dataset is well suited for neuropsychological research regarding common and distinct mechanisms involved in different cognitive tasks. Using this dataset, researchers can compare a wide range of classical EEG/ERP features across tasks for any selected subset of electrodes. At the same time, 128-channel EEG recording allows for source localization and detailed connectivity studies. Neurophysiological measures can be correlated with additional psychometric data obtained from the same participants. This dataset can also be used to develop and verify novel analytical and classification approaches that can advance the field of deep/machine learning algorithms, recognition of single-trial ERP responses to different task conditions, and detection of EEG/ERP features for use in brain-computer interface applications.

regulation of our consortium partner, who has the main authorship of MRI data. The data can be obtained on the individual request. We would also like to stress that our EEG recording with individual electrodes' location data can be used for reliable source analyses, which can be cautiously crossreferenced with the fMRI results. As for comment #3b. Stimulus and ISI durations are similar to those used by Bush et al. 2006, to study healthy populations (ITI was fixed there at 1750 ms, both containing stimulus and ISI, 24 trials were displayed in 42 s blocks). We added pseudorandom component to ISIs to reduce the likelihood of anticipating the onset of the stimulus (the slowest ISI in our case was 800 ms, and the longest 1300 ms; ITI 1700-2200 ms). Given the average response times from 537.29 ms to 746.82 ms, there is still ~1 s sweep after the movement termination (less than 1 s in FS condition with longest RT, and more than 1 s in all other trials). Although, we agree that this experimental setup could not be used to study oscillatory (e.g. motor related) dynamics over 1.5-2 seconds. We have added that information to the text with corresponding literature sources.
Response #4: We deleted the sentence where the 'extended psychometric' phrase appeared. It was our initial idea to include the information about existing additional data in the manuscript (i.e. corresponding MRI T1 and extended psychometric data; that are not included in the dataset, but available for the readers upon request). However, this is not in line with the journal's requirements (Submission Guidelines), and we were asked to delete the information about the data which cannot be uploaded along with our dataset. The other data belong to our consortium partners and for now, due to the internal regulations of those institutes, we are still not eligible to publish them openly.
Response #5: The experimental group was highly homogeneous in regard to the education and social status, the majority of the participants were students. That was a highly educated group. We have added this general information to the manuscript. Although, we do not have exact information about years of completed education (only for 50% of the subjects).
Response #6: The Polish adaptation of ARSQ belongs to the consortium partner (who also owns the extended psychometric data), who plans to publish extended analyses regarding the reliability and integrity of the Polish version of ARSQ in a separate, dedicated article.
Response #7: The original paper about MSIT (Bush et al. 2006) task proposes that the training should last 1-5 min and we followed this suggestion. We have added the extended description of the training tasks hopefully answering your questions: 'Training sessions of MSIT+ and oddball tasks were performed by each participant before the main task execution. In case of both tasks, ~45-50 stimuli were presented in a training session. The MSIT+ training lasted for ~2.5 min., whereas oddball training lasted ~1.3 min. One training session was sufficient for most subjects. However, if more errors were made, then the participant went through the training again until the instructions were completely understood.' Response #8: As mentioned above, for their fMRI study Bush et al. (2006) used block design with 24 trials of the same category displayed in a row. Such design generally yields higher statistical power than fast event-related design in fMRI. We decided to propose a more universal paradigm, which would partially retain the benefits of blocked stimulation but also allow for event-related analysis in both EEG and fMRI experiments. As can be seen in the submitted manuscript, our MSIT mini-block design has the same general pattern in behavioral results, as block design (Bush et al. 2006) or eventrelated design used by other groups (Wiesman et al. 2020).
Response #9: Thank you very much for this hint, we have corrected the text.
Response #10: There is one peer-reviewed proceedings paper related to machinelearning on these data (MSIT, SRT) that was presented during IEEE Symposium on Bioinformatics and Bioengineering (BIBE). It just got published with a DOI 10.1109/BIBE52308.2021.9635187. We added it to the reference list and mentioned it at the end of a paragraph 'Dataset ERP results'. Other research manuscripts are now under peer-review process. Dzianok, P., Kołodziej, M., & Kublik, E. (2021). Detecting attention in Hilberttransformed EEG brain signals from simple-reaction and choice-reaction cognitive tasks. In 2021 IEEE 21st International Conference on Bioinformatics and Bioengineering (BIBE) (pp. [1][2][3][4]. DOI:10.1109/BIBE52308.2021.9635187 Response #11: As suggested, we re-calculated average impedances for other tasks (and resting state) and added them to Table 4. R esponse #12: We believe that standard deviation corridors presented in the figure clearly illustrate the waveform differences, particularly strong in the odd-ball task. It is less obvious for the four MSIT+ categories, for which we found complex interaction effects not easy to illustrate in one simple figure. This data was extensively analyzed with ERP-microstates approach and cluster-permutation waveform analysis and obtained results will be soon published.
Response #13: No fixation point was presented, the fixation cross '+' was presented only during the breaks that are marked on the drawings. Blank black screen appeared between stimuli. We added such information to the manuscript now.
Response #14: We have added the swarm plots to each boxplot, showing individual data in a concise and readable way.
---Response to Reviewer #2 Thank you very much for your time dedicated to reviewing our manuscript and offering helpful remarks. The answers for your questions are given below: Response #1: We have added the following information to the manuscript: 'Signal artifacts (eye-movements, cardiac, muscle artifacts) were corrected for the purpose of ERP plotting using Independent Component Analysis (ICA; [28]). Remaining artifacts were removed during semi-automatic visual inspection. Data was band-pass filtered (0.5-40 Hz), re-referenced to an average reference and segmented into epochs from 200 ms before stimulus presentation to 1000 ms after stimulus presentation. Clean segments with correct responses were averaged separately for every condition and task. Baseline correction was applied for visualization purposes.' Response #2: Thank you for that suggestion, we have added the oddball reaction time to Figure 4.
Response #3: The second time slot was only four hours after the first one. That was still early afternoon (1 PM), to ensure that our participants will be fully rested and perform the tasks with the utmost care. Some participants preferred the second time slot, as the first time slot was too early for their daily routine (9 AM). We did not force participants for one of the time slots, there was always a choice between the morning one and the early-afternoon one. We added one additional sentence for that matter in the manuscript.
Response #4: Tasks were divided into two or four runes (all were recorded) according to Response #6: Thank you for this comment. You are right, Tab. 4-6 were deleted during the edition process and merged into Tab. 3. and we unfortunately overlooked the uncorrected numbering of the following tables. The numbering of the tables and their referencing in the text were corrected and checked carefully. As for the location of the impedance table in the text (and consequently its number), we prefer to leave it as it was -in the 'Dataset quality' subsection, where it well illustrates the thoroughness of preparing the setup for each recording. In the method section we provided a reference to this table, to draw the reader's attention to the fact that all the resistance data are given further in the text. However, we do not think that this table should be put so early in the text. (2) a 3-stimuli oddball task with frequent standard, rare target, and rare distractor stimuli;

Dataset validation
First, data validation confirmed acceptable quality of the obtained EEG signals. Typical eventrelated potential (ERP) waveforms were obtained, as expected for attention and cognitive control tasks (i.e., N200, P300, N450). Behavioral results showed the expected progression of reaction times and error rates, which confirmed the effectiveness of the applied paradigms.

Conclusions
This dataset is well suited for neuropsychological research regarding common and distinct mechanisms involved in different cognitive tasks. Using this dataset, researchers can compare a wide range of classical EEG/ERP features across tasks for any selected sub-set of electrodes.
At the same time, 128-channel EEG recording allows for source localization and detailed connectivity studies. Neurophysiological measures can be correlated with additional psychometric data obtained from the same participants. This dataset can also be used to develop and verify novel analytical and classification approaches that can advance the field of deep/machine learning algorithms, recognition of single-trial ERP responses to different task conditions, and detection of EEG/ERP features for use in brain-computer interface (BCI) applications.

Background and purpose
Here, we describe a dataset of EEG/ERP signals and metadata which may be useful in at least three domains: a) in-depth neuroscience investigations related to attention and cognitive control, b) testing machine and deep learning algorithms suited for neuroimage data, with a particular emphasis on attention and cognitive control aspects, and c) new approaches for brain- There is a tremendous need for neuroscience datasets to be available to the wider scientific community, which may help to overcome the reproducibility crisis and improve the quality of research (e.g., larger sample sizes, more reproducible findings). Open access to data may also increase the speed of research and reduce the need to collect multiple redundant datasets, such as in computational neuroscience [1,2,3,4].
Research concerning conflict processing and attentional cognitive control are well-established in the neuroscience literature. Such research has focused primarily on characterizing conflict processing and attentional cognitive control using behavioral [5,6,7] and neuroimaging [8,9,10] approaches. This work has demonstrated two primary sources of cognitive conflict: (1) stimulus-stimulus interference, which arises from flanking stimuli that are similar to each other, but differ from the target stimulus; (2) stimulus-response interference, which arises from spatial (in)compatibility between the target stimulus and the response button positions. Both sources of conflict may be examined using different sensory domains, also tactile and auditory ones [11]. The most common tasks used to study these conflict types include the Flanker Task, which evokes the "Flanker effect" [12], and the Simon Task, which evokes the "Simon effect" [13].
Both the Flanker and Simon effects are included in the multi-source interference task (MSIT) [14,15], which was designed to maximize conflict effects and strongly engage conflict-specific brain areas, including the dorsal anterior cingulate cortex (dACC). The popularity of the MSIT task has risen rapidly over the past 17 years (Fig. 1).
[ exists. If yes, it is also unclear whether conflict is resolved by a single, dedicated brain mechanism, regardless of the specific details of the conflicting events. In contrast, conflict may be detected and resolved by different neural mechanisms, depending on the type of conflicting stimuli. A reliable comparison of the data and a synthesis of study conclusions is precluded by multiple methodological differences. To our knowledge, only two MSIT-based datasets are publicly available: (1) a dataset that includes intracranial EEG with and without brain stimulation collected from 21 patients with epilepsy [16,17]; (2) a dataset of EEG/deep brain stimulation with affective MSIT collected from 14 patients with obsessive-compulsive disorder (OCD) and 14 patients with major depressive disorder (MDD) [18,19]. However, these datasets are relatively limited in sample size (N < 25). To our knowledge, there are no publicly available MSIT datasets collected in healthy individuals. This precludes the ability to resolve and clarify ongoing and past studies.
Here, we describe an MSIT+ dataset collected in healthy individuals. This database is well suited to address the aforementioned questions because it includes an extended MSIT (MSIT+) task. In addition to non-conflict and multi-source conflict conditions, our task includes two single-conflict conditions (i.e., Simon and Flanker). The same participants performed the classical 3-stimuli visual oddball task with two rare stimuli: a target and a distractor. A simple reaction time task (SRT) was also included as a no-conflict, no-attention, control situation. Prior to completing tasks, EEG data were recorded during a 10 min resting-state (REST) condition.
Resting-state paradigms are commonly used in neuroimaging and neurophysiology research to examine spontaneous functional activity of the brain at an individual level.
The included tasks initiate multiple cognitive processes, including attention, attentional control, working memory, and action selection. These tasks are of interest from the perspective of basic science and are also helpful for understanding how their dysfunctions induce various mental and medical conditions. The selected tasks are well suited to answer many co-related questions regarding the brain processes underlying conflict and attentional control. Indeed, we selected two attentional tasks (i.e., MSIT+ and oddball) and the SRT task which provides information regarding processing speed of each participant regardless of attentional resources and effort/conflict. The additional REST procedure is well suited to evaluate task-independent (i.e., spontaneous) activity of the brain, while psychometric questionnaires allow correlating EEG/ERP results with participant's individual characteristics. An in-depth analysis of such database (i.e., same participants, same experimental conditions, N=42) may also lead to new clinical applications.

Experimental design Participants and psychometric measures
Forty-two healthy, right-handed young adults (20-34 years of age, 22 females) completed the described experiment (Tab. 1). The experimental group was highly homogeneous in regard to the education and social status, the majority of the participants were students. Participants were provided with detailed information about the study and a list of exclusionary criteria, including contraindications for EEG/ERP: pregnancy, chronic diseases (e.g., epilepsy, chronic headaches, chronic sleep disorders), skin diseases and allergies (especially head-related), diagnosed mental and neurodevelopmental disorders, head injury, medication use (particularly those that may influence nervous system functioning), alcohol, and psychoactive substance abuse.
Each participant filled psychometric tests directly related to the EEG recording session.
Handedness was verified with the Edinburgh Handedness Inventory (EHI) [20]. Additionally, each participant was required to get sufficient sleep and come to the study visit well-rested.
Questions about their subjective rest and stress levels were included in the questionnaire filled before the EEG session along with demographic (age, sex) and basic health information (medication use, type of medication, caffeine uptake, phase of the menstrual cycle).
Additionally, mood was measured with the Polish version [21] of the UWIST Mood Adjective Check List (UMACL) [22], which measures hedonic tone (HT) and energetic arousal (EA) subscales, which are related to subjective feeling of pleasantness and energy needed for any activity and the tense arousal (TA) subscale, which is associated with fear and tension (Tab. 1). Hz and no high-pass nor Notch filters were used during recording.

Environment
The EEG experiments were conducted in the Nencki Institute of Experimental Biology PAS, in the EEG laboratory, and started either in the morning (9 AM) or early afternoon (1 PM). The participants had the possibility to choose the most convenient time session and to get enough sleep to be well rested for the recording time. Each EEG session took about ~3 hours, which included signing documents, participant preparation, and task execution (Tab. 2). In order to reduce testing time, the 128-electrode cap was prepared before the participant arrived in the lab.
All EEG sessions were conducted in a quiet, comfortable room with a dim light. The participant was seated in a chair with armrests, and the chair was facing the front of a monitor. The researcher supervised the study from an adjacent room via remote desktop connection to the recording computer and a LAN camera overlooking the EEG laboratory.

Experiment and datasets
Experiment procedures, including the duration and number of recorded data files, are listed in Tab. 2. Responses were made with the right (dominant) hand on the numeric keyboard: response 1 -'1' key pressed with right index finger; response 2 -'2' key pressed with right middle finger; response 3 -'3' pressed with right ring finger. The task was presented in a mini-block design format. In each trial, the stimulus was presented for 900 ms and followed an interstimulus interval (ISI) that ranged from 800-1300 ms (in steps of 100 ms) (Fig. 2). Stimuli were separated by a blank screen. Three or four stimuli of the same condition were presented consecutively with inter-mini-block intervals (IMI) that ranged from 2. [ Figure 2.]

Oddball task
The oddball paradigm provides data with well-known event related potential (ERP) components such as P3, which is frequently used in BCI approaches. In this paradigm, three stimuli (rare target -þ, rare distractor -Þ, frequent standard -p) were presented in a pseudorandom order with the restriction that two rare stimuli could not appear in a row (Fig. 3 [ Figure 3.]

Simple reaction time task
SRT (simple reaction time) task data can be easily compared to the results of other tasks and provides a measure of the basic motor and processing speed capabilities of the participants. In particular, the SRT gauges participants' 'average' reaction time in response to visual stimuli, which is considered the most basic measure of processing speed. A simple stimulus ('000') was displayed (900 ms) centrally on the screen (RGB colors as described above) with 700-1200 ms ISIs (in steps of 100 ms). The participant was instructed to respond with an index finger (pressing '2' on a numerical keyboard) as fast as possible. On average, 129 trials were presented to each subject. Marker denotation is presented in Tab. 3.

Resting-state session
The REST task is widely used to monitor spontaneous brain activity. In particular, the REST task measures an initial, task-negative state during which no task is given to the subject. REST recording lasted for 10 minutes and was performed in an eyes-open condition. A plus sign '+' was displayed (RGB colors as described above) centrally on a dark screen for gaze fixation.
Marker denotation is presented in Tab. 3.

Data format and structure
The data structure was prepared in accordance with BIDS (Brain Imaging Data Structure) rules [25,26] with the use of EEGLAB [27] to correct and standardize the files. This approach allows for coherent data organization, reduces errors, and makes it possible for different researchers to re-use the datasets for their own purposes and develop automated tools for data analysis. ⎯ Health: medication usage on the day of experiment, type of medication used, caffeine uptake, phase of the menstrual cycle (for women).
⎯ Subjective stress level, subjective rest level.

Additional notes
Additional markers of the type 'boundary' can be found in all described tasks and files. Events of such type are added automatically by EEGLAB when some portions of the data were deleted, or when portions of continuous datasets are concatenated. Of note, this occurs at the beginning of each run after uploading the data from Brain Products format to EEGLAB. Some raw files were concatenated because of technical errors during recordings, which required saving more than one file for each task. In these files, when a new portion of data was introduced, an additional marker with the code 'New Segment' was created. Files that were prepared that way are as follows: sub-17 (MSIT), sub-21 (MSIT), sub-23 (MSIT), sub-25 (REST), sub-37 (oddball), sub38 (oddball, MSIT). Furthermore, one participant did not complete the SRT task correctly (sub_29).

Reliability Methods
Behavioral data for the SRT and oddball task were calculated from EEG files (i.e., '*_events.tsv') and MSIT+ data were calculated from original Presentation (*.log) files. The raw files were used because these provide more precise information about individual stimuli.
Trials that were missed (i.e., no responses), anticipation (faster than 100 ms for SRT/oddball task and 200 ms for MSIT task), and longer than the time of stimulus plus the shortest used ITI were coded as errors for all tasks. Additionally, multiple responses to one stimulus were also coded as an error in the MSIT+ task. One participant was excluded in the calculation of behavioral data for the SRT task, as his/her responses were not recorded correctly (sub_29).
Statistical analysis of RTs was performed in JASP (version 0.14.1.0) [28]. Repeated Measures ANOVA was conducted with task conditions as repeated measures factors and Shapiro-Wilk test was used to test normal distribution. The assumption of sphericity was violated and Greenhouse-Geisser correction was used.
Signal artifacts (eye-movements, cardiac, muscle artifacts) were corrected for the purpose of ERP plotting (Fig.5) using Independent Component Analysis (ICA; [29]). Remaining artifacts were removed during semi-automatic visual inspection. Data was band-pass filtered (0.5-40 Hz), re-averaged to average reference and segmented into segments from 200 ms before stimulus presentation to 1000 ms after stimulus presentation. Clean segments with correct responses were averaged separately for every condition and task. Baseline correction was applied for visualization purposes.

Dataset quality
We placed great emphasis on obtaining clean EEG data. Electrode-skin impedance was kept as low as possible (Tab. 4, M±SD impedance for MSIT task: 4.66±2.36 kΩ), and participants were informed about the deteriorating influence of their movements on the EEG quality. We observed recorded EEG signals via remote desktop connection and intervened when necessary.
Unavoidably, there are eye-blink artifacts on frontal electrodes, and often muscle artifacts on temporal and occipital poles. A large signal drift is also observed for some participants/electrodes as no high-pass filters were used (this step was omitted to ensure the possibility of further data modifications/filtering according to analytical needs).
Data preprocessing for the purpose of our own analyses (not included in the database) indicated relatively good quality of EEG signal. Indeed, on average, in the most restrictive and conservative approach, we excluded 22.1 ± 14% of trials.

Dataset basic results
The basic behavioral and ERP results also confirm the quality of this dataset. Indeed, reaction times and accuracy corresponded with the increasing difficulty of tasks conditions, as expected.
Responses in SRTs are usually faster than responses in any choice reaction tasks. As expected, our mean SRT latencies were fast (Tab. 5 and Fig. 4), and were faster than observed in the 00 (no conflict) MSIT condition, and as compared to oddball responses. In addition, participants made very few errors: 4.23%, including 2.12% related to response failures and only 1.9% related to premature responses (i.e., faster than 100 ms). The rest of the errors corresponded with late/long responses.
[ Figure 4.]  The gradual increase of the conflict over the four conditions of the MSIT+ is clearly seen in reaction times (00<S0<F0<FS, Fig. 4; not commit any errors, and another 19 committed 1-3 errors during the entire task. Therefore, a total of 83% of participants did not commit more than 3 errors during the whole task. Only four participants committed substantially more errors during the task (>17). Given these observations, more elaborate error-rate analyses are not possible nor recommended.

Dataset ERP results
The aim of this section was to show basic characteristics of the obtained signal. During both the MSIT+ and oddball tasks, the primary ERPs are clearly visible (Fig. 5).
[ Figure 5.] The most studied component in healthy and clinical populations is P3, which peaks between 300-600 ms after stimulus onset [30]. P3 closely reflects attentional and memory processes in the human brain [30], and is often studied in the context of an oddball task. A much higher amplitude of P3 induced by target in comparison to distractor stimuli, and standard trials are visible in our oddball task results, especially for electrodes placed in parietal areas (such as Pz), which agrees with the current literature [31]. Additionally, there is a clear distinction between two P3 subcomponentsnamely P3a and P3bin our 3-stimulus oddball task. The P3a is more likely to be evoked during tasks with novelty, whereas P3b is considered to reflect attention [31]. Another well studied component related to cognitive control, attention, and conflict resolution is the fronto-central negative peak (N2). The N2 is commonly observed between 200-350 ms after stimulus onset [32]. A clear fronto-central N2 is observed in our data collected during the oddball task (Fig. 5). In MSIT+ ERPs, a later conflict-related negativity (i.e., the N450) is observed on fronto-central electrodes (Fig. 5). Of note, the N450 extends more posterior as a negative deflection within the P3. The P3b and N450 clearly discriminate between the four conditions of MSIT task. In particular, the P3 shows higher amplitudes for faster responses (00>S0>F0>FS) whereas the N450 amplitude increases with increasing levels of conflict (00<S0<F0<FS). Yet another conflict-related wave is clearly visible in our MSIT data: the slow potential (SP). The SP is commonly observed after a crossover point on the descending slope of a waveform [33]. The conflict SP amplitude increases with increasing levels of conflict (00<S0<F0<FS), as expected from earlier literature [34]. These basic results show that the tasks were designed correctly.
Additionally, we used MSIT and SRT single trial EEG data to detect and discriminate between attentive brain states with use of machine learning methods [35]. We have used multisource interference trials (FS) from MSIT as a condition with high attentional load, and SRT trials as a low attentional load. Classification accuracy between high-attention and low-attention conditions was up to 100% for individual subjects with 89% average classification accuracy for all subjects [35], which validates used tasks, proves high-quality of the obtained data and their general usefulness in different analytic approaches.

Summary and perspectives
It is essential to integrate and re-use data to improve the reliability of results in neuroscience.  [36,37]. The development of improved BCI requires a deeper understanding of the basics of EEG signals, functioning of brain areas and the connections between them, with a strong emphasis on attention control.
Some approaches have attempted to use conflict tasks (e.g., semantically congruent and incongruent stimuli) to first, evaluate performance, and then, build BCI systems for patients with disorders of consciousness (DOC) [37]. As seen above, BCIs still have many challenges that could be addressed with a more robust system training that would allow better accuracy. A more robust system would also improve the ability of the system to learn to 'read' real brain signals between nonstationary noises and artifacts.
Deep/machine learning approaches are widely used to detect specific patterns of EEG activity and improve understanding of brain functions [38,39,40,41,42]. These approaches are promising for developing new medical methodologies for early intervention and treatment of various brain dysfunctions, such as depression, stress-induced conditions, Alzheimer's disease, autism spectrum disorder, attention deficit hyperactivity, and more [43]. Neural Attention Models are the most recent state-of-the-art deep learning approaches that show promise in this area [44]. There is also a critical need to improve feature extraction, i.e., the process of analyzing signals to distinguish signal features from extraneous content. The proposed dataset could serve as a benchmark and help to evaluate the performance of several novel classifiers in an off-line scenario. Such a process is frequently used to evaluate new approaches (cf. [45]).
Advances in the ability to decode mental and cognitive states is a vital branch of neuroscience and computer science that has gained new attention in recent years [46,47,48,49]. Approaches based on machine learning can decode task engagement, performance, and attention from MSIT and oddball tasks, which can be vital for BCI and novel systems that could detect early mental fatigue. This could be used as a biomarker of fatigue, especially in professions where there is a need for constant attention (e.g., air traffic controllers, professional drivers). Resting-state brain dynamics have been shown to predict the effects of oddball paradigms. For example, rest regional power of a few brainwaves has been shown to correlate with the latency and amplitude of P300, N3, P2, and N1 components [50].
All the tasks included in this dataset can also be used for the in-depth study of the so-called "time-on-task" (time spent to solve the task) problem [51]. The "time-on-task" concept, in part, appears to oppose the results of conflict processing studies. For example, it has been asserted that the dorsal medial frontal cortex is more sensitive to time-on-task rather than conflict itself.

Summary of the advantages of this dataset
⎯ Localizer data of individual electrode locations, which can greatly improve the reliability of EEG source analysis given that real electrode localization can slightly vary from standard templates and across participants due to differences in head geometry; ⎯ recorded A1 and A2, which enables the off-line possibility of using classical earreferenced montages; ⎯ high-density recording, which allows for in-depth connectivity and source localization studies; ⎯ 3 different tasks (and additional resting-state protocol) completed by the same participants; ⎯ the first publicly available dataset that includes the MSIT in healthy participants; ⎯ corresponding resting-state questionnaire along with EEG resting-state data; ⎯ relatively large sample size (N=42); ⎯ twofold verification of the data via behavioral and simple ERP investigation; ⎯ detailed health data about participants (menstrual cycle phase, medication & caffeine use).

Summary of the limitations of this dataset
⎯ Relatively short durations of stimuli and ISIs may not be optimal to study induced oscillatory effects, which may be of interest to some researchers, as shown previously [52,53].
⎯ Lack of an open access to fMRI dataset from the same task and participants is a limitation, compromising the cross-methodological comparison of MSIT+ results.
However, such dataset will be made public with the upcoming fMRI results article from our research partners.

Availability of supporting data
All the data described in this paper, including EEG datasets, behavioral, and basic questionnaire data, are available in The Nencki-Symfonia EEG/ERP GigaScience repository, GigaDB [54].

Ethics, consent and permissions
This study was performed in accordance with the declaration of Helsinki and was approved by the Research Ethics Committee, Faculty of Humanities, Nicolaus Copernicus University in Toruń, Poland (No. 6/2018). All participants provided written informed consent to participate in the study and received cash remuneration (200 PLN, around 44 EUR). Participants signed the consent that granted safety of personal data, with the information that experimental data itself will be anonymized for use in analyses and publications related to the research project.