-
PDF
- Split View
-
Views
-
Cite
Cite
Erin J Wamsley, Megan Collins, Effect of cognitive load on time spent offline during wakefulness, Cerebral Cortex, Volume 34, Issue 2, February 2024, bhae022, https://doi.org/10.1093/cercor/bhae022
- Share Icon Share
Abstract
Humans continuously alternate between online attention to the current environment and offline attention to internally generated thought and imagery. This may be a fundamental feature of the waking brain, but remains poorly understood. Here, we took a data-driven approach to defining online and offline states of wakefulness, using machine learning methods applied to measures of sensory responsiveness, subjective report, electroencephalogram (EEG), and pupil diameter. We tested the effect of cognitive load on the structure and prevalence of online and offline states, hypothesizing that time spent offline would increase as cognitive load of an ongoing task decreased. We also expected that alternation between online and offline states would persist even in the absence of a cognitive task. As in prior studies, we arrived at a three-state model comprised of one online state and two offline states. As predicted, when cognitive load was high, more time was spent online. Also as predicted, the same three states were present even when participants were not performing a task. These observations confirm our method is successful at isolating seconds-long periods of offline time. Varying cognitive load may be a useful way to manipulate time spent in at least one of these offline states in future experimental studies.
Introduction
Humans spend a large portion of their waking hours attending to internally generated forms of cognition, including but not limited to daydreaming and mind wandering (Killingsworth and Gilbert 2010; Schooler et al. 2011; Christoff et al. 2016; Wang et al. 2018; Wamsley 2022). During these “offline” states, attention to the external environment is minimized, while cognitive processes such as imagination, memory formation, and decision-making continue, decoupled from immediate sensory inputs (Antrobus et al. 1966; Smallwood et al. 2008; Handy and Kam 2015). Perhaps the clearest example of an offline state is sleep, during which sensory afferents are blocked and awareness of the environment is nearly absent, even as the brain continues to consolidate memory and perform other cognitive functions (Stickgold 1998; Paller et al. 2021). But even during wake, attention to the external environment waxes and wanes dramatically, with moments of inattention associated with rich spontaneous cognition and benefits for cognitive functions like memory (Wamsley 2022; Wamsley et al. 2023), creativity (Baird et al. 2012), and decision-making (Dijksterhuis et al. 2006; Strick et al. 2010).
Yet despite an increasing amount of research invoking this term1, there remains surprisingly little agreement on the definition of an offline state or on how to measure whether a person is offline. In this paper, we assume being offline is only imperfectly correlated with participants’ subjective experience. Thus, we avoid measuring the extent to which a person is offline by self-report alone, preferring an integrative approach that combines self-report measures with physiological measures of brain state.
Being online or offline is often described as a function of the degree to which external stimuli are being processed, where being online means attending to the current sensory environment, and being offline means attending into internal representations (Schooler et al. 2011; Smallwood et al. 2011). Of course, by this definition, offline wakefulness is itself a varied state. For example, in attending to their own internally generated thought and imagery, participants might be effortlessly daydreaming about the past or future, intentionally concentrating on internal representations (Seli, Risko, and Smilek 2016) (e.g. engaging in goal-directed thought), ruminating, or experiencing a “mind blank” state in which they are unaware of any cognitive process. While theorists have proposed numerous different taxonomies of offline wakefulness, based on a multiplicity of measures, no universally adopted classification scheme exists (Seli, Risko, and Smilek 2016; Vidaurre et al. 2016; Wang et al. 2018; Poulet and Crochet 2019; Greene et al. 2023; Song et al. 2023).
In experimental studies, some labs have operationally defined offline wakefulness as a time during which participants are not performing any cognitive task, and sensory stimulation is experimentally minimized (e.g. sitting with eyes closed in a quiet, darkened room) (Dewar et al. 2012; Brokaw et al. 2016). In other studies, participants are considered to be offline when sensory stimulation is reduced to a lesser degree, for example, while they look at a fixation cross in between task trials (Alm et al. 2018; Bönstrup et al. 2019; King et al. 2022) or complete non-demanding activities such as listening to music or audiobooks (Nelson et al. 2021; McDevitt et al. 2022). Yet other studies have described offline cognition as occurring during an active task, using the term “offline” to signal that participants are no longer encoding the specific learning material of interest, although they are continuing to respond to unrelated stimuli (Strick et al. 2010; Staresina et al. 2013).
In part hampered by this lack of agreement on operational definition, little is known about even the most fundamental features of being offline while awake. For example, how frequently do humans “go offline” and for what duration of time? We suggest the transition between online and offline states of wakefulness can occur on a timescale of seconds. While this is more rapid than sometimes conceptualized, there is increasing evidence of seconds-level fluctuation between online and offline states in the literature. For example, rodents exhibit seconds-timescale alternation between a “desynchronized” cortical state during environment exploration (characterized by desynchronized firing patterns in cortical networks, pupil dilation, and transiently high levels of Ach [acetylcholine] and NE [norepinephrine]) and a “synchronized state” that emerges during pauses in exploration (characterized by synchronized firing patterns in cortical networks, pupil constriction, and decreased Ach and NE neuromodulation) (Harris and Thiele 2011; Poulet 2014; Poulet and Crochet 2019). This seconds-timescale fluctuation between cortical states persists even when animals are immobile and so is not the simple result of movement onset (Reimer et al. 2014).
In human studies, seconds-long offline states have been proposed in several contexts. First, there is behavioral and neural evidence that most improvement during motor sequence learning occurs offline, during the brief (≈10 to 30 s) rests between trials, rather than during the actual practice (Bönstrup et al. 2019; Jacobacci et al. 2020; Buch et al. 2021). Second, rapid alternation between neuroimaging-defined brain states is suggested by exploratory studies applying hidden Markov modeling and other analytic techniques to uncover latent structure in MEG or fMRI recordings (Vidaurre et al. 2017, 2018; Higgins et al. 2020; Goodale et al. 2021; Yamashita et al. 2021; Song et al. 2023). Such neuroimaging studies have described some transient brain states that resemble online task-related processing and others that resemble offline inwardly directed cognition, with dwell times ranging from ~100 ms (Vidaurre et al. 2018) to several seconds (Song et al. 2023). In some cases, neuroimaging-defined brain states have been found to be behaviorally relevant, predicting task performance (Goodale et al. 2021; Yamashita et al. 2021). Such rapid fluctuation between online and offline states has not been apparent in the human mind–wandering literature, but this may be due to the inherently poor temporal resolution offered by experience sampling. Based on this evidence, as well as our own prior studies (Wamsley and Summer 2020; Wamsley et al. 2023), we propose that humans fluctuate between online and offline states on at least a seconds-level timescale.
The current study
Here and in our prior work (Wamsley and Summer 2020; Wamsley et al. 2023), we use a multimodal, data-driven method to describe how humans move through online and offline states during a Sustained Attention to Reponses Task (SART). In summary, our approach is to continuously record EEG, pupil diameter, and reaction times during an attention task while also intermittently asking participants to report on their subjective experience. We define states of wakefulness using a cluster analysis applied only to trials on which participants reported subjective experience. Subsequently, we train a classifier to identify these states on all trials, even those where subjective report data are not available. This multimodal approach has the advantage of utilizing participants’ self-report of their attentional focus without being constrained by the temporal frequency with which participants are able to report this and without presuming that subjective experience is the sole defining feature of being offline. In a recent study, we found evidence for three distinct states: one online state that we interpreted as reflecting attention to the ongoing task, as well as two offline states that we interpreted as reflecting a decrease in task-focused attention and turn toward internally focused processing (Wamsley et al. 2023). Following a verbal memory task, the prevalence of one of these two offline states (“Offline2”) predicted memory retention.
The purpose of the current study was to determine how these states vary as a function of cognitive load. The motivation for this was 3-fold. First, we aimed to further validate our interpretation of these waking states. We initially labeled two states as “offline” because they were associated with reduced attention to the external environment, according to both objective and subjective measures (Wamsley and Summer 2020; Wamsley et al. 2023). Experimentally testing the effect of cognitive load on state prevalence affords an additional opportunity to test the assumption that the online state represents task-oriented cognitive effort, whereas the offline states represent internally directed cognition. If our interpretation of the states is correct, then as the attentional demands of an ongoing task increase, the amount of time spent online should increase as well. Second, developing methods to manipulate the amount of time spent in offline states could help us to isolate the function of these states. In Wamsley et al. (2023), we observed that prevalence of the Offline2 state after learning predicted subsequent memory retention. However, to establish a causal role for this or any form of offline state, its occurrence must be experimentally controlled.
Finally, while we and others have speculated that fluctuation between online and offline states may represent a fundamental feature of wakefulness (Wamsley 2022), the vast majority of existing work has defined such states of wakefulness only during the performance of a cognitive task. If the fluctuation between online and offline states is in fact an intrinsic feature of human wakefulness, alternation between them should be evident regardless of whether participants are effortfully directing their attention toward a task. We therefore also aimed to determine the extent to which these same states are detectable when participants are not performing any overt cognitive task.
The current study was therefore designed to address two primary questions:
(1) How do the online and offline states identified by our method respond to changes in task demands?
(2) Are these states present only when participants are performing a task? Or alternatively, do the states still occur even when participants are doing nothing?
In line with our prior work, we expected to find that participants alternate between an online state (characterized by large pupil diameter, decreased EEG alpha, fast reaction times [or in the case of the Task Free condition, increased VEP |$ \{$|visual evoked potential|$ \}$| amplitude], and attention to the current sensory environment) and one or more “offline” states (characterized by smaller pupil diameter, high EEG alpha, slow reaction times [or in the case of the Task Free condition, decreased VEP amplitude], and thought and imagery unrelated to the current sensory environment). We reasoned that, if these data-driven states reflect an alternation between online and offline brain states, participants should spend less time offline when the cognitive load of an ongoing task is high. However, we also hypothesized that alternation between online and offline states would still occur even when participants were not performing a task, as participants spontaneously switch their attention between external and internal foci even in the absence of a task.
Materials and methods
This study was preregistered on Open Science Framework (https://osf.io/2hcmt). Except where otherwise specified, all analyses described below were among those specified in the preregistration. Data, analysis code, and the study procedures manual used by research assistants are archived at the same location.
Participants
Participants were recruited through campus advertisements, word of mouth, and Furman University’s Psychology department subject pool. To qualify for participation, interested persons were required to be a current student at any college or university, between the ages of 18 and 30, proficient in English, able to see the computer screen clearly without glasses, have a hairstyle that allowed use of an EEG cap, and agree not to consume caffeine after 10 AM on the day of the study. Participants were compensated either with $10/h payment or course credit in their introductory psychology course. n = 36 participants enrolled in the study and successfully completed at least one of the three study visits. This is greater than the n = 30 enrollment target stated in the preregistration because there was a ~1-year data collection hiatus during the COVID-19 pandemic that prevented six participants from completing all of their scheduled visits. All valid sessions from these six participants were still included in analysis, but an additional six were recruited to compensate for this partial data loss. As described in our preregistration, participants were excluded from analysis if pupil data were unusable for >50% of trials (n = 6) or if they reported either obtaining <5 h of sleep a night on the pre-study log or a Stanford Sleepiness score of >5 during the study (n = 3). Therefore, following exclusions, n = 27 participants were eligible for analysis. Of these, three participants had missing data for the SART in the Task Free condition only, due to technical failure. They are still included in analyses that do not rely on these data. Table 1 describes the characteristics of the sample.
. | Mean . | . | ±SD . |
---|---|---|---|
Age . | 19.778 . | ± . | 1.928 . |
Epworth Sleepiness Score | 14.444 | ± | 3.434 |
MAAS Score | 4.134 | ± | 0.830 |
Daydream Frequency Score | 31.704 | ± | 12.181 |
Proportion SART target trials correct | 0.833 | ± | 0.125 |
Proportion SART nontarget trials correct | 0.986 | ± | 0.018 |
% Female | 66.7% |
. | Mean . | . | ±SD . |
---|---|---|---|
Age . | 19.778 . | ± . | 1.928 . |
Epworth Sleepiness Score | 14.444 | ± | 3.434 |
MAAS Score | 4.134 | ± | 0.830 |
Daydream Frequency Score | 31.704 | ± | 12.181 |
Proportion SART target trials correct | 0.833 | ± | 0.125 |
Proportion SART nontarget trials correct | 0.986 | ± | 0.018 |
% Female | 66.7% |
Notes. SART = Sustained Attention to Response Task. Daydream Frequency Score = total score from the daydream frequency subscale of the Imaginal Processes Inventory. MAAS Score = mean score on the Mindfulness Attention Awareness Scale. Data on proportion of correct responses are for the High Load and Low Load conditions only, as no responses were made in the Task Free condition.
. | Mean . | . | ±SD . |
---|---|---|---|
Age . | 19.778 . | ± . | 1.928 . |
Epworth Sleepiness Score | 14.444 | ± | 3.434 |
MAAS Score | 4.134 | ± | 0.830 |
Daydream Frequency Score | 31.704 | ± | 12.181 |
Proportion SART target trials correct | 0.833 | ± | 0.125 |
Proportion SART nontarget trials correct | 0.986 | ± | 0.018 |
% Female | 66.7% |
. | Mean . | . | ±SD . |
---|---|---|---|
Age . | 19.778 . | ± . | 1.928 . |
Epworth Sleepiness Score | 14.444 | ± | 3.434 |
MAAS Score | 4.134 | ± | 0.830 |
Daydream Frequency Score | 31.704 | ± | 12.181 |
Proportion SART target trials correct | 0.833 | ± | 0.125 |
Proportion SART nontarget trials correct | 0.986 | ± | 0.018 |
% Female | 66.7% |
Notes. SART = Sustained Attention to Response Task. Daydream Frequency Score = total score from the daydream frequency subscale of the Imaginal Processes Inventory. MAAS Score = mean score on the Mindfulness Attention Awareness Scale. Data on proportion of correct responses are for the High Load and Low Load conditions only, as no responses were made in the Task Free condition.
Procedures
This research was approved by the Furman University institutional review board. After reporting to the laboratory and completing the consent process, participants filled out surveys including a demographics form, the Epworth Sleepiness Scale (SSS; Johns 1991), the Daydream Frequency subscale of the Imaginal Processes Inventory (a measure of trait daydream frequency; Singer and Antrobus 1972), the Mindfulness Attention and Awareness Scale (a measure of trait mind wandering propensity; Brown and Ryan 2003), and a three-night retrospective sleep log. Participants were then prepared for EEG and pupillometry recording (see below) before completing a 30 min Sustained Attention to Response Task (SART). There were three different SART conditions (High Cognitive Load/Low Cognitive Load/Task Free). This was a within-subjects design, in which participants completed all three cognitive load conditions across three different laboratory visits (order counterbalanced, visits separated by at least 24 h). While completing the SART, EEG and pupil diameter were measured continuously, along with SART keypress reaction times.
Following the SART, participants completed an exit questionnaire on which they rated the proportion of the interval they spent engaged in 1 or more of 11 pre-defined mental categories: “thinking about the past” (something else earlier today/yesterday to a week ago/past year or several years ago), “imagining the future” (later today/tomorrow to next week/next year or several years), “thinking about the numbers task”, “mind was blank”, “focusing on breath”, “thinking about something else”, and “other.”
SART
The SART is a simple attention task designed to facilitate mind wandering, while also measuring fluctuations in reaction time. In all conditions, participants were serially presented with the digits 1 to 9 on a computer monitor (Fig. 1A). Each digit was on-screen for 450 ms, with a 5 s stimulus onset asynchrony. Stimulus sequences were randomly generated with the following constraints: (i) target probability was set to 0.29; (ii) digit sequences were generated in blocks of 9, 12, 15, or 18 stimuli, with each of these blocks containing at least one but no more than three targets; (iii) targets were always separated by at least one nontarget.

Experimental paradigm. A) Sustained attention to response task (SART). Participants respond to successive numeric stimuli with a button press. When the number is green, participants instead report whether that number is odd or even (Low Load condition), or whether the previous number was odd or even (High Load condition). In a control condition, participants passively view the stimuli without responding (Task Free condition). B) Experience was intermittently sampled using a forced-choice thought probe that prompted participants to categorize their immediately preceding experience as either related or unrelated to the experimental stimuli and either externally or internally directed. C) Process by which SART trials were classified into online and offline waking states. In step 1, a cluster analysis on thought probe trials defined the states based on EEG, pupil, RT, and subjective experience data. In step 2, the cluster labels from step 1 were used to train a classifier to identify cluster based on the EEG, pupil, and RT data alone (without subjective experience). Finally, in step 3, this classifier is applied to all trials, the majority of which did not have subjective experience data. D) Example temporal sequence of states across 400 s of recording in a single participant, in the Low Load condition. Gaps in the sequence indicate trials that were unclassified because they were target trials, or otherwise excluded from analysis (see Materials and methods).
During the SART, participants reported on their subjective experience by answering a multiple-choice question about their current attentional focus on 24 out of 324 trials (Fig. 1B). On these trials, a screen appeared asking participants to classify their experience into one of five categories: (i) external focus on the sensory aspects of the experimental stimuli (“external task–related”); (ii) external focus on other sensory stimuli in the environment (“external task–unrelated”); (iii) internal thoughts, feelings, or imagery about the experimental stimuli (“internal task–related”); (iv) internal thoughts, feelings or imagery unrelated to the current sensory environment (“internal task–unrelated”); or (v) mind blank/unable to recall any experience. Prior to the task, participants received detailed training on how to answer this question, including practice with classifying example experiences. In the Task Free condition, “task-related” referred to the stimuli being passively viewed on the screen.
We employed a variation of the SART that manipulated cognitive load, similar to that used by Smallwood et al. (2009). In the Low Load condition, participants were instructed to press the spacebar as quickly as possible as each digit appeared, but to refrain from responding if the digit was green (the “target”). If the digit was green, they were instructed to instead press a key indicating if the number was odd or even. In the High Load condition, participants were instructed to press the spacebar as quickly as possible as each digit appeared, but to refrain from responding if the digit was green (the “target”). If the digit was green, they were instructed to instead press a key indicating if the number that appeared just before the target was odd or even. In the Task Free condition, participants were not instructed respond to any of the numbers. They were simply asked to sit still and view the number stimuli and to answer the thought probe questions when asked. In the Task Free condition, participants were instructed that while it was necessary to look at the screen, it was not necessary to pay attention to the stimuli. Because reaction time data were not available in this condition, the amplitude of visual evoked potentials in response to the number stimuli was used as an alternate measure of sensory responsiveness.
EEG and pupillometry recordings
For the duration of the SART, 64 EEG channels were recorded using a high-density cap following the 10-10 system of electrode placement. EEG data were acquired at 500 Hz using Brain Products’ BrainAmp amplifiers. Impedance was kept under 10 kΩ. Participants were equipped with the Pupil Labs Pupil Core head-mounted infrared eyetracking system (https://pupil-labs.com/products/core/tech-specs), for the purpose of acquiring pupillometry data throughout the duration of the SART at 200 Hz (Kassner et al. 2014). This portion of the experiment was conducted under constant lighting conditions, using a single overhead lighting source. The fixation cross that appeared between trials was sized to be approximately the same number of pixels as the numeric stimuli, in order to prevent light-related pupil reflexes. A black backdrop was hung behind the computer monitor, to prevent light-related pupil responses that might be induced if participants were to look away from the screen and toward a bright wall.
Analysis methods
Deviations from the preregistration
This was a complex analysis plan to preregister. After the study was complete, we realized we needed to make the following deviations from the pre-registered approach:
In the preregistration, we proposed to use t-tests, ANOVAs, and chi-squares to separately conduct analyses at the trial level and at the participant level. However, we subsequently realized that the best approach to these data would be to use hierarchical linear models, which simultaneously consider trial-level and participant-level variability.
In the preregistration, we proposed that in addition to using cross-validation, final accuracy of the naïve Bayes classifier would be assessed on one-third of data, held out as a test set. However, because sample size was limited after exclusions, a separate test set was not held out for this purpose. Classification accuracy results are those from the cross-validation procedure. Cross-validation is adequate in this situation, because since we have extensively developed this method in previous studies (Wamsley and Summer 2020; Wamsley et al. 2023), no parameter tuning or exploration of multiple approaches was conducted.
The preregistration listed a large number of potential exploratory analyses, and we decided not to conduct all of these. Specifically, we did not run a second set of analyses using 2 states and did not run a second set of analyses with different timing of the EEG and pupil analysis windows. This was because, after running all of the primary analyses, we realized that the interpretive complexity of the results was already high, and generating further sets of exploratory analyses using different parameters might be counterproductive.
The preregistration stated that autocorrelation analysis would be conducted in MatLab. Instead, we conducted this analysis in R. This is because, during the timeframe in which this study was completed, our lab was transitioning away from using MatLab and SPSS for statistical analysis and toward conducting all statistical analyses in R.
We realized it would not be possible to examine how condition affects average pupil size across all trials, because unlike for our other measures, pupil size data were z-scored separately for each recording. Therefore, mean pupil size can be meaningfully compared between waking states, but cannot be meaningfully compared between experimental conditions averaged across waking state.
Other than these exceptions, all other analyses were conducted as described in the preregistration.
Pupillometry preprocessing
Datapoints during which the pupil failed to be detected due to blinks or other artifact were deleted, as were datapoints where extreme variations in diameter were present (points greater than 3 median absolute deviations away from the median, within a 1 min sliding window). Linear interpolation was used to replace these missing datapoints. Prior to analysis, pupil timeseries were then lowpass-filtered at 10 Hz and z-scored.
EEG and ERP analysis
EEG analyses were performed with Brainstorm (Tadel et al. 2011), which is documented and freely available for download online under the GNU general public license (http://neuroimage.usc.edu/brainstorm). Prior to analysis, EEG recordings were filtered at 0.3 to 35 Hz and bad channels were removed and interpolated using spherical splines. Ocular and other artifacts were removed by deleting artifactual independent components, and the remaining artifact was manually marked via visual inspection. Trials marked as still including excessive artifact were excluded from all subsequent analysis steps.
Calculation of spectral power
To avoid excessive redundancy in the features provided to the cluster analysis, spectral power was considered at only a single electrode (Fz) for the clustering and classification analyses. For each 5 s trial, power spectral density was calculated using Welch’s method. Mean power in five a priori frequency bands (slow oscillation [0.3 to 1], delta [1 to 4 Hz], theta [4 to 7 Hz], alpha [8 to 12 Hz], and beta [13 to 35 Hz]) was passed to the cluster and classification analyses described above. Power was normalized to the 0.3 to 35 Hz range.
Evoked potentials
In the Task Free condition, sensory responsiveness could not be measured via reaction time, since participants were not making any responses. In this condition, we instead quantified the amplitude of single-trial evoked potentials as a measure of sensory responsiveness. To do so, first, we calculated the grand-averaged ERP to all nontarget trials and identified the latency and scalp location of the two largest-amplitude peaks. We will refer to these peaks as “component 1” and “component 2” (Fig. 2), because, as ERPs have not previously been studied in this particular passive version of the SART, their correspondence to classically named components is not entirely clear. Component 1 was maximal at P10, reaching peak amplitude at 276 ms. Negative in polarity at P10, this component had a positive maximum over fronto-central sites. This component may correspond to the “P300,” which, in some situations, can be elicited by frequent nontarget stimuli (Polich and Criado 2006; Polich 2007) and has previously been reported to predict performance errors in the SART (Datta et al. 2007). Component 2 had a positive maximum at P8, reaching peak amplitude at 454 ms. We used these peaks as a template to estimate VEP amplitudes on individual trials at these latencies and scalp locations. This was done by calculating the cross-covariance between the template signal and the response on each individual trial, in the time range ±100 ms from the latency of the component’s peak in the grand-averaged signal. The input for clustering and classification analyses was the maximum cross-covariance value for each trial, for each of the two identified components.

VEPs in the Task Free condition. Evoked responses during the SART are plotted at each electrode, with time = 0 representing the moment of stimulus onset. Vertical lines indicate the times at which component 1 and component 2 reached maximum amplitude, respectively. Topographical insets illustrate the scalp distribution of voltage at the time of maximal amplitude, with spherical spline interpolation.
Classification of SART trials into online and offline states
Prior to examining the effects of cognitive load, each 5 s SART trial was classified into one of three data-defined states: Online, Offline1, or Offline2. Our methods for accomplishing this are detailed elsewhere (Wamsley and Summer 2020; Wamsley et al. 2023). Here, we describe them in brief.
Expectation maximization cluster analysis
States of wakefulness were initially defined using an expectation maximization (EM) cluster analysis. Just prior to clustering, data points more than 4 SD above or below the mean on any of the to-be-clustered continuous variables were removed. EEG spectral power data, VEP amplitudes, and pupil diameter data were z-scored (separately for each subject) to facilitate cross-subject comparison. The EM cluster analysis was then applied to all probe trials. Input features were EEG spectral power at Fz (preprocessed and z-scored as described below), reaction time to SART stimuli (or in the Task Free condition, visual evoked potential amplitudes), pupil diameter (preprocessed and z-scored as described below), and participants’ forced-choice response to the experience sampling probe. Clustering was conducted twice: once on trials from the High Load and Low Load conditions combined and again for the Task Free condition. Clustering was conducted separately for the Task Free condition because the measurement of “sensory responsiveness” is qualitatively different for those data and not necessarily comparable (VEP amplitudes instead of reaction times).
Selection of the 3-state model
Three different EM clustering models were evaluated, describing the data using 2, 3, or 4 clusters. To evaluate clustering, we examined 3 distance-based metrics (including the Davies–Bouldin, silhouette, and Calinski–Harabasz indices; Table 2). Across conditions and metrics, a 3-state model most often yielded the greatest distance between states. We additionally preferred the 3-state model because the states closely matched the characteristics of the 3 states identified in our most recent prior study using this same technique (Wamsley et al. 2023) and conformed to our theoretical expectations for the characteristics of online and offline states. Alternate clustering solutions with 2 and 4 states are presented in the Supplementary Results.
. | Number of clusters . | ||
---|---|---|---|
. | 2 . | 3 . | 4 . |
High and Low Load conditions | |||
Calinski–Harabasz index | 133.0a | 123.0 | 106.7 |
Davies–Bouldin index | 2.4 | 2.2a | 2.3 |
Silhouette index | 0.22 | 0.22a | 0.20 |
Task-Free condition | |||
Calinski–Harabasz index | 33.5 | 43.0a | 41.2 |
Davies–Bouldin index | 3.2 | 2.4 | 2.2a |
Silhouette index | 0.14 | 0.18 | 0.20a |
. | Number of clusters . | ||
---|---|---|---|
. | 2 . | 3 . | 4 . |
High and Low Load conditions | |||
Calinski–Harabasz index | 133.0a | 123.0 | 106.7 |
Davies–Bouldin index | 2.4 | 2.2a | 2.3 |
Silhouette index | 0.22 | 0.22a | 0.20 |
Task-Free condition | |||
Calinski–Harabasz index | 33.5 | 43.0a | 41.2 |
Davies–Bouldin index | 3.2 | 2.4 | 2.2a |
Silhouette index | 0.14 | 0.18 | 0.20a |
Notes. For the Calinski–Harabasz and Silhouette indices, higher values indicate a greater separation between clusters. For the Davies–Bouldin index, lower values indicate greater separation between clusters.
aModel with greatest separation for this metric.
. | Number of clusters . | ||
---|---|---|---|
. | 2 . | 3 . | 4 . |
High and Low Load conditions | |||
Calinski–Harabasz index | 133.0a | 123.0 | 106.7 |
Davies–Bouldin index | 2.4 | 2.2a | 2.3 |
Silhouette index | 0.22 | 0.22a | 0.20 |
Task-Free condition | |||
Calinski–Harabasz index | 33.5 | 43.0a | 41.2 |
Davies–Bouldin index | 3.2 | 2.4 | 2.2a |
Silhouette index | 0.14 | 0.18 | 0.20a |
. | Number of clusters . | ||
---|---|---|---|
. | 2 . | 3 . | 4 . |
High and Low Load conditions | |||
Calinski–Harabasz index | 133.0a | 123.0 | 106.7 |
Davies–Bouldin index | 2.4 | 2.2a | 2.3 |
Silhouette index | 0.22 | 0.22a | 0.20 |
Task-Free condition | |||
Calinski–Harabasz index | 33.5 | 43.0a | 41.2 |
Davies–Bouldin index | 3.2 | 2.4 | 2.2a |
Silhouette index | 0.14 | 0.18 | 0.20a |
Notes. For the Calinski–Harabasz and Silhouette indices, higher values indicate a greater separation between clusters. For the Davies–Bouldin index, lower values indicate greater separation between clusters.
aModel with greatest separation for this metric.
Naïve Bayes classifier
A primary goal of this research was to be able to classify all trials into EM-defined online and offline states, even when experience sampling data were not present on that trial. To accomplish this, we trained a naïve Bayes classifier to determine the cluster assignment (online vs. offline) of each probe trial based on the pupil, EEG, and reaction time data alone (or VEP amplitudes in the Task Free condition), using stratified 10-fold cross validation. For the High and Low Load conditions, the classifier was 97.1% accurate at determining cluster assignment. For the Task Free condition, the classifier was 96.6% accurate at determining cluster assignment. We then applied these classifiers to label all trials as either “online” or “offline,” allowing us to define these states across the full length of the recording with 5 s temporal resolution.
Statistical analysis
Statistical analyses were conducted in R (R Core Team 2021). For analyses at the level of individual trial, we utilized random-intercept mixed-effect models, with trials grouped by subject. Where outcome variables were categorical, mixed-effect logistic regression models were implemented using the generalized linear model function “glmer,” from the lme4 package (Bates et al. 2015). Here, statistical significance was assessed using Wald chi-square tests. Where outcome variables were continuous, linear mixed- effect models were conducted using the lme4 and lmerTest packages for R (Kuznetsova et al. 2017). ANOVA and pairwise test statistics derived from these models used Satterthwaite’s method of estimating degrees of freedom. Pairwise comparisons between states were then conducted on the estimated marginal means using the emmeans package in R (Russell Lenth 2020).
For analyses of variables at the participant level (aggregated across trials), we used ANOVAs, t-tests, and Pearson’s correlations. Wherever large numbers of exploratory tests were conducted, the false discovery rate was controlled using the Benjamini–Hochberg method (Benjamini and Hochberg 1995). Comparison of subject-mean EEG spectral power between states at individual electrode sites was conducted using paired-samples t-tests, controlling false discovery across n = 64 electrodes using the Benjamini–Hochberg method.
Results
The naïve Bayes classifier was used to sort all n = 15,254 SART trials into one of three data-defined states (see Materials and methods). We refer to these states as the “Online” state (46.97% of trials), the “Offline1” state (23.55% of trials), and the “Offline2” state (29.49% of trials). Figure 3 displays the characteristics of the states in the High and Low Load conditions, and Fig. 4 displays the characteristics of the states in the Task Free Condition.

Characteristics of online and offline states in the high and low cognitive load conditions. A) Pupil diameter by waking state (z-scored, arbitrary units). B) Reaction times by waking state, in milliseconds. C) EEG spectral power by waking state (z-scored). States differed significantly from each other in all frequency bands, except that Offline1 and Online did not differ in 0.3 to 1 Hz (slow oscillation) activity D) % Thought probe responses in each thought category, by waking state; * = P < 0.05, error bars = 95% CI.

Characteristics of online and offline states in the Task Free condition. A) Pupil diameter by waking state (z-scored, arbitrary units). B, C) ERP component amplitudes by waking state (absolute value, z-scored). D) EEG spectral power by waking state (z-scored). States differed significantly from each other in all frequency bands, except that Offline1 and Offline2 did not differ in 4 to 7 Hz (theta), and Online and Offline2 did not differ in 1 to 4 Hz (delta) activity. E) % Thought probe responses in each thought category, by waking state; * = P < 0.05, error bars = 95% CI.
Online and offline states in the High and Low Load conditions
Online state
Online trials had rapid reaction times (P < 0.001 relative to both Offline1 and Offline2), as well as increased task-related external thought (P = 0.032 relative to Offline1 trials). Delta and theta power were high (Fig. 3C).
Offline1 state
Offline1 trials had high alpha EEG activity (Fig. 3C), as well as increased task-unrelated internal thought (P = 0.024 relative to Online trials). Pupil diameter was reduced during Offline1, relative to both Online (P < 0.001) and Offline2 trials (P < 0.001).
Offline2 state
Offline2 was most prominently characterized by high slow oscillation EEG activity (Fig. 3C). Reaction times were faster during Offline2 than Offline1 trials (P < 0.001), but slower than during Online trials (P < 0.001; Fig. 3).
Online and offline states in the Task Free condition
Online state
In the Task Free condition, Online trials again had high theta and delta EEG power (Fig. 4D). States did not differ in task-related external thought. However, task-related internal thought was more frequent during Online trials, relative to both Offline1 (P = 0.024) and Offline2 trials (P = 0.055).
Offline1 state
Offline1 again had high alpha EEG activity. Relative to Online trials, task-unrelated internal thought was increased (P = 0.002) and pupil diameter was decreased (P = 0.008). VEP amplitudes were increased during Offline1 trials, relative to Online trials (P = 0.002 for the first component and P < 0.001 for the second component). Additionally, component2 amplitude was significantly greater in Offline1 than Offline2 (P < 0.001).
Offline2 state
Offline2 was again most prominently characterized by high slow oscillation EEG activity (Fig. 4). Relative to Online trials, VEP component1 amplitude was increased during Offline2 (P = 0.008), but component2 amplitude was decreased (P = 0.042). Pupil diameter was smaller during Offline2, relative to Online trials (P = 0.025; Fig. 4).
Temporal features of online and offline states
As hypothesized, in the High and Low Load conditions, the probability of being Online decreased across trials (Low Load: r271 = −0.329, P < 0.001; High Load: r271 = −0.391, P < 0.001; Fig. 5A), while conversely, the probability of Offline1 increased across trials (Low Load: r271 = 0.425, P < 0.001; High Load: r271 = 0.527, P < 0.001; Fig. 5B). The probability of Offline2 decreased across trials in the High Load condition (r271 = −0.136, P = 0.025; Fig. 5C) but did not change significantly across trials in the Low Load condition (r271 = −0.086, P = 0.158; Fig. 5C). In the Task Free condition, the probability of being Online did not change across trials (r272 = −0.072, P = 0.235). The probability of being in Offline1 increased substantially across trials in the Task Free condition (r272 = 0.366, P < 0.001), while the probability of being in Offline 2 decreased (r272 = −0.301, P < 0.001). The sequence of state transitions was not significantly autocorrelated at any time lag in any of the experimental conditions.

Effect of time on task on waking state. A) Association between SART trial number and probability of being in the online state, by condition. B) Association between SART trial number and probability of being in the Offline1 state, by condition. C) Association between SART trial number and probability of being in the Offline2 state, by condition.
Effect of cognitive load on state occupancy and duration
Table 3 lists both the proportion of trials spent in each state and the mean posterior probability that a trial belongs to each state (as defined by the naïve Bayes classifier), by cognitive load condition. Because states were defined separately for the combined High/Low Load conditions and the Task Free condition, the Task Free condition is not statistically compared to the High and Low Load conditions. In the High Load condition, participants spent a greater proportion of trials Online (P = 0.005) and a lower proportion of trials in Offline1 (P < 0.001), relative to the Low Load condition (Fig. 6). Cognitive load did not significantly affect the proportion of trials spent in Offline2. Similarly, in the High Load condition, mean posterior probability of the Online state was higher (marginally, P = 0.072) and probability of the Offline1 state was lower (P = 0.009), relative to the Low Load condition. Cognitive load did not affect the probability of Offline2. There was no effect of cognitive load on dwell time (mean # of trials spent in a state before switching to a new state) for any of the three waking states (Table 3).
. | Task condition . | . | ||
---|---|---|---|---|
. | Task Free . | Low Load . | High Load . | High vs. Low Load P-value . |
Proportion of trials by state | ||||
% Online | 50.8% | 44.1% | 46.4% | 0.006 |
% Offline1 | 20.3% | 26.3% | 23.7% | <0.001 |
% Offline2 | 28.9% | 29.6% | 29.9% | 0.74 |
Probability of state | ||||
Online probability | 0.48 ± 0.41 | 0.42 ± 0.41 | 0.43 ± 0.41 | 0.07 |
Offline1 probability | 0.22 ± 0.36 | 0.27 ± 0.38 | 0.25 ± 0.36 | 0.01 |
Offline2 probability | 0.31 ± 0.40 | 0.31 ± 0.40 | 0.31 ± 0.41 | 0.65 |
Dwell time | ||||
Online | 1.68 ± 0.54 | 1.47 ± 0.27 | 1.49 ± 0.21 | 0.79 |
Offline1 | 1.25 ± 0.20 | 1.31 ± 0.15 | 1.31 ± 0.16 | 0.95 |
Offline2 | 1.33 ± 0.21 | 1.25 ± 0.12 | 1.29 ± 0.12 | 0.38 |
. | Task condition . | . | ||
---|---|---|---|---|
. | Task Free . | Low Load . | High Load . | High vs. Low Load P-value . |
Proportion of trials by state | ||||
% Online | 50.8% | 44.1% | 46.4% | 0.006 |
% Offline1 | 20.3% | 26.3% | 23.7% | <0.001 |
% Offline2 | 28.9% | 29.6% | 29.9% | 0.74 |
Probability of state | ||||
Online probability | 0.48 ± 0.41 | 0.42 ± 0.41 | 0.43 ± 0.41 | 0.07 |
Offline1 probability | 0.22 ± 0.36 | 0.27 ± 0.38 | 0.25 ± 0.36 | 0.01 |
Offline2 probability | 0.31 ± 0.40 | 0.31 ± 0.40 | 0.31 ± 0.41 | 0.65 |
Dwell time | ||||
Online | 1.68 ± 0.54 | 1.47 ± 0.27 | 1.49 ± 0.21 | 0.79 |
Offline1 | 1.25 ± 0.20 | 1.31 ± 0.15 | 1.31 ± 0.16 | 0.95 |
Offline2 | 1.33 ± 0.21 | 1.25 ± 0.12 | 1.29 ± 0.12 | 0.38 |
Notes. Means ± SD. Occupancy and dwell times in each of the three waking states, by condition. Occupancy is measured both as the % of trials classified as being in that state and the Bayesian posterior probability of a trial being classified in that state, according the naïve Bayes algorithm. Dwell time is the mean number of trials that participants spent in each state before transitioning to another state. P-values are derived from linear mixed models comparing the High vs. Low Load conditions, as described in Materials and methods.
. | Task condition . | . | ||
---|---|---|---|---|
. | Task Free . | Low Load . | High Load . | High vs. Low Load P-value . |
Proportion of trials by state | ||||
% Online | 50.8% | 44.1% | 46.4% | 0.006 |
% Offline1 | 20.3% | 26.3% | 23.7% | <0.001 |
% Offline2 | 28.9% | 29.6% | 29.9% | 0.74 |
Probability of state | ||||
Online probability | 0.48 ± 0.41 | 0.42 ± 0.41 | 0.43 ± 0.41 | 0.07 |
Offline1 probability | 0.22 ± 0.36 | 0.27 ± 0.38 | 0.25 ± 0.36 | 0.01 |
Offline2 probability | 0.31 ± 0.40 | 0.31 ± 0.40 | 0.31 ± 0.41 | 0.65 |
Dwell time | ||||
Online | 1.68 ± 0.54 | 1.47 ± 0.27 | 1.49 ± 0.21 | 0.79 |
Offline1 | 1.25 ± 0.20 | 1.31 ± 0.15 | 1.31 ± 0.16 | 0.95 |
Offline2 | 1.33 ± 0.21 | 1.25 ± 0.12 | 1.29 ± 0.12 | 0.38 |
. | Task condition . | . | ||
---|---|---|---|---|
. | Task Free . | Low Load . | High Load . | High vs. Low Load P-value . |
Proportion of trials by state | ||||
% Online | 50.8% | 44.1% | 46.4% | 0.006 |
% Offline1 | 20.3% | 26.3% | 23.7% | <0.001 |
% Offline2 | 28.9% | 29.6% | 29.9% | 0.74 |
Probability of state | ||||
Online probability | 0.48 ± 0.41 | 0.42 ± 0.41 | 0.43 ± 0.41 | 0.07 |
Offline1 probability | 0.22 ± 0.36 | 0.27 ± 0.38 | 0.25 ± 0.36 | 0.01 |
Offline2 probability | 0.31 ± 0.40 | 0.31 ± 0.40 | 0.31 ± 0.41 | 0.65 |
Dwell time | ||||
Online | 1.68 ± 0.54 | 1.47 ± 0.27 | 1.49 ± 0.21 | 0.79 |
Offline1 | 1.25 ± 0.20 | 1.31 ± 0.15 | 1.31 ± 0.16 | 0.95 |
Offline2 | 1.33 ± 0.21 | 1.25 ± 0.12 | 1.29 ± 0.12 | 0.38 |
Notes. Means ± SD. Occupancy and dwell times in each of the three waking states, by condition. Occupancy is measured both as the % of trials classified as being in that state and the Bayesian posterior probability of a trial being classified in that state, according the naïve Bayes algorithm. Dwell time is the mean number of trials that participants spent in each state before transitioning to another state. P-values are derived from linear mixed models comparing the High vs. Low Load conditions, as described in Materials and methods.

State occupancy by condition. A) % of trials classified into each state, in the High and Low Load conditions. B) Posterior probability of a trial being classified into each state, in the High and Low Load conditions. P-values are derived from linear mixed models comparing the High vs. Low Load conditions, as described in Materials and methods. ** = P < 0.01, ⱡ = P < 0.1.
Effect of cognitive load on component features of the waking states
As anticipated, reaction times were slower in the High Load than the Low Load condition (Table 4). Alpha power was greater in the Low Load condition than in the High Load condition and greatest in the Task Free condition (Table 4). Meanwhile, slow oscillation power was greater in the High and Low Load conditions than during the Task Free condition (Table 4).
. | Task condition . | ||||||||
---|---|---|---|---|---|---|---|---|---|
. | High Load . | Low Load . | Task Free . | ||||||
. | Mean . | . | ±SD . | Mean . | . | ±SD . | Mean . | . | ±SD . |
Reaction times (ms) | 553.64 | ± | 153.21b | 518.23 | ± | 140.67c | ND | ND | |
EEG power (z-scored) | |||||||||
Alpha (8 to 12 Hz) | −0.051 | ± | 0.943a,b | −0.009 | ± | 0.963a,c | 0.031 | ± | 1.011b,c |
Beta (13 to 35 Hz) | −0.005 | ± | 0.930 | −0.026 | ± | 0.971 | 0.001 | ± | 1.024 |
Theta (4 to 7 Hz) | −0.003 | ± | 0.976 | −0.015 | ± | 0.955 | −0.030 | ± | 0.952 |
Delta (1 to 4 Hz) | −0.01 | ± | 1.000 | 0.005 | ± | 0.997 | 0.000 | ± | 0.989 |
Slow (0.3 to 1 Hz) | 0.014 | ± | 0.992a | 0.010 | ± | 0.982a | −0.035 | ± | 1.013b,c |
Thought probe responses (%) | |||||||||
Task-unrelated External | 13% | 13.25% | 17.4% | ||||||
Task-unrelated internal | 20.48%a,b | 28.85%a,c | 44.2%b,c | ||||||
Mind blank | 3.52% | 4.91% | 4.3% | ||||||
Task-related external | 37.22%a,b | 23.29%a,c | 11.2%b,c | ||||||
Task-related internal | 25.77% | 29.7%a | 22.9%b |
. | Task condition . | ||||||||
---|---|---|---|---|---|---|---|---|---|
. | High Load . | Low Load . | Task Free . | ||||||
. | Mean . | . | ±SD . | Mean . | . | ±SD . | Mean . | . | ±SD . |
Reaction times (ms) | 553.64 | ± | 153.21b | 518.23 | ± | 140.67c | ND | ND | |
EEG power (z-scored) | |||||||||
Alpha (8 to 12 Hz) | −0.051 | ± | 0.943a,b | −0.009 | ± | 0.963a,c | 0.031 | ± | 1.011b,c |
Beta (13 to 35 Hz) | −0.005 | ± | 0.930 | −0.026 | ± | 0.971 | 0.001 | ± | 1.024 |
Theta (4 to 7 Hz) | −0.003 | ± | 0.976 | −0.015 | ± | 0.955 | −0.030 | ± | 0.952 |
Delta (1 to 4 Hz) | −0.01 | ± | 1.000 | 0.005 | ± | 0.997 | 0.000 | ± | 0.989 |
Slow (0.3 to 1 Hz) | 0.014 | ± | 0.992a | 0.010 | ± | 0.982a | −0.035 | ± | 1.013b,c |
Thought probe responses (%) | |||||||||
Task-unrelated External | 13% | 13.25% | 17.4% | ||||||
Task-unrelated internal | 20.48%a,b | 28.85%a,c | 44.2%b,c | ||||||
Mind blank | 3.52% | 4.91% | 4.3% | ||||||
Task-related external | 37.22%a,b | 23.29%a,c | 11.2%b,c | ||||||
Task-related internal | 25.77% | 29.7%a | 22.9%b |
Notes. For reaction time and EEG, values are means ± SD. For thought probes, values are the percentage of responses in each category, within condition. As described in Materials and methods, statistical comparisons were conducted using linear mixed models (or in the case of dichotomous outcomes, multilevel logistic regression). Reaction time data were not collected in the Task Free condition, because the participants were not performing a task. ND = no data.
aDiffers significantly from Task Free.
bDiffers significantly from Low Load.
cDiffers significantly from High Load.
. | Task condition . | ||||||||
---|---|---|---|---|---|---|---|---|---|
. | High Load . | Low Load . | Task Free . | ||||||
. | Mean . | . | ±SD . | Mean . | . | ±SD . | Mean . | . | ±SD . |
Reaction times (ms) | 553.64 | ± | 153.21b | 518.23 | ± | 140.67c | ND | ND | |
EEG power (z-scored) | |||||||||
Alpha (8 to 12 Hz) | −0.051 | ± | 0.943a,b | −0.009 | ± | 0.963a,c | 0.031 | ± | 1.011b,c |
Beta (13 to 35 Hz) | −0.005 | ± | 0.930 | −0.026 | ± | 0.971 | 0.001 | ± | 1.024 |
Theta (4 to 7 Hz) | −0.003 | ± | 0.976 | −0.015 | ± | 0.955 | −0.030 | ± | 0.952 |
Delta (1 to 4 Hz) | −0.01 | ± | 1.000 | 0.005 | ± | 0.997 | 0.000 | ± | 0.989 |
Slow (0.3 to 1 Hz) | 0.014 | ± | 0.992a | 0.010 | ± | 0.982a | −0.035 | ± | 1.013b,c |
Thought probe responses (%) | |||||||||
Task-unrelated External | 13% | 13.25% | 17.4% | ||||||
Task-unrelated internal | 20.48%a,b | 28.85%a,c | 44.2%b,c | ||||||
Mind blank | 3.52% | 4.91% | 4.3% | ||||||
Task-related external | 37.22%a,b | 23.29%a,c | 11.2%b,c | ||||||
Task-related internal | 25.77% | 29.7%a | 22.9%b |
. | Task condition . | ||||||||
---|---|---|---|---|---|---|---|---|---|
. | High Load . | Low Load . | Task Free . | ||||||
. | Mean . | . | ±SD . | Mean . | . | ±SD . | Mean . | . | ±SD . |
Reaction times (ms) | 553.64 | ± | 153.21b | 518.23 | ± | 140.67c | ND | ND | |
EEG power (z-scored) | |||||||||
Alpha (8 to 12 Hz) | −0.051 | ± | 0.943a,b | −0.009 | ± | 0.963a,c | 0.031 | ± | 1.011b,c |
Beta (13 to 35 Hz) | −0.005 | ± | 0.930 | −0.026 | ± | 0.971 | 0.001 | ± | 1.024 |
Theta (4 to 7 Hz) | −0.003 | ± | 0.976 | −0.015 | ± | 0.955 | −0.030 | ± | 0.952 |
Delta (1 to 4 Hz) | −0.01 | ± | 1.000 | 0.005 | ± | 0.997 | 0.000 | ± | 0.989 |
Slow (0.3 to 1 Hz) | 0.014 | ± | 0.992a | 0.010 | ± | 0.982a | −0.035 | ± | 1.013b,c |
Thought probe responses (%) | |||||||||
Task-unrelated External | 13% | 13.25% | 17.4% | ||||||
Task-unrelated internal | 20.48%a,b | 28.85%a,c | 44.2%b,c | ||||||
Mind blank | 3.52% | 4.91% | 4.3% | ||||||
Task-related external | 37.22%a,b | 23.29%a,c | 11.2%b,c | ||||||
Task-related internal | 25.77% | 29.7%a | 22.9%b |
Notes. For reaction time and EEG, values are means ± SD. For thought probes, values are the percentage of responses in each category, within condition. As described in Materials and methods, statistical comparisons were conducted using linear mixed models (or in the case of dichotomous outcomes, multilevel logistic regression). Reaction time data were not collected in the Task Free condition, because the participants were not performing a task. ND = no data.
aDiffers significantly from Task Free.
bDiffers significantly from Low Load.
cDiffers significantly from High Load.
On thought probe trials, increased cognitive load was associated with increased task-related thought and decreased task-unrelated thought. Task-unrelated internal thoughts were most frequent in the Task Free condition, less frequent in the Low Load condition and least frequent in the High Load condition (Table 4). Task-related external thought followed the opposite pattern, being most frequent in the High Load condition, less frequent in the Low Load condition, and least frequent in the Task Free condition (Table 4). Finally, task-related internal thought was less frequent in the Task Free condition than the Low Load condition (Table 4).
As an additional measure of subjective experience during the SART, participants also retrospectively rated the proportion of time spent thinking about a variety of topics (Fig. 7). Thinking about the SART during its completion was positively related to cognitive load, such that participants spent the most time thinking about the SART in the High Load condition, less in the Low Load condition, and least in the Task Free condition (F2,40 = 15.8, P < 0.0001). Conversely, thoughts of the past (F2,40 = 3.49, P = 0.04) and future (F2,40 = 3.51, P = 0.04) were inversely related to cognitive load, such that participants spent the least tine thinking about these topics in the High Load condition, more in the Low Load condition, and most in the Task Free condition.

Effect of cognitive load on retrospective thought reports. Y-axis represents the proportion of the SART task that participants reported spending thinking about each of four predefined topics. As expected, thoughts about the task were proportionally greater when cognitive load was high, whereas thoughts about the past and future were proportionally greater when cognitive load was low or absent.
Intercorrelations between state features
There were several intercorrelations between waking state features that remained significant after correction for multiple comparisons (Fig. 8). Our a priori hypotheses that reaction times would be negatively correlated with pupil diameter and that pupil diameter would, in turn, be negatively correlated with alpha power were supported. Contrary to our predictions, reactions times were unrelated to alpha power. However, reaction times were positively correlated with beta power and negatively correlated with theta and slow oscillation power. In addition to being negatively correlated with alpha power, pupil diameter was also positively correlated with delta power. In the Task Free condition, amplitude of ERP component1 was positively associated with pupil diameter and negatively associated with both beta and delta EEG power. Amplitude of the two ERP components was also positively correlated with each other.

Bivariate correlations between state feature. Matrix of r-values for intercorrelations between the component features used to define waking states. * = correlation remains statistically significant after Benjamini–Hochberg correction for multiple comparisons.
Association of thought probe responses with other state features
As predicted, thought probe responses were associated with reaction times (High and Low Load conditions only: F4,911 = 7.36, P < 0.001). Specifically, mind blank responses were accompanied by slowed reaction times, compared to all other thought categories (max P = 0.009, following Benjamini–Hochberg correction). Additionally, task-unrelated internal responses were associated with slower reaction times than task-unrelated external responses (P = 0.024, following Benjamini–Hochberg correction). As anticipated, alpha power also varied by thought probe response (F4,1266 = 2.963, P = 0.019). Task-unrelated internal thought was associated with greater alpha power than task-related internal thought (P = 0.017). While theta power also varied by thought probe response (F4,1265 = 2.789, P = 0.025), no pairwise comparisons survived Benjamini–Hochberg correction. Contrary to our predictions, thought probe responses were not associated with pupil diameter. Thought probe responses also were not related to slow oscillation, delta, or beta power. In the Task Free condition, thought probe responses were not related to ERP amplitudes.
As expected, the proportion of internal task-unrelated thought probe responses was inversely associated with the proportion of external task-related responses, in all experimental conditions (High Load: r21 = −0.59, P = 0.003; Low Load: r21 = −0.66, P = 0.0006; Task Free: r18 = −0.70, P = 0.0006).
Waking-state EEG topographies
To describe the topography of EEG spectral power during Offline1 and Offline2, power in each frequency band was compared to the Online state, separately at each of the 64 electrode sites. After Benjamini–Hochberg correction for multiple comparisons, Offline1 trials differed from Online trials only in increased alpha power (8 to 12 Hz), and Offline2 trials differed from Online trials only in increased slow oscillation power (0.3 to 1 Hz). As displayed in Fig. 9, alpha power was increased during Offline1 at the majority of electrodes, but most strongly at frontal and occipital locations (Fig. 9, left). During Offline2, slow oscillation power was most strongly increased at frontal locations, but was also significantly elevated over posterior regions (Fig. 9, right).

Topography of EEG spectral power in offline states. After Benjamini–Hochberg correction for multiple comparisons, Offline1 trials differed from Online trials only in the alpha (8 to 12 Hz) band (left), and Offline2 trials differed from Online trials only in the slow oscillation (0.3 to 1 Hz) band (right). Color scale represents t-values derived from a t-test for unequal variances comparing states. Interpolation between electrodes by spherical splines.
Discussion
Wakefulness is not homogenous, and accumulating evidence suggests it may comprise a state-like structure. Here, as in our prior work, we used a data-driven method to parse extended periods of wakefulness into three distinct states (Wamsley and Summer 2020; Wamsley et al. 2023). One state (“Online”) was a task-focused state, during which reaction times were rapid, task-related thought was frequent, and pupil diameter was large. As predicted, increasing the cognitive load of the SART caused participants to spend more time in the Online state. Also as predicted, participants spent less time in the Online state with increasing time on task, presumably as their attention waned. A second state (“Offline1”) appears to be an inwardly focused offline state, during which reaction times slowed, task-unrelated thoughts increased, pupil diameter decreased, and EEG alpha was high. As hypothesized, a third state (“Offline2”), was most prominently characterized by increased slow oscillation EEG. Other features of Offline2 mixed characteristics typically ascribed to online and offline states. Offline2 was accompanied by marked EEG slowing in all conditions, as well as reduced task-related thought and reduced pupil diameter in the Task Free Condition, all characteristics typically associated with being offline. But in the High and Low Load conditions, Offline2 in some ways resembled the Online state, with a large pupil diameter and intermediate-speed reaction times. Notably, the Online, Offline1, and Offline2 states strongly resemble those described in our most recent prior study (Wamsley et al. 2023), which also observed both a high alpha offline state (Offline1) and a high slow oscillation offline state (Offline2).
Effect of cognitive load
We have proposed these states reflect a fluctuation between online and offline modes of cognition, with the two offline states representing internally focused attention, and the online state representing externally focused attention. If this is the case, the amount of time spent online should be greatest when task demands are high. In support of this hypothesis, increasing cognitive load indeed increased the amount of time spent online, while decreasing the amount of time spent in Offline1. This supports our interpretation that the Online state reflects task-oriented processing and that Offline1 reflects a period of sensory decoupling, in which attention turns inward.
Contrary to our hypotheses, Offline2 was not influenced by cognitive load and differed only moderately from the Online state in subjective experience, pupil diameter, and reaction time. As reviewed above, Offlline2 mixed some features that typify an offline state, with others indicating high arousal and task focus. Speculatively, this could indicate that Offline2 is a time during which participants are experiencing internally oriented thought that is focused and goal-directed. This would be consistent with the fact that participants report a high level of internally oriented thoughts about the SART during Offline2 and is in line with phenomenological evidence that intentional goal-directed internal thought forms a substantive portion of mind wandering (Seli, Risko, and Smilek 2016; Seli, Risko, Smilek, et al. 2016). If Offline2 represents internally oriented mental effort, this helps explain why slow oscillation power would be reduced during the Task Free condition, in comparison to the High and Low Load conditions. A prominent feature of Offline2, slow oscillation power could be associated with effortful attention to internal mental states, a category of cognitive activity that would be reduced in the Task Free condition inasmuch as participants are spending less time thinking about the SART.
In a previous study, prevalence of Offline2 following encoding predicted enhanced retention of just-learned information (Wamsley et al. 2023). Because of this, we have speculated that Offline2 is a time during which memory reactivation and consolidation occur. Future research should continue to test whether this is indeed the case, and if so, to better characterize the nature and function of Offline2.
It was not possible to validly compare the amount of time spent in each state in the Task Free condition to the amount of time spent in each state in the High and Low Load conditions. This is because states were defined separately in the Task Free condition, using a separate clustering model based on different features (i.e. evoked potentials in lieu of reaction time). Therefore, while it may at first appear odd that participants seem to spend a larger amount of time “online” in the task-free condition than when they are performing a task, we do not think this is meaningful.
Persistence of online/offline states in the absence of a cognitive task
Our second main hypothesis was that fluctuation between online and offline states would continue even when participants were not performing a task. This hypothesis was confirmed. In the Task Free condition, participants were asked to look at the computer screen, but were told they did not need to pay attention to the stimuli or perform any task. Still, the same three states of wakefulness were apparent, suggesting that fluctuation between these waking states is an intrinsic feature of wakefulness that occurs even when we are ostensibly doing nothing and is not solely a reaction to performing an attention-demanding task. Thus, neither Offline1 nor Offline2 can strictly be referred to as a “mind wandering” state, as they exist even when participants are not intentionally attempting to attend to a task for their mind to wander from.
Increase in evoked potential amplitude during offline states
In the Task Free condition, we expected the amplitude of evoked potentials to decrease during the offline states as a result of inattention. Instead, relative to the Online state, both components increased in amplitude during Offline1, and component 1 increased in amplitude during Offline2. Our expectations were based on evidence that late ERP components decrease in amplitude when participants mind wander during an attention task (Smallwood et al. 2008; Kam et al. 2022). In hindsight, it seems clear that our study differs from these prior investigations, because in the Task Free condition, participants were instructed to look at but not pay attention to the stimuli. This creates very a different situation than in mind wandering studies, where mind wandering is typically defined as a failure to attend to an ongoing task. Studies of the effect of global vigilance state on evoked potentials may therefore be more be more useful in interpreting our current evoked potential results. It is well known that the amplitude of late ERP components increases during the profoundly inattentive state of sleep (Wesensten and Badia 1988; Winter et al. 1995; de Lugt et al. 1996). Even during wakefulness, states of reduced vigilance have been associated with increased amplitude of late components, in a task where participants were not explicitly attending to the stimuli (Huang et al. 2017). As observed in sleep and other states of reduced vigilance, increased amplitude of evoked responses in Offline1/Offline2 could be caused by reduced cortical–cortical connectivity, resulting in a large-amplitude response generated locally, but with reduced propagation of that response to other cortical regions, mirroring reduced stimulus awareness (Massimini et al. 2005; Ferrarelli et al. 2010).
Comparison to our earlier observations
The states described here resemble, but do not exactly replicate, those observed in our most recent prior study (Wamsley et al. 2023). EEG profiles of the states are nearly identical between studies, with one state characterized by high delta and theta power (Online), another by high alpha (Offline1), and a third by high slow oscillation power (Offline2). This strong replication of the EEG differences between states is a main factor that led us to use the same naming convention in this paper as in Wamsley et al. (2023). Other similarities between the states described here and those in Wamsley et al. (2023) include that pupil diameter was greater in Offline2 than Offline1, and being offline was associated with decreased task-related thought and increased task-unrelated thought.
However, there are also differences between the states described here and in Wamsley et al. (2023). Most notably, here we report that the Online state was associated with large pupil diameter and fast reaction times, relative to the offline states. But in contrast, the Online state in Wamsley et al. (2023) was associated with relatively smaller pupil diameter and slower reaction times. These discrepancies may be explained by the different demands of the SART versions used in these two studies. Wamsley et al. (2023) used a traditional go/no-go SART in which participants saw the digits 1 to 9 appear sequentially on the computer screen and were asked to respond as quickly as possible to each stimulus, except to the target (“3”), to which they were to make no response. In contrast, participants in the current study had to also make odd/even distinctions on target trials, thus introducing higher cognitive demands, especially in the High Load condition. These increased task demands may have required increased arousal and attention to support optimal performance, causing the Online state to be associated with larger pupil diameter than in our earlier study, in which good performance required minimal attention to nontarget trials.
Increased cognitive demands of the SART may also explain why the Online state was associated with relatively fast reaction times in the current study, but relatively slow reaction times in Wamsley et al. (2023). Optimal performance on the standard version of the SART used in Wamsley et al. (2023) is associated with slowed reaction times, indicative of participants actively monitoring for the occurrence of infrequent target trials (Polychroni et al. 2020). In the current study, as participants had to both monitor for target trials and keep track of whether stimuli were odd or even, reaction times were overall much slower than in Wamsley et al. (2023). This may have altered the association between attentiveness and reaction time—while reaction times in the current study were faster online than in either offline state, still they were slower than the mean reaction time for any state in Wamsley et al. (2023). Thus, we propose that for the current task, optimally attentive performance is associated with the ability to react more quickly, even in the face of multiple cognitive demands. Altogether, it may be that while the EEG and subjective experience features of these states are relatively impervious to behavioral context, other features of the states can vary in accordance with varying task demands.
Comparison to other approaches
Interest in defining distinct global brain states during wakefulness has grown in recent years (Greene et al. 2023). There have been diverse other approaches to parsing wakefulness into substates, coming from multiple disciplines within neuroscience and psychology, spanning a range of techniques, and measuring mind/brain activity at varying spatial and temporal scales.
In psychology, the term “mind wandering” has been applied to a diverse collection of experiences differing along dimensions including task-relatedness, stimulus-relatedness, and degree of spontaneity (Christoff et al. 2016, 2018; Seli et al. 2018). Several investigators have taken a data-driven approach to classifying the subtypes of subjective experience during wakefulness (Wang et al. 2018; Zanesco et al. 2020). Zanesco et al. (2020), for example, applied hidden Markov modeling to ratings of task-focus during the SART, decomposing these data into a set of three underlying states. While we and Zanesco both arrived at a 3-state model, given the highly divergent measures used, it is not clear how this or any other proposed taxonomy of subjective mind wandering maps onto the states that we describe in the current paper.
Meanwhile, numerous studies have described rhythmic fluctuations in the perception of external stimuli (VanRullen 2016). While this typically has been reported to occur at much faster timescales than studied here, rhythmic fluctuations in the perception of experimental stimuli are another line of evidence suggesting that humans continuously alternate between externally focused and internally focused modes of attention. Such alternation between “internally biased” and “externally biased” modes has been argued to support learning across multiple timescales (Honey et al. 2017).
Other investigators have previously applied machine-learning methods to multimodal data in an attempt to delineate distinct brain states associated with on-task vs. mind wandering experience or behavior (Mittner et al. 2014; Groot et al. 2021). Both Groot et al. (2021) and Mittner et al. (2014), for example, have used neural data in combination with pupillometry to understand mind wandering. Their approach differs from the current study, however, in that their aim was to find neural correlates of the subjective experience of mind wandering. In contrast, we use subjective experience as just one of several measures in a clustering approach and allow that the resulting states might not map on to categories subjective experience in a straightforward manner.
Ultimately, our observations complement, rather than compete with this and other prior work. While all of these approaches to operationally defining waking states have their strengths, some advantages to our particular method are as follows:
(1) Our approach incorporates information from reports of subjective experience, but does not define “offline” exclusively based on subjective experience, and therefore does not require that subjective experience reports be present on every trial. Because of this, we are able to describe changes in waking state on a seconds-long timescale.
(2) While researchers agree on the broad strokes of the concept, there has been little agreement on the precise operational definition of an offline state. A data-driven approach maximally avoids researcher preconceptions about the features of waking states, allowing the possibility of discovering state features that are not yet incorporated into contemporary theoretical models.
(3) In practical terms, the use of EEG, pupillometry, and relatively straightforward analytic methods offers an approachable and low-cost method of measuring offline states in humans.
Practical implications for future research
One impetus for this study was to explore whether varying cognitive load could be a practicable method of experimentally manipulating the amount of time spent in online and offline states, thus providing a way to experimentally test the function of these states in future research. We were successful in manipulating the amount of time spent in the Online and Offline1 states, although the size of these effects was small. However, our past research led us to a specific interest in the potential role of Offline2 in memory performance (Wamsley et al. 2023), and manipulating cognitive load had no effect on this state. Manipulating cognitive load therefore does not appear to be a promising method of experimentally testing the effect of Offline2 on memory consolidation. Cognitive load manipulations might be useful for investigating potential functions of Offline1. As a high-alpha state characterized by reduced attention to the external environment, avenues of future research might include investigating the role of Offline1 in semantic memory access, which has been argued to rely on oscillations in the high alpha range (Klimesch 1997; Klimesch et al. 1997; Doppelmayr et al. 2002).
In summary, here we validate a method by which extended periods of wakefulness can be usefully parsed into a series of seconds-long online and offline states. The three states identified continue to occur in the absence of a cognitive task, and two of them respond to changes in cognitive load in the hypothesized manner. Along with our other studies using this method (Wamsley and Summer 2020; Wamsley et al. 2023), these observations provide converging evidence that the movement between these states reflects an alternation between moments in which our attention is focused on the external environment, and moments in which our attention is directed inward. Our data suggest that this alternation between online and offline attention can occur rapidly, on a timescale of seconds, and persists even in the absence of a cognitive task. As proposed elsewhere, this alternation between internally focused and externally focused states is well positioned to support learning and memory (Honey et al. 2017; Wamsley et al. 2023).
Acknowledgments
We thank student research assistants Madison Arora, Hannah Wood, Regan Schuetze, Tingtong Liu, Bridget Scalia, Justin Barron, and Lauren Omotosho, as well as lab coordinator Lauren Hudachek, for their contributions to data collection.
Author contributions
Erin Wamsley (Conceptualization, Formal analysis, Funding acquisition, Methodology, Project administration, Resources, Supervision, Visualization, Writing—original draft) and Megan Collins (Investigation, Methodology, Project administration, Software, Supervision, Writing—review & editing)
Funding
This research was supported by NSF grant BCS-1849026.
Conflict of interest statement: None declared.
Footnotes
In some literatures, particularly in animal work, the term “quiet wakefulness” is sometimes used to describe the same concept that we refer to as an “offline state.” The term “offline” is related to but not synonymous with “mind wandering,” in that participants may or may not be subjectively aware of being in an offline state.