Abstract

In contrast to classical views of working memory (WM) maintenance, recent research investigating activity-silent neural states has demonstrated that persistent neural activity in sensory cortices is not necessary for active maintenance of information in WM. Previous studies in humans have measured putative memory representations indirectly, by decoding memory contents from neural activity evoked by a neutral impulse stimulus. However, it is unclear whether memory contents can also be decoded in different species and attentional conditions. Here, we employ a cross-species approach to test whether auditory memory contents can be decoded from electrophysiological signals recorded in different species. Awake human volunteers (N = 21) were exposed to auditory pure tone and noise burst stimuli during an auditory sensory memory task using electroencephalography. In a closely matching paradigm, anesthetized female rats (N = 5) were exposed to comparable stimuli while neural activity was recorded using electrocorticography from the auditory cortex. In both species, the acoustic frequency could be decoded from neural activity evoked by pure tones as well as neutral frozen noise burst stimuli. This finding demonstrates that memory contents can be decoded in different species and different states using homologous methods, suggesting that the mechanisms of sensory memory encoding are evolutionarily conserved across species.

Introduction

As a crucial component of adaptive intelligence, working memory (WM) allows for the temporary retention of information, breaking the immediacy of sensations, and allowing for actions that are not reflexive. Sensory WM refers to an organism’s ability to retain information from a specific sensory modality (Spector 2011). Auditory sensory memory (ASM) is a low-level subset of auditory WM wherein features of acoustic events are automatically maintained for a short period of time without the need for active cognitive retention (Pasternak and Greenlee 2005). In comparison to auditory WM, which is actively maintained over longer periods of time, ASM can be thought of as a passively retained “buffer” that decays over time and is “overwritten” by new auditory sensory input (Pasternak and Greenlee 2005), serving as a temporary store before relevant information can be moved into higher-level memory systems when required. The auditory system cannot “re-hear” acoustic events, and as such automatic retention of such events is essential for higher-level cognitive processes (e.g., active maintenance of auditory WM or long-term storage). ASM is a vital, low-level, function of the memory system upon which our understanding of higher-level auditory memory functions is built.

Early findings demonstrated that maintenance of WM was accompanied by persistent neural activity in frontal areas (Tark and Curtis 2009; Huang et al. 2016). Recent studies have demonstrated that WM is not always reflected in sustained neural activity. This has led to new conceptualizations as to how WM maintenance takes place in the brain (Fries 2005; Mongillo et al. 2008; Stokes 2015; Kamiński and Rutishauser 2019). One possibility is that WM is instantiated by “activity-silent” neural states, whereby sensory cortical regions do not show sustained activity during the retention period despite clearly observed activity during encoding and recall periods (Stokes 2015). It has been shown that those activity-silent neural states can be probed by measuring “impulse responses” to a standardized broad-band probe stimulus during the activity-silent period and decoding the resultant neural response to make inferences about its contents (Wolff et al. 2020, 2015).

Existing research on silent-coding of WM has focused on higher-level auditory processes such as active retention of auditory features during WM maintenance, or lower-level processes in the visual system (Wolff et al. 2020, 2015). Despite cross-species studies enjoying the benefits of established research methods developed and refined for each species (Mishra and Gazzaley 2016), additional research in neural correlates of WM, particularly those which employ paradigms for activity-silent coding, focus either on the prefrontal cortex or employ single-species and single-conscious-state models (Constantinidis and Procyk 2004; Bigelow et al. 2014; Stokes 2015; Murray et al. 2017; Spaak et al. 2017). Thus, whether the mechanism subserving WM is preserved across species, conscious states and hierarchical levels remain unknown. To begin to address this, here we capitalize on cross-species investigations, and on understanding low-level memory processes which we see as integral to understanding higher-level functions in the auditory WM system and, as such, focus on silent-state activity during the ASM period. We use a multivariate decoding method to analyze data acquired from electroencephalogram (EEG) recordings in awake humans and electrocorticography (ECoG) recordings in anesthetized rats, decoding stimulus features from neural activity evoked by both the stimuli and frozen noise bursts presented during the ASM period. Such an approach allows us to bring invasive techniques to bear that help with precise localization, and allow for the investigation of causal mechanisms using methods that are not available in human subjects alone. By testing whether the neural response to frozen noise bursts contains information about the stimulus feature held in ASM, we hope to establish grounding for new models of ASM research by demonstrating the efficacy of cross-species and cross-attention-state approaches, elaborating on existing decoding research in a field whose limits have yet to be defined.

Methods and Materials

Human Electroencephalography

Participant Sample

Participants (N = 21, 12 male, 9 female, median age = 25, age range = 22–50) volunteered to take part in the study upon written consent. The work was conducted in accordance with protocols approved by the Human Subjects Ethics Sub-Committee of the City University of Hong Kong. All subjects were self-reported as healthy with no hearing impairment.

Behavioral Paradigm and Stimulus Design

Stimuli were presented in 10 separate blocks, where participants responded to stimulus pairs. Prior to task blocks, participants were given the opportunity to modify the playback volume from its default level of 83 dB SPL if they judged it to be too loud or too soft, with levels adjusted by the researcher within ±5 dB SPL to a comfortable setting for each participant.

During blocks, participants were presented with a pair of pure tones separated by an auditory burst of frozen pink noise (a full-bandwidth noise signal with equal power in proportionately wide bands and a power density decreasing at 3 dB per octave) (Fig. 1A). Tones preceding the frozen noise burst (T1) were randomly drawn from a set of six semitones, from a half-octave chromatic scale, starting at 440 Hz. Tones following the frozen noise burst (T2) were detuned per trial by picking at random one of 10 possible frequencies from a set that extended one semitone above to one semitone below T1 in steps of 22.22% of a semitone. The same frozen noise burst stimulus was used across trials and participants. Each of the three events within the stimulus sequence was 200 ms in duration, with pure tones tapered by 5 ms linear on and off-ramps. Stimulus events within each sequence were separated by randomly selected gaps of silence, uniform distributions of both gaps across trials. Gaps between T1 and frozen noise burst ranged between 0.6 and 1.6 s, and the gaps between frozen noise burst and T2 ranged between 0.3 and 0.8 s, consistent with the lower range of time intervals used in ASM paradigms (Nees 2016). Gap durations, tone frequencies, and detuning values were assigned at random for each trial, with T2 always being a detuning of T1, for an average of 100 presentations for each T1 frequency per subject. A 600 ms wait time was employed after a “start trial” button press, and before the presentation of stimuli, in order to separate T2 traces and motor activity from the neural response to T1.

Stimuli and recording techniques. (A) Human volunteers were exposed to homologous stimulus streams in an ASM task, in which they were asked to report whether the frequency of the sample tone was higher or lower than the frequency of the probe tone. (B) Rats were exposed to stimulus streams under anesthesia (adapted from Polley et al. 2007).
Figure 1

Stimuli and recording techniques. (A) Human volunteers were exposed to homologous stimulus streams in an ASM task, in which they were asked to report whether the frequency of the sample tone was higher or lower than the frequency of the probe tone. (B) Rats were exposed to stimulus streams under anesthesia (adapted from Polley et al. 2007).

During blocks, a black screen with a white fixation cross was presented. Participants were instructed to press the “Start” button on a USB joypad to begin each trial and tasked with identifying if T2 was a higher or lower frequency than T1 by pressing buttons labeled either “Higher” or “Lower.” A short break at the participant’s discretion was given every 60 trials, with the percentage of correct responses over the last 60 trials displayed on the screen during the break. Note that our primary research question focused on “silent” neural activity related to the echoic memory of T1 during the period in which the frozen noise burst was presented, and the task and performance feedback served merely to incentivize participants to continue with the task through the blocks with similar levels of attention and engagement throughout. Although task presence can be thought to introduce an element of active retention (thus bringing the paradigm into the purview of WM), we are differentiating ASM versus WM in the context of time scales rather than task requirements. No feedback was given on a trial-by-trial basis, nor were rewards or punishment employed. In total, participants completed 600 task trials.

Neural Data Acquisition and Preprocessing

Neural data were collected via an ANT Neuro EEGo Sports 64 channel 10–20 EEG, grounded at the nasion and referenced to the Cpz electrode, at a sampling rate of 1000 Hz. Each participant completed 10 blocks (~1.5 h), recorded in succession with short breaks. Participants were seated in a quiet room, fitted with Brainwavz B100 earphones, which delivered the audio stimuli via a MOTU Ultralite MK3 USB soundcard at 44.1 kHz, 16 bit. Data from all 21 subjects were included in EEG analysis, with 17 subjects being included in the behavioral analysis due to inaccuracies in recording incorrect responses for the initial four subjects. EEG data were preprocessed using the SPM12 Toolbox (Wellcome Trust Centre for Neuroimaging, University College London; RRID: SCR_007037) for MATLAB (The MathWorks; RRID: SCR_001622). Continuous data were high-pass filtered at 0.2 Hz and downsampled (using antialiasing filtering) to reduce the source sampling rate to 300 Hz for computational efficiency. A notch filter was then applied between 48 Hz and 52 Hz before low-pass filtering at 90 Hz. All filters were fifth-order zero-phase Butterworth. Eyeblink artifact detection was performed using channel Fpz for all but one subject (for whom Fz was substituted as a result of a bad Fpz channel on that subject’s recording), and the eyeblink artifacts were removed by subtracting their two spatiotemporal principal components from all EEG channels (Ille et al. 2002). Data were then re-referenced to the average of all channels, segmented into epochs ranging from −100 ms before to 500 ms after stimulus onset for all stimulus events of interest (sample tone, frozen noise burst, and probe tone), and denoised using the “Dynamic Separation of Sources” (DSS) algorithm (de Cheveigné and Simon 2008). This denoising procedure is commonly applied to maximize reproducibility of stimulus-evoked responses across trials, while preserving differences between responses evoked by different stimulus types (de Cheveigné and Simon 2008; de Cheveigné and Parra 2014). For each subject, epoched data were linearly detrended, and the first seven DSS components (constituting the most reproducible components, as determined based on data ranging from −100 to 500 ms relative to tone/frozen noise burst onset) were retained and used to project both the tone-evoked and frozen noise burst data back into sensor-space.

Animal Electrocorticography

Subjects, Experimental Apparatus, and Surgical Procedures

Five adult female Wistar rats, acquired from the Chinese University of Hong Kong, were used as subjects. Naive rats aged between 16 and 24 weeks (median age = 20 weeks) with weights between 257 and 345 g (median weight = 285 g) were tested for normal hearing (click auditory brainstem response thresholds < 20 dB) and received no prior stimulus exposure. A mixture of ketamine (80 mg/kg, intraperitoneal injection; i.p) and xylazine (12 mg/kg, i.p) was used to induce anesthesia at the outset of the experiment. Dexamethasone (0.2 mg/kg, i.p) was delivered before surgery as an anti-inflammatory. Anesthesia was maintained throughout the experiment via urethane injections (0.75 mg/kg, i.p) 1 h after the initial dose of ketamine and xylazine with supplementary doses (0.2–0.5 mL) delivered based on the presence of a withdrawal reflex when the animals’ toes were pinched. Based on previous rodent studies (Malmierca et al. 2019), this protocol allowed for faster induction of anesthesia via the initial administration of ketamine and xylazine, while avoiding subsequent NMDA-specific inhibitory effects of ketamine through the use of urethane to maintain anesthesia for ECoG recordings. The anesthetized animal was placed in a stereotaxic frame to allow hollow ear bars to be placed for sound delivery and fix the animals’ head for craniotomy. Body temperature was maintained at 36 ± 1 °C with an electric heating pad throughout the procedure and monitored via a rectal thermometer. During surgery, the skin was cut and muscle tissue over the temporal lobe of the skull was removed to allow for a unilateral craniotomy exposing a 5 × 4 mm region over the right primary auditory cortex, leaving the dura intact. The cranial window started 2.5 mm posterior from the Bregma, and ventral from the temporal edge of the lateral skull surface, in order to locate the auditory cortex during craniotomy (Fig. 1B). The ECoG array was placed over the exposed cortex and a cotton roll was placed between the skin and the array to keep impedance low and the array securely in place.

Correct placement of the ECoG array was verified by recording a set of Frequency Response Areas (FRAs) from each site by collecting responses to 100 ms pure tones varying in sound level (30–80 dB SPL) and frequency (500–32 k Hz, ¼ octave steps). Each tone was presented 10 times, in a randomly interleaved fashion, with an onset-to-onset ISI of 500 ms. FRA maps for each ECoG array placement were used to verify the placement of the array was similar across subjects. Note also that the spatial principal component analysis (PCA) analysis (described below) which underpins our analysis was performed separately for each subject, which minimizes any effects of the array misalignment between subjects.

Experimental Paradigm and Stimulus Design

For the main experiment, the stimulus sequence closely matched the sequence administered in the human study but was adjusted for the rat hearing range and passive delivery. Audio sequences consisted of tones followed by the same frozen pink noise bursts used in the human paradigm, each separated by gaps of silence with a duration chosen randomly from the interval 0.5–1 s. Tones were randomly drawn from a set of six frequencies spaced seven semitones apart, beginning with 1100 Hz. The lower limit of 1100 Hz ensured that all tones were well above the lower limit of the rat’s frequency range, and the seven semitone spacing (just over half an octave) was chosen to ensure that the tones should relatively easily discriminable for the rat’s auditory system, but we also wanted to avoid frequency steps corresponding to an integer number of octaves so that all tones differed not just in pitch height but also pitch chroma. To our knowledge, there is currently no evidence that the rat auditory system is sensitive to pitch chroma or designed to perceive octave equivalence, but ensuring that there could be no “issues of chroma confusion” if it does was an easy precaution to take. Each tone was presented binaurally in a random order 50 times each per block, with the two animals exposed to two blocks and the remaining exposed to three blocks. Both tones and noise bursts were 200 ms in duration, with tones tapered with 5 ms cosine on/off ramps (Fig. 1B).

Neural Data Acquisition and Preprocessing

ECoG signals were acquired at a sampling rate of 24 414 Hz using an 8 × 8 Viventi ECoG electrode array (Woods et al. 2018) with 400 μm electrode spacing, three ground channels located in the corners of the array, and a common reference. The array was connected to a Tucker Davis Technologies (TDT) PZ5 neurodigitizer and recorded via an RZ2 processor (controlled by BrainWare software). Acoustic stimuli were delivered by a TDT RZ6 multiprocessor at a playback sampling rate of 48 828 Hz. To extract neural activity evoked by acoustic stimuli, the recorded electrode signals were first low-pass filtered using a cutoff frequency of 90 Hz using a fifth-order Butterworth filter and downsampled to 300 Hz. We decided to analyze low-frequency (local field) potentials rather than high-gamma-band activity, as they provide a closer homologue to human EEG signals. As for the human EEG data, the preprocessed signals were then re-referenced to the average of all channels, as commonly used in ECoG studies (Ball et al. 2009), and segmented by extracting 600 ms long voltage traces from −100 ms to +500 ms relative to the onset of each tone or frozen noise burst stimulus. The epoched traces were baseline-corrected by subtraction of the mean prestimulus voltage values, and linearly detrended (Salisbury 2012).

Univariate Analysis: Summarizing Tone-evoked and Frozen Noise Burst-evoked Activity

As an initial step, EEG and ECoG data were subject to univariate analyses, to assess whether tone frequency-modulated tone- and frozen noise burst-evoked activity on a channel-by-channel basis. Epoched data were averaged across trials, separately for each tone frequency. First, to visualize the evoked responses, event-related potentials (ERPs) were concatenated across tone frequencies and participants/animals, resulting in 2D matrices with single channels along one dimension and concatenated time points, tone frequencies, and participants/animals along the second dimension. These matrices were then subject to PCA using singular value decomposition, resulting in spatial principal components (describing channel topographies) and temporal principal components (describing voltage time-series concatenated across tone frequencies and participants/animals), sorted by the ratio of explained variance. The top principal components explaining 95% of the original variance were summarized by calculating their weighted average, weighted by the proportion of variance explained. The resulting summarized voltage time-series were then averaged per tone frequency across participants/animals. In an identical procedure, frozen noise burst-evoked single-trial data were averaged across trials, separately for each preceding tone frequency, and subject to PCA as described above.

Next, to test whether any time points and channels show significant amplitude correlations with tone frequency, single-participant ERP data in the original sensor-space (i.e., prior to the principal component analysis, which was only used for visualization purposes) were converted into three-dimensional images (two spatial dimensions and one temporal dimension) and entered into a general linear model (GLM), separately for each species (humans, rats) and stimulus type (pure tone, neutral frozen noise burst). Each GLM was based on a flexible factorial design with one random factor (participant/rat) and one fixed factor (tone frequency/preceding tone frequency). A parametric linear contrast across six frequencies was designed to test for the effect of tone frequency on ERP amplitude. The resulting statistical parametric maps were thresholded at P < 0.05 (two-tailed) and corrected for multiple comparisons across spatiotemporal voxels at a family wise error (FWE)-corrected P = 0.05 (cluster-level) (Kilner et al. 2005).

The human EEG data were additionally source-localized, to infer the most probable cortical sources contributing to the sensor-level effects. Specifically, since we observed a significant univariate effect of tone frequency on the amplitude of tone-evoked responses (but not noise-evoked responses; see Results section), we focused on estimating the source activity underlying the sensor-level effects of tone frequency on tone-evoked responses. To this end, we used the multiple-sparse-priors approach to source localization under group constraints (Litvak and Friston 2008). For each tone frequency and participant, the entire time epoch (from 100 ms before to 500 ms after tone onset) of sensor-level tone-evoked responses overall EEG channels were subject to source localization. The resulting source estimates within the time window in which we observed significant results (113–260 ms relative to tone onset; see Results section) were converted into 3D images (in MNI space), smoothed with a 6 × 6 × 6 mm Gaussian kernel, and entered into a GLM with one within-subjects factor (Tone Frequency) and one between-subjects factor (Participant). Following the estimation of the GLM, we obtained statistical parametric maps for parametric linear contrasts between which were then thresholded at P < 0.05 (two-tailed, uncorrected). Significant effects were inferred at a cluster-level P < 0.05 (FWE, small-volume corrected), correcting for multiple comparisons across voxels under random field theory assumptions (Kilner et al. 2005). Sources were labeled using the Neuromorphometrics atlas, as implemented in SPM12.

Multivariate Analysis: Decoding Sensory and Mnemonic Tone Frequency Information

To test whether information about tone frequency can be decoded from the pattern of tone-evoked and frozen noise burst-evoked activity observed across multiple channels and time points, we subjected the data to multivariate analyses. To this end, we adapted methods established in previous research on multivariate EEG decoding of visual stimulus orientation during visual WM tasks (Myers et al. 2015; Wolff et al. 2017; van Ede et al. 2018) and similar approaches in decoding active retention of auditory stimuli (i.e., pure tones) during WM (Wolff et al. 2020).

A multivariate decoding method was employed in analyzing data acquired from both species to decode the frequency of the preceding T1 stimulus from neural activity evoked by the frozen noise bursts which did not carry any overt information about the sample tone given that the noise tokens used were always identical and were presented well after sample tone-evoked responses returned to baseline (> 600 ms following the offset of sample tone in humans and 500 ms in anesthetized rats). Channels with an average signal-to-noise ratio (SNR; defined as the ratio of root-mean-square values of poststimulus and prestimulus amplitudes) lower than 8 dB (Alaerts et al. 2009) were discarded from the analysis. This resulted in discarding 3.17% ± 1.53% EEG channels (mean ± SD) from subsequent multivariate decoding. All ECoG channels in all rats fulfilled the SNR criterion and were used in subsequent decoding. Prior to decoding, single-trial tone-evoked responses were sorted by tone frequency, and single-trial frozen noise burst-evoked responses were sorted by preceding tone frequency.

We sought to determine whether activity evoked by the sample tone (probing the sensory trace), and/or by the frozen noise burst (probing ASM contents), contained information about the sample tone feature (Fig. 2). To estimate decoding time-courses, we adopted a sliding window approach, integrating over the relative voltage changes within a 100 ms window of each time point (Wolff et al. 2020). This approach is a direct replication of previously established multivariate decoding methods (Wolff et al. 2020). Furthermore, pooling information over multiple time points (in addition to multiple channels) in a multivariate manner has been shown to boost decoding accuracy (Grootswagers et al. 2017; Nemrodov et al. 2018). To this end, per channel and trial, the time segments within 100 ms of each analyzed time-point were downsampled by binning the data over 10 ms bins, resulting in a vector of 10 average voltage values per channel. Next, the data were demeaned by removing the channel-specific average voltage over the entire 100 ms time window from each channel and time bin. This step ensured that the multivariate analysis approach was optimized for decoding transient activation patterns (voltage fluctuations around a zero mean) at the expense of more stationary neural processes (overall differences in mean voltage) (Wolff et al. 2020). The vectors of binned single-trial temporal data were then concatenated across channels for subsequent leave-one-out cross-validation decoding. As a multivariate decoding metric, we used the Mahalanobis distance (De Maesschalck et al. 2000), taking advantage of the potentially monotonic relation between tone frequency and neural activity (Auksztulewicz et al. 2019; Wolff et al. 2020). In other words, responses to similar tones are expected to yield low Mahalanobis distance metrics, while responses to more dissimilar tones are expected to yield larger Mahalanobis distance metrics. In a leave-one-out cross-validation approach (which has been shown to be optimal for EEG decoding, Grootswagers et al., 2017) per trial, we calculated six pairwise distances between EEG/ECoG amplitude fluctuations measured in a given test trial and mean vectors of EEG/ECoG amplitude fluctuations averaged for each of the 6 tone frequencies in the remaining trials. The Mahalanobis distances were computed using the shrinkage-estimator covariance obtained from all trials excluding the test trial (Ledoit and Wolf 2004). This approach, combining Mahalanobis distance with Ledoit–Wolf shrinkage, has been previously shown to outperform other correlation-based methods of measuring the dissimilarity between brain states (Bobadilla-Suarez et al. 2019). Mahalanobis distance-based decoding has also been shown to be more reliable and less biased than linear classifiers and simple correlation-based metrics (Walther et al. 2016).

Decoding method. (A) Decoding methods were based on estimating multivariate Mahalanobis distance between EEG/ECoG feature amplitudes in a given (test) trial and average amplitudes calculated for all six frequencies features, respectively (excluding the test trial). The top panel presents EEG/ECoG feature amplitudes for two example features (empty circle, test trial; solid circles, ERPs calculated from the remaining trials; acoustic frequencies are color-coded). Dashed lines on the top panel and bars on the bottom panel represent the multivariate distance between amplitudes observed in the test trial and the remaining trials. (B) Frequency-tuning matrices summarizing the population-level tuning curves were obtained after averaging across trials, per frequency, resulting in a 6 × 6 similarity matrix between all tone frequencies (each row represents the distance of all test trials of a given frequency to the remaining trials sorted per frequency and is shown in columns). The observed frequency-tuning matrices (top, an example from one participant) were regressed with the “ideal” tuning matrix (bottom), which consisted of the difference (in Hz) between pairs of tone frequencies. This regression coefficient provided a summary statistic that reflects decoding quality (i.e., how closely the relative dissimilarity between tone-evoked neural responses; “observed” in the figure) corresponds to the relative dissimilarity between tone frequencies (“ideal” in the figure).
Figure 2

Decoding method. (A) Decoding methods were based on estimating multivariate Mahalanobis distance between EEG/ECoG feature amplitudes in a given (test) trial and average amplitudes calculated for all six frequencies features, respectively (excluding the test trial). The top panel presents EEG/ECoG feature amplitudes for two example features (empty circle, test trial; solid circles, ERPs calculated from the remaining trials; acoustic frequencies are color-coded). Dashed lines on the top panel and bars on the bottom panel represent the multivariate distance between amplitudes observed in the test trial and the remaining trials. (B) Frequency-tuning matrices summarizing the population-level tuning curves were obtained after averaging across trials, per frequency, resulting in a 6 × 6 similarity matrix between all tone frequencies (each row represents the distance of all test trials of a given frequency to the remaining trials sorted per frequency and is shown in columns). The observed frequency-tuning matrices (top, an example from one participant) were regressed with the “ideal” tuning matrix (bottom), which consisted of the difference (in Hz) between pairs of tone frequencies. This regression coefficient provided a summary statistic that reflects decoding quality (i.e., how closely the relative dissimilarity between tone-evoked neural responses; “observed” in the figure) corresponds to the relative dissimilarity between tone frequencies (“ideal” in the figure).

The single-trial relative Mahalanobis distance estimates were then averaged across trials per tone frequency (for tone-evoked responses) or preceding tone frequency (for frozen noise burst-evoked responses), resulting in a 6 × 6 distance matrix for each analyzed time point. Overall decoding quality was quantified by comparing the estimated distance matrices with an “ideal decoding” distance matrix, with the lowest distance values along the diagonal and linearly increasing distance values along the off-diagonal. To obtain an easily interpretable measure of decoding quality, for each participant/animal and time point (from 50 ms before to 450 ms after tone/frozen noise burst onset), we normalized the observed and ideal decoding matrices by demeaning and dividing the entire matrix by its maximum absolute value and calculated the linear regression slope coefficient between the estimated distance matrix and the ideal distance matrix. Following data normalization, the resulting regression coefficients ranged between −1 (below-chance decoding) and 1 (ideal decoding) and formed decoding time-series which effectively summarized, per time point, how well the observed decoding matrices approximate the ideal decoding matrix. These decoding time-series were then smoothed with a Gaussian smoothing kernel (SD = 16 ms; Wolff et al. 2020) and averaged across participants/animals.

Furthermore, to quantify the comparison between decoding based on human EEG and rat ECoG, we performed a representational similarity analysis on the estimated distance matrices at six different time points (from 50 ms before to 450 ms after stimulus onset, in 100 ms steps) for the tones and bursts. Specifically, for both human EEG and rat ECoG data, we calculated pairwise Pearson correlation coefficients (Kriegeskorte et al. 2008) between 12 distance matrices (obtained for tones and bursts, at six-time points). This resulted in a 12 × 12 distance correlation matrices which summarized how similar the multivariate decoding of tone frequency is across time points as well as between tone-evoked and burst-evoked responses.

To establish the null distribution for statistical testing, we used a permutation-based approach, such that in each permutation the single-trial relative distance metrics were randomly reassigned stimulus labels. The resulting reshuffled single-trial decoding estimates were averaged across trials to obtain surrogate distance matrices. These distance matrices were then normalized and subject to linear regression, smoothing over time, and averaging across participants/rats, as described above. This procedure was repeated 10 000 times to obtain a null distribution of decoding estimates for each time point. Per time point, P values quantifying the significance of observing above-chance decoding were calculated by counting the proportion of surrogate decoding estimates exceeding the observed decoding estimate. Across time points, P values were corrected using a false-discovery-rate (FDR) approach at an FDR = 0.05 (Benjamini and Hochberg 1995). This procedure allowed for implementing exactly the same statistical procedures for both EEG and ECoG datasets.

Results

Behavioral Results

Performance across all human subjects yielded an average 79% accuracy rate (standard error of means (SEM) = 3.49%). A repeated-measures analysis of variance (RM ANOVA) was conducted on the dependent variable accuracy, with within-subject factors of probe divisions, and with a random factor of participants. Another set of RM ANOVA was performed on the hit rates with within-subject factors of ISIs. ISIs were categorized into 10 time windows for analysis, with the windows derived from equal range divisions between the smallest and largest ISI times. RM ANOVAs revealed no effect between ISI time windows and performance (P > 0.05), and a significant effect on task performance related to the distance of T2 frequency detuning relative to T1 (F(1,16) = 518.45, P < 0.001), with smaller detuning values resulting in more incorrect responses. These results indicate that the task was sufficiently difficult to keep subjects engaged and memory items reliably retained during trials.

Univariate Analysis: Single-channel Correlations with Tone Frequency

Our univariate analysis compared the averaged ERPs per T1 frequency, as well as averaged ERPs for the frozen noise bursts that followed given T1 frequency values. Using the FWE corrected univariate tests outlined in the methods, we observed a significant effect of tone frequency on tone-evoked responses in both human (EEG) and rat (ECoG) datasets (Fig. 3). In human EEG, a single cluster of amplitudes correlated with tone frequency, extending over bilateral anterior and right temporal channels and ranging between 113 and 260 ms after stimulus onset (p FWE = 0.021, Tmax = 3.27). Using a source localization procedure (see Methods section), we inferred the most likely cortical sources contributing to these source-level effects, which were localized in the right superior temporal gyrus (MNI coordinates: [48-6-16]; cluster-level p FWE = 0.007; Fmax = 21.01; Zmax = 3.93) and in the right middle/inferior frontal gyrus (MNI coordinates: [38 50-2]; cluster-level p FWE = 0.022; Fmax = 10.84; Zmax = 2.87). Similarly, in rat ECoG, a single cluster of amplitudes with a broad spatial distribution and a temporal range of 63–160 ms postonset was observed (p FWE = 0.012, Tmax = 7.21). In contrast, frozen noise burst-evoked responses did not correlate with preceding tone frequency (human EEG and rat ECoG: all clusters P > 0.8), indicating that univariate analyses are not sufficient to decode frequency labels in either EEG or ECoG based on ERP amplitude of frozen noise burst-evoked responses.

Univariate analyses. (A, D) In humans and rats, tones and frozen noise bursts evoked robust neural activity; different frequencies are represented as individual traces, from lowest frequencies (black traces) to highest frequencies (blue/red traces). Shaded areas denote SEM across subjects. (B, E) In humans and rats, tone-evoked activity correlated with tone frequency (parametric contrast T values; highlighted clusters: pFDR < 0.05). However, no significant effects of tone frequency were observed in univariate analyses of frozen noise burst-evoked activity. (C) Source localization of the univariate effect of tone frequency on tone-evoked EEG responses. Significant sources of activity, whose activity was parametrically related to tone frequency, were identified in the right superior temporal gyrus (rSTG) and in the right middle/inferior frontal gyrus (rMFG/IFG).
Figure 3

Univariate analyses. (A, D) In humans and rats, tones and frozen noise bursts evoked robust neural activity; different frequencies are represented as individual traces, from lowest frequencies (black traces) to highest frequencies (blue/red traces). Shaded areas denote SEM across subjects. (B, E) In humans and rats, tone-evoked activity correlated with tone frequency (parametric contrast T values; highlighted clusters: pFDR < 0.05). However, no significant effects of tone frequency were observed in univariate analyses of frozen noise burst-evoked activity. (C) Source localization of the univariate effect of tone frequency on tone-evoked EEG responses. Significant sources of activity, whose activity was parametrically related to tone frequency, were identified in the right superior temporal gyrus (rSTG) and in the right middle/inferior frontal gyrus (rMFG/IFG).

Multivariate Analysis: Decoding Tone Frequency from Transient Response Patterns

Our multivariate analysis computed distance matrices for neural responses evoked by tones of different frequencies, or by neutral frozen noise bursts preceded by tones of given frequencies. These matrices were compared to an “ideal decoding” distance matrix to quantify overall decoding quality. This analysis revealed that similar to the univariate analysis, tone frequency was reflected in tone-evoked response amplitudes (Fig. 4). In human EEG data, significant decoding (pFDR < 0.05) was observed between 10 and 450 ms relative to tone onset (all betas > 0.041, peak beta = 0.175; all P < 0.027), while in rat ECoG data, significant decoding was observed between −23 and 413 ms relative to tone onset (all betas > 0.041, peak beta = 0.709; all P < 0.033). Note that each decoding estimate for a given time point is based on data pooled over a 100 ms time window centered around that time point; hence, the exact latency of decoding onsets should be treated with ±50 ms precision. Taken together, tone frequency could be robustly decoded from tone-evoked activity in both humans and rats.

Multivariate analyses. (A, D) In humans and rats, tone frequency could be decoded from both tone-evoked (blue) and frozen noise burst-evoked activity (red). Red/Blue shaded areas: SEMs across subjects. Gray shaded areas: 95% confidence intervals of the null distribution of decoding time-series, reflecting the range of values for which decoding could have been observed by chance. Horizontal bars: pFDR < 0.05. Individual markers in (D) represent individual rats’ decoding peaks. (B, E) Relative Mahalanobis distance matrices per time point, forming the basis for decoding (beta coefficients) in (A, D). (C, F) Pairwise similarity (Pearson correlation coefficients) of the Mahalanobis distance matrices. Saturated colors mark P < 0.05.
Figure 4

Multivariate analyses. (A, D) In humans and rats, tone frequency could be decoded from both tone-evoked (blue) and frozen noise burst-evoked activity (red). Red/Blue shaded areas: SEMs across subjects. Gray shaded areas: 95% confidence intervals of the null distribution of decoding time-series, reflecting the range of values for which decoding could have been observed by chance. Horizontal bars: pFDR < 0.05. Individual markers in (D) represent individual rats’ decoding peaks. (B, E) Relative Mahalanobis distance matrices per time point, forming the basis for decoding (beta coefficients) in (A, D). (C, F) Pairwise similarity (Pearson correlation coefficients) of the Mahalanobis distance matrices. Saturated colors mark P < 0.05.

However, unlike in the univariate analysis, tone frequency was also reflected in subsequent frozen noise burst-evoked response amplitudes. Decoding of previously heard T1 frequency from frozen noise burst EPRs was present in both EEG and ECoG. Significant decoding in EEG occurred in our analysis between 247 and 343 ms relative to frozen noise burst onset (all betas > 0.036, peak beta = 0.070; all P < 0.027, FDR-corrected). In ECoG data, significant decoding was present in three-time windows (early cluster: 13–160 ms postfrozen noise burst, all betas > 0.071, peak beta = 0.192; middle cluster: 226–303 ms postfrozen noise burst, all betas > 0.075, peak beta = 0.110; late cluster: 343–400 ms postfrozen noise burst, all betas > 0.062, peak beta = 0.118; all P < 0.033, FDR-corrected). Given the relatively low number of rats, we inspected individual rats’ decoding peaks to exclude the possibility that the three significant clusters result from individual differences in peak latencies across rats. For each of the identified clusters, the majority of individual rats had at least one decoding peak within a given cluster (cluster 1: 4/5 rats; cluster 2: 4/5 rats; cluster 3: 3/5 rats). Thus, in both species, the neural response elicited by the auditory frozen noise burst contained statistically significant information about the previously heard stimuli retained in the sensory memory hold.

We have further quantified the similarity in decoding matrices obtained for tone-evoked and burst-evoked responses (Fig. 4C and F) in a representational similarity analysis (Kriegeskorte et al. 2008). This has revealed that the decoding correlation patterns were qualitatively similar across both species, that is, significant correlations were observed in both species, both across time points within tone-evoked and burst-evoked responses, as well as between tone-evoked and burst-evoked responses. However, the decoding matrices based on rat ECoG data were relatively more similar across time points and between tone-evoked and burst-evoked responses than the decoding matrices based on human EEG data. Specifically, the decoding matrices based on rat ECoG data were highly correlated across all poststimulus time points for both tone-evoked responses (all pairwise ρ > 0.9, all P < 0.001; 50–450 ms after tone onset) and burst-evoked responses (all pairwise ρ > 0.6, all P < 0.001; 50–350 ms after burst onset), as well as between tone-evoked and burst-evoked responses (all pairwise ρ > 0.7, all P < 0.001; between 50–450 ms after tone onset and 50–350 ms after burst onset). In contrast, the decoding matrices based on tone-evoked human EEG responses were only highly correlated across neighboring time points (ρ = 0.959, P < 0.001 for 50/150 ms after tone onset; ρ = 0.913, P < 0.001 for 250/350 ms after tone onset) and, to a smaller extent, across more distant time points (all remaining pairwise ρ > 0.396, P < 0.018). However, they were less consistently correlated across time points for burst-evoked responses (maximum ρ = 0.5828, P < 0.001), and between tone-evoked and burst-evoked responses (maximum ρ = 0.7820, P < 0.001).

In a control analysis, to ensure that frequency decoding based on noise burst-evoked responses is not driven by trials presented at short ISIs (and possibly contaminated by the neural response evoked by the preceding tone), we entered the single-trial epoched data into linear regression and, per channel and time point, calculated the residual after regressing out the ISI preceding the noise burst onset from single-trial amplitude values. These residuals were then used to obtain decoding estimates for both human EEG and rat ECoG data, as described above. In both cases, the decoding results were virtually identical as in the original analysis, with all previously reported clusters of significant decoding also showing statistical significance in the control analysis, and no additional clusters appearing in the control analysis. Therefore, it is unlikely that trial-by-trial differences in ISI between tone and noise burst could have contributed to the decoding results reported above.

Discussion

As established in previous research, the neural response to a sensory frozen noise burst contains information about the contents of WM held in the activity-silent period (Wolff et al. 2020, 2015). We elaborated on these findings by designing a task that does not require active retention and uses smaller time intervals to place stimuli in the range of lower-level ASM, applying this technique in a cross-species approach. We demonstrate that stimulus feature can be decoded from the evoked response to stimuli events using a univariate analysis, where ERP amplitude modulates parametrically with stimulus value in both anesthetized rat and awake human subjects, consistent with existing research conducted in awake humans (Wolff et al. 2015; Auksztulewicz et al. 2019). It is worth noting that the significant decodability that is visible in the baseline of our stimulus decoding (Fig. 4) can be attributed to the length of the sliding time window in decoding based on spatiotemporal patterns of transient responses. Our use of smaller intervals between frequencies in human trials (six tones, one semitone apart) further demonstrates that this technique is robust enough to decode more subtle differences in auditory stimuli than demonstrated in previous literature (Wolff et al. 2020), and comparable results in anesthetized rats under passive stimuli exposure serve as a counterpoint to a WM interpretation of behavioral paradigms employing active tasks while operating within the ASM temporal range.

In our human EEG data, the observed univariate relationship between ERP amplitude and tone frequency could be source-localized to the right higher-order auditory cortex (superior temporal gyrus) and frontal regions (middle/inferior frontal gyrus). These findings are consistent with the right-lateralization of spectral (versus temporal) auditory processing in the superior temporal gyrus (Poeppel 2003; Schonwiesner et al. 2005; Britton et al. 2009) and with the parametric encoding of tone frequency in the right frontal cortex during memory tasks (Spitzer and Blankenburg 2012). Importantly, the fact that sensor-level EEG effects of tone frequency could also be source-localized to auditory regions (in addition to frontal regions) make the human EEG results more comparable to the rat ECoG data, which were based on signals recorded only over auditory regions.

In addition to decoding a stimulus feature from the neural response to stimulus events, we also demonstrate that equivalent paradigms can be used to decode stimulus features from neural responses to uncorrelated auditory frozen noise bursts (Auksztulewicz et al. 2019; Wolff et al. 2020, 2015). Interestingly, no relationship was found between EEG/ECoG response amplitudes to stimulus events and frozen noise bursts of the same preceding tone feature in univariate analysis, illustrating the need for multivariate decoding methods. The results of our multivariate EEG decoding show significant bursts later in the time course (400–500 ms) than found in similar research (Wolff et al. 2020), suggesting that periods of decodability may be task-dependent. Our study employed a significantly more narrow range of token values than previously attempted, possibly testing the limits of this particular decoding method in the context of silent-state neural encoding of memory tokens. Although the longer latency of frequency decoding from noise bursts was surprising, both noise-evoked and tone-evoked ERPs (Fig. 3C) were typical, showing comparable latencies and peaks, with tone-evoked decoding (Fig. 4D) also appearing at the expected latency. Additionally, late reactivation of latent WM traces has also been shown by some earlier studies (Wolff et al. 2015) and may be better understood in the context of template matching, with late reactivation required for comparison in the EEG task (Myers et al. 2015; Wolff et al. 2015).

Traditionally, neural correlates of ASM in humans have been investigated in oddball paradigms yielding mismatch negativity (MMN) responses (Winkler et al. 1993). MMN responses to deviant stimuli during passive auditory oddball paradigms can also be observed in the absence of consciousness, for example, in some comatose patients (Morlet and Fischer 2014). Interestingly, previous MMN studies also fall in line with silent-coding theories, as MMN responses have been postulated to result from deviant stimuli in comparison to an existing ASM trace (Näätänen et al. 2005). Delayed match to sample (DMTS) tasks have also been considered a reliable method of investigating both sensory and WM (Daniel et al. 2016), and classical studies have employed DMTS paradigms to establish psychometric functions of ASM retention and decay periods (Nees 2016). In addition to their usefulness in behavioral studies, DMTS paradigms have been employed in modern human WM research (Myers et al. 2015; Wolff et al. 2020, 2015) to investigate neural components of sensory and WM traces. Interestingly, despite the use of rats in auditory research, many of which use similar DMTS tasks with jittered time intervals, there is sparse literature on ASM periods in the rat (O’Connor and Ison 1991).

In this study, we sought to fill several gaps in the existing literature by employing an ASM task to investigate neural correlates of ASM across species using contemporary decoding methods. In the broader scope of sensory memory research, our work employs a useful new tool in observing and analyzing neural phenomena. Existing tools, such as MMN and stimulus-specific adaptation (SSA) access this information indirectly by assessing the modulation of neural response to a particular repeated stimulus (Carbajal and Malmierca 2018). As demonstrated in previous studies (Costa-Faidella et al. 2011), repetition effects can be observed in time intervals coinciding with windows typical of both ASM and SSA, making such findings somewhat ambiguous given the presence of adaptation effects across multiple timescales in the auditory system. Similarly, multiple time scales of adaptation corresponding to stimulus duration have been observed in single-unit cortical recordings in anesthetized cats (Ulanovsky et al. 2004), and recent work has demonstrated topographically organized tone selectivity in SSA across multiunit cortical recordings in the anesthetized rat (Nieto-Diego and Malmierca 2016). Taken together, this may provide a possible explanation for our observed decodability using a multivariate approach, and the lack thereof using a univariate approach, as the recording methods employed in our study measure much larger neural populations than those possibly responsible for underlying SSA selectivity, and as a result lack the sensitivity to measure the more fine-grained patterns of tone selectivity accessible in single-unit recordings (Ulanovsky et al. 2004; Nieto-Diego and Malmierca 2016; Natan et al. 2017). As such, our decoding approach may provide an invaluable tool in assessing this phenomenon, that is applicable to both invasive recordings in animal models and noninvasive human models.

Our cross-species approach is, to the authors’ knowledge, the first attempt at decoding auditory memory traces in different species using the same analysis method and comparable stimuli. While the decoding matrices obtained from rat ECoG data were more strongly correlated across time points and between tone-evoked and burst-evoked responses (Fig. 4C and F), qualitatively similar correlation patterns were observed for decoding matrices obtained from human EEG data. Taken together, these findings suggest that the neural encoding of sensory memories is a general mechanism that has been evolutionarily maintained across species—a prospect that is also supported by previous MMN research using rat models. One such study observed a mismatch response from epidural potentials in anesthetized rats when presented with deviant tones in an oddball paradigm (Astikainen et al. 2011). Additional studies have yielded similar findings in awake and anesthetized rats using similar methods (Nakamura et al. 2011). As the ability for an organism to quickly differentiate between acoustic changes in its environment offers a potential benefit to its survival, such findings support the notion of ASM as an evolutionarily conserved adaptive trait. Our findings, paired with those previously mentioned and the limited behavioral studies available on rat ASM, further suggest the suitability of the rat in establishing animal models for research in central auditory processing.

In contrast to previous auditory studies requiring human participants to attend to a memory item (Wolff et al. 2020), our results demonstrate that active maintenance is not required for this approach to work, placing our findings in the purview of existing human ASM research relying on MMN responses, which have been shown to be conserved across conscious states (Winkler et al. 1993; Morlet and Fischer 2014). Of significance to the field, our findings suggest that animal models may provide an acceptable proxy for human sensory memory research, offering the benefit of significant decodability and higher signal-to-noise ratios from electrocorticography not feasible in human subjects, with implications for research across conscious states. Future studies could capitalize on our findings, possibly applying these methods to asleep or unconscious humans and awake rats. Given the key differences between ASM and WM (e.g., ASM as an automatic process that is present across attentive states and shorter time scales than the higher-level WM system), future studies could also apply our approaches to paradigms that manipulate WM contents or investigate their efficacy in WM retention intervals. While at shorter time windows, such as those employed in our study, ASM and SSA are partially overlapping (Costa-Faidella et al. 2011), future research should also seek to establish if the observed effects differ between active memory processes and passive adaptation. Furthermore, applying the decoding methods to additional research in anesthetized rats would also prove a logical extension, as areas such as longer time scales for retention or manipulation of passively maintained memory items remain largely unexplored in this context.

Notes

We would like to thank Fei Peng for help with data acquisition, and Vani Rajendran for helpful discussions. Conflict of Interest: None declared.

Funding

This work has been supported by the European Commission’s Marie Skłodowska-Curie Global Fellowship (750459 to R.A.), the Hong Kong General Research Fund (11100518 to R.A. and J.S.) and a grant from European Community/Hong Kong Research Grants Council Joint Research Scheme (9051402 to R.A., D.P. and J.S.).

References

Alaerts J, Luts H, Hofmann M, Wouters J.

2009
.
Cortical auditory steady-state responses to low modulation rates
.
Int J Audiol
.
48
:
582
593
.

Astikainen
P
,
Stefanics
G
,
Nokia
M
,
Lipponen
A
,
Cong
F
,
Penttonen
M
,
Ruusuvirta
T
.
2011
.
Memory-based mismatch response to frequency changes in rats
.
PLoS One
.
6
. doi: .

Auksztulewicz
R
,
Myers
NE
,
Schnupp
JW
,
Nobre
AC
.
2019
.
Rhythmic temporal expectation boosts neural activity by increasing neural gain
.
J Neurosci
.
39
:
9806
9817
. doi: .

Ball
T
,
Kern
M
,
Mutschler
I
,
Aertsen
A
,
Schulze-Bonhage
A
.
2009
.
Signal quality of simultaneously recorded invasive and non-invasive EEG
.
Neuro Image
.
46
:
708
716
. doi: .

Benjamini Y, Hochberg Y.

1995
.
Controlling the false discovery rate: a practical and powerful approach to multiple testing
.
J R Stat Soc Series B
.
57
(1):
289
300
.

Bigelow
J
,
Rossi
B
,
Poremba
A
.
2014
.
Neural correlates of short-term memory in primate auditory cortex
.
Front Neurosci
.
8
. doi: .

Bobadilla-Suarez
S
,
Ahlheim
C
,
Mehrotra
A
,
Panos
A
,
Love
B
.
2019
.
Measures of neural similarity
.
Comput Brain Behav
. doi: .

Britton
B
,
Blumstein
SE
,
Myers
EB
,
Grindrod
C
.
2009
.
The role of spectral and durational properties on hemispheric asymmetries in vowel perception
.
Neuropsychologia
.
47
:
1096
1106
.

Constantinidis
C
,
Procyk
E
.
2004
.
The primate working memory networks
.
Cogn Affect Behav Neurosci
.
4
:
444
465
.

Carbajal
GV
,
Malmierca
MS
.
2018
.
The neuronal basis of predictive coding along the auditory pathway: from the subcortical roots to cortical deviance detection
.
Trends Hear
.
22
. doi: .

Costa-Faidella
J
,
Grimm
S
,
Slabu
L
,
Díaz-Santaella
F
,
Escera
C
.
2011
.
Multiple time scales of adaptation in the auditory system as revealed by human evoked potentials: adaptation in the human auditory system
.
Psychophysiology
.
48
:
774
783
. doi: .

Daniel
TA
,
Katz
JS
,
Robinson
JL
.
2016
.
Delayed match-to-sample in working memory: a brain map meta-analysis
.
Biol Psychol
.
120
:
10
20
. doi: .

de
Cheveigné
A
,
Parra
LC
.
2014
.
Joint decorrelation, a versatile tool for multichannel data analysis
.
Neuro Image
.
98
:
487
505
. doi: .

de
Cheveigné
A
,
Simon
JZ
.
2008
.
Denoising based on spatial filtering
.
J Neurosci Methods
.
171
:
331
339
. doi: .

De Maesschalck
R
,
Jouan-Rimbaud
D
,
Massart
DL
.
2000
.
The Mahalanobis distance
.
Chemom Intel Lab Syst
.
50
:
1
18
. doi:

Fries
P
.
2005
.
A mechanism for cognitive dynamics: neuronal communication through neuronal coherence
.
Trends Cogn Sci
.
9
:
474
480
. doi: .

Grootswagers
T
,
Wardle
SG
,
Carlson
TA
.
2017
.
Decoding dynamic brain patterns from evoked responses: a tutorial on multivariate pattern analysis applied to time series neuroimaging data
.
J Cogn Neurosci
.
29
:
677
697
. doi: .

Huang
Y
,
Matysiak
A
,
Heil
P
,
König
R
,
Brosch
M
.
2016
.
Persistent neural activity in auditory cortex is related to auditory working memory in humans and nonhuman primates
.
Elife
.
5
. doi: .

Ille
N
,
Berg
P
,
Scherg
M
.
2002
.
Artifact correction of the ongoing EEG using spatial filters based on artifact and brain signal topographies
.
J Clin Neurophysiol Off Publ Am Electroencephalogr Soc
.
19
:
113
124
.

Kamiński
J
,
Rutishauser
U
.
2019
.
Between persistently active and activity-silent frameworks: novel vistas on the cellular basis of working memory
.
Ann N Y Acad Sci
. doi: .

Kilner
JM
,
Kiebel
SJ
,
Friston
KJ
.
2005
.
Applications of random field theory to electrophysiology
.
Neurosci Lett
.
374
:
174
178
. doi: .

Kriegeskorte
N
,
Mur
M
,
Bandettini
P
.
2008
.
Representational similarity analysis—connecting the branches of systems neuroscience
.
Front Syst Neurosci
.
2
:
4
. doi: .

Ledoit
O
,
Wolf
M
.
2004
.
A well-conditioned estimator for large-dimensional covariance matrices
.
J Multivar Anal
.
88
:
365
411
. doi: .

Litvak
V
,
Friston
K
.
2008
.
Electromagnetic source reconstruction for group studies
.
Neuroimage
.
42
:
1490
1498
.

Malmierca MS, Niño-Aguillón BE, Nieto-Diego J, Porteros Á, Pérez-González D, Escera C.

2019
.
Pattern-sensitive neurons reveal encoding of complex auditory regularities in the rat inferior colliculus
.
NeuroImage
.
184
:
889
900
.

Mishra
J
,
Gazzaley
A
.
2016
.
Cross-species approaches to cognitive neuroplasticity research. Neuro image, effects of physical and cognitive activity on
.
Brain Struct Funct
.
131
:
4
12
. doi:

Mongillo
G
,
Barak
O
,
Tsodyks
M
.
2008
.
Synaptic theory of working memory
.
Science
.
319
:
1543
1546
. doi: .

Morlet
D
,
Fischer
C
.
2014
.
MMN and novelty P 3 in coma and other altered states of consciousness: a review
.
Brain Topogr
.
27
:
467
479
. doi: .

Murray
JD
,
Bernacchia
A
,
Roy
NA
,
Constantinidis
C
,
Romo
R
,
Wang
X-J
.
2017
.
Stable population coding for working memory coexists with heterogeneous neural dynamics in prefrontal cortex
.
Proc Natl Acad Sci
.
114
:
394
399
. doi: .

Myers
NE
,
Rohenkohl
G
,
Wyart
V
,
Woolrich
MW
,
Nobre
AC
,
Stokes
MG
.
2015
.
Testing sensory evidence against mnemonic templates
.
Elife
.
4
:e09000. doi: .

Näätänen R, Jacobsen T, Winkler I.

2005
.
Memory-based or afferent processes in mismatch negativity (MMN): a review of the evidence
.
Psychophysiology
.
42
(1):
25
32
.

Nakamura
T
,
Michie
PT
,
Fulham
WR
,
Todd
J
,
Budd
TW
,
Schall
U
,
Hunter
M
,
Hodgson
DM
.
2011
.
Epidural auditory event-related potentials in the rat to frequency and duration deviants: evidence of mismatch negativity?
Front Psychol
.
2
. doi: .

Natan
RG
,
Rao
W
,
Geffen
MN
.
2017
.
Cortical interneurons differentially shape frequency tuning following adaptation
.
Cell Rep
.
21
:
878
890
. doi: .

Nees
MA
.
2016
.
Have we forgotten auditory sensory memory? Retention intervals in studies of nonverbal auditory working memory
.
Front Psychol
.
7
. doi: .

Nemrodov
D
,
Niemeier
M
,
Patel
A
,
Nestor
A
.
2018
.
The neural dynamics of facial identity processing: insights from EEG-based pattern analysis and image reconstruction
.
eNeuro
.
5
. doi: .

Nieto-Diego
J
,
Malmierca
MS
.
2016
.
Topographic distribution of stimulus-specific adaptation across auditory cortical fields in the anesthetized rat
.
PLoS Biol
.
14
:e1002397. doi: .

O’Connor
K
,
Ison
JR
.
1991
.
Echoic memory in the rat: effects of inspection time, retention interval, and the spectral composition of masking noise
.
J Exp Psychol Anim Behav Process
.
17
:
377
385
. doi: .

Pasternak
T
,
Greenlee
MW
.
2005
.
Working memory in primate sensory systems
.
Nat Rev Neurosci
.
6
:
97
107
. doi: .

Poeppel
D
.
2003
.
The analysis of speech in different temporal integration windows: cerebral lateralization as ‘asymmetric sampling in time’
.
Speech Commun
.
41
:
245
255
.

Polley
DB
,
Read
HL
,
Storace
DA
,
Merzenich
MM
.
2007
.
Multiparametric auditory receptive field organization across five cortical fields in the albino rat
.
J Neurophysiol
.
97
(
5
):
3621
3638
.

Salisbury
DF
.
2012
.
Finding the missing stimulus mismatch negativity (MMN): emitted MMN to violations of an auditory gestalt
.
Psychophysiology
.
49
:
544
548
. doi: .

Schonwiesner
M
,
Rubsamen
R
,
von
Cramon
DY
.
2005
.
Spectral and temporal processing in the human auditory cortex—revisited
.
Ann N Y Acad Sci
.
1060
:
89
92
.

Spaak
E
,
Watanabe
K
,
Funahashi
S
,
Stokes
MG
.
2017
.
Stable and dynamic coding for working memory in primate prefrontal cortex
.
J Neurosci
.
37
:
6503
6516
. doi: .

Spector
,
F.
,
2011
. Echoic Memory, in:
Kreutzer
,
J.S.
,
DeLuca
,
J.
,
Caplan
,
B.
(Eds.),
Encyclopedia of Clinical Neuropsychology
.
New York, NY
:
Springer
, pp.
923
924
. doi:.

Spitzer
B
,
Blankenburg
F
.
2012
.
Supramodal parametric working memory processing in humans
.
J Neurosci
.
32
:
3287
3295
.

Stokes
MG
.
2015
.
‘Activity-silent’ working memory in prefrontal cortex: a dynamic coding framework
.
Trends Cogn Sci
.
19
:
394
405
. doi: .

Tark
K-J
,
Curtis
CE
.
2009
.
Persistent neural activity in the human frontal cortex when maintaining space that is off the map
.
Nat Neurosci
.
12
:
1463
1468
. doi: .

Ulanovsky
N
,
Las
L
,
Farkas
D
,
Nelken
I
.
2004
.
Multiple time scales of adaptation in auditory cortex neurons
.
J Neurosci
.
24
:
10440
10453
. doi: .

van
Ede
F
,
Chekroud
SR
,
Stokes
MG
,
Nobre
AC
.
2018
.
Decoding the influence of anticipatory states on visual perception in the presence of temporal distractors
.
Nat Commun
.
9
:
1
12
. doi:

Walther
A
,
Nili
H
,
Ejaz
N
,
Alink
A
,
Kriegeskorte
N
,
Diedrichsen
J
.
2016
.
Reliability of dissimilarity measures for multi-voxel pattern analysis
.
Neuro Image
.
137
:
188
200
. doi: .

Winkler
I
,
Reinikainen
K
,
Näätänen
R
.
1993
.
Event-related brain potentials reflect traces of echoic memory in humans
.
Percept Psychophys
.
53
:
443
449
. doi: .

Wolff
MJ
,
Ding
J
,
Myers
NE
,
Stokes
MG
.
2015
.
Revealing hidden states in visual working memory using electroencephalography
.
Front Syst Neurosci
.
9
. doi: .

Wolff
MJ
,
Jochim
J
,
Akyürek
EG
,
Stokes
MG
.
2017
.
Dynamic hidden states underlying working-memory-guided behavior
.
Nat Neurosci
.
20
:
864
871
. doi: .

Wolff
MJ
,
Kandemir
G
,
Stokes
MJ
,
Akyürek
EG
.
2020
.
Unimodal and bimodal access to sensory working memories by auditory and visual impulses
.
J Neurosci
.
1194
1119
. doi: .

Woods
V
,
Trumpis
M
,
Bent
B
,
Palopoli-Trojani
K
,
Chiang
C-H
,
Wang
C
,
Yu
C
,
Insanally
MN
,
Froemke
RC
,
Viventi
J
.
2018
.
Long-term recording reliability of liquid crystal polymer μECoG arrays
.
J Neural Eng
.
15
:066024. doi: .

Author notes

These authors contributed equally to this work equal contribution.

This article is published and distributed under the terms of the Oxford University Press, Standard Journals Publication Model (https://academic.oup.com/journals/pages/open_access/funder_policies/chorus/standard_publication_model)