The anticipation of stimuli facilitates the top–down preparation of neuronal tissue involved in the processing of forthcoming targets. Increasing evidence in the visual modality emphasizes the anticipatory adjustment of visual cortex excitability through modulations of oscillatory alpha power. In the auditory system, however, this relationship has not yet been established. Furthermore, the association between anticipatory modulations of auditory alpha power and a potential top–down network within these anticipatory preparation processes remains unexplained. To disclose these processes, we recorded magnetoencephalography while visually cuing participants to attend to either ear and to anticipate forthcoming auditory stimuli. For the cue-stimulus phase, we expected an asymmetric modulation of auditory alpha power when attending to the left or right ear, assuming that frontoparietal regions would phase synchronize with the auditory cortex in an asymmetric pattern. Beamformer source solutions demonstrate an asymmetric modulation of auditory alpha power following visual cues expressed in a strong right auditory alpha power increase when attending to the right ear. Furthermore, the right auditory cortex is functionally connected to the frontal eye fields during the ipsilateral alpha increase. Altogether, these results contribute significantly to the understanding of how auditory anticipation acts on a local as well as on a network level.
EEG experiments investigating prestimulus allocation of visual spatial attention suggest that the excitability of the visual cortex is modulated by decreasing or increasing ongoing alpha activity (8–12 Hz) (Klimesch et al. 2007a). A decrease in alpha power (event-related desynchronization) is functionally related to active involvement of the underlying neuronal tissue that processes the upcoming stimulus, whereas an increase in alpha power (event-related synchronization) reflects active inhibition of the brain regions involved in processing distracting information (Foxe et al. 1998; Worden et al. 2000; Kelly et al. 2006; Thut et al. 2006; Rihs et al. 2007, Rihs et al. 2009; Romei et al. 2008; Jensen and Mazaheri 2010; Snyder and Foxe 2010).
Far less is known about similar processes in the auditory domain. In 1997, Lethelä et al. (1997) have shown that the processing of auditory stimuli involves a reduction of auditory alpha power. Despite this early study, only recently, accumulating evidence corroborates the existence of an auditory alpha rhythm as well as its functional role in auditory disorders, such as tinnitus (Weisz et al. 2007, Weisz et al. 2011). However, to what extent, the auditory alpha rhythm can be top–down modulated remains largely unexplored. To our knowledge, the only study investigating anticipatory alpha power modulations in the auditory cortex, for example, alpha modulations that are observed irrespective of the processing of auditory stimuli, is the one of Bastiaansen et al. (2001). The authors indeed showed an anticipatory alpha modulation in the auditory cortex, however, in only 2 of 5 participants.
Assuming auditory alpha activity is indeed top–down modulated, the question arises: which brain regions are involved in this top–down control and how does communication with the auditory cortex take place? Most existing evidence is based on functional magnetic resonance imaging and recently Transcranial Magnetic Stimulation (TMS) studies in the visual domain that consistently propose activation of frontal and parietal regions responsible for the allocation of spatial attention (Kastner and Ungerleider 2000; Corbetta and Shulman 2002; Fox et al. 2006; Serences and Yantis 2006; Slagter et al. 2007; Wu et al. 2007; Siegel et al. 2008; Capotosto et al. 2009). Corbetta and Shulman (2002) described the frontal eye field (FEF) and intraparietal sulcus (IPS) as core regions of the dorsal attention network mediating the top–down control mechanisms of attention. More recent neuroimaging studies postulate an activation of the dorsal attention network also during “auditory” spatial attention (Mayer et al. 2006; Shomstein and Yantis 2006; Voisin et al. 2006; Winkowski and Knudsen 2006; Wu et al. 2007; Salmi et al. 2009). In spite of strong evidence that frontoparietal regions are involved in spatial attention in different modalities, it is unclear how these frontoparietal regions communicate with respective sensory cortices. Electrophysiological research suggests that different neuronal assemblies communicate via phase synchronization of oscillatory activity (Varela et al. 2001; Womelsdorf et al. 2007; Canolty et al. 2010). We therefore hypothesize that frontoparietal regions phase synchronize with the auditory cortex in a spatially specific pattern related to the modulation of auditory alpha power.
We accordingly designed a dichotic listening experiment that visually cued participants to attend to either ear and to anticipate forthcoming auditory stimuli. Due to the simultaneous presentation of 2 concurrent sounds (one in the left and one in the right ear), we supposed that the auditory system has to inhibit sound processing at the unattended ear and to facilitate processing at the attended ear. According to this and because of the strong and preponderant contralateral anatomical connections in the auditory system (Evans 1982; Tervaniemi and Hugdahl 2003), we suggest 2 possible mechanisms that would support the processing of the attended sound. On the one hand, the auditory cortex contralateral to the attentional focus (predominantly processing the attended sound) could be facilitated, while on the other hand, and possibly even more decisive, the auditory cortex ipsilateral to the attentional focus (predominantly processing the unattended sound) could be inhibited. At this point, it has to be mentioned that despite a contralateral dominance in monaural and binaural hearing, it is known that the auditory cortex shows functional asymmetries between the hemispheres as, for example, in spatial sound localization (Zatorre and Penhune 2001) and already within the ascending auditory system during dichotic listening (Della Penna et al. 2007). Notwithstanding these asymmetries and differences compared with the visual system, we nevertheless suggest that especially in anticipation of 2 “competing” sounds (binaural presentation at the left and right ear), a differential preparation of the auditory cortices depending on the anticipated ear is advantageous. Therefore, for the cue-stimulus phase, we hypothesized an asymmetric modulation of alpha power in the auditory cortex when attending to the left or right ear, respectively. We furthermore assumed that frontoparietal regions phase synchronize with the auditory cortex, such that coupling with the modulated auditory cortex is enhanced.
Materials and Methods
Fifteen participants reporting normal hearing and sight took part in the current study (9 male, 6 female). The mean age of participants was 25 years (range: 20–28 years). According to the Edinburgh Handedness Inventory (Oldfield 1971), all participants were right-handed and free of psychiatric or neurological disorders according to the M.I.N.I. (Mini International Neuropsychiatric Interview, German Version 5.0.0). Participants were recruited via flyers posted at the University of Konstanz. The Ethical Committee of the University of Konstanz approved the experimental procedure, and the participants gave their written informed consent prior to taking part in the study. After the experiment, each participant received 15 € compensation for participation. Two participants had to be excluded because of too many artifacts in their magnetoencephalography (MEG) recordings (less than 60 trials after artifact rejection).
Task and Stimuli
Participants were “visually” cued to attend to either ear, where they had to distinguish target from standard tones. The cue was an arrow pointing either to the left or to the right, which instructed participants to shift their focus to the designated ear. Following the presentation of a left cue, participants should have attended to the left ear and after the presentation of a right cue to the right ear. Arrows were always displayed in the middle of the screen. Auditory stimuli consisted of standard tones (90%) and target tones (10%). Standard tones were amplitude modulated by either 45 or 20 Hz (carrier frequency: 655 Hz; stimulus duration: 800 ms; and loudness: 50 dB above hearing level), whereas target tones altered their modulation frequency during presentation (from 45 to 25 Hz and back to 45 Hz, from 20to 12.5 Hz and back to 20 Hz, Fig. 1 displays such a target tone). Participants simultaneously listened to tones in both ears in a way that the 20 Hz modulated tone was presented to one ear and the 45 Hz modulated one to the other. The side of stimulation was randomly alternated and equally balanced between tones and ears. Target tones could only appear in the attended ear.
Each trial commenced with a cross in the middle of the screen upon which subjects had to focus their attention for 1–1.5 s. The arrow, randomly pointing to the right (100 trials) or left side (100 trials), subsequently appeared for 1–1.5 s. One to 1.5 s after cue onset subjects were exposed to the auditory stimuli. Immediately afterward, participants were asked by a question displayed on the screen if they had noted a target. Subjects had to respond to this with a right-hand button press. The intertrial interval (ITI) varied between 2.5 and 3.5 s. During the ITI, participants were encouraged to blink so that this could be avoided during task performance. The time intervals with the fixation cross, the cue, and the ITI randomly differed by a slight margin. The procedure of one trial is illustrated in Figure 1. In total, participants had to perform 200 trials during the course of the experiment.
The experiment as well as the notation of events in the MEG data acquisition was controlled using Psyscope X (Cohen et al. 1993) an open-source environment for the design and control of behavioral experiments (http://psy.ck.sissa.it/). Tones were generated outside of the magnetically shielded chamber (ASG-BTI) and delivered to the participant’s ear via flexible plastic tubes of the sound system. Instructions and visual stimuli were presented using a video projector (JVCTM, DLA-G11E) outside of the MEG chamber and projected onto the ceiling of the MEG chamber by means of a mirror system. Participants used a response pad to record their responses. The individual head shapes of all subjects were collected using a digitizer. The MEG recordings were accomplished with a 148-channel whole-head magnetometer system (MAGNESTM 2500 WH, 4D Neuroimaging, San Diego, CA), installed in a magnetically shielded chamber (Vakuumschmelze Hanau) while participants lay in a supine position. MEG signals were recorded with a sampling rate of 678.17 Hz and a hard-wired high-pass filter of 0.1 Hz.
We analyzed the data sets using Matlab (The MathWorks, Natick, MA, Version 7.5.0 R 2007b) and the Fieldtrip toolbox (http://fieldtrip.fcdonders.nl/). We separately extracted epochs of 4 s, including 2 s pre-cue onset (baseline interval), 2 s post-cue onset (post-cue interval), and 2 s post-sound onset (during sound interval) from the continuously recorded MEG signal. This was done for each of the 2 conditions, resulting in 100 trials for the attend-left condition and 100 trials for the attend-right condition for each of the 3 different time intervals. Trials were visually inspected for artifacts, and we rejected those that were contaminated by blinks or muscle artifacts (trials for the different time intervals contained the same trials). After this procedure, no trials with field changes larger than 3 pico Tesla were left. To ensure similar signal to noise ratio across conditions, the trial numbers were equalized for the compared conditions (attend left vs. right) by random omission.
Analysis of Auditory Alpha Power Modulations
As anticipatory auditory alpha activity could not be separated very well from premotor or parietal activity on sensor level, we decided to define in a first step relevant auditory cortex regions as regions of interest (ROIs) (using the interval during sound stimulation with strong alpha power reductions elicited by the auditory stimuli). In a second step, we then disclosed the time–frequency representation of the auditory ROIs (“virtual electrodes”) in the cue-stimulus interval and tested them for condition affects (attend left vs. right). In a last step, we again localized the significantly modulated time–frequency interval (derived from the virtual electrodes) in the brain to assure that the main power modulation indeed arises from the auditory cortex.
Definition of Auditory ROIs
We defined the regions that exhibit strong alpha power modulations during auditory stimulation (Lehtelä et al. 1997) as auditory ROIs. We therefore analyzed changes in spectral power for the interval during auditory stimulation first on sensor level and localized the modulated time–frequency interval then in the brain.
We estimated oscillatory power using a multitaper Fast Fourier time–frequency transformation (Percival 1993) with frequency-dependent Discrete Prolate Spheroidal Sequences (DPSS) tapers (time window: Δt = 4/f sliding in 50 ms steps, taper: Δf = 0.3 × f) for the baseline and during stimulus epoch and both conditions (attend left and right). We calculated power for 5–15 Hz in steps of 1 Hz and tested the obtained time–frequency power distribution for effects of activation (during sound) versus baseline. As a baseline, we chose the pre-cue interval when participants fixated a cross in the center of the screen. As a next step, dynamic imaging of coherent sources (DICS)—a frequency-domain adaptive spatial filtering algorithm (Gross et al. 2001)—was performed to identify the sources of the time–frequency effects. We calculated spatial filters for a 3D grid covering the entire brain volume (resolution: 1 cm) as well as the leadfields for each grid point for individual participants using a multisphere head model (Huang et al. 1999). For each grid point, we constructed a common spatial filter from the cross-spectral density matrix of the MEG signal (activation and baseline) at the frequency of interest (9 ± 3 Hz, as obtained from sensor analysis) and the respective leadfield (regularization: lambda = 15%). We then applied the spatial filters to the Fourier-transformed data (multitaper analysis) for the frequency (9 ± 2.5 Hz) and time window of interest and normalized the resulting activation volumes to a template Montreal Neurological Institute (MNI) brain provided by the SPM2 toolbox (http://www.fil.ion.ucl.ac.uk/spm/software/spm2). We calculated source solutions for the baseline period (550–100 ms pre-cue) and for the interval during stimulus presentation (300–750 ms following tone onset) for both conditions separately (attend left and attend right). We then baseline corrected source solutions by applying a voxel-wise t-statistic that tested the activation period against baseline. Regions with significant modulations compared with baseline were defined as ROIs and the respective voxel with maximal power modulation in the right (MNI coordinates: 51, −21, 22) and left auditory cortex (MNI coordinates: −61, −25, 27) as voxels of interest for the virtual electrode analysis.
Spectral Power Changes in the Prestimulus Interval Obtained from the Auditory Cortex
Time–frequency representations for the voxels of interest were calculated as follows: The raw and downsampled data sets were first projected into source space by multiplying them with the accordant spatial filters. Spatial filters were constructed from the covariance matrix of the averaged single trials at sensor level (latency: 400 ms pre-cue to 1 s post-cue onset, 5–15 Hz, lambda 15%) and the respective leadfield by a linearly constrained minimum variance (LCMV) beamformer (Van Veen et al. 1997). Afterward, we calculated spectral power for the voxels of interest from 5 to 15 Hz in steps of 1 Hz using a multitaper Fast Fourier time–frequency transformation (Percival 1993) with frequency-dependent DPSS tapers (time window: Δt = 4/f sliding in 50 ms steps, taper: Δf = 0.3 × f). The obtained time–frequency power distributions for the right and left auditory locations of interest and the 2 attention foci were baseline corrected (baseline: 400–100 ms pre-cue, relative change) and then tested according to a potential interaction between attention focus and hemisphere. We therefore subtracted the attend-right from the attend-left condition within the right and left auditory cortex and then compared these difference representations using a pointwise-dependent samples t-statistic. We thereby preserved the frequency and time periods that were significantly modulated at a virtual electrode in the right and left auditory cortex according to the attentional focus. We further extracted mean values from the significantly modulated time–frequency maps (averaged across the significant time–frequency window: 6–7 Hz, 50–650 ms) for each participant, condition (attend left vs. right) and ROI (left and right temporal cortex) and again statistically tested these values using a 2 × 2 analysis of variance (condition × ROI). In order to better separate cue-evoked from genuine induced alpha modulations, we additionally calculated cue-locked activity for both conditions and the left and right virtual electrodes by low-pass filtering the raw data (30 Hz) and averaging the single trials. We then performed a time–frequency analysis on the evoked responses (same parameters as for virtual electrode analysis), baseline corrected (baseline: 400–100 ms pre-cue, absolute change), the obtained time–frequency representations, and again tested them for an interaction between hemisphere and attentional focus. We thereby obtained the cue-locked activity contributing to the above described time–frequency effect.
As a last step, we wanted to validate that the hemispheric and attention-specific alpha power modulation derived from the virtual electrode analysis has indeed its main origin in the auditory cortex. We therefore performed a DICS (as for the during sound analysis) to identify the sources of the time–frequency effects. We calculated spatial filters for a 3D grid covering the entire brain volume (resolution: 1 cm). For each grid point, we constructed a common spatial filter (baseline and activation; regularization: lambda 15%) from the cross-spectral density matrix of the MEG signal at the frequency of interest (6.5 ± 2 Hz, according to virtual electrode analysis) and the respective leadfield (obtained from during sound analysis). We then applied the spatial filters to the Fourier-transformed data (multitaper analysis) for the frequency (6.5 ± 1.5 Hz) and time window of interest and normalized the resulting activation volumes to a template MNI brain provided by the SPM2 toolbox (http://www.fil.ion.ucl.ac.uk/spm/software/spm2). Source solutions were calculated for the baseline (700–50 ms pre-cue) and the cue-stimulus period (50–700 ms post-cue) for both conditions separately (attend left vs. attend right). We then baseline corrected source solutions by subtracting the baseline values from the activation values (post-cue) and tested the 2 attention conditions (attend left vs. right) using a voxel wise–dependent t-statistic.
Phase Synchrony Analyses
In order to identify the brain regions functionally connected to the auditory cortex during anticipatory auditory alpha power modulations, we calculated phase synchrony (Lachaux et al. 1999) between the reference voxel within the right auditory ROI (voxel with strongest power modulation associated with the attentional focus, as obtained from prestimulus alpha power analysis; MNI coordinates: 47, −18, 23) and all other voxels. If the phase differences between 2 oscillators deviate from uniformity, they are likely to communicate with each other, whereas uniform distribution of phase differences indicates the independence of 2 oscillators. We first Fourier-transformed the sensor level data (multitaper analysis, latency post-cue interval: 100–650 ms post-cue, latency baseline interval: 650–50 ms pre-cue, 2–30 Hz), extracted the complex values containing phase information and transferred these complex values into source space by multiplying them with the accordant spatial filters. Spatial filters were constructed from the covariance matrix of the averaged single trials at sensor level (latency: 650 ms pre-cue to 1 s post-cue onset, 2–30 Hz, lambda 15%) and the respective leadfield by a LCMV beamformer (Van Veen et al. 1997). We thereby obtained complex values for each voxel and trial for the cue stimulus and the baseline interval. We then converted these complex values into angles (radians) and calculated the difference between the reference voxel and all other voxels for each trial. This refers to the above mentioned “phase difference” between voxels. From these values, we calculated the circular mean over all trials and employed a FisherZ-transformation in order to assure normal distribution across subjects. In a final step, we subtracted the baseline values from the cue-stimulus values and thereby obtained relative phase-locking values for each voxel and condition (attend left/right). These relative phase-locking values quantify the average change of connectivity from baseline to the cue-stimulus phase.
For a more precise analysis of phase-locking patterns, we first identified a frequency band of interest in a data-driven manner. We defined this frequency band according to a global (e.g., averaged across all voxels) estimate of phase locking and its modulation according to the relative power changes: We supposed that frequencies that show a modulation of the global phase-locking values according to the different experimental conditions (attend left vs. right) are likely to be involved in top–down mechanisms that are related to the alpha power modulations in the different conditions. We thus estimated global phase-locking values for both conditions (attend left/right) by averaging the relative phase-locking values across all voxels. Such a procedure yields a measure that reflects large modulations of phase locking from baseline to activation and disregards precise anatomical information. We did this for frequencies from 2 to 30 Hz. We then performed a t-statistic across the global phase-locking estimates for each frequency separately and identified the frequencies that were specifically modulated according to the attentional focus (analogously to the right auditory alpha power modulation dependent on the attentional focus).
In a second step, we wanted to scrutinize the pattern of relative phase locking for the frequency band of interest (here: 5 Hz, see Results) that means to disentangle the relative phase-locking values into coupling (positive values, i.e., increased synchrony in cue-stimulus interval) and decoupling (negative values, i.e., decreased synchrony in cue-stimulus interval) and to disclose the main regions that (de-)couple with the right auditory reference voxel. We therefore focused on the relative phase-locking values (including the change in phase locking from baseline to the cue-stimulus phase for each voxel, averaged across trials) at the frequency of interest and statistically tested these values according to the different conditions (attend left vs. attend right) with a voxel by voxel-paired Student’s t-test. As a result, we obtained statistical values for each voxel for phase locking with the right temporal reference voxel (attend right vs. attend left) and could thereby quantify the difference in phase synchrony between conditions. To correct for multiple comparisons, we defined a minimum cluster size (minimum number of neighboring voxels above a given threshold that are required for a significant cluster) with AlphaSim from the Afni Package (http://afni.nimh.nih.gov/afni/doc/manual/AlphaSim.pdf). We thereby preserved the main regions involved in coupling and decoupling with the auditory reference voxels and disregarded all voxels belonging to clusters with less than the minimum cluster size (770 voxels). Finally, we extracted the mean relative phase-locking values from our ROI (right FEF) for the 2 conditions separately and tested them with the accordant Student’s t-tests. Since the involvement of the IPS was not evident even without control for multiple comparison, we did not pursue any ROI analysis for this region. As the FEFs have been associated with eye movements and also the planning of eye movements, we wanted to rule out that potential differences in phase synchrony between conditions parallel visual cortical activity. We therefore repeated the described phase synchrony analysis with a reference voxel in the right FEF (MNI coordinates: 31, −14, 65) in order to exclude that any effects for the right FEF is paralled by coupling with primary visual areas.
Over all participants and trials, subjects correctly identified 74% tones, indicating that the task was feasible (but still challenging). Participants showed the same behavioral performance for the 40 Hz modulated tones (mean ± standard deviation [SD]: 76 ± 18%) and the 20 Hz modulated tones (72 ± 23%). Likewise, attending to the left (73 ± 20%) or to the right ear (76 ± 18%) did not affect the respective response patterns. The corresponding student’s t-tests statistically confirmed equivalence; both tests argue for the absence of differences between means (each P > 0.5). Mean reaction times were significantly shorter (P < 0.001) for the attend-left (mean: 940 ms, SD: 240 ms) compared with the attend-right condition (mean: 1600 ms, SD: 290 ms). It has to be noted, however, that responses were given after stimulus offset, that is, speed was not a requirement of the task.
Alpha Power Decrease during Sound Stimulation
Time–frequency analyses showed significant alpha power decreases during sound processing compared with baseline for both conditions (P < 0.05). The alpha reductions at representative temporal sensors were most prominent from 6.5 to 11.5 Hz and from 300 to 750 ms post-sound onset. Alpha power decreases were localized in the vicinity of the primary auditory cortex (in the range of ∼1 cm distance to BA 41, i.e., a deviance to be expected considering a grid resolution of 1 cm and nonindividual magnetic resonance imagings (MRIs), MNI coordinates: left auditory cortex −61, −25, 27, right auditory cortex 51, −21, 22). An illustration of the power changes during auditory stimulation averaged across participants is shown in Figure 2.
Auditory Alpha Power Modulation Following the Visual Cue
Alpha power picked up by the auditory virtual electrodes was modulated already in anticipation of auditory stimuli following a visual cue. On a descriptive level, we observed 2 main processes that have to be differentiated: An immediate increase in low alpha power (5–8 Hz) that sustained up to 650–700 ms post-cue followed by an alpha power decrease (8–12 Hz) peaking at the end of the post-cue period (interpretable interval up to about 900 ms post-cue). The early onset of the low alpha power increase raises the notion that the visual cue could have lead to an auditory evoked response, that is potentially intermingled with genuine alpha activity. For this reason, we also analyzed the data in the time domain, pointing to an evoked auditory response following the visual cues especially in the right auditory cortex (see Supplementary Fig. 1). Interestingly, the right auditory cue-locked response differs between the 2 conditions in the first 150 ms (attend right elicits a “stronger” Event Related Potential than attend left). In an analysis described below, we tested the influence this result may have had on the reported interaction when performing the time–frequency analysis on a single-trial level. Due to overlap of the evoked and induced components in the early (<500 ms) post-cue period, the later following alpha power decreases appear somewhat weaker. Note, however, that significant alpha decreases were noted in both hemispheres following a visual cue, particularly during the attend-right condition which on a descriptive level was more pronounced for the left auditory cortex (∼8–13 Hz). See Figure 3 for a descriptive illustration of the results.
One of the major aims of our analysis was to investigate whether alpha power is differentially modulated for the left and right auditory cortex after a visual cue instructing the participants to attend to the left or right ear, respectively. For this purpose, we decided to test the interaction effect by subtracting the time–frequency representations between the attend-left versus attend-right condition for each hemisphere (left vs. right auditory ROI) separately and then to compare these difference representations using a pointwise t-test. We could indeed elucidate an interaction between hemisphere and attention focus (P < 0.01, Fig. 4a,b) for 6–7 Hz and 50–650 ms post-cue with a transient weakening of the effect around 400 ms post-cue. The interaction is mainly due to a relatively stronger right auditory alpha power increase when attending to the ipsilateral right ear (post hoc Student’s t-test: P < 0.05; Fig. 4c). Note that for this analysis, we averaged over a period spanning early strong power increases and later power decreases, thus resulting in overall positive values. Importantly, this virtual electrode–based effect (right auditory alpha power increase when attending ipsilaterally) was also located in the vicinity of the right auditory cortex (in the range of ∼1 cm distance to BA 41, i.e., a deviance to be expected considering a grid resolution of 1 cm and nonindividual MRIs, MNI coordinates: 47, −18, 23) with an independently calculated beamformer (DICS) approach (Fig. 4d). The region strongly overlaps with the area exhibiting alpha power modulations during sound stimulation (peaks are in the range of ∼5 mm distance).
As stated above particularly for the right auditory cortex, surprisingly, strong evoked responses were observed following onset of the visual cue, which were stronger when attending to the ipsilateral ear. It is therefore possible that the evoked response could have contributed to some extent to the interaction effect that we describe above and illustrate in Figure 4. A disambiguation of evoked and induced contributions is challenging, particularly in the face of single-trial power increases. We therefore decided to perform the same interaction analysis described above on the time–frequency representations of the evoked responses. This analysis (shown in Supplementary Fig. 1) shows that the evoked response contributes significantly to the described interaction effect, however, only within a short time window (300–400 ms) and to a less strong extent. For this reason, we conclude that the interaction effect is mainly due to a modulation of the induced responses.
Phase Synchrony with the Right Auditory Cortex
Global phase-locking estimates with the right auditory cortex showed a marginally significant condition effect (attend left vs. right) for 5 Hz (P = 0.06). Based on this global estimate and in line with theoretical assumptions (von Stein et al. 2000; Lakatos et al. 2008), the 5 Hz band (theta) was defined as frequency band of interest for phase synchrony. In a second step, we calculated 5 Hz phase coupling of the right auditory reference voxel with all other voxels in the brain, thereby extracting the main regions that, compared with the baseline interval, increase or decrease their coupling with the right auditory cortex according to the attentional foci. Based on the dorsal attention network (Corbetta and Shulman 2002), the FEFs and the IPS were defined as ROIs. The right auditory cortex was mainly coupled to a region in the vicinity of the right FEF (MNI coordinates: 31, −14, 65). This region corresponds closely to the FEFs mentioned by Paus et al. (1996) (in the range of ∼1–2 cm distance to FEF, i.e., a deviance to be expected considering a grid resolution of 1 cm and nonindividual MRIs). The observed relative coupling for the FEFs and the right auditory cortex was caused by a strong coupling when attending to the ipsilateral unattended ear (parallels the strong power modulation) and an equally strong decoupling when attention is directed to the contralateral attended ear (see Fig. 5). Differences between coupling with the contralateral ear and decoupling with the ipsilateral ear were significant (P < 0.01). For the IPS, no such a modulation was evident.
Importantly, phase synchrony of the right FEF with the visual cortex (BA 17) did not differ between conditions (data are shown in the Supplementary Fig. 2), therefore showing that the effects reported above cannot be seen as a side effect of visual cortical activity.
In the present work, we demonstrate for the first time that alpha power is modulated in the auditory cortex in anticipation of auditory stimuli indicated by a visual cue. Moreover, we could also show that this modulation happens in an asymmetric pattern depending on the focus of auditory spatial attention. We furthermore show that during the periods of auditory cortical alpha modulations, particularly the right FEFs, couple with the strongly modulated right auditory cortex. In the following section, we elaborate upon the auditory power modulations and scrutinize the frontoparietal regions associated with the auditory alpha changes.
Alpha Power Modulation in the Auditory Cortex
The analysis of alpha power during sound stimulation at the sensor level points to alpha reductions most prominent between 6.5 and 11.5 Hz, which are localized to the right and left auditory cortex. This corroborates previous reports on small samples showing an alpha power reduction with sound stimulation (Lehtelä et al. 1997) and argues for the existence of an alpha generator in the auditory cortex. Here, we would like to point out that we observed 2 consecutive processes during anticipation of the auditory stimuli: An early synchronization of low-frequency power (<10 Hz) that is particularly strong for the right auditory cortex when the ipsilateral right ear is attended and a late and weaker desynchronization of alpha power at about 9 Hz. According to recent literature, the auditory alpha rhythm emerges in slightly different frequency bands (theta to common alpha band) depending on the task or method (Weisz et al. 2011). Our present results demonstrate early low-frequency synchronization followed by a higher alpha desynchronization proximate to the expected earliest possible onset of the sound. The mentioned second process is descriptively weaker compared with the earlier modulations partly due to the fact that particularly for the right auditory cortex pronounced evoked activity could be observed following the visual cue (at around 150–250 ms). An early evoked activation of the auditory cortex by visual stimuli is well documented in the literature (Pekkola et al. 2005; Schroeder and Foxe 2005; Ghazanfar and Schroeder 2006; Besle et al. 2008; Kayser et al. 2009; Raij et al. 2010) but has so far not been reported within the context of a spatial attention paradigm. Oscillatory activity, especially in the first 350 ms, thus likely reflects an overlay of cue evoked and induced oscillatory activity. However, 2 arguments can be given that indicate that this interesting evoked response cannot explain in entirety the effects observed in the single trial–based time–frequency analysis: 1) The asymmetric modulation of narrow-band low alpha power is sustained for more than 650 ms (see Fig. 4b) considerably exceeding any effects reported on the level of evoked responses. 2) Importantly, a direct test of the relevant interaction effect on the evoked time–frequency data does not show effects to such an extent as seen for the single-trial analysis (there is a significant interaction effect between 300 and 400 ms, but with much weaker intensity). We thus conclude that primarily genuine low alpha oscillations underlie the described asymmetric low alpha power modulation, which is functionally interpreted below. Nevertheless, the strong evoked responses as well as the counterintuitive ipsilateral increase following a visual cue raises interesting theoretical questions that they may play a crucial role in initiating specific oscillatory patterns for instance by cross modal phase resetting of low alpha oscillations (Klimesch et al. 2007b; Lakatos et al. 2009; Thorne et al. 2011). Whether this could be the case or the evoked effects are just an independent response has to be clarified by further studies.
The fact that auditory cortical alpha activity is modulated by the presence of a visual cue already argues for top–down (anticipation) effects on auditory cortex. Exceeding this demonstration, however, we could also show that low (6–7 Hz) alpha power is modulated differentially according to the attentional focus, particularly in the right hemisphere. We observed a prominent relative low alpha power increase in the right auditory cortex when anticipating an ipsilateral unattended sound while no such effect was evident when anticipating the contralateral attended sound (as stated above, this interaction effect could not be observed to the same extent for the evoked time–frequency data). This interaction between hemisphere and attentional focus rules out nonspecific processes related to alertness and points to the capacity of the auditory cortex to actively prepare for ear-specific sound processing in dichotic listening. An increase in alpha power, particularly in the hemisphere ipsilateral to the attentional focus predominantly processing the unattended sound, is consistent with several studies conducted mainly in the visual domain (Worden et al. 2000; Klimesch et al. 2007a; Rihs et al. 2007; Romei et al. 2008; Jensen et al. 2010). Such an increase in alpha power has been interpreted as active gating of uncued locations (Worden et al. 2000; Jensen and Mazaheri 2010) and emphasized as important mechanism in realizing spatial attention by inhibitory top–down control processes, potentially even more crucial than alpha power decreases (Rihs et al. 2007). In the auditory modality, however, it has not yet been convincingly demonstrated that the excitability of the auditory cortex is altered in a top–down fashion. Kerlin et al. (2010) demonstrated that the allocation of auditory attention to continuous speech is initiated by a lateralization of alpha power at parietal sites similar to the alpha modulations in visuospatial attention. Bastiaansen et al. (2001) were the first to show a reduction in auditory alpha power following an auditory cue; however, this effect only occurred in 2 of the 5 subjects who participated in the study reminiscent of the weak and late alpha power reduction we observed in the present experiment. Unfortunately, due to their design, they could not investigate interaction effects and rule out nonspecific processes related to alertness. The interaction effects derived from the current study were mainly due to a synchronization of ipsilateral low alpha power and pronounced the first 650 ms following cue onset. We would like to mention that due to the fact that targets could only occur at the attended ear, we could not further investigate if the participants indeed attended to the cued ear or whether participants responded to any dichotic tone pair including an altered modulation frequency. Even though the finding of the specific low alpha modulations depending on the cued ear is very suggestive, future studies will also need to include the presentation of targets at the noncued ear.
Right Hemispheric Dominance of Auditory Alpha Power Modulations
The condition-specific alpha power modulation was significantly stronger in the right compared with the left auditory cortex, driving the relevant interaction effect. We thus conclude that the right auditory cortex has a special role in auditory spatial attention and the processing of competing sounds at the left and right ear. This is not surprising, as many studies have provided evidence of hemispheric differences in auditory processing and hemispheric differences in spatial attention. It has been shown that the left auditory cortex primarily localizes sounds in the contralateral right space, whereas the right auditory cortex is involved in computations for the whole space (Zatorre and Penhune 2001; Spierer et al. 2009). Thus, the right auditory cortex seems to be less lateralized than the left auditory cortex. One could therefore assume that the right auditory cortex that is equally processing left and right ear stimuli should be downregulated for the specific processing of an ipsilateral right ear sound (when left and right ear sounds are competing as in the present experiment). The left auditory cortex, in contrast, that processes sounds predominantly from the contralateral right ear does not require such a modulation. Apart from these speculations, further experiments will help to elucidate asymmetries in the auditory system, which are already present during the anticipation of sounds.
Functional Connectivity during Pre-stimulus Alpha Modulations
Our data suggest that the observed asymmetric alpha power modulation in the right auditory cortex may be related to a relative coupling with the right FEFs. The FEFs show significant higher phase synchrony with the right auditory cortex when attending to the ipsilateral compared with the contralateral ear during periods of strong anticipation-related alpha modulations. In contrast, the FEFs are not differentially synchronized with the visual cortex so that the observed effects cannot simply be explained by eye movements or visual attention, but point to a specific communication between the FEFs and the auditory cortex in the context of an auditory spatial attention task. We did not find any modulation in phase synchrony with the IPS. Worth noting is that we focus in our experiment on the ROIs that exhibit relative coupling (Corbetta and Shulman 2002) in the 5 Hz band, keeping in mind that the entire system involved in auditory spatial attention probably comprises a larger network with a complex pattern of coupling and decoupling.
We presume that the FEFs are involved in top–down modulations of auditory cortical activity in the cue-stimulus period and communicate with these sensory regions by phase synchronizing their respective oscillatory activities. Growing evidence suggests that neuronal communication among distributed networks is realized through neuronal synchronization (Singer 1999; Varela et al. 2001; Buzsáki and Draguhn 2004; Fries 2005; Schoffelen et al. 2005; Womelsdorf et al. 2007; Canolty et al. 2010). It has been proposed that attention may control cortical regions by synchronizing of ongoing oscillatory activity (Engel et al. 2001; Salinas and Sejnowski 2001; Buzsáki and Draguhn 2004; Gross et al. 2004; Sauseng et al. 2006; Siegel et al. 2008; Gregoriou et al. 2009). According to these findings, our data show that neuronal activity in the auditory cortex is synchronized with the right FEFs during auditory spatial attention. The specific (de-)couplings were observed in the theta/low alpha band. Phase coupling of such lower frequency bands has been implied in long-range communication of distant brain regions (von Stein et al. 2000; Jensen 2005; Lakatos et al. 2008). However, the particular role of the 5 Hz band must still be substantiated by further data.
The FEF is one of the core regions corresponding to the dorsal attention network involved in visual (Kastner and Ungerleider 2000; Corbetta and Shulman 2002; Fox et al. 2006; Siegel et al. 2008) and, as more recently shown, in auditory spatial attention (Mayer et al. 2006; Shomstein and Yantis 2006; Voisin et al. 2006; Winkowski and Knudsen 2006; Wu et al. 2007; Salmi et al. 2009). This is in perfect accordance with our data, which demonstrate that the FEF is the main region specifically synchronized with the auditory cortex. The spatially specific pattern of coupling and decoupling corroborates the functional relevance of the FEF for the spatially specific auditory alpha modulations: The coupled ipsilateral auditory cortex shows a strong alpha power modulation in contrast to the decoupled contralateral auditory cortex with weak or almost absent alpha power modulation. Together with the recently published TMS findings, which established a causal link between the activation of the FEF and auditory/visual spatial processing (Capotosto et al. 2009; Smith et al. 2009), it seems likely that the spatially specific auditory alpha power increase is mediated by a spatially specific synchronization with the FEFs. Whether, however, the FEFs indeed modulate auditory cortical alpha power in a top–down manner must be tested in a future study using an approach (e.g., TMS) that allows for causal inferences.
To conclude, we emphasize that the present data goes significantly beyond the current knowledge on how auditory spatial attention in anticipation of auditory stimuli is implemented in the brain. We demonstrate that the implementation of auditory spatial attention in anticipation of auditory stimuli relies 1) on a spatially specific synchronization between the FEF and the auditory cortex, such that coupling with the auditory cortex ipsilateral to the sound is enhanced and coupling with the auditory cortex contralateral to the sound is reduced and 2) on the specific adjustment of auditory cortex sensitivity by gating actively the processing of irrelevant information (reflected in the increase of ipsilateral auditory alpha power).
Deutsche Forschungsgemeinschaft (grant number: 4156/2-1) and Zukunftskolleg of the University of Konstanz.
We thank Winfried Schlee for his help with data collection and the experimental setup. Conflict of Interest : None declared.