We recently demonstrated with magnetoencephalographic recordings in human observers that the focus of attention in visual search has a spatial profile consisting of a center enhancement surrounded by a narrow zone of sensory attenuation. Here, we report new data from 2 experiments providing insights into the cortical processes that cause the surround attenuation. We show that surround suppression appears in search tasks that require spatial scrutiny, that is the precise binding of search-relevant features at the target's location but not in tasks that permit target discrimination without precise localization. Furthermore, we demonstrate that surround attenuation is linked with a stronger recurrent activity modulation in early visual cortex. Finally, we show that surround suppression appears with a delay (more than 175 ms) that is beyond the time course of the initial feedforward sweep of processing in the visual system. These observations together indicate that the suppressive surround is associated with recurrent processing and binding in the visual cortex.
The spatial focus of attention has been envisioned as a simple monotonic gradient of sensory facilitation that falls off gradually with increasing distance from its center (Downing and Pinker 1985). We recently demonstrated with magnetoencephalographic (MEG) recordings in human observers that this may be an oversimplification. Visual search requiring subjects to discriminate the orientation of a popout target turned out to produce a more complex profile consisting of a center enhancement surrounded by a narrow zone of sensory attenuation (Hopf et al. 2006). Evidence for such Mexican hat–shaped center-surround profile had accumulated before by a number of psychophysical studies (Downing 1988; Caputo and Guerra 1998; Cave and Bichot 1999; Mounts 2000b; Cutzu and Tsotsos 2003) reporting that perceptual processing in the near surround of a target is impeded relative to the farther-away surround. As an underlying neurophysiological mechanism, it is plausible to attribute surround attenuation to neural suppression in the visual cortex, not least because neural suppression has been widely demonstrated to serve attentional selection in extrastriate and striate cortex (Moran and Desimone 1985; Chelazzi et al. 1993; Luck et al. 1997; Smith et al. 2000; Vanduffel et al. 2000). Nonetheless, the particular mechanism by which neural suppression produces that surround attenuation still remains to be clarified.
At present, satisfying explanations have mainly been offered by computational accounts, like the selective tuning model (STM, Tsotsos 1990; Tsotsos et al. 1995). According to STM, focused attention is mediated by a recurrent top-down propagating winner-take-all (WTA) mechanism that starts from (winning) units in high-level visual areas most activated after the initial feedforward sweep of processing through the hierarchy. This WTA mechanism eliminates projections from lower level units activated by the feedforward sweep that do not directly contribute to the target's location or feature set. While this operation is applied iteratively from level to level downwards the hierarchy, it produces a zone of attenuated units around a pass zone of unattenuated units which permit correct binding of relevant features at the relevant location (Tsotsos et al. 2008). Importantly, STM makes a number of predictions that can be directly tested with neurophysiological measurements in human observers. First, STM posits that the suppressive surround arises due to recurrent processing initiated after the feedforward sweep of processing reaches high-level visual areas. It should, therefore, appear with a certain delay relative to the feedforward sweep of processing in early visual areas. The initial feedforward sweep of processing in the macaque and presumably also in humans is approximately completed around 100–120 ms after stimulus onset (Schmolesky et al. 1998; Lamme and Roelfsema 2000; Bullier 2001), leading to the specific prediction that the onset of surround suppression should be delayed at least by this time range. Second, consistent with subsequent theoretical accounts (Treisman 1998; Lamme and Roelfsema 2000; Hochstein and Ahissar 2002), STM suggests that recurrent processing is critical when target selection requires spatial scrutiny for binding/discriminating target features as for example in search for conjunctions of features like color and orientation. A key notion of the feature integration theory (Treisman and Gelade 1980) is that the localization of an item (in a master map of locations) is critical for mediating the correct binding of features to that item. Considerable psychophysical evidence has mounted suggesting that this conjoining of features requires reentrant processing in early visual cortex areas (Di Lollo et al. 2000). In contrast, when sufficient target information for the current task can be obtained with the feedforward sweep of processing as in a simple color popout–detection task, precise binding of color to other features of a particular item would not be required. It suffices to process color in an unbound form which preempts the necessity to involve the top-down–propagating WTA process to demarcate the precise location of the colored item (Tsotsos et al. 2007). According to STM, surround attenuation arises as a consequence of the latter process; a suppressive surround is therefore not predicted to appear for a simple color-popout task. With the present study, we directly tested both predictions using MEG recordings.
Materials and Methods
Materials and Methods follow the experimental approach and methods reported in Hopf et al. (2006). In this previous study, we analyzed the neuromagnetic response to a probe stimulus presented 250 ms after search frame onset at varying distances from the focus of attention. This revealed a center-surround profile of cortical excitability resembling a Mexican hat shape. In order to analyze the time it takes for this profile to arise in the present experiment, we systematically varied the probe's onset relative to the onset of the search frame (100, 175, 250, 325, and 400 ms). A second experiment addressed the prediction that the suppressive surround depends on the requirement for correct binding of features at the target location. To this end, we compared search tasks that either involve such binding operation (orientation discrimination of a color-popout target) or not (simple color discrimination).
Stimuli and Procedure (Experiment 1)
The stimuli and trial structure of Experiment 1 are illustrated in Figure 1a. As mentioned above, the general logic and experimental setup followed our earlier work (Hopf et al. 2006). While fixating the center of the screen, observers searched for one target, the light blue letter C, among 8 dark blue distractor Cs (background black) presented at an isoeccentric distance from fixation (8° of visual angle) in the right lower visual quadrant. Each C subtended 0.8° of visual angle, and spacing between Cs was constant (1.35°, 8.2° radial angle). While the gap of the distractor Cs varied randomly in 1 of 4 directions (left, right, up, and down), the gap of the target C varied between left and right, and observers had to indicate the gap's orientation by pressing 1 of 2 buttons with the right hand (index finger = gap left, middle finger = gap right). On each trial, the target C appeared randomly at 1 of the 9 stimulus locations, which forced subjects to change the spatial focus of attention from trial to trial. Each search frame was presented for 700 ms, followed by a variable intertrial interval jittered between 650 and 850 ms (boxcar distribution). On 50% of the trials, a probe stimulus (a white ring) was flashed around the central C for 50 ms (frame + probe [FP] trials); in the remaining trials, no probe was presented (frame-only [FO] trials). Because the probe always appeared at a fixed center location, its distance to the target changed from trial to trial and could, thus, serve as a passive measure of cortical excitability at varying distances to the focus of attention. As illustrated in Figure 1a, there were 5 target-to-probe distances (in short probe distances [PD]), ranging from PD0 (target appears at the probe's location) through PD4 (target appears 4 items away from the probe). With this experimental setup, we recently demonstrated that the probe, presented 250 ms after search frame onset, elicited a significantly reduced magnetic response when attention was focused onto PD1 relative to PD0 and PD2–4, suggesting the presence of a narrow zone of sensory attenuation surrounding the target location (Hopf et al. 2006). Experiment 1 aimed to analyze the time course of this surround attenuation by systematically varying the frame-probe stimulus onset asynchrony (SOA) between 100 and 400 ms in increments of 75 ms (100, 175, 250, 325, and 400 ms, see Fig. 1a, lower part).
Stimuli and Procedure (Experiment 2)
Experiment 2 was conducted to test the prediction that the Mexican hat profile would only appear when the search task requires spatial scrutiny for precise feature/location binding but not when the search task could be performed with location-unbound feature information—information that is commonly assumed to be instantly provided with the feedforward sweep of processing. Thus, in Experiment 2, we directly compared situations that did or did not require recurrent processing. Specifically, in one type of trial blocks, we asked subjects to just report the color of a color-popout target (red or green among blue distractors), whereas in other trial blocks, subjects had to discriminate the gap orientation of the popout item as in Experiment 1. The former condition does not require spatial scrutiny to do the task and would generally not be assumed to involve location binding and recurrent processing (Treisman and Gelade 1980; Evans and Treisman 2005). The latter condition, in contrast, is assumed to necessitate the binding of orientation and color at the target's location and, therefore, to require spatial scrutiny and recurrent processing. The stimulus setup was similar to Experiment 1, except for 3 modifications. First, the target item was either a red or a green item among blue distractors with red and green assigned randomly from trial to trial. Second, the frame-probe SOA was fixed to 250 ms (Fig. 1b, lower part), which produced a prominent center-surround profile in a similar orientation discrimination task (Hopf et al. 2006). Third, to increase the number of trials per condition, the search array was reduced to the central 5 positions around the probe's location (Fig. 1b). The central 5 positions (PD0–2) are apparently sufficient to cover the surround suppression effect in view of our previous observations (Hopf et al. 2006). The task of the subjects alternated between trial blocks: in half the blocks, subjects were to report the orientation of the target C (orientation task: same button mapping as in Experiment 1), whereas in the other half of the runs, subjects had to report the color of the target (color task: green = index finger; red = middle finger).
Both experiments were undertaken with the understanding and written consent of the subjects and were approved by the ethics committee of the OvG-University, Magdeburg. All subjects were paid for participation. Sixteen neurologically normal subjects took part in Experiment 1 (12 females, mean age 24.3) and 14 in Experiment 2 (10 females, mean age: 24.9). All subjects were right handed with normal color vision and normal or corrected-to-normal visual acuity.
Data Recording and Analysis
The MEG signal was recorded using a 148-channel Bti Magnes 2500 whole-head magnetometer system (Biomagnetic Technologies Inc., San Diego, CA), digitized at a rate of 254 Hz, and low-pass filtered from direct current (DC) to 50 Hz. The electrooculogram (EOG) was recorded simultaneously with the MEG using a Synamps amplifier system (NeuroScan Inc., Herndon, VA). Both the horizontal and the vertical EOG were recorded with a bipolar montage using 2 electrodes behind the lateral orbital angles for the horizontal EOG and 2 electrodes below and above the right eye. Impedances were kept below 5 kΩ; an electrode placed at FPZ served as ground. The MEG data were coregistered with anatomical data by digitizing anatomical landmarks (left and right preauricular points, nasion; Polhemus 3Space Fastrak system [Polhemus Inc., Colchester, VT]), which were then brought into register with magnetic marker fields generated by 5 spatially distributed coils attached to the subjects’ head.
MEG signals were submitted to online and offline noise reduction (Robinson 1989). Subsequent artifact rejection was performed by removing MEG epochs exceeding a peak-to-peak threshold of 3 pT and EOG voltage changes exceeding 100 μV. Epochs containing eye movements, artifacts, or incorrect button presses were excluded from further analysis.
Average event-related magnetic field (ERMF) waveforms were computed for each subject and probe distance, time locked to probe onset, and relative to a baseline interval of 300–150 ms before probe onset. Separate averages were computed for FP and FO trials. To isolate the ERMF response elicited by the probe (probe-related response) from the overlapping response elicited by the search array, FO waveforms were subtracted from FP waveforms (FP minus FO difference) of trials with the same target location. This approach has been validated by earlier studies (Luck et al. 1993; Luck and Hillyard 1995; Vogel et al. 1998) and yields the passive cortical response to the probe at varying distances to the focus of attention. The size of the probe-related response was quantified in each observer as the mean amplitude of the ERMF difference between the efflux and influx maximum (relative to baseline) which appeared on average between 110 and 140 ms in Experiment 1 and between 130 and 150 ms in Experiment 2. In both experiments, the time range of analysis was defined relative to the onset of the probe and was the same for the 5 frame-probe SOAs in Experiment 1. It should be noted that sensor sites showing efflux and influx maxima varied between subjects but were identical for all probe distances for a given subject. Statistical data validation was conducted using repeated-measures analyses of variance (rANOVAs). Nonsphericity was corrected based on the Greenhouse–Geisser algorithm (if necessary) with respective results reported with adjusted degrees of freedom.
For source localization, current source density estimates were computed by means of standardized low-resolution electromagnetic tomography (sLORETA, Pascual-Marqui 2002) as implemented in the neuroimaging software Curry 5.08 (Compumedics Neuroscan, El Paso, TX). The sLORETA represents an extension of the minimum norm least square (MNLS) method (Hamalainen and Ilmoniemi 1994; Fuchs et al. 1999), where current estimates at each source location are weighted by their measurement error, yielding a pseudo-F–value distribution. Source localization results provided in Figure 5 represent cortical surface-constrained MNLS estimates. All inverse computations were constrained by realistic anatomical models of volume conductor and source compartment derived by 3-dimensional (3-D) surface reconstructions (boundary element method, Hämäläinen and Sarvas 1989) of the cerebrospinal fluid space and the cortical surface (gray matter compartment), respectively. The anatomical basis for the above constraints on inverse modeling were segmentations of the Montreal Neurological Institute (MNI) brain (average of 152 T1-weighted stereotaxic volumes from the ICBM project).
Subjects generally committed slightly more response errors in FP than in FO trials (97.5% vs. 96.7% correct). A 2-way rANOVA with the factors probe presence (present vs. absent) and SOA (100, 175, 250, 325, 400 ms) yielded a significant main effect for probe presence (F1,15 = 38, P < 0.001) and SOA (F2.5,37.8 = 10.1, P < 0.001), as well as a significant probe presence × SOA interaction (F2.7,40.7 = 8.9, P < 0.001). Subsequent rANOVAs with the factor probe presence testing each individual probe SOA revealed significant effects for the shortest SOA (100 ms: F1,15 = 51, P < 0.001) and the longest SOA (400 ms: F1,15 = 7.8, P = 0.013). The difference at 175 ms was only marginally significant (F1,15 = 4.3, P = 0.055), and no effects were found for the 250 ms and 325 ms SOA (both P > 0.5). The influence of the probe was also evident in the response time (RT) data (probe absent: mean RT = 507 ms; probe present: mean RT = 518 ms). An overall 2-way rANOVA with the factors probe presence (present vs. absent) and SOA (100, 175, 250, 325, 400 ms) revealed a significant main effect of SOA (F2.1,31.5 = 10.7, P < 0.001) and probe presence (F1,15 = 30.7, P < 0.001). The respective interaction was also significant (F2.8,42.8 = 33.2, P < 0.001). Separate rANOVAs for each SOA revealed significant effects of probe presence for the of 100 ms (F1,15 = 69.1, P < 0.001) and 175 ms SOA (F1,15 = 17.2, P = 0.001). SOAs beyond 175 ms revealed no significant effect (all P > 0.4). Taken together, probes presented soon after the onset of the search array influenced behavioral performance most presumably because of backward masking of the target by the probe. Probes with SOAs beyond 175 ms, however, had little influence on performance, except for the 400 ms SOA, which presumably started to interfere with the choice of the correct response.
Figure 2a shows the topographical distribution and waveforms of the probe-related response (FP minus FO difference) for the 250 ms SOA collapsed across probe distances. The waveforms show the ERMF response averaged across subjects from each individual subject's efflux (thin solid line) and influx maximum (thin dashed line) which were then collapsed by subtracting the influx from the efflux response (efflux + (−influx), bold solid line). Note that the central occipital magnetic field response consists of circumscribed efflux (bright region) and influx components (dark region) which both refer to the same underlying current source. Collapsing data across efflux and influx component, thus, provides a more complete characterization of the magnetic field response. Analogous to Hopf et al. (2006), the probe response for the different probe distances and SOA conditions was quantified as the average ERMF response in a selected time range after search frame onset (gray area under the bold trace).
The electromagnetic response for the different SOA conditions is illustrated in Figure 2b. For this analysis, the probe-related response was quantified as the mean ERMF amplitude between 110 and 140 ms after probe onset (see Materials and Methods and Hopf et al. 2006 for more details). When presented at 100 ms after frame-onset, the probe did not elicit a profile that markedly differed across probe distances. This was confirmed by a 1-way rANOVA with factor probe distance (PD0 through PD4), which yielded no significant effect (F2.4,35.3 = 0.8, P = 0.483). There was also no significant effect of probe distance when presenting the probe with a 175-ms SOA (F3.1,46.7 = 1, P = 0.42).
However, presenting the probe after 250 ms produced a significant effect of probe distance (F3.1,46.4 = 3.2, P = 0.031). Subsequent pairwise comparisons revealed that the probe response at PD1 was significantly smaller than at PD0 (PD0 vs. PD1: F1,15 = 9.3, P = 0.008) and at PD2 (PD1 vs. PD2: F1,15 = 6.2, P = 0.025)—a pattern that validates the presence of a suppressive surround at PD1. There was no difference between PD0 and PD2 (F1,15 < 0.1, P = 0.889) indicating that no center enhancement was present at the SOA of 250 ms. A significant effect of probe distance was also observed for the 325-ms SOA (F3.1,46.0 = 4.9, P = 0.005). As visible from the fourth bar graph in Figure 2b, this effect was mainly due to PD0 being larger in comparison to all other probe distances (PD0 vs. PD1: P = 0.001; PD0 vs. PD2: P = 0.031; PD0 vs. PD3: P = 0.015; PD0 vs. PD4: P = 0.037). There was, however, no significant difference between PD1 and PD2 (F1,15 = 0.9, P = 0.35). Finally, the 400-ms SOA produced a pattern roughly similar to that of the 325 ms SOA. While there was a significant overall effect of PD (F2.8,41.5 = 3.54, P = 0.025), this effect arose from an enhanced response at PD0 relative to all other probe distances except for PD2 (PD0 vs. PD1: P = 0.03; PD0 vs. PD2: P = 0.18; PD0 vs. PD3: P = 0.02; PD0 vs. PD4: P = 0.046). Again, there was no difference between PD1 and PD2 (F1,15 = 0.5, P = 0.5) indicating that circumscribed surround attenuation is absent at this SOA.
In sum, the suppressive surround previously observed with a similar experimental setup was replicated in the present experiment. Importantly, confirming the prediction of STM, surround suppression appeared with a clear delay of some time between 175 ms to 250 ms after search frame onset. According to STM, surround attenuation arises as a consequence of recurrent processing initiated after the initial feedforward sweep of processing reaches higher level areas of the visual hierarchy. The suppressive surround is therefore predicted to appear with a delay relative to the feedforward sweep. Furthermore, although the suppressive surround was fully apparent at 250 ms, it was already insignificant at 325 ms and completely absent at 400 ms. In contrast, at the 325 ms SOA, a significant enhancement was seen at PD0 which was not apparent for shorter SOAs, suggesting that surround attenuation precedes center enhancement as a transient cortical modulation.
Experiment 1 established that surround suppression did not appear until after 175 ms after search frame onset—a substantial delay that is in line with the time costs imposed by recurrent processing and binding. Temporal delay is, of course, at best suggestive but not sufficient evidence for recurrent processing. With the second experiment, we wanted to address more directly whether surround suppression arises from recurrence binding in the visual processing hierarchy. Experiment 2 was based on the following logic: A task that does not require recurrence binding like simple color discrimination (Evans and Treisman 2005) should not produce surround suppression as compared with a task that requires recurrence binding. We, thus, instructed subjects to either perform a gap orientation discrimination of the popout target as in Experiment 1 (task requiring spatial scrutiny and recurrence binding) or to report just the color of the target (red/green)—a task that does not require recurrence binding (Evans and Treisman 2005; Tsotsos et al. 2007).
Subjects responded faster during color discrimination (average: 488 ms) than during orientation discrimination (average: 544 ms; F1,13 = 46.6, P < 0.001). The presence/absence of the probe had little effect under both conditions (490 vs. 487 ms for color discrimination: F1,13 = 3, P = 0.11; 543 vs. 546 ms for orientation discrimination: F1,13 = 2.1, P = 0.17). Response accuracy did not differ between orientation and color discrimination (92% correct responses each; F1,13 = 0.3, P = 0.88). The probe, however, influenced response accuracy in the orientation discrimination task (92.8% correct without probe vs. 91.1% correct with probe; F1,13 = 8.59, P = 0.01) but had no effect on color discrimination (91.6% correct without probe vs. 91.5% correct with probe; F1,13 = 0.02, P = 0.89).
Figure 3 shows the average ERMF response elicited by the probe (frame-probe minus frame-only difference) when subjects performed the orientation discrimination task (a) and the color discrimination task (b). Obviously, the orientation discrimination task produced a clear reduction of the probe response at PD1 relative to PD0 and PD2 (PD3 and PD4 were not measured in this experiment, see Materials and Methods), which was not present for the color discrimination task. A 2-way rANOVA with the factors condition (color/orientation) and PD confirmed that impression by revealing a significant condition × PD interaction (F1.88,24.46 = 8.1, P < 0.005). Corresponding main effects of condition (F1,13 = 21.6, P < 0.001) and PD (F1.75,22.8 = 4.21, P < 0.05) were also significant. The surround attenuation at PD1 in the orientation discrimination task was further confirmed by significant pairwise comparisons between PD1 and the flanking PDs (PD1 vs. PD0: F1,13 = 6.36 P = 0.026 and PD1 vs. PD2: F1,13 = 10.49, P = 0.006). The slight difference between PD0 and PD2 was not statistically significant (F1,13 = 0.14, P = 0.712).
The inspection of Figure 3b reveals that color discrimination did not produce a reduction of the probe response at PD1 relative to the flanking positions. Instead, the ERMF response falls gradually from PD0 to PD2 resulting in a significant main effect of PD (F2,25.8 = 5.3, P = 0.011).
Finally, a direct comparison of the probe response between the 2 experimental tasks revealed larger responses for the color discrimination task at PD0 (F1,13 = 18.16, P = 0.001) and PD1 (F1,13 = 16.99, P = 0.001). No difference was found at PD2 (F1,13 = 0.92, P = 0.355).
To summarize, in line with our previous observations the orientation discrimination task produced a narrow zone of neural attenuation surrounding the search target. Color discrimination, in contrast, did not produce such surround attenuation, but instead, gave rise to a simple gradient. Under the premise that our orientation discrimination task but not the color discrimination task required precise color-orientation binding and therefore recurrent processing for successful target discrimination, these observations clearly support the prediction that surround attenuation arises as a consequence of recurrent processing in the visual processing hierarchy.
Activity Reflecting Recurrent Processing in Early Visual Cortex
The validity of the conclusion of Experiment 2 rests on the proposition that the orientation discrimination task requires recurrence binding as compared with the color discrimination task. Although there is considerable psychophysical evidence in support of this notion, it would still be important to validate this assumption with the present data, that is, to demonstrate that orientation discrimination, in fact, involves enhanced recurrent processing in early visual cortex areas. To this end, we analyzed the ERMF response elicited by the search frame in the time range before the presentation of the probe stimulus. Specifically, we focused on 3 different time windows that contain the initial feedforward elicited response in early visual cortex (50–80 ms), the time range of the maximum response in extrastriate areas (150–180 ms), and most importantly, the time range between 200 and 250 ms which is typical for attention-driven recurrent activity in early visual cortex (Mehta et al. 2000; Martinez et al. 2001; Noesselt et al. 2002; Woldorff et al. 2002; Di Russo et al. 2003). It should be noted that we do not expect recurrent processing to be completely absent in color discrimination, as to a certain degree recurrent processing appears to be automatic in the visual system and has been shown to arise even in anesthetized animals (Hupé et al. 1998, 2001). The important question here is whether the orientation discrimination task leads to a larger recurrent activation of the early visual cortex than the color discrimination task.
Figure 4a shows the average ERMF distribution of the color discrimination (Color) and the orientation discrimination task (Orient.) in 3 selected time windows between search frame onset and the onset of the probe (data collapsed across FP and FO trials). As obvious from comparing these maps, the topographical distribution of the ERMF response was pretty similar under both experimental conditions in each of the selected time ranges. However, between 200 and 250 ms, the orientation discrimination task showed a considerably larger central occipital field effect in comparison to the color task (orientation discrimination: 69 fT; color discrimination: 39 fT; effect size measured as ERMF difference between highlighted sensor sites). A respective rANOVA with the factor experimental task (orientation vs. color discrimination) revealed a significant effect (F1,13 = 11.76, P = 0.004). In contrast, no ERMF difference between experimental tasks was observed in the 2 earlier time windows neither between 50 and 80 ms over central occipital sensor sites (orientation/color: 102/94 fT; F1,13 = 0.75, P = 0.404) nor between 150 and 180 ms over left occipitotemporal sites (orientation/color: 130/126 fT; F1,13 = 0.122, P = 0.733).
Figure 4b shows current source localization results (sLORETA estimates, see Materials and Methods) of the orientation task at representative time points of the 3 selected time windows shown in panel (a). The current density maximum between 50 and 80 ms appeared in the left calcarine fissure region, consistent with a primary visual cortex origin, the current density distribution in the subsequent 150–180 ms window, in contrast, shows a maximum in lateral ventral extrastriate cortex. Importantly, the sLORETA between 200 and 250 ms yielded a maximum in a region of the left hemisphere calcarine fissure that is very similar to the current density maximum of the 50–80 ms time window, suggesting that the ERMF response between 200 and 250 ms, in fact, reflects recurrent activity in the primary visual cortex. A more detailed analysis is provided in Figure 5 which shows cortical surface-constrained (surface derived from a 3-D surface segmentation of the MNI brain) current density estimates of the color (upper row) and orientation discrimination task (lower row) in a more extended time range between 0 and 400 ms after search frame onset. This analysis was based on no-probe trials only and revealed a sequence of 4 different current source density distributions (A–D). The first distribution shows a central occipital maximum around 75 ms (A), which represents the initial feedforward response in early visual cortex. This current maximum as well as the subsequent extrastriate activity (B) did not differ between the color and the orientation task. In contrast, the following current maximum in early visual cortex (C) was substantially larger in the orientation than the color task. Importantly, for both experimental conditions, the current maximum in (A) and (C) appeared at the same location in early visual cortex, suggesting that the current maximum in (C) reflects recurrent activity in early visual cortex. It is notable that, consistent with the restricted time range of surround attenuation around 250 ms observed in Experiment 1, the enhancement of recurrent activity in the orientation discrimination task appeared between 190 and 270 ms (red horizontal bar) but not later.
In sum, the current density analysis of the color and orientation discrimination task revealed that the latter, indeed, led to enhanced recurrent activity in early (presumably the primary) visual cortex, shortly before probe's onset. In contrast, feedforward-related activity in earlier time ranges did not differ. These observations can be taken in support of the proposition that the orientation discrimination task involved enhanced recurrent processing as compared with the color discrimination task.
The present experiments replicate and extend our previous observations reported in Hopf et al. (2006) where we showed that the focus of attention in visual search is not a simple spatial gradient as traditionally envisioned (Posner 1980; Downing and Pinker 1985). Instead, it may rather resemble a Mexican hat profile consisting of a narrow zone of sensory attenuation that surrounds a central region of relative enhancement. We now extend these findings by showing that the spatial profile of attention can vary depending on the particular discrimination processes required to identify the target in visual search. With the first experiment, we provide a systematic time course analysis of the center-surround profile, which revealed that surround attenuation and center enhancement are temporally delayed effects with the center enhancement being even more delayed than surround attenuation. Specifically, the suppressive surround appeared with a delay of more than 175 ms relative to the search frame's onset. This lag is clearly beyond the timing of the initial feedforward sweep of processing in the human visual system which is presumably completed within the first ∼100 ms (Foxe and Simpson 2002). The suppressive surround is thus likely related to subsequent recurrent processing. In fact, the time course of the suppressive surround is perfectly in line with the typical time range (∼150 to 250 ms) of recurrent activity modulations due to attention in early visual cortex (Martinez et al. 2001; Noesselt et al. 2002; Di Russo et al. 2003). This clearly confirms the prediction of the STM (Tsotsos et al. 1995; Tsotsos 2005), according to which the suppressive surround arises from a top-down–propagating WTA process that is initiated after the feedforward sweep of processing reaches top-levels of the visual hierarchy. As the WTA process moves step-by-step downwards the hierarchy, it adds further delay before reaching early visual cortex areas. Effects of surround attenuation are, thus, predicted to appear with substantial temporal delay particularly in lower level visual areas.
Our time course analysis revealed another important finding, namely that attentional enhancement seen at the target's location is delayed relative to the attenuation in the target's surround. This observation confirms a further prediction of the STM (Tsotsos et al. 2001). The top-down propagating WTA produces a pass zone of unattenuated activity at the target's location that will initially not show a center enhancement, consistent with the absence of such enhancement at the 250 ms SOA. However, when surround attenuation is established, competitive (attenuating) influences from the distractor units onto the units inside the pass zone will be removed, resulting in a relative enhancement inside the pass zone (see Tsotsos et al. 2001 for a detailed description of this prediction). Importantly, while this sequence of surround attenuation followed by a center enhancement arises as an emergent property of STM's selection architecture, it provides an optimal solution of the problem of how to amplify target information without simultaneously boosting noise in a cluttered search scene.
Whether focal attention actually operates by signal enhancement in multi-item displays has been a matter of debate (Shiu and Pashler 1994; Carrasco et al. 2000, 2002; Dosher and Lu 2000; Lu et al. 2002), with the general tenet being that attention rather relies on distractor attenuation in multi-item displays. Signal enhancement has been suggested to become relevant only when isolated objects are processed. Nevertheless, there is psychophysical data suggesting that both, signal enhancement and distractor attenuation, play a role during attentional selection in multi-item displays (Henderson 1996; Luck et al. 1996; Cheal and Gregory 1997). In addition, functional brain imaging data (Tootell et al. 1998; Pinsk et al. 2004) indicate that spatial attention involves retinotopically specific enhancement and suppression in visual cortex, possibly coordinated in a push–pull–like manner (Pinsk et al. 2004). The present time course analysis of ERMF data adds to these observations by showing that both processes may be spatiotemporally structured such that interfering noise in the vicinity of the target is attenuated before subsequent enhancement processes are involved.
The delayed onset of surround suppression supports STM's notion of being a consequence of recurrent processing in the visual hierarchy. Temporal delay, however, represents only suggestive but not necessarily sufficient evidence for recurrent processing. We, therefore, designed a further experiment to verify this notion more directly. In this experiment, we analyzed the passive cortical excitability to an irrelevant probe stimulus as in the first experiment but under alternative experimental conditions that differed as to whether recurrence binding was critical for performing the search task. Subjects were required to either discriminate the gap orientation of the target—(a task that requires recurrent processing for attentional binding) or to simply report the color of the popout target. The latter task is commonly assumed not to involve recurrent processing for binding. For example, in the framework of feature integration theory, the color task will not require binding as it is sufficient to check the presence of the target color in the relevant feature map (Treisman and Gelade 1980; Treisman 1998) without resorting to its precise location. Feature singletons were shown to permit above chance detection performance even when the performance of reporting the location is erroneous. Conversely, the successful identification of a conjunction target always involved a correct location report (Treisman and Gelade 1980; Wolfe and Cave 1999). Recently, Evans and Treisman (2005) further demonstrated with real-world scenes and objects that detection based on one (or more) features does not require precise location binding, is fast, and escapes the attentional blink. Here, we observed that reporting the color (green/red) of a popout item did not produce a suppressive surround, whereas the gap task led to a clear center-surround profile—a finding that is clearly in line with the prediction of STM. Furthermore, it indicates that the spatial distribution of attention in visual search is not a fixed profile but changes depending on the task demands, in particular, on whether target discrimination necessitates recurrent processing to attain precise feature location binding. Importantly, the results of Experiment 2 may serve as a basis to reconcile conflicting results of previous psychophysical investigations suggesting either the presence of a simple gradient or a center-surround pattern. That is, studies that provided evidence for a simple spatial gradient typically used tasks requiring simple detections like luminance onsets (Downing and Pinker 1985; Hughes and Zimba 1985; Handy et al. 1996)—tasks which could be solved without spatial scrutiny and fine discrimination, yet without involving recurrence binding. Surround attenuation, on the other hand, was typically seen with search tasks that required the precise location and individuation of the target like letter, shape, or line-length discrimination tasks (Cave and Zimmerman 1997; Caputo and Guerra 1998; Bahcall and Kowler, 1999; Mounts 2000a; Cutzu and Tsotsos 2003). Thus, the particular profile of the focus of attention appears to depend on the specific computations required to discriminate the target item. In fact, such distinction based on the requirement for recurrence binding is directly substantiated by recent psychophysical data (McCarley and Mounts 2007). McCarley and Mounts analyzed target–distractor interactions in search tasks requiring either simple feature detection or a more thorough object individuation. They observed that only tasks involving object individuation led to enhanced attentional interference in the vicinity of the target. In sum, the long lasting controversy about the spatial profile of attention may be resolvable by acknowledging that different profiles are possible, with the Mexican hat profile appearing when the discrimination of the target relies on recurrent processing.
It should finally be mentioned that although the color discrimination task did not produce surround attenuation, it showed a significant effect of PD in the form of a center enhancement. This indicates that the color task involved some form of location-based selection different from the attentional process leading to surround attenuation. In other words, surround attenuation investigated here refers to just one among other mechanisms of attentional selection in the visual system. This is not unexpected as there is considerable evidence for multiple mechanisms of attentional selection to cooperate on a tight temporal scale in visual search (Luck 1995; Hopf et al. 2005). For example, it was shown in visual search that attention to features highlights any distractor location that bears search-relevant features significantly before attention is spatially focused onto the target item (Hopf et al. 2004). As the explicit selection of color was emphasized in the color discrimination task, the center enhancement may relate to this type of feature-based selection (Schoenfeld et al. 2007).
To summarize, the results of the 2 experiments reported here converge on the conclusion that the Mexican hat profile of the focus of attention in visual search is a direct consequence of recurrent processing in the visual system. This notion is suggested 1) by the time course analysis showing that the temporal onset of surround attenuation is significantly beyond the initial feedforward sweep of processing through the visual hierarchy and 2) by the observation that surround attenuation does not appear when target identification can be achieved without recurrence binding. Both observations are in line with predictions of the STM according to which the suppressive surround arises from a top-down propagating WTA process that prunes away feedforward connections from distractor locations not contributing to the search target. Finally, because this pruning operation attenuates competitive influences from distractory units in the surround, it should cause a subsequent relative enhancement of units in the attended pass zone—a prediction that is confirmed by the present data.
Deutsche Forschungsgemeinschaft (HE 1531/9-2) and Bundesministerium fuer Bildung und Forschung (Center for advanced imaging).
J.K.T. acknowledges the Canada Research Chairs Program and Natural Sciences and Engineering Research Council of Canada. Conflict of Interest: None declared.