Feature attention operates in a spatially global way, with attended feature values being prioritized for selection outside the focus of attention. Accounts of global feature attention have emphasized feature competition as a determining factor. Here, we use magnetoencephalographic recordings in humans to test whether competition is critical for global feature selection to arise. Subjects performed a color/shape discrimination task in one visual field (VF), while irrelevant color probes were presented in the other unattended VF. Global effects of color attention were assessed by analyzing the response to the probe as a function of whether or not the probe's color was a target-defining color. We find that global color selection involves a sequence of modulations in extrastriate cortex, with an initial phase in higher tier areas (lateral occipital complex) followed by a later phase in lower tier retinotopic areas (V3/V4). Importantly, these modulations appeared with and without color competition in the focus of attention. Moreover, early parts of the modulation emerged for a task-relevant color not even present in the focus of attention. All modulations, however, were eliminated during simple onset-detection of the colored target. These results indicate that global color-based attention depends on target discrimination independent of feature competition in the focus of attention.
Attention to elementary object features such as color, orientation, or motion has been shown to operate in a spatially global manner such that the selection of a feature value at one location triggers the parallel selection of that feature value at other unattended locations. While there are abundant data documenting global feature-based attention effects in the monkey (Treue and Martinez-Trujillo 1999; McAdams and Maunsell 2000; Martinez-Trujillo and Treue 2004; Bichot et al. 2005; Maunsell and Treue 2006; Katzner et al. 2009) and in humans (Saenz et al. 2002, 2003; Hopf et al. 2004; Melcher et al. 2005; Sohn et al. 2005; Arman et al. 2006; Serences and Boynton 2007; Andersen et al. 2009, 2011, 2013; Zhang and Luck 2009; Zirnsak and Hamker 2010; Boehler et al. 2011; Liu and Hou 2011; Liu and Mance 2011; White and Carrasco 2011; Festman and Braun 2012; Lustig and Beck 2012; Stoppel et al. 2012), the critical question of what spatially unbound feature selection actually entails remains to be answered.
Behavioral observations and event-related potential (ERP) recordings in humans hint at the possibility that global feature selection arises as a consequence of resolving the competition between feature values inside the focus of attention (Saenz et al. 2003; Zhang and Luck 2009; Moher et al. 2014). For example, Zhang and Luck (2009) had participants attend a stream of superimposed red and green dots continuously changing position in a random manner in one visual field (VF). Subjects were to detect occasional luminance decrements of the dots drawn in one of the colors (target color). Colored dots flashed in the unattended VF elicited an enhanced ERP response (enhanced P1 component) when they matched the target color relative to when they matched the nontarget color. Importantly, this P1 effect was not seen when competition in the target VF was abolished by presenting the red and green dots sequentially. Such account in terms of feature competition generally aligns with experimental evidence showing that attentional modulations of firing responses in feature-selective cells are typically largest when stimuli compete for processing inside a cell's receptive field (Moran and Desimone 1985; Luck et al. 1997; Reynolds et al. 1999; Treue and Maunsell 1999; Lee and Maunsell 2010).
On the other hand, effects of global feature-based selection were also observed in the absence of feature competition. For example, attention to orientation or motion-direction was found to operate in a spatially global way even when only a single orientation or motion-direction was presented in the focus of attention (Treue and Martinez-Trujillo 1999; McAdams and Maunsell 2000; Martinez-Trujillo and Treue 2004; Bondarenko et al. 2012; Stoppel et al. 2012). Feature competition may therefore not be the only determinant of global feature selection, and indeed, other mechanisms have been proposed like pattern grouping (Levinthal and Franconeri 2011) or the segmentation and grouping of features to form objects (Melcher et al. 2005; Arman et al. 2006; Katzner et al. 2009; Boehler et al. 2011; Festman and Braun 2012).
Here, we use magnetoencephalographic (MEG) recordings in human observers to investigate whether the effects of global feature-based attention to color are caused by the presence (and resolving) of competition between color values in the spatial focus of attention. To this end, results of 4 experiments will be reported. The first experiment will establish the general paradigm and characterize the neuromagnetic indices of global color selection under conditions of color competition. Subsequent experiments will then test whether color competition is critical to global color selection by analyzing whether respective indices are still observed after eliminating color competition in the focus of attention. We will find that global color-based attention is indexed by a sequence of distinct modulations that proceed in reverse hierarchical direction in human extrastriate visual cortex. Eliminating the distractor color from the target object (Experiment 2), will leave the MEG indices of global color selection unaltered. Experiment 3 will go a step further and show that early parts of the modulation reflecting global color selection arise for a target-defining color that is absent in the attentional focus. Finally, Experiment 4 will demonstrate that all indices of global color selection disappear when subjects are asked to simply detect the mere onset of a target stimulus, suggesting that the active discrimination of the target object is a critical determinant.
Materials and Methods
All subjects that took part in Experiment 1–4 were students of the University of Magdeburg, gave informed consent, and were paid for participation (6 €/h of participation). Twenty-one subjects (14 females, mean age 25.8) participated in Experiment 1, 22 (16 females, mean age 25.9) in Experiment 2, 25 (15 females, mean age 25.3) in Experiment 3, and 20 (13 females, mean age 25.8) in Experiment 4. All participants were right-handed, except for one participant of Experiment 4. All experiments were conducted according to the research regulations of the Declaration of Helsinki and were approved by the ethics board of the University of Magdeburg.
Stimuli and Procedure
The stimuli of Experiment 1 are illustrated in Figure 1A. On each trial, subjects were presented 2 colored circles, one in the left and one in the right VF. The circle in the left VF served as the target and was composed of 2 differently colored half circles. One half circle was always drawn in the target color, and the other half circle was assigned a distractor color. The full circle in the right VF was uniformly colored and served as a color probe. The probe was never attended and completely task-irrelevant. The circles subtended a diameter of 3.1° (visual angle) with the center being placed 3.1° below and 4.9° to the left and right from fixation. Subjects were instructed to fixate the center-cross and exclusively attend the circle in the left VF. The task was to report whether the curved section of the half circle drawn in the target color (red or blue in Fig. 1A) faced left or right with a two-alternative-button press of the right hand (left: index finger, right: middle finger). Target color (red, magenta, or blue) was designated at the start of each trial block, and was randomly assigned to the left or right half circle on subsequent trials. The color of the distractor half circle (gray, green, or yellow) varied randomly from trial to trial, and was never one of the colors that served as the target in other trial blocks. Moreover, the color of the distractor half circle was never simultaneously used as probe color. This was done to avoid that object-based selection of the irrelevant color of the target confounds the effects of global color-based attention to the target color. In fact, we have shown with a comparable stimulus configuration in visual search (Boehler et al. 2011) that a task-irrelevant color contained in an attended object produces a delayed feature bias for that color, which spills over to other unattended objects bearing that color. The probe's color in the right VF varied randomly from trial to trial among the 3 possible target colors, such that on one-third of the trials the probe matched the target color (match trials), while on the remaining trials the colors did not match (nonmatch trials). The effect of global color-based attention was assessed by comparing the brain magnetic response to match trials with that to nonmatch trials in visual cortex contralateral to the probe (left hemisphere).
Each stimulus array was presented for 300 ms with subsequent trials appearing with a randomly varying SOA between 1300 and 1800 ms (rectangular distribution). Subjects performed 195 trials per experimental block and a total of 9 blocks per session. Each target color was used on 3 experimental blocks forming a total of 6 possible block orders, which were randomly assigned to subjects (target color never repeated on subsequent blocks). Collapsed over the different target colors, each subject performed a total of 585 match and 1170 nonmatch trials throughout the whole experimental session.
Stimuli were back-projected by an LCD projector (DLA-G150CLE, COVILEX GmbH, Magdeburg, Germany) from outside the recording booth onto a screen at a viewing distance of 100 cm. The background color of the screen was set to gray at a luminance of 8.3 cd/m2. The luminance of the colors of the circles was psychophysically matched once prior to Experiment 1, 2, and 4 based on heterochromatic flicker photometry (Lee et al. 1988) in 3 selected subjects. As Experiment 3 required to add an additional color, flicker-based luminance matching was repeated in 5 selected subjects. Respective luminance values (average 44.0 cd/m2) were used for all subjects in the reported experiments.
Stimuli, setup, and task of Experiment 2 were similar to Experiment 1, except for the following modifications: 1) On half of the trial blocks, the distractor half circle was removed at the target side, such that only the half circle drawn in the target color was presented (“distractor-absent trials,” Fig. 1B). On the other half of the trial-blocks, the stimulation was exactly as in Experiment 1 (distractor-present trials). As in Experiment 1, subjects were required to report whether the curved section of the half circle faced left or right. Subjects performed 6 blocks of distractor-absent trials and 6 blocks of distractors-present trials in alternation. Target color assignment was randomized to form 12 color blocks run in 6 block orders, with each subject performing one of the 6 block orders (target color never repeated on subsequent blocks). 2) To increase trial count within blocks, the between-trial SOA jitter was reduced to a shorter range (1300–1500 ms, rectangular distribution). This yielded a total of 162 trials per block. Collapsed over target color, each subject performed 324 match and 648 nonmatch trials per trial type (distractor-absent/present) throughout one experimental session.
Stimuli, setup, and task of Experiment 3 were similar to Experiment 1, except that on half of the trial blocks, 2 colors were defined as the target for a given trial block (two-color blocks). On the other half of the trial blocks, only 1 color served as the target (one-color blocks). Data of the one-color blocks are not reported here. Trials of the two-color blocks are illustrated in Figure 1C. In a given block, the target half circle could, for example, be red or green. A red probe combined with a red target would form a color “match trial” (left) analogous to Experiment 1. In contrast, a red probe combined with a green target (middle) would form a trial on which the probe matches one of the task-relevant color descriptions held in working memory, but not the actual target color presented in the focus of attention. We refer to these trials as color “cross-match” trials. Finally, there were trials on which the probe color matched neither the target color on the attended side nor the target color held in working memory (nonmatch trials). Subjects performed 8 two- and 8 one-color blocks, with each block containing 120 trials. Sixteen different block orders were randomly assigned to subjects; target color was never repeated on subsequent blocks. Collapsed over target colors of the two-color blocks, each subject performed 192 match and 192 cross-match, and 576 nonmatch trials throughout the experimental session.
Stimuli, setup, and task of Experiment 4 were similar to Experiment 1, except for the following modifications: 1) On half of the trial blocks, subjects were asked to perform the color/shape task as in Experiment 1 (discrimination task). On the other half of the trial blocks, subjects simply had to detect the onset of the circle on the target side (onset-detection task) with a button press (right index finger). 2) To control correct performance when performing the onset-detection task, we introduced catch trials (20%) on which the probe, but not the target, appeared and subjects were to withhold the response. Catch trials were not presented when performing the discrimination task. 3) To increase the temporal uncertainty of target onset during the onset-detection task the between-trial SOA jitter was set to a range of 1000–1800 ms (rectangular distribution). Subjects performed 6 blocks of onset-detection trials, and 6 blocks of discrimination trials. Block order was randomized to form 6 sets, with each subject performing one of the 6 block orders, which amounted to a total of 324 match and 648 nonmatch trials per trial type (discrimination/onset-detection trials).
Data Recording and Analysis
The magnetoencephalogram (MEG) was recorded using a BTI Magnes 3600 whole-head MEG magnetometer system with 248 sensors (4D Neuroimaging, San Diego, CA, USA). The EEG was simultaneously recorded using a 32-electrode cap; respective data are however not reported in detail. Eye movements were monitored (Synamps amplifier, NeuroScan, El Paso, TX) by recording the horizontal and vertical electro-oculogram (HEOG and VEOG) with bipolar electrode placements at the outer canthi of both eyes (HEOG), as well as unipolar electrode placement below the right eye (VEOG). Environmental noise was canceled online based on reference coils (Robinson 1989). The MEG and EOG signals were band-pass filtered DC to 50 Hz and digitized with a sampling rate of 254.31 Hz. Artifact rejection was performed offline by removing epochs, with peak-to-peak amplitudes exceeding specific thresholds. Because the data for each subject varied with respect to the relative size of artifacts (e.g., eye blinks of differing magnitude) relative to the ongoing background brain activity, the artifact rejection was done on an individual basis. Specifically, the raw data for each subject were visually inspected, and thresholds were set in an iterative manner until the data of a given subject were devoid of major artifacts. This lead to 4–16% percent rejection of trials and thresholds ranging between 1.7 and 3.4 × 10−12 T for the MEG (mean: 2.5 × 10−12 T) and 60 and 130 µV for the HEOG/VEOG (mean: 90 µV).
Co-registration of MEG and Anatomical Data
To co-register anatomical and MEG data, individual anatomical landmarks (nasion, left and right preauricular point) as well as 5 localizer coils placed at standardized positions in the EEG cap (Easycap, Herrsching, Germany) were digitized using 3Space Fastrak System (Polhemus, Colchester, VT, USA).
Primary data analysis was performed in each subject by averaging trials of the same experimental condition (match trials/nonmatch trials). Average event-related magnetic field responses (ERMFs) were computed for a 500-ms time window after stimulus onset including a baseline period 100-to-0 ms prior to stimulus onset. As there was no difference between the 3 different target colors regarding the match versus nonmatch condition, data were collapsed over target color for subsequent analyses. Furthermore, to simplify data analysis and presentation, the response from corresponding magnetic efflux and influx field components was collapsed by averaging efflux and influx waveforms after reversing the polarity of the efflux waveforms. The latter was done to compensate for the reversed polarity direction of the field effects. All ERMF waveforms shown in the figures represent efflux–influx collapsed waveforms.
“Statistical validation” of the ERMF modulations was performed using a time-sample-by-time-sample sliding-window t-test (window width of 30 ms) in a time range between 0 and 500 ms after stimulus onset. Correction for multiple comparisons (Bonferroni adjustment) was estimated based on the number of independent variance components of the time-series data. The general logic behind this correction procedure follows the reasoning put forward by Guthrie and Buchwald (Guthrie and Buchwald 1991, p. 241), where the amount of autocorrelation of the data over time is taken as a criterion for correction. To determine the number of independent variance components over time, waveforms of each subject and experimental conditions recorded from selected sensor sites were subjected to an eigenvalue decomposition of the correlation matrix with time samples serving as variables. The number of eigenvalues >1 was then taken as estimate of independent variation in the data to be corrected for multiple comparisons. Respective estimates yielded the following number of independent variance components for the experiments: Experiment 1 = 11, Experiment 2 = 12, Experiment 3 = 13, Experiment 4 = 14.
Current Source Analysis
For source analysis of the “grand average data,” a distributed source model was estimated using the minimum norm least squares (MNLS) approach as implemented in the Curry 7 Neuroimaging Suite (Compumedics Neuroscan USA Ltd, ) (Fuchs et al. 1999). Source estimates were constrained by realistic anatomical data of the MNI brain (ICBM-152 template). The latter were derived by 3D surface segmentations (boundary element method) of the cerebrospinal fluid space for modeling the volume conductor, as well as the gray matter layer to model the current source compartment. Before computing the grand average across subjects, each subject's set of sensor positions was co-registered with a reference sensor set (selected from 1500 recording sessions) representing the most canonical positioning of the sensor set relative to the anatomical landmarks. Co-registration was performed by projecting each subject's field distribution onto the reference sensor set based on a lead-field inversion approach. That is, the MNI brain was first used to obtain a lead field in each subject, which was then inverted and combined with the MNI-based lead field of the reference sensor set to recompute the field distributions (using a MNLS representation of the data) as if measured with this reference sensor set.
Current source localization in 4 “individual subjects” was performed by constraining inverse estimates with realistic anatomical data obtained from MR images (3 T Siemens Trio Scanner: T1-weighted 3D spoiled gradient echo sequence; 256 × 256 matrix; field of view 25 × 25 cm; 124 slices; slice thickness 1.5 mm; in plane resolution 0.97 mm × 0.97 mm; echo time 8 ms; repetition time 24 ms; flip angle 30°). Specifically, 3D surface segmentations of the subjects' individual gray matter surface (source compartment) were computed using routines of the FreeSurfer (V.5.1.0) software. A 3D surface segmentation of the subjects' cerebrospinal fluid space served as volume conductor compartment, and was computed using the segmentation routines provided in Curry 7. Before source localization, the data of each subject was averaged across all the experimental sessions the subject took part in. Before averaging, the session data was co-registered using the lead-field inversion approach described for the grand average data above.
Retinotopic mapping in the 4 individual subjects was performed with standard methods (Sereno et al. 1995; Engel et al. 1997) using rotating-wedge and expanding-ring checkerboards. Stimuli were composed from a circular patch of 36 × 36 isopolar and eccentricity scaled checkerboard segments, with each segment subtending a width-to-radian ratio of 1:2. The wedge corresponded to one quadrant (90°) of the circular patch and rotated in successive steps of 20°/TR either clockwise or counterclockwise. The ring stimulus covered 25% of the checkerboard (9 eccentricity segments) while expanding or contracting through 18 eccentricity steps during 1 cycle (2 steps per TR). Functional data (TE = 30 ms, TR = 2 s, 90° flip angle, 28 coronal slices perpendicular to the calcarine fissure, voxel size of 2.0 × 2.0 × 2.0 mm) were acquired during 8 runs with each of the 4 stimulus types being presented during 2 runs. One run contained 10 full cycles of stimulation, with 1 cycle being completed after 18 TRs. Fixation performance was controlled by asking the subject to maintain fixation at the center fixation-cross and detect the onset of a small dot randomly appearing every 166–8300 ms (rectangular distribution). In addition, eye movements were continuously surveyed with a custom made eye-tracking system. Prior to the functional scans, a high-resolution anatomical scan was obtained (3T Siemens Trio Scanner: MPRAGE-volume, 1 × 1 × 1 mm resolution), which provided the data for segmenting the cortex layer. Functional scans were realigned to reduce movement artifacts, resliced, and smoothed with a kernel of 2 mm. Structural segmentation and unfolding of the cortical surface was then performed using routines provided with FreeSurfer (V.5.1.0) and FSL (http://www.fmrib.ox.ac.uk/fsl/).
The behavioral performance is summarized in Figure 2A. Shown is the mean (over subjects) response accuracy (% correct responses) and response time (ms), for match (gray bars) and nonmatch trials (white bars). Response accuracy is very high for both match and nonmatch trials with no apparent difference. A repeated-measures ANOVA (rANOVA) confirms that there was no significant effect (F1,20 = 0.23, P = 0.63). Response time, in contrast, shows a small but significant (F1,20 = 29.32, P < 0.0005) increase for the match relative to the nonmatch condition. The response slowing turns out to be caused by match trials with the target half circle facing left. Match trials with the target half circle facing right show faster responses, which do not fully compensate the slowing on the former trials. This RT asymmetry presumably reflects an issue of stimulus-response mapping. A detailed analysis and discussion is provided in the Supplementary Figure 1.
Event-Related Magnetic Field Responses
We analyzed the ERMF response elicited in visual cortex contralateral to the probe (left hemisphere) as a function of whether the probe's color matched the target color in the focus of attention (match trials) or not (nonmatch trials), with the response difference between match and nonmatch trials (match minus nonmatch [M−NM] waveform difference) serving as an index of global color-based attentional selection. Figure 3A displays respective M−NM waveform difference (grand average over subjects) at selected sensor sites showing maximum field responses together with field distribution maps at time points of maximum effect size after stimulus onset (arrows). The effect of global color selection is reflected by a sequence of 2 response modulations, one showing a maximum at 200 ms (black trace) and one at 280 ms (gray trace). The topographical maps indicate that the maximum at 200 ms arises over the left lateral occipito-temporal cortex (black ellipse), whereas the second maximum at 280 ms arises over a more posterior occipital left hemisphere region (gray ellipse). Based on the contralateral retinotopic organization of the visual cortex, both maxima in the left occipito-temporal cortex reflect activity modulations related to the unattended color probe in the right VF, thus indexing the selection of target color outside the focus of attention (global feature selection). As is visible in the field distribution maps, the 2 modulation effects appear as efflux–influx field configurations (red–blue fieldlines encircled by the ellipses), with the black and gray waveforms showing the time course of the response collapsed over maximum efflux and influx sensors. To statistically validate the modulation effects, sliding-window t-tests (see Materials and Methods) comparing the response to match versus nonmatch trials were computed in a time range between 0 and 500 ms after stimulus onset. Time periods of significant waveform differences for the early and late modulation shown in Figure 3A are highlighted by black and gray horizontal bars, respectively. The early phase of selection onsets at 160 ms, whereas the later phase starts with a delay of 90 ms at 250 ms.
To corroborate that the modulation underlying global color selection separates into an independent early and a later portion, an independent component analysis (ICA) was performed on the ERMF response shown in Figure 3A. The ICA provides a decomposition of the spatial distribution and time course of the field response into statistically independent contributions (Makeig et al. 1996). Hence, a separation of the observed magnetic field effects into different ICs would be evidence for independent modulatory effects. Figure 3B shows the time course and distribution of 2 independent components (IC-1, IC-2) explaining the largest portion of variance of the data (expressed as signal-to-noise ratio). Since the IC-1 and IC-2 map nearly perfectly onto the time course and field distribution of the early and late field modulation shown in Figure 3A, a true separation into spatio-temporally independent modulatory effects appears highly likely.
Note that, for characterizing ERMF effects of global color selection, we decided to compare the response to identical probe colors while varying the attended color in the focus of attention. This was done to avoid confounding color attention effects with low-level sensory differences in processing probe color. Nonetheless, the overall variation of color on the target side may at least partially influence the results. This possibility cannot be ruled out completely. It is, however, possible to address the issue to some extent by comparing match and nonmatch trials with the target color held constant while varying probe color. Respective results are reported in the Supplementary Figs 2 and 3). They reveal that M-MN comparisons with the target color kept constant yield the same sequence of modulation effects as when probe color is kept constant, indicating that low-level color differences on the target (or the probe) side are unlikely to account for the global color-based attention effects reported here.
To determine the approximate cortical localization of current source activity underlying the grand average M−NM difference, a distributed source model was computed using the MNLS method (see Materials and Methods). Figure 3C shows current source density (CSD) distributions during the maximum field effects at 200 and 280 ms, together with source waveforms obtained from the CSD maxima. The source analysis confirms the presence of 2 modulatory effects by showing a first CSD maximum in more anterior lateral ventral occipito-temporal cortex (200 ms, black trace) and a second maximum in more posterior medial ventral visual cortex (280 ms, gray trace).
To gain more specific localization information about the early and late modulation effect, we computed respective CSD distributions in 4 selected observers and co-registered the distributions with the subjects' individual retinotopic field-sign map (Sereno et al. 1995) as well as with a localizer of the lateral occipital complex (LOC) (Kourtzi and Kanwisher 2000). Note that the subjects were selected based on the criterion that they took part in at least 3 experiments, each containing 1 control condition matching the setup of Experiment 1. This allowed to increase the power of localization by pooling each subject's data over those conditions before estimating the CSD maps. Results are shown in Figure 4 where each row displays the data of one subject. As visible, all subjects show a first modulation peaking around 200 ms (white traces). Corresponding current distribution maxima (maps in the left column) are located in or close to LOC as well as in a more posterior-medial area just anterior to V4. The way field-sign mapping was performed here did not allow to reveal retinotopic structure reliably beyond V4. We believe, however, that the localization of the area is most consistent with areas VO-1/2 (Brewer et al. 2005; Wandell et al. 2007). Subsequent modulation maxima (red traces) around 280 ms display a greater temporal variability across subjects. Corresponding source distribution peaks (maps in the middle column) appear consistently in more posterior retinotopic areas V4 and V3 or even in V1 as in subject ox81. LOC and surrounding areas including VO-1/2 are known to represent a higher hierarchical level of visual representation than areas V3/V4 in human extrastriate cortex confirming that the 2 modulations arise from different hierarchical levels of representation in visual cortex.
In sum, probes matching the target color elicited an enhanced ERMF response relative to nonmatching probes in visual cortex contralateral to the VF of probe presentation. Experiment 1 therefore demonstrates that global feature-based selection indeed arises under conditions of color competition as operationalized with the present experimental setup. Most notably, global color selection triggers a sequence of independent modulations arising with different onset-latencies (160 and 250 ms) from different hierarchical levels of representation (LOC and V3/V4) in ventral extrastriate visual cortex.
The results of Experiment 1 are consistent with the notion that attention to color operates in a spatially global manner when a target color is selected under competition with a simultaneously presented distractor color. In Experiment 2, we explicitly tested whether global feature selection depends on the presence of a competing color in the focus of attention. To this end, Experiment 1 was modified such that, on half of the trial blocks, the distractor color was removed from the target circle with only the half circle drawn in the target color being presented (distractor-absent trials, Fig. 1B). On the other half of trial blocks, the stimulation was identical to Experiment 1 (distractor-present trials). If color competition in the focus of attention is critical for global feature selection to arise, it should be absent when competition is eliminated by removing the distractor half circle.
While performance accuracy is high for both the distractor-present and the distractor-absent trials, it is generally higher for distractor-present trials than for distractor-absent trials (Fig. 2B). Furthermore, for distractor-present trials, subjects performed slightly better on nonmatch than on match trials. A two-way rANOVA with the factors color match (match/mismatch) and distractor presence (present/absent) confirms this yielding a significant main effect of color match (F1,21 = 5.45, P < 0.05) and distractor presence (F1,21 = 37.71, P < 0.0005), as well as a significant color match × distractor presence interaction (F1,21 = 4.82, P < 0.05). Regarding response time, subjects were generally slower on match trials as compared with nonmatch trials, independent of whether distractors were present or absent. This is confirmed by a corresponding rANOVA showing significant main effect of color match (F1,21 = 18.65, P < 0.0005) but no other main effect or interaction. As in Experiment 1, the slight response slowing on match trials refers to an issue of response mapping. A detailed analysis is provided in the Supplementary Materials.
Event-Related Magnetic Field Responses
As visible in Figure 5, the prediction that color competition in the focus of attention is critical for global feature selection to arise is not confirmed. The M−NM difference of distractor-absent trials (panel (B)) shows the very same spatio-temporal modulation pattern as distractor-present trials shown in panel (A). That is, M−NM difference of distractor-absent trials displays a sequence of 2 modulations in left ventral extrastriate cortex, with an initial maximum at 210 ms originating in a more anterior-lateral part, and a later maximum at 280 ms arising in a more posterior part of the ventral extrastriate cortex. Also, the corresponding CSD maps display a distribution of source activity that is rather similar to the one seen for distractor-present trials (panel (A)).
In sum, eliminating color competition in the focus of attention did not eliminate the ERMF modulation indices of global color selection. Hence, Experiment 2 speaks against global color-based selection to arise as a direct consequence of color competition in the focus of attention.
What operation then gives rise to these modulations? Given that the subjects' task was to discriminate the curved section of the target half circle, which presumably required to discriminate its shape, a reasonable hypothesis would be that global color selection arises from a competition-independent selection process directly involved in the discrimination of the target half circle. Note, while target color was designated at the start of each trial block in Experiment 2, the discrimination of the target half circle did not necessarily require explicit color discrimination. One obvious interpretation would be that global effects of color selection were mediated by object-based attention, with the bias for color selection being inherited from the shape discrimination of the target. If this is the case, the reported indices of global color selection should not be elicited when probing a task-relevant color that is not contained in the target shape. Experiment 3 addressed this possibility based on a simple modification of Experiment 1 (Fig. 1C). We used 2 colors to define the target during a given trial block. On subsequent trials of this block, the one or the other target color (never both) was presented on the target side. This setup permitted us to probe either the target color that was presented in the focus of attention and defined the target shape on a given trial (as in “match trials” of Experiment 1 and 2), or to probe the alternative target color not presented in the focus of attention (cross-match [CM] trials). If object-based selection of color accounts for the global color selection effects, they should be absent when comparing cross-match with nonmatch trials (CM−NM difference).
As shown in Figure 2C, performance on cross-match trials (dark gray bars) differs from the other conditions. While accuracy is generally high, there is a decrease for cross-match trials relative to match and nonmatch trials, with no difference between the latter. A rANOVA with the three-level factor match condition yields a significant effect of accuracy (F2,23 = 10.87, P < 0.0005). Response time is slower for cross-match trials relative to match and nonmatch trials, and responses are also slightly slower to match relative to nonmatch trials as confirmed by a significant effect (F2,23 = 42.44, P < 0.0005). As in the previous experiments, the slight response slowing on match trials reflects an issue of response mapping. A detailed analysis is provided in the Supplementary Materials.
Event-Related Magnetic Field Responses
Figure 6B shows that the modulations indexing global color selection are not entirely abolished when comparing the response to cross-match with that to nonmatch trials. Respective CM−NM difference displays an early modulation around 200 ms (130–250 ms) with the corresponding field distribution showing the typical left lateral occipito-temporal efflux–influx configuration. Source localization of this modulation yields a current maximum in anterior-lateral ventral extrastriate cortex. In terms of time course and localization, this early modulation is comparable with the initial maximum of the M−NM difference of this experiment (black trace in [A]). However, a second phase modulation clearly present in the M−NM difference is missing in the CM−NM difference (compare gray traces in [A] and [B]).
Hence, the later more posterior modulation effect seen in Experiments 1 and 2 may in fact reflect global color selection mediated by object-based attention. The early phase modulation around 200 ms, in contrast, reflects the mere presence of a task-relevant color per se. As this modulation appears for a color that is part of the attentional set but absent in the focus of attention, it seems to index the match between probe color and task-relevant color descriptions held in working memory (color template matching).
In sum, Experiment 3 shows that the early and late modulations reflecting global color-based selection separate into functionally distinct components, with an initial component reflecting global color selection as a consequence of the color-probe matching the set of task-relevant colors (color template matching), no matter whether the probe's color is actually presented and discriminated in the focus of attention. The second, later component is presumably elicited as a direct consequence of discriminating the target shape, which biases the selection of its color feature.
Given our interpretation of the above results, one would predict that when target selection does not require target discrimination, no such modulation should be elicited by the probe. Experiment 4 addresses this prediction. Setup and task were generally identical to Experiment 1 except for the following changes. On half of the trial blocks, target color and the shape of the target were irrelevant, and subjects had to detect the onset of the target as fast as possible (onset-detection task). On the other trial blocks, subjects performed the color/shape discrimination task as in Experiment 1 (discrimination task). To control for correct performance on onset-detection trials, we introduced catch trials (20%) on which the probe appeared without the target, and where subjects were required to withhold the response. The onset-detection task does not require to discriminate the target, hence the corresponding late modulation effect should be absent. In addition, onset detection also preempts color-template matching, which predicts that the early phase modulation should also be absent.
As visible in Figure 2D, accuracy was generally high on both, the discrimination and onset-detection task, with the latter being still better than the discrimination task. No difference, however, appeared when comparing match and nonmatch trials. A repeated-measures two-way rANOVA with the factors color match (match/nonmatch) and task (discrimination/onset-detection) yielded a significant main effect of task (F1,19 = 47.0, P < 0.0005) but no effect of color match. Subjects performed generally faster on the onset-detection task than on the discrimination task. They were also slightly slower on match than nonmatch trials of the discrimination task (for an explanation of the response slowing, see the Supplementary Materials), while there was no difference between match and nonmatch trials when performing the onset-detection task. The RT pattern is confirmed by a corresponding rANOVA, which revealed a significant main effect of task (F1,19 = 233.2, P < 0.0005), color match (F1,19 = 6.0, P < 0.05), and a significant color match × task interaction (F1,19 = 10.4, P < 0.005).
Event-Related Magnetic Field Responses
As shown in Figure 7A, the pattern of modulations seen for the discrimination task replicates the observations made in the corresponding conditions of the previous Experiments 1–3. Importantly, in line with our predictions, the M−NM difference of the onset-detection task does not show any significant modulation in anterior-lateral and posterior-medial ventral extrastriate cortex, neither in the early nor in the later time range (Fig. 7B). There is a small nonsignificant effect (black trace). However, the source waveforms taken from cortex locations corresponding to the CSD maxima of the discrimination task show no noticeable source activity at all in ventral extrastriate cortex.
To conclude, when target detection preempts the need for target discrimination and color template matching, the neural modulations reflecting global color-based selection are eliminated in ventral extrastriate visual cortex.
The Role of Feature Competition for Global Color-based Attention
The reported experiments together show that, at the neural population level, global color-based attentional selection is reflected by a sequence of functionally and spatio-temporally separable modulations in extrastriate visual cortex, which variably occur depending on task definitions and specific demands on target selection. Importantly, none of the modulations indexing global color selection depended on color competition in the focus of attention. That is, the modulation sequence was observed for a color target presented with (Experiment 1) as well as without a competing distractor color (Experiment 2). Moreover, the initial phase of the modulation sequence appeared for a target color not even presented in the focus of attention (Experiment 3). All modulations in extrastriate visual cortex, however, disappeared when subjects were asked to simply detect the onset of the target without further discrimination (Experiment 4).
These observations together indicate that under the conditions of the reported experiments, resolving color competition in the focus of attention is not a critical determinant of global color-based selection to arise. This conclusion is in apparent contrast with previous studies suggesting that feature competition is a determining factor (Saenz et al. 2003; Zhang and Luck 2009). It is important to acknowledge, however, that the experimental design in Zhang and Luck (2009) differed from the present experiments in critical ways. In our experiments, the target and the probe were simultaneously presented for 300 ms with common onset and offset. In Zhang and Luck (2009), color probes were flashed while subjects continuously attended a dot stream. It is likely that continuous attention led to a different form and depth of feature biasing as the more transient allocation of attention in the present experiments. In fact, the P1 enhancement indexing global color selection in Zhang and Luck (2009) appeared in a time range compatible with a modulation of the feed-forward sweep of processing in visual cortex. The early phase modulation in the present experiments shows an effect substantially later, rather consistent with a modulation due to feedback processing. It is therefore possible that resolving feature competition by biasing the forward processing of the target feature becomes a relevant factor during sustained focusing, but not when attention is transiently allocated in a trial-by-trial manner as in the present experiments.
One could maintain that at least immediately after stimulus onset, early parts of the modulation reflecting template matching arise as a consequence of competition between the target and the probe. That is, the simultaneous stimulus-onset transient in both VFs may have caused spatial attention to be attracted to both VFs, with a color match between target and probe resulting in a stronger attraction to the probe side. The initial phase suggested to underlie template matching may then reflect an increased suppression of the probe that counters this stronger attraction and facilitates refocusing to the target. In fact, ERP recordings in subjects performing visual search tasks revealed that salient distractor items may be associated with a modulation reflecting distractor suppression. This modulation is referred to as distractor positivity, as it typically appears as a positive voltage deflection of the ERP over the posterior scalp contralateral to the distractor location (Hickey et al. 2009; Sawaki and Luck 2010, 2011).
Unfortunately, a direct verification of such distractor positivity in the neuromagnetic response is not possible, as the magnetic analog of the distractor positivity has not been characterized so far. We can, however, assess whether the simultaneously recorded ERP data show such effect. Respective ERP results were previously reported (Bartsch et al. 2011). Figure 8 replots the M−NM difference of the early modulation effect shown in Figure 3A together with the simultaneously recorded ERP response (Bartsch et al. 2011). The early modulation effect in the ERMF response has its maximum around 200 ms. In this time range, the corresponding ERP response displays an increased negative voltage deflection for the match (thin solid) relative to the nonmatch condition (thin dashed) contralateral to the probe's VF. This negative enhancement is apparently incompatible with a distractor positivity, which should appear as larger positivity for the match relative to the nonmatch condition. Hence, the early ERMF effect is unlikely to reflect a suppressive modulation as indexed by the distractor positivity (Hickey et al. 2009) to counter a stronger attraction of attention to the matching probe. In addition, a stronger attraction of attention to the probe side on color-match trials would also predict that subjects pronounce refocusing toward the target side. This would typically be associated with a bigger negativity contralateral to the target item (referred to as target negativity (Hickey et al. 2009)). The ERP topomap in Figure 8 shows no such negative modulation over the posterior scalp contralateral to the target. Hence, an account of the early modulation effect in terms of distractor suppression and/or an augmented refocusing of attention to the target does not fit with the response pattern of the simultaneously recorded ERP data.
In conclusion, spatially global effects of attention to color are found to arise as an early modulatory effect reflecting a preset bias for target-defining colors independent of target discrimination (template matching effect), and as a delayed modulation effect representing a more direct consequence of discriminating target attributes (discrimination matching effect).
Note that target shape selection required explicit color discrimination in Experiments 1 and 3, but not in Experiment 2, where the target half circle was presented in isolation. Nonetheless, template and discrimination matching effects were observed for the target color, which is notable to some extent. The presence of an early modulation reflecting template matching suggests that establishing a color template may not necessarily depend on an explicit definition of its task-relevance. Instead, it may arise from an implicit feature bias due to presenting the same target color on every trial of a given trial block. On the other hand, despite the fact that target color was not essential for performing the shape discrimination task in Experiment 2, it was explicitly designated at the beginning of each trial block prompting subjects to attend to color directly. As such, the present experimental data do not permit us to decide to what proportion the early modulation effect reflects a matching against an explicit or an implicit color template.
The presence of the later discrimination matching effect in Experiment 2 seems to be more puzzling in view of the fact that target shape selection did not explicitly require color discrimination. However, it is likely that target color selection reflects a consequence of object-based attention, that is, that the shape discrimination of the target object caused a subsequent global processing bias for the target's color. In fact, the latency of the discrimination matching effect is consistent with the temporal delay object-based feature biasing is typically associated with (Schoenfeld et al. 2003, 2014). Moreover, object-based attention has been shown to cause global selection of irrelevant object features in unattended objects around 260 ms after stimulus onset (Boehler et al. 2011)—a latency consistent with the time range of the discrimination matching effect observed here.
To summarize, the reported data together indicate that feature competition neither accounts for the early nor the late modulation effect underlying global color selection described in the present experiments. The interpretation that object-based selection underlies the modulation indexing discrimination matching is rather in support of accounts that have linked global feature selection to the segmentation and grouping of features to form objects (Melcher et al. 2005; Arman et al. 2006; Katzner et al. 2009; Boehler et al. 2011; Festman and Braun 2012).
Global Color-Based Attention is Mediated by Backward-Progressing Activity Modulations in Ventral Extrastriate Cortex
An important observation in all reported experiments is that early and late modulations underlying global color selection arise from different areas in ventral extrastriate cortex, with the temporal order of modulations proceeding backwards from higher to lower levels of representation. Comparable observations were made by Bondarenko et al. (2012) where a spatio-temporal sequence of modulations in higher followed by modulations in lower level extrastriate cortex was documented to underlie global orientation-based attention. Here, we provide a detailed spatio-temporal localization analysis of global color-based attention in individual subjects, which revealed modulation maxima underlying template matching in and around areas LOC and presumably VO-1/2 (LOC is a cluster of areas in ventral-stream visual cortex involved in object-related processing (Malach et al. 1995; Grill-Spector and Malach 2004)). Subsequent modulations underlying discrimination matching showed source maxima in retinotopic areas V4/V3, which are lower in hierarchy than areas LOC/VO-1. Note, comparative imaging in humans and monkeys suggests that LOC most likely corresponds with parts of monkey IT cortex (TEO) (Tootell et al. 2003). Consistently, the observation that hierarchical levels IT and V4/V3 are the locus of template and discrimination matching, respectively, lines up with a number of observations in the monkey. For example, color selective neurons in monkey IT were found to reflect color categorization but not the discrimination of color (Koida and Komatsu 2007). Moreover, single-unit recordings in monkeys performing a memory guided visual search task (Chelazzi et al. 1998, 2001) revealed that neurons in IT display reliable baseline firing enhancements to effective feature cues before search frame onset. These baseline effects were taken to reflect a working memory template of task relevant features and would be a prerequisite of template matching described here. In contrast, recordings in lower tier areas V4/V2 revealed feature-selective responses exclusively in response to stimulus onsets.
It is also worth considering that the temporal separation of modulations underlying global feature-based attention into early and late phases may relate to observations in monkey visual cortex (V2, V4, IT) suggesting a qualitative change of the population firing response over time (Hegde and van Essen 2004, 2006; Matsumoto et al. 2005). That is, the firing response undergoes a qualitative change from an initially more broadly tuned response optimal for grouping stimuli into broad categories, followed by later portions better at discriminating more fine-grained details of individual stimuli (Hegde 2008). Given such evolution from coarse categorical to a more detailed neuronal selectivity, it is possible that the early and later phase of global feature-based selection reflect top-down modulations timed to follow this temporal evolution of coding specificity. Of course, more experimental work is required to verify this possibility.
Taking a more general perspective, the present observations add to a growing body of data suggesting that attention involves multiple phases of selection in reverse hierarchical order in visual cortex, as has been shown for elementary features in the visual system of monkeys (Buffalo et al. 2010) or for more complex feature configurations like faces in monkeys (Sugase et al. 1999; Matsumoto et al. 2005) and in humans (Liu et al. 2002). Furthermore, they qualify the widely held notion that visual attention mediates its effects via recurrent processing in visual cortex (Martinez et al. 1999; Mehta et al. 2000; Noesselt et al. 2002; Juan and Walsh 2003; Boehler et al. 2009; Buffalo et al. 2010; Roelfsema et al. 2010). Finally, our data lend support to theoretical notions that take reverse-order selection in the visual processing hierarchy as a key property of visual attentional selection (Hochstein and Ahissar 2002; Ullman 2007; Hegde 2008; Tsotsos 2011). One fundamental problem of feature selection is how to represent and select every possible task-relevant feature in visual cortex (Maunsell and Treue 2006). Computational considerations indicate that this problem shows exponential complexity that is formally intractable without setting proper architectural constraints (Tsotsos 2011). In view of the present observations, the brain may reduce this problem to a solvable one by constraining feature selection to follow the representation hierarchy of the visual cortex in reverse direction (Ullman 2007). That is, by virtue of reverse hierarchical selection, which first determines task-defined categorical distinctions, and then selects based on more detailed sensory distinctions of the physical input, the combinatorial problem of feature selection will be dramatically reduced.
We thank Michael Scholz for providing software tools for data realingnment and visualization. Conflict of Interest: None declared.