In a typical scene with many different objects, attentional mechanisms are needed to select relevant objects for visual processing and control over behavior. To test the role of area V4 in the selection of objects based on non-spatial features, we recorded from V4 neurons in the monkey, using a visual search paradigm. A cue stimulus was presented at the center of gaze, followed by a blank delay period. After the delay, a two-stimulus array was presented extrafoveally, and the monkey was rewarded for detecting the target stimulus matching the cue. The array was composed of one ‘good’ stimulus (effective in driving the cell when presented alone) and one ‘poor’ stimulus (ineffective in driving the cell when presented alone). When the choice array was presented in the receptive field (RF) of the neuron, many cells showed suppressive interactions between the stimuli as well as strong attention effects. Within 150–200 ms of array onset, responses to the array were determined by the target stimulus. If the target was the good stimulus, the response to the array became equal to the response to the good stimulus presented alone. If the target was the poor stimulus, the response approached the response to that stimulus presented alone. Thus the influence of the nontarget stimulus was filtered out. These effects were reduced or eliminated when the poor stimulus was located outside the RF and, therefore, no longer competing for the cell's response. Overall, the results support a ‘biased competition’ model of attention, according to which objects in the visual field compete for representation in the cortex, and this competition is biased in favor of the behaviorally relevant object.
A complex scene will typically contain many different visual objects, few of which are currently relevant to behavior. Thus, attentional mechanisms are needed to select the relevant objects from the scene and to reject the irrelevant ones. Human behavioral studies have shown that relevant objects may be selected on the basis of their spatial location (Posner, 1980) as well as their non-spatial features (Bundesen and Pedersen, 1983), and neurophysiological studies of attentional selection in the ventral, ‘object recognition’, stream of primates have identified a number of neural correlates of spatially directed attention in striate, prestriate and inferior temporal cortex (Moran and Desimone, 1985; Motter, 1993; Connor et al., 1996, 1997; Luck et al., 1997; McAdams and Maunsell, 1999a, 2000; Reynolds et al., 1999). When multiple stimuli compete within the receptive field (RF) of neurons in areas V2, V4 and inferior temporal (IT) cortex, spatially directed attention has been shown to gate neural responses. Responses are determined primarily by the attended stimulus, and responses to irrelevant distracters within the RF are largely blocked (Moran and Desimone, 1985; Luck et al., 1997; Reynolds et al., 1999). These effects of attention are most pronounced when the competing stimuli occupy the same RF, although, in some cases, the response to a single stimulus within the RF also appears to be enhanced with attention in areas V1, V2 and V4 (Motter, 1993; Connor et al., 1996, 1997; McAdams and Maunsell, 1999a, 2000).
Less is known about ventral stream mechanisms for object selection based on non-spatial features, such as shape or color, although studies of IT cortex suggest that they may be similar to those involved in spatial attention. In previous studies of non-spatial attention in the anterior IT cortex, we investigated the responses of neurons recorded while monkeys performed a memory-guided visual search task (Chelazzi et al., 1993, 1998). In this task, the monkey was first presented with a cue stimulus at fixation. After a brief, blank delay period, an array of stimuli was presented extrafoveally, and the monkey was rewarded for making a saccadic eye movement to the target stimulus in the array matching the previous cue. The array typically contained both a ‘good’ stimulus that elicited a strong response from the cell when presented in isolation, and a ‘poor’ stimulus that elicited little or no response when presented in isolation. Similar to what had been found in the studies of spatially directed attention described above, we found that responses to the array were largely determined by the target stimulus, at least when the array was contained within the contralateral visual field. For example, if the target stimulus was the good stimulus for the cell, the response was comparable to that elicited by the good stimulus alone; however, if the target stimulus was a poor stimulus, the response to the good stimulus in the array was suppressed, resulting in a response comparable to that elicited by the poor stimulus alone. Although this modulation of responses by attention did not occur at the initial onset of the neuronal response, it occurred ~100 ms before the initiation of the eye movement. Similar results were found in a task in which the monkey released a bar if the target stimulus was present instead of making an eye movement and, therefore, the results likely were caused by attentional mechanisms and not the motor response. Thus, after a short period of processing, visual processing in IT cortex is largely restricted to behaviorally relevant stimuli. We refer to this attentional modulation of responses of IT neurons to the search array as the ‘target effect’.
In addition to the target effect, IT neurons also showed a modulation of their activity by attention prior to the onset of the array. During the blank delay period preceding the onset of the array, cells maintained a higher firing rate on trials in which their good stimulus was the cue and target than on trials with a poor cue and target. That is, the IT cells representing the target features discharged spikes as though they had been primed, or biased, by the cue, before the presentation of the array.
Both the target effect and the maintained activity during the delay have been interpreted in the context of a Biased Competition model of attention (Desimone and Duncan, 1995; Luck et al., 1997; Chelazzi et al., 1998; Reynolds et al., 1999). According to this model, stimuli in the visual field activate corresponding neuronal populations throughout the ventral stream cortical areas. These activated populations engage in mutually suppressive interactions, which are strongest when the stimuli occupy the same RF. The suppressive interactions are then biased in favor of one of the competing populations by ‘top-down’ signals specifying the properties of the stimulus of interest in a given behavioral context. The winning population is then released from suppression and, in turn, further suppresses the activity of cells in competing populations.
The aim of the present study was to determine whether neural mechanisms similar to those mediating visual search in IT cortex are present in area V4, an earlier stage of processing in the ventral stream. Unlike cells in IT cortex whose RFs often include the entire central portion of both hemifields, cells in V4 have restricted, retinotopically organized RFs that are typically 3–5° in size in the central visual field. A critical question was whether any neural correlate of visual search in V4 would be strongest with stimulus configurations in which the competing stimuli were restricted to the vicinity of the same RF, as has been found in spatial attention studies (Moran and Desimone, 1985; Luck et al., 1997). Furthermore, V4 neurons typically have simpler feature selectivity than is found in IT cortex. It was possible that visual search utilizing complex combinations of stimulus features held in memory would not affect visual processing in the early and intermediate visual areas of the ventral stream.
Materials and Methods
Two adult male rhesus monkeys weighing 7.5–9.9 kg were used. The general methods were described previously (Miller et al., 1993) and will only be briefly described here. Under aseptic conditions, a post for holding the head, a recording chamber, and scleral eye coil for monitoring eye position (Robinson, 1963) were implanted while the monkeys were under isofluorane anesthesia. A recording chamber was placed over the prelunate gyrus of both hemispheres in one animal and the right hemisphere in the other animal, and the prelunate gyrus was located in stereotaxic coordinates on the basis of a preoperative magnetic resonance imaging (MRI) scan. The two animals in the present study were the same as two of the three animals used in our previous investigation of cells in IT cortex during visual search (Chelazzi et al., 1993, 1998).
To facilitate comparison with the previous data in IT cortex, the stimuli were the same as used in the IT study (Chelazzi et al., 1993, 1998). They consisted of a set of 24 complex, multicolored pictures presented on a computer graphics display. The stimuli ranged in size from 1° × 1° to 2° × 2° and were digitized from magazine pictures. Some depicted identifiable objects (e.g. fruit, tools, faces and other body parts of humans and monkeys), while others were meaningless colored textures and patterns. We made no attempt to find ‘optimal’ stimuli; it was only necessary that the stimuli elicit a range of responses from each cell.
For each individual cell, we selected three stimuli from the set to use in the experiment, while the animal performed a simple fixation task (below). The stimuli were chosen such that one elicited a strong response from the cell and one elicited a weak response or no response. We will refer to these as the ‘good’ and the ‘poor’ stimuli, respectively. A third, ‘neutral’, stimulus was also selected without any specific response requirement, but in many cases it elicited a response that was intermediate between those elicited by the good and poor stimuli.
The basic task is shown in Figure 1. Each trial began with the presentation of a fixation target (0.1° white spot) at the center of the display, which the monkey was required to fixate. After an interval of 700–1000 ms, a cue stimulus was presented at the center of gaze for 300 ms, followed by a 1500 ms blank delay period. The fixation target remained visible during the delay, and the animal was required to maintain fixation within a 1° diameter window from the beginning of the trial until the end of the delay interval. Eye movements at any time from the onset of fixation to the end of the delay were counted as errors, and the trial was aborted.
At the end of the delay, an array of two stimuli was presented extrafoveally. On ‘target-present’ trials, one of the stimuli (the target) matched the previous cue, and the other (the distracter) did not. The monkey was required to make a saccade to the target within 700 ms of array onset. After the monkey fixated the target for 150 ms, the display was turned off, a drop of juice reward was given and the trial was terminated. Eye movements to the distracter at any time were counted as errors and immediately terminated the trial. On ‘target-absent’ trials, neither of the two stimuli in the array matched the cue. On these trials the array was presented for 600 ms, followed by a 1000 ms delay period, and the monkey was required to maintain fixation on the fixation target during this entire period. At the end of the delay, a single stimulus matching the cue was presented extrafoveally, and the monkey was rewarded for making a saccade to it. Half of the trials were target-present trials and half were target-absent trials.
The goal of the task was to measure responses of V4 neurons to a given stimulus in the array on trials when it was the attended target versus on trials when it was an irrelevant distracter. Since in one of the array configurations both stimuli were presented inside the RF of the cell (see below), such a comparison was made possible by constructing stimulus arrays comprising one good stimulus and one poor stimulus for the neuron under study. With such arrays, even though two stimuli are simultaneously present inside the RF, one can infer that the neural response is predominantly determined by the good stimulus for the cell. The poor stimulus can be treated as though it were effectively outside the RF. We could then measure the effects of attention on the response to the good stimulus by comparing the firing rate in trials where the monkey attended to the good stimulus in the array versus trials where the monkey attended to the poor stimulus. On target-present trials, the cue at the start of the trial determined which stimulus was the target. For example, on trials when the cue was the good stimulus, the animal was rewarded for selecting the good stimulus in the array as the target, whereas on trials when the poor stimulus was the cue, the good stimulus became behaviorally irrelevant and the animal was rewarded for selecting the poor stimulus as the target.
On target-absent trials, we measured the response to the array with the neutral stimulus for the cell as the cue. This gave a measure of the response to the good and poor stimulus paired together that was independent of target selection. Although the focus of the study was on the response to arrays composed of the good and poor stimulus, the trials were completely balanced so that the good, poor and neutral stimuli appeared equally often as a cue, target and distracter in the arrays.
Stimuli in the array were presented along an imaginary circle centered on fixation, usually at an eccentricity of 4–8°. The eccentricity was selected independently for each cell such that the imaginary circle intersected the most sensitive portion of the RF under investigation. Different spatial configurations of the array relative to the RF boundary were used. In the ‘inside/inside’ condition, both stimuli in the array were entirely contained within the boundaries of the RF. The spatial separation between the two stimulus locations was typically 2.5–4° of visual angle. The selection of the two locations for each cell was based on initial mapping of its RF (minimum response field method), by using a mouse to move the good stimulus through the visual field while the animal was maintaining central fixation.
In addition to the inside/inside configuration, some cells were also tested using one of two ‘inside/outside’ configurations. In the ‘inside/ near-outside’ configuration, the stimulus pair was presented across the RF border, with one stimulus inside and the other one just outside the RF border. The outside stimulus was always presented within the same visual field quadrant, and the spatial separation between the two stimuli was the same as that used in the inside/inside configuration for the cell (thus in the range of 2.5–4° of visual angle). In the ‘inside/far-outside’ configuration, one stimulus was presented inside the RF of the cell, and the other was presented within the same hemifield but in the opposite quadrant, at a distance of 5–10°. Because all cells were recorded from the dorsal portion of the prelunate gyrus (see below), all RFs were located in the lower contralateral quadrant.
Regardless of the specific spatial configuration of the search array, the relative locations of the stimuli within the array varied randomly across trials, and the animal had to find the target based on its features.
On some trials, the search array was replaced by a single stimulus, which was either the good or poor stimulus for the cell. The stimulus appeared randomly at each of the positions used for the two-stimulus arrays. The stimulus was equally often a target and a nontarget, depending on the preceding cue. These trials were treated as target-present and target-absent trials, respectively, which are described in the previous section. All other conditions of the task were the same as in the task with two-stimulus arrays.
Blocking of Trials
Cells were typically studied with 400–480 correct trials, which allowed for 10–12 correct trials for each trial type. Trials using two-stimulus arrays and one-stimulus arrays comprised two-thirds and one-third of the total trials, respectively. Two-stimulus array and one-stimulus array trials were run in separate blocks. A given block typically contained 10–30 trials, and each block was typically repeated two or three times, randomly interleaved, during the recording of an individual cell.
All cells were studied using a ‘blocked cue’ version of the two tasks, in which the same stimulus was used as the cue for 10–30 trials in a row. Thus, trials were blocked according to the cue and the number of stimuli in the search array (one-stimulus arrays and two-stimulus arrays).
Lever Release Task
Cells studied in the lever release task were tested using a combination of two-stimulus arrays and single-stimulus arrays.
The search task with saccades described above required both an eye movement and the explicit localization of the target. To test whether these two factors were necessary for any neuronal effects of attention in the search task, we tested some cells from a second monkey using a variation of the task in which the behavioral response was a lever release instead of a saccade. The monkey grasped a lever to initiate the trial. A fixation target then appeared at the center of the display, which the monkey was required to fixate for the remainder of the trial. If the animal broke fixation, the trial was terminated. Following an interval of 700–1000 ms, a cue stimulus was presented over the fixation target for 300 ms, followed by a 1500 ms blank delay period. At the end of the delay, a search array was presented for 500 ms inside the RF of the recorded cell. On half the trials (target-present, or match, trials) the array contained a stimulus that matched the initial cue, and the monkey was rewarded for releasing the lever within 700 ms of array onset. On the other half of the trials (target-absent, or nonmatch, trials) neither stimulus in the array matched the initial cue and the monkey was rewarded for holding the lever for an additional 1000 ms delay from array offset, at which time the presentation of a final matching stimulus signaled the monkey that it should release the bar.
On some trials, presented in separate blocks, the search array contained only a single stimulus. If the stimulus matched the previous cue, the trial was treated as a target-present trial, whereas if it did not match the previous cue, it was treated as a target-absent trial (see above).
All cells were studied with a ‘blocked cue’ design, in which the same cue stimulus was used for 10–30 trials in a row before switching to another cue or another task.
This task was only used for the initial characterization of the general response properties of each isolated cell. The monkey was simply required to maintain fixation within 1° of a central fixation target for an interval of 3–5 s to receive juice reward. While the monkey was fixating, different stimuli were presented in rapid succession (~2 stimuli/s) in the region of the lower quadrant contralateral to the recording hemisphere where responses could easily be evoked by the onset of visual stimuli. This allowed us to explore the stimulus preference of each cell. Once a good (and a poor) stimulus for a given cell was identified, the same fixation task was used to map the cell's RF (minimum response field method) by manually controlling the movement of such stimulus.
Attentional effects were evaluated at both the single cell and population level using ANOVAs and t-tests. When statistical tests were conducted individually on every cell in a population, a P < 0.05 criterion was used to evaluate whether the test was significant. Cells were assessed for visual responsiveness by conducting paired t-tests on the response to each stimulus presented alone inside the RF in a time window from 50–200 ms post-stimulus onset, compared with the firing rate in a 300 ms prestimulus period. Visual selectivity was assessed by conducting an ANOVA and post hoc t-tests on the responses to the different stimuli presented individually inside the RF. Population response histograms were created by averaging the responses of all neurons, with time bins of 10–50 ms. It made virtually no difference whether the population histograms were averaged from actual firing rates or from responses normalized to the peak rate; therefore, the figures show the unnormalized responses so that the firing rates can be easily appreciated.
At the conclusion of the experimental sessions, fluorescent dyes were injected through a cannula at the boundaries of the recording area. A few days later, following an overdose of sodium pentobarbital, the animals were perfused transcardially with formalin. Sections were cut every 50 μm, stained with thionin and examined for electrode tracks and dye marks. Although older tracks could not be visualized, recording sites could be inferred from the identifiable tracks and the location of the dye marks.
Both animals were performing the task with a high level of accuracy during the sessions of neuronal recording. Fixation errors (before the presentation of the choice stimuli) were made on 8% of the trials, and these trials were excluded from the following performance scores. The animal studied with the saccade version of the task made a saccade toward the correct stimulus on 88% of the target-present trials. On target-absent trials of the same task, the animal correctly maintained central fixation in 73% of the trials, while it produced a saccade to one or the other nontarget in the remaining trials. Performance of the animal studied with the bar-release version of the task was 91% correct.
On the basis of the histological reconstruction of the recording sites, we could establish that all recordings were from the surface of the anterior portion of the prelunate gyrus, between the lunate and superior temporal sulci, where the V4 representation of the lower contralateral quadrant is located (Zeki, 1973; Maguire and Baizer, 1984; Gattass et al., 1988). Consistent with the location of the recording sites, the RFs of the recorded neurons were confined within the inferior contralateral quadrant, with an eccentricity of 4–8°. A total of 177 cells were recorded from three hemispheres of the two monkeys. Of these, 13 did not give a significant response to any of the stimuli used (paired t-test comparing stimulus-evoked response to baseline firing rate), and they will not be considered further. In addition, 70 cells did not show any significant selectivity among the stimuli tested (ANOVA, P > 0.05), and these will also be ignored in the following description of the results. The remaining 94 cells, which were both significantly responsive and selective, are the focus of the analyses reported below. Of these 94 cells, 81 were studied with the saccade version of the task, and 13 were studied with the bar-release version of the task.
Search Task with Saccades: Responses to the Cues and Cue-related Delay Activity
Because nearly all of the recorded neurons had non-foveal classical RFs, only a few of them had a clear excitatory response to the onset of the cue stimuli presented at the fovea. Out of the 81 stimulus-selective cells studied with the saccade version of the task, only four cells (4.9%) gave a significant excitatory response to at least one of the cue stimuli, while 55 neurons (67.9%) were significantly inhibited by one or more of the cues. Both excitatory and inhibitory responses to the cue stimuli located on or near the boundary of the classical RF were very small compared with the responses elicited by stimuli presented inside the RF. Across all the 81 stimulus selective cells, the average response to the good stimulus presented alone inside the RF was 29.3 spikes/s (± 22.3 SD), and the average response to the poor stimulus alone presented inside the RF was 13.4 spikes/s (± 15.8 SD). Baseline firing in the 300 ms time window preceding stimulus onset for the same cells was 3.9 spikes/s (± 3.8 SD).
Although the cue stimuli were typically presented outside the RF, we tested the cells for differential activity during the delay interval following the cue. In particular, we asked whether cells tended to have a higher maintained activity in the delay following the good stimulus presented as the cue than following the poor stimulus presented as the cue, as had previously been found in IT cortex (Chelazzi et al., 1993, 1998). To assess differential delay activity across the population, we computed a t-test (evaluated at P < 0.05) for each cell comparing the firing rate in the last 500 ms of the delay interval following the good cue versus following the poor cue. Unlike what we had previously observed in IT cortex of monkeys performing the same search task, only a few of the V4 cells showed significantly different activity in the delay depending on the preceding cue. Of the 81 stimulus-selective neurons, only nine cells (11.1%) showed significant cue-specific activity during the delay. For eight of these cells, the delay activity was higher following the good than the poor cue (3.5 versus 2.6 spikes/s), while for the remaining cell the activity was higher in the delay following the poor than the good cue (3.9 versus 2.9 spikes/s).
Search Task with Saccades: Two-stimulus Arrays Confined to the RF (Inside/Inside Configuration)
Of the 81 stimulus-selective cells, five could not be used for the analysis of responses in the inside/inside condition because only one stimulus location turned out to be inside the RF for these cells when responses were subsequently analyzed after the recordings. The inside/inside analysis is therefore restricted to the remaining 76 neurons, in which both stimuli in the array were contained within the RF.
The response to physically identical search arrays varied considerably according to which stimulus was the target. Figure 2 shows the responses of an individual cell to the search array confined to the RF on trials where the good stimulus versus the poor stimulus was the target, and Figure 3 shows the same comparison conditions for the entire population of 76 stimulus-selective cells. Both the single cell example (Fig. 2A) and the population average histogram (Fig. 3A) show that the early phase of the response to the array inside the RF was unaffected by the preceding cue, i.e. by whether the good or poor stimulus in the array was the target. However, starting ~150 ms after the onset of the array, there was a strong target effect. If the target was the good stimulus for the recorded cell, then the activity remained high until about the time of the saccade to the target (indicated by the black vertical bar), at which time the eye movement moved the stimuli outside the RF and the firing rate rapidly dropped to baseline levels. By contrast, if the target was the poor stimulus for the recorded cell, the firing rate was strongly suppressed during the same time period. Thus, consistent with our previous findings in IT cortex (Chelazzi et al., 1993, 1998), cells initially responded to the good stimulus in the RF regardless of whether it was the target or the distracter. Responses to the good stimulus soon became suppressed, however, when the poor stimulus was the target and the good stimulus was behaviorally irrelevant.
Inspection of the population histograms in Figure 3A also provides an opportunity to confirm the results from the analysis of activity during the delay interval described above. The histograms show no clear difference in sustained activity between good-cue and poor-cue trials in the time interval preceding array onset across the entire cell population, consistent with the finding that few individual cells showed significant differential delay activity during this period.
To determine the time at which the population response to the good stimulus became suppressed when the poor stimulus was the target, we computed a paired t-test (evaluated at P < 0.05) on each 10 ms bin in the population histograms. The onset of suppression was defined to begin at the first of two consecutive bins that showed a significant difference in response on the good- versus poor-target trials. According to this analysis, the response to the good stimulus became significantly suppressed at 150–160 ms after array onset when the poor stimulus was the target. Bins with a significant difference in activity between the two comparison conditions are marked by the empty circles in Figure 3A. Thus, the target effect began well before saccade onset, which on average was 237 ms after array onset (black vertical bar in Fig. 3A).
To further examine the relationship between the target effect and saccade onset, we time locked the neural responses to the saccade onset in each trial. Figures 2B and 3B show the histograms of the response to the search array time locked to the onset of the saccadic eye movement, for the single cell example and for the population average, respectively. The target effect started well in advance of any change in the retinal input caused by the eye movement. We repeated the time series of t-tests on each 10 ms bin in the population average, and these tests indicated that a significant target effect began 70–80 ms before the saccade to the target.
In the early time window the average TEI across the population of 76 cells was 0.06 (± 0.02 SEM; corresponding to a 12.8% change in firing), while it increased to 0.24 in the late window (± 0.03 SEM; corresponding to a 63.2% change in firing). The difference between the TEI in the two windows was highly significant (t-test, P < 0.001), indicating that the target effect became much stronger in the period preceding the onset of the behavioral response. Figure 4 shows a scatterplot of the TEI in the early versus late time window for each cell. Most points in the plot (67/76) fall to the left of the diagonal, indicating that for most cells, the target effect increased from the early to the late window.
We also asked how many individual cells showed a significant target effect. This was determined by a t-test (evaluated at P < 0.05) comparing responses on good-target and poor-target trials, in both the early and late time windows. In the early window, 7 out of 76 cells (9.2%) showed a significant positive target effect (greater responses on good-target than poor-target trials), while an additional cell showed a significant effect in the reverse direction. In the late window, the number of cells with significant positive target effects increased to 29 (38.2%), with one cell showing a significant negative effect. A repeated-measures ANOVA with target (good target versus poor target) and time (early versus late window) as main factors showed a significant interaction (P < 0.05) between the two for 28/76 neurons (36.8%). Thus, consistent with the analysis of the TEI, the effect of selecting the good or the poor stimulus in the array became much more pronounced after the initial response to the array, with many more cells showing a positive target effect in the late time window than in the early window.
Even in the late time window not all cells showed a significant target effect, however. We asked whether this might be due in part to the fact that some cells did not show strong response differences between the good and poor stimuli that we selected for the attentional experiment. If the response to the good stimulus were similar to the response to the poor stimulus, then we predicted that there would be a correspondingly small difference in response when selecting one versus the other as the target. To test this, we examined the relationship between the target effect and stimulus selectivity. First, for each cell we computed a Stimulus Selectivity Index (SSI) using the formula:
Finally, we compared the responses to the search array in the good-target and poor-target trials, on the one hand, with the responses to the two component stimuli presented in isolation, on the other. For the latter two responses we used the data from the one-stimulus arrays in target-present (matching) trials, after we had established (not shown) that the responses on these trials were equivalent to the responses on the target-absent (non-matching) trials.
Figure 5 shows the population histograms for the four conditions, synchronized on stimulus onset. Regardless of which stimulus was the target, the early phase of the response to the array (between ~75 and 175 ms after array onset) was somewhat smaller than the response to the good stimulus presented alone, suggesting that the response to the good stimulus in the array was initially suppressed by the presence of the poor stimulus. However, by the time of the behavioral response, which occurred on average 237 ms after array onset with two-stimulus arrays and 210 ms after array onset with one-stimulus arrays, the response to the array was strongly modulated according to which stimulus was the target. When the good stimulus was the target, the suppressive effect of the poor stimulus was eliminated, and the response to the array became indistinguishable from the response to the good stimulus presented alone. By contrast, when the poor stimulus was the target, the excitatory influence of the good stimulus was greatly reduced and the response to the array tended to approach the response to the poor stimulus presented alone. Thus, the effect of target selection was to largely eliminate the influence of the distracting stimulus on the response to the target. As a result, around the time of the behavioral response of the animal, the cells' firing rates approached the firing rates that would have been obtained had the target stimulus been presented alone.
Search Task with Saccades: Two-Stimulus Arrays Presented across the RF Boundary (Inside/Near-outside and Inside/Far-outside Configurations)
The next step was to determine whether the target effect depended on both competing stimuli being present within the same RF of the recorded neuron. We therefore examined trials in which one stimulus was located inside the RF and one outside. Although the position of the good and poor stimulus relative to the RF boundary (one inside, the other outside) was completely unpredictable from trial to trial, we focused on trials in which the good stimulus was located inside the RF and the poor stimulus outside the RF. Much reduced responses, if any, were obtained with the reverse configuration (i.e. the good stimulus outside and the poor stimulus inside the RF).
For some cells the outside stimulus location was placed just outside the classical RF (within the same quadrant), while for other cells the outside stimulus location was placed at a much greater distance in the opposite quadrant of the same visual hemifield. The results from the two configurations will be described separately in the next two sections.
A total of 24 significantly responsive and stimulus selective cells were tested in this condition. None of these cells gave a significant excitatory response to the good stimulus presented alone at the stimulus location just outside the border of the RF.
Figure 6 compares the population response to the search array, averaged across the 24 cells, when the good stimulus (inside the RF) versus the poor stimulus (outside the RF) was the target. In Figure 6A the spike trains were synchronized to the onset of the search array, while in 6B the same data were synchronized to saccade onset. The population responses to the search array in the inside/near-outside condition did not show any clear effect of target selection until ~150 ms after array onset (Fig. 6A). Starting at about this time, however, neuronal responses clearly diverged depending on whether the good or the poor stimulus in the array was the target, with activity staying higher in the former than in the latter condition. Although these effects were qualitatively similar to those found in the inside/inside condition, the magnitude of the effects appeared to be smaller.
To determine the time at which the population response to the good stimulus began to show signs of suppression when the poor stimulus was the target, we computed a paired t-test (evaluated at P < 0.05) on each 20 ms bin in the population histograms. The onset of suppression was defined to begin at the first of two consecutive bins that showed a significant difference in response on the good- versus poor-target trials. According to this analysis, the response to the good stimulus became significantly suppressed at 160–180 ms after array onset when the poor stimulus was the target (average saccadic latency in this condition was 240 ms).
The same analysis was carried out on the same data synchronized to the onset of the behavioral response (Fig. 6B). We repeated the time series of t-tests on each 20 ms bin in the population average, with the good versus poor stimulus as the target. According to this analysis, a significant target effect was present only in the 40 ms preceding saccade onset. Although this was well in advance of any change in the retinal input determined by the eye movement, it was much later than the target effect in the inside/inside condition.
To establish the magnitude of the target effect for the individual neurons tested in the inside/near-outside condition, we next computed the target effect index on the individual cells' responses to the search array. The average TEI across the 24 cells was –0.01 (± 0.05 SEM; corresponding to a –2.0% change in firing rate) in the early time window (between 50 and 150 ms post-array onset), and it increased to 0.14 (± 0.04 SEM; corresponding to a 32.6% change in firing rate) in the late time window (spanning the last 100 ms prior to the onset of the eye movement; paired t-test, P < 0.001).
Finally, we compared the response to the array when the good stimulus (inside the RF) versus the poor stimulus (outside the RF) was the target for the individual cells. In the early time window, between 50 and 150 ms after array onset, one cell showed a significantly greater response in good-target versus poor-target trials, and one additional cell showed a significant effect in the opposite direction. In the late time window, spanning the last 100 ms before onset of the saccade, five cells showed a reliable effect (20.8%); for all of them the response to the array was larger in good-target than poor-target trials.
Although at least some of the cells studied with the inside/ near-outside configuration showed a clear target effect in the late time window, the question remained as to whether such an effect, assessed at the population level, was significantly reduced in magnitude relative to the effect obtained with the inside/inside configuration. It would be potentially misleading to compare the target effect in the 24 cells tested in the inside/near-outside condition to the target effect in the larger population of 76 cells studied in the inside/inside condition. We therefore tested for a significant difference in the magnitude of the target effect between the two conditions (inside/inside versus inside/near-outside) only in the 24 cells tested under both conditions. For these cells, the average TEI measured in the late time window in the inside/inside condition was 0.29 (corresponding to a 81.7% change in firing rate) compared to 0.14 in the inside/near-outside condition (corresponding to a 32.6% change in firing rate, as indicated above), and the difference between the two conditions was significant (paired t-test, P < 0.05).
In conclusion, a reliable effect of target selection was still observed when only one array stimulus was inside the RF of the recorded neuron and a second stimulus was located outside but near the RF border. However, this target effect was significantly diminished compared with when two stimuli fell within the RF of the same neuron.
A total of 25 significantly responsive and stimulus-selective cells were tested in this condition. None of these cells gave a significant excitatory response to the good stimulus presented alone at the stimulus location outside the RF.
Figure 7 compares the population response to the search array, averaged across the 25 cells, when the good versus poor stimulus was the target. In Figure 7A the spike trains were synchronized to the onset of the search array, while in 7B the same data were synchronized to saccade onset. The population responses to the search array in the inside/far-outside condition did not show any clear effect of target selection during approximately the first 150 ms after the onset of the array (Fig. 7A). Starting at about this time, neuronal responses showed only a slight tendency to differentiate depending on whether the good or the poor stimulus in the array was the target, with activity staying somewhat higher in the former than in the latter condition. To test whether there was significant suppression of the response to the good stimulus when the poor stimulus was a target at any time after stimulus onset, we computed a paired t-test on each 20 ms bin in the population histograms. The onset of suppression was defined to begin at the first of two consecutive bins that showed a significant difference in response on the good- versus poor-target trials. According to this criterion, population responses were never significantly different from one another depending on the selected target. The same analysis was carried out on the same data time-locked to the onset of the behavioral response (Fig. 7B). As shown in Fig. 7B, population responses remained generally indistinguishable through the time of saccade onset, and a significant difference was present for only the last 20 ms bin preceding onset of the eye movement.
To establish the magnitude of the target effect for the individual neurons tested in the inside/far-outside condition, we next computed the target effect index on the individual cells' responses to the search array. The average TEI across the 25 cells was 0.02 (± 0.03 SEM; corresponding to a 4.1% change in firing rate) in the early time window (between 50 and 150 ms post-array onset) and 0.06 (± 0.04 SEM; corresponding to a 12.8% change in firing rate) in the late time window (spanning the last 100 ms prior to onset of the eye movement), which were not statistically different according to a paired t-test (P = 0.29). Also, neither value was significantly different from a TEI of 0.0 (P > 0.05). Thus, in both time periods there was little or no effect of selecting the good versus poor stimulus in the array.
Finally, we compared the response to the array when the good stimulus (inside the RF) versus the poor stimulus (outside the RF) was the target for individual cells. In the early time window, between 50 and 150 ms post-array onset, none of the cells showed a significant difference in response between good-target and poor-target trials. In the late time window, spanning the last 100 ms before onset of the saccade, three cells showed a significant positive target effect and one cell showed a significant negative target effect.
It thus appears that, unlike the cells studied with the inside/inside and inside/near-outside conditions, this group of cells studied with the inside/far-outside configuration did not show a clear target effect. Nonetheless, it was important to verify that the same 25 cells tested under the inside/far-outside condition did show a clear effect of target selection under the inside/inside configuration. We therefore compared the magnitude of the target effect across the two conditions (inside/inside versus inside/far-outside) for this group of 25 cells tested under both conditions. For these cells, the average TEI measured in the late time window in the inside/inside condition was 0.24 (corresponding to a 63.2% change in firing rate), compared with a TEI of 0.06 in the inside/far-outside condition (corresponding to a 12.8% change in firing rate, as indicated above) and the difference between the two conditions was highly significant (paired t-test, P = 0.006).
Figure 8 summarizes the results obtained with the three types of search array configurations relative to the RF boundary, the inside/inside, inside/near-outside, and inside/far-outside configurations. For each of these conditions, the graph shows the average target effect index across the population of tested cells, separately for the early and late time windows. The largest effect of target selection was obtained in the late time window with both stimuli of the search array confined within the RF of the recorded neuron. The effect showed a small but significant reduction when one search array stimulus was moved just outside the RF, and a large, further significant decrease when the outside stimulus was moved to a greater distance within the opposite quadrant of the same visual hemifield.
Search Task with Lever Release: Two-stimulus Arrays Presented Inside the RF (Inside/Inside Configuration)
In our previous study of IT neurons during visual search (Chelazzi et al., 1998), we established that a reliable target effect could be found regardless of the specific motor response produced by the animal. A significant modulation of responses to the search array was observed both when the animal was rewarded for making a saccade to the target and when it signaled the presence of the target with a lever release. To test whether this was true in V4, we recorded the responses of 13 significantly stimulus selective V4 cells using the same lever-release task as used for the IT cortex recordings (see Materials and Methods). The main question was whether a significant target effect could be observed under these task conditions.
Because the activity of these 13 cells was typical of the cell population described in the previous sections, we will report here only the responses of these cells to the search array made of a good and a poor stimulus for the individual cell confined to the RF of each neuron, and their modulation by the selection of either stimulus as the target.
Figure 9 illustrates an example cell showing a clear modulation of activity depending on the selected target. Starting ~125–150 ms after array onset, the firing rate of this cell was significantly higher when the target was the good stimulus for the cell compared with when it was represented by the poor stimulus (see Fig. 9A). This modulation started well before the behavioral response, which occurred on average 367 ms after the onset of the array. The same result can be observed in Fig. 9B of the same figure, where spike trains have been synchronized to the lever release rather than to the onset of the search array. In this case the change in activity depending on the selected target became evident ~200 ms before the behavioral response.
To establish the magnitude of the target effect for the individual neurons tested in this condition, we next computed the target effect index on the individual cells' responses to the search array. The average TEI for the 13 cells was –0.05 (± 0.03 SEM; corresponding to a –9.5% change in firing rate) in the early time window (between 50 and 150 ms after array onset) and 0.24 (± 0.06 SEM; corresponding to a 63.2% change in firing rate) in the late time window (spanning the last 100 ms prior to the lever release). Values of the TEI for the early versus late time window were statistically different according to a paired t-test (P < 0.001).
Finally, we compared the response to the array when the good versus poor stimulus was the target for the individual cells. In the early time window, between 50 and 150 ms after array onset, only one cell showed a significant target effect, which was negative. In the late time window, spanning the last 100 ms before the behavioral response, 4 out of the 13 cells showed a significant target effect, which was a positive effect in each case.
These results indicate that V4 neurons can show a substantial target effect even when no eye movement is made to the location of the selected target, similar to what we had observed in our previous recordings of IT neurons.
As we found previously in IT cortex (Chelazzi et al., 1993, 1998), responses of neurons in V4 are strongly modulated by attention in a visual search task. The monkeys were trained to search for a target stimulus in a two-stimulus array, and the location of the target was unknown in advance. Thus, the monkey had to search for the target based on non-spatial features, which is akin to finding a ‘face in a crowd’, or at least a very small crowd. This is different from tasks of spatial attention commonly used for neurophysiological investigations of attention mechanisms in extrastriate cortex, where the location of a relevant stimulus is cued well in advance of its onset (Moran and Desimone, 1985; Motter, 1993; Treue and Maunsell, 1996; Connor et al., 1996, 1997; Luck et al., 1997; Reynolds et al., 1999; McAdams and Maunsell, 1999a,b). The target stimulus on a given trial was indicated by a cue presented at fixation at the start of the trial, and the animal indicated the presence of the target in the array by making a saccadic eye movement to it. We found that when two stimuli were located inside the RF of a V4 cell, the activity of the cell in a ~100 ms time window preceding the behavioral response was determined almost completely by the target stimulus (the ‘target effect’). The response to the distracter was almost completely suppressed. Furthermore, as in IT cortex, a reliable target effect was confirmed using a version of the task in which the animal had to signal the presence of a target by a lever release, i.e. when precise localization of the target in the array for the purpose of a goal-directed response (e.g. a saccade) was not required. The neural mechanism for visual search thus appears to extend back from IT cortex to at least the level of V4 in the ventral object recognition stream. There were differences between V4 and IT cortex, however, in both the presence of cue-selective delay activity and in the dependency of the target effect on the spatial relationship between the stimuli, which are discussed in the following sections.
Cue-selective Delay Activity
When recorded in a similar task to that used in the present study, IT neurons typically showed cue-specific maintained activity during the delay between the cue and the presentation of the choice array (Chelazzi et al., 1998). For example, if cue stimulus A elicited a larger response than cue stimulus B, the activity in the delay following A was typically larger than in the delay following B. We have interpreted this activity as one piece of evidence for a top-down bias in favor of cells representing the properties of the cue-target stimulus. The bias in this case could be a simple increase in excitatory input to the critical cells coding the expected target stimulus. Such a bias might then give these cells a competitive advantage when the choice array is presented, resulting in the suppression of cells representing distracters (Chelazzi et al., 1998). We have termed this model of attentional selection ‘biased competition’. Consistent with this model, we found analogous maintained activity in V4 in a spatial attention task, in which cells showed higher maintained activity on trials in which the animal attended to a location within the RF of the recorded neuron than when the animal attended anywhere else in the visual field, including nearby locations outside the RF (Luck et al., 1997). The magnitude of the maintained activity was proportional to the distance between the focus of attention and the center of the RF, and, thus, the resolution of this signal was greater than the RF dimensions (Luck et al., 1997). The spatially specific maintained activity could reflect a bias in favor of cells representing the critical feature of the target stimulus in the task, namely, its spatial location, and would give these cells a competitive advantage over cells representing distracters at nearby locations. Recent functional neuroimaging studies in humans have shown a similar topographical activation of extrastriate cortical foci, corresponding to the locus of attention, in the absence of visual stimulation (Kastner et al., 1999; Brefczynski and DeYoe, 1999). A quantitative implementation of the biased competition model has been presented (Reynolds et al., 1999).
The near absence of cue-specific delay activity in V4 in the present study calls into question our interpretation of the role of delay activity in biased competition. Cue-selective activity during the delay interval of the task was displayed by only a small fraction of cells in our sample, and even for these cells the difference in firing rate depending on the cue was very modest. Paradoxically, Haenny and colleagues found a substantial number of V4 cells with cue-specific delay activity in an orientation match-to-sample task, although there was no search component or spatial uncertainty for the stimuli in that task (Haenny et al., 1988). One possible explanation for the absence of delay activity in V4 in the present study is that IT cells alone are the recipients of direct biasing inputs in visual search, and that, through feedback connections, IT cells then modulate the responses or the competitive interactions among V4 cells. Another possible explanation is that cue-specific maintained activity might reflect only one of many different forms of bias impinging on cortical cells. We have found, for example, that when attention is directed to a spatial location, V4 cells with RFs at that location become more sensitive to attended stimuli, as though the contrast of the stimulus had increased (Reynolds et al. 2000). This increase in effective stimulus contrast is another type of bias in favor of cells coding relevant stimuli. Other studies have also found small increases in response to single attended stimuli in the RFs of V4 cells (Spitzer et al., 1988; Connor et al., 1996, 1997; McAdams and Maunsell, 1999a,b), reflecting a bias in favor of that stimulus. Motter (1994a,b) has shown that attention directed to a particular object feature (e.g. to the color red) can enhance the activity of V4 cells encoding red elements throughout the visual field. This may be a mechanism for pre-selecting visual objects containing a specified, relevant feature, and might be an initial step toward selecting a single, behavioral target.
We have also recently found evidence that attention causes an increase in high-frequency synchronization of cells representing the relevant stimulus location, which would presumably amplify the effective influence of those cells on postsynaptic elements (Fries et al. 2001). Interestingly, the increased synchronization is not always accompanied by firing rate increases. If such synchronization also takes place for cells representing the target stimulus features in the visual search task, this might be the critical bias that drives the competition between target and distracter in V4 during visual search. Finally, it could be that the maintained cue-specific activity found during visual search in IT cortex is important for maintaining a representation of the target features in working memory, but does not specifically bias the competitive interactions between target and distracters in either IT cortex or V4.
Regardless of the specific nature of the attentional bias in area V4, one likely source of bias signals in our search task is prefrontal cortex, which we have also suggested is a source of feedback bias to IT cortex (Chelazzi et al., 1998). Prefrontal cortex has long been shown to play a key role in spatial and object working memory, a critical cognitive component of our search paradigm (e.g. Goldman-Rakic, 1987; Fuster, 1989; Miller et al., 1996). In addition, more recent studies have revealed profound modulation of neuronal responses in prefrontal cortex as a function of attention and task relevance of the visual input (e.g. Schall and Hanes, 1993; Schall et al., 1995; Rainer et al., 1998; Asaad et al., 2000; Miller, 2000; Hasegawa et al., 2000).
Target Effect in V4 and IT Cortex
Many V4 cells showed a robust target effect, at least when two stimuli were positioned within the boundaries of the cell's classical RF. The incidence and magnitude of the effect in the present study are both similar to corresponding estimates in area IT. In the standard conditions of the two studies, i.e. when the stimulus arrays were confined within the RF border of V4 neurons and within the contralateral hemifield for IT neurons (Chelazzi et al., 1998), a significant target effect was obtained in ~39% of V4 neurons and ~44% of IT neurons. Likewise, the magnitude of the effect is similar in the two areas, as measured by the TEI. In the standard conditions, the average TEI measured over the last 100 ms preceding saccade onset was 0.24 (or a 63.2% change in firing rate) for V4 cells and 0.26 (or a 70.3% change in firing rate) for IT cells.
A more complex issue is to compare the time-course of the target effect in the two areas. In the present study, a significant target effect in the standard, inside/inside RF condition began at 150–160 ms after array onset, or ~70–80 ms prior to saccade onset. By comparison, in IT cortex the target effect began at the same time prior to saccade onset as in V4 but it began ~20 ms later compared to stimulus onset, i.e. 170–180 ms after array onset. However, this analysis is complicated by several factors including: (i) the target and distracter were closer to each other (typically in the same visual quadrant) in the V4 recordings than in the IT recordings (typically in the upper versus lower contralateral quadrant); (ii) the animals had more training prior to the V4 recordings than prior to the IT recordings; and (iii) possibly because of both of those factors, the average saccade latency in the V4 recordings (~240 ms) was considerably shorter than in the IT recordings (~300 ms). Thus, although it would be of great potential interest to compare the latency of the target effect between area V4 and IT, this question must be left for future studies.
Dependency on Spatial Factors
Large target effects in V4 were found only when both the target and distracter occupied the same RF, whereas IT neurons show equivalent target effects when stimuli are located anywhere in the upper and lower contralateral quadrants (Chelazzi et al., 1998). Thus, target selection appears to operate at different spatial scales in the two areas. This is consistent with the notion that a ‘central resource’ for which stimuli compete is the RF (Desimone and Duncan, 1995). When two or more stimuli occupy the same RF, the message communicated by the cells is not specific to either stimulus, and therefore attentional mechanisms are needed to improve the signal. When only one stimulus occupies the RF, there is no competition and attention is therefore not needed to resolve it. This would explain why both animals and a human patient with V4 lesions typically show only modest impairments when discriminating the features of a single target stimulus in the visual field but show much larger impairments when the same target stimulus is surrounded by nearby strong distracters (Schiller and Lee, 1991; Schiller, 1993; Merigan, 1996; De Weerd et al., 1999; Gallant et al., 2000). This explanation is also consistent with the notion that the competition between stimuli observed in behavioral studies (Desimone and Duncan, 1995) is mediated by competition by neurons in the cortex, and that this competition is strongest for nearby cells, such as those occupying nearby locations in a visuotopically organized area like V4 (Desimone and Duncan, 1995; Luck et al., 1997; Reynolds et al., 1999). Brain imaging studies in human subjects have similarly found evidence for suppressive interactions between stimuli located within the same RF in extrastriate cortex, and its modulation by spatially directed attention (Kastner et al., 1998, 1999).
One remaining puzzle is why attentional selection is stronger in IT cortex when the competing stimuli are contained in the same hemifield, even though IT RFs typically extend into both hemifields. There is recent evidence that competition between two stimuli in the ipsilateral hemifield results in target effects as strong as when the stimuli compete within the contralateral hemifield (Jagadeesh et al., 2001). The only configuration with weaker target selection effects is when the stimuli are located in opposite hemifields. It is possible that competition between the hemifields is resolved at higher levels of processing, such as in prefrontal cortex (Rainer et al., 1998).
Significant effects of spatial and non-spatial attention have been described in a number of prior studies of V4 cells (Moran and Desimone, 1985; Spitzer et al., 1988; Motter, 1993, 1994a, b; Connor et al., 1996, 1997; Luck et al., 1997; Reynolds et al., 1999; McAdams and Maunsell, 1999a,b, 2000). Of these, several studies have found that attentional effects are much stronger when two or more stimuli compete within the same RF than when one stimulus is located inside the RF and one outside (Moran and Desimone, 1985; Luck et al., 1997; Reynolds et al., 1999), and this finding has been extended to other areas of the ventral and dorsal stream of cortical visual processing, such as V2 (Luck et al., 1997; Reynolds et al., 1999) and MT/MST (Treue and Maunsell, 1996). According to the biased competition account, when only a single stimulus is located in the RF and any distracters are distant, attending to it will increase the bias in favor of that stimulus but will not modulate any competitive interactions. Because the direct biasing effects on neuronal responses are weaker than the effects of modulating competition between stimuli, attending to a single stimulus in the RF may cause little or no increase in firing rates in V4, as has been found in several previous studies (Moran and Desimone, 1985; Luck et al., 1997).
The Target Effect and the Biased Competition Model Of Attention
As predicted by the biased competition account (see Introduction), target selection in V4 (and IT cortex) appeared to modulate an underlying competition between target and distracter rather than simply increasing the neuronal response to the attended stimulus. We found that in both V4 and IT cortex, when neither stimulus was a target, the effect of adding a poor stimulus to a good stimulus in the RF was to suppress the response to the good stimulus to a level that was intermediate between the response to the good stimulus alone and the poor stimulus alone. When the good stimulus became the target, the effect of attending to it was to cancel the suppressive effect of the poor stimulus. As a result, the response was restored to the magnitude of response elicited by the good stimulus when presented alone. Conversely, when neither stimulus was a target, the effect of adding a good stimulus to a poor stimulus in the RF was to increase the cell's response to a level intermediate between the response to the good stimulus presented alone and the poor stimulus presented alone. This increase in response due to the presence of the good stimulus in the RF was largely (but not completely) canceled by selecting the poor stimulus as a target, so that the response to the array approached the response to the poor stimulus when it was presented alone. This is a particularly critical condition for the biased competition account because the effect of selecting the poor stimulus as the target was to drive the cells' responses down, not up, even though the poor stimulus elicited a small excitatory response when presented alone [see Reynolds et al. for a similar result (Reynolds et al., 1999)]. The fact that the response is suppressed when attention is directed to the poor but excitatory stimulus is inconsistent with simple ‘gain’ models of attention (McAdams and Maunsell, 1999a,b), in which the effect of attention is simply to increase the gain of the response to the attended stimulus. Attending to an excitatory stimulus in the RF should increase the neuron's response, according to gain models, which did not occur. Rather, the results can only be explained if attention modulates an underlying competitive interaction between the stimuli within the RF, a key element of the biased competition account.
We thank J. Sewell and T. Galkin for help with the histology, and J. Hart and R. Hoag for help in training the monkeys. This research was supported in part by a grant from the Human Frontier Science Program Organization to R. Desimone and J. Duncan, a grant from the Human Frontier Science Program Organization to L. Chelazzi, grants from the Office of Naval Research (N00014-91-J-1347) and Air Force Office of Scientific Research (AFOSR-90-0043) to J. Duncan, and by a fellowship from the Human Frontier Science Program Organization to L. Chelazzi.
Address all correspondence to Leonardo Chelazzi, MD, Ph.D., Department of Neurological and Vision Sciences, Section of Physiology, University of Verona, Strada Le Grazie 8, I-37121 Verona, Italy. Email: email@example.com.