## Abstract

Attention is known to play a key role in perception, including action selection, object recognition and memory. Despite findings revealing competitive interactions among cell populations, attention remains difficult to explain. The central purpose of this paper is to link up a large number of findings in a single computational approach. Our simulation results suggest that attention can be well explained on a network level involving many areas of the brain. We argue that attention is an emergent phenomenon that arises from reentry and competitive interactions. We hypothesize that guided visual search requires the usage of an object-specific template in prefrontal cortex to sensitize V4 and IT cells whose preferred stimuli match the target template. This induces a feature-specific bias and provides guidance for eye movements. Prior to an eye movement, a spatially organized reentry from occulomotor centers, specifically the movement cells of the frontal eye field, occurs and modulates the gain of V4 and IT cells. The processes involved are elucidated by quantitatively comparing the time course of simulated neural activity with experimental data. Using visual search tasks as an example, we provide clear and empirically testable predictions for the participation of IT, V4 and the frontal eye field in attention. Finally, we explain a possible physiological mechanism that can lead to non-flat search slopes as the result of a slow, parallel discrimination process.

## Introduction

Experiments investigating object detection and attention indicate that sets of cells encoding object features compete with one another in parallel. Chelazzi et al. (1993, 1998) assume that such a competition can be resolved by a feature-specific bias from working memory. Similarly, the feature-similarity framework (Treue and Martínez Trujillo, 1999) suggests that feedback implements a parallel feature-based gain control. Other work has revealed that a spatial bias can also resolve competition among cells (Luck et al., 1997; Reynolds et al., 1999).

Computational models have shown that interactions within a network can lead to attentive effects (Mumford, 1992; Tononi et al., 1992; Hamker, 1999; Kirkland and Gerstein, 1999; Corchs and Deco, 2002; Knoblauch and Palm, 2002). Specifically, we have recently shown that a global feature-specific bias can guide spatial selection by feedback within the ventral pathway (Hamker, 2004b). According to our model, a target template in prefrontal areas enhances the gain of cells in IT and V4 and facilitates processing of the features that are to be detected. The origin of a spatially selective bias, however, is rather unclear. Among others, the lateral intraparietal area (Bisley and Goldberg, 2003), the superior colliculus (Ignashchenkova et al., 2004) and the frontal eye field (FEF) (Bichot and Schall, 1999a) have been suggested to implement spatial attention. Inspired by the latter findings, we designed a computational model in which spatial attention emerges by reentry from the FEF, and showed that the temporal course of IT cell activity fits with some data of Chelazzi et al.'s (1993, 1998) experiment (Hamker, 2001, 2002, 2003). Further evidence in favor of the FEF has been given by Moore and Armstrong (2003), who have shown that the gain of V4 cells can be modified by a brief stimulation of FEF neurons. Assuming the FEF is indeed directly involved in spatial attention, the FEF could implement a gain modulation in V4 in two ways. Movement and visuomovement cells exhibit target selection and both could be the source of a reentry signal. A visual selection model and a movement preparation model have been proposed. The visual selection model predicts that target selection in the visuomovement cells provides the focus of attention (Thompson et al., 1997; Murthy et al., 2001; Sato and Schall, 2003). Alternatively, the movement plan model predicts that the activity of movement cells provides a spatial reentry signal (Hamker, 2003). At present there is no conclusive data in favor of one over the other.

In order to shed more light on the function and predictions of the movement plan model, the present paper focuses on a comparison of the movement plan model with a range of experimental data. We demonstrate that the reentry signal of the movement plan model is consistent with other conditions tested in Chelazzi et al.'s (1998, 2001) visual search experiment and with data from a conjunctive visual search task (Bichot and Schall, 1999b). Alternative models are shown to be less consistent with the data of Chelazzi et al. (1998). We further show that the model exhibits target selection in the visuomovement cells similar to FEF data in an eye movement task (Sato et al., 2001), although the reentry signal originates from movement cells. In order to obtain a model with predictive power we (i) put much emphasis on the selection of areas involved in the visual search task; and (ii) constrain our model to match the typical temporal course of activity of cells in all implemented areas.

Our simulations result in novel and specific predictions, one of the most relevant being that the latency of the spatial reentry depends on the degree of the target–distractor discrimination. This finding has strong implications on the emergence of search slopes in visual search experiments.

## Materials and Methods

We simulate the memory-guided search task used by Chelazzi et al. (1998). If the sample reappears in the search array, the condition is called ‘Target Present’ (Fig. 1). The result is a ‘saccade to the good stimulus’ if we observe a cell that is strongly driven by the cue. Let us now assume that we observe the same cell but present a cue stimulus that does not drive the cell very well. In this case the outcome is denoted ‘saccade to the poor stimulus’, since the chosen stimulus is a poor stimulus for the observed cell. If the good and the poor stimulus are within the search array and we present the poor stimulus as the cue, we observe distractor suppression. In the ‘Target Absent’ condition the cue stimulus is different from the stimuli in the choice array. In this case the saccade has to be withheld.

Figure 1.

Simulation of the experiment of Chelazzi et al. (1998). We use the same temporal order of events as in the real experiment. The simulated objects (banana, apple, pepper) are represented by a noisy population input in a one-dimensional feature space, here illustrated by a snapshot at the bottom of the figure. We do not use images as input. RFs without an object just have noise as input. Each object is encoded within a separate RF, illustrated by the dashed circle, of V4 cells in two simulated dimensions (only one is shown). All V4 cells are within the RF of the IT cell population. First a cue is presented to the model for 300 ms. After a delay of 1500 ms, one, two, three or five stimuli are shown. The model has to indicate the detection of the target by selecting its location for an intended eye movement. By varying the cue, we can define different search conditions: target present and target absent.

Figure 1.

Simulation of the experiment of Chelazzi et al. (1998). We use the same temporal order of events as in the real experiment. The simulated objects (banana, apple, pepper) are represented by a noisy population input in a one-dimensional feature space, here illustrated by a snapshot at the bottom of the figure. We do not use images as input. RFs without an object just have noise as input. Each object is encoded within a separate RF, illustrated by the dashed circle, of V4 cells in two simulated dimensions (only one is shown). All V4 cells are within the RF of the IT cell population. First a cue is presented to the model for 300 ms. After a delay of 1500 ms, one, two, three or five stimuli are shown. The model has to indicate the detection of the target by selecting its location for an intended eye movement. By varying the cue, we can define different search conditions: target present and target absent.

### Outline of the Model Proposed

We identified and constructed a network of relevant brain areas that are sufficient to perform the visual search task in Chelazzi's experiment (Fig. 2). Our model consists of ascending populations called ‘stimulus cells’ that can be primed by feedback connections, and descending populations of ‘target cells’ that project dominant patterns back into the source areas. In brief, the proposed dynamics of perception are as follows. Massive feedback projections within the ventral pathway implement a gain control in order to transfer target information represented in ‘higher’ areas to intermediate areas (V4). These intermediate areas drive the FEF and lead to target discrimination in visually responsive cells. By way of reentry into extrastriate visual areas from FEF movement cells, neurons in V4 and IT that have their receptive fields at the location of an intended eye movement increase their sensitivity and gain an additional advantage in competition.

Figure 2.

(a) Outline of the minimal set of interacting brain areas. Our model areas are restricted to elementary but typical processes and do not replicate all features in these areas. The arrows indicate known anatomical connections between the areas, which are relevant to the model. The area that sends feedforward input into the model is not explicitly modeled. The labels in the boxes denote the implemented areas. (b) Sketch of the simulated model areas. Each box represents a population of cells. The formation of those populations is a temporal dynamic process. Bottom-up (driving) connections are indicated by a bright arrow and top-down (modulating) connections are shown as a dark arrow. The two boxes in V4 and other areas indicate that we simulate two dimensions (e.g. ‘color’ and ‘form’) in parallel. The FEF is pooled across dimensions.

Figure 2.

(a) Outline of the minimal set of interacting brain areas. Our model areas are restricted to elementary but typical processes and do not replicate all features in these areas. The arrows indicate known anatomical connections between the areas, which are relevant to the model. The area that sends feedforward input into the model is not explicitly modeled. The labels in the boxes denote the implemented areas. (b) Sketch of the simulated model areas. Each box represents a population of cells. The formation of those populations is a temporal dynamic process. Bottom-up (driving) connections are indicated by a bright arrow and top-down (modulating) connections are shown as a dark arrow. The two boxes in V4 and other areas indicate that we simulate two dimensions (e.g. ‘color’ and ‘form’) in parallel. The FEF is pooled across dimensions.

We now describe the central gain control mechanism which determines the interaction of different areas, followed by an explanation of the different model areas. A mathematical description of the model can be found in Appendix I.

### Mechanisms of Interaction between Brain Areas

The selectivity of each cell is defined by its location iN in the population and its activity ri reflects the conspicuity of its preferred stimulus. Each cell is simulated by an ordinary differential equation (equation 1), that governs its average firing rate over time. Thus, using the model we are able to observe the temporal change of activity induced by a reentry signal.

Consistent with recent findings (Hupé et al., 2001), we model the influence of reentry as a gain control mechanism on the feedforward signal. In abstract terms, the reentry signal represents the expectation

$${\hat{r}}$$
to which the input (observation) r is compared. If the observation is similar to the expectation, we increase the conspicuity. This population-based inference can be achieved by a pointwise multiplication
$$I_{i}^{{\downarrow}}{\propto}r_{i}^{{\uparrow}}{\cdot}{\hat{r}}_{i}$$
(Fig. 3), which relates to a neural interpretation of Bayesian inference theory (Koechlin et al., 1999). Our theoretical definition of gain control has a direct functional relevance: if the reentry signal acted not on the input, but on the output, a suppressed cell would not increase in activity even if it had a high gain, and thus state changes of the dynamical system would be impaired.

Figure 3.

Illustration of how top-down directed expectation in a higher area modulates feedforward processing in a lower area. (a) The expectation

$${\hat{r}}$$
acts on the input r and increases the gain as depicted by the arrow through the circle. The y-axis encodes the firing rate of cells and the x-axis the feature space, e.g. orientation, color or location. For simplicity, the feature space of the involved areas is identical. To give an example, the expectation could originate from a population of cells in IT and modulate the conspicuity in V4. (b) Population activity without a significant top-down influence. In this case the content is simply processed in a bottom-up manner. (c) Population activity after top-down expectation multiplicatively increases the gain of the cells and therefore emphasizes a specific pattern (or location). Due to competitive interactions the population response for the non-supported stimuli decreases, resulting in a dynamic attention effect.

Figure 3.

Illustration of how top-down directed expectation in a higher area modulates feedforward processing in a lower area. (a) The expectation

$${\hat{r}}$$
acts on the input r and increases the gain as depicted by the arrow through the circle. The y-axis encodes the firing rate of cells and the x-axis the feature space, e.g. orientation, color or location. For simplicity, the feature space of the involved areas is identical. To give an example, the expectation could originate from a population of cells in IT and modulate the conspicuity in V4. (b) Population activity without a significant top-down influence. In this case the content is simply processed in a bottom-up manner. (c) Population activity after top-down expectation multiplicatively increases the gain of the cells and therefore emphasizes a specific pattern (or location). Due to competitive interactions the population response for the non-supported stimuli decreases, resulting in a dynamic attention effect.

Altogether, the change of activity of a cell i is a function of input

$$r_{i}^{{\uparrow}},$$
the lateral
$$I_{i}^{{\leftrightarrow}}$$
influence, and the top-down gain control
$$I_{i}^{{\downarrow}},$$
as well as an inhibitory term that depends in part on the activity
$$r_{i}^{V4}$$
of a cell i in V4:
(1)
$\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{i}^{V4}{=}r_{i}^{{\uparrow}}{+}I_{i}^{{\leftrightarrow}}{+}I_{i}^{{\downarrow}}{-}\left(r_{i}^{V4}{+}0.1\right)I_{d}^{inh}$

Given an identical input, the timing of reentry determines the change of activity of a target cell (Fig. 4). The difference of responses prior to 100 ms solely depends on influence from IT. After that, the FEF starts to weakly modulate the response. A strong modulation from the FEF does not occur prior to 150 ms.

Figure 4.

The temporal dynamics of gain control. We observe the influence of reentry onto the firing rate of a V4 cell in two different conditions (depending on whether the target is the good or poor stimulus), given the input is identical. A V4 cell receives a reentry signal from movement cells and IT. Both reentry signals enhance the gain and add up in their effect. Due to the different top-down signal in PF working memory in either condition, the reentry signal from IT differs for the two conditions. This difference leads to a different gain and thus to a different activity in V4. After 150 ms the reentry signals from the movement cell start to differ. Thus, the gain of the cell in the saccade to the good stimulus case is much higher than in the saccade to the poor stimulus condition. Due to competitive interactions among the V4 cells, the activity in the saccade to the poor stimulus case gets suppressed as well.

Figure 4.

The temporal dynamics of gain control. We observe the influence of reentry onto the firing rate of a V4 cell in two different conditions (depending on whether the target is the good or poor stimulus), given the input is identical. A V4 cell receives a reentry signal from movement cells and IT. Both reentry signals enhance the gain and add up in their effect. Due to the different top-down signal in PF working memory in either condition, the reentry signal from IT differs for the two conditions. This difference leads to a different gain and thus to a different activity in V4. After 150 ms the reentry signals from the movement cell start to differ. Thus, the gain of the cell in the saccade to the good stimulus case is much higher than in the saccade to the poor stimulus condition. Due to competitive interactions among the V4 cells, the activity in the saccade to the poor stimulus case gets suppressed as well.

Our gain control mechanism builds the core of the system in respect that it defines how areas on a different hierarchical level interact with each other in a continuous fashion.

### Interactions among Model V4 Cells

The model V4 cells are driven by the input to the model and, consistent with known massive feedback projections in the ventral pathway (Rockland and van Hoesen, 1994; Rockland et al., 1994), are modulated by IT. Another source of top-down influence seems to have its origin in the occulomotor circuit (Moore, 1999; Moore and Fallah, 2001; Tolias et al., 2001), in particular the FEF (Moore and Armstrong, 2003). We suggest that FEF movement cells modulate the gain of cells in V4 and IT (Hamker, 2003). Although retrograde labeling by tracers has revealed connections from layer 5 in the FEF, which contains movement cells, to extrastriate visual areas (Schall et al., 1995a), there is no direct evidence for the assumption that the movement cells are responsible for gain control.

The V4 used in our model is consistent with a range of experimental findings (Hamker, 2004a): if the receptive field contains just one stimulus, then a spatial bias results in a multiplicative gain increase. This has been observed in MT, MST and V4 (Treue and Maunsell, 1999; McAdams and Maunsell, 1999). If two stimuli are presented within the same receptive field, then the model V4 reproduces the data of Reynolds et al. (1999): a bias towards one stimulus reduces the influence of the other stimulus within the receptive field. We explain these attention effects by an input gain increase and additionally by an indirect inhibition among active populations.

### Interactions among Model IT Cells

Consistent with the large receptive fields of IT neurons, our model IT cell population receives converging input from all V4 populations (Fig. 2). Elevated baseline activity in IT cells (Tanaka et al., 1991; Miller et al., 1993; Chelazzi et al., 1993, 1998) is likely to originate in the prefrontal cortex (Tomita et al., 1999). Consistent with this finding, model prefrontal areas provide feedback into IT (Fig. 2). Since FEF projects to TEO (Schall et al., 1995a) the input gain in IT is also affected by model FEF movement cells (Fig. 2). We use the same model for IT as we do for V4, but our IT cells have stronger lateral inhibition.

### Task Control by PF Cells

The prefrontal cortex has been extensively studied in recordings around the principal and arcuate sulci, i.e. areas 8, 46 and 45 (Miller and Cohen, 2001) and is known to participate in the coordination of tasks (White and Wise, 1999; Asaad et al., 2000; Hasegawa et al., 2000; Miller and Cohen, 2001; Tanji and Hoshi, 2001). Areas 8 and 46, which overlap the frontal eye field, are often reported to code location- and motor-related signals, while area 45 is involved in categorization and feature detection (Freedman et al., 2001). Prefrontal cortex might apply a modulation over other areas in order to alter the mapping from perception to action (Miller and Cohen, 2001). Extending this concept, we show that prefrontal modulation can change the internal state of the system. One aspect of this control function is often referred to as working memory, while another is the detection of a match between object and sample in a delayed match-to-category task (Freedman et al., 2002). Our model prefrontal cortex fulfills these two major functions, encoding a pattern in PF working memory cells and indicating a match of the incoming pattern with the memorized pattern in PF match cells. Thus, IT cells can only drive PF match cells when their pattern matches the expectation from PF working memory cells (Fig. 2).

### Saccade Target Selection by FEF Cells

The FEF has connections to occipital, temporal and parietal areas, the thalamus, superior colliculus, and prefrontal cortex (Stanton et al., 1988, 1993; Schall et al., 1995a). The FEF can be subdivided into lateral and medial parts.

The lateral FEF, which generates short and precise saccades (Bahill et al., 1975), is connected to the dorsal (LIP, MT, MST, V3) and ventral (TEO, V4, V2) pathways, the ventrolateral prefrontal cortex (Baizer et al., 1991; Schall, 1995; Schall et al., 1995a; Stanton et al., 1995), and the superior colliculus (Sommer and Wurtz, 2000). The projections from V2 and V3 are weak, while the one from V4 is intermediate. Strong projections from TEO, MT and MST suggest that the FEF uses features after several stages of processing for target selection (Webster et al., 1994; Schall et al., 1995a).

Our model is consistent with this anatomy. FEF neurons receive convergent afferents from features across all dimensions in V4 at the same retinotopic location. Since anterior IT cortex, the area from which Chelazzi et al. (1998) recorded, does not project directly to FEF, we do not model any input to this area from IT.

The neurons in the FEF can be categorized based on both their responses to visual stimuli and to saccade execution into visual, visuomovement, fixation and movement cells (Bruce and Goldberg, 1985; Schall et al., 1995b). We consider visuomovement, fixation and movement cells (Fig. 2), and even model their temporal dynamics: visuomovement cells in deep layers are active from stimulus onset until saccade execution. Typically their initial response does not distinguish between distractor or target, but the activity decays when a distractor is in the receptive field (Schall et al., 1995b). Movement cells are active prior to saccades and do not show any response to stimulus onset (Hanes et al., 1998). Fixation cells decrease their activity before a saccade and increase their firing rate after the saccade or to terminate a planned eye movement (Hanes et al., 1998). Movement-related cells in the FEF show a fixation-disengagement discharge (Dias and Bruce, 1994), which indicates that fixation cells inhibit movement cells (Burman and Bruce, 1997).

The decision to execute an eye movement or to withhold gaze is based on a threshold detection of the PF match cells. If the PF match cells fire, the target is detected in the search array and the movement cells are disinhibited by removing the input into the fixation cell (Fig. 2).

## Results

### Sensory Interactions in IT During Visual Search

We now verify the reentry hypothesis by comparing the firing rate of our IT stimulus cells with recordings in IT (Fig. 5). All of our simulations correlated well with the experimental data, even with regard to the time course of competition.

Figure 5.

Activity of model IT neurons aligned to the onset of the search array. The activity of the cell with the optimal response to the good stimulus is shown. The time of an eye movement is indicated by a bar on the time axis. Activity after the eye movement cannot be reliably compared to real data, since we do not model an actual foveation. (ae) Physiological data (left) and simulation data (right). The physiological data is reprinted from ‘Responses of neurons in inferior temporal cortex during memory-guided visual search’ by Chelazzi L, Duncan, J, Miller EK, Desimone R (1998), J Neurophysiol 80:2918–2940. Copyright 1998 by the American Physiological Society. Reprinted with permission.

Figure 5.

Activity of model IT neurons aligned to the onset of the search array. The activity of the cell with the optimal response to the good stimulus is shown. The time of an eye movement is indicated by a bar on the time axis. Activity after the eye movement cannot be reliably compared to real data, since we do not model an actual foveation. (ae) Physiological data (left) and simulation data (right). The physiological data is reprinted from ‘Responses of neurons in inferior temporal cortex during memory-guided visual search’ by Chelazzi L, Duncan, J, Miller EK, Desimone R (1998), J Neurophysiol 80:2918–2940. Copyright 1998 by the American Physiological Society. Reprinted with permission.

When an array containing both the good and the poor stimuli is displayed (Fig. 5a), each cell initially encodes the presence of its preferred stimulus, but nonetheless the target cell shows an early advantage. Between 150 and 300 ms the cells encoding the non-target get suppressed almost to baseline activity, whereas the cells encoding the target show a small dip but then increase to the same level of the initial activation or even exceed it.

When only the good stimulus is presented, the physiological data show no difference in activity between the target and non-target conditions before the execution of an eye movement (Fig. 5b). Our simulations show a slight attention effect in favor of the target, since spatial and feature feedback cannot be completely shut off. However, as the activity of a model cell increases, feedback becomes less efficient and thus the attention effect is smaller than in conditions when stimuli compete. The presentation of a poor stimulus alone leads to a suppression, since in contrast to the experiment, our chosen poor stimulus does not drive the cell encoding the good stimulus.

A crucial condition is the target-absent condition (Fig. 5c). If the good and the poor stimuli are presented, the responses decrease after the initial burst. We explain this observation based on a weak winner-take-all competition. In the target-absent condition the good and the poor stimuli receive no top-down bias; they suppress each other and self-excitation is not strong enough for one population to dominate the other. Since prefrontal areas do not indicate the presence of the target, none receives a significant reentry, and the firing rate of IT cells decreases to a limit above baseline activity.

Figure 5d shows that the response to the good stimulus in the non-target condition is approximately halfway between the responses to the good stimulus and the poor stimulus in the target-present condition.

If we compare the stimulus alone with the two-stimulus array condition (Fig. 5e), we see that in both cases the good stimulus has almost the same activity around the time of the eye movement, although the activity in the good stimulus alone condition is initially stronger.

Our simulation replicates the temporal course of activity in the different conditions of the experiment from Chelazzi et al. (1998). This constraint allows us to make reliable predictions. Thus, we now explain the possible influence of the other simulated areas on the activity in IT.

### Contribution of Other Areas to Visual Search

The good fit with the data in IT is only of value if we can demonstrate that the temporal course of activity in other model areas is consistent with experimental findings. Here, we restrict ourselves to the condition with a target and one distractor in the display (Fig. 6). The presentation of the cue elicits a response in IT cells, which is stored by working memory cells. Consistent with studies using a delayed match-to-sample task (Miller et al., 1996), elevated firing rates are visible during the delay. In addition, the temporal course of activity of the PF match cell is very similar to what has been observed in the prefrontal cortex during a delayed match-to-category task (Freedman et al., 2002).

Figure 6.

Overview of the simulated areas aligned on cue. The good stimulus of the observed cell is the ‘banana’. The cue defines if the good or the poor stimulus is the target. Thus, the differences after cue presentation occur only due to the definition of which stimulus is the target. If the good stimulus is presented as the cue, the saccade goes to the banana, as indicated by the high movement activity. If the poor stimulus is the cue, the saccade is directed to the apple and, thus, movement activity at the location of the banana is low. The figures show the activity of the best matching cell for the good stimulus in a two-stimulus array over time. We observe attention in the system solely on the basis of interacting areas. Different attentional effects add up and their origin lies in feeback gain control as well as competition.

Figure 6.

Overview of the simulated areas aligned on cue. The good stimulus of the observed cell is the ‘banana’. The cue defines if the good or the poor stimulus is the target. Thus, the differences after cue presentation occur only due to the definition of which stimulus is the target. If the good stimulus is presented as the cue, the saccade goes to the banana, as indicated by the high movement activity. If the poor stimulus is the cue, the saccade is directed to the apple and, thus, movement activity at the location of the banana is low. The figures show the activity of the best matching cell for the good stimulus in a two-stimulus array over time. We observe attention in the system solely on the basis of interacting areas. Different attentional effects add up and their origin lies in feeback gain control as well as competition.

The receptive field of the V4 cell shown in Figure 6 does not encompass the location of the cue. Our model predicts a baseline increase during the cue presentation only for those V4 cells that receive direct feature-selective feedback from the IT. For other cells it predicts a slight suppression due to unspecific long-range inhibition. Consistent with this prediction, Chelazzi et al. (2001) report that 4.9% of V4 cells exhibit a significant baseline increase, while 67.9% are inhibited during cue presentation.

In order to guide eye movements, the information about the presence of the target encoded in IT has to be converted into the information about the target's location. We have shown that feedback from IT to V4 cells, which have smaller receptive fields, can provide information both about the features of the target and its location (Hamker, 2003). Thus, the model predicts an early target effect in V4. Consistent with this prediction, Chelazzi et al. (2001) found a slight early target effect in V4 cells, which is stronger when two stimuli are located within the same receptive field. Although this early attention effect is only small, it is remarkable since V4 is the second stage of feedback after TEO.

In the projection from V4 to FEF the neural firing pattern in V4 is averaged over dimensions and the feature specifity gets lost. Thus, the initial feature-specific enhancement in IT is transferred via V4 and FEF into a location specific advantage of some locations over others. The threshold detection in the PF match cells causes the FEF fixation cell to decrease in activity in order to plan an eye movement. Initially, FEF movement cells are able to gain activity regardless of whether they encode a target or non-target in their movement fields. This is supported by experimental data (Bichot et al., 2001b).

The time the model needs to select a target for an eye movement is variable. We notice that the latency of the eye movements increases with the set size (Fig. 7a). We observe a slope of 12 ms per item compared to 26 ms measured by Chelazzi et al. (1998). However, such a steep slope results from the fast response in the one-stimulus case. Consistent with the empirical data of Chelazzi et al. (1998), an eye movement is delayed for ∼40 ms when two stimuli are presented. Apparently, the processing of a target stimulus slows down when its selection occurs during a competition with distractors.

Figure 7.

Latency of eye movements depending on the set size. (a) Linear fit of the measured latencies. If two or more stimuli appear, the model predicts a delay in eye movement selection although the search is parallel. (b) Linear fit between the time for saccade selection and time for location discrimination. The time for location discrimination is the time from the target presence detection in the PF match cells to the time of saccade selection. Since the time for ‘target-present’ detection in PF match cells is almost constant, the different latencies occur during target discrimination and selection in FEF cells and not in the bottom-up wave into PF match cells. Thus, the interactions in the FEF are responsible for the delay in eye movement selection.

Figure 7.

Latency of eye movements depending on the set size. (a) Linear fit of the measured latencies. If two or more stimuli appear, the model predicts a delay in eye movement selection although the search is parallel. (b) Linear fit between the time for saccade selection and time for location discrimination. The time for location discrimination is the time from the target presence detection in the PF match cells to the time of saccade selection. Since the time for ‘target-present’ detection in PF match cells is almost constant, the different latencies occur during target discrimination and selection in FEF cells and not in the bottom-up wave into PF match cells. Thus, the interactions in the FEF are responsible for the delay in eye movement selection.

Consistent with FEF data (Hanes and Schall, 1996; Schall, 2002), the variability in search time can have two reasons in our model: variability in the growth rate of the movement activity, and variability in the onset of the movement cell activity. The observed set-size effect originates primarily in the variability of the growth rate of the movement cell activity. We find that it decreases with latency: the time for the behavioral response is highly correlated with the time span from target detection to action selection (Fig. 7b). The growth rate of the movement activity in turn depends on the target discrimination in the input as well as the overall strength of the input. The two-stimulus condition shows a better target discrimination as well as a stronger input (Fig. 8).

Figure 8.

Target discrimination in the frontal eye field depending on the set size, which has been also observed in experimental data (Bichot et al., 2001a). (a) FEF visuomovement cells. (b) FEF movement cells. The initial activity of the cells within the five-stimulus array is lower. As a result they need more time to reach the threshold for eye movements in the FEF movement cells. However, this delay is partly compensated by lower overall inhibition when the distractor is suppressed.

Figure 8.

Target discrimination in the frontal eye field depending on the set size, which has been also observed in experimental data (Bichot et al., 2001a). (a) FEF visuomovement cells. (b) FEF movement cells. The initial activity of the cells within the five-stimulus array is lower. As a result they need more time to reach the threshold for eye movements in the FEF movement cells. However, this delay is partly compensated by lower overall inhibition when the distractor is suppressed.

In our model, the onset of the movement cell activity depends directly on the detection of the target's presence in PF match cells (target detection), as a result of the constraint to withhold an eye movement to a distractor. Thus, target detection also influences the set-size effect. In the present simulations, however, target detection begins at a fairly constant time, ∼120 ms after target presentation. Miller et al. (1996) measured an average match response of ∼110–120 ms in prefrontal cortex as well, while presenting just one object at a time. The reason why we find a constant target presence detection lies in the simple stimuli and the low-level feature space we use [the scenes used in the experiment of Chelazzi et al. (1998) were also relatively simple]. Consistent with our model, difficult scenes can result in a delay, or even failure, in detecting the presence of the target.

### Alternative Models

We have demonstrated that a movement plan model fits with the temporal course of activity in IT and V4 using the paradigm of Chelazzi et al. (1998, 2001). Usher and Niebur (1996) have shown target selection in IT with only a feature-specific bias. Alternatively, it was suggested that visuomovement cells in the FEF could select the target (Thompson et al., 1997; Sato and Schall, 2003). We simulated these alternative models as well, to shed more light on their limitations (Fig. 9). Since all models contain a bias, either feature-specific alone or an additional location-specific bias, we observe the trivial result that the responses to the good and the poor stimuli differ. The objectives of rating the simulation data are as follows. First, in the target-present condition the IT cells show a transient response to the good stimulus and increase in firing prior to the eye movement. Second, in the target-absent condition none of the behaviorally irrelevant stimuli gets selected. From the experiment of Chelazzi et al. (1998) we cannot rule out that attention is not directed to non-target stimuli. However, since none of the stimuli receives a bias given by the instruction and the monkey has to hold fixation, we demand that noise in the neural responses alone should not result in the selection of a behaviorally irrelevant stimulus in response of the presentation of two stimuli. The parameters of all models are optimized separately to meet the objectives as well as possible for each model.

Figure 9.

Comparison of alternative models for target selection in IT in the target presence and target absence condition using a model with only a feature-specific bias (a, b), a visual selection model with feedback from FEF visuomovement cells (c) and a movement plan model with feedback from FEF movement cells (d). In all cases the response to the distractor (Target = Poor Stim.) is suppressed. The parameters of each model are fitted to meet two objectives. First, if the target is the good stimulus, a transient response after stimulus onset followed by an increase of activity prior to the eye movement is required. The slope of the increase is indicated by a dashed line. Second, in the no-target condition, none of the stimuli should be selected by noise in the system. The model with only a feature-specific bias using a strong self-enhancement (a) meets the first objective, since the response to the target shows an increase prior to the eye movement, but it fails to meet the second one, since stimulus ‘1’ is selected. The model with only a feature-specific bias using an intermediate self-enhancement (b) does not meet the first objective. The visual selection model (c) also fails to meet the first objective, although the spatial bias is already quite strong such that noise effects result in a slight target selection of stimulus ‘2’. The movement plan model (d) meets both objectives.

Figure 9.

Comparison of alternative models for target selection in IT in the target presence and target absence condition using a model with only a feature-specific bias (a, b), a visual selection model with feedback from FEF visuomovement cells (c) and a movement plan model with feedback from FEF movement cells (d). In all cases the response to the distractor (Target = Poor Stim.) is suppressed. The parameters of each model are fitted to meet two objectives. First, if the target is the good stimulus, a transient response after stimulus onset followed by an increase of activity prior to the eye movement is required. The slope of the increase is indicated by a dashed line. Second, in the no-target condition, none of the stimuli should be selected by noise in the system. The model with only a feature-specific bias using a strong self-enhancement (a) meets the first objective, since the response to the target shows an increase prior to the eye movement, but it fails to meet the second one, since stimulus ‘1’ is selected. The model with only a feature-specific bias using an intermediate self-enhancement (b) does not meet the first objective. The visual selection model (c) also fails to meet the first objective, although the spatial bias is already quite strong such that noise effects result in a slight target selection of stimulus ‘2’. The movement plan model (d) meets both objectives.

We simulated the model following the classical interpretation with a feature-specific bias from prefrontal cortex using a strong feedback from IT to V4 (Fig. 9a) and an intermediate feedback from IT to V4 (Fig. 9b). The strong feedback condition fulfills our first objective and shows an increase of activity prior to the eye movement, but it clearly fails to achieve the second one. The reduction of the weight of feedback from IT to V4 decreased the sensitivity to noise (second objective), but a reasonable bias from prefrontal cortex to IT does not sufficiently activate the response to the target prior to the eye movement. In general, any form of recurrent excitation is sensitive to noise. Thus, a strong excitatory loop within IT would select a behaviorally irrelevant stimulus as well.

Another alternative model for explaining the observation is a spatial reentry signal from the FEF visuomovement cells (Fig. 9c). In this model all locations receive a transient spatial bias due to stimulus onset, but since the visuomovement cells exhibit target discrimination (Fig. 8a), their activity can be sent back to V4 and IT to spatially select a stimulus. However, a reentry from the visuomovement cells shows difficulties meeting the objectives as well. We had to choose a weak gain for the spatial reentry signal, since otherwise noise results in the selection of a non-target. Even the strongest possible gain, which already slightly selects a non-target, did not allow for meeting the first objective.

We added to the model in Figure 9b a reentry signal from the movement cells to show that this model now meets both objectives (Fig. 9d).

We conclude that the target-present condition is difficult to explain entirely through the activation of a feature-specific top-down bias from prefrontal areas. A strong self-enhancement is sensitive to noise and thus predicts a winner in the non-target condition as well. A weak self-enhancement needs an additional strong (driving) bias. Visuomovement cells do not provide a good bias, since they are not decoupled from the early sensory processing and, thus, their bias is also sensitive to noise. A spatial reentry from movement cells is decoupled from direct sensory processing, since it requires the decision to plan an eye movement and so is not sensitive to noise.

We are careful to definitively rule out the alternative models, since the data from Chelazzi et al. (1998) do not allow a quantitative analysis. However, we exposed obvious inherent limitations of the alternative models in explaining the findings. According to our simulations, a spatial bias from the movement cells fits the objectives best.

An alternative, feature-specific explanation could be a weak early prefrontal bias and a strong late prefrontal bias. However, the monkey in Chelazzi et al.'s experiment knows the target object and its search plan is set, so it is unclear why a difference in strength between early and late prefrontal bias should occur. This does not mean that we can definitively exclude a feature-specific explanation. Nevertheless, as explained later, our hypothesis results in new testable predictions.

Our model was optimized to fit IT data in the visual search task of Chelazzi et al. (1998) using general information about the time course of activity in the FEF (Bruce and Goldberg, 1985; Schall, 1995; Bichot and Schall, 1999a). We have already discussed its fit with the V4 data obtained by Chelazzi et al. (2001). To further demonstrate that our model FEF can account for the data from a variety of experiments, we compare the same model with identical parameters to the behavioral data of a conjunction visual search experiment from Bichot and Schall (1999b) as well as FEF data from Sato et al. (2001).

Bichot and Schall (1999b) found that correct saccades are faster than incorrect ones. In our simulation (Appendix II) we varied the search efficiency of the task by a random selection of the feedback strength from PF working memory to IT. We observe a performance of 96% for correct target selection in trials with set size 4 and of 94% in trials with set size 6. Consistent with Bichot and Schall (1999b), the average time for correct saccades (291 ms) is significantly shorter than for incorrect saccades (360 ms) in the set size 4 condition (t-test, P < 0.001) as well as in the set size 6 condition (t-test, P < 0.001), with 298 ms for correct saccades and 472 ms for incorrect saccades. As we shall see next, the model predicts this increase on the basis that a poor discrimination leads to longer competition in the FEF.

A recent report investigated the effect of input discrimination on visual selection in the visuomovement cells of the FEF (Sato et al., 2001). Increasing the similarity of the distractors to the target increased reaction time and increased the time needed to discriminate the target by FEF visually responsive neurons. We have shown that increasing the target–distractor similarity increases the time to select the target and increases the number of errors (Hamker, 2004b). The target–distractor similarity and other factors, such as the availability of a target template, determine search efficiency by varying the target discrimination in the input of the FEF. We now shed more light of how target discrimination affects the time for target selection. We sorted the responses in the conjunction visual search simulation according to the reaction time and separated the trials into three equal groups (fast, medium, slow). By comparing the fast and slow groups, we see — similar to Sato et al. (2001) — a clear latency-increase in target discrimination with slower response time (Fig. 10). Thus, our model FEF transfers the target discrimination into the latency of a reentry signal.

Figure 10.

Fast and slow correct trials of visual search with varying discrimination of the target in V4 and IT (depending on the target template strength). An increase in saccade initiation time in FEF movement cells shows a correlation with the target disrimination time in FEF visuomovement cells (Wilcoxon rank sum test; significance level 0.0001). The line connects the medians of each group and shows a slope of 1.3. Thus, the model predicts that a better target discrimination in FEF visuomovement cells leads to faster eye movements. For the target discrimination, refer to Appendix III.

Figure 10.

Fast and slow correct trials of visual search with varying discrimination of the target in V4 and IT (depending on the target template strength). An increase in saccade initiation time in FEF movement cells shows a correlation with the target disrimination time in FEF visuomovement cells (Wilcoxon rank sum test; significance level 0.0001). The line connects the medians of each group and shows a slope of 1.3. Thus, the model predicts that a better target discrimination in FEF visuomovement cells leads to faster eye movements. For the target discrimination, refer to Appendix III.

Figure 11 shows the activity of the visuomovement cells in the fastest and slowest conditions. The initial activity clearly reflects the top-down advantage from the ‘what’ pathway (i.e. the number of dimensions that the item shares with the target). The target extends the discrimination with increasing time, consistent with the experimental data (Bichot and Schall, 1999a; Bichot et al., 2001a; Sato et al. 2001). In the fastest trial target discrimination occurs very early (50 ms), whereas in the slowest trial the discrimination of the target occurs at 290 ms.

Figure 11.

Discrimination of the target among five distractors in efficient and less efficient parallel search. The plots show the activity of the FEF visuomovement cells during the fastest trial (a) and the slowest trial (b) over time. The first dashed line indicates the discrimination in FEF visuomovement cells (Appendix III) and the second one the eye movement initiation in the FEF movement cells. In less efficient trials the target discrimination occurs late in time. The figure also indicates that the target discrimination in the input of the FEF is essential for a fast eye-movement selection. The initial differences after 50 ms correlate with the target–distractor similarity and depend on the randomized strength of the target template.

Figure 11.

Discrimination of the target among five distractors in efficient and less efficient parallel search. The plots show the activity of the FEF visuomovement cells during the fastest trial (a) and the slowest trial (b) over time. The first dashed line indicates the discrimination in FEF visuomovement cells (Appendix III) and the second one the eye movement initiation in the FEF movement cells. In less efficient trials the target discrimination occurs late in time. The figure also indicates that the target discrimination in the input of the FEF is essential for a fast eye-movement selection. The initial differences after 50 ms correlate with the target–distractor similarity and depend on the randomized strength of the target template.

### Predictions

#### A Poor Target Discrimination in FEF Visual Cells Results in Higher Activation of Non-target FEF Movement Cells

Our model FEF visuomovement cells show the effect of search efficiency on the visual selection in the FEF (Sato et al., 2001): low efficiency is characterized by poor (late) target discrimination in the visual cells. We now predict how search efficiency affects the movement cells, which were not investigated by Sato et al. (2001). In the case of a low efficiency, where a poor (late) target discrimination in the visual cells was observed, the model movement cells need more time to resolve the competition (Fig. 12). Our model predicts that in this case the distractor location can achieve a high activation relative to the condition with a good (fast) target discrimination.

Figure 12.

Activity of the FEF movement cells in efficient and less efficient parallel search over time. The figures show the fastest trial (a) and the slowest trial (b). The first dashed line indicates the discrimination in FEF visuomovement cells and the second one the eye movement initiation in the FEF movement cells. The non-target movement cell activity reaches higher values in non-efficient trials. Typically the items at these positions have one feature in common with the target.

Figure 12.

Activity of the FEF movement cells in efficient and less efficient parallel search over time. The figures show the fastest trial (a) and the slowest trial (b). The first dashed line indicates the discrimination in FEF visuomovement cells and the second one the eye movement initiation in the FEF movement cells. The non-target movement cell activity reaches higher values in non-efficient trials. Typically the items at these positions have one feature in common with the target.

#### A Late Target Effect in V4 and IT Is Launched by Spatial Reentry

Chelazzi et al. (1998) defined an early time window from 70 to 170 ms after stimulus onset and a late time window from 100 ms before the saccade until its execution. The responses of IT and V4 cells show an enhanced activity for the target in the early window, whereas a significant target selection was observed in the late time window (Chelazzi et al., 1998, 2001). It was suggested that the observed responses can be explained by a feature-specific bias from prefrontal areas (Chelazzi et al., 1998). Usher and Niebur (1996) have shown that competition among model IT cells is sufficient for the target selection observed by Chelazzi et al. (1993). However, their model is limited to the case of one target and one distractor, and did not explain the target-absent case. Our simulations of target-present and target-absent cases have shown that the target selection in the late phase is consistent with a reentry from the fronto-parietal network. A model without spatial reentry has difficulties in reconciling both the target-present and target-absent data.

#### Movement Cells of the Frontal Eye Field Are the Origin of Spatial Reentry

In the search for the saliency map, proposed by the classical hypothesis of spatial attention, a task-relevant increase has been reported in several fronto-parietal areas that process space, such as LIP (Bisley and Goldberg, 2003) and FEF (Bichot and Schall, 1999a). However, the major question is not which areas reflect attention but which areas are likely candidates for a spatially organized feedback signal — the source of spatial attention in the ventral pathway. Some recent experiments reported presaccadic activity in V4 (Moore, 1999; Tolias et al., 2001) which is likely to originate from the FEF (Moore and Armstrong, 2003). Since visual, visuomovement and movement cells exhibit target discrimination, spatial attention could be explained by a visual selection model or a movement plan model. Thompson and Schall (2000) observed a discrimination in the visuomovement cells and proposed a direct feedback of these cells into V4. We observe this discrimination in our model as well (Fig. 11). However, we suggest a movement plan model. Movement neurons have a late response and no phasic burst in response to stimulus onset. They show only little enhancement for distractors in visual search (Bichot et al., 2001b) and correct rejections in masking experiments (Thompson and Schall, 2000). Thus, movement cells are decoupled from direct visual processing. Our model suggests feedforward excitation and global inhibition from the visuomotor cells as a possible mechanism. Such a mechanism ensures that a broad activation pattern within the visuomovement cells is not transferred to movement cells. A strong and early feedback for target and distractors, as predicted if the phasic visual or visuomovement cells are the origin of reentry, introduces a selective bias in V4 and IT, which is sensitive to noise. We could only reconcile the experimental data with the simulation by assuming a feedback from the movement cells. The timing of a strong discrimination for our FEF movement cells, beginning 150 ms after array onset and 110 ms before eye movement, fits very well with the late target effect in the experimental data. This result is also consistent with information theory. If we define the reentry signal towards the target (true expected location) as the signal of interest and overall firing rate (false expected location) as noise, we would get a much higher signal/noise ratio in the movement cells than in the visuomovement cells.

Given this definition of spatial attention, our model predicts that target discrimination in the visuomovement cells can indicate spatial attention (Figs 8a, 11). However, in our model, target discrimination in the visuomovement cells guides spatial selection but does not provide the causal connection to spatial attention in V4.

#### Target Discrimination Translates into Latency of a Spatial Reentry Signal in Visual Search

We observed that a low target discrimination results in a slow and error-prone reentry process. As a result of a correct reentry, our simple model already predicts set-size effects in parallel searching (Fig. 7).

What factors might determine the input into the FEF (V4 activity) in such a way that it needs more time to select the target? Duncan and Humphreys (1992) have shown that varying the target–distractor and distractor–distractor similarity changes the efficiency of the task and produces different search slopes. The underlying reason could be that the different similarities determine the discrimination of the target in the ventral stream but do not produce any delay as such. An increasing set size might also reduce the discrimination through competitive interactions in V4. Since the ventral stream feeds the fronto-parietal network, the initial discrimination in action planning centers must also be poorer. Our simulations show that this poorer discrimination causes a slower spatial selection (Fig. 10). We observed selection times in the movement cells ranging from 220 to 400 ms after stimulus onset. Longer selection processes have not been observed, since noise in the system enforces either a correct or wrong selection. Depending on the efficiency of the search task our parallel mechanism can show a difference of 180 ms in selection time. Thus, we predict no faster selection times of covert attention than ∼120 ms, which is the discrimination time of movement cells in the fastest trial (Fig. 12a). Under the assumption that the number of items in the display affects the target–distractor discrimination, we predict that shallow but non-flat search slopes are based on a parallel mechanism. The prediction of a parallel search is of course difficult to test, since it would require showing the absence of any repetitive serial selection. However, we can give theoretical evidence that a slow reentry signal from movement cells can explain non-flat search slopes as result of a parallel process.

## Discussion

We aimed to demonstrate the suitability of our reentry hypothesis by comparing simulations with experimental data. Each modeled area exhibits a temporal course of activity that has been observed by similar physiological experiments performed by various investigators. Our approach is an attempt to tie together the existing understanding into a unified whole, so that we can better understand the interactions between different areas and design appropriate future experiments. We have demonstrated that the model can account for recent findings (Sato et al., 2001; Bichot et al., 2001a; Chelazzi et al., 2001) for which the model was not adjusted. Moreover, the simulations resulted in several experimentally testable predictions. We now discuss possible impacts of our study on theories of visual perception.

### Reentry and Competitive Mechanisms Evoke Attention

Attention is generally assumed to be computed within some brain areas in order to control processing in the brain. For example, Posner and Dehaene (1994) suggested that there were anterior and posterior attention systems. Such a localized view of attention is even more explicit in models in which attention originates within a saliency map (Treisman and Gelade, 1980; Wolfe, 1994; Itti and Koch, 2000). Other models have emphasized the controlling function of attention such as selective tuning (Tsotsos et al., 1995), the shifter-circuit (Olshausen et al., 1993) or a gain field (Salinas and Abbott, 1997). We admit that such models can be useful to describe aspects of attention, but they offer only a very abstract explanation of this phenomenon. Electrophysiology has started to investigate the neural mechanisms of attention. For example, within the biased competition framework attention has been suggested to be an emergent property of neural mechanisms (Desimone and Duncan, 1995). In particular, effects within the receptive field of cells have been revealed. In addition, the feature-similarity framework (Treue and Martínez Trujillo, 1999) suggests that mechanisms of feedback implement a global gain control.

Some recent computational models have emphasized the role of interactions within a network for explaining vision (Tononi et al., 1992; Mumford, 1992; Hamker, 1999; Kirkland and Gerstein, 1999; Hamker, 2000; Corchs and Deco, 2002). However, we are still missing an approach that allows us to describe how different areas contribute to object detection, attention and eye movement control. Tasks such as Chalazzi's visual search experiment can only be fully explained by an account that shows how different areas operate on the same event (Duncan et al., 1997). The present approach is particularly relevant, since each area is clearly defined and its cell dynamics have been observed in various experiments. We even account for the subdivision of cells in the FEF. This constraint considerably improves the validity of the claim that attention can be explained by already known areas, which compute specific variables, but not attention itself. We suggest that attention should not be regarded as a resource given by some control module. Attention is the result of mechanisms that act on the processed variables, such as gain control, by reentry and competitive interactions. We propose that future research focuses on identifying the areas that modulate vision. Movement cells of the FEF could provide an ideal signal for spatial selection. Other relevant areas controlling vision are the planning stages of the task at hand, which set task instructions and compute variables of interest. The mechanism described allows vision to be under cognitive control to resolve interference and to connect high-level task descriptions or actions with low-level scene descriptions.

### The Mechanism of Spatial Reentry Influences the Search Slope

In most visual search tasks the reaction time of subjects increases with the number of items. Two opposing theories have been suggested. The serial search hypothesis assumes that non-flat search slopes are necessarily the result of a scanning process that visits one item after another (Treisman and Gelade, 1980; Treisman and Sato, 1990; Wolfe, 1994; Itti and Koch, 2000). This assumption sometimes results in selection times of 30–50 ms per item. Parallel search has explained set-size effects in terms of a slow competitive mechanism (Duncan and Humphreys, 1989; Palmer, 1995; Deco et al., 2002).

Hybrid models have also been formulated (Bundesen, 1990, 1999; Chelazzi, 1999). They typically differentiate between a parallel capacity limited ‘one-view search’ and an additional slow spatial shift of attention. However, they do not specify the underlying neural mechanisms so that it is unclear on what kind of processes search is based. Since observation of human reaction times does not allow one or other explanation to be ruled out, experiments using a variety of methods have recently been conducted to ascertain the type of process (Corbetta et al., 1995; Woodman and Luck, 1999; Donner et al., 2000; Hopf et al., 2000; Leonards et al., 2000). Although some experiments tried to identify areas involved in a serial selection, the overall results are still inconclusive.

Our suggested spatial reentry mechanism predicts the involvement of a slow parallel as well as a serial component in visual search. Based on our simulation results we suggest that the brain does not have a fast scanning mechanism, only a slow one. We explain shallow but non-flat search slopes by a poorer and slower discrimination process for reentry. Steep search slopes, however, are likely be based on sequential reentry components. Interestingly, both modes are grounded in the same process. The strength of our approach lies in its testable predictions, which is an inherent result of the assumption that FEF movement cells provide a spatially selective reentry signal. Thus, we offer a clear description of the underlying process that can lead to set-size effects. The timing of the spatial reentry signal depends on the target discrimination and is therefore a variable parallel process. A poor discrimination, however, can lead to a wrong reentry. Since a distractor will be identified as such by the enhanced gain of cells encoding the distractor, a disengagement and following engagement of the spatial reentry component introduces the serial mechanism.

### Benefits and Limitations of the Model

At the model's core a reentry signal acts multiplicatively on the input of a cell, and thus gain control is described by means of a comparison of the feedforward with the reentry signal. The exact implementation in the brain is controversial; however, on an abstract level, multiplicative interactions are consistent with observations (Eskandar et al., 1992; McAdams and Maunsell, 1999; Hupé et al., 2001). Although we achieve a good fit with the temporal course of activity in several areas, and we have shown earlier that such a gain control also fits with recent experiments observing attention effects in V4 (Hamker, 2004a), at present it would be too early to claim that this describes an universal mechanism to implement a cognitive control of vision.

We have excluded the effects of stimulus-driven saliency. Consistent with our model, these effects might emerge from interactions in the network as well (Nothdurft et al., 1999; Kapadia et al., 2000; Li, 2002; Hochstein and Ahissar, 2002). Salient features would then be enhanced similar to feature-based, top-down effects.

We compared our model with data in which the monkey responded by making an eye movement towards the target. Chelazzi et al. (1998) report similar findings in a task where the monkey responded by pressing a lever. Our model would also produce qualitatively similar results if we assume that in this task the monkey is planning an eye movement, but movement cells do not reach threshold activity. At present, no experiment has studied FEF movement cell activity in covert attention tasks.

We do not claim that the FEF movement cells are the only source of spatial reentry. Within a distributed system, other areas are likely to have established similar mechanisms. The model is based on current anatomical and electrophysiological knowledge. Other areas, if necessary, can be included based on our gain control mechanism without changing the basic functionality described. Our simulations cannot prove that the movement cells or the FEF in general necessarily are responsible for the reentry signal. However, feedback from the visuomovement cells or no feedback at all resulted in a poor fit with the temporal course of activity in IT. Thus, based on our computational evidence, we suggest that the typical temporal course of activity of the FEF movement cells (Figs 8b, 12) is a necessary signal to discriminate the target from the background. Provided that anatomical studies show evidence for feedback connections, this prediction could be used to preselect cells in other brain areas in order to investigate if they are a source of reentry. LIP, for example, has only a few movement-type cells.

A strength of this model is its testability based on the predictions. In future work this model will be tested with other experimental paradigms. We have already managed to scale-up the model to cope with natural scenes (Hamker and Worcester, 2002). From the theoretical point of view our simulations reveal that an action/perception network can operate in a coordinated fashion by means of reentry. The decision in one area affects the outcome of the competition in another area, so that finally all areas operate on the same problem, an aspect of binding in the brain.

## Appendix I: Computational Aspects of the Model

We now give a formal description of the model. We first explain the input stimuli as well as the mechanisms of pooling and gain control. Then the equations of each area are given. Each connection in the model has an independent additive noise term that leads to variations in the transmission from one cell to another.

### Stimuli

Input stimuli Id,i,x are encoded as a population of cells i determined by a Gaussian distribution at each dimension d and each location x. For realistic experimental conditions, we delayed the input for 30 ms to account for the time a stimulus needs to reach V2. Since V1 cells typically fire very strongly in the beginning and then decrease in firing rate, we include a short-term synaptic depression Sd,i,x (similar to Chance et al., 1998, 1999) of the input.

(2)
$\begin{array}{lll}\mathrm{{\tau}}_{S}\frac{\mathrm{d}}{\mathrm{d}t}s_{d,i,x}{=}I_{d,i,x}{-}s_{d,i,x};&S_{d,i,x}{=}(1{-}d{\cdot}s_{d,i,x})&\begin{array}{l}d_{D}{=}0.45\\\mathrm{{\tau}}_{S}{=}0.08{\,}\mathrm{ms}\end{array}\end{array}$
The input into V4 is then computed as Sd,i,x · Id,i,x. This mechanism evokes a strong early response, which is useful to transfer the stimulus information within a bottom-up wave into higher areas and then allow top-down control to take over.

### Gain Control

We describe the modulation of the firing rate

$$\mathbf{\mathrm{r}}_{d,k,x}^{\mathrm{II}}(t)$$
of a population with a set of neurons kN(T) in an arbitrary area II. Each cell receives input
$$\mathbf{\mathrm{I}}_{d,k,x}^{\mathrm{II}}(t)$$
from cells
$$\mathbf{\mathrm{r}}_{d,i,x{^\prime}}^{\mathrm{I}}(t)$$
at a lower hierarchy level at the positions x′ within its receptive field x′ ∈ RF(x). Each of these populations usually encodes a different variable V(d,x′;t).

The signal

$$r_{d,i,x{^\prime}}^{\mathrm{I}}$$
is sent through a linear filter F (Fig. 13). For simplicity, we do not take topographically extended patterns or an increasing complexity of features into account. Thus, the preferred stimulus
$$\mathbf{\mathrm{u}}_{i}^{\mathrm{I}}$$
of a cell i in area I samples the same feature space as the preferred stimulus
$$\mathbf{\mathrm{u}}_{i}^{\mathrm{II}}$$
in area II. The filter
$$F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right){=}r_{d,i,x{^\prime}}^{\mathrm{I}}{\cdot}g_{k}\left({\Vert}\mathbf{\mathrm{u}}_{k}^{\mathrm{II}}{-}\mathbf{\mathrm{u}}_{i}^{\mathrm{I}}{\Vert}\right)$$
defines the feature space in area II with the preferred attribute
$$\mathbf{\mathrm{u}}_{k}^{\mathrm{II}}$$
by a set of radial basis functions gk.

Figure 13.

Sketch of the V4 and IT model to explain how afferents determine the output of a cell. Each cell in a higher area II

$$r_{k,x}^{II}$$
(e.g. in ITs) receives a weighted input from each cell in a lower area I
$$r_{i,x{^\prime}}^{I}$$
(e.g. from V4) at different locations x′ within its receptive field. Feedback connections
$${\hat{r}}_{i,x{^\prime}}$$
increase the input gain. For example, ITs receives feedback from PF working memory and FEF movement cells (Fig. 2). After the gain control stage, a spatial pooling function f is applied. Inhibition among target cells is modeled by an inhibitory pooling among all cells in the population. The final response is then determined by a differential equation, which describes the change through time of a model cell's activity.

Figure 13.

Sketch of the V4 and IT model to explain how afferents determine the output of a cell. Each cell in a higher area II

$$r_{k,x}^{II}$$
(e.g. in ITs) receives a weighted input from each cell in a lower area I
$$r_{i,x{^\prime}}^{I}$$
(e.g. from V4) at different locations x′ within its receptive field. Feedback connections
$${\hat{r}}_{i,x{^\prime}}$$
increase the input gain. For example, ITs receives feedback from PF working memory and FEF movement cells (Fig. 2). After the gain control stage, a spatial pooling function f is applied. Inhibition among target cells is modeled by an inhibitory pooling among all cells in the population. The final response is then determined by a differential equation, which describes the change through time of a model cell's activity.

The filtered incoming pattern is continuously compared with the expectation, such as spatial location or specific stimulus features. The gain is enhanced if the expectation

$${\hat{r}}_{d,i,x{^\prime}}^{\mathrm{{\gamma}}}$$
from the origin γ matches the feedforward signal
$$F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right).$$
Treue and Martínez Trujillo (1999) found evidence for an additive combination of feature-based and spatial attention. Similarly, we assume that feature-specific feedback
$${\hat{r}}_{d,i,x}^{F}(t)$$
and location-specific feedback
$${\hat{r}}_{d,i,x}^{L}(t)$$
independently increase the gain of the bottom-up signal and add up.

We use a non-linear pooling function f to define the influence of the filtered afferents

$$F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right)$$
on the cell k. To describe the process of filtering, input gain control and pooling, we define a convergent mapping function
$$\mathcal{R}$$
(see Mallot et al., 1990, for a general approach of neural mapping) of the activity at the populations of locations x′ within RF(x) onto the input
$$I_{d,k,x}^{\mathrm{II}}(t)$$
of a target population
$$r_{d,k,x}^{\mathrm{II}}(t)$$
at the location x in area II (Fig. 13):
(3)
\begin{eqnarray*}&&\mathcal{R}:\mathrm{Area}{\,}\mathrm{I}_{x{^\prime}}{\mapsto}\mathrm{Area}{\,}\mathrm{II}_{x}\\&&I_{d,k,x}^{\mathrm{II}}{=}w^{{\uparrow}}{\cdot}f{\,}\left(F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right)\right){+}{{\sum}_{\mathrm{{\gamma}}{\in}{\{}L,F{\}}}}\mathrm{{\sigma}}\left(A{-}r_{d,k,x}^{\mathrm{II}}\right){\cdot}f{\,}\left(F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right){\cdot}{\hat{r}}_{d,i,x{^\prime}}^{\mathrm{{\gamma}}}\right)\\&&\mathrm{{\sigma}}(a){=}max(a,0)\end{eqnarray*}
Gain control implements a multiplicative influence of feedback onto the feedforward stream. This is based on empirical data that shows that feedback connections can rapidly facilitate responses to stimuli, but do not drive cells without bottom-up activation (Hupé et al., 2001). When feedforward and feedback inputs are simultaneously active, feedback inputs could provide late polysynaptic excitatory post-synaptic potentials that influence the gain by the offset of slow inhibitory post-synaptic potentials, which provides an amplifying mechanism (Shao and Burkhalter, 1999).

Chelazzi et al. (1998) reported no attention effect on a single stimulus within a receptive field. A simple multiplicative gain increase would predict an even stronger effect. Reynolds et al. (2000) found that the effect of spatial attention can be best described as a contrast gain model. Attention increases the effective strength of a stimulus but not with high-contrast stimuli. Chelazzi et al. (1998) also used high-contrast stimuli. We do not aim to explain the possible underlying mechanisms of this effect here, but rather account for the finding by decreasing the efficiency of the feedback signal when the cell activity is higher according to

$$\mathrm{{\sigma}}\left(A{-}r_{d,k,x}^{\mathrm{II}}\right)$$
in equation (3). If the firing rate of a cell is
$$r_{d,k,x}^{\mathrm{II}}{=}A{=}0.42,$$
the effect of the feedback signal diminishes. In other words, the relative effect of the expectation increases with smaller inputs into the layer. This is also in accordance with findings in anesthetized monkeys where feedback into V1, V2 and V3 was more efficient for low-salience stimuli (Hupé et al., 2001). The mechanism implemented is similar to the saturation term introduced by Grossberg (1973), which was also used by Reynolds et al. (1999) to simulate the effect of spatial attention. However, we use this saturation only in the feedback pathway.

### Pooling Across Afferents

According to a previous study (Hamker, 2004a) we simulate a convergent projection from areas with smaller receptive field sizes to areas with larger receptive field sizes (Fig. 13) with a max-pooling function:

$$f\left(F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right)\right){=}max\left(F\left(r_{d,i,x{^\prime}}^{\mathrm{I}}\right)\right).$$
Using essentially the proposed area V4 alone, we compared the predictions of sum- and max-pooling. We found that both pooling functions can account for data from investigations into the competition between a pair of stimuli within a V4 receptive field (Reynolds et al., 1999). However, if we present an additional probe stimulus with the pair, sum-pooling predicts a bottom-up bias, whereas the competition using max-pooling is robust against the additional stimulus. Thus, max-pooling ensures that activities from different locations x′ of the receptive field do not add up on individual neurons k, but are simultaneously represented within the population. Thus, two equal objects do not result in a double activity, but two different objects are represented by different peaks within the population. A similar mechanism has been reported to improve the robustness of object recognition in hierarchical models (Riesenhuber and Poggio, 1999).

### Model V4

At each of six possible locations x ∈{1…6} and each feature dimension d we simulate a neural population

$$\mathbf{\mathrm{r}}^{V4}$$
:
(4)
$\begin{array}{ll}\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{d,i,x}^{V4}{=}I_{d,i,x}^{{\uparrow}}{+}I_{d,i,x}^{{\leftrightarrow}}{+}I_{d,i,x}^{{\downarrow}}{-}\left(r_{d,i,x}^{V4}{+}0.1\right)I_{d,x}^{inh}{-}Br_{d,i,x}^{V4};&\begin{array}{l}B{=}0.08\\\mathrm{{\tau}}{=}0.01{\,}\mathrm{s}\end{array}\end{array}$
The input is a result of bottom-up input
$$I^{{\uparrow}}$$
(equation 5) modulated by lateral
$$I^{{\leftrightarrow}}$$
(equation 6) and top-down gain control
$$I^{{\downarrow}}$$
(equation 7).
$$Br_{d,i,x}^{V4}$$
is a baseline inhibition term that keeps noise balanced. Id,i,x is defined by the task (Fig. 1). The lateral weights wij are computed from a Gaussian with wii = 0.3 and σ2 = 1. The feedback type input originates in ITt and FEFm (Fig. 2).
(5)
$I_{d,i,x}^{{\uparrow}}{=}w^{{\uparrow}}I_{d,i,x}{\cdot}S_{d,i,x};{\ }w^{{\uparrow}}{=}0.9$

(6)
$I_{d,i,x}^{{\leftrightarrow}}{=}I_{d,i,x}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i,x}^{V4}\right){{\sum}_{j}}w_{ij}r_{d,j,x}^{V4}$

(7)
\begin{eqnarray*}&&\begin{array}{ll}I_{d,i,x}^{{\downarrow}}{=}I_{d,i,x}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i,x}^{V4}\right)w^{\mathrm{ITt,V}4}r_{d,i}^{\mathrm{ITt}}\\&&{+}I_{d,i,x}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i,x}^{V4}\right)w^{\mathrm{FEFm},\mathrm{V}4}r_{x}^{\mathrm{FEFm}};&\begin{array}{l}w^{\mathrm{ITt},\mathrm{V}4}{=}20\\w^{\mathrm{FEFm},\mathrm{V}4}{=}10\end{array}\end{array}\end{eqnarray*}
Each population experiences short- and long-range inhibition (equation 8). We assume that long-range inhibition (Desimone and Schein, 1987) is mediated by a pool of inhibitory neurons
$$z_{d,x}^{V4}(t)$$
which collect the activity of each population.
(8)
$\begin{array}{ll}I_{d,x}^{inh}{=}w_{inh}{{\sum}_{i}}r_{d,i,x}^{V4}{+}w_{inh}^{RF}z_{d}^{V4};&\begin{array}{l}w_{inh}{=}1.3\\w_{inh}^{RF}{=}0.5\end{array}\end{array}$

(9)
$\mathrm{{\tau}}_{inh}^{RF}\frac{\mathrm{d}}{\mathrm{d}t}z_{d}^{V4}{=}{{\sum}_{x}}{\mathrm{max}_{i}}\left[r_{d,i,x}^{V4}\right]{-}z_{d}^{V4};{\ }\mathrm{{\tau}}_{inh}^{RF}{=}0.2{\,}\mathrm{s}$

### Model IT

In our model we do not increase the complexity of features from V4 to IT. Thus, our model IT populations represent the same feature space as our model V4 populations. The receptive field size, however, increases in our model, so that all populations in V4 converge onto one population in IT.

(10)
\begin{eqnarray*}&&\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{d,i}^{\mathrm{ITs}}{=}f\left(I_{d,i,x}^{{\uparrow}}\right){+}f\left(I_{d,i,x}^{{\leftrightarrow}}\right){+}f\left(I_{d,i,x}^{{\downarrow}}\right){-}\left(r_{d,i}^{ITs}{+}0.1\right)I_{d}^{inh}{-}Br_{d,i}^{\mathrm{ITs}}\\&&f{=}{\mathrm{max}_{x}};{\ }B{=}1.8\end{eqnarray*}
The overall input depends on the V4 cells that drive the population and on the feedback signals that enhance the sensitivity of IT cells (Fig. 2). The lateral weights wij are computed as in V4.
(11)
$I_{d,i,x}^{{\uparrow}}{=}w^{{\uparrow}}r_{d,i,x}^{V4};{\ }w^{{\uparrow}}{=}0.9$

(12)
$I_{d,i,x}^{{\leftrightarrow}}{=}I_{d,i,x}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i}^{ITs}\right){{\sum}_{j}}w_{ij}r_{d,j}^{ITs}$

(13)
\begin{eqnarray*}&&\begin{array}{ll}I_{d,i,x}^{{\downarrow}}{=}I_{d,i,x}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i}^{ITs}\right)w^{\mathrm{PFwm},\mathrm{ITs}}r_{d,i}^{\mathrm{PFwm}}\\&&{+}I_{d,i,x}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i}^{ITs}\right)w^{\mathrm{FEFm},\mathrm{ITs}}r_{x}^{\mathrm{FEFm}}&\begin{array}{l}w^{\mathrm{PFwm},\mathrm{ITs}}{=}10\\w^{\mathrm{FEFm},\mathrm{ITs}}{=}10\end{array}\end{array}\end{eqnarray*}
The inhibitory components are similar to V4 except that we only implemented one IT population.
(14)
$\begin{array}{ll}I_{d}^{inh}{=}w_{inh}{{\sum}_{i}}r_{d,i}^{ITs}{+}w_{inh}^{RF}z_{d}^{ITs}&\begin{array}{l}w_{inh}{=}0.14\\w_{inh}^{RF}{=}1.5\end{array}\end{array}$

(15)
$\mathrm{{\tau}}_{inh}^{RF}\frac{\mathrm{d}}{\mathrm{d}t}z_{d}^{ITs}{=}{{\sum}_{i}}r_{d,i}^{ITs}{-}z_{d}^{ITs}{\ }\mathrm{{\tau}}_{inh}^{RF}{=}0.1{\,}\mathrm{s}$
IT target (ITt) cells gets only input from IT stimulus (ITs) cells (Fig. 2). These cells ensure by strong competition that only a few active cells feed back into V4. The lateral weights wij are computed as in V4.
(16)
$\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{d,i}^{\mathrm{ITt}}{=}I_{d,i}^{{\uparrow}}{+}I_{d,i}^{{\leftrightarrow}}{-}\left(r_{d,i}^{ITt}{+}2\right)I_{d}^{inh}{-}Br_{d,i}^{\mathrm{ITt}};{\ }B{=}1.8$

(17)
$I_{d,i}^{{\uparrow}}{=}w^{{\uparrow}}\mathrm{{\sigma}}\left(r_{d,i}^{\mathrm{ITs}}{-}0.2\right);{\ }\mathrm{with{\sigma}}(a){=}max(a,0);{\ }w^{{\uparrow}}{=}1.4$

(18)
$I_{d,i}^{{\leftrightarrow}}{=}I_{d,i}^{{\uparrow}}{\cdot}\mathrm{{\sigma}}\left(A{-}r_{d,i}^{ITt}\right){{\sum}_{j}}w_{ij}r_{d,j}^{ITt}$

(19)
$I_{d}^{inh}{=}w_{inh}{{\sum}_{i}}r_{d,i}^{ITt};{\ }w_{inh}{=}0.6$

### Model PF

The underlying circuits, which are responsible for memory and the detection of a match, can involve many regions including subcortical areas. For simplicity, we assume a recurrent local circuit for working memory which is driven by ITs cells. The lateral weights wij are computed from a Gaussian with wii = 0.3 and σ2 = 0.6. Match cells (PFm) compare in parallel the current pattern in ITs cells with those in working memory (PFwm) (Fig. 2).

(20)
$\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{d,i}^{\mathrm{PFwm}}{=}I_{d,i}^{{\uparrow}}{+}{{\sum}_{j}}w_{ij}r_{d,j}^{\mathrm{PFwm}}{-}\left(r_{d,i}^{\mathrm{PFwm}}{+}0.25{+}I^{\mathrm{store}}\right)I_{d}^{inh}$

(21)
$I_{d}^{inh}{=}w_{inh}{{\sum}_{i}}r_{d,i}^{\mathrm{PFwm}};{\ }w_{inh}{=}0.4$

(22)
$I_{d,i}^{{\uparrow}}{=}\mathrm{{\sigma}}\left(0.35{-}{\mathrm{max}_{i}}\left(r_{d,i}^{\mathrm{PFwm}}\right)\right)\mathrm{{\sigma}}\left(r_{d,i}^{\mathrm{ITs}}{-}C\right);{\ }\mathrm{{\sigma}}(a){=}max(a,0)$
The variable Istore defines whether a pattern that fulfills
$$r_{d,i}^{\mathrm{ITs}}{-}C{>}0$$
with C = 0.1 should be memorized. It is externally set according to the task instruction. If a pattern is memorized, the term
$$\mathrm{{\sigma}}\left(0.35{-}\mathrm{max}_{i}\left(r_{d,i}^{\mathrm{PFwm}}\right)\right)$$
ensures that no other stimulus in IT can penetrate the memory.

To determine whether a pattern in the visual scene is similar to the pattern in memory we multiply the activity of the working memory cells with the one of IT cells. Activity increases in the match cells only if populations in ITs and working memory match. Cells with such characteristics have been observed (Freedman et al., 2002). The lateral weights wij are computed as in PF working memory.

(23)
$\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{d,i}^{\mathrm{PFm}}{=}I_{d,i}^{{\uparrow}}{+}{{\sum}_{j}}w_{ij}r_{d,j}^{\mathrm{PFm}}{-}\left(r_{d,i}^{\mathrm{PFm}}{+}w_{f{\,}inh}\right)I_{d}^{inh};{\ }w_{f{\,}inh}{=}0.5$

(24)
$I_{d,i}^{{\uparrow}}{=}w^{{\uparrow}}r_{d,i}^{\mathrm{PFwm}}r_{d,i}^{\mathrm{ITs}}$

(25)
$I_{d}^{inh}{=}w_{inh}{{\sum}_{i}}r_{d,i}^{\mathrm{PFm}}$

### Model FEF

We simulate frontal eye field visuomovement neurons which receive convergent afferents from V4 at the same retinotopic location (Fig. 2). Different dimensions d add up.

(26)
$\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{x}^{\mathrm{FEFv}}{=}I_{x}^{{\uparrow}}{-}r_{x}^{\mathrm{FEFv}}I^{inh}{-}Br_{x}^{\mathrm{FEFv}};{\ }B{=}0.3$

(27)
$\begin{array}{ll}I_{x}^{{\uparrow}}{=}w^{\mathrm{V}4\mathrm{s}}{{\sum}_{d}}{\mathrm{max}_{i}}\left(r_{d,i,x}^{\mathrm{V}4\mathrm{s}}\right){+}w^{\mathrm{FEFm}}r_{x}^{\mathrm{FEFm}};&\begin{array}{l}w^{\mathrm{V}4}{=}0.5\\w^{\mathrm{FEFm}}{=}0.2\end{array}\end{array}$

(28)
$I^{inh}{=}w_{inh}{\mathrm{max}_{x}}\left(r_{x}^{\mathrm{FEFv}}\right);{\ }w_{inh}{=}0.5$
The firing rate of these cells could be interpreted as representing the saliency or behavioral relevance of a location. Increased activity in FEF movement cells occurs when FEF fixation cells disinhibit the population (Fig. 2). Such disinhibition of the fixation cells occurs when the PF match cells signify a match with the target (since the monkeys in the experiment were trained only to make an eye movement towards the target and hold fixation in the target-absent condition). In the cue presentation phase, PF match cells have no influence over the FEF fixation cells. In addition to a feedforward excitation, the effect of the visuomovement cells on movement cells is a slight surround inhibition. A strong self-excitory component
$$I_{x}^{{\leftrightarrow}}$$
allows the movement cells to ramp-up. Since there is evidence that saccades are produced when movement-related activity in the FEF reaches a particular level (Hanes and Schall, 1996), we apply a fixed threshold to FEF movement cells and add 30 ms to the time it exceeds the threshold to initiate a saccade.
(29)
$\mathrm{{\tau}}\frac{\mathrm{d}}{\mathrm{d}t}r_{x}^{\mathrm{FEFm}}{=}I_{x}^{{\uparrow}}{+}I_{x}^{{\leftrightarrow}}{-}r_{x}^{\mathrm{FEFm}}I_{x}^{inh}$

(30)
$I_{x}^{{\uparrow}}{=}r_{x}^{\mathrm{FEFv}}{-}{{\sum}_{x{^\prime}{\neq}x}}w_{x,x{^\prime}}^{\mathrm{FEFv}}r_{x{^\prime}}^{\mathrm{FEFv}};{\ }w_{x,x{^\prime}}^{\mathrm{FEFv}}{=}0.15$

(31)
$I_{x}^{{\leftrightarrow}}{=}w^{\mathrm{FEFm}}r_{x}^{\mathrm{FEFm}};{\ }w^{\mathrm{FEFm}}{=}0.2$

(32)
$\begin{array}{ll}I_{x}^{inh}{=}w_{inh}{\mathrm{max}_{x}}\left(r_{x}^{\mathrm{FEFm}}\right){+}{{\sum}_{x{^\prime}{\neq}x}}w_{x,x{^\prime}}^{map}r_{x{^\prime}}^{\mathrm{FEFm}}{+}r^{\mathrm{FEFf}};&\begin{array}{l}w_{inh}{=}0.5\\w_{x,x{^\prime}}^{map}{=}3.6\end{array}\end{array}$

### Specification of Parameters

The temporal dynamics, including the effect of inhibitory pools within each, area has been worked out over several years, starting from an early simple model (Hamker, 1999). Once the dynamics, including the gain control mechanism, have been set up, the parameters of the model were specified from local to global. Our choice of parameters was guided by the typical course of activity measured in cell recordings. V4 was fit with experimental data from an attention experiment (Hamker, 2004a). The fine tuning to fit the experimental data of Chelazzi et al. (1998) was done by iteratively adjusting the weights between the areas, keeping the parameters within the areas fixed. The final values used are examples for which the model exhibits dynamics that closely resemble those of the recordings of Chelazzi et al. (1998). The qualitative behavior of the model is stable over a reasonable range of the parameters. Although the model contains several parameters to simulate the firing rates, the degrees of freedom are strongly limited by the constraint of matching the typical course of activity and by ananomical constraints. Such systems models differ largely from mathematical models (e.g. Bundesen, 1999) in which parameters are much less constrained by electrophysiology and anatomy.

## Appendix II: Conjunctive Search Task

Two conjunction visual search experiments have been simulated: a target with three distractors and target with five distractors. We construct a target item in two dimensions, i.e. ‘color’ and ‘shape’. The color-similar distractor activates the same neural population as the target in the first dimension and the shape-similar distractor activates the same population as the target the second dimension. The four-item display contains a target, a dissimilar, a shape-similar and a color-similar distractor. The six-item display is extended with an additional shape-similar and color-similar distractor. The target ‘color’ and ‘shape’ are stored in memory before the search begins without showing a cue.

To investigate interesting dependencies between correct and error trials, as well as easy and difficult trials, we varied the search efficiency of the task by varying the top-down weight from the PF working memory to IT. Among other sources that determine search efficiency, this simulates the availability of a target template. The simulations are repeated 80 times for each set size. Unlike the simulation of the experiment of Chelazzi et al. (1998), a saccade is always executed even if the match with the target template is poor.

## Appendix III: Target Discrimination Analysis

To determine the time at which neural activity in FEF visuomovement cells discriminates the target from distractors, we defined a discrimination threshold. For sufficient discrimination of the target the difference between its activity and the activity of a cell encoding a distractor location has to exceed the discrimination threshold for 15 ms. This is much simpler than the method used by Sato et al. (2001) for their recordings, but sufficient for a reliable measurement, since our model cells are less noisy than real cells. For all simulations we used the same model parameters.

I am grateful to Jamie Mazer, Jeffrey Schall, Leonardo Chelazzi, Rufin VanRullen and Christof Koch for helpful comments on earlier versions of this manuscript. I also thank Narcisse Bichot and Andrew Rossi for valuable discussions. This research was supported by DFG HA2630/2-1 and in part by the ERC Program of the NSF (EEC-9402726).

## References

Asaad WF, Rainer G, Miller EK (
2000
) Task-specific neural activity in the primate prefrontal cortex.
J Neurophysiol

84
:
451
–459.
Bahill AT, Adler D, Stark L (
1975
) Most naturally occurring human saccades have magnitudes of 15 degrees or less.
Invest Ophthalmol

14
:
468
–469.
Baizer JS, Ungerleider LG, Desimone R (
1991
) Organization of visual inputs to the inferior temporal and posterior parietal cortex in macaques.
J Neurosci

11
:
168
–190.
Bichot NP, Schall JD (
1999
a) Effects of similarity and history on neural mechanisms of visual selection.
Nat Neurosci

2
:
549
–554.
Bichot NP, Schall JD (
1999
b) Saccade target selection in macaque during feature and conjunction visual search.
Vis Neurosci

16
:
81
–89.
Bichot NP, Thompson KG, Rao SC, Schall JD (
2001
a) Reliability of macaque frontal eye field neurons signaling saccade targets during visual search.
J Neurosci

21
:
713
–725.
Bichot NP, Rao SC, Schall JD (
2001
b) Continuous processing in macaque frontal eye cortex during visual search.
Neuropsychologia

39
:
972
–982.
Bisley JW, Goldberg ME. (
2003
) Neuronal activity in the lateral intraparietal area and spatial attention.
Science

299
:
81
–86.
Bruce CJ, Goldberg ME (
1985
) Primate frontal eye fields. I. Single neurons discharging before saccades.
J Neurophysiol

53
:
603
–635.
Bundesen C (
1990
) A theory of visual attention.
Psychol Rev

97
:
523
–547.
Bundesen C. (
1999
) A computational theory of visual attention. In: Attention, space and action: studies in cognitive neuroscience (Humphreys GW, Duncan J, Treisman A, eds), pp. 54–71. Oxford: Oxford University Press.
Burman DD, Bruce CJ (
1997
) Suppression of task-related saccades by electrical stimulation in the primate's frontal eye field.
J Neurophysiol

77
:
2252
–2267.
Chance FS, Nelson SB, Abbott LF (
1998
) Synaptic depression and the temporal response characteristics of V1 cells.
J Neurosci

18
:
4785
–4799.
Chance FS, Nelson SB, Abbott LF (
1999
) Complex cells as cortically amplified simple cells.
Nat Neurosci

2
:
277
–282.
Chelazzi L (
1999
) Serial attention mechanisms in visual search: a critical look at the evidence.
Psychol Res

62
:
195
–219.
Chelazzi L, Miller EK, Duncan J, Desimone R (
1993
) A neural basis for visual search in inferior temporal cortex.
Nature

363
:
345
–347.
Chelazzi L, Duncan, J, Miller EK, Desimone R (
1998
) Responses of neurons in inferior temporal cortex during memory-guided visual search.
J Neurophysiol

80
:
2918
–2940.
Chelazzi L, Miller EK, Duncan J, Desimone R (
2001
) Responses of neurons in macaque area V4 during memory-guided visual search.
Cereb Cortex

11
:
761
–772.
Corbetta M, Shulman GL, Miezin FM, Petersen SE (
1995
) Superior parietal cortex activation during spatial attention shifts and visual feature conjunction.
Science

270
:
802
–805.
Corchs S, Deco G (
2002
) Large-scale neural model for visual attention: integration of experimental single-cell and fMRI data.
Cereb Cortex

12
:
339
–348.
Deco G, Pollatos O, Zihl J (
2002
) The time course of selective visual attention: theory and experiments.
Vision Res

42
:
2925
–2945.
Desimone R, Schein SJ. (
1987
) Visual properties of neurons in area V4 of the macaque: sensitivity to stimulus form.
J Neurophysiol.

57
:
835
–868.
Desimone R, Duncan J (
1995
) Neural mechanisms of selective attention.
Annu Rev Neurosci

18
:
193
–222.
Dias EC, Bruce CJ (
1994
) Physiological correlate of fixation disengagement in the primate's frontal eye field.
J Neurophysiol

72
:
2532
–2537.
Donner T, Kettermann A, Diesch E, Ostendorf F, Villringer A, Brandt SA (
2000
) Involvement of the human frontal eye field and multiple parietal areas in covert visual selection during conjunction search.
Eur J Neurosci

12
:
3407
–3414.
Duncan J, Humphreys GW (
1989
) Visual search and stimulus similarity.
Psychol Rev

96
:
433
–458.
Duncan J, Humphreys GW (
1992
) Beyond the search surface: visual search and attentional engagement.
J Exp Psychol Hum Percept Perform

18
:
578
–588.
Duncan J, Humphreys GW, Ward R (
1997
) Competitive brain activity in visual attention.
Curr Opin Neurobiol

7
:
255
–261.
Eskandar EN, Optican LM, Richmond BJ (
1993
) Role of inferior temporal neurons in visual memory. II. Multiplying temporal waveforms related to vision and memory.
J Neurophysiol

68
:
1296
–1306.
Freedman DJ, Riesenhuber M, Poggio T, Miller EK (
2001
) Categorical representation of visual stimuli in the primate prefrontal cortex.
Science

291
:
312
–316.
Freedman DJ, Riesenhuber M, Poggio T, Miller EK (
2002
) Visual categorization and the primate prefrontal cortex: neurophysiology and behavior.
J Neurophysiol

88
:
929
–941.
Grossberg S (
1973
) Contour enhancement short term memory, and constancies in reverberating neural networks.
Stud Appl Math

52
:
217
–257.
Hamker FH (
1999
) The role of feedback connections in task-driven visual search. In: Connectionist models in cognitive neuroscience (Heinke D, Humphreys GW, Olson A, eds), pp. 252–261. London: Springer Verlag.
Hamker FH (
2000
) Distributed competition in directed attention. In: Proceedings in artificial intelligence, Vol. 9: Dynamische Perzeption (Baratoff G, Neumann H, eds), pp 39–44. Berlin: AKA, Akademische Verlagsgesellschaft.
Hamker FH (
2001
) Attention as a result of distributed competition.
Soc Neurosci Abstr

27
:
348.10
.
Hamker FH (
2002
) How does the ventral pathway contribute to spatial attention and the planning of eye movements? In: Dynamic perception (Würtz RP, Lappe M, eds), pp. 83–88. St Augustin: Infix Verlag.
Hamker FH (
2003
) The reentry hypothesis: linking eye movements to visual perception.
J Vision

11
:
808
–816.
Hamker FH (
2004
a) Predictions of a model of spatial attention using sum- and max-pooling functions.
Neurocomputing.

56C
:
329
–343.
Hamker FH (
2004
b) A dynamic model of how feature cues guide spatial attention.
Vision Research

44
:
501
–521.
Hamker FH, Worcester J (
2002
) Object detection in natural scenes by feedback. In: Biologically motivated computer vision (Bülthoff HH et al., eds), pp. 398–407. Berlin: Springer Verlag.
Hanes DP, Schall JD (
1996
) Neural control of voluntary movement initiation.
Science

274
:
427
–430.
Hanes DP, Patterson WF 2nd, Schall JD (
1998
) Role of frontal eye fields in countermanding saccades: visual, movement, and fixation activity.
J Neurophysiol

79
:
817
–834.
Hasegawa RP, Blitz AM, Geller NL, Goldberg ME (
2000
) Neurons in monkey prefrontal cortex that track past or predict future performance.
Science

290
:
1786
–1789.
Hochstein S, Ahissar M. (
2002
) View from the top: hierarchies and reverse hierarchies in the visual system.
Neuron

36
:
791
–804.
Hopf JM, Luck SJ, Girelli M, Hagner T, Mangun GR, Scheich H, Heinze HJ (
2000
) Neural sources of focused attention in visual search.
Cereb Cortex.

10
:
1233
–1241.
Hupé JM, James AC, Girard P, Lomber SG, Payne BR, Bullier J (
2001
) Feedback connections act on the early part of the responses in monkey visual cortex.
J Neurophysiol

85
:
134
–145.
Ignashchenkova A, Dicke PW, Haarmeier T, Thier P. (
2004
) Neuron-specific contribution of the superior colliculus to overt and covert shifts of attention.
Nat Neurosci.

7
:
56
–64.
Itti L, Koch C (
2000
) A saliency-based search mechanism for overt and covert shifts of visual attention.
Vision Res

40
:
1489
–1506.
Kapadia MK, Westheimer G, Gilbert CD (
2000
) Spatial distribution of contextual interactions in primary visual cortex and in visual perception.
J Neurophysiol

84
:
2048
–2062.
Kirkland KL, Gerstein GL (
1999
) A feedback model of attention and context dependence in visual cortical networks.
J Comput Neurosci

7
:
255
–267.
Koechlin E, Anton JL, Burnod Y (
1999
) Bayesian inference in populations of cortical neurons: a model of motion integration and segmentation in area MT.
Biol Cybern

80
:
25
–44.
Knoblauch A, Palm G. (
2002
) Scene segmentation by spike synchronization in reciprocally connected visual areas. II. Global assemblies and synchronization on larger space and time scales.
Biol Cybern.

87
:
168
–184.
Leonards U, Sunaert S, Van Hecke P, Orban GA (
2000
) Attention mechanisms in visual search — an fMRI study.
J Cogn Neurosci

12
(suppl):
61
–75.
Li Z (
2002
) A saliency map in primary visual cortex.
Trends Cogn Sci

6
:
9
–16.
Luck SJ, Chelazzi L, Hillyard SA, Desimone R (
1997
) Mechanisms of spatial selective attention in areas V1, V2 and V4 of macaque visual cortex.
J Neurophysiol

77
:
24
–42.
Mallot HA, von Seelen W, Giannakopoulos F (
1990
) Neural mapping and space-variant image processing.
Neural Networks

3
:
245
–263.
1999
) Effects of attention on orientation-tuning functions of single neurons in macaque cortical area V4.
J Neurosci

19
:
431
–441.
Miller EK, Cohen JD (
2001
) An integrative theory of prefrontal cortex function.
Annu Rev Neurosci

24
:
167
–202.
Miller EK, Li L, Desimone R (
1993
) Activity of neurons in anterior inferior temporal cortex during a short-term memory task.
J Neurosci

13
:
1460
–1478.
Miller EK, Erickson CA, Desimone R (
1996
) Neural mechanisms of visual working memory in prefrontal cortex of the macaque.
J Neurosci

16
:
5154
–5167.
Moore T (
1999
) Shape representations and visual guidance of saccadic eye movements.
Science

285
:
1914
–1917.
Moore T, Fallah M (
2001
) Control of eye movements and spatial attention.
Proc Natl Acad Sci U S A

98
:
1273
–1276.
Moore T, Armstrong KM (
2003
) Selective gating of visual signals by microstimulation of frontal cortex.
Nature

421
:
370
–373.
Mumford D (
1992
) On the computational architecture of the neocortex. II. The role of cortico-cortical loops.
Biol Cybern

66
:
241
–251.
Murthy A, Thompson KG, Schall JD (
2001
) Dynamic dissociation of visual selection from saccade programming in frontal eye field.
J Neurophysiol

86
:
2634
–2637.
Nothdurft HC, Gallant JL, Van Essen DC (
1999
) Response modulation by texture surround in primate area V1: correlates of ‘popout’ under anesthesia.
Vis Neurosci

16
:
15
–34.
Olhausen B, Anderson C, van Essen D (
1993
) A neurobiological model of visual attention and invariant pattern recognition based on dynamic routing of information.
J Neurosci

13
:
4700
–4719.
Palmer J (
1995
) Attention in visual search: distinguishing four causes of set-size effects.
Curr Dir Psychol Sci

4
:
118
–123.
Posner MI, Dehaene S (
1994
) Attentional networks.
Trends Neurosci

17
:
75
–79.
Reynolds JH, Chelazzi L, Desimone R (
1999
) Competetive mechanism subserve attention in macaque areas V2 and V4.
J Neurosci

19
:
1736
–1753.
Reynolds JH, Pasternak T, Desimone R (
2000
) Attention increases sensitivity of V4 neurons.
Neuron

26
:
703
–714.
Riesenhuber M, Poggio T (
1999
) Hierarchical models of object recognition in cortex.
Nat Neurosci

2
:
1019
–1025.
Rockland KS, van Hoesen GW (
1994
) Direct temporal-occipital feedback connections to striate cortex (V1) in the macaque monkey.
Cereb Cortex

4
:
300
–313.
Rockland KS, Saleem KS, Tanaka K (
1994
) Divergent feedback connections from areas V4 and TEO in the macaque.
Vis Neurosci

11
:
579
–600.
Salinas E, Abbott LF (
1997
) Invariant visual responses from attentional gain fields.
J Neurophysiol

77
:
3267
–3272.
Sato T, Schall JD (
2001
) Pre-excitatory pause in frontal eye field responses.
Exp Brain Res

139
:
53
–58.
Sato TR, Schall JD (
2003
) Effects of stimulus-response compatibility on neural selection in frontal eye field.
Neuron

38
:
637
–648.
Sato T, Murthy A, Thompson KG, Schall JD (
2001
) Search efficiency but not response interference affects visual selection in frontal eye field.
Neuron

30
:
583
–591.
Schall JD (
1995
) Neural basis of saccade target selection.
Rev Neurosci

6
:
63
–85.
Schall JD (
2002
) The neural selection and control of saccades by the frontal eye field.
Phil Trans R Soc Lond B

357
:
1073
–1082.
Schall JD, Morel A, King DJ, Bullier J (
1995
a) Topography of visual cortex connections with frontal eye field in macaque: convergence and segregation of processing streams.
J Neurosci

15
:
4464
–4487.
Schall JD, Hanes DP, Thompson KG, King DJ (
1995
b) Saccade target selection in frontal eye field of macaque. I. Visual and premovement activation.
J Neurosci

15
:
6905
–6918.
Shao Z, Burkhalter A (
1999
) Role of GABAB receptor-mediated inhibition in reciprocal interareal pathways of rat visual cortex.
J Neurophysiol

81
:
1014
–1024.
Sommer MA, Wurtz RH (
2000
) Composition and topographic organization of signals sent from the frontal eye field to the superior colliculus.
J Neurophysiol

83
:
1979
–2001.
Stanton GB, Bruce CJ, Goldberg ME (
1993
) Topography of projections to the frontal lobe from the macaque frontal eye fields.
J Comp Neurol

330
:
286
–301.
Stanton GB, Bruce CJ, Goldberg ME (
1995
) Topography of projections to posterior cortical areas from the macaque frontal eye fields.
J Comp Neurol

353
:
291
–305.
Stanton GB, Goldberg ME, Bruce CJ (
1988
) Frontal eye field efferents in the macaque monkey: I. Subcortical pathways and topography of striatal and thalamic terminal fields.
Journal of Comparative Neurology

271
:
473
–492.
Tanaka K, Saito HA, Fukada Y, Moriya M (
1991
) Coding visual images of objects in the inferotemporal cortex of the macaque monkey.
J Neurophysiol

66
:
170
–189.
Tanji J, Hoshi E (
2001
) Behavioral planning in the prefrontal cortex.
Curr Opin Neurobiol

11
:
164
–170.
Thompson KG, Schall JD (
2000
) Antecedents and correlates of visual detection and awareness in macaque prefrontal cortex.
Vision Res

40
:
1523
–1538.
Thompson KG, Bichot NP, Schall JD (
1997
) Dissociation of visual discrimination from saccade programming in macaque frontal eye field.
J Neurophysiol

77
:
1046
–1050.
Tolias AS, Moore T, Smirnakis SM, Tehovnik EJ, Siapas AG, Schiller PH (
2001
) Eye movements modulate visual receptive fields of V4 neurons.
Neuron

29
:
757
–767.
Tomita H, Ohbayashi M, Nakahara K, Hasegawa I, Miyashita Y (
1999
) Top-down signal from prefrontal cortex in executive control of memory retrieval.
Nature

401
:
699
–703.
Tononi G, Sporns O, Edelman G (
1992
) Reentry and the problem of integrating multiple cortical areas: Simulation of dynamic integration in the visual system.
Cereb Cortex

2
:
310
–335.
1980
) A feature integration theory of attention.
Cogn Psychol

12
:
97
–136.
Treisman A, Sato S, (
1990
) Conjunction search revisited.
J Exp Psychol Hum Percept Perform

16
:
459
–478.
Treue S, Maunsell JH (
1999
) Effects of attention on the processing of motion in macaque middle temporal and medial superior temporal visual cortical areas.
J Neurosci

19
:
7591
–7602.
Treue S, Martínez Trujillo JC (
1999
) Feature-based attention influences motion processing gain in macaque visual cortex.
Nature

399
:
575
–579.
Tsotsos JK, Culhane SM, Wai W, Lai Y, Davis N, Nuflo F (
1995
) Modeling visual attention via selective tuning.
Artificial Intelligence

78
:
507
–545.
Usher M, Niebur E (
1996
) Modeling the temporal dynamics of IT neurons in visual search: a mechanism for top-down selective attention.
J Cogn Neurosci

8
:
311
–327.
Webster MJ, Bachevalier J, Ungerleider LG (
1994
) Connections of inferior temporal areas TEO and TE with parietal and frontal cortex in macaque monkeys.
Cereb Cortex

4
:
470
–483.
White IM, Wise SP (
1999
) Rule-dependent neuronal activity in the prefrontal cortex.
Exp Brain Res

126
:
315
–335.
Woodman GF, Luck SJ (
1999
) Electrophysiological measurement of rapid shifts of attention during visual search.
Nature

400
:
867
–869.
Wolfe J (
1994
) Guided search 2.0 A revised model of visual search.
Psychonom Bull Rev

1
:
202
–238.