Scalp event-related potential (ERP) studies in humans indicate that face processes taking place between 130 and 170 ms after stimulus onset at posterior sites (N170) are strongly reduced when another face stimulus is processed concurrently or has been presented shortly before for a prolonged period. These observations suggest that neural representations of individual faces compete in the occipitotemporal cortex as early as 130 ms. Here, we tested the respective role of spatial attention and sensory competition in accounting for the amplitude reduction of the N170 during concurrent face stimulation. ERPs time locked to a lateralized face stimulus were recorded while subjects were fixating either a face or a controlled scrambled-face stimulus (context factor) and were engaged in either a high- or a low-attentional load task at fixation (task factor). The N170 amplitude to the lateralized face stimulus was reduced both when the central stimulus was a face compared with a scrambled face and when the attentional load at fixation was high. However, these effects of context and task factors were largely additive. Most importantly, spatial attention modulated visual processes as early as 80 ms after stimulus onset, whereas sensory competition effects started at about 130 ms. These results provide strong evidence that the N170 in response to faces is modulated by spatial attention, and also that spatial attention and sensory competition do not reflect the same mechanisms of early selection of visual information in the extrastriate cortex.
Event-related potential (ERP) studies of face processing have reported a large negative deflection over occipitotemporal electrode sites on the scalp, starting at around 130 ms and peaking at around 170 ms following stimulus onset, the N170 (e.g., Botzel and others 1995; Bentin and others 1996; Rossion and others 2000; Rousselet and others 2004). This component is thought to reflect high-level visual face processes taking place in multiple visual cortical areas during interlocked time courses (see e.g., Henson and others 2003; Rossion and others 2003; Itier and Taylor 2004).
It was recently observed that the N170 in response to a lateralized face stimulus is massively reduced in amplitude when subjects are fixating a central face picture, as compared with when they are fixating a nonface stimulus matched for low-level visual properties (Jacques and Rossion 2004). Similarly, fixating a face—relative to a control stimulus—for a prolonged period (i.e., 5 s) results in a category-specific suppression of the N170 elicited by another face presented subsequently (Kovacs and others 2006).
The N170 reduction during concurrent stimulation is in line with single-cell recording studies in the monkey inferotemporal (IT) cortex showing that neurons tuned to respond best to face stimuli (Perrett and others 1982; Rolls 1992) exhibit a decrease of their response when more than one stimulus is present in the visual field (e.g., Miller and others 1993; Rolls and Tovee 1995). These effects are generally interpreted as reflecting competitive interactions among visual stimuli for neural representation, to the extent that these stimuli recruit a common population of neurons (Desimone and Duncan 1995; Desimone 1998; Keysers and Perrett 2002).
A major interest of such a concurrent stimulation paradigm in scalp ERPs is that it can be used to test the extent and the time course of the interaction between different shape representations. For instance, fixating a nonface object in a domain of expertise, such as a car picture in car experts, leads to a reduction of the N170 elicited by a face stimulus presented next to the central object (Rossion and others 2004; B. Rossion, V. Goffaux, D. Collin, unpublished data). These observations suggest that when presented concurrently, faces and nonface objects in a domain of expertise compete for common early visual categorization processes in the occipitotemporal cortex. However, there is at least one alternative explanation for this reduction of the face N170 in a concurrent presentation paradigm. That is, subjects may allocate less attentional resources to the visual field location where the lateralized face stimulus appears when another behaviorally relevant stimulus such as a face (or a car picture in car experts) is present at fixation. In other words, the reduction of the N170 recorded to a face stimulus may result from a general spatially based attentional modulation rather than from an underlying competition taking place between neural representations of faces or of faces and objects of expertise.
This alternative explanation is based on behavioral and neural experiments showing that early visual processes are modulated by spatial attention. Behaviorally, spatial attention modulations have generally been examined by using spatial cueing paradigms in which a cue or an instruction predicts the location of a forthcoming target stimulus. In addition, spatially based selection of visual information has also been investigated by varying the perceptual/attentional load at the attended location. Overall, these studies reveal a facilitation/suppression of the processing of stimuli appearing at the attended/unattended location (in spatial cueing paradigm: Cheal and Lyon 1991; Lu and Dosher 1998; Lee and others 1999; Pestilli and Carrasco 2005) or a reduced processing of unattended peripheral distractor stimuli (in perceptual/attentional load paradigms: Lavie and Tsal 1994; Lavie 1995; Plainis and others 2001; for a review, see Lavie 2005). Neuroimaging studies (Rees and others 1997; Tootell and others 1998; Martinez and others 1999; Somers and others 1999; Liu and others 2005), human scalp ERP studies (Luck and others 1994; Hillyard and others 1998; Hopfinger and Mangun 1998; Di Russo and others 2003; for reviews, see Mangun 1995; Luck and others 2000), and single-cell recording in the monkey brain (Moran and Desimone 1985; Spitzer and others 1988; Motter 1993; Connor and others 1997; Luck and others 1997; McAdams and Maunsell 1999) have shown that these behavioral attentional effects are accompanied by modulations of neural activity at a sensory level, in both striate and extrastriate visual areas (for recent reviews, see Kastner and Ungerleider 2000; Reynolds and Chelazzi 2004). Moreover, human ERP studies indicate that spatial attention not only affects visual processing at a sensory level but also acts as early as the initial sensory input to extrastriate cortex, as reflected by amplitude modulations of the early visual ERP components P1 and N1, starting at around 60–100 ms after stimulus onset (Hillyard and others 1998; Handy and others 2001; Di Russo and others 2003). Given these observations, it is plausible that spatial attention may account for the massive reduction of the N170 recorded to lateralized faces when presented concurrently either with another face or with an object in a domain of expertise.
The present ERP study was carried out to disentangle the effect of sensory competition between concurrently presented stimuli from the putative visual spatial attention effects that may act on the processing of faces presented in the periphery. To do so, we manipulated both sensory stimulation and attentional load on the fixated stimulus in the same electroencephalography (EEG) recording session. Scalp ERPs elicited by the onset of a lateralized face were recorded while subjects were fixating a central stimulus, either a face or a control Fourier scrambled face. Attentional load at fixation was varied using 2 different tasks that were performed in different blocks, on the exact same stimulus sequence. The first task was the same as in previous experiments (Jacques and Rossion 2004; Rossion and others 2004; B. Rossion, V. Goffaux, D. Collin, unpublished data) and was considered to involve “low-attentional load” on the central fixated stimulus. The second task (high-attentional load) was designed to increase and control the level of attention allocated to the central fixated stimulus.
In short, the same subjects were submitted to the exact same stimuli while the factors spatial attention (high load vs. low load at center) and sensory competition (face vs. scrambled face at fixation) were crossed in a 2 × 2 design.
Based on previous evidence, we drew several hypotheses regarding the effects of spatial attention and sensory competition on the ERPs indexing early visual processing of face stimuli. First, we expected to replicate the N170 amplitude reduction starting at around 130 ms when the lateralized face appeared concurrently with another face. Second, we hypothesized that the N170 to the lateralized face would be reduced when subjects are engaged in the “high-load” task at fixation. Previous studies have found small effects of spatial attention on the face N170 (e.g., Eimer 2000; Holmes and others 2003). However, the N1 component occurring in the same time range as the face N170 in response to the presentation of simple visual stimuli is strongly sensitive to spatial attention (Mangun 1995; Luck and others 2000). Therefore, we hypothesized that the N170 in the present experiment would be strongly modulated by spatial attention, providing that in the high-load task, the load at fixation was sufficiently strong to consume most attentional resources (see Handy and Mangun 2000; Jenkins and others 2003; Lavie 2005). Third, we tested whether the 2 effects (spatial attention and competition) interact significantly at the level of the N170. If the N170 decrease for lateralized faces presented in the context of another face stimulus is largely dependent on spatial attention, this effect should be present in the low-load condition, but largely attenuated in the high-load condition. However, if the N170 decrease observed when 2 faces are presented concurrently is mainly related to sensory competition, its magnitude should remain roughly equal across attentional load conditions.
Materials and Methods
Eighteen paid volunteers (9 females, 3 left handed, mean age 23.2 [± 4] years) gave their informed written consent to participate in this experiment. All subjects had normal or corrected-to-normal vision.
The same stimuli than in our previous study were used (Jacques and Rossion 2004). A set of 36 colored photographs of full front faces (set A), 18 males and 18 females without glasses, facial hair, and makeup, were used. All face photographs were edited in Adobe Photoshop 4.0 (Adobe Systems Inc., San Jose, CA) to remove backgrounds, hair, and everything below the chin. They were all of a neutral facial expression. Two additional sets of stimuli were built from this set of faces. The first additional set (set B) was composed of the faces from set A embedded in a rectangular colored white noise (Fig. 1A, B). The second additional set (set C) was built by scrambling the faces from set B using a Fourier phase randomization procedure (see Nasanen 1999). The phase randomization procedure replaces the phase spectrum of the image with random values, keeping the amplitude spectrum of the image unaltered. The phase randomization procedure yields images that preserve the global low-level properties of the original image (i.e., luminance, contrast, color distribution, spatial frequency distribution, energy) while completely degrading shape information. Faces in stimulus set B were embedded in colored white noise so that they subtended equal shape, size, and contrast against background as the phase-scrambled faces (set C). White noise was added in set B before phase randomization so that stimuli from sets B and C contained equal amount of visual information. All stimuli subtended ∼2.8 × 3.7° of visual angle.
After electrode cap placement, subjects were seated on a comfortable chair, in a light- and sound-attenuated room, at a viewing distance of 100 cm from a computer monitor. Subjects were instructed to visually fixate the center of the monitor during the presentation of 8 consecutive blocks of 72 trials each (with a 1- to 2-min pause between blocks). Stimuli were displayed using E-prime 1.1 (Psychology Software Tools, Inc., Pittsburgh, PA), against a light-gray background (luminance: 34.5 cd/m2). In each trial, a first stimulus (either a face or a phase-scrambled face) appeared at the center of the screen for ∼900 ms (Fig. 1C). Approximately 600 ms (randomized between 500 and 700 ms) after the onset of the first stimulus, a second stimulus (a face from set A) was presented for 300 ms either on the left or on the right side of the first stimulus. The lateralized face (second stimulus) appeared equally often in the left visual field (LVF) and in the right visual field (RVF) in a randomized manner, and its center was located 5.5 cm (3.1°) away from the center of the screen. About 750 ms (600–900 ms) after the simultaneous offset of the first 2 stimuli, a third stimulus identical to the first stimulus appeared for 200 ms at the center of the screen. The intertrial interval was randomized between 1500 and 2100 ms.
In each block, subjects had to perform one of two different tasks, whereas the exact same stimulus sequence was used in both tasks. In the first task—which was considered to involve low-attentional load on the central fixated stimulus—subjects had to press a key corresponding to whether the second stimulation had appeared on the left or on the right of the first stimulus. This was the same task as used in our previous studies (Jacques and Rossion 2004; Rossion and others 2004; B. Rossion, V. Goffaux, D. Collin, unpublished data), except that they were instructed to delay their response until the third stimulus appeared on the screen. In the second task—“high-attentional load” at fixation—subject performed a same/different luminance-matching task between the first and the third stimuli (i.e., central stimuli) and gave their response pressing one of two response buttons. In half of the trials, the luminance of the first (mean: 15.2 ± 2.4 cd/m2) and the third (26.9 ± 2.7 cd/m2) stimuli was slightly different (mean difference: 11.7 ± 0.48 cd/cm2—Fig. 1B). We chose a luminance-matching task for 2 reasons. First, we needed a difficult task to increase the attentional load on the central stimulus. Second, pilot experiments indicated that such a luminance-matching tasks could be performed equally well on both scrambled and face stimuli.
In order to match both tasks in terms of visual stimulation, the luminance of the first and the third stimuli in the low-load task was varied in exactly the same way as in the high-load task even though luminance information was irrelevant in the low-load task. Therefore, only task instructions distinguished the low-load from the high-load task. All subjects responded with 2 fingers of their right hand. They were instructed to maintain fixation to the center of the screen during the whole trial and to respond as accurately and as fast as possible to the presentation of the third stimulus. Before each block, subjects were informed as to which of the 2 tasks they were to perform during the block. The experimental setup involved a 2 (task) × 2 (type of central stimulus—referred to as “context” in the paper) × 2 (side of stimulation) design with 72 trials/condition, thus resulting in 576 trials. The different conditions within a block as well as the order with which subjects performed the tasks were randomized.
Scalp EEG was recorded from 58 tin electrodes, mounted in an electrode cap (Quick cap, Neuroscan Inc., El Paso, TX). Electrode positions included the standard 10–20 system locations and additional intermediate positions. Vertical and horizontal eye movements were monitored using 4 additional electrodes placed above and below the left orbit and on the outer canthus of each eye. During EEG recording, all electrodes were referenced to the left earlobe, and electrode impedances were kept below 10 kΩ. EEG signal was digitized at a 1024-Hz sampling rate, and a digital antialiasing filter of 0.27 × sampling rate was applied at recording (therefore, at 1024 Hz sampling rate, the usable bandwidth is 0 to ∼276 Hz). Subjects were instructed to refrain from blinking during experimental blocks.
EEG data were analyzed using EEprobe 3.1 (ANT, Inc., Enschede, The Netherlands). After filtering of the EEG with a digital 30-Hz low-pass filter, epochs in which the standard deviation of the EEG on any electrode within a sliding 200-ms time window exceeded 35 μV were removed. Trials containing eye blink artifacts occurring during the first 2 stimuli (first ∼900-ms period of each trial) were also rejected. Remaining blink artifacts were corrected by subtraction of a vertical-electrooculogram (EOG) propagation factors based on principal component analyses-transformed EOG components (Nowagk and Pfeifer 1996). For each subject, for correct trials only, averaged epochs ranging from 200 ms before the onset of the lateralized faces to 800 ms after stimulus onset were computed for each condition separately and baseline corrected using the 200-ms prestimulus epoch. Subjects' averages were re-referenced to a common average reference and high-pass filtered using a 1-Hz high-pass filter, in order to cutoff slow anticipatory waves that may be elicited before the onset of the second stimulus (Jacques and Rossion 2004; Rossion and others 2004). Mean number of trials left for averaging was 68 (± 2.8) per subject and condition.
Two clear visual ERP components time locked to the onset of the lateralized face were identified: the P1 and the N170. Amplitude values of these 2 components were measured at 8 occipitotemporal electrodes (right hemisphere: P8, P6, PO8, PO6; left hemisphere: P7, P5, PO7, PO5), where both components were the most prominent. (Because we recorded ERPs to lateral face stimuli, the P1 component had a slightly more temporal scalp distribution than usually observed for foveal stimulation. Given that the P1 and the N170 distribution largely overlapped in these conditions, we used the same occipitotemporal electrodes to measure their respective amplitude.) Amplitudes were quantified as the mean voltage measured within 20 ms time windows centered on the grand average peak latencies of the components' maximum, for each condition separately. This procedure was intended to account for differences in the component's latencies with respect to the experimental condition, hemisphere, and lateralization of the second stimulus (LVF/RVF). Amplitude values of each component were then submitted to repeated-measure analyses of variance, with factors task (high load vs. low load), context (face vs. scrambled face), visual field (LVF vs. RVF), hemisphere (right vs. left), and electrode (4 levels). Greenhouse–Geisser adjustments to the degrees of freedom were used when appropriate and polynomial contrasts were performed for post hoc comparisons. In Results, due to the large number of factors and for purpose of clarity, only effects involving either of our factors of interest (task and context) and their interactions are reported.
Although performing almost at ceiling on both tasks (99.5% in the low-load task and 93.6% in the high-load task; Fig. 2), subjects' accuracy was significantly better in the low-load task (F1,17 = 27.76, P < 0.0001). There was also a main effect of context (F1,17 = 7.63, P = 0.013); performances being slightly lower in the scrambled-face context (95.8%) compared with the face context (97.2%). The interaction between task and context (F1,17 = 6.32, P < 0.05) indicated that the scrambled-face context yielded significantly poorer performances in the high-load task only (P = 0.016), not in the low-load task (P = 0.86). Moreover, the task × context × visual field (F1,17 = 4.64, P < 0.05) revealed that this was the case only when lateralized faces appeared in the RVF (P = 0.007; LVF: P = 0.12). In short, accuracy data confirm that the attentional load to the central stimulus was successfully varied across tasks (Fig. 2).
Reaction times data also displayed a main effect of task (F1,17 = 141.96, P < 0.0001); the high-load task being performed slower (680 ms) than the low-load task (336 ms). However, it should be noted that in the low-load task, subjects had to delay their response until the third stimulus appeared on the screen, allowing them to anticipate their response, which was based on the second stimulus' location. A main effect of visual field (F1,17 = 5.71, P = 0.029) indicated that subjects were slightly faster when lateralized faces appeared in the LVF (505 ms) relative to the RVF (512 ms).
Following the onset of a lateralized face, 2 main visual scalp ERP components were observed in posterior regions of the scalp (Fig. 3): 1) a positivity (P1) maximal at about 90 ms at contralateral electrode sites and at about 130 ms at ipsilateral sites and 2) a negativity (N170) peaking on average at about 170 and 210 ms in contra- and ipsilateral electrode sites, respectively. Grand average ERPs elicited by the lateralized faces with respect to experimental conditions, visual field and hemisphere, are depicted on Figure 3. Figure 4 depicts ERP responses to the lateralized face displaying the effect of task manipulation collapsed across contexts, as well as ERP responses expressing the effect of context collapsed across tasks. Both task and context effects can best be identified when plotting the scalp topographical distribution of each effect as in Figure 5 for the LVF face stimulation.
The P1 component was reduced in amplitude in the high-load task compared with the low-load task, whether the central stimulus was a face or a scrambled face (Figs 3 and 4). In contrast, there was no amplitude difference between the P1 elicited by a lateralized face appearing in the face context relative to the scrambled-face context. This appears clearly on scalp topographies and differential activities (Fig 5), where the ERPs elicited in the low-load task start to differ from the ERPs recorded in the high-load task at around 80–100 ms after the onset of the face stimulus. In contrast, ERPs recorded in the face context differ much later (125–145 ms) from those observed in the scrambled-face context.
Confirming theses observations, statistical analyses performed on the P1 amplitude revealed a main effect of task (F1,17 = 14.84, P = 0.0013); the P1 recorded in the high-load task being smaller in amplitude than in the low-load task (Fig. 6A). A task × electrode interaction (F1.8,30.8 = 13.86, P < 0.0001) reflected the larger task effect on posterior parietooccipital electrodes (PO7/8, PO5/6—0.85 μV, P values < 0.001) than on posterior parietal electrodes (P7/8, P5/6—0.45 μV, P values < 0.02). This interaction was further qualified by a significant task × visual field × hemisphere × electrode interaction (F2.6,43.4 = 12.07, P < 0.0001), mainly because the task effect was significant on most, but not all of the posterior parietal electrodes, in ipsi- or contralateral hemisphere. On posterior parietooccipital electrodes, the effect of task was always significant (all P values ranging from 0.016 to 0.001), except for the PO5 electrode (left hemisphere) when stimulated in the RVF (P = 0.097). As expected, the effect of context was nonsignificant (F1,17 = 0.001, NS). A context × electrode interaction (F2.1,35.3 = 5.84, P < 0.006) was due to lateralized faces eliciting a slightly larger P1 in the face context on parietooccipital electrodes (face context–scrambled context = 0.1 μV), whereas the opposite effect (i.e., slightly larger P1 in the scrambled-face context; face context–scrambled context = −0.09 μV) was observed on posterior parietal electrodes. However, when considered separately, neither of the electrodes displayed a significant effect of context (all P values > 0.3).
Two main results were found on the amplitude of the N170 elicited by lateralized face stimuli. First, as expected, the N170 was markedly smaller when elicited by a lateralized face presented in the context of a central face relative to the scrambled-face context (Figs 3 and 4). Second, similar to the task effect reported for the P1, the N170 in the high-load task was smaller in amplitude than in the low-load task.
Statistical analyses conducted on the mean N170 amplitude confirmed these observations. Replicating our previous results, the N170 was substantially reduced when elicited by a face presented in the context of another face relative to the scrambled-face context (F1,17 = 56.48, P < 0.0001) (Fig. 6B). Moreover, a main effect of task reflected the smaller N170 amplitude recorded to the lateralized face when subjects performed the high-load task as compared with the low-load task (F1,17 = 25.92, P < 0.0001). On average, the effect of context appeared to be larger (2.47 μV) than the effect of task (1.52 μV). There was no significant interaction between the 2 factors. However, the effect of context was larger in the low-load than in the high-load task in ipsilateral hemisphere, as indicated by a significant task × context × visual field × hemisphere 4-way interaction (F1,17 = 14.08, P = 0.0016). In contralateral hemisphere, the amplitude difference between face context and scramble-face context was identical (2.7 μV) in the high- and low-load tasks, whereas in ipsilateral hemisphere, the context effect was 1.81 μV in the high-load task and 2.9 μV in the low-load task. As can be seen in Table 1, the effect of context was significant across all task × visual field × hemisphere conditions, even though it was reduced in the high-load task in ipsilateral hemisphere (Fig. 6B).
|Left hemisphere||Right hemisphere|
|Low load||High load||Low load||High load|
|LVF stimulation||Amplitude (μV)||2.6||1.8||2.9||2.8|
|RVF stimulation||Amplitude (μV)||2.5||2.6||3.2||1.8|
|Left hemisphere||Right hemisphere|
|Low load||High load||Low load||High load|
|LVF stimulation||Amplitude (μV)||2.6||1.8||2.9||2.8|
|RVF stimulation||Amplitude (μV)||2.5||2.6||3.2||1.8|
Note: Amplitude values representing the difference between scrambled-face context and face context and corresponding P values are reported. Note that this effect is always significant although slightly reduced in the high-load task in ipsilateral hemisphere.
A significant interaction between task and hemisphere (F1,17 = 11.885, P = 0.0037) was due to the task effect being stronger in the right hemisphere (2 μV) than in the left hemisphere (1.06 μV). Finally, both task and context effects were stronger on inferior electrodes (P7/8, PO7/8) than on more medial electrodes (P5/6, PO5/6) (task × electrode: F1.96,33.3 = 13.28, P < 0.0001 and context × electrode: F2.1,36.2 = 33.29, P < 0.0001).
In summary, the amplitude of the N170 elicited by the lateralized faces was strongly shaped both by the task that subjects were performing and by the context (face or scrambled face at fixation) in which the lateralized face appeared. Importantly, the effect of context appeared largely independent from task manipulation, being of similar amplitude in both tasks. The only interactions between task and context were found in the ipsilateral hemisphere, but the effect of context was significant in the 2 conditions of attention (Table 1, Fig. 6B). These interactions are most likely due to floor effects in the face context and high-load condition, for which there was no clear peak on the grand average data (Fig. 3—ispilateral hemisphere). Moreover, in inferior electrodes (P7/P8 and PO7/PO8), where similar context effects have been previously described (Jacques and Rossion 2004; Rossion and others 2004), the context effect was always highly significant across all tasks, visual fields, and hemispheres.
Our results can be summarized in 4 points. First, the ERP response to a laterally presented face was massively reduced as early as the onset of the N170 (∼130 ms) when subjects were concurrently fixating another face stimulus, replicating our previous observations (Jacques and Rossion 2004). Second, the N170 was strongly diminished in amplitude when subjects performed a high-attentional load task at fixation as compared with when they performed a low-load task, showing for the first time a clear modulation of the N170 by spatial attention. Third, the magnitude of the sensory competition effect was roughly equal across attentional load manipulations, competition, and attention exerting orthogonal effects on the N170 amplitude. Finally, whereas the onset time of the sensory competition modulations was around 130 ms, the spatial attention effect started at about 80 ms, as indexed by a reduction of the P1 component in the high-load task. In summary, these results bring further support to the view that when 2 faces are processed concurrently in the visual field, they compete for overlapping neural representation, leading to a reduction in the N170 amplitude, independently of a general spatial attention modulation of the processing of faces. In addition, spatial attention and sensory competition mechanisms do not appear to reflect the same mechanisms of early selection of visual information in the extrastriate cortex.
Sensory Competition between Face Representations
The competition effect arises at 130 ms with a similar topography as the N170, suggesting that it originates from regions responding preferentially to faces in the occipitotemporal cortex (see Jacques and Rossion 2004). The N170 reduction to the second face stimulus may be due to baseline activation level being at saturation in populations of neurons responding to the first face stimulus or to inhibitory competition from neurons coding the first stimulus through lateral inhibitory connections (see Jacques and Rossion 2004; Rossion and others 2004). This phenomenon can also be related to so-called adaptation effects, during which fixating a face for a prolonged period (i.e., 5 s) results in a suppression of the N170 elicited by another face presented subsequently, at the same location (Kovacs and others 2006).
Overall, this kind of competitive interactions has been widely observed at the single-cell level in several visual areas of the ventral stream (V2, V4, IT: Moran and Desimone 1985; Miller and others 1993; Rolls and Tovee 1995; Chelazzi and others 1998; Missal and others 1999; Reynolds and others 1999; Reynolds and Desimone 2003) and in the dorsal stream (Recanzone and others 1997) of the monkey brain. These studies suggest that when multiple stimuli are presented in nearby locations, they activate populations of neurons that interact in a mutually suppressive way. Such competitive interactions are thought to index competition for neural representation, thus reflecting the limited capacity of single neurons to represent and selectively respond to multiple stimuli present simultaneously in their receptive field. It may also be that the reduced response to competing stimuli provides useful information about the spatial arrangement of the stimuli or about the interaction between their shapes (Missal and others 1999).
Spatial Attention Modulation of Early Visual Face Processing
When subjects performed the high- relative to the low-attentional load task at fixation, accuracy was lower and both the P1 and the N170 were suppressed. Thus, electrophysiological and behavioral markers of our task manipulation indicate that it successfully induced 2 contrasted levels of attentional load at fixation and consequently different levels of spatially based resource allocation to the peripheral faces.
These results may be best understood in the framework of the perceptual load theory of attention (Lavie and Tsal 1994; Lavie 1995; reviewed by Lavie 2005), according to which when subjects perform a task at fixation, the processing or the filtering out of task-irrelevant peripheral stimuli varies as a function of attentional resources committed to the foveal task. When attentional load at fixation is low, processing resources “spill over” to peripheral stimuli, allowing them to interfere with the task performed at fixation. Conversely, if attentional capacity associated with visual perception is fully consumed by the foveal task (i.e., under high load), the perceptual processing of parafoveal distractors is reduced (Yantis and Johnston 1990; Lavie and Tsal 1994; Lavie 1995; Plainis and others 2001). Thus, in our experiment, one may assume that the increased/decreased attentional demand to the central stimulus in the high/low-load task was coupled with decreased/increased processing of peripheral faces. The fact that this differential processing of peripheral faces was reflected on the early visual ERPs (P1, N170) is in agreement with the view that attentional load modulates spatial selection by influencing the allocation of attentional resources at an early perceptual level of processing (Lavie 1995). This proposal has been recently supported by several neuroimaging studies (Rees and others 1997; Somers and others 1999; Yi and others 2004; Schwartz and others 2005), as well as ERP studies showing that high-attentional load increases location expectancy effects (Handy and Mangun 2000) and reduces distractor's visuocortical processing (Handy and others 2001) as reflected by modulation of the early P1 and N1 components. Hence, in case of increased perceptual load, reduced peripheral distractors' processing might thus be due to a narrowing of the spatial attention window around the attended stimulus, therefore effectively excluding stimuli appearing outside the window (Plainis and others 2001; Lavie 2005; Schwartz and others 2005).
These previous and present observations are also mostly consistent with the large number of human ERP studies investigating more directly the influence of spatial attention on early visual processing. The main findings of these studies are that the visual ERP components P1 and N1 (= N170 for faces and objects; see Rossion and others 2002) elicited by lateralized simple stimuli are enhanced in amplitude when subjects are instructed to covertly attend to the stimulated location or if central or peripheral cues predicts the forthcoming stimulus location (Luck and others 1994; Hillyard and others 1998; Hopfinger and Mangun 1998; Di Russo and others 2003; for reviews, see Mangun 1995; Luck and others 2000).
Previous ERP studies have found relatively small modulations of the face N170 as a result of spatial attention (Eimer 2000; Holmes and others 2003) or selective attention (Cauquil and others 2000; for magnetoencephalography evidence, see Furey and others 2006) as compared with the large spatial attention effects found on the N1 for simpler stimuli (for reviews, see Mangun 1995; Luck and others 2000). Yet, here, we found for the first time large and reliable effects of spatial attention on the N170 evoked by faces. This suggests that previous studies did not put sufficiently strong attentional demand to modulate the response to a face stimulus, which is associated with a high saturated response in the occipitotemporal cortex and may be more immune to resources capacity limitations than other categories of visual stimuli (Young and others 1986; Jenkins and others 2003). Most importantly, the present observations indicate that the processing of faces is modulated by spatial attention as early as its onset time in the occipitotemporal cortex.
Dissociable Effects of Spatial Attention and Sensory Competition on Early Visual Processing
Early visual ERPs to a lateralized face (P1 and N170) were strongly influenced both by spatially based selection mechanisms due to increased attentional load at fixation and by sensory competition between concurrently presented stimuli. This fits with the view that early visual selection depends on both bottom–up stimulus configuration and endogenous allocation of attention. However, spatial attention and sensory competition appear to exert dissociable effect on visual selection, at least when the processes reflected by the P1 and N170 are considered. Several arguments support this dissociation.
First, the N170 to the lateralized face stimulus was strongly suppressed when it was presented together with a central face whether the attentional load at fixation was low or high; spatial attention exerting an effect on the amplitude of the N170 was mostly orthogonal to the sensory competition effect. This finding supports the view that the amplitude reduction of the face N170 when subjects process a meaningful stimulus such as a face (Jacques and Rossion 2004; this article) or an object of expertise (Rossion and others 2004; B. Rossion, V. Goffaux, D. Collin, unpublished data) is independent of spatial attentional selection. If this effect was due to spatial attention, the reduction effect should have been attenuated under the high-load condition. Yet, interestingly, competition and attention effects at the level of the N170 have highly similar scalp topographies (see Fig. 5, at 165 ms), suggesting that they may act on common neural populations in occipitotemporal cortex. (Even though sensory competition and spatial attention have additive effects on the N170 in the present experiment, these mechanisms may interact in certain circumstances, for instance, when selective or spatial attention mechanisms give a processing advantage to an attended stimulus or an attended location when several competing stimuli are present in the visual field [e.g., Luck and others 1997; Chelazzi and others 1998; Kastner and others 1999; Reynolds and others 1999]. According to this biased competition model for attentional selection [Desimone and Duncan 1995], when multiple stimuli—e.g., 2 faces—compete for a neuron's response, if one of the stimuli is attended, the response of the neuron is biased toward that elicited when this attended stimulus is presented alone.)
The finding of an equally large competition effect in the high- and low-load conditions is partly incompatible with Lavie's (1995) proposal that once attentional/perceptual load exceeds capacity limits, task-irrelevant information is excluded from high-level processing. According to this view, in the high-load condition, because the processing of peripheral faces is gated at an early level, the magnitude of this higher level N170 competition effect should have been reduced.
A second argument for a dissociation between spatial attention and sensory competition is the differential time course with which these 2 mechanisms modulated ERP responses to the peripheral face. As mentioned earlier in the discussion, spatial attention modulated the N170 component over occipitotemporal regions. However, unlike sensory competition effects, these modulations started much earlier than the N170, at about 80 ms after stimulus onset over posterior occipital regions. Thus, not only competition and spatial attention mechanisms exert additive effects at the level of the face N170 but also their action on visual processing can be dissociated in time. This, again, strongly suggests that when a central face is being processed, the reduced N170 response to the second lateralized face stimulus can be related to sensory competition and is not simply due to a reduction of attention to the portion of the visual field where this second face appears. If this effect was merely due to spatial attention, we should have observed a reduction of the ERP signal in the face context starting at the level of the P1. Similarly, the fact that there is no modulation of the P1 in response to faces when subjects are processing visual objects of expertise (Rossion and others 2004; B. Rossion, V. Goffaux, D. Collin, unpublished data) supports the view that the N170 reduction observed in these conditions results from a genuine competition between the representation of faces and nonface objects of expertise (for other arguments supporting this view, see Rossion and others 2004; B. Rossion, V. Goffaux, D. Collin, unpublished data) and not from a decrease in attentional allocation to the face stimulus.
More generally, this temporal dissociation might be related to differences in the respective mechanisms through which attention and sensory competition modulate neural processing. On the one hand, sensory competition effects probably depend on hard-wired properties of the visual system (neural selectivity, receptive field size, lateral inhibitory connections, etc). Thus, for instance, given that competitive interactions occur primarily at the receptive field level, the level of the visual hierarchy at which competitive suppression is observed (and thereby the timing of the effect) varies as a function of the distance between the competing stimuli in the visual field (Kastner and others 2001; for competition for stimuli located in different hemifields, see also Schwartz and others 2005). On the other hand, top-down–controlled spatial attention is a more general, flexible, and stimulus-independent mechanism that entails selection at an early cortical stage by enhancing/suppressing neural response (Spitzer and others 1988; Motter 1993; McAdams and Maunsell 1999) by modulating the neurons' baseline activity, by biasing competitive interactions, etc (Luck and others 1997; Chelazzi and others 1998; Reynolds and Desimone 2003; for reviews, see Kastner and Ungerleider 2000; Yantis and Serences 2003).
In sum, although we found that both spatial attention and sensory competition affect visual processing during the first 150 ms after stimulus onset, we mostly observed dissociations rather than interactions between these mechanisms. Hence, spatial attention and sensory competition modulate visual processing of faces at different onset times, 80 and 130 ms, respectively, and both mechanisms exert strong and additive effects on the neural response to faces reflected by the N170.
Endogenous versus Exogenous Attention
The present findings show that the N170 reduction resulting from context manipulations in concurrent presentation is independent of endogenous top-down–directed attention. Yet, one may possibly argue that the presentation of a central face—or object of expertise in experts—has led to a rapid and involuntary attentional capture as a result of automatic allocation of attention to meaningful (face vs. scrambled face) or familiar stimuli (e.g., cars for car experts). Many behavioral studies have found that attention is rapidly drawn to salient stimuli, even though they are task irrelevant (Kim and Cave 1999; Lamy and others 2003). Accordingly, one cannot fully exclude the possibility that attention may have been more strongly captured by the central face than by the scrambled stimulus in the present study, therefore leading to a reduction of the N170 that would be additive to the endogenous attentional effects. However, there are a number of arguments against this possibility. First, the high-attentional load task was designed precisely to override the putative differential level of attention directed to the central stimulus when it was a face compared with a scrambled face. Several sources of evidence, both from psychophysics (Kim and Cave 1999; Lamy and others 2003) and neurophysiology (Ogawa and Komatsu 2004), suggest that salient stimulus might indeed attract attention but that this effect is rapidly overridden by top–down visual search strategies (Connor and others 2004). Therefore, in the present study, given the relatively long delay between the first and second stimuli (600 ms on average), if there was any stronger attentional capture by the central face relative to the scrambled face, it would have been totally cancelled out by the time the second stimulus appeared. Second, spatial attention studies in which early ERP modulations are found (see Mangun 1995; Luck and others 2000) typically report faster and more efficient performances on the detection of a salient peripheral target when it appears at an attended location. Unlike these studies, in the present and in our previous concurrent processing experiments, accuracy and reaction times at detecting the lateral face were virtually identical across context manipulations. Third, it has been shown that exogenously cued attention induces modulation of behavioral performances (Cheal and Lyon 1991; Pestilli and Carrasco 2005) and neural response (Reynolds and Desimone 2003; Liu and others 2005) similar to those observed with endogenously cued spatial attention. Critically, exogenous attention modulates visual ERPs as early as the P1 component (Hopfinger and Mangun 1998; Schuller and Rossion 2001), in the same way as what is generally observed with endogenous spatial attention. This strongly suggests that if the N170 reduction was due to faces or objects of expertise being more attentionally attractive than scrambled faces, this effect should have been observed on the ERP response at a similar latency than the spatial attention effect, that is, on the P1 component. Instead, the effects of sensory competition consistently occur around 50 ms later than the effect of spatial attention.
The authors are supported by the Belgian National Foundation for Scientific Research (FNRS). This work was supported by a grant ARC 01/06-267 (Communauté Française de Belgique—Actions de Recherche Concertées) to BR. Conflict of Interest: None declared.