Despite ample research, the structure and the functional characteristics of neural systems involved in human face processing are still a matter of active debate. Here we dissociated between a neural mechanism manifested by the face-sensitive N170 event-related potential effect and a mechanism manifested by induced electroencephalographic oscillations in the gamma band, which have been previously associated with the integration of individually coded features and activation of corresponding neural representations. The amplitude of the N170 was larger in the absence of the face contour but not affected by the configuration of inner components (ICs). Its latency was delayed by scrambling the configuration of the components as well as by the absence of the face contour. Unlike the N170, the amplitude of the induced gamma activity was sensitive to the configuration of ICs but insensitive to their presence within or outside a face contour. This pattern suggests a dual mechanism for early face processing, each utilizing different visual cues, which might indicate their respective roles in face processing. The N170 seems to be associated primarily with the detection and categorization of faces, whereas the gamma oscillations may be involved in the activation of their mental representation.
The outstanding human ability to identify individual human faces has long been of major interest to cognitive psychologists, neuropsychologists, and neuroscientists. Prevalent models assume that the processing of a face consists of the detection of its global shape including the face contour and the canonical organization of the inner components (ICs) (“first-order” configuration) as well as of relational computations during which the metrics of the inner face components are extracted (“second-order” relational processing, Maurer et al. 2002). Because the definition of “first-order” configuration contains multiple visual cues, the face features, their canonical configuration, and the face contour, an intriguing question is how the analysis of the global shape and of the face components are implemented in the neural system and whether they have distinct roles in early stages of face processing. In the present study, we distinguish between 2 electrophysiological manifestations of neural activity associated with face processing: One that is sensitive primarily to the presence of face contour but less to the configuration of inner face features (2 eyes above the nose and mouth) and another that is primarily sensitive to the configuration of the ICs within the face space regardless of face contour.
Electrophysiological, magnetoencephalographic, and hemodynamic studies associated several neural manifestations with visual face processing. Among those, an established neural signature of face processing is the N170 effect reflected in a negative event-related potential (ERP) component peaking between 150 and 180 ms from stimulus onset at posterior temporal sites (Bentin et al. 1996; George et al. 1996). This effect manifests considerably larger amplitude of the N170 in response to human and ape faces than in response to nonface categories such as houses, tools, flowers, animals, or other human body parts, categories among which the N170 does not differ (Carmel and Bentin 2002; Itier and Taylor 2004a). The N170 effect is insensitive to face familiarity (Bentin and Deouell 2000) and is as robust in response to line drawings of schematic faces as in response to photographs of natural faces (Sagiv and Bentin 2001). The latter finding indicates that the N170 is sensitive to the global face configuration (because ICs of schematic faces are meaningless in isolation [Bentin et al. 2002]). However, other findings suggest that it can be elicited by face components even in the absence of the face configuration. Indeed, the N170 amplitude is enhanced and delayed in response to isolated eyes relative to full faces (Bentin et al. 1996). A similar pattern of augmented N170 amplitude and delayed peak latency has been repeatedly found for inverted faces, as opposed to full faces (Bentin et al. 1996; Rossion et al. 1999, 2002).
Inverting a face or presenting components in isolation disrupts the spatial configuration of face features (Bartlett and Searcy 1993; Leder and Carbon 2006). Therefore, one account for the modulation of the N170 by these manipulations links the N170 to the configural structural encoding of a face (Eimer 2000; Rossion and Gauthier 2002). By this model, the delay of the N170 peak and, to some extent, the augmentation of its amplitude reflect the difficulty in computing spatial relations among face features when their global configuration is disrupted. However, an alternative account for these effects can be offered relating the N170 component to a face-detection mechanism, which may utilize visual information about face features in addition to global cues in order to categorize a stimulus as a face. Previous studies demonstrated that faces are processed first at a global level, (Young et al. 1987; Tanaka and Farah 1993; Hole 1994; Tanaka and Sengco 1997; Maurer et al. 2002; Bentin et al. 2006). However, when the global configuration is altered (by inversion or scrambling) or when features are presented in isolation, attention is diverted from the global level to the local components (e.g., Valentine 1988). Thus, in the absence of a coherent configuration, face detection must rely only on features and, consequently, the detection process might be prolonged. According to this account, the amplitude of the N170 and its latency may reflect the extent of neural activity involved in the detection process.
The goal of the present study was to address this debate and provide further evidence for the association between the N170 effect and a face-detection mechanism that relies primarily on global cues but utilizes featural information as well. In addition, we suggest another neural mechanism that might be involved in the structural encoding and the activation of a neural representation of a face; this process would be primarily sensitive to the configuration of the ICs within the face space. Based on previous results, we propose that high-frequency oscillatory electroencephalographic (EEG) activity in the gamma band (20–80 Hz) could be a neural signature of this latter mechanism.
Ample evidence suggests that high-frequency neural oscillatory activity may be involved in integrating individually coded features and creating an identifiable percept, which activates a preexistent neural representation (for reviews, see Bertrand and Tallon-Baudry 2000; Herrmann et al. 2004). Specifically, many studies investigated induced gamma oscillations, which are not necessarily phase locked to the stimulus onset and are most conspicuous during an epoch ranging between 200 and 400 ms from stimulus onset. These studies found enhanced induced gamma activity while processing recognizable objects relative to seeing new or meaningless objects (Tallon-Baudry et al. 1996, 1997; Gruber and Muller 2005). For example, Tallon-Baudry et al. (1996) found higher induced gamma oscillations in response to coherent illusory Kanizsa figures relative to incoherent figures. The induced gamma activity elicited by coherent illusory Kanizsa figures was similar to that elicited by the connected/whole objects, which supports the view that the induced gamma oscillations manifest a process of integrating components based on their spatial configuration, to form identifiable objects and activating their neural representation.
Similar results were reported also in response to faces: “mooney faces” or schematic faces that are meaningful only when presented upright but are unidentifiable when presented upside down (George et al. 1997) elicited higher induced gamma activity in the recognizable than in the unrecognizable condition (Keil et al. 1999; Rodriguez et al. 1999, but see Lachaux et al. 2005; Trujillo et al. 2005). Likewise, an intracranial study recorded gamma activity in response to recognizable and unrecognizable faces from ventro-occipital regions traditionally associated with face perception (including the Fusiform Face Area, Tallon-Baudry, Bertrand, et al. 2004). However, these data did not extend the meaningful versus meaningless object distinction because the inverted mooney faces are, in fact, meaningless. One study that did find differential oscillatory activity to faces versus meaningful nonface stimuli is reported by Klopp et al. (1999). These authors found higher activity in the fusiform area in a broad range of frequencies (5–45 Hz) to faces relative to words.
The present study was designed to dissociate between a face-detection mechanism on the one hand and a first-order configuration–based matching process between an incoming stimulus and a preexistent neural representation on the other hand. We propose that these 2 face perception mechanisms are manifested by the N170 ERP component and by induced gamma activity, respectively. To this end, we presented normally configured and spatially scrambled upright faces and isolated ICs. Our rationale was that a first-order configuration–based process should differentiate between correctly configured and scrambled faces (SFs), with a reduced sensitivity to the presence or absence of the face contour. By contrast, a face-detection process should be particularly sensitive to global cues, mainly the presence of a face contour and inner face configuration, and to face features as well.
The participants were 17 undergraduates (9 females) from the Hebrew University ranging in age from 20 to 38 years (median age 25 years). All participants reported normal or corrected to normal visual acuity and had no history of psychiatric or neurological disorders. Among them 2 were left handed. They signed an informed consent according to the institutional review board of Hebrew University and were paid for participation.
Stimuli and Design
The stimuli were based on 80 photographs of different faces, 80 photographs of different watches, and 120 photographs of different flowers. The faces were edited to form 4 stimulus conditions, with 80 stimuli in each condition: regularly configured full faces (RF); SFs, that is, faces in which the spatial location of the ICs was scrambled; ICs, that is, normally configured eyes, nose, and mouth without the face contour; and scrambled inner components (SICs), that is, 2 isolated eyes, a nose, and a mouth presented in random configuration and without the face contour. The watches were edited to form a regularly configured watch (RW) and a scrambled watch (SW) conditions with 80 stimuli each (see examples of the stimuli in Fig. 2).
All pairs of regularly configured and scrambled stimuli (RF-SF, IC-SIC, and RW-SW) were equated for luminance, so that, on the average, the regularly configured and scrambled stimuli sets were matched for luminance and brightness. The stimuli were presented at fixation and, seen from a distance of approximately 60 cm, occupied 9.5° × 13° in the visual field (10 × 14 cm).
Task and Procedure
Like in many previous N170 studies in our laboratory, the task was oddball target monitoring in which stimuli from different experimental categories were presented one after another and participants were requested to press a button each time a flower appeared on the screen. This procedure ensured that all stimulus categories of interest in this study were equally task relevant, that is, they were all distracters, to be ignored. The 600 stimuli were fully randomized and presented in 6 blocks of 100 stimuli each, with a short (up to a minute) break between blocks for refreshment. Each stimulus was presented for 700 ms with interstimulus intervals varying between 500 and 1250 ms. The experiment was run in an acoustically treated and electrically isolated boot. Following the mounting of the electrode cap, the participants were seated in a comfortable reclining chair and the monitor was raised to their eye levels.
The EEG analog signals were recorded continuously by 64 Ag–AgCl pin-type active electrodes mounted on an elastic cap (ElectroCap International) according to the extended 10-20 system (American Encephalographic Society, 1994) and from 2 additional electrodes placed at the right and left mastoids, all reference free. Eye movements, as well as blinks, were monitored using bipolar horizontal and vertical electro occulo gram (EOG) derivations via 2 pairs of electrodes, one pair attached to the external canthi and the other to the infraorbital and supraorbital regions of the right eye. Both EEG and EOG were sampled at 1024 Hz using a Biosemi Active II digital 24-bits amplification system with an active input range of −262 to +262 mV per bit without any filter at input. The digitized EEG was saved and processed off-line.
Data Processing and Analysis
Raw data were 1.0-Hz high-pass filtered (24 dB) and referenced to the tip of the nose. Eye movements were corrected using an ICA.procedure (Jung et al. 2000). Remaining artifacts exceeding ±100 μV in amplitude or containing a change of over 100 μV in a period of 50 ms were rejected. Artifact-free data were then segmented into epochs ranging from −250 ms before to 800 ms after stimulus onset for all conditions.
ERPs resulted from averaging the segmented trials separately in each condition. The averaged waveforms were smoothed by applying a low-pass filter of 17 Hz (24 dB) and baseline corrected based on the time between −150 and −50 ms before stimulus onset. For each subject, the peak of the N170 was determined (based on the filtered waveform) as the most negative peak between 150 and 200 ms. Subsequent visual scrutiny ensured that the most negative values represented real peaks rather than end points of the epoch. Based on previous studies and on scrutiny of the present N170 distribution, the statistical analysis was restricted to posterior–lateral regions. The amplitudes and latencies of the N170 at sites P8, PO8, and P10 over the right temporal hemisphere and the homologue sites over the left were averaged within each hemisphere to yield the dependent variables for analysis of variance (ANOVA). The characteristic scalp distribution of the N170 in each condition was estimated by spherical spline interpolations with 4 levels.
ANOVAs with repeated measures were applied on N170 amplitudes and latencies. The factors were stimulus type (full faces, ICs, watches), configuration (normal, scrambled), and hemisphere (right, left). For factors with more than 2 levels, P values were corrected for nonsphericity using the Greenhouse–Geisser correction (for simplicity, the uncorrected degrees of freedom are presented). Significant main effects and interactions were followed up by subsequent 1-way ANOVAs for each level of the interacting factors or by post hoc Bonferroni-corrected contrasts. A similar analysis was performed on the amplitudes and latencies of the P1 peaks (determined as the most positive peak between 80 and 120 ms) and on the difference in amplitude and latency between the N170 and the P1. This was done in order to determine whether the observed N170 effects could, in fact, be due to earlier P1 effects and therefore might not reflect the influence of the experimental manipulations on the neural mechanism eliciting the N170.
Oscillatory Activity Analysis
Wavelet analysis was used in order to obtain the amplitude of oscillatory activity in the gamma band, based on the procedure suggested by Tallon-Baudry et al. (1997). Data were convolved with a complex Gaussian Morlett wavelet: w(t,f)=Aexp(−t2/2σt2)exp(2iπft) using a constant ratio of f/σf=8, where σf=1/(2πσt) and normalization factor This procedure was applied to frequencies ranging from 20 to 80 Hz in steps of 0.75 Hz, and the results were baseline corrected based on the time between −150 and −50 ms before stimulus onset. Induced oscillatory activity was calculated by applying wavelet analysis to individual trials and averaging the time–frequency plots. Time–frequency plots were created based on the absolute value of the wavelet outcome for each frequency.
Induced activity was assessed as the mean activity between 200 and 300 ms for 2 frequency bands: 25–45 Hz (low gamma) and 55–75 Hz (high gamma). The scalp distribution of induced oscillations was widely spread. Therefore, our analysis compared left, midline, and right clusters at anterior, center, and posterior regions (see Fig. 1).
Statistical reliability for oscillation amplitudes in both frequency bands was determined by ANOVA with repeated measures. The factors were stimulus type (faces, ICs, watches), configuration (normal, scrambled), anterior–posterior distribution (anterior, center, posterior), and laterality (left, medial, right).
For factors with more than 2 levels, P values were corrected for nonsphericity using the Greenhouse–Geisser correction (for simplicity, the uncorrected degrees of freedom are presented). Significant main effects and interactions were followed up by subsequent 1-way ANOVAs for each level of the interacting factors or by post hoc Bonferroni-corrected contrasts.
A clear N170 component was observed in all conditions. As expected, this component was larger for faces than for watches demonstrating the N170 effect (Fig. 2a). The distribution of this component was posterior temporal, predominantly over right hemisphere sites (Fig. 2c). More importantly, however, the N170 elicited by face components outside the face contour was larger and peaked later than when the face contour was presented; the configuration of the ICs had no conspicuous influence on the N170 amplitude in either case but did influence the peak latency (Fig 2b).
The statistical analysis of the differences illustrated in Figure 2 was based on within-subject, 3-way ANOVAs for N170 amplitude and for latency. The amplitude analysis showed a main effect of stimulus type (F2,32 = 20.6, P < 0.001), a main effect of hemisphere (demonstrating that the N170 amplitude was larger over right [−11.0 μV] than over left hemisphere [−8.0 μV] sites; F1,16 = 32.3, P < 0.001), and no effect of configuration (F1,16 < 1.00) (Fig. 5a). There was a significant interaction between the effects of stimulus type and hemisphere (F2,32 = 14.4, P < 0.001). No other interactions were significant. Post hoc contrasts revealed that the N170 was larger for full faces (−9.6 μV) than for watches (−7.5 μV; F1,26 = 12.1, P < 0.01) and the N170 elicited by ICs (−11.2 μV) was larger than that elicited by full faces (F1,16 = 9.7, P < 0.01).
ANOVAs conducted separately for each hemisphere showed that whereas the stimulus type effect was significant over both hemispheres, it was slightly more conspicuous over the right hemisphere (F2,32 = 25.1, P < 0.001) than over the left (F2,32 = 9.6, P < 0.01). Post hoc contrasts showed that the ICs elicited larger N170 than full faces over both hemispheres (F1,16 = 17.8, P < 0.001 and F1,16 = 42.3, P < 0.001 for the left and right hemispheres, respectively). Faces elicited larger N170 than watches over the right hemisphere (F1,16 = 16.4, P < 0.001) but not over the left (F1,16 = 2.8, P = 0.113).
Because small differences were observed at the preceding positive peak (P1) and some authors suggested that P1 might also reflect specific face processing (e.g., Itier and Taylor 2004b), a similar analysis was performed on the P1 amplitudes. This analysis showed no main effects of stimulus type or configuration (F2,32 = 0.585, P > 0.5, F1,16 = 0.019, P > 0.8, respectively) and no significant interactions.
In summary, the above analyses demonstrated that the amplitude of the N170 was larger for ICs without the face contour than for full faces and was not affected by the components' configuration.
The same analysis conducted with the N170 latency as the dependent variable showed significant main effects for stimulus type (F2,32 = 35.6, P < 0.001), configuration (F1,16 = 40.7, P < 0.001), but not for hemisphere (F1,16 < 1.00). Post hoc contrasts showed that the N170 peaked later for watches (157.7 ms) than for faces (148.3 ms; F1,16 = 34.0, P < 0.001), both earlier than the peak latency for ICs (164.8 ms; F1,16 = 48.3, P < 0.001).
In addition to the main effects, there was a significant interaction between stimulus type and configuration (F2,32 = 17.4, P < 0.001). Bonferroni-corrected paired t-tests comparing normally configured and scrambled stimuli within each stimulus type revealed that the effect of configuration was significant for full faces (t16 = 10.67, P < 0.001) and for ICs (t16 = 5.82, P < 0.001), whereas there was no configuration effect for watches (t16 < 1.00).
When conducting the same analysis on the P1 latencies, we found main effects for stimulus type (F2,32 = 13.121, P < 0.001) and for configuration (F1,16 = 15.215, P < 0.001) as well as an interaction between these 2 factors (F2,32 = 5.407, P < 0.025). This pattern could suggest that the latency effects found for the N170 latencies might actually reflect earlier, face-unspecific effects. In order to test this hypothesis, we explored the effect of our manipulation on the N170 peak latency subtracting the P1 latency. This analysis reassured that, although some earlier effects exist (as revealed by the P1 analysis) all the previously described main effects persisted even when these earlier effects were subtracted. There were main effects for stimulus type (F2,32 = 12.729, P < 0.001), and configuration (F1,16 = 10.034, P < 0.01), but no significant interactions.
In summary, the above analyses demonstrated that the latency of the N170 was earlier for full faces than for watches and longest for ICs and, importantly, that for both faces and face components scrambled configurations delayed the N170 peak. These effects were found even after subtracting the earlier P1 effects.
Induced EEG Oscillations
Induced oscillations were conspicuous in a time window between 200 and 300 ms for all conditions. During this window, 2 centers of induced activity were discernable, one in the lower frequency range (25–45 Hz) and one in a higher frequency range (55–70 Hz) (Fig. 3). These 2 ranges were analyzed separately.
Oscillations between 25 and 45 Hz
ANOVA of the mean amplitude of the induced EEG oscillations between 25 and 45 Hz showed significant effects of stimulus type (F2,32 = 5.3, P < 0.025) and of anterior–posterior as well as laterality distribution effects (F2,32 = 7.1, P < 0.025; F2,32 = 8.5, P < 0.001, respectively). The induced gamma-band oscillations for faces (1.4 μV) were higher than for watches (0.5 μV; P < 0.005) but similar to those elicited by ICs (1.1 μV; P = 1.0). Additional post hoc contrasts revealed that the amplitudes were higher at central and posterior sites (1.2 and 1.1 μV, respectively) than at anterior sites (0.7 μV; P < 0.01) and higher at midline (1.2 μV) than at any of the lateral sites (0.9 μV equal across lateral sites; P < 0.02). The main effect of configuration was not quite significant (F2,32 = 3.5, P = 0.08); however, a significant interaction between the effects of configuration and anterior–posterior distribution (F2,32 = 4.3, P < 0.05) suggested that the configuration of the stimulus might have had different effects at anterior, center, and posterior sites. Indeed, post hoc contrasts between normally configured and scrambled stimuli at each region revealed that the configuration had no effect at anterior sites (P = 0.45) and approached significance at center sites (P = 0.051), whereas at posterior sites, normally configured stimuli elicited significantly higher oscillations (1.36 μV) than scrambled stimuli (0.9 μV; P < 0.05).
In summary, induced EEG oscillations in the low gamma range (25–45 Hz) elicited between 200 and 300 ms were conspicuous across a large expansion of the scalp. Across all regions, faces and face components elicited higher induced oscillations than watches. Most revealing, however, over the posterior sites, these oscillations were sensitive to configuration: normally configured stimuli elicited higher induced oscillations than scrambled stimuli (Fig. 4).
Oscillations between 55 and 70 Hz
ANOVA of the higher induced oscillations showed no effect of stimulus type (F2,32 = 2.4, P = 0.10) and no effect of configuration (F1,16 < 1.00). The main effects of anterior–posterior and lateral distributions were significant (F2,32 = 13.4, P < 0.001 and F2,32 = 11.4, P < 0.001, respectively). Post hoc contrasts of the distribution showed a similar pattern as the lower gamma oscillations, that is, equal amplitudes at center (0.82 μV) and posterior sites (0.96 μV), both larger than anterior sites (0.5 μV; P < 0.005). The main effect of lateral distribution revealed that the amplitudes were higher over medial clusters (0.9 μV) than over the left (0.6 μV) or right clusters (0.7 μV; P < 0.001). There were no significant interactions in this analysis.
In summary, in contrast to the lower range, induced EEG oscillations in the higher gamma range (55–70 Hz) were not reliably affected by the experimental manipulations in the present study.
The results of this study unveiled a number of dissociations between different forms of neural activity associated with early face processing. The amplitude of the N170 was higher in response to face components outside the face contour than in response to full faces and not influenced by the configuration of the ICs for either stimulus type (Fig. 5a). However, its peak latency (even after subtracting P1-latency effects) was influenced by both stimulus type and by configuration. The N170 peaked later for SFs than for regular faces, even later for normally configured ICs and the latest for SICs. The negative ERP elicited by watches during the N170 time range was smaller and peaked later than that elicited by faces (the N170 effect).
In contrast to the N170, induced EEG oscillations in the lower gamma band (25–45 Hz) were equally large for faces and face components. However, at posterior scalp sites, the amplitudes were lower when the stimuli were scrambled relative to when they were normally configured (Fig. 5b). In addition, across the scalp, the amplitude of induced EEG oscillations in this frequency range was higher for faces and inner face components than for watches, regardless of configuration. The dissociation between the sensitivity of the N170 to the presence of a contour but much less to configuration on the one hand, and the sensitivity of the induced EEG oscillations to configuration but not to the presence of a face contour on the other hand, suggests that these 2 types of neural activity elicited by faces manifest different perceptual mechanisms. Further evidence for this dissociation is provided by the different scalp distribution of the N170 and the EEG oscillations. Whereas the N170 was posterior–lateral distributed, larger over the right than over the left hemispheres, the gamma oscillations had a more medial and widespread distribution.
These results link the N170 to a face-detection mechanism, which is different from the mechanism reflected in the gamma oscillations. The N170 is larger in response to all types of face-related stimuli than in response to watches, indicating that they were all correctly categorized by the visual system as “faces” (Sagiv and Bentin 2001; Bentin et al. 2006). However, the increase in the N170 amplitude and latency when the contour was missing and the increase in its peak latency when the configuration was scrambled suggest the neural mechanism for face detection manifested by the N170 relies primarily on the global face shape and involves additional processing when presented with isolated features. Moreover, seemingly, the face contour (including the hairline) is a stronger face cue than the configuration of the ICs.
This interpretation is in line with previous reports that face processing is focused initially at a global level (Young et al. 1987; Tanaka and Farah 1993; Hole 1994; Tanaka and Sengco 1997; Maurer et al. 2002; Bentin et al. 2006). It is possible that for normally configured upright faces, the global process hinders the processing of the individual components (Sinha and Poggio 1996). In the absence of a normal configuration or face contour (which both invoke global processing), attention is diverted from the global shape to the components that might then be processed to a larger extent. According to this account, the enhanced amplitude and prolonged latency of the N170 when the face configuration is scrambled and the contour is absent reflect more extensive processing of the ICs.
In contrast to the N170, the induced gamma oscillations were clearly modulated by the configuration of the ICs. Therefore, these oscillations are possibly related to the configural computations performed during structural encoding leading to the activation of a perceptual representation. Indeed, this hypothesis is in line with ample evidence for the involvement of induced gamma-band oscillations with higher level perceptual processes such as bottom-up– and top-down–driven object recognition (Tallon-Baudry et al. 1996; Keil et al. 1999; Bertrand and Tallon-Baudry 2000, 2003) as well as accessing memory representations (Duzel et al. 2003; Gruber et al. 2004; Gruber and Muller 2005) or active maintenance of information in short-term memory (Tallon-Baudry et al. 2001; Tallon-Baudry, Mandon, et al. 2004).
Unlike previous studies, the current configural manipulations did not render the scrambled stimulus completely meaningless. Even in the scrambled configurations, the ICs remained distinguishable and meaningful. Moreover, the SFs, albeit strange, were clearly recognized and labeled as such. Therefore, in addition to distinguishing between meaningful and meaningless stimuli, the present results are more direct evidence that the perceptual mechanism manifested by the induced gamma oscillations relies on spatial configuration of the features in order to “match” the incoming stimulus with a preexistent representation. Additional support for this hypothesis is provided by emerging data in our laboratory, showing that face inversion reduced the induced gamma amplitude, whereas face familiarity enhances it (Anaki, D. Zion-Golumbic, E., and Bentin, S., in preparation).
In addition, an overall difference in amplitudes of gamma activity was found between face-related stimuli and watches. This categorical distinction goes along with the intracranial findings of Klopp et al. (1999) who found larger amplitudes of EEG oscillations between 5 and 45 Hz for faces than words in the fusiform gyrus. However, coming to interpret this putative face-selectivity response, we should remember that the modulation of induced gamma activity is not restricted to faces. Rather, they probably reflect the coherent activation of distributed representations of any visual object.
It is possible that the present distinction between faces and watches reflects the enhanced need for configural processing and identification of faces relative to nonface objects perceived at the base-categorical level, the more established perceptual representation of faces than watches in memory, or that faces attract more attention than watches (Furey et al. 2006). This difference is intriguing also because a previous study showed that induced gamma oscillations are reduced by familiar stimulus repetition (Gruber and Muller 2005). A possible interpretation of the current pattern is that the default processing of faces at the individual level emphasized the difference among exemplars, whereas the categorical processing of watches reduced such differences. Therefore, it is possible that repetition effects were stronger for watches. Obviously, additional research is needed to shed light on this intriguing category-specificity effect.
The present data clearly demonstrate that the perceptual mechanisms reflected by the induced EEG oscillations in the gamma band and by the N170 are dissociable. Apparently, each of these mechanisms uses different visual cues and might be involved in different aspects of face processing. This outcome supports a multistage, perhaps hierarchical, account for early face processing. Specifically, we propose the existence of a face-detection (selection) stage functionally and neurologically distinct from a configural structural encoding stage (see also Liu et al. 2002).
Why would a detection mechanism be especially necessary for face processing? To answer this question, we should recall that whereas most object categories are perceived at a base (functional) level, faces (and perhaps objects of expertise) are perceived by default at the individual (within category) level (Tanaka 2001; Bukach et al. 2006). The within-category identification of a face cannot rely on its global shape because, unlike most other object categories, global shapes are similar across faces. Rather, face identification must rely on additional cues that distinguish among individual faces. Previous studies suggested that these cues are provided by face features and their second-order spatial relations (Bartlett and Searcy 1993; Cabeza and Kato 2000). In contrast, nonface objects are usually categorized at a basic (global) level and do not require additional individuation cues. Given this difference, efficient extraction of this information for structural encoding of the face should be triggered selectively. It seems that the N170 is associated with this early face-selection mechanism.
A different neural mechanism is apparently responsible for the structural encoding of the face, a process that requires computation of spatial relations between face features within the face space. We propose that induced gamma activity observed in this study might be directly or indirectly associated with this process. This hypothesis is supported by the higher induced gamma activity for correctly configured than scrambled face stimuli. Although these results are insufficient to determine whether induced gamma activity actually reflects the structural encoding stage itself or a later process, evidently structural encoding is necessary for activating the neural representation of a face that is presumably manifested by gamma oscillations.
To conclude, the present results functionally dissociated 2 neural mechanisms involved in different aspects of early face processing. The face-specific process manifested by the N170 seems to index the initial detection of a face in the visual field, whereas induced oscillatory gamma activity is related to the activation of a faces' perceptual representation and perhaps is involved in the structural encoding of a face.
This study was funded by National Institute of Mental Health grant R01 MH 64458 to SB. We thank Drs David Anaki and Bruno Rossion for important comments on the manuscript. Conflict of Interest: None declared.