Abstract

Faces convey information about identity and emotional state, both of which are important for our social interactions. Models of face processing propose that changeable versus invariant aspects of a face, specifically facial expression/gaze direction versus facial identity, are coded by distinct neural pathways and yet neurophysiological data supporting this separation are incomplete. We recorded activity from neurons along the inferior bank of the superior temporal sulcus (STS), while monkeys viewed images of conspecific faces and non-face control stimuli. Eight monkey identities were used, each presented with 3 different facial expressions (neutral, fear grin, and threat). All facial expressions were displayed with both a direct and averted gaze. In the posterior STS, we found that about one-quarter of face-responsive neurons are sensitive to social cues, the majority of which being sensitive to only one of these cues. In contrast, in anterior STS, not only did the proportion of neurons sensitive to social cues increase, but so too did the proportion of neurons sensitive to conjunctions of identity with either gaze direction or expression. These data support a convergence of signals related to faces as one moves anteriorly along the inferior bank of the STS, which forms a fundamental part of the face-processing network.

Introduction

Faces are dynamic and complex stimuli that convey information about the identity and emotional state of an individual. A highly influential model of face processing, first proposed by Bruce and Young (1986) and later refined by Haxby and colleagues (2000), suggests that the different aspects of a face, such as its identity and expression, are processed by functionally and anatomically separated pathways. According to this model, regions in the inferior occipital gyrus first process low-level aspects of face stimuli (e.g., gender). From there, information concerning the specific aspects of a face is fed into 2 divergent pathways. The first pathway, which projects along the superior temporal sulcus (STS), is thought to be responsible for encoding the changeable aspects of a face (i.e., facial expression, gaze direction, and lip movement) (see also Allison et al. 2000). The second pathway, which projects ventrally along the fusiform gyrus, is thought to be responsible for encoding the invariant aspects of a face (i.e., identity). Collectively, these 2 pathways have been labeled the “core system” for face processing (Haxby et al. 2000). More recently, however, the degree to which these 2 pathways are indeed anatomically segregated, and at what level(s) this segregation is present, has come under debate (Calder and Young 2005; Engell and Haxby 2007; Hoffman et al. 2007; see Graham and LaBar 2012 for review).

Neurons responsive to face attributes have been found throughout macaque inferior temporal (IT) cortex, including the superior and inferior banks of the STS and the IT gyrus (Gross et al. 1969; Bruce et al. 1981; Perrett et al. 1982, 1984, 1985; Desimone et al. 1984; Rolls 2000). More recently, several groups have begun to parcellate these neuronal populations into functionally (and, in some cases, anatomically) distinct groups based on their selectivity for various facial features (Yamane et al. 1988; Hasselmo et al. 1989; Eifuku et al. 2004, 2011; Freiwald et al. 2009; Ghazanfar et al. 2010; Ohayon et al. 2012), which may provide clues as to if, where, and when the pathways responsible for processing faces might segregate. For example, a small proportion of neurons in the inferior bank, fundus, and superior bank of the STS were found to be sensitive to head orientation and gaze direction (Perrett et al. 1985) as well as facial expressions and different identities (Baylis et al. 1985; Hasselmo et al. 1989; Young and Yamane 1992; Sugase et al. 1999). The degree to which these cues interact appears to increase as one moves anteriorly along the STS (De Souza et al. 2005; Freiwald and Tsao 2010). Neurons sensitive to facial identity are found in more anterior and ventral areas of IT cortex (Eifuku et al. 2004; Leopold et al. 2006; Freiwald and Tsao 2010).

To provide further neurophysiological evidence addressing how social cues are encoded along the face-processing network, we examined the sensitivity of IT neurons to changeable and invariant properties of face stimuli. We chose to focus our recordings on 2 subdivisions of the inferior bank of the STS in monkeys, approximately corresponding to the face-selective patches recently identified using fMRI (Tsao et al. 2006, 2008; Pinsk et al. 2008; Bell et al. 2011; Ku et al. 2011). Three different facial expressions were used (neutral, fear grin, and threat), each presented with the gaze either directed toward the observer, or averted, from 8 different identities. We targeted 3 fundamental questions concerning face processing in the IT cortex: (1) How are social cues encoded in IT cortex? (2) How does the encoding of social cues progress along the STS?, and (3) Are social cues (i.e., facial expression, gaze direction, and identity) processed by separate neural populations?

Methods

Animal Subjects and Recording Techniques

All procedures were approved by the National Institute of Mental Health Animal Care and Use Committee and observed all NIH guidelines. Two adult male rhesus monkeys (Macaca mulatta), weighing between 10 and 14 kg, were used in this study. MR-compatible head posts (Applied Prototype, Inc., Franklin, MA, USA) and recording chambers (Crist Instruments, Hagerstown, MD, USA) were surgically implanted under aseptic conditions. The recording chambers were placed over 19 mm craniotomies in the right hemisphere of both subjects, centered 12–14 mm anterior to the interaural axis.

Neuronal data were recorded from the right IT cortex (ventral bank of the STS and the immediately adjacent IT gyrus) using procedures described elsewhere (Bell et al. 2011) (Supplementary Fig. 1). During recording sessions, between 1 and 3 electrodes were lowered into IT cortex, guided by transdural guide tubes held in place by a delrin grid (Crist Instruments). Waveform data were sampled at 40 kHz and later sorted into individual units using Offline-Sorter (Plexon Systems, Dallas TX, USA).

Fixation Task and Stimuli

Monkeys were rewarded for passively viewing images of conspecific faces and non-face control stimuli (objects and monkey body-part images). Each trial began with a fixation interval of 100–300 ms. The stimulus was presented for 300 ms followed by an additional 100-ms fixation interval. Eye position was sampled at 240 Hz using an infrared eye-tracking system (ISCAN, Inc., Woburn, MA, USA). Monkeys were given a liquid reward for maintaining stable fixation within 2–3° from a central fixation point for the entire duration of a single trial. Trials where the animal failed to maintain fixation were removed.

Forty-eight different conspecific face stimuli were used in this study (Gothard and Erickson 2004; Gothard et al. 2006), selected from the same stimulus set used previously in neuroimaging studies of social cue encoding in the temporal cortex of monkeys (Hoffman et al. 2007; Hadj-Bouziane et al. 2008). They consisted of 8 different facial identities (taken from monkeys at a different institution, therefore not familiar to the 2 subjects used in this study; male and female, juvenile to adult), and each presented with 3 different facial expressions (neutral, fear grin, and threat) with the gaze (head and eyes) directed toward or averted relative to the observer. In addition, 20 images of monkey body parts (e.g., arms, torsos, etc.) and 20 familiar objects (i.e., objects that the monkeys saw/interacted with daily, such as water bottles, toys, sipper tubes, etc.) were also presented (Bell et al. 2008). Stimuli were presented using an LCD monitor (refresh rate: 60 Hz). All stimuli were full color, approximately 5 × 5° in size, and presented on a neutral gray background. They were controlled for overall luminance (variation in image intensity was <10% across the stimuli, <1% when presented on gray background) but not spatial frequency. Each of the 88 stimuli was presented at least 3 times per neuron (mean number of trials for each stimulus: 4.3 ± 3, median: 4; mean number of trials for each expression: 68.6 ± 23; mean number of trials for each gaze direction: 102.9 ± 35; mean number of trials for each gaze direction: 25.7 ± 9), in randomized order.

Data Analysis

Spike trains were converted into spike-density functions using a normal Gaussian kernel (σ = 10 ms) and summed to yield a single density function for each trial. Neurons were defined as visually responsive if the average firing rate (50–400 ms following stimulus onset) to any/all of the 3 stimulus categories (neutral direct gaze faces, body parts, and objects) was significantly different from baseline (200 ms prior to 50 ms after stimulus onset) (Wilcoxon rank sum tests, P < 0.05). We examined all neurons that showed a significant response to faces (average of all 8 neutral, direct gaze face stimuli), regardless of whether the neuron responded more strongly to faces or stimuli from either non-face category. Data from neurons that responded most strongly to faces versus those that responded to faces but more strongly to stimuli from another category were quantitatively and qualitatively similar and so are considered together and are henceforth collectively referred to in this study as “face-responsive neurons.” Neuronal responses to different facial features (facial expression, gaze direction, and identity) and interactions between the various factors were analyzed using repeated-measures 3-way ANOVAs (factors: expression, gaze direction, and identity). Neurons that showed a main effect of identity, facial expression, or gaze direction were considered sensitive to that feature.

Results

We recorded activity from 637 neurons from the IT cortex (between ∼5–19 mm anterior to the interaural axis) of 2 monkeys (463 from monkey A and 174 from monkey B; see Supplementary Fig. 1 for recording locations), of which 64% (439/637) responded to at least one of the visual stimuli presented (summarized in Table 1). Approximately 44% (278/637) of the neurons encountered were responsive to face stimuli, 2 examples of which are shown in Figure 1. Both of these neurons were highly selective for faces. The neuron on the left also showed a significant effect of facial expression on the magnitude of the face responses (ANOVA, F2,160 = 16.08, P < 10−6), such that the response to threat (direct gaze) was greater than that to either neutral or fear grin (Fig. 1B, left). In contrast, the neuron on the right showed approximately equal responses to all 3 facial expressions (Fig. 1B, right), but showed a significantly greater response to direct gazes (neutral faces) compared with averted gazes (ANOVA, F1,186 = 39.66, P < 10−8) (Fig. 1C, right).

Table 1

Breakdown of the sampled population of face-responsive neurons

Face-responsive neurons (n = 278) Total (monkey A/monkey B) 278 (183/95) 44% (40%/55%) 
Effect of facial expression 
 Significant 65 (44/21) 23% (24%/22%) 
 Not significant 213 (139/74) 77% (76%/78%) 
Effect of gaze direction 
 Significant 77 (49/28) 28% (27%/29%) 
 Not significant 201 (134/67) 72% (73%/71%) 
Effect of identity 
 Significant 128 (85/43) 46% (46%/45%) 
 Not significant 150 (98/52) 54% (54%/55%) 
Face-responsive neurons (n = 278) Total (monkey A/monkey B) 278 (183/95) 44% (40%/55%) 
Effect of facial expression 
 Significant 65 (44/21) 23% (24%/22%) 
 Not significant 213 (139/74) 77% (76%/78%) 
Effect of gaze direction 
 Significant 77 (49/28) 28% (27%/29%) 
 Not significant 201 (134/67) 72% (73%/71%) 
Effect of identity 
 Significant 128 (85/43) 46% (46%/45%) 
 Not significant 150 (98/52) 54% (54%/55%) 
Figure 1.

Neuronal responses to face stimuli in the IT cortex. (A) Each row represents the average response to 3–5 repetitions of one of the 88 different stimuli. The average response to all stimuli from a given category (neutral direct gaze faces, body parts, and objects) is shown above. (B and C) Average response to the different facial expressions (B) (direct gaze only, collapsed across identity) and gaze direction (C) (neutral expression only, collapsed across identity). The neuron on the left exhibited a significant main effect of facial expression (ANOVA, F2,160 = 16.08, P < 10−6), whereas those on the right exhibited a significant main effect of gaze direction (ANOVA, F1,186 = 39.66, P < 10−8). sp/s: spikes/second.

Figure 1.

Neuronal responses to face stimuli in the IT cortex. (A) Each row represents the average response to 3–5 repetitions of one of the 88 different stimuli. The average response to all stimuli from a given category (neutral direct gaze faces, body parts, and objects) is shown above. (B and C) Average response to the different facial expressions (B) (direct gaze only, collapsed across identity) and gaze direction (C) (neutral expression only, collapsed across identity). The neuron on the left exhibited a significant main effect of facial expression (ANOVA, F2,160 = 16.08, P < 10−6), whereas those on the right exhibited a significant main effect of gaze direction (ANOVA, F1,186 = 39.66, P < 10−8). sp/s: spikes/second.

Encoding of Social Cues by Individual Neurons in IT Cortex

Of the 278 face-responsive neurons sampled, 23% (65/278) were sensitive to facial expression (ANOVAs with significant main effect of facial expression, P < 0.05). The average spike-density function for these 65 neurons is shown in Figure 2A. At the population level, threatening faces elicited the greatest response, followed by neutral faces. Fear grin/submissive faces did not elicit statistically different responses compared with neutral faces. From these data, it might be tempting to assume that neurons sensitive to facial expressions all exhibited this pattern of activity (threat > neutral > fear grin). However, closer examination of these neurons revealed that, in fact, many showed the opposite effect. Figure 2B shows the relative (normalized) responses to threat versus fear grin for all 278 face-responsive neurons. Filled symbols indicate neurons that exhibited a significant main effect of facial expression (n = 65, 23%). Those shown in red are neurons whose response to threat was greater than fear grin (n = 44/65), and those shown in blue are those whose response to threat was less than fear grin (n = 21/65). Therefore, while at the population level (Fig. 2A, inset), the greatest response was found for threatening faces, a third (32%) of these neurons responded more strongly to fear grin faces.

Figure 2.

Encoding of social cues in the IT cortex. (A) Population response to neutral, threat, and fear grin expressions across all neurons sensitive to facial expression (direct gaze only, collapsed across identity). Inset shows the average normalized response (±SEM) to each of the expressions across all neurons exhibiting a significant effect of facial expression. Responses have been normalized to maximum response to neutral gaze, directed faces. (B) Distribution of preferred expression across all face-responsive neurons. (C) Population response to direct versus averted gaze (neutral expression only, collapsed across identity) across all neurons sensitive to gaze direction. (D) Distribution of preferred direction across all face-responsive neurons.

Figure 2.

Encoding of social cues in the IT cortex. (A) Population response to neutral, threat, and fear grin expressions across all neurons sensitive to facial expression (direct gaze only, collapsed across identity). Inset shows the average normalized response (±SEM) to each of the expressions across all neurons exhibiting a significant effect of facial expression. Responses have been normalized to maximum response to neutral gaze, directed faces. (B) Distribution of preferred expression across all face-responsive neurons. (C) Population response to direct versus averted gaze (neutral expression only, collapsed across identity) across all neurons sensitive to gaze direction. (D) Distribution of preferred direction across all face-responsive neurons.

A similar trend was seen for gaze direction (Fig. 2C,D). At the population level, averted gaze stimuli (neutral only) elicited stronger responses among the 28% (77/278) of neurons that showed a main effect of gaze direction (n = 45/77, Fig. 2C, inset). However, there was a larger proportion of neurons that responded more strongly to direct gaze stimuli (n = 32/77) when compared with averted gaze stimuli (Fig. 2D).

Forty-six percent of face-responsive neurons (128/278) exhibited a main effect of identity (ANOVAs, P < 0.05). The average spike-density functions for these 128 neurons are shown in Figure 3A. The 4 different traces show the responses to 4 different identities, sorted according to each neuron's preference, so that the responses to each neuron's most preferred identity (neutral, direct gaze) are shown by the solid trace and the least preferred identity is shown by the black dashed trace. This panel shows that, on average, while these neurons exhibited a clear preference for specific identities, the neurons tended to respond to all 8 identities presented.

Figure 3.

Encoding of facial identity in the IT cortex. (A) Population response to the most preferred, third most, sixth most, and least preferred identity (neutral, direct gaze stimuli only) for all face-responsive neurons sensitive to identity. Order of preference determined on a neuron-by-neuron basis, and all responses were normalized to the magnitude of response to the most preferred identity. (B) Selectivity indices for identity for all face-responsive neurons. Indices calculated using the following formula: [response to preferred identity] − [average response to all identities]/[response to preferred identity] + [average response to all identities]. (C) Tuning width (full-width half maximum) of the sorted responses to each of the 8 identities. Values close to zero indicate broader tuning. sp/s: spikes per second; n.s.: not significant.

Figure 3.

Encoding of facial identity in the IT cortex. (A) Population response to the most preferred, third most, sixth most, and least preferred identity (neutral, direct gaze stimuli only) for all face-responsive neurons sensitive to identity. Order of preference determined on a neuron-by-neuron basis, and all responses were normalized to the magnitude of response to the most preferred identity. (B) Selectivity indices for identity for all face-responsive neurons. Indices calculated using the following formula: [response to preferred identity] − [average response to all identities]/[response to preferred identity] + [average response to all identities]. (C) Tuning width (full-width half maximum) of the sorted responses to each of the 8 identities. Values close to zero indicate broader tuning. sp/s: spikes per second; n.s.: not significant.

It is possible that a factor contributing to the larger proportion of neurons sensitive to identity when compared with the proportion sensitive to facial expression or gaze direction was the larger number of exemplars used to assess identity (8) when compared with facial expression (3) or gaze direction (2). Therefore, we repeated the ANOVAs using 3 identities (selected at random). This analysis revealed that, indeed, when sensitivity to identity is defined on the basis of 3 versus 8 identities, the number of neurons drops to 22% (60/278). The implications of this finding will be addressed in the section Discussion.

We calculated an identity selectivity index to measure the degree to which neurons were selective for their preferred identity, with larger indices indicating a greater difference between the mean response to the preferred identity (neutral, direct gaze) relative to the mean response to all identities (Fig. 3B). Across the population of neurons with a significant effect of identity, the average selectivity index was 0.24 ± 0.01, indicating that the response to the preferred identity was approximately 60% greater than that to the average of all the other identities. We also quantified the selectivity of these neurons by calculating the tuning width (full-width half maximum) (Freiwald and Tsao 2010) of the sorted responses to each of the 8 identities. Values close to zero indicate broader tuning. The average tuning width of neurons responsive to faces was 2.39 ± 1.7 (Fig. 3C).

Hierarchical Representation of Social Cues Along the STS

To examine how the encoding of social cues (i.e., facial expression, gaze direction, and identity) evolves along the STS, we divided the sample population into 2 datasets: Neurons sampled from regions 5–10 mm anterior to the interaural axis (“posterior STS” dataset, n = 206; 155 from monkey A and 51 from Monkey B) and those sampled from regions 14–19 mm anterior to the interaural axis (“anterior STS” dataset, n = 317; 223 from monkey A and 94 from monkey B). These posterior and anterior subdivisions were purposefully selected so as to encompass the locations of the fMRI-identified posterior/middle and anterior face-selective regions, respectively (Pinsk et al. 2008; Tsao et al. 2008; Freiwald et al. 2009; Moeller et al. 2009; Bell et al. 2011; Ku et al. 2011).

The distributions of neurons responsive to faces (vs. neurons responsive only to non-face stimuli, Fig. 4A), individual social cues (Fig. 4B), and proportion of neurons exhibiting interactions (ANOVAs) between social cues were all significantly different across the 2 populations (χ2 tests, P < 0.05). Compared with the posterior STS, the proportion of neurons in the anterior STS responsive to any of the visual stimuli dropped by 8% (66–58%), whereas the proportion of face-responsive neurons increased by 7% (34–41%) (Fig. 4A). Of greater interest was the increase in the proportion of neurons sensitive to social cues, facial expression, gaze direction, and identity alike—in most cases at least doubling—in the anterior STS relative to the posterior STS (Fig. 4B). Likewise, the proportion of neurons that showed a two-way significant interaction (ANOVAs, P < 0.05) between identity and facial expression or gaze direction substantially increased. Notably, these interactions were largely between facial expression and identity or gaze direction and identity, while rarely between facial expression and gaze direction (Fig. 4C).

Figure 4.

Distribution of neuron selectivity in posterior versus anterior STS. (A) Proportion of face-responsive neurons for posterior (all recording sites between 5–10 mm forward of interaural axis) versus anterior STS (14–19 mm). (B) Proportion of face-responsive neurons that exhibited a significant main effect (ANOVA) to each of the social cues in posterior versus anterior STS. Identity (asterisk) refers to neurons showing a significant effect of identity, as measured using a random selection of 3 identities. (C) Proportion of face-responsive neurons that exhibited a significant interaction effect (ANOVA) between 2 of the social cues. FE: facial expression; ID: identity; GD: gaze direction.

Figure 4.

Distribution of neuron selectivity in posterior versus anterior STS. (A) Proportion of face-responsive neurons for posterior (all recording sites between 5–10 mm forward of interaural axis) versus anterior STS (14–19 mm). (B) Proportion of face-responsive neurons that exhibited a significant main effect (ANOVA) to each of the social cues in posterior versus anterior STS. Identity (asterisk) refers to neurons showing a significant effect of identity, as measured using a random selection of 3 identities. (C) Proportion of face-responsive neurons that exhibited a significant interaction effect (ANOVA) between 2 of the social cues. FE: facial expression; ID: identity; GD: gaze direction.

Separate Neuronal Populations for Facial Expression and Gaze Direction

Two critical questions to be addressed with these data are: (1) Whether the same or different populations of face-responsive neurons process the 3 social cues and (2) whether this organization changes along the STS. If the 3 subpopulations were completely independent (meaning that no neurons show sensitivity to more than one social cue), this might suggest that social cues are processed, at this stage, independently. On the other hand, if most neurons were sensitive to 2 or more factors, this would suggest that at this level of the face-processing pathway, social cues are processed together.

In the case of neurons in the posterior STS sensitive to social cues, the majority of neurons process a single cue. For example, of the 10 neurons sensitive to facial expression, 50% are sensitive to facial expression only (Fig. 5A). Likewise, of the 28 neurons sensitive to identity, 68% are sensitive to identity only. In contrast, in the case of neurons in the anterior STS, not only is there an increase in the proportion of neurons sensitive to social cues (as described above, Fig. 4), but there is also an increase in the proportions of neurons that are sensitive to at least 2 different social cues (i.e., showing at least 2 main effects of identity, facial expression, and/or gaze direction), suggesting an integration of the different aspects of a face in the anterior STS.

Figure 5.

Convergence of neural signals related to social cues in STS. (A and B) Distribution of neurons sensitive to one or more social cue(s) in posterior (A) versus anterior (B) STS. Values indicate the number and proportion of total number sensitive to specified cue (defined as a significant main effect of the given cue). (C) Proportion of neurons in posterior versus anterior STS sensitive to only one of the 3 social cues. (D) Degree of overlap between neuronal populations sensitive to facial expression versus gaze direction, independent of recording location. FE: facial expression; ID: identity; GD: gaze direction.

Figure 5.

Convergence of neural signals related to social cues in STS. (A and B) Distribution of neurons sensitive to one or more social cue(s) in posterior (A) versus anterior (B) STS. Values indicate the number and proportion of total number sensitive to specified cue (defined as a significant main effect of the given cue). (C) Proportion of neurons in posterior versus anterior STS sensitive to only one of the 3 social cues. (D) Degree of overlap between neuronal populations sensitive to facial expression versus gaze direction, independent of recording location. FE: facial expression; ID: identity; GD: gaze direction.

This is summarized in Figure 5C, which shows the relative proportion of neurons sensitive only to a “single” social cue, relative to the total number of neurons sensitive to “at least” one cue. The effect was particularly evident for neurons sensitive to identity: compared with 68% of neurons sensitive to only identity in the posterior STS, 44% are sensitive only to identity in the anterior STS, and with the rest being sensitive to identity and facial expression and/or gaze direction. Neurons sensitive to facial expression also exhibited a substantial decrease in the proportion of neurons sensitive to facial expression alone (50–34%). Interestingly, neurons sensitive to gaze direction showed a much more modest decrease (42–38%), which may indicate that gaze direction is processed differently.

Note again that similar to the presence of interaction effects between social cues (Fig. 4C), the majority of neurons exhibiting sensitivity to more than one social cue (i.e., more than main effect) are those that are sensitive to identity and either facial expression and gaze direction, with relatively few neurons sensitive to gaze direction and facial expression.

Therefore, in addition to an increase in sensitivity to social cues in general as one progresses anteriorly along the STS, there is also an increase in the level of complexity with respect to individual neurons.

Effect of Social Cues on Neurons not Responsive to Faces

In previous fMRI studies, we demonstrated that the influence of facial expressions and gaze direction is not restricted to the fMRI-identified face patches, but is instead distributed along the STS (Hadj-Bouziane et al. 2008; Furl et al. 2012). This suggests that the effect of social cues may not be restricted to only those neurons that respond to neutral faces. To further investigate this possibility, we examined the effect of facial expressions and gaze directions on the responses of neurons that were not classified as face-responsive neurons according to our selection criterion (i.e., showed a statistically significant response above baseline to neutral, gaze-directed faces). Two-hundred and seventy-five neurons met this criterion (183 were not responsive to any of the visual stimuli tested, 35 responded significantly to body-part stimuli, and 57 responded significantly to object stimuli). Of the 275 neurons that did not exhibit significant responses to neutral, gaze-directed faces, 17% (48/275) showed a significant main effect of facial expression. The average spike-density function for these 48 neurons is shown in Figure 6A. As with face-responsive neurons (Fig. 2A,B), threatening faces appeared to elicit a greater response at the population level. However, we encountered neurons that exhibited stronger responses to fear grin as well (Fig. 6B).

Figure 6.

Influence of social cues on neurons not responsive to faces. (A) Population response to neutral, threat, and fear grin expressions across all neurons that did not show a significant response to neutral, gaze-directed faces (i.e., “non-face-responsive neurons”), but that nonetheless exhibited a significant main effect of facial expression (direct gaze only, collapsed across identity). Inset shows the average normalized response (±SEM) to each of the expressions across all neurons exhibiting a significant effect of facial expression. Responses have been normalized to the maximum response elicited by any of the visual stimuli on a neuron-by-neuron basis. The responses were not normalized in order to account for the large variation in response magnitudes (B) Distribution of preferred expression across all non-face-responsive neurons. (C) Population response to direct versus averted gaze (neutral expression only, collapsed across identity) across all non-face-responsive neurons sensitive to gaze direction. (D) Distribution of preferred direction across all face-responsive neurons. (E) Distribution of non-face-responsive neurons sensitive to one or more social cue(s). Values indicate the number and proportion of total number sensitive to specified cue (defined as a significant main effect of the given cue). sp/s: spikes per second; n.s.: not significant.

Figure 6.

Influence of social cues on neurons not responsive to faces. (A) Population response to neutral, threat, and fear grin expressions across all neurons that did not show a significant response to neutral, gaze-directed faces (i.e., “non-face-responsive neurons”), but that nonetheless exhibited a significant main effect of facial expression (direct gaze only, collapsed across identity). Inset shows the average normalized response (±SEM) to each of the expressions across all neurons exhibiting a significant effect of facial expression. Responses have been normalized to the maximum response elicited by any of the visual stimuli on a neuron-by-neuron basis. The responses were not normalized in order to account for the large variation in response magnitudes (B) Distribution of preferred expression across all non-face-responsive neurons. (C) Population response to direct versus averted gaze (neutral expression only, collapsed across identity) across all non-face-responsive neurons sensitive to gaze direction. (D) Distribution of preferred direction across all face-responsive neurons. (E) Distribution of non-face-responsive neurons sensitive to one or more social cue(s). Values indicate the number and proportion of total number sensitive to specified cue (defined as a significant main effect of the given cue). sp/s: spikes per second; n.s.: not significant.

Similarly, 15% (41/275) of the neurons that did not respond to neutral, gaze-directed faces did exhibit a main effect of gaze direction (Fig. 6C). The majority of these showed stronger responses to averted faces, while a small subset responded more strongly to directed faces (i.e., were suppressed by averted faces).

To determine to what degree these 2 subpopulations represented the same neurons, we constructed a Venn diagram (Fig. 6E). As with face-responsive neurons, the majority of neurons (77%) that exhibited a main effect of facial expression did not exhibit a main effect of gaze direction, and vice versa (73%), thus suggesting relatively little overlap between the 2 populations.

Discussion

Here, we examined the responses of neurons in the posterior and anterior STS to face stimuli, including several identities, facial expressions, and gaze directions. Consistent with previous reports (Perrett et al. 1985; Hasselmo et al. 1989; Sugase et al. 1999; De Souza et al. 2005; Freiwald et al. 2009; Freiwald and Tsao 2010), we found a modest proportion of neurons in these regions sensitive to these different facial attributes. Furthermore, we found: (1) A higher proportion of neurons sensitive to social cues (identity, facial expression, or gaze direction) and (2) a higher degree of integration of these different face attributes as one moves from the posterior to anterior STS, particularly in the case of facial expression and identity. These data provide novel neurophysiological evidence for increased complexity in how neurons encode social cues as one moves anteriorly along the STS.

Hierarchical Encoding of Social Cues Along a Posterior-to-Anterior Gradient

According to Haxby and colleagues' model of face processing in the human (Haxby et al. 2000), the changeable aspects of a face, including its expression and gaze direction, are processed in the STS, whereas the invariant aspects of a face, such as its identity, are processed in the lateral fusiform gyrus. More recently, there has been some debate as to precisely where and to what degree the changeable versus invariant aspects of a face are encoded in anatomically distinct cortical regions (Calder and Young 2005; Calder 2011) and whether facial expression and gaze direction might be processed by partially segregated pathways (Graham and LaBar 2012).

Over the past decade, there has been increasing electrophysiological support for a posterior–anterior hierarchy in face-encoding, primarily along the dimension of viewpoint invariance. For example, a recent study by De Souza and colleagues (2005) demonstrated that selectivity for facial view increases as one moves anteriorly along the STS, a finding later supported by Freiwald and Tsao (2010). In the latter case, the researchers specifically targeted the fMRI-identified middle [middle lateral (ML)/ middle fundus (MF)] and anterior [anterior lateral (AL)/anterior medial (AM)] face-selective regions (located ∼3–8 and 12–19 mm anterior to the interaural axis; Tsao et al. 2006; Freiwald et al. 2009; Freiwald and Tsao 2010).

Our results extend the findings from these studies by demonstrating a similar hierarchical progression for facial expression, gaze direction, and identity. Here, we show that, in a relatively early stage of face processing (area TEO, the approximate location of the face patches ML and MF) (Tsao et al. 2008), these classes of social cues appear to be encoded by primarily distinct neuronal populations (Fig. 5). However, at a later stage (area TE, the approximate location of the face patches AL and anterior fundus (AF)), not only is there a general increase in the number of neurons that encode social cues (Fig. 3B), there is also a marked increase in the overlap of neurons encoding the different social cues (Fig. 5) as well as neurons encoding interactions between social cues, especially between facial expression and identity or gaze direction and identity (Fig. 4C), further supporting the hypothesis that facial expression and gaze direction are processed by partially segregated pathways.

Behavioral evidence has shown that facial expression and gaze direction influence one another (Senju and Hasegawa 2005; Adams and Franklin 2009; Graham and LaBar 2012) and early neuroimaging studies in humans found that both cues are encoded within the posterior STS (Hoffman and Haxby 2000; Andrews and Ewbank 2004). More recent studies, however, have shown that they are processed by distinct yet overlapping regions within the STS (Engell and Haxby 2007; Harris et al. 2012; Baseler et al. 2014). Our data confirm these later findings by demonstrating that neurons encoding these 2 features are found in the same locations along the STS, but form 2 largely distinct populations (e.g., Figs 4C and 5A,B). A similar segregation has been reported in the monkey amygdala (Hoffman et al. 2007).

The above data show that, while there might be an initial degree of separation in the encoding of facial attributes in the posterior STS, as one moves more anteriorly along the STS, these properties become integrated within the same neurons; at least in the monkey brain. How then does this change our conception of face processing in the human brain? There is, first, the matter of how face-responsive regions within IT cortex of the monkey brain map onto regions of the human brain, which is still under much debate. Current evidence suggests that the more posterior face-responsive regions in the monkey (ML and MF) share properties with the occipital face area (OFA) and possibly posterior the fusiform face area (FFA) in the human, whereas more anterior regions in the monkey (AL and AF) share properties with anterior FFA and anterior temporal face regions in the human (Tsao et al. 2008; see Tsao and Livingstone 2008; Yovel and Freiwald 2013 for Discussion). Given these putative, albeit imperfect, homologies and the current data, one would expect to see a blending of social cue encoding in the human brain as early as the FFA, which is indeed the case (Ganel et al. 2005; Cohen Kadosh et al. 2010). One would also predict an effect of one cue on the discrimination of another at the behavioral level—which is again the case (Lobmaier et al. 2008; Ewbank et al. 2009; Pourtois et al. 2010; Milders et al. 2011), thus supporting the notion that the human brain, like the monkey brain, follows a pattern of converging neural signals related to social cues from posterior to anterior IT cortex.

Greater Diversity in Facial Identity Drives Increased Neuronal Selectivity

Our data showed a relatively high proportion of neurons sensitive to facial identity (Fig. 2). From this, one might conclude that neurons in the STS are, on average, more likely to show sensitivity to identity when compared with facial expression or gaze direction. However, with further analysis, it was revealed that this effect may be due, at least in part, to experimental design—specifically the larger number of exemplars used to assess identity, when compared with facial expression or gaze direction. While we acknowledge this limitation, we nonetheless argue that it may, indeed, be the case that under natural conditions, a greater proportion of neurons in STS will show sensitivity to identity, given the assumed limited classes of facial expression (anger, sadness, surprise, disgust, happiness, and fear) (Darwin 1872; Ekman et al. 1969) when compared with the limitless range of facial identities one can recognize.

Generalized Effect of Social Cues on Neuronal Activity in the STS

In addition to demonstrating an effect of social cues on face-responsive neurons, we have also shown that a small proportion of neurons that do not respond to neutral, gaze-directed faces nonetheless show an effect of facial expression and/or gaze direction, which is consistent with our previous neuroimaging studies (Hadj-Bouziane et al. 2008; Furl et al. 2012). In the majority of cases, this was expressed as an increase in neuronal activity in response to facial expressions or averted gazes (Fig. 6). However, in the case of gaze direction, some neurons showed a decrease in activity in response to averted faces.

These data are suggestive of a sparse but widespread effect of social cues on IT cortex. In support of this, we recently showed that ablation of the amygdala obliterates the effect of social cues on activation within temporal cortex while sparing the processing of faces per se (Hadj-Bouziane et al. 2012). It therefore seems likely that projections from the amygdala mediate the effect of social cues on the responsiveness of IT neurons, and that these projections do not selectively target face-responsive neurons.

Supplementary Material

Supplementary material can be found at: http://www.cercor.oxfordjournals.org/.

Funding

This work was supported by the National Institute of Mental Health Intramural Research Program (E.L.M., F.H.B., L.G.U., and A.H.B.).

Notes

The authors thank Katalin Gothard for generously providing the face stimuli; Nicholas Malecek for his invaluable assistance with setting up the electrophysiological recording apparatus and his contributions to training the animals; and Andrew Calder for his insightful comments. Conflict of Interest: None declared.

References

Adams
RB
Franklin
RG
2009
.
Influence of emotional expression on the processing of gaze direction
.
Motiv Emot
 .
33
:
106
112
.
Allison
T
Puce
A
McCarthy
G
2000
.
Social perception from visual cues: role of the STS region
.
Trends Cogn Sci
 .
4
:
267
278
.
Andrews
TJ
Ewbank
MP
2004
.
Distinct representations for facial identity and changeable aspects of faces in the human temporal lobe
.
NeuroImage
 .
23
:
905
913
.
Baseler
HA
Harris
RJ
Young
AW
Andrews
TJ
2014
.
Neural responses to expression and gaze in the posterior superior temporal sulcus interact with facial identity
.
Cereb Cortex
 .
24
:
737
744
.
Baylis
G
Rolls
E
Leonard
C
1985
.
Selectivity between faces in the responses of a population of neurons in the cortex in the superior temporal sulcus of the monkey
.
Brain Res
 .
342
:
91
102
.
Bell
AH
Hadj-Bouziane
F
Frihauf
JB
Tootell
RBH
Ungerleider
LG
2008
.
Object representations in the temporal cortex of monkeys and humans as revealed by functional magnetic resonance imaging
.
J Neurophysiol
 .
101
:
688
700
.
Bell
AH
Malecek
NJ
Morin
EL
Hadj-Bouziane
F
Tootell
RBH
Ungerleider
LG
2011
.
Relationship between functional magnetic resonance imaging-identified regions and neuronal category selectivity
.
J Neurosci
 .
31
:
12229
12240
.
Bruce
C
Desimone
R
Gross
C
1981
.
Visual properties of neurons in a polysensory area in superior temporal sulcus of the macaque
.
J Neurophysiol
 .
46
:
369
384
.
Bruce
V
Young
A
1986
.
Understanding face recognition
.
Br J Psychol
 .
77
(Pt 3)
:
305
327
.
Calder
AJ
2011
.
Does facial identity and facial expression recognition involve separate visual routes?
In:
Calder
AJ
Rhodes
G
Johnson
M
Haxby
JV
editors.
Oxford Handbook of Face Perception
 .
1st ed
.
Oxford
:
Oxford University Press
, pp.
427
448
.
Calder
AJ
Young
AW
2005
.
Understanding the recognition of facial identity and facial expression
.
Nat Rev Neurosci
 .
6
:
641
651
.
Cohen Kadosh
K
Henson
RNA
Cohen Kadosh
R
Johnson
MH
Dick
F
2010
.
Task-dependent activation of face-sensitive cortex: an fMRI adaptation study
.
J Cogn Neurosci
 .
22
:
903
917
.
Darwin
C
1872
.
The expression of the emotions in man and animals
 .
London
:
John Murray
.
Desimone
R
Albright
TD
Gross
CG
Bruce
C
1984
.
Stimulus-selective properties of inferior temporal neurons in the macaque
.
J Neurosci
 .
4
:
2051
2062
.
De Souza
WC
Eifuku
S
Tamura
R
Nishijo
H
Ono
T
2005
.
Differential characteristics of face neuron responses within the anterior superior temporal sulcus of macaques
.
J Neurophysiol
 .
94
:
1252
1266
.
Eifuku
S
De Souza
WC
Nakata
R
Ono
T
Tamura
R
2011
.
Neural representations of personally familiar and unfamiliar faces in the anterior inferior temporal cortex of monkeys
.
PLoS ONE
 .
6
:
e18913
.
Eifuku
S
De Souza
WC
Tamura
R
Nishijo
H
Ono
T
2004
.
Neuronal correlates of face identification in the monkey anterior temporal cortical areas
.
J Neurophysiol
 .
91
:
358
371
.
Ekman
P
Sorenson
ER
Friesen
WV
1969
.
Pan-cultural elements in facial displays of emotion
.
Science
 .
164
:
86
88
.
Engell
AD
Haxby
JV
2007
.
Facial expression and gaze-direction in human superior temporal sulcus
.
Neuropsychologia
 .
45
:
3234
3241
.
Ewbank
MP
Jennings
C
Calder
AJ
2009
.
Why are you angry with me? Facial expressions of threat influence perception of gaze direction
.
J Vis
 .
9
:
16
.
Freiwald
WA
Tsao
DY
2010
.
Functional compartmentalization and viewpoint generalization within the macaque face-processing system
.
Science
 .
330
:
845
851
.
Freiwald
WA
Tsao
DY
Livingstone
MS
2009
.
A face feature space in the macaque temporal lobe
.
Nat Neurosci
 .
12
:
1187
1196
.
Furl
N
Hadj-Bouziane
F
Liu
N
Averbeck
BB
Ungerleider
LG
2012
.
Dynamic and static facial expressions decoded from motion-sensitive areas in the macaque monkey
.
J Neurosci
 .
32
:
15952
15962
.
Ganel
T
Valyear
KF
Goshen-Gottstein
Y
Goodale
MA
2005
.
The involvement of the “fusiform face area” in processing facial expression
.
Neuropsychologia
 .
43
:
1645
1654
.
Ghazanfar
AA
Chandrasekaran
C
Morrill
RJ
2010
.
Dynamic, rhythmic facial expressions and the superior temporal sulcus of macaque monkeys: implications for the evolution of audiovisual speech
.
Eur J Neurosci
 .
31
:
1807
1817
.
Gothard
K
Erickson
C
2004
.
How do rhesus monkeys (Macaca mulatta) scan faces in a visual paired comparison task?
Anim Cogn
 .
7
:
25
36
.
Gothard
KM
Battaglia
FP
Erickson
CA
Spitler
KM
Amaral
DG
2006
.
Neural responses to facial expression and face identity in the monkey amygdala
.
J Neurophysiol
 .
97
:
1671
1683
.
Graham
R
LaBar
KS
2012
.
Neurocognitive mechanisms of gaze-expression interactions in face processing and social attention
.
Neuropsychologia
 .
50
:
553
566
.
Gross
CG
Bender
DB
Rocha-Miranda
CE
1969
.
Visual receptive fields of neurons in inferotemporal cortex of the monkey
.
Science
 .
166
:
1303
1306
.
Hadj-Bouziane
F
Bell
AH
Knusten
TA
Ungerleider
LG
Tootell
RBH
2008
.
Perception of emotional expressions is independent of face selectivity in monkey inferior temporal cortex
.
Proc Natl Acad Sci USA
 .
105
:
5591
5596
.
Hadj-Bouziane
F
Liu
N
Bell
AH
Gothard
KM
Luh
W-M
Tootell
RBH
Murray
EA
Ungerleider
LG
2012
.
Amygdala lesions disrupt modulation of functional MRI activity evoked by facial expression in the monkey inferior temporal cortex
.
Proc Natl Acad Sci USA
 .
109
:
E3640
E3648
.
Harris
RJ
Young
AW
Andrews
TJ
2012
.
Morphing between expressions dissociates continuous from categorical representations of facial expression in the human brain
.
Proc Natl Acad Sci USA
 .
109
:
21164
21169
.
Hasselmo
ME
Rolls
ET
Baylis
GC
1989
.
The role of expression and identity in the face-selective responses of neurons in the temporal visual cortex of the monkey
.
Behav Brain Res
 .
32
:
203
218
.
Haxby
J
Hoffman
E
Gobbini
M
2000
.
The distributed human neural system for face perception
.
Trends Cogn Sci
 .
4
:
223
233
.
Hoffman
EA
Haxby
JV
2000
.
Distinct representations of eye gaze and identity in the distributed human neural system for face perception
.
Nat Neurosci
 .
3
:
80
84
.
Hoffman
KL
Gothard
KM
Schmid
MC
Logothetis
NK
2007
.
Facial-expression and gaze-selective responses in the monkey amygdala
.
Curr Biol
 .
17
:
766
772
.
Ku
S-P
Tolias
AS
Logothetis
NK
Goense
J
2011
.
fMRI of the face-processing network in the ventral temporal lobe of awake and anesthetized macaques
.
Neuron
 .
70
:
352
362
.
Leopold
DA
Bondar
IV
Giese
MA
2006
.
Norm-based face encoding by single neurons in the monkey inferotemporal cortex
.
Nature
 .
442
:
572
575
.
Lobmaier
JS
Tiddeman
BP
Perrett
DI
2008
.
Emotional expression modulates perceived gaze direction
.
Emotion
 .
8
:
573
577
.
Milders
M
Hietanen
JK
Leppänen
JM
Braun
M
2011
.
Detection of emotional faces is modulated by the direction of eye gaze
.
Emotion
 .
11
:
1456
1461
.
Moeller
S
Nallasamy
N
Tsao
DY
Freiwald
WA
2009
.
Functional connectivity of the macaque brain across stimulus and arousal states
.
J Neurosci
 .
29
:
5897
5909
.
Ohayon
S
Freiwald
WA
Tsao
DY
2012
.
What makes a cell face selective? The importance of contrast
.
Neuron
 .
74
:
567
581
.
Perrett
D
Rolls
E
Caan
W
1982
.
Visual neurones responsive to faces in the monkey temporal cortex
.
Exp Brain Res.
 
47
:
329
342
.
Perrett
D
Smith
P
Potter
D
Mistlin
A
Head
A
Milner
A
Jeeves
M
1985
.
Visual cells in the temporal cortex sensitive to face view and gaze direction
.
Philos Trans R Soc Lond B Biol Sci
 .
223
:
293
317
.
Perrett
DI
Smith
PA
Potter
DD
Mistlin
AJ
Head
AS
Milner
AD
Jeeves
MA
1984
.
Neurones responsive to faces in the temporal cortex: studies of functional organization, sensitivity to identity and relation to perception
.
Hum Neurobiol
 .
3
:
197
208
.
Pinsk
MA
Arcaro
M
Weiner
KS
Kalkus
JF
Inati
SJ
Gross
CG
Kastner
S
2008
.
Neural representations of faces and body parts in macaque and human cortex: a comparative fMRI study
.
J Neurophysiol
 .
101
:
2581
2600
.
Pourtois
G
Spinelli
L
Seeck
M
Vuilleumier
P
2010
.
Modulation of face processing by emotional expression and gaze direction during intracranial recordings in right fusiform cortex
.
J Cogn Neurosci
 .
22
:
2086
2107
.
Rolls
ET
2000
.
Functions of the primate temporal lobe cortical visual areas in invariant visual object and face recognition
.
Neuron
 .
27
:
205
218
.
Senju
A
Hasegawa
T
2005
.
Direct gaze captures visuospatial attention
.
Vis Cogn
 .
12
:
127
144
.
Sugase
Y
Yamane
S
Ueno
S
Kawano
K
1999
.
Global and fine information coded by single neurons in the temporal visual cortex
.
Nature
 .
400
:
869
873
.
Tsao
D
Moeller
S
Freiwald
W
2008
.
Comparing face patch systems in macaques and humans
.
Proc Natl Acad Sci USA
 .
105
:
19514
.
Tsao
DY
Freiwald
WA
Tootell
RBH
Livingstone
MS
2006
.
A cortical region consisting entirely of face-selective cells
.
Science
 .
311
:
670
674
.
Tsao
DY
Livingstone
MS
2008
.
Mechanisms of face perception
.
Annu Rev Neurosci
 .
31
:
411
437
.
Yamane
S
Kaji
S
Kawano
K
1988
.
What facial features activate face neurons in the inferotemporal cortex of the monkey?
Exp Brain Res
 .
73
:
209
214
.
Young
MP
Yamane
S
1992
.
Sparse population coding of faces in the inferotemporal cortex
.
Science
 .
256
:
1327
1331
.
Yovel
G
Freiwald
WA
2013
.
Face recognition systems in monkey and human: are they the same thing?
F1000Prime Rep
 .
5
:
10
.