Abstract

Self-face recognition in the mirror is considered to involve multiple processes that integrate 2 perceptual cues: temporal contingency of the visual feedback on one's action (contingency cue) and matching with self-face representation in long-term memory (figurative cue). The aim of this study was to examine the neural bases of these processes by manipulating 2 perceptual cues using a “virtual mirror” system. This system allowed online dynamic presentations of real-time and delayed self- or other facial actions. Perception-level processes were identified as responses to only a single perceptual cue. The effect of the contingency cue was identified in the cuneus. The regions sensitive to the figurative cue were subdivided by the response to a static self-face, which was identified in the right temporal, parietal, and frontal regions, but not in the bilateral occipitoparietal regions. Semantic- or integration-level processes, including amodal self-representation and belief validation, which allow modality-independent self-recognition and the resolution of potential conflicts between perceptual cues, respectively, were identified in distinct regions in the right frontal and insular cortices. The results are supportive of the multicomponent notion of self-recognition and suggest a critical role for contingency detection in the co-emergence of self-recognition and empathy in infants.

Introduction

Human adults recognize their own faces in a mirror or on any reflective surface. However, this ability has been demonstrated in only a limited number of animal species (Gallup 1982; Reiss and Marino 2001; Plotnik et al. 2006). Because these animals typically have large brains and show evidence of empathic behavior, mirrored self-recognition has been understood as an engram of a unique social-cognitive function of a “self,” which relies on a unique system in highly evolved brains (Gallup 1982; Marino 2002; Hart et al. 2008; Suddendorf and Collier-Baker 2009). This notion is consistent with the developmental co-emergence of empathic behavior and mirrored self-recognition in human infants (Bischof-Köhler 1988; Zahn-Waxler et al. 1992).

In contrast to this unique self-system notion, several lines of evidence suggest that multiple (at least 4) “ordinary” cognitive processes underpin mirrored self-recognition at both perceptual and semantic levels. At the perceptual level, there are 2 independent processes for detecting different perceptual cues from a mirrored self-image: contingency and figurative cues. The contingency cue is the temporal contingency between one's intentional facial action and the visually observed movement feedback of one's own face. The figurative cue is the match between the perceived face and the self-face representation stored in the visual long-term memory. The contingency cue may trigger 2 additional distinct processes: the sense of agency underpinning the perceived action (Wegner and Wheatley 1999; Frith et al. 2000) and the recognition of the mirror, which involves understanding that the space represented as being behind the mirror surface is in fact a symmetrically transformed duplicate of the real space in front (Loveland 1986). The mechanism for the figurative-cue processing appears to develop based on agency-related experience derived from the contingency cue; infants begin self-recognition of noncontingent images (e.g., photos and recorded videos) after they acquire abilities related to contingent images (e.g., mirrors and live videos) (Bigelow 1981; Courage et al. 2004).

At the semantic level, 2 independently processed perceptual cues as perceptual-level processes are integrated to allow meta-level self-cognition. In this integration stage, 2 complementary processes are also at work. One process is the access to amodal self-representation, which is activated by input from perceptual-level processes of any modality or types of self-relevant cues (i.e., irrespective of the contingency or figurative cue). This notion seems consistent with the notion of a special-self system. The second process is the belief-validation process, which evaluates the consistency between different self-relevant inputs from a perceptual level (i.e., between contingency and figurative cues). This hypothesis is based on the observation of mirrored-self misidentification in patients with dementia, in which a false belief (i.e., that the person in the mirror is not the self) generated by an impaired perceptual-level process is not corrected by comparison with other perceptual or contextual information (Coltheart 2010; Connors and Coltheart 2011).

No study has examined the neural basis of mirrored self-recognition by addressing this multicomponent structure, whereas related studies have suggested the involvement of different cortical regions, mostly in the right hemisphere (Sugiura 2013). The neuropsychological literature on mirrored-self misidentification suggests the involvement of the right hemisphere or frontal regions, but further anatomical specification has been hampered by the severe dementia and global cortical impairment of patients (Breen et al. 2001; Villarejo et al. 2011). Many neuroimaging studies have explored the self-face-specific response of the brain using noncontingent images (i.e., pictures or recorded videos) and have thus addressed self-face recognition based solely on a figurative cue. Indeed, recent studies have typically reported self-face-specific activation in several regions in the lateral frontal, parietal, occipital, temporal, and insular cortices of the right hemisphere (Uddin et al. 2005; Sugiura et al. 2006, 2008, 2012; Devue et al. 2007; Kaplan et al. 2008; Oikawa et al. 2012). Neuroimaging studies assessing the sense of action agency identified the neural response to a contingency cue by contrasting the real-time and violated visual feedback of the subject's hand action; activation has been reported in the insula and putamen (Farrer and Frith 2002; Farrer et al. 2003; Leube et al. 2003). However, in these studies, the subjects were not provided with the figurative cue or the unique visuospatial experience of being faced with a mirror. Finally, 2 potential semantic-level self-recognition processes have also been implicated in the right lateral prefrontal cortex. Amodal self-representation has been suggested to involve a region in the right inferior frontal gyrus or sulcus; this region responded not only to the self-face but also to the self-voice (Nakamura et al. 2001) or to one's own name being called (Kaplan et al. 2008). The belief-validation process in general has been speculated to involve the right middle frontal gyrus, which is responsive to the violation of predictions based on a belief (Fletcher et al. 2001; Turner et al. 2004).

In this functional magnetic resonance imaging (MRI) study, we directly examined the neural bases of self-face recognition in a mirror. We installed a virtual mirror system in a MRI scanner; the system featured an open-face head coil, video camera, and a projection system capable of presenting real-time, delayed, and recorded videos. This system allowed us to independently manipulate contingency and figurative cues, enabling us to dissociate the perceptual-level processes for 2 cues and 2 semantic-level processes. During a simple facial action task, each subject was presented with a video of the subject's own face (Self) or a similar prerecorded video of an unfamiliar face (Other). Each type of face stimuli was projected in real-time (Real-time), with a 500-ms delay (Delayed), or as a still image (Static). Of these 6 conditions for different types of visual stimuli, 4 were designed for a 2-by-2 factorial design composed of the factors Contingency (Real-time, Delayed) and Face (Self, Other). The perceptual-level process of the contingency cue was identified as the main effect of Contingency, and that of the figurative cue as the main effect of Face. The semantic-level processes were expected to show an interaction between 2 factors, with the 2 potential processes demonstrating different activation profiles. Amodal self-representation should exhibit a lack of additive effect of 2 cues while being sensitive to either cue alone; that is, an equivalent level of activation should be observed among Real-time Self, Delayed Self, and Real-time Other (i.e., Real-time contingency is a self-relevant input) compared with the Delayed Other condition. The belief-validation process should be activated when the contingency violates the prediction based on the face due to human expertise in mirrored self-recognition (i.e., real-time feedback from the self-face).

Materials and Methods

The Ethics Committee of Tohoku University School of Medicine approved this experimental protocol.

Subjects

Twenty-seven healthy right-handed male undergraduate or graduate students (aged 19–25 years) participated, and written informed consent was obtained from all subjects. No subject had a history of neurological or psychiatric illness.

Virtual Mirror System

The system is schematically illustrated in Figure 1a. Subjects lay on the bed of an MRI scanner with their heads fixed in the MRI head coil using elastic blocks. The head component of a SENSE head spine coil (Philips, Best, the Netherlands) was used as the open-face head coil. A high-speed (250 fps) video camera SVS340CUCP (SVS-VISTEK, Seefeld, Germany) viewed the subject's face via a half-mirror attached to the head coil. The image was projected onto a semilucent screen behind the head coil using an Endeavor Pro4700 (Epson, Suwa, Japan) and a DLA-HD10K LCD projector (JVC, Wayne, NJ, USA). Custom software (Physiotech, Tokyo, Japan) allowed presentation of a quasi-real-time (<60-ms delay), delayed-real-time, or prerecorded video. The subjects viewed the screen via a mirror.

Figure 1.

Experimental equipment and task. (a) Virtual mirror system installed in the magnetic resonance imaging (MRI) scanner. A video of the subject's face was shot via a half-mirror and stored on a personal computer (PC). The real-time, delayed, or prerecorded video from the PC was projected onto a semilucent screen behind the head coil, and the subject viewed it via a mirror. (b) Examples of the visual stimuli and the time courses of the subject's facial action and the visual stimuli under the Real-time, Delayed (500 ms), and Static conditions. Each subject quickly opened his mouth as soon as the prompt (a small circle overlaid on the image) appeared, and then immediately closed it.

Figure 1.

Experimental equipment and task. (a) Virtual mirror system installed in the magnetic resonance imaging (MRI) scanner. A video of the subject's face was shot via a half-mirror and stored on a personal computer (PC). The real-time, delayed, or prerecorded video from the PC was projected onto a semilucent screen behind the head coil, and the subject viewed it via a mirror. (b) Examples of the visual stimuli and the time courses of the subject's facial action and the visual stimuli under the Real-time, Delayed (500 ms), and Static conditions. Each subject quickly opened his mouth as soon as the prompt (a small circle overlaid on the image) appeared, and then immediately closed it.

Stimuli and Task

Figure 1b illustrates the task conditions. In all trials, a face image was presented for 2 s and a small white circle appeared at the position of the glabella following a 500-ms latency, where it remained for 500 ms; this was the prompt for the mouth action. Each subject was instructed to quickly open his mouth as soon as the prompt appeared, and to then close it immediately. In the scanner, the subjects initially practiced the task while viewing an example video, and then while viewing various types of facial images that would be presented during the experiment. After acquiring the ability to quickly and consistently perform the task, 3 video clips were recorded: videos of the participant engaging in regular and delayed performance (cue latency, 1 s), and without task performance. The first 2 facial action video clips were not used in the experiment for the subject depicted in the video, but for other subjects as the clips for Real-time and Delayed Other conditions. Video clips without task performance were used as the Static Self condition in experiments for this subject, as well as for the Static Other condition in experiments for other subjects.

Each subject participated in 3 460-s sessions, each including 10 trials for each of the 6 conditions, with pseudo-random intertrial intervals of 1–16 s. The image was projected with minimal and 500-ms delays under the Real-time and Delayed Self conditions, respectively. The recorded video clips of an unfamiliar subject's regular and delayed performances were presented during the Real-time and Delayed Other conditions, respectively. Videos of the subject himself and another subject not performing the task were presented during the Static Self and Other conditions, respectively.

The latency between the onset of the action and the visual prompt was measured for each subject using the prerecorded video used as a stimulus. Actions could not be recorded during the fMRI measurement because the system assigned all the computational resources to visual presentation to minimize the feedback delay during the Real-time condition.

fMRI Data Acquisition and Preprocessing

Forty-four transaxial gradient-echo images (echo time = 30 ms, flip angle = 85°, slice thickness = 2.5 mm, slice gap = 0.5 mm, FOV = 192 mm, matrix = 64 × 64, voxel size = 3 × 3 × 3 mm) covering the whole cerebrum were acquired during all sessions at a repetition time of 2.5 s, using an echo planar sequence and a Philips Achieva (3T) MR scanner. The following preprocessing procedures were performed using statistical parametric mapping (SPM8) software (Wellcome Department of Imaging Neuroscience, London, UK) and MATLAB: Adjustment of acquisition timing across slices, correction for head motion, spatial normalization using the EPI-MNI template, and smoothing using a Gaussian kernel with a full-width at half-maximum size of 8 mm.

fMRI Data Analysis

Seven subjects were excluded from the analysis; one could not discriminate self from other faces during the experiment, 4 performed the action too quickly or too slowly compared with the video of the respective other (more than a 100-ms difference in action onset), and 2 moved their heads >4 mm within a session. Data from the remaining 20 subjects were analyzed.

A conventional two-level approach for event-related fMRI data was adopted using SPM8. A voxel-by-voxel multiple regression analysis was conducted for the first-level within-subject (fixed effects) model. Expected signal changes were modeled for the 6 conditions. A model of the expected signal change was constructed using the hemodynamic response function provided by SPM8. A high-pass filter with a cutoff period of 128 s was used to eliminate the artifactual low-frequency trend.

A voxel-by-voxel statistical inference on the contrasts of the parameter estimates was performed on the second-level between-subject (random effects) model, using one-sample t-tests. First, the main effects and their interactions were examined in a 2-by-2 factorial design composed of the Contingency (Real-time, Delayed) and Face factors (Self, Other) (Fig. 2a). The threshold for significant activation was initially set at P < 0.001 (uncorrected); it was then corrected to P < 0.05 for multiple comparisons using cluster size assuming the whole brain as the search volume. A region-of-interest analysis was then performed using the activation peaks identified in the contrast between the main effects of Face and negative interaction; the areas identified in the 2 contrasts overlapped considerably (see Results). The analysis was intended for detailed functional segregation using 3 additional contrasts at a liberal threshold (P< 0.05, without correction for multiple comparisons).

Figure 2.

Design of the analyses. (a) Four conditions were assessed using a 2-by-2 factorial design composed of 2 factors: Contingency (Real-time, Delayed) and Face (Self, Other). Expected activation profiles for the main effects of Contingency (b) and Face (c) assumed responses to the contingency and figurative cues, respectively (i.e., perception-level processes). Expected activation profiles for negative interaction assumed 2 semantic-level processes: the amodal self-representation (d) and the belief-validation process (e). The Static condition was used post hoc to examine whether the identified face effect was dependent on facial motion.

Figure 2.

Design of the analyses. (a) Four conditions were assessed using a 2-by-2 factorial design composed of 2 factors: Contingency (Real-time, Delayed) and Face (Self, Other). Expected activation profiles for the main effects of Contingency (b) and Face (c) assumed responses to the contingency and figurative cues, respectively (i.e., perception-level processes). Expected activation profiles for negative interaction assumed 2 semantic-level processes: the amodal self-representation (d) and the belief-validation process (e). The Static condition was used post hoc to examine whether the identified face effect was dependent on facial motion.

To identify regions involved in the perceptual-level processes, the main effect of each factor was tested. To identify regions responsive to the contingency cue, the main effect of Contingency (Fig. 2b) was tested using the contrast (Real-time Self + Real-time Other) – (Delayed Self + Delayed Other). To identify regions responsive to the figurative cue, the main effect of Face (Fig. 2c) was tested using the contrast (Real-time Self + Delayed Self) – (Real-time Other + Delayed Other). For the latter regions, the region-of-interest analysis was performed on the Self–Other contrast under the Static condition to examine whether the effect of the figurative cue was dependent on facial motion.

To identify regions involved in the 2 semantic-level processes together, the negative interaction of 2 factors (Fig. 2d,e) was tested using the contrasts (Real-time Other + Delayed Self) – (Real-time Self + Delayed Other). To identify a moderate degree of negative interaction among regions that exhibited the main effect of Face, the contrast Real-time–Delayed under the Other condition was used as an index of sensitivity to a contingency cue in the absence of a figurative cue in the region-of-interest analysis. To dissociate 2 semantic-level processes, a contrast (Real-time Other + Delayed Self) – 2 × Real-time Self was used as an index of violation of the predicted contingency; delayed feedback from the self-face and real-time feedback from the other face should violate the subject's prediction. This contrast dissociated the belief-validation process (Fig. 2e) from amodal self-representation (Fig. 2d).

Activation specific to mirrored self-recognition (i.e., Real-time Self), if any, was explored by testing a positive interaction, namely the contrast (Real-time Self + Delayed Other) – (Real-time Other + Delayed Self).

Results

Voxel-by-Voxel Analyses

Significantly higher activation in the cuneus was identified under the Real-time condition than during the Delayed condition (i.e., the main effect of Contingency) (Fig. 3a, Table 1). The activation profile showed that the effect was derived from deactivation under the Delayed condition; this was also observed under the Static condition.

Table 1

Effects of contingency and figurative cues

Structure
 
Coordinates
 
Cluster
 
Peak activation (t-value)
 
x y z Size
 
P-value Contingency
 
Face
 
Interaction
 
“Main effect of contingency” (Real-time > Delayed) 
 Cuneus  −82 22 223  0.02 5.13 −1.33  0.78  
“Main effect of Face” (Self > Other) 
 OPTJx 32 −76 24 3169 −0.31  7.35 0.16  
 OPJx 22 −72 44   0.97  7.53 −0.45  
 IPS 28 −62 44   1.32  7.41 −0.06  
 −24 −62 48 882  2.77 ** 6.59 −1.23  
 pSMG 44 −32 32   1.12  4.98 −0.46  
 aSMG 62 −18 28   1.45  4.78 −0.63  
 pSTG 64 −36 18   −0.58  4.97 0.21  
 pITG 60 −56 −10 188  0.026 −0.78  5.46 1.34  
 pSFG 28 −6 52 2585 −0.04  7.15 −2.15  
 pMFG 44 52   0.56  4.76 2.67 ** 
 pIFG 58 10 16   1.22  4.99 2.12 ** 
 tIFG 42 38 12   2.94 ** 6.68 3.83 ** 
 oIFG 44 32 −4   0.84  5.28 1.23  
 aINS 30 24   1.84 ** 6.84 1.95 ** 
 mINS 38   0.24  5.60 −0.70  
“Negative interaction” (Real-time Other + Delayed Self > Real-time Self + Delayed Other) 
 aMFG 42 52 318 0.006 0.20  0.94  6.54 
 aIFS 42 42 10   2.83 ** 4.47 ** 5.25 
 pIFS 30 12 28 733 2.30 ** 1.51  5.57 
 FOP 44 10 16   1.67  2.91 ** 4.79 
Structure
 
Coordinates
 
Cluster
 
Peak activation (t-value)
 
x y z Size
 
P-value Contingency
 
Face
 
Interaction
 
“Main effect of contingency” (Real-time > Delayed) 
 Cuneus  −82 22 223  0.02 5.13 −1.33  0.78  
“Main effect of Face” (Self > Other) 
 OPTJx 32 −76 24 3169 −0.31  7.35 0.16  
 OPJx 22 −72 44   0.97  7.53 −0.45  
 IPS 28 −62 44   1.32  7.41 −0.06  
 −24 −62 48 882  2.77 ** 6.59 −1.23  
 pSMG 44 −32 32   1.12  4.98 −0.46  
 aSMG 62 −18 28   1.45  4.78 −0.63  
 pSTG 64 −36 18   −0.58  4.97 0.21  
 pITG 60 −56 −10 188  0.026 −0.78  5.46 1.34  
 pSFG 28 −6 52 2585 −0.04  7.15 −2.15  
 pMFG 44 52   0.56  4.76 2.67 ** 
 pIFG 58 10 16   1.22  4.99 2.12 ** 
 tIFG 42 38 12   2.94 ** 6.68 3.83 ** 
 oIFG 44 32 −4   0.84  5.28 1.23  
 aINS 30 24   1.84 ** 6.84 1.95 ** 
 mINS 38   0.24  5.60 −0.70  
“Negative interaction” (Real-time Other + Delayed Self > Real-time Self + Delayed Other) 
 aMFG 42 52 318 0.006 0.20  0.94  6.54 
 aIFS 42 42 10   2.83 ** 4.47 ** 5.25 
 pIFS 30 12 28 733 2.30 ** 1.51  5.57 
 FOP 44 10 16   1.67  2.91 ** 4.79 

Stereotactic coordinates (x, y, z) of the activation peak, cluster size (number of voxels = 2 × 2 × 2 mm3), P-value (corrected for multiple comparisons), and t-values for the 3 major contrasts (i.e., main effects and interaction) at the peak are given. Lowercase letters for each cluster indicate activity in the same cluster.

Abbreviations for structures: a, anterior; m, middle; p, posterior; OPT, occipito–parietal–temporal; OP, occipitoparietal; Jx, junction; IPS, intraparietal sulcus; SMG, supramarginal gyrus; STG, superior temporal gyrus; ITG, inferior temporal gyrus; SFG, superior frontal gyrus; MFG, middle frontal gyrus; IFG, inferior frontal gyrus; t, triangular part; o, orbital part; INS, insula; IFS, inferior frontal sulcus; FOP, frontal operculum.

*P< 0.001, **P< 0.05 (uncorrected).

Figure 3.

Main effects and interaction. (a) An area with a significant main effect of Contingency is superimposed on the midsagittal section of the anatomical image. The activation profile (activation estimate for each condition) at the activation peak is shown on the right. The main effects of Face (b) and the interaction (c) are rendered on the lateral surface of the right hemisphere. See Figure 4 for the activation profiles of the activation peaks.

Figure 3.

Main effects and interaction. (a) An area with a significant main effect of Contingency is superimposed on the midsagittal section of the anatomical image. The activation profile (activation estimate for each condition) at the activation peak is shown on the right. The main effects of Face (b) and the interaction (c) are rendered on the lateral surface of the right hemisphere. See Figure 4 for the activation profiles of the activation peaks.

Significantly higher activation under the Self than under the Other conditions (i.e., the main effect of Face) is shown in Figure 3b and Table 1. Activation was observed primarily in the right lateral cortices. A large activation cluster in the occipitoparietal region included peaks at the occipito-temporo-parietal junction (OTPJx), occipitoparietal junction (OPJx), intraparietal sulcus (IPS), posterior and anterior parts of the supramarginal gyrus (SMG), and posterior part of the superior temporal gyrus (pSTG). Activation was also observed in the posterior part of the right inferior temporal gyrus (pITG) and in the left IPS. In the right frontal cortices, a large frontal–insular cluster had peaks at the most posterior parts of the superior, middle, and inferior frontal gyri along the precentral sulcus (pSFG, pMFG, and pIFG, respectively), the triangular and orbital parts of the inferior frontal gyrus (tIFG and oIFG, respectively), and the anterior and middle parts of the insula (aINS and mINS, respectively). Several of these activation peaks showed a significant main effect of Contingency or interaction when a liberal (P < 0.05, uncorrected) threshold for statistical significance was used (Table 1).

A significant negative interaction of Contingency and Face was identified in 2 clusters in the right frontal region (Fig. 3c, Table 1). One cluster included 2 peaks at the anterior parts of the middle frontal gyrus (aMFG) and inferior frontal sulcus (aIFS), and the other involved 2 peaks at the posterior part of the IFS (pIFS) and the medial surface of the frontal operculum (FOP) facing the middle insula.

Region-of-Interest Analysis

The results are summarized in Figure 4. The activation peaks identified in the contrast for the main effect of Face (circles in Fig. 4a) were divided into 2 groups depending on the sensitivity to the contingency cue (vertical axis in the upper panel of Fig. 4a). The peaks that were not sensitive to the contingency cue (i.e., dedicated to figurative-cue processing; Fig. 2c) were segregated into 2 groups based on their sensitivity to the static self-face (horizontal axis in the upper panel of Fig. 4a): aSMG, pITG, and oIFG were responsive to a static neutral self-face (green circles), but other occipitoparietal regions and pSFG were not (beige circles). The peaks that were sensitive to the contingency cue (i.e., dedicated to amodal self-representation; Fig. 2d) were all responsive to the static neutral self-face (pMFG, pIFG, tIFG, and aINS; light-blue circles). As expected, these peaks were not sensitive to the violation of a predicted contingency (vertical axis in the lower panel of Fig. 4a).

Figure 4.

Functional segregation of the activated areas in the contrasts for the main effect of Face or the negative interaction. 3D plots of the t-values for the peaks (a) and segregation of areas (b) based on the significance of the 3 hypothesis-driven contrasts. The sensitivities of the static self-face (Static Self – Static Other) to the contingency cue under the no-figurative-cue condition (Real-time Other – Delayed Other) and the violation of predicted contingency ([Real-time Other + Delayed Self] – 2 × Real-time Self) are shown on the horizontal, vertical (upper), and vertical (lower) axes (a) and are color-coded in green, blue, and red, respectively. Regions sensitive to both the static self-face and the contingency cue are shown in light blue, those sensitive to both the contingency cue and violation of prediction in purple, and those to none of them in beige. In plot (a), the red line shows the statistical threshold (t= 1.72; P = 0.05, uncorrected). Circles and triangles denote peaks originally identified in the contrasts for the main effect of Face and the negative interaction, respectively. (b) Surface rendering from the top (top panel), right (middle panel), and coronal sections through the posterior frontal region (y = 10, right hemisphere only; bottom right panel) and the sagittal section through the right insula (x = 38; bottom left panel) are presented to visualize the segregation of areas. Regions in blue or red include no predefined peaks. The activation profile of a representative peak from each group is shown in (c).

Figure 4.

Functional segregation of the activated areas in the contrasts for the main effect of Face or the negative interaction. 3D plots of the t-values for the peaks (a) and segregation of areas (b) based on the significance of the 3 hypothesis-driven contrasts. The sensitivities of the static self-face (Static Self – Static Other) to the contingency cue under the no-figurative-cue condition (Real-time Other – Delayed Other) and the violation of predicted contingency ([Real-time Other + Delayed Self] – 2 × Real-time Self) are shown on the horizontal, vertical (upper), and vertical (lower) axes (a) and are color-coded in green, blue, and red, respectively. Regions sensitive to both the static self-face and the contingency cue are shown in light blue, those sensitive to both the contingency cue and violation of prediction in purple, and those to none of them in beige. In plot (a), the red line shows the statistical threshold (t= 1.72; P = 0.05, uncorrected). Circles and triangles denote peaks originally identified in the contrasts for the main effect of Face and the negative interaction, respectively. (b) Surface rendering from the top (top panel), right (middle panel), and coronal sections through the posterior frontal region (y = 10, right hemisphere only; bottom right panel) and the sagittal section through the right insula (x = 38; bottom left panel) are presented to visualize the segregation of areas. Regions in blue or red include no predefined peaks. The activation profile of a representative peak from each group is shown in (c).

In contrast, the peaks identified in the contrast for negative interaction (triangles in Fig. 4a) were divided into the aIFS and 3 resting peaks depending on their sensitivity to the static self-face and violation of a predicted contingency (horizontal and vertical axes, respectively, in the lower panel of Fig. 4a). The aIFS was sensitive to the static self-face but not to the violation of a predicted contingency (light-blue triangle); therefore, it belonged to a group that exhibited sensitivity to the contingency cue among the Face-main-effect peaks (i.e., dedicated to amodal self-representation; light-blue circle; Fig. 2d). In contrast, the resting 3 peaks aMFG, pIFS, and FOP were sensitive to the violation of a predicted contingency (i.e., dedicated to the belief-validation process; Fig. 2e) but not to the static neutral self-face (purple triangle).

Discussion

This fMRI study examined the neural mechanisms underlying mirrored self-face recognition for the first time. Our data demonstrated that multiple processes at the perceptual or semantic level play roles in the processing of 2 perceptual cues (i.e., contingency and figurative) and their integration. We demonstrated that different cortical regions were sensitive to contingency and/or figurative cues. Particularly, several right frontal–insular regions showed activation reflecting different types of negative interaction between the 2 cues. These findings support the multicomponential view of the mirrored-self-recognition process.

Main Effect of Contingency

The observed significant main effect in the cuneus may reflect the unique visuospatial experience of the mirror confrontation induced by detecting the real-time contingency in the visual feedback of one's own facial action. This region likely corresponds to V3 or V3A (Tootell et al. 1997) and has been reported to be involved in 3D depth perception (Paradis et al. 2000) and attention to peripersonal space in front (Weiss et al. 2000; Quinlan and Culham 2007). Upon recognition of the “mirror,” our subjects may have perceived the space represented on the image as real and paid attention to its depth in their peripersonal space. The activation profile in this region showed deactivation relative to the baseline activity under the delayed and static conditions; this may be more accurately described as suppression of 3D depth perception or attention to peripersonal space during the perception of a virtual image (i.e., recorded or static video).

Main Effect of Face

Many occipital, parietal, temporal, frontal, and insular regions were identified, particularly in the right hemisphere. These regions were largely consistent with those previously reported as showing self-face-specific activation (Uddin et al. 2005; Platek et al. 2006; Sugiura et al. 2006, 2008, 2012; Kaplan et al. 2008; Oikawa et al. 2012). Our results segregated these regions into 3 groups. A group composed of frontal activation peaks (i.e., pMFG, pIFG, tIFG, and aINS) also exhibited sensitivity to the contingency cue, suggesting that these regions have a role in amodal self-representation at the semantic level. Functional dissociation of these frontal regions from posterior regions was previously suggested by a different pattern of intersubject variance in self-specific activation (Sugiura et al. 2006). The remaining regions, which were specifically sensitive to the figurative cue, were considered to be involved in perceptual-level processes. These were further subdivided into regions that were sensitive to the static neutral self-face (i.e., aSMG, pITG, and oIFG), and those that were not (other occipitoparietal regions and pSFG).

Although the precise processes operating in each group of perceptual-level regions remain a matter of speculation, the difference in the roles of the 2 groups seem to be traceable to different stages of an infants' acquisition of this ability in front of a mirror. The regions that do not respond to the static self-face (e.g., occipitoparietal regions and the pSFG) may be relevant to experiences during the initial stages of the acquisition of this ability. Infants show exploratory behavior (e.g., smiling, moving, and touching a mirror) when they begin to learn about the contingency between their own actions and visual feedback from a mirror (Loveland 1986). This visuomotor learning process involves the parietal and premotor regions (Ghilardi et al. 2000; Inoue et al. 2000), which largely overlap with regions in this group. In contrast, the group that showed sensitivity to the static self-face (i.e., pITG, aSMG, and oIFG) may be related to the bodily representation of the self-face that was established after this learning process. Specifically, regions in this group have been implicated in the sense of action agency or body ownership (Leube et al. 2003; David et al. 2007; Schnell et al. 2007; Ionta et al. 2011). Both groups of regions overlap with the areas that receive vestibular input (Smith et al. 2012; zu Eulenburg et al., 2012), which is highly relevant to bodily representation of self-face, as well as its acquisition process.

Amodal Self-representation

The expected activation profile was observed in 4 right frontal peaks (pMFG, pIFG, tIFG, and aINS) that were identified in the contrast for the main effect of Face, and in the right aIFS, which was identified in the contrast for the negative interaction. Several findings support the notion of the amodal nature of the self-representation in these regions. The response to the self-voice (Nakamura et al. 2001) or to one's own name being called (Kaplan et al. 2008) was identified previously in the regions close to the right tIFG and aIFS peaks. A correlation between social-affective feeling and activation during self-face viewing has also been reported in a cross-trial correlation between a feeling of embarrassment and a region close to the aIFS and the cross-individual correlation of a public self-consciousness score and a region close to the aINS and pIFG (Morita et al. 2008). In contrast, the range or domain of “self” represented in these regions may be limited considering the report that they do not respond to a visually presented self-name, whereas the right tIFG does respond to the self-face (Sugiura et al. 2008).

Belief-Validation Process

A response to the violation of a predicted contingency was identified in the right aMFG, pIFS, and FOP. The lack of response to the static neutral self-face may be due to our expertise in viewing a static self-face. Of these regions, the pIFS and aMFG are close to a region that has been reported to be responsive to the violation of a learned association between a drug and a side effect (Fletcher et al. 2001; Turner et al. 2004). The aMFG may select processes (i.e., perceptual cues) congruent with the validated belief; this region has been implicated in the enhancement and suppression of task-relevant and task-irrelevant processes, respectively (Sakai and Passingham 2003; Sugiura et al. 2007).

Neural Mechanisms for Mirrored Self-face Recognition and Implications

The groups of regions that exhibited different activation profiles are thus indeed likely to accommodate the different cognitive processes that have been assumed to underlie mirrored self-recognition based on the findings of developmental psychology and clinical observations. This finding may contribute to the discussion regarding the multiple developmental levels of the ability for self-recognition, particularly its behavioral index (Anderson 1984; Brooks-Gunn and Lewis 1984). Historically, the legitimate index has been the mark test (or rouge test), by which animals or infants are marked on an unseen part of the body and examined in front of a mirror to determine whether they show a behavior directed to the mark rather than to the mirror (Gallup 1970; Anderson 1984; Brooks-Gunn and Lewis 1984). Some animal species that previously failed to pass this test were able to pass a modified version of the test; for example, pigeons passed the test after training (Epstein et al. 1981). Infants typically initially pass this test in the second year of life, but they pass tests for other self-recognition-relevant indices earlier; they discriminate between contingent and noncontingent images of self at <2 months of age (Reddy et al. 2007). It would be interesting to compare these different indices with the components of mirrored self-recognition and the related cortical regions.

It is also interesting to focus on the role of contingency detection in the development of recognition not only of the self but also of others in an interactive relationship. We attributed the neural response to the figurative cue in the right pITG, aSMG, and oIFG to the bodily representation of the self-face acquired through the experience of contingency testing during infancy. However, it has been reported that these regions, particularly the right pITG and aSMG, are also responsive to the faces of personally familiar people in a social context (Sugiura et al. 2012). This finding may suggest that the recognition of personally familiar people in a social context also involves the representation of contingent relationships. Consistent with this, it has been proposed that an innate contingency-detection module analyzes not only the input related to one's own body but also the input related to communicative others: the former carries a perfect contingency between one's own action and the perceived motion of the body, whereas the latter carries a loose contingency (i.e., delay in time, different form and strength, or different modality) between one's own social action and the perceived response of others (Gergely and Watson 1999). In fact, evidence of contingency detection by infants (e.g., a negative reaction to a recorded video compared with a real-time video) in response to both the self (Reddy et al. 2007) and the mother (Nadel et al. 1999) appears at around 2 months of age. This commonality between the contingency detection mechanisms for self- and communicative-other recognition may explain the developmental co-emergence of mirrored self-recognition and empathic behavior in infants (Bischof-Köhler 1988; Zahn-Waxler et al. 1992).

To fully understand the roles of the frontal regions during mirrored self-face recognition, further conceptual elaborations are necessary. The right lateral prefrontal cortex has been implicated in the coordination of self-relevant information related not only to the physical self but also to the social self (Sugiura 2013). The regions assigned to the amodal self-representation and the belief-validation processes within the right frontal and insular cortices exhibited mosaic-like distribution. The overlap of these regions with those reported in previous relevant studies was often partial or variable. This suggests that further functional segregation may occur. For example, it is known that multiple regions in the lateral prefrontal cortex are dedicated to hierarchically organized cognitive control (Koechlin and Summerfield 2007; Badre 2008).

Methodological Considerations

Although the absence of a recording of subjects' facial action during the fMRI measurement was a methodological limitation of this study, we believe that it eventually had no significant effect on our results for the following reasons. Although it was possible that subjects performed delayed facial actions during the experiment, this did not affect the Self conditions, which used real-time feedback under the Real-time and Delayed conditions. However, it could have affected the Other conditions, during which recorded videos were presented. The delay could have reduced and increased the real-time contingency under the Real-time Other and Delayed Other conditions, respectively, and thus resulted in a reduction in the contrast between the 2 conditions. To minimize this possibility, we had each subject practice the task until he acquired the ability to consistently perform it. Additionally, during the fMRI measurement, we monitored the subjects' performance under the Real-time Self and Delayed Self conditions in which the real-time image of the subjects' action was available. Despite such efforts, it was still possible that the subjects' performance was delayed and that the contingency cue had a reduced effect under the Other conditions. This may have produced an artifactual positive interaction between 2 perceptual cues; that is, the regions that should show the main effect of the contingency cue may have exhibited artifactual predominance in the contingency-cue effect under the Self condition. Fortunately, we did not detect a positive interaction and, therefore, do not need to discuss the possibility of artifacts.

Conclusion

We successfully identified the neural bases of the processes underlying mirrored-self-face recognition. Perceptual-level processes, which were responsive to a single perceptual cue, were localized primarily in the posterior cortices. Responses to real-time contingency cues were identified in the cuneus, which may reflect the unique visuospatial experience of the mirror confrontation. Responses to the figurative cues of the self-face, which were assumed to index the self-face representation, were identified in the bilateral occipitoparietal and right temporal, frontal, and insular cortices. These regions were segregated into 2 groups depending on the presence or absence of the response to the static self-face, and their roles could be traced to different developmental stages of self-face recognition. Two regions reflecting semantic-level processes were identified: Regions for amodal self-representation, which were sensitive to both perceptual cues and static self-face, and those for belief-validation, which responded to violations of a predicted contingency. These 2 region types exhibited mosaic-like distribution over the right frontal and insular regions, suggesting the need for further conceptual elaboration in self-relevant processes at the meta-cognitive level. These results illustrate the notion of multiple developmental levels of self-recognition ability, as well as the potential role of contingency detection in the co-emergence of mirrored self-recognition and empathic behavior in infants.

Funding

This study was supported by MEXT KAKENHI 23119702 and 25560347. Funding to pay the Open Access publication charges for this article was provided by MEXT KAKENHI 25560347.

Notes

Conflict of Interest: None declared.

References

Anderson
JR
1984
.
The development of self-recognition—a review
.
Dev Psychobiol
 .
17
:
35
49
.
Badre
D
2008
.
Cognitive control, hierarchy, and the rostro–caudal organization of the frontal lobes
.
Trends Cogn Sci.
 
12
:
193
200
.
Bigelow
AE
1981
.
The correspondence between self and image movement as a cue to self-recognition for young children
.
J Genet Psychol
 .
139
:
11
26
.
Bischof-Köhler
D
1988
.
On the connection between empathy and the ability to recognize oneself in the mirror
.
Swiss J Psychol
 .
47
:
147
159
.
Breen
N
Caine
D
Coltheart
M
2001
.
Mirrored-self misidentification: two cases of focal onset dementia
.
Neurocase
 .
7
:
239
254
.
Brooks-Gunn
J
Lewis
M
1984
.
The development of early visual self-recognition
.
Dev Rev
 .
4
:
215
239
.
Coltheart
M
2010
.
The neuropsychology of delusions
.
Ann N Y Acad Sci
 .
1191
:
16
26
.
Connors
MH
Coltheart
M
2011
.
On the behaviour of senile dementia patients vis-a-vis the mirror: Ajuriaguerra, Strejilevitch and Tissot (1963)
.
Neuropsychologia
 .
49
:
1679
1692
.
Courage
ML
Edison
SC
Howe
ML
2004
.
Variability in the early development of visual self-recognition
.
Infant Behav Dev
 .
27
:
509
532
.
David
N
Cohen
MX
Newen
A
Bewernick
BH
Shah
NJ
Fink
GR
Vogeley
K
2007
.
The extrastriate cortex distinguishes between the consequences of one's own and others’ behavior
.
Neuroimage
 .
36
:
1004
1014
.
Devue
C
Collette
F
Balteau
E
Degueldre
C
Luxen
A
Maquet
P
Bredart
S
2007
.
Here I am: the cortical correlates of visual self-recognition
.
Brain Res
 .
1143
:
169
182
.
Epstein
R
Lanza
RP
Skinner
BF
1981
.
Self-awareness in the pigeon
.
Science
 .
212
:
695
696
.
Farrer
C
Franck
N
Georgieff
N
Frith
CD
Decety
J
Jeannerod
A
2003
.
Modulating the experience of agency: a positron emission tomography study
.
Neuroimage
 .
18
:
324
333
.
Farrer
C
Frith
CD
2002
.
Experiencing oneself vs. another person as being the cause of an action: the neural correlates of the experience of agency
.
Neuroimage
 .
15
:
596
603
.
Fletcher
PC
Anderson
JM
Shanks
DR
Honey
R
Carpenter
TA
Donovan
T
Papadakis
N
Bullmore
ET
2001
.
Responses of human frontal cortex to surprising events are predicted by formal associative learning theory
.
Nat Neurosci
 .
4
:
1043
1048
.
Frith
CD
Blakemore
SJ
Wolpert
DM
2000
.
Explaining the symptoms of schizophrenia: abnormalities in the awareness of action
.
Brain Res Rev
 .
31
:
357
363
.
Gallup
GG
1970
.
Chimpanzees: self-recognition
.
Science
 .
167
:
86
87
.
Gallup
GG
1982
.
Self-awareness and the emergence of mind in primates
.
Am J Primatol
 .
2
:
237
248
.
Gergely
G
Watson
JS
1999
.
Early socio-emotional development: contingency perception and the social biofeedback model
. In:
Rochat
P
editor.
Early social cognition
 .
Mahwah, NJ
:
Erlbaum
. p.
101
136
.
Ghilardi
M
Ghez
C
Dhawan
V
Moeller
J
Mentis
M
Nakamura
T
Antonini
A
Eidelberg
D
2000
.
Patterns of regional brain activation associated with different forms of motor learning
.
Brain Res
 .
871
:
127
145
.
Hart
BL
Hart
LA
Pinter-Wollman
N
2008
.
Large brains and cognition: where do elephants fit in?
Neurosci Biobehav Rev
 .
32
:
86
98
.
Inoue
K
Kawashima
R
Satoh
K
Kinomura
S
Sugiura
M
Goto
R
Ito
M
Fukuda
H
2000
.
A PET study of visuomotor learning under optical rotation
.
Neuroimage
 .
11
:
505
516
.
Ionta
S
Heydrich
L
Lenggenhager
B
Mouthon
M
Fornari
E
Chapuis
D
Gassert
R
Blanke
O
2011
.
Multisensory mechanisms in temporo-parietal cortex support self-location and first-person perspective
.
Neuron
 .
70
:
363
374
.
Kaplan
JT
Aziz-Zadeh
L
Uddin
LQ
Iacoboni
M
2008
.
The self across the senses: an fMRI study of self-face and self-voice recognition
.
Soc Cogn Affect Neurosci
 .
3
:
218
223
.
Koechlin
E
Summerfield
C
2007
.
An information theoretical approach to prefrontal executive function
.
Trends Cogn Sci
 .
11
:
229
235
.
Leube
DT
Knoblich
G
Erb
M
Grodd
W
Bartels
M
Kircher
TTJ
2003
.
The neural correlates of perceiving one's own movements
.
Neuroimage
 .
20
:
2084
2090
.
Loveland
KA
1986
.
Discovering the affordances of a reflecting surface
.
Dev Rev
 .
6
:
1
24
.
Marino
L
2002
.
Convergence of complex cognitive abilities in cetaceans and primates
.
Brain Behav Evol
 .
59
:
21
32
.
Morita
T
Itakura
S
Saito
DN
Nakashita
S
Harada
T
Kochiyama
T
Sadato
N
2008
.
The role of the right prefrontal cortex in self-evaluation of the face: a functional magnetic resonance imaging study
.
J Cogn Neurosci
 .
20
:
342
355
.
Nadel
J
Carchon
I
Kervella
C
Marcelli
D
Reserbat-Plantey
D
1999
.
Expectancies for social contingency in 2-month-olds
.
Dev Sci
 .
2
:
164
173
.
Nakamura
K
Kawashima
R
Sugiura
M
Kato
T
Nakamura
A
Hatano
K
Nagumo
S
Kubota
K
Fukuda
H
Ito
K
et al
2001
.
Neural substrates for recognition of familiar voices: a PET study
.
Neuropsychologia
 .
39
:
1047
1054
.
Oikawa
H
Sugiura
M
Sekiguchi
A
Tsukiura
T
Miyauchi
CM
Hashimoto
T
Takano-Yamamoto
T
Kawashima
R
2012
.
Self-face evaluation and self-esteem in young females: an fMRI study using contrast effect
.
Neuroimage
 .
59
:
3668
3676
.
Paradis
AL
Cornilleau-Peres
V
Droulez
J
Van De Moortele
PF
Lobel
E
Berthoz
A
Le Bihan
D
Poline
JB
2000
.
Visual perception of motion and 3-D structure from motion: an fMRI study
.
Cereb Cortex
 .
10
:
772
783
.
Platek
SM
Loughead
JW
Gur
RC
Busch
S
Ruparel
K
Phend
N
Panyavin
IS
Langleben
DD
2006
.
Neural substrates for functionally discriminating self-face from personally familiar faces
.
Hum Brain Mapp
 .
27
:
91
98
.
Plotnik
JM
de Waal
FBM
Reiss
D
2006
.
Self-recognition in an Asian elephant
.
Proc Natl Acad Sci USA
 .
103
:
17053
17057
.
Quinlan
DJ
Culham
JC
2007
.
fMRI reveals a preference for near viewing in the human parieto–occipital cortex
.
Neuroimage
 .
36
:
167
187
.
Reddy
V
Chisholm
V
Forrester
D
Conforti
M
Maniatopoulou
D
2007
.
Facing the perfect contingency: interactions with the self at 2 and 3 months
.
Infant Behav Dev
 .
30
:
195
212
.
Reiss
D
Marino
L
2001
.
Mirror self-recognition in the bottlenose dolphin: a case of cognitive convergence
.
Proc Natl Acad Sci USA
 .
98
:
5937
5942
.
Sakai
K
Passingham
RE
2003
.
Prefrontal interactions reflect future task operations
.
Nat Neurosci
 .
6
:
75
81
.
Schnell
K
Heekeren
K
Schnitker
R
Daumann
J
Weber
J
Hesselmann
V
Moller-Hartmann
W
Thron
A
Gouzoulis-Mayfrank
E
2007
.
An fMRI approach to particularize the frontoparietal network for visuomotor action monitoring: detection of incongruence between test subjects’ actions and resulting perceptions
.
Neuroimage
 .
34
:
332
341
.
Smith
AT
Wall
MB
Thilo
KV
2012
.
Vestibular inputs to human motion-sensitive visual cortex
.
Cereb Cortex
 .
22
:
1068
1077
.
Suddendorf
T
Collier-Baker
E
2009
.
The evolution of primate visual self-recognition: evidence of absence in lesser apes
.
Proc Biol Sci
 .
276
:
1671
1677
.
Sugiura
M
2013
.
Associative account of self-cognition: extended forward model and multi-layer structure
.
Front Hum Neurosci
 .
7
:
535
.
Sugiura
M
Friston
KJ
Willmes
K
Shah
NJ
Zilles
K
Fink
GR
2007
.
Analysis of intersubject variability in activation: an application to the incidental episodic retrieval during recognition test
.
Hum Brain Mapp
 .
28
:
49
58
.
Sugiura
M
Sassa
Y
Jeong
H
Horie
K
Sato
S
Kawashima
R
2008
.
Face-specific and domain-general characteristics of cortical responses during self-recognition
.
Neuroimage
 .
42
:
414
422
.
Sugiura
M
Sassa
Y
Jeong
H
Miura
N
Akitsuki
Y
Horie
K
Sato
S
Kawashima
R
2006
.
Multiple brain networks for visual self-recognition with different sensitivity for motion and body part
.
Neuroimage
 .
32
:
1905
1917
.
Sugiura
M
Sassa
Y
Jeong
H
Wakusawa
K
Horie
K
Sato
S
Kawashima
R
2012
.
Self-face recognition in social context
.
Hum Brain Mapp
 .
33
:
1364
1374
.
Tootell
RB
Mendola
JD
Hadjikhani
NK
Ledden
PJ
Liu
AK
Reppas
JB
Sereno
MI
Dale
AM
1997
.
Functional analysis of V3A and related areas in human visual cortex
.
J Neurosci
 .
17
:
7060
7078
.
Turner
DC
Aitken
MR
Shanks
DR
Sahakian
BJ
Robbins
TW
Schwarzbauer
C
Fletcher
PC
2004
.
The role of the lateral frontal cortex in causal associative learning: exploring preventative and super-learning
.
Cereb Cortex
 .
14
:
872
880
.
Uddin
LQ
Kaplan
JT
Molnar-Szakacs
I
Zaidel
E
Iacoboni
M
2005
.
Self-face recognition activates a frontoparietal "mirror" network in the right hemisphere: an event-related fMRI study
.
Neuroimage
 .
25
:
926
935
.
Villarejo
A
Martin
VP
Moreno-Ramos
T
Camacho-Salas
A
Porta-Etessam
J
Bermejo-Pareja
F
2011
.
Mirrored-self misidentification in a patient without dementia: evidence for right hemispheric and bifrontal damage
.
Neurocase
 .
17
:
276
284
.
Wegner
DM
Wheatley
T
1999
.
Apparent mental causation—sources of the experience of will
.
Am Psychol
 .
54
:
480
492
.
Weiss
PH
Marshall
JC
Wunderlich
G
Tellmann
L
Halligan
PW
Freund
HJ
Zilles
K
Fink
GR
2000
.
Neural consequences of acting in near versus far space: a physiological basis for clinical dissociations
.
Brain
 .
123
:
2531
2541
.
Zahn-Waxler
C
Radke-Yarrow
M
Wagner
E
Chapman
M
1992
.
Development of concern for others
.
Dev Psychol
 .
28
:
126
136
.
zu Eulenburg
P
Caspers
S
Roski
C
Eickhoff
SB
2012
.
Meta-analytical definition and functional connectivity of the human vestibular cortex
.
Neuroimage
 .
60
:
162
169
.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/3.0/), which permits non-commercial re-use, distribution, and reproduction in any medium, provided the original work is properly cited. For commercial re-use, please contact journals.permissions@oup.com