Self-face recognition in the mirror is considered to involve multiple processes that integrate 2 perceptual cues: temporal contingency of the visual feedback on one's action (contingency cue) and matching with self-face representation in long-term memory (figurative cue). The aim of this study was to examine the neural bases of these processes by manipulating 2 perceptual cues using a “virtual mirror” system. This system allowed online dynamic presentations of real-time and delayed self- or other facial actions. Perception-level processes were identified as responses to only a single perceptual cue. The effect of the contingency cue was identified in the cuneus. The regions sensitive to the figurative cue were subdivided by the response to a static self-face, which was identified in the right temporal, parietal, and frontal regions, but not in the bilateral occipitoparietal regions. Semantic- or integration-level processes, including amodal self-representation and belief validation, which allow modality-independent self-recognition and the resolution of potential conflicts between perceptual cues, respectively, were identified in distinct regions in the right frontal and insular cortices. The results are supportive of the multicomponent notion of self-recognition and suggest a critical role for contingency detection in the co-emergence of self-recognition and empathy in infants.
Human adults recognize their own faces in a mirror or on any reflective surface. However, this ability has been demonstrated in only a limited number of animal species (Gallup 1982; Reiss and Marino 2001; Plotnik et al. 2006). Because these animals typically have large brains and show evidence of empathic behavior, mirrored self-recognition has been understood as an engram of a unique social-cognitive function of a “self,” which relies on a unique system in highly evolved brains (Gallup 1982; Marino 2002; Hart et al. 2008; Suddendorf and Collier-Baker 2009). This notion is consistent with the developmental co-emergence of empathic behavior and mirrored self-recognition in human infants (Bischof-Köhler 1988; Zahn-Waxler et al. 1992).
In contrast to this unique self-system notion, several lines of evidence suggest that multiple (at least 4) “ordinary” cognitive processes underpin mirrored self-recognition at both perceptual and semantic levels. At the perceptual level, there are 2 independent processes for detecting different perceptual cues from a mirrored self-image: contingency and figurative cues. The contingency cue is the temporal contingency between one's intentional facial action and the visually observed movement feedback of one's own face. The figurative cue is the match between the perceived face and the self-face representation stored in the visual long-term memory. The contingency cue may trigger 2 additional distinct processes: the sense of agency underpinning the perceived action (Wegner and Wheatley 1999; Frith et al. 2000) and the recognition of the mirror, which involves understanding that the space represented as being behind the mirror surface is in fact a symmetrically transformed duplicate of the real space in front (Loveland 1986). The mechanism for the figurative-cue processing appears to develop based on agency-related experience derived from the contingency cue; infants begin self-recognition of noncontingent images (e.g., photos and recorded videos) after they acquire abilities related to contingent images (e.g., mirrors and live videos) (Bigelow 1981; Courage et al. 2004).
At the semantic level, 2 independently processed perceptual cues as perceptual-level processes are integrated to allow meta-level self-cognition. In this integration stage, 2 complementary processes are also at work. One process is the access to amodal self-representation, which is activated by input from perceptual-level processes of any modality or types of self-relevant cues (i.e., irrespective of the contingency or figurative cue). This notion seems consistent with the notion of a special-self system. The second process is the belief-validation process, which evaluates the consistency between different self-relevant inputs from a perceptual level (i.e., between contingency and figurative cues). This hypothesis is based on the observation of mirrored-self misidentification in patients with dementia, in which a false belief (i.e., that the person in the mirror is not the self) generated by an impaired perceptual-level process is not corrected by comparison with other perceptual or contextual information (Coltheart 2010; Connors and Coltheart 2011).
No study has examined the neural basis of mirrored self-recognition by addressing this multicomponent structure, whereas related studies have suggested the involvement of different cortical regions, mostly in the right hemisphere (Sugiura 2013). The neuropsychological literature on mirrored-self misidentification suggests the involvement of the right hemisphere or frontal regions, but further anatomical specification has been hampered by the severe dementia and global cortical impairment of patients (Breen et al. 2001; Villarejo et al. 2011). Many neuroimaging studies have explored the self-face-specific response of the brain using noncontingent images (i.e., pictures or recorded videos) and have thus addressed self-face recognition based solely on a figurative cue. Indeed, recent studies have typically reported self-face-specific activation in several regions in the lateral frontal, parietal, occipital, temporal, and insular cortices of the right hemisphere (Uddin et al. 2005; Sugiura et al. 2006, 2008, 2012; Devue et al. 2007; Kaplan et al. 2008; Oikawa et al. 2012). Neuroimaging studies assessing the sense of action agency identified the neural response to a contingency cue by contrasting the real-time and violated visual feedback of the subject's hand action; activation has been reported in the insula and putamen (Farrer and Frith 2002; Farrer et al. 2003; Leube et al. 2003). However, in these studies, the subjects were not provided with the figurative cue or the unique visuospatial experience of being faced with a mirror. Finally, 2 potential semantic-level self-recognition processes have also been implicated in the right lateral prefrontal cortex. Amodal self-representation has been suggested to involve a region in the right inferior frontal gyrus or sulcus; this region responded not only to the self-face but also to the self-voice (Nakamura et al. 2001) or to one's own name being called (Kaplan et al. 2008). The belief-validation process in general has been speculated to involve the right middle frontal gyrus, which is responsive to the violation of predictions based on a belief (Fletcher et al. 2001; Turner et al. 2004).
In this functional magnetic resonance imaging (MRI) study, we directly examined the neural bases of self-face recognition in a mirror. We installed a virtual mirror system in a MRI scanner; the system featured an open-face head coil, video camera, and a projection system capable of presenting real-time, delayed, and recorded videos. This system allowed us to independently manipulate contingency and figurative cues, enabling us to dissociate the perceptual-level processes for 2 cues and 2 semantic-level processes. During a simple facial action task, each subject was presented with a video of the subject's own face (Self) or a similar prerecorded video of an unfamiliar face (Other). Each type of face stimuli was projected in real-time (Real-time), with a 500-ms delay (Delayed), or as a still image (Static). Of these 6 conditions for different types of visual stimuli, 4 were designed for a 2-by-2 factorial design composed of the factors Contingency (Real-time, Delayed) and Face (Self, Other). The perceptual-level process of the contingency cue was identified as the main effect of Contingency, and that of the figurative cue as the main effect of Face. The semantic-level processes were expected to show an interaction between 2 factors, with the 2 potential processes demonstrating different activation profiles. Amodal self-representation should exhibit a lack of additive effect of 2 cues while being sensitive to either cue alone; that is, an equivalent level of activation should be observed among Real-time Self, Delayed Self, and Real-time Other (i.e., Real-time contingency is a self-relevant input) compared with the Delayed Other condition. The belief-validation process should be activated when the contingency violates the prediction based on the face due to human expertise in mirrored self-recognition (i.e., real-time feedback from the self-face).
Materials and Methods
The Ethics Committee of Tohoku University School of Medicine approved this experimental protocol.
Twenty-seven healthy right-handed male undergraduate or graduate students (aged 19–25 years) participated, and written informed consent was obtained from all subjects. No subject had a history of neurological or psychiatric illness.
Virtual Mirror System
The system is schematically illustrated in Figure 1a. Subjects lay on the bed of an MRI scanner with their heads fixed in the MRI head coil using elastic blocks. The head component of a SENSE head spine coil (Philips, Best, the Netherlands) was used as the open-face head coil. A high-speed (250 fps) video camera SVS340CUCP (SVS-VISTEK, Seefeld, Germany) viewed the subject's face via a half-mirror attached to the head coil. The image was projected onto a semilucent screen behind the head coil using an Endeavor Pro4700 (Epson, Suwa, Japan) and a DLA-HD10K LCD projector (JVC, Wayne, NJ, USA). Custom software (Physiotech, Tokyo, Japan) allowed presentation of a quasi-real-time (<60-ms delay), delayed-real-time, or prerecorded video. The subjects viewed the screen via a mirror.
Stimuli and Task
Figure 1b illustrates the task conditions. In all trials, a face image was presented for 2 s and a small white circle appeared at the position of the glabella following a 500-ms latency, where it remained for 500 ms; this was the prompt for the mouth action. Each subject was instructed to quickly open his mouth as soon as the prompt appeared, and to then close it immediately. In the scanner, the subjects initially practiced the task while viewing an example video, and then while viewing various types of facial images that would be presented during the experiment. After acquiring the ability to quickly and consistently perform the task, 3 video clips were recorded: videos of the participant engaging in regular and delayed performance (cue latency, 1 s), and without task performance. The first 2 facial action video clips were not used in the experiment for the subject depicted in the video, but for other subjects as the clips for Real-time and Delayed Other conditions. Video clips without task performance were used as the Static Self condition in experiments for this subject, as well as for the Static Other condition in experiments for other subjects.
Each subject participated in 3 460-s sessions, each including 10 trials for each of the 6 conditions, with pseudo-random intertrial intervals of 1–16 s. The image was projected with minimal and 500-ms delays under the Real-time and Delayed Self conditions, respectively. The recorded video clips of an unfamiliar subject's regular and delayed performances were presented during the Real-time and Delayed Other conditions, respectively. Videos of the subject himself and another subject not performing the task were presented during the Static Self and Other conditions, respectively.
The latency between the onset of the action and the visual prompt was measured for each subject using the prerecorded video used as a stimulus. Actions could not be recorded during the fMRI measurement because the system assigned all the computational resources to visual presentation to minimize the feedback delay during the Real-time condition.
fMRI Data Acquisition and Preprocessing
Forty-four transaxial gradient-echo images (echo time = 30 ms, flip angle = 85°, slice thickness = 2.5 mm, slice gap = 0.5 mm, FOV = 192 mm, matrix = 64 × 64, voxel size = 3 × 3 × 3 mm) covering the whole cerebrum were acquired during all sessions at a repetition time of 2.5 s, using an echo planar sequence and a Philips Achieva (3T) MR scanner. The following preprocessing procedures were performed using statistical parametric mapping (SPM8) software (Wellcome Department of Imaging Neuroscience, London, UK) and MATLAB: Adjustment of acquisition timing across slices, correction for head motion, spatial normalization using the EPI-MNI template, and smoothing using a Gaussian kernel with a full-width at half-maximum size of 8 mm.
fMRI Data Analysis
Seven subjects were excluded from the analysis; one could not discriminate self from other faces during the experiment, 4 performed the action too quickly or too slowly compared with the video of the respective other (more than a 100-ms difference in action onset), and 2 moved their heads >4 mm within a session. Data from the remaining 20 subjects were analyzed.
A conventional two-level approach for event-related fMRI data was adopted using SPM8. A voxel-by-voxel multiple regression analysis was conducted for the first-level within-subject (fixed effects) model. Expected signal changes were modeled for the 6 conditions. A model of the expected signal change was constructed using the hemodynamic response function provided by SPM8. A high-pass filter with a cutoff period of 128 s was used to eliminate the artifactual low-frequency trend.
A voxel-by-voxel statistical inference on the contrasts of the parameter estimates was performed on the second-level between-subject (random effects) model, using one-sample t-tests. First, the main effects and their interactions were examined in a 2-by-2 factorial design composed of the Contingency (Real-time, Delayed) and Face factors (Self, Other) (Fig. 2a). The threshold for significant activation was initially set at P < 0.001 (uncorrected); it was then corrected to P < 0.05 for multiple comparisons using cluster size assuming the whole brain as the search volume. A region-of-interest analysis was then performed using the activation peaks identified in the contrast between the main effects of Face and negative interaction; the areas identified in the 2 contrasts overlapped considerably (see Results). The analysis was intended for detailed functional segregation using 3 additional contrasts at a liberal threshold (P< 0.05, without correction for multiple comparisons).
To identify regions involved in the perceptual-level processes, the main effect of each factor was tested. To identify regions responsive to the contingency cue, the main effect of Contingency (Fig. 2b) was tested using the contrast (Real-time Self + Real-time Other) – (Delayed Self + Delayed Other). To identify regions responsive to the figurative cue, the main effect of Face (Fig. 2c) was tested using the contrast (Real-time Self + Delayed Self) – (Real-time Other + Delayed Other). For the latter regions, the region-of-interest analysis was performed on the Self–Other contrast under the Static condition to examine whether the effect of the figurative cue was dependent on facial motion.
To identify regions involved in the 2 semantic-level processes together, the negative interaction of 2 factors (Fig. 2d,e) was tested using the contrasts (Real-time Other + Delayed Self) – (Real-time Self + Delayed Other). To identify a moderate degree of negative interaction among regions that exhibited the main effect of Face, the contrast Real-time–Delayed under the Other condition was used as an index of sensitivity to a contingency cue in the absence of a figurative cue in the region-of-interest analysis. To dissociate 2 semantic-level processes, a contrast (Real-time Other + Delayed Self) – 2 × Real-time Self was used as an index of violation of the predicted contingency; delayed feedback from the self-face and real-time feedback from the other face should violate the subject's prediction. This contrast dissociated the belief-validation process (Fig. 2e) from amodal self-representation (Fig. 2d).
Activation specific to mirrored self-recognition (i.e., Real-time Self), if any, was explored by testing a positive interaction, namely the contrast (Real-time Self + Delayed Other) – (Real-time Other + Delayed Self).
Significantly higher activation in the cuneus was identified under the Real-time condition than during the Delayed condition (i.e., the main effect of Contingency) (Fig. 3a, Table 1). The activation profile showed that the effect was derived from deactivation under the Delayed condition; this was also observed under the Static condition.
|Structure||Coordinates||Cluster||Peak activation (t-value)|
|“Main effect of contingency” (Real-time > Delayed)|
|“Main effect of Face” (Self > Other)|
|“Negative interaction” (Real-time Other + Delayed Self > Real-time Self + Delayed Other)|
|Structure||Coordinates||Cluster||Peak activation (t-value)|
|“Main effect of contingency” (Real-time > Delayed)|
|“Main effect of Face” (Self > Other)|
|“Negative interaction” (Real-time Other + Delayed Self > Real-time Self + Delayed Other)|
Stereotactic coordinates (x, y, z) of the activation peak, cluster size (number of voxels = 2 × 2 × 2 mm3), P-value (corrected for multiple comparisons), and t-values for the 3 major contrasts (i.e., main effects and interaction) at the peak are given. Lowercase letters for each cluster indicate activity in the same cluster.
Abbreviations for structures: a, anterior; m, middle; p, posterior; OPT, occipito–parietal–temporal; OP, occipitoparietal; Jx, junction; IPS, intraparietal sulcus; SMG, supramarginal gyrus; STG, superior temporal gyrus; ITG, inferior temporal gyrus; SFG, superior frontal gyrus; MFG, middle frontal gyrus; IFG, inferior frontal gyrus; t, triangular part; o, orbital part; INS, insula; IFS, inferior frontal sulcus; FOP, frontal operculum.
*P< 0.001, **P< 0.05 (uncorrected).
Significantly higher activation under the Self than under the Other conditions (i.e., the main effect of Face) is shown in Figure 3b and Table 1. Activation was observed primarily in the right lateral cortices. A large activation cluster in the occipitoparietal region included peaks at the occipito-temporo-parietal junction (OTPJx), occipitoparietal junction (OPJx), intraparietal sulcus (IPS), posterior and anterior parts of the supramarginal gyrus (SMG), and posterior part of the superior temporal gyrus (pSTG). Activation was also observed in the posterior part of the right inferior temporal gyrus (pITG) and in the left IPS. In the right frontal cortices, a large frontal–insular cluster had peaks at the most posterior parts of the superior, middle, and inferior frontal gyri along the precentral sulcus (pSFG, pMFG, and pIFG, respectively), the triangular and orbital parts of the inferior frontal gyrus (tIFG and oIFG, respectively), and the anterior and middle parts of the insula (aINS and mINS, respectively). Several of these activation peaks showed a significant main effect of Contingency or interaction when a liberal (P < 0.05, uncorrected) threshold for statistical significance was used (Table 1).
A significant negative interaction of Contingency and Face was identified in 2 clusters in the right frontal region (Fig. 3c, Table 1). One cluster included 2 peaks at the anterior parts of the middle frontal gyrus (aMFG) and inferior frontal sulcus (aIFS), and the other involved 2 peaks at the posterior part of the IFS (pIFS) and the medial surface of the frontal operculum (FOP) facing the middle insula.
The results are summarized in Figure 4. The activation peaks identified in the contrast for the main effect of Face (circles in Fig. 4a) were divided into 2 groups depending on the sensitivity to the contingency cue (vertical axis in the upper panel of Fig. 4a). The peaks that were not sensitive to the contingency cue (i.e., dedicated to figurative-cue processing; Fig. 2c) were segregated into 2 groups based on their sensitivity to the static self-face (horizontal axis in the upper panel of Fig. 4a): aSMG, pITG, and oIFG were responsive to a static neutral self-face (green circles), but other occipitoparietal regions and pSFG were not (beige circles). The peaks that were sensitive to the contingency cue (i.e., dedicated to amodal self-representation; Fig. 2d) were all responsive to the static neutral self-face (pMFG, pIFG, tIFG, and aINS; light-blue circles). As expected, these peaks were not sensitive to the violation of a predicted contingency (vertical axis in the lower panel of Fig. 4a).
In contrast, the peaks identified in the contrast for negative interaction (triangles in Fig. 4a) were divided into the aIFS and 3 resting peaks depending on their sensitivity to the static self-face and violation of a predicted contingency (horizontal and vertical axes, respectively, in the lower panel of Fig. 4a). The aIFS was sensitive to the static self-face but not to the violation of a predicted contingency (light-blue triangle); therefore, it belonged to a group that exhibited sensitivity to the contingency cue among the Face-main-effect peaks (i.e., dedicated to amodal self-representation; light-blue circle; Fig. 2d). In contrast, the resting 3 peaks aMFG, pIFS, and FOP were sensitive to the violation of a predicted contingency (i.e., dedicated to the belief-validation process; Fig. 2e) but not to the static neutral self-face (purple triangle).
This fMRI study examined the neural mechanisms underlying mirrored self-face recognition for the first time. Our data demonstrated that multiple processes at the perceptual or semantic level play roles in the processing of 2 perceptual cues (i.e., contingency and figurative) and their integration. We demonstrated that different cortical regions were sensitive to contingency and/or figurative cues. Particularly, several right frontal–insular regions showed activation reflecting different types of negative interaction between the 2 cues. These findings support the multicomponential view of the mirrored-self-recognition process.
Main Effect of Contingency
The observed significant main effect in the cuneus may reflect the unique visuospatial experience of the mirror confrontation induced by detecting the real-time contingency in the visual feedback of one's own facial action. This region likely corresponds to V3 or V3A (Tootell et al. 1997) and has been reported to be involved in 3D depth perception (Paradis et al. 2000) and attention to peripersonal space in front (Weiss et al. 2000; Quinlan and Culham 2007). Upon recognition of the “mirror,” our subjects may have perceived the space represented on the image as real and paid attention to its depth in their peripersonal space. The activation profile in this region showed deactivation relative to the baseline activity under the delayed and static conditions; this may be more accurately described as suppression of 3D depth perception or attention to peripersonal space during the perception of a virtual image (i.e., recorded or static video).
Main Effect of Face
Many occipital, parietal, temporal, frontal, and insular regions were identified, particularly in the right hemisphere. These regions were largely consistent with those previously reported as showing self-face-specific activation (Uddin et al. 2005; Platek et al. 2006; Sugiura et al. 2006, 2008, 2012; Kaplan et al. 2008; Oikawa et al. 2012). Our results segregated these regions into 3 groups. A group composed of frontal activation peaks (i.e., pMFG, pIFG, tIFG, and aINS) also exhibited sensitivity to the contingency cue, suggesting that these regions have a role in amodal self-representation at the semantic level. Functional dissociation of these frontal regions from posterior regions was previously suggested by a different pattern of intersubject variance in self-specific activation (Sugiura et al. 2006). The remaining regions, which were specifically sensitive to the figurative cue, were considered to be involved in perceptual-level processes. These were further subdivided into regions that were sensitive to the static neutral self-face (i.e., aSMG, pITG, and oIFG), and those that were not (other occipitoparietal regions and pSFG).
Although the precise processes operating in each group of perceptual-level regions remain a matter of speculation, the difference in the roles of the 2 groups seem to be traceable to different stages of an infants' acquisition of this ability in front of a mirror. The regions that do not respond to the static self-face (e.g., occipitoparietal regions and the pSFG) may be relevant to experiences during the initial stages of the acquisition of this ability. Infants show exploratory behavior (e.g., smiling, moving, and touching a mirror) when they begin to learn about the contingency between their own actions and visual feedback from a mirror (Loveland 1986). This visuomotor learning process involves the parietal and premotor regions (Ghilardi et al. 2000; Inoue et al. 2000), which largely overlap with regions in this group. In contrast, the group that showed sensitivity to the static self-face (i.e., pITG, aSMG, and oIFG) may be related to the bodily representation of the self-face that was established after this learning process. Specifically, regions in this group have been implicated in the sense of action agency or body ownership (Leube et al. 2003; David et al. 2007; Schnell et al. 2007; Ionta et al. 2011). Both groups of regions overlap with the areas that receive vestibular input (Smith et al. 2012; zu Eulenburg et al., 2012), which is highly relevant to bodily representation of self-face, as well as its acquisition process.
The expected activation profile was observed in 4 right frontal peaks (pMFG, pIFG, tIFG, and aINS) that were identified in the contrast for the main effect of Face, and in the right aIFS, which was identified in the contrast for the negative interaction. Several findings support the notion of the amodal nature of the self-representation in these regions. The response to the self-voice (Nakamura et al. 2001) or to one's own name being called (Kaplan et al. 2008) was identified previously in the regions close to the right tIFG and aIFS peaks. A correlation between social-affective feeling and activation during self-face viewing has also been reported in a cross-trial correlation between a feeling of embarrassment and a region close to the aIFS and the cross-individual correlation of a public self-consciousness score and a region close to the aINS and pIFG (Morita et al. 2008). In contrast, the range or domain of “self” represented in these regions may be limited considering the report that they do not respond to a visually presented self-name, whereas the right tIFG does respond to the self-face (Sugiura et al. 2008).
A response to the violation of a predicted contingency was identified in the right aMFG, pIFS, and FOP. The lack of response to the static neutral self-face may be due to our expertise in viewing a static self-face. Of these regions, the pIFS and aMFG are close to a region that has been reported to be responsive to the violation of a learned association between a drug and a side effect (Fletcher et al. 2001; Turner et al. 2004). The aMFG may select processes (i.e., perceptual cues) congruent with the validated belief; this region has been implicated in the enhancement and suppression of task-relevant and task-irrelevant processes, respectively (Sakai and Passingham 2003; Sugiura et al. 2007).
Neural Mechanisms for Mirrored Self-face Recognition and Implications
The groups of regions that exhibited different activation profiles are thus indeed likely to accommodate the different cognitive processes that have been assumed to underlie mirrored self-recognition based on the findings of developmental psychology and clinical observations. This finding may contribute to the discussion regarding the multiple developmental levels of the ability for self-recognition, particularly its behavioral index (Anderson 1984; Brooks-Gunn and Lewis 1984). Historically, the legitimate index has been the mark test (or rouge test), by which animals or infants are marked on an unseen part of the body and examined in front of a mirror to determine whether they show a behavior directed to the mark rather than to the mirror (Gallup 1970; Anderson 1984; Brooks-Gunn and Lewis 1984). Some animal species that previously failed to pass this test were able to pass a modified version of the test; for example, pigeons passed the test after training (Epstein et al. 1981). Infants typically initially pass this test in the second year of life, but they pass tests for other self-recognition-relevant indices earlier; they discriminate between contingent and noncontingent images of self at <2 months of age (Reddy et al. 2007). It would be interesting to compare these different indices with the components of mirrored self-recognition and the related cortical regions.
It is also interesting to focus on the role of contingency detection in the development of recognition not only of the self but also of others in an interactive relationship. We attributed the neural response to the figurative cue in the right pITG, aSMG, and oIFG to the bodily representation of the self-face acquired through the experience of contingency testing during infancy. However, it has been reported that these regions, particularly the right pITG and aSMG, are also responsive to the faces of personally familiar people in a social context (Sugiura et al. 2012). This finding may suggest that the recognition of personally familiar people in a social context also involves the representation of contingent relationships. Consistent with this, it has been proposed that an innate contingency-detection module analyzes not only the input related to one's own body but also the input related to communicative others: the former carries a perfect contingency between one's own action and the perceived motion of the body, whereas the latter carries a loose contingency (i.e., delay in time, different form and strength, or different modality) between one's own social action and the perceived response of others (Gergely and Watson 1999). In fact, evidence of contingency detection by infants (e.g., a negative reaction to a recorded video compared with a real-time video) in response to both the self (Reddy et al. 2007) and the mother (Nadel et al. 1999) appears at around 2 months of age. This commonality between the contingency detection mechanisms for self- and communicative-other recognition may explain the developmental co-emergence of mirrored self-recognition and empathic behavior in infants (Bischof-Köhler 1988; Zahn-Waxler et al. 1992).
To fully understand the roles of the frontal regions during mirrored self-face recognition, further conceptual elaborations are necessary. The right lateral prefrontal cortex has been implicated in the coordination of self-relevant information related not only to the physical self but also to the social self (Sugiura 2013). The regions assigned to the amodal self-representation and the belief-validation processes within the right frontal and insular cortices exhibited mosaic-like distribution. The overlap of these regions with those reported in previous relevant studies was often partial or variable. This suggests that further functional segregation may occur. For example, it is known that multiple regions in the lateral prefrontal cortex are dedicated to hierarchically organized cognitive control (Koechlin and Summerfield 2007; Badre 2008).
Although the absence of a recording of subjects' facial action during the fMRI measurement was a methodological limitation of this study, we believe that it eventually had no significant effect on our results for the following reasons. Although it was possible that subjects performed delayed facial actions during the experiment, this did not affect the Self conditions, which used real-time feedback under the Real-time and Delayed conditions. However, it could have affected the Other conditions, during which recorded videos were presented. The delay could have reduced and increased the real-time contingency under the Real-time Other and Delayed Other conditions, respectively, and thus resulted in a reduction in the contrast between the 2 conditions. To minimize this possibility, we had each subject practice the task until he acquired the ability to consistently perform it. Additionally, during the fMRI measurement, we monitored the subjects' performance under the Real-time Self and Delayed Self conditions in which the real-time image of the subjects' action was available. Despite such efforts, it was still possible that the subjects' performance was delayed and that the contingency cue had a reduced effect under the Other conditions. This may have produced an artifactual positive interaction between 2 perceptual cues; that is, the regions that should show the main effect of the contingency cue may have exhibited artifactual predominance in the contingency-cue effect under the Self condition. Fortunately, we did not detect a positive interaction and, therefore, do not need to discuss the possibility of artifacts.
We successfully identified the neural bases of the processes underlying mirrored-self-face recognition. Perceptual-level processes, which were responsive to a single perceptual cue, were localized primarily in the posterior cortices. Responses to real-time contingency cues were identified in the cuneus, which may reflect the unique visuospatial experience of the mirror confrontation. Responses to the figurative cues of the self-face, which were assumed to index the self-face representation, were identified in the bilateral occipitoparietal and right temporal, frontal, and insular cortices. These regions were segregated into 2 groups depending on the presence or absence of the response to the static self-face, and their roles could be traced to different developmental stages of self-face recognition. Two regions reflecting semantic-level processes were identified: Regions for amodal self-representation, which were sensitive to both perceptual cues and static self-face, and those for belief-validation, which responded to violations of a predicted contingency. These 2 region types exhibited mosaic-like distribution over the right frontal and insular regions, suggesting the need for further conceptual elaboration in self-relevant processes at the meta-cognitive level. These results illustrate the notion of multiple developmental levels of self-recognition ability, as well as the potential role of contingency detection in the co-emergence of mirrored self-recognition and empathic behavior in infants.
This study was supported by MEXT KAKENHI 23119702 and 25560347. Funding to pay the Open Access publication charges for this article was provided by MEXT KAKENHI 25560347.
Conflict of Interest: None declared.