Abstract

The ability to extract sequential regularities embedded in the temporal context or temporal structure of sensory events and to predict upcoming events based on the extracted sequential regularities plays a central role in human cognition. In the present study, we demonstrate that, without any intention, upcoming emotional faces can be predicted based on sequential regularities, by showing that prediction error responses as reflected by visual mismatch negativity (MMN), an event-related brain potential (ERP) component, were evoked in response to emotional faces that violated a regular alternation pattern of 2 emotional faces (fearful and happy faces) under a situation where the emotional faces themselves were unrelated to the participant’s task. Face-inversion and negative-bias effects in the visual MMN further indicated the involvement of holistic face representations. In addition, through successive source analyses of the visual MMN, it was revealed that the prediction error responses were composed of activations mainly in the face-responsible visual extrastriate areas and the prefrontal areas. The present results provide primary evidence for the existence of the unintentional temporal context–based prediction of biologically relevant visual stimuli as well as empirical support for the major engagement of the visual and prefrontal areas in unintentional temporal context–based prediction in vision.

Introduction

Unintentional Temporal Context–Based Prediction in Vision

The extraction of sequential regularities embedded in the temporal context or temporal structure of sensory events and the formation of predictions about upcoming events on the basis of predictive models encoding such extracted sequential regularities are essential tasks of the brain, since 1) at the behavioral level, they allow us to optimize the accuracy and speed of perceptual, cognitive, and motor processes in response to upcoming events (Huettel et al. 2002; Doherty et al. 2005) and 2) at the neuronal and computational levels, they enable us to minimize processing resources and computational demands for redundant predictable events and to maximize them for novel unpredictable events (Friston 2003, 2005).

With regard to such a temporal context–based prediction in vision, a wide range of behavioral (Howard et al. 1992; Kelly et al. 2003), electrophysiological (Eimer et al. 1996; Rüsseler and Rösler 2000; Bubic et al. 2010), and brain imaging studies (Schubotz and von Cramon 2002a, 2002b; Wolfensteller et al. 2007; Bubic et al. 2009) have shown that these processes require the observer’s intention to use the sequential regularities to predict upcoming events. However, it has also been suggested that such an explicit intention may not always be required and predictions about upcoming events can be formed in an automatic and obligatory manner, especially when the sequential regularities are relatively simple. For instance, behavioral studies on infant’s anticipatory eye movements (Haith and McCarty 1990; Canfield and Haith 1991), representational momentum (Freyd and Finke 1984; Kelly and Freyd 1987; Hayes and Freyd 2002), implicit perceptual sequence learning (Mayr 1996; Ramillard 2003), and, possibly, a flash-lag effect (MacKay 1958; Nijhawan 1994; but see also Whitney and Murakami 1998; Eagleman and Sejnowski 2000) have demonstrated the existence of some prediction mechanisms that allow the extraction of sequential regularities and formation of predictions, without any intention.

Visual Mismatch Negativity

Further evidence for the existence of unintentional temporal context–based prediction in vision comes from a series of electrophysiological studies on visual mismatch negativity (MMN), an event-related brain potential (ERP) component (for reviews, see Czigler 2007; Kimura et al. 2011). Visual MMN is a negative-going ERP component with a posterior scalp distribution that usually peaks at around 200–400 ms after the onset of visual stimuli. Visual MMN has been observed in response to visual stimuli that violate regular patterns embedded in successive visual stimulations, such as those that violate a simple repeating pattern in the so-called “oddball” sequences (OOOOX; Czigler et al. 2002; Kimura et al. 2009, Kimura, Ohira, and Schröger 2010; Kimura, Schröger, Czigler, and Ohira 2010), those that violate an alternating pattern (OOXXOOXXX; Czigler et al. 2006), and those that violate a more complex and abstract pattern (OO-XX-XX-OO-OX; Stefanics et al. 2011). In addition, although visual MMN has been repeatedly observed in response to stimuli that violate regularity in oddball sequences (OOOOXOOOOOXOOOX), it can be eliminated when the order of 2 stimuli is patterned (OOOOXOOOOXOOOOX; Kimura, Widmann, and Schröger 2010a; see also Kimura, Widmann, and Schröger 2010b). Based on these findings, visual MMN is thought to be an electrophysiological correlate of prediction error responses that are generated when a current event is incongruent with events predicted on the basis of sequential regularities (Kimura et al. 2011). Importantly, all of these findings were obtained when low-level visual features in which the violations of sequential patterns occurred (e.g., color in Czigler et al. 2006, orientation in Kimura et al. 2009, and luminance in Kimura, Widmann, and Schröger 2010a) were unrelated to the participant’s task, and there was no need and no task-related benefit to intentionally make predictions about upcoming events. Therefore, these visual MMN findings can be regarded as ample evidence for the existence of unintentional temporal context–based prediction in vision (Kimura et al. 2011).

Visual MMN and Emotional Faces

Based on these visual MMN findings as well as the aforementioned behavioral findings, the concept of unintentional temporal context–based prediction in vision itself seems to be considerably well established. However, it is not clear whether or not this unintentional temporal context–based prediction operates on a similar principle for biologically relevant visual stimuli since almost all of the evidence for unintentional temporal context–based prediction have been obtained with the use of nonbiological visual stimuli, such as simple geometric figures. This question is not trivial since it is related to the important issue of whether or not the concept of unintentional temporal context–based prediction can be generalized to a wide range of biological visual stimuli (e.g., familiar objects, human, and other species). Regarding this question, ERP studies using an oddball sequence consisting of human face stimuli have provided some tentative evidence (Susac et al. 2004; Zhao and Li 2006; Astikainen and Hietanen 2009; Chang et al. 2010). For example, Susac et al. (2004) used an oddball sequence consisting of frequent happy (H) and infrequent neutral faces (N) and observed a visual MMN–like negativity in response to the infrequent faces (HHHHN; called expression-related mismatch negativity [EMMN]) under a situation where the emotional faces themselves were unrelated to the participant’s task (i.e., detection of faces with glasses). The elicitation of EMMN was replicated for infrequent fearful and infrequent happy faces inserted in repetitive neutral faces (Astikainen and Hietanen 2009) and for infrequent sad and infrequent happy faces inserted in repetitive neutral faces (Zhao and Li 2006; Chang et al. 2010). Importantly, EMMN is considered to be associated with regularity violations not only in low-level visual features that comprise the face (e.g., local shapes of the eyes and mouth) but also in holistic face dimensions, which is supported by face-inversion (Thompson 1980; Searcy and Bartlett 1996) and negative-bias effects (Hansen and Hansen 1988) in EMMN (Susac et al. 2004; Chang et al. 2010). Taken together, these EMMN findings are consistent with the idea that there exists unintentional temporal context–based prediction of biological visual stimuli and it may operate on a principle similar to that of nonbiological visual stimuli.

Present Study

However, this idea should be tested more directly for at least 2 reasons. First, due to the exclusive use of an oddball sequence in those studies, the elicitation of EMMN is not straightforward evidence for the existence of unintentional temporal context–based prediction of emotional faces. As has often been pointed out in the MMN literature, the elicitation of MMN in an oddball sequence can be explained in terms not only of prediction errors but also of memory mismatches (Czigler 2007; Kimura et al. 2011). Thus, although the EMMN might reflect prediction error responses caused by the violation of sequence regularities embedded in the repetitive presentation of emotional faces, it is also possible that EMMN simply reflects mismatches between an infrequently presented emotional face and the memory of a frequently presented emotional face.

To provide more direct evidence for the existence of unintentional temporal context–based prediction of emotional faces, in the present study, we used upright and inverted stimulus sequences (Fig. 1), where 2 emotional faces (fearful and happy faces) were regularly alternated (regular fearful and regular happy faces), and these faces were suddenly repeated (irregular fearful and irregular happy faces) under a situation where the emotional faces themselves were unrelated to the participant’s task (detection of faces with glasses, cf. Susac et al. 2004). We expected that 1) if upcoming emotional faces were unintentionally predicted, then prediction error responses should be evoked in response to irregular compared with regular faces, 2) if the EMMN observed in previous oddball studies reflected prediction errors rather than memory mismatches, then prediction error responses would emerge as EMMN (i.e., a posterior negativity that peaks at around 200–400 ms), and 3) if the EMMN, at least partly, reflected prediction errors at the level of holistic face dimensions, then the onset latencies of the EMMN would be delayed and/or the amplitudes of the EMMN would be reduced for inverted compared with upright sequences (face-inversion effect) and for happy compared with fearful faces (negative-bias effect; cf. Susac et al. 2004; Chang et al. 2010).

Figure 1.

Schematic illustrations of the upright and inverted sequences.

Figure 1.

Schematic illustrations of the upright and inverted sequences.

Second, although EMMN has been shown to have morphologies similar to those of traditional visual MMN, further investigations are necessary to determine whether or not unintentional temporal context–based prediction operates on a similar principle for biological and nonbiological visual stimuli since no previous studies have directly tested their similarity in neural substrates. To address this issue, in the present study, we examined neural generators of the EMMN by using standardized low-resolution brain electromagnetic tomography (sLORETA; Pascual-Marqui 2002). With regard to neural substrates of the unintentional temporal context–based prediction of nonbiological visual stimuli, previous studies have provided converging evidence that suggests engagement of the visual and prefrontal areas. For instance, previous studies on neural generators of prediction error responses reflected by visual MMN with ERP (Pazo-Alvarez et al. 2004; Kimura, Ohira, and Schröger 2010), magnetoencephalography (Urakawa et al. 2010), and functional resonance imaging (fMRI; Yucel et al. 2007) have shown that visual MMN is mainly generated from the visual (typically, the visual extrastriate areas responsible for attributes in which regularity violations occurred) and prefrontal areas. Also, an fMRI study on neural substrates of perceptual sequence learning (Huettel et al. 2002) showed that prediction error responses obligatorily evoked by violations of regular alternations of 2 geometric visual stimuli (OXOXOO), as well as violations of regular repetitions (OOOOX), were generated from the prefrontal areas. (In this study, no data were recorded from brain areas other than the frontal areas.) Further, an fMRI study on neural substrates of representational momentum (Rao et al. 2004) showed that the presentation of successive regular displacements of a bar’s orientation, which was followed by the occurrence of representational momentum, evoked activations in the prefrontal areas. Based on these findings, we expected that if the unintentional temporal context–based prediction of emotional faces operates on a principle similar to that of nonbiological visual stimuli, then neural generators of the EMMN would include the face-responsible visual areas (e.g., the occipitotemporal visual extrastriate areas including the fusiform gyrus; for a review, see Haxby et al. 2000) and the prefrontal areas.

However, a partly different expectation would also be possible. Recent brain imaging studies on social cognition have shown that, unlike the case for observation of nonbiological visual stimuli, observation of biological visual stimuli such as the facial expression of others can lead to activations in the motor-related areas in a relatively automatic manner (Carr et al. 2003; Leslie et al. 2004; see also Dimberg et al. 2000, 2002). For example, Leslie et al. (2004) showed that not only active imitation but also passive observation of others’ facial expression led to activations in the motor-related areas such as the ventral premotor areas. In the light of mirror neuron system (Gallese et al. 1996; Rizzolatti et al. 1996), such motor-related activations have been associated with the concept of simulation (i.e., mapping of the observed others’ actions onto our own motor system) and regarded as neural underpinnings of social communications such as imitation, empathy, and understanding of others’ intention (for reviews, see Blakemore and Decety 2001; Miall 2003; Gallese et al. 2004; Blakemore and Frith 2005; Wilson and Knoblich 2005; Frith and Frith 2006). According to this view, it is assumed that the unintentional temporal context–based prediction of emotional faces may operate on a principle partly different from that of nonbiological visual stimuli. That is, it may operate on the basis of sequential regularities not only at the perceptual level but also at the motoric level (for similar concepts, see Nattkemper and Prinz 1997; Schubotz 2007). Based on this assumption, we also expected that neural substrates of the unintentional temporal context–based prediction of emotional faces might be engaged not only in the visual and prefrontal areas but also in the motor-related areas, and therefore, neural generators of prediction error responses reflected by the EMMN would include not only the face-responsible visual and prefrontal areas but also the motor-related areas.

Finally, to explore neural mechanisms of the unintentional temporal context–based prediction of emotional faces, in the present study, we further examined time courses of the neural generators by conducting sLORETA for successive time windows of the EMMN. Because no previous studies have investigated this issue, it was difficult to form clear expectations. However, one possible expectation is that activations in prefrontal (and possibly, motor-related) areas might be preceded by those in face-responsible visual areas. This expectation was primarily based on a recent computational framework (Friston 2003, 2005) that proposed that perceptual predictions are interactive processes between 1) top–down prediction signals that are conveyed via feedback connections from higher cortical areas (where the prediction signals are generated) to lower sensory areas (where a current sensory event and predicted events are compared), and 2) bottom–up prediction error signals that are conveyed via feedforward connections from the lower cortical areas (where prediction error signals are initially generated) to the higher cortical areas (where predictive models are updated accordingly and new prediction signals are generated). Our expectation was also based on a wide range of reports in the literature on sensory predictions that have suggested that 1) the prefrontal and/or motor-related areas play a central role in the formation of sensory predictions and the updating of predictive models (Blakemore et al. 2000; Huettel et al. 2002; Rao et al. 2004; Bar et al. 2006; Schubotz 2007; Summerfield and Koechlin 2008), while 2) the sensory areas are regions where the sensory predictions are imposed and a current sensory event and predicted sensory events are compared (Brunia 1999; Blakemore et al. 2000; Gomez et al. 2004; Schubotz 2007). Thus, if the activations in the face-responsible visual areas are related to the initial generation of prediction error signals as a result of such comparisons and those in the prefrontal (and the motor-related) areas are associated with the updating of predictive models based on error signals, then prefrontal (and motor-related) activations would be preceded by (or, at earliest, occur simultaneously compared with) face-responsible visual activations.

Materials and Methods

Participants

Twelve students (10 women, 2 men; age range = 20–29 years, M = 22.1 years) participated in this experiment. Eleven participants were right-handed, and one was left-handed. All participants had normal or corrected-to-normal vision and were free of neurological or psychiatric disorders. Written informed consent was obtained from each participant after the nature of the study had been explained.

Stimuli and Procedure

Thirty-two black-and-white oval pictures of faces at a central location against a black background (visual angle of 3.4° [width] × 5.7° [height] from a viewing distance of 100 cm) that were defined by 4 models (2 females and 2 males), 2 facial expressions (fearful, happy), 2 orientations (upright, inverted), and 2 glasses (with, without) were used. These pictures were made by modifying original pictures of fearful and happy faces of 4 different models that were selected from the Pictures of facial affect (Ekman and Friesen 1976). The duration of stimuli was 250 ms, and the stimulus onset asynchrony was 500 ms in all conditions.

Figure 1 shows schematic illustrations of the upright and inverted sequences. In the upright sequence, 4 upright faces of 1 model were used (fearful face without glasses, happy face without glasses, fearful face with glasses, and happy face with glasses). The fearful and happy faces without glasses were regularly alternated (regular fearful nontarget and regular happy nontarget stimuli), and these stimuli were suddenly repeated (irregular fearful nontarget and irregular happy nontarget stimuli). Also, these 4 stimuli without glasses were occasionally replaced by those with glasses (regular fearful target, regular happy target, irregular fearful target, and irregular happy target stimuli). Each experimental block consisted of 256 trials, where the number of each stimulus was 105 (regular fearful nontarget), 105 (regular happy nontarget), 15 (irregular fearful nontarget), 15 (irregular happy nontarget), 7 (regular fearful target), 7 (regular happy target), 1 (irregular fearful target), and 1 (irregular happy target). The inverted sequence was completely the same as the upright sequence except for the orientation of the face stimuli.

The experiment consisted of 16 experimental blocks (8 blocks for the upright sequence [2 blocks for each model] and 8 blocks for the inverted sequence [2 blocks for each model]). The order of blocks was randomized across participants. In all blocks, the participant was seated in a reclining chair in a sound-attenuated and electrically shielded room and instructed to ignore attributes (facial expressions and identities) of the face stimuli and to press a button with the right thumb as quickly and accurately as possible when face stimuli with glasses (target stimuli) were presented. They were also asked to focus on the center of the display and to minimize any eye movement during each block.

Recordings

The electroencephalogram (EEG) was recorded from 25 silver–silver chloride cup electrodes attached to an electrocap (Fp1, Fp2, F7, F3, Fz, F4, F8, T3, C3, Cz, C4, T4, T5, P3, Pz, P4, T6, PO7, PO3, POz, PO4, PO8, O1, Oz, and O2 according to the extended International 10-20 System). All electrodes were referenced to the nose tip. Blinks and eye movements were monitored with electrodes above and below the right eye (vertical electrooculogram [EOG]) and at the right and left outer canthi of the eyes (horizontal EOG). The impedance of all electrodes was kept below 5 kΩ. EEG and EOG signals were digitized at a sampling rate of 500 Hz. Offline, the EEG and EOG signals were digitally bandpass-filtered at 1–30 Hz with a finite impulse response (FIR) filter (909-point Kaiser-windowed sinc FIR filter; Kaiser beta = 5.653, maximum passband deviation = 0.001, transition band width = 2 Hz). The EEG and EOG signals were then averaged for 8 categories defined by 4 nontarget stimuli (regular fearful, regular happy, irregular fearful, and irregular happy stimuli) and 2 orientations (upright and inverted sequences). Averaging epochs were 600 ms featuring a 100-ms prestimulus baseline. In the averaging procedure, 1) the first 4 epochs in each block, 2) epochs during which the participants made a button press, 3) 2 epochs preceded by target stimuli, 4) 2 epochs preceded by irregular stimuli, and 5) epochs in which the signal changes exceeded ±50 μV on any of the electrodes were omitted. As a result, the averaging number for each nontarget stimulus was, on average, 359 (upright regular fearful), 361 (upright regular happy), 86 (upright irregular fearful), 87 (upright irregular happy), 367 (inverted regular fearful), 369 (inverted regular happy), 87 (inverted irregular fearful), and 87 (inverted irregular happy).

Data Analysis

Behavioral Performance

Behavioral performance was measured in terms of reaction time (ms), hit rate (%), and false alarm rate (%) for 4 conditions (upright fearful, upright happy, inverted fearful, and inverted happy conditions). Since the number of irregular target trials was quite low, the irregular and regular trials were collapsed. Responses were scored as a hit if the button was pressed within 100–800 ms after target onset. Responses outside this period were classified as false alarms. These measures were subjected to repeated-measures ANOVAs with 2 factors: 2 orientations (Upright, Inverted) and 2 facial expressions (Fearful, Happy). The effect sizes are shown as partial eta squared (η2).

ERPs and Difference Waves

To estimate the regularity violation effects, grand-average irregular-minus-regular difference waves were calculated for the upright fearful, upright happy, inverted fearful, and inverted happy conditions. In the difference waves for the fearful conditions, a right parieto-occipital (PO8) maximum negativity (EMMN) that peaked at around 276–286 ms (upright) and 340–350 ms (inverted) was observed. In the difference waves for the happy conditions, a midline parieto-occipital (POz) maximum negativity (EMMN) that peaked at around 348–358 ms (upright) and 368–378 ms (inverted) was observed.

Mean Amplitudes of EMMN

To test the significance of the elicitation of EMMN, mean amplitudes of the difference waves within the corresponding 10-ms time windows at the corresponding electrode were subjected to one-tailed paired t-tests. The effect sizes are presented as d values. Further, to compare the mean amplitudes of EMMN among the 4 conditions, the amplitudes were subjected to repeated-measures ANOVAs with 2 factors: 2 orientations (Upright, Inverted) and 2 facial expressions (Fearful, Happy). The effect sizes are shown as partial η2.

Onset Latencies of EMMN

To estimate the onset latencies of EMMN, a jackknife method combined with the relative criterion technique (Miller et al. 1998; Ulrich and Miller 2001; Kiesel et al. 2008) was used. With regard to 12 sub–grand-average irregular-minus-regular difference waves for each of 4 conditions (at the PO8 electrode for the fearful upright and inverted conditions and at the POz electrode for the happy upright and inverted conditions), the onset latency was evaluated as the time at which the difference waves reached 50% of the peak amplitude of EMMN, starting at 200 ms after stimulus onset. Next, to compare the onset latencies of EMMN among the 4 conditions, the evaluated values were subjected to conventional repeated-measures ANOVAs with 2 factors: 2 orientations (Upright, Inverted) and 2 facial expressions (Fearful, Happy). The F values were adjusted according to Ulrich and Miller (2001). The effect sizes are shown as partial η2.

Neural Generators of EMMN

sLORETA was used to estimate the neural generators of EMMN. Briefly, sLORETA computes the cortical 3D distribution of current source density of scalp-recorded EEG and provides a single solution to the inverse problem of the location of cerebral sources. This method is based on a standardized current source density as well as a 3-shell spherical model registered in the digitized atlas provided by the Brain Imaging Centre, Montreal Neurological Institute (MNI; Talairach and Tournoux 1988), calculates the standardized current source density at each of voxels in the gray matter and the hippocampus of the MNI reference brain (6239 voxels at a spatial resolution of 5 mm), and estimates the current sources under the assumption that neighboring voxels should have a maximally similar electrical activity. This method is confirmed to achieve zero localization error in noise-free stimulations (Pascual-Marqui 2002).

sLORETA was conducted for successive time windows of EMMN to estimate the neural generators and their time courses. For the EMMN in the upright fearful conditions, sLORETA images for difference waves within the 250- to 290-, 290- to 330-, and 330- to 370-ms time windows were calculated. For the EMMN in the upright happy conditions, sLORETA images for difference waves within the 300- to 340- and 340- to 380-ms time windows were calculated. These sLORETA images were then compared with zero with voxel-by-voxel t-tests. Statistical significance was assessed using a randomization test (number of randomizations: 5000) based on statistical nonparametric mapping (Nichols and Holmes 2002), which corrects for multiple comparisons. Voxels with significant differences (P < 0.01) were projected in specific brain regions.

Table 1

Behavioral performance (standard deviations in parentheses)

 Fearful Happy 
Upright Inverted Upright Inverted 
Reaction time (ms) 549.4 (35.4) 566.3 (39.6) 547.4 (41.0) 558.6 (42.3) 
Hit rate (%) 80.6 (16.0) 78.6 (21.3) 85.6 (13.8) 81.0 (23.6) 
False alarm rate (%) 0.07 (0.09) 0.07 (0.12) 0.07 (0.07) 0.04 (0.07) 
 Fearful Happy 
Upright Inverted Upright Inverted 
Reaction time (ms) 549.4 (35.4) 566.3 (39.6) 547.4 (41.0) 558.6 (42.3) 
Hit rate (%) 80.6 (16.0) 78.6 (21.3) 85.6 (13.8) 81.0 (23.6) 
False alarm rate (%) 0.07 (0.09) 0.07 (0.12) 0.07 (0.07) 0.04 (0.07) 
Table 2

Brain regions that showed significant activation for EMMN in response to upright irregular fearful faces

Region (BA) R/L 250–290 ms 290–330 ms 330–370 ms 
x y z t value x y z t value x y z t value 
Frontal lobe 
 Medial frontal gyrus (11/10) 10 62 −16 9.96 10 62 −16 7.54 10 62 −16 8.17 
−5 62 −16 9.57 −5 62 −16 7.01 −5 62 −16 7.75 
 Orbital gyrus (11) 10 52 −24 8.98 10 52 −24 7.20 10 52 −24 7.67 
−10 52 −19 8.37 −10 52 −19 6.53 −10 52 −19 6.89 
 Rectal gyrus (11) 52 −24 8.79 52 −24 6.97 52 −24 7.48 
−5 52 −24 8.67 −5 52 −24 6.78 −5 52 −24 7.19 
 Superior frontal gyrus (11/10) 15 62 −16 10.0 20 63 −12 7.58 15 62 −16 8.24 
−5 63 −12 9.50 −5 57 −20 6.93 −5 63 −12 7.71 
 Middle frontal gyrus (11/10/47) 35 58 −11 8.93 35 53 −15 7.23 35 58 −11 7.65 
−35 58 −11 7.73 −20 53 −11 5.89 −20 53 −11 6.14 
 Inferior frontal gyrus (10/47) 50 43 −11 7.95 45 38 −15 6.35 40 54 6.99 
−40 54 6.95         
Limbic lobe 
 Anterior cingulate (32/10) 10 48 −2 7.32 10 48 −2 6.18 10 48 −2 6.72 
−5 48 −2 6.99     −5 48 −2 6.12 
Temporal lobe 
    Fusiform gyrus (20) 59 −16 −24 7.02         
    Inferior temporal gyrus (20/21) 64 −15 −24 7.30         
    Middle temporal gyrus (21) 64 −15 −12 7.04         
Occipital lobe 
    Inferior occipital gyrus (18)         −40 −87 6.09 
    Middle occipital gyrus (18/19)         −30 −92 6.25 
Region (BA) R/L 250–290 ms 290–330 ms 330–370 ms 
x y z t value x y z t value x y z t value 
Frontal lobe 
 Medial frontal gyrus (11/10) 10 62 −16 9.96 10 62 −16 7.54 10 62 −16 8.17 
−5 62 −16 9.57 −5 62 −16 7.01 −5 62 −16 7.75 
 Orbital gyrus (11) 10 52 −24 8.98 10 52 −24 7.20 10 52 −24 7.67 
−10 52 −19 8.37 −10 52 −19 6.53 −10 52 −19 6.89 
 Rectal gyrus (11) 52 −24 8.79 52 −24 6.97 52 −24 7.48 
−5 52 −24 8.67 −5 52 −24 6.78 −5 52 −24 7.19 
 Superior frontal gyrus (11/10) 15 62 −16 10.0 20 63 −12 7.58 15 62 −16 8.24 
−5 63 −12 9.50 −5 57 −20 6.93 −5 63 −12 7.71 
 Middle frontal gyrus (11/10/47) 35 58 −11 8.93 35 53 −15 7.23 35 58 −11 7.65 
−35 58 −11 7.73 −20 53 −11 5.89 −20 53 −11 6.14 
 Inferior frontal gyrus (10/47) 50 43 −11 7.95 45 38 −15 6.35 40 54 6.99 
−40 54 6.95         
Limbic lobe 
 Anterior cingulate (32/10) 10 48 −2 7.32 10 48 −2 6.18 10 48 −2 6.72 
−5 48 −2 6.99     −5 48 −2 6.12 
Temporal lobe 
    Fusiform gyrus (20) 59 −16 −24 7.02         
    Inferior temporal gyrus (20/21) 64 −15 −24 7.30         
    Middle temporal gyrus (21) 64 −15 −12 7.04         
Occipital lobe 
    Inferior occipital gyrus (18)         −40 −87 6.09 
    Middle occipital gyrus (18/19)         −30 −92 6.25 

Note: Brain areas that showed statistically significant differences (P < 0.01) are shown. The critical t value for P < 0.01 was 6.50 (250–290 ms), 5.67 (290–330 ms), and 6.07 (330–370 ms), respectively. Three-dimensional x-, y-, and z-coordinates are given for the Talairach space (Talairach and Tournoux 1988). The x-, y-, and z-coordinates and t values show the maximum value for each location. BA, Brodmann’s area; R, right hemisphere; L, left hemisphere.

Table 3

Brain regions that showed significant activation for EMMN in response to upright irregular happy faces

Region (BA) R/L 300–340 ms 340–380 ms 
x y z t value x y z t value 
Frontal lobe 
 Medial frontal gyrus (11/10) 10 62 −16 6.55 10 62 −16 7.07 
−5 62 −16 6.56 −5 62 −16 6.83 
 Orbital gyrus (11) 10 52 −24 6.21 10 52 −24 6.86 
−10 52 −19 5.97 −10 52 −19 6.25 
 Rectal gyrus (11) 52 −24 6.15 52 −24 6.72 
−5 52 −24 6.12 −5 52 −24 6.55 
 Superior frontal gyrus (11/10) 15 62 −16 6.69 15 62 −16 7.16 
−5 63 −12 6.48 −5 57 −20 6.79 
 Middle frontal gyrus (11/10/47) 35 58 −11 6.33 35 58 −11 6.54 
−20 53 −11 5.95     
    Inferior frontal gyrus (10/47) 50 43 −11 6.04 50 43 −11 6.21 
Limbic lobe 
 Anterior cingulate (32/10) 10 48 −2 5.88     
−5 48 −2 5.71     
Temporal lobe 
    Fusiform gyrus (20) 50 −21 −24 5.63     
    Inferior temporal gyrus (20) 45 −16 −29 5.69     
    Superior temporal gyrus (38) 50 24 −14 5.81     
Occipital lobe 
 Cuneus (18/19)     −86 37 7.05 
    −5 −91 28 7.12 
Region (BA) R/L 300–340 ms 340–380 ms 
x y z t value x y z t value 
Frontal lobe 
 Medial frontal gyrus (11/10) 10 62 −16 6.55 10 62 −16 7.07 
−5 62 −16 6.56 −5 62 −16 6.83 
 Orbital gyrus (11) 10 52 −24 6.21 10 52 −24 6.86 
−10 52 −19 5.97 −10 52 −19 6.25 
 Rectal gyrus (11) 52 −24 6.15 52 −24 6.72 
−5 52 −24 6.12 −5 52 −24 6.55 
 Superior frontal gyrus (11/10) 15 62 −16 6.69 15 62 −16 7.16 
−5 63 −12 6.48 −5 57 −20 6.79 
 Middle frontal gyrus (11/10/47) 35 58 −11 6.33 35 58 −11 6.54 
−20 53 −11 5.95     
    Inferior frontal gyrus (10/47) 50 43 −11 6.04 50 43 −11 6.21 
Limbic lobe 
 Anterior cingulate (32/10) 10 48 −2 5.88     
−5 48 −2 5.71     
Temporal lobe 
    Fusiform gyrus (20) 50 −21 −24 5.63     
    Inferior temporal gyrus (20) 45 −16 −29 5.69     
    Superior temporal gyrus (38) 50 24 −14 5.81     
Occipital lobe 
 Cuneus (18/19)     −86 37 7.05 
    −5 −91 28 7.12 

Note: Brain areas that showed statistically significant differences (P < 0.01) are shown. The critical t value for P < 0.01 was 5.60 (300–340 ms) and 5.89 (340–380 ms), respectively. Three-dimensional x-, y-, and z-coordinates are given for the Talairach space (Talairach and Tournoux 1988). The x-, y-, and z-coordinates and t values show the maximum value for each location. BA, Brodmann’s area; R, right hemisphere; L, left hemisphere.

Results

Behavioral Performance

Table 1 shows the behavioral performance in the 4 conditions. With regard to the reaction times (ms), two-way ANOVAs (2 orientations × 2 facial expressions) revealed a significant main effect of orientation (F1,11 = 19.00, P < 0.01, partial η2 = 0.63), which indicates that the reaction time was longer for the inverted conditions than for the upright conditions. With regard to the hit and false alarm rates, two-way ANOVAs revealed no significant effects.

ERPs and Difference Waves

Figure 2A shows the grand-average ERPs elicited by irregular and regular stimuli and the grand-average irregular-minus-regular difference waves in the fearful (left 2 columns) and happy conditions (right 2 columns). In the fearful conditions, a posterior negativity (EMMN) that peaked at around 276–286 ms (upright) and 340–350 ms (inverted) was observed in the difference waves, while in the happy conditions, a posterior negativity (EMMN) that peaked at around 348–358 ms (upright) and 368–378 ms (inverted) was observed in the difference waves. Figure 2B shows topographical maps of the difference waves within the corresponding 10-ms time windows. The EMMN in the fearful conditions had a scalp distribution that peaked at a right parieto-occipital electrode (PO8), while the EMMN in the happy conditions had a scalp distribution that peaked at a midline parieto-occipital electrode (POz).

Figure 2.

(A) Grand-average ERPs elicited by irregular and regular stimuli and grand-average irregular-minus-regular difference waves for the upright fearful, inverted fearful, upright happy, and inverted happy conditions. (B) Topographical maps of the difference waves for the upright fearful, inverted fearful, upright happy, and inverted happy conditions.

Figure 2.

(A) Grand-average ERPs elicited by irregular and regular stimuli and grand-average irregular-minus-regular difference waves for the upright fearful, inverted fearful, upright happy, and inverted happy conditions. (B) Topographical maps of the difference waves for the upright fearful, inverted fearful, upright happy, and inverted happy conditions.

Mean Amplitudes of EMMN

Figure 3A shows the grand-average mean amplitudes of EMMN (276–286 ms at PO8 for the upright fearful, 340–350 ms at PO8 for the inverted fearful, 348–358 ms at POz for the upright happy, and 368–378 ms at POz for the inverted happy conditions). The mean amplitude was −1.54 μV (standard error [SE] = 0.44) in the upright fearful, −0.93 μV (0.38) in the inverted fearful, −1.04 μV (0.45) in the upright happy, and −0.47 μV (0.24) in the inverted happy conditions. One-tailed paired t-tests showed that EMMN was significantly elicited in the upright fearful (t11 = −3.51, P < 0.01, d = 1.01), inverted fearful (t11 = −2.47, P < 0.05, d = 0.71), upright happy (t11 = −2.31, P < 0.05, d = 0.67), and inverted happy conditions (t11 = −1.97, P < 0.05, d = 0.57). However, two-way ANOVAs (2 orientations × 2 facial expressions) revealed no significant effects.

Figure 3.

(A) Grand-average mean amplitudes of EMMN for the upright fearful (276–286 ms at PO8), inverted fearful (340–350 ms at PO8), upright happy (348–358 ms at POz), and inverted happy conditions (368–378 ms at POz). Error bars indicate standard errors of mean. Asterisks show the significance of the elicitation of EMMN (*P < 0.05 and **P < 0.01 by one-tailed paired t-tests). (B) Grand-average onset latencies of EMMN for the upright fearful (PO8), inverted fearful (PO8), upright happy (POz), and inverted happy conditions (POz). Error bars indicate standard errors of mean.

Figure 3.

(A) Grand-average mean amplitudes of EMMN for the upright fearful (276–286 ms at PO8), inverted fearful (340–350 ms at PO8), upright happy (348–358 ms at POz), and inverted happy conditions (368–378 ms at POz). Error bars indicate standard errors of mean. Asterisks show the significance of the elicitation of EMMN (*P < 0.05 and **P < 0.01 by one-tailed paired t-tests). (B) Grand-average onset latencies of EMMN for the upright fearful (PO8), inverted fearful (PO8), upright happy (POz), and inverted happy conditions (POz). Error bars indicate standard errors of mean.

Onset Latencies of EMMN

Figure 3B shows the grand-average onset latencies of EMMN (PO8 for the fearful and POz for the happy conditions). The onset latency was 250.8 ms (SE = 0.3) in the upright fearful, 303.3 ms (0.9) in the inverted fearful, 311.3 ms (0.9) in the upright happy, and 362.3 ms (0.5) in the inverted happy conditions. Two-way ANOVAs (2 orientations × 2 facial expressions) revealed a significant main effect of orientation (Fcorrected1,11 = 33.11, P < 0.01, partial η2 = 0.75) and facial expression (Fcorrected1,11 = 96.76, P < 0.01, partial η2 = 0.90), which was due to longer onset latencies of EMMN in the inverted conditions than in the upright conditions and those of EMMN in the happy conditions than in the fearful conditions.

Neural Generators of EMMN

Figure 4A and Table 2 show the current sources of EMMN in response to upright irregular fearful faces. In the early phase of EMMN (250–290 ms), the current sources were located in the 1) temporal (the fusiform, inferior temporal, and middle temporal gyri; right hemisphere only), 2) frontal (the medial frontal, orbital, rectal, superior frontal, middle frontal, and inferior frontal gyri; both hemispheres but lateralized to the right hemisphere), and 3) limbic lobes (the anterior cingulate; both hemispheres but lateralized to the right hemisphere). In the middle phase (290–330 ms), the current sources were located in the 1) frontal and 2) limbic lobes. In the late phase (330–370 ms), the current sources were located in the 1) frontal, 2) limbic, and 3) occipital lobes (the inferior occipital and middle occipital gyri; left hemisphere only).

Figure 4.

(A) Neural generators of EMMN in response to upright irregular fearful faces within the 250- to 290-, 290- to 330-, and 330- to 370-ms time windows. (B) Neural generators of EMMN in response to upright irregular happy faces within the 300- to 340- and 340- to 380-ms time windows. Red areas represent brain regions where significant differences from zero were seen (P < 0.01).

Figure 4.

(A) Neural generators of EMMN in response to upright irregular fearful faces within the 250- to 290-, 290- to 330-, and 330- to 370-ms time windows. (B) Neural generators of EMMN in response to upright irregular happy faces within the 300- to 340- and 340- to 380-ms time windows. Red areas represent brain regions where significant differences from zero were seen (P < 0.01).

Figure 4B and Table 3 show the current sources of EMMN in response to upright irregular happy faces. In the early phase of EMMN (300–340 ms), the current sources were located in the 1) temporal (the fusiform, inferior temporal, and superior temporal gyri; right hemisphere only), 2) frontal (the medial frontal, orbital, rectal, superior frontal, middle frontal, and inferior frontal gyri; both hemispheres but lateralized to the right hemisphere), and 3) limbic lobes (the anterior cingulate; both hemispheres but lateralized to the right hemisphere). In the late phase (340–380 ms), the current sources were located in the 1) frontal and 2) occipital lobes (the cuneus; both hemispheres but lateralized to the left hemisphere).

Discussion

The present study demonstrated that 1) irregular emotional faces that violated sequential regularities elicited EMMN under a situation where the emotional faces themselves were unrelated to the participant’s task, 2) the onset latencies of EMMN were significantly prolonged in the inverted conditions compared with the upright conditions (face-inversion effect) and in the happy conditions compared with the fearful conditions (negative-bias effect), 3) for both irregular fearful and happy faces, EMMN was associated with activations in the temporal, frontal, limbic, and occipital lobes, while it was not associated with activations in any motor-related areas, and 4) for both irregular fearful and happy faces, temporal activations were observed only in the early phase of EMMN and occipital activations were observed only in the late phase of EMMN, while frontal and limbic activations were observed in almost all phases of EMMN.

Evidence for the Unintentional Temporal Context–Based Prediction of Emotional Faces

When compared with the regular emotional faces, irregular emotional faces evoked robust prediction error responses under a situation where the emotional faces themselves were unrelated to the participant’s task. In addition, the prediction error responses emerged as a posterior negativity with morphologies similar to EMMN (Susac et al. 2004; Zhao and Li 2006; Astikainen and Hietanen 2009; Chang et al. 2010) and traditional visual MMN (Czigler 2007; Kimura et al. 2011). These results indicate that upcoming emotional faces were unintentionally predicted on the basis of sequential regularities and that EMMN observed in previous oddball studies reflected prediction errors rather than memory mismatches. Therefore, these results provide direct evidence for the existence of the unintentional temporal context–based prediction of emotional faces.

One may argue that the present EMMN might still be explained in terms of memory mismatches rather than prediction errors, as the EMMN that has been obtained under the use of oddball sequences. That is, since the present stimulus sequences also included a repetitive presentation of a particular stimulus pattern (e.g., fearful to happy faces; FH), the present EMMN might be interpreted to reflect mismatches between the irregular emotional faces and the visual memory of a repetitively presented stimulus pattern (FH). Traditionally, an automatically formed visual memory has been understood in terms of visual sensory store that is involved in representing only temporally unintegrated (static) information (Neisser 1967; Atkinson and Shiffrin 1968). According to this view, it is unlikely that the visual memory of the stimulus pattern (FH) was automatically formed. However, it will be still arguable that the present successive presentation of face stimuli might induce apparent facial motion and the visual memory of the stimulus pattern (FH) might be automatically formed in the format of facial motion. However, it seems unlikely that apparent facial motion was induced in the present study. First, previous studies on apparent facial motion have typically used a continuous presentation of static face stimuli to induce the robust apparent facial motion (e.g., Puce et al. 2000; Miyoshi et al. 2004), while the present study presented static face stimuli with a relatively long interstimulus interval (250 ms). Second, previous ERP studies on apparent facial motion have shown that the amplitude and/or latency of visual evoked potentials (in particular, N170, a face-sensitive ERP component; Bentin et al. 1996) are robustly modulated as a function of the types of apparent facial motion (e.g., mouth-opening changes vs. mouth-closing changes, Puce et al. 2000; facial expression changes vs. individual changes, Miyoshi et al. 2004), while in the present study such a difference in visual evoked potentials was not observed between the irregular and regular emotional faces (i.e., facial expression no changes vs. facial expression changes; see Fig. 2). These facts suggest that apparent facial motion was not induced in the present study and oppose the idea of the visual memory of the stimulus pattern (FH) in the format of facial motion. Thus, although the existence of an automatically formed visual memory that can represent temporally integrated information could not be completely ruled out, we consider that the present EMMN would be more reliably explained in terms of prediction errors rather than memory mismatches.

Furthermore, the face-inversion and negative-bias effects on the onset latencies of EMMN suggest that unintentional predictions were present not only for low-level visual features that comprised a face but also for holistic face dimensions. Although it is difficult to provide a conclusive explanation why robust face-inversion and negative-bias effects were observed only for the onset latencies but not for the mean amplitudes of EMMN, the present results are highly consistent with previous behavioral studies on the face-inversion effect that suggested that upright and inverted face processing do not differ qualitatively and even inverted faces are eventually processed holistically (Sekuler et al. 2004; Richler et al. 2011; but see also Murray et al. 2000; Rossion 2008) and those on the negative-bias effect that demonstrated that negative faces are processed faster than positive faces (Hansen and Hansen 1988; Öhman et al. 2001).

One may argue that the present modulations of EMMN might be explained in terms of differences in task difficulty between the upright and inverted conditions. In fact, the present glass-detection task was slightly more difficult in the inverted conditions than in the upright conditions. However, this task-difficulty account is not consistent with the finding in a previous visual MMN study that amplitudes of visual MMN were enhanced when the task difficulty increased (Kimura et al. 2008). In addition, although this account might explain the face-inversion effect on EMMN, it does not explain the negative-bias effect on EMMN. One may also argue that the present modulations of EMMN might be explained in terms of the effects of visual fields of stimulation on visual MMN (i.e., a higher sensitivity of visual MMN for stimulation of the lower visual field than for stimulation of the upper visual field; Czigler et al. 2004). In the present study, one of the prominent differences in the low-level visual features between fearful and happy faces was the shape of the mouth. The mouth was located in the lower visual field in the upright conditions and in the upper visual field in the inverted conditions, which might have caused a more prominent EMMN in the upright conditions than in the inverted conditions. Although this possibility cannot be completely ruled out based on the present results, this visual-field account also cannot explain the negative-bias effect on EMMN. Taken together, the present face-inversion and negative-bias effects on EMMN suggest that unintentional prediction occurred, at least partly, for holistic face dimensions.

In summary, the present results on EMMN provide direct evidence for the existence of the unintentional temporal context–based prediction of emotional faces and support the idea that the concept of unintentional temporal context–based prediction can be generalized for a wide range of biologically relevant stimuli.

Neural Substrates of the Unintentional Prediction of Emotional Faces

For both irregular fearful and happy faces, the neural generators of EMMN were located in the temporal, frontal, limbic, and occipital lobes. The temporal activations were located mainly in the inferior lateral temporal areas (the fusiform and inferior temporal gyri), which are known as the face-responsible occipitotemporal visual extrastriate areas (Haxby et al. 2000). In addition, the temporal activations were located only in the right hemisphere, which is highly consistent with a large body of brain imaging studies on face processing (Sergent et al. 1992; Haxby et al. 1994; Clark et al. 1996; Kanwisher et al. 1997; McCarthy et al. 1997). Further, the activations of face-responsible areas fit nicely with previous observations that prediction error responses reflected by visual MMN were engaged in the visual extrastriate areas responsible for dimensions where the regularity violations occurred (e.g., MT/V5 activations for visual MMN in response to regularity violations in the direction of motion; Pazo-Alvarez et al. 2004).

The frontal activations were located over the medial prefrontal (the medial frontal, orbital, and rectal gyri) and lateral prefrontal areas (the superior frontal, middle frontal, and inferior frontal gyri). These results are consistent with previous studies on neural generators of visual MMN that have reported activations in the medial prefrontal (the orbital gyrus; Kimura, Ohira, and Schröger 2010) and lateral prefrontal areas (the inferior frontal gyrus, Urakawa et al. 2010; the dorsolateral prefrontal cortex, Yucel et al. 2007). Further, the present results are also consistent with a previous study on perceptual sequence learning that has reported activations in the lateral prefrontal areas (the middle frontal gyrus, inferior frontal sulcus, and inferior frontal gyrus; Huettel et al. 2002) and a study on representational momentum that has reported activations in the medial prefrontal (the frontopolar cortex and medial frontal gyrus) and lateral prefrontal areas (the superior frontal and inferior frontal gyri; Rao et al. 2004). In addition, the present frontal activations showed right-hemisphere dominance, which is highly consistent with all of these studies.

The other activations were located in the limbic (the anterior cingulate, right lateralized) and occipital lobes (the occipital visual extrastriate areas including the inferior occipital gyrus, left lateralized). Activation of the anterior cingulate has not been consistently observed in the previous studies mentioned above. However, Huettel et al. (2002) reported similar right-lateralized anterior cingulate activation. The activation of occipital visual extrastriate areas has been reported in almost all studies on neural generators of visual MMN (Yucel et al. 2007; Kimura, Ohira, and Schröger 2010; Urakawa et al. 2010) and in a study on representational momentum (Rao et al. 2004). In addition, the occipital visual extrastriate areas including the inferior occipital gyrus have also been associated with face processing (in particular, early analyses of low-level facial features; Haxby et al. 2000). Taken together, the present activations in the temporal, frontal, limbic, and occipital lobes are highly consistent with our primary expectation that neural generators of the EMMN would include the face-responsible visual and prefrontal areas.

Importantly, the neural generators of EMMN were not associated with activations in any motor-related areas. This indicates that in the present study the mapping of the emotional faces onto the participant’s motor system did not occur and the unintentional temporal context–based prediction of emotional faces on the basis of motoric sequential regularities was not formed. If we consider the previous findings that suggested that passive observation of others’ facial expression can lead to motor-related activations in a relatively automatic manner (Carr et al. 2003; Leslie et al. 2004), the present result might seem to be strange. One possible explanation for this discrepancy is that it may be associated with the participant’s top–down strategy. Indeed, although some studies have shown that even passive observation of others’ actions can lead to the motor-related activations (Fadiga et al. 1995; Iacoboni et al. 1999; Buccino et al. 2001; Carr et al. 2003; Leslie et al. 2004), it has also been shown that passive observation of others’ actions is not necessarily sufficient for the robust motor-related activations and they are dependent on the observer’s top–down strategy (e.g., observing the others’ actions with intention to imitate; Decety et al. 1997; Grèzes et al. 1998). In the present study, the participant was required to ignore facial expressions and the emotional faces themselves were completely unrelated to the participant’s task. Such a task instruction may fade away the motoric involvement in the unintentional prediction of emotional faces.

In summary, the present results on neural generators of EMMN suggest that the unintentional temporal context–based prediction of emotional faces can operate on a principle similar to that for nonbiological visual stimuli and provide further empirical support for the major engagement of the visual and prefrontal areas in unintentional temporal context–based prediction in vision.

Neural Mechanisms of the Unintentional Prediction of Emotional Faces

For both irregular fearful and happy faces, activations in face-sensitive occipitotemporal visual extrastriate areas were observed only in the early phase of EMMN, while those in prefrontal areas were observed in almost all phases of EMMN. Thus, a clear precedence of occipitotemporal visual extrastriate activations compared with prefrontal activations was not observed. However, this temporal pattern is not inconsistent with our expectation. Thus, the present result is not inconsistent with the notion that occipitotemporal visual extrastriate activation might be related to the initial generation of prediction error signals, while prefrontal activation might be related to the updating of predictive models.

An interesting observation is that occipital extrastriate visual areas were activated only in the late phase of EMMN. Although it is difficult to provide a definitive interpretation, at least 2 possible explanations can be proposed. First, occipital activation may reflect the top–down imposition of new predictions based on updated predictive models. This account seems to be plausible since in the successive presentation of emotional faces as in the present study, it would be beneficial to form new predictions when prediction errors occurred and to impose them on the visual areas that are responsible for early analyses of low-level facial features (Haxby et al. 2000) before the presentation of the next emotional face. Such formation of new predictions is theoretically possible since in the present study irregular emotional faces were always followed by an emotional face of another valence (i.e., irregular fearful faces were always followed by happy faces and vice versa). This account may further explain why occipital activation was more prominent for irregular happy faces (see Fig. 4). Thus, it may reflect stronger predictions for the upcoming fearful compared with happy faces (i.e., a kind of negative-bias effect). Second, occipital activation may simply reflect the bottom–up initial prediction error responses with respect to the low-level visual features that comprise the face. In the present study, EMMN showed face-inversion and negative-bias effects. However, the presence of these effects does not mean that the prediction error responses were exclusively related to holistic face dimensions. Therefore, it is also possible that occipital activation reflected bottom–up error responses. This account seems to be consistent with an observation that the occipital activation in response to irregular fearful and happy stimuli emerged within approximately the same time windows (330–370 and 340–380 ms, respectively). Importantly, however, if we consider that no previous studies have investigated the time courses of the neural generators of EMMN as well as visual MMN, both the validity of these 2 hypotheses and the replicability of the temporal patterns of neural generators have to be extensively tested in future studies.

In summary, the present findings on the time courses of EMMN generators provide preliminary support for the idea that prediction error responses reflected by EMMN (and, presumably, visual MMN) might be composed of both bottom–up and top–down predictive processes.

Conclusions

The present study demonstrated that, without any intention on the part of the participant, upcoming emotional faces were predicted on the basis of sequential regularities, which was mainly implemented by the face-responsible visual extrastriate areas and prefrontal areas. The present results support the idea that the concept of unintentional temporal context–based prediction that has been established with the use of nonbiological visual stimuli can be generalized for a wide range of biological visual stimuli and provide further empirical evidence for the major engagement of the visual and prefrontal areas in unintentional temporal context–based prediction in vision.

Funding

Japan Society for the Promotion of Science (JSPS19007792 to M.K.).

We thank Alexandra Bendixen for her suggestions on the data analysis. We also thank Saika Araki, Saori Ishii, Yuka Kasii, Chizuru Kimura, Akane Kyou, Kazuya Maekawa, Asuka Masaki, Asumi Nakamura, Sayaka Sakaue, and Jiaming Tang for their assistance in data acquisition. Conflict of Interest : None declared.

References

Astikainen
P
Hietanen
JK
Event-related potentials to task-irrelevant changes in facial expressions
Behav Brain Funct
 , 
2009
, vol. 
5
 pg. 
30
 
Atkinson
RC
Shiffrin
RM
Spence
KW
Spence
JT
Human memory: a proposed system and its control process
The psychology of learning and motivation: advances in research and theory
 , 
1968
New York
Academic Press
(pg. 
89
-
195
)
Bar
M
Kassam
KS
Ghuman
AS
Boshyan
J
Schmid
AM
Dale
AM
Hämäläinen
MS
Marinkovic
K
Schacter
DL
Rosen
BR
, et al.  . 
Top-down facilitation of visual recognition
Proc Natl Acad Sci U S A.
 , 
2006
, vol. 
103
 (pg. 
449
-
454
)
Bentin
S
Allison
T
Puce
A
Perez
E
McCarthy
G
Electrophysiological studies of face perception in humans
J Cogn Neurosci.
 , 
1996
, vol. 
8
 (pg. 
551
-
565
)
Blakemore
SJ
Decety
J
From the perception of action to the understanding of intention
Nat Rev Neurosci
 , 
2001
, vol. 
2
 (pg. 
561
-
567
)
Blakemore
SJ
Frith
C
The role of motor contagion in the prediction of action
Neuropsychologia
 , 
2005
, vol. 
43
 (pg. 
260
-
267
)
Blakemore
SJ
Wolpert
D
Frith
CD
Why can’t you tickle yourself?
Neuroreport
 , 
2000
, vol. 
11
 (pg. 
R11
-
R16
)
Brunia
CHM
Neural aspects of anticipatory behavior
Acta Psychol
 , 
1999
, vol. 
101
 (pg. 
213
-
242
)
Bubic
A
Bendixen
A
Schubotz
RI
Jacobsen
T
Schröger
E
Differences in processing violations of sequential and feature regularities as revealed by visual event-related brain potentials
Brain Res
 , 
2010
, vol. 
1317
 (pg. 
192
-
202
)
Bubic
A
von Cramon
DY
Jacobsen
T
Schröger
E
Schubotz
RI
Violation of expectation: neural correlates reflect bases of prediction
J Cogn Neurosci
 , 
2009
, vol. 
21
 (pg. 
1
-
14
)
Buccino
G
Binkofski
F
Fink
GR
Fadiga
L
Fogassi
L
Gallese
V
Seitz
RJ
Zilles
K
Rizzolatti
G
Freund
HJ
Action observation activates premotor and parietal areas in a somatotopic manner: an fMRI study
Eur J Neurosci
 , 
2001
, vol. 
13
 (pg. 
400
-
404
)
Canfield
RL
Haith
MM
Young infants’ visual expectations for symmetric and asymmetric stimulus sequence
Dev Psychol
 , 
1991
, vol. 
27
 (pg. 
198
-
208
)
Carr
L
Iacoboni
M
Dubeau
MC
Mazziotta
JC
Lenzi
GL
Neural mechanisms of empathy in humans: a relay from neural systems for imitation to limbic areas
Proc Natl Acad Sci U S A
 , 
2003
, vol. 
100
 (pg. 
5497
-
5502
)
Chang
Y
Xu
J
Shi
N
Zhang
B
Zhao
L
Dysfunction of processing task-irrelevant emotional faces in major depressive disorder patients revealed by expression-related visual MMN
Neurosci Lett
 , 
2010
, vol. 
472
 (pg. 
33
-
37
)
Clark
VP
Keil
K
Maisog
JM
Courtney
S
Ungerleider
LG
Haxby
JV
Functional magnetic resonance imaging of human visual cortex during face matching: a comparison with positron emission tomography
Neuroimage
 , 
1996
, vol. 
4
 (pg. 
1
-
15
)
Czigler
I
Visual mismatch negativity: violation of nonattended environmental regularities
J Psychophysiol
 , 
2007
, vol. 
21
 (pg. 
224
-
230
)
Czigler
I
Balázs
L
Pató
L
Visual change detection: event-related brain potentials are dependent on stimulus location in humans
Neurosci Lett
 , 
2004
, vol. 
364
 (pg. 
149
-
153
)
Czigler
I
Balázs
L
Winkler
I
Memory-based detection of task-irrelevant visual changes
Psychophysiology
 , 
2002
, vol. 
39
 (pg. 
869
-
873
)
Czigler
I
Weisz
J
Winkler
I
ERPs and deviance detection: visual mismatch negativity to repeated visual stimuli
Neurosci Lett
 , 
2006
, vol. 
401
 (pg. 
178
-
182
)
Decety
J
Grèzes
J
Costes
N
Perani
D
Jeannerod
M
Procyk
E
Grassi
F
Fazio
F
Brain activity during observation of actions
Brain
 , 
1997
, vol. 
120
 (pg. 
1763
-
1777
)
Dimberg
U
Thunberg
M
Elmehed
K
Unconscious facial reactions to emotional facial expressions
Psychol Sci
 , 
2000
, vol. 
11
 (pg. 
86
-
89
)
Dimberg
U
Thunberg
M
Grunedal
S
Facial reactions to emotional stimuli: automatically controlled emotional responses
Cognition Emotion
 , 
2002
, vol. 
11
 (pg. 
86
-
89
)
Doherty
JR
Rao
A
Mesulam
MM
Nobre
AC
Synergistic effect of combined temporal and spatial expectations on visual attention
J Neurosci
 , 
2005
, vol. 
25
 (pg. 
8259
-
8266
)
Eagleman
DM
Sejnowski
TJ
Motion integration and postdiction in visual awareness
Science
 , 
2000
, vol. 
287
 (pg. 
2036
-
2038
)
Eimer
M
Goschke
T
Schlaghecken
F
Stürmer
B
Explicit and implicit learning of event sequences: evidence from event-related brain potentials
J Exp Psychol Learn Mem Cogn
 , 
1996
, vol. 
22
 (pg. 
970
-
987
)
Ekman
P
Friesen
WV
Pictures of facial affect
 , 
1976
Palo Alto (CA)
Consulting Psychologists Press
Fadiga
L
Fogassi
L
Pavesi
G
Rizolatti
G
Motor facilitation during action observation: a magnetic stimulation study
J Neurophysiol
 , 
1995
, vol. 
73
 (pg. 
2608
-
2611
)
Freyd
JJ
Finke
RA
Representational momentum
J Exp Psychol Learn Mem Cogn
 , 
1984
, vol. 
10
 (pg. 
126
-
132
)
Friston
KJ
Learning and inference in the brain
Neural Netw
 , 
2003
, vol. 
16
 (pg. 
1325
-
1352
)
Friston
KJ
A theory of cortical responses
Philos Trans R Soc Lond B Biol Sci
 , 
2005
, vol. 
360
 (pg. 
815
-
836
)
Frith
CD
Frith
U
How we predict what other people are going to do
Brain Res
 , 
2006
, vol. 
1079
 (pg. 
36
-
46
)
Gallese
V
Fadiga
L
Fogassi
L
Rizzolatti
G
Action recognition in the premotor cortex
Brain
 , 
1996
, vol. 
119
 (pg. 
593
-
609
)
Gallese
V
Keysers
C
Rizzolatti
G
A unifying view of the basis of social cognition
Trends Cogn Sci
 , 
2004
, vol. 
8
 (pg. 
396
-
403
)
Gomez
CM
Vaquero
E
Vaquero-Marrufo
M
A neurocognitive model for short-term sensory and motor preparatory activity in humans
Psicologica
 , 
2004
, vol. 
25
 (pg. 
217
-
229
)
Grèzes
J
Costes
N
Decety
J
Top-down effect of strategy on the perception of human biological motion: a PET investigation
Cogn Neuropsychol
 , 
1998
, vol. 
15
 (pg. 
553
-
582
)
Haith
MM
McCarty
ME
Stability of visual expectancy at 3.0 months of age
Dev Psychol
 , 
1990
, vol. 
26
 (pg. 
68
-
74
)
Hansen
CH
Hansen
RD
Finding the face in the crowd: an anger superiority effect
J Pers Soc Psychol
 , 
1988
, vol. 
54
 (pg. 
917
-
924
)
Haxby
JV
Hoffman
EA
Gobbini
MI
The distributed human neural system for face perception
Trends Cogn Sci
 , 
2000
, vol. 
4
 (pg. 
223
-
233
)
Haxby
JV
Horwitz
B
Ungerleider
LG
Maisog
JM
Pietrini
P
Grady
CL
The functional organization of human extrastriate cortex: a PET-rCBF study of selective attention to faces and locations
J Neurosci
 , 
1994
, vol. 
14
 (pg. 
6336
-
6353
)
Hayes
AE
Freyd
JJ
Representational momentum when attention is divided
Vis Cogn
 , 
2002
, vol. 
9
 (pg. 
8
-
27
)
Howard
JH
Jr
Mutter
SA
Howard
DV
Serial pattern learning by event observation
J Exp Psychol Learn Mem Cogn
 , 
1992
, vol. 
18
 (pg. 
1029
-
1039
)
Huettel
SA
Mack
PB
McCarthy
G
Perceiving patterns in random series: dynamic processing of sequence in prefrontal cortex
Nat Neurosci
 , 
2002
, vol. 
5
 (pg. 
485
-
490
)
Iacoboni
M
Woods
RP
Brass
M
Bekkering
H
Mazziotta
JC
Rizzolatti
G
Cortical mechanisms of human imitation
Science
 , 
1999
, vol. 
286
 (pg. 
2526
-
2528
)
Kanwisher
N
McDermott
J
Chun
MM
The fusiform face area: a module in human extrastriate cortex specialized for face perception
J Neurosci
 , 
1997
, vol. 
17
 (pg. 
4302
-
4311
)
Kelly
MH
Freyd
JJ
Explorations of representational momentum
Cogn Psychol
 , 
1987
, vol. 
19
 (pg. 
369
-
401
)
Kelly
SW
Burton
AM
Riedel
B
Lynch
E
Sequence learning by action and observation: evidence for separate mechanisms
Br J Psychol
 , 
2003
, vol. 
94
 (pg. 
355
-
372
)
Kiesel
A
Miller
J
Jolicœur
P
Brisson
B
Measurement of ERP latency differences: a comparison of single-participant and jackknife-based scoring methods
Psychophysiology
 , 
2008
, vol. 
45
 (pg. 
250
-
274
)
Kimura
M
Katayama
J
Murohashi
H
Underlying mechanisms of P3a task-difficulty effect
Psychophysiology
 , 
2008
, vol. 
45
 (pg. 
731
-
741
)
Kimura
M
Katayama
J
Ohira
H
Schröger
E
Visual mismatch negativity: new evidence from the equiprobable paradigm
Psychophysiology
 , 
2009
, vol. 
46
 (pg. 
402
-
409
)
Kimura
M
Ohira
H
Schröger
E
Localizing sensory and cognitive systems for pre-attentive visual deviance detection: an sLORETA analysis of the data of Kimura et al. (2009)
Neurosci Lett
 , 
2010
, vol. 
485
 (pg. 
198
-
203
)
Kimura
M
Schröger
E
Czigler
I
Visual mismatch negativity and its importance in visual cognitive sciences
Neuroreport. 22:669–673.
 , 
2011
Kimura
M
Schröger
E
Czigler
I
Ohira
H
Human visual system automatically encodes sequential regularities of discrete events
J Cogn Neurosci
 , 
2010
, vol. 
22
 (pg. 
1124
-
1139
)
Kimura
M
Widmann
A
Schröger
E
Human visual system automatically represents large-scale sequential regularities
Brain Res
 , 
2010
, vol. 
1317
 (pg. 
165
-
179
)
Kimura
M
Widmann
A
Schröger
E
Top-down attention affects sequential regularity representation in the human visual system
Int J Psychophysiol
 , 
2010
, vol. 
77
 (pg. 
126
-
134
)
Leslie
KR
Johnson-Frey
SH
Grafton
ST
Functional imaging of face and hand imitation: towards a motor theory of empathy
Neuroimage
 , 
2004
, vol. 
21
 (pg. 
601
-
607
)
MacKay
DM
Perceptual stability of a stroboscopically lit visual field containing self-luminous objects
Nature
 , 
1958
, vol. 
181
 (pg. 
507
-
508
)
Mayr
U
Spatial attention and implicit sequence learning: evidence for independent learning of spatial and nonspatial sequences
J Exp Psychol Learn Mem Cogn
 , 
1996
, vol. 
22
 (pg. 
350
-
364
)
McCarthy
G
Puce
A
Gore
JC
Allison
T
Face-specific processing in the human fusiform gyrus
J Cogn Neurosci
 , 
1997
, vol. 
9
 (pg. 
605
-
610
)
Miall
RC
Connecting mirror neurons and forward models
Neuroreport
 , 
2003
, vol. 
14
 (pg. 
2135
-
2137
)
Miller
J
Patterson
T
Ulrich
R
Jackknife-based method for measuring LRP onset latency differences
Psychophysiology
 , 
1998
, vol. 
35
 (pg. 
99
-
115
)
Miyoshi
M
Katayama
J
Morotomi
T
Face-specific N170 component is modulated by facial expressional change
Neuroreport
 , 
2004
, vol. 
15
 (pg. 
911
-
914
)
Murray
JE
Yong
E
Rohdes
G
Revisiting the perception of upside-down faces
Psychol Sci
 , 
2000
, vol. 
11
 (pg. 
492
-
496
)
Nattkemper
D
Prinz
W
Stimulus and response anticipation in a serial reaction task
Psychol Res
 , 
1997
, vol. 
60
 (pg. 
98
-
112
)
Neisser
U
Cognitive psychology
 , 
1967
New York
Appleton-Century-Crofts
Nichols
TE
Holmes
AP
Nonparametric permutation tests for functional neuroimaging: a primer with examples
Hum Brain Mapp
 , 
2002
, vol. 
15
 (pg. 
1
-
25
)
Nijhawan
R
Motion extrapolation in catching
Nature
 , 
1994
, vol. 
370
 (pg. 
256
-
257
)
Öhman
A
Lundqvist
D
Esteves
F
The face in the crowd revisited: a threat advantage with schematic stimuli
J Pers Soc Psychol
 , 
2001
, vol. 
80
 (pg. 
381
-
396
)
Pascual-Marqui
RD
Standardized low-resolution brain electromagnetic tomography (sLORETA): technical details
Methods Find Exp Clin Pharmacol
 , 
2002
, vol. 
24D
 (pg. 
5
-
12
)
Pazo-Alvarez
P
Amenedo
E
Lorenzo-López
L
Cadaveira
F
Effects of stimulus location on automatic detection of changes in motion direction in the human brain
Neurosci Lett
 , 
2004
, vol. 
371
 (pg. 
111
-
116
)
Puce
A
Smith
A
Allison
T
ERPs evoked by viewing facial movements
Cogn Neuropsychol
 , 
2000
, vol. 
17
 (pg. 
221
-
239
)
Ramillard
G
Pure perceptual-based sequence learning
J Exp Psychol Learn Mem Cogn
 , 
2003
, vol. 
29
 (pg. 
581
-
597
)
Rao
H
Han
S
Jiang
Y
Xue
Y
Gu
H
Cui
Y
Gao
D
Engagement of the prefrontal cortex in representational momentum
Neuroimage
 , 
2004
, vol. 
23
 (pg. 
98
-
103
)
Richler
JJ
Mack
ML
Palmeri
TJ
Gauthier
I
Inverted faces are (eventually) processed holistically
Vision Res
 , 
2011
, vol. 
51
 (pg. 
333
-
342
)
Rizzolatti
G
Fadiga
L
Gallese
V
Fogassi
L
Premotor cortex and the recognition of motor actions
Cogn Brain Res
 , 
1996
, vol. 
3
 (pg. 
131
-
141
)
Rossion
B
Picture-plane inversion leads to qualitative changes of face perception
Acta Psychol
 , 
2008
, vol. 
128
 (pg. 
274
-
289
)
Rüsseler
J
Rösler
F
Implicit and explicit learning of event sequences: evidence for distinct coding of perceptual and motor representations
Acta Psychol
 , 
2000
, vol. 
104
 (pg. 
45
-
67
)
Schubotz
RI
Prediction of external events with our motor system: towards a new framework
Trends Cogn Sci
 , 
2007
, vol. 
11
 (pg. 
211
-
218
)
Schubotz
RI
von Cramon
DY
Dynamic patterns make the premotor cortex interested in objects: influence of stimulus and task revealed by fMRI
Cogn Brain Res
 , 
2002
, vol. 
14
 (pg. 
357
-
369
)
Schubotz
RI
von Cramon
DY
Predicting perceptual events activates corresponding motor schemes in lateral premotor cortex: an fMRI study
Neuroimage
 , 
2002
, vol. 
15
 (pg. 
787
-
796
)
Searcy
JH
Bartlett
JC
Inversion and processing of component and spatial-relational information in faces
J Exp Psychol Hum Percept Perform
 , 
1996
, vol. 
22
 (pg. 
904
-
915
)
Sekuler
AB
Gaspar
CM
Gold
JM
Bennett
PJ
Inversion leads to quantitative, not qualitative, changes in face processing
Curr Biol
 , 
2004
, vol. 
14
 (pg. 
391
-
396
)
Sergent
J
Ohta
S
MacDonald
B
Functional neuroanatomy of face and object processing. A positron emission tomography study
Brain
 , 
1992
, vol. 
115
 (pg. 
15
-
36
)
Stefanics
G
Kimura
M
Czigler
I
Visual mismatch negativity reveals automatic detection of sequential regularity violation
Front Hum Neurosci
 , 
2011
, vol. 
5
 
Summerfield
C
Koechlin
E
A neural representation of prior information during perceptual inference
Neuron
 , 
2008
, vol. 
59
 (pg. 
336
-
347
)
Susac
A
Ilmoniemi
RJ
Pihko
E
Supek
S
Neurodynamic studies on emotional and inverted faces in an oddball paradigm
Brain Topogr
 , 
2004
, vol. 
16
 (pg. 
265
-
268
)
Talairach
J
Tournoux
P
Co-planar stereotaxic atlas of the human brain: three-dimensional proportional system
 , 
1988
Stuttgart (Germany)
Georg Thieme
Thompson
P
Margaret Thatcher: a new illusion
Perception
 , 
1980
, vol. 
9
 (pg. 
483
-
484
)
Ulrich
R
Miller
J
Using the jackknife-based scoring method for measuring LRP onset effects in factorial designs
Psychophysiology
 , 
2001
, vol. 
38
 (pg. 
816
-
827
)
Urakawa
T
Inui
K
Yamashiro
K
Kakigi
R
Cortical dynamics of the visual change detection process
Psychophysiology
 , 
2010
, vol. 
47
 (pg. 
905
-
912
)
Whitney
D
Murakami
I
Latency difference, not spatial extrapolation
Nat Neurosci
 , 
1998
, vol. 
1
 (pg. 
656
-
657
)
Wilson
M
Knoblich
G
The case for motor involvement in perceiving conspecifics
Psychol Bull
 , 
2005
, vol. 
131
 (pg. 
460
-
473
)
Wolfensteller
U
Schubotz
RI
von Cramon
DY
Understanding non-biological dynamics with your own premotor system
Neuroimage
 , 
2007
, vol. 
36
 (pg. 
T33
-
T43
)
Yucel
G
McCarthy
G
Belger
A
fMRI reveals that involuntary visual deviance processing is resource limited
Neuroimage
 , 
2007
, vol. 
34
 (pg. 
1245
-
1252
)
Zhao
L
Li
J
Visual mismatch negativity elicited by facial expressions under non-attentional condition
Neurosci Lett
 , 
2006
, vol. 
410
 (pg. 
126
-
131
)