Abstract

Two experiments investigated the network involved in the visual perception of walking. Video clips of forward and backward walk (real walk direction) were shown either as recorded, or reversed in time (rendering). In Experiment 1 (identification task), participants were asked to indicate whether or not the stimulus was time-reversed. In Experiment 2 (free-viewing), participants viewed the video clips passively. Identification accuracy was good with the more familiar scene, that is, when the visual walk was in the direction of the facing orientation, and at chance level in the opposite case. In both experiments, the temporo-occipital junction (TOJ) was activated more strongly by unfamiliar than familiar scenes. Only in Experiment 1 intraparietal, superior temporal, and inferior temporal regions were also activated. TOJ activation signals the detection in unfamiliar scenes of a mismatch between facing orientation and visual movement direction. We argue that TOJ response to a mismatch prevents the further processing of the visual input required to identify temporal inversions. When no mismatch is detected (familiar stimuli), TOJ would, instead, be involved in the kinematic analysis that makes such identification possible. The study demonstrates that unfamiliar walking movements are detected earlier than so far assumed along the visual movement processing stream.

Introduction

We perceive human movements with great accuracy and speed. The visual factors responsible for this remarkable performance have been investigated extensively both behaviorally and with brain imaging techniques (for a review, see Grosbras et al. 2012). Recently, growing attention has been paid to the role of nonvisual factors, such as expectations, contextual constraints, rationality, and motor competence. These attempts to place movement within a broader context rely increasingly on the use of more complex and realistic visual stimuli than point-light (PL) displays (Rizzolatti and Craighero 2004). It has been suggested (Frith and Frith 2003, 2006; Brass et al. 2007; de Lange et al. 2008) that visual movement provides access to the underlying intentionality of the action by involving cortical areas where mental states are attributed to the agent. Seeing people engaged in motor tasks activates an action observation network (AON; Gallese and Goldman 1998; Gallese et al. 2004), which extends beyond the temporo-occipital cortex (Han et al. 2013) and includes premotor and parietal areas (Caspers et al. 2010). It has also been argued that the AON supports a processes of resonance and simulation by which motor competence would be brought to bear for movement understanding (Rizzolatti and Craighero 2004; Kilner et al. 2009; Oosterhof et al. 2010). However, the neuronal underpinnings of such hypothetical process are still debated (e.g., Lingnau et al. 2009). Several brain imaging studies have focused on whole-body movements. In particular, it has been shown that the activation of AON components when viewing dance movements is modulated by the motor expertise of the observer (Calvo-Merino et al. 2005, 2006), movement naturalness (Cross et al. 2012), and training (Cross et al. 2006; Cross, Hamilton, et al. 2009; Cross, Kraemer, et al. 2009).

The human brain is sensitive to the temporal order with which events unfold in the real world (Hasson et al. 2008). However, to our knowledge, no previous study has investigated how the cortical network involved in the perception of human movements takes into account the arrow of time. Several complex human movements have a prevalent time arrow dictating the sequence with which their components usually follow each other. In some cases, prevalence is absolute because inverting the usual sequence is barred by the interaction between the body and the environment, as in most fetching gestures, or by internal constraints of the motor system, as in articulatory gestures. Walking is instead a case of relative prevalence, which is well suited for investigating the mechanisms leading to the identification of the time arrow. On the one side, the sequence of stance and swing phases of the legs that produces a forward displacement of the body can easily be inverted to produce a backward displacement. On the other side, backward walking is not the time-symmetric version of forward walking. Kinematic measurements demonstrated subtle but consistent differences between the 2 actions, which include the time course of the ankle joint angle (Thorstensson 1986; Winter et al. 1989), step length (Grasso et al. 1998), the variability of movement kinematics (Katzavelis et al. 2010), and the correlation between gait cycle, step length, and speed (Grasso et al. 1998). Therefore, even when the translational component of the movement is suppressed or is ambiguous, these asymmetric kinematic cues could still be used to discriminate forward and backward walk. In principle, discrimination may be achieved with a purely visual analysis of the cues. However, because walking is an over-trained action with strong idiosyncratic features and a robust motor representation, discriminating whether a sequence of leg movements corresponds to a forward or backward walk may also involve an interaction between visual inputs and the motor competence of the observer.

Previous studies on the perception of walk have mostly utilized PL stimuli in which the translational component of the action is eliminated. However, a recent study (Viviani et al. 2011) showed that this component plays a major role in the ability to identify the time arrow of the display. The stimuli were video clips of actors walking either forward or backward. Participants were shown both types of recordings played back either as recorded or after a time reversal and were asked to identify the play-back mode. The results showed that the ability to detect time reversals was strongly influenced by the direction of the body movement on the display. When the body appeared to move in the more familiar forward direction (forward walk rendered normally or time-reversed backward walk), response accuracy was well above chance. In contrast, responses were at chance level when the body appeared to move in the unusual backward direction. This unexpected result suggested that familiarity has a gating effect. Apparently, visual inputs describing the rather unusual scene in which the actor moves backward cannot be processed adequately to detect a temporal reversal.

Based on this finding, and using the same stimuli as in Viviani et al. (2011), we designed a brain imaging study to address 3 issues. The first issue is where along the visual stream brain activity correlates with the familiarity of the stimuli. An action can be perceptually familiar because we see it often performed, and also familiar from the motor point of view because we perform it frequently. If motor familiarity is crucial for identifying the arrow of time, the 2 directions of the body on the display should elicit different patterns of activation in premotor areas, irrespective of whether the time arrow has been preserved or reversed. If instead visual familiarity is crucial for correct identification, a differential pattern of activation should emerge already at some earlier stage of the motion processing stream. In particular, differences may be expected in the superior temporal sulcus (STS). STS is a key component of AON that responds to several aspects of biological movement (cf. Puce and Perrett 2003) and is involved in several higher-order cognitive functions such as detecting the appropriateness of the action (Pelphrey et al. 2004), action recognition (Iacoboni 2005; Molnar-Szakacs et al. 2005), action understanding (Thioux et al. 2008; Herrington et al. 2011), and intentional attunement (Gallese 2006). A computational model by Giese and Poggio (2003) predicted that forward walk elicits a stronger STS activity than backward walk. In addition, it has been argued (Jastorff and Orban 2009) that the body of the actor is linked to the action he/she is performing already in the temporo-occipital region (extra-striate body area, EBA; fusiform body areas) by integrating shape and kinematic cues.

If one can identify an area (or areas) that appears to be responsible for the gating effect of the visual body direction, the next issue is whether unfamiliar stimuli (such as a body moving backward on the display) increase or decrease the activity in these areas with respect to familiar stimuli. The prevailing view, adopted by Giese and Poggio (2003), holds that the AON should respond more energetically to familiar, than to unfamiliar, actions (Press 2011). Recently, this view has been challenged by Cross et al. (2012) who showed that parietal, premotor, and temporo-occipital areas respond more strongly to robot-like motions than to natural biological movements. Clearly, these conflicting results lead to different interpretations of the cortical responses. A more energetic response to familiar actions is in keeping with the general view that motor perceptual interactions are mediated by a resonance mechanism whereby incoming information and visual/motor competence compatible with that information reinforce mutually (Decety and Chaminade 2003; Aglioti et al. 2008). In contrast, a more energetic response to unfamiliar action may be interpreted as a startling reaction to a mismatch (Kilner et al. 2007a, 2007b; Cross et al. 2012).

The final issue concerns the identification of the time arrow. Provided that the visual direction of the walk is the familiar forward one, how is then a proficient performance achieved? Viviani et al. (2011) argued that identification involves motor competence. Clearly, an association between proficient performance and a pattern of differential activation within the AON in response to normal and reversed stimuli cannot be construed as evidence that identification is causally dependent on activity in these areas (Mahon and Caramazza 2008). It would however, be in keeping with the notion that motor expertise is conducive to a finer evaluation of the kinematic features of the stimuli (see above). Conversely, should normal and reversed stimuli fail to activate differentially any AON component, the most natural inference would be that correct identification is based mainly on visual processes either in the temporo-occipital cortex (Jastorff and Orban 2009) or in the STS (Grossman and Blake 2002).

Two event-related brain imaging experiments investigated the 3 issues above. In Experiment 1, the same two-alternative forced choice (2AFC) task as in Viviani et al. (2011) estimated the ability to detect a time reversal in 4 types of displays in which the actual direction of the walk (forward/backward) was crossed with the rendering of the recording (normal/time-reversed). In addition to components related to the visual processing, the pattern of cortical activations emerging from this experiment is likely to include also components related to the task (attention, task difficulty, and response selection). To factor out the potential confounding effects of the latter components, in Experiment 2, the same stimuli as in Experiment 1 were shown to a different sample of observers in a free-viewing condition in which no identification judgment was required.

Methods

Participants

Fourteen young adults (8 females, 6 males; age range: 22–28) participated in Experiment 1. A different sample of 15 young adults (8 females, 7 males; age range: 22–28) participated in Experiment 2. All participants were right-handed and right-foot dominant according to the Edinburgh handedness inventory. Participants were healthy, free of psychotropic, or vasoactive medication, with no past history of psychiatric or neurological diseases. They all had normal or corrected-to-normal (contact lenses) visual acuity. After receiving the instructions, participants gave their written informed consent to procedures approved by the Institutional Review Board of Fondazione Santa Lucia. However, they were not aware of the ultimate goal of the experiments. Experimental protocols complied with the Declaration of Helsinki on the use of human subjects in research.

Stimuli

Eight young adults (4 females, 4 males, age range: 21–24) who did not participate in the imaging experiments volunteered as actors for recording the stimuli. They were selected with the criterion of minimizing anthropometric differences in the lower part of the body. Identity and sex cues were also minimized by asking actors to wear black leg wears. Actors were informed that recordings of their walking movements would be used as experimental stimuli, but were not aware of the goal of the study.

Actors were asked to walk at their spontaneous pace along a 7-m long platform. Each actor performed the exercise 12 times. In 6 trials, the walk was in the forward direction (FW). In the other 6, actors walked backward direction (BW). Half of each set of walks was from the left to the right side of the platform (LR), and the other half was from the right to the left side (RL). There were 3 repetitions for each condition. Movements were recorded in color with a Sony Handycam NP-F330 at 25 frames per second. The camera was level with the platform at a distance of 3 m. With the selected focal length, the field of view (FOV) included the central 4-m portion of the platform. The background was uniformly gray. The walking action began before entering the FOV and continued after disappearing.

Normalized video clips [96 frames = 3.84 s, field size: 26° (H) × 9.5° (V)] were obtained by editing the original recordings with Adobe Premiere 7.0. Video clips showed only the lower part of the body from waist to feet and a portion of the platform. They included at least 2 complete stepping cycles. The experimental stimuli were the video clips shown either as recorded (N: normal) or after a time reversal (R: reverse). Altogether, there were 4 [actor] × 2 [sex] × 3 [repetition] × 2 [FW/BW] × 2 [LR/RL] × 2 [N/R] = 192 different stimuli. Each stimulus was presented twice to each participant in a different pseudorandom order.

Procedure and Task

The total number of trials (2 × 192 = 384) was divided into 5 runs (4 blocks of 77 trials and 1 block of 76 trials) with the constraint that successive trials never involved the same actor and same walk directions. Within runs, intertrial intervals were randomized with a long-tailed (geometric) distribution (Hagberg et al. 2001). The mean onset asynchrony was 5.68 s (minimum: 4.68 s and maximum: 11.38 s). A run lasted about 8 min. Runs were administered in a single session and were separated by brief pauses.

In Experiment 1, participants were told that the stimuli represented either a forward or a backward walk. They were also informed that, in some cases, the recordings were reversed in time, so that what appeared to be a forward walk could either be a true forward walk or the result of inverting a backward walk. The task (2AFC) was to indicate whether the movement was displayed as recorded or reversed in time. Responses were entered by pressing with the right index finger 1 of 2 buttons marked N (normal) and R (reverse), respectively. There was no time limit, but participants were encouraged to answer as soon as possible, even before the end of the stimulus. Between trials, the display was uniformly gray and participants fixated a central point. No constraints were imposed on oculomotor behavior during the presentation of the stimuli. Before the experimental session, participants were administered 48 warm-up trials, which included at least one example for each combination of actor sex, walk direction, and rendering. The results of these trials were not analyzed.

In Experiment 2, the presentation schedule was as in Experiment 1, but participants were only asked to watch the video clips, without making any overt judgment about the stimuli (free-viewing). Unlike Experiment 1, participants were not informed that half of the stimuli were time-reversed versions of a real recording.

Stimulus Presentation

Participants laid supine in the MR scanner with the head immobilized with foam cushioning and wore ear plugs and headphones to suppress ambient noise. A digital projector (NEC LT158, 60-Hz refresh rate) sent the stimuli through an inverted telephoto lens onto a semi-opaque Plexiglas screen mounted vertically inside the scanner bore, behind the participant's head. The back-projected image was then viewed via a mirror mounted on the head coil positioned at about 4.5 cm from the eyes. The eye-to-screen equivalent distance was 57 cm, and the angular size of the projected image was 16° × 5.8°. Responses were acquired with an MR-compatible response box (fORP, Current Designs) sampled at 1 kHz. Eye movements were recorded with an ASL 504 eye-tracking system (Bedford, MA, USA) and sampled at 60 Hz for off-line processing.

Processing of Eye Movement Data

Eye movement traces included a large initial saccade from the central fixation point to the screen edge where the actor entered the scene, followed by a smooth pursuit phase interspersed with catch-up saccades. Only the horizontal component of the movement was processed with a customized analysis program. First, the program extracted the trace portion from the end of the initial saccade to either the response (Experiment 1) or to the end of the video clip (Experiment 2). Then, catch-up saccades were detected and eliminated by realigning the smooth components and filling the gap with a linear interpolation. For each participant, conditions (FW/N, FW/R, BW/N, and BW/R), and walk orientation (LR/RL), we computed an average eye movement trace by pooling the results over actors and repetitions (the end of the average trace was clipped to the shortest length). Finally, we estimated the average gain of the smooth pursuit component by dividing the slope of the [gaze position/time] linear regression by the average walking speed in each recording (computed separately for FW and BW).

fMRI Data Acquisition

Functional imaging data were acquired by a Siemens Magnetom Allegra 3-T head-only scanning system (Siemens Medical Systems, Erlangen, Germany), equipped with a quadrature volume RF head coil. Whole-brain blood oxygen level-dependent (BOLD) echo-planar imaging (EPI) functional data were acquired with a 3T-optimized gradient-echo pulse sequence (repetition time = 2.47 s, echo time = 30 ms, flip angle = 70°, FOV = 192 mm, fat suppression). Thirty-eight image slices were acquired in the ascending order (64 × 64 voxels, 3 × 3 × 2.5 mm, distance factor 50%). For each participant, a total of 204 volumes of data were acquired for each of the 5 runs.

Image Preprocessing

For both experiments, the 204 × 5 = 1020 available fMRI volumes for each participant were processed using the SPM8 software (Wellcome Department of Cognitive Neurology, University College London; implemented in MATLAB 7.4). The first 4 volumes of each run were discarded to allow for stabilization of longitudinal magnetization. The remaining 1000 volumes underwent the following preprocessing steps: (1) Realignment of all images to compensate for head motion; (2) correction of the slice acquisition delays using the middle slice as a reference; (3) normalization to the MNI EPI template and re-sampling to 2-mm isotropic voxel size; (4) spatial smoothing with an isotropic Gaussian kernel with a 8-mm full-width at half maximum.

Data Analysis for Each Experiment

Statistical analysis was performed in 2 steps (Penny et al. 2003). First, for every participant, the onset and duration of each condition were modeled by a general linear model (GLM). The design matrix included multiple regressors, each modeling 1 of the 4 conditions, and 6 additional covariates modeling head motion. Voxel time-series were processed to remove autocorrelations using a first-order autoregressive model and high-pass filtering (128-s cut-off). For each participant, the resulting parameter estimate images were sorted separately for each experimental condition (FW/N, BW/N, FW/R, and BW/R).

In the second step of the analysis, for both Experiments 1 and 2, we applied a random-effect (RFX) GLM to the individual parameter estimates obtained from the first step. The statistical threshold was set to P-corr < 0.05 (FWE correction) at the cluster level (cluster size estimated at P-uncorr <0.001), considering the whole brain as the volume of interest. The localization of the activation clusters was based on Duvernoy's (1991) anatomical atlas of the human brain.

We tested whether additional information emerged when the analysis of cortical activity is repeated with a new design matrix by grouping trials according to the response. For instance, a FW/N trial the response to which was “Reversed” was reckoned as FW/R, and a BW/R trial the response to which was “Normal” was reckoned as FW/N. Four images, corresponding to the new grouping criterion, were calculated for each participant at the first level. The images were then again subjected to a second-level RFX analysis.

The analysis of the smooth pursuit component of eye movements revealed only in Experiment 1 an effect of the experimental conditions on the gain (see Results). Thus, for this experiment, we tested whether eye movements contributed to the pattern of activity. To this end, in the first-level statistical analysis of the BOLD data, we added trial by trial the individual pursuit gain as a covariate. This control was performed only on the participants (N = 8) in which the estimate of the gain was robust in at least two-thirds of the trials. The resulting image for each condition was then subjected to a second-level RFX analysis.

Comparison Between Experiments 1 and 2

First, we verified whether the regions active in Experiment 1 were also active in Experiment 2. Accordingly, we considered as region of interest (ROI) spheres (8 mm radius) centered on the local maxima of the clusters detected in Experiment 1, and tested the corresponding contrast in Experiment 2 (main effects and interactions). Altogether, 11 ROIs were considered (Table 7). Secondly, for completeness, we tested whether the effects of interest within these ROIs were significantly different between experiments. To this end, a set of unpaired t-tests compared the contrasts of interest (main effects and interactions) in Experiments 1 and 2. Note that in this second step, the contrast weights for ROI selection and for ROI test were not independent. All ROI analyses were performed using the MARSBARS toolbox for SPM8, and P-values were Bonferroni-corrected for multiple comparison. However, the correction was not applied to the analysis of nonindependent data where we focused on single areas.

Results

Experiment 1

Identification Performance

In agreement with the literature (Grasso et al. 1998), statistical analysis of step size (cf. Viviani et al. 2011) showed a significant difference between backward (110.2 cm) and forward (132.6 cm) movements, but no effect of actor sex or orientation of the walk (LR/RL). In subsequent analyses, response probabilities were computed by pooling repetitions and LR/RL trials. Table 1 reports the probability P{C} of a correct response averaged over all participants as a function of the experimental factors. Statistical analysis (mixed-model analyses of variance [ANOVAs], 4 [actor] × 2 [sex] × 2 [FW/BW] × 2 [N/R] with random factor actor nested within sex, arcsin transformation) showed that neither real walking direction (FW/BW: F1,427 = 2.04, P = 0.154) nor sex (F1,427 = 2.71, P = 0.100) had a significant effect on identification performance. In contrast, the effect of rendering was highly significant (N/R; F1,427 = 57.77, P < 0.001), and so was the interaction between real walk direction and rendering (F1,427 = 186.76, P < 0.001). For each participant, the interaction was confirmed by performing a χ2 test on the 2 × 2 contingency tables for correct responses. For all the participants, the test revealed that real walk direction and rendering interacted significantly. Identification performance was highly accurate in the condition FW/N (pooling over sex, P{C} = 0.870), decreased but remained well above chance level (P{C} = 0.672) in the condition BW/R, and collapsed to chance level when the actor appeared to move backward (BW/N: P{C} = 0.546; FW/R: P{C} = 0.408).

Table 1

Experiment 1: Probability of correct rendering identification for each combination of experimental factors

 Male actors
 
Female actors
 
FW 0.842 [0.783–0.902] 0.391 [0.260–0.523] 0.897 [0.857–0.937] 0.426 [0.264–0.587] 
BW 0.562 [0.460–0.665] 0.692 [0.502–0.757] 0.530 [0.411–0.648] 0.714 [0.566–0.862] 
 X2 = 74.30, P < 0.001 X2 = 111.23, P<0.001 
 Male actors
 
Female actors
 
FW 0.842 [0.783–0.902] 0.391 [0.260–0.523] 0.897 [0.857–0.937] 0.426 [0.264–0.587] 
BW 0.562 [0.460–0.665] 0.692 [0.502–0.757] 0.530 [0.411–0.648] 0.714 [0.566–0.862] 
 X2 = 74.30, P < 0.001 X2 = 111.23, P<0.001 

Note: Results averaged over participants, orientation of the walk, and repetitions. In brackets, the 0.95 confidence intervals.

N: normal rendering; R: reverse rendering; FW: real forward walk; BW: real backward walk.

Eye Movements

All participants spontaneously adopted a pursuit oculomotor strategy to keep the walking figure near the center of the visual field. Figure 1 illustrates with data from one representative participant the pursuit component of eye movements. Pursuit effectiveness was estimated by the pursuit gain (see Methods). Because pursuit gain was <1 (Table 2), a strategy of saccadic recapture was often adopted to maintain the gaze on the figure. Statistical analysis (three-way ANOVAs for repeated measures, 2 [FW/BW] × 2 [LR/RL] × 2 [N/R], Greenhouse-Geisser correction) showed that the pursuit gain was affected neither by the orientation of the walk (LR/RL: F1,13 < 0.01, P = 0.993) nor the rendering (N/R: F1,13 = 3.351, P = 0.090). Instead, the real direction of the walk was a significant factor (FW/BW: F1,13 = 9.583, P = 0.009), the gain being on average lower for forward (FW: 0.646) than for backward (BW: 0.729) walk.

Table 2

Experiment 1: Smooth pursuit gain averaged over participants and repetitions for each combination of experimental factors

 Left to right
 
Right to left
 
FW 0.682 0.651 0.662 0.588 
BW 0.711 0.705 0.746 0.755 
 Left to right
 
Right to left
 
FW 0.682 0.651 0.662 0.588 
BW 0.711 0.705 0.746 0.755 

Left to right/right to left: orientation of the walk; FW: real forward walk; BW: real backward walk; N: normal rendering; R: reverse rendering.

Figure 1.

Experiment 1. Oculomotor smooth pursuit in one representative participant for all 4 indicated types of stimuli and both orientations of the walk (left to right/right to left). Solid dots: pursuit trajectory averaged over all repetitions. Also shown the linear regressions through the data points used for computing the average pursuit gain (shown inset with the 95% confidence limits). Light lines: 95% confidence bars.

Figure 1.

Experiment 1. Oculomotor smooth pursuit in one representative participant for all 4 indicated types of stimuli and both orientations of the walk (left to right/right to left). Solid dots: pursuit trajectory averaged over all repetitions. Also shown the linear regressions through the data points used for computing the average pursuit gain (shown inset with the 95% confidence limits). Light lines: 95% confidence bars.

fMRI Results

The analysis of BOLD signals was carried out twice using a design matrix in which trials were grouped according to either the actual conditions or the responses (see Methods). The 2 grouping criteria yielded similar results. In the following, we will detail only the results corresponding to the former criterion. A preliminary global analysis (Table 3) charted the cortical regions activated by the stimuli, irrespective of the parameters manipulated experimentally (main effect of all moving stimuli vs. baseline). As shown in Figure 2, significant activations were found in temporo-occipital, parietal, and frontal (premotor and prefrontal) areas, which are generally included in the AON.

Table 3

Experiment 1: Main effect of all moving stimuli versus baseline

Anatomical areas Cluster size MNI peak coordinates
 
Z-score 
x y z (mm)  
Left temporo-occipital junction 22 562 −50 −76 6.17 
Right temporo-occipital junction  42 −76 −12 6.11 
Right middle occipital gyrus  34 −88 12 6.02 
Left post-central gyrus 8474 −42 −34 54 5.86 
Left precentral gyrus  −36 66 5.38 
Left superior parietal lobe  −34 −46 60 5.23 
Left inferior frontal gyrus 541 −34 18 −4 5.66 
  −28 22 10 4.36 
Left supplementary motor area 1786 −2 16 50 5.56 
Left superior frontal gyrus  −14 16 38 4.55 
  −12 24 38 4.14 
Right inferior frontal gyrus 3072 40 32 12 5.34 
  36 20 −6 5.15 
  52 14 30 4.57 
Anatomical areas Cluster size MNI peak coordinates
 
Z-score 
x y z (mm)  
Left temporo-occipital junction 22 562 −50 −76 6.17 
Right temporo-occipital junction  42 −76 −12 6.11 
Right middle occipital gyrus  34 −88 12 6.02 
Left post-central gyrus 8474 −42 −34 54 5.86 
Left precentral gyrus  −36 66 5.38 
Left superior parietal lobe  −34 −46 60 5.23 
Left inferior frontal gyrus 541 −34 18 −4 5.66 
  −28 22 10 4.36 
Left supplementary motor area 1786 −2 16 50 5.56 
Left superior frontal gyrus  −14 16 38 4.55 
  −12 24 38 4.14 
Right inferior frontal gyrus 3072 40 32 12 5.34 
  36 20 −6 5.15 
  52 14 30 4.57 

Note: Anatomical areas, cluster sizes, peak coordinates in MNI space, and Z-scores for significant activations.

Figure 2.

Experiment 1. Main effect of all moving stimuli versus baseline (P-corr < 0.05).

Figure 2.

Experiment 1. Main effect of all moving stimuli versus baseline (P-corr < 0.05).

The detailed analysis of the cortical activations was patterned after the identification performance, which, as detailed above, was strongly dependent on the interaction between real walk direction and rendering. The direction of the body displacement on the display (hereafter, “visual walk”) was designated as the first main effect and tested by the contrast [(FW/N) + (BW/R)] versus [(FW/R)] + (BW/N)]. The time arrow in the playback of the video clip (hereafter, “rendering”) was designated as the second main effect and tested by the contrast [(FW/N) + (BW/N)] versus [(BW/R) + (FW/R)]. Finally, the interaction between visual walk and rendering, which corresponds to the real walking direction (hereafter, “real walk”), was tested by the contrast [(FW/N) − (BW/R)] versus [(BW/N) − (FW/R)].

Figure 3 summarizes the results for the visual walk factor. The contrast [(FW/R) + (BW/N)] > [(FW/N) + (BW/R)] (backward visual walk > forward visual walk), which reflects the effect of watching a backward walking, demonstrated significant bilateral activations of the temporo-occipital junction (TOJ). Activation was also detected bilaterally in the intraparietal sulci (IPS) and in the right STS. Table 4 reports the coordinates of the peaks of activation for all clusters. The opposite contrast [(FW/N) + (BW/R)] > [(FW/R) + (BW/N)] (forward visual walk > backward visual walk) failed to detect any significant activation. Significant differences signaled by the GLM analysis may be due to BOLD responses with similar peak amplitude but different time course. We excluded this possible source of confound by showing that the time course of the responses was actually very similar. These comparisons are reported in Supplementary Results.

Table 4

Peaks of cluster activations in Experiment 1

Anatomical areas Cluster size MNI peak coordinates
 
Z-score 
x y z 
Backward visual walk 
 Left temporo-occipital junction 951 −52 −74 6.74 
 Right temporo-occipital junction 1173 48 −64 6.90 
 Left intraparietal sulcus 367 −34 −36 54 4.43 
 Right intraparietal sulcus 425 32 −32 40 5.06 
 Right superior temporal sulcus 321 68 −34 14 4.67 
Normal rendering 
 Left fusiform gyrus 274 −38 −30 −26 4.21 
 Right precuneus 743 −40 60 3.82 
 Anterior cingulate cortex 295 18 34 3.85 
Forward real walk 
 Left orbito-frontal cortex 2633 −8 36 −12 4.59 
 Left posterior cingulate gyrus 403 −20 −50 38 4.04 
Backward real walk 
 Right temporo-occipital junction 261 58 −64 −6 3.97 
Anatomical areas Cluster size MNI peak coordinates
 
Z-score 
x y z 
Backward visual walk 
 Left temporo-occipital junction 951 −52 −74 6.74 
 Right temporo-occipital junction 1173 48 −64 6.90 
 Left intraparietal sulcus 367 −34 −36 54 4.43 
 Right intraparietal sulcus 425 32 −32 40 5.06 
 Right superior temporal sulcus 321 68 −34 14 4.67 
Normal rendering 
 Left fusiform gyrus 274 −38 −30 −26 4.21 
 Right precuneus 743 −40 60 3.82 
 Anterior cingulate cortex 295 18 34 3.85 
Forward real walk 
 Left orbito-frontal cortex 2633 −8 36 −12 4.59 
 Left posterior cingulate gyrus 403 −20 −50 38 4.04 
Backward real walk 
 Right temporo-occipital junction 261 58 −64 −6 3.97 

Note: whole-brain analysis

Figure 3.

Experiment 1. Effect of visual walk (backward > forward). Bar plots: Activity changes relative to the baseline for each stimulus type (FW: forward real walk, BW: backward real walk; forward/backward: direction of the visual walk). Ordinates: Percentual signal change of estimated activity. Error bars indicate SE. Clusters of activation (in parenthesis the coordinates of the local maxima) are superposed to coronal sections of the canonical MNI template.

Figure 3.

Experiment 1. Effect of visual walk (backward > forward). Bar plots: Activity changes relative to the baseline for each stimulus type (FW: forward real walk, BW: backward real walk; forward/backward: direction of the visual walk). Ordinates: Percentual signal change of estimated activity. Error bars indicate SE. Clusters of activation (in parenthesis the coordinates of the local maxima) are superposed to coronal sections of the canonical MNI template.

We found no evidence of learning in either behavioral or fMRI data. The rate of correct identifications did not change significantly across the 5 sessions (three-way ANOVA, 2 [FW/BW] × 2 [N/R] × 5 [session], arcsin transformation; session: F4,52 = 2.539, P = 0.073). As for the BOLD signal, we considered both the contrasts [FW/N] > [BW/R] and [BW/N] > [FW/R]. When the whole brain was defined as the volume of interest, no significant difference was detected between sessions 1 and 5 (regression on all sessions with weights: [−2 −1 0 1 2], P-corr > 0.3). The same analysis was then repeated separately on the 8 ROIs (8 mm sphere) centered on the peaks of the main effect of backward [(FW/R) + (BW/N)] > [(FW/N) + (BW/R)] and normal [(FW/N) + (BW/N)] > [(FW/R) + (BW/R)] rendering (see Table 4). No significant temporal trends emerged in any ROI (P-uncorr > 0.1).

The contrast [(FW/N) + (BW/N)] > [(BW/R) + (FW/R)] (normal rendering > reverse rendering) demonstrated significant activity in the left fusiform gyrus (FG), right precuneus (PrCu), and anterior cingulate cortex (ACC). Because estimated activity in the former 2 areas was negative, Figure 4 illustrates the results only in the case of ACC. No significant activation emerged from the opposite contrast [(BW/R) + (FW/R)] > [(FW/N) + (BW/N)].

Figure 4.

Experiment 1. Effect of rendering (normal > reverse). FW: forward real walk; BW: backward real walk; forward/backward: direction of the visual walk. Same format as Figure 3.

Figure 4.

Experiment 1. Effect of rendering (normal > reverse). FW: forward real walk; BW: backward real walk; forward/backward: direction of the visual walk. Same format as Figure 3.

For the interaction between visual walk and rendering, which corresponds to the direction of the real walk, both contrasts [(FW/N) − (BW/R)] > [(BW/N) − (FW/R)] and [(BW/N) − (FW/R)] > [(FW/N) − (BW/R)] yielded significant results (Table 4). The first interaction (forward real walk > backward real walk) showed a significant cluster of activity in the left orbito-frontal cortex and in the posterior cingulate gyrus (Table 4). However, as in the case of the rendering contrast (see above), the estimated activity in these areas was negative. The reverse contrast (backward real walk > forward real walk) revealed a significant cluster of activation in right TOJ (Fig. 5). Note that 60% of the voxels in this cluster were also within the corresponding cluster detected by the (backward visual walk > forward visual walk) contrast (cf. Fig. 3). In fact, within the ROIs identified by the contrast (backward real walk > forward real walk), we found a significant activation also for the contrast (backward visual walk > forward visual walk) (t = 5.01, P < 0.001). Thus, the 2 contrasts identified overlapping areas within the same TOJ region.

Figure 5.

Experiment 1. Effect of real walk (interaction of the main effects shown in Figs 3 and 4). Backward > forward contrast. FW: forward real walk; BW: backward real walk; forward/backward: direction of the visual walk. Same format as Figure 3.

Figure 5.

Experiment 1. Effect of real walk (interaction of the main effects shown in Figs 3 and 4). Backward > forward contrast. FW: forward real walk; BW: backward real walk; forward/backward: direction of the visual walk. Same format as Figure 3.

Finally, because smooth pursuit gain was found to depend on the experimental conditions, we tested on a subset of the participants whether eye movements contributed to the pattern of cortical activity (see Methods). Taking the whole brain as the search volume of interest and adding the smooth pursuit gain as covariates, the main effect (backward visual walk > forward visual walk) was again significant in the right TOJ [50 −74 14], right IPS [28 −42 60], right STS [54 −30 6] (P-corr < 0.05), and left TOJ [−50 −70 2] (P-uncorr < 0.001) in spite of the reduced data set. The interaction (backward real walk > forward real walk) was significant (P-uncorr < 0.001) in the right TOJ [54 −74 10]. Thus, the control confirmed that eye movements did not represent a significant confound.

Experiment 2

Eye Movements

Also in free-viewing conditions, several participants showed a spontaneous tendency to pursue the walking figure with smooth eye movements. However, there were some significant differences with respect to Experiment 1. First, because no response was required, the pursuit continued to the end of the video clip. Secondly, in 5 participants, saccadic scanning was the dominant oculomotor behavior, and a smooth pursuit component could not be reliably isolated. Thirdly, even in the remaining 10 participants, saccadic recapture was more prominent than in the condition where a response was required by the task. As shown in Table 5, in all conditions, the average pursuit gain computed over these 10 participants was significantly lower than in Experiment 1 (cf. Table 2). Statistical analysis (three-way ANOVAs for repeated measures, 2 [FW/BW] × 2 [LR/RL] × 2 [N/R], Greenhouse-Geisser correction) showed that the pursuit gain was affected neither by the orientation of the walk (LR/RL: F1,9 < 0.420, P = 0.533) nor the rendering (N/R: F1,9 = 0.339, P = 0.575). Unlike Experiment 1, the actual direction of the walk did not affect the pursuit gain (FW/BW: F1,9 = 1.432, P = 0.262).

Table 5

Experiment 2: Smooth pursuit gain averaged over participants and repetitions for each combination of experimental factors

 Left to right
 
Right to left
 
 
FW 0.342 0.373 0.323 0.338 
BW 0.387 0.358 0.388 0.331 
 Left to right
 
Right to left
 
 
FW 0.342 0.373 0.323 0.338 
BW 0.387 0.358 0.388 0.331 

Left to right/right to left: orientation of the walk; FW: forward real walk; BW: backward real walk; N: normal rendering; R: reverse rendering.

fMRI Results

The contrast [(FW/R) + (BW/N)] > [(FW/N) + (BW/R)] (backward visual walk > forward visual walk) revealed a strong bilateral activation of TOJ (Fig. 6) similar to that revealed by Experiment 1 (cf. Fig. 3). Table 6 reports the cluster size and the coordinates of the activation peaks for both clusters. About 70% of the voxels in these clusters were also within the corresponding clusters detected by the (backward visual walk > forward visual walk) contrast in Experiment 1 (cf. Fig. 3). No significant activation was detected by the contrasts defined by the direction of the real walk. As in Experiment 1, no significant cluster of activity was detected by the reverse contrast (forward visual walk > backward visual walk). No consistent brain activations were associated with the contrast [(FW/N) + (BW/N)] > [(BW/R) + (FW/R)] (normal rendering > reverse rendering). Instead, the opposite contrast [(FW/R) + (BW/R)] > [(BW/N) + (FW/N)] (reverse rendering > normal rendering) revealed a cluster of activity in the right TOJ, not present in Experiment 1.

Table 6

Peaks of cluster activations in Experiment 2

Anatomical areas Cluster size MNI peak coordinates
 
Z-score 
x y z 
Backward visual walk 
 Left temporo-occipital junction 915 −48 −74 10 5.62 
 Right temporo-occipital junction 1286 50 −64  8 6.90 
Reverse rendering 
 Right temporo-occipital junction 465 50 −64 10 5.06 
Anatomical areas Cluster size MNI peak coordinates
 
Z-score 
x y z 
Backward visual walk 
 Left temporo-occipital junction 915 −48 −74 10 5.62 
 Right temporo-occipital junction 1286 50 −64  8 6.90 
Reverse rendering 
 Right temporo-occipital junction 465 50 −64 10 5.06 

Note: whole-brain analysis

Figure 6.

Experiment 2. Effect of visual walk (backward > forward). FW: forward real walk; BW: backward real walk; forward/backward: direction of the visual walk. Same format as Figure 3.

Figure 6.

Experiment 2. Effect of visual walk (backward > forward). FW: forward real walk; BW: backward real walk; forward/backward: direction of the visual walk. Same format as Figure 3.

The same pattern of cortical activity emerged whether or not the population sample had a homogeneous oculomotor behavior. We performed an additional second-level analysis of the data after excluding the 5 participants who, instead of pursuing the moving body, explored the stimuli mainly with saccades (see above). The whole-brain analysis detected only a significant activation in the bilateral TOJ for the contrast (backward visual walk > forward visual walk; peaks at [50 −64 6], [−46 −72 6]). The analysis was repeated on the ROIs (8-mm radius spheres) centered on the peaks, as reported in Table 6. The main effect (backward visual walk > forward visual walk) was significant in both left and right TOJ (for all ROIs P < 0.001). The interaction [(BW/N) − (FW/R)] > [(FW/N) − (BW/R)] was significant only in right TOJ (for all ROIs P < 0.05). No other activation emerged from this new analysis.

Finally, we explored, in further detail, the effects of the moving stimuli on the more rostral components of the AON. We considered 4 ROIs in the premotor/prefrontal regions centered on the local maxima (see Fig. 2) that are nearest to the coordinates reported in the meta-analysis study by Grosbras et al. (2012): Superior frontal gyrus ([−24 2 58]), inferior precentral gyrus (IPG) ([−50 10 32]), and left and right inferior frontal gyrus (IFG) ([52 22 20], [−56 6 22]). Within these ROIs, we calculated the main effects (forward visual walk > backward visual walk) and (normal rendering > reversed rendering) and the interactions (real walk) for both Experiments 1 and 2. Only for Experiment 1, significance was reached for the main effect (forward visual walk > backward visual walk) in the IPG (P < 0.05, corrected for the number of ROIs) and for the interaction in the right IFG (P = 0.015, corrected for the number of ROIs).

Comparing Experiments 1 and 2

The effect of the task was investigated by testing the patterns of activation in Experiment 2 considering the 11 ROIs defined by the clusters activated in Experiment 1 (Table 7; see Methods for ROIs definition). For the contrast (backward visual walk > forward visual walk), the test involved 5 ROIs. In the 2 ROIs within the left and right TOJ, the activity was statistically indistinguishable in the task (Experiment 1) and free-viewing (Experiment 2) conditions (univariate t-test; Table 7). The similarity between experiments of the activity in this area is confirmed by the analysis of the contrast (backward real walk > forward real walk) (univariate t-test; Table 7).

Table 7

Analysis of ROIs

Anatomical areas No task Task > no task 
P-corr P-uncorr 
Visual walk: backward > forward 
 Left temporo-occipital junction <0.001 0.272 
 Right temporo-occipital junction <0.001 0.619 
 Left intraparietal sulcus 0.79 0.020 
 Right intraparietal sulcus 0.79 0.004 
 Right superior temporal sulcus 0.08 0.031 
Rendering: normal > reverse 
 Left fusiform gyrus 0.537 0.002 
 Anterior cingulate cortex 0.987 <0.001 
 Right precuneus 0.939 <0.001 
Interaction (real walk: forward > backward) 
 Left orbito-frontal cortex 0.437 0.003 
 Left posterior cingulate gyrus 0.338 <0.001 
Interaction (real walk: backward > forward) 
 Right temporo-occipital junction 0.013 0.087 
Anatomical areas No task Task > no task 
P-corr P-uncorr 
Visual walk: backward > forward 
 Left temporo-occipital junction <0.001 0.272 
 Right temporo-occipital junction <0.001 0.619 
 Left intraparietal sulcus 0.79 0.020 
 Right intraparietal sulcus 0.79 0.004 
 Right superior temporal sulcus 0.08 0.031 
Rendering: normal > reverse 
 Left fusiform gyrus 0.537 0.002 
 Anterior cingulate cortex 0.987 <0.001 
 Right precuneus 0.939 <0.001 
Interaction (real walk: forward > backward) 
 Left orbito-frontal cortex 0.437 0.003 
 Left posterior cingulate gyrus 0.338 <0.001 
Interaction (real walk: backward > forward) 
 Right temporo-occipital junction 0.013 0.087 

No task: P-values (Bonferroni-corrected for the number of ROIs) for data in Experiment 2 within the ROIs detected in Experiment 1 (Table 4). Task > no task: comparison between Experiments 1 and 2. Uncorrected P-values.

The role of TOJ was investigated further by testing simple contrasts in Experiment 2 within the ROIs identified in Experiment 1. The tests showed a pattern in keeping with the behavioral results of Experiment 1 (Table 1). Stimuli showing an actor moving backward on the display activated TOJ more strongly than those showing an actor moving forward: ([BW/N > FW/N]: P = 0.001; [FW/R > BW/R]: P = 0.02; [BW/N > BW/R]: P = 0.0793; [FW/R > FW/N]: P < 0.001). Moreover, the activity in the condition BW/R was higher than in the condition FW/N (P = 0.0001). In contrast, conditions FW/R and BW/N were indistinguishable (P = 0.358). Thus, whether or not an identification response was required, TOJ responded differentially to the visual direction of the stimuli. Moreover, within forward visual walk stimuli, the activity in TOJ discriminated between rendering modes.

In the other ROIs associated in Experiment 1 with the contrasts (backward visual walk > forward visual walk), (normal > reverse), and (backward real walk > forward real walk), the free-viewing condition of Experiment 2 failed to confirm the differential activations found previously. The difference between experiments within these ROIs was confirmed statistically (univariate t-test, Table 7).

Discussion

Video clips of walking actors were shown in 2 brain imaging experiments. Four types of displays resulted from crossing the direction of the real walk with the rendering of the video clip. Identification performance in Experiment 1 (Table 1) confirmed the results of Viviani et al. (2011). Accuracy in deciding whether a stimulus was the original video clip or its time-reversed version was at chance level in the 2 conditions (FW/R and BW/N) where the actor's body moved backward on the display (backward visual walk). Accuracy improved significantly in the other 2 conditions (forward visual walk), being very good when a real forward walk was rendered normally (FW/N; P{C} = 0.870) and well above chance level when a real backward walk was time-reversed (BW/R; P{C} = 0.703).

The main fMRI result was that TOJ responded bilaterally more vigorously to backward visual walk (when it is impossible to identify the rendering mode) than to forward visual walk, irrespective of whether backward walk was real, or it resulted from time-reversing a forward walk. The inverse relation between TOJ activation and identification performance was emphasized further by the analysis of simple effects. Not only did identifiable stimuli (FW/N and BW/R) activate TOJ less than indistinguishable ones (BW/N and FW/R), but also the cortical response was weaker in the easier condition (FW/N) than in the more difficult one (BW/R). TOJ activity was very similar both when the task required an identification response (Experiment 1) and in the free-viewing condition (Experiment 2). Instead, the areas that are generally considered as components of the AON reacted differently in the 2 experiments. In Experiment 1, sensitivity to the rendering mode emerged most clearly in the anterior cingulate cortex (ACC), but only when the visual walk was in the FW and identification responses were fairly accurate.

Effect of Visual Walk Direction on TOJ

The first issue to be addressed is the functional significance of TOJ activity in the context of our experiment. The finding that TOJ responds differently to stimuli that do or do not comply with the usual walking direction is incompatible with one influential model for the recognition of biological motion (Giese and Poggio 2003), postulating that differences in cortical response to the forward and backward walk should emerge no earlier than STS. Our results are instead in keeping with more recent evidence on the role of the early stages of visual processing in body perception. Jastorff and Orban (2009) suggested that an integration of the actor's body with the action he/she is performing takes place already in the temporo-occipital cortex. Moreover, de Lange et al. (2008) showed that actions implemented in unusual manners elicit a stronger activity than ordinary actions in the lateral temporo-occipital cortex, where the context in which body parts are viewed would be taken into account. It is already known that an area of the lateral temporo-occipital cortex (EBA) responds selectively to visual stimuli representing body movements or body parts (Downing et al. 2001; Peelen et al. 2006), is activated by limb movements (Astafiev et al. 2004), links a portrayed movement to the body scheme (Jastorff and Orban 2009), and distinguishes self- from other-generated movements (David et al. 2007). Along similar lines, TOJ responses to forward and backward visual walk may be taken to indicate that this area is selectively sensitive to the global familiarity of the action.

Neural Response to Familiarity

It is generally agreed that activity in the AON increases in response to stimuli conforming to the motor experience of the observer (Calvo-Merino et al. 2005, 2006; Aglioti et al. 2008; Cross, Hamilton, et al. 2009; Cross, Kraemer, et al. 2009). Moreover, Giese and Poggio (2003) argued that a video clip of walking played normally should elicit significantly more activity than that played in reverse. We found instead that TOJ responds more strongly to an action (backward walk) that we rarely execute and see other people executing. A similar contradiction has been noted by Cross et al. (2012), who hypothesized a U-shaped relation between cortical activity and the degree of familiarity of the action being displayed, so that both very familiar and unfamiliar actions would result in high BOLD signals. These authors argued that, within a Bayesian perspective, the non-monotonic relationship between familiarity and cortical activity is compatible with the notion of prediction error (Kilner et al. 2007a, 2007b). Specifically, one may assume that, upon seeing the onset of an action, we entertain prior expectations about its future course, which are all the more specific that the action appears to be a familiar one. Therefore, large prediction errors (i.e., sharp discrepancies between expected and actual course of action) would occur both when the action is very unfamiliar and the associated prior expectations are poorly defined, and when the action is very familiar, but its actual course diverges from the expected one. In either case, the level of activation would correlate with the prediction error. Seeing someone walking is a very familiar visual experience. Thus, if we adopt the above line of reasoning, the strong activation of TOJ for backward visual walk suggests that this area is signaling a strong discrepancy between expected and actual input. Of course, the main discrepancy arises when the visual direction of the body displacement is opposite to the natural one. However, kinematics may be a further source of discrepancy. As mentioned in the Introduction, the kinematics of a backward walk is significantly different from that of a forward walk reversed in time. Thus, in the BW/R condition, the body is actually moving in the expected FW, but the kinematics is somewhat at variance with the visual direction. On the one hand, this may explain why identification performance is less good with BW/R than FW/N stimuli. Let us assume that the forward visual direction biases responses toward the “as recorded” option. Then, the perfect match of FW/N stimuli with the expected kinematic template would further prime this (correct) response. Instead, the assumed bias is in conflict with the mismatch detected in BW/R stimuli, which favors a “reversed” (correct) response. On the other hand, violations of kinematic expectations may also account for the result of the interaction (backward real walk > forward real walk), showing that, when the visual walk is forward, TOJ is activated more strongly by backward (BW/R) than by forward (FW/N) real walk (see lower panel in Fig. 5). Thus, even when the visual direction is familiar, TOJ signals that the kinematics diverges from the one expected in the case of a real forward walk.

The demonstration that TOJ was activated differentially by the direction of the visual walk lends itself to at least 2 interpretations. Both interpretations share the premise that TOJ reacts to backward visual walk by signaling a mismatch between the stimulus and an internal model embodying the a priori expectations of the observer. The interpretations diverge somewhat insofar as the consequences of the mismatch are concerned. On the one hand, it can be surmised (Friston 2005) that the brain actually uses the mismatch as an error signal to broaden the scope of the internal model, by progressively including kinematic configurations not previously experienced. In other words, the error signal would provide the basis for perceptual learning. On the other hand, as suggested by Viviani et al. (2011), the mismatch may prevent the extraction of the kinematic cues allowing the observer to detect temporal inversions. If so, a failure to pick up and feed forward these cues should have no consequence in free-viewing conditions. In contrast, when a response is requested it should be expected that areas downstream from the TOJ react differently to the direction of the visual walk.

It is not easy to assess conclusively the merits of these 2 interpretations. The fact that we detected no change across sessions in either performance or cortical response speaks against the hypothesis that the error signal is used to support learning. Moreover, in 3 areas downstream from the TOJ (IPS bilaterally and right STS), the response was stronger to backward than to forward visual walk (Fig. 3), suggesting a spread of the mismatch detected by the TOJ. However, because the time constant of the learning process may exceed the relatively short (40 min) duration of our experiments, one cannot exclude that the 2 interpretations are both valid. On a short time scale, the incongruence between expected and actual kinematics may indeed prevent any further processing of the stimuli. At the same time, with a prolonged exposure to the unusual (backward) visual walk, the error signal might drive an update of the expectations and a corresponding improvement of the performance.

Two related factors may provide an alternative account of the modulation of cortical activity by the direction of the visual walk. One can argue that identifying the rendering mode is both more difficult and more attention demanding for the unusual backward visual walk than for familiar forward visual walk. Thus, the increased BOLD signal in the former case may simply reflect both the increased difficulty of the task and the corresponding higher level of attention required to cope with it. In fact, this hypothesis is in agreement with the observed IPS responses, which in Experiment 1 were higher to backward than to forward visual walk (Fig. 3) and disappeared in Experiment 2. In contrast, in the TOJ, the BOLD signal was essentially the same (Table 7), irrespective of whether the identification task was imposed (Experiment 1) or not (Experiment 2). Therefore, task difficulty and attention levels do not seem to provide a viable alternative to the assumption that TOJ activity signals a mismatch between the actual and expected visual direction.

The activation of STS and IPS in the context of the decision process is compatible with the role of these structures. STS is involved in the understanding of dynamic stimuli (Puce and Perett 2003; Herrington et al. 2011). It has been suggested that STS codes walking direction (Oram and Perrett 1996) and is activated when the observer predictions on the unfolding of the action are violated (Grèzes et al. 2003; Jastorff et al. 2011). Also, the posterior parietal area is involved in the processing of the visual properties of movements and in the generation of visuo-motor transformations (Decety and Grèzes 1999). In particular, the IPS is supposed to be instrumental for bringing the observer motor competence to bear for understanding the goal of an action (Fogassi et al. 2005), and for discriminating among different actions (Jastorff et al. 2010). The fact that both STS and IPS were activated more energetically when rendering identification is close to impossible (FW/R and BW/N) may then reflect the difficulty arising with unfamiliar stimuli, when the decision process is not supported by adequate discriminal information (Liew et al. 2011).

Is Perceptual Identification Mediated by Motor Competence?

In our previous study (Viviani et al. 2011), we discussed why the difference in step size between FW and BW is unlikely to be a critical cue for deciding whether or not the stimuli are time-reversed. We argued that discriminal cues could instead be provided by the interplay between perceived kinematics and implicit motor competence. Parietal and premotor areas encode both performed and observed actions in a way that is fairly unique for different actions (Rizzolatti and Craighero 2004; Kilner et al. 2007a, b; Oosterhof et al. 2010). If indeed the forward-moving body triggers a motoric simulation of the action, identification may then result from a congruence check between perceived and expected kinematics. The results did not support this suggestion. When a response was not required, forward and backward visual walk did not activate differentially any of the regions that, according to recent meta-analyses (Caspers et al. 2010; Grosbras et al. 2012), belong to the AON. Moreover, none of these regions reacted differently to normal and time-reversed stimuli. Conversely, the 3 areas that in Experiment 1 were activated more strongly by normal than by time-reversed stimuli, namely the left FG, the PrCu, and the ACC (Fig. 4), are not known to be involved in the interplay between motoric competence and perception. Instead, the available evidence invites the inference that time-reversals are detected mostly by perceptual processes, rather early within the visual stream.

Effect of Rendering Mode

As mentioned above, the (normal rendering vs. reverse rendering) contrast detected activity in the FG, PrCu, and ACC. FG has a role in the identification of biological motion (Bonda et al. 1996; Vaina et al. 2001; Grossman and Blake 2002; Beauchamp et al. 2003). The observed PrCu sensitivity to the rendering mode (Table 4) is consistent with previous work (Hasson et al. 2008), suggesting that this area is involved in the accumulation of information over a time window required to perceive the coherence of visual events (Bischoff-Grethe et al. 2000; Hasson et al. 2008). However, the functional significance of these 2 areas in the present context is unclear because they were actually deactivated. Finally, the anterior cingulate cortex (ACC) has a role in monitoring conflicts between competing inputs (Botvinick et al. 1999; Carter et al. 1999) decision making, and action selection (Rushworth et al. 2007). The dorsal region of the ACC is activated by cognitive demanding tasks in which conflicting evidence may lead to erroneous responses (Bush et al. 2000). The difference between ACC activity elicited by normal and reversed stimuli was larger when visual walk was forward than backward. Insofar as ACC is directly connected to both premotor and motor areas (Hatanaka et al. 2003), this pattern of activations may correspond to the effect of the rendering mode on one relay station prior to the motor response.

Summary and Conclusions

We have shown that, as face recognition is crucially dependent on the familiar (upright) spatial orientation, so correct perception of walk kinematics is crucially dependent on the familiar (forward) visual direction of walk. In both cases, visual familiarity dictates whether or not stimuli details are accurately processed. It has been claimed that different components of the AON deal with different aspects of the action (intention, rationality, kinematics, and activated body part). In particular, familiarity is supposed to be a rather abstract aspect of the action evaluated within parietal and motor areas (Calvo-Merino et al. 2006). The new significant contribution of this study is the demonstration that familiar and unfamiliar walking directions are instead discriminated fairly early along the visual stream, mainly in the TOJ. Because a similar pattern of activity in this area was observed in both the task and free-viewing conditions, TOJ appears to perform a preliminary automatic analysis of the stimuli. More specifically, we suggest that unfamiliar stimuli evoke an error signal that interferes with the further processing of the kinematics and is ultimately responsible for the poor identification performance with these stimuli.

TOJ may also have a role when identification of the rendering mode is feasible. Asked whether a walk has been shown as recorded or reversed in time, participants in Experiment 1 were quite accurate when the body moved forward. To respond “Normal” to FW/N trials, the participant had to decide that no element in the display was in conflict with the visual direction. To respond “Reversed” to BW/R trials, the observer must have detected at least one incongruence with respect to the visual direction. Albeit different, both decision criteria necessarily require the availability of kinematic templates describing the expected time course of the stance and swing phase in forward walk. The fMRI results of Experiment 1 cannot adjudge conclusively the outstanding issue of where this information becomes available. Indeed, the (backward visual direction > forward visual direction) contrast of Figure 3 showed activations in several areas—including parietal and prefrontal ones—that are generally included in the AON. However, the (backward real walk > forward real walk) contrast (Fig. 5) demonstrated that, when the visual direction is forward, TOJ is activated more strongly by backward (BW/R) than by forward (FW/N) real walk. This may be taken to indicate that, as suggested above, correct responses to BW/R trials are based on a mismatch between incoming information and a kinematic walk template, and that the mismatch is detected as early as TOJ. The involvement in the identification process of other areas downstream from TOJ could not be demonstrated conclusively. Further research focusing on the 2 conditions conducive to correct identification is necessary to investigate this issue.

Supplementary Material

Supplementary material can be found at: http://www.cercor.oxfordjournals.org/.

Funding

This work was supported by the Italian Ministry of Health (RF-10.057 grant), Italian Ministry of University and Research (PRIN grant), and Italian Space Agency (CRUSOE and COREA grants).

Notes

We thank the anonymous referees for their comments and suggestions. Conflict of Interest: The authors declare no competing financial interests.

References

Aglioti
SM
Cesari
P
Romani
M
Urgesi
C
.
2008
.
Action anticipation and motor resonance in elite basketball players
.
Nat Neurosci
 .
11
:
1109
1116
.
Astafiev
SV
Stanley
CM
Shulman
GL
Corbetta
M
.
2004
.
Extrastriate body area in human occipital cortex responds to the performance of motor actions
.
Nat Neursosci
 .
7
:
542
548
.
Beauchamp
MS
Lee
KE
Haxby
JV
Martin
A
.
2003
.
fMRI responses to video and point-light displays of moving humans and manipulable objects
.
J Cogn Neurosci
 .
15
:
991
1001
.
Bischoff-Grethe
A
Proper
SM
Mao
H
Daniels
KA
Berns
GS
.
2000
.
Conscious and unconscious processing of nonverbal predictability in Wernicke's area
.
J Neurosci
 .
20
:
1975
1981
.
Bonda
E
Petrides
M
Ostry
D
Evans
A
.
1996
.
Specific involvement of human parietal systems and the amygdala in the perception of biological motion
.
J Neurosci
 .
16
:
3737
3744
.
Botvinick
M
Nystrom
LE
Fissel
K
Carter
CS
Cohen
JD
.
1999
.
Conflict monitoring versus selection-for-action in anterior cingulate cortex
.
Nature
 .
402
:
179
181
.
Brass
M
Schmitt
RM
Spengler
S
Gergely
G
.
2007
.
Investigating action understanding: inferential processes versus action simulation
.
Curr Biol
 .
17
:
2117
2121
.
Bush
G
Luu
P
Posner
MI
.
2000
.
Cognitive and emotional influences in anterior cingulate cortex
.
Trends Cogn Sci
 .
4
:
215
222
.
Calvo-Merino
B
Glaser
DE
Grèzes
J
Passingham
RE
Haggard
P
.
2005
.
Action observation and acquired motor skills: an fMRI study with expert dancers
.
Cereb Cortex
 .
15
:
1243
1249
.
Calvo-Merino
B
Grèzes
J
Glaser
DE
Passingham
RE
Haggard
P
.
2006
.
Seeing or doing? Influence of visual and motor familiarity in action observation
.
Curr Biol
 .
16
:
1905
1910
.
Carter
CS
Botvinick
MM
Cohen
JD
.
1999
.
The contribution of the anterior cingulate cortex to executive processes in cognition
.
Rev Neurosci
 .
10
:
49
57
.
Caspers
S
Zilles
K
Laird
AR
Eickoff
SB
.
2010
.
ALE meta-analysis of action observation and imitation in the human brain
.
Neuroimage
 .
50
:
1148
1167
.
Cross
ES
Hamilton
AF
Grafton
ST
.
2006
.
Building a motor simulation de novo: observation of dance by dancers
.
Neuroimage
 .
31
:
1257
1267
.
Cross
ES
Hamilton
AF
Kraemer
DJ
Kelley
WM
Grafton
ST
.
2009
.
Dissociable substrates for body motion and physical experience in the human action observation network
.
Eur J Neurosci
 .
30
:
1383
1392
.
Cross
ES
Kraemer
DJM
Hamilton
AF
Kelley
WM
Grafton
ST
.
2009
.
Sensitivity of the action observation network to physical and observational learning
.
Cereb Cortex
 .
19
:
315
326
.
Cross
ES
Liepelt
R
Hamilton
AF de C
Parkinson
J
Ramsey
R
Stadler
W
Prinz
W
.
2012
.
Robotic movement preferentially engages the action observation network
.
Hum Brain Mapp
 .
33
:
2238
2254
.
David
N
Cohen
MX
Newen
A
Bewernick
BH
Shah
NJ
Fink
GR
Vogeley
K
.
2007
.
The extrastriate cortex distinguishes between the consequences of one's own and others’ behavior
.
Neuroimage
 .
36
:
1004
1014
.
Decety
J
Chaminade
T
.
2003
.
When the self represents the other: a new cognitive neuroscience view on psychological identification
.
Conscious Cogn
 .
12
:
577
596
.
Decety
J
Grèzes
J
.
1999
.
Neural mechanisms subserving the perception of human actions
.
Trends Cogn Sci
 .
3
:
172
178
.
de Lange
FP
Spronk
M
Willems
RM
Toni
I
Bekkering
H
.
2008
.
Complementary systems for understanding action intentions
.
Curr Biol
 .
18
:
454
457
.
Downing
PE
Jiang
Y
Shuman
M
Kanwisher
N
.
2001
.
A cortical area selective for visual processing of the human body
.
Science
 .
293
:
2470
2473
.
Duvernoy
HM
.
1991
.
The human brain: surface, three-dimensional sectional anatomy and MRI
 .
New York
:
Springer
.
Fogassi
L
Ferrari
PF
Gesierich
B
Rozzi
S
Chersi
F
Rizzolatti
G
.
2005
.
Parietal lobe: from action organization to intention understanding
.
Science
 .
308
:
662
667
.
Friston
K
.
2005
.
A theory of cortical responses
.
Philos Trans R Soc Lond B Biol Sci
 .
360
:
815
836
.
Frith
CD
Frith
U
.
2006
.
How we predict what other people are going to do
.
Brain Res
 .
1079
:
36
46
.
Frith
U
Frith
CD
.
2003
.
Development and neurophysiology of mentalizing
.
Philos Trans R Soc Lond B Biol Sci
 .
358
:
459
473
.
Gallese
V
.
2006
.
Intentional attunement: a neurophysiological perspective on social cognition and its disruption in autism
.
Brain Res
 .
1079
:
15
24
.
Gallese
V
Goldman
A
.
1998
.
Mirror neurons and the simulation theory of mindreading
.
Trends Cogn Sci
 .
2
:
493
501
.
Gallese
V
Keysers
C
Rizzolatti
G
.
2004
.
A unifying view of the basis of social cognition
.
Trends Cogn Sci
 .
8
:
396
403
.
Giese
MA
Poggio
T
.
2003
.
Neural mechanisms for the recognition of biological movements
.
Nat Rev Neurosci
 .
4
:
179
192
.
Grasso
R
Bianchi
L
Lacquaniti
F
.
1998
.
Motor patterns for human gait: backward versus forward locomotion
.
J Neurophysiol
 .
80
:
1868
1885
.
Grèzes
J
Frith
CD
Passingham
RE
.
2003
.
Inferring false beliefs from the actions of oneself and others: an fMRI study
.
Neuroimage
 .
21
:
744
750
.
Grosbras
MH
Beaton
S
Eickhoff
SB
.
2012
.
Brain regions involved in human movement perception: a quantitative voxel-based meta-analysis
.
Hum Brain Mapp
 .
33
:
431
454
.
Grossman
ED
Blake
R
.
2002
.
Brain areas active during visual perception of biological motion
.
Neuron
 .
35
:
1167
1175
.
Hagberg
GE
Zito
G
Patria
F
Sanes
JN
.
2001
.
Improved detection of event-related functional MRI signals using probability functions
.
Neuroimage
 .
14
:
1193
1205
.
Han
Z
Bi
Y
Chen
J
Chen
Q
He
Y
Caramazza
A
.
2013
.
Distinct regions of right temporal cortex are associated with biological and human-agent motion: functional magnetic resonance imaging and neuropsychological evidence
.
J Neurosci
 .
33
:
15442
15453
.
Hasson
U
Yang
E
Vallines
I
Heeger
DJ
Rubin
N
.
2008
.
A hierarchy of temporal receptive windows in human cortex
.
J Neurosci
 .
28
:
2539
2550
.
Hatanaka
N
Tokuno
H
Hamada
I
Inase
M
Ito
Y
Imanishi
M
Hasegawa
N
Akazawa
T
Nambu
A
Takada
M
.
2003
.
Thalamocortical and intracortical connections of monkey cingulate motor areas
.
J Comp Neurol
 .
462
:
121
138
.
Herrington
JD
Nymberg
C
Schultz
RT
.
2011
.
Biological motion task performance predicts superior temporal sulcus activity
.
Brain Cogn
 .
77
:
372
381
.
Iacoboni
M
.
2005
.
Understanding others: imitation, language, and empathy. In perspectives on imitation: from neuroscience to social science
 .
Massachusetts
:
MIT Press
. p.
77
100
.
Jastorff
J
Begliomini
C
Fabbri-Destro
M
Rizzolatti
G
Orban
GA
.
2010
.
Coding observed motor acts: different organizational principles in the parietal and premotor cortex of humans
.
J Neurophysiol
 .
104
:
128
140
.
Jastorff
J
Clavagnier
S
Gergely
G
Orban
GA
.
2011
.
Neural mechanisms of understanding rational actions: middle temporal gyrus activation by contextual violation
.
Cereb Cortex
 .
21
:
318
329
.
Jastorff
J
Orban
GA
.
2009
.
Human functional magnetic resonance imaging reveals separation and integration of shape and motion cues in biological motion processing
.
J Neurosci
 .
29
:
7315
7329
.
Katzavelis
D
Mukherjee
M
Decker
L
Stergiou
N
.
2010
.
Variability of lower extremity joint kinematics during backward walking in a virtual environment
.
Nonlin Dyn Psychol Life Sci
 .
14
:
165
178
.
Kilner
JM
Friston
KJ
Frith
CD
.
2007a
.
The mirror-neuron system: a Bayesian perspective
.
Neuroreport
 .
18
:
619
623
.
Kilner
JM
Friston
KJ
Frith
CD
.
2007b
.
Predictive coding: an account of the mirror neuron system
.
Cogn Process
 .
8
:
159
166
.
Kilner
JM
Neal
A
Weiskopf
N
Friston
KJ
Frith
CD
.
2009
.
Evidence of mirror neurons in human inferior frontal gyrus
.
J Neurosci
 .
29
:
10153
10159
.
Liew
SL
Han
S
Aziz-Zadeh
L
.
2011
.
Familiarity modulates mirror neuron and mentalizing regions during intention understanding
.
Hum Brain Mapp
 .
32
:
1986
1997
.
Lingnau
A
Gesierich
B
Caramazza
A
.
2009
.
Asymmetric fMRI adaptation reveals no evidence for mirror neurons in humans
.
Proc Natl Acad Sci
 .
106
:
9925
9930
.
Mahon
BZ
Caramazza
A
.
2008
.
A critical look at the embodied cognition hypothesis and a new proposal for grounding conceptual content
.
J Physiol (Paris)
 .
102
:
59
70
.
Molnar-Szakacs
I
Iacoboni
M
Koski
L
Mazziotta
JC
.
2005
.
Functional segregation within pars opercularis of the inferior frontal gyrus: evidence from fMRI studies of imitation and action observation
.
Cereb Cortex
 .
15
:
986
994
.
Oosterhof
NN
Wiggett
AJ
Diedrichsen
J
Tipper
SP
Downing
PE
.
2010
.
Surface-based information mapping reveals crossmodal vision-action representations in human parietal and occipito-temporal cortex
.
J Neurophysiol
 .
104
:
1077
1089
.
Oram
MW
Perrett
DI
.
1996
.
Integration of form and motion in the anterior superior temporal polysensory area (STPa) of the macaque monkey
.
J Neurophysiol
 .
76
:
109
129
.
Peelen
MV
Wiggett
AJ
Downing
PE
.
2006
.
Patterns of fMRI activity dissociate overlapping functional brain areas that respond to biological motion
.
Neuron
 .
49
:
815
822
.
Pelphrey
KA
Morris
JP
McCarthy
G
.
2004
.
Grasping the intentions of others: the perceived intentionality of an action influences activity in the superior temporal sulcus during social perception
.
J Cogn Neurosci
 .
16
:
1706
1716
.
Penny
WD
Holmes
AP
Friston
KJ
.
2003
.
Random effects analysis
. In:
Frackowiak
RSJ
Friston
KJ
Frith
C
Dolan
R
Price
CJ
Zeki
S
Ashburner
J
Penny
WD
. editors.
Human brain function
 .
2nd ed
.
San Diego
:
Academic Press
. p.
843
850
.
Press
C
.
2011
.
Action observation and robotic agents: learning and anthropomorphism
.
Neurosci Biobehav Rev
 .
35
:
1410
1418
.
Puce
A
Perrett
D
.
2003
.
Electrophysiology and brain imaging of biological motion
.
Philos Trans R Soc Lond B Biol Sci
 .
358
:
435
445
.
Rizzolatti
G
Craighero
L
.
2004
.
The mirror-neuron system
.
Annu Rev Neurosci
 .
27
:
169
192
.
Rushworth
MFS
Behrens
TEJ
Rudebeck
PH
Walton
ME
.
2007
.
Contrasting roles for cingulate and orbitofrontal cortex in decisions and social behavior
.
Trends Cogn Sci
 .
11
:
168
176
.
Thioux
M
Gazzola
V
Keysers
C
.
2008
.
Action understanding: how, what and why
.
Curr Biol
 .
18
:
R431
R434
.
Thorstensson
A
.
1986
.
How is the normal locomotor program modified to produce backward walking?
Exp Brain Res
 .
61
:
664
668
.
Vaina
LM
Solomon
J
Chowdhury
S
Sinha
P
Belliveau
JW
.
2001
.
Functional neuroanatomy of biological motion perception in humans
.
Proc Natl Acad Sci USA
 .
98
:
11656
11661
.
Viviani
P
Figliozzi
F
Campione
GC
Lacquaniti
F
.
2011
.
Detecting temporal reversals in human locomotion
.
Exp Brain Res
 .
214
:
93
103
.
Winter
RF
Pluck
N
Yang
JF
.
1989
.
Backward walking: a simple reversal of forward walking?
J Motor Behav
 .
21
:
291
305
.

Author notes

Vincenzo Maffei and Maria Assunta Giusti contributed equally to the work.