Recognition of biological motion is one of the essential ingredients of human evolutionary survival. When biological motion is revealed solely by a set of light dots on the joints of an invisible human figure, the perceptual system reliably distinguishes it from similar configurations. Here, we assessed the changes in neuromagnetic cortical responses during visual perception of biological motion. Healthy humans saw a randomized set of stimuli consisting of a point-light canonical walker and a scrambled configuration in which the spatial positions of dots were randomly rearranged on the screen. In separate runs, configurations were presented either within an upright or inverted (180°) orientation in the image plane. Participants performed a one-back repetition task lifting a forefinger in response to the second of two consecutive identical stimuli of each type. Both recognizable upright and non-recognizable inverted walkers evoke enhancements in oscillatory gamma brain activity (25–30 Hz) over the left occipital cortices as early as 100 ms from stimulus onset. Only a recognizable upright walker, however, yields further consecutive peaks over the parietal (130 ms) and right temporal (170 ms) lobes. Scrambled displays do not elicit any increases in the gamma response. The stimulus-specific time course and topographic dynamics of cortical oscillatory activity indicate that the brain rapidly dissociates spatial coherence and meaning revealed through biological movement.
The remarkable ability of the brain to interpret continuously changing configurations as single coherent objects supports functional adaptive behavior in an ever-changing environment. In human vision, in accord with the initial findings in the anaesthetized and alert cat (Eckhorn et al., 1988; Gray et al., 1989; Gray and Viana Di Prisco, 1997) and in the awake monkey (Eckhorn et al., 1993; Kreiter and Singer, 1996), cortical synchronized oscillatory activity of neuron populations in the gamma frequency range of 20–80 Hz is thought to underlie perception of Gestalt-like patterns (e.g. Rodriguez et al., 1999; Tallon-Baudry and Bertrand, 1999). Although many investigators are striving to test this assumption, it has proven non-trivial to verify it experimentally. Moreover, the existing data are controversial (e.g. Thiele and Stoner, 2003). In humans, evoked (phase-locked to stimulus onset) and induced (non-phase locked) electroencephalographic (EEG), gamma response increases both with a real and an illusory Kanizsa triangle, but not with a similar incoherent one (Tallon et al., 1995; Tallon-Baudry et al., 1996). Early (100–200 ms) peaks in evoked magnetoencephalographic (MEG) oscillatory activity have been observed in response to Kanizsa figures (Herrmann and Mecklinger, 2000). Induced gamma range EEG activity is enhanced when bistable faces rotating in the image plane are presented at orientations with which their expression is easily perceived (Keil et al., 1999). Yet, it remains unclear whether oscillatory activity varies with stimulus coherence by itself or with a meaningful representation of a coherent structure. To distinguish between and test the viability of these possibilities, we assessed changes in the neuromagnetic oscillatory brain response during perception of biological motion stimuli.
Anecdotal evidence abounds on the ability of humans to extract accurate perceptual information based only on the dynamic outline of a biological organism moving in the distance or in dim light. This is especially true when the walker is someone we know personally, but occurs as well with acquaintances and strangers. For example, the Russian psychiatrist Gannushkin claimed that he was able to recognize mental disorders of patients simply by observing their changing outline as they moved about in a dimly lit room. The perceptual basis of his claim receives support from numerous psychophysical investigations conducted over the past 30 years (e.g. Johansson, 1976; Barclay et al., 1978; Runeson and Frykholm, 1983; Neri et al., 1998; Pavlova and Sokolov, 2000).
In order to isolate information about structure-from-motion from featural or semantic information, a set of discrete light dots are attached to the main joints and head of a human body. Although in the resulting display only a few dots moving against a dark background are visible, the visual system is exquisitely sensitive to implicit coherent structure revealed through biological movement: human infants, monkeys, bottlenose dolphins, cats and even new-born chicks readily perceive point-light displays (Fox and McDaniel, 1982; Herman et al., 1990; Blake, 1993; Oram and Perrett, 1994; for a review, see Pavlova et al., 2001). Inverted stimulus presentation, however, dramatically impedes veridical recognition of a point-light walker (Sumi, 1984; Pavlova and Sokolov, 2000). Because an inverted display retains the same relational structure as an upright one, we changed orientation of a configuration to examine how oscillatory brain activity varies with a reconstruction of ‘hidden’ meaning. We hypothesized that if changes in the neuromagnetic gamma response connect with the reconstruction of meaning, then a canonical easily recognizable point-light walker will evoke greater gamma activity than the same figure presented upside-down. To control for spatial coherence, we used a scrambled configuration in which the spatial positions of dots were randomly rearranged on the screen so that the display lacked the implicit coherent structure of a canonical figure (Fig. 1).
The other important motivation for this work was a desire to uncover the time course and dynamic topography of the brain response to biological motion stimuli. Although the brain mechanisms underlying the perception of biological motion are currently being explored by neuroimaging techniques such as positron emission tomography (PET) and especially functional magnetic resonance imaging (fMRI) (Bonda et al., 1996; Castelli et al., 2000; Grossman et al., 2000; Grèzes et al., 2001; Grossman and Blake, 2001, 2002; Vaina et al., 2001; Servos et al., 2002; Pelphrey et al., 2003; Puce and Perrett, 2003), the findings are entirely restricted to localization of cortical regions involved in biological motion processing. Yet the changes in the brain activation unfolding over time remain largely unknown.
Material and Methods
Fourteen right-handed paid volunteers (seven females, seven males, age range 20–32 years) with normal vision participated. None had a history of neurological disorders. The study complied with the requirements of the Ethical Committee of the University of Tübingen Medical School (Ethik-Kommission der Medizinischen Fakultät der Universität Tübingen), and was conducted with the written informed consent of each subject.
Two types of computer-generated stimuli were used. A canonical point-light walker consisted of 11 dots on the head and main joints of an invisible figure (Fig. 1). The figure, facing right, was seen moving as if on a treadmill. A gait cycle was completed in 40 frames with frame duration of 31 ms that produced a walking speed of ∼48 cycles per minute. For a scrambled configuration, the spatial positions of dots were randomly rearranged on the screen so that the display lacked an implicit coherent structure of a canonical walker. The motion of each point of the scrambled display was identical to the motion of one of the points defining the canonical figure. The size, luminance and phase relations of the dots also remained unchanged. The configurations were generated by Cutting’s algorithm (Cutting, 1978) and subtended a visual angle of 9° in height and 6° in width.
A randomized set of 200 stimuli of both types (canonical and scrambled point-light configurations) was presented in two separate runs either with an upright or inverted display orientation. The order of runs was counter-balanced between subjects. Participants performed a one-back repetition task: they lifted a forefinger following the offset of the second of two consecutive identical stimuli of each type. This movement triggered a light barrier signal. For example, in a string of stimuli A and B (AA*BABB*AA*BB*BABB*), the requested points of the responses are designated by asterisks. Participants had no explicit identification task. They also were unaware as to the nature of the point-light stimuli. The reasons for using this experimental design were threefold. First, because oscillatory brain activity may be modulated by attention mechanisms (Reynolds and Desimone, 1999; Sokolov et al., 1999; Haenschel et al., 2000), the task allocates subjects’ attention to all types of stimuli and reduces possible effects of task-driven attention. Secondly, this procedure eliminates the influence of motor activity on recorded MEG traces (Kristeva-Feige et al., 1993) and helps to avoid possible effects of sensory adaptation. Accordingly, the only trials analyzed were those for which motor responses were not required by the task. Thirdly, the procedure also avoids making the task one of explicit identification. It is known that active visual search [e.g. for a camouflaged Dalmatian dog (Tallon-Baudry et al., 1997)] or even an explicit categorization task (Rodriguez et al., 1999) may lead, irrespective of stimulus type, to an increase in oscillatory gamma-range activity. The participants, therefore, were asked about their visual impressions only after the whole run was completed. After each run, participants were asked to rate the display vividness on a five-point equal-spaced unipolar scale. Participants visually fixated a gray cross in the middle of the screen that was seen during the whole run. The stimulus appeared for 650 ms on a blank screen with an inter-stimulus interval that varied randomly between 2.5 and 3.0 s. For analysis of behavioral errors, separately for each subject, we calculated the miss rate as a ratio of the number of failures to respond to the second identical stimulus of each type to the total number of the required responses. Similarly, for analysis of the false alarm rate, the number of false alarms for each type of stimulus was divided by the total number of trials in which this type of error might occur.
A participant was seated in an electromagnetically shielded chamber (Vakuum-Schmelze, Hanau, Germany). The cortical responses were recorded with the whole head MEG system (CTF Systems Inc., Vancouver, Canada) consisting of 151 hardware first-order magnetic gradiometers distributed with an average distance between sensors of 3 cm. The signals were sampled at a rate of 312.5 Hz in the frequency range of 0–100 Hz. A baseline was recorded during 300 ms pre-stimulus. Subjects were instructed to blink only during inter-trial intervals. Vertical eye movements were monitored by EEG/EOG recording from the left eye (impedance was kept below 5 kΩm). Both at the beginning and at the end of each recording session, the subject’s head position was determined with three localization coils fixed at the nasion and the periauricular sites. Sessions with head movements exceeding 0.5 cm were discarded. Each MEG recording session (during presentation of a set of 200 stimuli in a run) lasted 10–12 min. The entire experimental session including preparatory period, instruction, familiarization and MEG recording took ∼1.5 h per subject.
All epochs of MEG activity were first automatically and then manually inspected for artifacts. Epochs containing blinks or eye movements (>±100 µV) were rejected. The only trials analyzed were those for which a motor response was not required (for reasons see the Experimental Design section above). If a subject failed to respond to the second identical stimulus, all trials beginning from the last correct response were also discarded. A total of ∼70 correct trials were processed for each type of stimulus and subject. The evoked oscillatory response was analyzed in this study because recent work suggests that stimulus-related feature integration processing is reflected in evoked rather than induced gamma brain activity (e.g. Haenschel et al., 2000; Palva et al., 2002; Debener et al., 2003; see also Pantev et al., 1991). The latter has been shown to be largely modulated by task-driven and top-down influences. For example, retention of an object representation in short-term memory substantially affects induced gamma EEG activity (Tallon-Baudry et al., 1998). It has been assumed, therefore, that the evoked gamma band response may functionally reflect synchronously active neural assemblies, i.e. feature binding, while induced gamma activity is related to object representations or the activation of associative memories (e.g. Tallon-Baudry and Bertrand, 1999; Kaiser et al., 2002).
The artifact-free MEG data for each type of stimulus were averaged and digitally filtered in the frequency domain. An acausal Gaussian shaped Gabor filter with center frequencies ranging from 10 to 65 Hz and a width of ±5 Hz was separately applied over the entire epochs of 300 ms before and 650 ms after stimulus onset. To calculate amplitude demodulation, a Hilbert transformation was employed (Clochon et al., 1996). For elimination of baseline distortions caused by these procedures the baseline epoch was restricted to 200 ms pre-stimulus. To assess differences in spectral narrow-band neuromagnetic activity for distinct stimuli relative to baseline, a statistical-probability mapping algorithm based on paired two-sided t-tests for every MEG channel and data point (in the time interval of 0–300 ms post-stimulus) for the group of subjects was used, t-values were converted to P-values. A data point was considered significant if the mean of three consecutive P-values met the significance criterion (P < 0.001) thus reducing the probability of occurrence of false positives while preserving a reasonably high level of time resolution. This procedure yielded isocontour regions of significant effects on the map of MEG sensors. The mapping algorithm is described in more detail elsewhere (e.g. Kaiser et al., 2000, 2002). In brief, for assessment of the cortical topography of significant changes in spectral amplitude, we used a common coil system. For each participant, the sensor positions were assigned to common spatial sensor coordinates of one representative subject and the spatial locations of changes in spectral amplitude were determined on the 2-D brain model derived from this subject’s volumetric structural MRI. Applicability of the common coil system for the purposes of the present study, in which the exact source localization was beyond the focus of interest, has been established elsewhere (e.g. Kaiser et al., 2000). The localization errors introduced by employing the common coil system, as opposed to the individual sensor locations were within the range of spatial resolution determined by the spacing of sensors in the MEG system. Paired Student’s t-tests and analyses of variance (ANOVA) were used to analyze the differences in the spectral amplitudes of gamma activity for distinct stimuli.
Performance and Spontaneous Stimulus Recognition
Subjects performed the task with great accuracy: they responded to the second of two consecutive stimuli of each type almost without error. When figures were presented upside-down, the task was often reported as more difficult. Yet, the performance level was equally high in both runs. The miss rate (failures to respond to the second identical consecutive stimulus; see Experimental Design) was on average 0.018 ± 0.031 (mean ± SD) and 0.011 ± 0.018 in responding to the upright canonical and scrambled displays, respectively, and 0.02 ± 0.02 and 0.028 ± 0.037 in responding to the inverted canonical and scrambled configurations. Very few participants made false alarms: the rate of this type of error was 0.002 ± 0.005 and 0.006 ± 0.013 in responding to the upright canonical and scrambled displays, and 0.008 ± 0.016 and 0.001 ± 0.004 in responding to the inverted canonical and scrambled configurations. Pair-wise comparisons performed on individual error rates did not reveal any significant differences between the distinct stimuli. Following each run, subjects briefly described what they saw and indicated any stimulus interpretations they might have had. Even during the familiarization trials, all subjects spontaneously reported seeing the upright walker. Their impression of the walking figure was vivid and resulted in high ratings of the display’s vividness on a five-point scale. The ratings were significantly higher for the upright walker display than for the inverted figure [mean rating 4.29 ± 0.61 and 2.22 ± 1.12 for upright and inverted walker, respectively; t(13) = 6.11, P < 0.001, two-tailed] and than for upright or inverted scrambled displays [mean rating 1.57 ± 0.85 and 1.57 ± 0.64, for upright and inverted scrambled display respectively; t(13) = 8.05 and t(13) = 8.64, P < 0.001]. No significant difference was found between the inverted walker and scrambled displays. None of the subjects recognized an inverted point-light figure. Instead, consistent with earlier psychophysical findings (Pavlova and Sokolov, 2000, 2003; Grossman and Blake, 2001), the display was described in a variety of ways ranging from ‘swinging of dots back and forth’ to ‘rotation in 3-D’. Scrambled configurations, irrespective of orientation, were typically described as a disorganized cloud of dots.
Stimulus-specific Effects in Oscillatory Brain Activity
Inspection of the statistical probability maps (Fig. 2) shows that both upright and inverted canonical walkers evoke an increased, relative to baseline, gamma-band (25–30 Hz) MEG activity over the occipital areas (P < 0.001). The evoked gamma enhancements were observed markedly early at 80 ms, reaching a maximum at 100 ms after stimulus onset. In contrast, neither upright nor inverted scrambled configuration elicited any significant enhancements in the gamma response.
Averaged individual values of spectral amplitude of the gamma response over the occipital areas were submitted to two-way repeated measures ANOVA with factors stimulus type (canonical/scrambled) and orientation (upright/inverted). The results revealed a highly significant main effect of stimulus type [F(1,13) = 15.6, P < 0.002]. Neither the main effect of orientation nor the interaction stimulus type × orientation was significant. The mean spectral amplitudes of gamma response were greater for the upright walker than for the scrambled display [t(13) = 3.20, P < 0.007] and greater for the inverted walker than for the scrambled configuration [t(13) = 2.30, P < 0.039]. The difference in the spectral power of the early gamma response between upright and inverted walking figures was not significant [t(13) = 1.40, n.s.].
For the upright-oriented walker only, two later consecutive increases in gamma activity occurred over the parietal and right temporal cortical areas (Figs 2 and 3A,B). The maximum enhancement over the parietal cortices was found at 130 ms and over the temporal at 170 ms from stimulus onset (P < 0.0001). The enhancements for the upright walker were greater for the left than for the right occipital areas [F(1,13) = 48.3, P < 0.001] and greater for the right than for the left temporal cortices [F(1,13) = 8.6, P < 0.01]. In the frequency range of 20–70 Hz, there were no other stimulus-specific effects. As can be seen in Figure 3A, there was activation at the low frequencies (10–20 Hz) in response to the canonical walker which, however, did not exhibit any systematic character. Overall, analyses of the low frequency (<20 Hz) and slow neuromagnetic responses indicated that the effects found are specific to the gamma range. Figure 4 shows the time course and topographic dynamics of evoked gamma activity in response to the upright point-light walker as depicted by averaged MEG traces for a representative subject (left) and the group mean of spectral amplitude in the gamma range (right panel). Comparison of spectral amplitudes of gamma activity over the occipital, parietal and temporal areas together reveals a stronger gamma response to the upright than to the inverted walker [t(13) = 2.22, P < 0.045] or to a scrambled configuration [t(13) = 2.81, P < 0.015].
By manipulating display orientation of an impoverished point-light biological motion pattern, we examined whether oscillatory brain activity varies with a meaningful representation of a structure revealed through biological motion. The results show that visual processing of either a recognizable upright or non-recognizable inverted point-light walker leads to an early occipital increase in the oscillatory cortical response in the lower gamma-band range peaking at 100 ms after stimulus onset (Figs 2 and 3). In contrast, spatially scrambled configurations do not evoke any reliable changes in gamma activity. These findings suggest that the human brain rapidly discriminates between similar moving configurations with and without implicit coherent structure. The rapid response of the brain, however, might be specialized for extracting structure from biological motion: psychophysical findings show that naïve human observers need 100 ms of stimulus duration (that corresponds to about two frames with frame duration of 42 ms) to discriminate different types of filmed biological motion such as walking and jogging (Johansson, 1976). Notably, human observers discriminate a canonical point-light walker from a jumbled point-light figure embedded in a static-dot mask for two or three frames of exposure, and only 40 ms of stimulus duration are needed to recognize a known point-light figure (Perrett et al., 1990).
In view of the inability of participants to spontaneously recognize the inverted walker, it is remarkable that both upright and upside-down walking figures elicit an increased gamma activity over the lateral occipital cortices. This suggests that, irrespective of perceptual interpretations, spatial coherence extracted from motion alone is sufficient to evoke an increase in the gamma response over the modality-specific cortical regions. Moreover, the findings indicate that discrimination between both upright and inverted canonical figures and scrambled displays is likely to be accomplished at relative early stages of cortical processing. This agrees with functional mapping data in humans (Grill-Spector et al., 1998; Kourtzi and Kanwisher, 2000) and macaque single neuron results (Vogels, 1999), which suggest that the occipital cortices exhibit a sensitivity to image scrambling.
Although gamma activity is considered to underlie perception of coherent patterns, earlier data supporting this assumption may have been confounded by task-driven attention. For example, static geometrical figures are reported to elicit enhancements in the evoked gamma-band EEG activity (Tallon et al., 1995). Yet, the particular task of counting a target figure directs a subject’s attention to the similar stimuli and this might affect gamma activity. Manipulating attention to regularly moving bars by an auditory distracter in a temporal-order detection task, we have found an increase in evoked gamma MEG activity over the occipital cortices in the attended conditions (Sokolov et al., 1999). The present data indicate that when a task requires attention to both a point-light walker and a similar noise, only patterns with implicit coherent spatial structures (i.e. inverted and upright walkers) elicit an early enhancement in gamma brain activity.
Our findings suggest that changes in gamma activity can serve as a neurophysiological indicator for recovering meaningful structure from motion. Only a few earlier results indirectly favor this conjecture. Real and illusory Kanizsa figures, unambiguously perceived as triangles, yield similar perceptual interpretations and induce the same increase in gamma EEG activity (30–40 Hz) at two parietal electrodes Pz and P4 (Tallon-Baudry et al., 1996). ‘Mooney faces’ elicit stronger gamma EEG activity than the same patterns inverted 180° in the image plane, which become non-recognizable configurations (Rodriguez et al., 1999). Gamma (∼30 Hz) EEG activity is enhanced when bistable faces rotating in the image plane are presented at orientations in which their expression is easily perceived (Keil et al., 1999). In the present work, changing orientation of a point-light walker, i.e. using the same stimuli under conditions providing for different possibilities for reconstruction of ‘hidden’ meaning, we have found enhancements in evoked gamma activity to the meaningful structure revealed from biological motion. The increases over the parietal and right temporal cortices observed solely for the upright walker occur later than they do over the occipital areas in response to both upright and inverted figures, most likely reflecting the processing of a recognizable meaningful stimulus (Figs 3 and 4). The robustness of this pattern of activity in response to attended canonical point-light biological motion is confirmed by our recent work with a different task and an independent group of participants (Pavlova et al., 2000).
Hemispheric asymmetries in the response to the upright walker are consistent with the assumption that the two hemispheres are biased toward different aspects of visual processing (Corballis et al., 1999). The left hemisphere is thought primarily to process the local details of a visual stimulus, while the right hemisphere is mainly used for processing its global form (Robertson et al., 1988). A stronger left-side enhancement in the oscillatory response over the occipital cortices is likely to reflect the early processing of coherent structure emerging from the array of moving dots, while a late right-side increase over the temporal areas may reflect the processing of the whole configuration.
Neurobiological mechanisms underlying the perception of biological motion are currently receiving considerable attention. Following observations first reported by Bruce et al. (1981), cells in the superior temporal polysensory area (STPa) and in the macaque inferotemporal cortex (Oram and Perrett, 1994; Wachsmuth et al., 1994) were found to respond selectively to a body view, the direction the walker is facing and the type of body motion. The STPa is supposed to be involved in object recognition, integrating the ventral and dorsal pathways that are responsible for form and motion processing (Baizer et al., 1991). These areas are also considered to be essential for visual awareness of a stimulus (Sheinberg and Logothetis, 1997). Neuropsychological case studies show that an ability to identify human actions in point-light displays is preserved in patients with bilateral lesions in the dorsal occipito-parietal cortex, if the temporal lobe is spared (Vaina et al., 1990; McLeod et al., 1996). Patient AF, for example, first described by Vaina et al. (1990), is able to identify actions in point-light displays. This person’s performance on early-motion tasks such as speed and direction discrimination, however, is very poor. Yet, patients with bilateral lesions in the parietal cortex (Brodmann areas 7 and 40) are unable to recognize moving point-light figures, although they demonstrate unimpaired ability to segment the figures from the stationary background and have normal motion-coherence thresholds (Schenk and Zihl, 1997). Cowey and Vaina (2000) reported the case of patient AL, who had extensive damage to the rostral ventral temporal cortex together with a white-matter damage. She was unable to derive meaning from point-light displays.
Patients with impairments in high-level symbolic processing need only very brief exposure to recognize point-light figures (Moore et al., 1997; cf. Blake et al., 2003). Children with Williams syndrome aged 9–15 years reliably judge facing (right- or leftward) of a slightly camouflaged walker moving as if on a treadmill (Jordan et al., 2002). Adolescents who were born preterm and exhibited early motor disability (bilateral cerebral palsy) can detect a point-light walker embedded in a complex simultaneous mask (Pavlova et al., 2003). Their sensitivity to a presence of the camouflaged figure, however, correlates negatively with the extent of periventricular damage to the parieto-occipital complex. The latter finding suggests that biological motion processing might be markedly modulated by subcortical lesions.
Overall, the lesion data converge with the recent brain imaging findings (e.g. Vaina et al., 2001). Positron emission tomography (PET) indicates that the rostrocaudal part of the right superior temporal sulcus (STS) and basal temporal regions (fusiform gyrus and temporal poles adjacent to the amygdala) are involved in the perception of signs and actions conveyed by biological motion (Bonda et al., 1996; Castelli et al., 2000; Ptito et al., 2003). Functional MRI reveals that a gradient of activation during viewing of point-light biological motion stimuli is located within a region of posterior STS, over the intraparietal cortex, and over the lateral occipital complex (Grèzes et al., 2001; Grossman and Blake, 2001, 2002; Vaina et al., 2001; Servos et al., 2002). These areas of activity coincide with the changes in cortical synchronization revealed by MEG (Singh et al., 2002). The unfolding over time stimulus-specific brain activity in response to point-light stimuli reported in the present work substantially extends these findings. The data indicate that the human brain rapidly dissociates spatial coherence and meaning conveyed by biological movement. The early gamma band response over the occipital cortex exhibits a sensitivity to coherent structure in point-light displays, while later enhancements over the parietal and right temporal areas appear to reflect the neural representation of meaningful structure revealed through biological movement.
Portions of this work were supported by the Human Frontier Science Program Organization, the Deutsche Forschungsgemeinschaft (DFG) and by the University of Tübingen Medical School (fortüne-Programs 716 and 979). We thank John C. Baird for valuable advice on the manuscript, Jürgen Dax for helpful technical assistance, Ella Maslova, an honored artist of the Russian Federation, for creating an outline image of the point-light walker, Arseny Sokolov for assistance in data collection and anonymous reviewers for valuable comments. We especially would like to acknowledge the valuable advice of Patricia S. Goldman-Rakic, a co-editor of this journal.