How are the bits and pieces of retinal information assembled and integrated to form the coherent objects that we see? One long-established principle is that elements that move as a group are linked together. For instance a fragmented line-drawing of an object, placed on a background of randomly distributed short lines, can be impossible to see. But if the object moves relative to the background, its shape is instantly recognized. Even after the motion stops, the percept of the object persists briefly before it fades into the background of random lines. Where in the brain does the percept of the object persist? Using functional brain imaging, we found that such moving line-drawings activated both motion-sensitive areas (medial temporal complex, MT+) and object-sensitive areas (lateral occipital complex, LOC). However, after the motion stopped only the LOC maintained its activity while the percept endured. Evidently a percept assembled by motion-sensitive areas like MT+ can be stored, at least briefly, in the LOC.
From the light impinging on our eyes our brains construct a world of objects and scenes. That is, our awareness is of objects and scenes, but how does the brain group visual elements into the coherent objects that we perceive? This question is the key problem in understanding high-level vision and was a major concern of the Gestalt psychologists. They proposed several ‘laws’ for how visual elements are grouped into objects. One of these, the ‘law’ of common fate, says that the visual system naturally binds elements that move together. A vivid demonstration of this ability is Regan’s (Regan, 1986, 2000) shape- from-motion phenomenon. If the fragmented line-drawing of the object depicted in Figure 1A is superimposed on the background of pseudo-randomly distributed lines depicted in Figure 1B, the fragmented object is grouped with the background and almost perfectly camouflaged (see Fig. 1C). If the line-drawing of the object is moved over the background, however, the camouflage is invariably broken and the object is easily recognized.
Neuroimaging studies have consistently shown that an area in the ventral visual pathway, the lateral occipital complex (LOC), is highly active during object perception and recognition (Corbetta et al., 1990; Sergent et al., 1992; Malach et al., 1995; Kanwisher et al., 1996; Faillenot et al., 1997; Kraut et al., 1997; Halgren et al., 1999; Grill-Spector et al., 2000, 2001; James et al., 2000) whereas an area in the dorsal visual pathway, the human extrastriate motion complex (MT+), plays a central role in motion processing (Zeki et al., 1991; Watson et al., 1993; Tootell et al., 1995a). Of particular relevance here is that the LOC is activated by ‘shape-from-motion’ stimuli (Gulyás et al., 1994; Grill-Spector et al., 1998; Wang et al., 1999), whereas area MT+ is engaged in the analysis of object shape (Kourtzi et al., 2002).
A critical aspect of the shape-from-motion displays described by Regan (Regan, 1986, 2000) is that although the object appears as soon as it starts to move, it does not disappear immediately after the motion stops. Instead, it fades away over one or two seconds. This indicates that the motion is not necessary to sustain the percept once the elements have been integrated into one shape. Where in the brain does the representation persist in the absence of the inducing stimulus? The two obvious candidates are area MT+ and the LOC.
On the grounds that the integration time constant for detecting shape-from-motion is longer than for luminance-defined shapes, Regan and co-workers (Regan and Beverley, 1984; Regan, 2000) speculated that the persistence may be the result of a long-lived aftereffect caused by adaptation of filters for motion-defined form. These filters may be located in area MT+. If so, area MT+ should be activated during the perceptual persistence after the motion stopped. It is important to note, however, that the lingering percept is that of a stationary object; no motion is perceived. Moreover, observers experience the persistence only when the fragmented line drawing of the object remains part of the display. In other words, if the fragmented line drawing of the object disappears from the display the very instant the motion stops, the percept disappears with it. Thus, although the fragmented object is perfectly camouflaged and unrecognizable without motion, its presence is required to experience persistence after the motion stops.
The LOC, as well as other areas, is involved in binding elements into forms. This includes the binding of illusory contours (Mendola et al., 1999). Subjects perceive illusory contours of objects between the inducing elements (Kanisza, 1979) and although the contour does not physically exist, the activation in an area within the LOC correlates with the percept. Similarly, the LOC may mediate the persistence of object contours in the shape-from-motion displays after the motion terminates.
Here we examined brain activation in regions that were activated by moving objects in shape-from-motion displays and that were also active during the perceptual persistence after the motion stopped. The question asked was which brain areas sustain the percept after the binding occurred. In other words, where is the percept being stored, in the motion sensitive area MT+ or in the object-sensitive area in the LOC?
Materials and Methods
Seven healthy volunteers (four men and three women) who gave written consent participated in the study. All procedures were approved by the University of Western Ontario Ethics Review Board.
Experiments were performed using a 4.0 Tesla Varian Siemens whole-body imaging system. Functional data were obtained using a navigator echo corrected T2*-weighted segmented gradient EPI pulse sequence (TE, = 15 ms; FA = 45°; FOV = 19.2 cm × 19.2 cm; in-plane pixel size = 3 × 3 mm; slice thickness = 5 mm). We used a 15.5 × 11.5 cm quadrature radio frequency surface coil placed at the occipital pole to improve the signal-to-noise ratio. Functional data were aligned to high-resolution inversion prepared 3-D T1-weighted anatomical images of the brain collected immediately after the functional images using the same in-plane field of view (64 slices; TE = 5.5 ms; in-plane pixel size = 0.75 × 0.75 mm; slice thickness = 2.5 mm).
Each subject participated in one session consisting of 12 functional scans, each comprising 18 epochs (each epoch was 12 s long): four MT+ motion localizer scans, four lateral occipital complex (LOC) object localizers, and finally four shape-from-motion scans. Subjects viewed, through a mirror, images that were back-projected onto a screen. In all experiments, the subjects fixated centrally on a stationary red dot.
From the MT+ and LOC localizer scans (volume acquisition time 2 s; two shots) we selected five contiguous slices that included MT+ and the adjacent LOC. The slices were oriented approximately parallel to the calcarine sulcus. These slices were scanned with a volume acquisition time of 0.5 s to achieve a high temporal resolution (two shots; FA = 33°).
To identify brain areas that are sensitive to motion the display alternated between moving or stationary (control condition) lines. These lines were randomly oriented and as a group were either rotating, translating, contracting or expanding. The display extended 45° horizontally and 20° vertically.
To identify object-sensitive brain areas, we presented our subjects with intact and scrambled versions of 2-D images of the same set of objects (black-and-white line-drawings). Twelve images were presented in each epoch at 1 s intervals. The images subtended 5° of visual angle. To control attention across stimulus conditions, subjects performed a one-back matching task in which they pressed a response key whenever they saw two identical images in a row. Presentation of a new image was indicated by a small horizontal displacement.
Subjects were presented with three different stimulus conditions: (a) object move: segmented line-drawings that formed incomplete but recognizable shapes of objects were superimposed on a background of randomly oriented lines. The segmented line-drawings were rotated clockwise and counterclockwise ±15° with a period of 2.5 s. The background also rotated but in counter phase, (b) object stop: the same segmented line-drawings and same background stopped rotating, (c) object vanish: the background stopped rotating and at the same time the segmented line-drawing of the object vanished (see www.med.uwo.ca/neuroscience/gap/demo/bird.htm). We measured the perceptual persistence the observers experienced after the motion stopped in the conditions object stop and object vanish by asking our subjects to indicate with a button press when the percept of the object had disappeared. Each scan consisted of two repetitions of a a-b-a-c-a-c-a-b block design (see Fig. 1D).
Image Analysis and Regions of Interest (ROIs)
Analysis was carried out using BrainVoyager 4.4 software. For each subject, 3-D statistical maps were calculated by comparing the experimental condition (moving lines, images of intact objects) to the appropriate control condition (stationary lines, images of scrambled objects) within the context of the general linear model.
The MT+ and LOC ROIs were located by contrasting the stimulus condition (moving lines or images of intact objects) with the control condition (stationary lines or scrambled objects). An overlap ROI was located by using a conjunction analysis to find the brain area that responds to both motion and objects (see Fig. 2A). Within these areas the point of peak activation was located, and cubic regions of interest (9 × 9 × 10 mm) were centered around these points.
Using these independently defined ROIs (motion sensitive area, object sensitive area, overlap area), functional magnetic resonance imaging (fMRI) responses of the shape-from-motion scans were extracted by averaging the data from all activated voxels in these ROIs (P < 10−4). The average percent signal change for the two stimulus conditions, object stop and object vanish, was calculated using the initial fixation period as a baseline (10 volumes).
Because we were interested in the persistence of brain activation we needed to determine when the fMRI signal started to drop. A robust measure of this was the time at which the signal had decreased by 25%. To determine the 25% drop time point we normalized the data to the activation of 10 volumes of the transition period (the average of the last five volumes of the object move condition and the first five volumes of the following condition: object stop or object vanish, see Fig. 1D).
Subjects viewed shape-from-motion displays composed of fragmented line-drawings of objects rotating relative to a background of random lines. After the motion stopped, subjects indicated with a button press during functional imaging when the percept of the object had disappeared into the background. We compared two different stimulus conditions. In the object-stop condition, the background and the fragmented line-drawing simply stopped moving. In the object-vanish condition, the fragmented line-drawing of the object was removed when the motion stopped.
We found a clear difference in the responses to these two stimulus conditions in each of our seven subjects. Figure 2B shows that the percept lasted longer in the object stop condition (2045 ms) than in the object vanish condition (695 ms, paired t-test, t = 4.807, P = 0.003).
To identify the brain regions that subserve this perceptual persistence we analyzed the brain activation in the motion-sensitive area MT+ and the object-sensitive LOC. We identified MT+ as the brain area that responded more strongly to moving than stationary random lines (Tootell et al., 1995a), and the LOC as the brain area that responded more strongly to stationary line-drawings of objects than to scrambled versions of the same objects (Kourtzi et al., 2002).
Our goal was to assess the role of these independently identified ROIs (MT+ and LOC) in the perceptual persistence experienced when viewing shape-from-motion displays. We determined whether the brain activation for our stimulus conditions, object stop versus object vanish, persisted longer in one of these ROIs. Figure 3 compares the activation levels in the MT+ and LOC ROIs for both conditions in each hemisphere separately. Simple visual inspection shows that the brain activations in our MT+ ROI dropped in a similar way for the object stop (light dark line) and object vanish condition (dark gray line). In the LOC ROI, however, we found that brain activation persisted longer in the object stop condition.
To quantify the persistence of the activation, we identified for each ROI the time (in s) when the fMRI signal first dropped below 75% of its normalized peak activation. It can be clearly seen in Figure 2C that these time points for both stimulus conditions did not differ systematically in area MT+. In contrast, in the LOC we found that the brain activation dropped below 75% of its peak activation later in the object stop condition than in the object vanish condition in each of our seven subjects.
A repeated measures ANOVA with hemisphere (left versus right), ROIs (MT+, LOC and an overlap area, i.e. an area that is activated by motion and objects, see Methods) and stimulus condition (object stop versus object vanish) as the within subjects factors and time points of the 25% signal drop as the dependent variable revealed significant main effects of ROI (F = 56.671, P < 0.001) and stimulus condition (F = 18.256, P = 0.008). However, these main effects were qualified by a significant two-way interaction for ROI and stimulus condition (F = 11.375, P = 0.003). Post hoc comparisons (t-tests for paired samples, Bonferroni corrected) showed that the 25% signal drop for the object stop versus object vanish condition occurred at significantly different time points in our LOC ROI only (t = 15.185, P < 0.001), whereas no significant differences were found for MT+ (t = 0.504, P < 0.636) and the overlap region (t = 1.919, P < 0.113). This means that in correspondence with the behavioral data, the brain activation persisted longer in the object stop condition than in the object vanish condition but only in the LOC ROI.
The purpose of the experiment was to assess the role of motion-sensitive and object-sensitive brain areas in the persistence experienced after observing shape-from-motion. Our data suggest that area MT+ and the LOC both respond to moving objects. However, activity only in the LOC, a part of the ventral visual stream, is correlated to the perceptual persistence observers experience after the motion stopped. The fMRI signal in area MT+ dropped as soon as the motion stopped for both the object stop and object vanish conditions, but we found sustained activation for the object stop condition in the LOC. Although the shape-from-motion phenomenon depends on motion analysis, it seems to be the object-selective LOC that maintains the percept of the object after the motion stopped. From previous research we know that grouping processes activate the LOC even in the absence of a physical object (Mendola et al., 1999). Our results confirm these findings. Moreover, here we were able to demonstrate that the LOC responds even when the grouping is induced by motion and not by contours as in the previous studies.
Regan and co-workers (Regan and Beverley, 1984; Regan, 2000) argued that the persistence may be due to adaptation of filters for motion-defined form. This is consistent with the findings of Shioiri and Cavanagh (Shioiri and Cavanagh, 1992). Moreover, the same authors demonstrated that the source of the perceptual persistence is not a motion aftereffect, a process commonly attributed to area MT+ (Tootell et al., 1995b; He et al., 1998; Culham et al., 1999). Our findings clearly support this idea. Area MT+ did not show sustained activity during the persistence. Thus, the perceptual persistence is not subserved by the motion-sensitive area MT+, indicating that if such filters for motion-defined form mediate the persistence they are not located in this area either.
It is the motion that binds the elements of the fragmented line drawing into one object. The neurophysiological process, however, is still unclear. Hess and Field (Hess and Field, 1999) suggested that lateral connections within and between cortical areas integrate components into one coherent whole. This fits with Regan et al.’s (Regan et al., 1992) observation that patients with lesions in the parieto-temporo-occipital white matter showed impairments in motion-defined form recognition. The authors suggested that the lesions interrupted connections between motion sensitive brain areas in the dorsal pathway and form-sensitive areas in the ventral pathway that are relevant for the recognition of motion-defined form. Given the ease with which we extract shape from motion cues it is reasonable to argue that interconnections between neighboring cortical areas, such as MT+ and the LOC, subserve the shape-from-motion phenomenon. Our study does not address the involvement of earlier visual area or top-down signals from higher-order areas. The high temporal resolution fMRI scans required to measure persistence limited our focus to five thin slices centered on MT and the adjacent LOC.
A possible interpretation of our results is that there was a differential allocation of attention in our object stop and object vanish stimulus conditions. The presence of the fragmented line drawing of the object in the object stop condition could cause a higher attentional engagement and thus lead to a prolonged fMRI response. The idea here is that brain activation is simply correlated with the effort exerted by the subjects in their attempt to maintain the percept of the object in the absence of visual stimulation (Kastner et al., 1999). From previous research we know that fMRI signals in MT+ vary according to the attentional load of the task at hand (Huk and Heeger, 2000). The absence of a difference of the fMRI signal in area MT+ between the two conditions, object stop and object vanish, clearly indicates that the attentional load across the two tasks was the same. There is no reason to assume that a possible differential allocation of attention leads to a sustained fMRI response in the LOC only; it should have affected MT+ in a similar way. Another possible confound does affect the LOC only. O’Craven and Kanwisher (O’Craven and Kanwisher, 2000) were able to demonstrate that imagery and perception share common neural mechanisms and thus any activation in the LOC area could be due to a mental image of the previously visible object. We cannot rule out that our subjects engaged in some form of mental imagery but there is no reason to believe that this affected both stimulus conditions differently. Thus, we believe that explanations based on a differential employment of attention or visual imagery cannot account for the difference in the perceptual persistence between our two stimulus conditions.
Our conclusion that the LOC is crucial to perceptual persistence fits with the Gestalt theory, which would hold that past experience in terms of the previous exposure to the moving object subserves the persistence. The idea is that previously grouped elements are stored as an object, and this stored representation enables the viewer to maintain a visual percept. Our study clearly supports this idea, by showing that motion-defined percepts are preserved not in motion-sensitive brain regions but in object-sensitive brain regions such as area LOC.
This research was supported by the Canadian Institutes of Health Research.
Address correspondence to Susanne Ferber, Department of Psychology, University of Toronto, 100 St George Street, Toronto, Ontario, Canada, M5S 3G3. Email: firstname.lastname@example.org.