Abstract

Contour curvature (CC) is a vital cue for the analysis of both form and motion. Using functional magnetic resonance imaging, we localized the neural correlates of CC for the processing and perception of rotational motion. We found that the blood oxygen level–dependent signal in retinotopic area V3A and possibly also lateral occipital cortex (LOC) varied parametrically with the degree of CC. Control experiments ruled out the possibility that these modulations resulted from either changes in the area of the stimuli, the velocity with which contour elements were actually translating, or perceived angular velocity. We conclude that neurons within V3A and perhaps also LOC process continuously moving CC as a trackable feature. These data are consistent with the hypothesis that V3A contains neural populations that process trackable form features such as CC, not to solve the “ventral problem” of determining object shape but in order to solve the “dorsal problem” of what is going where.

Introduction

The goal of this research was to employ functional magnetic resonance imaging (fMRI) and psychophysics to characterize brain circuitry that uses contour curvature (CC) as a primary cue for the computation of 2-dimensional (2D) rotational object motion. It has been widely recognized that CC is an important cue for shape processing (Attneave 1954; Kristjansson and Tse 2001). In this paper, we investigate the possibility that CC may also be an important cue in motion processing.

How the visual system constructs the perception of motion from the temporal dynamics of the retinal image is a fundamental question that continues to challenge vision scientists. This is true even for the perception of relatively simple stimuli such as those completely defined by a closed contour. At the heart of the problem is the fact that an infinite number of 3D velocity fields can generate the same 2D retinal sequence. The local motion information at any point along a contour is consistent with an infinite number of possible motions that all lie on a “constraint line” in velocity space (Adelson and Movshon 1982) for the 2D case. The problem of interpreting this many-to-one mapping is commonly termed the “aperture problem” (Fennema and Thompson 1979; Adelson and Movshon 1982; Marr 1982; Nakayama and Silverman 1988).

Explaining how the aperture problem is solved is perhaps the most basic challenge that must be met by any model of motion perception. Although there are several theoretical solutions to the aperture problem that account for many aspects of motion perception, no single general theory has yet emerged that can explain how the visual system actually processes motion in every instance. Several models that provide reasonable solutions to the aperture problem for the case of translational motion, such as “intersection of constraints” and “vector summation” models, fail to provide unique solutions in the case of rotational motion.

An account that provides a solution to the aperture problem in the case of both translational and rotational motion is that based upon the tracking of “trackable features” (TFs, Ullman 1979). Image features, such as contour corners, terminators, and junctions, can provide unambiguous motion signals when they correspond to attributes that are intrinsically part of the moving object.

This paper seeks to isolate the neural circuitry underlying the processing of CC as a TF for the perception of motion in general and rotational motion in particular.

Materials and Methods

Subjects

All subjects had normal or corrected-to-normal vision and, prior to participating, gave written informed consent according to the guidelines of the Department of Psychological and Brain Sciences and the internal review board (IRB) of Dartmouth College. Subjects received $20 for their participation in each magnetic resonance imaging (MRI) scanning session.

Stimuli

In each experiment, there were 5 groups of stimuli used. Each group consisted of a set of ellipses or modified ellipses known as “bumps” (Tse and Albert 1998; Kristjansson and Tse 2001). Bumps are formed by joining 2 half-ellipses along a common axis (Fig. 1a–c). If the noncommon axes of each half-ellipse are unequal, this conjunction forms 2 CC discontinuities at either end of the common axis along which the 2 half-ellipses are joined. The ratio of the unequal axes of each of the 2 half-ellipses determines the severity of the discontinuity and the corresponding CC. For example, a bump constructed from a fat ellipse and a thin ellipse will have a more severe discontinuity than one constructed from 2 fat ellipses. As such, it will appear to have sharper “rounded corners” than a bump constructed from 2 fat ellipses. One of the 5 stimulus groups used in these experiments was simply a set of ellipses, each with an aspect ratio of 1.083, comprised of a major axis = 3.25° visual angle from end to end and minor axis = 3.0° visual angle from end to end. Each of the other 4 groups was made up of bumps that were constructed by parametrically varying the degree of CC discontinuity.

Figure 1.

Bumps. (a) A bump and an ellipse. (b) The bumps were constructed by combining the bottom half of an ellipse of major axis a and minor axis b and the top half of an ellipse with a common major axis and different minor axis c. The bumps contain a CC discontinuity along the contour where the 2 component half-ellipses meet. (c) The bump has major axis a and minor axis b + c. The ellipse has the same major axis a and minor axis 2d. By choosing (b + c) = 2d, the widths and heights of the bumps and ellipses are exactly the same because 2d = b + c. Moreover, the areas of the bumps and ellipses are identical because π × a(2d) = π × a(b + c). (d) By increasing the ratio between b and c while preserving (b + c) = 2d, the degree of CC discontinuity can be modulated while preserving the area of the stimulus. The color of the bounding box distinguishes the individual stimuli that made up each presentation group used in experiment 1. (e) In each presentation block, a set of 13 individual black stimuli was rotated against a white background. The initial orientation and direction of rotation for each stimulus was randomly determined. Subjects were required to maintain fixation on a central square whose color periodically changed.

Figure 1.

Bumps. (a) A bump and an ellipse. (b) The bumps were constructed by combining the bottom half of an ellipse of major axis a and minor axis b and the top half of an ellipse with a common major axis and different minor axis c. The bumps contain a CC discontinuity along the contour where the 2 component half-ellipses meet. (c) The bump has major axis a and minor axis b + c. The ellipse has the same major axis a and minor axis 2d. By choosing (b + c) = 2d, the widths and heights of the bumps and ellipses are exactly the same because 2d = b + c. Moreover, the areas of the bumps and ellipses are identical because π × a(2d) = π × a(b + c). (d) By increasing the ratio between b and c while preserving (b + c) = 2d, the degree of CC discontinuity can be modulated while preserving the area of the stimulus. The color of the bounding box distinguishes the individual stimuli that made up each presentation group used in experiment 1. (e) In each presentation block, a set of 13 individual black stimuli was rotated against a white background. The initial orientation and direction of rotation for each stimulus was randomly determined. Subjects were required to maintain fixation on a central square whose color periodically changed.

In each group, 13 stimuli were presented as illustrated in Figure 1e. The individual ellipses and bumps were rotated about their center of gravity at an angular velocity of 126°/s. The direction of rotation for each stimulus was randomly determined based on position within the group. These directions of rotation were held constant across all runs.

In addition to these primary stimuli, a central fixation spot was present at all times, comprised of a small square extending 0.25° visual angle.

Fixation Task

Eye movements, wakefulness, and attention to the fovea were controlled for by requiring subjects to perform a reaction time task in which subjects had to press a button as fast as possible to a pseudorandomly occurring change in fixation point color. The fixation point was 0.25° × 0.25° visual angle, located at the center of the screen. It changed color approximately every 7.3 s. This color change occurred an equal number of times during each block. This task could only be carried out successfully if subjects were fixating during both condition- and fixation-only blocks and attending to the fixation point carefully. A psychophysical control experiment was conducted outside the scanner in which the stimuli and fixation task were presented to subjects while their eye movements were monitored. The results of this control experiment established that subjects were able to successfully perform the fixation task and that the number and magnitude of fixational eye movements do not vary as a function of the different stimulus conditions. Thus, fixational eye movements can be ruled out as a possible confounding source of blood oxygen level–dependent (BOLD) signal variation among conditions.

Stimulus Presentation fMRI

Each subject participated in multiple runs in the scanner. Each run lasted 220 s (88 volumes, time repetition [TR] = 2.5 s). In each run, there were five 20-s (8 volumes) stimulation blocks. These blocks were interleaved with six 20-s blank periods, during which only the fixation spot was present. Thus, each run began and finished with blank periods. In addition, 10 s (4 TRs) of dummy images were collected at the beginning of each run to allow transient activation of BOLD signal to return to baseline.

In each of the stimulation blocks, one of the 5 stimulus groups was presented. The order in which the 5 groups were presented was randomly determined for each run. Stimuli were projected from a digital data projector (refresh rate 60 Hz) onto a frosted Plexiglas screen outside the bore of the magnet and viewed via a tangent mirror inside the magnet that permitted a maximum of 22° × 16° visible area. The projected image was smaller than this and subtended approximately 17° × 12° of visual angle. The objects in each group were black. The background was always white, even during the blank period.

Psychophysics

The purpose of this staircase-psychophysics experiment was to find the angular velocity at which each group must be rotated in order to be perceived as equivalent. The 5 groups of stimuli used in experiment 2 were used in this experiment.

Procedure Psychophysics Experiment 3

Subjects were presented with a control stimulus group (the central bumps type shown in Fig. 1d in blue) for 500 ms. Each of the 13 bumps was rotated at 126°/s in a randomly determined direction. Orientations were also randomized. Immediately following this, one of the other 4 groups was presented for 500 ms (test group). Each of the 13 objects in this group was rotated either faster or slower than 126°/s. Subjects were asked to press one of 2 buttons if they perceived the test group as rotating faster than the control group and the other button if they perceived the test group as rotating slower than the control group. This process is illustrated in Figure 3A.

Upon pressing a button, subjects were again presented with the control group for 500 ms, followed by the same test group as before except that the objects in this group were rotated either slightly faster or slower than on the previous trial depending on which button had been pressed. This process was repeated until subjects indicated, by the pressing of a third button, that the 2 groups seemed to be rotating at the same speed.

The speed of rotation of the test group was then recorded, and the process was begun again with a new test group. The experiment concluded when subjects had matched the speeds of all 4 test groups to the control group 4 times (2 times with the test group starting out much faster than the control groups and 2 times with the test group starting out much slower than the control group). For each test group, the 4 recorded speeds were averaged together to determine the speed at which each group must be rotated in order to be perceived to be rotating at the same speed as the control group which was rotating at 126°/s.

Stimulus Presentation Psychophysics Experiment 3

The visual stimulator was a 2-GHz Dell workstation running Windows 2000. The stimuli were presented on a 24-inch SONY CRT monitor with 1600 × 1200 pixel resolution and an 80-Hz frame rate. Observers viewed stimuli from a distance of 76.2 cm with their chin in a chin rest. Stimulus size (in visual angle) was set to be precisely that of the bumps when shown using the MRI projection system. Fixation was ensured using a head-mounted eye tracker (Eyelink2, SR Research, Ontario, Canada; Tse and others 2002). Any time subjects' monitored left eye was outside a fixation window of 1.5° radius, the trial was automatically aborted, and the current experimental state was repeated. The eye tracker was recalibrated whenever subjects' monitored eye remained outside the fixation window although subjects reported maintaining fixation. Once calibration was completed, the experiment resumed.

Data Acquisition

T1-weighted anatomical images were acquired using a high-resolution 3D spoiled gradient recovery sequence (124 sagittal slices, echo time [TE] = 6 ms, TR = 25 ms, flip angle = 25°, 1 × 1 × 1.2 mm voxels) as well as a T1-weighted coplanar anatomical image with the same slice orientation as the EPI data which were used for coregistration. Continuous whole-brain BOLD signal was acquired at the Dartmouth Brain Imaging Center on a GE 1.5-T signa scanner using a standard head coil. Standard T2* gradient-weighted echoplanar functional images sensitive to BOLD contrast were collected using 25 slices (4.5 mm thickness and 3.75 × 3.75 mm in-plane voxel resolution, interslice distance 1 mm, TR = 2500 ms, T2* TE = 35 ms, flip angle = 90°, field of view [FOV] = 240 × 240 × 256 mm, descending interleaved slice acquisition, matrix size = 64 × 64) oriented approximately along the anterior-commissure posterior-commissure plane. These slices were sufficient to encompass the entire brain of each subject.

Analysis of fMRI Data

Preprocessing

Data were analyzed using BRAIN VOYAGER (BV) 4.9.6 and MATLAB software developed in-house. Effects of small head movements were removed using BV's motion correction algorithm. Slice scan-time correction was carried out to correct for the fact that slices were not collected at the same time and were collected in interleaved and descending order. Slices were corrected to have the same mean intensity. Functional data were not smoothed in the space domain, but any low-frequency temporal fluctuations whose wavelength was greater than 29 TRs were removed. This did not introduce correlations between a voxel and its neighbors. For each subject, the functional data were coregistered to the high-resolution anatomical image and normalized into the Talairach stereotactic coordinate space, which enables comparisons to be made across subjects.

General Linear Model

Data from each experiment and each of the localizer scans were analyzed using the general linear model (GLM) with a boxcar waveform convolved with a canonical hemodynamic response function. This returned a beta weight parameter estimate for each condition for each voxel.

Constructing Region of Interest Masks

Retinotopic Mapping

Retinotopy was carried out on each subject who participated in the study using standard phase-encoding techniques (4.5 mm thickness and 3.75 × 3.75 mm in-plane voxel resolution, interslice distance 1 mm, TR = 1600 ms, flip angle = 90°, FOV = 240 × 240 × 256 mm, interleaved slice acquisition, matrix size = 64 × 64; 16 slices oriented along the calcarine sulcus) with the modification that 2 wedges of an 8-Hz flicker black-and-white polar checkerboard grating were bilaterally opposite like a bow tie, to enhance signal to noise (Sereno and others 1995; Slotnick and Yantis 2003). Wedges occupied a given location for 2 TRs (3.2 s) before moving to the adjacent location in a clockwise fashion. Each wedge subtended 18° of 360°. Six TRs of dummy scans were discarded before each run to bring spins to baseline. A total of 168 volumes were collected on each run. A minimum of 7 wedge runs were collected for each subject and then averaged to minimize noise before retinotopic data analysis in BV. A minimum of 3 runs were collected per subject using expanding 8-Hz flickering concentric rings that each spanned approximately one degree of visual angle in ring width. Each ring was updated after one TR (1.6 s) after which it was replaced by its outward neighbor, except that the outermost ring was replaced by the innermost ring, whereupon the cycle was repeated. Retinotopic areas (V1, V2d, V2v, V3d, V3v, V4v/VO1, and V3A) were defined as masks on the basis of standard criteria (Sereno and others 1995), assuming a contralateral quadrant representation for V1, V2d, V2v, V3d, and V3v and a contralateral hemifield representation for V4v/VO1 and V3A (Tootell and others 1997). V4v and the hemifield representation just anterior to it, called VO (Brewer and others 2005), were combined into a common mask because the border between these regions was not distinct in all subjects.

The Determination of V3A

The identification of area V3A has been debated in the literature. There is a general agreement that V3A has a contralateral hemifield representation. However, there is evidence for a second retinotopic area (V3B) lateral to V3A that shares the same foveal representation (Smith and others 1998; Press and others 2001). We were unable to clearly identify the boundary between V3A and V3B in all the subjects, so they were combined into a common “V3A” mask. It has been argued that V3B is actually the same as the “kinetic occipital” (KO) area, (Smith and others 1998; Zeki and others 2003), which is an area particularly responsive to motion-defined borders (Dupont and others 1997; Van Oostende and others 1997; Grossman and others 2000; Kononen and others 2003). However, it is important to note that the conclusion that V3B and KO are in fact the same was made without employing retinotopic criteria but rather on the basis of the similarity between normalized Talairach coordinates. An alternative segmentation of the areas has been put forth, in which all the cortex lateral to V3A has been grouped into a common topographically defined “V4d topolog” (Sereno and others 1995; Tootell and others 1995, 1996, 1997, Hadjikhani and others 1998; Tootell, Hadjikhani, Hall, and others 1998; Tootell, Hadjikhani, Vanduffel, and others 1998; Malach and others 2002; Tsao and others 2003). V4d is anatomically rather than functionally defined and encompasses both V3B and KO.

Our V3A/B localizations were functionally defined on a strictly retinotopic basis and, therefore, unlikely to overlap with KO. However, without a specific localization of KO, it is impossible to rule out the possibility that some overlap does exist. Because it is still a matter of debate whether V3B is distinct from V3A and whether V3B, if it exists, is the same as KO, we decided to define V3A as all contiguous retinotopically mapped cortex responsive to the contralateral hemifield beyond the vertical meridian border with area V3d.

Individual hMT+ Mask

The analog of macaque motion processing area MT has been called V5 or human hMT+. Left and right hMT+ were localized in a subset of subjects tested in each experiment using a localizer scan comprised of 3–6 runs of 3 min each. The hMT+ localizer stimuli consisted of a grid of 3 × 3 subgrids of white squares on a black background whose length and height were approximately 1° × 1° visual angle. This was constructed by eliminating the zeroth, ±fourth, and ±eighth rows and columns from a regular grid of squares. Square centers were separated by approximately 3° visual angle. In baseline blocks, the grid remained stationary for a 20-s epoch, followed by an epoch where the grid rotated clockwise around its center at a speed of 270° of rotational angle per second. Each run contained 9 epochs of alternating motion and nonmotion stimulation. As in the main experiment, subjects carried out a simple fixation task pressing a button in the right hand any time the fixation point changed color. The hMT+ was localized as activity in the motion > nonmotion GLM contrast. In each case, a statistical threshold of at least P < 0.0001 corrected (fixed effects) was used. In addition, activation had to occupy the inferior occipital gyrus or inferior temporal sulcus in order to be localized as hMT+. The mean Talairach coordinates of hMT+ in the right hemisphere were x = 45.5 (standard error [SE] = 1.3), y = −68.9 (1.6), and z = −5.2 (2.6) and in the left hemisphere x = −41.4 (1.0), y = −68.4 (1.4), and z = 2.5 (1.4).

Individual Lateral Occipital Complex Masks

An individual LOC mask was also determined individually for the same subjects for whom individual hMT+ masks were made following standard procedures (Kourtzi and Kanwisher 2000). Object images (7° × 7°) were placed on a white background and were embedded within a black grid. Control images were comprised of the same images scrambled within the same grid. Their centroid position was updated randomly every TR within a 1° radius of the fixation point in order to prevent perceptual fading. The left and right hemisphere LOC masks were created from the fixed effects GLM analysis contrast of unscrambled objects > scrambled objects for each subject. The masks in each case were determined using a statistical threshold of at least P < 0.001 uncorrected. The mean left LOC mask location in Talairach coordinates was x = −42.8 (SE = 1.0), y = −73.6 (1.4), and z = −5.1 (2.1), and the mean right hemisphere LOC mask location was x = 45.9 (1.5), y = −71.1 (1.5), and z = −7.2 (2.1). (By way of comparison, the mean location reported by Kourtzi and others [2003; left hemisphere Talairach coordinates −41.9, −64,8, −2.7; right hemisphere 39.1, −65.6, −12.0] found using an LOC localizer in each of 10 subjects fit well within the bilateral activations found in the present study.)

In many subjects, there is an anterior portion of the LOC located in the middle Fusiform gyrus and a posterior portion located just inferior to hMT+ that is activated by this contrast. The present LOC masks were selected as the posterior region because the 2 subregions were not abutting in any subject and could well comprise areas with different functionalities.

Separation of hMT+ and LOC

In some subjects, a small number of voxels that were localized as being in hMT+ were also localized as belonging to LOC. This results because, depending on the threshold at which hMT+ and LOC masks are specified, there can be common voxels shared between these masks. In order to eliminate the possibility that measured responses were driven by this common overlap region in the hMT+ and LOC region of interest (ROI) masks, any overlapping voxels were removed from both the corresponding hMT+ and LOC masks and were not included in the analysis.

Individual ROI Statistical Analysis

In each experiment, voxel masks belonging to individual ROIs were available for a subset of subjects. For each of these subjects, one mean beta weight per condition was computed for each ROI. Beta weights are the regressor weights obtained by separate GLM analyses carried out within each subject's ROI and reflect the degree to which the BOLD signal was modulated away from baseline during the corresponding stimulus condition. The ROI beta weights were then averaged across hemispheres (Fig. 4A). The magnitude of each beta weight taken by itself is not especially informative. This is because the magnitude of the beta weight is computed based upon a least-squares analysis of the raw signal, not percent BOLD signal change. As such, one subject can have larger raw beta weights within a given ROI than another subject simply because the mean level of the magnitude of the raw signal coming off the fMRI scanner was higher in one subject than another. In contrast, the relative magnitude of one beta weight versus another within an ROI is highly informative because it indicates relative neuronal activity between conditions within an area. To capture this relative, between-condition information, Z scores for each condition were obtained from the beta weights in each ROI (Fig. 4B). For a given subject, these Z scores signify the degree to which any one condition differed from the mean response within the ROI across all the conditions. Thus, a positive Z score indicates that the percent BOLD signal change for that condition was greater than the mean percent BOLD signal change for that subject within that ROI. In general, normalized ROI-specific beta weights are a more appropriate means of comparing BOLD signal responses within ROIs across subjects because they are not subject to the problem of differential weighting that would arise if percent BOLD signal or raw beta weights were used for comparing responses across subjects.

For each ROI, a repeated-measures analysis of variance (ANOVA) with a linear contrast based on increasing CC was then performed on the Z scores incorporating the data from all subjects (Fig. 4C). Mauchly's test for sphericity was performed for each repeated-measures ANOVA. ANOVA assumes that variance across conditions is constant but is robust even if this assumption is violated, assuming that the data are uncorrelated. However, heterogeneity of variances in correlated data makes it spuriously easier to reach significance using an ANOVA. Passing a test for sphericity indicates that data are uncorrelated and have homogeneous variance. In every instance, the test was unable to reject the null hypothesis (P > 0.2) that the data sets were spherical, confirming that the variance assumptions made by the repeated-measures ANOVA carried out in this study were not violated. For each experiment, ROIs are reported to be parametrically modulated by CC if the amount of variance in the Z scores that is accounted for by the linear contrast (partial eta squared ηp2) reaches significance at a level of P < 0.05.

Results

The goal of this study was to isolate brain regions that process CC as a form-based cue for motion processing. In each experiment, CC was parametrically varied across stimulus conditions. Parametric variation in percent BOLD signal change (more precisely, normalized beta weights obtained from raw BOLD signal values) across stimulus conditions is taken as evidence that such brain regions may be using CC as a cue for motion processing. In order to ultimately conclude that a region is indeed processing CC, all other stimulus parameters that covary with CC must be ruled out. Each of the following experiments was designed to control for such covarying factors.

Experiment 1: Area Control

By fixing the length of the common axis (1.625° visual angle from origin to one end) and appropriately selecting the noncommon axes used to construct the bumps (Fig. 1a–c), the severity of CC discontinuity was systematically varied across stimulation group (Fig. 1d). This permitted the perceived sharpness of the rounded corners or degree of CC to vary while keeping the area of all bumps constant and keeping the contour itself differentiable everywhere. Each stimulus group consisted of 13 identical bumps (Fig. 1e). The sizes (in visual angle) of the noncommon axes, from origin to one end, used to define the bumps in each stimulation group were as follows: Group 1, 1.5°/1.5°; Group 2, 1.75°/1.25°; Group 3, 2.0°/1.0°; Group 4, 2.25°/0.75°; and Group 5, 2.5°/0.5°. Thus, the stimuli used in each group had the same height/width proportions 3.25° × 3.0°.

Subjects

Nine people (5 women and 4 men, mean age 27 years, range from 18 to 41 years) participated in this study. Of these 9 participants, 4 also participated in experiments 2 and 3.

Results Experiment 1

Retinotopic ROIs were available for all the subjects who participated in this experiment. In addition, hMT+ and LOC were localized for 7 of the 9 subjects. Figure 2A indicates the degree to which percent BOLD signal varies parametrically with CC in each of the ROIs. Table 1 provides the statistical results of the repeated-measures ANOVA. Parametric variation of percent BOLD signal change with increasing CC as measured by a repeated-measures ANOVA linear contrast analysis was observed in areas V2v, V3v, V3A, hMT+, and LOC. ηp2 is a measure of the effect size or the amount of variance in the data accounted for by the linear relationship embodied in the contrast used.

Figure 2.

fMRI results. Results of a repeated-measures ANOVA linear contrast between percent BOLD signal Z scores and increasing CC. ηp2 values are plotted by ROI. Significance (*P < 0.05) indicates the existence of a linear relationship between percent BOLD signal and CC. (A) Experiment 1: Retinotopic areas V2v, V3v, and V3A as well as areas hMT+ and LOC modulate parametrically across stimulus condition. (B) Experiment 2: Retinotopic area V3A and areas hMT+ and LOC modulate parametrically across stimulus group. Based on this, we can no longer conclude that areas V2v and V3v, found to modulate in experiment 1, are processing CC as a cue for motion perception. (C) Experiment 3: With the stimulus groups presented so that each group was perceived to rotate at the same angular velocity, only the BOLD signal in retinotopic area V3A modulates parametrically across stimulus condition, thus enabling us to conclude that V3A processes CC as a cue for motion perception.

Figure 2.

fMRI results. Results of a repeated-measures ANOVA linear contrast between percent BOLD signal Z scores and increasing CC. ηp2 values are plotted by ROI. Significance (*P < 0.05) indicates the existence of a linear relationship between percent BOLD signal and CC. (A) Experiment 1: Retinotopic areas V2v, V3v, and V3A as well as areas hMT+ and LOC modulate parametrically across stimulus condition. (B) Experiment 2: Retinotopic area V3A and areas hMT+ and LOC modulate parametrically across stimulus group. Based on this, we can no longer conclude that areas V2v and V3v, found to modulate in experiment 1, are processing CC as a cue for motion perception. (C) Experiment 3: With the stimulus groups presented so that each group was perceived to rotate at the same angular velocity, only the BOLD signal in retinotopic area V3A modulates parametrically across stimulus condition, thus enabling us to conclude that V3A processes CC as a cue for motion perception.

Table 1

Statistical results for experiment 1

Area\stats F score P value Effect size (ηp2
V1 F1,8 = 1.598 >0.24 0.167 
V2v F1,8 = 16.814 <0.003 0.678 
V2d F1,8 = 1.869 >0.20 0.189 
V3v F1,8 = 8.31 <0.02 0.510 
V3d F1,8 = 3.105 >0.11 0.28 
V3A F1,8 = 7.675 <0.025 0.49 
V4v F1,8 = 0.797 >0.39 0.091 
hMT+ F1,6 = 14.383 <0.009 0.706 
LOC F1,6 = 9.853 <0.002 0.622 
Area\stats F score P value Effect size (ηp2
V1 F1,8 = 1.598 >0.24 0.167 
V2v F1,8 = 16.814 <0.003 0.678 
V2d F1,8 = 1.869 >0.20 0.189 
V3v F1,8 = 8.31 <0.02 0.510 
V3d F1,8 = 3.105 >0.11 0.28 
V3A F1,8 = 7.675 <0.025 0.49 
V4v F1,8 = 0.797 >0.39 0.091 
hMT+ F1,6 = 14.383 <0.009 0.706 
LOC F1,6 = 9.853 <0.002 0.622 

Note: For each of the individually identified ROIs, F scores, P values, and effect sizes are shown for the repeated-measures ANOVA with linear contrast. ROIs in which a significant amount of variance was accounted for by the linear contrast are shown in bold font (V2v, V3v, V3A, hMT+, LOC).

The results of experiment 1 identify V2v, V3v, V3A, hMT+, and LOC as possible candidates for areas that use CC as a cue for motion processing. Along with CC, however, there are other stimulus parameters that varied parametrically across stimulus condition in the current experiment, which are controlled for in the following 2 experiments.

Experiment 2: Maximum Velocity Control

For any rotating object, the point along the contour that is farthest from the center of rotation will be the one that moves the fastest. For a bump or ellipse rotating about its center of gravity, this point is located at the end of the major axis (ellipse) or at the junction of the 2 half-ellipses.

The method used to preserve the area across the stimulus groups used in experiment 1 involved fixing the length of the major axes and modifying the minor axes of the 2 half-ellipses that made up each bump. This had the result of systematically changing the maximum distance from the center of rotation to the contour. As a result, the points along the stimulus contour corresponding to the ends of the major axis translated at a velocity that covaried with CC in experiment 1. In addition, because the distance to the farthest point along the contour increased with increasing CC, the area of the circular patch of visual field “swept out” by each of the bumps increased across the stimulus groups. Thus, the visual areas found to vary parametrically in experiment 1 might have been modulated by either of these potentially confounding factors.

This experiment was designed to control for both confounding factors. This was accomplished by uniformly scaling the sizes of each of the stimuli so that the distance to the farthest point along the contour from the center of rotation was held constant across stimulus groups. This necessarily decreased the area of the stimuli as a function of increasing CC and kept the maximum velocity of all points along the contour constant across bump types. The scale factors used to match each of the stimulus groups were as follows: Group 1, 100% (unchanged); Group 2, 99%; Group 3, 96%; Group 4, 91%; and Group 5, 85%.

Subjects

Thirteen people (9 naive, 8 women and 5 men, mean age 24 years, range from 18 to 41 years) participated in this study.

Results Experiment 2

Retinotopic ROIs were available for all 13 of the subjects who participated in this experiment. In addition, hMT+ and LOC were localized for 9 of the subjects. Figure 2B indicates the degree to which percent BOLD signal varies parametrically with CC in each of the ROIs. Table 2 provides the statistical results of the repeated-measures ANOVA. Parametric variation of percent BOLD signal change with increasing CC was observed in areas V3A, hMT+, and LOC.

Table 2

Statistical results for experiment 2

Area\Stats F score P value Effect size (ηp2
V1 F1,12 = 0.083 >0.77 0.007 
V2v F1,12 = 0.153 >0.70 0.013 
V2d F1,12 = 1.652 >0.22 0.121 
V3v F1,12 = 0.017 >0.89 0.001 
V3d F1,12 = 1.236 >0.28 0.093 
V3A F1,12 = 7.354 <0.019 0.38 
V4v F1,12 = 0.449 >0.51 0.38 
hMT+ F1,8 = 8.342 <0.02 0.51 
LOC F1,8 = 8.997 <0.017 0.529 
Area\Stats F score P value Effect size (ηp2
V1 F1,12 = 0.083 >0.77 0.007 
V2v F1,12 = 0.153 >0.70 0.013 
V2d F1,12 = 1.652 >0.22 0.121 
V3v F1,12 = 0.017 >0.89 0.001 
V3d F1,12 = 1.236 >0.28 0.093 
V3A F1,12 = 7.354 <0.019 0.38 
V4v F1,12 = 0.449 >0.51 0.38 
hMT+ F1,8 = 8.342 <0.02 0.51 
LOC F1,8 = 8.997 <0.017 0.529 

Note: For each of the individually identified ROIs, F scores, P values, and effect sizes are shown for the repeated-measures ANOVA with linear contrast. ROIs in which a significant amount of variance was accounted for by the linear contrast are shown in bold font (V3A, hMT+, LOC).

Of the areas found in experiment 1, the percent BOLD signal in V3A, hMT+, and LOC was found to modulate parametrically across stimulus groups. From this, we can conclude that in these areas the parametric modulation of percent BOLD signal is not due to the area of the stimuli, the true translational speed of points along the contour, or the area of the visual field swept out by the rotation. Furthermore, we can no longer conclude that V2v or V3v are processing CC as a cue for motion perception because we cannot rule out the possibility that these areas were modulated by the actual translational velocity of points along the contour or with the amount of visual space stimulated by the individual bumps. Indeed, given the different results for V2v and V3v in this and the previous experiment, we can conclude that the parametric variation of BOLD signal in areas V2v and V3v was modulated by one or both of these confounding factors.

Experiment 3: Perceived Speed Control

The perceived angular velocity of an ellipse rotating about its origin is underestimated as a function of its aspect ratio (Caplovitz and others 2006). A “skinny” ellipse will be perceived to rotate faster than a “fatter” ellipse even if they are, in fact, rotating at the same speed. Subjectively, this same illusory underestimation of angular velocity exists for the bumps used in experiments 1 and 2. As the CC of the bumps increases, so does its perceived angular velocity. This experiment was designed to isolate the processing of CC from the perceived angular velocity of the stimulus groups. Using psychophysics, we determined for each subject the speed at which the bumps in each stimulus group needed to rotate, in order to be perceived as rotating at the same speed. These psychophysical results were incorporated into the presentation of the stimuli while subjects were scanned using the same stimuli as those used in experiment 2. Therefore, in this third fMRI experiment the actual rate of rotation decreased with increasing CC in a subject-specific manner, so that all bumps appeared to rotate at the same subjective angular velocity.

Subjects

Twelve people (8 naive, 5 women and 7 men, mean age 27 years, range from 19 to 42 years) participated in this study. All 12 subjects participated in the psychophysics portion of this study for $5 and returned to participate in the fMRI portion of this study.

Results Psychophysics Experiment 3

The results of the psychophysics are illustrated in Figure 3B and demonstrate that as the degree of CC increased so did the subjectively perceived speed of rotation. This was demonstrated by the fact that stimulus groups 4 and 5 (greater CC than the control, shown as the yellow and pink bumps in Fig. 1d) had to be rotated slower than 126°/s to be perceived to be rotating at the same angular velocity as the control group. In contrast, stimulus groups 1 and 2 (less CC than control, shown as the red and green bumps in Fig. 1d) had to be rotated faster than 126°/s to be perceived to rotate at the same angular velocity as the control group. Overall, as the degree of CC increased, so did the perceived speed of rotation.

Figure 3.

Experiment 3: Psychophysics. (A) Psychophysics paradigm: a group of 13 control stimuli (Fig. 2d blue) rotating at 126°/s was presented to subjects for 500 ms. Subjects were then presented with a test group, adjusted for size according to experiment 2, for 500 ms rotating either faster or slower than the control stimulus. Subjects indicated via a button press whether the test group was rotating faster or slower than the control. (B) Results of experiment 4 psychophysics: the color of the plotted bars corresponds to the stimuli bounded by similar color in Figure 2d corrected for size as in experiment 2. The magnitude of the bars indicates how much faster or slower in percentage relative to the control group, each test stimulus group needed to be rotated in order to be perceived as rotating at the same angular velocity (point of subjective equality) as the control group (blue). As the degree of CC increased, the point of subjective equality decreased. This indicates that the higher curvature bumps were perceived to rotate faster than lower curvature ones. Error bars indicate the standard error of the mean across subjects.

Figure 3.

Experiment 3: Psychophysics. (A) Psychophysics paradigm: a group of 13 control stimuli (Fig. 2d blue) rotating at 126°/s was presented to subjects for 500 ms. Subjects were then presented with a test group, adjusted for size according to experiment 2, for 500 ms rotating either faster or slower than the control stimulus. Subjects indicated via a button press whether the test group was rotating faster or slower than the control. (B) Results of experiment 4 psychophysics: the color of the plotted bars corresponds to the stimuli bounded by similar color in Figure 2d corrected for size as in experiment 2. The magnitude of the bars indicates how much faster or slower in percentage relative to the control group, each test stimulus group needed to be rotated in order to be perceived as rotating at the same angular velocity (point of subjective equality) as the control group (blue). As the degree of CC increased, the point of subjective equality decreased. This indicates that the higher curvature bumps were perceived to rotate faster than lower curvature ones. Error bars indicate the standard error of the mean across subjects.

Results fMRI Experiment 3

Psychophysical data allowed the presentation of stimuli in the fMRI portion of the experiment to be individually calibrated for each subject so that the perceived speed of rotation was the same across stimulus groups. Retinotopic ROIs were available for all 12 of the subjects who participated in this experiment. In addition, hMT+ and LOC were localized for 9 of the subjects. Figure 2C indicates the degree to which percent BOLD signal varies parametrically with CC in each of the ROIs. Table 3 provides the statistical results of the repeated-measures ANOVA. Parametric variation of percent BOLD signal change with increasing CC was only observed in area V3A.

Table 3

Statistical results for experiment 3

Area\Stats F score P value Effect size (ηp2
V1 F1,11 = 0.121 >0.73 0.011 
V2v F1,11 = 0.772 >0.41 0.062 
V2d F1,11 = 0.088 >0.77 0.008 
V3v F1,11 = 0.35 >0.56 0.031 
V3d F1,11 = 0.014 >0.90 0.001 
V3A F1,11 = 14.252 <0.003 0.564 
V4v F1,11 = 0.561 >0.46 0.048 
hMT+ F1,8 = 4.19 >0.07 0.344 
LOC F1,8 = 4.11 >0.077 0.34 
Area\Stats F score P value Effect size (ηp2
V1 F1,11 = 0.121 >0.73 0.011 
V2v F1,11 = 0.772 >0.41 0.062 
V2d F1,11 = 0.088 >0.77 0.008 
V3v F1,11 = 0.35 >0.56 0.031 
V3d F1,11 = 0.014 >0.90 0.001 
V3A F1,11 = 14.252 <0.003 0.564 
V4v F1,11 = 0.561 >0.46 0.048 
hMT+ F1,8 = 4.19 >0.07 0.344 
LOC F1,8 = 4.11 >0.077 0.34 

Note: For each of the individually identified ROIs, F scores, P values, and effect sizes are shown for the repeated-measures ANOVA with linear contrast. ROIs in which a significant amount of variance was accounted for by the linear contrast are shown in bold font (V3A).

Interestingly, however, the BOLD signal in both hMT+ and LOC appears to have some parametric relationship (albeit statistically insignificant P > 0.07) with CC. Directly interpreting “near-significant” results can be very difficult; however, it would be imprudent to neglect such results without first attempting to eliminate potential sources of confounding noise. One possible source of noise can arise from the threshold criterion for determining the hMT+ and LOC masks. For example, too liberal a criterion could include voxels not directly responsive to our stimuli. Because all the voxels within the localized masks are averaged together, these “extra” voxels would only serve to reduce the signal to noise ratio. In order to investigate this possibility, a second set of hMT+ and LOC masks was created using a stricter threshold criterion. Using these masks, the analysis was repeated and revealed that LOC was indeed parametrically modulated (F1,8 = 5.48, P < 0.047, ηp2 = 0.407), whereas hMT+ was not (F1,8 = 2.07, P > 0.18, ηp2 = 0.206). From this, we conclude that in addition to V3A, the LOC is more likely than hMT+ to process CC as a cue for rotational motion.

The results of this experiment indicate that the parametric modulation of BOLD signal observed in area V3A was not due to the perceived speed of rotation. Based on the results of experiments 1, 2, and 3, we can conclude that the percent BOLD signals observed in V3A, and possibly the LOC among the areas tested, vary parametrically with the CC of the stimuli. Figure 5 illustrates as an example the GLM activation map in a given subject for parametric modulations as a function of CC for each of the 3 experiments.

Figure 4.

Individual ROI analysis. Based on a voxel by voxel GLM analysis performed in each subject individually, mean beta weights were computed within each ROI and then averaged across hemisphere. (A) The mean beta weights for each stimulus condition for area hMT+ across subject. (B) The raw beta weights for each subject converted to Z scores to eliminate between-subject magnitude variance. (C) A repeated-measures ANOVA with a linear contrast incorporating the data from all subjects performed on the Z scores to determine the degree of parametric percent BOLD signal change across stimulus condition.

Figure 4.

Individual ROI analysis. Based on a voxel by voxel GLM analysis performed in each subject individually, mean beta weights were computed within each ROI and then averaged across hemisphere. (A) The mean beta weights for each stimulus condition for area hMT+ across subject. (B) The raw beta weights for each subject converted to Z scores to eliminate between-subject magnitude variance. (C) A repeated-measures ANOVA with a linear contrast incorporating the data from all subjects performed on the Z scores to determine the degree of parametric percent BOLD signal change across stimulus condition.

Figure 5.

Single-subject activation maps. Activation map based on a single-subject GLM (fixed effects) showing parametric BOLD variation as a function of CC in a single exemplar subject (left hemisphere). The activation maps for (A) experiment 1, (B) experiment 2, and (C) experiment 3 are shown overlaid with this subject's ROIs outlined in black.

Figure 5.

Single-subject activation maps. Activation map based on a single-subject GLM (fixed effects) showing parametric BOLD variation as a function of CC in a single exemplar subject (left hemisphere). The activation maps for (A) experiment 1, (B) experiment 2, and (C) experiment 3 are shown overlaid with this subject's ROIs outlined in black.

Experiment 4: Monitoring Eye Movements

In any fMRI study investigating neuronal activations within the retinotopic areas of the visual cortex, one has to take into consideration the role of eye movements (both voluntary and involuntary fixational microsaccades) as a potential source of confounding retinal stimulation. Due to technological limitations, we were unable to record eye movements during the fMRI scans. However, in order to determine whether or not a systematic relationship between our stimulus conditions and eye movements existed, we monitored the eye movements of a subset of subjects who participated in the fMRI experiments outside the scanner.

Methods

Twelve subjects (the 2 authors and 10 naïve Dartmouth students) participated in this control experiment conducted outside the scanner. Each subject was presented with one stimulus run like those presented in experiment 1 while simultaneously having their eye movements monitored using the presentation and eye-tracking system described in the psychophysics portion of experiment 3. As in the scanner, subjects were instructed to maintain fixation and press a button whenever the fixation square changed color.

The detection of microsaccades was performed in MATLAB using detection algorithms developed by Engbert and Kliegl (2002). The number and amplitude of microsaccades were computed for each stimulus condition. A repeated-measures ANOVA with linear contrast was performed to determine if there was a parametric relationship between eye movements and the CC of the stimulus groups.

Results

There was no statistically significant parametric variation in the number (F1,11 = 0.316, P > 0.58, ηp2 = 0.028) or amplitude (F1,11 = 1.26, P > 0.28, ηp2 = 0.103) of microsaccades across stimulus condition. Importantly, the effect sizes for the 2 ANOVAs were quite small relative to the effect sizes observed in the fMRI portions of the study. Subjects were nearly perfect (98% hit rate) in successfully responding to the fixation color changes. All responses to fixation color changes occurred within 1 s of the actual color change.

Discussion of Eye Movements

Although eye movements were not recorded during the fMRI runs, the above results indicate little or no correlation between the stimulus groups and the rate or magnitude of eye movements. For this reason, we conclude that eye movements and the confounding retinal stimulation they produce are unlikely to be the source of the parametric BOLD activations we see studying our fMRI experiments. Further evidence against the role of eye movements can be found in the fMRI data themselves in which BOLD modulations in V1 are not observed, as one would expect with systematic variations of eye movements.

General Discussion

Past research on smoothly moving contours has suggested that contour relationships contribute to motion analysis. Wallach (1935, 1976; Wuerger and others 1996) found that the perceived direction of motion of a straight line drifting smoothly behind an aperture depends on the shape of that aperture, even though such a line does not possess a mathematically well-defined direction of motion. Rather than being constructed from the ill-defined motion of the line itself, the motion percept followed the well-defined direction of motion of the line terminators defined by the aperture. In doing so, Wallach presaged what many years later would be called the aperture problem (Marr and Ullman 1981; Marr 1982; Hildreth 1984; Movshon and others 1985).

More recent work has shown how such terminator motions influence processes such as amodal completion and global integration of local motion signals (Lorenceau and Shiffrar 1992; Shiffrar and others 1995). However, the interactions of form and motion are not limited to local contour terminators. For example, there are several illusions where a 3D shape appears to change its direction of rotation depending on the 3D form interpretation one places over the moving object (e.g., Ames' rotating trapezoidal window illusion, Ames 1951; the rotating mask illusion, e.g., Klopfer 1991; motion from structure, e.g., Ullman 1979). Others (Dosher and others 1986) have noted that when a Necker stimulus is continuously rotated from a stationary position, it appears to rotate in a direction consistent with the 3D interpretation it had when stationary.

Similarly, Sinha and Poggio (1996) have shown that the representation of the 3D form of an ambiguous “rotating” wire silhouette determines whether rigid rotation or deformation is seen. They rotated a computer-generated wire silhouette. Although an infinity of 3D motions are consistent with the silhouette motion, an assumption of object rigidity allows the perception of a single rigidly rotating 3D shape. When a new wire is rotated from an initial position that happens to cast the same silhouette as the final position of the first wire, observers tend to see the wire deform, as the silhouette takes on shapes inconsistent with the shape inferred from the first rotating silhouette. Interestingly, observers who do not receive training with the first wire do not see deformation in the second stimulus but instead see rigid rotation. This demonstrates both the existence of an object rigidity assumption and the existence of an internal 3D model that can bias perceived motions toward paths involving rigid rotation or deformation. These studies demonstrate that global form analysis plays a role in the perception of continuous rotational motion.

The purpose of the experiments conducted in this study was to identify where in the brain such global form–motion processing takes place specifically in the context of 2D rotational motion. Each of the 3 studies controlled for a different aspect of the visual stimulus and together isolated CC as the sole form cue that varied across stimulus conditions. The convergent results of the 3 fMRI experiments identify visual area V3A and possibly the LOC as potential locations for neuronal activity underlying the processing of CC as a TF for the perception of continuous rotational motion.

Human V3A has been shown to be motion selective using neuroimaging techniques (fMRI: Tootell and others 1997; Braddick and others 2000, 2001; Vanduffel and others 2002; Vaina and others 2003; Lui and others 2004; Koyama and others 2005; Liu and Wandell 2005; Moutoussis and others 2005; magnetoencephalography: Schellart and others 2004; Aspell and others 2005). While anatomical investigations on the brain of nonhuman primates have revealed that V3A has direct reciprocal connections with primate area MT (for review of primate MT, see Born and Bradley 2005), it has been shown that V3A in humans is much more sensitive to motion than in nonhuman primates (Tootell and others 1997; Vanduffel and others 2001; Orban and others 2003). That we find activity in V3A to be parametrically modulated by CC is consistent with the work of Schira and others (2004) who demonstrated that percent BOLD signal in V3A is correlated with contour and figural processing, even in the absence of conscious perception. Figural processing is central to the TFs argument as the motion signal derived from the TF must be generalized to the rest of the contour. Whereas their findings demonstrated that V3A responded to figural contours, our findings extend and go beyond these by demonstrating that the percent BOLD signal in V3A is parametrically modulated by CC specifically in the context of continuous rotational motion. Unlike previous work, we provide a functional role for the processing of contours in V3A, namely, regions of high curvature can serve as a form-based TF that can be used by the visual system to solve the aperture problem.

Our findings are also consistent with recent imaging work that has investigated the neural correlates of form–motion interactions. Several groups (Braddick and others 2000, 2001; Vaina and others 2003; Moutoussis and others 2005) have shown that percent BOLD signal change in V3A was greater for coherent than for random motion. Koyama and others (2005) showed that V3A is more responsive to radial than to translational motion, at least in the central portion of the visual field. These findings suggest a role for V3A in the generation of global motion percepts. Our findings expand upon this work by suggesting a specific mechanism concerning how form and motion may interact to construct global motion percepts. Namely, we hypothesize that neural activity within V3A serves to extract reliable motion information from regions of high CC. Such TF motion information may then be propagated to the entire moving object, resulting in the global motion percept. The convergent evidence from these studies, as well as our own, leads to the hypothesis that V3A contains neural populations that process form, not to solve the “ventral problem” of determining object shape, but in order to solve the “dorsal problem” of what is going where. The form analysis that we hypothesize takes place here involves the specification and tracking of key TFs, such as CC.

Trackable Features

Motion perception is beset with the problem that many of the motion signals generated by early detectors in the visual system are ambiguous. There are many motions in the world that can give rise to any particular motion observed at the level of the retina or later. This ambiguity, known as the aperture problem, arises because of the receptive field properties of neurons in the early stages of visual processing.

The receptive fields of early motion detectors are small and tuned for orientation. Because of this, motion can only be detected by these neurons in the direction perpendicular to a neuron's orientation. Many authors have argued that the aperture problem can be solved by integrating component motion signals along the contour (Bonnet 1981; Burt and Sperling 1981; Adelson and Movshon 1982; Watson and Ahumada 1985). These models are based on the assertion that ambiguous motion signals can, via integration, be disambiguated. However, certain locations along a contour such as corners, terminators, and junctions do not move ambiguously when they are intrinsically part of the moving object. An alternative solution to the aperture problem lies in exploiting such TFs in order to disambiguate ambiguous component motion signals that arise along portions of contour distant from TFs (Ullman 1979).

Recent neurophysiological data have shown that neurons in MT in the macaque respond more to terminator motion in a barber pole stimulus than to the ambiguous signals generated by portions of the contour away from terminators. Furthermore, they respond more to intrinsically owned terminators than to extrinsic terminators (Pack and others 2004). It has also been shown that neurons in MT in the macaque will initially respond to the direction of motion that is perpendicular (component direction) to a moving line independent of the actual direction of motion (Pack and Born 2001). These same neurons will, over a period of ∼60 ms, shift their response properties so that they respond to the true motion of the line independent of its orientation, suggesting that the unambiguously moving endpoints of the line are quickly but not instantaneously exploited to generate a veridical motion solution. The response properties of these neurons match behavioral data that show that initial pursuit eye movements will be in the direction perpendicular to the moving line and then rapidly adapt to follow the direction of veridical motion as defined by the line terminators (Pack and Born 2001). There is also neurophysiological evidence of end-stopped neurons in V1 that respond to the motion of line terminators independently of the line's orientation (Pack and others 2003), suggesting that form-based TFs such as line terminators can be directly extracted from the image as early as V1. Such cells are largely immune to the aperture problem.

In line with this view, features to which such end-stopped cells would respond have been shown psychophysically to be processed both rapidly and in parallel across the visual scene. Visual search studies have found several form-based “features,” including certain types of contour junctions (Enns and Rensink 1991), contour concavities (Hulleman and others 2000), corners (Humphreys and others 1994), CC (Wolfe and others 1992), and CC discontinuities (Kristjansson and Tse 2001), that will pop out among a set of distracters. It is commonly believed that features that exhibit pop out during visual search are processed rapidly and in parallel across the visual field (Treisman and Gelade 1980), suggesting the existence of hardwired contour discontinuity detectors. Indeed, contour discontinuity information may begin to be extracted even before V1 because circular center-surround receptive fields will respond more to corners than to edges and more to bar terminators than corners (Troncoso and others 2005).

Perceptual Phenomenology of Rotational Motion

Caplovitz and others (2006) characterized the relationship between the degree of curvature along the contour of an ellipse and the speed at which it is perceived to rotate. They found that as the degree of CC increases, so does the perceived speed of rotation. Thus, a skinny ellipse will be perceived to rotate faster than a “fat” ellipse. This observation served as the basis for the design of experiment 3 of the current study, in which the perceived rotational speed of the bumps was shown to parametrically modulate with CC. Indeed, we hypothesize that this illusory percept results from the processing of motion signals generated by the regions of high curvature (TFs) and that the modulation of percent BOLD signal observed in V3A, hMT+, and LOC may reflect this processing.

It is important to note, however, that both the illusory percept of rotational speed as well as the TFs hypothesis only apply to percepts of rigid rotation. It has been shown (Wallach and others 1956; Weiss and Adelson 2000) that a low-aspect ratio ellipse rotating continuously in the 2D plane can be perceived to deform as though its contour were made out of jelly. This nonrigid percept can be influenced by many factors including the presence of satellites (Weiss and Adelson 2000) but most importantly depends on the aspect ratio of the ellipse being low (i.e., close to 1). In the case of nonrigid motion, the question of rotational speed no longer applies because the object is not perceived to rotate at all, but to deform. Similarly, because the shape of the nonrigid object is continuously changing, the concept of a shape-defined TF for a gelatinous ellipse is ill-defined. Based on the verbal reports of the participants (both authors and naives), we are confident that all the stimuli used in the current study were indeed perceived to rotate rigidly at all times by all observers.

Psychophysical evidence examining other forms of motion perception has demonstrated a critical link between form-defined features and the perception of rigid motion. An example can be found in the kinetic depth effect (Wallach and O'Connell 1953) in which the projected 2D image of a rotating, backlit, 3D bent wire will appear to pop out into a 3D rotating object. Importantly, the kinetic depth effect will only occur if the 3D bent wire object has regions of high or discontinuous curvature. Similarly, the phenomenon of anorthoscopic projection (Zollner 1862; von Helmholtz 1867/1925; Parks 1965), in which an object moving behind a narrow slit can be recognized, is also heavily dependent upon the presence of highly salient contour features. In the absence of corners or regions of high curvature, nonrigid motion is perceived through the slit, and object recognition does not occur. Based on these findings, one can speculate that the degree of CC or the presence of other TFs may play a critical role in determining whether a rotating object is perceived as rigid or nonrigid.

Conclusions

Based on the results of these experiments, we conclude that neuronal processing in area V3A and possibly the LOC serves to analyze CC as a TF for the perception of rotational motion. This raises the possibility that these areas contain neural populations that process form, not to solve the ventral problem of determining object shape but in order to solve the dorsal problem of what is going where. We predict that neurons in V3A and possibly in the LOC will respond to the continuous motion of other TFs defined by contour discontinuities, such as junctions, corners, and terminators.

We thank Melissa Henley for assistance in collecting and analyzing data. This project was funded by NIH R03 MH0609660-01 grant to PUT and NSF fellowship 2005031192 to GPC. Conflict of Interest: None declared.

References

Adelson
EH
Movshon
JA
Phenomenal coherence of moving visual patterns
Nature
 , 
1982
, vol. 
30
 (pg. 
523
-
525
)
Ames
A
Visual perception and the rotating trapezoidal window
Psychol Monogr Gen Appl
 , 
1951
, vol. 
65
 (pg. 
1
-
31
)
Aspell
JE
Tanskanen
T
Hurlbert
AC
Neuromagnetic correlates of visual motion coherence
Eur J Neurosci
 , 
2005
, vol. 
22
 
11
(pg. 
2937
-
2945
)
Attneave
F
Some informational aspects of visual perception
Psychol Rev
 , 
1954
, vol. 
61
 (pg. 
183
-
193
)
Bonnet
C
Long
J
Bradley
A
Processing configurations of visual motion
Attention and performance IX
 , 
1981
Hillsdale, NJ
Erlbaum
Born
RT
Bradley
DC
Structure and function of visual area MT
Annu Rev Neurosci
 , 
2005
, vol. 
28
 (pg. 
157
-
189
Review
Braddick
OJ
O'Brien
JM
Wattam-Bell
J
Atkinson
J
Hartley
T
Turner
R
Brain areas sensitive to coherent visual motion
Perception
 , 
2001
, vol. 
30
 (pg. 
61
-
72
)
Braddick
OJ
O'Brien
JM
Wattam-Bell
J
Atkinson
J
Turner
R
Form and motion coherence activate independent but not dorsal/ventral segregated, networks in the human brain
Curr Biol
 , 
2000
, vol. 
10
 (pg. 
731
-
734
)
Brewer
AA
Liu
J
Wade
AR
Wandell
BA
Visual field maps and stimulus selectivity in human ventral occipital cortex
Nat Neurosci
 , 
2005
, vol. 
8
 
8
(pg. 
1102
-
1109
)
Burt
P
Sperling
G
Time, distance, and feature trade-offs in visual apparent motion
Psychol Rev
 , 
1981
, vol. 
88
 (pg. 
171
-
195
)
Caplovitz
GP
Hsieh
P-J
Tse
PU
Mechanisms underlying the perceived angular velocity of a rigidly rotating object
Vision Res
 , 
2006
, vol. 
46
 (pg. 
2877
-
2893
)
Dosher
BA
Sperling
G
Wurst
SA
Tradeoffs between stereopsis and proximity luminance covariance as determinants of perceived 3D structure
Vision Res
 , 
1986
, vol. 
26
 
6
(pg. 
973
-
990
)
Dupont
P
De Bruyn
B
Vandenberghe
R
Rosier
AM
Michiels
J
Marchal
G
Mortelmans
L
Orban
GA
The kinetic occipital region in human visual cortex
Cereb Cortex
 , 
1997
, vol. 
7
 (pg. 
283
-
292
)
Engbert
R
Kliegl
R
Microsaccade uncover the orientation of covert attention
Vision Res
 , 
2002
, vol. 
43
 (pg. 
1035
-
1045
)
Enns
JT
Rensink
RA
Preattentive recovery of three-dimensional orientation from line drawings
Psychol Rev
 , 
1991
, vol. 
98
 
3
(pg. 
335
-
351
)
Fennema
C
Thompson
W
Velocity determination in scenes containing several moving objects
Comput Graph Image Process
 , 
1979
, vol. 
9
 (pg. 
301
-
305
)
Grossman
E
Donnelly
M
Price
R
Pickens
D
Morgan
V
Neighbor
G
Blake
R
Brain areas involved in perception of biological motion
J Cogn Neurosci
 , 
2000
, vol. 
12
 (pg. 
711
-
720
)
Hadjikhani
N
Liu
AK
Dale
AM
Cavanagh
P
Tootell
RBH
Retinotopy and color sensitivity in human visual cortical area V8
Nat Neurosci
 , 
1998
, vol. 
1
 (pg. 
235
-
241
)
Hildreth
EC
The measurement of visual motion
 , 
1984
MA: MIT Press
Cambridge
Hulleman
J
te Winkel
W
Boselie
F
Concavities as basic features in visual search: evidence from search asymmetries
Percept Psychophys
 , 
2000
, vol. 
62
 
1
(pg. 
162
-
174
)
Humphreys
GW
Keulers
N
Donnelly
N
Parallel visual coding in three dimensions
Perception
 , 
1994
, vol. 
23
 
4
(pg. 
453
-
470
)
Klopfer
DS
Apparent reversals of a rotating mask: a new demonstration of cognition in perception
Percept Psychophys
 , 
1991
, vol. 
49
 
6
(pg. 
522
-
530
)
Kononen
M
Paakkonen
A
Pihlajamaki
M
Partanen
K
Karjalainen
PA
Soimakallio
S
Aronen
HJ
Visual processing of coherent rotation in the central visual field: an fMRI study
Perception
 , 
2003
, vol. 
32
 (pg. 
1247
-
1257
)
Kourtzi
Z
Erb
M
Grodd
W
Bulthoff
HH
Representation of the perceived 3-D object shape in the human lateral occipital complex
Cereb Cortex
 , 
2003
, vol. 
13
 
9
(pg. 
911
-
920
)
Kourtzi
Z
Kanwisher
N
Cortical regions involved in perceiving object shape
J Neurosci
 , 
2000
, vol. 
20
 
9
(pg. 
3310
-
3318
)
Koyama
S
Sasaki
Y
Andersen
GJ
Tootell
RB
Matsuura
M
Watanabe
T
Separate processing of different global-motion structures in visual cortex is revealed by FMRI
Curr Biol
 , 
2005
, vol. 
15
 
22
(pg. 
2027
-
2032
)
Kristjansson
A
Tse
PU
Curvature discontinuities are cues for rapid shape analysis
Percept Psychophys
 , 
2001
, vol. 
63
 
3
(pg. 
390
-
403
)
Liu
J
Wandell
BA
Specializations for chromatic and temporal signals in human visual cortex
J Neurosci
 , 
2005
, vol. 
25
 
13
(pg. 
3459
-
3468
)
Lorenceau
J
Shiffrar
M
The role of terminators in motion integration across contours
Vision Res
 , 
1992
, vol. 
32
 (pg. 
263
-
273
)
Lui
T
Slotnick
SD
Yantis
S
Human MT+ mediates perceptual filling-in during apparent motion
Neuroimage
 , 
2004
, vol. 
21
 
4
(pg. 
1772
-
1780
)
Malach
R
Levy
I
Hasson
U
The topography of high-order human object areas
Trends Cogn Sci
 , 
2002
, vol. 
6
 (pg. 
176
-
184
)
Marr
D
Vision
 , 
1982
New York
W. H. Freeman and Co.
Marr
D
Ullman
S
Direction selectivity and its use in early visual processing
Proc R Soc Lond Ser B
 , 
1981
, vol. 
211
 (pg. 
151
-
180
)
Moutoussis
K
Keliris
G
Kourtzi
Z
Logothetis
N
A binocular rivalry study of motion perception in the human brain
Vision Res
 , 
2005
, vol. 
45
 
17
(pg. 
2231
-
2243
)
Movshon
JA
Adelson
EH
Gizzi
MS
Newsome
WT
Chagas
C
Gattass
R
Gross
C
The analysis of moving visual patterns
Pattern recognition mechanisms
 , 
1985
New York
Springer
(pg. 
117
-
151
)
Nakayama
K
Silverman
GH
The aperture problem I. Perception of nonrigidity and motion direction in translating sinusoidal lines
Vision Res
 , 
1988
, vol. 
28
 
6
(pg. 
739
-
746
)
Orban
GA
Fize
D
Peuskens
H
Denys
K
Nelissen
K
Sunaert
S
Todd
J
Vanduffel
W
Similarities and differences in motion processing between the human and macaque brain: evidence from fMRI
Neuropsychologia
 , 
2003
, vol. 
41
 
13
(pg. 
1757
-
1768
)
Pack
CC
Born
RT
Temporal dynamics of a neural solution to the aperture problem in visual area MT of macaque brain
Nature
 , 
2001
, vol. 
409
 
6823
(pg. 
1040
-
1042
)
Pack
CC
Gartland
AJ
Born
RT
Integration of contour and terminator signals in visual area MT of alert macaque
J Neurosci
 , 
2004
, vol. 
24
 
13
(pg. 
3268
-
3280
)
Pack
CC
Livingstone
MS
Duffy
KR
Born
RT
End-stopping and the aperture problem: two-dimensional motion signals in macaque V1
Neuron
 , 
2003
, vol. 
39
 
4
(pg. 
671
-
680
)
Parks
TE
Post-retinal visual storage
Am J Psychol
 , 
1965
, vol. 
78
 (pg. 
145
-
147
)
Press
WA
Brewer
AA
Dougherty
RF
Wade
AR
Wandell
BA
Visual areas and spatial summation in human visual cortex
Vision Res
 , 
2001
, vol. 
41
 (pg. 
1321
-
1332
)
Schellart
NA
Trindade
MJ
Reits
D
Verbunt
JP
Spekreijse
H
Temporal and spatial congruence of components of motion-onset evoked responses investigated by whole-head magneto-electroencephalography
Vision Res
 , 
2004
, vol. 
44
 
2
(pg. 
119
-
134
)
Schira
MM
Fahle
M
Donner
TH
Kraft
A
Brandt
SA
Differential contribution of early visual areas to the perceptual process of contour processing
J Neurophysiol
 , 
2004
, vol. 
91
 
4
(pg. 
1716
-
1721
)
Sereno
MI
Dale
AM
Reppas
JB
Kwong
KK
Belliveau
JW
Brady
TJ
Rosen
BR
Tootell
RB
Borders of multiple visual areas in humans revealed by functional magnetic resonance imaging
Science
 , 
1995
, vol. 
268
 (pg. 
889
-
893
)
Shiffrar
M
Li
X
Lorenceau
J
Motion integration across differing image features
Vision Res
 , 
1995
, vol. 
35
 
15
(pg. 
2137
-
2146
)
Sinha
P
Poggio
T
Role of learning in three-dimensional form perception
Nature
 , 
1996
, vol. 
384
 (pg. 
460
-
463
)
Slotnick
SD
Yantis
S
Efficient acquisition of human retinotopic maps
Hum Brain Mapp
 , 
2003
, vol. 
18
 
1
(pg. 
22
-
29
)
Smith
AT
Greenlee
MW
Singh
KD
Kraemer
FM
Hennig
J
The processing of first- and second-order motion in human visual cortex assessed by functional magnetic resonance imaging (fMRI)
J Neurosci
 , 
1998
, vol. 
18
 (pg. 
3816
-
3830
)
Tootell
RB
Mendola
JD
Hadjikhani
NK
Ledden
PJ
Liu
AK
Reppas
JB
Sereno
MI
Dale
AM
Functional analysis of V3A and related areas in human visual cortex
J Neurosci
 , 
1997
, vol. 
17
 
18
(pg. 
7060
-
7078
)
Tootell
RBH
Dale
AM
Sereno
MI
Malach
R
New images from human visual cortex
Trends Neurosci
 , 
1996
, vol. 
95
 (pg. 
818
-
824
)
Tootell
RBH
Hadjikhani
N
Hall
EK
Marrett
S
Vanduffel
W
Vaughan
JT
Dale
AM
The retinotopy of visual spatial attention
Neuron
 , 
1998
, vol. 
21
 (pg. 
1409
-
1422
)
Tootell
RBH
Hadjikhani
NK
Vanduffel
W
Liu
AK
Mendola
JD
Sereno
MI
Dale
AM
Functional analysis of primary visualcortex (V1) in humans
Proc Natl Acad Sci USA
 , 
1998
, vol. 
95
 (pg. 
811
-
817
)
Tootell
RBH
Reppas
JB
Kwong
KK
Malach
R
Born
RT
Brady
TJ
Rosen
BR
Belliveau
JW
Functional analysis of human MT and related visual cortical areas using magnetic resonance imaging
J Neurosci
 , 
1995
, vol. 
15
 (pg. 
3215
-
3230
)
Treisman
AM
Gelade
GA
Feature-integration theory of attention
Cognit Psychol
 , 
1980
, vol. 
12
 
1
(pg. 
97
-
136
)
Troncoso
XG
Macknik
SL
Martinez-Conde
S
Novel visual illusions related to Vasarely's ‘nested squares’ show that corner salience varies with corner angle
Perception
 , 
2005
, vol. 
34
 
4
(pg. 
409
-
420
)
Tsao
DY
Vanduffel
W
Sasaki
Y
Fize
D
Knutsen
TA
Mandeville
JB
Wald
LL
Dale
AM
Rosen
BR
Van Essen
DC
Livingstone
MS
Orban
GA
Tootell
RBH
Stereopsis activates V3A and caudal intraparietal areas in macaques and humans
Neuron
 , 
2003
, vol. 
39
 (pg. 
555
-
568
)
Tse
PU
Albert
MK
Amodal completion in the absence of image tangent discontinuities
Perception
 , 
1998
, vol. 
27
 
4
(pg. 
455
-
464
)
Tse
PU
Sheinberg
DL
Logothetis
NK
Fixational eye-movements are not affected by abrupt onsets that capture attention
Vision Res
 , 
2002
, vol. 
42
 (pg. 
1663
-
1669
)
Ullman
S
The interpretation of visual motion
 , 
1979
MA: MIT Press
Cambridge
Vaina
LM
Gryzacz
NM
Saiviroonporn
P
LeMay
M
Bienfang
DC
Conway
A
Can spatial and temporal motion integration compensate for deficits in local motion mechanisms?
Neuropsychologia
 , 
2003
, vol. 
41
 (pg. 
1817
-
1836
)
Vanduffel
W
Fize
D
Mandeville
JB
Nelissen
K
Van Hecke
P
Rosen
BR
Tootell
RB
Orban
GA
Visual motion processing investigated using contrast agent-enhanced fMRI in awake behaving monkeys
Neuron
 , 
2001
, vol. 
32
 
4
(pg. 
565
-
577
)
Vanduffel
W
Fize
D
Peuskens
H
Denys
K
Sunaert
S
Todd
JT
Orban
GA
Extracting 3D from motion: differences in human and monkey intraparietal cortex
Science
 , 
2002
, vol. 
298
 (pg. 
413
-
415
)
Van Oostende
S
Sunaert
S
Van Hecke
P
Marchal
G
Orban
GA
The kinetic occipital (KO) region in man: an fMRI study
Cereb Cortex
 , 
1997
, vol. 
7
 (pg. 
690
-
701
)
von Helmholtz
H
Treatise on physiological optics
 , 
1867
, vol. 
Volume 3
 
3rd ed
New York
Dover Press
Wallach
H
Über visuell wahrgenommene Bewegungsrichtung
Psychol Forsch
 , 
1935
, vol. 
20
 (pg. 
325
-
380
)
Wallach
H
On perception
 , 
1976
New York
Quadrangle
Wallach
H
O'Connell
DN
The kinetic depth effect
J Exp Psychol
 , 
1953
, vol. 
45
 (pg. 
205
-
217
)
Wallach
H
Weisz
A
Adams
PA
Circles and derived figures in rotation
Am J Psychol
 , 
1956
, vol. 
69
 (pg. 
48
-
59
)
Watson
AB
Jr
Ahumada AJ
Model of human visual-motion sensing
J Opt Soc Am A
 , 
1985
, vol. 
2
 (pg. 
232
-
342
)
Weiss
Y
Adelson
EH
Adventures with gelatinous ellipses-constraints on models of human motion analysis
Perception
 , 
2000
, vol. 
29
 
5
(pg. 
543
-
566
)
Weiss
Y
Simoncelli
EP
Adelson
EH
Motion illusions as optimal percepts
Nat Neurosci
 , 
2002
, vol. 
5
 
6
(pg. 
598
-
604
)
Wolfe
JM
Yee
A
Friedman-Hill
SR
Curvature is a basic feature for visual search tasks
Perception
 , 
1992
, vol. 
21
 
4
(pg. 
465
-
480
)
Wuerger
S
Shapley
R
Rubin
N
“On the visually perceived direction of motion” by Hans Wallach: 60 years later
Perception
 , 
1996
, vol. 
25
 (pg. 
1317
-
1367
)
Zeki
S
Perry
RJ
Bartels
A
The processing of kinetic contours in the brain
Cereb Cortex
 , 
2003
, vol. 
13
 (pg. 
193
-
203
)
Zollner
F
Über eine neue art anorthoskopischer zerrbilder
Ann Phys Chem
 , 
1862
, vol. 
117
 (pg. 
477
-
484
)