An internal sense of eye position is necessary to maintain the constancy of the visual world in spite of movements of the eyes. Neuroimaging studies have localized human homologs of monkey visual motion processing areas in MT/MST and also in the collateral sulcus (V4), an area that codes features within objects. We show that these two areas have a baseline fMRI signal that is modulated by eye position and that the preferred direction of the eye position signal is different in the two areas; increasing for ipsiversive eye positions in MT/MST and increasing for contraversive eye positions within the collateral sulcus. This baseline modulation is a true eye position signal; one that is present in the absence of visual motion stimuli. The difference in the preferred direction of the eye position signal may reflect the different transformations in these two areas; a transformation from a retinotopic (eye-centered) to an egocentric coordinate frame necessary for guiding action and to an object-centered frame for object recognition.
Whenever the eyes move, the image of the world sweeps across the retina, yet our perception of the world remains stable. Objects that are stationary continue to appear so. How is this achieved? A simple solution, proposed by Helmholtz (Southall, 1925) over a century ago, is that a copy of the motor command, a corollary discharge, may be sent to visual areas to preserve this constancy. Indeed, in the monkey early visual areas show modulation by eye position signals (Galletti and Battaglini, 1989; Guo and Li, 1997; Trotter and Celebrini, 1999). Without this cortical eye position signal, eye movements generate the percept of a constantly shifting world (Haarmeier et al., 1997).
There are several ways in which this eye position signal can be used to achieve constancy. One is to use eye position to shift the locus of visually activated neurons during the course of an eye movement (Duhamel et al., 1992). Another is to use eye position to transform the reference frame from one that is retinotopic into one that is centered in the head or some other body part (Brotchie et al., 1995; Snyder et al., 1998). This transformation is achieved in the monkey parietal cortex by modulating the activity of visual neurons by an eye position signal (Brotchie et al., 1995; Andersen, 1997). Recently, fMRI has been used to demonstrate that a similar eye position modulation occurs within human parietal cortex (Baker et al., 1999; DeSouza et al., 2000).
Transformations are also required when sensing motion through visual cues. To judge the direction of self-motion of our bodies during locomotion, the pattern of optic flow must be invariant of eye position. Recent studies in the monkey (Bremmer et al., 1997) suggest that this transformation may begin as early as MT. Surprisingly, in MST an eye position signal is present even in the absence of visual input (Squatrito and Maioli, 1996). Here the tonic basal firing rate changes in proportion to eye position, even in complete darkness, thus demonstrating the presence of a true eye position signal.
Motion is also important for processing the features of objects within the ventral pathway (Ungerleider and Mishkin, 1982). Early in the human ventral pathway lies area V4v, an area involved in the initial stages of color and form processing (Van Essen and Drury, 1997). Moving visual stimuli also activate ventral areas along the human collateral sulcus (de Jong et al., 1994; Dupont et al., 1994; Sunaert et al., 1999), areas corresponding to V4v/V8. In area V4 of the monkey a true eye position modulation, similar to that in area MST, has recently been observed (Bremmer, 2000).
The goal of our experiment was to use fMRI to determine whether a similar modulation by eye position is present in the human equivalents of monkey areas MT/MST and V4. To examine whether this was a true eye position signal, changes in the basal levels of fMRI activation, those present in the absence of a moving visual stimulus, were examined.
Materials and Methods
Subjects and Training
The seven subjects were paid volunteers in this study (five male, two female, mean age = 28 ± 1.2 years). Subjects gave informed written consent and the University of Western Ontario Ethics Review Board approved all procedures. Each subject performed three training sessions: (i) 1 day before the experiment, (ii) immediately before the experiment while viewing a regular computer monitor outside the magnetic bore and (iii) just after entering the magnet bore while viewing the actual experimental screen.
Expanding Optic Flow Stimulus
Subjects were supine and entered the bore feet first (backwards) to increase the viewable visual angle (effective visual angle 90° wide, 30° tall). Images were back-projected onto a screen (Da-Lite Corp., Warsaw, IN) and subjects viewed them through a mirror. Subjects were instructed to fixate a red cross while either viewing an expanding optic flow pattern or in near darkness with only the red visual fixation cross present. The red visual fixation cross was either 30° to the left or 30° to the right of center (Fig. 1A,B). During the optic flow epochs, the fixation cross was surrounded by an expanding optic flow pattern that subtended a square shape of ±15° horizontally and vertically. This expanding flow pattern consisted of 400 centrifugally moving white dots displayed on a black background (dot size 0.28°, average dot speed 8.0°/s; dots that disappeared past the edge of the display were replaced by distant dots at random locations). In a previous experiment in our laboratory we compared the activation produced by expanding optic flow to that produced by an identical stationary image (Dukelow et al., 2001). This produced a similar activation along the collateral sulcus when compared with the present study.
Optic flow epochs always alternated with visual fixation epochs as follows: FR, FLof, FL, FRof, FR, FLof, FL, FRof, FR (FL, left visual fixation; FLof, left visual fixation with optic flow; FR, right visual fixation; FRof, right visual fixation with optic flow). The order of the left and right fixation (FL and FR) was pseudo-randomly alternated across subjects. Each epoch was 16 s long (for one subject the epochs were 24 s long) and thus the entire series of epochs lasted 144 (or 216) s. Each subject completed four to six repetitions of the series of epochs and the corresponding functional scans were averaged.
Experiments were carried out using a 4.0 T Varian UNITY INOVA whole body imaging system (Siemens, Palo Alto, CA and Erlangen, Germany) equipped with whole body shielded gradients. An 8 cm diameter quadrature radio frequency (RF) surface coil was used for transmission and reception of the RF signal from MT+ and fusiform regions. This coil was centered over the right hemisphere occipito-temporal region. Subjects’ heads were restrained using a vice system with padded side and forehead restraints.
Functional data were collected using BOLD (blood oxygenation level-dependent) navigator echo corrected T2*-weighted segmented gradient echoplanar imaging [11 slices, 64 × 64 resolution, 19.2 cm in-plane field of view (FOV), time to echo (TE) 15 ms, flip angle 30°, volume acquisition time 1.0–2.0 s and voxel size 3 × 3 × 5 mm].
Functional data were superimposed on high resolution inversion prepared 3-dimensional T1-weighted anatomical images of the brain collected immediately after functional images using the same in-plane FOV (64 slices, 256 × 256 resolution, TE 5.5 ms, TR 11.5 ms, flip angle 15°) with a phase reference image that corrected for high field geometric distortions. Subjects were also scanned in another session using a cylindrical quadrature birdcage RF coil to obtain full-brain anatomical images (256 slices, 256 × 256 resolution, FOV 22 cm, TE 5.5–6.0 ms, TR 11.5 ms, flip angle 15°). The surface coil anatomical for each subject was manually realigned with the subjects’ head coil anatomical. Images were convolved to the Talairach atlas (Talairach and Tournoux, 1988) to obtain three-dimensional coordinates. Anatomical images for each subject were then segmented at the gray/white matter boundary, rendered, inflated and flattened for visualization purposes (Goebel et al., 1998).
Analysis was carried out using Brain Voyager 4.3 software (Brain Innovation, Maastricht, The Netherlands). Collected images underwent motion correction (within-slice) and linear trend removal. Each subject’s data was analyzed using a voxel-by-voxel linear cross-correlation analysis to detect significantly activated regions of cortex. We correlated a reference function reflecting the optic flow and visual fixation epochs (left and right visual fixations were grouped) that was convolved with the hemodynamic response, thus mapping voxels independently of eye position. Correlation coefficients of r > 0.4 (P < 0.000001) were used for functional map generation within each subject. Data for statistical comparisons across experimental epochs comparing eye position signals were generated from these significantly activated voxels within a specified cortical region (MT+ or V4v/V8) of each individual subject. The signal intensities from these functionally mapped voxels using the optic flow stimulus was than parsed for eye position effects. While analysis used linear correlations throughout, general linear model analysis was used to generate Figure 3A,B of the functional map averaged across the seven subjects (Friston et al., 1991). We excluded data in which head movement artifacts of >0.5 mm were observed.
Functional Mapping of Expanding Optic Flow
To examine eye position signals in both MT+ and the V4v/V8 complex, we first functionally mapped these areas using an expanding optic flow stimulus with the prediction that it should activate visually responsive and motion responsive neurons. Unlike many functional imaging studies of motion that have compared a motion stimulus to a stationary control stimulus (Zeki et al., 1991; Watson et al., 1993; Tootell et al., 1995; Sunaert et al., 1999), we chose to have visual fixation as our control period (subjects looked at a red fixation cross). This allowed us to measure a true eye position signal uncorrupted by extraneous visual information (other than the fixation cross).
As expected, the optic flow stimulus (compared with visual fixation) produced functional activation in MT+ along the inferior temporal sulcus in all seven subjects (see Fig. 2A, single subject). The average location in Talairach coordinates (Talairach and Tournoux, 1988) of the activation in MT+ was x = 46 ± 2, y = –62 ± 2 and z = –1 ± 2 (mean ± SD), which was consistent with previous reports of human MT+ (Zeki et al., 1991; Watson et al., 1993; de Jong et al., 1994; Tootell et al., 1995; Sunaert et al., 1999).
This stimulus also produced robust functional activation in the V4v/V8 complex within the cortex lining the collateral sulcus along the lingual and fusiform gyri (Fig. 2A,B). The average location in Talairach coordinates of the collateral sulcus activation was x = 23 ± 2, y = –66 ± 3 and z = –14 ± 2. This was consistent with the foci described earlier for lingual gyrus and fusiform gyrus during other motion perception experiments (de Jong et al., 1994; Sunaert et al., 1999).
The activation map in Figure 2A,B for optic flow was overlaid onto the rendered cortical surface (Fig. 2C). This right hemisphere was then flattened (see Materials and Methods) (Fig. 2D). The volume of functional activation in MT+ was larger than the activation in the V4v/V8 complex (Fig. 2A,B). For this subject, the volume of cortex for MT+ was 1518 mm3 and for V4v/V8 was 1163 mm3. Across our subjects, the volume of cortex activated in MT+ and V4v/V8 was 1180 ± 213 and 577.1 ± 124 mm3, respectively. The collateral sulcus activation (dashed gray line) had a retinotopic representation in this subject [using visual mapping stimuli similar to others (Sereno et al., 1995; Engel et al., 1997); data not shown]. This suggests that our optic flow collateral sulcus activation may correspond to V4v and portions of V8 (Hadjikhani et al., 1998), called the V4v/V8 complex. Activation was also observed in the early visual areas (labeled V1 in Fig. 2) and within the cortex surrounding the parieto-occipital sulcus. Since our MR signal decayed with increased distance from the surface coil center, we focused our analysis on areas near its center, namely right hemisphere MT+ and V4v/V8 regions.
We averaged all seven subjects’ optic flow activation maps to produce an average activation on a flattened map of a representative subject (Fig. 3A). For a more direct comparison, the dashed yellow oval highlights the common activation of area V4v/V8 along the collateral sulcus, excluding object recognition areas more anterior along the fusiform gyrus (for details see James et al., 2002).
Eye Position Signals in MT+ and V4v/V8 During Visual Fixation
The signal intensity within right hemisphere MT+ and V4v/V8 was compared during the visual fixation epochs to the right and left. Figure 4A shows the comparison for one subject (aligned to optic flow offset). The activations during both left and right fixation began to decrease together shortly after the onset of the visual fixation epoch but then diverged at 10 s (downward pointing arrow in Fig. 4A). During the next 6 s the signal intensity in MT+ for fixating rightward was greater than for fixating leftward [t(9) = 4.56, P < 0.025]. The dotted gray vertical line at 16 s represents a simultaneous change in gaze position with the onset of optic flow. The averaged waveform of six subjects shows the two lines diverging at 9 s after the onset of the visual fixation epoch/offset of optic flow (upward pointing arrow in Fig. 4B) and then remained separate for the duration of the epoch.
The V4v/V8 activation (Fig. 4C) showed a similar divergence of waveform patterns starting at 7 s after optic flow offset (the visual fixation epoch onset), but this time the difference was larger and occurred earlier than in MT+ (downward pointing arrow in Fig. 4C). Most interesting was the fact that the polarity of this eye position signal in V4v/V8 was reversed when compared with MT+. Here, left fixation showed an increased signal compared with right fixation [t(9) = 8.08, P < 0.001]. This pattern was also observed for the population average (Fig. 4D). In the V4v/V8 complex, on average, the waveforms for left and right fixation began to diverge ~3 s earlier (Fig. 4D, upward pointing arrow) as compared with MT+ (Fig. 4B, upward pointing arrow).
Figure 5A,B compares the averaged signal intensity across subjects during right and left visual fixation from the right hemisphere MT+ and V4v/V8 complex (excluding hemodynamic fall and rise). On average, both brain areas showed a significant eye position modulation during visual fixation. The MT+ signal intensity during visual fixation to the right was greater than during visual fixation to the left [t(6) = 3.57, P < 0.025] and the V4v/V8 signal intensity for the analogous comparison was significantly different, but in the reverse direction [t(6) = 4.00, P < 0.01; far right bars in Fig. 5].
Eye Position Signals in MT+ and V4v/V8 During Optic Flow
The signal intensity in response to the optic flow stimulus was also examined for potential eye position. For each subject, the average of all optic flow epochs with eye right was calculated and compared with the average of those when the eye looked left for both MT+ and V4v/V8. Across all subjects, changes in eye position during optic flow produced a variable modulation in both MT+ and V4v/V8 (far right bars in Fig. 6A,B). In area MT+ one subject (1) showed a significantly higher signal intensity during eye right [t(9) = 19.2, P < 0.005] and two other subjects (3 and 4) showed a higher signal intensity for the reverse direction [subject 3, t(9) = 2.39, P < 0.05; subject 4, t(9) = 2.93, P < 0.05; Fig. 6A]. Thus the average across subjects showed no significant eye position effects during optic flow (far right bars in Fig. 6A).
Likewise for area V4v/V8, two subjects (3 and 6) showed a significantly larger signal for eye right compared with eye left [subject 3, t(9) = 6.23, P < 0.001; subject 6, t(9) = 4.31, P < 0.005; Fig. 6B), but again the average across subjects showed no influence of changing the eye position with optic flow (far right bars in Fig. 6B). Interestingly, there was no difference in the variability of signal intensities during epochs of optic flow when compared with epochs of visual fixation (i.e. variances in Fig. 5A,B are the same as in Fig. 6A,B).
Summary of Eye Position Effects
Figure 7A compares the effects of eye position during optic flow and visual fixation epochs within MT+ and V4v/V8. It plots the percent signal changes for each subject during right fixation on the x-axis against that during left fixation. The region enclosed by the gray box highlights the data obtained in the visual fixation epochs. Area MT+ tends to fall in the lower right quadrant, the quadrant that represents an ipsiversive eye position effect (all data were from the right hemisphere). In contrast, data for V4v/V8 revealed a contraversive eye position effect. This difference was also larger in amplitude than that in MT+ [t(9) = 8.08, P < 0.0001]. When comparing the eye position effects for MT+ with V4v/V8 the two areas showed non-overlapping populations (Fig. 7A).
The figure also illustrates that the influence of eye position is different during epochs of optic flow (data points outside the gray box in Fig. 7B). If the optic flow data were to show the same eye position effects as those seen during visual fixation, one would expect that the data points for visual fixation (triangles and squares) would be shifted diagonally up to the right along the dotted line (Fig. 7B). All the triangles would remain above the line and the squares below the line. However, the optic flow data showed no such relationship; most subjects’ data lay close to the equality line (dotted line) for both eye positions. Also, it is clear that the two test conditions of (i) visual fixation and (ii) fixation during optic flow were indeed two different populations, with the optic flow data having a stronger percent signal change.
These results demonstrate that in the absence of moving visual stimuli, the human motion sensitive area MT+ is clearly modulated by the position of the eye in the head. This eye position modulation is similar to that observed from single unit studies in monkey area MST (Squatrito and Maioli, 1996). In monkey MST the on direction (preferred direction) of this modulation varies from neuron to neuron but more than twice as many neurons show an increase for ipsiversive gaze (Squatrito and Maioli, 1996) compared with contraversive gaze. The present fMRI activation is presumably the equivalent of this average.
As previously suggested, one function of an eye position signal may be to transform a signal coding retinal slip velocity into a velocity coded with respect to the head. The motion signals coded by MT are passed on to area MST, where the direction of self-motion and that of moving small objects is computed (Andersen et al., 2000). In both cases, motion with respect to a common egocentric frame of reference is useful (Andersen, 1997; Shenoy et al., 1999). To compute this, an eye position signal is required.
The modulation by eye position that is observed during visual fixation is, quite surprisingly, no longer present when subjects view a moving visual stimulus. This null result may be comparable to the observation that the on direction of motion-sensitive neurons in monkey MST (Bremmer et al., 1997) varies considerably from neuron to neuron with no statistically significant bias for any particular gaze direction. In these neurons the interaction between the eye position signal and the signal coding visual motion is complementary for some neurons and antagonistic for other neurons. Still others show no eye position modulation. One possible explanation for our result is that the small difference observed during visual fixation becomes swamped by a large and variable visual motion activation. However, this does not appear to be the case because the variability of the signal over time is not different between fixation and motion stimulus conditions. Rather, the direction of the interactive effect of the visual motion stimulus and the eye position signal varies from subject to subject. In some subjects the original on direction is preserved, while in others it is reversed. Another explanation is a saturation of the fMRI signal. Perhaps a given increase in the average neuronal firing rate produces a smaller increment in fMRI signal when added to a large activation.
Our results also show that the collateral sulcus activation (area V4v/V8) is modulated by eye position during visual fixation. On average, this modulation is larger than that observed in MT+. An eye position modulation of the tonic firing rate of single units has also been observed in the monkey during visual fixation of a small target light (Bremmer, 2000). Here the on direction varies from neuron to neuron with no apparent average on preference (Bremmer, 2000).
Surprisingly, our study shows that the eye position modulation in V4v/V8 is in the opposite direction to that found in MT+. In V4v/V8 an increase is observed for a contraversive displacement of the eye. Given that a possible role of the eye position is that of a coordinate transformation, this suggests that the transformations performed in V4v/V8 and beyond is different from those in MT+. The former cortical area is part of the ventral stream, which specializes in object recognition, while the latter is considered part of the dorsal stream, which specializes in guiding actions (Ungerleider and Mishkin, 1982; Goodale and Milner, 1992). One possible function for an eye position signal in the ventral stream is to transform features within an object to a reference frame centered within the object. In contrast, the dorsal stream performs transformations between various egocentric frames of reference (Andersen et al., 1993).
In summary, an eye position signal is likely an important tag for detecting and functionally mapping the many areas within the human cortex involved in transforming the retinotopic coding of early visual areas into a code suitable for early object recognition and for guiding our action. These data provide the first evidence in the human that this tag is most prominent when not contaminated by a moving visual input.
The Canadian Institutes of Health and Research supported this research. We thank Leopold van Cleeff for equipment construction and assistance during experimental testing, Ralph Baddour for programming the stimulus, Joseph S. Gati for technical assistance during imaging, Alain Proulx for assistance with brain flattening and M. Goodale, K. Humphrey, J. Connolly, T. James, M. Brown and J. Culham for insightful suggestions and comments.