Novel mapping stimuli composed of biological motion figures were used to study the extent and layout of multiple retinotopic regions in the entire human brain and to examine the independent manipulation of retinotopic responses by visual stimuli and by attention. A number of areas exhibited retinotopic activations, including full or partial visual field representations in occipital cortex, the precuneus, motion-sensitive temporal cortex (extending into the superior temporal sulcus), the intraparietal sulcus, and the vicinity of the frontal eye fields in frontal cortex. Early visual areas showed mainly stimulus-driven retinotopy; parietal and frontal areas were driven primarily by attention; and lateral temporal regions could be driven by both. We found clear spatial specificity of attentional modulation not just in early visual areas but also in classical attentional control areas in parietal and frontal cortex. Indeed, strong spatiotopic activity in these areas could be evoked by directed attention alone. Conversely, motion-sensitive temporal regions, while exhibiting attentional modulation, also responded significantly when attention was directed away from the retinotopic stimuli.
The primate brain contains multiple re-representations of the retina laid out in topological maps, often called retinotopic maps, in the midbrain, thalamus, and occipital lobe (Felleman and Van Essen 1991; Sereno and Allman 1991). Functional magnetic resonance imaging (fMRI) has been used for over a decade to study early cortical retinotopic maps in the human brain (Sereno et al. 1995).
In higher areas, visual areas are smaller and receptive fields of neurons are larger (Gattass et al. 2005; Serences and Yantis 2007). Neurons with large receptive fields are sometimes mistakenly considered unsuitable candidates for encoding spatial location. In fact, in a small cortical area containing somewhat noisy single units, it may actually be preferable to have larger receptive fields, since then, spatial location can be estimated from a larger number of neurons, increasing signal-to-noise (see Baldi and Heiligenberg 1988 for a formal model). Large receptive fields are not incompatible with either retinotopy, or computation of exact spatial locations. Indeed, in recent years fMRI studies have shown that there are topographic representations outside of occipital regions, in temporal, parietal, and even frontal cortex (Sereno et al. 2001, 2003; Huk et al. 2002; Hasson et al. 2003; Brewer et al. 2005; Schluppeck et al. 2005; Silver et al. 2005; Hagler and Sereno 2006; Larsson and Heeger 2006; Sereno and Huang 2006; Hagler et al. 2007; Kastner et al. 2007; Swisher et al. 2007).
The discovery of maps in higher-level areas, including those not previously thought to be retinotopic based on studies in macaque monkeys, brings up the question of what actually drives retinotopy in these regions. The discovery of new maps or areas is due in part to developments in neuroimaging technology that have increased signal-to-noise over the years. But another possibility is stimulus-based selectivity: retinotopic mapping protocols typically use flickering checkerboards as stimuli, whereas higher areas in the brain respond preferentially to complex, higher-order visual properties. More complex stimuli might be better suited to reveal maps in higher areas (Sereno et al. 2003). On the other hand, tasks that rely on spatial attention, saccade preparation, and working memory appear to activate maps in regions known to be involved in these tasks when modified to be performed retinotopically (e.g., Sereno et al. 2001; Schluppeck et al. 2005, 2006; Silver et al. 2005; Hagler and Sereno 2006; Hagler et al. 2007), suggesting that retinotopy in these regions may be a way to allocate processing resources.
Spatially specific attentional modulation has been demonstrated in human visual areas in several studies (e.g., Tootell et al. 1998; Brefczynski and DeYoe 1999; Gandhi et al. 1999; Kastner et al. 1999; Martinez et al. 1999; Somers et al. 1999). In the classical attentional “control” regions in parietal and frontal cortex (see Corbetta and Shulman 2002; Pessoa et al. 2003; Schall 2004; Boynton 2005; Serences and Yantis 2006 for overviews) spatially specific modulation of attention has been inconsistent (also see Discussion). However, given that these regions themselves contain topographic representations, an obvious question is whether these representations are actually used during spatial attention tasks.
In the present study, we employed stimuli containing complex visual features as well as of ecological relevance (biological motion) and a spatial attention task. A further combination of retinotopic mapping and an experimental design allowed us to identify retinotopic regions primarily responsive to stimulus properties, and those actively used during spatial attention. Though the 2 are not mutually exclusive, we will refer to these 2 endpoints as “stimulus-driven” and “attention-driven” retinotopy.
Our novel stimuli allowed us to manipulate these 2 factors as independently as possible so that activity driven by stimulus features and activity driven by attention could both be measured. In order to study attentional modulation while holding stimulus properties controlled, we modified the standard retinotopic mapping paradigm, where only a particular portion of visual field contains a stimulus at any given time, and stimulated the entire visual field at all times. Note that in standard retinotopic mapping, neural activity evoked by these factors cannot be differentiated because the stimulus and attention (either via an explicit task, or because nothing else in the visual field is competing for attention) are at the same location.
The standard flickering checkerboard stimuli typically used in retinotopic mapping experiments are not well suited to study effects of stimulus complexity on retinotopic maps. Our stimuli are instead based on point-light biological motion animations (Johansson 1973). These animations were chosen because they contain high-level features such as motion and form, and are perceived as meaningful objects. Perceiving biological motion has cross-species importance and many organisms appear to have evolved specialized mechanisms to process this information (Vallortigara et al. 2005; Troje and Westhoff 2006). At the same time, in contrast to other possible complex stimuli (e.g., video), control stimuli and prior psychophysical and neuroimaging data are available for point-light biological motion, making them far better suited for experimental manipulation (see Methods).
The basic design of our study can be summarized as follows: a retinotopically rotating polar angle mapping wedge contained point-light biological motion figures surrounded by a field of scrambled figures (stimulus contrast) or a field of identical figures (no stimulus contrast). Additionally, while fixating, subjects performed either a task that required them to attend to the wedge (attention), or a task that required them to attend to the center of gaze (withdrawn attention, see Fig. 1).
Nine adults with normal or corrected to normal vision (age 25–35, 5 women) participated in this study. All were experienced with behavioral and with functional MRI experiments, including retinotopic mapping. Each subject was scanned in 4–6 runs of each of the 3 conditions of the experiment on 3–4 different days. Some subjects participated in additional sessions (control, pilot, or additional sessions). Before starting each experiment, subjects were trained and familiarized with the stimuli and tasks outside the scanner, and additionally practiced each task for 8–10 min in the scanner. The experimental protocol was approved by the UCSD Human Subjects Research Protections Program. Informed consent was obtained from each participant.
Experimental Stimuli and Paradigm
In creating the stimuli, we used point-light biological motion animations (Johansson 1973). These are salient structured motion stimuli that are also perceived as coherent, meaningful objects. At the same time, because they lack many other visual cues, they are easily manipulable, and control stimuli that disrupt the structured motion are readily available—such as “scrambled biological motion.”
The point-light biological motion animations used here were a subset of motion sequences created by Ahlstrom et al. (1997) by videotaping an actor performing various activities and subsequently encoding joint positions in the digitized videos. The actions used here depicted walking, walking up stairs, jogging, jumping jacks, throwing, underarm throwing, skipping, stepping up, a high kick into the air, a lower kick, and jumping rope. Scrambled biological motion animations were spatially scrambled; the starting positions of the point-lights were randomized while keeping each dot's trajectory intact. The starting positions of the dots were chosen randomly within a region such that the total area encompassed by the figure was similar to that of the structured motion figures. Eleven biological motion animations and 11 corresponding scrambled animations were consistently used in the experiment. All point-light figures were identical to those used in our previous block design fMRI study (Saygin et al. 2004).
We used phase-encoded retinotopic mapping within an experimental design. Neuroimaging experiments on spatial attention typically sample a small number of locations in space. Phase encoding not only provides a complement to prior work, but it also is a high signal-to-noise mapping of the entire visual field. Phase encoding also allowed us to address 2 possible problems that are especially important in studying attentional modulation. First, potential effects of task or set shifting have been avoided because the stimuli and task were constant throughout each session. Second, the present design allowed us to minimize “surround” effects. When a visual stimulus is presented at 1 location, both the stimulus and attention at this location cause changes in neural activity in cortical representations of untested locations—and these changes may vary depending on brain area, visual stimulus, task, and attentional state (Boynton 2005; Schwartz et al. 2005). In an experiment aiming to disentangle effects of stimulus representations and attention, these surround effects are in essence external factors in the experiment. Phase encoding minimizes this issue because every condition is Fourier analyzed within itself (see below and Supplementary Methods). A block design study could also minimize this problem by testing many locations in space and comparing each location to all of the others—but that is less efficient and essentially approaches a phase-encoded design; and if locations are interrogated randomly, it also requires that the subject adapt attention to monitor for new targets at many locations.
In phase-encoded polar angle retinotopic mapping, subjects fixate and view a clockwise or counterclockwise rotating pie-shaped “wedge” (Sereno et al. 1995). In the present study, visual stimuli were presented in both the rotating wedge and the background. The retinotopic wedge was not separated from the background with a border or any other demarcation and contained 3 biological motion animations increasing in size with eccentricity (Fig. 1 and Supplementary videos 1–3). The background was similarly filled with point-light animations arranged around a central fixation cross increasing in size with eccentricity. The composite stimulus then was a circular area populated with 18 individual point-light animations. This circle was on average 55° of visual angle in diameter. The wedge, and consequently, the whole display of dots rotated around the fixation cross at constant speed. Each animation completed its movement in 1 s—the next animation was presented immediately after (next frame), so there was no discontinuity perceived in the rotation of the animations even though the individual animations changed every second. Each animation was presented in 1 of 11 randomly selected approximately isoluminant colors whether in the wedge or background (except in Attention condition, see below). The stimuli are illustrated in Supplementary Videos.
The rotation always started with the retinotopic wedge at the horizontal meridian of the right hemifield (i.e., 3 o'clock). For each subject, in half of the scans the rotation direction was counterclockwise, in the other half clockwise. This allows us to ascertain that reversing the rotation direction of the stimulus leads to a reversal in the phase map; but it also allows us to cancel phase errors due to local static differences in hemodynamic delay by combining data from opposite rotation directions (Sereno et al. 1995, 2001).
Polar angle mapping was used rather than eccentricity mapping because the latter was difficult to adapt into our experimental design aimed at contrasting stimulus and attention effects retinotopically: Perception and attention in the fovea are better than in the periphery—indeed, for the present stimuli, even after adjustment for cortical magnification (Ikeda et al. 2005).
In reporting some individual subject results, we have additionally used data from separate localizer scans to provide approximate locations of functionally defined cortical visual areas in relation to the present results. Middle temporal area was identified by the contrast of low contrast moving rings to static rings (Tootell et al. 1995), and the fusiform face area or FFA (Kanwisher et al. 1997) was identified using the contrast of images of faces to scrambled faces.
There were 3 conditions corresponding to the experimental factors rotating with the wedge: Attention + Stimulus, Stimulus, and Attention. The content of the wedge and background as well as the subjects’ task varied by condition as follows.
Attention + Stimulus Condition
The wedge contained point-light biological motion, whereas the background contained scrambled version of the same motion (Fig. 1a, Supplementary Video 1). This is a rather subtle stimulus contrast—compared with standard retinotopy, which has no stimuli in the background, or even compared with various possible control stimuli such as stationary dots (Saygin et al. 2004). In addition to a stimulus contrast, here the subjects’ attention was actively directed to the wedge stimuli with an explicit task. Although fixating centrally, subjects were asked to keep their attention on the rotating wedge and monitor for trials in which the 3 animations in the wedge were not identical. This is a difficult and attention-demanding task at the rate the stimuli refresh and especially with the large field of view of the stimuli.
In this condition, the retinotopic stimuli presented were identical to the Attention + Stimulus condition with biological motion in the wedge and scrambled motion in the background. The only difference was the fixation cross, which also changed color once per second (Fig. 1b, Supplementary Video 2). Subjects were asked to ignore all peripheral stimuli and carry out a 2-back working memory task with the color of the fixation cross (respond when a trial matches the trial before the previous trial, e.g., Red, Blue, Red). This task is very difficult to perform at the refresh rate of these stimuli and requires sustained attention. This task was chosen because it is attention-demanding, alters the stimulus minimally, centrally, and in a nonperiodic manner—allowing an attention contrast to be made with the Attention + Stimulus condition while keeping the retinotopic stimulus identical (Lavie 2005).
This condition aims to drive the retinotopy with attention as opposed to a stimulus contrast. Here, biological motion was presented in both the wedge and the background. As in the Attention + Stimulus condition, subjects kept their eyes on the fixation cross and attended to the wedge and responded whenever the 3 figures in the wedge were not identical.
Even though the attended wedge and the background both contained biological motion, the animations in the attended wedge do lie approximately along a salient line. However, there were 2 other such sets of animations in the background defining 2 alternate wedges (3 wedges centered 120° apart). The fact that there was little signal at 3 times the base rotation frequency (data not shown) suggests that the imaginary contours of the attended wedge cannot explain our results.
The competing wedges make it crucial that subjects not “lose” the attended wedge. To help subjects track the wedge, a color cue was used: instead of using a random color for each figure, the point-lights in the wedge were consistently presented in 1 of the (approximately isoluminant) colors elsewhere in the display (Fig. 1c, Supplementary Video 3). Then, as additional controls, we showed that this display does not generate retinotopy in the absence of attention; and we replicated the result in trained individual subjects after the color cue was removed—see Results.
Scanning and analysis parameters were the same for all scans and were as follows: We used a 3-Tesla GE Excite scanner and an 8-channel head coil. For functional scans, a T2*-weighted echo planar gradient echo pulse sequence (8′32″ scan time, time repetition [TR] = 2000 ms, time echo [TE] = 30 ms, flip angle = 90°, bandwidth = 125 kHz, 64 × 64 matrix, 31 axial slices, 3.125 × 3.125 × 3.5 mm voxels, 0 gap) was used. When possible, a per-voxel equilibrium longitudinal magnetization (B0) field map was collected at each session and later used in reducing distortions in the images (Reber et al. 1998). A T1-weighted fast spoiled gradient-recalled scan (TR = 10.5 ms, flip angle = 15°, bandwidth = 20.83 kHz, 256 × 256 matrix, 143 axial slices, 1 × 1 × 1.3 mm voxels) was also acquired during each session to align the functional images to a previously obtained (at 1.5, 3, or 4 Tesla Siemens, GE or Varian scanners) high-resolution (1 × 1 × 1 mm) T1-weighted magnetization-prepared rapid gradient echo scan of each subject.
Subjects’ heads were stabilized with foam padding in order to minimize movement during the scans. Subjects directly viewed the stimuli on a screen that was suspended inside the magnet bore above their chest. Stimuli were projected onto this screen using an XGA video projector and a 7.38–12.3″ focal length Xtra Bright Zoom lens (Buhl Optical/Navitar, Rochester, NY). This setup allowed a large field of view (on average 55° in diameter).
Unfortunately, we did not have access to a scanner with eye-tracking capabilities at the time of data collection—but we verified that fixation was adequate at a later time by collecting data from an individual subject with simultaneous eye-tracking at a later date (not shown). The fact that retinotopic maps in primary visual cortex appeared as expected indicates in general, all subjects had to have maintained good fixation (see Results).
The experiments were programmed and presented using MATLAB (Mathworks, Natick, MA) and the Psychophysics Toolbox (Brainard 1997). Subjects used a button box (Photon Control Inc, Barnaby, B.C., Canada) to report matches in the task.
The data were analyzed using cortical surface-based methods using FreeSurfer (Dale et al. 1999; Fischl, Sereno, Dale, 1999), AFNI (Cox 1996), as well as custom software extensions (Hagler et al. 2006; see also http://kamares.ucsd.edu/∼sereno/csurf/tarballs/).
The functional scans were motion-corrected using the AFNI program 3dvolreg. For each subject and each session, the alignment structural scan was registered with the high-resolution structural scan used to construct the cortical surface. The registration was refined using manual blink comparison to achieve a very precise overlay of the functional data onto the cortical surface.
Each subject's phase-encoded data were analyzed using a Fourier analysis, yielding an amplitude and a phase value at each voxel. For each subject, multiple scans were averaged in the Fourier domain in a manner that uses both amplitude and phase in maximizing signal-to-noise (a vector sum). This method corrects for between-voxel differences in hemodynamic delay and strongly penalizes inconsistent phases across scans (see Supplementary Methods).
Details of the group analyses were described in detail elsewhere (Hagler et al. 2007) as well as in Supplementary Methods. Briefly, each subject's cortical hemispheres were reconstructed, inflated, resampled to a sphere, and then morphed to the average spherical representations of the cerebral hemispheres (Fischl, Sereno, Tootell, et al. 1999). Group statistics were then carried out on this common spherical coordinate system. Two kinds of group analyses were conducted. First, the amplitude and phase values of the Fourier analysis from each subject were averaged directly to make group retinotopic maps. Significant cortical patches in these average maps represent areas which not only have strong responses at the stimulus frequency, but also consistent phase across subjects, indicating a strong and highly consistent retinotopic representation (see Supplementary Methods). Second, at each voxel, the signed amplitude of the Fourier transform was used as a quantitative measure of strength of contralateral representation (“signed” positive or negative depending on whether the phase corresponds to contralateral or ipsilateral space, respectively). The conditions of the experiment were compared (i.e., Attention + Stimulus − Stimulus; Attention + Stimulus − Attention) by running voxel-by-voxel analyses of variance (ANOVA) with subjects as random effects and condition as fixed effects.
In all figures, colored areas represent regions that showed a significant contralateral periodic response (henceforth, a “retinotopic response”) at the retinotopic stimulus frequency (see Methods). Ipsilateral responses were virtually nonexistent and were truncated (a few voxels per scan). Color is used to represent the phase of the response. Although the precise delineation and naming of retinotopic areas is not the focus of the present study, when an area could clearly be identified due to its previously known retinotopic organization and anatomical location, we used common nomenclature, including V1–V3, V3A, V4, V6, V7, MT, intraparietal sulcus (IPS)1, IPS2, and FEF (Sereno et al. 1995; Hadjikhani et al. 1998; Tootell et al. 1998; Sereno et al. 2001; Huk et al. 2002; Wade et al. 2002; Brewer et al. 2005; Schluppeck et al. 2005; Sereno and Tootell 2005; Silver et al. 2005; Hagler and Sereno 2006). The most recently discovered parietal retinotopic regions IPS3, IPS4, and ventral intraparietal area (VIP) (Sereno and Huang 2006; Swisher et al. 2007) are less consistent across subjects and these boundaries were not marked in the figures.
Continuous regions spanned by periodic responses could not in all cases be broken into areas each containing a complete hemifield map. This could be due to limitations of resolution, vagaries of vasculature, or blurring due to cross-subject averaging. But it could also indicate that some areas do not represent all polar angles uniformly. Invasive studies in primates have shown that even areas with well-established retinotopy such as MT do not emphasize all polar angles equally, and these emphases can differ across individual animals (Maunsell and Van Essen 1987).
We first present average data from each experimental condition, followed by illustrations of selected individual subjects.
Phase-Encoded Retinotopy—Attention + Stimulus Condition
Behavioral data showed that sensitivity was high (d′ = 2.76, SD = 0.31; range = 2.20–3.03) indicating subjects performed the task (see Methods) and maintained their attention on the retinotopic stimuli.
Significant activity was found in extensive regions of early visual cortex, temporal, parietal, and frontal cortex bilaterally; many of these regions contained clear phase spreads indicating full or partial visual field representation (Fig. 2).
Figure 2a shows the lateral views of the inflated hemispheres. Retinotopic responses covered an extensive region of occipital and temporal cortex including lateral occipital cortex (LOC) and MT/medial superior temporal areas (henceforth MT+). This activity likely covers putative human analogs of occipitotemporal motion-sensitive areas that are not yet well-mapped in the human brain (e.g., FST, V4t—Kaas and Morel 1993) and reaches into the superior temporal sulcus (STS, especially clearly in the left hemisphere).
There was also well-defined bilateral retinotopic activity with substantial phase spread in the superior precentral sulcus, corresponding to the frontal eye fields (FEF). The location of this activity was verified with recent data from experiments in our lab that activated the FEF (Hagler and Sereno 2006; Hagler et al. 2007). Further anteriorly, there were also responses in smaller areas in the precentral sulcus.
The more dorsal retinotopic areas in Figure 2a are better viewed by rotating and tilting each hemisphere (Fig. 2b). There was a continuous, large region of retinotopic activity along and around the intraparietal sulcus, which contains several phase reversals indicating multiple areas. Moving dorsally, there was a band of retinotopic activity covering previously studied areas V3A, V7, IPS1, and IPS2 (Tootell et al. 1998; Sereno et al. 2001, 2003; Silver et al. 2005; Schluppeck et al. 2006). From here, significant activity extended into the postcentral sulcus covering new retinotopic regions anterior and lateral to IPS2 (Sereno and Huang 2006; Swisher et al. 2007).
The most anterior and lateral portions of this activity have a reduced representation of the upper visual field. Posterior parietal areas (e.g., area VIP, Avillac et al. 2005) are known to contain neurons that code space in eye-centered coordinates as well as those that code space in head-centered coordinates. In the present study, subjects had to look slightly downward at the direct-view screen and thus, even though stimuli would be retinotopically centered as subjects fixated, in head-centered terms, this point would have a relative shift towards the lower field (in relation to the head). Thus, even though there may be neurons in these areas responding to stimuli at all visual locations, when the lower field position relative to the head coincides with a lower visual field stimulus, the signal may be slightly larger than for other visual fields (cf. Hagler et al. 2007).
In ventral temporal cortex (Fig. 2c), there was significant periodic activity covering V2 and VP (Sereno et al. 1995) extending anteriorly into a region previously labeled V4v + V8 or hV4 + VO (Hadjikhani et al. 1998; Wade et al. 2002), henceforth V4+. From here, activation continued further anteriorly, into posterior inferotemporal areas that are important for high-level form processing (Hasson et al. 2003). In the group data, the boundaries of ventral areas were less clear than in other regions, perhaps due to the wide visual angle of our stimuli; it has been reported that ventral temporal areas may be best mapped using stimuli that do not extend far into the periphery (Brewer et al. 2005).
In primary visual cortex (Fig. 2d), retinotopic maps were significant, despite the presence of visual stimuli both in the wedge and the background. Note that here the response to the more structured biological motion was greater than the response to the seemingly less structured scrambled biological motion (if the situation were reversed, the phase of the response would have been inverted and incremented by π). There was also a retinotopic area at the medial border of V3 and V3A, most likely corresponding to human V6, which exhibits contralateral retinotopy when stimuli covering wide visual field are used (Galletti et al. 1999; Pitzalis et al. 2006). Further anteriorly, there were significant responses in the precuneus. The location of this activation overlaps the parietal reach region (PRR) (Connolly et al. 2003), but we did not perform any functional tests to localize the PRR in the present study.
Phase-Encoded Retinotopy—Stimulus Condition
Behavioral data indicated that the central task that was intended to keep the subjects’ attention away from the retinotopic stimuli (2-back working memory at fixation, see Methods) was more difficult than the task used in the Attention + Stimulus and Attention conditions. Although subjects were engaged in the task (mean d′ = 1.6; range = 0.90–2.46), they performed significantly worse in the Stimulus condition than in the peripheral task (paired t-test compared with the Attention + Stimulus condition, P < 0.05). In line with this, in postexperiment questioning, all subjects found the central working memory task subjectively harder than the retinotopic task performed in the other conditions; for example, subjects “were not even aware [of the simultaneous peripheral stimuli] except that [they] were there,” or they “had completely tuned it out.” These behavioral results and subject reports verify our expectation that there should be a notable attentional differential in the retinotopic response between the Attention + Stimulus and Stimulus conditions.
When subjects viewed the exact same retinotopic stimuli as in the Attention + Stimulus condition, but attended the central task instead of the retinotopic stimuli, the activation was significantly reduced in most areas, both in extent and in strength (Figs 2 and 3—for a particularly easy-to-view comparison, see Supplementary Fig. S1 where these conditions are shown together in animated gif format.)
Even as subjects’ attention was withdrawn from the stimuli and engaged strongly elsewhere, significant responses were found in some regions. Motion-sensitive areas in lateral temporal cortex, including the left STS, as well as V3A exhibited significant activity (Fig. 3a,b), though activity was slightly reduced in extent compared with the Attention + Stimulus condition in most of these regions. On the other hand, frontal and parietal areas showed larger reductions in their response when attention was not actively directed to the stimuli. Some retinotopic maps in the dorsal stream including IPS1 and to a lesser extent IPS2 and FEF still revealed retinotopy, but only in the left hemisphere (Fig. 3b).
Ventrally retinotopic activity was also reduced in extent (Fig. 3c); the more anterior and lateral portions of inferotemporal cortex were no longer responsive—the remaining activity likely corresponds to V2 and VP and possibly part of V4+.
On the other hand, primary visual cortex (medial view, Fig. 3d) showed significant response, very similar to the Attention + Stimulus condition. Responses in V6 and precuneus were present but diminished.
Phase-Encoded Retinotopy—Attention Condition
Behavioral data analysis revealed high sensitivity (mean d′ = 2.90, SD = 0.43; range = 2.13–3.39) indicating good attention to the retinotopic stimuli. This performance was slightly better than in the Attention + Stimulus condition, approaching significance (paired t-test P = 0.06).
The results looked very similar to the Attention + Stimulus condition when the background stimulus was also biological motion and subjects attended the retinotopic wedge, with significant maps in lateral and ventral temporal cortex, the STS, parietal cortex, the FEF, precentral sulcus, V6, and precuneus (Fig. 4). Notably, primary visual cortex did not respond with a well-defined retinotopic map in the absence of a stimulus contrast (medial view, Fig. 4d).
We ran 2 additional experiments on individual subjects to verify that activity revealed in this condition was in fact attention-driven. First, we performed an experiment in which there was 1) no stimulus contrast (biological motion in the wedge and the background), 2) the wedge was presented in a uniform color, and 3) the same central task as in the Stimulus condition was used (2-back working memory with the color of the fixation cross). Confirming our predictions, there was no significant activation at the stimulus frequency under these conditions, even at lower thresholds (data not shown). Thus, the uniform color of the point-lights alone is not sufficient to account for our results. This also ascertained that there was no other confound in the stimuli correlated with the stimulus frequency and that the results were not driven by a stimulus confound (e.g., perceived edges of wedges).
Second, we altered the Attention stimulus so that the point-light figures in the wedge were presented in random colors just like those in the background. After training outside the scanner, Subject 3 was scanned in the Attention condition once more, this time keeping track of the attended wedge for the duration of each run with no overt cue. The results (Supplementary Fig. S2) were notably similar to those obtained with a color cue, indicating these maps are indeed primarily driven by attention.
Analysis of Variance
To add quantification to the results presented above, we also performed voxel-by-voxel ANOVA using the signed amplitude of the Fourier analysis, with subjects as random effects (see Methods). These data are reported in Supplementary Materials. To summarize, highly significant Attention effects were found in posterior parietal cortex (especially in the right hemisphere) and the FEF, as well as in lateral and ventral temporal cortex. The Stimulus effect was found in mainly in earlier areas V1, V2, V3, VP, and V3A.
Although spherical surface-based averaging methods cause less blurring than 3D methods, retinotopic areas that are smaller or that tend to be more variable across subjects might be better viewed in individual subjects. Exploring individual cases also makes it possible to examine the agreement between group results and individual data.
Figure 5 depicts Subject 3′s data for the Attention + Stimulus condition. In this subject, the responses were similar to those already presented in the group data (the group data have been displayed on the cortical surface of Subject 3). This subject had an especially strong response in the left STS and clear additional frontal responses in the precentral sulcus. Subject 3 also had several additional contralateral field representations beyond IPS2, seen clearly in the left hemisphere. These regions have recently been studied by Swisher et al. (2007) and subdivided into areas IPS3 and IPS4 in some subjects. Here, retinotopy also extended somewhat lateral to those areas, possibly overlapping with human VIP as defined in Sereno and Huang (2006).
Ventrally, retinotopic activity extended anteriorly, intersecting with this subject's functionally mapped FFA. In the medial view, despite the noise caused by image distortions near the tip of the occipital lobe, primary visual cortex showed the expected phase pattern, with less blurring than in the average. This particular subject did not show significant V6 activation.
Retinotopic responses were highly reliable from session to session. As an example, compare Figure 5 with Supplementary Figure S2, which shows data from the same subject collected approximately 6 months apart.
In Figure 6, we show an additional 4 hemispheres (2 left and 2 right) with the data presented on each subject's own inflated cortical surface, this time focusing on within-subject attentional modulation of retinotopic activity. As in previous studies, there was variability in retinotopic organization between subjects—but the responses were consistent enough to be conserved across the group after averaging. Subjects showed extensive activity in temporal, parietal, and frontal regions in the Attention + Stimulus condition. Withdrawn attention led to a strong reduction in the responses. Areas in the vicinity of MT+ proved to be the most resistant to the withdrawal of attention here (especially see Subject 1) and across other subjects.
We explored whether the present results were specific to biological motion in a variant of the experiment that featured nonbiologically moving point-light objects. The retinotopic higher areas we identified here do not appear to be specifically driven by biological motion, and were at least similarly activated by coherently but nonbiologically moving objects comprised of point-lights (Supplementary Materials). This is consistent with our general finding that higher areas are primarily attention driven.
In the last few years, distinct topographic regions in higher cortical areas have been studied by different groups (e.g., Silver et al. 2005; Hagler and Sereno 2006; Larsson and Heeger 2006; Pitzalis et al. 2006; Kastner et al. 2007; Swisher et al. 2007). Here, like those studies, we identified topographic maps in V6, lateral occipital cortex, several areas in the vicinity of the IPS and the FEF, but all at the same time. In addition, we identified retinotopy in a large portion of lateral temporal cortex extending anteriorly from MT/MST into the STS and a region in the precuneus that may correspond to the human PRR.
Retinotopic activity in the human brain changes in steps from primarily stimulus driven to primarily attention driven. Retinotopy in early areas—especially primary visual cortex—appears primarily stimulus driven and shows small attentional modulation compared with higher areas (Tootell et al. 1998). Retinotopy in motion-sensitive areas shows attentional modulation, but also shows sensitivity to stimulus structure in the absence of attention. Retinotopy in parietal and frontal regions known to be involved in spatial orienting and attentional control (Kastner and Ungerleider 2000; Corbetta and Shulman 2002; Pessoa et al. 2003) is strongly and primarily driven by attention.
Lateral temporal retinotopy covers several motion-sensitive areas. The region activated here extending into the STS almost certainly includes more than the previously studied retinotopic LO regions, MT and MST (Huk et al. 2002; Larsson and Heeger 2006). In contrast to our human data, areas beyond MT have shown little or no retinotopy in monkeys (see Nelissen et al. 2006). It remains to be determined whether this difference is due to experimental factors (stimuli, phase-encoded mapping), or to actual cross-species differences.
As shown in Supplementary Figure 3, the retinotopic activity in lateral temporal cortex overlaps brain areas responsive to biological motion, the stimuli used in the present study (Grossman et al. 2000; Saygin et al. 2004).
Even though lateral temporal cortex exhibited attentional modulation, these regions were active even when subjects did not attend to the stimuli. This was true both at the group level and for individual subjects. It appears that these areas represent the stimuli retinotopically even in the absence of attention.
Parietal and Frontal Cortex
Detailed spatial representations in human cortex are not restricted to early visual areas, but continue to higher levels of processing, all the way to frontal cortex. In the present study, we saw clear, strong, attention-driven retinotopic activity in multiple parietal and frontal areas.
Recently, it was suggested that phase-encoded methods might give biased results in higher areas and that a measure of contralateral preference is more reliable than within-hemifield retinotopy (Jack et al. 2007). Contralateral–ipsilateral biases are likely to be more significant than within-hemifield biases because the average distance between ipsilateral and contralateral receptive field centers are larger than the average distance between, for example, receptive fields with polar angles from 12 o'clock to 2 o'clock (the “vertical meridian”) and those from 2 o'clock to 4 o'clock (the “horizontal meridian”) within 1 hemifield. Also, as receptive field size increases, the periodic modulation of a phase-encoded signal will be reduced (cf. Tootell et al. 1997). However, we still see significant modulation in single subjects; and the fact that retinotopic organization in parietal and frontal areas survives in cross-subject averages strongly argues that the present results are not merely an artifact of amplifying noise in randomly distributed receptive field centers.
It is clear that there are one or more retinotopic areas anterior and lateral to IPS2 as reported recently (Swisher et al. 2007). The lateral edge of the anterior parietal activity in our data also overlaps with the putative homologue of macaque VIP (Sereno and Huang 2006). V7, the intraparietal areas, and the FEF are predominantly attention-driven, in contrast to superior occipital area V3A, which responded equally well in the Stimulus + Attention and Stimulus conditions. This pattern agrees well with the literature on the neural correlates of self-directed attention (Kincade et al. 2005).
The topography of the FEF region was studied recently using eye movement tasks by Hagler et al. (2007) and Kastner et al. (2007). Frontal areas, specifically the FEF have long been known to receive topographic connections from posterior areas (Schall et al. 1995), and spatially specific modulatory influences of FEF on retinotopic cortex has recently been demonstrated in the human brain (Ruff et al. 2006). We now see that the human FEF exhibit retinotopy that can be driven by attention alone.
There were weak but significant responses to the retinotopic stimuli in parietal and to a lesser extent in frontal cortex even in the withdrawn attention condition. This could be due to the inherent salience of these stimuli (e.g., monkey lateral intraparietal area neurons are known to represent salient sensory stimuli that are not, but might become behaviorally relevant, Gottlieb et al. 1998). It is not immediately clear why this activity was stronger in the left hemisphere given that the maps were contralateral. In individual subjects there were exceptions to this pattern of lateralization.
In the majority of spatial attention studies, classical attention areas have either been shown to have no spatial selectivity or only a coarse spatial representation (e.g., Corbetta et al. 2005; Wilson et al. 2005; Serences and Yantis 2007). Here however, we showed that attention to particular spatial locations is accompanied by precise predictable changes in the locus of activity in retinotopic maps in intraparietal and frontal cortex. The relative scarcity of similar results in previous studies may be due in part to 3D blurring typically applied in volume-based group averages. The spatial extent of the stimuli (compare Serences and Yantis 2007 with Yantis et al. 2002; Serences et al. 2005) and stimulus modality (Macaluso et al. 2003) might also be relevant to whether these modulations are detected.
In inferotemporal cortex, retinotopic activation covered V4 + and extended anteriorly overlapping with the FFA in individual cases. Inferotemporal cortex exhibited both attentional modulation as well as stimulus-driven activity.
Early Visual Cortex
Given that neurons in early areas have small receptive fields, presenting visual stimuli covering the entire visual field could well have led to no retinotopic response from primary visual cortex. Instead, activity here resembled maps obtained using stimuli that are optimal for these areas, even when subjects did not attend the stimuli (Sereno et al. 1995). On the other hand, the maps were disrupted when there was no stimulus contrast between the wedge and the background, even when subjects attended to the wedge. It is likely that the responses in V1 are due to the perceived difference between the wedge and the background, rather than the specific contents of the wedge: V1 is not known to respond better to structure-from-motion, has not shown a preference for biological motion (Grossman et al. 2000; Saygin et al. 2004), and in fact may have a preference for unstructured motion (Braddick et al. 2001; Murray et al. 2002).
Attentional modulation of neural activity in higher areas but not in primary visual cortex has been reported in neurophysiological studies of nonhuman primates (McAdams and Maunsell 1999; Cook and Maunsell 2002). On the other hand, human fMRI studies were repeatedly able to show attentional effects in early visual cortex including V1 (Brefczynski and DeYoe 1999; Gandhi et al. 1999; Kastner et al. 1999; Martinez et al. 1999; Somers et al. 1999). It is possible that the present study did not reveal reliable attentional modulation of retinotopic activity in V1 due to insufficient statistical power. Also, image distortions at 3T due to B0 inhomogeneties are especially prominent in posterior and posterior-medial cortex where V1 is located, even after field map corrections. At the very least, the present data show that the maps in V1 are not driven as strongly by attention as those in higher areas, a finding which is consistent with both the neurophysiology and the human neuroimaging data.
Several years ago, in their classic study, Brefczynski and DeYoe (1999) reported that directing attention to different locations in space leads to increased fMRI activity in the cortical representation of those locations in primary visual cortex even when the stimulus is kept well controlled. Recently, Silver et al. (2005) reported a related result in dorsal stream areas. In the Attention condition here, we were able to reveal the full extent of topographic activity that is driven in temporal, parietal and frontal cortex as attention moves across space.
Topography, Vision, and Attention
Multiple cortical areas exhibit activity correlated with the retinotopic position of visual stimuli. The areas differ in the degree to which these responses are modulated by visual stimuli or by attention: Some areas (early visual cortex) are primarily driven by stimuli, whereas others (parietal and frontal cortex) exhibit primarily attention-driven retinotopy. There are also areas (motion-sensitive cortex) that maintain a reliable retinotopic response to the stimuli even in the absence of attention. These findings indicate that retinotopic representations in different areas may have varying functional roles during perception and spatial attention. In general, we propose that retinotopic maps in higher areas are not epiphenomenal. Instead, they actively subserve and may provide an infrastructure for spatial tasks such as attention.
National Science Foundation grant (BCS 0224321) to M.I.S.; and European Commission Marie Curie grant (FP6-025044) to A.P.S.
We thank D. J. Hagler for developing group analysis software; S. M. Wilson for help with visual stimuli; R. Buxton, E. Wong, T. Liu, and L. Frank at the University of California San Diego fMRI Center for scan time and pulse sequences; F. Dick, J. Driver, and G. Rees for their comments on the manuscript. Conflict of Interest: None declared.