The capacity to imagine being somewhere else and seeing the environment from a different point of view is crucial for spatial planning in daily life and for understanding the intentions, actions, and state of mind of other people. The neural bases of spatial updating of multiple object locations were investigated using functional magnetic resonance imaging. Healthy volunteers saw an array of objects on a table in a virtual reality environment and imagined movement of their own viewpoint or rotation of the array. Their memory for the locations of the objects was then tested with a change-detection task. Behavioral results confirmed the advantage for imagined viewpoint change compared with imagined array rotation of equivalent size. Encoding of object locations was associated with a network of areas, including bilateral superior and inferior parietal cortices. The precuneus was additionally activated by the demands of both viewpoint- and array rotation. The parieto-occipital sulcus/retrosplenial cortex and hippocampus were additionally activated by the demands of viewpoint rotation, while array rotation was associated with activation of the right intraparietal sulcus. These findings support a computational model of spatial memory in which parieto-occipital sulcus/retrosplenial cortex mediates spatial updating as part of a process of translation between “egocentric” and “allocentric” reference frames.
The capacity to imagine what a scene, an object or an array of objects looks like when seeing from a different viewpoint is crucial in everyday life. It allows, for instance, recognizing landmark buildings from memory when approached from a new direction and planning a novel scenario or route. It also allows imagining what another individual sees and then better understanding his or her intentions, actions, and emotional reactions.
When an observer is facing an array of objects and has to imagine it from another point of view, he can either imagine the configuration rotating in front of him—“object rotation” or “array rotation”—or he can imagine himself moving around the array—that is, “viewer rotation” or “visual perspective taking.” Several lines of evidence suggest that the 2 processes of perspective taking and array rotation are distinct (see Zacks and Michelon 2005). In the original experiment of Wang and Simons (1999), participants saw an array of objects from one viewpoint and, after a pause, were asked which object had been moved. During the pause the participant might be moved to a new viewpoint around the array or the array might be rotated by a corresponding amount or both or neither. The authors showed that performance was better after movement of viewpoint than after the equivalent rotation of the array, consistent with numerous similar findings in terms of accuracy or reaction times (RTs) and for both actual and imagined motion (Presson 1982; Amorim and Stucchi 1997; Wang and Simons 1999; Wraga et al. 2000, 2005; Creem, Downs, et al. 2001; Creem, Wraga, et al. 2001; King et al. 2002; Burgess et al. 2004). Moreover, the linear modulation of RTs by angle of rotation, which is well established in array rotation (Diwadkar and McNamara 1997; Shelton and McNamara 1997; Amorim 2003), is not necessarily found in perspective taking (Amorim 2003; Lambrey et al. 2008).
Neuroimaging studies have identified different parietal and premotor activations associated with object or array rotation (e.g., Bonda et al. 1995; Parsons et al. 1995; Cohen et al. 1996; Tagaris et al. 1997; Kosslyn et al. 1998, 2001; Wexler et al. 1998; Carpenter et al. 1999; Wraga 2003). In a recent meta-analysis (Zacks 2008), Zacks showed that object rotation is accompanied by increased activity in the intraparietal sulcus and adjacent cortices as well as in motor areas of the medial precentral cortex, supporting the view that object/array rotation may depend on motor simulation in certain situations. Conversely, the neural correlates of imagined self or viewer rotation have been less explored. In a study by Creem, Downs, et al. (2001), participants were required to indicate the position of 1 of 4 external objects after they had performed an imagined self-rotation to a new position. Using functional magnetic resonance imaging (fMRI), the authors found that this task was associated with bilateral superior parietal activation (stronger in the left hemisphere), as well as activation in the left premotor and supplementary motor areas (but not in M1 motor cortex). In another study, Vogeley et al. (2004) presented a virtual scene with an avatar and red balls in a room to normal volunteers and asked them to count the balls as seen either from the avatar's (3rd person perspective, 3PP) or one's own perspective (1st person perspective, 1PP). Their results revealed that the 3PP condition was associated with activation in the medial superior parietal and right premotor cortices, whereas the 1PP condition was associated with activation in the right insula, the medial prefrontal cortex, the superior temporal cortex as well as in the posterior and anterior cingulate cortices. So far, only a few studies attempted to directly compare the neural correlates of imagined perspective taking and array rotation (Zacks et al. 2003; Wraga et al. 2005, 2010; Keehner et al. 2006). In Zacks et al. (2003), participants were instructed to imagine either a square array of 4 objects rotating or themselves rotating around the array. Object rotation led to selective increased activity in the right intraparietal sulcus and decreased activity in the left superior temporal sulcus and temporo-parietal junction, whereas viewer rotation led to selective activation of left superior temporal sulcus and parieto-temporo-occipital junction. In Keehner et al. (2006), participants were presented one ball on a table in virtual reality and had to decide whether this ball would be on their left or their right after either imagined rotation of the table (i.e., object rotation) or of themselves around the table (i.e., perspective taking). Their results showed that during object rotation the right superior parietal cortex exhibited a positive linear relationship between hemodynamic responses and degrees of rotation, whereas the same region exhibited a negative linear trend during perspective taking. In 2 other studies, authors used the Shepard and Metzler (1971) objects to assess the distinction between object and viewer rotation (Wraga et al. 2005, 2010). In Wraga et al. (2005), participants were asked either to imagine such an object rotating in front of them or themselves moving around the object and then had to indicate if one particular landmark on the object was visible or not from the imagined viewpoint. Here, object rotation yielded neural activity spreading from left premotor to left primary motor (M1) cortex, whereas equivalent self-rotation was associated with activation in the left supplementary motor area. In a more recent study (Wraga et al. 2010), the same authors used a similar paradigm with decision task that required participants to associate the stimuli with their physical body. After having performed the spatial transformation, they had to indicate if the landmark on the object was on the right or on the left of their body midline. That time, both object rotation and perspective taking yielded activation in the left primary motor cortex (M1), in the precuneus as well as in the cingulate gyrus. Moreover, perspective taking also yielded activation in the left premotor cortex as well as in the inferior parietal lobule, whereas object rotation yielded activation in the superior parietal lobule.
To summarize, previous fMRI studies identified different parietal and premotor activations associated with viewer and array rotation (Zacks et al. 2003; Wraga et al. 2005, 2010; Keehner et al. 2006) although these patterns were not consistent across the studies. However, neither study reported activation of the medial temporal lobe or parieto-occipital sulcus. This is at odds with neuropsychological findings, which show that hippocampal damage specifically impairs spatial memory across changes of viewpoint (Morris et al. 1996; Abrahams et al. 1997; Parslow et al. 2005) leading to impairments with even small numbers of items and short delays (King et al. 2002, 2004; Hartley et al. 2007). In addition, the parieto-occipital sulcus (including retrosplenial cortex anteriorly) has been proposed as a key structure for the translation between allocentric hippocampal representations and egocentric parietal ones (Galletti et al. 1993; Burgess, Becker, et al. 2001; Burgess, Maguire, et al. 2001; Byrne et al. 2007; Vann et al. 2009), a process likely required during imagined movement of viewpoint through an external environment (Burgess, Maguire, et al. 2001; Committeri et al. 2004). Moreover, part of the advantage for memory in the self-motion condition of Wang and Simons over the array rotation condition was due to the influence of allocentric representations of the array locations relative to external landmarks in the environment (Burgess et al. 2004), representations thought to be encoded by medial temporal lobe structures (O'Keefe and Nadel 1978; Burgess et al. 2002; Bird et al. 2010). The absence of a surrounding environment in previous fMRI studies, notably the studies by Zacks et al. (2003) and Wraga et al. (2005), may therefore explain the absence of activation in the medial temporal lobe or the parieto-occipital sulcus. In addition, the use of a regular square array and rotations of multiples of 90° by Zacks et al. (2003) may have left the “perspective taking” task solvable via logical/verbal rules, while the use of a single object for the stimuli by Wraga et al. (2005) may not have recruited the same network as required for spatial updating of multiple locations (King et al. 2002).
The main aim of the present fMRI study was therefore to investigate the neural bases of perspective taking and array rotation in conditions in which an environmental reference frame is clearly available. Healthy volunteers were presented an array of 4 objects on a table in a rich virtual environment and asked to imagine either movement of their own viewpoint (perspective-taking condition) or rotation of the table and array (array rotation condition). Their memory for the locations of the objects was then tested with a change-detection task. Behaviorally, participants were expected to show greater performance in the perspective-taking condition than in the array rotation condition. In terms of neural correlates, array rotation was expected to be associated with differential increases of activity in the intraparietal sulcus, as suggested by previous studies, whereas perspective taking was expected to be associated with activation in the parieto-occipital sulcus as well as in the medial temporal lobe, and in particular the hippocampus.
Perspective taking also provides critical information for monitoring social interactions. It is likely a prerequisite to understand another's intentions, actions, or emotional reactions, as well as to adapt our own behavior to the current situation (Frith and Frith 2006). Interestingly, Langdon and Coltheart (2001) have shown that the general advantage of perspective taking over array rotation is reversed in participants with significant schizotypal traits. Accordingly, the distinction between perspective taking and array rotation might have relevance to social cognition concerning our more general ability to appreciate another person's perspective (Frith and Frith 2005; Frith and de Vignemont 2005). An additional aim of the present study was therefore to compare the behavioral and fMRI effects of simply imagining a new perspective versus imagining “someone else's perspective.” To do so, both perspective taking and array rotation were cued either by a simple arrow or by a virtual character present in the scene. We expected the distinction between the arrow and the avatar to be relevant for perspective taking but not for array rotation.
Materials and Methods
There were initially 21 participants, but data from 3 of them had to be excluded from further analyses due to technical problems during scanning. Therefore, data from 18 participants (9 males and 9 females, aged between 20 and 23, mean age = 21.0) were analyzed. All participants were healthy right-handed university students. They all gave informed written consent and were paid for participating in this study, in accordance with the local Ethics Committee.
Stimuli and Trial Structure
Images were created using 3D Studio Max 6 (Autodesk, Inc., San Rafael, USA) and presented using Cogent (http://www.vislab.ucl.ac.uk/cogent.php)—a toolbox for Matlab (The MathWorks, Inc., Natick, USA). Each trial comprised 3 phases: “presentation phase,” in which the participant sees an array of 4 objects on a table set within a rich 3D environment (maximum 10 s, participants could choose to jump to the delay phase at any time during those 10 s); “delay phase,” a short delay (2–6 s, mean 4 s); “test phase,” being represented with the scene and having to indicate which 1 of the 4 objects has been moved (a forced choice between 2 objects; maximum 14 s). The presentation-phase scene shows an arrow and an avatar placed at varying locations around the table (112.5° or 157.5° from the viewer), with a pole set in the table at its nearest point to the viewer, see Figure 1. During the test phase, 2 answer choices are shown at the bottom left or bottom right of the screen. Participants made their response by pressing either the left or the right button on a keypad with their right hand. The task instructions (“cue phase”; 4 s) preceded the encoding image and instructed participants on what kind of rotation to imagine, see below.
We aimed to investigate the neural bases of self-rotation (taking a new perspective at a different position) and array rotation (table rotation to new position) within spatial memory. We also aimed to examine the effects of taking the perspective of another observer (an avatar) versus simply changing one's own perspective (indicated by an arrow). Therefore, our experiment comprised a 2 × 2 factorial design, with the within-subject factors “type of rotation” (self vs. table) and “cue” (avatar vs. arrow), see Figure 2A. Our conditions were thus determined by the instructions prior to presentation and the changes between the presentation and test phase scenes: a rotation of the table so that the pole is now at the arrow (“table arrow” condition) or at the avatar (“table avatar” condition); a movement of viewpoint to the location indicated by the arrow (“self-arrow” condition) or by the avatar (“self-avatar” condition) and no change between presentation and test scenes (“control” condition). Prior to the presentation phase, the participant was informed of the forthcoming condition and had to imagine the appropriate movement (table rotation or movement of viewpoint to either arrow or avatar) during presentation. For the control condition, they were simply asked to memorize the object locations. The participant was instructed to press a button during the presentation phase to indicate when they had completed the imagined movement. To control for any unintended effects of difficulty or memory for specific object configurations, no configuration appeared more than once from a particular viewpoint, and the configurations used for table rotation and self-rotation were counterbalanced across participants. During the fMRI scan, participants performed a set of 60 trials, 12 for each condition. The order of the trials was randomized and the same condition was not presented during consecutive trials.
Participants were trained prior to their fMRI scan. Training first included written instructions with pictures. Then, participants were familiarized with the virtual environment by the way of a 360° video (showed twice). They were also shown videos of the situation to be imagined in the 4 main conditions: a smooth rotation of the table or of the viewpoint to the arrow or avatar. Finally, 10 practice blocks were run: 2 trials for each of the 5 conditions, presented in randomized order.
In order to have an idea of how the participants performed the experimental tasks, they were asked the 2 following questions at the end of the experiment: 1) “Did you always follow the instructions, that is ‘imagining yourself moving’ in the ‘self’ trials and ‘imagining the table rotating’ in the ‘table’ trials?” and 2) “Did you do the required mental transformation during the presentation phase, during the delay phase or during the test phase?”
Blood oxygen level–dependent sensitive T2*-weighted functional images were acquired on a 3-T Siemens Allegra scanner using a gradient-echo echo planar imaging (EPI) pulse sequence with the following parameters: time repetition = 2600 ms, time echo (TE) = 30 ms, flip angle = 90°, slice thickness = 2 mm, interslice gap = 1 mm, in-plane resolution = 3 × 3 mm, field of view = 192 mm2, 40 slices/volume. The first 5 volumes were discarded to allow for T1 equilibration. The sequence was optimized to minimize signal dropouts in the medial temporal lobes (Weiskopf et al. 2006). In addition, a field map using a double echo FLASH sequence was recorded for distortion correction of the acquired EPI images (Weiskopf et al. 2006), see below.
The imaging analysis was performed with statistical parametric mapping (SPM5) (www.fil.ion.ucl.ac.uk/spm). First, EPI images were spatially realigned to the first image in the times series. Using the field map routines in SPM5 (Hutton et al. 2002), field maps were estimated from the phase differences between the images acquired at the short and long TE. The EPI images were corrected for distortions based on the field map (Hutton et al. 2002) and the interaction of motion and distortion using the unwarp routines in SPM5 (Andersson et al. 2001; Hutton et al. 2002). Subsequently, images were normalized to an EPI template specific to our sequence and scanner that was aligned to the T1 Montreal Neurological Institute (MNI) template. Finally, the normalized functional images were spatially smoothed with an isotropic 8-mm full-width at half-maximum Gaussian kernel.
Statistical Analysis of fMRI Time Series
fMRI time series were modeled by a general linear model including regressors for the 4 experimental conditions and the control condition, separately for the encoding, the delay, and retrieval phase. Data were high-pass filtered (cutoff = 128 s) and scaled for global activity. Coefficients for each regressor were estimated for each participant using maximum likelihood estimates to account for serial correlations in the data. At a first level, linear contrasts of the parameter estimates for each experimental condition regressor versus the control condition regressor (separately for the 3 trial phases) were calculated for each participant and brought to a second level random effects analysis. The data were then subjected to a 2 × 2 analysis of variance (ANOVA) on the second level with the factors rotation (self vs. table rotation) and cue (arrow vs. avatar), separately for the encoding, delay, and retrieval phase. Based on our strong a priori hypotheses, we report activations of 3 or more contiguous voxels at a statistical threshold of P < 0.001 (uncorrected). Coordinates of brain regions are reported in MNI space.
To the question “Did you always follow the instructions, that is ‘imagining yourself moving’ in the ‘self’ trials and ‘imagining the table rotating’ in the ‘table’ trials?,” all subjects but 2 answered yes, suggesting that they actually performed the required task. The 2 subjects, who answered ‘no,’ reported sometimes having “just jumped to image of what it would be like” or “just jumped to test image.”
To the question “Did you do the required mental transformation during the presentation phase, during the delay phase or during the test phase?,” all subjects answered having done it during the presentation phase. Four of them reported that they continued to do it during the delay phase and 2 others said that they continued to do it during both the delay and the test phases.
The behavioral data were analyzed with a 2 × 2 × 2 × 2 ANOVA, composed of the between-subject factor gender and 3 within-subject factors: rotation type (self vs. table), cue (avatar vs. arrow), and angle (112.5° vs. 157.5°). This analysis was conducted separately for the RTs during the presentation phase, and for both RT and performance (percentage correct) during the test phase, see Figure 2B. RTs for the self-rotation condition were significantly shorter than for the table rotation condition in both presentation (F1,16 = 6.22; P = 0.024) and test (F1,16 = 20.61; P < 0.001) phases. Test performance was also significantly better for the self-rotation condition than the table rotation condition (F1,16 = 11.37; P = 0.004). RT during encoding (F1,16 = 2.16; P = 0.161) and performance (F1,16 = 1.44; P = 0.247) did not differ between the 2 cue types, however, the main effect of cue for the RT during retrieval approached significance, with RT being shorter for the arrow than the avatar (F1,16 = 4.39; P = 0.053). There was also a significant interaction between the factors rotation type and cue on the test-phase RT (F1,16 = 14.66; P = 0.001) but not on the presentation-phase RT (F1,16 = 2.596; P = 0.127) nor on the test-phase performance (F1,16 = 0.40; P = 0.535). This rotation cue interaction on the test-phase RT reflects a significantly longer RT for change of viewpoint to the avatar than to the arrow (t17 = 3.50; P = 0.003), with no difference in array rotation to either cue (t17 = −1.20; P = 0.248).
There were no main effects of gender or rotation angle nor any significant interactions between these factors and rotation type or cue on the 3 dependent measures, with the following exceptions: we observed a cue × angle (F1,16 = 4.63; P = 0.047) and a cue × angle × gender (F1,16 = 5.02; P = 0.040) interaction on RT during test, however, the rotation × cue interaction was not modulated by angle (no rotation × cue × angle interaction: F1,16 = 0.46; P = 0.509) nor by gender (no rotation × cue × gender interaction: F1,16 = 0; P = 0.993). Finally, for the performance measure, we observed an additional angle × gender interaction (F1,16 = 5.66; P = 0.030).
To summarize, behavioral results confirmed the advantage for imagined viewpoint change compared with imagined array rotation of equivalent size.
Spatial Updating of Object Locations in Memory
First, we were interested in brain regions involved in imagining either type of rotation (self and table), independent of cue type. Thus, we compared activity during the presentation phase of all 4 experimental conditions with the control condition, which involves encoding the object locations knowing that there will be no change of perspective or rotation of the table. We observed bilateral increased activation in medial parietal cortex, that is, precuneus (MNI coordinates x, y, z: 12/−66/54; peak z-score: 4.87) and in intraparietal sulcus (−42/−48/48; z = 4.12), see Figure 3. See also Supplementary Table 1 for a full list of regions, including results for the delay and test phase.
fMRI data were then subjected to a 2 × 2 ANOVA with the factors rotation type (self vs. table) and cue type (avatar vs. arrow), separately for the presentation, delay, and test phases. To control for any unspecific task effects these, analyses were performed on the contrast images of the 4 experimental conditions versus the control condition, see Materials and Methods.
Effects of Rotation Type
When comparing activations in the self-rotation conditions with those of the table rotation conditions during the presentation phase, we observed bilateral activation in the parieto-occipital sulcus extending into the retrosplenial cortex anteriorly (21/−57/6; z = 4.36). We also found bilateral activation in insular cortex (−51/9/0; z = 4.50) in this comparison. In contrast to self-rotation, imagined table rotation activated a parietal network, including the bilateral medial parietal cortex (12/−50/57; z = 4.42) and the inferior parietal lobule extending into the intraparietal sulcus (33/−51/63; z = 4.81). See Figure 4 and Supplementary Table 2.
A subset of these regions was also activated during the test phase when comparing the self-rotation conditions with the array rotation conditions, including the right parieto-occipital sulcus extending into the retrosplenial cortex (15/−51/12; z = 3.98) and the left insula (−45/−15/15; z = 3.64), with an additional recruitment of the left hippocampus (−18/−9/−15; z = 3.47) as well as of the right parahippocampal gyrus extending into the right hippocampus (30/−27/−18; z = 3.28). See Figure 5. The reversed comparison revealed activations in right lateral prefrontal cortex, see Supplementary Table 4.
We also analyzed the delay between presentation and test phase. Right retrosplenial cortex (−15/−54/6; z = 3.88), bilateral insula (−33/−3/9; z = 3.97), and left hippocampus/parahippocampus (−21/−30/−9; z = 3.63) showed increased activity for the self-rotation relative to the table rotation conditions during the delay phase (anticipating several of the activations found in the test phase). We additionally found activations in the left intraparietal sulcus (−27/−57/51; z = 3.63) during imagined table versus self-rotation. See Figure 6 and Supplementary Table 3.
In order to assess the differences between participants with good and poor performance in changing point of view in space, we performed correlations of each individual subject's performance with their fMRI activation during the presentation phase in both the self-rotation and the table rotation conditions, separately. Considering the self-rotation condition, we found a significant effect in the right parieto-occipital sulcus (27/−54/9; z = 3.45) such that participants with higher fMRI signal in this region showed better performance. No effect was found in the table rotation condition.
Effects of Cue Type
A final question concerns the effect of taking someone else's perspective (avatar) over taking a new perspective per se (arrow). As shown in Figure 7, the comparison of activations in the avatar-cued trials with those of the arrow-cued trials revealed an increased activation in the medial prefrontal/paracingulate cortex, bilaterally, both during delay (18/51/21; z = 3.96) and test phase (9/30/24; z = 3.29). See Supplementary Tables 2–4.
On the other hand, the comparison of activations in the arrow-cued trials with those of the avatar-cued trials revealed increased activation in the inferior frontal gyrus/sulcus, both in the presentation (51/9/9; z = 3.17/−33/42/18; z = 3.21) and in the test phase (−33/34/12; z = 3.83), in the middle frontal gyrus in the presentation phase (21, 27, 33; z = 4.03) and the inferior temporal gyrus in the presentation phase (42, −57, −6; z = 3.50) as well as in the precentral gyrus in the test phase (−63, 3, 18; z = 3.60)
The performance and RT effects in the retrieval phase confirm the advantage for spatial memory following self-rotation compared with an equivalent amount of array rotation. This advantage may come from the fact that the internal spatial relationships within the array must be maintained during array rotation but not necessarily during perspective taking, Moreover, it has been shown that part of the advantage for perspective taking over array rotation is due to the possibility, in addition to spatial updating of egocentric locations across imagined self-motion, to code the array locations relative to external landmarks in the environment (Burgess et al. 2004). But another explanation could be that, if array rotation does necessarily require imagined movement of the array, perspective taking may be, at least in some conditions, more akin to a blink transformation, in which subject instantly jumps to the new imagined location, without passing through any intervening points (Kosslyn 1980; Zacks and Michelon 2005; Kessler and Thomson 2010). In the present study, the fact that there was no effect of angular disparity on RT in the perspective-taking condition could be interpreted as an argument in favor of that explanation. However, there was no effect of angular disparity neither in the perspective taking condition nor in the array rotation condition, the 2 used angles being probably too close to each other. Finally, we think that subjects, here, were likely to imagine themselves moving around in the perspective taking condition. Indeed, before the experiment, they were shown movies of movements they were required to imagine; and all reported having done so at the end of the experiment (debriefing). Only 2 of them spontaneously said that they sometimes directly jumped to the test image.
Considering the neuroimaging results, the basic task of encoding the locations of the 4 objects (“control condition”) produced activation of a network of areas often associated with spatial navigation and memory for locations within a surrounding virtual environment (Maguire et al. 1998; Gron et al. 2000; Hartley et al. 2003; Iaria et al. 2003; Hannula and Ranganath 2008), including medial, inferior, and superior parietal cortices. The additional demands of imagining the effect of either rotation of the table and array or of the participant's own viewpoint produced extensive activation of the precuneus. This finding is consistent with the evidence that strongly supports the suggestion that the precuneus is involved in spatial working memory and mental imagery (Fletcher et al. 1996; Burgess, Maguire, et al. 2001; Wallentin et al. 2006). This is also consistent with electrophysiological results in monkey suggesting that medial parietal region plays a critical role in route-based navigation by integrating location information and self-movement information (Sato et al. 2006). The demands of imagining the effect of either rotation of the array or of the participant's viewpoint also produced activation of the intraparietal sulcus, which is consistent with the evidence that parietal neurons compute coordinate transformations (Andersen et al. 1985).
During the encoding phase, the self-rotation task differs from the array rotation task in the participant's instruction to imagine movement of their perspective around the array (in preparation for the retrieval phase) as compared with the instruction to imagine rotation of the table and array in front of them. A significantly greater activation in the parieto-occipital sulcus, including retrosplenial cortex, was observed when the self-rotation condition was compared with array rotation. Furthermore, activation in this brain area positively correlated with self-rotation performance. This activation is consistent with the association of this area with motion-related spatial updating of egocentric locations within an allocentric environmental reference frame (Galletti et al. 1993; Chen et al. 1994; Burgess, Becker, et al. 2001; Byrne et al. 2007). More generally, this area has often been associated with spatial navigation and memory (Burgess, Maguire, et al. 2001; Maguire 2001; Ino et al. 2002), and some have argued that it might support the formation of an allocentric cognitive map (Wolbers and Buchel 2005). Alternatively it might be important for processing information relating to heading (Wiener et al. 2002; Committeri et al. 2004) or the integration of internal (motion related) and external (visual perceptual) information (Cooper and Mizumori 2001). The close links between the neural bases of mental imagery for spatial scenes and spatial navigation (Ghaem et al. 1997; Byrne et al. 2007), indicated by all of the above accounts, are consistent with recent neuropsychological findings (Guariglia et al. 2005).
By contrast, comparison of the array rotation condition with self-rotation during encoding produced bilateral activation of the intraparietal sulcus extending into the inferior parietal cortex. This result is consistent with the association of these regions with mental rotation of single compound objects (Ratcliff 1979; Zacks et al. 2003; Podzebenko et al. 2005; Keehner et al. 2006), suggesting that imagined rotation of the array involves similar processes to mental rotation of single objects.
The basic retrieval task (control condition) elicited extensive bilateral activation of the lateral parietal lobes, including the intraparietal sulcus, as well as dorsal prefrontal cortical activation including lateral and extensive medial areas. The test phase revealed additional activation of the right parieto-occipital sulcus/retrosplenial cortex in the self-rotation condition compared with the control condition. This is consistent with preferential involvement of this region in retrieval of spatial information from a new point of view, and the allocentric–egocentric translation implied by that (Burgess, Maguire, et al. 2001; Byrne et al. 2007). However, this activation might also reflect processes relating to seeing the array of objects against a new (rotated) background, such as those relating to encoding the new scene or comparing it against the initial scene.
When comparing the activations during retrieval in the self- and array rotation conditions, we found only regions that were more active for self- than array rotation, and none showing the reverse pattern. These areas included the right insula, left superior temporal gyrus, right parieto-occipital sulcus/retrosplenial cortex, and left anterior hippocampus. As with the areas more activated during the self-rotation encoding phase, some of these activations may reflect the processes which confer the behavioral advantage to self-rotation. Prime suspect among these is the right parieto-occipital sulcus/retrosplenial cortex which was also activated during the encoding phase. The observation of several of these activations also during the delay between encoding and retrieval phases indicates that they are unlikely to reflect the perceptual difference between these 2 conditions at retrieval (namely the shifted view of the environmental background following self-rotation).
In addition to investigating the neural bases of movement-related spatial updating, we also aimed to investigate the effects, if any, of taking someone else's perspective over taking a new perspective per se. Our behavioral results when participants performed the retrieval task from a new perspective indicated an increased RT when the new perspective was that of the avatar rather than the arrow. Thus, additional incidental (i.e., nontask relevant) processes may be triggered simply by taking another being's point of view. In the neuroimaging data, we found additional activation of the anterior medial prefrontal cortex during the test and the delay phases when the task involved self- or array rotation to the avatar rather than to the arrow (even though the cues themselves were only visible during the encoding phase). This provides a hint as to the nature of this incidental processing, as activation of the medial prefrontal cortex has been repeatedly associated with processing of the intentions of others or “mentalizing” (Frith 2001; Kampe et al. 2003; D'Argembeau et al. 2007). The presumed automaticity of mentalizing (Leslie 1987) would be consistent with the incidental nature of whatever processing was slowing down our participants in the retrieval phase. We also found areas that showed increased activation in the arrow-cued condition relative to the avatar-cued condition. These areas mainly included the inferior frontal gyrus, the middle frontal gyrus, and the inferior temporal gyrus. We speculate that these areas might be involved in calculating the body position and posture to be imagined, given that this is less well specified by the arrow than it is by the avatar (which serves as an example), However, further work will be required to assess this interpretation.
Our fMRI data revealed a network of areas, including bilateral parietal cortices, involved in remembering the locations of an array of objects. The additional effects of rotation of the participant's viewpoint or of the array revealed a role for the precuneus in the spatial working memory and a role for the parieto-occipital sulcus/retrosplenial cortex in supporting movement of viewpoint. By contrast, array rotation was associated with activation of the intraparietal sulcus, in common with previous studies. These findings support suggestions that parieto-occipital sulcus/retrosplenial cortex serves to translate between allocentric representations in medial temporal lobe and egocentric representations in parietal cortex and can perform “spatial updating” of viewpoint as a consequence (Burgess, Becker, et al. 2001; Byrne et al. 2007). In addition, the paracingulate cortex was implicated in incidental processing of the perspective of the avatar used in some trials, potentially linking the neural mechanisms of spatial perspective taking with social perspective taking or mentalizing.
Medical Research Council (G0501672); European Union (“Wayfinding Project”).
We gratefully acknowledge the technical assistance of Guillaume Flandin, much help from Manja Lehmann and Rita Chung in piloting and collecting data, many useful discussions with John King and Uta Frith, and the Wellcome Trust Centre for Neuroimaging at UCL for providing help and scanning facilities. Conflict of Interest: None declared.