Human visual cortex shows retinotopic organization during both perception and attention, but whether this remains true for visual short-term memory (VSTM) is uncertain. In 2 functional magnetic resonance imaging experiments, we separated retinotopic activation during perception, attention, and VSTM maintenance. The 2 experiments differed in whether spatial encoding of the VSTM stimuli and prospective attention to the locations of the remembered items was encouraged or discouraged. Using multivoxel pattern analysis to extract a measure of spatial coding in early visual cortex, we saw sensory and attentional retinotopic coding in both experiments. However, significant spatial coding during memory maintenance was only seen where a spatial strategy was encouraged. Furthermore, individual differences in attentional spatial coding predicted performance in both experiments, while individual differences in maintenance spatial coding predicted performance in neither. We conclude that retinotopic coding in the early visual cortex during VSTM maintenance is not obligatory, that attentional processes during stimulus perception modulate memory performance, and that different attentional strategies are used depending on the task in hand.
Understanding of retinotopic occipital cortex (for review, see Wandell et al. 2007) has progressed from a simple proof of activation in response to visual stimulation (Belliveau et al. 1991; Kwong et al. 1992), to demonstrations of visual cortex topography (Schneider et al. 1993; DeYoe et al. 1994; Engel et al. 1994). Later accounts have shown that attention evokes a similar topography to sensory retinotopy in the occipital striate and extrastriate cortex (Tootell et al. 1998; Brefczynski and DeYoe 1999; Silver et al. 2005; Saygin and Sereno 2008).
On the one hand, the wealth of sensory and attentional paradigms, visual short-term memory (VSTM) paradigms exploring spatial coding in early visual cortex, have been scarce. Many studies have chosen to only describe spatial coding in posterior parietal cortex (PPC) and prefrontal cortex (PFC). Two studies that did explore visual cortical responses (Sereno et al. 2001; Schluppeck et al. 2005), using a delayed saccade task, revealed no retinotopy. This absence of occipital spatial coding may have been due to the presence of a ring-shaped visual distracter mask covering all radial positions during the delay period, which likely resulted in spatially uninformative visual cortex activation overshadowing potential attentional and memory effects. Another study evidencing spatial coding in the visual cortex also placed a patch of distracter dots at the approximate location of the stimulus (Jack et al. 2007) during the delay period, meaning that spatial coding seen here was likely due to perceptual activity. Two other studies had no distracting stimuli during the maintenance interval (Hagler and Sereno 2006; Hagler et al. 2007), but employed fixed periods for stimulus presentation and the memory delay, hindering the clear separation of activations specific to perception and memory due to the sluggishness of the blood oxygen level–dependent (BOLD) signal.
On the other hand, there have been several studies examining nonretionotopic aspects of activity in the early visual cortex during the delay period. One study found that univariate bulk visual cortex activity during the delay was negligible during a VSTM task, but stronger during an attentional task (Offen et al. 2009). Other work has used multivoxel pattern analysis (MVPA), a multivariate technique that examines whether the pattern of activity elicited in a region contains information about the experimental conditions (for a review of MVPA, see Norman et al. 2006). Two such studies (Harrison and Tong 2009; Serences et al. 2009) found that pattern activity in the visual cortex contained information relating to the stimulus orientation during the delay period. However, none of these studies assessed the degree of spatial coding during the delay period.
We therefore aimed to measure the spatial coding in the visual cortex during perception, attention, and VSTM, in particular whether spatial information is necessarily preserved during the delay period. We used temporal jittering techniques (Dale 1999) in order to separate the neural activity for the encoding, maintenance, and retrieval phases of the task. In addition, we used MVPA, basing our method on work by Haxby et al. (2001) and Kriegeskorte et al. (2006) to assess the spatial coding in visual cortex. Previous studies using delayed saccades may be conceptualized as sustained spatial attention (see Sereno et al. 2001) or even as motor preparation (Curtis and Connolly 2008) paradigms, in contrast to our feature-based paradigm. Therefore, in order to control for and examine spatial attention effects, we ran 2 studies, where spatial encoding and spatial prospective attention were either encouraged or discouraged during the delay period.
Materials and Methods
The present experiments were designed to characterize the spatial coding of perception, attention, and memory using MVPA. To this effect, we used a task where the memoranda were presented at 4 possible locations (Fig. 1—Experiment 1), and the coding of this location information was analyzed using MVPA in each of the 3 phases of the task. We also controlled for the role of prospective attention during the memory period by using a change detection paradigm where the change detection probe was presented centrally, rather than at the remembered location (Fig. 1—Experiment 2).
Nineteen individuals were scanned in Experiment 1: 4 participated in 1 imaging session, 15 participated in 2 sessions. Three sessions were rejected: 1 due to the lack of detectable retinotopy during the response period, 1 due to poor behavioral performance on the task, and 1 due to excessive head movement during the scanning. This resulted in 31 sessions for 17 participants (8 females) of mean age 25.6 (standard deviation [SD] 4.5).
Thirty-two individuals participated in a single imaging session each in Experiment 2. Six participants were rejected: 5 participants due to very low or chance performance on the task, 1 due to large movement (>10 mm), resulting in the area of interest moving outside the imaged echo-planar imaging (EPI) space. The remaining 26 participants (16 females) had mean age 26.2 (SD 4.6). Twenty-two of the participants in Experiment 2 also carried out one short (9min, 260 EPI volumes) imaging block mapping sensory spatial coding.
The experiments were undertaken with the understanding and written consent of each subject, and subjects were paid for their participation. Visual acuity of participants was normal or corrected to normal with scanner-compatible glasses.
Experiment 1: Attentional, maintenance, and sensory retinotopy
The paradigm (Visual Basic.Net was used to present the visual stimuli and to collect the responses) used in this study resembles previous attentional retinotopy tasks, where visual stimulation is presented in multiple places in the display, but attention is directed to a limited number of areas (Brefczynski and DeYoe 1999; Silver et al. 2005). The stimulus display consists of 4 sectors presented around fixation, each sector occupying a quadrant of the visual field. Each sector was delimited from the others vertically and horizontally, and the sectors were separated from one another (10 radial degrees). The fixation cross was surrounded by an empty circular area (radius 1.63° of visual angle), in order to reduce the effect of small deviations from the fixation, and the sectors themselves subtended 4.48° from their innermost to their outermost bounds. Two of the sectors were attended, while the remaining 2 were ignored on the basis of their color (either magenta or cyan, chosen to be bright and easy to distinguish). Rather than vary the number of sectors attended, we required attention to 2 sectors in order to optimize the statistical power of each sector-specific regressor. Each of the 3 task blocks in each scanning session was divided into 9 sub-blocks in which all 6 possible combinations of 2 sectors were displayed in random order.
Due to a computing error, 10 of the sessions had a slight unbalancing, such that one combination was presented twice as often and another combination omitted. This led to a slight reduction in power in these sessions, although not a bias. Since both sessions were affected for only 1 participant, this loss of power should be compensated by their second sessions and those of unaffected subjects, which sum up to 21.
Participants were required to remember their final impression of the 2 attended sectors. To secure attention throughout the stimulus presentation period, the features within each sector changed (Blaser et al. 2000) over the duration of the attention period and froze during the last 300 ms at its end to permit encoding of the static feature information. Furthermore, the use of dynamically varying stimuli reduced the potential problem of informative afterimages in the memory delay, which could have facilitated performance and, more importantly, patterned sensory activation. In addition, the primary sensory afterimage would have comprised all 4 sectors, which was uncorrelated with the location of attention.
The duration of the attention period was jittered by 1, 6, or 11s, such that participants could not predict the offset of the stimulus display. The duration of the blank memory period was also similarly randomized. The random jitter allowed us to dissociate the functional magnetic resonance imaging (fMRI) BOLD activity related to the encoding, maintenance, and retrieval phases (Dale 1999).
A response display appeared after the mnemonic period had elapsed. This display contained the 2 attended sectors, where either both gratings were identical to the remembered gratings, or where one grating was changed in terms of either spatial frequency or orientation. Participants performed a same–different discrimination with respect to the remembered sectors (2700 ms).
Experiment 2: Attentional and maintenance retinotopy
The task in Experiment 2 was similar to the one used in Experiment 1 and was designed as a control for prospective attention during the delay period. In Experiment 1, it is likely that participants could have deployed their attention to the locations of the attended stimuli on the display during the maintenance period, because the test stimuli were presented in these same locations. This prospective attention strategy could have created spatial coding unrelated to the maintenance of visual information in memory. In Experiment 2, the locations of the attended stimuli were made unpredictive of where the probe would appear, discouraging attention to those locations during the delay by presenting the probe centrally.
Four sectors were presented around fixation, identical in shape to the ones used in the previous experiment. However, in this experiment, the sectors were rotated by 45° with respect to the previous experiment, such that their borders were diagonal rather than vertical and horizontal. Our rationale for this was based on the properties of early visual cortex. While the V1 representation is composed of 2 hemifields, most topographic maps in occipital and parietal cortex are quarterfield representations divided by horizontal and vertical meridians (Wandell et al. 2007), and visual map representations corresponding to the same quarterfield are found side by side. This means that the stimuli in Experiment 1, which occupied an entire quarterfield, could not be used to distinguish the border between these quarterfield representations, potentially losing spatial coding precision. To attempt to capture more spatial information and with a view to potentially trace the borders between retinotopic areas, we manipulated the angle of the stimuli to overlap the horizontal and vertical meridian borders.
We encouraged the participants to attend to and remember the 2 relevant sectors separately. The test sector was shaped in the same way as one of the remembered sectors, cueing the participant to compare the presented test sector with a single relevant sector, from the 2 held in memory. The test sector contained features that were either identical to the relevant sector, or differed in orientation or spatial frequency.
Experiment 2: Sensory retinotopy
Because the probe display always placed the test sector centrally we were not able to probe sensory retinotopy during the response period of the main experiment. Therefore, we devised a control task to this effect. The task consisted of monitoring a stream of central letters (changing every 0.50s) for the presence of the letter X, whereupon the participants pressed a button. While the task was being performed, 2 sectors identical to the ones presented in the main task of Experiment 2 were presented for 11 s. As in Experiment 1, the positions of the sectors were counterbalanced to allow individual modeling of the retinotopic activations for each sector.
Imaging data were acquired with a Siemens TIM Trio 3T scanner, using a 12-channel head coil. The functional sequence used EPI, using a custom high resolution sequence (time repetition = 2150; time echo = 30 ms; flip angle = 78°; 32 slices of matrix 64 × 64 with a 25% gap; voxel size 2.4 × 2.4 × 2.4 mm3), the acquired volume covered the parietal and occipital cortex entirely, a restricted posterior part of the temporal cortex, and superior aspects of PFC. Each scanning session was subdivided into 3 blocks of EPI, with 430 scans (however, a 390 scan sequence without breaks between task sub-blocks was used for the first 4 sessions, 2 of which were subsequently analyzed, whereas following sessions contained rest periods used to obtain an implicit measure of baseline activity in the subjects), the 7 first scans of which were discarded to allow magnetization to reach a steady state. A T1-weighted structural image of the entire brain was also acquired (MPRAGE, 1-mm isotropic resolution).
We extracted the mean amount of information in VSTM held during the experiment by each participant, aggregating behavioral performance across 2 sessions where necessary. This measure was extracted by using Cowan's K (Cowan 2001). This shows the proportion of feature information the participants remembered, in terms of the number of perfectly remembered items, based on the proportion of correct (P(C)), and the number of items to be remembered (N): KN = (2 × P(C)N − 1) × N. When computing Cowan's K, we ignored trials where no response was given. Comparisons using Cowan's K led to identical conclusions as those using raw proportion of correct responses, therefore, we only report these. In addition, we extracted reaction time (RT) measures from each participant.
Participants were instructed to maintain the fixation throughout the entire scan duration. We tested fixation performance by obtaining eye-tracking data during Experiment 2 only (15 sessions). Eye movements were monitored using an fMRI compatible infrared eye-tracking camera (50-Hz acquisition rate, SensoMotoric Instruments, Germany). An eye tracker was calibrated at the start of the experiment.
Eye-tracking data were analyzed using a custom code developed in Matlab. To assess the quality of the data, we measured the variability (i.e. the root mean square deviation from mean fixation position) for each participant, and then averaged across participants. During attention, measurement variability was 0.24° horizontally (SD 0.07) and 0.46° vertically (SD 0.11). This difference was significant (t(16) = 8.79, P < 0.001). During maintenance, measurement variability was 0.25 horizontally (SD 0.08) and 0.41 vertically (0.12). This difference was also highly significant (t(16) = 5.34, P < 0.001). The increased variability in the vertical plane appeared to be caused by pupil isolation errors due to the shadow of the participant's eyelid. Since this artifact rendered vertical eye movements unreliable, all remaining analyses were restricted to the horizontal plane. Furthermore, analyses were focused on the key attentional and maintenance periods of the task.
Two participants showed excessive (2 SDs above the sample mean) eye position shifts. Specifically, eye movements were detected, respectively, in these participants in 45.5% and 36.5% of trials during the attentional period. The first of these participants also showed excessive eye movements during the maintenance period: 42.6%. We excluded these participants from analysis, which did not significantly alter the pattern of results.
The remaining participants showed eye movements in the attentional phase averaging 6.1% (SD 5.8%). The maintenance phase displayed an average number of eye movements of 6.5% (SD 8.0%). This is a small proportion of the trials and is unlikely to explain the difference between experiments during the maintenance phase. Nevertheless, we repeated our analyses while additionally excluding those trials that contained significant eye movements, but this did not significantly change the conclusions of the experiment. Furthermore, the correlation between eye movements during neither attention nor maintenance correlated with spatial coding during these epochs anywhere in the brain.
The analysis of functional data was carried out in SPM 5 (Wellcome Department of Imaging Neuroscience, London, United Kingdom; http://www.fil.ion.ucl.ac.uk/spm). Preprocessing included slice-time and motion corrections, mutual information coregistration to the structural scan, nonlinear normalization to the Montreal Neurological Institute template brain (Mazziotta et al. 1995), spatial filtering with a 10-mm full-width half-maximum Gaussian kernel, and high-pass filtering with a 128-s cutoff to reduce low-frequency drift in the functional signal. These steps were automated through the use of the automatic analysis library (http://www.github.com/rhodricusack/automaticanalysis)
Using a general linear model, regressors were fitted time-series at each voxel in each of the 3 imaging blocks. Three sets of 4 regressors modeled the neural response related to the attention, maintenance, and response retinotopy phases for each of the 4 quadrants attended. Each of the phases was modeled as an epoch of either variable duration for the attention and maintenance (these durations were randomly jittered 1, 6, or 11 s, and counterbalanced in order to help separate the selective responses to each regressor independent of the others.), or constant duration (sensory retinotopy). Since the appearance of each sector was independent of the others, it was possible to extract activity selective for each quadrant at each epoch. These regressors were convolved with the canonical haemodynamic response function, producing a modeled time-course of neural activity. All analyses contained regressors modeling the mean effect of the block and 6 unconvolved regressors to control for head movement.
These regressors were combined to extract voxel-wise contrasts related to the overall neural activity in each epoch, and the specific (calculated by subtracting the activity related to a particular sector from the average activity related to the 3 other sectors) activity in each quadrant and each epoch. Individual contrasts were included in a random-effects group analysis and reported at a corrected [false discovery rate (FDR) P < 0.05] threshold.
We also performed a region of interest (ROI) analysis to observe the activation of individual areas to the various epochs of our task. The analysis used both anatomically defined ROIs, including V1 and V2 (Amunts et al. 2000; Wohlschläger et al. 2005) at a threshold of 6 (out of 10) overlapping brains.
MVPA is an analysis technique typically applied to fMRI data (although see the study by Ashburner (2007), for a use on structural data). Unlike univariate analysis, it does not test the activation of individual voxels, but instead examines the activation of many voxels at once, to look at the spatial pattern of activity within a local region (Haxby et al. 2001, Norman et al. 2006). While univariate analysis is “activation-based,” MVPA is “information-based” functional brain mapping (Kriegeskorte et al. 2006). A figure summarizing the principal steps of our MVPA procedure is presented in Figure 2, and these steps are described in more detail below.
To perform MVPA, we extracted the voxel-wise unsmoothed beta values for each ROI and each quadrant- and epoch-specific regressors. This resulted in 12 voxel-wise sets of beta values for each of the 3 blocks of the session. The sets of each block were correlated with the 12 sets of beta values from the other 2 blocks. This yielded 3 unique block-to-block comparisons, each comprising a 12 by 12 correlation matrix of each epoch and quadrant with one another. A number of hypotheses were then evaluated in each participant, using these correlation matrices as the dependent variable in a general linear model (GLM) analysis, where a contrast matrix served as the predictor.
The contrast matrix used here tested for the spatial coding of the sector locations during the attentional, maintenance, and response periods. This contrasts within-quadrant correlations against between-quadrant correlations (weighted to avoid a bias where a different number of cells are positive or negative). This is depicted graphically in Figure 3.
By applying these 3 measures to the data of each subject and then carrying out group statistics on the summary values, we arrived at a number of results that indicate whether an area shows retinotopic response properties. The MVPA was performed in functional ROIs used in the univariate ROI analysis described above, and also using searchlight mapping (Kriegeskorte et al. 2006) in the entire volume.
Searchlight mapping was performed on the native space images of each participant by moving a spherical ROI of 4 voxel radius (∼10 mm) through the gray-matter masked volume one voxel at a time. Resultant single-subject statistics were mapped back to the center voxel of each spherical ROI, thus yielding a single-subject information map that was entered into a group analysis. Analysis was restricted to searchlights that contained at least 30 voxels. The first-level results were normalized, and a second-level model was carried out to examine the spatial coding at the group level.
Individual Differences in Performance
Performance differences in the task could originate in a number of distinct processes that are related to task success. We consider 3 factors. The first is selective attention, which allows subjects to attend to only relevant information in the scene, allowing visual processing of the target sectors while ignoring distracter sectors. Therefore, good attentional selection should aid the encoding of the relevant sectors into memory, with minimal perceptual interference from task-irrelevant information. The second factor is, naturally, memory capacity, which determines how much visual information can be maintained in memory. This mnemonic capacity dictates the number of representations that can be maintained, as well as the resolution and detail of each representation, which in turn relates to how well feature changes can be detected. The third factor is the comparison and decision-making processes occurring at the probe phase, where the actual change detection occurs.
All 3 factors can influence task performance, so we need a way to separate their individual contribution. Let us consider that we can separate the attentional, memory capacity, and decision-making processes into the attentional, mnemonic, and response epochs of our visual task. Thus, it is possible to relate behavioral performance on the task to various univariate and multivariate measures in each of the epochs, thus allowing us to explore which aspects of neural responses relate to each of these 3 factors. For simplicity, we restricted this analysis to our preselected anatomical V1/V2 ROI (Amunts et al. 2000; Wohlschläger et al. 2005).
To maximize sensitivity to individual differences, we chose to pool the results of both studies together. However, given the fact that the 2 studies are not identical, this may also confound true individual differences effects with study-specific effects. Therefore, we ran an analysis of covariance (ANCOVA), with individual differences as one predicting variable and study as the other, with each ROI contrasts as the dependent variable. This analysis covaries out the study effect from our other comparisons of interest, namely the linear relationship between performance and either bulk activity or spatial coding. We first examine any differences between studies, then the common linear relationship related with performance, and finally the interaction between study and performance.
Participants in Experiment 1 tended to remember significantly more feature information than that contained in one sector, such that Cowan's K was greater than 1 (t(16) = 3.10, P < 0.005). In addition, participants who performed 2 scanning sessions of the experiment showed no improvement across the 2 sessions [t(13) = 0.61, not significant (n.s.)]. Those in Experiment 2 did not remembered single sector worth of information (t(25) = 0.95, n.s.). Comparing the experiments with regard to the amount of information held in memory shows that participants in Experiment 1 remembered more information than those in Experiment 2 (t(41) = 2.55, P < 0.025).
The average RT across participants in Experiment 1 was 1169 ms (SD 212 ms) and was greater for incorrect trials than for correct trials (t(16) = 8.13, P < 0.001), but did not change between sessions (t(13) = 0.73, n.s.). In Experiment 2, the mean RT across participants was 1234 ms (SD 174 ms) and was higher for incorrect than correct trials (t(25) = 9.32, P < 0.001). Mean RT did not differ between the 2 experiments (t(41) = 1.13, n.s.). The increased RT for incorrect trials relative to correct ones suggests that the experiment is unlikely to have been affected by speed–accuracy tradeoffs, such that errors likely reflect a failed mnemonic representation, rather than impulsive responding.
The sensory retinotopy task in Experiment 2 showed ceiling performance (mean accuracy 95.6% correct, SD 2.6%). The average RT was 648.1 ms (SD 143.7 ms). There was no significant relationship between performance on the task and RT (r = −0.27, n.s.).
Bulk Activity (Univariate, Experiments 1 and 2)
In both experiments, we saw very similar univariate responses; therefore, we will discuss the bulk activity results together. These can be seen side by side in Figure 4.
The attentional period showed an extended locus of visual cortex activation, in occipital cortex, extending ventrolaterally into the occipito-temporal junction and dorsally into the parietal cortex. Also evident are small loci of thalamic activation. It also shows some deactivation in the occipital pole, which may reflect attention away from the fovea, since the sectors are presented parafoveally. Other deactivations, such as those typically seen in the temporo-parietal junction (TPJ) during VSTM tasks (Todd et al. 2005; Anticevic et al. 2010) and in the medial surface of the hemispheres, are consistent with the “default mode network”, originally found by observing task-induced deactivations (Raichle et al. 2001; Fox et al. 2005; Fox and Raichle 2007).
We see deactivation of the occipital visual cortex during maintenance, particularly at the loci where slight deactivation was seen during attention. PPC appears strongly activated, and there remains some evidence of the deactivation of TPJ.
During the response period, we saw similar visual activation of the visual cortex and thalamus, as well as dorsolateral prefrontal and posterior parietal activations, and deactivation of the lateral parietal cortex and precuneus.
We also directly compared the 2 experiments for differences in activation. This revealed significant differences between the epochs, with increased activations and deactivations in Experiment 1 relative to Experiment 2 across a set of regions, particularly in the attention and response periods. With a few exceptions (Supplementary Tables 1–3), these differences typically reflected differences in the strength of activity, rather than reflecting areas (de-)activated in Experiment 1 but inactive in Experiment 2, suggesting a subtle but significant difference in power across the 2 experiments.
Coding of Spatial Information (Multivariate, Sensory Experiment)
Response patterns in occipital cortex during the sensory experiment showed a correlation with the spatially-specific predictor (the “spatial contrast”), thus demonstrating that these areas showed spatial coding and that our method reliably detects it at a whole-brain–corrected statistical threshold (Fig. 5, bottom row, center column). This serves as our baseline measure of retinotopy, against which we can compare spatial coding in the other experiments.
Coding of Spatial Information (Multivariate, Experiments 1 and 2)
The results of MVPA spatial coding are depicted in Figure 5 for Experiments 1 (left column) and 2 (middle column). Both experiments required attention to the locations of the attended items initially, and both showed similar spatial coding throughout much of the occipital cortex, focusing around the locations of V1 and V2.
During the response period of Experiment 1, we observed spatial coding, strongest around areas V1/V2. In Experiment 2, while spatial coding was also seen, this appeared to be of markedly lesser extent and localized to the occipital poles.
The maintenance period produced clear differences between the experiments. In Experiment 1, we saw spatial coding during maintenance, which was still localized in areas V1/V2. In Experiment 2, however, we saw no significant spatial coding.
A direct comparison between the experiments revealed that the differences in spatial coding across the 2 experiments were significant only during the response period, with Experiment 1 exhibiting significantly stronger coding primarily in the areas V1/V2 of occipital cortex, but extending parietally into the angular gyrus (Supplementary Table 4).
For increased sensitivity, we also ran an analysis within the cytoarchitectonic ROI comprising V1 and V2. The MVPA spatial contrast on this V1/V2 ROI showed a similar pattern of results to our searchlight.
Bulk activity was elevated relative to rest in the response (µ = 0.547; t(31) = 12.56; P < 0.001) and the attention (µ = 0.642; t(31) = 10.07; P < 0.001) phases, but was deactivated during maintenance (µ = −0.131; t(31) = 3.24; P < 0.005).
Similar to Experiment 1, we saw activation during the response (µ = 0.315; t(31) = 13.79; P < 0.001) and the attention (µ = 0.432; t(31) = 13.38; P < 0.001) periods. The maintenance period showed a trend toward deactivation (µ = −0.042; t(31) = 1.58; n.s.).
Comparison between Experiments 1 and 2
Although qualitatively, the pattern of activity was similar across the 2 experiments, there were some differences in the visual cortex. Bulk activity was greater for Experiment 1 than Experiment 2 during the response (Δµ = 0.232; t(56) = 4.22; P < 0.001) and attention (Δµ = 0.209; t(56) = 2.70; P < 0.01) periods. During the maintenance period, activity was greater in Experiment 2, but this difference was not significant (Δµ = −0.089; t(56) = 1.83; n.s.).
We present the mean spatial coding in terms of the β values from the GLM analysis on the correlations.
First, Experiment 1 showed consistent spatial coding during the response (µ = 0.119; t(31) = 14.20; P < 0.001), attention (µ = 0.026; t(31) = 9.02; P < 0.001), and memory maintenance epochs (µ = 0.032; t(31) = 5.40; P = 0.001).
Furthermore, we tested whether the pattern of spatial coding from one epoch was similar to that found in another epoch. Crucially, this coding was similar between all epochs, suggesting the spatial coding is consistent across different types of sensory, attentional, and mnemonic processes (attention vs. maintenance: µ = 0.011; t(31) = 4.07; P < 0.001. attention vs. response: µ = 0.036; t(31) = 10.18; P < 0.001. maintenance vs. response: µ = 0.042; t(31) = 9.74; P < 0.001).
The sensory block of Experiment 2 revealed spatial coding (µ = 0.077; t(21) = 8.55, P < 0.001). Within the main task, spatial coding was seen in the attention phase (µ = 0.017; t(25) = 12.70, P < 0.001).
Importantly, contrary to the results in Experiment 1, the maintenance period showed no retinotopy in the visual cortex (t(25) = 1.77, n.s.).
However, contrary to our prediction that the response period would result in no spatial coding, given the central, nonspatially informative presentation of the probes, we did see spatial coding during this phase (t(25) = 4.26, P < 0.005).
We again compared the spatial pattern across the 3 epochs and found that spatial coding in maintenance was not similar to the other 2 epochs (attention vs. maintenance: t(25) = 0.25; n.s., maintenance vs. response: t(25) = 1.62; n.s.). Importantly, spatial coding across the attentional and response periods showed a negative value (attention vs. response: t(25) = 5.35; P < 0.001), suggesting the nature of spatial code is dissimilar across the attentional and response epochs.
Comparison of Experiments 1 and 2
Contrasting the spatial coding in V1/V2 revealed the expected significant difference in spatial coding during the memory maintenance (Δµ = 0.026; t(56) = 3.41; P < 0.005) period, confirming the results of the searchlight analysis and suggesting that spatial coding during the maintenance period was greater in Experiment 1 than in Experiment 2. Similarly, we found a difference in the response (Δµ = 0.110; t(56) = 10.53; P < 0.001) period. We also found a quantitatively smaller, but also significant difference in the attentional period (Δµ = 0.009; t(56) = 2.40; P < 0.05).
To compare the encoding and maintenance spatial coding (Fig. 6) in the 2 studies, we ran a mixed analysis of variance on the between-subjects factor of study and the within-subject factor of epoch (attention or maintenance). This revealed a main effect of study (F1,55 = 14.85, P < 0.001) and an interaction between study and epoch (F1,55 = 4.81, P < 0.05), but not main effect of epoch (F1,55 < 1, n.s.). This confirms that while there were differences among studies in both the attention and maintenance epochs, this difference was greater in the maintenance period, arguing against a simple power explanation and suggesting that the spatial coding in the maintenance period was differentially affected by our probe manipulation.
Individual Differences ANCOVA
We related differences in performance across participants with differences in the BOLD bulk activity or the degree of spatial coding as measured with MVPA.
Bulk Activity Versus Performance
When comparing the relationship between performance and bulk activity, the ANCOVA found a significant effect of study during the attentional (F1,53 = 6.74, P < 0.05) and response (F1,53 = 15.70, P < 0.001) periods, but not in the maintenance period (F1,53 = 3.61, n.s.). Furthermore, bulk activity was not related to performance in any of these epochs (F1,53 < 1, n.s.), nor was there any interaction between study and performance (F1,53 < 1, n.s.).
Spatial Coding Versus Performance
We found an effect of study during the maintenance (F1,53 = 8.32, P < 0.01) and response (F1,53 = 96.31, P < 0.001) periods, but not during attention (F1,53 = 2.15, n.s.).
However, the attentional period displayed a significant effect of performance (F1,53 = 19.58, P < 0.001) and an interaction between study and performance (F1,53 = 5.81, P < 0.05). In contrast, the maintenance and response periods showed neither the effect of performance (F1,53 < 2.13, n.s.) nor the interaction (F1,53 < 1.11, n.s.).
To examine the relationship between attentional spatial coding and performance, we looked at the correlations within each study. This revealed a strongly significant correlation in Experiment 1 (attention: r = 0.59, P < 0.001), while in Experiment 2, this relationship is in the same direction, but only marginally significant (attention: r = 0.33, P = 0.051).
The scatterplot for the spatial coding results for the 2 experiments during encoding and maintenance can be seen in Figure 7.
The results show that spatial coding can occur during the maintenance period of visual memory tasks and that it can be detected and localized using MVPA. However, they show it is not an automatic, obligatory feature of visual memories, which places important constraints on the nature of the neural code. Spatial coding disappeared in Experiment 2 when the possibility for prospective attention to the upcoming locations of the probe was removed, suggesting that the cause of the spatial coding during maintenance in Experiment 1 was prospective attention and not memory per se. Alternatively, participants might have changed strategy in response to the slightly different task requirements and focused less on encoding spatial information in the second experiment. In either case, this indicates that retinotopic coding is not an obligatory feature of visual memory, but is evoked by specific attentional or task strategy requirements, which may be epiphenomenal to the memory itself. This was not due to a difference in sensitivity across experiments, as we found similar univariate results and only a small difference in attentional retinotopy in multivariate measures. Furthermore, the strength of attentional spatial coding reliably predicted behavioral performance on the task, while the maintenance period spatial coding did not, regardless of whether prospective attention was encouraged or discouraged.
Relationship to Other Studies
Previous experiments addressing spatial VSTM have typically used delayed saccades (Sereno et al. 2001; Schluppeck et al. 2005, 2006; Hagler et al. 2007; Jack et al. 2007; Kastner et al. 2007) or delayed responses (Hagler et al. 2007; Kastner et al. 2007). Our results suggest that spatial and feature VSTM may rely on independent mechanisms, as spatial coding is modulated by the presence or absence of spatial requirements in a VSTM task. Furthermore, our data raise the possibility that effects in previous studies are due to prospective attention rather than memory per se, given that performance on our VSTM task was uncorrelated with spatial coding during maintenance. Furthermore, the distinction between spatial attention and memory is not uncontroversial, with some researchers arguing they rely on overlapping neural mechanisms (Awh and Jonides 2001; Awh et al. 2006). Indeed, a PPC area argued to track the memory representations also appears to track the number of objects that are attended (Mitchell and Cusack 2008; Xu 2009).
Another intriguing possibility is that spatial working memory may be viewed not only as a case of sustained spatial attention, but also as a special case of sustained feature attention (e.g. “coherence fields” in Rensink 2000). Were this the case, the crucial distinction is in terms of the task relevance and usefulness of this spatial activity, since during delayed saccades this information is vital to perform the task, while in our first experiment spatial information and resulting spatial activity was task irrelevant. If previous spatial delayed response studies truly reflect memory, we would predict that the degree of spatial coding in spatial delayed saccade and delayed response tasks during the maintenance period should correlate with task performance.
Contrasting with our own results, there have been recent accounts where the early visual cortex appears to hold visual information during the VSTM delay period. For instance, recent findings show that visual cortex may contain information on the feature contents of VSTM (Harrison and Tong 2009). In their experiment, a classifier was able to decode the orientation of a grating from MVPA patterns in the early visual cortex during the delay period, even in those cases where bulk activity was indistinguishable from the baseline. Other emerging accounts suggest that the remembered orientation is held in foveal visual cortex, regardless of visual eccentricity (Shin and Kanwisher 2010). This last study suggests also that spatial information is abstracted out when it is not needed for the task at hand. However, it remains unclear whether this apparent discrepancy between coding of orientation and location in the early visual cortex is due to an intrinsic difference in the processing of location and orientation or due to the relative task relevance of the feature information over spatial information. It is also unclear whether these studies reflect true memory maintenance or an effect of prospective feature attention to the expected stimulus orientation. Our own results are generally compatible with the above accounts, although we cannot speak of feature encoding in our own experiments, which was not optimized for such an analysis.
In summary, our results argue that spatial coding during VSTM is not obligatory, but that it depends on task demands. This runs counter to the original proposal by Luck and Vogel (1997), who proposed that all features (including spatial location) are encoded in VSTM regardless of the actual task demands. Our own results extend recent demonstrations that visual cortex holds only the color or orientation information that is attended (Serences et al. 2009) by showing that spatial information is not excluded from such filtering. Whether similar limitations apply for the special case of visual imagery is not clear, but would benefit from further scrutiny (Slotnick et al. 2005).
The Response Period
Contrary to our expectations, we found that the response period in Experiment 2 showed significant spatial coding, even though the sectors were presented centrally in this task. We propose, however, that this coding was not spatial in nature, but related to the differences in the shape of the different sectors (which were necessary to ensure that the subject was cued to the correct spatial sector in memory).
This conclusion is supported by our ROI analysis of Experiment 2, which found that the pattern of spatial coding during the response period was significantly dissimilar to that in the attentional period, unlike in Experiment 1.
We observed that the strength of spatial coding during the attentional period was positively correlated with performance. This follows the logic of previous reports that the separability of multivoxel activation patterns can be related to behavioral performance (Raizada et al. 2009). Further, this supports previous reports (Vogel et al. 2005; Cowan and Morey 2006, McNab and Klingberg 2008, Cusack et al. 2009, Linke et al. 2011) that processes during encoding dominate individual differences in memory capacity. In our study, retinotopy during the attentional period should be stronger in participants who were able to direct attention to the relevant stimuli, and this attentional focus on the relevant items should aid the selection and encoding of the relevant information for maintenance. From this logic, it follows that participants who show stronger attentional retinotopy will be better at encoding the relevant information and will therefore do better on the task.
Furthermore, we saw that bulk activity in the early visual cortex was neither significant (Harrison and Tong 2009, Offen et al. 2009; Serences et al. 2009), nor correlated with attentional performance, which may appear surprising given previous reports that bulk activity in V1 is positively related to performance on an attentional task (Ress et al. 2000; Ress and Heeger 2003, Silver et al. 2007). However, note that the tasks used by Ress and colleagues were visual detection tasks at the threshold of vision, whereas our task is an above-threshold selective attention task. Indeed, Ress and colleagues showed that the visual cortex ceased to show marked attentional responses to no-stimulus trials when the stimuli presented were raised above threshold (Ress et al. 2000). Furthermore, Silver et al. (2007) found that attentional activity is spatially selective for the stimulus location, such that sustained activity increases for attended locations and decreases for unattended locations. It has been suggested that attention is not encoded as bulk activation or deactivation of the visual cortex, but as an asymmetry in activation (Sylvester et al. 2007). In our experiment, each location was equally often attended and unattended, so it is reasonable to assume that the selective attentional activation and inattentional deactivation should cancel each other out when averaged. This activation asymmetry also explains our ability to extract retinotopic information during the attentional period and in the maintenance period where prospective attention was encouraged. Importantly, behavioral performance is correlated with attentional activation in the relevant location only, which explains why the strength of attentional retinotopy was correlated with performance where bulk activity did not.
Beyond the Visual Cortex
First, we note that some components of the bulk activity are often seen in executive tasks. In particular, we saw the expected pattern of activity comprising parietal and dorsolateral prefrontal cortex areas (Cohen et al. 1997; Courtney et al. 1997, 1998; Curtis and D'Esposito 2003), suggested to be part of a multiple demands network by Duncan and Owen (2000). We also saw specific parietal activations usually seen in visual working memory (Todd and Marois 2004; Xu and Chun 2006; Kawasaki et al. 2008; Mitchell and Cusack 2008).
While the scope of the paper precludes a lengthy discussion regarding the neural factors relevant for memory maintenance during the delay, we briefly mention the literature on related individual differences relevant to our finding of linkage between performance and spatial coding. First, we have previously shown that bulk activity in multiple fronto-parietal and lateral occipital areas correlates with performance during the encoding–attentional period, but not during maintenance, of change detection tasks (Linke et al. 2011). These results relate to the previous work showing that individual differences in the activity in the PPC predicted performance (Todd and Marois 2005; McNab and Klingberg 2008). Furthermore, the TPJ, which appears to be important in stimulus-driven attention, (Corbetta and Shulman 2002) shows a negative correlation with working memory load and relates to inattentional blindness (Todd et al. 2005), this latter a phenomenon that relates to the selection of relevant information and the filtering of environmental distracters.
A further issue relates to the interactions between higher areas and the visual representations held in memory. Given recent advances in techniques such as dynamic causal modeling (DCM; Friston et al. 2003), it would be interesting to examine the causal relationships between univariate activation and pattern information in different regions, but the strictly univariate nature of DCM precludes such an analysis at this point.
A confound of Behavioral Performance
One potential confound in comparing the 2 experiments is the poorer behavioral performance in Experiment 2 compared with Experiment 1. We could assume that the behavioral performance was directly related to the amount of feature information held in memory, resulting in sharper representations and, furthermore, that the “sharpness” of the remembered representations related to the degree of retinotopy found during the maintenance period. Since the representations held in memory were sharper in Experiment 1 than in Experiment 2, we might expect a lesser degree of retinotopy in Experiment 2 regardless of the effect of prospective attention. However, if these assumptions were true, we would also expect to see a positive correlation between behavioral performance and spatial coding during maintenance in both of these experiments, since we would also expect that a greater amount of feature information held in memory would result in sharper spatial coding across subjects. Nevertheless, we found that the strength of spatial coding during memory was unrelated to behavioral performance in both Experiments 1 and 2. Similarly, it could be argued that perhaps spatial coding during maintenance was weaker, and more difficult to detect. However, this interpretation is not supported by the finding that the mean spatial index during maintenance was not less than that of encoding in Experiment 1 (Fig. 6). The interaction between task epoch and experiment further supports our claim that spatial coding during maintenance was selectively impaired in Experiment 2.
This difference in performance may be due to several possible factors. One of the possibilities considered is that in Experiment 2 one needs to compare information from the parafoveal with a central probe, which may pose greater difficulty for the visual system than a comparison at an identical parafoveal location. However, this is unlikely given that VSTM has been shown to be resilient to changes in visual location, in contrast to sensory memory (Phillips 1974) and, indeed, one may argue that memory could even improve when unimpeded by local BOLD adaptation effects derived from extended activation. Alternatively, Experiment 2 necessitated the segregation of information on the display into 2 chunks of information separately maintained in memory. Indeed, participants tend to encode the configural information of visual displays, and when this information is lost (e.g. by presenting only part of the memorized display), performance is disrupted (Jiang et al. 2000, 2004; Delvenne and Bruyer 2006). Further, the necessity to select which of the 2 memorized stimuli to compare with the probe may increase errors. Lastly, the different positioning of the sectors in Experiment 2 may have also affected performance, since 2 of the sectors shown were placed such that each was presented on both hemifields, which might split processing across hemispheres (Alvarez and Cavanagh 2005; Delvenne 2005).
In conclusion, we found that spatial coding in the early visual cortex during VSTM is task dependent; spatial coding is found when a strategy based on the encoding of the spatial information and prospective attention to the item locations is encouraged, but not when such strategies are discouraged. Furthermore, spatial coding during the attentional, but not the maintenance period, predicted performance on the feature VSTM task. Our results suggest that spatial location is not obligatorily encoded into VSTM and emphasize the importance of considering the roles of strategy and prospective attention in future studies.
Supplementary material can be found at: http://www.cercor.oxfordjournals.org/.
This work was supported by the Medical Research Council (United Kingdom).
We thank Daniel J. Mitchell and Annika Linke for helpful discussion and comment on earlier versions of the manuscript. Conflict of Interest: None declared.