## Abstract

We can more precisely tune attention to highly rewarding objects than other objects in our environment, but how our brains do this is unknown. After a few trials of searching for the same object, subjects' electrical brain activity indicated that they handed off the memory representations used to control attention from working memory to long-term memory. However, when a large reward was possible, the neural signature of working memory returned as subjects recruited working memory to supplement the cognitive control afforded by the representations accumulated in long-term memory. The amplitude of this neural signature of working memory predicted the magnitude of the subsequent behavioral reward-based attention effects across tasks and individuals, showing the ubiquity of this cognitive reaction to high-stakes situations.

## Introduction

Reward has profound effects in shaping behavior and learning (Rescorla and Wagner 1972). Neuroscientists have made substantial progress examining reward processing at the molecular and neuronal levels (Tobler et al. 2005; Björklund and Dunnett 2007; Penny 2011), as well as in large-scale networks (O'Doherty 2004; Vickery et al. 2011).

However, it is yet unclear how the opportunity to earn additional reward influences what information is selected by attention (Della Libera and Chelazzi 2006; Serences 2008; Peck et al. 2009; Raymond and O'Brien 2009). Our goal here was to determine whether changes in the nature of top-down attentional control underlie these reward-based modulations of performance that require selective information processing by the brain.

To test the hypothesis that changes in top-down attentional control are responsible for reward-based attention effects, we developed a new technique to simultaneously measure separate human event-related potentials (ERPs) indexing the working and long-term memory representations that control visual attention. The contralateral delay activity (or CDA) of subjects' ERPs was used to track the maintenance of target representations in visual working memory, known as attentional templates (Carlisle et al. 2011). The CDA is a sustained posterior negativity, contralateral to the location where an object had appeared, as it is actively maintained in visual working memory (Vogel and Machizawa 2004; Vogel et al. 2005; Woodman and Vogel 2008). A separate component, the anterior P1 component, was used to directly measure the accumulation of long-term memory representations. In the present study, we call this anterior P1 effect, the P170 component, based on previous work, showing that this frontocentral positivity is observable during memory tasks using simple geometric shapes (Voss et al. 2010) and appears to reflect the accumulation of information that supports successful recognition via familiarity (Tsivilis et al. 2001; Duarte et al. 2004; Friedman 2004; Diana et al. 2005). This waveform is more negative when a given stimulus has previously been stored in long-term memory and is encountered again. Here, we used simultaneous measurements of the CDA and P170 to determine the roles that working and long-term memory representations play in controlling attention and how those roles change when reward is at stake.

It is possible that, when high rewards are at stake, the brain uses redundant target representations in both working and long-term memories, even though representations of just one type are sufficient to guide attention. If we maintain a representation of the task-relevant object in both visual working and long-term memories, then both sources of top-down control can converge on the same critical perceptual input, increasing its attentional priority. To determine whether this type of redundancy gain underlies reward-driven changes in performance, we examined the influence of reward incentives on visual working memory after several trials of searching for the same target occurred, and subjects were beginning to rely on long-term memory to control attention (Logan 1988, 2002; Anderson 2000). If this redundant-template hypothesis is correct, then the opportunity to earn additional reward should result in a reinstatement of the CDA measuring the maintenance of the attentional template in visual working memory even after long-term memory representations have begun to control attention.

## Materials and Methods

### Subjects

Different groups of 15 volunteers (18–32 years of age) participated in each experiment in exchange for monetary compensation. All participants had normal color vision, and normal or corrected-to-normal visual acuity. All procedures were approved by the Vanderbilt University Institution Review Board and consented to by the subjects prior to the beginning of the experiment.

### Stimuli

Stimuli were presented against a gray background (54.3 cd/m2) at a viewing distance of 114 cm. A black fixation cross (<0.01 cd/m2, 0.4 × 0.4° of visual angle) was visible throughout each trial. Reward stimuli were outlined circles (0.88° diameter and 0.13° thick) presented at the center of the monitor. Both low and high rewards were distinguished by color (blue: x = 0.146, y = 0.720, 6.41 cd/m2; yellow: x = 0.408, y = 0.505, 54.1 cd/m2), with the assignment of color to reward amount counterbalanced across subjects. The elements in the cue and search arrays were Landolt-C stimuli (0.88° diameter, 0.13° thick, and 0.22° gap width), of 8 possible orientations (0°, 22.5°, 45°, 67.5°, 90°, 112.5°, 135°, and 157.5°), 1 of which was green (x = 0.281, y = 0.593, 45.3 cd/m2) and the others red (x = 0.612, y = 0.333, 15.1 cd/m2). The task-relevant color of the cue stimulus was determined prior to the start of each experiment, counterbalanced across subjects. Each cue stimulus was presented 2.2° to either the left or right of the center of the monitor.

The search array contained 1 red, 1 green, and 10 black distractor Landolt-C stimuli (<0.01 cd/m2) arranged similar to the number of locations on a clock face (centered 4.4° from the middle of the monitor; Fig. 1A). The target orientation could only appear in the task-relevant color. All subjects also completed a change-detection task, identical to that used in previous work (e.g., Vogel et al. 2005), to estimate the number of simple colored objects each individual could remember.

Figure 1.

Stimuli and results from Experiment 1. (A) A precue stimulus (blue or yellow circle) signaled whether the trial was low or high reward. The task-relevant object in the cue array (red or green Landolt-C), signaled the shape of the target in the search array. Feedback was given at the end of each trial. Central fixation was maintained for the trial duration. (B) Reaction times (RTs) across target repetitions following low-reward cues (white), and the critical high-reward cue (green) followed by trials with low-reward cues (gray). Error bars are 95% confidence intervals. (C) Grand-average ERP waveforms from posterior and lateral electrodes contralateral (red) and ipsilateral (black) to the cue location across target repetition. CDA current density distributions across target repetitions and reward are shown collapsed across the right and left cue locations with all contralateral activity projected onto the left hemisphere. (D) CDA amplitude across target repetitions following low-reward cues (white bars), and the critical high-reward cue (green bar) followed by trials with low-reward cues (gray bars). Superimposed is the simultaneously measured P170 amplitude across the same low-reward cues (cyan circles) and high reward followed by low-reward cues (cyan squares). Error bars are as in (B). (E) The relationship between individual subjects' CDA amplitude and RT following high- minus low-reward cues at the same target repetition.

Figure 1.

Stimuli and results from Experiment 1. (A) A precue stimulus (blue or yellow circle) signaled whether the trial was low or high reward. The task-relevant object in the cue array (red or green Landolt-C), signaled the shape of the target in the search array. Feedback was given at the end of each trial. Central fixation was maintained for the trial duration. (B) Reaction times (RTs) across target repetitions following low-reward cues (white), and the critical high-reward cue (green) followed by trials with low-reward cues (gray). Error bars are 95% confidence intervals. (C) Grand-average ERP waveforms from posterior and lateral electrodes contralateral (red) and ipsilateral (black) to the cue location across target repetition. CDA current density distributions across target repetitions and reward are shown collapsed across the right and left cue locations with all contralateral activity projected onto the left hemisphere. (D) CDA amplitude across target repetitions following low-reward cues (white bars), and the critical high-reward cue (green bar) followed by trials with low-reward cues (gray bars). Superimposed is the simultaneously measured P170 amplitude across the same low-reward cues (cyan circles) and high reward followed by low-reward cues (cyan squares). Error bars are as in (B). (E) The relationship between individual subjects' CDA amplitude and RT following high- minus low-reward cues at the same target repetition.

### Procedure

In Experiment 1, each trial began with the presentation of the fixation cross for 1200–1600 ms (randomly jittered using a rectangular distribution). The reward-cue stimulus was presented next for 200 ms, followed by the target-cue stimuli for 100 ms. Following a 1000-ms interval in which the screen was blank other than the fixation point, we then presented the search array for 2000 ms. The final stimulus event of each trial was a feedback screen displaying the current and total cents earned. On low-reward trials, subjects earned 1 cent ($0.01) for a correct response within 2000 ms, and no penalty for an incorrect or missed response. On high-reward trials, subjects earned 5 cents ($0.05) for a correct response within 2000 ms and were penalized 50 cents ($0.50) for an incorrect response. Subjects were aware that money earned for performance would be added to their hourly compensation ($10.00), which typically totalled $15.00, leaving subjects with a total compensation of approximately$45.00.

Experiments 2 and 3 were identical to Experiment 1 with a few key exceptions. In Experiment 2, no explicit reward precue was used. Subjects instead learned to associate both low- and high-reward trials with the orientation of the task-relevant cue stimulus (i.e., orientations at 0°, 45°, 90°, and 135° vs orientations at 22.5°, 67.5°, 112.5°, and 157.5°). The mapping of reward values to these 2 classes of orientations was counterbalanced across subjects. In Experiment 3, the search array that presented in Experiment 1 was replaced with a centered black Landolt-C stimulus presented for 100 ms.

Because error responses were infrequent (0–8% of trials across subjects and experiments), the penalty for incorrect, high-reward trials rarely occurred. However, to verify that the effects we observed were due to the reward payoffs and not the avoidance of losses, we ran Experiment 4 that was identical to Experiment 1 except that no penalties existed and only differential reward occurred for both low- versus high-reward trials (1 vs. 5 cents gains for correct responses within the 2-s window).

The target-cue stimulus remained the same color, orientation, and location throughout each run of 7 trials. The target presented during visual search (Experiments 1, 2, and 4) or the target discrimination task (Experiment 3) matched the shape of the task-relevant cue on half the total number of trials. The cued target orientation and target presence (present or absent) were randomly selected on each trial for all experiments. Target location was randomized on each trial for Experiments 1, 2, and 4. In Experiments 1, 3, and 4, 75% of all trials were preceded by low-reward cues with the remainder being preceded by high-reward cues. Ninety-five percent of the high-reward cues were presented on the fifth target repetition during the same-target runs. The remaining 5% of high-reward cues were evenly distributed across the remaining 6 serial positions in the same-target runs. In Experiment 2, because reward incentives were the same within each run of 7 trials, runs were randomized with the constraint that half would be high reward. Participants were instructed to respond to the search array (Experiments 1, 2, and 4) or the possible target stimulus (Experiment 3) by pressing one button on a handheld gamepad (Logitech Precision) to indicate the target presence and a different button to indicate the target absence, using the thumb of their right hand, giving equal importance to speed and accuracy. In every experiment, each participant performed 2 blocks of 420 trials (i.e. 840 trials in total or 120 runs containing 7 trials each), including a 30-s break approximately every 65 trials with the constraint that breaks would not interrupt a run of trials.

The electroencephalogram was acquired (250-Hz sampling rate, 0.01–100 Hz bandpass filter) using an SA Instrumentation Amplifier from 21 tin (Sn) electrodes arrayed according to the International 10–20 System, including 3 midline (Fz, Cz, and Pz), 7 lateral pairs (F3/F4, C3/C4, P3/P4, PO3/PO4, T3/T4, T5/T6, and O1/O2), and 2 nonstandard electrodes (OL, halfway between O1 and T5; and OR halfway between O2 and T6), and held on the scalp with an elastic cap (Electrocap International, Eaton, OH, USA). The right mastoid electrode served as the online reference for these active electrode sites. Signals were re-referenced offline to the average of the left and right mastoids (Nunez 1981). The electrooculogram was recorded by placing electrodes 1 cm lateral to the external canthi to measure horizontal eye movements and by placing an electrode beneath the left eye, referenced to the right mastoid, to measure vertical eye movements and blinks. We used a 2-step ocular artifact rejection method (Woodman and Luck 2003). Trials accompanied by incorrect behavioral responses or ocular or myogenic artifacts were excluded from the averages.

### Data Analysis

We computed visual memory capacity using a standard formula that estimates the number of objects' worth of information stored while correcting for guessing (Cowan 2001).

The CDA was measured at lateral posterior parietal, occipital, and temporal electrode sites (i.e. PO3/4, O1/2, OL/R, and T5/6) as the difference in mean amplitude between the ipsilateral and contralateral waveforms during 300–1000 ms following target cue onset, consistent with previous CDA experiments (Vogel and Machizawa 2004; Vogel et al. 2005; Carlisle et al. 2011). The P170 was measured at the frontocentral electrode site (i.e. Fz) during 170–370 ms following target cue onset, similar to previous work (Voss et al. 2010). To examine the effects of the learning and reward manipulations on the deployment of attention to the targets in the search arrays, we also measured the N2pc time locked to array onset. The N2pc is a negative-going potential observed over the posterior cortex contralateral to where in the visual field attention is focused (Luck and Hillyard 1990; Luck et al. 1993). The N2pc was measured at lateral occipital electrodes (i.e. OL/R) as the mean difference in amplitude between the ipsilateral and contralateral waveforms during 200–300 ms following the onset of the search array in Experiments 1, 2, and 4. Mean amplitudes were compared across learning and reward conditions by analysis of variance (ANOVAs), and P-values were adjusted using the Greenhouse-Geisser epsilon correction for nonsphericity (Jennings and Wood 1976). Learning effects from all experiments were assessed across the critical target repetitions (changes in performance and ERPs between target repetition 1 through 5; but see Table 1 for additional comparisons). Significant results were further analyzed using the Fisher least significance difference post hoc test. Reward effects were assessed between low- and high-reward cues at the same serial position of target repetition (i.e. the fifth trial in a run in Experiments 1, 3, and 4). In Experiment 2, reward effects were assessed by an interaction between reward value (high vs. low) and target repetition 1 through 5 (but see Table 1 for additional comparisons).

Table 1

Statistical comparisons during learning (trials 1–7)

 Experiment 1 Accuracy df F P-value Percent correct 6,84 1.500 0.19 Target repetition df F P-value Search RT 6,84 1.836 0.10 CDA amplitude 6,84 4.553 <0.01 Alpha suppression 6,84 2.592 <0.03 P170 amplitude 6,84 2.457 <0.03 Visual working memory capacity df r P-value ΔCDA amplitude 14 −0.715 <0.01 ΔP170 amplitude 14 −0.798 <0.01 ΔRT 14 0.730 <0.01 Experiment 2 Accuracy df F P-value Percent correct (low reward) 6,84 1.138 0.35 Percent correct (high reward) 6,84 0.812 0.56 Target repetition df F P-value Search RT (low reward) 6,84 2.252 <0.05 Search RT (high reward) 6,84 2.424 <0.04 CDA amplitude (low reward) 6,84 2.684 <0.02 CDA amplitude × Reward 6,84 3.829 <0.01 Alpha suppression (low reward) 6,84 6.420 <0.01 Alpha suppression × Reward 6,84 4.178 <0.01 P170 amplitude (low reward) 6,84 1.866 0.96 P170 amplitude (high reward) 6,84 2.158 <0.05 P170 amplitude × Reward 6,84 0.984 0.44 Visual working memory capacity df r P-value ΔCDA amplitude (low reward) 14 −0.780 <0.01 ΔP170 amplitude (low reward) 14 −0.681 <0.01 ΔRT (low reward) 14 0.767 <0.01 Experiment 3 Accuracy df F P-value Percent correct 6,84 1.134 0.35 Target repetition df F P-value Search RT 6,84 2.873 <0.05 CDA amplitude 6,84 3.401 <0.01 Alpha suppression 6,84 2.475 <0.03 P170 amplitude 6,84 3.153 <0.01 Visual working memory capacity df r P-value ΔCDA amplitude 14 −0.78 <0.01 ΔP170 amplitude 14 −0.822 <0.01 ΔRT 14 0.767 <0.01 Experiment 4 Accuracy df F P-value Percent correct 6,84 0.882 0.49 Target repetition df F P-value Search RT 6,84 2.442 <0.04 CDA amplitude 6,84 4.142 <0.01 P170 amplitude 6,84 2.513 <0.03
 Experiment 1 Accuracy df F P-value Percent correct 6,84 1.500 0.19 Target repetition df F P-value Search RT 6,84 1.836 0.10 CDA amplitude 6,84 4.553 <0.01 Alpha suppression 6,84 2.592 <0.03 P170 amplitude 6,84 2.457 <0.03 Visual working memory capacity df r P-value ΔCDA amplitude 14 −0.715 <0.01 ΔP170 amplitude 14 −0.798 <0.01 ΔRT 14 0.730 <0.01 Experiment 2 Accuracy df F P-value Percent correct (low reward) 6,84 1.138 0.35 Percent correct (high reward) 6,84 0.812 0.56 Target repetition df F P-value Search RT (low reward) 6,84 2.252 <0.05 Search RT (high reward) 6,84 2.424 <0.04 CDA amplitude (low reward) 6,84 2.684 <0.02 CDA amplitude × Reward 6,84 3.829 <0.01 Alpha suppression (low reward) 6,84 6.420 <0.01 Alpha suppression × Reward 6,84 4.178 <0.01 P170 amplitude (low reward) 6,84 1.866 0.96 P170 amplitude (high reward) 6,84 2.158 <0.05 P170 amplitude × Reward 6,84 0.984 0.44 Visual working memory capacity df r P-value ΔCDA amplitude (low reward) 14 −0.780 <0.01 ΔP170 amplitude (low reward) 14 −0.681 <0.01 ΔRT (low reward) 14 0.767 <0.01 Experiment 3 Accuracy df F P-value Percent correct 6,84 1.134 0.35 Target repetition df F P-value Search RT 6,84 2.873 <0.05 CDA amplitude 6,84 3.401 <0.01 Alpha suppression 6,84 2.475 <0.03 P170 amplitude 6,84 3.153 <0.01 Visual working memory capacity df r P-value ΔCDA amplitude 14 −0.78 <0.01 ΔP170 amplitude 14 −0.822 <0.01 ΔRT 14 0.767 <0.01 Experiment 4 Accuracy df F P-value Percent correct 6,84 0.882 0.49 Target repetition df F P-value Search RT 6,84 2.442 <0.04 CDA amplitude 6,84 4.142 <0.01 P170 amplitude 6,84 2.513 <0.03

Note: ANOVA and correlation analysis summaries using data across all target repetitions from Experiments 1–4.

Current density topography analyses were performed in CURRY 6 (Compumedics Neuroscan, Singen, Germany). The interpolated Boundary Element Method (BEM) model (Fuchs et al. 1998) was derived from the averaged magnetic resonance imaging data from the Montreal Neurological Institute. It consisted of 9300 triangular meshes overall or 4656 nodes, which describe the smoothed inner skull (2286 nodes), the outer skull (1305 nodes), and the outside of the skin (1065 nodes). The mean triangle edge lengths (node distances) were 9 (skin), 6.8 (skull), and 5.1 mm (brain compartment). Standard conductivity values for the 3 compartments were set to: skin = 0.33 S/m, skull = 0.0042 S/m, and brain = 0.33 S/m. The Standardized low-resolution electromagnetic tomography Weighted AccuRate minimum norm Method (SWARM) was estimated using sensor positions based on the International 10-20 System and a cortical surface obtained from a segmentation of the CURRY 6 individual reference brain. Current density distributions were drawn either from CDA difference waves or from P170 waves during the interval following target cue onset (CDA: 300–1000 ms and P170: 170–370 ms) using all of the available electrodes. Prior to constructing CDA difference waves, data were collapsed across left and right cue locations and averaged using a procedure that preserved the electrode location relative to the cue location. All contralateral current density activity was projected onto the left hemisphere of a 3-dimensional reconstruction of the cortical surface, and all ipsilateral activity was projected onto the right hemisphere.

Time-frequency analyses were performed using a continuous Morlet wavelet decomposition with FieldTrip software (Oostenveld et al. 2011). The Morlet wavelet is a complex wavelet (i.e. containing both real and imaginary sinusoidal oscillations) that is convolved with a Gaussian envelope. The shape of this wavelet is therefore largest at it center and tapered toward its edges. The Morlet wavelet used here was defined by a constant ratio (σf = f/7) and a wavelet duration (6σt), where f is the center frequency and σt = 1/(2πσf). After obtaining complex time-frequency data points for every individual trial, these data were transformed into a measure of cross-trial total power, which involves extracting, squaring, and averaging the magnitude length of the complex number vectors. This measure contains both target phase-locked and nonphase-locked components. Power was estimated in the −200- to 1000-ms window centered on target onset in 4-ms bins and from 2 to 30 Hz in 1-Hz bins. The data were then baseline-corrected using the data in the time window centered 200 ms before target onset. No post-target, 10-Hz power was observed to be bleeding into the baseline. Single-trial alpha-band power values (8–12 Hz) were averaged across the 300–1000-ms interval following target cue onset and extracted.

## Results

### Experiment 1

In Experiment 1, we tested the hypothesis that reward triggered the use of redundant target representations in memory, while subjects (n = 15) searched complex visual scenes for target objects. A color precue at the start of each trial (i.e. yellow or blue circle) signaled the potential for either low (correct response = $0.01) or high reward (correct response =$0.05) on that trial (Fig. 1A). Before each search array of objects, a target cue was presented that signaled the shape that was to be searched for on that trial. The task-relevant object in the cue array was indicated by color (e.g. the red shape), with both red and green items presented to eliminate physical stimulus confounds (Woodman 2010). Subjects were cued to search for the same target orientation across a run of consecutive trials before the target changed to a new, randomly selected shape. Subjects responded to the item in the search array of the task-relevant color (e.g. red) as fast and accurately as they could on each trial by pressing 1 of the 2 buttons on a gamepad. Feedback at the end of the trial indicated how many cents had been earned in bonus money paid to the subject. Our analyses focused on trial runs in which high reward was possible in the middle of a same-target run of trials (i.e. target repetition 5). These trial runs afforded us the opportunity to see whether the CDA indexing the target representation in visual working memory returned following a high-reward cue.

Reaction time (RT) data showed that responses became faster across trials of search for the same target (Fig. 1B), yielding a significant main effect of target repetition (F4,56 = 2.998, P < 0.03). Post hoc analysis was significant for repetitions 1 versus 5 (P < 0.03).

Critically, there was a marked reduction in RT following high- relative to low-reward cues even after 5 trials of searching for the same target (F1,14 = 9.851, P < 0.01; Fig. 1B). Search accuracy was near ceiling and did not significantly differ across target repetitions (mean: 97.2% correct; P > 0.30) or between low- and high- reward cues (mean: 97.7% correct; P > 0.75). Across the same-target runs, we found that the CDA amplitude elicited by the target cue systematically decreased (Fig. 1C,D), similar to the pattern of RTs, while P170 amplitude systematically increased in negativity (Figs 1D and 2), resulting in a significant main effect of target repetition on CDA amplitude (F4,56 = 4.115, P < 0.01) and P170 amplitude (F4,56 = 2.832, P < 0.04). Post hoc analyses were significant for repetitions 1 versus 5 (CDA, P < 0.02 and P170, P < 0.02). Further, we found a significant correlation between the learning-related changes in P170 amplitude and RT improvements from target repetitions 1 to 5 (r = −0.719, P < 0.01). This is as we expected if subjects were becoming increasingly reliant on long-term memory representations of the target item across the same-target runs. However, when the fifth trial in a same-target run was preceded by a high-reward cue, we found that the CDA returned to the amplitude measured when a new target orientation was introduced. This resulted in the CDA amplitude on the fifth target repetition being significantly larger for high- relative to low-reward cues (F1,14 = 10.701, P < 0.01; Fig. 1C,D). In contrast, the amplitude of the P170 was only slightly more negative and not significantly affected by reward (P > 0.20; Fig. 1D). We also quantified these learning and reward-based effects using time-frequency analyses, because previous work has proposed that visual working memory maintenance can be measured as the strength of the alpha-band suppression contralateral to a remembered stimulus (Mazaheri and Jensen 2008, 2010; van Dijk et al. 2010). These analyses of the frequency content converge with our measurements of CDA amplitude (learning: F4,56 = 2.851, P < 0.04 and reward: F1,14 = 9.589, P < 0.01; Fig. 3). Thus, our findings support the hypothesis that, with high rewards at stake, subjects implement more potent attentional control by reloading into working memory the target representation to supplement the representations accruing in long-term memory that typically guide attention after a short period of learning (Logan 1988, 2002; Carlisle et al. 2011). These redundant target representations in working and long-term memories result in faster behavioral responses during search for targets in complex scenes.

Figure 2.

The P170 from Experiment 1. Grand-average ERP waveforms from midline electrodes binned by target repetitions 1–2 (black), 3–4 (red), 5 low reward (blue), 5 high reward (green), and 6–7 (purple). As illustrated in Figure 1D, inset shows P170 amplitudes at electrode Fz from 170 to 370 ms (gray-shaded region) for each target repetition following low-reward cues (cyan circles) and high reward followed by low-reward cues (cyan squares). Error bars are 95% confidence intervals.

Figure 2.

The P170 from Experiment 1. Grand-average ERP waveforms from midline electrodes binned by target repetitions 1–2 (black), 3–4 (red), 5 low reward (blue), 5 high reward (green), and 6–7 (purple). As illustrated in Figure 1D, inset shows P170 amplitudes at electrode Fz from 170 to 370 ms (gray-shaded region) for each target repetition following low-reward cues (cyan circles) and high reward followed by low-reward cues (cyan squares). Error bars are 95% confidence intervals.

Figure 3.

Time-frequency representations of lateralized alpha-band total power during learning and reward from Experiment 1. Grand-average contralateral alpha-band suppression data are illustrated across target repetitions. These data are synchronized to the cue stimulus and by convention shown as the difference in power from both right and left hemifield stimuli, collapsed across the right and left hemisphere electrodes (i.e. comparable to the CDA difference wave).

Figure 3.

Time-frequency representations of lateralized alpha-band total power during learning and reward from Experiment 1. Grand-average contralateral alpha-band suppression data are illustrated across target repetitions. These data are synchronized to the cue stimulus and by convention shown as the difference in power from both right and left hemifield stimuli, collapsed across the right and left hemisphere electrodes (i.e. comparable to the CDA difference wave).

We reasoned that if the reward-triggered reinstatement of the working memory representations of the target was the source of the effects of reward on behavior, then we should be able to use the magnitude of the CDA amplitude rebound following a high-reward cue to predict the subsequent speeding of the search response. Indeed, we observed that the difference in CDA amplitude following high- relative to low-reward cues after 5 target repetitions was correlated with the speeding of RT across subjects (r = 0.732, P < 0.01; Fig. 1E). This shows that the behavioral gains induced by reward are predicted by the rebound in CDA amplitude measured in the preceding second. Previous research with the CDA has shown that it varies across individuals with different visual working memory capacities (Vogel and Machizawa 2004; Vogel et al. 2005; Woodman and Vogel 2008). Each of our subjects performed an independent task used to estimate their visual working memory capacity, allowing us to determine whether the reward-related effects on the CDA were simply an artifact of differences in working memory capacity. However, we found that individuals' working memory capacity was unrelated to the magnitude of the reward-triggered CDA (P = 0.542), P170 (P = 0.662), and RT speeding (P = 0.485). Visual working memory capacity was predictive of the learning effects we observed though. We found that visual working memory capacity predicted the size of the learning effects between 1 and 5 target repetitions across low-reward trials measured with CDA amplitude (r = −0.785, P < 0.01; Fig. 4A; the correlation remained significant after removing the high-capacity outlier in the top left corner, r = −0.627, P < 0.02), P170 amplitude (r = −0.821, P < 0.01; Fig. 4B), and RT (r = 0.747, P < 0.01; Fig. 4C). These results for CDA amplitude remained significant with P170 amplitude partialled out (r = −0.660, P < 0.01), and likewise for P170 amplitude with CDA amplitude partialled out (r = −0.722, P < 0.01). Additionally, we grouped subjects based on a median split of their behaviorally estimated visual working memory capacity (i.e. k) and compared their CDA and P170 amplitudes measured on the first target repetition. This analysis showed no significant differences, indicating that both high- and low-capacity individuals started each run of trials with essentially the same amplitude CDA (F1,14 = 0.543, P = 0.474) and P170 (F1,14 = 1.029, P = 0.329), but that these different groups exhibited ERPs that changed differently as learning unfolded. This shows that individuals with larger visual working memory capacities switch to relying more heavily on long-term memory to guide selection during learning, but the effect of reward on the reinstatement of the visual working memory target and the subsequent behavioral benefits was a general observation across individuals who differed in working memory capacity.

Figure 4.

Memory capacity correlations with neural activity and behavior during learning from Experiment 1. The relationship between an individual's visual working memory capacity and the change in CDA (A) and P170 amplitude (B) from target repetitions 1 min 5 of the low-reward runs. (C) The relationship between an individual's visual working memory capacity and the change in RT from target repetitions 1 min 5 of the low-reward runs.

Figure 4.

Memory capacity correlations with neural activity and behavior during learning from Experiment 1. The relationship between an individual's visual working memory capacity and the change in CDA (A) and P170 amplitude (B) from target repetitions 1 min 5 of the low-reward runs. (C) The relationship between an individual's visual working memory capacity and the change in RT from target repetitions 1 min 5 of the low-reward runs.

### Experiment 2

Experiment 1 showed that, after attentional templates accrue in long-term memory and start controlling attention, visual working memory representations are reinstated when a large reward is at stake. However, it is possible to argue that, in the real world, the opportunity to earn large rewards is often correlated with the identity of the target, unlike the cuing procedure we used in Experiment 1. That is, the reward values associated with the objects in our environment are typically stable. When we see a gold ring on the beach, it is a high-reward target regardless of whether we have recently been precued about the price of gold. We designed Experiment 2 to determine whether redundant attentional templates in working and long-term memories underlie the more efficient processing of consistently high-reward targets.

We trained subjects (n = 15) to associate 2 classes of Landolt-C orientations with high versus low reward (i.e. orientations at 0°, 45°, 90°, and 135° vs. orientations at 22.5°, 67.5°, 112.5°, and 157.5°) with explicit instructions and feedback on each trial. As in Experiment 1, subjects searched for the same target across a run of consecutive trials, but because target assignment to reward value was stable, the trials within each run were composed of search for the same high- or low-reward target. Given this consistent relationship between target identity and reward, the reward-driven CDA modulation should be sustained as the P170 increases in negativity across multiple consecutive high-reward trials in Experiment 2.

We again observed a clear behavioral advantage (i.e. faster RTs) when subjects searched for a high- relative to low-reward target across target repetitions (Fig. 5A). This behavioral reward effect was superimposed on the learning effects found when subjects searched for the same target across successive trials. This led to a main effect of target repetition on search RT across low- (F4,56 = 3.364, P < 0.02) and high-reward runs (F4,56 = 3.032, P < 0.03). Post hoc analysis was significant for repetitions 1 versus 4 (low, P < 0.05 and high, P < 0.05) and repetitions 1 versus 5 (low, P < 0.04 and high, P < 0.03). Search accuracy was near ceiling across low- (mean: 95.2%) and high-reward target repetitions (mean: 97.7%) and did not significantly differ across same-target repetitions (Ps > 0.15).

Consistent with our predictions, we found that the amplitude of CDA dropped quickly across low-reward target repetitions, but remained consistently higher across high-reward target repetitions (Fig. 5BD), whereas the P170 steadily increased in negativity across both low- and high-reward conditions, reaching significance in the high-reward condition across repetitions 1 to 7 (F6,84 = 2.158, P < 0.05; Fig. 6AC) and predicting learning-based RT speeding similar to Experiment 1 (r = −0.729, P < 0.01). Importantly, the consistently higher CDA amplitude across runs of high-reward targets relative to the gradually decreasing CDA amplitude across runs of low-reward targets was evidenced by an interaction between repetition and reward value (F4,56 = 3.060, P < 0.03; Fig. 5BD). This interaction was also observed in the time-frequency domain for contralateral alpha-band suppression (F4,56 = 5.329, P < 0.01; Fig. 7A,B). It is noteworthy that the P170 was not significantly modulated by reward value, consistent with results from Experiment 1 and indicated graphically by overlapping 95% confidence intervals (Fig. 6C). Additionally, high-capacity individuals exhibited a larger drop in CDA amplitudes (r = −0.791, P < 0.01; Fig. 8A), increase in P170 amplitudes (r = −0.693, P < 0.01; Fig. 8B), and faster RTs (r = 0.766, P < 0.01; Fig. 8C) across low-reward target repetitions. The results for CDA amplitude remained significant when P170 amplitude was partialled out (r = −0.670, P < 0.01), but was at trend level for P170 amplitude when CDA amplitude was partialled out (r = −0.485, P = 0.07). In addition, these results cannot be explained by different starting levels, because the CDA (F1,14 = 1.615, P = 0.226) and P170 amplitudes (F1,14 = 1.015, P = 0.332) from the first low-reward target repetition did not significantly differ between high- and low-capacity subjects. These findings suggest that individuals with high working memory capacity switch to relying on long-term memory more quickly than low capacity individuals. However, we observed similar effects of reward across subjects regardless of capacity (r = −0.134, P = 0.634), similar to Experiment 1.

Figure 5.

Results from Experiment 2. (A) RT across target repetitions for the runs of low (white) and high reward (green). (B) CDA amplitude across target repetitions for the runs of low (white) and high reward (green). Error bars are 95% confidence intervals. Grand-average ERP waveforms from posterior parietal and lateral occipital-temporal sites contralateral (red) and ipsilateral (black) to the location of the cue. The data are binned according to the number of trials during a run since a change of target identity across low- (C) and high-reward runs (D).

Figure 5.

Results from Experiment 2. (A) RT across target repetitions for the runs of low (white) and high reward (green). (B) CDA amplitude across target repetitions for the runs of low (white) and high reward (green). Error bars are 95% confidence intervals. Grand-average ERP waveforms from posterior parietal and lateral occipital-temporal sites contralateral (red) and ipsilateral (black) to the location of the cue. The data are binned according to the number of trials during a run since a change of target identity across low- (C) and high-reward runs (D).

Figure 6.

The P170 from Experiment 2. Grand-average ERP waveforms from midline electrodes for low- (A) and high-reward runs (B), binned by target repetitions 1–2 (black), 3–4 (red), and 5–7 (blue). (C) P170 amplitudes measured from the frontocentral electrode, 170–370 ms postcue onset illustrated across target repetitions for low- (cyan circles) and high-reward runs (cyan squares).

Figure 6.

The P170 from Experiment 2. Grand-average ERP waveforms from midline electrodes for low- (A) and high-reward runs (B), binned by target repetitions 1–2 (black), 3–4 (red), and 5–7 (blue). (C) P170 amplitudes measured from the frontocentral electrode, 170–370 ms postcue onset illustrated across target repetitions for low- (cyan circles) and high-reward runs (cyan squares).

Figure 7.

Time-frequency representations of lateralized alpha-band total power during learning and reward from Experiment 2. Grand-average contralateral alpha-band suppression data are illustrated across target repetitions during consecutive low- (A) and high-reward runs (B). These data are synchronized to the cue stimulus and by convention shown as the difference in power from right and left hemifield stimuli, collapsed across the right and left hemisphere electrodes.

Figure 7.

Time-frequency representations of lateralized alpha-band total power during learning and reward from Experiment 2. Grand-average contralateral alpha-band suppression data are illustrated across target repetitions during consecutive low- (A) and high-reward runs (B). These data are synchronized to the cue stimulus and by convention shown as the difference in power from right and left hemifield stimuli, collapsed across the right and left hemisphere electrodes.

Figure 8.

Memory capacity correlations with neural activity and behavior during learning from Experiments 2–3. The subject-wise relationship between visual working memory capacity and the change in CDA amplitude from low-reward trials 1–5 during search for the same target from Experiment 2 (A) and Experiment 3 (D). The subject-wise relationship between visual working memory capacity and the change in CDA amplitude from low-reward trials 1–5 during search for the same target from Experiment 2 (B) and Experiment 3 (E). The subject-wise relationship between visual working memory capacity and the change in RT from low-reward trials 1–5 from Experiment 2 (C) and Experiment 3 (F). The correlation remained significant after removing the outlier in C (r = 0.616, P < 0.03).

Figure 8.

Memory capacity correlations with neural activity and behavior during learning from Experiments 2–3. The subject-wise relationship between visual working memory capacity and the change in CDA amplitude from low-reward trials 1–5 during search for the same target from Experiment 2 (A) and Experiment 3 (D). The subject-wise relationship between visual working memory capacity and the change in CDA amplitude from low-reward trials 1–5 during search for the same target from Experiment 2 (B) and Experiment 3 (E). The subject-wise relationship between visual working memory capacity and the change in RT from low-reward trials 1–5 from Experiment 2 (C) and Experiment 3 (F). The correlation remained significant after removing the outlier in C (r = 0.616, P < 0.03).

The findings of Experiment 2 address 2 alternative explanations for the findings of Experiment 1. First, the elevated working memory-related CDA amplitude across a run of high-reward trials rules out the alternative hypothesis that the reward-triggered working memory reinstatement observed following the infrequent high-reward cues in Experiment 1 was simply a release from a repetition suppression effect due to the frequent low-reward trials. Given such an account, the learning-induced reduction of the CDA would have been explained as a repetition-priming effect in which the neural response to the target cues was reduced across repetitions. However, the findings of Experiment 2 showing that a run of high-reward targets is immune to this reduction rule out this story. Secondly, Experiment 2 demonstrated that objects that are consistently high-reward targets are processed more efficiently, because subjects store redundant target representations of these items in both working and long-term memories when searching for them in complex scenes.

Recent behavioral research shows that linking reward value with a basic stimulus feature (e.g. color) rather than the more complex combinations of features can cause significant and enduring distraction (Anderson et al. 2010). Our data from Experiment 2 allowed us to examine whether distractors associated with large rewards along a single-stimulus dimension (i.e. orientation) would interfere with performance when searching for low-reward targets. That is, each search array contained 2 colored objects (Fig. 1A), one of the target color (e.g. red) and one that was a uniquely colored distractor (e.g. green, for counterbalancing purposes). This analysis looked at whether a green distractor of a shape associated with high reward would be more distracting than a low-reward green shape when searching for a red target. We observed significant RT slowing in the presence of a high-reward distractor (mean ± SEM, 764 ± 21 ms) relative to a low-reward distractor (749 ± 33 ms; F1,14 = 4.976, P < 0.05). Additionally, the magnitude of this high-reward distractor interference effect correlated with individuals' visual working memory capacity (r = −0.618, P < 0.02; Fig. 9A). Most critically, we found that CDA amplitude significantly increased on the trial after high-reward distractor interference (F1,14 = 6.159, P < 0.03, gray bars), resulting in faster RTs on that subsequent trial (F1,14 = 6.725, P < 0.02; Fig. 9B). These results were obtained by comparing CDA amplitude and RT on the high-reward n + 1 trial with the low-reward n + 1 trial. However, similar results were found for RT (F1,14 = 5.957, P < 0.03) and CDA amplitude (F1,14 = 6.331, P < 0.03) by comparing the high-reward n + 1 trial with the high-reward n trial (Fig. 9B, ‘High’ white vs. gray bars). This reactive use of the working memory representation to overcome reward-based distraction occurred despite our observation that before the subjects see the search array with a high- or low-reward distractor, the CDA amplitude is similar (Fig. 9B, white bars). These findings demonstrate that working memory attentional templates can also be reinstated as a reaction to overcome reward-related distraction.

Figure 9.

Reward-driven attentional capture. (A) The relationship between individual visual working memory capacity and the magnitude of reward-driven attentional capture (indexed as RT in the presence of a high- minus low-reward distractor). (B) CDA amplitude on low- (“Low”) and high-reward distractor trials (“High”, white bars), and trials following a low- (“Low”) and high-reward distractor (“High”, gray bars) during visual search. Error bars are 95% confidence intervals.

Figure 9.

Reward-driven attentional capture. (A) The relationship between individual visual working memory capacity and the magnitude of reward-driven attentional capture (indexed as RT in the presence of a high- minus low-reward distractor). (B) CDA amplitude on low- (“Low”) and high-reward distractor trials (“High”, white bars), and trials following a low- (“Low”) and high-reward distractor (“High”, gray bars) during visual search. Error bars are 95% confidence intervals.

### Experiment 3

To determine whether redundant target representations in working and long-term memories might explain reward benefits across a variety of tasks, and not just search tasks, we ran Experiment 3 in which subjects (n = 15) performed a simple target discrimination task (Donders 1868/1969; Posner and Snyder 1975; Wickens 2002). We again used cues to indicate the object that might or might not be present and measured the CDA, however, only a single, centrally presented black Landolt-C was shown on each trial (Fig. 10A). Demonstrating that our findings are ubiquitous across tasks, the results of Experiment 3 are essentially identical to those of Experiment 1. Subjects responded faster with repeated discrimination of the same low-reward target (F4,56 = 2.968, P < 0.03; contrasts 1 vs. 5, P < 0.05; Fig. 10B) with a consistently high accuracy across repetitions (mean: 97.6% correct; F4,56 = 1.632; P = 0.179). These behavioral results were accompanied by a reliable decrease in CDA amplitude (F4,56 = 2.552, P < 0.05; contrasts 1 vs. 4, P < 0.05 and 1 vs. 5, P < 0.04; Fig. 10C,D) and alpha-band suppression (F4,56 = 2.982, P < 0.03; Fig. 11), and an increase in P170 negativity (F4,56 = 2.634, P < 0.05; contrasts 1 vs. 4, P < 0.05 and 1 vs. 5, P < 0.03; Figs 10C and 12) across consecutive low-reward target repetitions. Learning-related changes in P170 amplitude and RT speeding from target repetitions 1 to 5 were correlated further reinforcing the connection between P170 and the strength of long-term memory representations (r = −0.701, P < 0.01). Similar to Experiments 1 and 2, individuals' visual working memory capacity correlated with the difference in CDA amplitude (r = −0.703, P < 0.01; Fig. 8D), P170 amplitude (r = −0.819, P < 0.01; Fig. 8E), and RT (r = 0.818, P < 0.01; Fig. 8F) from low-reward target repetitions 1 to 5. The results for CDA amplitude remained significant when P170 amplitude was partialled out (r = −0.640, P < 0.02), and for P170 amplitude when CDA amplitude was partialled out (r = −0.677, P < 0.01). These differences between high- and low-capacity individuals could not be explained by different starting levels of their CDA (F1,14 = 1.593, P = 0.229) or P170 (F1,14 = 0.117, P = 0.738) at the beginning of each run of trials. Additionally, the change in CDA and P170 amplitudes during learning (i.e. target repetitions 1 min 5) were significantly correlated across experiments (Experiment 1: r = 0.583, P < 0.03, Experiment 2: r = 0.568, P < 0.03, Experiment 3: r = 0.707, P < 0.01; Fig. 13AC). It should be pointed out that not all subjects exhibited similarly strong CDA and P170 differences during learning. However, these individual differences were consistent across experiments in that those subjects with little change in CDA amplitude during learning also showed little change in P170 amplitude during learning. This shows that the learning rate, as measured behaviorally and electrophysiologically, was predicted by our independent estimate of subject's visual working memory capacity, even during a simple discrimination task without spatial uncertainty or distractors.

Figure 10.

Stimuli and results from Experiment 3. (A) Schematic representation of Experiment 3. (B) RT across target repetitions following low-reward cues (white), and the critical high-reward cue (green) followed by trials with low-reward cues (gray). Error bars are 95% confidence intervals. (C) CDA and P170 amplitudes across target repetitions following low- and high-reward cues using the same conventions as in Figure 1. Error bars are as in (B). (D) Grand-average ERP waveforms from posterior parietal and lateral occipital-temporal sites contralateral (red) and ipsilateral (black) to the location of the cue, binned according to the number of trials since a change of target identity. (E) The relationship between CDA amplitude and RT on high- minus low-reward cues at the fifth target repetition.

Figure 10.

Stimuli and results from Experiment 3. (A) Schematic representation of Experiment 3. (B) RT across target repetitions following low-reward cues (white), and the critical high-reward cue (green) followed by trials with low-reward cues (gray). Error bars are 95% confidence intervals. (C) CDA and P170 amplitudes across target repetitions following low- and high-reward cues using the same conventions as in Figure 1. Error bars are as in (B). (D) Grand-average ERP waveforms from posterior parietal and lateral occipital-temporal sites contralateral (red) and ipsilateral (black) to the location of the cue, binned according to the number of trials since a change of target identity. (E) The relationship between CDA amplitude and RT on high- minus low-reward cues at the fifth target repetition.

Figure 11.

Time-frequency representations of lateralized alpha-band total power during learning and reward from Experiment 3. Grand-average contralateral alpha-band suppression data are illustrated across target repetitions. These data are synchronized to the cue stimulus and by convention shown as the difference in power from right and left hemifield stimuli, collapsed across right and left hemisphere electrodes.

Figure 11.

Time-frequency representations of lateralized alpha-band total power during learning and reward from Experiment 3. Grand-average contralateral alpha-band suppression data are illustrated across target repetitions. These data are synchronized to the cue stimulus and by convention shown as the difference in power from right and left hemifield stimuli, collapsed across right and left hemisphere electrodes.

Figure 12.

The P170 from Experiment 3. Grand-average ERP waveforms from midline electrodes binned by target repetitions 1–2 (black), 3–4 (red), low reward 5 (blue), high reward 5 (green), and 6–7 (purple). As illustrated in Figure 10C, inset shows P170 amplitudes at electrode Fz from 170 to 370 ms (gray-shaded region) for each target repetition following low-reward cues (cyan circles) and high reward followed by low-reward cues (cyan squares). Error bars are 95% confidence intervals.

Figure 12.

The P170 from Experiment 3. Grand-average ERP waveforms from midline electrodes binned by target repetitions 1–2 (black), 3–4 (red), low reward 5 (blue), high reward 5 (green), and 6–7 (purple). As illustrated in Figure 10C, inset shows P170 amplitudes at electrode Fz from 170 to 370 ms (gray-shaded region) for each target repetition following low-reward cues (cyan circles) and high reward followed by low-reward cues (cyan squares). Error bars are 95% confidence intervals.

Figure 13.

CDA and P170 correlations during learning in Experiments 1–3. The relationship between an individual's CDA and P170 amplitudes from target repetitions 1 minus 5 of the low-reward runs in Experiments 1 (A), 2 (B), and 3 (C).

Figure 13.

CDA and P170 correlations during learning in Experiments 1–3. The relationship between an individual's CDA and P170 amplitudes from target repetitions 1 minus 5 of the low-reward runs in Experiments 1 (A), 2 (B), and 3 (C).

The reward-based results from Experiment 3 demonstrate that even during a simple object-discrimination task, reward incentives led individuals to engage additional top-down control by reinstatement of the target representation in visual working memory, which improved task performance. Following high- relative to low-reward cues, we found a significant decrease in RT (F1,14 = 6.200, P < 0.03; Fig. 10B) and accuracy that did not significantly differ between reward conditions (mean: 96.1% correct; F1,14 = 1.565, P = 0.231). Critically, we again found a significant increase in CDA amplitude (F1,14 = 7.132, P < 0.02; Fig. 10C,D) and lateralized alpha suppression (F1,14 = 8.030, P < 0.02; Fig. 11), but no significant change in P170 (P > 0.08) between high- and low-reward cues. Moreover, the difference in CDA amplitude and RT following high- versus low-reward cues was significantly correlated across subjects (r = 0.798, P < 0.01; Fig. 10E), indicating that reward-based behavioral gains can be predicted by the magnitude of the CDA indexing the reinstatement of the attentional template in working memory. However, reward-driven CDA (r = −0.367, P = 0.178) and RT (r = 0.357, P = 0.192) effects were uninfluenced by differences in individual visual working memory capacity, showing that these reward effects are a general mechanism used across individuals with different working memory capacities. In sum, the results obtained from Experiment 3 underscore the reliability and generality of our findings across different tasks. Given the simplicity of the target discrimination task in Experiment 3, it appears that the recruitment of working memory to aid the cognitive control provided by long-term memory is a general mechanistic response to reward across tasks that vary across a large range of complexity. These findings again show that the reward-based effects are driven by redundancy gains in having both visual working and long-term memory representations facilitating the perception of target objects.

### Experiment 4

To rule out that our findings were due to the possibility of penalties on high-reward trials with incorrect responses, we re-ran Experiment 1 without the possibility of penalties on incorrect high-reward trials. The results of this experiment were identical to those of Experiment 1. Specifically, RTs were increasingly faster with each trial searching for the same target (F4,56 = 2.722, P < 0.05; Fig. 14A), and markedly reduced following high- relative to low-reward cues after 5 trials of searching for the same target (F1,14 = 5.449, P < 0.04; Fig. 14A). Search accuracy was near ceiling and did not significantly differ across target repetitions (mean: 96.3% correct; P > 0.30) or between low- and high-reward cues (mean: 95.3% correct; P > 0.45). Across the same-target runs, we observed a significant CDA amplitude decrease (F4,56 = 4.432, P < 0.01; contrasts 1 vs. 5, P < 0.03; Fig. 14B), and a P170 amplitude increase in negativity (F4,56 = 2.600, P < 0.04; contrasts 1 vs. 5, P < 0.04; Fig. 14B). When the fifth trial in a same-target run was preceded by a high-reward cue, we found that the CDA returned to its full amplitude (F1,14 = 8.992, P < 0.01; Fig. 14B), comparable with when a new target orientation was introduced (i.e. target repetition 1), while reward-triggered P170 modulation was not significant (P > 0.14). Finally, we observed that the difference in CDA amplitude following high- relative to low-reward cues after 5 target repetitions was correlated with the speeding of RT across subjects (r = 0.682, P < 0.01; Fig. 14C), indicating that the behavioral gains induced by reward are predicted by the rebound in CDA amplitude. Thus, our findings show that the possibility of receiving a large reward rather than incurring a large penalty is sufficient to lead subjects to exercise more precise attentional control by reloading the target representation into working memory to supplement the representations accumulating in long-term memory that typically guides attention after a short period of learning.

Figure 14.

Results from Experiment 4. (A) RTs across target repetitions following low-reward cues (white), and the critical high-reward cue (green) followed by trials with low-reward cues (gray). Error bars are 95% confidence intervals. (B) CDA amplitude across target repetitions following low-reward cues (white bars), and the critical high-reward cue (green bar) followed by trials with low-reward cues (gray bars). Superimposed is the simultaneously measured P170 amplitude across the same low-reward cues (cyan circles) and high-reward followed by low-reward cues (cyan squares). Error bars are as in (A). (C) The relationship between individual subjects' CDA amplitude and RT following high- minus low-reward cues at the same target repetition.

Figure 14.

Results from Experiment 4. (A) RTs across target repetitions following low-reward cues (white), and the critical high-reward cue (green) followed by trials with low-reward cues (gray). Error bars are 95% confidence intervals. (B) CDA amplitude across target repetitions following low-reward cues (white bars), and the critical high-reward cue (green bar) followed by trials with low-reward cues (gray bars). Superimposed is the simultaneously measured P170 amplitude across the same low-reward cues (cyan circles) and high-reward followed by low-reward cues (cyan squares). Error bars are as in (A). (C) The relationship between individual subjects' CDA amplitude and RT following high- minus low-reward cues at the same target repetition.

### Spatial Distributions and Downstream Effects

We assessed the spatial distributions of the CDA and P170 by calculating current density distributions (see Materials and Methods for details). The current distributions were similar to the voltage distributions across the scalp during the time windows of the CDA and P170. The current density topographies illustrated in Figure 15 show that the lateral posterior activation of the CDA decreased in magnitude and coverage across the cortical surface, while the frontomedial distribution of the P170 increased in magnitude and spread across the cortical surface as subjects continued to search for the same low-reward target. The cortical coverage and magnitude of current density of the CDA increased when a high reward was expected. In all experiments, the current distributions contributing to the CDA explained 87–94% of the variance, while that contributing to the P170 explained 83–95% of the variance. These findings are consistent with the previous reports of the spatial topography of the CDA (Vogel and Machizawa 2004; Vogel et al. 2005; Woodman and Vogel 2008) and P170 (Voss et al. 2010) when these components were measured in isolation. The differences in current distribution, latency, and correlations with behavior all support the conclusion that these different ERP components are indices of separate cognitive processes in working and long-term memories.

Figure 15.

Current density distributions of P170 and CDA from Experiment 1. Distributed current densities projected across the cortical surface for the P170 (top view) and CDA (lateral rear view). The P170 model was computed based on the cue-locked grand-average ERPs from 170 to 370 ms using all scalp electrodes. The CDA model was computed based on the contralateral minus ipsilateral cue-locked grand-average ERP difference waves from 300 to 1000 ms using all scalp electrodes. Although both left and right visual field conditions are included in the CDA model, for visualization purposes all CDA contralateral signals are projected onto the left hemisphere.

Figure 15.

Current density distributions of P170 and CDA from Experiment 1. Distributed current densities projected across the cortical surface for the P170 (top view) and CDA (lateral rear view). The P170 model was computed based on the cue-locked grand-average ERPs from 170 to 370 ms using all scalp electrodes. The CDA model was computed based on the contralateral minus ipsilateral cue-locked grand-average ERP difference waves from 300 to 1000 ms using all scalp electrodes. Although both left and right visual field conditions are included in the CDA model, for visualization purposes all CDA contralateral signals are projected onto the left hemisphere.

Finally, to verify that the deployment of attention during the visual search tasks was modulated by our learning and reward manipulations as indicated by subjects' RT speed, we examined the electrophysiological responses during visual search using an ERP component known to be sensitive to the deployment of covert visual attention (i.e. N2pc, Luck and Hillyard 1990; Luck et al. 1993). RT is a useful index of the output of all of the computations performed during a task, but we used the N2pc to more directly measure the influence of our manipulations on this mechanism of perceptual attention (Woodman and Luck 2003). The results of the analyses of the N2pc across the learning and reward manipulations of Experiment 1 are shown in Figure 16. In the experiments where it was possible to measure an N2pc during search (i.e. Experiments 1, 2, and 4), we found a systematic increase in N2pc amplitude as the subjects searched for the same target across the runs of low-reward trials (Experiment 1: F4,56 = 2.963, P < 0.03; Experiment 2: F4,56 = 3.144, P < 0.03; Experiment 4: F4,56 = 2.836, P < 0.04). Post hoc contrasts showed that repetitions late in the run of trials were driving these effects relative to the significantly smaller amplitude N2pc to the targets at the beginning of the runs (1 vs. 4: Experiment 1, P < 0.05; 1 vs. 5: Experiment 1, P < 0.05; Experiment 2, P < 0.05; Experiment 4, P < 0.04). The amplitude of the N2pc was also robustly increased following high- relative to low-reward cues on the critical fifth target repetitions (Experiment 1: F1,14 = 9.025, P < 0.01; Experiment 4: F1,14 = 9.856, P < 0.01). In Experiment 2, the N2pc was consistently enhanced across all trials in the runs of high-reward trials relative to the gradually increasing N2pc observed across the runs of low-reward trials as evidenced by an interaction between repetition and reward value (F4,56 = 3.963, P < 0.02). These results provide evidence that converges with the behavior in showing that attention was more precisely tuned to the target features across the short bursts of learning and when a large reward was at stake.

Figure 16.

Search array-locked ERPs from Experiment 1. (A) Grand-average ERP waveforms from lateral occipital electrodes contralateral (red) and ipsilateral (black) to the search array target location across target repetition. N2pc components are shaded gray. (B) N2pc amplitude across target repetitions following low-reward cues (white bars), and the critical high-reward cue (green bar) followed by trials with low-reward cues (gray bars). N2pc amplitude is the contralateral minus ipsilateral difference wave between 200 and 300 ms postsearch array. Error bars are 95% confidence intervals.

Figure 16.

Search array-locked ERPs from Experiment 1. (A) Grand-average ERP waveforms from lateral occipital electrodes contralateral (red) and ipsilateral (black) to the search array target location across target repetition. N2pc components are shaded gray. (B) N2pc amplitude across target repetitions following low-reward cues (white bars), and the critical high-reward cue (green bar) followed by trials with low-reward cues (gray bars). N2pc amplitude is the contralateral minus ipsilateral difference wave between 200 and 300 ms postsearch array. Error bars are 95% confidence intervals.

## Discussion

The present study shows that reward-driven modulations of information processing and attentional selection can be accounted for with existing cognitive models, but with specifying that redundant memory representations can be concurrently engaged to control processing (Desimone and Duncan 1995; Logan 2002; Bundesen et al. 2005). The present study shows that neuroeconomic influences can create a dynamic situation in which the source of cognitive control involves the interplay of working and long-term memory representations. Our electrophysiological findings show that long-term memory plods along accumulating representations of instances searching for a specific item, regardless of its reward value. In contrast, the working memory representations are used in a more responsive manner to supplement cognitive control when necessary. Consistent with this, we found that high stakes triggered the recruitment of visual working memory to supplement long-term memory. Moreover, when a potent, high-reward distractor slowed processing on a given trial, the brain responded to this by recruiting visual working memory on the following trial to recover from this distraction. It is therefore possible that visual working memory is called upon as a more general  purpose top-down strategy to overcome distraction irrespective of reward processing. Our simultaneous measurements of multiple components indexing working and long-term memories provide novel insights into how the memory representations that control attention change in the face of potential rewards.

Our findings add to a developing literature demonstrating robust modulations of attentional control and learning during attention-demanding tasks by manipulations of reward (Della Libera and Chelazzi 2006, 2009; Serences 2008; Kiss et al. 2009; Peck et al. 2009; Raymond and O'Brien 2009; Krebs et al. 2010; Navalpakkam et al. 2010; Hickey et al. 2010a, 2010b). For example, prior work has shown that sustained attentional suppression is only present in high-reward situations (Della Libera and Chelazzi 2006), and that subjects can more efficiently select stimuli that have been previously tied to large reward, but have difficulty ignoring these stimuli (Della Libera and Chelazzi 2009; Anderson et al. 2010) even when they are independent of goals and salience (Anderson et al. 2010). Such reward maximization behavior is robust across multiple visual features and does not depend on the type of motor response (Navalpakkam et al. 2010). Reward-based perceptual and attentional advantages have also been shown to influence stimulus detectability, leading to decreased sensitivity for reward-related stimuli in the attentional blink task (Raymond and O'Brien 2009). Moreover, many of these reward-based attentional effects may be due to individual differences originating from personality traits such as reward-seeking (Hickey et al. 2010b). The current study is consistent with this growing literature, but also suggests a new perspective on the mechanisms underlying these effects—the use of converging memory representations to enhance attentional priority and to optimize performance.

Our findings shed light on the nature of the transition between working and long-term memories during learning. Following from Fitts and Posner (1967), theories of memory and learning have proposed that as we become practiced at a certain task, we switch from processing information guided by working memory representations to relying on long-term memory representations (e.g. Anderson 2000). Alternatively, another prominent theory of task learning posits that representations from both memory stores race in parallel, with the speeding of RTs with practice being due to the fast long-term memory processes more frequently winning the race to categorize the task-relevant objects as experience accrues (Logan 1988, 2002). This latter model proposes that we should continue to see the ERP index of working memory representations during learning and across experience as the effects of long-term memory representations mount. Our findings support this account of learning and suggest that the nature of the transition between working and long-term memories is a matter of degree rather than an all-or-none phenomenon. That is, during learning, our electrophysiological evidence shows a gradual tradeoff between using working and long-term memory representations to guide attention. Working memory even appeared to be marginally engaged later in learning (e.g. target repetitions 6–7) when long-term memory was the primary controller of attention. However, in our experiments and others (Carlisle et al. 2011), subjects were cued a maximum of 7 times with the same object before having to switch to a new search target, and thus future work using longer runs is needed to determine whether and where, in its time course, working memory-guided processing becomes completely disengaged and RT asymptotes. In addition, it is possible that the switch from using working memory to using long-term memory representations to control attention is a punctate event (Rickard 1997), but that the averaging that was necessary to measure the ERPs smeared out this transition. We are exploring the possibility that other measurement techniques might be able to distinguish between a gradual and a discrete transition between memory systems. However, the present findings clearly show how separate electrophysiological indices can be used to study the types of memory representations that control information processing as participants become proficient at a particular task.

The instance theory of attention and memory is the natural modeling framework in which to integrate and explain the current findings (Logan 1988, 2002). Briefly, instance theory conceives of representations from working and long-term memories as runners racing toward a threshold with the cognitive process triggered once the threshold is crossed by the first runner. In the context of the present study, we can view the deployment of attention to task-relevant items in the visual search array as the process of interest. With a larger number of runners, the average finishing time will be faster assuming variability in the speed of the runners. As a result, this theory predicts that when we have more representations in multiple memory stores converging to drive attention to the target objects in the scene, we will have more efficient processing of the target information. Our work extends this basic logic by proposing that top-down redundancy not only occurs at the start of learning, but can also be implemented whenever it would be adaptive to accelerate visual processing. Alternatively, it is possible that subjects rely on information stored exclusively in working memory to tune attention in high stakes situations. However, this usage of working memory would need to be acting on a system in a different state than it faces when a new target is presented to account for the behavioral effects. Experiment 2 of the current study is a special and interesting case in which visual cognition appears to step on the accelerator repeatedly across high-reward trials. Moreover, across all experiments subjects appear to perseverate on the strategy of using redundant representations to more precisely control attention. This is supported by the observation that the CDA remains slightly elevated for a couple of trials following the high-reward trial (i.e. as indicated by the difference between gray vs. white bars in Figs. 1, 10, and 14). Future modeling simulations in conjunction with brain and behavioral data will be necessary to verify that a cognitive model can in fact predict that processing will be more efficient after a moderate amount of task practice when working memory representations are brought back on line as well as in cases when working memory reengagement is triggered repeatedly on trial after trial.

Our empirical findings may provide an explanation for why several clinical disorders, like schizophrenia and attention deficit hyperactivity disorder, appear to demonstrate a clustering of impairments that include attention and working memory deficits, abnormal reward processing, and dysfunction of the dopaminergic system (Zubin 1975; Seeman et al. 1993; Gong et al. 2011). The present study suggests that these impairments might have a common locus in the information processing system of the brain, instead of being due to a diversity of deficits scattered across distinct cognitive mechanisms with different neural substrates. Specifically, a visual working memory impairment that prevents the use of this mechanism in responding to changing environmental demands can manifest itself in disordered focusing of attention, reactions to reward value, and change of task or target, such as in the Wisconsin Card Sorting task.

In this study, we used concurrent measurements of electrophysiological indices of working and long-term memories to understand how information processing changes in the face of reward. The paradigm we developed allows us to observe short bursts of learning both behaviorally and electrophysiologically, providing a proving ground for these concurrent measures. But we believe that this combination of measures can be used to determine the nature of the memory representations brought to bear during a host of different tasks and paradigms. These noninvasive electrophysiological tools can be used with normal adult subjects as well as a variety of different clinical or developmental populations.

## Funding

This work was made possible by grants from the National Institutes of Health (R01- EY019882, P30-EY08126, and P30-HD015052) and National Science Foundation (BCS-0957072).

## Notes

Conflict of Interest: None declared.

## References

Anderson
B
Laurent
P
Yantis
S
Value-driven attentional capture
,
2010
, vol.
108
(pg.
10367
-
10371
)
Anderson
JR
Learning and memory
,
2000
New York
Wiley
Björklund
A
Dunnett
S
Dopamine neuron systems in the brain: an update
Trends Neurosci
,
2007
, vol.
30
(pg.
194
-
202
)
Bundesen
C
Habekost
T
Kyllingsbaek
S
A neural theory of visual attention: bridging cognition and neurophysiology
Psychol Rev
,
2005
, vol.
112
(pg.
291
-
328
)
Carlisle
NB
Arita
JT
Pardo
D
Woodman
GF
Attentional templates in visual working memory
J Neurosci
,
2011
, vol.
31
(pg.
9315
-
9322
)
Cowan
N
The magical number 4 in short-term memory: a reconsideration of mental storage capacity
Behav Brain Sci
,
2001
, vol.
24
(pg.
87
-
185
)
Della Libera
C
Chelazzi
L
Learning to attend and to ignore is a matter of gains and losses
Psychol Sci
,
2009
, vol.
20
(pg.
778
-
784
)
Della Libera
C
Chelazzi
L
Visual selective attention and the effects of monetary reward
Psychol Sci
,
2006
, vol.
17
(pg.
222
-
227
)
Desimone
R
Duncan
J
Neural mechanisms of selective visual attention
Ann Rev Neurosci
,
1995
, vol.
18
(pg.
193
-
222
)
Diana
R
Vilberg
K
Reder
L
Identifying the ERP correlate of a recognition memory search attempt
Cogn Brain Res
,
2005
, vol.
24
(pg.
674
-
684
)
Donders
FC
Koster
WG
On the speed to mental processes
Attention and performance II
,
1868/1969
Amsterdam
North-Holland Publishing Co
(pg.
412
-
431
)
Duarte
A
Ranganath
C
Winward
L
Hayward
D
Knight
R
Dissociable neural correlates for familiarity and recollection during the encoding and retrieval of pictures
Cogn Brain Res
,
2004
, vol.
18
(pg.
255
-
272
)
Fitts
PM
Posner
MI
Human performance
,
1967
Belmont, CA
Brooks Cole
Friedman
D
ERP studies of recognition memory: differential effects of familiarity, recollection, and episodic priming
Cogn Sci
,
2004
, vol.
1
(pg.
81
-
121
)
Fuchs
M
Drenckhahn
R
Wischmann
HA
Wagner
M
An improved boundary element method for realistic volume-conductor modeling
IEEE Trans Biomed Eng
,
1998
, vol.
45
(pg.
980
-
997
)
Gong
R
Ding
C
Hu
J
Lu
Y
Liu
F
Mann
E
Xu
F
Cohen
M
Luo
M
Role for the membrane receptor guanylyl cyclase-c in attention deficiency and hyperactive behavior
Science
,
2011
, vol.
333
(pg.
1642
-
1646
)
Hickey
C
Chelazzi
L
Theeuwes
J
Reward changes salience in human vision via the anterior cingulate
J Neurosci
,
2010a
, vol.
30
(pg.
11096
-
11103
)
Hickey
C
Chelazzi
L
Theeuwes
J
Reward guides vision when it's your thing: trait reward-seeking in reward-mediated visual priming
PLoS One
,
2010b
, vol.
5
pg.
e14087

Jennings
JR
Wood
CC
The e-adjustment procedure for repeated-measures analyses of variance
Psychophysiology
,
1976
, vol.
13
(pg.
277
-
278
)
Kiss
M
Driver
J
Eimer
M
Reward priority of visual target singletons modulates event- related potential signatures of attentional selection
Psychol Sci
,
2009
, vol.
20
(pg.
245
-
251
)
Krebs
R
Boehler
C
Woldorff
M
The influence of reward associations on conflict processing in the Stroop task
Cognition
,
2010
, vol.
117
(pg.
341
-
247
)
Logan
GD
Toward an instance theory of automatization
Psychol Rev
,
1988
, vol.
95
(pg.
492
-
527
)
Logan
GD
An instance theory of attention and memory
Psychol Rev
,
2002
, vol.
109
(pg.
376
-
400
)
Luck
SJ
Fan
S
Hillyard
SA
Attention-related modulation of sensory-evoked brain activity in a visual search task
J Cogn Neurosci
,
1993
, vol.
5
(pg.
188
-
195
)
Luck
SJ
Hillyard
SA
Electrophysiological evidence for parallel and serial processing during visual search
Percept Psychophys
,
1990
, vol.
48
(pg.
603
-
617
)
Mazaheri
A
Jensen
O
Asymmetric amplitude modulations of brain oscillations generate slow evoked responses
J Neurosci
,
2008
, vol.
28
(pg.
7881
-
7787
)
Mazaheri
A
Jensen
O
Rhythmic pulsing: linking ongoing brain activity with evoked responses
Front Hum Neurosci
,
2010
, vol.
4
(pg.
1
-
13
)
Navalpakkam
V
Koch
C
Rangel
A
Perona
P
Optimal reward harvesting in complex perceptual environments
,
2010
, vol.
107
(pg.
5232
-
5237
)
Nunez
P.L.
Electric Fields of the Brain
,
1981
New York
Oxford University Press
O'Doherty
J
Reward representations and reward-related learning in the human brain: insights from human neuroimaging
Curr Opin Neurobiol
,
2004
, vol.
14
(pg.
769
-
76
)
Oostenveld
R
Fries
P
Maris
E
Schoffelen
JM
FieldTrip: open source software for advanced analysis of MEG, EEG, and invasive electrophysiological data
Comput Intell Neurosci
,
2011
, vol.
2011
(pg.
1
-
9
)
Peck
C
Jangraw
D
Suzuki
M
Efem
R
Gottlieb
J
Reward modulates attention independently of action value in posterior parietal cortex
J Neurosci
,
2009
, vol.
29
(pg.
11182
-
11191
)
Penny
P
Common cellular and molecular mechanisms in obesity and drug addiction
Nat Rev Neurosci
,
2011
, vol.
12
(pg.
638
-
651
)
Posner
MI
Snyder
CR
Solso
R
Attention and cognitive control
Information processing and cognition
,
1975
Hillsdale (NJ)
Erlbaum
(pg.
55
-
85
)
Raymond
JE
O'Brien
JL
Selective visual attention and motivation: the consequences of value learning in an attentional blink task
Psychol Sci
,
2009
, vol.
20
(pg.
981
-
988
)
Rescorla
R
Wagner
A
Black
A
Prokasy
W
Classical conditioning II: current research and theory
Classical conditioning ii: current research and theory
,
1972
New York
Appleton Century Crofts
pg.
64

Rickard
TC
Bending the power law: a CMPL theory of strategy shifts and the automatization of cognitive skills
J Exp Psychol Gen
,
1997
, vol.
126
(pg.
288
-
311
)
Seeman
P
Guan
H-C
Van Tol
H
Dopamine D4 receptors elevanted in schizophrenia
Nature
,
1993
, vol.
365
(pg.
441
-
445
)
Serences
J
Value-based modulations in human visual cortex
Neuron
,
2008
, vol.
60
(pg.
1169
-
1181
)
Tobler
P
Fiorillo
C
Schultz
W
Adaptive coding of reward value by dopamine neurons
Science
,
2005
, vol.
307
(pg.
1642
-
1645
)
Tsivilis
D
Otten
L
Rugg
M
Context effects on the neural correlates of recognition memory: an electrophysiological study
Neuron
,
2001
, vol.
31
(pg.
497
-
505
)
van Dijk
H
Van der Werf
J
Mazaheri
A
Medendorp
P
Jensen
O
Modulations in oscillatory activity with amplitude asymmetry can produce cognitively relevant event-related responses
,
2010
, vol.
107
(pg.
900
-
905
)
Vickery
T
Chun
M
Lee
D
Ubiquity and specificity of reinforcement signals throughout the human brain
Neuron
,
2011
, vol.
72
(pg.
166
-
177
)
Vogel
EK
Machizawa
MG
Neural activity predicts individual differences in visual working memory capacity
Nature
,
2004
, vol.
428
(pg.
748
-
751
)
Vogel
EK
McCollough
AW
Machizawa
MG
Nature
,
2005
, vol.
438
(pg.
500
-
503
)
Voss
J
Schendan
H
Paller
K
Finding meaning in novel geometric shapes influences electrophysiological correlates of repetition and dissociates perceptual and conceptual priming
Neuroimage
,
2010
, vol.
49
(pg.
2879
-
2889
)
Wickens
T
Elementary signal detection theory
,
2002
Oxford
Oxford University Press
Woodman
GF
A brief introduction to the use of event-related potentials (ERPs) in studies of perception and attention
Attent Percept Psychophys
,
2010
, vol.
72
(pg.
2031
-
2046
)
Woodman
GF
Luck
SJ
Serial deployment of attention during visual search
J Exp Psychol: Hum Percep Perform
,
2003
, vol.
29
(pg.
121
-
138
)
Woodman
GF
Vogel
EK
Top-down control of visual working memory consolidation
Psychon Bull Rev
,
2008
, vol.
15
(pg.
223
-
229
)
Zubin
J
Sutton
S
Zubin
J
Problem of attention in schizophrenia
Experimental approaches to psychopathology
,
1975
New York