Abstract

We investigated how orbitofrontal cortex (OFC) contributes to adaptability in the face of changing reward contingencies by examining how reward representations in monkey orbitofrontal neurons change during a visually cued, multi-trial reward schedule task. A large proportion of orbitofrontal neurons were sensitive to events in this task (69/80 neurons in the valid and 48/58 neurons in the random cue context). Neuronal activity depended upon preceding reward, upcoming reward, reward delivery, and schedule state. Preceding reward–dependent activity occurred in both the valid and random cue contexts, whereas upcoming reward-dependent activity was observed only in the valid context. A greater proportion of neurons encoded preceding reward in the random than the valid cue context. The proportion of neurons with preceding reward–dependent activity declined as each trial progressed, whereas the proportion encoding upcoming reward increased. Reward information was represented by ensembles of neurons, the composition of which changed with task context and time. Overall, neuronal activity in OFC adapted to reflect the importance of different types of reward information in different contexts and time periods. This contextual and temporal adaptability is one hallmark of neurons participating in executive functions.

Introduction

To survive and reproduce, animals must be able to recognize rewarding events when they occur, utilize environmental cues to predict future rewards, and remember rewards from the past. It is widely accepted that orbitofrontal cortex (OFC) plays an important role in this process by integrating information about current and future rewards. Because past rewards can play an important role in planning future behavior, we have investigated the role of OFC in encoding information about preceding rewards. The importance of different rewards can vary in different situations and at different times, so we have also investigated the ways in which reward representations in OFC change with context and time.

OFC is activated by reward delivery itself. In humans, OFC activations have been observed in response to primary sensory rewards and to the direct experience of scoring points or winning money (Rolls et al. 1990; Thut et al. 1997; O'Doherty, Kringelbach, et al. 2001; O'Doherty, Rolls, et al. 2001; O'Doherty et al. 2002; Elliott et al. 2003, 2004; Kringelbach et al. 2003; Rogers et al. 2004; Pritchard et al. 2005; Remijnse et al. 2005). In monkeys, OFC neurons respond to reward delivery, discriminate between different rewards, and encode reward preference (Rosenkilde et al. 1981; Thorpe et al. 1983; Tremblay and Schultz 1999; Tremblay and Schultz 2000b; Ichihara-Takeda 2006; Padoa-Schioppa and Assad 2006).

OFC is also important for predicting future rewards. In humans, cues predicting taste, odor, or financial rewards activate OFC (O'Doherty et al. 2002; Gottfried et al. 2003; Knutson et al. 2005; Ursu and Carter 2005). In monkeys, OFC lesions impair the ability to associate visual stimuli with rewards and to respond appropriately when reinforcement contingencies change (Jones and Mishkin 1972; Pears et al. 2003; Izquierdo et al. 2004). OFC neurons in monkeys respond to cues that predict rewarding outcomes and alter their responses when changes in stimulus-reinforcer contingencies occur (Rolls et al. 1996; Tremblay and Schultz 1999, 2000a, 2000b; Wallis and Miller 2003; Roesch and Olson 2004; Hosokawa et al. 2005).

The degree to which OFC maintains information about past rewards has been studied less extensively. OFC lesions in humans impair their ability to weigh short-term gains against long-term losses, but these deficits appear to arise from an insensitivity to future consequences rather than from an impairment in the memory of recent rewards (Bechara et al. 1994, 2000). OFC neurons maintain information about future rewards over delays, but these neuronal responses have been seen during stimulus-reinforcer time intervals and relate more to reward expectancy than to reward history (Hikosaka and Watanabe 2000; Wallis and Miller 2003).

We have investigated the ways in which OFC represents reward information by recording from single neurons in monkeys performing a reward schedule task. We examined the effects of preceding, current, and upcoming rewards and found that OFC neurons maintained reward information from the preceding trial, in addition to encoding reward expectancy and responding to reward delivery itself. The degree to which ensembles of OFC neurons represented preceding or upcoming reward changed with context and time, reflecting the reward information available and the relative importance of that information. The neuronal ensembles representing preceding or upcoming reward included overlapping but different sets of neurons in different contexts and time periods. Together, our results suggest that OFC carries signals that can provide a basis for the reward components of executive function.

Materials and Methods

Subjects

Subjects were 2 male rhesus monkeys weighing 6 and 8.6 kg.

Behavioral Training and Testing

The monkeys were trained to perform a visually cued reward schedule task (Fig. 1A,B; Bowman et al. 1996). All behavioral training and testing took place with the monkeys squatting in a primate chair, responding to visual stimuli presented on a computer display monitor. Behavioral control and data acquisition were performed using the REX program (Hays et al. 1982). Neurobehavioral Systems Presentation software was used to display visual stimuli (Neurobehavioral Systems, Inc., Albany, CA).

Figure 1.

Reward schedule task. (A) Events occurring in each trial of the reward schedule task. Trials begin when the monkey touches a bar mounted at the front of the chair. A visual cue then appears alone for 2–2.5 s. The visual cue remains on for the duration of the trial. The wait signal (a red spot) appears for 0.5–1.5 s, and the monkey must hold the bar until the red spot turns green (the go signal). A trial is performed correctly when the monkey releases the bar between 0.2 and 0.8 s after the go signal. Correct trials are signaled by the green spot turning blue. Intertrial intervals (ITI) and interschedule intervals (ISI) are 1 s in length. (B) Trial sequences within each schedule. Schedules contain sequences of 1, 2, or 3 trials. The schedule state indicates the trial and schedule length (e.g., 1:1, 1:2, 2:2, 1:3, 2:3, and 3:3). In the valid cue context, schedule states are differentiated by the gray intensity of the visual cue (as illustrated). In the random cue context, the visual cue is randomly chosen on each trial and has no relation to the underlying schedule state. After correct performance on each trial, the monkey progresses to the next trial in the schedule (blue arrows), receiving a reward at the end of a complete schedule (liquid drop). If an error is made, the monkey must repeat that trial but is not required to return to the beginning of the schedule. After successful completion of the current schedule, a new schedule is picked at random. (C) Behavioral performance for 2 monkeys (M1 and M2). Error rates are plotted as a function of schedule state. Error rates were calculated by dividing the total number of errors by the total number of trials in each schedule state across all recording sessions, resulting in a single grand error rate for each schedule state in each context. In the valid cue context (black lines), the monkeys had the lowest error rates in the rewarded schedule states and progressively higher error rates as a function of the number of trials remaining in the schedule (chi-square tests for linear trend, ***P < 0.001). In the random cue context (red lines), each monkey responded with low and indistinguishable error rates across all schedule states.

Figure 1.

Reward schedule task. (A) Events occurring in each trial of the reward schedule task. Trials begin when the monkey touches a bar mounted at the front of the chair. A visual cue then appears alone for 2–2.5 s. The visual cue remains on for the duration of the trial. The wait signal (a red spot) appears for 0.5–1.5 s, and the monkey must hold the bar until the red spot turns green (the go signal). A trial is performed correctly when the monkey releases the bar between 0.2 and 0.8 s after the go signal. Correct trials are signaled by the green spot turning blue. Intertrial intervals (ITI) and interschedule intervals (ISI) are 1 s in length. (B) Trial sequences within each schedule. Schedules contain sequences of 1, 2, or 3 trials. The schedule state indicates the trial and schedule length (e.g., 1:1, 1:2, 2:2, 1:3, 2:3, and 3:3). In the valid cue context, schedule states are differentiated by the gray intensity of the visual cue (as illustrated). In the random cue context, the visual cue is randomly chosen on each trial and has no relation to the underlying schedule state. After correct performance on each trial, the monkey progresses to the next trial in the schedule (blue arrows), receiving a reward at the end of a complete schedule (liquid drop). If an error is made, the monkey must repeat that trial but is not required to return to the beginning of the schedule. After successful completion of the current schedule, a new schedule is picked at random. (C) Behavioral performance for 2 monkeys (M1 and M2). Error rates are plotted as a function of schedule state. Error rates were calculated by dividing the total number of errors by the total number of trials in each schedule state across all recording sessions, resulting in a single grand error rate for each schedule state in each context. In the valid cue context (black lines), the monkeys had the lowest error rates in the rewarded schedule states and progressively higher error rates as a function of the number of trials remaining in the schedule (chi-square tests for linear trend, ***P < 0.001). In the random cue context (red lines), each monkey responded with low and indistinguishable error rates across all schedule states.

The monkeys were first taught to respond to a fixation spot changing color. Each color discrimination trial was initiated when the monkey touched a bar mounted at the front of the chair. After the bar was touched, a red spot appeared (wait signal). To perform a trial correctly, the monkey was required to release the bar after the spot turned green (go signal) and before it disappeared. On correct trials, the spot then turned blue (correct signal). During this phase of training (∼4 weeks), all correctly performed trials were rewarded.

After the monkeys learned to perform color discrimination trials at greater than 80% correct, the visually cued, multi-trial reward schedule task was introduced. In this task, sequences of trials were embedded within schedules (Fig. 1B). Each trial was identical to that described above but began with the presentation of a visual cue for 2000–2500 ms. Each schedule contained 1, 2, or 3 trials. The monkey was required to complete all of the trials in a schedule before receiving a reward. After the successful completion of a schedule, a liquid reward was delivered, and a new schedule (1, 2, or 3 trials) was chosen randomly. Errors consisted of any bar release occurring outside the 200- to 800-ms post-go period. After an error, there was no explicit punishment, but the monkey had to repeat the current trial to advance in the schedule or receive a reward. The ordered placement of a trial within a schedule is referred to as the “schedule state”; the schedule state indicates the current trial and the current schedule length (e.g., 1:1, 1:2, 2:2, 1:3, 2:3, and 3:3). The first trial of any schedule always follows the delivery of a reward. Within a multi-trial schedule, the final rewarded trial always follows one or more unrewarded trials.

The reward schedule task was run in 2 contexts, a valid cue context and a random cue context, during separate blocks within each session. In both contexts, schedules of different lengths were intermixed randomly and trials progressed in an ordered sequence within each schedule. In both contexts, preceding reward information was available to the monkey. In the valid cue context, the gray value intensity of the cue and the schedule state were paired (1:1 = 100% black; 1:2 = 50% black; 2:2 = 100% black; 1:3 = 33% black; 2:3 = 66% black; 3:3 = 100% black), so that the cue provided information about upcoming reward expectancy. In the random cue context, the gray value of the cue on any given trial was chosen randomly, so the cue provided no information about upcoming reward expectancy. The monkeys were not specifically trained to use cue-related information to perform the task.

Surgery

A recording chamber and head fixation post (Crist Instrument Co., Inc., Hagerstown, MD) were implanted using a sterile surgical procedure under general anesthesia in a veterinary operating facility. Craniotomy and chamber locations were determined using stereotaxic coordinates from a baseline magnetic resonance image (MRI) of each animal's brain. During the same surgery, a scleral magnetic search coil was implanted in one eye so that eye position could be monitored during recording sessions (Robinson 1963; Judge et al. 1980). All procedures were carried out according to the National Institutes of Health guidelines, supervised by a board certified veterinarian, and approved by the National Institute of Mental Health Animal Care and Use Committee.

Magnetic Resonance Images

MRIs at 1.5 T were obtained regularly throughout this study to determine initial recording well placement and to localize recording sites.

Electrophysiological Recordings

Single-unit recordings were made in caudal area 11 and rostral area 13 of OFC (Fig. 2; Carmichael and Price 1994). Recording sessions began after behavioral patterns on the reward schedule task had stabilized (8–9 weeks). All well-isolated single-unit action potentials were recorded and included in the data analyses. Action potentials were converted to pulses using a time–voltage window discriminator (FHC Inc., Bowdoin, ME) and were recorded at 1 ms resolution with REX.

Figure 2.

MRI with recording electrode in area 13 of OFC (tip of recording electrode marked with star). All recording sites were located between the medial orbital sulci (MOS) and lateral orbital sulci (LOS), from 33 to 37 mm rostral to the interaural line (caudal area 11 and rostral area 13). MRIs were obtained on a 1.5-T General Electric Signa unit, using a 5-inch surface coil and a 3-dimensional volume spoiled grass pulse sequence (time echo 6, time repetition 25, flip angle 30, field of view of 11 cm, slice thickness of 1 mm).

Figure 2.

MRI with recording electrode in area 13 of OFC (tip of recording electrode marked with star). All recording sites were located between the medial orbital sulci (MOS) and lateral orbital sulci (LOS), from 33 to 37 mm rostral to the interaural line (caudal area 11 and rostral area 13). MRIs were obtained on a 1.5-T General Electric Signa unit, using a 5-inch surface coil and a 3-dimensional volume spoiled grass pulse sequence (time echo 6, time repetition 25, flip angle 30, field of view of 11 cm, slice thickness of 1 mm).

Data Analyses

All data analyses were performed in the R statistical computing environment (Team RDC 2004). Neuronal activity was quantified by measuring spike counts during 400 ms time windows at 4 points in each trial: 1) just prior to cue appearance (precue period), 2) following cue appearance (cue-triggered period), 3) during the wait period, and 4) at the time of reward delivery in rewarded trials or the time when a reward would have occurred in unrewarded trials (outcome-triggered period). The 400-ms duration of the time windows was chosen because it was similar to that used in other studies of orbitofrontal neurons (Rosenkilde et al. 1981; Thorpe et al. 1983; Tremblay and Schultz 2000b) and because it was compatible with the times separating successive events in the reward schedule task.

The onset of each analysis time period was determined with a sliding window. For each neuron, a 400-ms window was moved in 25 ms increments across a 1-s period around each event of interest. At each time point, 1-way analyses of variance (ANOVAs) were used to test whether firing rates depended upon the occurrence of a preceding reward, the expectation of an upcoming reward, the outcome of a trial, or schedule state (see below). The numbers of neurons with activity modulations dependent on each of these factors and the average variance explained by each factor were determined at every time point. For each event of interest, we chose the time period that captured the maximum numbers of neurons with significant activity modulations and/or the maximum variance.

We examined 3 possible ANOVA models to explain the activity modulations observed in this task. We looked for activity dependent upon 1) “Preceding reward,” by comparing activity during the schedule states that immediately followed a rewarded trial to activity during the schedule states that followed a correctly performed, unrewarded trial. The schedule states following a rewarded trial consisted of the first trials of each schedule (1:1, 1:2, 1:3); the schedule states following an unrewarded trial were the later trials in the multi-trial schedules (2:2, 2:3, 3:3). 2) “Upcoming reward” and “reward delivery,” by comparing activity during rewarded schedule states (1:1, 2:2, 3:3) to schedule states in which no reward would be delivered, even if the trial were performed correctly (1:2, 1:3, 2:3). 3) “Schedule state,” by comparing activity across all 6 of the schedule states.

Two-level, 1-way ANOVAs were used to identify activity modulations dependent on preceding reward, upcoming reward, or reward delivery. Six-level, 1-way ANOVAs were used to identify schedule state–dependent activity. During each time window, the activity of each neuron was tested with 3 different ANOVAs; therefore, a significance threshold of P < 0.01 was used. To quantify the degree to which neuronal activity depended upon the factors of interest, we determined the variance in the neuronal activity explained by each factor. We calculated this variance directly from the ANOVA results by dividing the sum of squares of each factor by the total sum of squares (SSfactor/[SSfactor + SSresiduals] × 100). This measure is equivalent to calculating the power of the neuronal signal related to each factor.

Neuronal activity modulations were placed into the schedule state category when the 6-level ANOVA model explained significantly more variance than either of the 2-level ANOVAs, after correcting for the degrees of freedom (R ANOVA function, P < 0.01). Otherwise, they were categorized according to which of the 2-level ANOVA models explained the most variance. For schedule state–dependent activity, post hoc comparisons of pairs of schedule states were performed using the Tukey test, which corrects automatically for multiple comparisons.

To create population spike density functions, we identified the schedule state with the greatest firing rate within the time period under analysis and normalized that rate to 1.0. The firing rates in each of the schedule states were then scaled relative to that maximum.

Results

Behavioral Results

As shown in Figure 1C, both monkeys showed similar patterns of behavior in the reward schedule task. In the valid cue context, monkeys made the fewest errors on the rewarded schedule states (1:1, 2:2, 3:3), made more errors on the unrewarded states (1:2, 1:3, 2:3), and made fewer errors as they progressed in the 2- and 3-trial schedules (chi-square tests for linear trend; M1: 3-trial schedule, χ2 = 39.3, df = 1, P < 0.001; M2: 2-trial schedule, χ2 = 61.2, df = 1, P < 0.001; 3-trial schedule, χ2 = 464.2, df = 1, P < 0.001). In the random cue context, each animal responded with low and indistinguishable error rates across all schedule states (chi-square tests for linear trend; M1: 2-trial schedule, χ2 = 1.4, P = 0.2; 3-trial schedule, χ2 = 2.3, P = 0.1; M2: 2-trial schedule, χ2 = 0.02, P = 0.9; 3-trial schedule, χ2 = 0.1, P = 0.8). These error patterns show that the monkeys used the visual cues to guide their behavior in the valid cue context. They appeared to ignore the visual cues in the random cue context.

Electrophysiological Results

Single-unit recordings were obtained from caudal area 11 and rostral area 13 of OFC, as confirmed by MRI (Fig. 2). Every well-isolated neuron was recorded during the task and included in the data analyses. Recording sites were similar in the 2 monkeys, and the same types of neuronal activity modulations were found in both monkeys. No spatial segregation of responses was observed in either monkey. Therefore, all of the neurons from the 2 monkeys were treated as a single population. Eighty neurons were recorded during the valid cue context of the reward schedule task (M1 n = 36, M2 n = 44); 58/80 were also recorded during the random cue context (M1 n = 23, M2 n = 35).

Neuronal activity modulations were analyzed during four 400 ms time periods. The precue time period began 400 ms before cue onset. The cue-triggered time period began 200 ms after cue presentation. Activity modulations during the wait period developed slowly, with the largest number starting 350 ms after the red spot appeared. The end time of this 400 ms window occurred after the end of the red period, that is, after the green spot appeared, in 20% of trials. These trials were excluded from analysis. The outcome-triggered period began 150 ms after reward delivery (or 150 ms after the reward would have been delivered in the unrewarded trials).

A large proportion of OFC neurons showed significant activity modulations during the reward schedule task. Sixty-nine out of 80 (86%) neurons in the valid cue context and 48/58 (83%) neurons in the random cue context showed significant activity modulations during at least one time period in this task. Many OFC neurons had significant activity modulations during more than one time period (45/80 in valid cue; 24/58 in random cue).

Neuronal activity modulations could be placed into one of 4 categories: 1) Activity dependent upon the reward outcome of the preceding trial; 2) Activity dependent upon the upcoming reward outcome of the current trial; 3) Activity dependent upon schedule state; and (4) Activity dependent upon reward delivery itself. Table 1 summarizes these findings.

Table 1

Numbers (proportion) of OFC neurons with significant activity modulations in each context and time period (ANOVA, P < 0.01)

 Precue Cue triggered Wait Outcome triggered 
Valid cue context (n = 80)     
    Preceding reward 21 (26%)* 14 (18%)* 2 (3%) 3 (4%) 
    Upcoming reward/reward delivery 1 (1%) 15 (19%)* 24 (30%)* 27 (34%)* 
    Schedule state 4 (5%) 15 (19%)* 7 (9%) 13 (16%)* 
Random cue context (n = 58)     
    Preceding reward 29 (50%)* 16 (28%)* 5 (9%) 3 (5%) 
    Upcoming reward/reward delivery 0 (0%) 2 (3%) 1 (2%) 21 (36%)* 
    Schedule state 1 (2%) 2 (3%) 1 (2%) 3 (5%) 
 Precue Cue triggered Wait Outcome triggered 
Valid cue context (n = 80)     
    Preceding reward 21 (26%)* 14 (18%)* 2 (3%) 3 (4%) 
    Upcoming reward/reward delivery 1 (1%) 15 (19%)* 24 (30%)* 27 (34%)* 
    Schedule state 4 (5%) 15 (19%)* 7 (9%) 13 (16%)* 
Random cue context (n = 58)     
    Preceding reward 29 (50%)* 16 (28%)* 5 (9%) 3 (5%) 
    Upcoming reward/reward delivery 0 (0%) 2 (3%) 1 (2%) 21 (36%)* 
    Schedule state 1 (2%) 2 (3%) 1 (2%) 3 (5%) 
*

Number of neurons greater than expected by chance (1-sided binomial test, P < 0.001).

Activity Dependent upon Preceding Reward

Precue Period

For 21 neurons (26%), activity modulations encoded the outcome of the preceding trial in the valid cue context during the precue period (2-level, 1-way ANOVA, P < 0.01). Twelve of these neurons had higher firing rates after rewarded trials. Figure 3 shows one of the neurons with this type of activity modulation; the inset shows the population spike density function for all 12 such neurons. In the other 9 neurons, firing rates were tonically higher in trials following correctly performed, unrewarded trials than in trials following a reward. Overall, the reward outcome of the preceding trial explained an average of 13.9% of the variance in the valid cue context.

Figure 3.

Preceding reward–dependent activity during the precue period in the valid cue context. In this example, activity levels in the trials following a rewarded trial (1:1, 1:2, 1:3, thick gray lines) were significantly higher than in the trials following a correctly completed, unrewarded trial (2:2, 2:3, 3:3, thin black lines; 2-level, 1-way ANOVA, F1,187 = 274.6, P < 0.001). The firing rate in this neuron also increased as the onset of the first cue in a schedule approached (1-tailed, paired t-test between the 2 halves of the precue time window; t = 9.1, df = 86, P < 0.001). Each row of dots represents an individual trial in the task. Spike density function curves (Gaussian kernal sd = 25 ms) are superimposed on each raster. Rasters are aligned on cue onset (0 ms, vertical line). The ordinate shows firing rate per trial (in Hertz). The precue time window (−400 to 0 ms) is indicated above each raster with a horizontal line. Inset, normalized population spike density functions for the 12 neurons with higher firing rates following a rewarded trial. Each thin black or thick gray line represents a different schedule state. Thin light gray lines represent the outer bounds of the standard errors for each set of curves.

Figure 3.

Preceding reward–dependent activity during the precue period in the valid cue context. In this example, activity levels in the trials following a rewarded trial (1:1, 1:2, 1:3, thick gray lines) were significantly higher than in the trials following a correctly completed, unrewarded trial (2:2, 2:3, 3:3, thin black lines; 2-level, 1-way ANOVA, F1,187 = 274.6, P < 0.001). The firing rate in this neuron also increased as the onset of the first cue in a schedule approached (1-tailed, paired t-test between the 2 halves of the precue time window; t = 9.1, df = 86, P < 0.001). Each row of dots represents an individual trial in the task. Spike density function curves (Gaussian kernal sd = 25 ms) are superimposed on each raster. Rasters are aligned on cue onset (0 ms, vertical line). The ordinate shows firing rate per trial (in Hertz). The precue time window (−400 to 0 ms) is indicated above each raster with a horizontal line. Inset, normalized population spike density functions for the 12 neurons with higher firing rates following a rewarded trial. Each thin black or thick gray line represents a different schedule state. Thin light gray lines represent the outer bounds of the standard errors for each set of curves.

The activity of 29 neurons (50%) encoded the outcome of the preceding trial in the random cue context (2-level, 1-way ANOVA, P < 0.01). Figure 4 shows one of the 12 neurons for which the precue firing rate following rewarded trials was tonically higher than its firing rate following correctly completed, unrewarded trials. The inset shows the population spike density function for the 12 neurons with this type of activity modulation. In the other 17 neurons, precue firing rates were higher after a previously unrewarded trial. Overall, the preceding reward factor explained an average of 10.9% of the variance in the random cue context (a result statistically indistinguishable from that explained in the valid context; 2-tailed, unpaired t-test, P = 0.44).

Figure 4.

Preceding reward–dependent activity during the precue period in the random cue context. In this example, activity levels in the trials following a rewarded trial (1:1, 1:2, 1:3, thick gray lines) were significantly higher than in the trials following a correctly completed, unrewarded trial (2:2, 2:3, 3:3, thin black lines; 2-level, 1-way ANOVA, F1,276 = 132.7, P < 0.001). Inset, normalized population spike density functions for the 12 neurons with higher firing rates following rewarded trials. Display conventions as in Figure 3.

Figure 4.

Preceding reward–dependent activity during the precue period in the random cue context. In this example, activity levels in the trials following a rewarded trial (1:1, 1:2, 1:3, thick gray lines) were significantly higher than in the trials following a correctly completed, unrewarded trial (2:2, 2:3, 3:3, thin black lines; 2-level, 1-way ANOVA, F1,276 = 132.7, P < 0.001). Inset, normalized population spike density functions for the 12 neurons with higher firing rates following rewarded trials. Display conventions as in Figure 3.

Across the OFC popoulation, precue activity modulations depended upon task context. A significantly greater proportion of neurons encoded the reward outcome of the preceding trial in the random than the valid cue context during this time period (Fig. 5A; 2-sided proportions test, χ2 = 7.21, df = 1, P = 0.007). In addition, the particular neurons encoding this factor changed when the task context changed. The 50 activity modulations dependent upon preceding reward occurred in 39 neurons. 11 neurons encoded preceding reward in both contexts.

Figure 5.

Percentages of neurons with preceding reward–dependent and upcoming reward–dependent activity in the precue, cue-triggered, and wait time periods during the valid and random cue contexts. (A) Preceding reward–dependent activity. During the precue period, the proportion of neurons with preceding reward–dependent activity was significantly larger in the random than the valid cue context (2-sided proportions test, **P < 0.01). The proportion of neurons with preceding reward–dependent activity decreased linearly across time periods in both contexts (chi-square tests for linear trend; ***P <0.001). (B) Upcoming reward–dependent activity. In the cue-triggered and wait periods, the proportion of neurons with upcoming reward–dependent activity was significantly larger in the valid than the random cue context (2-sided proportions tests, *P < 0.05, ***P < 0.001). The proportion of neurons with upcoming reward–dependent activity increased linearly across time periods in the valid context (chi-square tests for linear trend, ***P < 0.001) but did not change in the random context.

Figure 5.

Percentages of neurons with preceding reward–dependent and upcoming reward–dependent activity in the precue, cue-triggered, and wait time periods during the valid and random cue contexts. (A) Preceding reward–dependent activity. During the precue period, the proportion of neurons with preceding reward–dependent activity was significantly larger in the random than the valid cue context (2-sided proportions test, **P < 0.01). The proportion of neurons with preceding reward–dependent activity decreased linearly across time periods in both contexts (chi-square tests for linear trend; ***P <0.001). (B) Upcoming reward–dependent activity. In the cue-triggered and wait periods, the proportion of neurons with upcoming reward–dependent activity was significantly larger in the valid than the random cue context (2-sided proportions tests, *P < 0.05, ***P < 0.001). The proportion of neurons with upcoming reward–dependent activity increased linearly across time periods in the valid context (chi-square tests for linear trend, ***P < 0.001) but did not change in the random context.

After rewarded trials, 7/21 activity modulations in the valid and 10/29 modulations in the random cue context consisted of statistically significant increases in firing rate across the precue period (Fig. 3; 1-tailed, paired t-tests between the 2 halves of the 400 ms precue time window, P < 0.025). The magnitude of these increases was significantly higher in the valid than the random cue context (1-tailed, unpaired t-test, P = 0.025). This rising activity pattern appears to reflect anticipation of the first trial in a new schedule: greater increases occur in the valid context, when the cue will provide information about the upcoming schedule.

Precue activity modulations could have resulted from sustained taste-related activity from the preceding trial. If so, we would expect the neurons with activity dependent upon preceding rewards to be the same as those with activity dependent upon reward delivery (see below). However, this criterion was met for only 4/21 (19%) and 9/29 (31%) neurons in the valid and random cue contexts, respectively. We would also expect the same type of activity modulation during the outcome-triggered and precue periods. For example, if a neuron's firing rates were higher at the end of rewarded versus unrewarded trials, this same neuron should have higher activity prior to first trials. This was true for 2/4 cells in the valid and 8/9 cells in the random cue context. Finally, if the precue activity modulations were an extension of reward delivery–dependent modulations, firing rate differences should be continuous throughout the intertrial interval. Four neurons (1 in the valid and 3 in the random cue context) had sustained activity differences across the intertrial interval, from the previous trial's outcome to and including the precue time window (sliding window analysis; 2-level, 1-way ANOVAs, P < 0.01). For these 4 neurons, we could not rule out the possibility that precue activity modulations resulted from sustained reward delivery-dependent or taste-related activity.

Cue-Triggered Period

Fourteen (18%) cells in the valid cue context and 16 (28%) cells in the random cue context encoded the reward outcome of the preceding trial even after a new trial had begun and a new cue presented (Fig. 5A; 2-level, 1-way ANOVA, P < 0.01). The proportion of neurons in this category was statistically indistinguishable between the valid and random cue contexts (2-sided proportions test, χ2 = 1.46, df = 1, P = 0.23). Preceding reward also explained indistinguishable amounts of the variance in both contexts (valid mean = 6.7%; random mean = 8.4%; 2-tailed, unpaired t-test, P = 0.34). However, the particular neurons encoding the reward outcome of the preceding trial changed when the task context changed. The 30 cue-triggered preceding reward modulations occurred in 27 neurons. 3 neurons encoded this factor in both contexts.

Wait Period

The wait period began at least 2 s after the end of the previous trial. During this time window, the number of neurons with significant preceding reward–dependent modulations was no greater than that expected by chance (Table 1; 1-sided binomial tests, valid P = 0.91; random P = 0.16).

Across Time Periods

The number of OFC neurons encoding preceding reward depended upon the proximity of the preceding trial outcome. The proportion of neurons with preceding reward–dependent activity decreased significantly across the 3 time periods in both the valid and the random cue contexts (Fig. 5A; chi-square tests for linear trend, valid χ2 = 17.3, df = 1, P = 3.2 × 10−5; random χ2 = 24.2, df = 1, P = 8.5 × 10−7).

Activity Dependent upon Upcoming Reward

Precue Period

Prior to cue presentation, no information about the reward expectancy of the upcoming trial was available. During this time period, the number of neurons with significant upcoming reward–dependent activity was no greater than that expected by chance (Table 1; 1-sided binomial tests; valid P = 0.98; random P = 1).

Cue-Triggered Period

In the valid cue context, the cue-triggered activity of 15 (19%) neurons depended on whether or not the trial would be rewarded (Fig. 5B; 2-level, 1-way ANOVA, P < 0.01). The firing rates of 10 of these neurons were higher in to-be rewarded trials. Figure 6 shows one of the neurons with this type of activity; the inset shows the population spike density function for all 10 such neurons. The other 5 neurons had higher firing rates in to-be unrewarded trials. Overall, the upcoming reward factor explained an average of 13.5% of the variance in the valid cue context.

Figure 6.

Upcoming reward–dependent activity during the cue-triggered period in the valid cue context. In this example, activity levels in the rewarded trials (1:1, 2:2, 3:3, thick gray lines) were significantly higher than in the unrewarded trials (1:2, 1:3, 2:3, thin black lines; 2-level, 1-way ANOVA, F1,201 = 37.9, P < 0.001). Rasters are aligned on cue onset (0 ms, vertical line). The cue-triggered time window (200–600 ms) is indicated above each raster with a horizontal line. Inset, normalized population spike density functions for the 10 cells with higher firing rates in rewarded trials. Display conventions as in Figure 3.

Figure 6.

Upcoming reward–dependent activity during the cue-triggered period in the valid cue context. In this example, activity levels in the rewarded trials (1:1, 2:2, 3:3, thick gray lines) were significantly higher than in the unrewarded trials (1:2, 1:3, 2:3, thin black lines; 2-level, 1-way ANOVA, F1,201 = 37.9, P < 0.001). Rasters are aligned on cue onset (0 ms, vertical line). The cue-triggered time window (200–600 ms) is indicated above each raster with a horizontal line. Inset, normalized population spike density functions for the 10 cells with higher firing rates in rewarded trials. Display conventions as in Figure 3.

In the random cue context, the cues provided no information about upcoming reward, and the number of OFC neurons with upcoming reward–dependent activity was not significantly greater than that expected by chance (Table 1; 1-sided binomial test, P = 0.8). During this time period, significantly more neuronal modulations encoded reward expectancy in the valid than random cue context (Fig. 5B; 2-sided proportions test, χ2 = 5.9, df = 1, P = .015).

Wait Period

In the valid cue context, 24 (30%) cells differentiated between to-be rewarded and unrewarded trials during the wait period (Fig. 5B; 2-level, 1-way ANOVA, P < 0.01). Thirteen of these neurons had increases in firing rate in to-be rewarded trials (Fig. 7). The firing rates of the other 11 neurons were higher in to-be unrewarded trials. The upcoming reward factor explained an average of 14.2% of the variance during the wait period in the valid cue context.

Figure 7.

Upcoming reward–dependent activity during the wait period in the valid cue context. In this example, activity levels in the rewarded trials (1:1, 2:2, 3:3, thick gray lines) were significantly higher than in the unrewarded trials (1:2, 1:3, 2:3, thin black lines; 2-level, 1-way ANOVA, F1,196 = 122.3, P < 0.001). Rasters are aligned on the time at which the red spot appeared (0 ms, vertical line). The wait time window (350–750 ms) is indicated above each raster with a horizontal line. Inset, the population spike density functions for the 13 cells with higher firing rates in rewarded trials. Display conventions as in Figure 3.

Figure 7.

Upcoming reward–dependent activity during the wait period in the valid cue context. In this example, activity levels in the rewarded trials (1:1, 2:2, 3:3, thick gray lines) were significantly higher than in the unrewarded trials (1:2, 1:3, 2:3, thin black lines; 2-level, 1-way ANOVA, F1,196 = 122.3, P < 0.001). Rasters are aligned on the time at which the red spot appeared (0 ms, vertical line). The wait time window (350–750 ms) is indicated above each raster with a horizontal line. Inset, the population spike density functions for the 13 cells with higher firing rates in rewarded trials. Display conventions as in Figure 3.

In the random cue context, the activity of only one neuron appeared to depend on reward expectancy, a number no greater than that expected by chance (1-sided binomial test, P = 0.95). During this time period, significantly more neuronal activity modulations encoded reward in the valid than random cue context (Fig. 5B; 2-sided proportions test, χ2 = 16.27, df = 1, P = 5.5 × 10−5).

Across Time Periods

As expected, upcoming reward–dependent activity was only observed in the valid cue context. Within the valid context, the number of OFC neurons encoding this factor depended upon the proximity of the trial outcome. A greater proportion of neurons encoded reward expectancy as each trial progressed and the predicted outcome approached (Fig. 5B; chi-square tests for linear trend, χ2 = 23.8, df = 1, P = 1.1 × 10−6).

Activity Dependent upon Schedule State

Precue Period

The numbers of neurons with activity modulations dependent upon schedule state were no greater than that expected by chance (Table 1; 1-sided binomial tests; valid P = 0.57; random P = 0.95).

Cue-Triggered Period

The activity of 15 (19%) cells in the valid cue context depended upon the particular schedule state indicated by the cue (Fig. 8; 6-level, 1-way ANOVA, P < 0.01). The activity modulations in these 15 cells could be divided into 9 subtypes (Tukey tests, 2-sided, P < 0.05). Firing rates in the 1:1 schedule state were significantly different than rates in the other schedule states in 4 neurons. Firing rates in the 2:3 schedule state were significantly different in 3 neurons. In 2 neurons, activity modulations consisted of a monotonic trend in firing through the multi-trial schedules. Overall, state dependency in the valid cue context explained an average of 20.8% of the variance in these neurons.

Figure 8.

Percentages of neurons with schedule state–dependent activity modulations in the precue, cue-triggered, and wait time periods during the valid and random cue contexts. During the cue-triggered period, significantly more neurons had schedule state–dependent activity in the valid than the random cue context (2-sided proportions tests; *P < 0.05).

Figure 8.

Percentages of neurons with schedule state–dependent activity modulations in the precue, cue-triggered, and wait time periods during the valid and random cue contexts. During the cue-triggered period, significantly more neurons had schedule state–dependent activity in the valid than the random cue context (2-sided proportions tests; *P < 0.05).

Only 2 neurons in the random cue context appeared to have schedule state–dependent activity, a number not significantly different than that expected by chance (1-sided binomial test, P = 0.79). During the cue-triggered period, significantly more activity modulations depended on schedule state in the valid than random cue context (Fig. 8; 2-sided proportions tests, χ2 = 5.9, df = 1, P = 0.015).

Wait Period

The numbers of neurons with activity modulations dependent upon schedule state were no greater than that expected by chance (Table 1; 1-sided binomial tests; valid P = 0.11; random P = 1).

Activity Dependent upon Reward Delivery

The activity of 27 (34%) neurons in the valid cue context and 21 (36%) neurons in the random cue context depended on whether or not a reward was delivered at the end of a trial (150–550 ms after trial outcome; 2-level, 1-way ANOVA, P < 0.01). For 12 neurons in the valid and 15 neurons in the random cue context, firing rates during the outcome-triggered period were higher in rewarded than unrewarded trials (Fig. 9). In other cases, outcome-triggered activity was higher in the unrewarded trials (valid cue n = 15, random cue n = 6).

Figure 9.

Reward delivery–dependent activity. (A) Valid cue context. In this example, activity levels in the rewarded trials (1:1, 2:2, 3:3, thick gray lines) were significantly higher than in the unrewarded trials (1:2, 1:3, 2:3, thin black lines; 2-level, 1-way ANOVA, F1,178 = 101.3, P < 0.001). Rasters are aligned on the time at which the reward was delivered in the rewarded trials or the time at which the reward would have been delivered in the unrewarded trials (0 ms, vertical line). The outcome-triggered window (150–550 ms) is indicated above each raster with a horizontal line. Inset, the population spike density functions for the 12 cells with higher firing rates in rewarded trials in the valid cue context. (B) Random Cue context. In this example, activity levels in the rewarded trials were significantly higher (2-level, 1-way ANOVA, F1,196 = 41.4, P < 0.001) than in the unrewarded trials. Inset, the population spike density functions for the 15 cells with higher firing rates in rewarded trials in the random cue context. Display conventions as in Figure 3.

Figure 9.

Reward delivery–dependent activity. (A) Valid cue context. In this example, activity levels in the rewarded trials (1:1, 2:2, 3:3, thick gray lines) were significantly higher than in the unrewarded trials (1:2, 1:3, 2:3, thin black lines; 2-level, 1-way ANOVA, F1,178 = 101.3, P < 0.001). Rasters are aligned on the time at which the reward was delivered in the rewarded trials or the time at which the reward would have been delivered in the unrewarded trials (0 ms, vertical line). The outcome-triggered window (150–550 ms) is indicated above each raster with a horizontal line. Inset, the population spike density functions for the 12 cells with higher firing rates in rewarded trials in the valid cue context. (B) Random Cue context. In this example, activity levels in the rewarded trials were significantly higher (2-level, 1-way ANOVA, F1,196 = 41.4, P < 0.001) than in the unrewarded trials. Inset, the population spike density functions for the 15 cells with higher firing rates in rewarded trials in the random cue context. Display conventions as in Figure 3.

The proportion of neurons with reward delivery–dependent activity was statistically indistinguishable between the valid and random cue contexts (2-sided proportions test, χ2 = 0.014, df = 1, P = 0.9). Trial outcome explained indistinguishable amounts of the variance in the valid and random cue contexts (valid mean = 9.8%; random mean = 13.3%; 2-tailed, unpaired t-test, P = 0.29). However, when the task context changed, the particular neurons encoding trial outcome changed. The 48 reward delivery–dependent activity modulations occurred in 41 neurons. The activity of 7 neurons depended on this factor in both contexts.

Composition of Neuronal Ensembles

Within any given context and time period, ensembles of OFC neurons encoded preceding reward, upcoming reward, reward delivery, or schedule state. However, the activity modulations of individual OFC neurons may or may not have depended on the same factor between contexts or across time periods. We reported above that only a small proportion of the preceding reward–dependent and reward delivery–dependent modulations occurred in the same neurons in the valid and random cue contexts. We also performed an additional analysis to test the probability that the activity of individual neurons depended upon the same factor more frequently than would be expected by chance.

Given that 3 separate ANOVAs were performed on the data from each time period, we assumed a given neuron would depend upon the same reward factor one-third of the time by chance. We then compared the activity dependence of each neuron between pairs of time windows within a context and between the 2 contexts within each time window. In the random context, individual neurons were significantly more likely to have activity dependent on preceding reward in both the precue and cue-triggered time periods (chi-square test, χ2 = 15.4, df = 1, P < 0.001). During the precue period, individual neurons were significantly more likely to have preceding reward–dependent activity modulations in both the valid and random contexts (χ2 = 10.8, df = 1, P = 0.001). All other comparisons were statistically indistinguishable from chance. Thus, the composition of the neuronal ensembles encoding preceding reward outcome and upcoming reward expectancy generally changed between task contexts and across time periods.

Discussion

We have found that neuronal activity in OFC changes to reflect the types and relative importance of the reward information available in each context of the reward schedule task. First, OFC neurons represent upcoming reward information only in the valid cue context but encode preceding reward information in both the valid and random contexts. Second, the proportion of neurons with activity dependent upon the reward outcome of the preceding trial is significantly larger in the random cue context.

The activity of OFC neurons also changes to reflect the relative temporal proximity of different rewards. The proportion of neurons with preceding reward–dependent activity is large before a trial begins and decreases over the course of each trial. Conversely, the proportion of neurons with upcoming reward–dependent activity grows over time, as the trial outcome gains immediacy.

Finally, the ensembles of neurons representing different types of reward information are also context and time dependent. The specific neurons encoding preceding or upcoming reward generally change when the task context changes or as a trial unfolds. For example, during the cue-triggered period, the neuronal ensembles carrying preceding reward information included overlapping but different sets of neurons in the random and valid cue contexts. Likewise, in the valid context, the neuronal ensemble carrying upcoming reward information during the wait period was overlapping with but different than the ensemble carrying this information during the cue-triggered period.

Earlier work has also shown that orbitofrontal responses carry information about current and future rewards, encode information about predicted reward expectancy during different task time periods, and change with learning or changes in stimulus-reinforcer contingencies (Rosenkilde et al. 1981; Thorpe et al. 1983; Rolls et al. 1996; Tremblay and Schultz 1999; Hikosaka and Watanabe 2000; Tremblay and Schultz 2000a, 2000b; Wallis and Miller 2003; Roesch and Olson 2004). Our data replicate these findings and extend them in 2 ways: we have found that OFC neurons can encode preceding as well as upcoming rewards and that changes in orbitofrontal activity reflect the types of reward information available and the relative importance of that information in a given context and time period.

Interpretations of Precue Activity

We have concluded that the precue activity in OFC is related to the reward outcome of the preceding trial rather than to anticipation of an upcoming schedule or to estimations of upcoming reward probability. We have based this interpretation on several findings: 1) The proportion of neurons with precue activity modulations is significantly greater in the random than the valid cue context, when the preceding trial's outcome is more relevant and information about the upcoming schedule is less so. 2) We have also observed precue activity related to anticipation of a new schedule. This dynamic activity is superimposed on the tonic firing rate differences that encode the reward outcome of the preceding trial and is greater in the valid cue context. 3) This type of modulation occurs in a significant number of neurons even after the upcoming reward has been completely predicted by the cue in the valid cue context, making it less likely that these activity differences are related to calculations of reward probability. 4) The specific neurons with precue activity modulations are more likely to be the same in the valid and random cue contexts, suggesting that a subset of neurons are encoding the same variable in both contexts.

Precue activity has been observed in other studies of primate orbitofrontal neurons. However, the previous trial outcome had little or no effect on the precue activity levels in those studies, suggesting that these responses were not related to preceding reward history (Tremblay and Schultz 2000b; Hikosaka and Watanabe 2004). In contrast, neurons in lateral orbital cortex of rodents have been shown to encode the valence of a previous trial throughout a subsequent trial (Schoenbaum and Eichenbaum 1995). Our results extend these findings from the rodent to the monkey and demonstrate that the size of the preceding reward representation in OFC can depend upon the temporal proximity of the preceding reward.

Comparison to Decision-Making Tasks

In other work using stochastic decision-making tasks, neurons in dorsolateral prefrontal cortex and parietal cortex have been shown to carry signals reflecting the reward value expected in a trial as estimated from choices made and rewards received over some number of previous trials (Barraclough et al. 2004; Sugrue et al. 2004). In addition, lesions of anterior cingulate cortex in monkeys have been shown to impair their ability to use action-outcome histories to guide behavior (Kennerley et al. 2006). In these cases, different cortical regions appear to use reward history to predict the probability of a reward in the current trial.

Our reward schedule task and stochastic decision-making tasks differ in at least 2 major ways. First, monkeys performing decision-making tasks can make operant responses to select the more preferable choice; in our task, each trial must be completed independent of preceding rewards or predictions about upcoming rewards. Second, in the valid cue context of our task, the outcome of each trial is predicted with 100% certainty by the visual cue. Therefore, the upcoming reward signals in OFC arise from the information provided by the cue rather than from information estimated from prior actions or outcomes.

In the random cue context, the cues do not provide information about the schedule state, and there is a relation between the time of the last reward and the probability of receiving a reward in the current trial. Reward probability is 1/3 in the first trial after reward, 1/2 in the second trial, and 1 in the third trial. However, as shown in this experiment and in others from our laboratory, monkeys performing this task do not show a behavioral sensitivity to the reward probability structure in the random cue context (Shidara et al. 1998; Shidara and Richmond 2002; Ravel and Richmond 2006). Therefore, the activity modulations we have observed in OFC seem unlikely to reflect upcoming reward probability even in the random context of the reward schedule task. On the other hand, if needed to perform a stochastic decision-making task, the preceding reward–dependent modulations seen in OFC could be integrated to develop signals that encode estimates of future reward contingencies.

Comparison to Other Areas

OFC is strongly interconnected with the midbrain dopaminergic nuclei, ventral striatum, amygdala, and anterior cingulate cortex (Oades and Halliday 1987; Barbas and De Olmos 1990; Carmichael and Price 1995; Haber et al. 1995). The ways in which these different brain regions represent different reward-related properties has been a subject of some interest (Holland and Gallagher 2004; Schultz 2004; Amiez et al. 2006; Cardinal 2006; Haber et al. 2006). Previous studies from our laboratory have examined the responses of neurons within each of these areas during the reward schedule task.

In substantia nigra, pars compacta, many dopamine neurons responded preferentially to cues in the first trial of each schedule (Ravel and Richmond 2006). These types of responses were interpreted as encoding the relative salience of the visual cues. However, about the same proportion of dopamine neurons as OFC neurons responded to first cues in the random and the valid cue context, suggesting the possibility that dopamine neurons might also encode information about preceding reward.

In ventral striatum, neuronal responses depended upon schedule state during the cue-triggered period in the valid cue context (Shidara et al. 1998; Shidara and Richmond 2004). The largest groups of cue-triggered neuronal responses in ventral striatum appeared to differentiate the first from the later trials in a schedule. However, these responses disappeared during the random context. Therefore, unlike OFC, the ventral striatum does not appear to encode preceding reward information.

In amygdala, as in OFC, differential activity modulations during the precue period were observed (Sugase-Miyamoto and Richmond 2005). However, unlike the more numerous precue OFC modulations in the random context, precue responses in amygdala were reported only in the valid context. Therefore, the precue amygdala responses appear to reflect anticipation of the information to be provided by the first cue in a schedule rather than information about whether or not the preceding trial was rewarded. In one-third of OFC neurons, we did see increases in firing rates across the precue period that mirrored the amygdala responses. Schedule anticipatory signals seen in OFC, at least in the valid context, may result from amygdalar input (Barbas and De Olmos 1990).

In anterior cingulate cortex, one-third of the neuronal responses reflected increasing reward expectancy as an animal progressed through multi-trial schedules in the valid cue context (Shidara and Richmond 2002). In OFC, only 2 cells recorded in the valid cue context showed such monotonic trends in firing rate across trials within a schedule. The proportion of OFC neurons encoding upcoming reward expectancy did increase monotonically over time but did so on a trial-by-trial basis. Thus, while both regions process information about the temporal proximity of future outcomes, neurons in anterior cingulate cortex appear to integrate this information over a longer timescale.

Activity in anterior cingulate cortex and OFC also differs in the random cue context. Anterior cingulate neurons did not respond to events in the random context, whereas activity in orbitofrontal neurons depended upon the reward outcome of the preceding trial during the precue and cue-triggered periods. Therefore, it appears that anterior cingulate cortex encodes information only about upcoming rewards whereas OFC processes the reward information available in a given context, whether related to the past, present, or future.

Executive Function

OFC has been seen as one component of the prefrontal cortical system underlying executive functions (Schoenbaum and Setlow 2001; O'Reilly et al. 2002). To make decisions related to a reward or goal, animals must correctly assess reward context in a dynamically changing environment. Our findings support a role for OFC in executive function by demonstrating that this region can provide the dynamically adaptable, context-dependent, and time-sensitive reward information required by an animal to organize goal-directed activity.

This work was supported by the Intramural Research Program of the National Institute of Mental Health and a National Alliance for Research on Schizophrenia and Depression Young Investigator Award (JMS). We thank R.C. Saunders, S. Ravel, G. La Camera, T. Minamimoto, S. Bouret, and D. Pritchett for their help. Conflict of Interest: None declared.

References

Amiez
C
Joseph
JP
Procyk
E
Reward encoding in the monkey anterior cingulate cortex
Cereb Cortex
 , 
2006
, vol. 
16
 
7
(pg. 
1040
-
1055
)
Barbas
H
De Olmos
J
Projections from the amygdala to basoventral and mediodorsal prefrontal regions in the rhesus monkey
J Comp Neurol
 , 
1990
, vol. 
300
 
4
(pg. 
549
-
571
)
Barraclough
DJ
Conroy
ML
Lee
D
Prefrontal cortex and decision making in a mixed-strategy game
Nat Neurosci
 , 
2004
, vol. 
7
 
4
(pg. 
404
-
410
)
Bechara
A
Damasio
AR
Damasio
H
Anderson
SW
Insensitivity to future consequences following damage to human prefrontal cortex
Cognition
 , 
1994
, vol. 
50
 
1–3
(pg. 
7
-
15
)
Bechara
A
Tranel
D
Damasio
H
Characterization of the decision-making deficit of patients with ventromedial prefrontal cortex lesions
Brain
 , 
2000
, vol. 
123
 
Pt 11
(pg. 
2189
-
2202
)
Bowman
EM
Aigner
TG
Richmond
BJ
Neural signals in the monkey ventral striatum related to motivation for juice and cocaine rewards
J Neurophysiol
 , 
1996
, vol. 
75
 
3
(pg. 
1061
-
1073
)
Cardinal
RN
Neural systems implicated in delayed and probabilistic reinforcement
Neural Netw
 , 
2006
, vol. 
19
 
8
(pg. 
1277
-
1301
)
Carmichael
ST
Price
JL
Architectonic subdivision of the orbital and medial prefrontal cortex in the macaque monkey
J Comp Neurol
 , 
1994
, vol. 
346
 
3
(pg. 
366
-
402
)
Carmichael
ST
Price
JL
Limbic connections of the orbital and medial prefrontal cortex in macaque monkeys
J Comp Neurol
 , 
1995
, vol. 
363
 
4
(pg. 
615
-
641
)
Elliott
R
Newman
JL
Longe
OA
Deakin
JF
Differential response patterns in the striatum and orbitofrontal cortex to financial reward in humans: a parametric functional magnetic resonance imaging study
J Neurosci
 , 
2003
, vol. 
23
 
1
(pg. 
303
-
307
)
Elliott
R
Newman
JL
Longe
OA
William Deakin
JF
Instrumental responding for rewards is associated with enhanced neuronal response in subcortical reward systems
Neuroimage
 , 
2004
, vol. 
21
 
3
(pg. 
984
-
990
)
Gottfried
JA
O'Doherty
J
Dolan
RJ
Encoding predictive reward value in human amygdala and orbitofrontal cortex
Science
 , 
2003
, vol. 
301
 
5636
(pg. 
1104
-
1107
)
Haber
SN
Kim
KS
Mailly
P
Calzavara
R
Reward-related cortical inputs define a large striatal region in primates that interface with associative cortical connections, providing a substrate for incentive-based learning
J Neurosci
 , 
2006
, vol. 
26
 
32
(pg. 
8368
-
8376
)
Haber
SN
Kunishio
K
Mizobuchi
M
Lynd-Balta
E
The orbital and medial prefrontal circuit through the primate basal ganglia
J Neurosci
 , 
1995
, vol. 
15
 
7 Pt 1
(pg. 
4851
-
4867
)
Hays
AV
Richmond
BJ
Optican
LMA
Unix-based multiple process system for real-time data acquisition and control
WESCON Conf Proc
 , 
1982
, vol. 
2
 (pg. 
1
-
10
)
Hikosaka
K
Watanabe
M
Delay activity of orbital and lateral prefrontal neurons of the monkey varying with different rewards
Cereb Cortex
 , 
2000
, vol. 
10
 
3
(pg. 
263
-
271
)
Hikosaka
K
Watanabe
M
Long- and short-range reward expectancy in the primate orbitofrontal cortex
Eur J Neurosci
 , 
2004
, vol. 
19
 
4
(pg. 
1046
-
1054
)
Holland
PC
Gallagher
M
Amygdala-frontal interactions and reward expectancy
Curr Opin Neurobiol
 , 
2004
, vol. 
14
 
2
(pg. 
148
-
155
)
Hosokawa
T
Kato
K
Inoue
M
Mikami
A
Correspondence of cue activity to reward activity in the macaque orbitofrontal cortex
Neurosci Lett
 , 
2005
, vol. 
389
 
3
(pg. 
146
-
151
)
Ichihara-Takeda
S
Funahashi
S
Reward-period activity in primate dorsolateral prefrontal and orbitofrontal neurons is affected by reward schedules
J Cogn Neurosci
 , 
2006
, vol. 
18
 
2
(pg. 
212
-
226
)
Izquierdo
A
Suda
RK
Murray
EA
Bilateral orbital prefrontal cortex lesions in rhesus monkeys disrupt choices guided by both reward value and reward contingency
J Neurosci
 , 
2004
, vol. 
24
 
34
(pg. 
7540
-
7548
)
Jones
B
Mishkin
M
Limbic lesions and the problem of stimulus—reinforcement associations
Exp Neurol
 , 
1972
, vol. 
36
 
2
(pg. 
362
-
377
)
Judge
SJ
Richmond
BJ
Chu
FC
Implantation of magnetic search coils for measurement of eye position: an improved method
Vision Res
 , 
1980
, vol. 
20
 
6
(pg. 
535
-
538
)
Kennerley
SW
Walton
ME
Behrens
TE
Buckley
MJ
Rushworth
MF
Optimal decision making and the anterior cingulate cortex
Nat Neurosci
 , 
2006
, vol. 
9
 
7
(pg. 
940
-
947
)
Knutson
B
Taylor
J
Kaufman
M
Peterson
R
Glover
G
Distributed neural representation of expected value
J Neurosci
 , 
2005
, vol. 
25
 
19
(pg. 
4806
-
4812
)
Kringelbach
ML
O'Doherty
J
Rolls
ET
Andrews
C
Activation of the human orbitofrontal cortex to a liquid food stimulus is correlated with its subjective pleasantness
Cereb Cortex
 , 
2003
, vol. 
13
 
10
(pg. 
1064
-
1071
)
Oades
RD
Halliday
GM
Ventral tegmental (A10) system: neurobiology. 1. Anatomy and connectivity
Brain Res
 , 
1987
, vol. 
434
 
2
(pg. 
117
-
165
)
O'Doherty
J
Kringelbach
ML
Rolls
ET
Hornak
J
Andrews
C
Abstract reward and punishment representations in the human orbitofrontal cortex
Nat Neurosci
 , 
2001
, vol. 
4
 
1
(pg. 
95
-
102
)
O'Doherty
J
Rolls
ET
Francis
S
Bowtell
R
McGlone
F
Representation of pleasant and aversive taste in the human brain
J Neurophysiol
 , 
2001
, vol. 
85
 
3
(pg. 
1315
-
1321
)
O'Doherty
JP
Deichmann
R
Critchley
HD
Dolan
RJ
Neural responses during anticipation of a primary taste reward
Neuron
 , 
2002
, vol. 
33
 
5
(pg. 
815
-
826
)
O'Reilly
RC
Noelle
DC
Braver
TS
Cohen
JD
Prefrontal cortex and dynamic categorization tasks: representational organization and neuromodulatory control
Cereb Cortex
 , 
2002
, vol. 
12
 
3
(pg. 
246
-
257
)
Padoa-Schioppa
C
Assad
JA
Neurons in the orbitofrontal cortex encode economic value
Nature
 , 
2006
, vol. 
441
 
7090
(pg. 
223
-
226
)
Pears
A
Parkinson
JA
Hopewell
L
Everitt
BJ
Roberts
AC
Lesions of the orbitofrontal but not medial prefrontal cortex disrupt conditioned reinforcement in primates
J Neurosci
 , 
2003
, vol. 
23
 
35
(pg. 
11189
-
11201
)
Pritchard
TC
Edwards
EM
Smith
CA
Hilgert
KG
Gavlick
AM
Maryniak
TD
Schwartz
GJ
Scott
TR
Gustatory neural responses in the medial orbitofrontal cortex of the old world monkey
J Neurosci
 , 
2005
, vol. 
25
 
26
(pg. 
6047
-
6056
)
Ravel
S
Richmond
BJ
Dopamine neuronal responses in monkeys performing visually cued reward schedules
Eur J Neurosci
 , 
2006
, vol. 
24
 
1
(pg. 
277
-
290
)
Remijnse
PL
Nielen
MM
Uylings
HB
Veltman
DJ
Neural correlates of a reversal learning task with an affectively neutral baseline: an event-related fMRI study
Neuroimage
 , 
2005
, vol. 
26
 
2
(pg. 
609
-
618
)
Robinson
DA
A method of measuring eye movement using a scleral search coil in a magnetic field
IEEE Trans Biomed Eng
 , 
1963
, vol. 
10
 (pg. 
137
-
145
)
Roesch
MR
Olson
CR
Neuronal activity related to reward value and motivation in primate frontal cortex
Science
 , 
2004
, vol. 
304
 
5668
(pg. 
307
-
310
)
Rogers
RD
Ramnani
N
Mackay
C
Wilson
JL
Jezzard
P
Carter
CS
Smith
SM
Distinct portions of anterior cingulate cortex and medial prefrontal cortex are activated by reward processing in separable phases of decision-making cognition
Biol Psychiatry
 , 
2004
, vol. 
55
 
6
(pg. 
594
-
602
)
Rolls
ET
Critchley
HD
Mason
R
Wakeman
EA
Orbitofrontal cortex neurons: role in olfactory and visual association learning
J Neurophysiol
 , 
1996
, vol. 
75
 
5
(pg. 
1970
-
1981
)
Rolls
ET
Yaxley
S
Sienkiewicz
ZJ
Gustatory responses of single neurons in the caudolateral orbitofrontal cortex of the macaque monkey
J Neurophysiol
 , 
1990
, vol. 
64
 
4
(pg. 
1055
-
1066
)
Rosenkilde
CE
Bauer
RH
Fuster
JM
Single cell activity in ventral prefrontal cortex of behaving monkeys
Brain Res
 , 
1981
, vol. 
209
 
2
(pg. 
375
-
394
)
Schoenbaum
G
Eichenbaum
H
Information coding in the rodent prefrontal cortex. II. Ensemble activity in orbitofrontal cortex
J Neurophysiol
 , 
1995
, vol. 
74
 
2
(pg. 
751
-
762
)
Schoenbaum
G
Setlow
B
Integrating orbitofrontal cortex into prefrontal theory: common processing themes across species and subdivisions
Learn Mem
 , 
2001
, vol. 
8
 
3
(pg. 
134
-
147
)
Schultz
W
Neural coding of basic reward terms of animal learning theory, game theory, microeconomics and behavioural ecology
Curr Opin Neurobiol
 , 
2004
, vol. 
14
 
2
(pg. 
139
-
147
)
Shidara
M
Aigner
TG
Richmond
BJ
Neuronal signals in the monkey ventral striatum related to progress through a predictable series of trials
J Neurosci
 , 
1998
, vol. 
18
 
7
(pg. 
2613
-
2625
)
Shidara
M
Richmond
BJ
Anterior cingulate: single neuronal signals related to degree of reward expectancy
Science
 , 
2002
, vol. 
296
 
5573
(pg. 
1709
-
1711
)
Shidara
M
Richmond
BJ
Differential encoding of information about progress through multi-trial reward schedules by three groups of ventral striatal neurons
Neurosci Res
 , 
2004
, vol. 
49
 
3
(pg. 
307
-
314
)
Sugase-Miyamoto
Y
Richmond
BJ
Neuronal signals in the monkey basolateral amygdala during reward schedules
J Neurosci
 , 
2005
, vol. 
25
 
48
(pg. 
11071
-
11083
)
Sugrue
LP
Corrado
GS
Newsome
WT
Matching behavior and the representation of value in the parietal cortex
Science
 , 
2004
, vol. 
304
 
5678
(pg. 
1782
-
1787
)
Team
RDC
A language and environment for statistical computing
2004
Vienna (Austria)
R Foundation for Statistical Computing
Thorpe
SJ
Rolls
ET
Maddison
S
The orbitofrontal cortex: neuronal activity in the behaving monkey
Exp Brain Res
 , 
1983
, vol. 
49
 
1
(pg. 
93
-
115
)
Thut
G
Schultz
W
Roelcke
U
Nienhusmeier
M
Missimer
J
Maguire
RP
Leenders
KL
Activation of the human brain by monetary reward
Neuroreport
 , 
1997
, vol. 
8
 
5
(pg. 
1225
-
1228
)
Tremblay
L
Schultz
W
Relative reward preference in primate orbitofrontal cortex
Nature
 , 
1999
, vol. 
398
 
6729
(pg. 
704
-
708
)
Tremblay
L
Schultz
W
Modifications of reward expectation-related neuronal activity during learning in primate orbitofrontal cortex
J Neurophysiol
 , 
2000
, vol. 
83
 
4
(pg. 
1877
-
1885
)
Tremblay
L
Schultz
W
Reward-related neuronal activity during go-nogo task performance in primate orbitofrontal cortex
J Neurophysiol
 , 
2000
, vol. 
83
 
4
(pg. 
1864
-
1876
)
Ursu
S
Carter
CS
Outcome representations, counterfactual comparisons and the human orbitofrontal cortex: implications for neuroimaging studies of decision-making
Brain Res Cogn Brain Res
 , 
2005
, vol. 
23
 
1
(pg. 
51
-
60
)
Wallis
JD
Miller
EK
Neuronal activity in primate dorsolateral and orbital prefrontal cortex during performance of a reward preference task
Eur J Neurosci
 , 
2003
, vol. 
18
 
7
(pg. 
2069
-
2081
)

Author notes

Funding to pay the Open Access publication charges for this article was provided by NIMH/DIRP.
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.