Abstract

Separate regions of the orbitofrontal cortex (OFC) have been implicated in mediating different aspects of cost–benefit decision-making in humans and animals. Anatomical and functional imaging studies indicate that the medial (mOFC) and lateral OFC may subserve dissociable functions related to reward and decision-making processes, yet the majority of studies in rodents have focused on the lateral OFC. The present study investigated the contribution of the rat mOFC to risk and delay-based decision-making, assessed with probabilistic and delay-discounting tasks. In well-trained rats, reversible inactivation of the mOFC increase a risky choice on the probabilistic discounting task, irrespective of whether the odds of obtaining a larger/risky reward decreased (100–12.5%) or increased (12.5–100%) over the course of a session. The increase in risky choice was associated with enhanced win-stay behavior, wherein rats showed an increased tendency to choose the risky option after being rewarded for the risky choice on a preceding trial. In contrast, mOFC inactivation did not alter delay discounting. These findings suggest that the mOFC plays a selective role in decisions involving reward uncertainty, mitigating the impact that larger, probabilistic rewards exert on subsequent choice behavior. This function may promote the exploration of novel options when reward contingencies change.

Introduction

The orbitofrontal cortex (OFC) has been identified as a critical component of the neural circuitry mediating reward-related learning and different aspects of cost–benefit decision-making, although its specific contributions to these processes remain elusive. Anatomical studies have suggested an important division between the medial (mOFC) and lateral OFC regions (Price 2007; Hoover and Vertes 2011). However, the majority of research on OFC function in animals has focused primarily on the lateral OFC or indiscriminately targeted both subregions of OFC. As such, the lack of consensus regarding the role of the OFC in decision-making may be a result of studying the mOFC and lateral OFC as a unified region. Examples of inconsistencies in the literature come from animal studies on the contribution of the OFC to decision-making. Much of this work has focused on how this region mediates a choice in situations where subjects must choose between smaller, immediate rewards and larger rewards delivered after a delay (delay discounting), which serves as a measure of impulsive choice. In these situations, the subjective value of an objectively larger reward is diminished when its receipt is delayed, resulting in an increased preference for the more immediate, but smaller reward. Larger lesions of the OFC in rats have been reported to either increase (Winstanley et al. 2004; Mar et al. 2011) or decrease (Mobini et al. 2002; Rudebeck et al. 2006) preference for larger, delayed rewards. More discrete, reversible inactivation of the lateral OFC induced differential effects on delay discounting that were dependent on the presence or absence of a cue during the delay and baseline levels of impulsivity (Zeeb et al. 2010). In comparison, Mar et al. (2011) reported that permanent lateral OFC lesions increased delay discounting. However, in the same study, lesions of the mOFC increased preference for larger, delayed rewards after extended retraining, indicative of an enhanced sensitivity to relative reward magnitudes (Mar et al. 2011), a conclusion consistent with other lesion studies in mice (Gourley et al. 2010).

Studies with brain-damaged human subjects have implicated the OFC in mediating decisions about risks and rewards, although again, the specific contribution of this region remains the topic of continued debate. Initial studies using the Iowa Gambling Task (IGT) revealed that OFC patients displayed impaired decision-making, increasing the selection of risky options (Bechara et al. 1994, 1999). However, on a variant of the IGT where the frequencies of rewards and punishments were randomized, patients with OFC lesions perform similarly to controls (Fellows and Farah 2005). Other studies using the Cambridge Gamble Task where outcome probabilities are presented explicitly have shown that OFC damage is associated either with no difference in choice behavior relative to control patients or even risk-averse tendencies (Rogers et al. 1999; Manes et al. 2002). These discrepant findings may also be related to the specific OFC region damaged. Clark et al. (2008) reported that patients with damage to the mOFC showed an increase in risk-taking behavior, whereas a lesion control group including lateral OFC patients performed similarly to controls. This suggests that the mOFC may play a more prominent role in facilitating risk-based decisions by biasing the choice toward conservative options.

Lesions targeting the rat lateral OFC interfere with risk-based decision-making, impairing learning of optimal decision strategies, or reducing sensitivity to risk (Mobini et al. 2002; Pais-Vieira et al. 2007; Zeeb and Winstanley 2011). In contrast, lesions made after training do not impair performance on a rodent version of the IGT (Zeeb and Winstanley 2011), suggesting that this region plays a more critical role in the initial learning of risk/reward contingencies. Similarly, work in our laboratory has shown that inactivation of the lateral OFC does not alter the risky choice in well-trained animals performing a probabilistic discounting task (St. Onge and Floresco 2010). On the other hand, inactivation of the medial prelimbic prefrontal cortex (PFC) disrupts adjustments of choice biases in response to changing reward probabilities. Thus, when the odds of obtaining a larger, uncertain reward decreased (from 100% to 12.5%) or increased over time, PFC inactivation increases or decreases risky choice, respectively. However, despite studies in humans implicating the mOFC in mediating risk/reward judgments, there have been no studies that have specifically investigated the contribution of this region to this aspect of decision-making, and whether it plays a role that is dissociable from other prefrontal regions such as the lateral OFC or medial PFC.

With these issues in mind, the present study was undertaken to further explore the contribution of the rat mOFC to different forms of cost–benefit decision-making. Reversible inactivations of the mOFC were induced in well-trained rats performing either a probabilistic discounting task (sensitive to manipulations of the medial PFC, but not the lateral OFC; St. Onge and Floresco 2010) or a delay-discounting task that is differentially affected by lesions of the mOFC versus the lateral OFC (Winstanley et al. 2004; Floresco et al. 2008; Zeeb et al. 2010; Mar et al. 2011).

Materials and Methods

Animals

Male Long-Evans rats (Charles River Laboratories, Montreal, Canada) weighing 250–300 g at the beginning of training were used. On arrival, rats were given 1 week to acclimatize to the colony and then were food restricted to 85–90% of their free-feeding weight for an additional week before behavioral training and given ad libitum access to water for the duration of the experiment. Feeding occurred in the rats' home cages at the end of the experimental day, and body weights were monitored daily. All testing was in accordance with the Canadian Council on Animal Care and the Animal Care Committee of the University of British Columbia.

Apparatus

Behavioral testing was conducted in 12 operant chambers (30.5 × 24 × 21 cm; Med Associates, St Albans, VT, United States of America) enclosed in sound-attenuating boxes. The boxes were equipped with a fan that provided ventilation and masked extraneous noise. Each chamber was fitted with 2 retractable levers, one located on each side of a central food receptacle where food reinforcement (45 mg; Bioserv, Frenchtown, NJ, United States of America) was delivered by a pellet dispenser. The chambers were illuminated by a single 100-mA house light located in the top center of the wall opposite the levers. Four infrared photobeams were mounted on the side of each chamber, and another photobeam was located in the food receptacle. Locomotor activity was indexed by the number of photobeam breaks that occurred during a session. All experimental data were recorded by personal computers connected to the chambers through an interface.

Lever-Press Training

Our initial training protocols were identical to those of St. Onge and Floresco (2009), as adapted from Cardinal and Howes (2005). On the day before their first exposure to the operant chamber, rats were given approximately 25 food reward pellets in their home cage. On the first day of training, 2–3 pellets were delivered into the food cup and crushed pellets were placed on a lever before the animal was placed in the chamber. Rats were first trained under a fixed ratio 1 schedule to a criterion of 60 pellets in 30 min, first for 1 lever, and then repeated for the other lever (counterbalanced left/right between subjects). They were then trained on a simplified version of the full task. These 90 trial sessions began with the levers retracted and the operant chamber in darkness. Every 40 s, a trial was initiated with the illumination of the house light and the insertion of 1 of the 2 levers into the chamber, and the rat was required to press it within 10 s of its insertion. Failure to press the lever resulted in its retraction, no food delivered and the chamber reverting to darkness. For rats that were to be trained on the probabilistic discounting task, responding on a lever delivered a pellet with 50% probability. This procedure was used to familiarize the rats to the probabilistic nature of the full task. For those to be trained on the delay-discounting task, a pellet was delivered after every press. In every pair of trials, the left or right lever was presented once, and the order within the pair of trials was random. Rats were trained for 3–5 days to a criterion of 80 or more successful trials (i.e. <10 omissions).

Decision-Making Tasks

Probabilistic Discounting

Risk-based decision-making was assessed with a probabilistic discounting task we have described previously (Floresco and Whelan 2009; Ghods-Sharifi et al. 2009) and was originally modified from that described by Cardinal and Howes (2005) (Fig. 1, top). Rats received daily sessions consisting of 72 trials, separated into 4 blocks of 18 trials. The entire session took 48 min to complete, and the animals were trained 5–7 days per week. A session began in darkness with both levers retracted (the intertrial state). A trial began every 40 s with the illumination of the house light and the insertion of one or both levers into the chamber. One lever was designated the large/risky lever, the other the small/certain lever, which remained consistent throughout training (counterbalanced left/right). If the rat did not respond within 10 s of lever presentation, the chamber was reset to the intertrial state until the next trial (omission). When a lever was chosen, both levers retracted. A choice of the small/certain lever always delivered 1 pellet with 100% probability; choice of the large/risky lever delivered 4 pellets but with a particular probability (see below). After a response was made and food delivered, the house light remained on for another 4 s, after which the chamber reverted back to the intertrial state until the next trial. Multiple pellets were delivered 0.5 s apart. The 4 blocks consisted of 8 forced-choice trials, where only 1 lever was presented (4 trials for each lever, randomized in pairs) permitting animals to learn the amount of food associated with each lever press and the respective probability of receiving reinforcement over each block. This was followed by 10 free-choice trials, where both levers were presented and the animal chose between the small/certain and the large/risky lever.

Figure 1.

Probabilistic and delay-discounting task design. Format of a single free-choice trial for the probabilistic (top) and delay- (bottom) discounting tasks used in the present study.

Figure 1.

Probabilistic and delay-discounting task design. Format of a single free-choice trial for the probabilistic (top) and delay- (bottom) discounting tasks used in the present study.

The probability of obtaining 4 pellets after pressing the large/risky lever was varied systematically across the 4 blocks. Previous research in our laboratory has shown that inactivation of the more caudal and dorsal, prelimbic region of the medial PFC induces differential effects on probabilistic discounting. When the probability of obtaining the larger reward decreased over the session, PFC inactivation increased risky choice, whereas when reward probabilities were initially low and subsequently increased, PFC inactivation had the opposite effect (St. Onge and Floresco 2010). In light of these findings, separate groups of rats were trained on a variant of the task where probabilities of obtaining the large/risky reward systematically descended (100%, 50%, 25%, and 12.5%) or ascended (12.5%, 25%, 50%, and 100%) across blocks. Thus, when the probability of obtaining the 4-pellet reward was 100% or 50%, this option would be more advantageous. At 25%, it is arbitrary which lever the animal chooses, and at 12.5%, the small/certain lever would be the more advantageous option in the long-term.

For each session and trial block, the probability of receiving the large reward was drawn from a set probability distribution. Therefore, on any given day, the probabilities in each block may have varied, but averaged across many training days, the actual probability experienced by the rat approximated the set value. Previous behavioral studies from our laboratory have shown that even though different rats may experience better or worse “luck” following the choice of the large/risky lever during earlier phases of training on this task, these outcomes do not substantially impact the final discounting rates exhibited by animals once they display stable discounting behavior at the end of training (St. Onge 2011).

Delay Discounting

The delay-discounting task (Cardinal et al. 2000; Winstanley et al. 2004; Zeeb and Winstanley 2011) was similar to the probabilistic discounting tasks in a number of respects, but with some key differences (Fig. 1, bottom). Rats received daily sessions consisting of 48 trials, separated into 4 blocks of 12 trials (2 forced-choice followed by 10 free-choice per block). The entire session took 56 min to complete. Trials began every 70 s with the illumination of the house light and the insertion of one or both levers into the chamber. One lever was designated the large/delayed lever, and the other the small/immediate lever, which remained consistent throughout training (counterbalanced left/right). A choice of the small/immediate lever always delivered 1 pellet immediately. Selecting the large/delayed lever delivered 4 pellets after a delay that increased systematically over the 4 blocks: It was initially 0 s, then 15 s, 30 s, and 45 s. There were no explicit cues presented during the delay period; the house light was extinguished, and then re-illuminated upon delivery of the 4 pellets.

Rats on either task were trained until as a group demonstrated prominent and stable baseline levels of discounting. Infusions were administered after a group of rats displayed stable patterns of choice for 3 consecutive days, assessed using a procedure similar to that we have described previously (St. Onge and Floresco 2010). Data from 3 consecutive sessions were analyzed with a repeated-measures analysis of variance (ANOVA) with 2 within-subjects factors (day and trial block). If the effect of block was significant at the P< 0.05 level, but there was no main effect of day or day × block interaction (at the P > 0.1 level), animals were judged to have displayed stable baseline levels of discounting.

Surgery and Microinfusion Protocol

Rats were trained on their respective tasks until they displayed stable levels of choice, after which they were provided food ad libitum for 1–3 days and were then subjected to surgery. Rats were anesthetized with 100 mg/kg ketamine hydrochloride and 7 mg/kg xylazine and implanted with bilateral 23 gauge stainless steel guide cannulae aimed at the mOFC (flat skull: Anteroposterior = +4.4 mm; mediolateral = ±0.7 mm; dorsoventral = −3.2 mm from dura) using standard stereotaxic techniques. Guide cannulae were held in place with stainless steel screws and dental acrylic. Thirty-gauge obdurators flush with the end of guide cannulae remained in place until the infusions were made. Rats were given at least 7 days to recover from surgery before testing. During this period, they were handled at least 5 min each day and were food restricted to 85% of their free-feeding weight.

Rats were subsequently trained on their respective task for at least 5 days until the group displayed stable levels of choice behavior for 3 consecutive days. One to 2 days before their first microinfusion test day, obdurators were removed, and a mock infusion procedure was conducted. Stainless steel injectors were placed in the guide cannulae for 2 min, but no infusion was administered. The day after displaying stable discounting, the group received its first microinfusion test day.

A within-subjects design was used for all experiments. Inactivation was achieved by microinfusion of a solution containing the gamma-aminobutyric acid (GABA)B agonist baclofen and the GABAA agonist muscimol (Sigma Aldrich). Both drugs were dissolved in physiological saline, mixed separately at a concentration of 500 ng/μL, and then combined into equal volumes so that the final concentration of each compound in solution was 250 ng/μL. Drugs or saline were infused at a volume of 0.5 μL, so the final dose of baclofen–muscimol was 125 ng/side. Infusions of GABA agonists or saline were administered bilaterally via 30-gauge injection cannulae that protruded 0.8 mm past the end of the guide cannulae, at a rate of 0.4 μL/min by a microsyringe pump. Injection cannulae were left in place for an additional 1 min to allow for diffusion. Each rat remained in its home cage for an additional 10-min period before behavioral testing.

On the first infusion test day, half of the rats in each group received saline infusions, and the other half received baclofen–muscimol. The next day, they received a baseline training day (no infusion). If, for any individual rat, choice of the large reward lever deviated by >15% from its preinfusion baseline, it received an additional day of training before the second infusion test. On the following day, rats received a second counterbalanced infusion of either saline or baclofen–/muscimol.

Histology

After completion of behavioral testing, rats were euthanized in a carbon dioxide chamber. Brains were removed and fixed in a 4% formalin solution. The brains were frozen and sliced in 50-μm sections before being mounted and stained with Cresyl Violet. Placements were verified with reference to the neuroanatomical atlas of Paxinos and Watson (2005). Data from rats whose placements were outside the borders of the mOFC were removed from the analysis. In general, animals with inaccurate placements did not display prominent changes in choice behavior following inactivation treatments relative to saline infusions. The location of acceptable placements is presented in Figure 2.

Figure 2.

Schematic of sections of the rat brain showing location of acceptable infusions in the mOFC for rats in probabilistic (filled circles) and delay-discounting (open circles) experiments. Numbers correspond to mm from the bregma.

Figure 2.

Schematic of sections of the rat brain showing location of acceptable infusions in the mOFC for rats in probabilistic (filled circles) and delay-discounting (open circles) experiments. Numbers correspond to mm from the bregma.

Data Analysis

The primary dependent measure of interest was the proportion of choices directed toward the large reward lever (large/risky or large/delayed) for each block of free-choice trials, factoring in trial omissions. For each block, this was calculated by dividing the number of choices of the large reward lever by the total number of successful trials. For the probabilistic discounting experiment, choice data were analyzed using 3-way, between-/within-subjects ANOVAs, with treatment and probability block as 2 within-subjects factors and task variant (i.e. Reward probabilities descending or ascending over blocks) as a between-subjects factor. Thus, in this analysis, the proportion of choices of the large/risky option across the 4 levels of the trial block were analyzed irrespective of the order in which they were presented. For the treatment factor, we included data obtained after saline and baclofen–muscimol infusions, as well as baseline choice data, comprised of the average values from the days prior to each infusion. For the delay-discounting experiment, choice data were analyzed with a 2-way repeated-measures ANOVA, with treatment and trial block as factors. In each of these analyses, the effect of trial block was always significant (P < 0.001) for both tasks and will not be reported further. When significant effects of choice were obtained, P-values were corrected using the Huynh-Feldt Epsilon to accommodate for any violations of sphericity. Response latencies (the time elapsed between lever insertion and subsequent press), locomotor activity (i.e. photobeam breaks), and the number of trial omissions were analyzed with 1-way repeated-measures ANOVA (the between-subjects factor of task or baseline values was not incorporated in these analyses).

Win-Stay/Lose-Shift Analysis

For data obtained from the probabilistic discounting experiment, we conducted a supplementary analysis to further clarify whether changes in choice behavior were due to alterations in sensitivity to reward (win-stay performance) or negative feedback (lose-shift performance; Bari et al. 2009; Stopper and Floresco 2011; St. Onge et al. 2011, 2012). Animals' choices during the task were analyzed according to the outcome of each preceding trial (reward or nonreward) and expressed as a ratio. The proportion of win-stay trials was calculated from the number of times a rat chose the large/risky lever after choosing the risky option on the preceding trial and obtaining the large reward (a win), divided by the total number of free-choice trials where the rat obtained the larger reward. Conversely, lose-shift performance was calculated from the number of times a rat shifted choice to the small/certain lever after choosing the risky option on the preceding trial and was not rewarded (a loss), divided by the total number of free-choice trials resulting in a loss. These computations were conducted for all trials across the 4 blocks. We could not conduct a block-by-block analysis of these data because there were many instances where rats did not obtain the large reward at all during the latter blocks. Win-stay and lose-shift ratios observed after saline and inactivation treatments were analyzed separately with 2-way ANOVAs, with task variant as a between-subjects factor and treatment (saline or inactivation) as a within-subjects factor.

Results

mOFC Inactivation and Risk-Based Decision-Making

Initially, 24 rats were trained on the probabilistic discounting tasks (13 on the descending task) for an average of 20 days prior to being implanted with guide cannulae in the mOFC, retrained on the task and receiving counterbalanced microinfusions. Ten of these rats were excluded from the data analyses because of inaccurate cannula placements that resided ventral to the mOFC (6 from the descending group). In these rats, risky choice did not differ across treatments (all P > 0.50, data not shown). Data from the remaining 14 rats with acceptable placements were included in the analysis (7 rats each included from the descending and ascending variant of the task, Fig. 2). Inactivation of the mOFC markedly increased risky choice, reflected in the ANOVA as a significant main effect of treatment (F2,24 = 3.76, P < 0.05; Huynh-Feldt Epsilon = 1.0; Fig. 3A). Multiple comparisons revealed that infusions of baclofen–muscimol significantly (Dunnett's, P< 0.05) increased risky choice relative to saline, whereas choice after saline infusions did not differ from the baseline. Importantly, the analysis did not yield significant task × treatment [F2,24 = 1.33, not significant (n.s.)] or task × treatment × block interactions (F6,72 = 1.12, n.s.). The lack of an interaction effect indicates that the increase in risky choice induced by mOFC inactivation by rats trained on the task where the odds of obtaining the larger reward was initially 100% and then decreased (descending group, Fig. 3B) and was comparable with that displayed by rats trained on the version where reward probabilities were initially 12.5% and then increased over blocks (ascending group, Fig. 3C). Response latencies were unaffected by mOFC inactivation (F1,12 = 1.97, n.s.; Table 1). mOFC inactivation increased locomotion (F1,13 = 4.512, P = 0.05), but did not affect trial omissions (F1,13 = 0.65, n.s.; Table 1).

Table 1

Mean (SEM) latency, omission, and locomotor data

Probabilistic discounting Saline Inactivation 
Response latency(s) 0.51 (0.04) 0.58 (0.08) 
Trial omissions 0.14 (0.1) 0.26 (0.22) 
Locomotion (photobeam breaks) 1516 (112) 1831 (195)* 
Experienced reward probability 
 50% block 45.6% (0.04) 44.9% (0.05) 
 25% block 25.5% (0.07) 25.4% (0.07) 
 12.5% block 12.7% (0.06) 7.7% (0.03) 
Delay discounting 
 Response latency(s) 0.83 (0.11) 0.92 (0.12) 
 Trial omissions 0.36 (0.2) 0.64 (0.3) 
 Locomotion 1761 (170) 1678 (203) 
Probabilistic discounting Saline Inactivation 
Response latency(s) 0.51 (0.04) 0.58 (0.08) 
Trial omissions 0.14 (0.1) 0.26 (0.22) 
Locomotion (photobeam breaks) 1516 (112) 1831 (195)* 
Experienced reward probability 
 50% block 45.6% (0.04) 44.9% (0.05) 
 25% block 25.5% (0.07) 25.4% (0.07) 
 12.5% block 12.7% (0.06) 7.7% (0.03) 
Delay discounting 
 Response latency(s) 0.83 (0.11) 0.92 (0.12) 
 Trial omissions 0.36 (0.2) 0.64 (0.3) 
 Locomotion 1761 (170) 1678 (203) 

* = P < 0.05.

Figure 3.

Inactivation of the mOFC increases the risky choice on a probabilistic discounting task. (A) Percentage choice of the large/risky option on baseline training days, and following infusions of baclofen–muscimol or saline into mOFC across 4 blocks of free-choice trials for all animals included in this experiment. Choice data are plotted as a function of probability block, irrespective of the order in which reward probabilities changed over a session. Symbols represent mean ± standard error of the mean (SEM). Black star denotes P < 0.05 for the average choice versus saline. (B) Choice data from the subset of rats tested on the task variant in which reward probabilities decreased from 100% to 12.5% (descending odds) across the 4 blocks of trials. (C) Choice data from the group that were tested on the variant in which reward probabilities increase from 12.5–100% (ascending odds) across the trial blocks. In both instances, mOFC inactivation increased risky choice. (D) Win-stay/lose-shift ratios following inactivation and saline treatments. Win-stay values are displayed as the proportion of choices on the large/risky lever following a rewarded risky choice on the preceding trial. Lose-shift values are displayed as the proportion of choices on the small/certain lever following an unrewarded risky choice on the preceding trial. mOFC inactivation selectively increased the tendency to select the large/risky option after obtaining the larger reward on the preceding trial, as indexed by an increase in win-stay tendencies. (E and F) Win-stay and lose shift ratios obtained from the subset of rats tested on the task variant in which reward probabilities descended or ascended across blocks.

Figure 3.

Inactivation of the mOFC increases the risky choice on a probabilistic discounting task. (A) Percentage choice of the large/risky option on baseline training days, and following infusions of baclofen–muscimol or saline into mOFC across 4 blocks of free-choice trials for all animals included in this experiment. Choice data are plotted as a function of probability block, irrespective of the order in which reward probabilities changed over a session. Symbols represent mean ± standard error of the mean (SEM). Black star denotes P < 0.05 for the average choice versus saline. (B) Choice data from the subset of rats tested on the task variant in which reward probabilities decreased from 100% to 12.5% (descending odds) across the 4 blocks of trials. (C) Choice data from the group that were tested on the variant in which reward probabilities increase from 12.5–100% (ascending odds) across the trial blocks. In both instances, mOFC inactivation increased risky choice. (D) Win-stay/lose-shift ratios following inactivation and saline treatments. Win-stay values are displayed as the proportion of choices on the large/risky lever following a rewarded risky choice on the preceding trial. Lose-shift values are displayed as the proportion of choices on the small/certain lever following an unrewarded risky choice on the preceding trial. mOFC inactivation selectively increased the tendency to select the large/risky option after obtaining the larger reward on the preceding trial, as indexed by an increase in win-stay tendencies. (E and F) Win-stay and lose shift ratios obtained from the subset of rats tested on the task variant in which reward probabilities descended or ascended across blocks.

Although inactivation of the mOFC increased risky choice relative to saline treatments, the amount of reward pellets earned were nearly identical across the 2 treatment conditions (saline = 72 ± 2, inactivation = 73 ± 2, t(13) = 0.17, n.s.). This is likely attributable to rats obtaining slightly more pellets in the 50% block, and fewer pellets in the 12.5% block after mOFC inactivation relative to saline. Similarly, the actual experienced reward probabilities in the risky blocks (50–12.5%) were not different across the 2 treatment conditions or 2 task variants (all Fs < 2.1, all Ps > 0.17; Table 1). These observations indicate that the increase in risky choice induced by mOFC inactivation cannot easily be attributable to differences in the reward probabilities experienced on saline versus inactivation test days, or alterations in choice strategy to maximize the amount of rewards obtained.

We further analyzed the proportion of “win-stay” and “lose-shift” trials to determine whether the increase in risky choice induced by mOFC inactivation could be attributed to an altered reward or negative feedback sensitivity, respectively. These analyses confirmed that mOFC inactivation caused a significant increase in win-stay tendencies (main effect of treatment, F1,12 = 8.57, P < 0.05). Thus, under control conditions, rats displayed a strong tendency to select the risky lever after selecting this lever on the preceding trial and receiving a reward, doing so on approximately 80% of these trials. However, this tendency was even more pronounced following mOFC inactivation (Fig. 3D, left). Furthermore, this analysis did not yield a significant main effect of task (F1,12 = 0.05, n.s.) or task × treatment interaction (F1,12 = 0.26, n.s.), indicating that the increase in win-stay tendencies induced by mOFC inactivation was comparable across groups trained on either a variant of the task (Fig. 3E,F). In contrast, lose-shift performance was not altered following inactivation of the mOFC (F1,12 = 1.06, n.s.). The lose-shift analysis did yield a significant main effect of task (F1,12 = 5.94, P < 0.05), which was driven by the fact that rats trained on the ascending version showed greater lose-shift tendencies across both treatments when compared with those trained on the descending version of the task (Fig. 3E,F, right). However, there was no significant task × treatment interaction (F1,12 = 0.13, n.s.), confirming that mOFC inactivation did not differentially alter this aspect of performance across the 2 task variants. Collectively, these results indicate that increases in risky choice induced by mOFC inactivation appear to be due to an increased tendency to choose risky after obtaining a larger reward, as opposed to decreased tendency to shift to the small/certain option after reward omission.

mOFC Inactivation and Delay-Based Decision-Making

Initially, 14 rats were trained on the delay-discounting task for an average of 23 days prior to being implanted with guide cannulae into the mOFC, retrained on the task and receiving counterbalanced microinfusions (Fig. 4). Three of these rats were excluded from the data analyses because of inaccurate cannulae placements that again, resided ventral to the mOFC, leaving a final 11 rats. In contrast to the effects of mOFC inactivation on probabilistic discounting, similar treatments did not alter decision-making on the delay-discounting task. Analysis of choice behavior following bilateral infusions of baclofen–muscimol or saline into the mOFC (n= 11) did not yield a significant main effect of treatment (F2,20 = 1.07, n.s.) or significant treatment × block interaction (F6,60 = 1.45, n.s.). Similarly, mOFC inactivation had no effect on response latencies, locomotion, or trial omissions (all Fs < 1.0, n.s.; Table 1).

Figure 4.

Inactivation of the mOFC does not affect delay discounting. Percentage choice of the large/delayed option on baseline training days, and following infusions of baclofen–muscimol or saline into the mOFC across 4 blocks of free-choice trials. Symbols represent mean + SEM.

Figure 4.

Inactivation of the mOFC does not affect delay discounting. Percentage choice of the large/delayed option on baseline training days, and following infusions of baclofen–muscimol or saline into the mOFC across 4 blocks of free-choice trials. Symbols represent mean + SEM.

Discussion

The present data provide novel insight into the contribution of the rat mOFC to different aspects of cost–benefit decision-making. Inactivation of the mOFC increased risky choice during probabilistic discounting. This effect was attributable primarily to an increased tendency to select the risky option after choosing risky on the previous trial and obtaining the larger reward. In contrast, similar inactivations did not affect choice between small/immediate and larger/delayed rewards. Thus, in choice situations that involve reward uncertainty, the mOFC appears to mitigate the impact that larger, probabilistic rewards exert on subsequent choice behavior, with inactivation of this region, leading to an increased tendency to select riskier options after these options have recently paid off.

mOFC and Risk-Based Decision-Making

The finding that inactivation of the mOFC increased risky choice irrespective of the manner in which reward probabilities changed over time is in marked contrast to the effect of similar inactivation of other medial or orbital regions of the PFC. Thus, inactivation of either the dorsal anterior cingulate cortex or the lateral OFC did not alter probabilistic discounting using an identical procedure, although inactivation of the lateral OFC did increases decision latencies (St. Onge and Floresco 2010). In contrast, inactivation of the medial, prelimbic region of the PFC affected risk-based decision-making differentially, increasing a risky choice when reward probabilities were initially 100% and decreased across the session, but had the opposite effect when reward probabilities started at 12.5% and increased across blocks (St. Onge and Floresco 2010). In that study, we concluded that the prelimbic PFC plays a key role in facilitating adjustments in decision biases under conditions where reward probabilities are volatile. The present findings highlight that the involvement of the mOFC in decision-making is dissociable from that of the lateral OFC or the caudal and dorsal medial PFC and broadens our understanding of the cortical/subcortical networks that regulate different component processes that guide risk/reward judgments. The medial PFC, working in conjunction with basolateral amygdala–ventral striatal circuitry (and possibly the mOFC), facilitates tracking of changes in reward probability and modifying decision policies over time (St. Onge and Floresco 2010; Hoover and Vertes 2011; St. Onge et al. 2012). In comparison, the lateral OFC aids in the speed of decisions, whereas the mOFC appears temper the urge to select options linked to larger yet uncertain rewards.

Further insight into the increase in risky choice induced by mOFC inactivation comes from a detailed analysis of choice behavior on trials following those where animals chose risky and received the larger reward (win-stay) versus those where they selected the risky option and did not receive reward (lose-shift). Under control conditions, rats chose the risky option on ∼80% of trials after obtaining the larger reward on the preceding trial. Conversely, on trials following a risky choice and loss, animals shifted to the small/certain option on ∼30% of subsequent trials. Inactivation of the mOFC did not alter lose-shift tendencies, suggesting that the increase in risky choice was not attributable to a reduction in negative feedback sensitivity. Instead, these manipulations selectively increased win-stay tendencies, increasing the likelihood that rats would chase after “wins” with another risky choice. These findings suggest that under conditions involving reward uncertainty, the mOFC appears to temper the impact on that large/risky rewards exert subsequent behavior directed toward riskier reward options.

The notion that the mOFC may serve to mitigate the impact that reward has over behavior is in keeping with recent rodent studies investigating the effects of lesion of this region on reward-related learning. Gourley et al. (2010) reported that mice with mOFC lesions displayed enhanced instrumental responding for food delivered on a progressive ratio. In a similar vein, post-training lesions of the mOFC in rats increased preference for larger delayed rewards (Mar et al. 2011), although this effect was only observed after extended postoperative retraining (see below). These findings have led to the conclusion that that disruption of mOFC function may enhance “sensitivity to reward magnitude” (Mar et al. 2011), and that this region facilitates “goal-directed response inhibition under circumstances that require the adoption of novel response strategies” (Gourley et al. 2010). This proposal is compatible with the present findings, where inactivation of the mOFC augmented the tendency to “chase” after larger, riskier rewards, even when the probability of obtaining these rewards was low (12.5–25%).

It is now well established that lesions to the OFC in humans impair risk-based decision-making on a variety of laboratory tasks. For example, Bechara et al. (1994) showed that patients with lesions to the ventromedial PFC (including the mOFC) persist in making risky, but disadvantageous choices on the IGT. Similarly, Clark et al. (2008) performed an assessment of brain-damaged patients performing the Cambridge Gambling Task, wherein subjects are explicitly informed about reward probabilities prior to making a decision. Patients with mOFC damage played riskier and increased their betting regardless of the odds of winning, consistent with a role of this region in biasing a choice toward conservative options under risk. It is notable that in that study, a lesion control group that included individuals with lateral OFC damage performed similar to healthy controls. These findings, in combination with the present data, suggest that that damage to the mOFC, rather than the later portion, may be primarily responsible for the increases in risky decisions made by patients with OFC lesions, due to a disruption in inhibitory control over biases to select larger, riskier rewards.

Functional imaging studies in humans provide additional insight into the cognitive operations governed by the mOFC that can influence decision biases, which in turn may help clarify how disruption of this region increases the risky choice. Early studies pointed to mOFC activity correlating with positive outcomes or the reward (O'Doherty et al. 2001; Small et al. 2001; Gottfried et al. 2002; Anderson et al. 2003; Rolls et al. 2003; Ursu and Carter, 2005; Kim et al. 2006). More contemporary studies focusing on choice between uncertain rewards have modified this view, providing evidence that the mOFC may compute comparisons between the “relative” values of different options that might be chosen (Padoa-Schioppa and Assad 2008; Boorman et al. 2009; FitzGerald et al. 2009; Basten et al. 2010; Philiastides et al. 2010; Lim et al. 2011; Rushworth et al. 2011). Viewing this conceptual framework in light of the present findings suggests that when animals are faced with a choice between uncertain, larger rewards and smaller, certain ones, information processed by the mOFC may assist in judging whether the risky or certain option may be of greater relative value. This signal may contribute to biasing choice away from options associated with larger yet riskier rewards, given that these options are not always more advantageous in the long-term. It follows that disruption of this relative value signal would reduce these competing biases, causing animals to view the larger reward option as more attractive despite its uncertainty. Moreover, given the probabilistic manner of reward delivery, it would be expected that the incentive salience of the larger reward option would be greater when a risky choice resulted in reward delivery on the preceding trial. This notion is consistent with the observation that mOFC inactivation made rats more likely to select the larger/risky option after obtaining a larger reward after the previous choice. Thus, it is possible disruption of mOFC activity may have caused animals to either forget or not care that the larger reward option was associated with some risk, leading to an increase preference for this option relative to the small/certain reward.

mOFC and Delay-Based Decision-Making

In contrast to the above-mentioned findings, acute, temporary inactivation of the mOFC did not alter decision biases during a delay-discounting task, when rats chose between larger/delayed and smaller/immediate rewards. In comparison, lesions or inactivation of the lateral OFC exert a more prominent effect on this form of decision-making, although the specific effects on choice can depend on a variety of factors, including the presence/absence of cues during a delay, and baseline levels of impulsive choice (Winstanley et al. 2004; Rudebeck et al. 2006; Zeeb et al. 2010). Collectively, these findings provide further evidence that probabilistic and delay discounting are dissociable forms of decision-making, each recruiting different cortical circuits.

It is important to note that, superficially, the lack of effect of mOFC inactivation on delay discounting appears to contrast with the findings of Mar et al. (2011), who reported an increased preference for larger, delayed rewards after lesions of the mOFC. In that study, rats were well-trained on a delay-discounting task very similar to the one used here, after which they received permanent excitotoxic lesions of either the mOFC, lateral OFC or large lesions encompassing both regions. Following retraining, rats with mOFC lesions eventually developed a greater bias toward the larger/delayed reward option. However, this effect only emerged after extended retraining on the task (14–18 days). In contrast, choice behavior during early post-lesion retraining (6 days) was indistinguishable from control animals, which is in keeping with the lack of effect on delay discounting following temporary inactivation of the mOFC reported here. Thus, it appears that neural activity in the mOFC does not make a major contribution to delay-related judgments in the short-term, although this region may exert a longer-term influence over these biases when animals must learn (or re-learn) the relative cost–benefit contingencies associated with different options.

The lack of effect on mOFC inactivation in delay discounting is an important finding, in that it demonstrates that this region does not uniformly blunt reward sensitivity or promote decision biases away from larger rewards when their subjective value is diminished by certain costs (e.g., delays). Thus, the increase in risky choice induced by mOFC inactivation is unlikely to be attributable to a general increase in preference for larger rewards. A survey of findings from other studies of the mOFC permits a refinement of our understanding of its contribution to reward processing. For example, lesions of the mOFC enhanced instrumental responding for food delivered on a progressive ratio schedule, requiring mice to make multiple lever presses with increasing response requirements to obtain a reward (Gourley et al. 2010). Along similar lines, the most robust increases in preference for larger/delayed rewards induced by mOFC lesions emerged after changes in reward contingencies (i.e. switching from nondelayed to delayed rewards, reversing reward contingencies; Mar et al. 2011). These data, in addition to the present findings, suggest that the mOFC may influence how receipt of reward influences subsequent behavior most prominently in novel or uncertain situations, or when specific actions do not always yield at least some reward. Indeed, it is precisely these types of scenarios that would require a system to lessen the impact of reward on subsequent action selection and reduce persistent modes of responding, to promote flexibility and exploration of other options if reward contingencies change. Thus, by mitigating the impact that rewards exert over decision biases, the mOFC may promote more adaptive response selection in changing environments and aid in “deciding whether or not it is worth adapting or maintaining decisions” (Boorman et al. 2009).

Funding

This work was supported by a grant from the Canadian Institutes of Health Research (MOP 89861) to S.B.F. S.B.F. is a Michael Smith Foundation for Health Research Senior Scholar.

Notes

Conflict of Interest: None declared.

References

Anderson
AK
Christoff
K
Stappen
I
Panitz
D
Ghahremani
DG
Glover
G
Gabrieli
JD
Sobel
N
Dissociated neural representations of intensity and valence in human olfaction
Nat Neurosci
 , 
2003
, vol. 
6
 (pg. 
196
-
202
)
Bari
A
Eagle
DM
Mar
AC
Robinson
ES
Robbins
TW
Dissociable effects of noradrenaline, dopamine, and serotonin uptake blockade on stop task performance in rats
Psychopharmacology
 , 
2009
, vol. 
205
 (pg. 
273
-
283
)
Basten
U
Biele
G
Heekeren
HR
Fiebach
CJ
How the brain integrates costs and benefits during decision making
Proc Natl Acad Sci USA
 , 
2010
, vol. 
107
 (pg. 
21767
-
21772
)
Bechara
A
Damasio
AR
Damasio
H
Anderson
SW
Insensitivity to future consequences following damage to the human prefrontal cortex
Cognition
 , 
1994
, vol. 
50
 (pg. 
7
-
15
)
Bechara
A
Damasio
H
Damasio
AR
Lee
GP
Different contributions of the human amygdala and ventromedial prefrontal cortex to decision-making
J Neurosci
 , 
1999
, vol. 
19
 (pg. 
5473
-
5481
)
Boorman
ED
Behrens
TE
Woolrich
MW
Rushworth
MF
How green is the grass on the other side? Frontopolar cortex and the evidence in favor of alternative courses of action
Neuron
 , 
2009
, vol. 
62
 (pg. 
733
-
743
)
Cardinal
RN
Howes
NJ
Effects of lesions of the nucleus accumbens core on choice between small certain rewards and large uncertain rewards in rats
BMC Neurosci
 , 
2005
, vol. 
6
 pg. 
9
 
Cardinal
RN
Robbins
TW
Everitt
BJ
The effects of d-amphetamine, chlordiazepoxide, alpha-flupenthixol and behavioural manipulations on choice of signaled and unsignalled delayed reinforcement in rats
Psychopharmacology
 , 
2000
, vol. 
152
 (pg. 
362
-
375
)
Clark
L
Bechara
A
Damasio
H
Aitken
MR
Sahakian
BJ
Robbins
TW
Differential effects of insular and ventromedial prefrontal cortex lesions on risky decision-making
Brain
 , 
2008
, vol. 
131
 (pg. 
1311
-
1322
)
Fellows
LK
Farah
MJ
Different underlying impairments in decision-making following ventromedial and dorsolateral frontal lobe damage in humans
Cereb Cortex
 , 
2005
, vol. 
15
 (pg. 
58
-
63
)
FitzGerald
TH
Seymour
B
Dolan
RJ
The role of human orbitofrontal cortex in value comparison for incommensurable objects
J Neurosci
 , 
2009
, vol. 
29
 (pg. 
8388
-
8395
)
Floresco
SB
St Onge
JR
Ghods-Sharifi
S
Winstanley
CA
Cortico-limbic-striatal circuits subserving different forms of cost-benefit decision making
Cogn Affect Behav Neurosci
 , 
2008
, vol. 
8
 (pg. 
375
-
389
)
Floresco
SB
Whelan
JM
Perturbations in different forms of cost/benefit decision making induced by repeated amphetamine exposure
Psychopharmacology
 , 
2009
, vol. 
205
 (pg. 
189
-
201
)
Ghods-Sharifi
S
St Onge
JR
Floresco
SB
Fundamental contribution by the basolateral amygdala to different forms of decision making
J Neurosci
 , 
2009
, vol. 
29
 (pg. 
5251
-
5259
)
Gottfried
JA
Deichmann
R
Winston
JS
Dolan
RJ
Functional heterogeneity in human olfactory cortex: an event-related functional magnetic resonance imaging study
J Neurosci
 , 
2002
, vol. 
22
 (pg. 
10819
-
10828
)
Gourley
SL
Lee
AS
Howell
JL
Pittenger
C
Taylor
JR
Dissociable regulation of instrumental action within mouse prefrontal cortex
Eur J Neurosci
 , 
2010
, vol. 
32
 (pg. 
1726
-
1734
)
Hoover
WB
Vertes
RP
Projections of the medial orbital and ventral orbital cortex in the rat
J Comp Neurol
 , 
2011
, vol. 
519
 (pg. 
3766
-
3801
)
Kim
H
Shimojo
S
O'Doherty
JP
Is avoiding an aversive outcome rewarding? Neural substrates of avoidance learning in the human brain
PLoS Biol
 , 
2006
, vol. 
4
 pg. 
e233
 
Lim
SL
O'Doherty
JP
Rangel
A
The decision value computations in the vmPFC and striatum use a relative value code that is guided by visual attention
J Neurosci
 , 
2011
, vol. 
31
 (pg. 
13214
-
13223
)
Manes
F
Sahakian
B
Clark
L
Rogers
R
Antoun
N
Aitken
M
Robbins
T
Decision-making processes following damage to the prefrontal cortex
Brain
 , 
2002
, vol. 
125
 (pg. 
624
-
639
)
Mar
AC
Walker
AL
Theobald
DE
Eagle
DM
Robbins
TW
Dissociable effects of lesions to orbitofrontal cortex subregions on impulsive choice in the rat
J Neurosci
 , 
2011
, vol. 
31
 (pg. 
6398
-
6404
)
Mobini
S
Body
S
Ho
MY
Bradshaw
CM
Szabadi
E
Deakin
JF
Anderson
IM
Effects of lesions of the orbitofrontal cortex on sensitivity to delayed and probabilistic reinforcement
Psychopharmacology
 , 
2002
, vol. 
160
 (pg. 
290
-
298
)
O'Doherty
J
Kringelbach
ML
Rolls
ET
Hornak
J
Andrews
C
Abstract reward and punishment representations in the human orbitofrontal cortex
Nat Neurosci
 , 
2001
, vol. 
4
 (pg. 
95
-
102
)
Padoa-Schioppa
C
Assad
JA
The representation of economic value in the orbitofrontal cortex is invariant for changes of menu
Nat Neurosci
 , 
2008
, vol. 
11
 (pg. 
95
-
102
)
Pais-Vieira
M
Lima
D
Galhardo
V
Orbitofrontal cortex lesions disrupt risk assessment in a novel serial decision-making task in rats
Neuroscience
 , 
2007
, vol. 
145
 (pg. 
225
-
231
)
Paxinos
G
Watson
C
The rat brain in stereotaxic coordinates.
 , 
2005
4th ed.
San Diego, CA
Academic Press
Philiastides
MG
Biele
G
Heekeren
HR
A mechanistic account of value computation in the human brain
Proc Natl Acad Sci USA
 , 
2010
, vol. 
107
 (pg. 
9430
-
9435
)
Price
JL
Definition of the orbital cortex in relation to specific connections with limbic and visceral structures and other cortical regions
Ann N Y Acad Sci
 , 
2007
, vol. 
1121
 (pg. 
54
-
71
)
Rogers
RD
Everitt
BJ
Baldacchino
A
Blackshaw
AJ
Swainson
R
Wynne
K
Baker
NB
Hunter
J
Carthy
T
Booker
E
, et al.  . 
Dissociable deficits in the decision-making of chronic amphetamine abusers, opiate abusers, patients with focal damage to prefrontal cortex, and tryptophan-depleted normal volunteers: evidence for monoaminergic mechanisms
Neuropsychopharmacology
 , 
1999
, vol. 
20
 (pg. 
322
-
339
)
Rolls
ET
Kringelbach
ML
de Araujo
IE
Differential representations of pleasant and unpleasant odours in the human brain
Eur J Neurosci
 , 
2003
, vol. 
18
 (pg. 
695
-
703
)
Rudebeck
PH
Walton
ME
Smyth
AN
Bannerman
DM
Rushworth
MFS
Separate neural pathways process different decision costs
Nat Neurosci
 , 
2006
, vol. 
9
 (pg. 
1161
-
1168
)
Rushworth
MF
Noonan
MP
Boorman
ED
Walton
ME
Behrens
TE
Frontal cortex and reward-guided learning and decision-making
Neuron
 , 
2011
, vol. 
70
 (pg. 
1054
-
1069
)
Small
DM
Zatorre
RJ
Dagher
AJ
Evans
AC
Jones-Gotman
M
Changes in brain activity related to eating chocolate: from pleasure to aversion
Brain
 , 
2001
, vol. 
124
 (pg. 
1720
-
1733
)
St. Onge
JR
Abhari
H
Floresco
SB
Dissociable contributions by prefrontal D1 and D2 receptors to risk-based decision making
J Neurosci
 , 
2011
, vol. 
31
 (pg. 
8625
-
8633
)
St. Onge
JR
Floresco
SB
Dopaminergic modulation of risk-based decision making
Neuropsychopharmacology
 , 
2009
, vol. 
34
 (pg. 
681
-
697
)
St. Onge
JR
Floresco
SB
Prefrontal cortical contribution to risk-based decision making
Cereb Cortex
 , 
2010
, vol. 
20
 (pg. 
1816
-
1828
)
St. Onge
JR
The contribution of prefrontal-subcortical circuitry to risk-based decision making
 , 
2011
Vancouver, Canada
University of British Columbia Doctoral Thesis
St. Onge
JR
Stopper
CM
Zahm
DS
Floresco
SB
Separate prefrontal-subcortical circuits mediate different components of risk-based decision making
J Neurosci
 , 
2012
, vol. 
32
 (pg. 
2886
-
2899
)
Stopper
CM
Floresco
SB
Contributions of the nucleus accumbens and its subregions to different aspects of risk-based decision making
Cogn Affect Behav Neurosci
 , 
2011
, vol. 
11
 (pg. 
97
-
112
)
Ursu
S
Carter
CS
Outcome representations, counterfactual comparisons and the human orbitofrontal cortex: implications for neuroimaging studies of decision-making
Brain Res Cogn Brain Res
 , 
2005
, vol. 
23
 (pg. 
51
-
60
)
Winstanley
CA
Theobald
DE
Cardinal
RN
Robbins
TW
Contrasting roles of basolateral amygdala and orbitofrontal cortex in impulsive choice
J Neurosci
 , 
2004
, vol. 
24
 (pg. 
4718
-
4722
)
Zeeb
FD
Floresco
SB
Winstanley
CA
Contributions of the orbitofrontal cortex to impulsive choice: interactions with basal levels of impulsivity, dopamine signalling, and reward-related cues
Psychopharmacology
 , 
2010
, vol. 
211
 (pg. 
87
-
98
)
Zeeb
FD
Winstanley
CA
Lesions of the basolateral amygdala and orbitofrontal cortex differentially affect acquisition and performance of a rodent gambling task
J Neurosci
 , 
2011
, vol. 
31
 (pg. 
2197
-
2204
)