Abstract

Instrumental actions are a vital cognitive asset that endows an organism with sensitivity to the consequences of its behavior. Response–outcome feedback allows responding to be shaped in order to maximize beneficial, and minimize detrimental, outcomes. Lesions of the medial prefrontal cortex (mPFC) result in behavior that is insensitive to changes in outcome value in animals and compulsive behavior in several human psychopathologies. Such insensitivity to changes in outcome value is a defining characteristic of instrumental habits: responses that are controlled by antecedent stimuli rather than goal expectancy. Little is known regarding the neurochemical substrates mediating this sensitivity. The present experiments used sensitivity to posttraining outcome devaluation to index the action–habit status of instrumental responding. Infusions of dopamine into the ventral mPFC (vmPFC), but not dorsal mPFC, restored outcome sensitivity bidirectionally—decreasing responding following outcome devaluation and increasing responding when the outcome was not devalued. This bidirectionality makes the possibility that these infusions nonspecifically dysregulated vmPFC dopamine transmission unlikely. VmPFC dopamine promoted instrumental responding appropriate to outcome value. Reinforcer consumption data indicated that this was not a consequence of altered sensitivity to the reinforcer itself. We suggest that vmPFC dopamine reengages attentional processes underlying goal-directed behavior.

Introduction

Acquiring knowledge about the relationship between behavior and its consequences is a vital cognitive asset that permits responses to be acquired and/or modified in the pursuit of specific goals (Balleine and Dickinson 1998). This “response–outcome association” is one of many encoded during instrumental conditioning (Colwill 1994). Clearly, dysfunction of the neural substrates that encode or process this information may result in maladaptive behavior (Toates 1998).

Instrumental responses controlled by knowledge of their specific consequences are termed “actions.” The goal-directed nature of actions is reflected by their sensitivity to posttraining changes in outcome value. Thus, the performance of an action spontaneously tracks changes in the value of its specific outcome (Dickinson 1985). Continuous monitoring of the response–outcome relationship requires allocation of prefrontal cortex (PFC)-dependent, limited-capacity cognitive processes (Gehring and Knight 2000; Miller and Cohen 2001). This can become redundant once responding has been shaped to produce the optimal behavioral outcome. Thus, extended, stable response–outcome conditions promote a shift in the control of instrumental responding from being goal directed to being stimulus elicited. Instrumental responses triggered by environmental stimuli are termed “habits.” The stimulus–response nature of habits is reflected by their insensitivity to posttraining changes in outcome value (Dickinson 1985). This action–habit shift allows efficient responding to be maintained while permitting the reallocation of cognitive resources to other demands (Gehring and Knight 2000). Conversely, the inability to shift from habit to action leaves an organism unable to readily adapt to changing environmental conditions. This deficit is apparent in various human psychiatric populations in whom general cognitive functions are intact; yet, they compulsively perform maladaptive behaviors that are primarily stimulus elicited (Lhermitte 1986; Toates 1998; Montague and Berns 2002).

Studies in animals indicate that medial PFC (mPFC) integrity is necessary both for the initial acquisition of task-relevant responses and the suppression of task-irrelevant, especially prepotent, responses (Balleine and Dickinson 1998; Coutureau and Killcross 2003; Killcross and Coutureau 2003; Clark et al. 2004; Ostlund and Balleine 2005). Whereas these studies help delineate the anatomical substrates mediating the balance between actions and habits, the neurochemical substrates that specifically influence this balance are not well understood. Dopamine is one likely candidate because activation of dopamine neurons increases during response acquisition but declines following extended training under stable (i.e., habit promoting) conditions (Ljungberg et al. 1992). The preferential cortical targets of these dopaminergic neurons are precisely those areas implicated in outcome sensitivity, namely, the ventral mPFC (vmPFC) (Van Eden et al. 1987; Smiley et al. 1992).

Whereas striatal dopamine has been implicated in habitual instrumental responding (Faure et al. 2005), there have been no investigations of mPFC dopamine modulation of actions and/or habits. Here, we tested the hypothesis that infusions of dopamine into the vmPFC would promote goal-directed behavior by increasing the sensitivity of instrumental responding to the value of its outcome. Accordingly, we examined whether, following habit formation, vmPFC infusions of dopamine would promote responding appropriate to outcome value, which we manipulated using a devaluation procedure that we and others have previously shown to reduce the performance of an instrumental action (e.g., Dickinson et al. 2002; Miles et al. 2003; Hitchcott et al. 2005; Quinn et al. 2006). Because the vmPFC is implicated in the production of adaptive behavior (e.g., Dalley et al. 2004), and dopamine has been suggested to exert a permissive effect on this function (Hollerman et al. 2000; Sullivan 2004; Fletcher et al. 2005; Floresco et al. 2006), we hypothesized that dopamine infusions would promote adaptive behavioral adjustments appropriate to current outcome value (i.e., enhance responding for a valued outcome and reduce responding for a devalued outcome).

Methods

General Procedural Outline

We conducted 2 experiments looking at the role of dopamine in the vmPFC on the expression of an instrumental habit. We first trained all animals using procedures intended to generate an instrumental habit. Subsequently, half of each group underwent a posttraining associative reinforcer devaluation procedure by pairing the reinforcer that was used during instrumental training with lithium chloride (LiCl). A common procedure was then used to probe whether instrumental performance was an action or a habit. Because actions are determined by knowledge of their outcomes, their performance spontaneously tracks outcome value such that posttraining devaluation of the outcome results in decreased performance of the action. By contrast, habits are directly elicited by antecedent environmental stimuli. As a result, posttraining associative devaluation does not affect the performance of habits. Hence, sensitivity to posttraining changes in outcome value provides an objective criterion for distinguishing actions from habits. In these experiments, the impact of an intra-vmPFC dopamine infusion on instrumental performance was assessed. Finally, in experiment 2, all rats underwent a reinforcer consumption test immediately following the instrumental test session. This was done to test whether dopamine infusions directly affected reinforcer value.

Subjects and Apparatus

Experimentally naive male Sprague-Dawley rats (Charles River Laboratories, Wilmington, MA), weighing 300–420 g at the time of surgery, were used for these experiments. The rats were pair housed and allowed to acclimate to the colony room for at least 1 week prior to surgery. Training and testing was conducted in 8 Med-Associates (St Albans, VT) operant chambers housed within sound-attenuating boxes. Each chamber was equipped with a liquid dipper that delivered 0.06 mL of 20% (w/v) sucrose solution into a recessed magazine located in one of the end walls of the chamber when activated. A retractable lever was located to the left of the magazine. A 3-W 24-V house light mounted at the center of the top wall opposite the magazine provided illumination. Each sound-attenuating chamber was fitted with a fan and a speaker to deliver white noise to mask sounds from outside. A PC equipped with the MED-PC software controlled the equipment and recorded lever presses.

Surgery

Rats were anesthetized with Equithesin (4 mL/kg intraperitoneally [i.p.]), treated with the non-steroidal anti inflammatory drug carprofen (5 mg/kg subcutaneously), and placed in a stereotaxic frame. Small holes were drilled into the skull, and a bilateral guide cannula (22 gauge) was lowered into either the vmPFC specifically targeting 1 mm above the infralimbic (IL) cortex (+3.0 mm anterior-posterior, ±0.5 mm medial-lateral (ML), and −4.2 mm dorsal-ventral (DV) relative to bregma) or the dorsal mPFC (dmPFC) specifically targeting 1 mm above the cingulate (CG1) cortex (+3.0 mm AP, ±0.5 mm ML, and −2.0 mm DV relative to bregma) (Paxinos and Watson 2005). Subsequent infusions were performed using a bilateral injection cannula (26 gauge) that projected 1 mm beyond the guide cannula. The cannula was fixed to the skull using stainless steel screws and acrylic cement. Not less than 1 week after surgery, rats were placed on a food deprivation schedule receiving 10–15 g/day of their maintenance diet until body weights were reduced to 85% of their preoperative, free-feeding level. Once training began, they were fed sufficient food, at least 1 h after the training session, to maintain this body weight. Access to water in their home cage was ad libitum.

Lever-Press Training

Experiment 1: vmPFC Dopamine and Instrumental Performance following Devaluation

Rats (n = 40) were first given a single session of magazine training in which sucrose solution was delivered (32 presentations) on a variable time 60-s schedule with the lever removed. Next, lever-press training was initiated during a single session of continuous reinforcement that terminated once 30 reinforcers were earned. For all subsequent training sessions (once daily), a random interval (RI) schedule of reinforcement was used. Sessions ended after 90 min or the delivery of 100 reinforcers. Sucrose was initially delivered on a RI 15-s schedule that was incremented in steps of 15 each day until RI 60 s was reached. Training continued for a further 4 sessions using a RI 60-s schedule. All animals successfully acquired the lever-press response during the initial fixed ratio 1 session, and the rate of responding on the final RI 60-s session was 12.4 ± 1.4 responses per minute. Animals were matched for their final rate of responding prior to group assignment. Final group sizes were (value/infusion) the following: valued/vehicle, n = 12; valued/dopamine, n = 12; devalued/vehicle, n = 6; and devalued/dopamine, n = 10.

Experiment 2: Dissociation between vmPFC and dmPFC

In experiment 1, infusions of dopamine into the vmPFC produced bidirectional effects on responding depending upon the outcome value. A second experiment was conducted to assess the anatomical specificity of these effects. This consisted of a replication of experiment 1 that included additional groups of animals that received infusions of vehicle or dopamine into the dmPFC. The training procedures were identical to those used in experiment 1. All animals acquired the instrumental response, and no differences were observed between vmPFC and dmPFC cannulated groups either in the rate at which this response was acquired or in the final rate of responding (responses per minute: dmPFC = 15.2 ± 1.5; vmPFC = 17.7 ± 2.0). Animals were matched for their final rate of responding prior to group assignment. Final group sizes were vmPFC (value/infusion): valued/vehicle, n = 4; valued/dopamine, n = 7; devalued/vehicle, n = 5; devalued/dopamine, n = 6; dmPFC (value/infusion): valued/vehicle, n = 7; valued/dopamine, n = 6; devalued/vehicle, n = 7; devalued/dopamine, n = 8.

Experiments 1 and 2: Outcome Devaluation by Conditioned Taste Aversion

In experiment 1, over the 3 consecutive days following the final instrumental training session, the sucrose reward was devalued using a conditioned taste aversion procedure. For the devaluation sessions, each rat was allowed 30 min free access to a drinking tube containing 20% (w/v) sucrose (i.e., the reinforcer used during instrumental training) in a novel context. Immediately thereafter, half of the animals received an i.p. injection of lithium chloride (LiCl; 0.6 M, 5 mL/kg), whereas the remaining half received sodium chloride (0.6 M, 5 mL/kg). During these 3 days, conditioned taste aversion training was conducted at least 4 h in advance of daily feeding. In experiment 2, the outcome devaluation phase was modified such that animals in the valued and devalued groups received equivalent exposure to LiCl, except that the LiCl was paired with sucrose only in the devalued groups. Thus, over the 6 consecutive days following the final instrumental training session, all animals received 3 injections of LiCl (0.6 M, 5 mL/kg) once every 2 days. Subjects assigned to the “devalued” subgroups received this injection immediately following 30 min access to sucrose as in experiment 1 (i.e., the sucrose and LiCl were paired). Subjects assigned to the “valued” subgroups received nothing following sucrose access. On the alternate day of each 2-day cycle, valued subgroups received LiCl (0.6 M, 5 mL/kg) in the home cage (i.e., the sucrose and LiCl were unpaired), whereas devalued subgroups received nothing. The order of treatment was counterbalanced such that half the subjects received LiCl on days 1, 3, and 5 and half on days 2, 4, and 6.

Experiments 1 and 2: Habit Test

In experiments 1 and 2, the day following the final outcome devaluation session, each rat was lightly restrained and over 2 min received bilateral intracranial infusions (0.25 μL/min) of dopamine (total dose 0 or 20 μg in 1 μL) dissolved in phosphate-buffered saline containing 0.1% w/v ascorbic acid. Selection of the dose of dopamine was based on data showing that higher doses produced a nonspecific suppression of responding (Hitchcott PK, Taylor JR, unpublished observations). Ten minutes after this infusion, all animals received a 5-min test of instrumental performance conducted in extinction. This test began with the illumination of the house light and insertion of the lever and ended with lever retraction and offset of the house light. No reinforcement was delivered during this test to establish that responding is determined solely by the information encoded during the previous training phases and, in addition, a short test session was employed to obviate any influence of extinction (Dickinson 1985). In experiment 2, a brief (5 min) sucrose consumption test was conducted immediately following the habit test. The purpose of this was to establish whether the observed effects of dopamine infusions might be due to altered sensitivity to the outcome per se.

Histology

Upon completion of behavioral testing, animals were anesthetized using pentobarbital (≥90 mg/kg i.p.) and perfused transcardially with 0.9% saline followed by 10% formalin. Brains were extracted and placed in 10% formalin. Two days prior to being sliced, the brains were transferred to a 10% formalin/30% sucrose solution. Brains were then sliced in 50-μm-thick coronal sections using a cryostat. Sections at the level of the infusion site were mounted on microscope slides and subsequently stained using Cresyl Violet Acetate (Sigma, St. Louis, MO). Infusion sites were identified under low-power light microscopy and their location recorded using the atlas of Paxinos and Watson (2005).

Statistical Analyses

Consumption data derived from the outcome devaluation phase of each experiment were analyzed using a 3-way mixed-design analysis of variance (ANOVA). Devaluation treatment (experiment 1: sodium chloride vs. lithium chloride; experiment 2: LiCl paired or LiCl unpaired) and infusion (experiments 1 and 2: vehicle or dopamine, which would be administered before the final test) were included as between-subject factors and day as a within-subject factor. Lever-press data from the final instrumental test were analyzed using a 2-way between-subjects ANOVA using devaluation treatment and infusion (vehicle or dopamine) as factors. Where indicated by a significant interaction of these factors, post hoc t-tests were performed. In experiment 2, posttest sucrose consumption data were not normally distributed in the devalued groups; therefore, these data were further analyzed using nonparametric statistics (Mann–Whitney U).

Results

Histology

In experiment 1, all animals had infusion sites located within the vmPFC (these were distributed throughout the entire dorsal–ventral limits of IL cortex and within 0.5 mm of the intended coronal plane i.e., AP +3.0 ± 0.5 mm) and were therefore included in the statistical analyses (see Fig. 1a). In experiment 2, a similar distribution of infusion sites was observed in the vmPFC group, whereas animals in the dmPFC group had infusion sites located within the CG1 region (these also were distributed within 0.5 mm of the intended coronal plane i.e., AP +3.0 ± 0.5 mm, see Fig. 1b).

Figure 1.

Simplified schematic (adapted from Paxinos and Watson 2005) of the rat PFC showing the location of the sites at which dopamine or vehicle was infused in experiment 1 (a) and experiment 2 (b).

Experiment 1: vmPFC Dopamine and Instrumental Performance following Devaluation

Outcome Devaluation

The data from the outcome devaluation phase are presented in Figure 2a. The conditioned taste aversion was acquired rapidly in those groups receiving lithium chloride but not in animals receiving sodium chloride (devaluation: F1,36 = 34.4, P < 0.001; devaluation × day interaction: F2,72 = 59.63, P < 0.001).

Figure 2.

(a) Sucrose consumption during outcome devaluation in rats receiving saline (open symbols) or lithium chloride (closed symbols) and that would subsequently receive intra-vmPFC vehicle (circles) or dopamine (squares) prior to test. (b) Instrumental performance during the 5-min habit test following infusion of vehicle or dopamine. *P < 0.05 and **P < 0.01. ††P < 0.01 versus respective valued group.

Habit Test

As seen in Figure 2b, rats infused with vehicle did not differ in responding whether or not the reinforcer had been devalued, confirming that the training procedures resulted in the development of an instrumental habit. The effect of vmPFC dopamine infusion was dependent upon outcome value (devaluation × infusion interaction: F1,36 = 25.26, P < 0.01). In comparison with vehicle-infused controls, dopamine infusions reduced responding for a devalued outcome (P < 0.001) and increased responding for a valued outcome (P < 0.05).

Experiment 2: Dissociation between vmPFC and dmPFC

Outcome Devaluation

The data from the outcome devaluation phase are presented in Figure 3a,b. It is evident that the conditioned taste aversion was acquired rapidly in the groups receiving sucrose–lithium chloride pairings (devalued) but not in animals receiving unpaired (valued) sucrose and lithium chloride (vmPFC devaluation: F1,18 = 16.98, P < 0.01; devaluation × day interaction: F2,34 = 23.51, P < 0.001; dmPFC devaluation: F1,24 = 22.75, P < 0.001; devaluation × day interaction: F2,46 = 34.87, P < 0.001).

Figure 3.

(a) Sucrose consumption during outcome devaluation in rats receiving unpaired (open symbols; valued) or paired (closed symbols; devalued) presentations of sucrose and LiCl and that would subsequently receive intra-vmPFC vehicle (circles) or dopamine (squares) prior to test. (b) Sucrose consumption during outcome devaluation in rats receiving unpaired (open symbols; valued) or paired (closed symbols; devalued) presentations of sucrose and LiCl and that would subsequently receive intra-dmPFC vehicle (circles) or dopamine (squares) prior to test. (c) Instrumental performance during the 5-min habit test following vmPFC infusion of vehicle or dopamine. *P < 0.05 and **P < 0.01. ††P < 0.01 versus respective valued group. (d) Instrumental performance during the 5-min habit test following dmPFC infusion of vehicle or dopamine. *P < 0.05 and **P < 0.01. ††P < 0.01 versus respective valued group. (e) Posttest sucrose consumption following vmPFC infusion of vehicle or dopamine. (f) Posttest sucrose consumption following dmPFC infusion of vehicle or dopamine.

Habit Test

The data from the final habit test session are shown in Figure 3c,d. Separate analyses were performed on the data obtained from vmPFC- and dmPFC-infused groups. There was no effect of outcome devaluation in either vmPFC or dmPFC vehicle-infused rats, confirming that the training procedures produced an instrumental habit. Figure 3c shows the effect of dopamine infusions into the vmPFC on instrumental performance. Statistical analysis of the data replicated those of experiment 1. Dopamine exerted a differential effect on responding depending on outcome value (infusion × devaluation interaction: F1,18 = 12.48, P < 0.01). Post hoc analyses indicated that dopamine enhanced responding in valued animals (P < 0.05) and reduced responding in devalued animals (P < 0.05). The latter effect reflected a reversal of the instrumental habit that was observed in the vehicle-infused subjects. Figure 3d shows the effect of dopamine infusions into the dmPFC on instrumental performance. Dopamine nonspecifically reduced responding regardless of outcome value (infusion: F1,24 = 6.23, P < 0.05; infusion × devaluation interaction: F1,24 < 1).

Consumption Test

Data from the consumption test are shown in Figure 3e,f. Separate analyses were performed on the data obtained from vmPFC- and dmPFC-infused groups. Figure 3e shows the effect of dopamine infusions into the vmPFC on sucrose consumption. Prior outcome devaluation was clearly effective in reducing sucrose consumption (devaluation: F1,14 = 9.74, P < 0.01). However, dopamine infusions had no significant effects on sucrose consumption in either the valued or the devalued groups (both F values <1). A very similar pattern of results was observed in animals that received dopamine infusions into the dmPFC (Fig. 3f). Again, prior outcome devaluation reduced sucrose consumption (devaluation: F1,24 = 11.44, P < 0.01), and this effect was unaltered by dopamine infusion (both F values <1). The near complete suppression of sucrose consumption following devaluation resulted in a violation of normality within those groups. The consumption data were, therefore, further analyzed using the nonparametric Mann–Whitney U test. This analysis entirely supported the ANOVA results. Prior outcome devaluation was clearly effective in reducing sucrose consumption in the vmPFC group in both vehicle- (U = 20.0, P < 0.01) and dopamine- (U = 40.0, P < 0.01) infused rats. However, dopamine had no significant effects on sucrose consumption in either the valued or the devalued (P values >0.05) groups (Fig. 3e). An identical pattern of results was observed in animals that received dopamine infusions into the dmPFC (Fig. 3f). Prior outcome devaluation reduced sucrose consumption in both vehicle- (U = 49.0, P < 0.01) and dopamine- (U = 48.0, P < 0.01) infused groups, and these effects were unaltered by dopamine infusion in both the valued and the devalued (P values >0.05) groups.

Discussion

The present study demonstrates that when animals are trained using procedures that produce an instrumental habit, vmPFC dopamine infusions produce opposite effects depending on whether the reinforcer has been devalued by pairings with lithium chloride. When devalued, dopamine reduces responding. When not devalued, it enhances responding. Therefore, vmPFC dopamine increases the sensitivity of instrumental responding to the value of its outcome leading to adaptive bidirectional changes in performance. This bidirectionality makes the possibility that vmPFC dopamine nonspecifically disrupted behavior unlikely. Consistent with the anatomical and functional heterogeneity of the mPFC, dopamine infusions into the vmPFC, but not the dmPFC, produced this bidirectional modulation of performance. The mechanism by which performance was affected appeared not to involve changes in the primary motivational properties of the reinforcer. First, dopamine infusions into either region failed to alter posttest reinforcer consumption. Second, in a separate study, identical vmPFC dopamine infusions suppressed, rather than enhanced, responding for a valued reinforcer as assessed by progressive ratio performance. Taken together, our data indicate that dopamine in the vmPFC can modify the sensitivity of behavior to its consequences generating instrumental responding that is appropriate to current reinforcer value. These data demonstrate a specific neurochemical substrate that determines whether instrumental responding is expressed as an action or a habit.

It is possible that the present data showing bidirectional effects of vmPFC dopamine on instrumental performance reflect 2 dissociable underlying processes: one subserving the decrease in responding in devalued animals and another contributing to the increase in responding in valued animals. If this is the case, then the general decrease in responding following dmPFC dopamine infusions might suggest that both vmPFC and dmPFC dopamine contributes to a reversal of instrumental habit (promoting goal-directed behavior) as indicated by a decrease in responding following devaluation. However, the decrease in responding in valued animals is counterintuitive to this interpretation because goal-directed behavior should track current reinforcer value (Dickinson 1985). Thus, there would be no reason for responding in valued animals to have changed. Therefore, we believe that dmPFC dopamine generally decreases responding though it is unclear what process this general suppression may reflect.

Given that the posttraining reinforcer devaluation by conditioned taste aversion was conducted in a separate context (not in the instrumental training context), there is some concern that this devaluation training may transfer to the instrumental context differentially among dopamine- and vehicle-infused animals. Whereas this could potentially account for dopamine-induced differences in devalued animals, this cannot account for the dopamine-induced differences observed in valued animals.

It is interesting that vmPFC dopamine appears to have the direct opposite effect in modulating goal-directed behavior compared with striatal dopamine. Faure et al. (2005) showed that lesions of the nigrostriatal dopamine system disrupt the development of an instrumental habit. Thus, generally speaking, it appears that decreased dopamine function in striatum and increased dopamine transmission in vmPFC both promote goal-directed behavior. It will be interesting to see whether infusions of dopamine directly into dorsal striatum and dopamine depletion (or antagonism) within the vmPFC have similar effects in promoting habitual responding. Future studies will need to address how dopamine transmission is differentially regulated in these 2 regions across training and/or how these 2 systems compete for control over performance.

vmPFC Dopamine Amplifies the Sensitivity of Behavior to Its Consequences

In humans, lesions of the vmPFC increase control over behavior by external stimuli at the expense of internally generated goals (Lhermitte 1986; Bechara et al. 1994). This shift is hypothesized to bias the expression of instrumental responding in animals toward habit (Balleine and Dickinson 1998)—objectively defined by insensitivity to posttraining changes in reinforcer value (Dickinson 1985). We found that in rats trained to express an instrumental stimulus–response habit, vmPFC infusions of dopamine restored the spontaneous sensitivity to altered reinforcer value. That is, animals that had the reinforcer devalued by pairings with lithium chloride responded less than animals that had not received reinforcer devaluation. A novel finding of this study is, therefore, the demonstration that vmPFC dopamine is a specific substrate that determines the expression of an instrumental response as an action or habit.

Numerous reports have shown that vmPFC function affects behavioral disinhibition (Arnsten and Li 2005; Robbins 2005) including, possibly, the disinhibition of instrumental actions (Coutureau and Killcross 2003; Killcross and Coutureau 2003). This latter study reported that temporary inactivation of the vmPFC using the γ-aminobutyric acid A agonist, muscimol, increased responding for a “valued reinforcer” (i.e., nondevalued) after habit training. Notably, muscimol failed to alter responding for the “devalued reinforcer” when compared with muscimol-infused valued animals. This finding appears inconsistent with a reversal of habit because instrumental responding that fails to track reinforcer devaluation does not meet the objective criterion defining habits (Dickinson 1985). It is possible in the study of Coutureau and Killcross (2003) that muscimol infusions produce a general increase in responding that masks any decrease in devalued animals. The increased responding in muscimol-infused valued animals would be consistent with this possibility. A major component of our data is the unequivocal demonstration of a reversal of instrumental habit because dopamine reduces responding for a devalued reinforcer and increases responding for a valued reinforcer, thereby eliminating interpretational difficulties imposed by potential changes in general performance.

In addition to the reversal of an instrumental habit measured by a reduction of instrumental responding following reinforcer devaluation, vmPFC dopamine infusions enhanced responding when the reinforcer was not devalued. This is a second major finding of this study that vmPFC dopamine exerted a bidirectional effect on instrumental performance depending on the current reinforcer value. This observation is significant for 2 reasons. First, the fact that identical infusions both reduced and enhanced responding diminishes an interpretation based upon a nonspecific disruptive effect of vmPFC function by dopamine. Second, a bidirectional modulation of responding, appropriate to the current value of the reinforcer, suggests that dopamine likely contributes to the generation of adaptive behavior by the vmPFC (Price 2005). It is possible that the increased responding in valued animals following vmPFC dopamine infusion may reflect the normal level of responding expected of animals not responding habitually. That is, the higher levels of responding observed in dopamine-infused, relative to vehicle-infused, valued animals may be attributable to differences among action and habit responders, respectively. Dysregulation of vmPFC dopamine function may contribute to maladaptive states, where behavior is dissociated from its consequences.

Attentional Modulation within the vmPFC

The vmPFC has been proposed to generate cognitive–emotional response “sets” (Price 1999; Barbas 2000; Wall and Messier 2001; Critchley 2005). Specifically, it has been demonstrated that vmPFC integrity is necessary for the generation of autonomic responses that guide effective decision making (Bechara et al. 1996; but see also Balleine 2005). Moreover, the vmPFC has been shown to gate input of biologically significant information, requiring the redirection of attention, into working memory (Botvinick et al. 2001; Wall and Messier 2001; Corbetta and Shulman 2002; Ullsperger and von Cramon 2004) while filtering out information extraneous to the ongoing task (Fuster 1997). Notably, the labeling of information as biologically significant predominantly occurs “upstream” of the vmPFC in regions such as the amygdala, hippocampus and the lateral subregion of orbitofrontal cortex (OFC) (Schoenbaum et al. 1998; Wall and Messier 2001). Thus, the vmPFC itself does not appear to modulate the biological significance/value of events but rather controls which events are attended to and acted upon. This view is consistent with the effects in humans of vmPFC lesions, and in certain clinical populations such as drug addicts, who repeatedly experience the detrimental consequences of their behavior yet are unable to modify it (Bechara 2005; Ersche et al. 2005).

Our data accord with the view that vmPFC function determines the sensitivity of behavior to its consequences, while extending it by implicating dopamine. Dopamine infusions were without effect on consumption of the reinforcer itself, indicating that dopamine did not alter sensitivity to the reinforcer per se. We believe that vmPFC dopamine enhanced the processing of task-relevant information and suppressed processing of task-irrelevant information. That such a view implicates dopamine in the modulation of a known function of the vmPFC (Fuster 1997; Wall and Messier 2001; Corbetta and Shulman 2002) lends additional credence to this hypothesis that will be evaluated in future studies.

Attentional Modulation by Dopamine: Potential Mechanism of Action

The region of the vmPFC targeted in the present study has been implicated in “supervisory attentional functions,” such as extradimensional set shifting (Birrell and Brown 2000; Dalley et al. 2004). Recent evidence demonstrates that performance in such tasks is facilitated by vmPFC dopamine (Fletcher et al. 2005; Floresco et al. 2006). This dopaminergic facilitation of cognitive flexibility has been argued to promote behavioral adaptation in both appetitive and aversive situations (Hollerman et al. 2000; Sullivan 2004). Our findings add to this role of vmPFC dopamine in 2 new ways. First, we have extended the role of vmPFC dopamine to include the redirection of attention necessary for the shift in instrumental control from habit to action that occurs when a prepotent response produces a detrimental outcome. Second, vmPFC control of behavioral flexibility is anatomically dissociable (Dalley et al. 2004). Notably, only dopamine infusions into the vmPFC were effective in producing bidirectional modulation of instrumental performance.

Summary and Implications

The insensitivity of instrumental behavior to its consequences is not necessarily maladaptive, whereas a failure to restore this sensitivity can be. An organism without such flexibility is left vulnerable to detrimental consequences of its behavior. The associative basis for this deficit is a failure to shift responding from habit to action, the anatomical locus of which has been hypothesized to be the vmPFC (Balleine and Dickinson 1998). Our data confirm this view and, furthermore, provide evidence that optimal dopamine function is necessary for adaptive, prospective (i.e., goal directed) control of behavior. In this respect, it is notable that cortical dysfunction is common to several psychiatric disorders characterized by stimulus-elicited, compulsive behaviors lacking sensitivity to their detrimental consequences (Toates 1998; Volkow et al. 2004). Perhaps the most obvious example is the inability of addicts to modify the instrumental act of drug taking despite knowledge of the negative consequences of this behavior (Bechara 2005; Everitt and Robbins 2005). Whereas previous models have emphasized the role of PFC dopamine in the inhibition of subcortical substrates of reward-related learning (Jentsch and Taylor 1999; Jentsch et al. 2000), our data suggest that dysfunction of cortical dopamine transmission would result in an attentional deficit whereby biologically significant information fails to functionally interrupt striatal-dependent, prepotent stimulus–response habits (Yin and Knowlton 2006).

The authors thank Peter Holland for his helpful comments on an earlier draft of the manuscript. This research was supported by grant DA11717 from the National Institutes of Health (NIH) and by the Tourette's Syndrome Association. JJQ was supported by NIH 5T32MH014276 to Ronald Duman. Conflict of Interest: None declared.

References

Arnsten
AF
Li
BM
Neurobiology of executive functions: catecholamine influences on prefrontal cortical functions
Biol Psychiatry
2005
, vol. 
57
 (pg. 
1377
-
1384
)
Balleine
BW
Neural bases of food-seeking: affect, arousal and reward in corticostriatolimbic circuits
Physiol Behav
2005
, vol. 
86
 (pg. 
717
-
730
)
Balleine
BW
Dickinson
A
Goal-directed instrumental action: contingency and incentive learning and their cortical substrates
Neuropharmacology
1998
, vol. 
37
 (pg. 
407
-
419
)
Barbas
H
Connections underlying the synthesis of cognition, memory, and emotion in primate prefrontal cortices
Brain Res Bull
2000
, vol. 
52
 (pg. 
319
-
330
)
Bechara
A
Decision making, impulse control and loss of willpower to resist drugs: a neurocognitive perspective
Nat Neurosci
2005
, vol. 
8
 (pg. 
1458
-
1463
)
Bechara
A
Damasio
AR
Damasio
H
Anderson
SW
Insensitivity to future consequences following damage to human prefrontal cortex
Cognition
1994
, vol. 
50
 (pg. 
7
-
15
)
Bechara
A
Tranel
D
Damasio
H
Damasio
AR
Failure to respond autonomically to anticipated future outcomes following damage to prefrontal cortex
Cereb Cortex
1996
, vol. 
6
 (pg. 
215
-
225
)
Birrell
JM
Brown
VJ
Medial frontal cortex mediates perceptual attentional set shifting in the rat
J Neurosci
2000
, vol. 
20
 (pg. 
4320
-
4324
)
Botvinick
MM
Braver
TS
Barch
DM
Carter
CS
Cohen
JD
Conflict monitoring and cognitive control
Psychol Rev
2001
, vol. 
108
 (pg. 
624
-
652
)
Clark
L
Cools
R
Robbins
TW
The neuropsychology of ventral prefrontal cortex: decision-making and reversal learning
Brain Cogn
2004
, vol. 
55
 (pg. 
41
-
53
)
Colwill
RM
Medin
D
Associative representations of instrumental contingencies
The psychology of learning and motivation: advances in research and theory
1994
, vol. 
Vol. 31
 
San Diego, CA
Academic Press
(pg. 
1
-
72
)
Corbetta
M
Shulman
GL
Control of goal-directed and stimulus-driven attention in the brain
Nat Rev Neurosci
2002
, vol. 
3
 (pg. 
201
-
215
)
Coutureau
E
Killcross
S
Inactivation of the infralimbic prefrontal cortex reinstates goal-directed responding in overtrained rats
Behav Brain Res
2003
, vol. 
146
 (pg. 
167
-
174
)
Critchley
HD
Neural mechanisms of autonomic, affective, and cognitive integration
J Comp Neurol
2005
, vol. 
493
 (pg. 
154
-
166
)
Dalley
JW
Cardinal
RN
Robbins
TW
Prefrontal executive and cognitive functions in rodents: neural and neurochemical substrates
Neurosci Biobehav Rev
2004
, vol. 
28
 (pg. 
771
-
784
)
Dickinson
A
Actions and habits: the development of behavioural autonomy
Philos Trans R Soc Lond Ser B Biol Sci
1985
, vol. 
308
 (pg. 
67
-
78
)
Dickinson
A
Wood
N
Smith
JW
Alcohol seeking by rats: action or habit?
Q J Exp Psychol
2002
, vol. 
55B
 (pg. 
331
-
348
)
Ersche
KD
Fletcher
PC
Lewis
SJ
Clark
L
Stocks-Gee
G
London
M
Deakin
JB
Robbins
TW
Sahakian
BJ
Abnormal frontal activations related to decision-making in current and former amphetamine and opiate dependent individuals
Psychopharmacology (Berl)
2005
, vol. 
180
 (pg. 
612
-
623
)
Everitt
BJ
Robbins
TW
Neural systems of reinforcement for drug addiction: from actions to habits to compulsion
Nat Neurosci
2005
, vol. 
8
 (pg. 
1481
-
1489
)
Faure
A
Haberland
U
Conde
F
Massioui
NE
Lesion to the nigrostriatal dopamine system disrupts stimulus-response habit formation
J Neurosci
2005
, vol. 
25
 (pg. 
2771
-
2780
)
Fletcher
PJ
Tenn
CC
Rizos
Z
Lovic
V
Kapur
S
Sensitization to amphetamine, but not PCP, impairs attentional set shifting: reversal by a D1 receptor agonist injected into the medial prefrontal cortex
Psychopharmacology (Berl)
2005
, vol. 
183
 (pg. 
190
-
200
)
Floresco
SB
Magyar
O
Ghods-Sharifi
S
Vexelman
C
Tse
MT
Multiple dopamine receptor subtypes in the medial prefrontal cortex of the rat regulate set-shifting
Neuropsychopharmacology
2006
, vol. 
31
 (pg. 
297
-
309
)
Fuster
JM
Network memory
Trends Neurosci
1997
, vol. 
20
 (pg. 
451
-
459
)
Gehring
WJ
Knight
RT
Prefrontal-cingulate interactions in action monitoring
Nat Neurosci
2000
, vol. 
3
 (pg. 
516
-
520
)
Hitchcott
P
Anderson
G
Lombroso
P
Taylor
JR
Prefrontal cortical modulation of striatal instrumental habit learning
2005
 
Soc Neurosci Abstr No. 997.918. 31
Hollerman
JR
Tremblay
L
Schultz
W
Involvement of basal ganglia and orbitofrontal cortex in goal-directed behavior
Prog Brain Res
2000
, vol. 
126
 (pg. 
193
-
215
)
Jentsch
JD
Roth
RH
Taylor
JR
Role for dopamine in the behavioral functions of the prefrontal corticostriatal system: implications for mental disorders and psychotropic drug action
Prog Brain Res
2000
, vol. 
126
 (pg. 
433
-
453
)
Jentsch
JD
Taylor
JR
Impulsivity resulting from frontostriatal dysfunction in drug abuse: implications for the control of behavior by reward-related stimuli
Psychopharmacology (Berl)
1999
, vol. 
146
 (pg. 
373
-
390
)
Killcross
S
Coutureau
E
Coordination of actions and habits in the medial prefrontal cortex of rats
Cereb Cortex
2003
, vol. 
13
 (pg. 
400
-
408
)
Lhermitte
F
Human autonomy and the frontal lobes. Part II: patient behavior in complex and social situations: the “environmental dependency syndrome”
Ann Neurol
1986
, vol. 
19
 (pg. 
335
-
343
)
Ljungberg
T
Apicella
P
Schultz
W
Responses of monkey dopamine neurons during learning of behavioral reactions
J Neurophysiol
1992
, vol. 
67
 (pg. 
145
-
163
)
Miles
FJ
Everitt
BJ
Dickinson
A
Oral cocaine seeking by rats: action or habit?
Behav Neurosci
2003
, vol. 
117
 (pg. 
927
-
938
)
Miller
EK
Cohen
JD
An integrative theory of prefrontal cortex function
Annu Rev Neurosci
2001
, vol. 
24
 (pg. 
167
-
202
)
Montague
PR
Berns
GS
Neural economics and the biological substrates of valuation
Neuron
2002
, vol. 
36
 (pg. 
265
-
284
)
Ostlund
SB
Balleine
BW
Lesions of medial prefrontal cortex disrupt the acquisition but not the expression of goal-directed learning
J Neurosci
2005
, vol. 
25
 (pg. 
7763
-
7770
)
Paxinos
G
Watson
C
The rat brain in stereotaxic coordinates
2005
5th ed
San Diego (CA)
Academic Press
Price
JL
Prefrontal cortical networks related to visceral function and mood
Ann N Y Acad Sci
1999
, vol. 
877
 (pg. 
383
-
396
)
Price
JL
Free will versus survival: brain systems that underlie intrinsic constraints on behavior
J Comp Neurol
2005
, vol. 
493
 (pg. 
132
-
139
)
Quinn
J
Hitchcott
PK
Arnold
AP
Taylor
JR
Chromosomal sex determines habit formation: relevance to addiction
2006
 
Soc Neurosci Abstr No. 483.4. 32
Robbins
TW
Chemistry of the mind: neurochemical modulation of prefrontal cortical function
J Comp Neurol
2005
, vol. 
493
 (pg. 
140
-
146
)
Schoenbaum
G
Chiba
AA
Gallagher
M
Orbitofrontal cortex and basolateral amygdala encode expected outcomes during learning
Nat Neurosci
1998
, vol. 
1
 (pg. 
155
-
159
)
Smiley
JF
Williams
SM
Szigeti
K
Goldman-Rakic
PS
Light and electron microscopic characterization of dopamine-immunoreactive axons in human cerebral cortex
J Comp Neurol
1992
, vol. 
321
 (pg. 
325
-
335
)
Sullivan
RM
Hemispheric asymmetry in stress processing in rat prefrontal cortex and the role of mesocortical dopamine
Stress
2004
, vol. 
7
 (pg. 
131
-
143
)
Toates
F
The interaction of cognitive and stimulus-response processes in the control of behaviour
Neurosci Biobehav Rev
1998
, vol. 
22
 (pg. 
59
-
83
)
Ullsperger
M
von Cramon
DY
Neuroimaging of performance monitoring: error detection and beyond
Cortex
2004
, vol. 
40
 (pg. 
593
-
604
)
Van Eden
CG
Hoorneman
EM
Buijs
RM
Matthijssen
MA
Geffard
M
Uylings
HB
Immunocytochemical localization of dopamine in the prefrontal cortex of the rat at the light and electron microscopical level
Neuroscience
1987
, vol. 
22
 (pg. 
849
-
862
)
Volkow
ND
Fowler
JS
Wang
GJ
The addicted human brain viewed in the light of imaging studies: brain circuits and treatment strategies
Neuropharmacology
2004
, vol. 
47
 
Suppl
1
(pg. 
3
-
13
)
Wall
PM
Messier
C
The hippocampal formation—orbitomedial prefrontal cortex circuit in the attentional control of active memory
Behav Brain Res
2001
, vol. 
127
 (pg. 
99
-
117
)
Yin
HH
Knowlton
BJ
The role of the basal ganglia in habit formation
Nat Rev Neurosci
2006
, vol. 
7
 (pg. 
464
-
476
)
This is an Open Access article distributed under the terms of the Creative Commons Attribution Non-Commercial License (http://creativecommons.org/licenses/by-nc/2.0/uk/) which permits unrestricted non-commercial use, distribution, and reproduction in any medium, provided the original work is properly cited.