Abstract

Learning to predict rewarding and aversive outcomes is based on the comparison between predicted and actual outcomes (prediction error: PE). Recent electrophysiological studies reported that during a Pavlovian procedure some dopamine neurons code a classical PE signal while a larger population of dopaminergic neurons reflect a “salient” prediction error (SPE) signal, being excited both by unpredictable aversive events and by rewards. Yet, it is still unclear whether specific human brain structures receiving afferents from dopaminergic neurons code a SPE and whether this signal depends upon reinforcer type. Here, we used a model-based functional magnetic resonance imaging approach implementing a reinforcement learning model to compute the PE while subjects underwent a Pavlovian conditioning procedure with 2 types of rewards (pleasant juice and monetary gain) and 2 types of punishments (aversive juice and aversive picture). The results revealed that activity of a brain network composed of the striatum, anterior insula, and anterior cingulate cortex covaried with a SPE for appetitive and aversive juice. Moreover, amygdala activity correlated with a SPE for these 2 reinforcers and for aversive pictures. These results provide insights into the neurobiological mechanisms underlying the ability to learn stimuli-rewards and stimuli-punishments contingencies, by demonstrating that the network reflecting the SPE depends upon reinforcement’s type.

Introduction

Rewards and punishments have opposite hedonic valences but both are motivationally salient events. A fundamental question is to know whether motivational salience and hedonic valence, which represent 2 distinct but closely related attributes of reward and punishment, are separately encoded by the brain. Animals, including humans, learn to associate various stimuli with different types of rewards and punishments. Central for such behavior is the capacity to compute the discrepancy between the prediction and the actual outcome (prediction error: PE). Based on a wealth of evidence from electrophysiological recording studies in nonhuman primates, rodents and humans, it has been widely assumed that dopaminergic neurons encode a reward PE (RPE), with a positive phasic response when the outcome is better than expected (unexpected reward or omission of expected punishment) and a negative response when it is worse than expected (unexpected punishment or omission of expected reward) (Schultz 1998; Bayer and Glimcher 2005; Pan et al. 2005; Roesch et al. 2007; Zaghloul et al. 2009). According to this hypothesis, referred to as the RPE hypothesis, the sign of the PE is opposite for rewards and punishments.

However, in awake monkeys, recent recordings from the same dopaminergic neurons for rewards and aversive events point to the coexistence of a phasic dopaminergic signal encoding biologically salient events conveying both positive and negative information (Matsumoto and Hikosaka 2009b). During a Pavlovian procedure, one class of dopaminergic neurons located ventromedially, some in the VTA, are excited by unexpected rewards and inhibited by unexpected aversive stimuli, as expected by the RPE hypothesis. Yet, a larger subpopulation of dopamine neurons, located more dorsolaterally in the substantia nigra pars compacta, are excited both by unpredictable reward and aversive stimuli, as would predict a salient PE (SPE) hypothesis. Moreover, recent results in rodents confirm that, while some dopaminergic neurons of the VTA are inhibited by aversive stimuli, others are excited by these same stimuli (Brischoux et al. 2009). These findings suggest that different groups of dopamine neurons convey RPE and SPE signals, shedding light on increased striatal dopamine levels observed not only during appetitive conditioning (Reynolds et al. 2001) but also during aversive conditioning (Pezze and Feldon 2004; Young 2004). Together, these results raised the possibility of the coexistence of 2 brain networks active during the learning of associations between cues and rewards or punishments: a reward brain network, treating reward and punishment in opposite ways (opposite hedonic valences), and a salient brain network, which treats them in a similar manner as motivationally salient events.

To date, most of human functional magnetic resonance imaging (fMRI) studies using computational reinforcement learning models investigated which brain system show responses consistent with the RPE when learning associations between conditioned stimuli and different types of rewards, such as juice (Berns et al. 2001; McClure et al. 2003; O'Doherty et al. 2004), odor (Gottfried et al. 2003), money (Tanaka et al. 2004; Abler et al. 2006; Pessiglione et al. 2006), and attractive faces (Bray and O'Doherty 2007). Paralleling these investigations, a number of human fMRI studies also investigated the cerebral substrates of aversive conditioning using a variety of punishments, such as painful stimuli, aversive juices, odors, tones, or visual stimuli (LaBar et al. 1998; Buchel et al. 1999; Gottfried, Deichmann, et al. 2002; Seymour et al. 2004; Knight et al. 2010; Sarinopoulos et al. 2010) or monetary losses (Delgado et al. 2000, 2008; Knutson et al. 2000; Nieuwenhuis et al. 2005). However, these fMRI studies either separately investigated RPE or PE related to aversive stimuli in designs using only positive or only aversive events or did not vary the type of reinforcer (e.g., only used monetary gains and losses). Thus, it is still unclear whether the regions with response profiles consistent with the SPE and RPE signals depend upon the reinforcer type.

Only a task design simultaneously manipulating both positive and negative outcomes, as well as reinforcer types, would allow us to directly investigate whether separate or overlapping targets of dopaminergic neurons are consistent with an RPE or with an SPE signal. Using a computational reinforcement learning approach in a Pavlovian conditioning procedure combining appetitive and aversive juice, money and aversive pictures, we investigated whether the RPE and SPE signals depend upon reinforcer types. We chose to investigate the cerebral representations of PE and SPE related to 2 appetitive cues: one immediate and gustatory (apple juice), the other visual and delayed (money), and 2 aversive cues, both indicating an immediate aversive outcome (aversive picture and salty water).

Brain regions in which the activity reflects an RPE signal should represent reward and punishment in opposite fashion, responding with a positive PE when the outcome is better than expected (unexpected reward or omission of expected punishment) and a negative PE when it is worse than expected (unexpected punishment or omission of expected reward). In contrast, brain regions with response profiles consistent with an SPE signal should treat both rewards and punishments in a similar fashion, as salient outcomes, with a positive response when both of these salient events are delivered and a negative response for their omission (Fig. 1).

Figure 1.

Experimental design and computational model. (A) Subjects learned to associate various cues with 4 different types of reinforcers (2 appetitive and 2 aversive) in a classical reinforcement learning paradigm. Two types of cues were followed by positive reinforcers (apple juice and money) on 50% of occasions or by a scrambled picture (unreinforced), 2 other types of cues were followed by negative reinforcers (salty water and aversive picture) on 50% of occasions or by a scrambled picture (unreinforced), while some cues were always followed by a scrambled picture (neutral condition). (B). Time course of a single trial. After the cue presentation, subjects pressed a response button (<1 s), immediately followed by a delay period (fixation cross) and by the reinforcer or by a scrambled picture. (C). Top. Salient computational model: predicted neural response. Schematic showing the mean representation of the SPE signal which responds to reward and punishment in the same way, as motivationally salient events, generating positive PE for reinforced trials and negative PE for unreinforced trials. Bottom. Reward computational model: predicted neural response. The RPE model signals rewards and punishments in opposite ways, generating a positive PE when an unexpected reward is delivered or when an expected punition is missed and generating a negative PE when an unexpected punishment is delivered or an expected reward is missed (Unreinf., Unreinforced; Reinf., Reinforced).

Figure 1.

Experimental design and computational model. (A) Subjects learned to associate various cues with 4 different types of reinforcers (2 appetitive and 2 aversive) in a classical reinforcement learning paradigm. Two types of cues were followed by positive reinforcers (apple juice and money) on 50% of occasions or by a scrambled picture (unreinforced), 2 other types of cues were followed by negative reinforcers (salty water and aversive picture) on 50% of occasions or by a scrambled picture (unreinforced), while some cues were always followed by a scrambled picture (neutral condition). (B). Time course of a single trial. After the cue presentation, subjects pressed a response button (<1 s), immediately followed by a delay period (fixation cross) and by the reinforcer or by a scrambled picture. (C). Top. Salient computational model: predicted neural response. Schematic showing the mean representation of the SPE signal which responds to reward and punishment in the same way, as motivationally salient events, generating positive PE for reinforced trials and negative PE for unreinforced trials. Bottom. Reward computational model: predicted neural response. The RPE model signals rewards and punishments in opposite ways, generating a positive PE when an unexpected reward is delivered or when an expected punition is missed and generating a negative PE when an unexpected punishment is delivered or an expected reward is missed (Unreinf., Unreinforced; Reinf., Reinforced).

Materials and Methods

Subjects

Twenty healthy subjects (10 females) with no history of neurological or psychiatric illness participated in the experiment (mean age: 24.4; range: 18–33). All subjects were right-handed as assessed by the Edinburgh Handedness Inventory (Oldfield 1971). Subjects were drink deprived for 12 h prior to scanning to ensure that subjects remained thirsty during the experiment and only drunk 2.4 mL of liquid over the experiment to keep them in a drink deprivation state. The study was approved by the local ethics committee (Lyon) and all subjects gave informed consent.

Paradigm

The task used a Pavlovian conditioning procedure in which affectively neutral visual cues were paired with 4 types of reinforcers: apple juice, salty water, money, or aversive picture (Fig. 1A). Each trial was divided into 2 phases: anticipation and reception. The anticipation phase began with a cue (geometric form) displayed until a response button was made (in less than 1 s). This cue was followed by a delay period of 6 s displaying a fixation cross. Then, in the reception phase, either the corresponding reinforcer or a scrambled picture was presented for 1.5 s, in a 50% reinforcement schedule. Each reinforced trial was followed by real consequences: 1) in the apple juice condition, the subjects were delivered 0.05 mL of apple juice in the mouth while they were presented with a picture representing a glass of apple juice; 2) in the salty water condition, the subjects were delivered 0.05 mL of salty water while they were presented with a picture representing a brown glass of water; 3) in the aversive picture condition, an aversive picture was presented; 4) in the monetary reward condition, subjects were presented with a 20 Euros bill picture and were informed that they would earn a percentage of each of these bills at the end of the experiment. A blank screen was finally used as an intertrial interval of variable duration (2.5–5.5 s) (Fig. 1B). In a fifth condition (neutral condition), a cue announced the neutral scrambled picture with certainty. To maintain the subjects’ attention, they were asked to press a response button as soon as they saw the cue. Subjects were explicitly informed that the delivery of the reinforcer was independent of their responses and knew that the cue would disappear after 1 s, even if they did not make a key press.

Stimuli and Reinforcers

Visual stimuli were back projected on a screen located at the head of the scanner bed and presented to the subjects through an adjustable mirror located above their head. We investigated the neural representations of PE and SPE related to 2 appetitive cues (apple juice and money) and 2 aversive cues (aversive picture and salty water). The presentation of the stimuli as well as the juice delivery were controlled by Presentation software (Neurobehavioral Systems), which also recorded trigger pulses from the scanner signaling the beginning of each volume acquisition.

We used 2 liquids of opposite valence: appetitive apple juice and aversive salty water consisting of 0.2 M of NaCl. They were contained in 2 60-mL syringes, connected to an IVAC P7000 electronic pump positioned in the scanner control room. For each reinforced trial in the taste conditions, 0.05 mL of liquid were delivered in the subjects’ mouth via 2 separate 6 m long and 1 mm wide polyethylene tubes. This small amount of liquid was chosen to minimize any satiety effect that could occur during the experiment. In order to reduce head movement related to swallowing, subjects were instructed to swallow only during the intertrial interval, after the reinforcer offset and before the new trial onset.

For the monetary reward trials, a picture of a 20 Euros bill was presented at the center of the screen and subjects were told that they would earn a percentage of this amount after the experiment. Subjects were not told the exact percentage to avoid counting during the experiment. By the end of the experiment, each subject had seen 24 bills and earned 20€, in addition to the 50€ earned for being scanned. Thus, subjects were paid a fixed amount of 70€ for their participation but did not know this amount before the end of the experiment. Finally, in the aversive picture condition, visually negative reinforced trials corresponding to a highly repelling picture from the International Affective Picture System (Lang et al. 2005) showed a mutilated face (picture no 3060: valence = 1.79 ± 1.56; arousal = 7.12 ± 2.09). All unreinforced trials were represented by the presentation of a unique scrambled neutral picture.

There were intrinsic differences between the 4 reinforcers due to their specific nature, since they had distinct valence (appetitive/aversive), modality (visual only/visual + gustatory) and were of different types (primary/secondary). Each reinforcer can be characterized in the following way: apple juice: appetitive, gustatory + visual, immediate (primary); monetary reward: appetitive, visual, delayed (secondary); salty water: aversive, gustatory + visual, immediate (primary); aversive picture: aversive, visual, immediate (primary). Thus, one should keep in mind that the 4 conditions did not differ by only one factor but also that this is not a major problem since we did not perform one to one comparison between each reinforcer.

Note that we did not include monetary losses in the experiment because monetary gains and losses are known to weigh differently (Tom et al. 2007) and because 1) it can be questioned whether monetary loss acts as a primary punishment like aversive liquids or aversive pictures; 2) monetary losses are the removal of a valued appetitive stimulus (type II punishment) whereas physical punishments (type I punishment) are the administration of an aversive stimulus (Skinner 1938).

In fact, monetary losses and physical punishments may be coded in opposite direction during aversive conditioning, unexpected monetary losses leading to a decrease in striatal blood oxygen level–dependent (BOLD) response (Delgado et al. 2000; Yacubian et al. 2006) while unexpected painful electrical stimulation may lead to an increase in striatal BOLD response (Seymour et al. 2004; Menon et al. 2007). Since we did not include monetary losses, it should be noted that our design is not symmetric. However, if we had chosen to use only money (gains and losses) and juice (appetitive and aversive), we could not have made generalizations about PE and SPE coding for primary reinforcers because we would have included only one type of primary reward/punishment (juice).

Experimental Design

The experiment consisted in 3 scanning runs of 15 min, separated by short breaks during which the echo-planar image (EPI) sequence was stopped, allowing the subject to rest. In each run, the 5 conditions were presented in blocks of 16 successive trials. Each run was pseudorandomly ordered according to a Latin square design so that each of the 5 conditions appeared only once at different serial positions within a run and that they alternated with no repetition within a run (e.g., run I: 12345, run II: 43521, run III: 51432, where 1,2,3,4,5 corresponds to each condition). The order of the runs was also counterbalanced across subjects. In each run, a new cue was used for each condition and subjects had to learn the probabilistic association between this cue and the corresponding reinforcer. The trials from the different conditions were not mixed between each others to avoid relative comparison between the values of the different reinforcers. Indeed, a large body of literature reports context-dependent activity in different components of the reward system (Tremblay and Schultz 1999; Nieuwenhuis et al. 2005). This design allowed us to test the SPE hypothesis versus the RPE hypothesis (Fig. 1C), while distinguishing between the modality (gustatory or visual) and the nature (primary/immediate or secondary/delayed) of the reinforcers.

Behavioral Measures

Before the scanning session, subjects performed a two-alternative forced-choice preference task designed to investigate if they had a preference for cues explicitly associated with positive reinforcers relative to cues associated with negative reinforcers. On each trial (72 trials total), 2 out of 4 possible cues explicitly indicating both the type of reinforcer and the chance to be reinforced (P = 0.5) were presented side-by-side and subjects had to choose which one they preferred by a left or right key press. There were 6 different pairs of cues, repeated 12 times each, and the cues were randomly assigned to the left or right side of the screen. After the decision, the chosen reinforcer was effectively delivered with a probability of P = 0.5 (i.e., the cues predicting apple juice or salty water were followed in 50% of the trials by the simultaneous presentation of the corresponding glass and by the delivery of 0.05 mL of liquid in the mouth of the subject). For each reinforcer, we computed a preference score (percent chosen if available) as the number of times one cue was chosen divided by the number of times this cue was presented.

During scanning, to assess whether the probabilities of the cue-reinforcer association were explicitly learnt, subjects were asked, at the end of each block, to rate the probability that each cue was associated with the reinforcer. Such rating was done by positioning a cursor on a continuous scale from 0 to 1, representing an estimate of the probability that a cue was paired with a specific reinforcer.

Finally, to assess the value of each reinforcer after the scanning session, subjects were also asked to provide pleasantness ratings for each reinforcer on a scale ranging from −2 (very unpleasant) to 2 (very pleasant). Subjects were also asked to rate their thirst on a scale ranging from 1 (not thirsty at all) to 5 (extremely thirsty), both before and after scanning. The a priori significance level was defined at P < 0.05 for all the behavioral tests.

fMRI Data Acquisition and Preprocesses

fMRI data were acquired on a 1.5 Tesla Siemens MRI scanner. BOLD signal was measured with gradient echo T2*-weighted EPIs. Twenty-six interleaved slices parallel to the AC-PC line were acquired per volume (matrix 64 × 64, voxel size = 3.4 × 3.4 × 4 mm). In total, 410 volumes were acquired continuously every 2.5 s for each of the 3 runs. The first 4 volumes of each run were discarded to allow the BOLD signal to reach a steady state. A T1-weighted structural image (1 × 1 × 1 mm) was also acquired for each subject at the end of the experiment.

Data were preprocessed using the SPM5 software package. First, outlier scans (>1.5% variation in global intensity or >0.5 mm/time repetition scan-to-scan motion) were detected using the ArtRepair SPM toolbox http://spnl.stanford.edu/tools/ArtRepair/ArtRepair.htm (Mazaika et al. 2009). Since less than 5% of outlier scans were detected per subject, no repair was performed. Then, images were corrected for slice timing and spatially realigned to the first image from the first run. They were normalized to SPM5’s EPI template in Montreal Neurological Institute (MNI) space with a resampled voxel size of 3 × 3 × 3 mm and spatial smoothed using a Gaussian kernel with full-width at half-maximum of 8 mm. The T1-weighted structural scan of each subject was normalized to a standard T1 template in MNI space with a resampled voxel size of 1 × 1 × 1 mm.

Computational Model

We computed the value of the PE for each subject according to the sequence of stimuli they received, providing a statistical regressor for the fMRI data. The use of a probabilistic reinforcement strategy, in which the cues are only 50% predictive of their outcomes, ensures constant learning and updating of predictions and generates both positive and negative PE throughout the course of the experiment, therefore maximizing the variability of PE over each run.

Predicted values and PE values were calculated trial-by-trial by using a Rescorla–Wagner rule (Rescorla and Wagner 1972). For each trial t, a PE δ(t) was computed as the difference between the actual outcome value R(t) and its predicted value V(t) on that trial (eq. 1):

 
1
δ(t)=R(t)V(t)

Then, the predicted value of the next trial V (t + 1) was updated by adding the PE δ (t) weighted by a learning rate α (eq. 2):

 
2
V(t+1)=V(t)+αδ(t)

The outcome value R(t) was set to 1 when a reinforcer (either a reward or a punishment) was delivered and to 0 when a scrambled picture was delivered. V(t) was initialized to 0. The learning rate (α) was derived from subjects’ response times (RTs) to the cue. RTs have been shown to be good indicators of conditioning (Critchley et al. 2002; Gottfried et al. 2003) and to be correlated with the prediction v(t) estimated by a reinforcement learning model (Seymour et al. 2004). Several recent fMRI studies have used RTs to estimate the learning rate of reinforcement learning models during tasks in which the buttons presses were irrelevant to receive the reward (Seymour et al. 2005; Bray and O'Doherty 2007).

First, RTs were normalized to allow analysis across subjects. We derived the prediction V(t) for each subject based on their individual conditioning histories for a range of learning rates (ranging from 0.01 to 0.5). Then, trial-by-trial RTs across subjects were fitted to a regression model that included the prediction V(t). The best fit yielded a learning rate of 0.24, which is close to the value used in other studies (O'Doherty, Dayan, et al. 2003; Seymour et al. 2005; Jensen et al. 2007). Note, however, that the qualitative behavior of the model is robust to a range of such parameters (between 0.1 and 0.5).

fMRI Data Analysis

First, statistical analysis was performed using the general linear model. For each of the 5 conditions (apple juice, monetary reward, salty water, aversive picture, and neutral), 2 phases (anticipation and reception) were modeled, resulting in the creation of 10 regressors. The anticipation phase was modeled as an epoch, time locked to the onset time of the cue, with a duration equal to RT + anticipatory period (=6 s). The reception phase was modeled as a boxcar of 1.5-s duration. For each reinforced condition (i.e., all conditions except the neutral condition), predicted values V(t) and PE δ(t) generated by the Rescorla–Wagner model were used as parametric modulators of the anticipation and reception regressors, respectively. All of these 18 regressors were convolved with a canonical hemodynamic response function. In addition, the 6 ongoing motion parameters estimated during realignment were included as regressors of no interest.

Second, we calculated first-level single-subject contrasts at the time of the reception for: 1) apple juice delivery and omission positively modulated by PE (C1), 2) monetary reward delivery and omission positively modulated by PE (C2), 3) aversive juice delivery and omission positively modulated by PE (C3), 4) aversive picture delivery or omission positively modulated by PE, (C4), 5) aversive juice delivery and omission negatively modulated by PE (C5), 6) aversive picture delivery or omission negatively modulated by PE (C6).

Third, we performed 2 second-level one-way flexible factorial design (described below) including a subject factor accounting for between-subject variability, in which we used a series of conjunctions testing the conjunction null hypothesis, as implemented in SPM5 (Nichols et al. 2005).

SPE Flexible Factorial Design

This flexible factorial design included the contrasts C1–C4 described above. This analysis was used to identify an “SPE brain network,” defined as a set of brain regions responding in the same way for positive and negative reinforcers: that is, showing high BOLD signal when an unexpected reward or punishment is delivered and low BOLD signal for its unexpected omission (Fig. 1C, top). First, we searched for a “global SPE brain network,” by performing a formal conjunction analysis of brain regions showing a positive correlation with the PE for all 4 reinforcers. Next, we investigated a “primary SPE brain network” by performing a conjunction analysis of the brain regions showing positive correlations with the PE for each of the primary reinforcers (apple juice, salty water, and aversive picture). Then, we investigated a “gustatory SPE brain network” by performing a conjunction analysis of the regions showing a positive correlation with the PE for the 2 gustatory conditions (apple juice and salty water). Finally, we searched for a “visual SPE brain network” by performing a conjunction analysis of the brain regions showing positive correlations with PE for the 2 visual conditions (money and aversive picture).

RPE Flexible Factorial Design

This flexible factorial design included the contrasts C1, C2, C5, and C6. This analysis was used to identify an “RPE brain network,” defined as a set of brain regions responding to positive and negative reinforcers in opposite ways: that is, showing high BOLD signal when an unexpected reward is delivered and an unexpected punishment is omitted and low BOLD signal for an unexpected reward omission and unexpected punishment delivery (Fig. 1C, bottom). Again, we first tested for a “global RPE brain network” by performing a conjunction analysis across the brain regions showing a positive correlation with the PE for the appetitive conditions and the regions showing a negative correlation with PE for the aversive conditions. We also performed specific conjunction analyses to look for brain regions in which the RPE signal explains the BOLD signal for the primary reinforcers, the gustatory reinforcers and finally, the visual reinforcers.

Activations Localization and Reported Statistics

Anatomic labeling of activated regions was done using the SPM Anatomy toolbox (Eickhoff et al. 2005) and the probabilistic atlas of Hammers et al. (2003). Reported coordinates conform to the MNI space. Activations from whole brain analysis are reported with a threshold of P < 0.05 after false discovery rate (FDR) correction for multiple comparisons.

Region of Interest Analyses

For illustrative purposes, we extracted and plotted the time course of activity corresponding to the reinforced and unreinforced trials for each condition in several regions of interest (ROIs). The ROIs were created in 3 stages. First, we identified several brain regions that have established roles during aversive and appetitive conditioning: the striatum, the insula, the anterior cingulate cortex (ACC) and the amygdala (Jensen et al. 2003; Seymour et al. 2004; Bray and O'Doherty 2007; Herwig et al. 2007; Sescousse et al. 2010). Second, we built 7 anatomical ROIs defined with the probabilistic atlas of Hammers et al. (2003): the ACC ROI resulted from the union of left and right ACC cortex and the other anatomical ROIs were left/right insula, left/right amygdala, and left/right striatum. Third, we intersected each of these anatomical ROIs with the functional clusters revealed by our whole brain conjunction analyses. Time course extractions were conducted with MarsBaR toolbox for SPM (http://marsbar.sourceforge.net/). Moreover, because the substantia nigra (SN) is known to be a key region in processing rewarding and aversive events (Schultz 1998; Matsumoto and Hikosaka 2009b), we used the probabilistic brain atlas of Hammers et al. (2003) to build bilateral ROIs in this region.

Results

Behavioral Results

We performed a 2 valences (positive vs. negative) × 2 modalities (gustatory vs. visual) repeated-measures analysis of variance (ANOVA) on the preference scores. As expected, the main effect of the valence demonstrated that the cues announcing positive reinforcers (apple juice and money) were preferred to those announcing negative reinforcers (salty water and aversive picture) (F1,19 = 2684.94, P < 10−6) (Fig. 2A). During scanning, the percent accuracy for detecting the cues was 97%, confirming that subjects paid attention to the cues. A one-way repeated-measures ANOVA on the RTs from the 5 conditions did not reveal any significant RT difference between reinforcer types (F4,72 = 2.0242, P = 0.1) at the time of the cue. For each of the 4 conditions having a probability of P = 0.5, the rating of the estimated probability that a cue led to a specific reinforcer was nonsignificantly different from 0.5 (Fig. 2B). For the neutral condition (P = 0), this estimated probability did not differ from 0. These results show that the probabilities of the association between cues and reinforcers were explicitly learnt in a valence-insensitive manner. Postscan subjective ratings confirmed that subjects perceived positive reinforcers as more pleasant than negative reinforcers. This was assessed using a 2 valences (positive vs. negative) × 2 modalities (gustatory vs. visual) repeated-measures ANOVA, in which the main effect of the valence was significant (F1,19 = 180.91, P < 10−6) (Fig. 2C). Finally, subjects’ self-reports indicated that they were drink deprived for an average of 12 h 24 min ± 2 h 16 min prior to scanning, therefore respecting their instructions to be drink deprived for 12 h. No significant difference was observed between the ratings of the thirst sensation performed before and after scanning (before: 3.16 ± 1.01, after: 2.74 ± 0.99; paired t-test: t18 = −1.41, P = 0.18). Although absence of evidence is no evidence for absence, the fact that our subjects were drink deprived for 12 h (this includes asking subjects to restrain from drinking water during this time period) and only drunk a very small amount of apple juice (1.2 mL, i.e., less than 2% of a glass of 200 mL) over the course of the experiment supports that the motivation to drink was stable over the course of the experiment. Interestingly, previous reinforcer devaluation fMRI studies reported that the amygdala and orbitofrontal cortex showed reduced activation after feeding subjects to satiety with one type of food but not with another (Small et al. 2001; Gottfried et al. 2003; Kringelbach et al. 2003). Yet, it is unlikely that such satiety effects were present in our Pavlovian conditioning paradigm because motivation for apple juice was not directly manipulated in our experiment and because apple juice was not devalued over the course of the experiment.

Figure 2.

Behavioral results. (A) Preference scores. Before scanning, subjects performed a two-alternative forced-choice preference task. Scores are normalized out of 12 pairings for each cue (with higher scores indicating more preferred). (B) Learning of the cue-reinforcer association. At the end of each block, subjects rated how likely the cue predicted the reinforcer. Subjects learned that the probability of the cue-reinforcer association was equal to 0.5. (C) Appetitive and aversive ratings. Subjects rated the pleasantness or aversiveness of each reinforcer on a scale from −2 (very unpleasant) to 2 (very pleasant). As expected, the ratings for positive reinforcers were significantly higher than the rating for negative reinforcers. Error bars represent the standard deviations.

Figure 2.

Behavioral results. (A) Preference scores. Before scanning, subjects performed a two-alternative forced-choice preference task. Scores are normalized out of 12 pairings for each cue (with higher scores indicating more preferred). (B) Learning of the cue-reinforcer association. At the end of each block, subjects rated how likely the cue predicted the reinforcer. Subjects learned that the probability of the cue-reinforcer association was equal to 0.5. (C) Appetitive and aversive ratings. Subjects rated the pleasantness or aversiveness of each reinforcer on a scale from −2 (very unpleasant) to 2 (very pleasant). As expected, the ratings for positive reinforcers were significantly higher than the rating for negative reinforcers. Error bars represent the standard deviations.

fMRI Results

Brain regions with response profiles consistent with an RPE signal should code reward and punishment in opposite fashion, responding with a “positive” PE when the outcome is better than expected (unexpected reward or omission of expected punishment) and a “negative” PE when it is worse than expected (unexpected punishment or omission of expected reward). In contrast, brain regions in which the activity reflects an SPE signal should treat both rewards and punishments in a similar fashion, as salient outcomes, with a positive response when both of these salient events are delivered and a negative response for their omission.

SPE Analysis

To test the hypothesis of different cerebral networks modulated by the SPE signal (Fig. 1C, up), we performed 4 different conjunctions combining all or different subsets of the reinforcers (primary, gustatory, and visual). All reported results are FDR corrected, P < 0.05.

Global SPE Analysis

First, when investigating whether there is a set of brain regions modulated by a global SPE signal encoding all unexpected reinforcers in a similar fashion independently of their valence, significant clusters of activation were only found in the bilateral occipital lobe (x, y, z = −30, −69, −18, T = 6.02; 24, −81, −6, T = 5.18) (Table 1a). Note that this global SPE analysis can either reflect an SPE signal regardless of reinforcer type or reflect a general visual SPE-related signal since all conditions share in common a visual component.

Table 1

MNI coordinates and statistic t for regions in which the cerebral activity positively correlates with the PE (FDR corrected, P < 0.05)

 x y z t 
a. Global salient conjunction      
    Occipital lobe −30 −69 −18 6.02 
24 −81 −6 5.18 
b. Primary/immediate salient conjunction      
    Occipital lobe −30 −69 −18 6.02 
24 −81 −6 5.18 
    Amygdala −21 −6 −18 4.48 
21 −3 −18 3.75 
c. Gustatory salient conjunction      
    Occipital lobe −30 −69 −18 6.02 
24 −81 −6 5.18 
    Postcentral gyrus −57 −21 30 9.93 
    Supramarginal gyrus 57 −18 24 9.11 
    Precentral gyrus −57 −3 36 7.89 
63 30 7.72 
    Supplementary motor area −3 −3 63 5.95 
60 5.81 
        Insula −39 −3 −6 8.91 
42 −6 6.83 
        Amygdala −21 −3 −15 5.60 
21 −3 −12 4.63 
        Putamen −21 −9 5.87 
21 −12 4.83 
        ACC −6 39 5.63 
12 42 5.93 
    Middle frontal gyrus −33 36 36 4.02 
33 42 30 3.19 
d. Visual salient conjunction      
    Occipital lobe −27 −81 −12 10.86 
24 −81 −6 9.71 
    Lateral orbital gyrus 39 27 −12 3.89 
    Middle frontal gyrus 48 57 3.24 
 x y z t 
a. Global salient conjunction      
    Occipital lobe −30 −69 −18 6.02 
24 −81 −6 5.18 
b. Primary/immediate salient conjunction      
    Occipital lobe −30 −69 −18 6.02 
24 −81 −6 5.18 
    Amygdala −21 −6 −18 4.48 
21 −3 −18 3.75 
c. Gustatory salient conjunction      
    Occipital lobe −30 −69 −18 6.02 
24 −81 −6 5.18 
    Postcentral gyrus −57 −21 30 9.93 
    Supramarginal gyrus 57 −18 24 9.11 
    Precentral gyrus −57 −3 36 7.89 
63 30 7.72 
    Supplementary motor area −3 −3 63 5.95 
60 5.81 
        Insula −39 −3 −6 8.91 
42 −6 6.83 
        Amygdala −21 −3 −15 5.60 
21 −3 −12 4.63 
        Putamen −21 −9 5.87 
21 −12 4.83 
        ACC −6 39 5.63 
12 42 5.93 
    Middle frontal gyrus −33 36 36 4.02 
33 42 30 3.19 
d. Visual salient conjunction      
    Occipital lobe −27 −81 −12 10.86 
24 −81 −6 9.71 
    Lateral orbital gyrus 39 27 −12 3.89 
    Middle frontal gyrus 48 57 3.24 

Gustatory SPE Analysis

To test for a gustatory SPE brain network, we performed a conjunction analysis of the positive correlation between BOLD activity and PE in the gustatory conditions (apple juice and salty water). This analysis revealed higher activity in the putamen bilaterally (x, y, z = −21, 3, −9, T = 5.87; 21, 6, −12, T = 4.83), the insula bilaterally (x, y, z = −39, −3, −6, T = 8.91; 42, −6, 6, T = 6.83), and the ACC (x, y, z = −6, 9, 39, T = 5.63) (Fig. 3 and Table 1c). All these activations were also observed when directly comparing the brain regions positively correlating with the SPE for the 2 gustative reinforcers with those responding to the money-related PE and the aversive picture-related PE taken together (with a threshold of P < 0.05 after FDR correction for multiple comparisons), demonstrating that the BOLD signal in these brain regions correlated more strongly with SPE related to gustative reinforcers than to the PE related to the other 2 reinforcers. For illustrative purpose, the time courses of activation were extracted in the ROIs corresponding to the 5 brain regions noted above (Fig. 3). They showed an increase in BOLD signal when unexpected apple juice or salty water were received and a BOLD decrease when either type of juice was unexpectedly omitted.

Figure 3.

Gustatory SPE Signal. Statistical parametric maps showing that activity in ACC, bilateral putamen, and bilateral insula correlates with the SPE in the 2 gustatory conditions (conjunction analysis). Plotted below are the time courses of inferred mean neuronal activity aligned to the onset of the reception phase for the 4 types of outcomes, in each of these brain regions. Reinforced and unreinforced trials are plotted separately. Color bars represent t values. Statistical significance was thresholded at P < 0.05, FDR corrected.

Figure 3.

Gustatory SPE Signal. Statistical parametric maps showing that activity in ACC, bilateral putamen, and bilateral insula correlates with the SPE in the 2 gustatory conditions (conjunction analysis). Plotted below are the time courses of inferred mean neuronal activity aligned to the onset of the reception phase for the 4 types of outcomes, in each of these brain regions. Reinforced and unreinforced trials are plotted separately. Color bars represent t values. Statistical significance was thresholded at P < 0.05, FDR corrected.

Primary SPE Analysis

Next, we investigated whether the activity of a specific set of brain regions may be modulated by an SPE signal for primary reinforcers. We thus performed a conjunction analysis (logical AND) of the positive correlations between the BOLD signal and PE for the 3 primary reinforcers conditions (apple juice, salty water, and aversive picture). This analysis revealed that activity in bilateral amygdala (x, y, z = −21, −6, −8, T = 4.48; 21, −3, −18, T = 3.75) was positively modulated by the SPE for the 3 primary reinforcers (Fig. 4 and Table 1b). The left amygdala activation survived the direct contrast between the brain regions modulated by the SPE for the 3 primary reinforcers compared with the money-related PE, demonstrating that the BOLD signal in this brain region correlated more strongly with SPE related to primary reinforcers than to PE related to monetary reward, with a threshold of P < 0.05 after FDR correction for multiple comparisons. To illustrate this point, we built up 2 ROIs, in left and right amygdala, in which we extracted the time course of each condition for the reinforced and unreinforced trials separately. The resulting time courses indicate a positive increase in BOLD signal when an unexpected primary reinforcer is delivered but not when it is omitted.

Figure 4.

Primary SPE Signal. (A) Bilateral amygdala activity showing positive correlation with the SPE for the primary/immediate reinforcers: apple juice (blue), salty water (green), and aversive picture (red). The overlap between the 3 conditions is represented in white. (B) Activation resulting from the conjunction (logical AND) of the correlation between SPE and amygdala activity for these 3 conditions. Below are plotted the average time course, aligned to the onset of the reception phase. For each brain hemisphere, time courses are extracted in the functional clusters of the conjunction analysis. Reinforced and unreinforced trials are plotted separately. Color bars represent t values. Statistical significance was thresholded at P < 0.05, FDR corrected.

Figure 4.

Primary SPE Signal. (A) Bilateral amygdala activity showing positive correlation with the SPE for the primary/immediate reinforcers: apple juice (blue), salty water (green), and aversive picture (red). The overlap between the 3 conditions is represented in white. (B) Activation resulting from the conjunction (logical AND) of the correlation between SPE and amygdala activity for these 3 conditions. Below are plotted the average time course, aligned to the onset of the reception phase. For each brain hemisphere, time courses are extracted in the functional clusters of the conjunction analysis. Reinforced and unreinforced trials are plotted separately. Color bars represent t values. Statistical significance was thresholded at P < 0.05, FDR corrected.

Visual SPE Analysis

Monetary reward and aversive picture were presented in the same visual modality, without the gustatory components. Thus, we searched for common brain regions responding as an SPE signal for these 2 visual conditions. The conjunction of the brain regions showing a positive correlation with PE for both the monetary reward and the aversive picture conditions, revealed large clusters of activity in bilateral occipital lobe (x, y, z = −27, −81, −12, T = 10.86; 24, −81, −6, T = 9.71), and smaller clusters in right lateral orbital gyrus and the middle frontal gyrus (Table 1d).

RPE Analysis

As noted in the Introduction, an alternative to the SPE hypothesis is that some brain regions activity covaries with an RPE signal, responding in opposite fashion for positive and negative events, regardless of whether these events are reinforced or unreinforced (Fig. 1C, bottom). Such an RPE signal should thus be high for unexpected reward delivery (apple juice and money) but low for unexpected punishment (salty water and aversive picture).

Global, Gustatory, and Visual RPE Analysis

To test for brain regions responding like a global RPE signal for all reinforcers, we performed a conjunction analysis of the brain regions showing activity positively correlated with the PE for the 2 rewarding conditions and of the brain regions showing activity negatively correlated with the PE for the 2 aversive conditions. This analysis revealed no significant correlation with RPE signal, even with the very liberal threshold of P = 0.01 uncorrected. Moreover, no brain region was found to elicit an activity modulated by the RPE signal for the gustatory or visual conditions when restricting the conjunctions analyses to one or the other of these 2 modalities.

RPE for Each Reinforcer

For completeness, we also report the list of brain regions positively and negatively modulated by the PE for each reinforcer taken separately (Supplementary Tables S1–S4). Using our standard imaging procedure, we were unable to find evidence of SPE-related activity in the VTA/SN for any of the reinforcers when performing a small volume correction in a spherical ROI centered at x, y, z = 6, −16, −14 (6 mm radius) or when using the SN ROI of a probabilistic brain atlas (Hammers et al. 2003).

Discussion

The present study identifies the brain structures responding to the “salient” and the “reward” PE signals in a Pavlovian reinforcement procedure with 2 types of rewards and 2 types of punishments. Critically, our model-based fMRI approach allowed us to extend at the whole brain level in humans the distinction between RPE and SPE signals recently observed in midbrain dopaminergic neurons in monkeys (Matsumoto and Hikosaka 2009b). Our results reveal the contributions of specific brain systems covarying with distinct SPE signals during Pavlovian learning of different types of stimuli-rewards and stimuli-punishments associations. We found evidence of a brain network including the striatum, anterior insula, and the ACC in which the activity reflected an SPE signal that depended upon the type of reinforcer. This brain network responded more robustly for appetitive and aversive juice outcomes than for monetary or aversive pictures. Moreover, the activity in the amygdala correlated with the SPE for primary reinforcers (i.e., for both types of juice and for aversive picture outcomes), but we found no evidence for this brain region encoding SPE for the secondary reinforcer (money). In contrast, no evidence was found for a nonsensory brain network coding RPE regardless of reinforcers type. These results demonstrate that an SPE signal covaries with the activity of specific components of the reward system and depends upon reinforcers type.

A current theory proposes that dopaminergic neurons encode a form of valence for learning and motivating reward-seeking behavior (Schultz et al. 1997; Berridge and Robinson 1998; Wise 2004) while another theory states that these neurons encode a form of salience (or ‘‘alerting’’) for shifting attention to unpredicted events (Redgrave et al. 1999; Horvitz 2000). In contrast, recent electrophysiological data propose that dopamine neurons may follow both theories at different times during a single task (Bromberg-Martin et al. 2010) and our findings that some components of the reward system respond to an SPE signal dependent upon the type of reinforcer are thus consistent with the latter results.

Previous fMRI studies using only potentially rewarded outcomes (presented or omitted) or only negative reinforcers could not search for the localization of brain activities covarying with an SPE signal regardless of reinforcer type (i.e., involved in the same way for unexpected reward and punishment). Our observed brain network in which the activity covaries with an SPE signal for both positive and aversive juices complement 2 sets of literature, one concerning appetitive conditioning and the other concerning aversive conditioning. First, our findings are consistent with previous human associative learning fMRI studies reporting striatal activity for different types of rewards, such as unexpected appetitive juice (McClure et al. 2003; O'Doherty, Dayan, et al. 2003) or positive odors and faces (Gottfried, O'Doherty, et al. 2002; Bray and O'Doherty 2007). Second, a number of human fMRI studies also investigated the neural substrates of aversive conditioning using a variety of punishments (Seymour et al. 2004; Jensen et al. 2007; Menon et al. 2007; Sarinopoulos et al. 2010). In agreement with the current findings, some experiments using a reinforcement learning approach involving cues that predict potential painful stimuli reported that the striatal BOLD signal increases with unexpected punishment and decreases when it is unexpectedly omitted (Jensen et al. 2003; Seymour et al. 2004; Menon et al. 2007). However, most aversive Pavlovian conditioning experiments did not use computational reinforcement learning models and did not compare directly how the brain learns to predict different types of rewards and punishments, making the comparison with our current findings difficult.

The present findings add to the growing body of evidence supporting the role of the striatum in coding PE for aversive outcomes, both in humans (Jensen et al. 2003; Seymour et al. 2004; Menon et al. 2007) and animals (Pezze and Feldon 2004). Moreover, an SPE-like response has been observed both in the human striatum, which shows increased activity for unexpected reward and for unexpected punishment (Seymour et al. 2005; Jensen et al. 2007) and in the striatum of nonhuman primates, which represent both appetitive and aversive events (Ravel et al. 2003) Together, our results point to a general role of the striatum in coding a SPE signal across a broad range of reinforcer types and complement previous reports that the striatum responds to nonrewarding unexpected stimuli, consistent with a saliency interpretation (Dreher and Grafman 2002; Zink et al. 2006).

In addition to the striatum, responses in other brain regions, including the ACC and the anterior insula, also correlated with a gustatory SPE, regardless of juice valence. ACC neurons are known to encode PE when macaque monkeys learn a correct action to make in a given context (Matsumoto et al. 2007) and the activity of this region has been shown to correlate with PE for both pain increase and pain relief (Seymour et al. 2005). This literature is consistent with our current findings that both appetitive and aversive PE correlate with ACC activity. Our results also suggest that the anterior insula activation signals an error in predicting juice salience, regardless of its valence. This result extends previous findings that anterior insula activity is often reported with various aversive events (Jensen et al. 2003; Liu et al. 2007), is driven by taste intensity irrespective of valence (Small et al. 2003), and correlates with the PE for both aversive and appetitive stimuli when using pain and pain relief as reinforcers (Seymour et al. 2005).

A number of studies have focused on valence, using monetary incentives or aversive shock, but have not independently varied salience (Breiter et al. 2001; Knutson et al. 2001; Abler et al. 2006; Dreher et al. 2006; Jensen et al. 2007; Tobler et al. 2007; Tom et al. 2007; Sescousse et al. 2010). Other studies have varied salience but have not independently varied valence across gains and losses (Jensen et al. 2003; Tricomi et al. 2004; Zink et al. 2006; Bjork and Hommer 2007). The novel aspect of the current study is that aversive and appetitive stimuli of different types are combined within a single paradigm and that different modality-dependent saliency PE are computed to identify specific saliency networks. When simultaneous manipulation of valence and salience have been performed, the results have shown some inconsistencies between studies, in part because the concept of saliency has been defined in different ways (Zink et al. 2006; Cooper and Knutson 2008; Carter et al. 2009). For example, Cooper and Knutson (2008) independently manipulated valence and salience by cuing participants to anticipate certain and uncertain monetary gains and losses. They found a significant interaction between valence and salience of anticipated incentives in the ventral striatum, suggesting that it separately represents valence and salience. Other studies reported that ventral striatal activation increases in anticipation of both gains and losses, supporting a saliency interpretation (Carter et al. 2009) or that some components of the reward system depend on valence (Seymour et al. 2007). Salient trials have also been reported to engage the dorsal striatum, when salience is defined by a monetary reception dependent upon a correct response (active condition) as opposed to a passive monetary reception (Tricomi et al. 2004; Zink et al. 2006). The concept of saliency used in these previous studies, however, greatly differed between paradigms and was not implemented in terms of computational processes, as is the case in our study.

It is worth noting that monetary RPE did not evoke reliable striatal or amygdala activity in our Pavlovian conditioning experiment. To the best of our knowledge, striatal activity is evoked in instrumental conditioning experiments using money as reward (Pessiglione et al. 2006; Yacubian et al. 2006; Liu et al. 2007) but has not been reported during monetary Pavlovian conditioning. In most experiments involving monetary rewards, receipt of money is contingent upon subjects’ action (i.e., monetary decision making task, guessing task, and gambling task) and money delivery only follows a correct response (O'Doherty, Critchley, et al. 2003; Dreher et al. 2006; Liu et al. 2007). Similarly, fMRI experiments reporting PE signal in the striatum (O'Doherty Critchley, et al. 2003; Pessiglione et al. 2006; Yacubian et al. 2006) did so during instrumental learning paradigms. The fact that instrumental, but not Pavlovian conditioning, engage the striatum for monetary reward is directly supported by a study reporting higher striatal activation when the money delivery depended upon subject’s response as compared with when the receipt of money was independent of subject’s actions (Zink et al. 2004). In our paradigm, subjects did not have to choose between actions leading to potentially aversive/positive outcomes and therefore could not actively avoid or approach anticipated outcomes. This may explain the absence of striatal response to monetary reward in our Pavlovian conditioning procedure since the ability to learn the valence of stimuli-outcomes associations is fundamental for appropriate subsequent approach or avoidance behavior. It is also possible that our study was insensitive to RPE for signal to noise reasons.

Bilateral amygdala response was observed with the SPE for all types of primary reinforcers (positive/aversive juices and aversive pictures) but did not significantly evoke PE for secondary reward (money). One reason may be that primary reinforcers are more salient than secondary reinforcers because they are more important for survival, capturing orienting or information seeking behavior more effectively. The amygdala is critical for processing rewarding and aversive outcomes (Murray 2007). Consistent with its role in processing both emotional valence and intensity (Machado et al. 2009), it is in an ideal position to integrate appetitive and aversive events to direct emotional responses. Information about primary pleasant and aversive stimuli converges in the amygdala, which receives input from sensory systems of all modalities (Stefanacci and Amaral 2002). Our results are thus consistent with studies emphasizing the role of the amygdala as a detector of behaviorally relevant stimuli (Sander et al. 2003) and during conditioning procedures with both appetitive and aversive stimuli (Buchel et al. 1999; Seymour et al. 2005; Paton et al. 2006). A number of brain regions, including the amygdala, may also reflect greater saliency associated with risk-taking during neuroeconomics paradigms (Kahn et al. 2002; Bechara et al. 2003; Cohen and Ranganath 2005; Hsu et al. 2005) as well as during speeded as compared with delayed motor responses during the stop signal task, a classical cognitive control paradigm requiring individuals to restrain a habitual response (Li et al. 2009). Interestingly, at the neuronal level, electrophysiological responses in the primate amygdala support both valence-specific and salience-specific processes, with both common and distinct populations of neurons involved in learning the association between conditioned visual stimuli and rewards or punishments (Buchel et al. 1999; Paton et al. 2006; Belova et al. 2008). Supporting a saliency signal, some cells showed a similar effect of expectation on responses to rewards and punishments. In contrast, in other cells, expectation modulates the responses to either rewards or aversive stimuli but not both, potentially playing a role in processes that require information about stimulus valence (Paton et al. 2006; Belova et al. 2008). A recent study in rodents also demonstrated that the amygdala provides an unsigned error signal (Roesch et al. 2010) with characteristics consistent with those postulated by signaling “motivational salience” (Belova et al. 2008). Many of these processes, which include learning when to exhibit defensive as opposed to approach behavior, are thus likely to involve the amygdala (LeDoux 2000).

A recent study used an axiomatic approach rooted in economics theory to formally test the class of RPE models on neural data (Rutledge et al. 2010). This approach is dividing the space of all possible models into subdomains and attempts to falsify the hypothesis that one or more members of an entire class of models can account for a set of empirical observations. This elegant study showed that the neural responses observed in the striatum, medial prefrontal cortex, amygdala, and posterior cingulate cortex satisfy axioms (necessary and sufficient conditions) for the entire class of RPE models. However, activity measured from the anterior insula falsified the axiomatic model, and therefore, no RPE model can account for measured activity in this brain region. Additional analyses suggested that the anterior insula might encode a saliency signal at outcome, consistent with our results. However, our current study, which used a regression approach (as well as all other model-based fMRI studies on RPE) cannot falsify the hypothesis that dopamine-related activity encodes an RPE signal. On the other hand, it is important to note that this study used a task requiring choices between lotteries only in the monetary domain. Therefore, this paradigm—in contrast to the current one—did not allow the authors to test whether the distinct nature of reinforcers differentially influence distinct brain networks.

The absence of positive results concerning the VTA/SN related to PE or SPE could be due to known difficulties to image midbrain dopamine nuclei without high-resolution fMRI and midbrain-specific alignment algorithms (D'Ardenne et al. 2008). In fact, a number of previous fMRI studies did report midbrain-related PE activity in different paradigms (Dreher et al. 2006; D'Ardenne et al. 2008) while others did not (McClure et al. 2003; O'Doherty, Dayan, et al. 2003; Abler et al. 2006; Li et al. 2006; Pessiglione et al. 2006; Behrens et al. 2008; Hare et al. 2008; Sescousse et al. 2010). In the same line, the neural activity of the lateral habenula has been reported to increase when expected rewards are not delivered or when unexpected punishments are received (Matsumoto and Hikosaka 2007, 2009a). Such signals may be an important source of negative RPE inputs to the VTA since activity increases in lateral habenular neurons inactivate dopamine neurons in the VTA. Some fMRI studies in humans also reported that the habenula responds to negative PE (Salas et al. 2010) and sends a signal to the VTA/SN during error detection in a stop signal task (Ide and Li 2011). Although these results are consistent with the hypothesis that the VTA and habenula are part of a saliency brain circuit, most fMRI studies on PEs using either standard or high-resolution imaging have failed to report such activity in the lateral habenula (McClure et al. 2003; O'Doherty, Dayan, et al. 2003; Seymour et al. 2004; Abler et al. 2006; Dreher et al. 2006; Li et al. 2006; Pessiglione et al. 2006; Behrens et al. 2008; D'Ardenne et al. 2008; Hare et al. 2008; Rutledge et al. 2010; Sescousse et al. 2010).

Finally, it should be noted that finding BOLD activity that covariates with an SPE signal does not mean that the source of the SPE network is unique. Indeed, a region X might receive an RPE from region A and a punishment PE from region B and, therefore, may appear as if it is receiving a single SPE signal. This is not only theoretically possible, since it has been suggested that dopamine encodes an RPE and that serotonin may encode a punishment PE or may regulate patience while waiting for future reward (Daw et al. 2002; Miyazaki et al. 2012). Thus, there is a conceptual distinction to be made between brain activity that directly reflects an SPE—that is, computes an SPE or receives direct projection(s) from area(s) that computes it—and brain activity that correlates with this signal, as a result of increased salience. This distinction is particularly useful to understand that the only brain area that responds to the global SPE, the occipital lobe, is devoid of dopamine projections.

Our findings advance our understanding of the neurobiological mechanisms underlying the ability to learn associations between stimuli and rewards or punishments. Our model-based fMRI approach allowed us to pinpoint the brain structures responding to the 2 types of hypothesized PE signals (Matsumoto and Hikosaka 2009b). In this electrophysiological study, the SPE hypothesis was developed using liquid reward and airpuff aversive stimuli, which have not only opposite valence but are also of different types (gustatory vs. tactile). Our results reveal the contributions made by distinct brain regions in computing PE depending upon the type and valence of the outcomes. Thus, our results provide direct empirical evidence for formal learning theories that posit a critical role for the SPE signal during Pavlovian conditioning in humans and demonstrate that the cerebral representation of this signal depends not only on the valence but also upon the type of the reinforcement.

Supplementary Material

Supplementary material can be found at: http://www.cercor.oxfordjournals.org/

Funding

Marie Curie International reintegration grant; Fyssen Foundation grant (to J.-C.D.).

We thank the CERMEP staff for help during scanning. Conflict of Interest: None declared.

References

Abler
B
Walter
H
Erk
S
Kammerer
H
Spitzer
M
Prediction error as a linear function of reward probability is coded in human nucleus accumbens
Neuroimage
 , 
2006
, vol. 
31
 
2
(pg. 
790
-
795
)
Bayer
HM
Glimcher
PW
Midbrain dopamine neurons encode a quantitative reward prediction error signal
Neuron
 , 
2005
, vol. 
47
 
1
(pg. 
129
-
141
)
Bechara
A
Damasio
H
Damasio
AR
Role of the amygdala in decision-making
Ann N Y Acad Sci
 , 
2003
, vol. 
985
 (pg. 
356
-
369
)
Behrens
TEJ
Hunt
LT
Woolrich
MW
Rushworth
MFS
Associative learning of social value
Nature
 , 
2008
, vol. 
456
 
7219
(pg. 
245
-
249
)
Belova
MA
Paton
JJ
Salzman
CD
Moment-to-moment tracking of state value in the amygdala
J Neurosci
 , 
2008
, vol. 
28
 
40
(pg. 
10023
-
10030
)
Berns
GS
McClure
SM
Pagnoni
G
Montague
PR
Predictability modulates human brain response to reward
J Neurosci
 , 
2001
, vol. 
21
 
8
(pg. 
2793
-
2798
)
Berridge
KC
Robinson
TE
What is the role of dopamine in reward: hedonic impact, reward learning, or incentive salience?
Brain Res Brain Res Rev
 , 
1998
, vol. 
28
 
3
(pg. 
309
-
369
)
Bjork
JM
Hommer
DW
Anticipating instrumentally obtained and passively-received rewards: a factorial fMRI investigation
Behav Brain Res
 , 
2007
, vol. 
177
 
1
(pg. 
165
-
170
)
Bray
S
O'Doherty
J
Neural coding of reward-prediction error signals during classical conditioning with attractive faces
J Neurophysiol
 , 
2007
, vol. 
97
 
4
(pg. 
3036
-
3045
)
Breiter
HC
Aharon
I
Kahneman
D
Dale
A
Shizgal
P
Functional imaging of neural responses to expectancy and experience of monetary gains and losses
Neuron
 , 
2001
, vol. 
30
 
2
(pg. 
619
-
639
)
Brischoux
F
Chakraborty
S
Brierley
DI
Ungless
MA
Phasic excitation of dopamine neurons in ventral VTA by noxious stimuli
Proc Natl Acad Sci U S A
 , 
2009
, vol. 
106
 
12
(pg. 
4894
-
4899
)
Bromberg-Martin
ES
Matsumoto
M
Hikosaka
O
Distinct tonic and phasic anticipatory activity in lateral habenula and dopamine neurons
Neuron
 , 
2010
, vol. 
67
 
1
(pg. 
144
-
155
)
Buchel
C
Dolan
RJ
Armony
JL
Friston
KJ
Amygdala-hippocampal involvement in human aversive trace conditioning revealed through event-related functional magnetic resonance imaging
J Neurosci
 , 
1999
, vol. 
19
 
24
(pg. 
10869
-
10876
)
Carter
RM
Macinnes
JJ
Huettel
SA
Adcock
RA
Activation in the VTA and nucleus accumbens increases in anticipation of both gains and losses
Front Behav Neurosci
 , 
2009
, vol. 
3
 pg. 
21
 
Cohen
MX
Ranganath
C
Behavioral and neural predictors of upcoming decisions
Cogn Affect Behav Neurosci
 , 
2005
, vol. 
5
 
2
(pg. 
117
-
126
)
Cooper
JC
Knutson
B
Valence and salience contribute to nucleus accumbens activation
Neuroimage
 , 
2008
, vol. 
39
 
1
(pg. 
538
-
547
)
Critchley
HD
Mathias
CJ
Dolan
RJ
Fear conditioning in humans: the influence of awareness and autonomic arousal on functional neuroanatomy
Neuron
 , 
2002
, vol. 
33
 
4
(pg. 
653
-
663
)
D'Ardenne
K
McClure
SM
Nystrom
LE
Cohen
JD
BOLD responses reflecting dopaminergic signals in the human ventral tegmental area
Science
 , 
2008
, vol. 
319
 
5867
(pg. 
1264
-
1267
)
Daw
ND
Kakade
S
Dayan
P
Opponent interactions between serotonin and dopamine
Neural Netw
 , 
2002
, vol. 
15
 
4–6
(pg. 
603
-
616
)
Delgado
MR
Li
J
Schiller
D
Phelps
EA
The role of the striatum in aversive learning and aversive prediction errors
Philos Trans R Soc Lond B Biol Sci
 , 
2008
, vol. 
363
 
1511
(pg. 
3787
-
3800
)
Delgado
MR
Nystrom
LE
Fissell
C
Noll
DC
Fiez
JA
Tracking the hemodynamic responses to reward and punishment in the striatum
J Neurophysiol
 , 
2000
, vol. 
84
 
6
(pg. 
3072
-
3077
)
Dreher
J-C
Grafman
J
The roles of the cerebellum and basal ganglia in timing and error prediction
Eur J Neurosci
 , 
2002
, vol. 
16
 
8
(pg. 
1609
-
1619
)
Dreher
J-C
Kohn
P
Berman
KF
Neural coding of distinct statistical properties of reward information in humans
Cereb Cortex
 , 
2006
, vol. 
16
 
4
(pg. 
561
-
573
)
Eickhoff
SB
Stephan
KE
Mohlberg
H
Grefkes
C
Fink
GR
Amunts
K
Zilles
K
A new SPM toolbox for combining probabilistic cytoarchitectonic maps and functional imaging data
Neuroimage
 , 
2005
, vol. 
25
 
4
(pg. 
1325
-
1335
)
Gottfried
JA
Deichmann
R
Winston
JS
Dolan
RJ
Functional heterogeneity in human olfactory cortex: an event-related functional magnetic resonance imaging study
J Neurosci
 , 
2002
, vol. 
22
 
24
(pg. 
10819
-
10828
)
Gottfried
JA
O'Doherty
J
Dolan
RJ
Appetitive and aversive olfactory learning in humans studied using event-related functional magnetic resonance imaging
J Neurosci
 , 
2002
, vol. 
22
 
24
(pg. 
10829
-
10837
)
Gottfried
JA
O'Doherty
J
Dolan
RJ
Encoding predictive reward value in human amygdala and orbitofrontal cortex
Science
 , 
2003
, vol. 
301
 
5636
(pg. 
1104
-
1107
)
Hammers
A
Allom
R
Koepp
MJ
Free
SL
Myers
R
Lemieux
L
Mitchell
TN
Brooks
DJ
Duncan
JS
Three-dimensional maximum probability atlas of the human brain, with particular reference to the temporal lobe
Hum Brain Mapp
 , 
2003
, vol. 
19
 
4
(pg. 
224
-
247
)
Hare
TA
O'Doherty
J
Camerer
CF
Schultz
W
Rangel
A
Dissociating the role of the orbitofrontal cortex and the striatum in the computation of goal values and prediction errors
J Neurosci
 , 
2008
, vol. 
28
 
22
(pg. 
5623
-
5630
)
Herwig
U
Abler
B
Walter
H
Erk
S
Expecting unpleasant stimuli—an fMRI study
Psychiatry Res
 , 
2007
, vol. 
154
 
1
(pg. 
1
-
12
)
Horvitz
JC
Mesolimbocortical and nigrostriatal dopamine responses to salient non-reward events
Neuroscience
 , 
2000
, vol. 
96
 
4
(pg. 
651
-
656
)
Hsu
M
Bhatt
M
Adolphs
R
Tranel
D
Camerer
CF
Neural systems responding to degrees of uncertainty in human decision-making
Science
 , 
2005
, vol. 
310
 
5754
(pg. 
1680
-
1683
)
Ide
JS
Li
C-SR
Error-related functional connectivity of the habenula in humans
Front Hum Neurosci
 , 
2011
, vol. 
5
 pg. 
25
 
Jensen
J
McIntosh
AR
Crawley
AP
Mikulis
DJ
Remington
G
Kapur
S
Direct activation of the ventral striatum in anticipation of aversive stimuli
Neuron
 , 
2003
, vol. 
40
 
6
(pg. 
1251
-
1257
)
Jensen
J
Smith
AJ
Willeit
M
Crawley
AP
Mikulis
DJ
Vitcu
I
Kapur
S
Separate brain regions code for salience vs. valence during reward prediction in humans
Hum Brain Mapp
 , 
2007
, vol. 
28
 
4
(pg. 
294
-
302
)
Kahn
I
Yeshurun
Y
Rotshtein
P
Fried
I
Ben-Bashat
D
Hendler
T
The role of the amygdala in signaling prospective outcome of choice
Neuron
 , 
2002
, vol. 
33
 
6
(pg. 
983
-
994
)
Knight
DC
Waters
NS
King
MK
Bandettini
PA
Learning-related diminution of unconditioned SCR and fMRI signal responses
Neuroimage
 , 
2010
, vol. 
49
 
1
(pg. 
843
-
848
)
Knutson
B
Adams
CM
Fong
GW
Hommer
D
Anticipation of increasing monetary reward selectively recruits nucleus accumbens
J Neurosci
 , 
2001
, vol. 
21
 
16
pg. 
RC159
 
Knutson
B
Westdorp
A
Kaiser
E
Hommer
D
FMRI visualization of brain activity during a monetary incentive delay task
Neuroimage
 , 
2000
, vol. 
12
 
1
(pg. 
20
-
27
)
Kringelbach
ML
O'Doherty
J
Rolls
ET
Andrews
C
Activation of the human orbitofrontal cortex to a liquid food stimulus is correlated with its subjective pleasantness
Cereb Cortex
 , 
2003
, vol. 
13
 
10
(pg. 
1064
-
1071
)
LaBar
KS
Gatenby
JC
Gore
JC
LeDoux
JE
Phelps
EA
Human amygdala activation during conditioned fear acquisition and extinction: a mixed-trial fMRI study
Neuron
 , 
1998
, vol. 
20
 
5
(pg. 
937
-
945
)
Lang
PJ
Bradley
MM
Cuthbert
BN
International affective picture system (IAPS): Affective ratings of pictures and instruction manual
 , 
2005
Gainesville (FL)
University of Florida
 
No. Technical Report A-6
LeDoux
JE
Emotion circuits in the brain
Ann Rev Neurosci
 , 
2000
, vol. 
23
 (pg. 
155
-
184
)
Li
C-sR
Chao
HHA
Lee
T-W
Neural correlates of speeded as compared with delayed responses in a stop signal task: an indirect analog of risk taking and association with an anxiety trait
Cereb Cortex
 , 
2009
, vol. 
19
 
4
(pg. 
839
-
848
)
Li
C-sR
Huang
C
Constable
RT
Sinha
R
Imaging response inhibition in a stop-signal task: neural correlates independent of signal monitoring and post-response processing
J Neurosci
 , 
2006
, vol. 
26
 
1
(pg. 
186
-
192
)
Liu
X
Powell
DK
Wang
H
Gold
BT
Corbly
CR
Joseph
JE
Functional dissociation in frontal and striatal areas for processing of positive and negative reward information
J Neurosci
 , 
2007
, vol. 
27
 
17
(pg. 
4587
-
4597
)
Machado
CJ
Kazama
AM
Bachevalier
J
Impact of amygdala, orbital frontal, or hippocampal lesions on threat avoidance and emotional reactivity in nonhuman primates
Emotion
 , 
2009
, vol. 
9
 
2
(pg. 
147
-
163
)
Matsumoto
M
Hikosaka
O
Lateral habenula as a source of negative reward signals in dopamine neurons
Nature
 , 
2007
, vol. 
447
 
7148
(pg. 
1111
-
1115
)
Matsumoto
M
Hikosaka
O
Representation of negative motivational value in the primate lateral habenula
Nat Neurosci
 , 
2009
, vol. 
12
 
1
(pg. 
77
-
84
)
Matsumoto
M
Hikosaka
O
Two types of dopamine neuron distinctly convey positive and negative motivational signals
Nature
 , 
2009
, vol. 
459
 
7248
(pg. 
837
-
841
)
Matsumoto
M
Matsumoto
K
Abe
H
Tanaka
K
Medial prefrontal cell activity signaling prediction errors of action values
Nat Neurosci
 , 
2007
, vol. 
10
 
5
(pg. 
647
-
656
)
Mazaika
P
Hoeft
F
Glover
G
Reiss
AL
Methods and software for fMRI analysis for clinical subjects
2009
 
Presentation at the 15th Annual Meeting of the Organization for Human Brain Mapping San Francisco, CA, June 18-23, 2009 spnl.stanford.edu
McClure
SM
Berns
GS
Montague
PR
Temporal prediction errors in a passive learning task activate human striatum
Neuron
 , 
2003
, vol. 
38
 
2
(pg. 
339
-
346
)
Menon
M
Jensen
J
Vitcu
I
Graff-Guerrero
A
Crawley
A
Smith
MA
Kapur
S
Temporal difference modeling of the blood-oxygen level dependent response during aversive conditioning in humans: effects of dopaminergic modulation
Biol Psychiatry
 , 
2007
, vol. 
62
 
7
(pg. 
765
-
772
)
Miyazaki
K
Miyazaki
KW
Doya
K
The role of serotonin in the regulation of patience and impulsivity
Mol Neurobiol
 , 
2012
 
doi: 10.1007/s12035-012-8232-6
Murray
EA
The amygdala, reward and emotion
Trends Cogn Sci
 , 
2007
, vol. 
11
 
11
(pg. 
489
-
497
)
Nichols
T
Brett
M
Andersson
J
Wager
T
Poline
J-B
Valid conjunction inference with the minimum statistic
Neuroimage
 , 
2005
, vol. 
25
 
3
(pg. 
653
-
660
)
Nieuwenhuis
S
Heslenfeld
DJ
Geusau
NJAv
Mars
RB
Holroyd
CB
Yeung
N
Activity in human reward-sensitive brain areas is strongly context dependent
Neuroimage
 , 
2005
, vol. 
25
 
4
(pg. 
1302
-
1309
)
O'Doherty
J
Critchley
H
Deichmann
R
Dolan
RJ
Dissociating valence of outcome from behavioral control in human orbital and ventral prefrontal cortices
J Neurosci
 , 
2003
, vol. 
23
 
21
(pg. 
7931
-
7939
)
O'Doherty
J
Dayan
P
Schultz
J
Deichmann
R
Friston
K
Dolan
RJ
Dissociable roles of ventral and dorsal striatum in instrumental conditioning
Science
 , 
2004
, vol. 
304
 
5669
(pg. 
452
-
454
)
O'Doherty
JP
Dayan
P
Friston
K
Critchley
H
Dolan
RJ
Temporal difference models and reward-related learning in the human brain
Neuron
 , 
2003
, vol. 
38
 
2
(pg. 
329
-
337
)
Oldfield
RC
The assessment and analysis of handedness: the Edinburgh inventory
Neuropsychologia
 , 
1971
, vol. 
9
 
1
(pg. 
97
-
113
)
Pan
W-X
Schmidt
R
Wickens
JR
Hyland
BI
Dopamine cells respond to predicted events during classical conditioning: evidence for eligibility traces in the reward-learning network
J Neurosci
 , 
2005
, vol. 
25
 
26
(pg. 
6235
-
6242
)
Paton
JJ
Belova
MA
Morrison
SE
Salzman
CD
The primate amygdala represents the positive and negative value of visual stimuli during learning
Nature
 , 
2006
, vol. 
439
 
7078
(pg. 
865
-
870
)
Pessiglione
M
Seymour
B
Flandin
G
Dolan
RJ
Frith
CD
Dopamine-dependent prediction errors underpin reward-seeking behaviour in humans
Nature
 , 
2006
, vol. 
442
 
7106
(pg. 
1042
-
1045
)
Pezze
MA
Feldon
J
Mesolimbic dopaminergic pathways in fear conditioning
Prog Neurobiol
 , 
2004
, vol. 
74
 
5
(pg. 
301
-
320
)
Ravel
S
Legallet
E
Apicella
P
Responses of tonically active neurons in the monkey striatum discriminate between motivationally opposing stimuli
J Neurosci
 , 
2003
, vol. 
23
 
24
(pg. 
8489
-
8497
)
Redgrave
P
Prescott
TJ
Gurney
K
Is the short-latency dopamine response too short to signal reward error?
Trends Neurosci
 , 
1999
, vol. 
22
 
4
(pg. 
146
-
151
)
Rescorla
RA
Wagner
AR
Black
A
Prokasy
WF
A theory of Pavlovian conditioning: variations in the effectiveness of reinforcement and non-reinforcement
Classical conditioning II: current research and theory
 , 
1972
New York
Appleton-Century-Crofts
(pg. 
64
-
99
)
Reynolds
JN
Hyland
BI
Wickens
JR
A cellular mechanism of reward-related learning
Nature
 , 
2001
, vol. 
413
 
6851
(pg. 
67
-
70
)
Roesch
MR
Calu
DJ
Esber
GR
Schoenbaum
G
Neural correlates of variations in event processing during learning in basolateral amygdala
J Neurosci
 , 
2010
, vol. 
30
 
7
(pg. 
2464
-
2471
)
Roesch
MR
Calu
DJ
Schoenbaum
G
Dopamine neurons encode the better option in rats deciding between differently delayed or sized rewards
Nat Neurosci
 , 
2007
, vol. 
10
 
12
(pg. 
1615
-
1624
)
Rutledge
RB
Dean
M
Caplin
A
Glimcher
PW
Testing the reward prediction error hypothesis with an axiomatic model
J Neurosci
 , 
2010
, vol. 
30
 
40
(pg. 
13525
-
13536
)
Salas
R
Baldwin
P
de Biasi
M
Montague
PR
BOLD responses to negative reward prediction errors in human habenula
Front Hum Neurosci
 , 
2010
, vol. 
4
 pg. 
36
 
Sander
D
Grafman
J
Zalla
T
The human amygdala: an evolved system for relevance detection
Rev Neurosci
 , 
2003
, vol. 
14
 
4
(pg. 
303
-
316
)
Sarinopoulos
I
Grupe
DW
Mackiewicz
KL
Herrington
JD
Lor
M
Steege
EE
Nitschke
JB
Uncertainty during anticipation modulates neural responses to aversion in human insula and amygdala
Cereb Cortex
 , 
2010
, vol. 
20
 
4
(pg. 
929
-
940
)
Schultz
W
Predictive reward signal of dopamine neurons
J Neurophysiol
 , 
1998
, vol. 
80
 
1
(pg. 
1
-
27
)
Schultz
W
Dayan
P
Montague
PR
A neural substrate of prediction and reward
Science
 , 
1997
, vol. 
275
 
5306
(pg. 
1593
-
1599
)
Sescousse
G
Redoute
J
Dreher
J-C
The architecture of reward value coding in the human orbitofrontal cortex
J Neurosci
 , 
2010
, vol. 
30
 
39
(pg. 
13095
-
13104
)
Seymour
B
Daw
N
Dayan
P
Singer
T
Dolan
R
Differential encoding of losses and gains in the human striatum
J Neurosci
 , 
2007
, vol. 
27
 
18
(pg. 
4826
-
4831
)
Seymour
B
O'Doherty
JP
Dayan
P
Koltzenburg
M
Jones
AK
Dolan
RJ
Friston
KJ
Frackowiak
RS
Temporal difference models describe higher-order learning in humans
Nature
 , 
2004
, vol. 
429
 
6992
(pg. 
664
-
667
)
Seymour
B
O'Doherty
JP
Koltzenburg
M
Wiech
K
Frackowiak
R
Friston
K
Dolan
R
Opponent appetitive-aversive neural processes underlie predictive learning of pain relief
Nat Neurosci
 , 
2005
, vol. 
8
 
9
(pg. 
1234
-
1240
)
Skinner
BF
The behavior of organisms: an experimental analysis
 , 
1938
New York
Appleton-Century-Crofts
Small
DM
Gregory
MD
Mak
YE
Gitelman
D
Mesulam
MM
Parrish
T
Dissociation of neural representation of intensity and affective valuation in human gustation
Neuron
 , 
2003
, vol. 
39
 
4
(pg. 
701
-
711
)
Small
DM
Zatorre
RJ
Dagher
A
Evans
AC
Jones-Gotman
M
Changes in brain activity related to eating chocolate: from pleasure to aversion
Brain
 , 
2001
, vol. 
124
 
Pt 9
(pg. 
1720
-
1733
)
Stefanacci
L
Amaral
DG
Some observations on cortical inputs to the macaque monkey amygdala: an anterograde tracing study
J Comp Neurol
 , 
2002
, vol. 
451
 
4
(pg. 
301
-
323
)
Tanaka
SC
Doya
K
Okada
G
Ueda
K
Okamoto
Y
Yamawaki
S
Prediction of immediate and future rewards differentially recruits cortico-basal ganglia loops
Nat Neurosci
 , 
2004
, vol. 
7
 
8
(pg. 
887
-
893
)
Tobler
PN
O'Doherty
JP
Dolan
RJ
Schultz
W
Reward value coding distinct from risk attitude-related uncertainty coding in human reward systems
J Neurophysiol
 , 
2007
, vol. 
97
 
2
(pg. 
1621
-
1632
)
Tom
SM
Fox
CR
Trepel
C
Poldrack
RA
The neural basis of loss aversion in decision-making under risk
Science
 , 
2007
, vol. 
315
 
5811
(pg. 
515
-
518
)
Tremblay
L
Schultz
W
Relative reward preference in primate orbitofrontal cortex
Nature
 , 
1999
, vol. 
398
 
6729
(pg. 
704
-
708
)
Tricomi
EM
Delgado
MR
Fiez
JA
Modulation of caudate activity by action contingency
Neuron
 , 
2004
, vol. 
41
 
2
(pg. 
281
-
292
)
Wise
RA
Dopamine, learning and motivation. Nature Reviews
Neuroscience
 , 
2004
, vol. 
5
 
6
(pg. 
483
-
494
)
Yacubian
J
Gläscher
J
Schroeder
K
Sommer
T
Braus
DF
Büchel
C
Dissociable systems for gain- and loss-related value predictions and errors of prediction in the human brain
J Neurosci
 , 
2006
, vol. 
26
 
37
(pg. 
9530
-
9537
)
Young
AMJ
Increased extracellular dopamine in nucleus accumbens in response to unconditioned and conditioned aversive stimuli: studies using 1 min microdialysis in rats
J Neurosci Methods
 , 
2004
, vol. 
138
 
1–2
(pg. 
57
-
63
)
Zaghloul
KA
Blanco
JA
Weidemann
CT
McGill
K
Jaggi
JL
Baltuch
GH
Kahana
MJ
Human substantia nigra neurons encode unexpected financial rewards
Science
 , 
2009
, vol. 
323
 
5920
(pg. 
1496
-
1499
)
Zink
CF
Pagnoni
G
Chappelow
J
Martin-Skurski
M
Berns
GS
Human striatal activation reflects degree of stimulus saliency
Neuroimage
 , 
2006
, vol. 
29
 
3
(pg. 
977
-
983
)
Zink
CF
Pagnoni
G
Martin-Skurski
ME
Chappelow
JC
Berns
GS
Human striatal responses to monetary reward depend on saliency
Neuron
 , 
2004
, vol. 
42
 
3
(pg. 
509
-
517
)