-
PDF
- Split View
-
Views
-
Cite
Cite
Hackjin Kim, Shinsuke Shimojo, John P. O'Doherty, Overlapping Responses for the Expectation of Juice and Money Rewards in Human Ventromedial Prefrontal Cortex, Cerebral Cortex, Volume 21, Issue 4, April 2011, Pages 769–776, https://doi.org/10.1093/cercor/bhq145
Close - Share Icon Share
Abstract
Although much is known about the neural substrates of reward, the question of whether expectation of different types of reinforcers engage distinct or overlapping brain circuitry has not been addressed definitively. In the present study, human subjects, while being scanned with functional magnetic resonance imaging, performed a simple reward-based action selection task to obtain different magnitudes of either monetary outcomes (winning or losing money) or juice outcomes (pleasant apple juice or an unpleasant salt flavor). At the group level, we found partially overlapping value-related activity within ventromedial prefrontal cortex (vmPFC) during anticipation of juice and money reward outcomes. Analogous results were found in the right anterior insula, except that this region showed negative correlations as a function of increasing expected reward. These results indicate that vmPFC and anterior insula contain overlapping representations of anticipatory value, consistent with the existence of a common currency for the value of expected outcomes in these regions.
Introduction
In order to make decisions between disparate rewards such as when choosing between buying a refrigerator and booking a vacation, it is often suggested that the value of these rewards needs to be encoded on a common scale or currency so that it is possible to work out which option is more preferable to that individual (Montague and Berns 2002). In economics, the term “utility” is used to describe the relative preference that an individual will express for such different types of reward, a construct implying that all of these items can be ranked on a common scale (Georgescu-Roegen 1968). It is still an open question, however, as to how such a common utility if one exists, might be represented in the brain.
While a large number of neuroimaging studies have been conducted on the neural representation of reward and expected reward for many different types of reinforcers including money (Delgado et al. 2000; Knutson et al. 2000; O'Doherty et al. 2001; O'Doherty, Critchley, et al. 2003), food (or juice) (O'Doherty et al. 2002; De Araujo and Rolls 2004; Valentin et al. 2007), odors (O'Doherty et al. 2000; Gottfried et al. 2002; Royet et al. 2003), attractive faces (Aharon et al. 2001; O'Doherty, Winston, et al. 2003), esthetic art (Kawabata and Zeki 2004), and pleasant touch (Rolls et al. 2003) to name but a few, the question of whether there exists a unitary representation of reward value for different types of reinforcer has not been addressed in a definitive manner to date. This is partly due to the lack of an appropriate methodology, in which studies have typically been restricted to exploring responses to a single type of reward and thus any comparison between rewards or punishers can only be made in a qualitative manner between studies that involve different subject groups, imaging methodologies, and tasks. Thus, any differences in reported activation for different types of reinforcers across studies could be due to differences in the task, analysis methodology employed, scanning technology or parameters, subject populations, or can merely be ascribed to random interstudy variability. There is some preliminary evidence that social reinforcement may recruit partially overlapping regions of striatum to monetary reward (Izuma et al. 2008). However, the degree to which neural representations of more abstract reinforcers such as money and that of natural reinforcers such as juice are distinct or overlapping is as yet unknown.
In the present study, we aimed to use functional magnetic resonance imaging (fMRI) to probe representations for 2 different types of reinforcer: juice flavors and money. This was achieved by comparing responses with these 2 distinct reinforcer types within the same functional imaging study, using the same task and in the same subjects. This enabled us to compare directly between regions activated by different types of reward or punisher rather than trying to infer differences in a qualitative manner between subjects and studies.
To achieve this, we used a reward-based action selection paradigm similar to the monetary incentive delay task of Knutson et al. (2000), in which subjects performed actions in order to obtain both juice and money rewards. In this task, each trial began with the presentation of a visual cue or discriminative stimulus (DS), to which subjects had to respond immediately in order to obtain an outcome that would be delivered 4 s later (Fig. 1A). The shape, color, and brightness of the DS informed subjects of the specific type (juice or money), valence (winning and losing money; receiving pleasant fruit juice or an unpleasant salty stimulus), and the magnitude of the outcome that would subsequently be delivered (Fig. 1B). This design allowed us to separately evaluate neural responses related to the anticipation and receipt of both juice and money rewards and punishers.
Schematic diagram of the experimental design. (A) Each trial began with the presentation of a visual cue or DS, to which subjects were asked to respond immediately. A yellow box appears following a key press, informing subjects of the successful registration of their key press, and 4 s later, both the DS and the yellow box disappear, while at the same time subjects see a text message indicating an outcome. (B) The shape, color, and brightness of the DS informed subjects of the specific type, valence and magnitude of outcome that would be subsequently delivered, and the associations of a DS with a specific outcome remained constant for a single subject, which were then counterbalanced across subjects. On trials involving monetary reward, subjects received monetary outcomes ranging from a gain of 20 cents down to a loss of 12 cents, whereas on trials involving juice reward, subjects received 1 of 3 different magnitudes of pleasant apple juice, tasteless solution, or else an aversive juice.
A major methodological issue when directly comparing neural responses between different types of reinforcer is that in order for such a comparison to be meaningful either the utility of these reinforcers need to be equated (so that the reinforcers are equally valued) or else possible differences in the utility of the reinforcers need to be taken into account. Otherwise, differential neural responses between reinforcers could merely reflect differences in the overall utility of these reinforcers to the individual. Pilot behavioral experiments in which subjects could directly choose between squirts of juice and monetary outcomes demonstrated that it is extremely difficult to equate the value of these reinforcers directly. Given that only small quantities of juice can be delivered at any one time in the scanner, the monetary value of each individual squirt of juice was determined by even juice deprived subjects to be negligible, especially given that as soon as the experiment is complete subjects can then go out and purchase as much juice as they deem fit. Consequently, rather than trying to equate their value directly, we resorted to a statistical approach to allow us to evaluate responses to both types of reinforcer and to compare between them. Regions involved in encoding the value of juice but not money were hypothesized to show a significant monotonic increase in blood oxygen level–dependent (BOLD) responses with increasing values of the juice reinforcer (as detected by a parametric regression analysis), whereas BOLD responses to the money reinforcer should show no significant increases in activation with increasing values of money. Similarly, areas sensitive to the value of money but not juice were hypothesized to show the converse: a monotonically increasing response to money reward but no significant change in activity to juice. Finally, regions encoding a common utility for juice and money would be expected to show significantly increasing responses to increasing values of both juice and money reward (Fig. 1B). This design therefore allowed us to detect regions involved in coding separately for the value of juice and money rewards, as well as to uncover regions involved in responding commonly to both types of reward, constituting candidate areas for mediating a common utility.
In our search for a common utility, we focused our attention specifically on the ventromedial prefrontal cortex (vmPFC), as this area has been found to be involved in responding during the expectation and receipt of diverse types of reward in numerous studies (O'Doherty et al. 2001; Knutson et al. 2003; Hampton et al. 2006) and has previously been postulated as the possible locus for a common currency representation (Montague and Berns 2002; O'Doherty and Dolan 2006).
In addition to exploring the representation of expected reward, that is, for areas increasing in activity with increasing reward magnitude, we also tested for the opposite response profile, that is, for areas exhibiting an increase in activity as the expected reward magnitude decreased. Furthermore, we also tested for common and distinct negative value-related encoding responses to juice and money. For this analysis, we focused in particular on the anterior insula, as this region has long been associated with negative affect in general and in responding during anticipation and receipt of aversive reinforcers such as pain or monetary loss (Ploghaus et al. 1999).
Materials and Methods
Subjects
Eighteen right-handed healthy normal subjects (10 females; mean age = 26.5 ± 4.94 years, age range = 18 ∼ 40) participated in the fMRI experiment. Participants were instructed to abstain from eating solid foods and from drinking sugared or sweetened drinks (but not from drinking water) for 6 h prior to their participation in the experiment. Subjects were food deprived to ensure that the liquid food rewards were deemed valuable by the subjects but were not water deprived to ensure that the tasteless control solution was deemed affectively neutral, according to a well-established experimental protocol for studying liquid food reward in human fMRI experiments (Gottfried et al. 2003; Valentin et al. 2007; Valentin and O'Doherty 2009). All the participants reported that they were thirsty when asked immediately prior to the experiment. All subjects gave informed consent, and the study was approved by the Institutional Review Board of the California Institute of Technology.
Stimuli and Task
Visual stimuli were presented through MRI-compatible goggles (Resonance Technology), and subjects made choices using an MRI-compatible button box. Three types of liquid stimuli were apple juice, affectively neutral tasteless solution consisting of the main ionic components of human saliva (25 mM KCL and 2.5 mM NaHCO3), and aversive flavor consisting of 0.5 M NaCl dissolved in cold black tea, which were used as pleasant, neutral, and aversive juice stimuli, respectively.
The liquid stimuli were delivered by means of separate electronic syringe pumps (one for each liquid) positioned in the scanner control room. These pumps transferred the liquid stimuli to the subject via ∼10-m-long polyethylene plastic tubes (6.4 mm diameter), the other end of which were held between the subject’s lips like a straw while they lay supine in the scanner.
To detect transient head movements due to swallowing, we attached an 8-cm long copper coil with a radius of 2.5 cm to the neck of each subject. Small movements of the coil induced a current in the magnetic field that could be detected when amplified using one channel of a Biopac system (Biopac systems, Inc) positioned in the scanner control room. This produced a time series over the whole experiment reflecting transient head movement.
Each subject had a single functional scan, which lasted about 30 min. In each scan, there were 4 juice and 4 money blocks, which were pseudorandomly distributed and counterbalanced across subjects. This feature of the design was to mitigate against relative contrast effects in responses to food and money rewards, which may have occurred had both juice and money trials been fully intermixed. There were a total of 5 different types of trials (High, Med, Low, Neu, and Neg), and each trial type was repeated 5 times in each block (i.e., 25 trials per block). Each experiment lasted approximately 38 min (composed of 25 trials per block, 11 s per trial, and 8 blocks in total).
Each trial began with the presentation of a visual cue or DS. Subjects had to respond immediately after seeing each DS in order to obtain the outcome, which was delivered 4 s later (Fig. 1A). We used a fixed 4-s interval, since having a jittered interval between a conditioned stimulus and an unconditioned stimulus may 200 introduce confounding prediction error-related signals (McClure et al. 2003; O'Doherty, Dayan, et al. 2003). A previous study using a similar 4 sec fixed cue-outcome interval demonstrated 2 separate hemodynamic peaks corresponding to anticipation and outcome (Kim et al. 2006). If subjects did not respond within 2 s, the trial was aborted and subjects were then presented with the same trial again. In this way, subjects could not avoid obtaining the aversive outcomes by simply refusing to respond on those trials (in practice subjects missed only a small number of trials, with 6.39 trials aborted on average per subject). The shape (triangle and square) of the DS predicted the type of the outcome (i.e., juice or money), the color (red, green, and blue) of the DS predicted the valence of the outcome (i.e., good, neutral, and bad), and the brightness (high, medium, and low) of the DS informed subjects of the magnitude (i.e., large, medium, or small) of outcome that would be subsequently delivered. The associations of a DS with a specific outcome remained constant for a single subject, which were then counterbalanced across subjects (Fig. 1B). On trials involving monetary reward, subjects received monetary outcomes ranging from a gain of 20 cents down to a loss of 12 cents (i.e., +20, +12, +4, 0, and −12 cents), whereas on trials involving juice reward, subjects received 1 of 3 different magnitudes of pleasant apple juice, tasteless solution, or else an aversive juice (i.e., 1.0 mL [apple juice], 0.6 mL [apple juice], 0.2 mL [apple juice], 0.6 mL [artificial saliva], and 0.6 mL [salty tea]). In the present experiment, we modulated the magnitude of expected outcomes for the positive reward stimuli but only offered one magnitude for the negative stimuli in both modalities. This feature of the design was introduced due to limitations in the number of trial types we could incorporate into the one experiment due to constraints on experimental duration (for subject performance and comfort) and because the overall focus of the study was on the representation of expected reward and not punishers. Thus, although our design does allow us to explore representations of negative as well as positive reinforcers, it is important to note that the present design offers greater experimental power for detecting regions responding to expectation and receipt of positive than negative reinforcers. The present task is similar to the monetary incentive delay task by Knutson et al. (2000), except unlike in the Knutson paradigm, the outcome delivery was not contingent upon the relative speed of participants’ responses. Here, subjects were asked to press a single key every time they saw either a triangle or a square, and every response within 1.5 s was followed by either a positive or a negative outcome irrespective of their response time once the response was recorded within the 1.5 s window. If subjects failed to respond within the 1.5 s window then an aborted trial was repeated in the immediately following trial. At the time of juice or money delivery (i.e., outcome phase), subjects saw the computer screen with a message saying either “Juice Delivered!” for juice trials or “You Won XXX Cents!” for money trials in order to provide comparable textual feedback in both conditions.
Immediately following the experiments, subjects performed a 2-alternative forced-choice preference ranking task, where they were asked to choose a more preferred DS from each pair of 2 DSs that they saw during the experiment. Upon completing the preference task, each subject was presented with each DS again one at a time and asked to provide pleasantness ratings based on a scale ranging from 1 (not pleasant at all) to 9 (very pleasant). All the participants were told that they would receive the money they earned during the fMRI experiment (which amounted to an average of $9.60 across participants) plus a fixed participation fee ($30) by the end of the experiment.
Imaging Procedures
The functional imaging was conducted on a 3-T Siemens TRIO MRI scanner. For each subject, we acquired whole brain T1-weighted anatomical scans (256 × 256 voxels; 1 × 1 × 1–mm in-plane resolution; 176 axial slices) and gradient echo -weighted echo-planar images (EPIs) with BOLD contrast (64 × 64 voxels; time repetition = 2000 ms; time echo = 30 ms; 3 × 3 × 3–mm in-plane resolution; 32 oblique axial slices). We used a tilted acquisition sequence at 30° to the anterior commissure—posterior commissure line to recover signal loss in medial orbitofrontal cortex (OFC) (Deichmann et al. 2003). In addition, we used an 8 channel phased array coil which yields a ∼40% signal increase in medial OFC over a standard head coil. Visual inspection of raw EPIs showed excellent signal quality in the medial OFC. Each brain volume comprised 32 axial slices of 3-mm thickness and 3-mm in-plane resolution. Each scan lasted approximately 40 min depending on performance, and the first 5 volumes of images were discarded to allow for equilibration effects.
Imaging Data Analysis
Image analysis was performed using SPM2 (Wellcome Department of Imaging Neuroscience, Institute of Neurology, London, UK). To correct for subject motion, the images were realigned to the first volume, spatially normalized to a standard template with a resampled voxel size of 3 mm3, and spatial smoothing was applied using a Gaussian kernel with a full-width at half-maximum of 6 mm.
The structural T1 images were coregistered to the mean functional EPI images for each subject and normalized using the parameters derived from the EPI images. Anatomical localization was performed by overlaying the statistical maps on a normalized structural image averaged across subjects and with reference to an anatomical atlas.
Regressors for the events of DS presentation and of outcome delivery were convolved with a canonical hemodynamic response function, the duration of which was about 21 s, as measured from the onset time to the offset time. In the main results described in this manuscript, we orthogonalized the regressors for outcome delivery with respect to those for DS presentation events before entering them into the design matrices. It should be noted that orthogonalization does not allow one to resolve issues relating to colinearity (Henson 2007). However, we ran additional analyses without the orthogonalization procedure and replicated all the main findings albeit at a weaker threshold (Supplementary Fig. S3). In addition, the 6 scan-to-scan motion parameters produced during realignment were entered into a regression analysis against the fMRI data for each individual subject, in order to account for residual effects of movement. To take into account transient head motion effects produced by, for example, swallowing, we also included an additional motion regressor that featured the output of the motion detector coil, band-pass filtered appropriately, smoothed, and subsampled to the number of scans in the experiment.
We computed linear contrasts of regressor coefficients to detect regions showing increasing response to increases in expected reward (i.e., [HIGH, MED, LOW, NEU, NEG] = [1.5, 1, 0.5, 0, −3]) and decreasing responses to increases in expected reward (i.e., [HIGH, MED, LOW, NEU, NEG] = [−1.5, −1, −0.5, 0, 3]) for both juice and money at the single subject level. When we used separate contrasts for positive and negative trials (compared with neutral), we found qualitatively similar activity patterns in the same regions. However, probably due to power considerations, we found these effects at somewhat weaker significance levels than in the analysis reported above where we integrated across both positive and negative trials. The results from each subject were taken to a random effects level by including the contrast images from each single subject into a one-sample t-test and a one-way analysis of variance with no mean term for conjunction analyses. Given our specific a priori region of interest, we used small volume correction (SVC) on vmPFC using a sphere with 8-mm radius centering around the coordinate of the vmPFC peak voxel (0, 30, −18) from a previous study where juice was used as a reinforcer (O'Doherty et al. 2006). We also tested for areas showing decreasing responses as a function of increasing anticipated reward value in anterior insula, a region that has previously been implicated in responding to a wide array of aversive reinforcers including monetary losses (O'Doherty, Critchley, et al. 2003; Kuhnen and Knutson 2005; Paulus and Stein 2006), disgusting odors (Wicker et al. 2003), faces depicting disgust (Phillips et al. 1997), as well as painful stimulations (Ploghaus et al. 1999; Seymour et al. 2004). For SVC on insula, we used the coordinate of the peak voxels within insula (left: −36, 22, 8; right: 36, 28, −2) taken from a previous study where aversive stimuli were used (Wicker et al. 2003).
In order to plot the effects in the activated regions, we used a leave-one-out analysis to obtain plots of effect sizes in our regions of interest (Kriegeskorte et al. 2009). For this, we computed a random effects contrast for all subjects but one and used the peak voxel from that analysis as the coordinate to extract the parameter estimates from the remaining (left out) subject. This was repeated 18 times by omitting a different subject each time, and the ensuing parameter estimates were then averaged across subjects and plotted.
Results
Behavioral Evidence of Learning
Reaction Times
In order to provide behavioral evidence of learning, we analyzed subjects’ reaction times separately for each condition. Jonckheere–Terpstra (JT) trend tests revealed that the subjects showed a significant linearly decreasing trend in reaction times in response to cues signaling increasing values for the predicted outcome in money (JT = 2.52, P < 0.05) but not juice (JT = 0.91, P > 0.05) conditions (Fig. 2A). Testing the linear trend of positive trials did not reveal any significant linear trends for either juice (JT = 0.51, P > 0.05) or money (JT = 0.98, P > 0.05) trials.
Behavioral results. (A) A plot of the differences in reaction times (RTs) for each condition plotted with respect to the RTs for the neutral condition reveals that subjects were significantly faster in their responding as the reward value of the expected outcome increased. (B) Plot of the number of times each DS was chosen in the postexperiment preference choice test reveals that subjects show a preference ranking consistent with the value of the outcome with which that DS was associated. (C) Plot of subjective pleasantness measures for the DS obtained postexperiment also showed linear change in subjective pleasantness ratings consistent with the change in the value of the outcomes with which that DS was associated.
Preference Rankings
We also determined overall preference ranks for each cue after the experiment by repeatedly presenting pairs of cues incorporating all possible combinations of pairs (including the comparisons between cues for juice and money to assess relative preference for either juice or money) and recording subjects’ choices on each occasion. The total number of times each cue was chosen showed a significant linearly increasing trend with the increasing value of the outcomes with which that cue had been associated, separately for the juice (JT = 7.72, P < 0.05) and money (JT = 7.63, P < 0.05) conditions (Fig. 2B). The linear trends remained significant even when only the positive trials were considered: juice (JT = 5.68, P < 0.05) and money (JT = 5.36, P < 0.05).
Pleasantness Ratings
In addition, subjects provided subjective pleasantness ratings for the cues after the experiment using a scale ranging from 1 (very unpleasant) to 9 (very pleasant). Again, subjects’ pleasantness ratings for the cues showed a significant linearly increasing trend with increasing value of the outcomes with which the cue had been associated (Fig. 2C). Testing the linear trend across all trials only also revealed significant linear trends for both juice (JT = 6.75, P < 0.05) and money (JT = 6.93, P < 0.05) trials. The linear trends remained significant even when only the positive trials were considered: juice (JT = 4.54, P < 0.05) and money (JT = 4.59, P < 0.05). These results suggest that the observed linear trend is not simply due to the comparison between the negative and the other trials.
We did not find any significant gender differences in either the behavioral or fMRI results.
FMRI Results
Positive Value-Related Responses during Expectation of Juice and Money
At the group random effects level, we found that the vmPFC showed BOLD responses following cue presentation that increased monotonically as the value of the associated outcome was increased, separately for juice (x = −6, y = 27, z = −15, Z = 2.84, P < 0.05, SVC corrected) and money (x = −6, y = 36, z = −12, Z = 4.42, P < 0.05, SVC corrected) conditions (Fig. 3A). To formally test for an overlap in value-related responses during the 2 conditions, we performed a conjunction analysis. A region of vmPFC was found to exhibit a conjunction effect indicative of common value-related activity during expectation of money and juice (Fig. 3B: x = −6, y = 27, z = −15, Z = 3.03, P < 0.005, uncorrected). To plot the activity profile in vmPFC, we used a leave-one-out cross validation procedure (for details, see Materials and Methods), where a random effects contrast of juice (or money) value is computed for all subjects but one, and the peak voxel from that analysis is used as the coordinate to extract the parameter estimates from the remaining (left out) subject. The results of this plot are shown in Figure 3C,D for the regions encoding juice and money values, respectively. Direct contrasts between areas showing monotonic increases during anticipation of different magnitudes of money and juice reward did not reveal any significant effects, indicating that there is no evidence at the group random effects level of differential processing as a function of reinforcer type within vmPFC.
Group statistical maps showing areas with monotonically increasing responses to the value of an expected outcome for juice and money. (A) Areas showing monotonically increasing responses during expectation of juice (shown in red) and money (shown in green) are shown superimposed on a sagittal slice through a subject averaged anatomical scan (the threshold is set at P < 0.005 in this and subsequent figures). (B) Region of vmPFC showing significant overlapping responses during expectation of both juice and money reward (x = −6, y = 27, z = −15; Z = 3.03, P < 0.005, uncorrected). (C,D) Effect size plot from voxels encoding juice (C) and money (D) values defined using a leave-one-out analysis.
Negative Value-Related Responses during Expectation of Juice and Money
At the group random effects level, we found significant negative value-related responses in the insula bilaterally for both juice (x = 27, y = 33, z = 6, Z = 3.62, P < 0.05, SVC corrected; x = −33, y = 30, z = 6, Z = 3.93, P < 0.05, SVC corrected) and money (x = 36, y = 24, z = 3, Z = 4.25, P < 0.05, SVC corrected; x = −39, y = 12, z = 3, Z = 3.47, P < 0.05, SVC corrected) (Fig. 4A). Furthermore, a conjunction analysis revealed an overlapping region in the right insula responding negatively during anticipation of both juice and money (Fig. 4B: x = 33, y = 24, z = 0, Z = 3.27, P < 0.005, uncorrected). To plot the activity profile in these regions as a function of value, we used a leave-one-out cross validation procedure (for details, see Materials and Methods). The results of this plot are shown in Figure 4C,D for the regions encoding juice and money values, respectively. Direct contrasts between areas showing monotonic increases during anticipation of different magnitudes of money and juice reward did not reveal any significant effects, indicating that there is no evidence at the group random effects level for differential processing as a function of reinforcer type in the insula.
Group statistical maps showing areas with monotonically decreasing responses to the value of an expected outcome for juice and money. (A) Statistical maps of areas showing negative value-related responses during anticipation of juice(red) and money(green), superimposed on an axial slice from a subject averaged T1 scan. (B) Region of right anterior insula cortex showing significant overlapping negative value responses during anticipation of juice and money (arrows; x = 33, y = 24, z = 0, Z = 3.27, P < 0.005, uncorrected). Plot of effect sizes for voxels encoding juice (C) and money (D) values in the left insula and in the right insula (E, F) defined using a leave-one-out analysis.
Responses Related to the Receipt of Juice and Money Outcomes
At the group random effects level, we found evidence for separate representations within vmPFC for the value of juice and money outcomes (Supplementary Fig. S2) but no evidence for overlap in representations at the group level, as a conjunction analysis revealed no significant overlapping voxels even at P < 0.05 (uncorrected).
We did not find any significant aversive outcome value-related activation in either right or left insula in the money condition at P < 0.005 (uncorrected), although the anterior insula (x = −36, y = 27, z = 3) showed monotonically decreasing responses to juice delivery (Z = 2.76, P < 0.005, uncorrected) as shown in Supplementary Table 2.
Activity in Areas Outside Our Regions of Interest in the Group Analyses
We also tested for areas showing significant effects in the group analyses in the rest of the brain outside our main regions of interest. These areas are listed in Supplementary Table 1. However, because none of these areas showed effects that survived correction for multiple comparisons across the whole brain, we do not draw definitive conclusions about them but merely report them for completeness. One finding is of note: activity was observed in nucleus accumbens which showed a differential effect during expectation of reward in the direction of enhanced responses during expectation of money compared with expectation of juice reward (x = 0, y = 12, z = 0, Z = 3.16, P < 0.005, uncorrected). However, subsequent inspection of the parameter estimates suggested that effects in this region were mostly driven by differences in responses to the anticipation of the aversive outcomes between the reinforcers (expectation of losing money compared with expectation of the aversive flavor) (see Supplementary Fig. S1).
Discussion
In the present study, we addressed how representations for the value of expected money and juice reward outcomes are encoded in the brain and specifically whether overlapping or distinct regions are engaged during expectation and receipt of money and juice reward, respectively. We found evidence for at least partly overlapping neural responses during expectation of money and juice reward in the vmPFC. These findings could be interpreted in the context of a common currency, in that one of the ways by which such a common utility could be implemented at the neural level is for the same brain area to respond commonly to multiple types of reward such as juice and money. These findings are also complementary to the results of a previous study that reported common overlapping representations of decision values for diverse goods ranging from monetary gambles to trinkets and food items (Chib et al. 2009). The key difference between the present study and that previous report is that here we are studying the representation of expected rewards, that is, the state elicited following the presentation of a cue that signals the subsequent presentation of variable quantities of either a money or juice reward. The expected reward signal studied here is conceptually quite different from the decision value signal reported by Chib et al. (2009), in that the latter is part of the input to the decision process while the former need not necessarily be decision related but merely indicates whether a reward is expected or not (and its corresponding value). The finding that in the present paradigm, vmPFC is also recruited and that a common region is also commonly engaged during expectation of food and money reward, suggests that vmPFC plays a very flexible role in representing values not only for diverse types of reinforcer but also represents a variety of different value signals including expected rewards, decision values, as well as outcome values (Knutson et al. 2001; O'Doherty et al. 2001).
One difference between primary reinforcers and abstract reinforcers such as money is that the mechanism of delivery of the reinforcers is different—in the case of primary reinforcers such as food the delivery is tangible and involves the direct consumption of the food item following delivery. On the other hand, monetary reinforcers by their very nature are abstract and its attainment can be signaled in any number of ways, whether it is the flash of number on a trading screen, a cheque, a payslip, or a visual signal presented in an fMRI scanner indicating that an amount has been won. It is certainly possible that the more intangible nature of the monetary reward and the difference in the method of delivery in the scanner could potentially explain any differences between money and juice representations if present in the brain, although no significant differences in representations were observed in the present study.
It is also notable that we did not find simple main effects of reward expectation for juice or money reward in the ventral striatum. This result may pertain to the fact that the ventral striatum is especially engaged by reward prediction errors as opposed to reward prediction per se (McClure et al. 2003; O'Doherty, Dayan, et al. 2003; Hare et al. 2008). While the onset of the different cues does constitute a form of prediction error, this signal alone may not have been sufficient to robustly recruit the ventral striatum in the present design. We did find a difference in the striatum in responses during expectation of money and juice reward, but in the absence of simple main effects of reward prediction for money reward, this finding should be interpreted with caution, although it does resonate with a previous reported finding for money and juice prediction errors (Valentin and O'Doherty 2009).
In addition to finding regions responding during expectation of reward, where activation increased as a consequence of an increase in the positive reward value of an expected outcome, we also found evidence for neural signals exhibiting a decrease for both juice and money. That is, some areas decreased their responses as the value of the expected outcome increased and responded maximally during expectation of the least desirable and aversive outcomes (monetary loss or “salty” tea). These areas were the left and right insula. These areas have previously been implicated in responding during many different types of aversive conditions, such as during receipt of monetary loss (O'Doherty, Critchley, et al. 2003; Paulus and Stein 2006), when considering risky gambles (in risk averse individuals) (Kuhnen and Knutson 2005; Huettel et al. 2006; Preuschoff et al. 2006), in response to disgusting odors (Wicker et al. 2003) and aversive tastes (Small et al. 1999) or faces displaying disgust (Phillips et al. 1997), and during the anticipation and receipt of painful stimulation (Ploghaus et al. 1999; Seymour et al. 2004). Here, we report a common negative value-coding representation during expectation of money and juice in the right but not left insula, which provides evidence that the right insula may have a very general role in signaling expected negative consequences for aversive outcomes, irrespective of the specific reinforcer involved.
In the present study, we did not find activity in lateral OFC during prediction or receipt of the aversive outcomes in contrast to a number of previous findings (O'Doherty et al. 2001; O'Doherty, Critchley, et al. 2003; Ursu and Carter 2005; Fujiwara et al. 2008; Elliott et al. 2010). One possible explanation for this is that lateral OFC appears to be especially involved under situations where subjects need to initiate behavioral responses that results in the subsequent avoidance of that outcome (O'Doherty, Critchley, et al. 2003; Kim et al. 2006; Wrase et al. 2007; Elliott et al. 2010), although aversive-related activity in this region has sometimes been reported independently of response selection considerations (O'Doherty, Winston, et al. 2003; Seymour et al. 2005; Cloutier et al. 2008; Elliott et al. 2010).
Besides the possible role of a common currency to facilitate comparisons between different reinforcer types, it should also be noted that conceptually there is an equally important argument for the necessity of distinct representations for different types of reward in the brain as there is for a common representation. This is because different rewards can be valued differently by an animal at different points in time depending on its current biological needs and ensuing motivational states. For instance, when thirsty, water might be valued highly, whereas salted peanuts would have very low value, while the opposite might hold true if the animal is hungry or salt deprived. In order for value representations for each distinct type of reward to be modulated independently as a function of motivational state, a separate representation needs to exist for the value of each type of reinforcer. Although in the present study, activations correlating with expected juice reward did extend more posteriorly within vmPFC than activations correlating with expected money reward, a direct statistical comparison did not reveal a significant difference between these activated areas as a function of reinforcer type. Thus, we cannot conclude from this that representations for expectation of money and juice reward are differentially represented within vmPFC. It is also notable that clear evidence for a differential representation in the voxels correlating with decision values for different goods was not found by Chib et al. (2009). One limitation of the present study and that of Chib et al. (2009) is that both studies focused on group averaged data that could potentially obscure any differences in reinforcer representation present at the single subject level. This could occur if such differential representations vary across subjects in their extent and location within vmPFC, as any such variability would likely be washed out in the group average. Further studies on this topic could address this limitation by acquiring repeated sessions of fMRI data from each single subject in order to provide sufficient statistical power to enable testing for value-coding representations at the single subject level.
To conclude, the present study provides evidence for overlapping value-related responses during expectation of juice and money outcomes within vmPFC and anterior insula. These results suggest that during anticipation of future rewarding and aversive outcomes, some brain areas are involved in signaling the value of those outcomes irrespective of the nature of the reinforcer involved. These results are compatible with the possibility that vmPFC and anterior insula contribute to the computation of a common currency, whereby anticipated outcomes are represented on a common scale within the same brain areas. The present results further indicate that the vmPFC in particular is involved in common valuation coding for diverse types of reinforcer not only for decision values as shown previously but also for anticipatory values.
Funding
National Science Foundation grant (0617174) and Searle Scholarship (to J.P.O.D.); Japan Science and Technology Agency, Exploratory Research for Advanced Technology (to S.S.); the Gordon and Betty Moore Foundation (to J.P.O.D.); World Class University program through the Korea Science and Engineering Foundation funded by the Ministry of Education, Science and Technology (R31-2008-000-10008-0 to H.K.).
We thank Po-Yin Samuel Huang, Tony Bruguier, and Shawn Wagner for technical assistance. Conflict of Interest: None declared.



